User:Szha/Notes/CMSC714/note0911

< User:Szha‎ | Notes/CMSC714
Revision as of 15:05, 13 September 2012 by Szha (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
  1. Office Hours
  2. MPI project
  3. bug cluster account info

MPI

goals

standardize previous message passing
  • PVM, P4, NX (intel), MPL (IBM), ...
support copy-free message passing
portable to many platforms - defines an API , not an implementation

features

point-to-point messaging
group/collective communications
profiling interface: every function has a name-shifted version (allow wrapper and metrics)

buffering (in standard mode)

no guarantee that there are buffers (existence or size)
possible that send will block until receive is called

delivery order

two sends from same process to same des. will arrive in order
no guarantee of fairness between processes on receive (in wildcard-receiving)

MPI communicators

provide a named set of processes for communication

plus a context - system allocated unique tag

all processes within a communicator can be named

a communicator is a group of processes and a context
numbered from 0..n-1

allows libraries to be constructed

application creates communicators
library uses it
prevents problems with posting wildcard receives
  • adds a communicator scope to each receive

all programs start with MPI_COMM_WORLD

functions for creating communicators from other communicators (split, duplicate, etc.)
  • duplicate: to get different tag (context)
functions for finding out about processes within communicator (size, my_rank, ...)

non-blocking point-to-point functions

two parts

post the operation
wait for results

also includes a poll/test option

checks if the operation has finished

semantics

must not alter buffer while operation is pending (wait returns or test returns true)
and data not valid for a receive until operation completes

collective communication

communicator specifies process group to participate

various opeartions, that may be optimized in an MPI implementation

barrier synchronization
broadcast
gather/scatter (with one destination, or all in group)
reduction operations - predefined and user-defined
  • also with one destination or all in group
scan - prefix reductions
  • for processes p1 - p5 that produces msg x1 - x5, scan: 0 x1 x1+x2 x1+x2+x3

collective operations may or may not synchronize

up to the implementation, so application can't make assumptions

MPI calls

include <mpi.h> in c/c++ program

for every process

first call MPI_Init(&argc, &argv)

MPI_Comm_rank(MPI_COMM_WORLD, &myrank)

myrank is set to id of this process (in range 0 to P-1)

MPI_Wtime()

returns wall time

at the end, call MPI_Finalize()

no MPI calls allowed after this

MPI communication

parameters of various calls

var - a variable (pointer to memory)
num - number of elements in the variable to use
type {MPI_INT, MPI_REAL, MPI_BYTE, ... }
root - rank of process at root of collective operation
src/dest - rank of source/destination process
status

calls (all return a code - check for MPI_Success)

MPI_Send(var, num, type, dest, tag, MPI_COMM_WORLD)
MPI_Recv(var, num, type, src, MPI_ANY_TAG, MPI_COMM_WORLD, &status)
MPI_Bcast(var, num, type, root, MPI_COMM_WORLD)
  all processes call Bcast (the same call) with compatible parameters ("compatible" refers to different buffer pointers)
MPI_Barrier(MPI_COMM_WORLD)
  all processes must call the barrier, otherwise all the rest processes hang

MPI Misc.

MPI Types

all messsages are typed
  • base/primitive types are pre-defined
int, double, real, {unsigned}{short, char, long}
  • can construct user-defined types
includes non-contiguous data types

processor topologies

allows construction of cartesian & arbitrary graph
may allow some systems to run faster

language bindings for C, fortran, C++, ...

What's not in MPI-1

process creation
I/O
one-sided communication (get put)

sample MPI program

#include "mpi.h"

/* fragment from main  */ 

int myrank, friendRank;
char message[MESSAGESIZE];
int i, tag = MSG_TAG;
MPI_Status status;

/* initialize, no spawning necessary */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if (myrank == 0) {
    friendRank = 1;
}
else {
    friendRank = 0;
}
MPI_Barrier(MPI_COMM_WORLD);
if (myrank == 0) {
    for (i = 0; i < MESSAGESIZE; i++ {
        message[i] = '1';
    }
}

for (i = 0; i<ITERATIONS; i++) {
    if (myrank == 0) {
        MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
        MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
    }
    else {
        MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
        MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
    }
}
MPI_Finalize();
exit(0);