User:Szha/Notes/CMSC714/note0911
< User:Szha | Notes/CMSC714
- Office Hours
- MPI project
- bug cluster account info
MPI
goals
- standardize previous message passing
- PVM, P4, NX (intel), MPL (IBM), ...
- support copy-free message passing
- portable to many platforms - defines an API , not an implementation
features
- point-to-point messaging
- group/collective communications
- profiling interface: every function has a name-shifted version (allow wrapper and metrics)
buffering (in standard mode)
- no guarantee that there are buffers (existence or size)
- possible that send will block until receive is called
delivery order
- two sends from same process to same des. will arrive in order
- no guarantee of fairness between processes on receive (in wildcard-receiving)
MPI communicators
provide a named set of processes for communication
- plus a context - system allocated unique tag
all processes within a communicator can be named
- a communicator is a group of processes and a context
- numbered from 0..n-1
allows libraries to be constructed
- application creates communicators
- library uses it
- prevents problems with posting wildcard receives
- adds a communicator scope to each receive
all programs start with MPI_COMM_WORLD
- functions for creating communicators from other communicators (split, duplicate, etc.)
- duplicate: to get different tag (context)
- functions for finding out about processes within communicator (size, my_rank, ...)
non-blocking point-to-point functions
two parts
- post the operation
- wait for results
also includes a poll/test option
- checks if the operation has finished
semantics
- must not alter buffer while operation is pending (wait returns or test returns true)
- and data not valid for a receive until operation completes
collective communication
communicator specifies process group to participate
various opeartions, that may be optimized in an MPI implementation
- barrier synchronization
- broadcast
- gather/scatter (with one destination, or all in group)
- reduction operations - predefined and user-defined
- also with one destination or all in group
- scan - prefix reductions
- for processes p1 - p5 that produces msg x1 - x5, scan: 0 x1 x1+x2 x1+x2+x3
collective operations may or may not synchronize
- up to the implementation, so application can't make assumptions
MPI calls
include <mpi.h> in c/c++ program
- for every process
first call MPI_Init(&argc, &argv)
MPI_Comm_rank(MPI_COMM_WORLD, &myrank)
- myrank is set to id of this process (in range 0 to P-1)
MPI_Wtime()
- returns wall time
at the end, call MPI_Finalize()
- no MPI calls allowed after this
MPI communication
parameters of various calls
- var - a variable (pointer to memory)
- num - number of elements in the variable to use
- type {MPI_INT, MPI_REAL, MPI_BYTE, ... }
- root - rank of process at root of collective operation
- src/dest - rank of source/destination process
- status
calls (all return a code - check for MPI_Success)
MPI_Send(var, num, type, dest, tag, MPI_COMM_WORLD) MPI_Recv(var, num, type, src, MPI_ANY_TAG, MPI_COMM_WORLD, &status) MPI_Bcast(var, num, type, root, MPI_COMM_WORLD) all processes call Bcast (the same call) with compatible parameters ("compatible" refers to different buffer pointers) MPI_Barrier(MPI_COMM_WORLD) all processes must call the barrier, otherwise all the rest processes hang
MPI Misc.
MPI Types
- all messsages are typed
- base/primitive types are pre-defined
- int, double, real, {unsigned}{short, char, long}
- can construct user-defined types
- includes non-contiguous data types
processor topologies
- allows construction of cartesian & arbitrary graph
- may allow some systems to run faster
language bindings for C, fortran, C++, ...
What's not in MPI-1
- process creation
- I/O
- one-sided communication (get put)
sample MPI program
#include "mpi.h" /* fragment from main */ int myrank, friendRank; char message[MESSAGESIZE]; int i, tag = MSG_TAG; MPI_Status status; /* initialize, no spawning necessary */ MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank == 0) { friendRank = 1; } else { friendRank = 0; } MPI_Barrier(MPI_COMM_WORLD); if (myrank == 0) { for (i = 0; i < MESSAGESIZE; i++ { message[i] = '1'; } } for (i = 0; i<ITERATIONS; i++) { if (myrank == 0) { MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD); MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status); } else { MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status); MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD); } } MPI_Finalize(); exit(0);