4,903 bytes added
, 15:01, 13 September 2012
# Office Hours
# MPI project
# bug cluster account info
= MPI =
== goals ==
: standardize previous message passing
::* PVM, P4, NX (intel), MPL (IBM), ...
: support copy-free message passing
: portable to many platforms - defines an API , not an implementation
== features ==
: point-to-point messaging
: group/collective communications
: profiling interface: every function has a name-shifted version (allow wrapper and metrics)
== buffering (in standard mode) ==
: no guarantee that there are buffers (existence or size)
: possible that send will block until receive is called
== delivery order ==
: two sends from same process to same des. will arrive in order
: no guarantee of fairness between processes on receive (in wildcard-receiving)
= MPI communicators =
== provide a named set of processes for communication ==
: plus a context - system allocated unique tag
== all processes within a communicator can be named ==
: a communicator is a group of processes and a context
: numbered from 0..n-1
== allows libraries to be constructed ==
: application creates communicators
: library uses it
: prevents problems with posting wildcard receives
::* adds a communicator scope to each receive
== all programs start with MPI_COMM_WORLD ==
: functions for creating communicators from other communicators (split, duplicate, etc.)
::* duplicate: to get different tag (context)
: functions for finding out about processes within communicator (size, my_rank, ...)
= non-blocking point-to-point functions =
== two parts ==
: post the operation
: wait for results
== also includes a poll/test option ==
: checks if the operation has finished
== semantics ==
: must not alter buffer while operation is pending (wait returns or test returns true)
: and data not valid for a receive until operation completes
= collective communication =
== communicator specifies process group to participate ==
== various opeartions, that may be optimized in an MPI implementation ==
: barrier synchronization
: broadcast
: gather/scatter (with one destination, or all in group)
: reduction operations - predefined and user-defined
::* also with one destination or all in group
: scan - prefix reductions
::* for processes p1 - p5 that produces msg x1 - x5, scan: 0 x1 x1+x2 x1+x2+x3
== collective operations may or may not synchronize ==
: up to the implementation, so application can't make assumptions
= MPI calls =
== include <mpi.h> in c/c++ program ==
: for every process
== first call MPI_Init(&argc, &argv) ==
== MPI_Comm_rank(MPI_COMM_WORLD, &myrank) ==
: myrank is set to id of this process (in range 0 to P-1)
== MPI_Wtime() ==
: returns wall time
== at the end, call MPI_Finalize() ==
: no MPI calls allowed after this
= MPI communication=
== parameters of various calls ==
: var - a variable (pointer to memory)
: num - number of elements in the variable to use
: type {MPI_INT, MPI_REAL, MPI_BYTE, ... }
: root - rank of process at root of collective operation
: src/dest - rank of source/destination process
: status
== calls (all return a code - check for MPI_Success) ==
<pre>
MPI_Send(var, num, type, dest, tag, MPI_COMM_WORLD)
MPI_Recv(var, num, type, src, MPI_ANY_TAG, MPI_COMM_WORLD, &status)
MPI_Bcast(var, num, type, root, MPI_COMM_WORLD)
all processes call Bcast (the same call) with compatible parameters ("compatible" refers to different buffer pointers)
MPI_Barrier(MPI_COMM_WORLD)
all processes must call the barrier, otherwise all the rest processes hang
</pre>
= MPI Misc. =
== MPI Types ==
: all messsages are typed
::* base/primitive types are pre-defined
::: int, double, real, {unsigned}{short, char, long}
::* can construct user-defined types
::: includes non-contiguous data types
== processor topologies ==
: allows construction of cartesian & arbitrary graph
: may allow some systems to run faster
== language bindings for C, fortran, C++, ... ==
== What's not in MPI-1 ==
: process creation
: I/O
: one-sided communication (get put)
= sample MPI program =
<pre>
#include "mpi.h"
/* fragment from main */
int myrank, friendRank;
char message[MESSAGESIZE];
int i, tag = MSG_TAG;
MPI_Status status;
/* initialize, no spawning necessary */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if (myrank == 0) {
friendRank = 1;
}
else {
friendRank = 0;
}
MPI_Barrier(MPI_COMM_WORLD);
if (myrank == 0) {
for (i = 0; i < MESSAGESIZE; i++ {
message[i] = '1';
}
}
for (i = 0; i<ITERATIONS; i++) {
if (myrank == 0) {
MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
}
else {
MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
}
}
MPI_Finalize();
exit(0);
</pre>