Office Hours
MPI project
bug cluster account info

MPI

goals

standardize previous message passing

PVM, P4, NX (intel), MPL (IBM), ...

support copy-free message passing

portable to many platforms - defines an API , not an implementation

features

point-to-point messaging

group/collective communications

profiling interface: every function has a name-shifted version (allow wrapper and metrics)

buffering (in standard mode)

no guarantee that there are buffers (existence or size)

possible that send will block until receive is called

delivery order

two sends from same process to same des. will arrive in order

no guarantee of fairness between processes on receive (in wildcard-receiving)

MPI communicators

provide a named set of processes for communication

plus a context - system allocated unique tag

all processes within a communicator can be named

a communicator is a group of processes and a context

numbered from 0..n-1

allows libraries to be constructed

application creates communicators

library uses it

prevents problems with posting wildcard receives

adds a communicator scope to each receive

all programs start with MPI_COMM_WORLD

functions for creating communicators from other communicators (split, duplicate, etc.)

duplicate: to get different tag (context)

functions for finding out about processes within communicator (size, my_rank, ...)

non-blocking point-to-point functions

two parts

post the operation

wait for results

also includes a poll/test option

checks if the operation has finished

semantics

must not alter buffer while operation is pending (wait returns or test returns true)

and data not valid for a receive until operation completes

collective communication

communicator specifies process group to participate

various opeartions, that may be optimized in an MPI implementation

barrier synchronization

broadcast

gather/scatter (with one destination, or all in group)

reduction operations - predefined and user-defined

also with one destination or all in group

scan - prefix reductions

for processes p1 - p5 that produces msg x1 - x5, scan: 0 x1 x1+x2 x1+x2+x3

collective operations may or may not synchronize

up to the implementation, so application can't make assumptions

MPI calls

include <mpi.h> in c/c++ program

for every process

first call MPI_Init(&argc, &argv)

MPI_Comm_rank(MPI_COMM_WORLD, &myrank)

myrank is set to id of this process (in range 0 to P-1)

MPI_Wtime()

returns wall time

at the end, call MPI_Finalize()

no MPI calls allowed after this

MPI communication

parameters of various calls

var - a variable (pointer to memory)

num - number of elements in the variable to use

type {MPI_INT, MPI_REAL, MPI_BYTE, ... }

root - rank of process at root of collective operation

src/dest - rank of source/destination process

status

calls (all return a code - check for MPI_Success)

MPI_Send(var, num, type, dest, tag, MPI_COMM_WORLD)
MPI_Recv(var, num, type, src, MPI_ANY_TAG, MPI_COMM_WORLD, &status)
MPI_Bcast(var, num, type, root, MPI_COMM_WORLD)
  all processes call Bcast (the same call) with compatible parameters ("compatible" refers to different buffer pointers)
MPI_Barrier(MPI_COMM_WORLD)
  all processes must call the barrier, otherwise all the rest processes hang

MPI Misc.

MPI Types

all messsages are typed

base/primitive types are pre-defined

int, double, real, {unsigned}{short, char, long}

can construct user-defined types

includes non-contiguous data types

processor topologies

allows construction of cartesian & arbitrary graph

may allow some systems to run faster

language bindings for C, fortran, C++, ...

What's not in MPI-1

process creation

I/O

one-sided communication (get put)

sample MPI program

#include "mpi.h"

/* fragment from main  */ 

int myrank, friendRank;
char message[MESSAGESIZE];
int i, tag = MSG_TAG;
MPI_Status status;

/* initialize, no spawning necessary */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
if (myrank == 0) {
    friendRank = 1;
}
else {
    friendRank = 0;
}
MPI_Barrier(MPI_COMM_WORLD);
if (myrank == 0) {
    for (i = 0; i < MESSAGESIZE; i++ {
        message[i] = '1';
    }
}

for (i = 0; i<ITERATIONS; i++) {
    if (myrank == 0) {
        MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
        MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
    }
    else {
        MPI_Recv(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD, &status);
        MPI_Send(message, MESSAGESIZE, MPI_CHAR, friendRank, tag, MPI_COMM_WORLD);
    }
}
MPI_Finalize();
exit(0);

User:Szha/Notes/CMSC714/note0911

Contents

MPI