Trapezoid Tutorial

Introduction
This tutorial will show you how to compile and run a simple MPI program on one of the Beowulf clusters. The use of  is assumed, but you can substitute   for. Be sure to substitute  with your St. Olaf username.

Connecting to the cluster
The most flexible interface to the cluster is, of course, the command line.

ssh -X helios.public.stolaf.edu
 * 1) Log in to   using your St. Olaf username and password. From a lab machine, type:

(The -X flag enables X-windows applications to show on your local computer.)

You can also use PuTTY on Windows as an SSH client.

Once logged in, you will be in your home directory on the cluster. The home directories on each cluster are separate from each other and from the St. Olaf home directories. Your home directory on a cluster is NFS mounted on each node of that cluster, making it easy to get to your programs and data. If you are working with very large data sets, you may want to consider an alternative way of moving data to the nodes.

Compiling an MPI program
The Message Passing Interface (MPI) is a set of functions that let you write parallel programs, usually in C or Fortran. Our clusters have the OpenMPI runtime installed, which implements the MPI standard. Your code interacts with MPI in three phases:


 * Include  in your program.
 * Compile with, which links in the necessary libraries.
 * Run with, which launches your program on the appropriate nodes and handles inter-node communication. (We will actually use Slurm to run  .)

Let's open a text editor and enter (click here for a C++ version):


 * 1) include 
 * 2) include 


 * 1) include "mpi.h"

/* Begin C Program */ int main(int argc, char** argv) { /* Variables */ int        my_rank; int        nprocs; double     a = 0.0; double     b = 1.0; int        n = 1024; double     h;   double      local_a; double     local_b; int        local_n; double     integral; double     total; int        source; int        dest = 0; int        tag = 0; MPI_Status  status; double Trap(double local_a, double local_b, int local_n, double h);

/* Let's get things started */ MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); /* Compute the area of the trapezoids */ h = (b - a) / n;  local_n = n / nprocs; local_a = a + my_rank * local_n*h; local_b = local_a + local_n*h; integral = Trap(local_a, local_b, local_n, h);

if(my_rank == 0) { /* The head node fetches the results */ total = integral; for(source = 1; source < nprocs; source++) { MPI_Recv(&integral, 1, MPI_DOUBLE, source, tag, MPI_COMM_WORLD, &status); total = total + integral; }  } else { /* Each node sends its answer to the head */ MPI_Send(&integral, 1, MPI_DOUBLE, dest, tag, MPI_COMM_WORLD); }

/* Print our answer */ if(my_rank == 0) { printf("With n = %d trapezoids, our estimate\n", n); printf("of the integral from %g to %g = %g\n", a, b, total); }  /* Shut things down */ MPI_Finalize; }

double Trap(double local_a, double local_b, int local_n, double h) { double integral; double x;  int i;   double f(double x); integral = (f(local_a) + f(local_b))/2.0; x = local_a; for(i = 1; i <= local_n-1; i++) { x = x + h;     integral = integral + f(x); }  integral = integral * h;   return integral; }

double f(double x) { double return_val; return_val = pow(x, 8) - 4*pow(x,7) - pow(x,6) + 12*pow(x,5) + 3*pow(x,4) - 4*pow(x,3) - 7*pow(x,2) - 20*x - 12; return return_val; }

This program implements the Trapezoid Rule for integration. There are several MPI calls that make this parallel program work:


 * must be the first MPI function you call. It accepts your program's command line parameters.
 * saves the processes' *rank* to the integer pointed to. If your program is running with 12 processes, each process will have a unique rank ranging from 0 to 11.
 * saves to number of processes to the integer pointed to.
 * receives a message. We will go over these parameters, but keep in mind that  and   are blocking. A process will block on   until another process sends a message.
 * sends a message.
 * must be the last MPI call in your program.

An excellent reference to start learning MPI is _Parallel Programming with MPI_ by Peter Pacheco. Prof. Brown has a couple of copies.

Now we want to compile this program:
 * 1) mpicc -c trap.c
 * 2) mpicc -o trap trap.o

(Use  instead of   for C++ programs, as described in MPI and C++.) Make sure you know the location of the compiled program. For example, your program might located at: /home/username/trap/trap

Once we have an MPI program, we want to run it on the cluster using Slurm and OpenMPI.

Creating an Slurm job
Slurm is a queuing system that organizes jobs to run on the cluster. You specify a job by creating a *job script* and submitting the job script to Slurm.

Let's open a text editor and enter, a script to run our   program:
 * 1) !/bin/sh

mpirun trap

The first line indicates that this file is a script for the program. The last line runs your program with.

When your job is ready to run, Slurm executes this script on one of the nodes. OpenMPI interacts with Slurm to determine what nodes your MPI program should run on and launches it accordingly. Make sure that the program you specify to  is actually accessible on each node. (NFS mounting of home directories takes care of this.)

Now, submit your job to Slurm. This step copies your job script to a spooling directory where it waits to be run by the scheduler: sbatch -n 4 -o trapjob.out trapjob.sh

As your job is running, Slurm saves the job's standard output and standard error streams to files in your home directory. in the file trapjob.out