HPC MPI Homework Project 1: Getting Started With MPI


Getting Started

Begin by making a new directory for this course, containing a subdirectory for this exercise, and save a copy of the C program greetings.c in that subdirectory. This program is a parallel version of a "hello world" style program, using the Message Passing Interface (MPI).

Compiling an MPI Program

To compile the program, use the command:

   mpicc greetings.c 
mpicc compiles a C program in the usual way (i.e., using gcc), except that it correctly processes the MPI commands the program uses, which would generate errors if a regular C compiler were used. The resulting binary is stored in a file named a.out.

(If you get an error message like:

   bash: mpicc: file not found 
then you will need to add MPI's bin directory to your PATH variable. For example, if MPI's bin directory were in /opt/openmpi/bin/, you would add:
    PATH=/opt/openmpi/bin/:$PATH
    export PATH 
in the file .bash_profile in your home directory.)

Running an MPI Program

To run an MPI program in parallel on multiple machines, we will use the mpirun command. In the Unix lab, we will invoke the mpirun command using two switches:

  mpirun -np numProcesses -machinefile hostNameFile binaryFileName 
will run numProcesses instances of the program in binaryFileName, starting the first instance on our local machine, and then starting the remaining numProcesses-1 instances on the machines whose names are listed in hostNameFile.

Your first task in the Unix lab is to set up your own personal file of host names. To make this easy, download and run either Tim Brom's Perl script, or John VanEnk's shell script. These scripts generate a randomized list of all the ulab machines that are currently running Linux. (Note: Neither of these scripts generate host files for multicore hosts. Feel free to hack them so that they do so.) Change the permissions on the script file to allow execution; then run it and redirect its output to a file (e.g., hosts).

The mpirun command will start up remote processes on the machines according to their order in this file, so you may want to ensure that the machine on which you are working occurs first in this file. (Note: You will not need to use the -machinefile switch on the cluster.)

Once you have your own "randomized" personal machines file, we are nearly ready to run our program in the Unix lab. To run the program in a.out on the first four machines listed in hosts, use the command

   mpirun -np 4 -machinefile hosts ./a.out 
The argument to the -np switch must be an integer specifying how many processes you want to use (generally, more than 1).

One annoying thing is that mpirun uses ssh to launch the remmote processes, and if you have not yet set up SSH key authentication for each remote node, you will have to enter your password. To avoid having to enter your password each time, follow Kevin DeGraaf's instructions on SSH Key Authentication.

Run the program with different arguments for this switch (e.g., 4, 8, 12, 16) and compare its behavior against its source code, until you understand how it is generating the behavior you observe.

When you understand how greetings.c is working, continue on to today's homework assignment, which is given below.

Homework

Today's assignment has these parts:

    1. In the Unix Lab:
      • Using greetings.c as a model, write a program that sends a message around a ring of processes. More precisely, the process with rank 0 should send a message containing its rank to the process with rank 1, receive a message from the process with rank n-1, and then print this message to the screen. All other processes i > 0 should wait to receive a message from the process with rank i-1, verify that the message contains the integer i-1, replace the integer i-1 with the integer i, and then send the modified message on to the process with rank i+1 (using modulus to "wrap around" from n-1 to 0). Test this much with different arguments for the -np switch, so that you are confident it works correctly before continuing.
      • Using the information from Quinn Chapter 4 on benchmarking, add the necessary code to make the process with rank 0 time how long it takes the message to circulate around the "ring". Since you are competing for network bandwidth and CPU cycles with others in the lab, this time will vary from execution to execution. To compensate for this variance, modify your code so that it computes the average time per trip using 3 trips around the "ring". Test and record the timing data when using rings of 4, 8, 16, 32, and 64 processes. Be careful to only compute the time it takes the message to traverse the ring -- your timing should exclude I/O, your loop to "do it 3 times", etc. If you're not certain about this, check with me before proceeding to the next step.
    2. On The Cluster:
      • Transfer your code to the cluster (dahl.calvin.edu). The secure-copy utility scp is probably easiest way to do so. For example, if my program is in a file named ringCircler.c, to transfer it to my 374/proj1 directory on Dahl, I would type:
            scp ringCircler.c adams@dahl.calvin.edu:/home/adams/374/proj1 
      • Several people can work at a time on the cluster, though you may interfere with one another if more than a half-dozen or so are working simultaneously. When you login, use the who or w command to see what other users are logged in, and if there are more than a few, logout and try again later.

        On dahl, repeat the timing experiments you did in the Unix lab by measuring the average time per trip using 3 trips around the "ring". Note that you need not prepare a machine file, and you should need not use the -machinefile switch on the cluster.

        Record the timing data when using rings of 4, 8, 16, 32, and 64 processes.

    Hand In

    A script file listing your program, showing its compilation, and then its execution for 4, 8, 16, 32, and 64 processes. Attach a line-chart created using a spreadsheet program (e.g., Excel, Open Office, etc.), that plots the average time for a message to circle the ring against the number of processes in the ring, for both the Unix lab and the cluster. Attach a 1-2 paragraph written analysis of your interpretation of the data on your chart.


    Up to the HPC Homework Project Page Up to the Calvin HPC Course Page


    This page maintained by Joel Adams.