HPC Homework Project 7:
A Simple Problem
Overview
In this exercise, we want to practice using some of the
MPI collective communication operations,
and introduce the use of OpenMP.
The problem we'll be solving is a simple one:
sum the values in an array.
The basic idea is to parallelize this summation
by distributing it across N different PEs.
To make the problem more interesting,
we'll be using some rather large arrays,
Exercise
The file arraySum.c
contains a sequential program to sum the values in an array.
The files 10k.txt, 100k.txt,
1m.txt, and 10m.txt
(in the directory /home/cs/374/homework/07)
can be used to test this program,
either by using their absolute pathnames,
or by creating symbolic links to them.
Please access the files directly from there --
don't copy them, to avoid wasting space.
Compile arraySum.c,
and execute it using each of these data files,
recording the sums you get for each.
Homework
This week's assignment has these parts:
-
Part I is to write a parallel version of arraySum.c
-- mpiArraySum.c --
in which PE0 reads in the array,
scatters a fragment of the array to every PE, after which
each PE sums the array fragment it has received.
Then PE0 uses the MPI_SUM reduction
to sum these sums, after which it displays the result
and the execution time, which includes
reading the values from the file, scattering them,
and summing the sums via the reduce operation.
Your program should compute:
-
The total time taken by the program;
-
The time spent in I/O;
-
The time to scatter the values; and
-
The time to sum the array.
-
Part II is to copy your program to a new name
-- ompArraySum.c --
and revise it to use
OpenMP and shared-memory parallelism in place of
MPI and distributed-memory parallelism.
In this version, record the total time, the time spent in I/O,
and the time to sum the array.
The actual work of summing the array (but not the I/O)
should all be done by a rewritten/parallel version
of the sumArray() function.
Given N "server" PEs, each "server" should sum roughly M / N values.
The "master" PE (e.g., PE0) can sum the remainig
M % N values as it waits for the others to finish.
Test your programs in the ulab using small arrays.
When they seem to be working properly, scp them
to the cluster and test them on the data files there,
recording your execution times.
On Dahl, the data files are in the directory
/home/cs/374/homework/07.
Access the files directly from there -- please don't copy them,
to avoid wasting space.
Hand In
-
Your source code for the different versions of the program.
-
Four spreadsheet charts
-- one for each input file:
10k.txt, 100k.txt, 1m.txt,
and 10m.txt --
comparing the execution times of your programs
on Dahl using 1, 2, 4, 8, and 16 PEs.
For each PE value, your chart should have two columns,
one for the MPI program and one for the OpenMP program.
These columns should be "stacked" columns,
in which the entire length of each column is the total time
for your program,
the top segment of each column is the I/O time,
the middle segment of each column is the scatter time,
and the bottom segment of each column is the sum+reduce time.
-
A 1-2 page analysis of your results.
Explain the time measurements you've observed.
Where is your computation "losing" time?
What is different about this computation and those
we have done in previous assignments?
How do Amdahl's and Gustafson's laws relate to your observations?
Please staple or paperclip your pages together.
Up to the HPC Homework Page
Up to the Calvin HPC Course Page
This page maintained by
Joel Adams.