In this exercise, we want to use parallelism to accelerate the computation of the sum of the squares of the values stored in an input file. The basic idea is to parallelize this summation by distributing it across P different PEs. To make the problem interesting, we'll be using the same large files you used in this week's lab.
The file sumSquares.c contains a sequential program to read N double values from a file into an array, sum the squares of the values in the array, and report the resulting value. The file Makefile lets you build sumSquares using the make utility.
Use the text files in the directory /home/cs/374/exercises/04 to test this program.
If you do not want to use these clunky, long, absolute file-names, you can create a symbolic link to each one. For example, if you enter the following command in your project folder:
ln -s /home/cs/374/exercises/04/1m-doubles.txtLinux will create a symbolic link or shortcut named 1m-doubles.txt there in your current folder. This both avoids copying the files / wasting space AND provides a more convenient way to access a given file.
Add the necessary calls to sumSquares.c to separately time:
Then build the program, and when it builds without errors or warnings, execute it using each of the .txt files in /home/cs/374/exercises/04. In a spreadsheet, record the sums and times you get for each file.
The end-goal of this project is to create a program mpiSumSquares that uses MPI to perform this computation faster. Use everything you learned in this week's lab exercise to complete this task.
As in sumSquares.c, your mpiSumSquares program should use MPI_Wtime() calls to calculate the read time, the sum-the-squares time, and the total time, to indicate how much of the overall time the read and sum-the-squares steps are consuming.
Use a similar procedure to what we did in the lab, running your program over 3 trials for each file-size. For each trial, record the sum, the read-time, sum-the-squares time, and the total time in a spreadsheet, and use those entries to calculate the minimum value for each time. You should get approximately the same sum-values as sumSquares.c produced.
Your sum-values may not be exactly the same because when adding and multiplying many real numbers, round-off errors may occur, and the particular round-off errors depend on the order in which the adds and multiplies occur. Since the exact order in which the adds and multiplies occur within a parallel computation is non-deterministic, you may not get exactly the same results in different executions of a parallel computation that uses real numbers.
When your program works correctly, use your spreadsheet to record the sums, the trial-times, and the minima you get for each file, using P = 1, 2, 4, 6, and 8 processes.
For P = 2, 4, 6, and 8, use your spreadsheet to compute the parallel speedup, and the parallel efficiency.
Using the data in your spreadsheet, create the following charts:
These four charts should allow you to explore how well your program scales as P and N change. Be sure to use descriptive titles and labels for your charts and their axes, and use chart-columns or lines that can be easily distinguished from one another when the chart is printed on a gray-scale printer.
Hard copies of:
Please staple these pages together and make certain that your name is on each page.
CS > 374 > Exercises > 04 > Homework Project