Begin by typing
man -k pthreadand
man pthread_createand skim the information that is displayed, particular the signature of the pthread_create() command.
When you have a basic understanding of what pthread_create does and how it works, create a new directory and save copies of pi.c and Makefile there.
The file pi.c contains a program that computes an approximation of PI using 1 or more threads. Since PI == 4 * arctan(1) and the arctan(x) == the integral from 0 to x of (1/(1+x*x)), the program calculates a numerical approximation of this integral and uses it to approximate PI. Take a few minutes to study the program. Don't proceed until you understand what it is doing, and why.
On your workstation, compile the program using the Makefile to ensure that it compiles correctly. Execute the program using the command:
./a.out 1000 1and verify that it works as it should. The "usage" for this program is as follows:
./a.out intervals threadswhere intervals is the number of "pieces" into which we divide the integral computation, and threads is the number of threads across which we spread the computation of those "pieces".
Note that the precision of our approximation of PI is rather limited. PI is a transcendental number, so its exact value cannot be specified -- we must always deal with an approximation. The question is how precise our approximation is.
Execute the program again, increasing the first command-line argument by a factor of ten:
./a.out 10000 1What happens to the precision of PI?
Keep increasing the command-line argument by a factor of ten until (a) the program takes longer than five seconds; or (b) the approximation of PI is as accurate as the value given for comparison purposes. Record this intervals value.
time ./a.out yourIntervals 1where yourIntervals is your intervals value from part I. Record the real and user time values in a spreadsheet.
Repeat this procedure using 2, 4, and 8 for the number of threads. Using your spreadsheet, compute the speedup using 2, 4, and 8 threads. Create a line-chart showing the real execution time using 1, 2, 4, and 8 threads.
Login to acolyte and recompile your program there.
Login to ohm.calvin.edu (the cluster's head node) and recompile your program there. Repeat the "experiment" a final time, using your modified version of the program. Add final series to your spreadsheet and a line to your chart with the results of this "experiment."
Have a good holiday!
(Optional.) For extra credit, rewrite the circuit program as a shared-memory (multithreaded) application. Compare its performance to that of the MPI version we wrote previously.
To learn more about POSIX threads, see the LLNL Posix Threads Programming tutorial.
Up to the Calvin HPC Course Page