Last week, we parallelized calcPI, a sequential program to calculate PI using integration and the trapezoidal method. This week, we are going to start with calcPI2, which calculates PI using integration and the arctangent method, and uses the pthreads library to do the computation in parallel.
This version of calcPI2 has each thread compute its portion of the integral in a variable localSum. At the end of the computation, the thread adds its localSum to a shared variable pi. Since each thread is reading and writing this shared variable, this addition is a critical section, so the program performs this addition using the Mutual Exclusion pattern, to avoid a race condition.
Using a spreadsheet, record the precision of the answer and the time required to compute it, using 1; 10; 100; 1,000; 10,000; 100,000; 1,000,000; 10,000,000; 100,000,000; 1,000,000,000; and 10,000,000,000 intervals. Record these timings using 1, 2, 4, 8, and 16 threads. This will provide a baseline for Part II.
Part II of this week's assignment is to replace that Mutual Exclusion pattern with a Reduction pattern, to see if we can improve the program's performance. More precisely, you should replace the lines:
pthread_mutex_lock(&piLock); pi += localSum; pthread_mutex_unlock(&piLock);with something like this:
pthreadReductionSum(localSum, &pi);that reliably achieves the same result, but using a tree-like parallel summation instead of a mutually exclusive sequential summation.
Unfortunately, the pthreads library does not provide any constructs that implement the Reduction pattern, so you will need to implement this yourself. There are different ways this can be done; do some research to look for a speedy way. Depending on the approach you use, you may find the Barrier pattern to be useful. If you need more information about pthreads, this tutorial by Blaise Barney is a good resource.
Store your pthreadReductionSum() function in its own header file (e.g., pthreadReduction.h, so that you can easily use it in another program if the need arises. Your function should not use any non-local variables except for ones defined in pthreadReduction.h.
Once you have finished it, time this version of your program using the same intervals and threads values as before.
Then use your program to collect the timing data needed for the 3D charts described below. For these timing runs, you should use the Ulab machines.
This is a one week project.
Hard copies of:
CS > 374 > Exercises > 06 > Homework Project