### HPC Project 5: Applying Patterns

The program in calcPI.c calculates an approximate value for PI. However, a single process (i.e., process 0) is doing all the work. This week's project is to use the patterns you have learned about so far to speedup this computation as much as you can, such that you can calculate PI to 17 digits of precision "quickly".

This is not an especially difficult task, so this is a one week project.

Once you have your parallel version of calcPI working, use it to find how many trapezoids and processes are needed to compute PI to 17 digits of precision "quickly".

Then use your program to collect the timing data needed for the 3D chart described below. For these timing runs, you may use either the Ulab or the cluster.

#### Hand In

Hard copies of:

1. Your spreadsheet data and three 3D bar-charts -- one showing your timing data, one showing your program's speedup, and one showing its computational efficiency -- with the X-axis being 1, 2, 4, 8, 16, and 32 processes, the Z-axis being 100000, 1000000, 10000000, 100000000, 1000000000, and 10000000000 trapezoids; and the Y-axis being your program's run-times, speedups, and/or computational efficiencies for the corresponding X and Z values. As always, label your chart and its axes, and format your chart so that it displays well when printed on the Ulab printer.
2. A 1-2 page analysis of your program's behavior, exploring the relationship between the number of PEs and the number of trapezoids:
• Discuss your projections on how long it would take to calculate PI to 20 and 17 digits sequentially, and how fast you were able to compute the 17-digit value using parallelism.
• Discuss Intel's Parallel Advisor, what it revealed about the runtime behavior of calcPI, and how you used that information to parallelise the program.
• Discuss your program's run-times, speedups, and computational efficiences, and explain how your observations correspond to or disagree with Amdahl's and Gustafson's Laws.