HPC MPI Exercise 3: Homework Project -- Monte Carlo Forest Fire


Computational models that use the Monte Carlo pattern model a physical phenomenon using randomness, probability, and the average of many trials to compute "typical" behavior. The more trials that are used, the more accurately the model is in describing the physical phenomenon. However running many trials can greatly increase a simulation's runtime, especially as the size of the problem (in this case, the forest size) increases.

Recall that we only used nine burn-probabilities in the lab exercise, but there are 101 burn-probablities (0, 1, 2, ..., 100). To accurately model the phenomenon, we must include all 101 probabilities, in case some probabilities produce unexpected behaviors.

Your homework assignment this week is to take a sequential version of the forest-fire simulation and "parallelize" it using MPI. Make a new folder for this project (e.g., 03/project); then download this tarball of the sequential model and save it in your project folder.

If your browser does not automatically extract the tarball's files, you can extract its contents manually by opening a terminal, navigating to the folder containing the tarball, and entering:

   tar xvzf Fire.tgz
Compile it using the included Makefile and then run the program.

This simulation uses 10,000 trials for each of the 101 probabilities to model the burning of a modest 20x20 forest (400 trees). When the simulation is done, the program displays a table of the 101 burn-probabilities, their corresponding percentage-burned values, and their corresponding 'burn-times' (number of iterations). It also generates two simple graphs that, for each of the 101 burn-probabilities, respectively plot the percentage-burned and 'burn-time' values. These provide you with quick visualizations of the values the program is generating.

In the Lab

  1. Add MPI timing commands to benchmark how fast this sequential program takes in "wall clock" time. To time the full calculation, start timing before the code that initializes the random number generator, and stop timing after the for loop that calculates and prints out the 101 probability-results. (Do not include the code that plots the xgraph.)
  2. Even with a "small" (20x20) forest, this sequential simulation takes a non-trivial amount of time to perform its 10,000 trials. Rewrite this program using MPI to spread the computation of these burn-percentage and time values across P processes. You should be able to develop a good parallel solution to the problem, using the patterns we have explored so far. You may find the chunks.h file to be useful.
  3. Test and debug your program using the 20x20 forest and P = 1, 2, and 4 processes. In your spreadsheet, record: that your program produces.

    Create quick charts for the percent-burned and burn-time values, and verify that these results are consistent with those produced by the sequential version, regardless of the number of processes used. (If they are markedly different, your program has a bug that you need to fix before proceeding further.)

  4. When your program is fully tested and debugged, copy it to the cluster.

On The Cluster

On the cluster, modify your program as necessary for it to build using this Makefile. Since we will be running it via SLURM, you will need to remove the x_graph variables and other code related to generating the simple plots. You should also remove the X-graph and display files from your project folder on the cluster.

When your code builds without errors or warnings, generate the SLURM file(s) needed to run your program using P = 1, 2, 4, 8, 16, 32, 64, 128, and 256 processes and three forest-sizes--20x20, 40x40, and 80x80--a total of 27 different runs. Note that the forest-size can be passed to the program via command-line arguments. Note also that these sizes are small compared to an actual forest, but they should let us see how increasing the forest size affects the performance of our model.

Since each compute node on Borg has 16 cores, use a single cluster-node for your runs of P = 1, 2, 4, 8, and 16 processes; then double the number of nodes for each subsequent run.

Data Collection and Visualization

For each forest-size:

  1. Open a new tab in your spreadsheet and using your program's output on the cluster, record your program's wall-clock run-times when using P = 1, 2, 4, 8, 16, 32, 64, 128, and 256 processes. Next to the times for which P >= 2, use your spreadsheet to calculate your program's parallel speedup and parallel efficiency. (Double- and triple-check the formulas you use to calculate the speedup and efficiency, as it's easy to get these wrong in a spreadsheet.)
  2. Open a new spreadsheet tab and for all 101 burn-probabilities, record the burn-probability, the burn-percentages and and burn-times from your runs on the cluster with P == 256.

Using your spreadsheet, create five line-charts:

  1. One that plots your simulation's wall-clock run-times against P, the number of processes used, for P = 1, 2, 4, 8, 16, 32, 64, 128, and 256.
  2. One that plots your simulation's parallel speedup against P = 1, 2, 4, 8, 16, 32, 64, 128, and 256.
  3. One that plots your simulation's parallel efficiency against P = 1, 2, 4, 8, 16, 32, 64, 128, and 256.
  4. One that plots your simulation's burn-percentage against the 101 burn-probabilities when P = 256.
  5. One that plots your simulation's burn-times (# of steps) against the 101 burn-probabilities when P = 256.
On each chart, put the first quantity (e.g., wall-clock time, speedup, efficiency, burn-percentage, or burn-time) on the vertical axis and the second quantity (e.g., P or burn-probability) on the horizontal axis.

Each chart should feature three lines--one for each forest-size--so that any effects of increasing the forest size can be seen. Be sure to use distinct line-styles (e.g., solid, dashed, dotted) so that the lines may be easily distinguished when printed on a gray-scale printer.

Since you will be submitting these charts, give each chart a descriptive title, well-labelled axes, etc. Resize each chart so that it is a full-page in size when printed, to maximize the visible details. If necessary, copy them into a word processor for printing.

Hand In

Hard copies of:

  1. A 2-3 page analysis -- as quantitative as possible -- of what this model reveals about the behavior of a forest fire, and why it behaves that way. In your analysis, you should: As usual, be as quantitative as possible, citing specific data values to support each of your claims.

    If you used AI in completing this project, attach an additional 1-2 pages in which you document your use of it, as was described in the course policies.

  2. The charts you created in this exercise;
  3. The spreadsheet data you collected in this exercise;
  4. The final version of your source program from the cluster.
  5. The SLURM ".out" file produced by running your program on the cluster using 256 processes and the 20x20 forest.
Please staple these pages together and make certain that your name is on each page.


CS > 374 > Exercises > 03 > Homework Project


This page maintained by Joel Adams.