SI 486L Lab 5: Life Goes On This lab is due prior to our next lab period. All the files should be zipped or tar'd into a single file and e-mailed (the subject line should contain the phrase “lab 5”) as an attachment to the instructor. One of the files should be a makefile with a default target that builds an MPI executable called “life”. Introduction Earlier this semester you built a cellular automaton simulating a simple life form. Your task is to convert that to run up to 8-wide parallel. See Lab 2 for background on the simulation “Life”. Your MPI version should be able to handle a large board because you will divide it up into as many as 8 subboards with each processor computing a portion of the board. Tasks Your objective is to modify your version of the game of Life in C language using MPI to make it a parallel application. Like the earlier lab, the input file will contain a first line that tells the size of the “board” - the dimensions of the 2-D grid. What follows, then, are an arbitrary number of lines that contain x and y value pairs indicating populated cells at the start of the simulation. Here's a very simple example of input: 1600 400 31 32 33 This declares a 1600 x 400 grid and has only 3 populated cells: (3,1), (3,2) and (3,3). Design The program should take one argument on the command line, an integer. The argument (argv[1]) is the number of generations to run the simulation. The final values should be written to stdout in the format of an input file for your colorIt program from an earlier lab. The input to the program will be reading from stdin. Your version may support filenames specified as an argument, but that isn't necessary today. Here's an example of invoking the program: mpirun n 8 ./life 500 < huge.data This invokes, as an MPI parallel application, 8 copies of the binary “life” in the current directory, to run 500 generations. The input comes from stdin, redirected from the file huge.data in the current directory. Here's a simple design (in pseudo code) for the main() function: main parse command line argument initialize the boards for each generation halo exchange compute the next generation switch boards consolidate the results Dividing up the work and sharing it among the processors can be tricky. We'll make it easier on ourselves by dividing along just the x-axis, as follows. Note the shared boundaries. We'll use periodic boundaries (wrap around) both on the top/bottom (internally) and between the first/last boards. Step #1 Read in all the points on rank 0 and send all the x-y pairs to all the ranks using MPI_Bcast; let each rank select its points from the master list. That's easier (I hope) than sending each rank its own unique points. How will each rank know how much space to allocate for the Bcast receive buffer? You'll have to broadcast that first. Step #2 Have each rank treat its coordinates as if they were the base grid. Because of the halo, the lowest numbered location will be (1,1). If the 4th board were starting at, say, (1501, 1) for grids that are 500 cells across, have the program run as if it were starting at (1, 1). The input section can find its points and map them to the base dimensions (e.g., subtract 500 from the x coordinate). That way, all array access works simply, regardless of which section of the board each rank uses. Step #3 The halo exchange will consist of two parts, internal copies for top and bottom and MPI calls to copy the neighboring edges. halo exchange: copy the top to bottom, bottom to top copy the left and right boundaries to neighbors Be sure to include the halo parts when copying a column or row. Step #4 After all the generations are computed you can “gather” the final results; normally this isn't a good approach because the root rank will have to have enough memory to hold all boards. Don’t send the board, just send the x,y pairs where there is life. Use these data to generate a file for your colorIt lab! Why Why are we doing this? This is a simple “halo exchange”, a staple in supercomputing applications that cover large grids. This gets us ready for a more complex halo, where a single rank has neighbors not just to the left and right, but top and bottom as well. We are dividing the work between processors; they each get a piece of the data on which to work. This is called Data Parallelism. The computation involved in Life isn't that intensive; we aren't making the simulation run that much faster than if it were on a single processor. Running the same work faster by using more processors is called Strong Scaling. However, we are able to run bigger simulations by running across multiple processors. This is called Weak Scaling. Clean Code Counts The code for this and other labs will be graded on two criteria: 1) does it work? and 2) is it well written? That means clean, readable code, well structured and well commented – including having your name in the comments!
© Copyright 2026 Paperzz