Homework on using a Monte Carlo method to estimate π 1 Problem

CPS343: Homework on using a Monte Carlo method to estimate π
1
Homework on using a Monte Carlo method to estimate π
1
Problem Overview
To prepare to write parallel programs we first examine a simple model problem. This problem was
selected because it is embarrassingly parallel, meaning that nearly all the computation it requires
can separated into segments of work that can be done completely independently of one another.
While many important problems do not have this property, some do, and it gives us a good way
to focus on the various types of parallelism.
Suppose we want to compute an estimate of π, the ratio of the circumference of a circle to its
diameter. As you probably know, the area of a circle of radius r is computed by A = πr2 . A circle
with r = 1 is called the unit circle and has area π, so the area of one fourth of the unit circle is just
π/4. The left side of Figure 1 shows a unit square, that is a square with sides of length one, with
an embedded quarter unit circle. On the right is the same square with 500 random points plotted
inside it. The ratio of the number of points inside the quarter circle to the total number of points
Figure 1: Left: quarter of unit circle in unit square; Right: unit square with 500 random points
in the square should ideally be the same as the ratio of the area of the quarter circle to the area of
the square. This means
area of quarter unit circle
number of points in quarter unit circle
≈4
π=4
area of square
number of points in square
which gives us a reasonable way to estimate π. The more random points we generate, the better
our estimate will be. Generating random points in the unit square is easy; many pseudorandom
CPS343: Homework on using a Monte Carlo method to estimate π
2
number generators 1 (PRNGs) return values in the interval [0, 1) which is just what we need. The
point (x, y) will be inside the unit circle if x2 + y 2 < 1. Pseudocode for the algorithm is
count = 0
for i = 1 to number_of_samples do
x = random value from [0,1)
y = random value from [0,1)
if x * x + y * y < 1 then
count = count + 1
endif
endfor
estimate_of_pi = 4 * count / number_of_samples
As already noted, the more points we generate, the better our estimate will be. This is a natural
place to exploit parallelism since the generation of each point is independent of the generation of
all other points. We can create separate tasks that each compute a number of points and count
the number within the circle, then add all of these together before estimating π.
2
Monte Carlo Simulation
This process is an example of Monte Carlo Simulation2 , a very important technique used in problems
ranging from models of plasmas and surface properties in physics to weather forecasting to economic
modeling. All Monte-Carlo simulations have one thing in common: the generation of enough
pseudorandom data to provide a desired level of accuracy from the simulation. Since more random
data leads to more accurate models, “enough” random data usually translates to “as much as
possible, as fast as possible.”
2.1
Random number generation
Monte Carlo simulations typically require many random numbers. Computers cannot generate true
random numbers without special hardware, but they can generate sequences of pseudorandom numbers; numbers whose distribution appears random according to various statistical measures. There
are many ways to do this, some better than others in the sense that they generate longer sequences
before they repeat or exhibit other patterns not expected from a sequence of true random numbers.
Most programming languages supply pseudorandom number generators (PRNGs, or sometimes
just called RNGs). Using a PRNG in a program usually consists of setting a ¡em¿seed¡/em¿ for
the generator. Pseudorandom numbers are then generated by applying a function to the seed and
generating a new seed from the old one. When started with the same seed, a PRNG will produce
exactly the same sequence of numbers.
As already suggested, some PRNGs are better than others. For example, the C/C++ standard
library historically has provided two different functions, rand() and random(), to produced random
1
2
http://en.wikipedia.org/wiki/Pseudorandom_number_generator
http://en.wikipedia.org/wiki/Monte_Carlo_method
CPS343: Homework on using a Monte Carlo method to estimate π
3
integers between 0 and some machine-dependent upper bound. While recent implementations of
these functions are equivalent, this was not always the case. Sequences generated by rand() were
significantly less “random” than those produced by random().
2.2
Thread Safe PRNGs
Another important issue is that neither of these functions are thread-safe3 . This means that it
is possible for multiple threads to obtain the same pseudorandom values from them. Normally a
PRNG uses a single seed, or perhaps a more complicated structure, to maintain its state. If one
thread requests a random number before another thread is finished updating the generator’s state,
the same number can be returned to both threads and/or the state of the PRNG may not be
updated properly. One way to make random number generation safe is to make sure each thread
maintains its own PRNG state. Several robust thread-safe PRNGs are found in the GNU Scientific
Library4 . There is a thread-safe PRNG included in the GNU C library (drand48 r()), but it does
not perform well (in terms of speed) in certain OpenMP applications.
The following code demonstrates how to set up and use the GSL PRNG to generate uniformally
distributed pseudorandom numbers in [0, 1):
// Set up random number generator using system time as seed
gsl_rng* rng = gsl_rng_alloc( gsl_rng_default );
gsl_rng_set( rng, time( NULL ) );
// Get and display samples
for ( long i = 0L; i < numSamples; i++ )
{
printf( "%20.16f\n", gsl_rng_uniform( rng ) );
}
// Clean up
gsl_rng_free( rng );
Programs using GSL PRNG should also #include <gsl/gsl rng.h> and should be linked with
the -lgsl -lgslcblas (or some other CBLAS) libraries.
3
Assignment
Write a program in C or C++ to estimate π using the Monte-Carlo approach just described. we’ll
eventually write different parallel versions of this program, but for this assignment you should focus
on the following:
3
4
http://en.wikipedia.org/wiki/Thread_safety
http://www.gnu.org/software/gsl/
CPS343: Homework on using a Monte Carlo method to estimate π
4
a. Your program should accept the desired number of random points to generate as a commandline option using the -n switch. See the file
http://www.cs.gordon.edu/courses/cps343/assignments/gen_rand_samples.cc
for how to use the getopt() function to do this. You might also find the man page helpful:
type man 3 getopt to get it.
b. Use the wtime() function that was supplied in the GIT repository for our first hands-on exercise (https://github.com/gordon-cs/cps343-hoe/tree/master/00-calculate-pi). to
compute the elapsed time for the computation of π. See the pi serial.cc file in the same
repository for example usage.
c. Have your program report the following information on a single line: (1) the estimate of π,
(2) the error between the estimate and the exact value (include math.h in C or cmath in
C++) and use the constant M PI for the exact value, (3) the elapsed time in seconds, and
(4) the number of samples used. For example the output might be
Pi: 3.1417392800, error: -1.466e-04, seconds: 0.7675, samples: 50000000
The printf() format string for this output is
"Pi: %12.10f, error: %10.3e, seconds: %g, samples: %ld\n"
d. You might find it helpful later to have the option of producing only numbers in the output,
something like
3.1414080000
1.847e-04
0.667400 50000000
with format string
"%12.10f %10.3e %10.6f %ld\n"
Modify the getopt()-section of your program to handle the command-line switch -q (quiet).
When supplied, this should toggle a boolean variable that is used to determine whether or
not to display descriptive text in the output. The default behavior should be to display the
labels.