Probability Theory and Random Number Generation

Chapter 1
Probability Theory and Random
Number Generation
S. B. Santra
1.1
A few definitions
In an experiment, suppose the outcomes are {A1 , A2 , · · · , Ak } and the experiment
is performed repeatedly, say N times, N >> 1. The probability for the outcome of
Ak is given by
nk
pk = lim
(1.1)
N →∞ N
where the outcome Ak appeared nk times out of N experiments. The probability
of occurrence should remain between 0 and 1: 0 < pk < 1. pk = 0 means Ak never
occurred and pk = 1 means Ak always occurred. The sum of the probabilities of
P
occurrence of all the state must be 1: k pk = 1. The basic definitions of probability
can be found in Ref. [1,2] .
There are two ways of interpreting the above equation. First, we perform the same
experiment over and over again, altogether N times. The experiments are carried
out at different times, one after another. The number nk is the number of times the
outcome Ak occurred in this sequence of experiments. Second, we envisage N identical (indistinguishable) systems (an ensemble) and perform the same experiment
at a time on all N such systems. The number nk is the number of systems that
yield the outcome Ak . For sufficiently large N, the final result would be the same
in both the methods (ergodicity principle). If there is no apparent reason for one
outcome to occur more frequently than another, then their respective probabilities
1
Chapter 1. Probability Theory and Random Number Generation
are assumed to be equal (equal a priori probability) [1] .
1.1.1
Conditional probability: Dependent and independent
events:
Conditional probability measures the effect (if any) on the occurrence of an event E2
when it is known that an event E1 has occurred in the experiment. The conditional
probability p(E2 |E1 ) is defined as
p(E2 |E1 ) =
p(E1 E2 )
p(E1 )
(1.2)
where p(E1 E2 ) is the probability for both E1 and E2 to occur and p(E1 ) 6= 0. In
general p(E1 |E2 ) 6= p(E2 |E1 ).
If the occurrence or nonoccurence of E1 does not affect the probability of occurrence
of E2 , then p(E2 |E1 ) = p(E2 ) and we say that E1 and E2 are independent events,
otherwise, they are dependent events. The probability that both the events E1 and
E2 occur is given by p(E1 E2 ) = p(E1 )p(E2 |E1 )
and p(E1 E2 ) = p(E1 )p(E2 ) for
dependent and independent events respectively.
For three dependent events E1 , E2 and E3 , we have
p(E1 E2 E3 ) = p(E1 )p(E2 |E1 )p(E3 |E1 E2 )
(1.3)
where p(E1 ) is the probability for E1 to occur, p(E2 |E1 ) is the probability for E2 to
occur given that E1 has occurred, and p(E3 |E1 E2 ) is the probability of E3 to occur
given that both E1 and E2 have occurred. For three independent events,
p(E1 E2 E3 ) = p(E1 )p(E2 )p(E3 ).
(1.4)
For N such independent events,
p(E1 · · · EN ) =
N
Y
p(Ei ).
(1.5)
i=1
Example 1. Consider tossing of a fair die for a large number of times. What is the
probability to get 6 in all three tosses at 103 , 104 and 105 th tosses?
2
1.1 A few definitions
Three events are independent. Thus, the probability to have all three 6 is
p(6, 6, 6) = p(6)p(6)p(6) =
1 1 1
1
× × =
6 6 6
36
(1.6)
Example 2. Suppose that a box contains 3 white balls and 2 black balls. Two balls
are drawn successively from the box without replacing. What is the probability to
have both the balls are black?
The events are dependent events. Thus, the probability to have both the black balls
is
p(black, black) = p(black)p(black|black) =
1.1.2
1
2 1
× =
5 4
10
(1.7)
Mutually exclusive events:
Two or more events are called mutually exclusive if the occurrence of any one of
them excludes the occurrence of the others. Thus, if E1 or E2 are mutually exclusive
events, then p(E1 E2 ) = 0.
If (E1 + E2 ) denotes the event that either E1 or E2 or both occur, then
p(E1 + E2 ) = p(E1 ) + p(E2 ) − p(E1 E2 ).
(1.8)
In particular,
p(E1 + E2 ) = p(E1 ) + p(E2 )
for mutually exclusive events.
(1.9)
As an extension of this, if E1 , E2 · · · EN are N mutually exclusive events then the
probability of occurrence of either E1 or E2 or · · · EN is
p(E1 + · · · + EN ) =
N
X
p(Ei ).
(1.10)
i=1
Example 3. Consider a deck of 52 cards. (a) What is the probability of drawing
a King or a Queen in a single draw? (b)What is the probability of drawing either a
King or a Spade or both in a single draw?
(a) The events are mutually exclusive. Thus, the probability to have a King or a
3
Chapter 1. Probability Theory and Random Number Generation
Queen in a single draw is
p(King + Queen) = p(King) + p(Queen) =
1
1
2
+
=
13 13
13
(1.11)
(b) The events are not mutually exclusive. Thus, the probability of drawing either
a King or a Spade or both in a single draw is
p(King+Spade) = p(King)+p(Spade)−p(King, Spade) =
4
4 13 1
+ − =
(1.12)
52 52 52
13
Problem: Three balls are drawn successively from a box containing 6 red balls, 4
white balls and 5 blue balls. Find the probability that they are drawn in the order
red, white and blue if each ball is (a) replaced and (b) not replaced.
[Ans: (a)
1.1.3
8
,
225
(b)
4
]
91
Expectation and variance of random events:
The outcome of certain random experiments may be represented by some real variable xi . The expectation value of this random variable is given by
hxi =
N
X
pi xi .
(1.13)
i=1
where pi is the probability for a real variable xi to occur. This is also called the
mean or average value of the variable x.
The variance of the measurement is
2
2
2
2
var(x) = σ = h(x − hxi) i = hx i − hxi ,
2
hx i =
N
X
pi x2i
(1.14)
i=1
where σ is called the standard deviation.
1.2
1.2.1
Probability Distributions:
Binomial distribution:
If p is the probability that an event will happen in any single trial and q = 1 − p is
the probability that the event will not happen, then the probability that the event
will happen exactly n times in N independent identical trials is given by
4
1.2 Probability Distributions:
N
n
P (n) =
where
N
n
!
!
pn (1 − p)N −n
(1.15)
being the binomial coefficients.
Problem: Calculate the mean hxi and variance σ 2 of N samples of a variable x
[Ans. hxi = Np, σ 2 = Np(1 − p)]
which is binomially distributed.
1.2.2
Normal distribution:
The normal or Gaussian distribution is given by
(x − µ)2
p(x) = √
exp −
2σ 2
2πσ 2
1
(1.16)
which µ is the mean and σ is the standard deviation. The total area under the
distribution function and the x-axis is 1.
Z +∞
p(x)dx = 1
(1.17)
−∞
The mean and the variance of the distribution are defined as
Z +∞
Z +∞
2
hxi =
xp(x)dx and σ =
(x − hxi)2 p(x)dx.
−∞
(1.18)
−∞
If N is large and if neither p nor q is too close to zero, the binomial distribution can
be approximated by a normal distribution with mean Np and standard deviation
√
Npq.
Problem: Calculate the mean hxi and variance var(x) of a variable x which is
[Ans. hxi = µ, var(x) = σ 2 ]
normally distributed.
1.2.3
Poisson distribution:
The Poisson distribution is given by
p(x = n) =
λn
exp(−λ)
n!
where λ is a given constant.
5
n = 0, 1, 2, , · · ·
(1.19)
Chapter 1. Probability Theory and Random Number Generation
In the binomial distribution, if N is large while the probability p is of the occurrence
of an event is close to zero, so that q = 1 − p is close to 1, the event is called a rare
event. In such case the binomial distribution is very closely approximated by the
Poisson distribution with λ = Np.
Since there is a relation between the binomial and normal distribution, it follows
that there also is a relation between the Poisson and normal distribution. It can
be shown that the Poisson distribution approaches a normal distribution with mean
hxi = λ and σ 2 = λ as λ increases indefinitely.
Problem: Calculate the mean hxi and variance σ 2 of a variable x which follows
[Ans. hxi = λ, σ 2 = λ]
Poisson distribution.
1.3
Central limit theorem:
If a set of random variables x1 , x2 , · · · , xN , all independent of each other and with
P
finite variance, are drawn from the same distribution, the mean hxi = i xi /N in
the limit N → ∞ will always be distributed according to Gaussian distribution with
√
mean µ and standard deviation σ/ N , irrespective of distribution from which the
xi were drawn. This behavior is known as the “central limit theorem” and plays a
very important role in the sampling of states of a system.
Problem: A population consists of five numbers 2, 3, 6, 8, and 11. Consider all
possible samples of size 2 that can be drawn with replacement from this population.
Find (a) the mean of the population, (b) standard deviation of the population, (c)
the mean of the sampling distribution of the means, (d) the standard deviation of
the sampling distribution of the means.
2
[Ans. (a) µ = 6.0, (b) σ 2 = 10.8, (c) µhxi = 6.0, (b) σhxi
= 5.40 ]
1.4
Markov process and Markov chain:
Markov processes are stochastic processes. Consider a stochastic or random process
at discrete times t1 , t2 , · · · for a system with a finite set of possible states S1 , S2 , · · · .
The probability that the system is in the state Sn at time tn given that at the
preceding time the system states were in Sn−i at tn−i , for i < n is given by the
conditional probability P (Sn |Sn−1 , Sn−2 , · · · , S1 ). If this probability depends only
on the immediate previous state Sn−1 and independent of all other previous states,
6
1.4 Markov process and Markov chain:
then the process is called a Markov process and the series of states {S1 , S2 , · · · } is
called a Markov chain. The conditional probability that the system is in the state
Sn at time tn then should reduce to
P (Sn |Sn−1 , Sn−2 , · · · , S1 ) → P (Sn |Sn−1)
(1.20)
for a Markov chain. The conditional probability can be interpreted as the transition
probability Wij to move from state i to state j,
Wij = W (Si → Sj ) = P (Sj |Si )
(1.21)
with further requirement
X
Wij ≥ 0
Wij = 1,
(1.22)
j
as usual for the transition probability. Note that we kept the transition probability
Wij = p(Sj |Si ) time independent. Such Markov chains are called “time homogeneous”. If there are N states in a Markov chain, the one-step transition probabilities
Wij s can be arranged in an N × N array called the (one-step) transition matrix W
of S and can be written as

W11
W12
· · · W1N

 W21 W22 · · · W2N
W=
..
..
..
 ..
.
.
.
 .
WN 1 WN 2 · · · WN N



.


(1.23)
One can now find the total probability p(Sj , t) to find the system in the state Sj at
time t as
p(Sj , t) =
X
i
p(Si , t − 1)Wij .
(1.24)
or in matrix notation, for N such states, it is given by
p(t) = p(t − 1) · W.
(1.25)
Markov process and Markov chain is well described in Ref. [2] .
For example, consider a three state Markov chain with states (1, 2, 3). Say, the
7
Chapter 1. Probability Theory and Random Number Generation
transition matrix for such a chain is given by

0.2 0.7 0.1


W =  0.4 0.3 0.3  .
0.6 0.2 0.2

(1.26)
(Note that each row adds up to 1.) If the initial probability of the states is given
by p0 = (0.1, 0.8, 0.1), (a) what is the probability to find the system in state S = 1
after one jump (t = 1)? (b) what is the probability to find the system in state S = 2
after two jumps (t = 2)? (c) what is the probability to find the system in state
S = 3 after three jumps (t = 3) if the system initially was at S = 1?
(a) p1 (t = 1) = (p0 · W)1, the first element of the row vector. Thus,



0.2 0.7 0.1



p1,t=1 = (0.1, 0.8, 0.1)  0.4 0.3 0.3  = (0.40, 0.33, 0.27)1 = 0.40.
0.6 0.2 0.2
1
(1.27)
(b) p2 (t = 2) = (p0 · W2 )2 , the second element of the row vector. Thus,
p2,t=2 = (0.374, 0.433, 0.193)2 = 0.433.
(1.28)
(c) This is the conditional probability p(3, t = 3|1, t = 0) = (W3)13 = 0.199, the
(1, 3) element of the W3 matrix.
The “steady-state” behavior of Markov chains as t → ∞ is of considerable interest.
In the limit t → ∞, the steady state will be achieved if the probability flux out of
a state i is equal to the probability flux into the state i. The steady state condition
then can be written as
X
X
pi (t)Wij =
pj (t)Wji .
(1.29)
j
j
This is called “global balance”. Since,
pi (t) =
P
j
Wij = 1, the above equation reduces to
X
pj (t)Wji
(1.30)
j
or in matrix notation
p(t) = p(t) · W.
(1.31)
Thus, the probability distribution of the states {pi } under the transition matrix W
8
1.4 Markov process and Markov chain:
remains constant. Note that this condition may not be satisfied if the Markov chain
is transient or recurrent [2] .
Problem: Show that the (one-step) transition matrix W for following three Markov
chains are as given below. The line between the sites indicate the connectivity
between them and transition is possible. All transitions from a site are equally
probable. Find the corresponding steady state distribution {pj }.
1
2
3
4
(a)
1
2
1
2
4
3
4
3
(b)
(c)

0
1/2

1
0
0



 1/2 0 1/2 0 


(a) W = 

0
1/2
0
1/2


0
0
1
0

0
1/2
0

 1/2 0 1/2 0
(b) W = 
 0 1/2 0 1/2

1/2 0 1/2 0
Solving the set of equations pi (t) =

j
1/2
0
1/2

 1/3 0 1/3 1/3
(c) W = 
 0 1/2 0 1/2

1/3 1/3 1/3 0





P
0






pj (t)Wji at the steady state, one may find
(a) p = (1/6, 1/3, 1/3, 1/6),
(b) p = (1/4, 1/4, 1/4, 1/4),
(c) p = (1/5, 3/10, 1/5, 3/10)
In the continuous time representation, the probability to find the system at time
t + dt is given by
p(t + dt) = p(t) · W.
9
(1.32)
Chapter 1. Probability Theory and Random Number Generation
Hence, the probability to find the system in the state Sj at time t + dt is
P
p(Sj , t + dt) = i p(Si , t)Wij
P
= p(Sj , t)Wjj + i (i6=j) p(Si , t)Wij
P
P
= p(Sj , t)(1 − i (i6=j) Wji ) + i (i6=j) p(Si , t)Wij
or, p(Sj , t + dt) − p(Sj , t) = −
X
p(Sj , t)Wji +
i (i6=j)
X
p(Si , t)Wij .
(1.33)
(1.34)
i (i6=j)
Hence, the time rate of change of the probability of occurrence of a state with time
t is given by the the master equation:
X
X
dp(Sj , t)
=−
Wji p(Sj , t) +
Wij p(Si , t)
dt
i
i
(1.35)
where i 6= j in the summations. The above equation can be considered as a continuP
ity equation expressing the fact that the total probability is conserved ( j p(Sj , t) =
1 at all times) and the probability of a state i that is ‘lost’ by transitions to state j
is gained in the probability of that state, and vice versa.
At equilibrium, Wji peq (Sj ) = Wij peq (Si ), the net flux of states must be zero and the
steady state should be achieved. At this point, the master equation yields
dpeq (Sj , t)
= 0,
dt
(1.36)
since the gain and loss term cancel exactly.
Example: A system has two energy states labeled as 0 and 1, with energy E0 and
E1 . Transition between the two states takes place at rates W01 = w0 exp[−β(E1 −
E0 )] and W10 = w0 where β = 1/kB T . Solve the master equation for the probabilities
p(0, t) and p(1, t) of occupation of the two states as a function of time t with the
initial conditions p(0, 0) = 0 and p(1, 0) = 1. What is the steady state probability
distribution for the states? What is the average energy of the system in the steady
state?
10
1.4 Markov process and Markov chain:
Let us consider the master equation for p(1, t) which is given by
dp(1, t)
= −p(1, t)W10 + p(0, t)W01
dt
= −w0 p(1, t) + w0 e−β(E1 −E0 ) [1 − p(1, t)]
= w0 e−β(E1 −E0 ) − w0 [1 + e−β(E1 −E0 ) ]p(1, t)
= a − bp(1, t)
where a = w0 e−β(E1 −E0 ) and b = w0 [1+e−β(E1 −E0 ) ]. A solution to the above equation
can easily be obtained by taking the second derivative of p(1, t) as given below:
p′ (1, t) = a − bp(1, t)
dp′ (1, t)
= −bp′ (1, t)
dt
→
or, p′ (1, t) = Ce−bt
→
→
dp′ (1, t)
= −bdt,
p′ (1, t)
a − bp(1, t) = Ce−bt .
As per the initial condition, at t = 0, p(1, 0) = 1 and then C = a − b = −w0 . The
solution to the master equation is then given by
p(1, t) =
a w0 −bt
+
e .
b
b
In the t → ∞ limit, the steady state would be achieved and the probability p1 of
the state 1 would be
p1 =
a
e−β(E1 −E0 )
e−βE1
=
=
b
1 + e−β(E1 −E0 )
e−βE0 + e−βE1
and the probability p0 of the state 0 will be
p0 = 1 − p1 =
e−βE0
.
e−βE0 + e−βE1
Thus, in the steady state, the probability of any state s is given by the Boltzmann
distribution
1
ps = e−βEs
Z
P −βEs
where Z = s e
is the canonical partition function. The average energy of the
system is given by
hEi = E0 p0 + E1 p1 =
1X
E0 e−βE0 + E1 e−βE1
Es e−βEs .
=
e−βE0 + e−βE1
Z s
11
Chapter 1. Probability Theory and Random Number Generation
Thus with the right choice of transition probabilities, the Markov process takes the
system to the right Boltzmann distribution in the t → ∞ limit.
Finally, it should be noted that the restriction to a discrete set of states {Si } is
not at all important: one can generalize the discussion to a continuum of states,
working with suitable probability densities in the appropriate space [3] and it will
lead to Fokker Planck equation.
1.5
Random number generators:
Monte Carlo methods are heavily dependent on the fast and efficient production of
streams of random numbers. Random number sequences can be produced directly on
the computer using software. Since such algorithms are actually deterministic, the
random number sequences which are thus produced are only ‘pseudo-random’ and
do indeed have limitations. These deterministic features are not always negative.
For example, for testing a program it is often useful to compare the results with a
previous run made using exactly the same random numbers. On the other hand,
because of its deterministic nature, it has finite cycle of repetition. The cycle must
be larger than the number of calls one makes during a simulation of a particular
problem. For detailed discussion one may consult Ref. [5] .
There are different algorithms to generate random numbers such as Congruential
method, Shift register algorithms, Lagged Fibonacci generators etc. Below, we will
be discussing these algorithms briefly. These algorithms generate a sequence of
random integers. Usually floating point numbers uniformly distributed between
0 and 1 are needed for MC simulation. Random numbers uniformly distributed
between 0 and 1 are then obtained by carrying out a floating point division by the
largest integer which can fit into the word length of a computer.
1.5.1
Congruential method:
A simple and very popular method for generating random number sequences is the
multiplicative or congruential method. Here, a fixed multiplier c is chosen along
with a given seed and subsequent numbers are generated by simple multiplication:
Xn = (cXn−1 + a0 )MOD(Nmax ),
12
(1.37)
1.5 Random number generators:
where Xn is an integer between 1 and Nmax . X0 is the initial seed to be supplied to
generate the sequence. A random number rn is then obtained as
rn = Xn /Nmax .
It is important that the value of the multiplier be chosen to have ‘good’ properties
and various choices have been used in the past. In addition, the best performance
is obtained when the initial random number X0 is odd. Experience has shown that
a ‘good’ congruential generator is the 32-bit linear congruential algorithm
Xn = (cXn−1 )MOD(Nmax )
(1.38)
where c = 75 = 16807 and Nmax = 231 − 1 = 2147483641. A congruential generator
which was quite popular earlier turned out to have quite noticeable correlation
between consecutive triplets of random numbers.
The correlation in a random number sequence can be reduced by shuffling. Call a
random number z, use z to a find a random location in the sequence. For a sequence
of length N, the location is I = z ×N. Replace the random number y at Ith location
by z, set z = y and then repeat the procedure.
The period of the random number sequence obtained in the above congruential
generator is ≈ 109 . In order to improve the period of the sequence, two different
sequences with different periods are combined to produce a new sequence whose
period is the lowest common multiple of the two periods. By choosing, Nmax,1 =
2147483562 and Nmax,1 = 2147483398 (slightly less than 231 ), one can produce
a sequence of period ≈ 1018 . A few random number generators are provided in
Numerical Recipes by W. H. Press et al [5] incorporating the measures and added
safeguards.
Congruential generators which use a longer word length also have improved properties. Assembly language implementation using 64-bit product register is straightforward but not portable from machine to machine.
The shift register algorithm was introduced to eliminate some of the problems with
correlations in a congruential method. A table of random numbers is first produced
and a new random number is produced by combining two different existing numbers
from the table:
Xn = Xn−p · XOR · Xn−q
13
(1.39)
Chapter 1. Probability Theory and Random Number Generation
where p and q must be properly chosen to have good sequence. The ·XOR· operator
is the bit wise exclusive-OR operator. The best choices of the pairs (p, q) are
determined by the primitive trinomials given by
X p + X q + 1 = primitive.
1.6
(1.40)
Test of quality:
The quality of random numbers generated from a given random number generator
depends on two factors: (i) uniformity and (ii) uncorrelated or independent. These
criteria can be tested using Frequency test and auto-correlation test respectively.
Apart from these two tests parking lot test can be performed to visualize the quality
of a sequence of random numbers generated. We will briefly discuss these tests here
for detailed discussion one may consult Ref. [6] .
1.6.1
Frequency test:
This test verifies the uniformity of the random numbers. The interval [0 − 1] is
divided into Nbin number of bins and total N random numbers are distributed. If
nj times the random numbers appear in the jth bin, the χ2 is defined as
2
χ =
N
bin
X
j=1
(nj − Nw)2
,
Nw
w = 1/Nbin
(1.41)
where nj in every bin is expected to have a Poisson distribution. High value of
χ2 does not correspond to uniform distribution of random numbers. On the other
hand, zero of χ2 correspond to no randomness.
Since the sum of the occupancies in all the bins must be equal to N, the total number
of random numbers generated, the number of degrees of freedom ν is one less than
the number of bins, i.e.; ν = Nbin − 1. Now, one needs to determine the value of
χ2α,ν for the significant level α and degrees of freedom ν from the “Percentage points
of the chi-square distribution table”. The considered random number distribution
will be accepted as a uniform distribution if
χ2 ≤ χ2α,ν .
(1.42)
If the sequence is truly random, the departure in any one bin from the average value
14
1.6 Test of quality:
of Nw would be given by a normal distribution centered about zero with variance
of Nw. Then the expected value of the reduced chi-square is to be χ2 /χ2ν ≈ 1 with
negligible α. Thus, in the figures below the generator with realtive χ2 ≈ 0.99 is a
good generator.
0.1
0.1
χ ~0.99
2
χ ~58
fr
fr
2
0.05
0.05
0
0
0.2
0.4
0.6
0.8
0
1
0
0.2
0.4
0.6
0.8
1
r
r
Figure 1.1: 10000 random numbers are distributed over 20 equal bins of width 0.05
generated from two random number generators. The generator with reduced χ2 ≈ 0.99 is
a good generator.
1.6.2
Auto-correlation test:
The tests for auto-correlation are to check the dependence between numbers in a
sequence. For example consider the following sequence:
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83 0.93
0.68 0.49 0.05 0.43 0.95 0.58 0.19 0.36 0.69 0.87
In this sequence every 5th number is of high value as marked in bold.
One then needs to find out the correlation among every mth member of the sequence
starting from the ith member in the sequence of N random numbers. The autocorrelation ρim of interest shall be between numbers: ri , ri+m , ri+2m , ri+(M +1)m where
M is the largest integer such that i + (M + 1)m ≤ N. If the values are uncorrelated,
the distribution of the estimator of ρim , denoted by ρ̂im , is approximately normal
for large values of M. In order to test the auto-correlation, one needs to calculate a
quantity Z0 defined as
Z0 =
ρ̂im
σρ̂im
15
Chapter 1. Probability Theory and Random Number Generation
where
ρ̂im
"M
#
√
X
13M + 7
1
=
ri+km ri+(k+1)m − 0.25 and σρ̂im =
.
M + 1 k=0
12(M + 1)
Z0 is distributed normally with mean 0 and variance 1. The numbers in the sequence
are said to be independent if
−Zα/2 ≤ Z0 ≤ Zα/2
where α is the level of significance and Zα/2 is obtained from the standard normal
distribution table. For example for 5% significance, Z0.05/2 = Z0.025 = 1.96. Hence,
the calculated value of Z0 for a given sequence is found to be within −1.96 and
+1.96, then the numbers in the sequence can be assumed to be independent.
There is another type of correlation called serial correlation. In this correlation there
is a tendency for each random number to be followed by another of a particular type.
For example, a large number is always followed by a small number or a large number.
To check this, a correlation coefficient between a sequence of N random numbers
r1 , r2 , · · · , rN can be defined as
C=
N(
PN
N
PN
ri ) 2
i=1 ri−1 ri ) − (
PN 2
PN i=1 2
i=1 ri − (
i=1 ri )
(1.43)
where r0 in the first term of the numerator can be taken as rN . The value of C for
an uncorrelated sequence is expected to be in the range [µN − 2σN , µN + 2σN ] where
1
µN = −
N −1
1.6.3
1
σN =
N −1
s
N +1
.
N(N − 3)
(1.44)
Parking lot test:
Carrying out ‘parking lot’ test one may visualize the correlations among the random
numbers in sequence. Consecutive pairs of random numbers can be taken as x and
y-coordinates and can be plotted in the xy-plane. For uncorrelated distribution, one
expects to see a uniform distribution of these numbers. Any particular pattern in
the plot indicates a kind of correlation among the numbers. A comparison of two
different two different random number sequences taken from two different generators
are shown in Figure 1.2.
16
1.7 Random numbers of a given distribution
y
1
y
1
0
0
0
1
0
x
1
x
Figure 1.2: 1000 points are plotted using consecutive pairs of random numbers as xand y-coordinates. At the left is a picture of a ‘bad’ generator and at the right are the
results of a ‘good’ generator.
1.7
Random numbers of a given distribution
In many cases we need random numbers with a specific distribution that is different
from a uniform one. The general approach is to start with a uniform distribution
P (x) and change it to a required one. There are several ways to carry out such
transformation. We will be discussing a single one here.
Consider a set of random numbers evenly distributed in the interval [0, 1]. The
probability of finding a number between x and x + dx, denoted by p(x)dx, is given
by
(
dx 0 < x < 1
p(x)dx =
(1.45)
0 otherwise
The probability distribution p(x) is of course normalized, so that
Z
+∞
p(x)dx = 1
(1.46)
−∞
The interest is to find out a transformation from x to y such that the random number
y has a desired distribution p(y). It is convenient to assume that the p(y) has the
same normalization as that of p(x) given in Eq.1.46. The probability distribution
of y, denoted by p(y)dy, is determined by the fundamental transformation law of
probabilities, which is simply
|p(y)dy| = |p(x)dx|
17
(1.47)
Chapter 1. Probability Theory and Random Number Generation
or
dx p(y) = p(x) dy
(1.48)
where the Jacobian of the transformation is given by the absolute value of the
derivative of x with respect to y, coming from the fact that probabilities cannot be
negative.
As an example, say the required distribution is an exponential distribution given by
p(y)dy = e−y dy
(1.49)
and to be obtained from a uniform distribution p(x). This exponential distribution occurs frequently in real problems, usually as the distribution of waiting times
between independent Poisson random events, for example the radioactive decay of
nuclei. Using the above prescription one has e−y = x or y(x) = − ln(x) that changes
a set of uniformly distributed numbers {xi } to a set {yi } having an exponential
distribution.
Such a method to carry out the transformation works well in cases where it is possible
to carry out the indefinite integral
x = F (y) =
Z
p(y)dy
(1.50)
and the inverse of F (y) = x can be found. In practice, the relation
y = F −1 (x)
(1.51)
is complicated in form. In such cases, other methods are more efficient.
Problem: (i) Generate a sequence of random numbers using congruential method
assuming the computer is of 4-bit word. What is the cycle of the sequence?
(ii) Generate a sequence of random numbers using congruential method with
c = 111, Nmax = 23767 and X0 = 99999999. Perform the frequency and parking lot
test for the sequence.
(iii) Generate a sequence of random numbers using ran2 given in Numerical
Recipes by W. H. Press et al with a large negative integer as seed. Compare with
the generator given in (ii) by performing the same quality tests.
18
Bibliography
[1] E. Atlee Jacson, Equilibrium statistical Mechanics, (Dover Publications, Inc.,
[2]
[3]
[4]
[5]
[6]
Mineola, New York, 1968).
R. Y. Rubinstein and D. P. Kroese, Simulation and Monte Carlo Method, (John
Wiley & Sons, Inc., Hoboken, New Jersey, 2008).
D. P. Landau and K. Binder, A Guide to Monte Carlo simulations in Statistical
Physics, (Cambridge university Press, Cambridge, 2005).
M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical
Physics, (Clarendon Press, Oxford, 2001).
W. H. Press, S. A. Teukolsky, W. T. Vellerling, B. P. Flannery, Numerical
Recipes, (Cambridge university Press, Cambridge, 1998).
S. S. M. Wong, Computational Methods in Physics and Engineering, (World
Scientific, Singapore, 1997).
19

Download Report

Probability Theory and Random Number Generation

Paperzz.com

Your Paperzz