The Multiple Prime Random Number Generator

The Multiple Prime Random Number
Generator
ALEXANDER HAAS
Mercury Consolidated,
Inc.
A new pseudorandom number generator, the Multiple Prime Random Number Generator, has been
developed; it is efficient, conceptually simple, flexible, and easy to program. The generator utilizes
cycles around prime numbers to guarantee the length of the period, which can easily be programmed
to surpass the maximum period of any other presently available random number generator. There
are minimum limits placed on the seed values of the variables because the period of the generator is
not a function of the initial values of the variables. The generator passes thirteen standard random
number generator tests. It requires only about fifteen lines of FORTRAN code to program and utilizes
programming language constructs found in most major languages. Finally, it compares very favorably
to the fastest of the other available generators.
Categories and Subject Descriptors: G.3 [Mathematics
of Computing]:
Probability and Statisticsprobabilistic algorithms, random. number generation; 1.6.3 [Simulation
and Modeling]:
Applications;
1.6.4 [Simulation
and Modeling]:
Model Validation and Analysis; 5.2 [Computer
Applications]:
5.4 [Computer
Applications]:
Physical Sciences and Engineering-mathematics
and statistics;
Social and Behavioral Sciences-economics,
sociology
General Terms: Algorithms,
Economics,
Experimentation,
Reliability,
Performance
1. INTRODUCTION
This research began in order to find a random number generator (RNG) with a
sufficiently long period, which would behave in a manner similar to “true” random
phenomena (e.g., the toss of a die), be statistically unbiased, and be conceptually
simple, flexible, and easy to program. Lehmer-type linear congruential generators
[4,5], although conceptually simple, sometimes have short periods, are deterministic (i.e., any given number is always followed by the same number), and in
many cases cannot pass some of the relevant statistical
tests [l, 2, 111.
The Tausworthe
digital shift-register
sequence generator and its offspring
[6, 8-111 appear to pass all relevant statistical tests and can be programmed for
long periods, but require the user either to program in the registers or to have
language constructs, such as the Exclusive OR, not readily available in many
languages. The Power Residue Generator [3] lacks in flexibility
in that some
starting values are more efficient than others. The following algorithm, the
Author’s address: 6999 Moores Mill Road, Huntsville, AL 35811.
Permission to copy without fee all or part of this material is granted provided that the copies are not
made or distributed for direct commercial advantage, the ACM copyright notice and the title of the
publication and its date appear, and notice is given that copying is by permission of the Association
for Computing Machinery.
To copy otherwise, or to republish, requires a fee and/or specific
permission.
0 1987 ACM 0098-3500/87/1200-0368
$01.50
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987, Pages 36&381.
The Multiple Prime Random Number Generator
Multiple Prime Random Number Generator (MPRNG),
gruential generator, resolves all of these problems.
C
C
10
C
l
369
a modified linear con-
THE MULTIPLE
PRIME RANDOM NUMBER GENERATOR
THE FOLLOWING
CODE IS FORTRAN
INITIALIZE
SEED VARIABLES
M = 971
IA = 11113
IB = 104322
IRAND = 4181
GENERATE
5000 NUMBERS
DO 10 I = 1,500O
M=M+7
IA = IA + 1907
IB = IB + 73939
IF(M .GE. 9973)M = M - 9871
IF(IA .GE. 99991)IA = IA - 89989
IF(IB .GE. 224729)IB = IB - 96233
IRAND = (MOD((IRAND
* M + IA + IB), 100000))/10
CONTINUE
END OF RANDOM NUMBER GENERATOR
Following the conventions of the linear congruential generators, it4 is referred
to as the multiplier, and IA and IB are called increments. The major difference
between MPRNG and linear congruential generators is that in the MPRNG the
multiplier and increments change from one iteration to the next. The two types
of considerations
in determining
the validity of random number generators,
periodicity and statistical, will be covered fully later in the text. For now, the
following should be noted:
(1) All arithmetic
is computer integer arithmetic in the above FORTRAN
code. It was designed for a 32-bit integer. However, it has been tested in Pascal
using real arithmetic with 11-digit accuracy and in Basic using double-precision
real arithmetic, and it performed well.
(2) It was decided by the author that four digits were all that were necessary
for his particular application. Therefore, the generated value lies not between
O-l, but between 0000-9999, and an integer probability rather than a decimal
probability is used when making comparisons. It was later found that at least
one other author [2] had a similar idea. For those users who find this too coarse
and have applications requiring greater precision, IRAND
can be concatenated
with IRAND(t
+ 1) to give a range from O-99,999,999, which is much more
precise than generators with 24-binary bit accuracy.
(3) The four rightmost digits (4-l) are not used because bias was found in the
rightmost digit. Therefore, digits 5-2 are used.
(4) The largest integer possible is when IRAND = 9,999, it4 = 9,972, IA =
99,990, and IB = 224,728, or the quantity IRAND X M + IA + IB = 100,034,746,
well within the largest integer size for a 32-bit computer.
(5) The minimum value of M is about 100. The minimum value of IA is
about 10,000, and IB about 128,000. This was done in order that the equation
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
370
’
Alexander
Haas
IRAND X M + IA + 1B would always have some values in the relevant digits
(see the discussion in Section 4).
(6) All language constructs in the code are found in virtually every major
language.
(7) In testing on a UNIVAC 1160 in FORTRAN,
MPRNG generated about
50,000 numbers per second. Some testing was done on a 32-bit minicomputer,
and approximately 6,000,OOO numbers were generated per hour. On an IBM AT,
26,000 numbers per minute were generated in compiled Basic. This will be more
or less dependent on the language and machine (on a 16-bit home computer, the
algorithm generates approximately
20,000 numbers per hour in interpreted
Basic).
2. PERIODICITY
As noted elsewhere [3, p. 31, “Any sequence of numbers produced by a subroutine
with finite input will eventually repeat, since the computer has only a finite
number of ‘stable states.’ Thus, the first problem in determining a procedure to
produce random numbers is to assure a long period for the repeating sequence.”
In the past, the periods of other generators [2-4, 8, 111 have been limited to the
word size of the computer, the largest period being smaller than the maximum
integer available. Two reports on improved Tausworthe generators [6, lo] and
Knuth [4] have overcome this problem and shown ways of achieving virtually
infinite periods.
The periodicity of MPRNG is guaranteed by setting up cycles around prime
numbers. To begin, two prime numbers are selected, for instance, 3 and 7. Then
any number less than the larger prime is taken, for instance, 2, and the smaller
prime is added to it. Any time a sum greater than or equal to the larger prime is
obtained, the larger prime is subtracted from it. This process results in a series
of all numbers less than the larger prime, with a period of length equal to the
larger prime, as shown in the following example, designated SERIESl:
2-5-1-4-O-3-6-2.
If, instead of subtracting the larger prime from the sum, a third prime, lying
between the other two primes, is subtracted, the period becomes that of the third
prime, as shown in the following example, designated SERIESB, where the
third prime is 5:
2-5-3-6-4-2.
Mathematically,
the series and cycles shown above can be derived as follows:
For SERIES1 there are two primes, Pl and P2, where
Pl
Going through
< P2.
the process above eventually
P2 - Pl
leads to some number N, defined as
+ n = N,
where
(P2-Pl)sN<P2.
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
The Multiple Prime Random Number Generator
l
371
In order to determine how many adds are needed to get back to this point, or
what the cycle or period is, the following equation is used to describe the process:
N=N+mxPl-nxP2,
where m is the cycle length. This equation implies
mXPl=nXP2
or
m x Pl(modP2)
= 0.
From the fundamental theorem of arithmetic, it is known that the quantity
m X Pl is the product of primes. Since Pl and P2 are relatively prime, it is clear
that the minimal m that can satisfy the above conditions is m = P2, that is,
there are P2 adds, or that the period of the series is P2. This guarantees that all
P2 numbers P2 - 1, P2 - 2, . . . , 0 are produced in the series.
Another way of stating the above is that the procedure produces the finite set
of P2 integers that contains all of the integers O-P2 - 1 (not necessarily in
order) and that set is ordered, discrete, and uniformly distributed. The probability
that any particular integer from the set will be randomly selected is l/P2.
Further, suppose a second series is generated. This series will have properties
exactly the same as described in the above paragraph. However, different primes
will be used, and therefore, the period will be different. For the sake of discussion,
the period of this second series will be designated P4, and the probability that
any integer from the series can be selected is l/P4.
When the two separate series are combined, a set of ordered pairs are generated.
Suppose all members of the set length P2 are designated by X, all members of
the second set are designated by Y, and P2 = 5, and P4 = 3. Then, the following
series is generated:
Xl
x2
x3
x4
x5
Xl
x2
x3
x4
x5
Xl
x2
x3
x4
x5
Yl
Y2
Y3
Yl
Y2
Y3
Yl
Y2
Y3
Yl
Y2
Y3
Yl
Y2
Y3
Xl
Yl
Note that all possible pairs of integers are generated, and each possible
pair occurs only once before the combined series begins to repeat, just as
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
372
-
Alexander
Haas
each possible integer occurs only once in each separate series before that series
repeats.
In order to predict the period of the series where the separate series is combined,
any pair is selected, for instance, XI and YJ. XI will be repeated every P2th
generation, and YJ every P4th generation. This leads to the following identity:
c x P2 = b x P4,
where c = the number of series where XI is repeated and b = the number of
series where YJ is repeated.
As above, this implies
c x PB(modP4)
= 0,
and the smallest c that can satisfy this identity is c = P4. Therefore, the period
of the combined series is P2 X P4; that is, the period of the combined series is
the product of the period of each separate series. This assumes the length of each
separate series to be independent of the other: that is, the lengths are relatively
prime (this has been assured by making them prime). If the lengths are not
independent, the length of the period of the combined series is less than the
product of each separate series. For instance, if P2 = 2 and P4 = 3, the period is
6. If P4 is changed to 4, the period becomes 4. Furthermore, this property holds
when any number of separate series are combined.
To determine the period for SERIESB, there are three primes, Pl, P2, and P3,
where
Pl<P2<P3.
Going through the process above eventually
leads to some integer N, defined as
P3 - P2 + n = N,
where
(P3-P2)sN<P3.
In order to determine how many adds are needed to get back to this point, or
what the cycle or period is, the following equation is used to describe the process:
P3 - P2 + n = P3 - P2 + n + m X Pl - n X P2,
where m is the cycle length. As above, this equation implies
mXPl=nXP2
or
m x Pl(modP2)
= 0,
and the minimum m that can satisfy the identity is m = P2; that is, there are
P2 adds. This guarantees that all P2 integers P3 - 1, P3 - 2, . . . , P3 - P2 are
produced in the series. This series may be considered to be exactly like SERIES1
with all of its important properties, including being able to be combined with
other similar series. It is this type of series that is used in MPRNG because the
smallest integer in the set can be adjusted to be greater than 0.
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
The Multiple Prime Random Number Generator
l
373
In this MPRNG, 7, 1,907, 9,973, 9,871, 73,939, 89,989, 96,233, 99,991, and
224,729 are all prime. They were selected somewhat arbitrarily, 99,991 being the
largest prime available from a table of prime numbers with values less than
100,000. However, the analysis above indicates that it does not matter which
primes are selected as long as the necessary criteria are met when setting up each
separate series. A different prime should be used wherever a prime is called for.
The primes 9,871, 89,989, and 96,233 were selected so as to leave a remainder
after the subtraction; however, the size of the remainder was also arbitrary within
the constraints mentioned in Section 4. The period for this generator is equal to
9,871 X 89,989 X 96,233 or about 85.5 trillion.
3. STATISTICAL
CONSIDERATIONS
For the tests below, appropriate chi-square values are from tables [7, p. 6191. For
sample sizes of N > 30, the following formula [7, p. 6181 was used to calculate
the significant chi-square value (p = .95):
chi-square
= n X ((1 - A + (z X A.5))3),
where n = degrees of freedom, z = the normal deviate (value of x for F(x) = .95)
= 1.645, and A = 2/(9 X n).
The following relevant chi-square values were calculated from the above
formula:
9,999 degrees of freedom: 10,232.8
99 degrees of freedom: 123.23
65 degrees of freedom: 85.61
Using the normal conventions with the chi-square statistic, the cutoff of the cells
to be analyzed comes when the expected frequency in the cell falls below 10-15.
In the following analyses, a cutoff at 15, 10, or 5 made no difference in the
conclusions drawn.
The following tests, run on the generator, as described above, were used by
IBM to validate the power residue method 131:
(1) Uniform distribution of u&es. In this test each possible number represents
a cell. Testing with 10,000,000 generated numbers, the expected frequency in
each cell should be 1,000. A chi-square value less than 10,232.8 indicates a
uniform distribution
of numbers between 0000 and 9,999.
The chi-square value (9,999 degrees of freedom) for this distribution is 9,881.2,
indicating a uniform distribution.
(2) Independence of successive values. In this test the leftmost two digits of
IRAND
are concatenated with the leftmost two digits of IRAND(t + 1). This
results in the creation of 10,000 cells. With 5,000,OOO generated numbers, the
expected frequency in each cell is 500. A chi-square value of less than 10,232.a
indicates that successive values are equally distributed for each value of IRAND.
The chi-square value (9,999 degrees of freedom) for this distribution is 9,922.91,
indicating that successive values are unpredictable.
Knuth suggests doing this test in a different manner using his serial test. This
involves taking successive pairs of numbers. In the above test, the pairs used
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
198’7.
374
*
Alexander
Haas
Table I.
The Results of the Autocorrelation
Test
Standard deviation
h
Mean
1
2
3
4
5
6
7
8
9
10
11
12
13
14
20
21
22
23
24
25
0.2503
0.2505
0.2503
0.2503
0.2502
0.2505
0.2503
0.2503
0.2502
0.2503
0.2504
0.2503
0.2504
0.2503
0.2503
0.2504
0.2504
0.2503
0.2503
0.2504
0.2503
0.2503
0.2503
0.2504
0.2501
0.2206
0.2208
0.2205
0.2205
0.2207
0.2207
0.2207
0.2205
0.2205
0.2204
0.2206
0.2207
0.2206
0.2206
0.2205
0.2206
0.2205
0.2206
0.2205
0.2207
0.2205
0.2204
0.2205
0.2206
0.2203
Expected values
0.2500
0.2200
15
16
17
18
19
were distributed as (1, 2), (2, 3), (3, 4), . . . , (n - 1, n). In this test the pairs are
distributed as (1, 2), (3, 4), (5, 6), . . . ; and 10,000,000 pairs of numbers are
generated and put together as described above. The resulting chi-square value of
10,182.3 indicates that MPRNG also passes the test in this form.
(3) The autocorrelation
coefficient is another measure of independence. It is
defined by the equation
c,, = (l/N)
C IRAND,
x IRAND,+h.
The mean should approach 0.25, with a standard deviation equal to 0.22. Using
this test, the Lehmer generator has been found to perform unsatisfactorily,
whereas the autocorrelation
for the Tausworthe generator was satisfactory for
all h = 1 up to 50 [ll].
In this test each generated number is divided by 10,000 to obtain the necessary
decimal. Then IRAND
is multiplied by IRAND(t + h). MPRNG passed this
test for all h = 1 to 25, and N = l,OOO,OOO.The results are shown in Table I.
(4) “Runs up and down.” The expected frequency of runs of length k is defined
by the equation
2((k2 + 3k + 1)N - (k3 + 3k2 - k - 4)]/(k
+ 3)!,
with the total number of expected runs equal to (2N - 1)/3, where N is the size
of the sample of generated numbers. Numbers that do not follow a random
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
198’7.
The Multiple Prime Random Number Generator
Table II.
/
375
The Results of the “Runs Up and Down” Test
k
Expected
values
1
2
3
4
5
6
7
8
416,667
183,333
52,778
11,508
2,034
303
39
5
Chi-square
(6 degrees of freedom)
Chi-square
-
UP
Down
416,858
183,151
52,636
11,625
2,074
347
30
5
416,878
183,478
52,577
11,394
2,031
328
34
5
11.09
4.82
values should be less than 12.6 to be significant.
pattern tend to have more long runs than predicted. Knuth suggests performing
this test in a different manner than described by IBM. The test was performed
in both ways, and the results are identical. Knuth’s remarks about this test
should be considered to apply only to linear-congruential-type
generators. The
reader is referred to IBM for a more applicable discussion.
MPRNG passed this test for N = 5000,000. The results are shown in
Table II. The chi-square values should be less than 12.6 to be significant.
(5) The coin flip or the “above and below the mean test.” This is done by
dichotomizing
the values the generated numbers can assume; for instance,
O-4,999 is “heads,” and 5,000-9,999 is “tails.” Then, the runs of length k of heads
and tails are counted. A chi-square test is used to confirm the results.
With a sample of 5000,000 generated numbers, MPRNG passed this test. The
results are shown in Table III. The relevant chi-square value (p = .95; 16 degrees
of freedom) is 26.3.
(6) The independence of a subset of generated values. Binder [l, 21 has noted
that in one type of generator, even though successive values seem to be independent, there appears to be a predictable relationship (“strong triplet correlations”) between a generated number and the numbers generated after it.
Two tests of a subset of generated values were done. The first is similar to that
described in test 2 above. In this test the leftmost two digits of IRAND
were
concatenated, with the leftmost two digits of IRAND(t + h) for h = 2 to 5. The
resulting chi-square values (p = .95, 9,999 degrees of freedom), shown below,
show that the second through the fifth numbers following IRAND
cannot be
predicted by knowing IRAND
(for N = 10,000,000):
h
chi-square
2
3
4
5
10,082.g
lOJ31.6
10,176.2
9,885.l
The second test is a crude version
of that described above. For this test
digit of IRAND
was conca-
2,000,OOO numbers were generated. The leftmost
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4. December
1987.
376
Alexander
l
Haas
Table III.
The Results of the Coin Flip Test
Expected
values
k
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
625,000
312,500
156,250
78,125
39,063
19,531
9,766
4,883
2,441
1,221
610
305
153
76
38
19
10
Heads
Tails
625,153
312,181
155,904
77,734
39,419
19,530
9,748
4,939
2,452
1,233
617
278
161
84
44
25
9
3
0
0
624,284
312,469
156,297
77,926
39,297
19,652
9,768
4,916
2,475
1,237
645
272
142
75
31
13
9
4
2
0
14.07
14.22
Chi-square
(16 degrees of freedom)
Chi-square
values should be less than 26.3 to be significant.
with the leftmost
digit of IRAND(t
+ h) for h = 1 up to 25. This results
in 100 cells for each h. A chi-square value (99 degrees of freedom) of less than
123.23 is not significant. The results, presented in Table IV, show that there is
no relationship between any generated number and the 25 following generated
numbers.
(7) Frequency of digits in each place. The results of this test are shown in
Table V. The chi-square values (9 degrees of freedom) should be less than 16.9
for significance. This test shows that digits are distributed equally in each place
for 5,000,OOO generated numbers.
(8) The gap test. This tests the frequency of observed distances separating
two equal digits in a place. The expected distance is 10. However, the expected
curve follows a classical decay function where 10 percent are expected to repeat
in the next generated number, 9 percent the following number, and so on. A chisquare test is used to confirm the results.
This test was confirmed using 100,000 generated numbers. The results are
shown in Tables VI and VII. In Table VII the results represent a cutoff of the
cells where the expected value in the cell falls below 10.
tenated
Following
are additional
tests, which were also done on MPRNG.
(9) Repeatability. Some generators do not allow numbers to repeat. However,
with a truly random sequence of numbers, digits and combinations of digits will
repeat with a known probability. MPRNG was designed keeping this in mind (it
ACM Transactions on Mathematical Software, Vol. 13, No. 4, December 1987.
377
The Multiple Prime Random Number Generator
Table IV.
Results of the Test of a Subset of Values
h
Chi-square
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
102.56
110.50
106.97
110.27
102.85
89.02
109.67
107.03
78.92
90.59
109.64
75.82
100.80
109.41
79.22
97.77
84.85
88.54
92.39
97.49
107.55
110.14
84.67
81.67
119.39
Chi-square values should be less than 123.23 for
significance.
Table V.
The Results of the Test for Digits in each Place
Dieit
Thousands
Hundreds
Tens
Ones
0
500,727
498,950
498,664
500,813
500,570
499,368
499,722
500,805
499,797
500,584
500,450
500,046
500,839
500,337
500,528
499,479
499,768
499,362
500,526
498,665
500,797
500,315
500,712
498,941
501,234
499,054
499,465
499,331
499,073
501,078
500,819
498,757
499,636
499,600
500,117
500,354
499,468
501,191
500,183
499,875
11.82
8.18
15.07
8.80
1
2
3
4
5
6
7
8
9
Chi-square
Expected frequency in each cell = 500,000. Chi-square
16.9 to be significant.
values should be less than
was also a consideration in [6]). Each number generated represents a particular
combination of four digits with a probability of repeating equal to 0.0001.
Repeatability was demonstrated using 5000,000 generated numbers. The expected number of repeats is 500; the actual number is 487. This falls easily within
the expected standard deviation of 22.4.
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
378
-
Alexander
Haas
Table VI.
Digit
The Average Distance between Digits in each Place
(the distance should approach 10)
Thousands
Hundreds
9.95
10.02
9.99
9.96
9.98
9.92
10.03
10.01
0
1
2
3
4
5
6
I
8
9
10.07
9.86
10.22
9.87
10.22
9.91
10.11
9.99
9.84
9.91
10.08
10.06
Table VII.
Chi-Square
Tens
Ones
10.01
9.87
10.16
9.98
10.00
9.92
9.94
10.08
9.97
10.04
10.09
10.01
9.88
10.09
9.85
9.94
9.97
10.01
10.19
9.97
Values in each Place for the Gap Test
Degrees
of freedom
Expected
maximum
Thousands
Hundreds
Tens
Ones
65
85.61
80.94
66.73
51.48
52.33
(10) Independence between digits in adjacent places. Not only was independence tested from one generated number to the next, but within each generated
number, independence was tested from one place to the next.
With a sample of 2,000,OOOgenerated numbers, it was confirmed that digits in
one place cannot be predicted from any digits in its neighboring place. In this
test the digit in each place was concatenated with the digit in all other places
resulting in 100 cells for each possible combination of digits. The chi-square
values are given in Table VIII. Note that the table is symmetric; that is, the chisquare value for the tens-ones pair is the same as the chi-square value for the
ones-tens pair.
(11) Collision test (from Knuth [43). For this test suppose that n balls are
tossed into m urns, where m is greater than n. Each time a ball falls into an urn
that is already occupied, a “collision” occurs. Knuth considers this to be one of
the better tests for random number generators. The probability that a given urn
will contain exactly k balls is
n
m-yl
Pk =
- m-‘)“-k,
0k
giving the average total number of collisions to be something less than
(n2)l(2 X m).
For this test 200,000 numbers were generated, giving 200 series of 1,000
numbers. If m = 10,000 and n = 1,000, then the expected average will be less
than 50. The actual result is an average of 48.73 collisions, which indicates
MPRNG passes this test.
(12) The serial correlation test (suggested by Knuth). This is another test that
measures the extent of the relationship between IRAND
and IRAND(t + h).
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
The Multiple Prime Random Number Generator
Table VIII.
Thousands
Hundreds
Tens
Ones
Chi-square
The Results of the Test for Independence
l
379
of Adjacent Digits
Thousands
Hundreds
Tens
Ones
N/A
95.88
N/A
115.70
90.82
N/A
90.12
79.00
85.01
N/A
value should be less than 123.23 to be significant.
Table IX. The Results of the
Serial Correlation Test
Correlation
h
-0.0030
-0.0017
-0.0040
-0.0001
0.0060
0.0032
0.0009
0.0005
0.0047
0.0009
0.0011
0.0010
0.0015
-0.0050
-0.0014
-0.0064
-0.0082
-0.0066
0.0012
-0.0005
-0.0008
-0.0065
-0.0092
0.0050
-0.0008
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
A correlation of 0 indicates no relationship;
+l or -1 indicates a perfect relationship.
It is actually the correlation coefficient
covary. It is given by the equation
and measures the extent to which they
C = (N x SUMTl)
(N x SUMT2)
- (SUMT)2
- (SUMT)2 ’
where N = sample size, SUMT = sum of IRAND(
SUMTl
= sum of
IRAND
X IRAND(t
+ h), and SUMTB = sum IRAND
X IRAND(
and C will vary from +l to -1, where either extreme indicates a perfect
relationship, and C = 0 indicates no relationship. For 50,000 generated numbers,
the serial correlation test indicated no relationship
between IRAND
and
IRAND(t + h) for h = 1 up to 25. The results are shown in Table IX.
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.
380
-
Alexander Haas
(13) The Spectral test. This test is used by Knuth to test the efficacy of the
multiplier in the linear congruential generators. It measures the equal distribution
of IRAND over t-dimensions
and is normally run over the full period of
the generator. Knuth [4, p. 911 notes that equal distribution
of anything over
10 dimensions has “no practical significance.”
The test of a subset of numbers (test (6) above) shows that MPRNG is equally
distributed over 25 dimensions and would therefore pass the spectral test, at least
for the limited period of the test.
4. FLEXIBILITY
Owing to the properties of the type of series generated, the user is allowed to use
virtually any starting values for the seeds M, IA, and IB, because, before it is all
finished and MPRNG goes through its full period, all combinations of M, IA,
and IB will be used. Generally, the starting seed values should fall within the
range between the minimum and maximum integers of the particular series,
although starting at less than the minimum value affects minimally only the
period of the generator. For MPRNG above, it is recommended that M lie
between 1,000 and 9,000, IA between 20,000 and 90,000, and IB between 130,000
and 220,000. This gives the user on the order of 5 X 1013unique series of numbers
from which to choose.
As it turns out, it also does not matter what the beginning seed for IRAND is,
although it should be less than 10,000. This is due to the fact that, after the
generator has gone through its full period, the final value of IRAND will always
be the same regardless of its starting value. At this point the entire series starts
repeating. The reason for this is because, at some point in the series, the values
for IRAND become the same no matter what the beginning value of IRAND was.
This point is different depending on the initial value of IRAND. For instance, in
MPRNG above, if the initial value of IRAND is changed from 4,181 to 8,362,
every number beginning with the 1,12&h generated number is the same in either
series. If it is changed from 4,181 to 2,093 or 5,000, the point at which the series
become identical is after the 4,806th generated number. This means that, in
order to change MPRNG from one series to another, one or more of the values
of the variables M, IA, or IB must be changed. Changing IRAND will have
essentially no effect at all on the generated series.
The user can adjust the length of the period in two ways. First, the user can
just change the relevant prime numbers. For instance, by changing the M
comparison from
IF(M
.GE. 9973)M = M - 9871
to
IF(M
.GE. 99901)M = M - 89909,
the period becomes 89,989 X 89,909 X 96,233, or about 800 trillion.
Second, one can include another increment, for instance, an IC, making the
formula N x M + IA + IB + IC. In MPRNG, if we were to include an IC in
the form
IF(IC
ACM Transactions
on Mathematical
.GE. 9925387)1C = IC - 9121439,
Software,
Vol. 13, No. 4, December
1987.
The Multiple Prime Random Number Generator
l
381
the period then becomes 9,121,439 x 9,871 X 89,989 X 96,233, or about 8 X 10”.
It has already been shown that both of these methods of increasing the period
are valid. The number of increments that can be included are limited only by the
integer size of the computer and, as a practical matter, the number of prime
numbers of the needed magnitude that are readily available. In this way, if it
is desired, the period of the generator can be pushed past the approximate
1018’ numbers described by [4], [6], and [lo]. In one estimation, using just
over 100 increments, the period could become about 10750.
Whenever the primes are changed, the minimum integer values for each
increment series are summed, and that value must be greater than 100,000. This
is done in order to ensure that there are digits remaining in the relevant places
when the product M X IRAND equals 0 (which will occur on the average of once
for every 10,000 generated numbers, i.e., when IRAND equals 0). In MPRNG
the minimums for IA and IB are 10,002 and 128,496, respectively, and their sum
is 138,498. The maxima must also be checked in order to ensure that the
maximum value does not exceed the largest value the computer can have.
Clearly, MPRNG presently passes all the stated criteria. It is easily transferable
to any machine of 32 bits or higher. The period is long and, better still, absolutely
predictable. It passes all of the requisite statistical tests, and is conceptually easy
to understand, flexible, and easy to program in most major languages on most
machines.
REFERENCES
1. BINDER, K. Introduction:
Theory and technical aspects of Monte Carlo simulations. In Topics
in Current Physics: Monte Carlo Methods in Statistical Physics, K. Binder, Ed. Springer-Verlag,
New York, 1979, pp. l-45.
2. BINDER, K. Topics in Current Physics: Applications of the Monte Carlo Method in Statistical
Physics. Springer-Verlag,
New York, 1984, pp. 2-4.
3. IBM.
Random number generation and testing. IBM Manual GC20-8011-0, IBM, Mechanicsburg, Pa., copyright 1959, reprinted 1969.
4. KNUTH, D. E. The Art of Computer Programming. Vol. 2, Seminumerical Algorithms. AddisonWesley, Reading, Mass., 1981.
5. LEHMER, D. H. Mathematical
methods in large scale computing units. In Proceedings of the
2nd Symposium on Large Scale Digital Calculating Machinery. Ann. Comput. Lab., Harvard Univ.
XXVI, Cambridge, Mass., 1951, pp. 141-146.
6. LEWIS, T. G., AND PAYNE, W. H. Generalized feedback shift register pseudorandom number
algorithm. J. ACM 20, 3 (July 1973), 456-468.
7. SELBY, S. M., ED. CRC Standard Mathematical Tables. 23rd ed. CRC Press, Boca Raton, Fla.,
1975.
8. TAUSWORTHE, R. C. Random numbers generated by linear recurrence modulo two. Math.
Comput. 19 (1965), 201-209.
9. TOOTILL, J. P. R., ROBINSON, W. D., AND ADAMS, A. G. The runs up-and-down performance
of Tausworthe pseudo-random number generators. J. ACM 18,3 (July 1971), 381-399.
random Tausworthe
10. TOOTILL, J. P. R., ROBINSON, W. D., AND EAGLE, D. J. An asymptotically
sequence. J. ACM 20,3 (July 1973), 469-481.
11. WHITTLESEY, J. R. B. A comparison of the correlational behavior of random number generators
for the IBM 360. Commun. ACM II,9 (Sept. 1968), 641-644.
Received February
1986; revised June 1987; accepted September 1987
ACM Transactions
on Mathematical
Software,
Vol. 13, No. 4, December
1987.