Uniform Random Rational Number Generation

Uniform Random Rational Number Generation
Thomas Morgenstern
Hochschule Harz, Friedrichstr. 57-59, D-38855 Wernigerode
[email protected]
Summary. Classical floating point random numbers fail simple tests when considered as rational numbers.
A uniform random number generator based on a linear congruential generator
with output in the rational numbers is implemented within the two computer algebra
systems Maple 10 and MuPAD Pro 3.1.
The empirical tests in Knuth’s suite of tests show no evidence against the hypothesis that the output samples independent and identically uniformly distributed
random variables.
Key words: uniform random number generation, computer algebra system, statistical test, stochastic simulation
1 Introduction
Randomized algorithms like Monte Carlo simulation require long sequences of
random numbers (xi )i∈N . These sequences are asked to be samples of independent and identically distributed (i.i.d.) random variables (Xi )i∈N . Of special
interest are in the open unit interval (0, 1) ⊆ R uniformly distributed random
variables Ui ∼ U(0, 1).
Random numbers produced by algorithmic random number generators
(RNG) are never random, but should appear to be random to the uninitiated. We call these generators pseudorandom number generators [1]. They
should pass statistical tests of the hypothesis [2]:
H0 : the sequence is a sample of i.i.d. random variables with the given
distribution.
In fact, no pseudo RNG can pass all statistical tests. So we may say that
bad RNGs are those that fail simple tests, whereas good RNGs fail only
complicated tests that are very hard hard to find and to run [4, 5].
2
Thomas Morgenstern
1.1 Definitions
We follow the definitions in [2–4]:
Definition 1. A (pseudo-) random number generator (RNG) is a structure
(S, µ, t, O, o) where S is a finite set of states (the state space), µ is a probability distribution on S used to select the initial state (or seed) s0 , t : S → S
is the transition function, O is the output space, and o : S → O is the output
function.
The state of the RNG evolves according to the recurrence si = t(si−1 ),
for i ≥ 1, and the output at step i is ui = o(si ) ∈ O. The output values
u0 , u1 , u2 , . . . are called the random numbers produced by the RNG.
In order to keep the computation times moderate and to compare our results with the literature we use a generator implemented in Maple 10 [9] and
MuPAD Pro 3.1 [10] (and discussed in [4]) and call it LCG10. It is a multiplicative congruential generator (c = 0) with multiplier a = 427 419 669 081
and modulus m = 1012 − 11 = 999 999 999 989, i.e. with the recurrence:
si+1 := 427 419 669 081 · si mod 999 999 999 989 .
(1)
It has period length ρ = 1012 − 12. To produce values in the unit interval
O = (0, 1) ⊆ R one usually uses the output function:
ui = o(si ) := si /m .
1.2 Problems and Indications
Example 1. Consider a micro wave
1010 Hz. We want to deR 1 with frequency
10
2
termine the signal energy E = 0 cos(2 π 10 × t) dt by Monte-Carlo integration. We use the generator Eq. 1 to produce (decimal 10 digits precision
floating point) numbers ui ∈ (0, 1)Pand simulate the times ti := 2 π 1010 × ui .
n
For n random numbers and S := i=1 cos2 (ti ) we expect E ≈ S/n.
6
Using n := 10 and 10 iterations starting with seed 1 we get values in the
interval [0.949, 0.950] with mean 0.9501, far from the true result 0.50. Using
R1
the same numbers to integrate the cosine 0 cos(2 π 1010 × t) dt we get values
in [0.899, 0.901] with mean 0.9000.
These results are due to the fact that mainly natural multipliers of 2 π
are generated. To see this, we multiply the number ui by m = 1012 or m =
999 999 999 989 and treat the result as rational number ri (e.g. converte it
to a rational number using convert(u, rational, exact) or convert(u,
rational,10) in Maple 10). The smallest common multiplier of the divisors
is 1, what is unlikely for true random rational numbers.
Uniform Random Rational Number Generation
3
2 Random Rational Numbers
We construct a new generator producing rational numbers as output.
2.1 An Uniform Random Rational Number Generator
The new generator RationalLCG10 is defined in Table 1 and is based on
LCG10 (Eq. 1). The initial state s0 is chosen arbitrarily.
Table 1. RationalLCG10
s := 427 419 669 081*s mod 999 999 999 989;
q := s;
s := 427 419 669 081*s mod 999 999 999 989;
if (s < q) then p := s else p := q; q := s end if;
return r := p/q;
The generator has period length ρ = 1011 − 6, i.e. half the period length of
LCG10. (Note that p 6= q, because two consecutive states of a LCG can only
be equal if all states are the same.)
2.2 Empirical Tests
We test our generator RationalLCG10 with Knuth’s suite of empirical tests
(see [1, pp. 61–73]), all implemented in Maple 10 [9] and MuPAD Pro 3.1 [10].
Kolmogorov-Smirnov equidistribution test
The Kolmogorov-Smirnov test is described in [1, pp. 48–58]. Starting with
seed 1 and n = 105 numbers, we get in 55 tests the results in Table 2.
Table 2. Kolmogorov-Smirnov test n = 105
0% ≤ p+ < 5%
5% ≤ p+ < 10%
10% ≤ p+ ≤ 90%
90% < p+ ≤ 95%
95% < p+ ≤ 100%
4
3
42
3
3
0% ≤ p− < 5%
5% ≤ p− < 10%
10% ≤ p− ≤ 90%
90% < p− ≤ 95%
95% < p− ≤ 100%
4
3
41
4
3
Chi-square equidistribution test
The Chi-square test is described in [1, pp. 42–47]. Starting with seed 1 and
20 tests we get with n = 105 random numbers and 20 000 subintervals the
occurrences in Table 3(a). With n = 106 and 200 000 classes we get the results
in Table 3(b).
4
Thomas Morgenstern
Table 3. Chi-square test: (a) n = 105 ; (b) n = 106
0% ≤ p < 5%
5% ≤ p < 10%
10% ≤ p ≤ 90%
90% < p ≤ 95%
95% < p ≤ 100%
2
1
17
0
0
0% ≤ p < 5%
5% ≤ p < 10%
10% ≤ p ≤ 90%
90% < p ≤ 95%
95% < p ≤ 100%
1
3
15
0
1
Birthday spacings test
The Birthday spacings test is described in [1, pp. 71–72] and discussed in
[5–8]). We consider a sparce case in t = 2 dimensions.
For n = 105 random numbers in 50 000 random vectors we choose d =
5 590 169 classes per dimension, such that we have k = 31 249 989 448 561
categories in total. For n = 106 random numbers, we choose d = 176 776 695
classes per dimension, such that we have k = 31 249 999 895 123 025 categories.
In both cases we expect in average λ = 1 collisions of birthday spacings.
For LCG10 in 10 tests (and initial state 1) we get more than 3 collisions
for the tests in the first case and more than 1 100 collisions in the second case.
The probability to get these or more collisions is less than 2% and even less
than 10−40 in the second case. LCG generators are known to fail this test for
√
n > 3 ρ (see [5, 6, 8]). For the generator RationalLCG10 we get the unsuspect
results shown in Table 4.
Table 4. Birthday spacing test: (a) n = 105 ; (b) n = 106
0% ≤ p < 5%
5% ≤ p < 10%
10% ≤ p ≤ 90%
90% < p ≤ 95%
95% < p ≤ 100%
2
3
15
0
0
0% ≤ p < 5%
5% ≤ p < 10%
10% ≤ p ≤ 90%
90% < p ≤ 95%
95% < p ≤ 100%
2
2
16
0
0
Serial correlation test
The Serial correlation test computes the auto correlations as described in [1,
pp. 72–73]. With n = 106 random numbers one expects correlations within
the range [−0.002, 0.002] in 95% of the tests.
For the generator RationalLCG10 starting from seed 1 with lags from 1
up to 100 in one test we get the results shown in Table 5.
Table 5. Serial correlation test: n = 106
p < −0.003
0.003 ≤ p < −0.002
−0.003 ≤ p ≤ 0.0
0
2
50
0.0 ≤ p ≤ 0.002
0.002 < p ≤ 0.003
0.003 < p
46
2
0
Uniform Random Rational Number Generation
5
2.3 Improvements
We continue Example 1 using the generator RationalLCG10. In 10 simulations
with n = 106 numbers we get in average 0.500 028 for the Energy and 0.000 386
for the the integral of cos. In Table 6 we find the results of 20 simulations,
with average 0.000 270 for the integral of cos and 0.499 968 for cos2 .
Table 6. Monte-Carlo Simulation: (a) cos(x); (b) cos2 (x)
−0.0015 ≤ S̄ ≤ −0.0005
−0.0005 < S̄ ≤ +0.0005
+0.0005 < S̄ ≤ +0.0015
4
7
9
0.4993 ≤ S̄ < 0.4998
0.4998 < S̄ ≤ 0.5003
0.5003 < S̄ ≤ 0.5008
7
10
3
The generator RationalLCG10 produces random numbers with many different denominators. As in Sec. 1.2 we multiply the rational numbers by
m = 999 999 999 989, take the denominators qi and their least common multiple d. Plotting log10 (d) against the number of iterations i, we find exponential
growth (see Fig. 1).
log10 (d)
log10 (d)
ppp
pp p
1 000
p
p
p pp
pp p
p
p
p pp
ppp
p
p
ppp
pp p
p
p
pp
p pp
p
p
p
pp
ppp
6
6
0
10 000
-
100 i
0
ppp
pp p
p
p
pppp
pp p
p
p
pppp
pp p
p
p
p pp
pp p
p
p
pp
ppp
pp p
-
1 000 i
Fig. 1. log10 of the lcm of the denominators of RationalLCG10
3 Conclusions
Using division by the modulus as output function for a linear congruential
random number generator can lead to bad results even in simple simulations.
This is mainly due to the fact that only a few denominators are generated. A
first improvement was suggested in [11].
The new generator RationalLCG10 produces rational random numbers
equally distributed in (0, 1). It shows a greater variety of denominators and
improved simulation results, compared to the underlying generator LCG10.
The generator RationalLCG10 passes all of Knuth’s empirical tests [1]
(not all results are presented here due to the lack of space) and the birthday
spacings test indicates even further improvements.
6
Thomas Morgenstern
References
[1] Knuth D E (1998) The art of computer programming. Vol. 2: Seminumerical algorithms. third edition, Addison-Wesley, Reading, Mass.
[2] L’Ecuyer P (1994) Uniform random number generation. Annals of Operations Research 53:77–120
[3] L’Ecuyer P (1998) Random number generation. In: Banks J (ed) Handbook on Simulation. John Wiley, Hoboken, NJ.
[4] L’Ecuyer P (2004) Random number generation. In: Gentle J E, Hrdle
W, Mori Y, (eds) Handbook of computational statistics. Concepts and
methods. Springer, Berlin Heidelberg New York
[5] L’Ecuyer P (2001) Software for uniform random number generation: Distinguishing the good and the bad. In: Proceedings of the 2001 Winter
Simulation Conference. Pistacaway NJ., IEEE Press
[6] L’Ecuyer P, Hellekalek P (1998) Random number generators: Selection
criteria and testing. In: Hellekalek P (ed) Random and quasi-random
point sets. Springer Lecture Notes in Statistics 138. Springer, New York
[7] L’Ecuyer P, Simard R (2001) On the performance of birthday spacings
tests with certain families of random number generators. Mathematics
and Computers in Simulation 55(1-3): 131–137
[8] L’Ecuyer P, Simard R, Wegenkittl S (2002) Sparse serial tests of uniformity for random number generators. SIAM Journal on Scientific Computing 24(2): 652–668
[9] Maple 10 (2005) Maplesoft, a division of Waterloo Maple Inc.,
www.maplesoft.com
[10] MuPAD Pro 3.1 (2005) SciFace Software GmbH& Co.KG,
www.sciface.com
[11] Morgenstern T (2006) Uniform Random Binary Floating Point Number
Generation. In: Proceedings of the 2. Wernigeröder Automatisierungsund Informatiktage. Hochschule Harz, Wernigerode