A Parallel Approach to Finding Prime Numbers and

A Parallel Approach to Finding Prime
Numbers and Visualizing their Distribution
Pamela K. Speidely
Ernest Sibertz
Abstract
A data-parallel implementation of the Sieve of Eratosthenes has been used to
calculate prime numbers up to approximately 230 in C*. Alternative programming
techniques to implement the sieve approach on the Connection Machine have been
explored. In addition, ways to visualize the distribution of the prime numbers using
the parallel graphics capability of the CM were investigated.
1 Introduction
The prime numbers are of fundamental importance to number theory and are of current
interest in public-key cryptosystems as well as in other areas. A standard method for
calculating primes is the sieve of Eratosthenes, but the sieve technique is used for other
number-theoretic calculations as well. Data-parallel computers have not been considered
very eective for sieve calculations, so the possibility of an ecient data-parallel sieve was
studied to determine primes up to 230 +1. The resulting data were used to investigate ways
of displaying information about the distribution of primes.
2 How to Recognize Whether a Number is a Prime
This brings us to a data-parallel implementation of the famous sieve of Eratosthenes to
calculate prime numbers up to 230 + 1. The implementation has three main steps.
The rst step is to identify the composite numbers by determining only the multiples of
prime numbers where p ( 215, since if n ( 230 + 1 is composite, then n has a factor < 215.
The prime numbers p ( 215 were obtained in 0.12 seconds using the program discussed in
the training session [1]. The program uses the mod function to mark o the nonprimes in
is prime, a boolean parallel variable of 215 entries, that has the shape of 32K processors
representing odd integers up to 216 (Figure 1). The program was extended to create a bit
array mk, a boolean parallel variable of 229 bits, that has the shape of 32K processors, with
each processor having a 16K array representing odd integers to 230 + 1 (Table 1).
Work funded in part by NSF Grant CDA-9200577 for the 1992 Research Experiences for Undergraduates
(REU) Site Program at the Northeast Parallel Architectures Center (NPAC), Syracuse University, Syracuse,
NY. Additional funding provided by the Oce of the Vice President for Research and Computing, and by
NPAC, Syracuse University.
y Research Apprentice, 1992 NPAC REU Program, Syracuse University; Computer Science Major,
University of Missouri.
z Professor, School of Computer and Information Science, Syracuse University.
89
90
Speidel &Sibert
<: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 32K processors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :>
+ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::+
j 0 j 1 j 1 j 0 j 1 j :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::j
+ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::+
2 3 5 7 9 : : :: : :
FIG. 1. is prime, a boolean parallel variable.
Table 1. mk, the bit array 32K by 16K.
3
5
7
9
32K+3
32K+5
32K+7
32K+9
32K+1 64K+1
64K+3 64K+5
64K+7
64K+9
16 K bits
230 + 1
In step two, the \small" prime numbers were sieved using a parallel counting procedure.
First in the procedure, p is set as an integer variable that has the prime number to count
o, acquiring it from is prime. Each time a prime number is obtained, is prime is updated.
Next, counter, an integer parallel variable of the shape of 32K processors, is set to the
starting position for each processor (Figure 2). The counter is set, making certain not to
skip any multiples of p except for p, itself, with the following code:
q = (pcoord(0) rowsize) ?((p ? 3)=2)
counter = (q %p)
where ((q > 0) and (counter == 0)) counter = p.
After counter and p are set, the parallel program checks the equality of counter to p. If
they are equal, then the position of mk is marked as a multiple of p, and that processor's
counter is reset to zero. Then counter is incremented by 1 in all the processors. All 32K
processors in this procedure are run in parallel and the timing for small prime numbers is
relatively fast.
+: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :+
j 0 j 1 j 2 j 3 j 1 j : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :j 2 j
+: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :+
FIG. 2. counter, an integer parallel variable, set to the starting positions for p = 3.
Finding Prime Numbers and Visualizing their Distribution
91
The third step is to sieve the \large" prime numbers using a parallel stepping procedure.
Indirect addressing is used in the stepping procedure, accessing only those positions in mk
that are multiples of the prime being sieved. As before, p is the prime number obtained
from is prime. Next, step, an integer parallel variable of the shape of 32K processors, is set
to the starting position for each processor (Figure 3). The step is set, making certain not
to mark o the p being sieved, with the following code:
qm = ((p ? 3)=2)? (pcoord(0) rowsize)
step = (qm %% p)
where (qm >= 0) step = step +p.
After step and p are set, the 32K processors run parallel marking the mk[step] position
which is a multiple of p. Step is incremented by p to continue the stepping through the
columns of numbers. The processors are active only if step is less than the 16K rowsize. For
smaller primes this is slow and the time required decreases as p increases, being essentially
constant for 214 ( p ( 215.
+: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :+
j 27 j 17 j 4 j 10 j 16 j : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : j 18 j
+: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :+
FIG. 3. step, an integer parallel variable, set to the starting positions for p = 19.
After the rst three steps of the program were designed, timing experiments were
conducted to determine the division between the \small" and \large" prime numbers. In
order to obtain reliable times, the timing experiments were carried out with a directly
attached CM-2 (as opposed to time-shared) with the safety o. Findings indicated that
timing for numbers smaller than 19 were faster using the counting procedure, while numbers
19 and greater were faster using the stepping procedure. Therefore, \small" prime numbers
include 3 through 17 and \large" prime numbers include 19 through 215. Some timings are
displayed in Table 2.
Table 2. Timings for counting and stepping procedures.
prime counting(sec) stepping(sec)
3
2.81
6.22
5
2.81
4.52
7
2.81
3.80
11
2.81
3.14
13
2.81
2.96
17
2.81
2.73
19
2.81
2.65
23
2.81
2.54
29
2.81
2.42
31
2.81
2.39
A division within the \large" prime numbers was also examined. A special stepping
procedure was used as a shorter revision of the general stepping procedure. This procedure
is shorter, because for the largest primes each processor marks zero or one multiple of p
92
Speidel &Sibert
only. The time to sieve the \very large" odd numbers up to 230 + 1 was expected to be
faster using this shorter procedure. However, results of the timing experiments of both
the general and special stepping procedures indicated no time change. Thus, the general
stepping procedure was suitable for sieving the \largest" prime numbers. Some timings are
displayed in Table 3. The time it took to sieve odd numbers to 230 + 1 was 1.47 hours. It
took 16.87 seconds to sieve all the \small" prime numbers using the counting procedure, and
it took 87.919 minutes to sieve all the \large" prime numbers using the stepping procedure.
Thus, it took approximately 1.51 seconds per prime to sieve the 3512 odd primes ( 215
through the 230 +1 numbers. While sieving the bit array mk, checkpoints were done, writing
and reading the important information to a le on the datavault. The time it took to read
this information from the datavault le was approximately 6.30 seconds, and writing the
information took approximately 4.90 seconds. Thus, transferring 512M bits of data on the
datavault moves between 85:2 106 bits/sec and 109:7 106 bits/sec. Graph 1 is a graphic
representation of the timings for the counting procedure and the stepping procedure from
the timing experiments.
Table 3. Timings for general and special stepping procedures.
prime general(sec) special(sec)
16363
1.32
1.32
16381
1.32
1.32
16411
1.32
1.32
16451
1.32
1.32
24509
1.32
1.32
24593
1.32
1.32
32707
1.32
1.32
32749
1.32
1.32
Graph 1. Timings for counting and stepping procedures. The horizontal axis is
the natural logarithm of the prime being sieved and the vertical axis is the time
in seconds to complete each sieve.
Finding Prime Numbers and Visualizing their Distribution
93
A program was implemented to check the accuracy of the resulting data. The checking
program searches for both types of errors, including checking that a prime number was not
marked o as a nonprime and that a nonprime number was not marked as a prime number.
As the time for the project came to a close, only 85 columns of the 32K columns (which
are up to 2720K + 1 integers) were checked for accuracy, resulting in no errors.
3 How Many Prime Numbers Are There Up To 230 + 1?
As a result of the sieving, the set of primes is represented as a bit vector with an entry for
each integer of interest. Since 2 is the only even prime, and 1 by convention is not prime,
229 bits represent odd integers from 3 through 230 + 1, and after sieving, the set of odd
primes in this range is obtained. On the other hand, the formula due to Hadamard and
de le Vallee Poussin [2] indicates that the number P (x) of primes p ( is asymptotically
x= ln(x), thus
P (230 + 1) = 51; 636; 066 (approximately).
This shows that about 10% of the integers represented in the 229 bits are prime, so there
are about 10 bits per prime. This is much more compact than a vector of 51,636,066
integers of 32 bits each. Fortunately, the memory of a 16K CM-2 at 214 216 = 230 bits
is sucient for the calculation, using exactly half of the memory on a CM-2. After sieving,
the number of odd primes was counted and found to be 54,400,027 up to 230 + 1. The
results of the function x= ln(x) were compared to the actual counted results from the data
displayed in Graph 2. Graph 3 is a graphic representation of the ratio of x= ln(x) divided
by the resulting data.
Graph 2. The comparison of x= ln(x) and counted results. The horizontal axis
is the natural logarithm of the number x and the vertical axis is the natural
logarithm of the number of primes less x. Plotted are both the theoretical
x= ln(x) and the actual counted results. Thus, the two lines are virtually
indistinguishable.
94
Speidel &Sibert
Graph 3. The ratio of x=ln(x) and counted results. The graph displays the ratio
of the theoretical x= ln(x) and the actual counted results as plotted in Graph 2.
4 How Are the Prime Numbers Distributed?
A close examination of the table of prime numbers suggests an irregular distribution of
primes; however, if these same primes are viewed at a distance, the distribution appears
smooth. One technique explored for visualizing the distribution of prime numbers is
called the shading technique. The shading technique displays the distribution of the prime
numbers to 230 +1 onto a 1024 1024 size screen. As a result of this technique, 512 numbers
per pixel can be displayed from the sieved data. Initially, the minimum and maximum
values for the grouping are found and values are placed in the following formulas. Then,
with the results from these formulas and the number of primes within each grouping, a
color distribution is determined. The color distribution is shown in Figure 4.
a = 254=(max ? min)
b = 1 ? a (min ?1)
where (np == 0)
color = 0
else color = (a (np ? 1)) + b.
Finding Prime Numbers and Visualizing their Distribution
95
FIG. 4. The shading technique color distribution. The distribution of primes is
visualized using the color distribution in which the maximum number of primes
within a grouping is 255 that are magenta, and the minimum number is 1 that
are red. The rainbow color scheme contains red, orange, yellow, green, blue,
and magenta.
Transposing the resulting data was also explored. In this instance, transposing is dened
as placing columns into rows, as displayed in Tables 4 and 5. Only 230 bits of memory
in the 16K CM-2 can be used for this procedure; therefore, two arrays of 32K by 16K use
too much memory. Thus, four dierent les of 32K by 8K must be created to hold the
transposed data, which can be used to select certain ranges of data, while still utilizing all
the processors for eciency. This plan appeared to be straightforward; however, since C*
cannot express the data movement required, the plan was not completed. Because of time
limitations no alternative technique was explored.
Table 4. Format of resulting data.
3 19 35 51 67 83 99 115
5 21 37 53 69 85 101 117
7 23 39 55 71 87 103 119
9 25 41 57 73 89 105 121
11 27 43 59 75 91 107 123
13 29 45 61 77 93 109 125
15 31 47 63 79 95 111 127
17 33 49 65 81 97 113 129
96
Speidel &Sibert
Table 5. Transposed format of the data.
3 5
7 9 11 13 15 17
19 21 2 3 25 27 29 31 33
35 37 39 41 43 45 47 49
51 53 55 57 59 61 63 65
67 69 71 73 75 77 79 81
83 85 87 89 91 93 95 97
99 101 103 105 107 109 111 113
115 117 119 121 123 125 127 129
5 Summary
An extended form of the sieve of Eratosthenes was implemented to calculate prime numbers
up to 230 +1. A counting procedure that steps through all of the bits marking o multiples
of p for \small" prime numbers, and a stepping procedure that uses indirect addressing,
accessing only those bits that are multiples of p for \large" prime numbers, was added.
The division between \small" and \large" prime numbers was found to be 17 using timing
experiments. A second division within the \large" prime numbers to speed up the run time
was considered. The resulting times were the same; therefore, there was no need for this
division and a special stepping procedure. A program to count the number of odd prime
numbers out of 230 + 1 numbers was implemented and found 54,400,027 odd primes. A
program for transposing the resulting data was explored to improve the eciency of the
processors when certain ranges of data were selected for visualizing the distribution. Last,
a program to visualize the distribution of the prime numbers was implemented using the
shading technique.
6 Future Work
The data resulting from the sieving program needs further checking to conrm its accuracy.
Additional exploration of ways to visualize the distribution of the prime numbers would be
benecial. Finally, the current programs could be adapted to run on the CM-5. Timing
experiments and the sieving program that runs on the CM-5 could then be tested for speed
and eciency, as compared to the CM-2 programs.
Acknowledgements
I would like to thank the National Science Foundation for its support of the REU Program
and NPAC for providing the Connection Machine time. A special thanks goes to Professor
Ernest Sibert who advised me in the course of the project, and to Kristin M. Tschinkel for
helping me edit this article. Thanks go to E. Bogucz, B. LaPlante, and N. McCracken for
organizing the REU Program. Finally, thanks to everyone in the REU Program for their
encouragement throughout and for making this program fun and a worthwhile experience.
References
[1] Computational Science REU Program, The Training Session, 1992.
[2] E. T. Bell, \The Queen of Mathematics," The World of Mathematics, J. R. Newman, Ed.,
Simon and Schuster, New York (1956), pp. 498{518.