Riemann`s Explicit Formula

Riemann’s Explicit Formula
Sean Li
Cornell University
[email protected]
May 11, 2012
This paper is a brief study of Bernhard Riemann’s main result in analytic
number theory: the article “Über die Anzahl der Primzahlen unter einer gegebenen Grösse” (1859), in which he derives an explicit formula for the prime counting function. Much of our paper works to make Riemann’s intuitive statements
more rigorous. In fact, to prove some of his ideas, we need to use theorems that
were not invented until decades after his lifetime.
1
Introduction
The theory begins with Euler’s product formula, which states that for s > 1,
∞
X
Y 1
1
=
s
n
1 − p1s
p
n=1
where p ranges over all primes. The fomula can be shown by expanding each
term in the product as
1
1
1
1
1 = 1 + ps + p2s + p3s + · · ·
1 − ps
and multiplying out all of them. This results in an infinite sum of terms in the
form
1
k1 k2
(p1 p2 · · · pkmm )s
where p1 , ..., pm are distinct primes and k1 , ..., km are positive integers. Then
one may use the fundamental theorem of arithmetic, which states that every
integer has a unique prime factorization, to see that each of these terms is a
1/ns . When summed, these terms equal the left-hand side.
Riemann called this function ζ(s) and considered its behavior when s is
a complex variable. It is not hard to see that it converges in the halfplane
Re(s) > 1. Let s = σ + it where σ and t are real. Then for σ > 1,
ns = nσ nit = nσ eit log n ,
1
so that
∞
∞
X
X
1
1 −it log n
ζ(s) =
=
e
.
s
σ
n
n
n=1
n=1
But |e−it log n | = 1, so the sum converges absolutely in the halfplane Re(s) > 1.
Moreover, in Re(s) > 1 there is uniform convergence, so ζ is holomorphic in this
halfplane.
2
Properties of ζ(s)
To obtain a formula for ζ(s) that works when s is outside the halfplane Re(s) >
1, we shall extend ζ to a meromorphic function in C, using the gamma and
theta functions.
2.1
The Gamma Function
Our first object of study is the gamma function, defined as
Z ∞
Γ(s) =
e−t ts−1 dt
0
for s > 0. When s is a positive integer, Γ(s) = (s − 1)!. To see that it converges,
one may break it up into
Z 1
Z ∞
−t s−1
Γ(s) =
e t dt +
e−t ts−1 dt
0
1
and observe that the second integral defines an entire function, while the first
can be dealt with accordingly. Expand e−t as a power series and integrate
termwise, resulting in
1
Z 1
∞
∞
X
X
(−1)n
(−1)n tn+s −t s−1
,
e t dt =
=
n!(n + s) n!(n + s)
0
1
1
0
which defines a meromorphic function on C, having poles at the negative integers
with residue (−1)n /n! at s = −n. This is easy to verify, as the rapid growth
of n! in the denominator makes the series converge uniformly. Therefore, the
relation
Z ∞
∞
X
(−1)n
Γ(s) =
+
e−t ts−1 dt
n!(n
+
s)
1
1
defines a meromorphic function.
2
Before going on, we first write out a property of the gamma function that
shall be useful later:
Γ(s + 1) = lim
N →∞
N!
(N + 1)s
(s + 1)(s + 2) · · · (s + N )
∞
Y
n1−s (n + 1)s
s+n
n=1
s
∞
Y
1 + n1
.
=
1 + ns
n=1
=
(1)
The first line is due to Euler, and the second and third are reformulations of it.
2.2
The Theta Function
For our case, define the theta function for real t > 0 as
ϑ(t) =
∞
X
2
e−πn t .
−∞
This satisfies the functional equation
ϑ(t) = t
− 21
1
ϑ
t
which can be shown by application of the Poisson summation formula to ϑ(t).
The growth of ϑ(t) is bounded like
|ϑ(t) − 1| ≤ Ce−πt .
This can be seen from the fact that
∞
X
1
2
e−πn
t
≤
∞
X
e−πnt ≤ Ce−πt
1
for t ≥ 1. The behavior of ϑ(t) near t = 0 is given by
1
ϑ(t) ≤ Ct− 2
which can be seen from the functional equation.
2.3
Analytic Continuation and Functional Equation
Now we are in a position to relate ζ, γ, and ϑ as follows. The proof is based on
Stein and Shakarchi [S1]. Let Re(s) > 1. If n ≥ 1, then
Z ∞
2
e−πn u u(s/2)−1 du = π −s/2 Γ(s/2)n−s ,
0
3
which can be seen immediately from the change of variable u = t/(πn2 ), making
the integral
Z
∞
e−t t(s/2)−1 dt · (πn2 )−s/2
0
equal to π −s/2 Γ(s/2)n−s . Now, because
∞
ϑ(u) − 1 X −πn2 u
=
e
,
2
n=1
and because of the previously shown bounds on the growth and decay of ϑ, we
may interchange the sum and integral. Then
Z
∞ Z ∞
X
2
1 ∞ (s/2)−1
u
[ϑ(u) − 1]du =
u(s/2)−1 e−πn u du
2 0
n=1 0
= π −s/2 Γ(s/2)
∞
X
n−s
n=1
=π
Now define
ψ(u) =
−s/2
Γ(s/2)ζ(s)
ϑ(u) − 1
.
2
The functional equation ϑ(u) = u−1/2 ϑ(1/u) implies
ψ(u) = u−1/2 ψ(1/u) +
1
1
− .
2
2u1/2
From the previously derived equation, we have, for Re(s) > 1,
Z ∞
π −s/2 Γ(s/2)ζ(s) =
u(s/2)−1 ψ(u)du
0
Z ∞
u(s/2)−1 ψ(u)du +
u(s/2)−1 ψ(u)du
0
1
Z 1
1
1
(s/2)−1
−1/2
=
u
u
ψ(1/u) + 1/2 −
du
2
2u
0
Z ∞
+
u(s/2)−1 ψ(u)du
1
Z ∞
1
1
=
+ +
[u−(s/2)−1/2 + u(s/2)−1 ]ψ(u)du.
s−1 s
1
Z
1
=
Note that this defines a meromorphic function with simple poles at 0 and 1.
This is because the exponential decay of ψ in the integral means the integral
defines an entire function. Also, observe that the value is unchanged if s is
replaced by 1 − s. Hence
π −s/2 Γ(s/2)ζ(s) = π −(1−s)/2 Γ((1 − s)/2)ζ(1 − s),
4
which allows us to define values for zeta everywhere except the pole at s = 1.
We shall follow Riemann’s notation and multiply π −s/2 Γ(s/2)ζ(s) by the
factor s(s − 1)/2 and define this as∗
ξ(s) = Γ(s/2 + 1)(s − 1)π −s/2 ζ(s).
The advantage of using this definition of ξ is that the multiplying by s and
s − 1 effectively cancel the simple poles of π −s/2 Γ(s/2)ζ(s), and hence ξ(s) is
an entire function and satisfies
ξ(s) = ξ(1 − s).
We may rearrange to find
ζ(s) =
π s ξ(s)
,
(s − 1)Γ(s/2 + 1)
which shows that ζ has a simple pole at 1 and zeros where Γ(s/2 + 1) has poles,
namely at s/2 = −n, n ∈ N. Hence zeta has simple zeros at −2, −4, −6, etc.
These are defined the trivial zeros. Note that all other zeros of ζ must also be
zeros of ξ. These nontrivial zeros are denoted by ρ.
Furthermore, the zeta function can be defined in the halfplane Re(s) > 1 by
ζ(s) =
Y
p
1
.
1 − p1s
Now (1 − 1/ps )−1 − 1 converges absolutely, so if zeta has a zero in this halfplane,
then one of the terms (1 − 1/ps )−1 must equal zero. This is impossible, so ζ
has no zeroes in the halfplane Re(s) > 1. And from the functional equation
ξ(s) = ξ(1 − s), it follows that there are no zeros in the halfplane Re(s) < 0,
except at the trivial zeros. Thus all of the nontrivial zeros must lie in the
rectangle 0 ≤ Re(s) ≤ 1. This bound can be improved to remove the lines
Re(s) = 0 and Re(s) = 1 and thus have the statement that all nontrivial zeros
of zeta lie in the region 0 < Re(s) < 1, which is denoted as the critical strip.
2.4
Product Formula for ξ(s)
Riemann assumed it was possible to factor ξ(s) in terms of its roots in something
of the form
Y
s
ξ(s) = f (s)
1−
,
ρ
ρ
where f(s) is a function that does not vanish. Given this was possible, he showed
that f (s) must be a constant, and then showed that the constant must be
f (s) = ξ(0), which follows upon setting s = 0.
∗ The ξ function is usually defined as ξ(s) = π −s/2 Γ(s/2)ζ(s), which has been shown to
have simple poles at 0 and 1.
5
The factoring step is indeed valid as shown by Hadamard in 1893, some 34
years after the publication of Riemann’s paper. We will not repeat the proof of
the Hadamard factorization here, as it is a fairly intricate result (a proof can be
found starting on p.147 of [S1]). The factorization theorem states for this case
that f (s) = ea+bs because ξ has order of growth 1 (this can be easily checked
from the equation defining ξ). Then since ξ(s − 1/2) is an even function (this
follows from ξ(s) = ξ(1 − s)), Re log ξ(s − 1/2) is an even function but must
grow slower than s1+ . A linear term cannot be even, so it must be constant.
Hence we have the equation
Y
s
.
ξ(s) = ξ(0)
1−
ρ
ρ
But we also have by definition that
ξ(s) = Γ(s/2 + 1)(s − 1)π −s/2 ζ(s),
so we may combine these, take the log, and rearrange to obtain
s
s
X
s
− log Γ
+ 1 + log π − log(s − 1). (2)
log 1 −
log ζ(s) = log ξ(0) +
ρ
2
2
ρ
3
3.1
Building the Formula
π(x) and J(x)
The end goal is to obtain a formula for π(x), which counts the number of primes
less than x. For our purposes, we shall use the formula


X
X
1
π(x) = 
1+
1 .
2 p<x
p≤x
This function starts at 0 when x = 0 and jumps by 1 at each prime. At each
jump, the function assumes the halfway value. Since π(x) almost everywhere
assumes integer values, it is difficult to imagine why a formula based on analytic
techniques should exist.
Riemann next defined the function J(x). Like π(x), this function starts at 0
when x = 0 and jumps by 1 for every prime, but it also jumpts by 1/2 for every
prime square, 1/3 for every prime cube, etc. It may be defined as


X
X
1
1
1
+
J(x) = 
2 pn <x n
n
n
p ≤x
where it assumes halfway values at the jumps. The reason this function is
interesting is that it may be related to the zeta function as follows.
6
Consider the product formula of ζ(s) for Re(s) > 1
Y
ζ(s) =
p
1
.
1 − p1s
Taking the log of both sides and using the Taylor series for the log yields
X
1
log ζ(s) =
− log 1 − s
p
p
X 1
1
1
=
+ 2s + 3s + · · ·
ps
2p
3p
p
=
X X p−ns
p
n
n
Observe that
p−ns = s
Z
.
∞
x−s−1 dx
pn
which follows from elementary calculus. We may substitute this into the log ζ(s)
formula to obtain
XX 1 Z ∞
x−s−1 dx
log ζ(s) = s
n
n
p
p
n
Because this is absolutely convergent for Re(s) > 1, it follows that we may
interchange the order of summation and integration, resulting in†
Z ∞ X
1 −s−1
log ζ(s) = s
x
dx
n
0
pn <x
Z ∞
=s
J(x)x−s−1 dx.
(3)
0
This is the key relation between ζ(s) and J(x). Later in this section we shall
use this formula again.
Now we need a relation between J(x) and π(x). This is given by
1
1
1
1
J(x) = π(x) + π(x 2 ) + π(x 3 ) + · · ·
2
3
(4)
where the number of primes less than x is counted with weight 1, the number
of prime squares less than x is counted with weight 1/2, etc. Note that the sum
is actually a finite sum, as π(x) = 0 for x < 2 (there are no primes less than 2).
This shall be helpful, though not necessary, for inverting the relation.
† Note that since jumps occur on a set of measure zero, it does not matter in the sum
whether we use pn < x or pn ≤ x.
7
The method of inversion will be the Möbius inversion. Let µ(n) denote the
Möbius function, defined for n ∈ N as


if n = 1,
1,
µ(n) = (−1)k , if n is the product of k distinct primes,


0,
otherwise.
Then Möbius inversion on equation (4) gives
π(x) =
∞
X
1
µ(n)
J(x n )
n
n=1
which is also a finite sum, for when x < 2, we have J(x) = 0 (there are no
1
primes or prime powers less than 2). So, all the terms where x n < 2 are 0, that
is, which means there are only blog x/ log 2c non-zero terms.
At this point, note that since J(x) counts primes and weighted prime powers
below x, J(x) grows no faster than x (in fact, the prime number theorem implies
J(x) ∼ x/ log x). Then the function J(x)x−s−1 grows slower than x−s . Combine
this with the fact that J(x) = 0 for x < 2, to see that J(x)x−s−1 is integrable
across the line when Re(s) > 1. So, we may use the inverse Laplace transform
on the equation
Z ∞
log ζ(s)
=
J(x)x−s−1 dx
s
0
which is a reassembling of equation (3) to find
1
J(x) =
2πi
Z
a+i∞
log ζ(s)
a−i∞
xs
ds
s
(5)
with a > 1.
3.2
The Product Formula and the Result
The next step begins a long line of hard work. We now attempt to substitute
equation (2), reprinted below,
s
s
X
s
log ζ(s) = log ξ(0) +
log 1 −
− log Γ
+ 1 + log π − log(s − 1)
ρ
2
2
ρ
into (5). If this works, then we can integrate term-wise and obtain a formula
for J(x). Unfortunately, the direct substitution does not work because it leads
to divergent integrals. We can, however, first integrate (5) by parts to obtain
J(x) = −
1
1
·
2πi log x
Z
a+i∞
a−i∞
8
d log ζ(s) s
x ds
ds
s
(6)
and then carry out the processes of substitution and term-wise integegration
to obtain the desired formula. The integration by parts of (5) depends on the
behavior of the term
1
log ζ(s) s
1
·
·
x
2πi log x
s
when s → a ± ∞. To prove the validity of (6), it suffices to show that
lim
T →∞
log ζ(a ± iT ) a±iT
x
= 0.
a ± iT
This follows from the inequality
X X
(1/n)p−n(a±iT ) | log ζ(a ± iT )| = n p
≤
XX
(1/n)p−na = log ζ(s) < ∞
n
p
so that the numerator is bounded, the denominator goes to infinity, and the
right-hand term is also bounded. Hence the term goes to zero and the integration
by parts is valid. The next section, in which we integrate term-wise, is the hard
part.
4
The Terms of J(x)
After substitution, formula (6) at the end of the last section gives us an integral
with 5 terms. The evaluations of some of these integrals are certainly not trivial.
Much of the work in this section is due to Edwards [E1].
For sake of nearby reference for the reader, the integral is
Z a+i∞
1
1
d log ζ(s) s
J(x) = −
·
x ds
2πi log x a−i∞ ds
s
and the terms are
log ζ(s) = log ξ(0) +
X
ρ
s
s
s
− log Γ
+ 1 + log π − log(s − 1),
log 1 −
ρ
2
2
derived in the previous sections.
4.1
The Main Term
We shall start with the − log(s − 1) term. This becomes
Z a+i∞
1
1
d log(s − 1) s
·
x ds.
2πi log x a−i∞ ds
s
To compute this integral, we first define a few auxiliary functions, the the first
of which is
Z a+i∞
1
1
d log[(s/β) − 1]
F (β) =
·
xs ds
2πi log x a−i∞ ds
s
9
where our term in question is the special case F (1). To extend F , we take
a > Re β and define log[(s/β) − 1] as log(s − β) - log β, to follow the principal
branch of log. Moreover, the integral is absolutely convergent because
d log[(s/β) − 1] | log[(s/β) − 1]|
1
≤
+
ds
s
|s|2
|s(s − β)|
is integrable, while xs oscillates on the line of integration. Now we use the
derivative
d log[(s/β) − 1]
1
=
dβ
s
(β − s)β
to obtain
Z a+i∞
1
1
d
1
F (β) =
·
xs ds
2πi log x a−i∞ ds (β − s)β
Z a+i∞
xs
1
ds
=−
2πi a−i∞ (β − s)β
Z a+i∞
1
xs
=
ds
2πiβ a−i∞ s − β
0
where the first step comes from differentiation under the integral sign, the second
from integration by parts, and the third from trivial rearrangement.
This can be computed. Consider the function
Z ∞
1
=
x−s xβ−1 dx
[Re(s − β) > 0].
s−β
1
Substitute x = eλ , dx = eλ dλ and write s = a + iµ to obtain
Z ∞
1
=
e−iλµ eλ(β−a) dλ
[a > Re(β)],
a + iµ − β
0
which gives, from Fourier inversion,
Z
∞
−∞
1
eiµx dµ =
a + iµ − β
(
2πex(β−a) , if x > 0,
0,
if x < 0.
It follows that
1
2πi
Z
a+i∞
a−i∞
1
y s ds =
s−β
(
y β , if y > 1,
0,
if y < 1.
(7)
Since we already have x > 1, F 0 (β) = xβ /β.
The next step is to evaluate a contour integral. Let C + be the contour from
0 to x that consists of the real line segment from 0 to 1 − , the semicircle in
the upper halfplane Im t ≥ 0 from 1 − to 1 + , and then the real line segment
from 1 + to x. Define
Z
tβ−1
dt
G(β) =
C + log t
10
and note that
G0 (β) =
Z
tβ−1 dt =
C+
x
tβ = F 0 (β).
β 0
Since G(β) is defined and analytic for Re(β) > 0, G(β) and F (β) must differ
by a constant. The hope is that we can compute this constant and hence find
F (β) as G(β) plus a constant.
We shall evaluate the constant by setting β = σ + iτ , holding σ fixed, letting
τ → ∞, and evaluate F (β) and G(β). First, we evaluate the limit of G(β).
Making the change of variable t = eu puts G(β) in the form
Z log x
Z iδ+log x βu
eβu
e
du +
du.
u
iδ+log x u
iδ−∞
Note that the path of integration has been altered slightly based on Cauchy’s
integral theorem. The further changes of variable u = iδ + v in the first integral
and u = log x + iw in the second put G(β) in the form
Z log x σv
Z δ −τ w σiw
e
e
e
eiδσ e−στ
eiτ v dv − ixβ
dw,
iδ
+
v
log
x
+
iw
−∞
0
whose values both approach 0 as τ → ∞. In the first integral, e−δτ → 0 is
enough to make the value 0, and in the second, e−τ w → 0 except at w = 0.
Therefore, the limit of G(β) as τ → ∞ is 0.
Evaluating the limit of F (β) is a bit trickier. Define another auxiliary function
Z a+i∞
1
1
d log[1 − (s/β)]
H(β) =
·
xs ds
2πi log x a−i∞ ds
s
where a > Re β and log[1−(s/β)] is defined for complex β as log(s−β)−log(−β).
The goal is to compare this to F (β) and thereby to G(β). In the upper halfplane
Im β > 0, the difference
Z a+i∞
1
1
d log β − log(−β) s
H(β) − F (β) =
·
x ds
2πi log x a−i∞ ds
s
Z a+i∞
1
d iπ s
1
·
x ds
=
2πi log x a−i∞ ds s
Z a+i∞
1
iπ s
=−
x ds
2πi a−i∞ s
= −iπ
where the last result is derived from equation (7). Therefore, F (β) = H(β) + iπ
in the upper halfplane, reducing the problem to finding the limit of H(β) as
τ → ∞. From the derivative
d log[1 − (s/β)]
log[1 − (s/β)]
1
=−
+
ds
s
s2
s(s − β)
log[1 − (s/β)]
1
1
=−
+
−
,
s2
β(s − β) βs
11
we may put this in the integral defining H(β). The first term is then
Z a+i∞
1
log[1 − (s/β)] s
−
x ds.
2πi a−i∞
s2
Since 1 − (s/β) → 1 and hence log[1 − (s/β)] → 0, the numerator is strongly
bounded. The denominator is s2 , which grows like |s|2 for large τ , and xs
oscillates along the line of integration. The 1/s2 growth rate means the we may
use the Lebesgue bounded convergence theorem so that the limit of the integral
is the integral of the limit, which is 0 due to log[1 − (s/β)] in the numerator.
Hence, this integral is 0. The second and third terms combine to give
Z a+i∞ 1
1
xβ
1
1
−
xs ds =
−
2πi a−i∞ β(s − β) βs
β
β
from equation (7). The numerators are bounded and |β| → ∞, hence these
terms go to 0, and the function H(β) goes to 0. This implies F (β) → iπ, and
thus F (β) = G(β) + iπ in the halfplane Re β > 0. Finally, this allows us to
write the main J(x) term as
Z 1−
Z 1+
Z x
dt
dt
dt
F (1) =
+
+
+ iπ
log t
0
1− log t
1+ log t
Taking the limit as → 0, we see that the second term approaches along a pole
of residue
R 1+ dt1, but the contour is taken with the negative orientation, resulting
in 1− log
t = −iπ, from the residue theorem. This implies that the iπ terms
cancel and we are left with
Z 1−
Z x
dt
dt
F (1) = lim
+
= Li(x).
→0 0
log t
log
t
1+
4.2
The Oscillatory Term
Next, we shall look at the term
X
ρ
s
log 1 −
ρ
which involves the nontrivial roots of the zeta function. In the integral form for
J(x), this becomes

P
s
Z a+i∞
log
1
−
ρ
1
d  ρ
1
 xs ds.
·
(8)
−
2πi log x a−i∞ ds
s
At this point it is not clear what to do, since we do not know whether the
integral and sum can be interchanged. Riemann did not know how to prove
this, but he assumed it could be done. We will see in a later section that if we
12
assume the interchange is valid, the final result is the correct one, despite the
possible invalidity of the method.
Assuming we can interchange the integral and sum, this expression becomes
X
−
H(ρ)
ρ
with the same H(ρ) as defined in the previous section. We showed that H(ρ) =
G(ρ) in the first quadrant (Re ρ > 0, Im ρ > 0), and if we take the integral
defining G(ρ) to go through the lower halfplane, the same holds ρ in the fourth
quadrant (Re ρ > 0, Im ρ ≤ 0). That is, let C − be the contour that goes in a
line segment from 0 to 1 − , in a semicircle in the lower halfplane (Im ρ < 0)
from 1 − to 1 + , and then in a line segment from 1 + to x. Then after
pairing the terms ρ and 1 − ρ, we find that the total sum is equal to
Z
X Z
t−ρ
tρ−1
dt +
dt .
−
C − log t
C + log t
Im ρ>0
If β is real and positive, then the change of variable u = tβ , log t = (log u)/β,
dt/t = du/uβ gives
Z
C+
tβ−1
dt =
log t
xβ
Z
0
du
= Li(xρ ) − iπ,
log u
where the path from 0 to xβ passes in the upper halfplane near u = 1. Now the
integral converges in the upper halfplane Re β > 0 and thus gives an analytic
continuation of Li(xβ ) in this halfplane. On the other hand, the integral
Z
tβ−1
dt = Li(xρ ) + iπ,
C − log t
through a similar argument. Thus the formula for equation (8) is
X −
Li(xρ ) + Li(x1−ρ ) .
Im ρ>0
We must be careful as this sum converges only conditionally. We take the sum
in order of increasing | Im(ρ)|.
4.3
The Constant Term
The next term is
log ξ(0),
which becomes, in the integral,
−
1
1
·
2πi log x
Z
a+i∞
a−i∞
d log ξ(0) s
x ds.
ds
s
13
Integrating by parts and using equation (7), we have that the above is equal to
1
2πi
Z
a+i∞
a−i∞
log ξ(0) s
x ds = log ξ(0),
s
which is given by ξ(0) = Γ(1)π 0 (0 − 1)ζ(0) = −ζ(0) = 21 , so that
log ξ(0) = − log 2.
4.4
The Integral Term
The last useful term is
log Γ
s
2
+1
and the corresponding integral is
Z
1
1
·
2πi log x
a+i∞
a−i∞
"
#
d log Γ 2s + 1
xs ds
ds
s
(9)
Using formula (1), a property of the gamma function, we may rewrite
∞ X
s s
1
log Γ
+1 =
− log 1 +
+ log 1 +
.
2
2n
2
n
1
s
Putting this formula in (9) and and assuming that we can interchange the sum
and integral, we have (9) in the form
1
1
·
−
2πi log x
a+i∞
Z
a−i∞
d
ds
− log[1 + (s/2n)]
s
xs ds
where only the first sum is intact (the second sum vanishes because division by
s results in a constant, which has derivative 0). But this is equal to
−
∞
X
H(−2n)
1
where H is defined as in section 4.1 in the evaluation of the main term. In that
section we only evaluated H for Re(β) > 0. To analyze the behavior of H in
Re(β) < 0, define
Z ∞ β−1
t
dt
E(β) = −
log t
x
and note that
E 0 (β) = −
Z
∞
tβ−1 dt =
x
14
xβ
= F 0 (β) = H 0 (β),
β
so that E(β) and H(β) differ by a constant. Now both E and H approach zero
as β → ∞ and so the constant is zero, giving E(β) = H(β). Thus our term
becomes
Z ∞ −2n−1
∞
X
t
−
dt
H(−2n) =
log t
x
1
Z ∞
∞
1 X −2n =
t
dt
t log t 1
x
Z ∞
dt
=
2
t(t − 1) log t
x
assuming that termwise integration is valid.
To show that it is, we consider
∞
X d − log[1 + (s/2n)]
d log Γ(s/2 + 1)
=−
.
ds
s
ds
s
1
For large n, take the Taylor series expansion log(1 + x) = x − 21 x2 + 13 x3 − · · ·
to find that
∞
X
d − log[1 + (s/2n)]
1 1
2 s
3 s2
−
=
−
+
− ···
ds
s
2 4n2
3 8n3
4 16n4
1
which converges uniformly as the highest order term of n is n−2 . This justifies
termwise differentiation. The termwise integration is likewise justified, as the
terms decay like 1/n2 and the sum is hence uniformly convergent.
4.5
The Vanishing Term
The final term we look at is
s
log π,
2
which, as it turns out, completely vanishes in the formula for J(x), because
Z a+i∞
1
1
d 2s log π s
−
·
x ds = 0.
2πi log x a−i∞ ds
s
The term is divided by s and becomes constant, resulting in a derivative of 0,
and thus the entire term is 0.
4.6
Result
In the final analysis, we have
J(x) = Li(x) +
X
Li(xρ ) − log 2 +
Z
∞
x
ρ
15
dt
t(t2 − 1) log t
with x > 1, and with the sum in the second term only conditionally convergent
(one must sum in order of increasing | Im(ρ)|). Combining this formula with
π(x) =
∞
X
1
µ(n)
J(x n )
n
n=1
gives an analytic formula for π(x). Remembering that this formula involves
a finite sum, we can see easily that if the formula for J(x) is valid, then the
formula for π(x) must also be valid.
We have not yet shown the validity of termwise integration for the second
term
X
Li(xρ ).
ρ
A proof dealing with this sum directly was not discovered until 1908, nearly
half a century after Riemann’s paper, by Landau [L1]. There were also methods
of indirect proof which involved formulas for functions similar to J(x), one of
which we shall examine in the next section.
5
5.1
The Von Mangoldt Formula
Deriving the Formula
Consider a counting function that counts primes and prime powers weighted by
the log of the prime, that is,
X
ψ(x) =
log p
pn <x
where the function assumes the halfway value at each jump.
This function has the corresponding equation (proved by von Mangoldt in
1894, see [E1])
X x−2n
X xρ
− log(2π) +
ψ(x) = x −
ρ
2n
n
ρ
for x > 1. While we shall not fully prove it here, we can show that it is a very
reasonable result. One can differentiate the formula for J(x) to obtain
!
X xρ−1
1
1
dJ =
−
−
dx.
log x
log x x(x2 − 1) log x
ρ
Now, since J jumps by 1/n at prime powers, dJ = 1/n at x = pn . Similarly,
dψ = log p = (1/n) log(pn ) = (1/n) log(x) at x = pn . They are 0 everywhere
else. Hence these equations give
dψ = (log x)dJ
!
=
1−
X
x
ρ−1
ρ
−
X
n
16
x
−2n−1
dx
where the last term can be derived with geometric series. This leads to the
plausible guess that
ψ(x) = x −
X xρ
ρ
ρ
+
X x−2n
n
2n
+ C.
The hard part in showingPthat von Mangoldt’s formula holds is showing that
ρ
the oscillatory term , i.e.
ρ x /ρ, converges.
To derive such a formula for ψ(x) in terms of ζ, von Mangoldt used the same
method as Riemann, i.e., he first found a formula for ζ(s) in an integral form of
ψ(x), and then took the Laplace transform. In his case, he found
Z ∞
ζ 0 (s)
=s
ψ(x)x−s−1 dx,
−
ζ(s)
0
which comes from log-differentiating the product formula for zeta, and then he
applies the transform to obtain
Z a+i∞ 0 1
ζ (s) s ds
ψ(x) =
−
(10)
x
2πi a−i∞
ζ(s)
s
for a > 1.
For the next step, we shall find a formula for −ζ 0 (s)/ζ(s) and take the
integral termwise. The reader will probably recognize this process as nearly
identical so far to the process Riemann used to find J(x).
Using the equation
Y
s
ξ(0)
1−
= Γ(s/2 + 1)(s − 1)π −s/2 ζ(s)
ρ
ρ
developed at the end of section 2.4 and log-differentiating, we find that
X 1
X
ζ 0 (s)
1
1
1
1
1
−
=
−
+
−
+ log 1 +
− log π.
ζ(s)
s−1
s
−
ρ
s
+
2n
2
n
2
ρ
n
Plugging s = 0 gives
−
X 1 X 1
ζ 0 (0)
1
1
1
= −1 +
+
−
+ log 1 +
− log π,
ζ(0)
ρ
2n 2
n
2
ρ
n
which, when substracted from the previous equation, gives
−
X
X
ζ 0 (s)
s
s
s
ζ 0 (0)
=
−
+
−
.
ζ(s)
s−1
ρ(s − ρ)
2n(s + 2n)
ζ(0)
ρ
n
17
(11)
5.2
The
P
xρ
ρ ρ
Term
When we plug equation (11) into the integral in equation (10), the terms actually
converge so we do not need to take the extra step of integrating by parts as we
did for J(x). This simplifies the calculation immensely. We shall skip over
the calculation of the first, third, and fourth terms, as we already know from
calculation of J(x) what they should be (except for the value of the constant)
and why they converge. We shall concern ourselves with the second term, arising
from the nontrivial zeros, namely
X
s
−
ρ(s
− ρ)
ρ
with the integral expression
#
Z a+i∞ "X
1
s
ds
xs .
2πi a−i∞
ρ(s
−
ρ)
s
ρ
The goal will be to show that this term converges and is equal to
X xρ
.
ρ
ρ
(12)
(13)
If we pair the roots ρ and 1 − ρ (such roots exist because ξ(s) = ξ(1 − s)), we
find that the sum actually converges uniformly. This can be seen from
1
1
1
1
s − ρ + s − (1 − ρ) = (s − 1 ) − (ρ − 1 ) + (s − 1 ) + (ρ − 1 ) 2
2
2
2
2(s − 12 )
= 1 2
1 2
(s − 2 ) − (ρ − 2 )
1
≤C
1 2
(ρ − )
2
for large ρ, and the fact that
X
ρ
1
< ∞,
ρ − 1 1+
2
which is essentially due to ξ(s) having order of growth 1. The uniform convergence implies that this sum can be integrated termwise over finite intervals.
Thus the term (12) is equal to
X 1 Z a+ih xs ds
X xρ 1 Z a+ih xs−ρ ds
lim
= lim
h→∞
2πi a−ih ρ(s − ρ) h→∞ ρ ρ 2πi a−ih s − ρ
ρ
and defines the correct term in the formula for ψ(x). It is not hard to find, for
x > 1,
Z a+ih s−ρ
1
x ds
lim
= 1,
h→∞ 2πi a−ih
s−ρ
18
which follows immediately from the formula

0, if 0 < y < 1,
Z a+ih s
y ds  1
1
lim
= 2 , if y = 1,

h→∞ 2πi a−ih
s

1, if y > 1.
This would imply that the term (12) converges to
X xρ 1 Z a+ih xs−ρ ds X xρ
=
h→∞
ρ 2πi a−ih s − ρ
ρ
ρ
ρ
lim
if we are allowed to interchange the limit and sum. If this is possible, then we
will have shown that (13) converges.
To do this, we shall follow von Mangoldt’s proof, which takes the limit
“diagonally” using the function
X
| Im(ρ)|≤h
xρ 1
ρ 2πi
Z
a+ih
a−ih
xs−ρ ds
.
s−ρ
(14)
Before doing the proof, we need two bounds on the integral
1
2πi
The first bound is
Z
a+ih
a−ih
y s ds
.
s
1 Z a+ih y s ds
xa
− 1 ≤
πh log x
2πi a−ih
s
with x > 1 and a > 0, and the second is
1 Z a+id y s ds xa
≤K
2πi a−ic
s (a + c) log x
(15)
(16)
where x > 1, a > 0, and d > c ≥ 0. The proofs for both can be found in [E1],
and we shall not provide them in this paper.
We also need a statement about the density of roots ρ. Namely, there exists
H such that for T ≥ H, the number of roots in the region T ≤ Im(ρ) ≤ T + 1
is less than 2 log T . It is clear due to ξ(s) having order of growth 1 that this
density must be less than T , but to obtain the bound 2 log T requires a bit more
detail, and it in fact uses Stirling’s approximation for the gamma function. We
shall not give the proof here, but it can also be found in [E1].
Now, on with the proof that (12) converges to (13). Consider for a given h
the differences
X xρ 1 Z a+ih xs−ρ ds
X xρ 1 Z a+ih xs−ρ ds
−
(17)
ρ 2πi a−ih s − ρ
ρ 2πi a−ih s − ρ
ρ
| Im(ρ)|≤h
19
and
X
| Im(ρ)|≤h
xρ 1
ρ 2πi
Z
a+ih
a−ih
xs−ρ ds
−
s−ρ
X
| Im(ρ)|≤h
xρ
.
ρ
(18)
The goal will be to show that both of these are 0, which will prove that (12) is
equal to (13), and since the former converges, so does the latter.
We shall consider first an estimate of (17). Write ρ = β + iγ. From (16), we
see that the modulus of (17) is at most
X xρ 1 Z a+ih xs−ρ ds ρ 2πi
s−ρ a−ih
|γ|>h
X xβ 1 Z a−β+i(γ+h) xt dt ≤2
β 2πi a−β+i(γ−h) t γ>h
≤2
X xβ
xa−β
·K
β
(a − β + γ − h) log x
γ>h
≤ 2K
xa X
1
log x
γ(γ − h + c)
γ>h
where c = a − 1 > 0 so that c ≤ a − β for all roots ρ. Grouping the with γ > h
in intervals h < γ ≤ h + 1, h + 1 < γ ≤ h + 2,..., then for large h, the interval
h + j < γ < h + j + 1 contains at most 2 log(h + j) roots, and thus the modulus
of (17) is at most a constant times
∞
X
j=0
log(h + j)
.
(h + j)(j + c)
This sum obviously converges because of the j 2 in the denominator. However,
we need to show that as h → ∞, the sum converges to 0. Choosing h large
enough so that log(h + j) < (h + j)1/2 for all j ≥ 0, and thus the sum is at most
∞
X
1
j=0
(h + j) 2 (j + c)
1
which can be made arbitrarily small by choosing large h. Hence (17) goes to 0.
Now consider (18). The modulus of (18) is at most
X xβ 1 Z a−β−iγ+ih xt dt 2
.
β 2πi a−β−iγ−ih t 0<γ≤h
Note the difference in bounds of integration. The integral bounds (15) and (16)
20
imply that this is at most
X xβ 1 Z a−β+i(h+γ) xt dt X xβ 1 Z a−β+i(h+γ) xt dt 2
+2
β 2πi a−β−i(h+γ) t β 2πi a−β+i(h−γ) t 0<γ≤h
0<γ≤h
X xβ
X xβ
xa−β
xa−β
≤2
+2
K
β π(h + γ) log x
β (a − β + h − γ) log x
0<γ≤h
a
0<γ≤h
X
2x
1
2Kxa X
1
≤
+
,
π log x
γ(h + γ)
log x
γ(c + h − γ)
0<γ≤h
0<γ≤h
where c = a − 1 > 0 and c ≤ a − β as before. Now we just need to show tht the
two sums
X
X
1
1
+
γ(h + γ)
γ(c + h − γ)
0<γ≤h
0<γ≤h
both go to 0. For the first sum, let H be an integer large enough such that the
interval H + j ≤ γ ≤ H + j + 1 contains at most 2 log(H + j) roots. Then
X
X
X
1
1
2 log(H + j)
≤
+
,
γ(h + γ)
γ(h + γ)
(H + j)(h + H + j)
0<γ≤H
0<γ≤h
0≤j≤h−H
where the first sum has a finite number of terms, and thus goes to 0 as h → ∞.
The second sum is at most
X
1
1
1
2
(log h)
−
h H +j
h+H +j
0≤j≤h−H
≤2
log h
h
log h
≤2
h
X
0≤j≤h−H
Z
h
H−1
2
1
H +j
dt
t
(log h)
,
h
which goes to 0 as h → ∞. A similar calculation shows that the sum
X
1
γ(c + h − γ)
≤2
0<γ≤h
goes to 0 as h → ∞.
With this, we have shown that (17) and (18) go to 0, and hence
X xρ 1 Z a+ih xs−ρ ds
X xρ
−
= 0,
lim
h→∞
ρ 2πi a−ih s − ρ
ρ
ρ
| Im(ρ)|≤h
and therefore, we have shown the convergence of
X xρ
.
ρ
ρ
21
6
The Prime Number Theorem and Concluding
Remarks
After the argument in the previous section, von Mangoldt then uses a Stieltjes
integral to transform the formula for ψ(x) into the formula Riemann obtained
for J(x) (the integral is based on dψ = (log x)dJ). Note that there is no
circular reasoning here, as von Mangoldt proved the formula for ψ(x) without
using J(x) at all; the plausibility argument at the beginning of section 5.1 using
dψ = (log x)dJ is purely a result. In the Stieltjes integral that von
PMangoldt
computed, there were two terms corresponding to the convergent ρ xρ /ρ: a
first term that contained the sum over ρ but did not contain the variable over
which he was integrating, hence the validity of termwise integration, and a
second term that contained ρ2 on the denominator, so that the sum converged
uniformly. These formulas can be found on p.63 of [E1].
With these facts, this would
constitute an indirect proof that the second
P
term in J(x), i.e. the term ρ Li(xρ ), converges. Then the formula
Z ∞
X
dt
,
J(x) = Li(x) +
Li(xρ ) − log 2 +
2 − 1) log t
t(t
x
ρ
where x > 1 and the second term is summed in order of increasing | Im(ρ)|, is
valid.
We now turn our attention for the remainder of the paper to the prime
number theorem
x
π(x) ∼
,
log x
which can almost be seen in Riemann’s formula, as π(x) ∼ J(x), and x/ log x ∼
Li(x). Obviously the third and fourth term do not grow, but to show the prime
number theorem, one must show that
X
1
lim
Li(xρ ) = 0.
x→∞ (x/ log x)
ρ
Perhaps it is much easier to see this with von Mangoldt’s formula after noting
that ψ(x) ∼ π(x) log x. Then the prime number theorem amounts to showing
that ψ(x) ∼ x. From von Mangoldt’s formula
ψ(x) = x −
X xρ
ρ
ρ
− log(2π) +
X x−2n
n
2n
,
we see that the prime number theorem is equivalent to
P ρ
P −2n
− ρ xρ − log(2π) + n x2n
lim
= 0.
x→∞
x
Since the last 2 terms do not grow with x, it suffices to show that
X xρ−1
lim
= 0,
x→∞
ρ
ρ
22
which would follow from xρ−1 → 0 for all ρ. This requires the proof that there
are no zeros on the line Re(s) = 1, which is precisely what Hadamard and de la
Vallée Poussin showed in their proofs of the prime number theorem.
References
[D1] Derbyshire, J., Prime Obsession, Joseph Henry Press, Washington, DC,
2003.
[E1] Edwards, H. M., Riemann’s Zeta Function, Academic Press, New York,
NY, 1974.
[L1] Landau, E., Nouvelle démonstration pour la formule de Riemann... Ann.
Sci. Ecole Norm. Sup., 25, 399-442 (1908).
[S1] Stein, E. M. and Shakarchi, R., Complex Analysis, Princeton University
Press, Pinceton, NJ, 2003.
[S2] Stopple, J.,A Primer on Analytic Number Theory, Cambridge University
Press, Cambridge, UK, 2003.
23