Sieves in Number Theory

Sieves in Number Theory
Lecture Notes
Taught Course Centre 2007
Tim Browning,
Roger Heath-Brown
Typeset by Sandro Bettin1
1 All errors are the responsibility of the typesetter. In particular there are some arguments which, as an exercise for the
typesetter, have been “ fleshed out” or re-interpreted, possibly incor- rectly. Tim’s lectures were neater and more concise.
Corrections would be gratefully received at [email protected]
Contents
1 Introduction
2
2 Sieve of Eratosthenes
5
3 Large Sieve
10
4 Selberg sieve
19
5 Sieve limitations
31
6 Small gaps between primes
37
1
Chapter 1
Introduction
Sieves can be used to tackle the following questions:
i) Are there infinitely many primes p such that p + 2 is also prime?
ii) Are there infinitely many primes p such that p = n2 + 1 for some n ∈ N?
iii) Are there infinitely many primes p such that 4p + 1 is also prime?
iv) Is every sufficiently large n a sum of two primes?
v) Is it true that the interval (n2 , (n + 1)2 ) contains at least one prime for every n ∈ N∗ ?
These problems are still open, but, using Sieves methods, some steps towards their solutions
have been done. For example, in 1966 Chen proved a weaker version of iv) stating that
every sufficiently large n is a sum of a prime and a P2 (where Pr denotes the numbers that
have at most r prime factors).
These problems are also related to important problems in other Mathematics branches,
such as Artin’s primitive root conjecture, which says that, for all a ∈ Z with a 6= 0, ±1,
there exists infinitely many primes p such that a is a primitive root modulo p.
Proposition 1. If iii) is true, then Artin’s conjecture is true for a = 2, i.e. there exists
infinitely many primes p such that 2 is a primitive root modulo p.
2
Proof. Let p = 2k + 1, with k ∈ N, and q = 4p + 1 = 8k + 5 be primes. Recall that for all
prime r
(
1
if r ≡ ±1 (mod 8),
2
=
r
−1 if r ≡ ±3 (mod 8),
a
where p is the Legendre symbol. Therefore 2q = −1 and so there doesn’t exist any x
such that 2 ≡ x2 (mod 8). Furthermore, by Fermat’s little theorem, 24p = 2q−1 ≡ 1 (mod q)
and so the order of 2 modulo q must be 1, 2, 4, p, 2p or 4p. It’s easily checked that the
q−1
order can’t be 1, 2 or 4, and it can’t be p either because otherwise 2p = 2 4 ≡ 1 (mod q)
2
and so 2k+1 ≡ 2 (mod q). It remains to show that 22p ≡
6 1 (mod q). If it weren’t so, we
would have 22 ≡ 2−4k (mod q) and so there would be two possibilities: 2 ≡ 2−2k (mod q)
or 2 ≡ −2−2k (mod q). The first is impossible for the same reason as before, the second is
impossible because it would imply that
−1
2
−1
−2
=
=−
= −1.
1=
q
q
q
q
The fundamental goal of sieve theory is to produce upper and lower bound for sets of
the type
S(A, ℘, z) = #{n ∈ A | p|n ⇒ p > z ∀p ∈ ℘},
where A is a finite subset of N, ℘ is a subset of the set of primes P and z > 0.
Examples 2.
1. Let A = {n ∈ N | n ≤ x} and ℘ = {p ∈ P | p ≡ 3 (mod 4)}, then
S(A, ℘, x) =#{n ≤ x | p|n, p ∈ P ⇒ p 6≡ 3 (mod 4)}
#{n ≤ x | n = a2 + b2 for some coprime a, b ∈ N},
so through this function we can detect sums of two squares.
2. Let A = {n ∈ N | n ≤ x} and
√
x < z ≤ x. Then
S(A, P, z) = #{n ≤ x | p|n ⇒ p > z}
= π(x) − π(z),
where π(x) = #{p ∈ P | p ≤ x}.
3
3. Let A = {n(2N − n) | n ∈ N, 2 ≤ n ≤ 2N − 2}. Then
S(A, P, x) = #{(p, 2N − p) ∈ P2 |
and this is related to Goldbach conjecture.
4
√
√
2N < p < 2N − 2N }
Chapter 2
Sieve of Eratosthenes
The Möbius function is the function µ : N∗ → {0, ±1} defined by


if n = 1,
1
µ(n) = 0
if ∃p ∈ P such that p2 |n,


(−1)r if n = p1 · · · pr with p1 , . . . , pr distinct primes.
Lemma 1. For all n ∈ N we have
X
d|n
(
1
µ(d) =
0
if n = 1,
otherwise.
Proof. Suppose n = pe11 · · · perr with p1 , . . . , pr primes and e1 , . . . , er ∈ N∗ . Then
X
X
r
r r
µ(d) =
µ(d) = 1 + (−1)
+ · · · + (−1)
= (1 − 1)r = 0.
1
r
d|n
d|p1 ···pr
Lemma 2 (Abel’s partial summation formula). Let λ1 , λ2 , . . . be an increasing sequence
of real numbers that goes to ∞ and c1 , c2 , . . . a sequence of complex numbers. Let C(x) =
P
1
λn ≤x cn and φ : [λ1 , ∞[→ R be of class C . Then
Z X
X
cn φ(λn ) = −
C(x)φ0 (x) dx + C(X)φ(X),
(2.1)
λ1
λn ≤X
for all X ≥ λ1 . Moreover, if C(X)Φ(X) → 0 as X → ∞, then
Z ∞
∞
X
cn φ(λn ) = −
C(x)φ0 (x) dx,
λ1
n=1
provided that either side is convergent.
5
(2.2)
Proof. One has
C(X)φ(X) −
X
X
cn φ(λn ) =
λn ≤X
cn (φ(X) − φ(λn )) =
λn ≤X
Z
X
=
X Z
λn ≤X
X
cn φ0 (x) dx =
λ1 λ ≤x
n
Z
X
cn φ0 (x) dx
λn
X
C(x)φ0 (x) dx.
λ1
This proves (2.1). To prove (2.2) it’s enough to let X go to infinity.
Let
Y
Π = Π(℘, z) :=
p,
p∈℘, p≤z
Ad := {n ∈ N | dn ∈ A},
for all d ∈ N∗ . Applying lemma 1, we can write
S(A, ℘, z) =
X
1=
X X
µ(d) =
n∈A d|(n,Π)
n∈A, (n,Π)=1
X
µ(d)#Ad .
(2.3)
d|Π(℘,z)
Now, suppose that there exist X, Rd and a completely multiplicative function ω(d), with
ω(d) ≥ 0
∀d,
ω(p) = 0
∀p ∈ P \ ℘,
such that
#Ad =
ω(d)
X + Rd
d
∀d ∈ N∗ .
(2.4)
Then we can prove the following
Theorem A (Sieve of Eratosthenes). Let X, Rd , ω(d) as above and assume furthermore
that
1. Rd = O(ω(d))
2. ∃k ≥ 0 such that
P
p|Π(℘,z)
ω(p) log p
p
≤ k log z + O(1)
3. ∃y > 0 such that #Ad = 0 for d > y.
6
Then we have
S(A, ℘, z) = XW (z) + O
y
x+
log z
k+1
(log z)
log y
exp −
,
log z
where
Y ω(p)
W (z) =
1−
.
p
p∈℘, p≤z
Proof. Assume all the hypothesis in the theorem. For all δ > 0, we have
δ
X
X
t
F (t, z) :=
ω(d) ≤
ω(d)
,
d
d≤t, d|Π
d|Π
using “Rankin’s trick”. Since 1 + x ≤ ex for all x ∈ R, using multiplicativity of ω we
deduce that
F (t, z) ≤ tδ
Y
1+
p|Π
ω(p)
pδ
≤ tδ
Y
exp
p|Π
ω(p)
pδ


= exp δ log t +
X ω(p)
p|Π
pδ
.
Now, writing δ = 1 − η and using the inequality ex ≤ 1 + xex for x > 0, we see that
pη = exp(η log p) ≤ 1 + η log ppη ≤ 1 + ηz η log p,
since every prime p|Π(℘, z) is less then z. Therefore


X ω(p)
F (t, z) ≤ t exp(−η log t) exp 
pη 
p
p|Π


X ω(p)
X ω(p)
≤ t exp −η log t +
+ ηz η
log p .
p
p
p|Π
Now, applying lemma 2 to cp =
ω(p)
p
log p and φ(x) =
p|Π
1
,
log x
we have
Z z X
X ω(p)
X ω(p)
1
ω(p)
1
1
log p
=
log p
dx
+
log p
2
p
log p
p
log z
x log x
p1 p|Π(℘,x) p
p|Π(℘,z)
p|Π(℘,z)
≤ k log log z + O(1),
by hypothesis 2). Hence
F (t, z) ≤ t exp (−η log t + k log log z + kz η log z) .
7
Choosing η = (log z)−1 , we obtain
F (t, z) t exp
− log t
log z
(log z)2 .
(2.5)
Moreover, by partial summation (lemma 2) with cd = ω(d), φ(x) = x1 , we can conclude
that
Z ∞
X ω(d) Z ∞ F (t, z) − F (y, z)
F (t, z)
F (y, z)
=
+
dt = −
dt
2
d
t
y
t2
y
y
d|Π, d>y
Z ∞
− log y
1
− log t
k
(log z) exp
+
exp
dt
log z
t
log z
y
− log y
k+1
(log z) exp
.
log z
Finally, by hypothesis 3) and (2.3)-(2.4) we have


X
X µ(d)Xω(d)
+O
|µ(d)||Rd |
S(A, ℘, z) =
µ(d)#Ad =
d
d|Π, d≤y
d|Π, d≤y
d|Π, d≤y


X µ(d)ω(d)
X
= XW (z) + O X
+
ω(d)
d
d|Π, d>y
d|Π, d≤y
y
− log y
k+1
= XW (z) + O
X+
(log z)
exp
,
log z
log z
X
where we used hypothesis 1) and (2.5)-(2.6).
We apply the previous theorem to the problem of twine primes.
Corollary A.1 (Brun’s theorem). We have
X 1
<∞
p
p, p+2∈P
Proof. It follows from a slight modified version of Theorem A.
1
Corollary A.2. For z ≤ x 4 log log x , we have
Y
1
φ(x, z) = #{n ≤ x | p|n ⇒ p ≥ z} ∼
1−
x
p
p<z
8
(2.6)
Proof. Exercise.
Note that
Lemma 3 (Merten’s formula). We have
Y
1
e−γ
1−
∼
,
p
log z
p≤z
where γ is Euler’s constant.
Proof. See Hardy Wright, theorem 429.
9
Chapter 3
Large Sieve
Lemma 4. Let F : [0, 1] → C be a differentiable function with continuous derivative.
Then, if we extend F by periodicity to all R with period 1, we have
Z 1
Z 1
X X a 2
|F 0 (α)| dα,
|F (α)| dα +
F
≤z
d
0
0
d≤z 1≤a≤d,
(a,d)=1
for all z ∈ N∗
Proof. We have that
−F
a
d
α
Z
F 0 (t) dt.
= −F (α) +
a
d
Therefore
Z α
a |F 0 (t)| dt.
(3.1)
≤ |F (α)| +
F
a
d
d
Now, let δ = 2z12 , so that the intervals I = I ad := I ad − δ, ad + δ , for d ≤ z, 1 ≤ a ≤ d
and (a, d) = 1, are all disjoints and contained in [0, 1]. Integrating (3.1) over I, we obtain
Z Z α
a Z
|F 0 (t)| dt dα
2δ F
≤ |F (α)| dα +
a
d
I
I
Z
Z Zd
≤ |F (α)| dα +
|F 0 (t)| dt dα
I ZI
ZI
= |F (α)| dα + 2δ |F 0 (t)| dt,
I
I
10
a
,
α
⊂ I. Summing over a and d and multiplying by z 2 we obtain
d
Z
X X a X X Z
2
0
z
|F (α)| dα + |F (t)| dt
F
≤
d
I
I
d≤z 1≤a≤d,
d≤z 1≤a≤d,
since, if α ∈ I, then
(a,d)=1
(a,d)=1
≤z
2
Z
1
Z
0
|F (α)| dα +
0
1
|F 0 (α)| dα.
0
Theorem B (Analytic large sieve inequality). Let {an }n∈N be a sequence in C, x ∈ N and
S(α) =
X
an e (nα),
n≤x
where e (β) = exp(2πiβ). Then
X X a 2
X
|an |2 .
S
≤ (z 2 + 4πx)
d
n≤x
d≤z 1≤a≤d,
(a,d)=1
Proof. Applying lemma 4 with F (α) = S(α)2 , we obtain
Z 1
Z 1
X X a 2
2
|S(α)| dα + 2
|S 0 (α)S(α)| dα.
S
≤z
d
0
0
d≤z 1≤a≤d,
(a,d)=1
By Parseval’s identity we have that
Z 1
|S(α)|2 dα =
0
and, since S 0 (α) = 2π
P
n≤x
X
|an |2
n≤x
nan e (nα), by Cauchy’s inequality and Parseval’s equality, we
get
Z
1
0
!
2
|S (α)S(α)| dα
0
≤
X
2
|an |
!2
!
4π
n≤x
2
X
n≤x
2
2
n |an |
2 2
≤ 4π x
X
2
|an |
,
n≤x
that completes the proof.
Remark. Montgomery-Vaughan (1974) and Selberg proved independently that 4π can be
removed from the analytic large sieve inequality. Moreover, 1 is the best possible coefficient
of x.
11
Next we deduce a sieve method from Theorem B. We need the following lemma about
Ramanujan sums.
Lemma 5. For all d, n ∈ N, let
cd (n) =
X
e
na d
1≤a≤d,
(a,d)=1
.
Then
1. (d, d0 ) = 1 ⇒ cdd0 (n) = cd (n)cd0 (n);
2. cd (n) =
P
D|(d,n)
µ
d
D
D;
3. (d, n) = 1 ⇒ cd (n) = µ(d).
Proof.
1. By Bézout’s identity we have
X
cdd0 (n) =
e
na dd0
1≤a≤dd0 ,
(a,dd0 )=1
X
=
e
=
1≤s≤d,
(s,d)=1
e
1≤r≤d,
1≤s≤d0 ,
(a,d)=1
ns X
d
X
e
nr d0
1≤r≤d0 ,
(r,d0 )=1
n(rd + sd0 )
dd0
= cd (n)cd0 (n)
2. By lemma 1, we have
cd (n) =
X
1≤a≤d
=
X
D|(d,n)
e
na X
d
µ(d) =
D|(a,d)
D
µ(D) =
d
X
1≤a≤d
e
µ
D|(d,n)
na d
3. It’s a special case of the previous point.
12
d
D
X
µ(D)
d
1≤a≤ D
D|d
X
since
X
D,
(
0 if d - n,
=
d if d | n.
e
naD
d
Theorem C (Arithmetic large sieve inequality). Let ℘ ⊂ P and A = {n ∈ N | n ≤ x}.
For each p ∈ ℘, let Ωp = {w1,p , . . . , wω(p),p } be a set of ω(p) residue classes modulo p and
put ω(p) = 0 if p ∈
/ ℘. Finally, let
S(A, ℘, z) = {n ∈ A | n 6≡ wi,p (mod p) ∀ ≤ i ≤ ω(p) ∀p|Π(℘, z)}
and S(A, ℘, z) = #S(A, ℘, z). Then
S(A, ℘, z) ≤
z 2 + 4πx
,
L(z)
where
L(z) =
X
|µ(d)|
d≤z
Y
p|d
ω(p)
.
p − ω(p)
Proof. Let d = p1 , · · · pt a (square-free) integer dividing Π(℘, z). By Chinese remainder
theorem, for every i = (i1 , . . . , it ) with 1 ≤ ij ≤ ω(pj ) there exists a unique Wi,d such that
Q
0 ≤ Wi,d < d and Wi,d ≡ wij ,pj (mod pj ) for j ≤ t. Let’s call ω(d) = tj=1 ω(pj ) the total
numbers of the possible Wi,d as we vary i.
Now let n ∈ S(A, ℘, z). Then (n − Wi,d , d) = 1 for all d and i. Hence, by lemma 5 item
3), we have
µ(d) = cd (n − Wi,d ) =
X
e
1≤a≤d,
(a,d)=1
a(n − Wi,d )
d
.
Summing over i and n ∈ S(A, ℘, z), we deduce that
X X −aWi,d X an e
µ(d)S(A, ℘, z)ω(d) = cd (n − Wi,d ) =
e
d
d
n
1≤a≤d, i
(a,d)=1
and therefore, by Cauchy-Schwartz inequality,
2
2
X X −aWi,d X X an |µ(d)S(A, ℘, z)ω(d)|2 ≤
e
e
.
d
d n
1≤a≤d,
i
1≤a≤d,
(a,d)=1
(a,d)=1
13
The first term on the right hand side is
2
0
X X −aWi,d X
X
)a
(Wi,d − Wi,d
e
e
=
=
d
d
0
i
1≤a≤d,
1≤a≤d,
(a,d)=1
(a,d)=1
=
Wi,d ,Wi,d
X
X
µ
0 D|(d,W −W 0 )
Wi,d ,Wi,d
i,d
i,d
d
D
D=
X
D|d
X
0
cd Wi,d − Wi,d
0
Wi,d ,Wi,d
Dµ
d
D
X
Wi,d
X
1
0
Wi,d
0
D|Wi,d −Wi,d
X µ(E)ω(E)
d
d
=
Dµ
ω(d)ω
= dω(d)
D
D
E
D|d
E|d
Y
Y
ω(p)
= ω(d) (p − ω(p)),
= dω(d)
1−
p
X
p|d
p|d
where we used lemma 5 item 2). Hence we have
|µ(d)|S(A, ℘, z)2
Y
p|d
2
X X an ω(p)
≤
e
.
p − ω(p) 1≤a≤d, n
d (a,d)=1
and this equality is obviously true also if d is not square-free or if it doesn’t divide Π(℘, z).
Summing over d ≤ z and applying Theorem B with an = 1 if n ∈ S(A, ℘, z), 0 otherwise,
we obtain
L(z)S(A, ℘, z)2 ≤ (z 2 + 4πx)S(A, ℘, z).
Given a prime p let’s define q(p) to be the smallest positive integer such that q(p) is
is equal to −1). Note that,
not a square modulo p (or, i.e. the Legendre symbol q(p)
p
being the Legendre symbol completely multiplicative, q(p) ∈ P. Moreover, q(p) = 2 if
p ≡ ±3 (mod 8), since
(
1 if p ≡ ±1 (mod 8),
2
=
p
0 if p ≡ ±3 (mod 8).
The best result known is q(p) pθ+ε for all ε > 0 unconditionally, where θ =
1
√
4 e
=
0, 1516 . . . , while, assuming the Riemann hypothesis, it is q(p) log2 p. This problem is
linked to Artin’s conjecture on primitive roots. Using Theorem C, we can now prove the
following corollary.
14
Corollary C.1. Let ε > 0 and Eε (N ) = #{primes p ≤ N | q(p) > N ε }. Then
Eε (N ) ε 1.
Proof. Since Eε0 (N ) ⊆ Eε (N ) if ε < ε0 , we can suppose ε−1 ∈ N. Let A = {1, . . . , N 2 },
℘ = {p ∈ P | np = 1 ∀n ≤ N ε } and Ωp = {v (mod p) | vp = −1}. Thus ω(p) = #Ωp =
p−1
2
for all p ∈ ℘ and h(p) :=
ω(p)
p−ω(p)
p−1
p+1
=
≥
1
3
if p ∈ ℘. Theorem C implies that
(1 + 4π)N 2
(1 + 4π)N 2
Q
≤P
.
p≤N, |µ(d)|
p≤N, h(p)
p|d h(p)
N 2 + 4πN 2
Q
=P
d≤N |µ(d)|
p|d h(p)
S(A, ℘, N ) ≤ P
q(p)>N ε
p|d⇒p∈℘
But
X
Eε (N ) =
X
1≤
p≤N,
q(p)>N ε
3h(p)
p≤N,
q(p)>N ε
and so
Eε (N )S(A, ℘, N ) ≤ 3(1 + 4π)N 2 .
(3.2)
Moreover, we have
2
S(A, ℘, N ) = #{n ≤ N |
p≤N,
( mp )=1 ∀m≤N ε
n
⇒
6= 1}
p
2
≥ #{n = m · p1 · · · pk ≤ N | N
ε−ε2 /2
(3.3)
−1
ε
< pj < N for 1 ≤ j ≤ k = 2ε }.
2
Indeed if n = m · p1 · · · pk ≤ N 2 with N ε−ε < pj < N ε for 1 ≤ j ≤ k = 2ε−1 , then for all
k
2
p
p ∈ ℘ we have pj = 1 for all 1 ≤ j ≤ k and mp = 1, since N 2 ≥ m N ε−ε /2 = mN 2−ε
P
and so m ≤ N ε . Thus np = m·p1p···pk = 1. Using the fact that p≤B p1 ∼ log log B, the
equation (3.3) gives
S(A, ℘, N ) ≥
X
p1 ,...pk
2
ε− ε2
N
<pj <N ε
2
N log
1
1−
N2
> N2
p1 · · · pk
2ε−1
ε
2
−
X
p1 ,...pk
2
ε− ε2
N
<pj <N ε
Nε
ε log N
2ε−1
X
1
−
1
p1 · · · pk p1 ,...pk
= N2
pj <N ε
log
1
1−
2ε−1
ε
2
−
1
ε log N
2ε−1 !
ε N 2 ,
(3.4)
15
since
log
1
1−
ε
2
>
1
ε log N
for N large enough (depending on ε). To complete the proof it is enough to put together (3.2) and (3.4).
We now would like to tackle the following questions: for a, b ∈ N how likely is it that
the conic
Ca,b := {ax2 + by 2 = z 2 , (x, y, z) 6= (0, 0, 0)} ⊂ P2Q
has a rational point? If M(H) is defined as
M(H) = #{a, b ∈ N | a, b ≤ H, Ca,b (Q) 6= ∅},
what is the ratio
M(H)
H2
as H goes to infinity?
We are now going to deduce by Theorem C a partial answer to this problem, but first,
we need to state some definitions and results.
Let K = R or Qp for some prime p. The Hilbert symbol for K is the function defined
by
(
1
∃(x, y, z) ∈ K 3 \ {0} s.t. ax2 + by 2 = z 2 ,
(a, b)K =
−1 otherwise
for all a, b ∈ K ∗ . Write
(
(a, b)p
(a, b)K =
(a, b)∞
K = Qp
K = R.
We’ll need the following properties:
Proposition 3. Let K = Qp for some prime p or R and let a, a0 , b ∈ K ∗ . Then:
1. (a, b) = (b, a)
2. (aa0 , b)K = (a, b)K (a0 , b)K (bimultiplicativity),
(
1
a or b > 0,
3. (a, b)∞ =
−1 a, b < 0,
16
α
β
αβ(p−1)2
4. If p > 2 and a = p u, b = p v for p - uv, then (a, b)p = (−1)
β α
u
p
v
p
,
where the last two factors are Legendre symbols.
Proof. See §3 of Serre’s “A course in arithmetic”.
It’s worthwhile to know the following theorem that proves the Ca,b satisfy the Hasse
principle.
Theorem (Hasse-Minkowski). There exists (x, y, z) ∈ Q3 \ {0} such that ax2 + by 2 = z 2
iff (a, b)∞ = 1 and (a, b)p = 1 for all primes p.
Proof. See Serre’s “A course in arithmetic”.
Now we are ready to prove the following
Corollary C.2. We have
M(H) H2
1
(log H) 2 −ε
.
Proof. Let
M∗ (H 0 , H) = #{a, b ∈ N | a ≤ H 0 , |µ(a)| = 1, b ≤ H, Ca,b (Q) 6= ∅}.
Clearly, we have
M∗ (H 0 , H) ≤
X
|µ(a)|Ma (H),
a≤H 0
where
Ma (H) = #{b ≤ H | (a, b)p = 1 ∀p > 2}.
If we define ℘ = {p ∈ P | p > 2}, A = {b ≤ H} and Ωp = {v (mod p) | p - v, (a, v)p = −1)},
then
Ma (H) ≤ S(A, ℘, z)
∀z > 0.
Let’s now fix a square-free a ≤ H 0 and assume H 0 ≤ H. Since a is square-free we can write
a = pα u for p - u and α ∈ {0, 1} . Thus, by proposition 3 item 4), we have that if p > 2,
17
Ωp = {1 ≤ v ≤ p − 1 | −1 =
α
v
p
(p−1)
2
} and so ωp =
if α = 1, 0 otherwise. Applying
theorem C, we therefore obtain
z 2 + 4πH
,
La (z)
Ma (H) where
X
La (z) =
|µ(d)|
Yp−1
d≤z,
p|d⇒p|a
and g(d) =
Q
p|d
p+1
X
=
g(d)
d≤z,d|a
p−1
p|d p+1 .
p−1
p+1
Now, let ε > 0 and note that
P
define ν(d) := p|d 1, we have
≥
1
1+ε
iff p ≥
2+ε
ε
ε 1. If we take z =
√
a and we
X Y p−1 Y p−1
X 1 ν(d)
La (z) =
ε
p+1
p+1
1+ε
d≤z,
d≤z,
p|d,
D|a pε 1
p|d,
pε1
d|a
X
1
1
1=
ν(a)
(1 + ε)
2
√
d≤ a,
d|a
Moreover, we have that z =
√
a≤
√
H0 ≤
2
1+ε
ν(a)
.
√
H, thus
−ν(a)
X
X
X 1 + ε ν(a)
2
∗
0
|µ(a)|
|µ(a)|Ma (H) H
M (H , H) ≤
H
.
1+ε
2
a≤H 0
a≤H 0
a≤H 0
P
Hardy and Ramanujan proved that a≤H 0 β ν(a) (log HH0 )1−β and so we obtain
M∗ (H 0 , H) HH 0
(log H 0 )
1−ε
2
.
Finally, note that Cuv2 ,b (Q) 6= ∅ implies Cu,b (Q) 6= ∅, so, writing a = uv 2 for u square-free,
we get
M(H) ≤
X
√
v≤ H
M
∗
H
,H
v2
H2
1
(log H) 2 −ε
.
Remark C.2.1. The result proved in the previous corollary can be improved. In fact,
Hooley and Serre proved that
H2
H2
M(H) .
log H
log H
18
Chapter 4
Selberg sieve
Eratosthenes sieve investigates the function S(A, ℘, z) =
Q
p∈℘, p, via the equality
P
n∈N,
(n,Π)=1
1, where Π = Π(℘, z) =
p<z
S(A, ℘, z) =
XX
X
µ(d) =
n∈A d|n,
d|Π
µ(d)#Ad .
d|Π
The “basic sieve problem” is to find some arithmetic functions µ± (d) : N → R such that
(
X
1 if (n, Π) = 1,
µ− (d) ≤
(4.1)
0
if
(n,
Π)
>
1;
d|n,
d|Π
(
1 if (n, Π) = 1,
µ+ (d) ≥
0 if (n, Π) > 1,
d|n,
X
(4.2)
d|Π
so that
X
d|Π
µ(d)− #Ad =
XX
µ− (d) ≤ S(A, ℘, z) ≤
n∈A d|n,
d|Π
Writing #Ad as #Ad =
ω(d)X
d
XX
µ+ (d) =
n∈A d|n,
d|Π
X
µ(d)+ #Ad .
d|Π
+ Rd with ω(d) completely multiplicative, this gives
S(a, ℘, z) ≤ X
X µ+ (d)ω(d)
d|Π
d
+
X
|µ+ (d)Rd |.
(4.3)
d|Π
Selberg sieve arose out of an effort to minimize (4.3) subject to (4.2). The key idea is to
replace µ+ (d) by a quadratic form, optimally chosen.
We’ll need the following lemmas
19
Lemma 6. Let ζ > 0 and {λi }i∈N ⊂ R. Then
X
ω(d)λd
µ(`)y` =
µ(d)
d
(4.4)
`|Π, d|`,
`<ζ
holds for all d|Π with d < ζ if and only if
y` =
X ω(δ)λδ
δ
δ|Π, `|δ,
δ<ζ
for all ` < ζ, `|Π.
Proof. If y` =
P
X
δ|Π, `|δ,
δ<ζ
ω(δ)λδ
δ
µ(`)yl =
`|Π, d|`,
`<ζ
for all ` < ζ, `|Π, we have that
X
µ(`)
`|Π, d|`,
`<ζ
=
X ω(δ)λδ
X ω(δ)λδ X
µ(`)
=
δ
δ
δ|Π, `|δ,
δ<ζ
δ|Π, d|δ,
δ<ζ
d|`, `|δ
X ω(δ)λδ X
X ω(δ)λδ X
µ(md) = µ(d)
µ(m)
δ
δ
δ
δ|Π, d|δ,
δ<ζ
= µ(d)
md|δ
δ|Π, d|δ,
δ<ζ
m| d
ω(d)λd
.
d
Vice versa, if (4.4) held for another {y`0 }`<ζ with {y`0 }`<ζ 6= {y` }`<ζ , then there would exist
˜ such that y ˜ 6= y 0 , and this is a contradiction since
a maximal `˜ < ζ, `|Π
`
`˜
X
˜ ˜ − y 0˜) 6= 0.
µ(`)(y` − y`0 ) = µ(`)(y
0=
`
`
(4.5)
˜
`|Π, `|`,
`<ζ
Lemma 7. Let d|Π and z, ζ > 0. For all a|Π, let
Ga (ζ, z) =
X
g(m),
am|Π(℘,z),
m<ζ
with g(m) the multiplicative arithmetic function defined by g(m) =
Then, if 0 ≤ ω(p) < p ∀p ∈ ℘, we have
Y
−1
ζ
ω(p)
G1 (ζ, z) ≥ Gd
,z
1−
.
d
p
p|d
20
ω(m)
m
Q
p|m
1−
ω(p)
p
−1
.
Proof. We have that
G1 (ζ, z) =
X
g(m) =
X
m|Π(℘,z),
m<ζ
=
X
g(m) =
`|d m|Π, (m,d)=`,
m<ζ
X
g(`)
`|d
X
g(m0 ) =
X
g(`)
≥
`|d
g(`)
X
0
g(m ) = Gd
dm0 |Π,
g(`m0 )
X
g(m0 )
dm0 |Π,
m0 < ζ`
m0 < ζ`
X
X
`|d `m0 |Π, (m0 d )=1,
`
`m0 <ζ
`|d
`m0 |Π, (m0 , d` )=1,
X
X
ζ
,z
g(l),
d
`|d
m0 < dζ
since g(m0 ) ≥ 0. To conclude the proof it’s enough to observe that
−1 ! Y −1
X
Y
Y
ω(p)
ω(p)
ω(p)
=
1−
1−
.
1+
g(`) =
(1 + g(p)) =
p
p
p
`|d
p|d
p|d
p|d
We are now ready to prove the following
Theorem D (Fundamental theorem for Selberg sieve). Let z > 0, y > 1 and ω(d) a
completely multiplicative arithmetic function such that
0 ≤ ω(p) < p
and #Ad =
ω(d)X
d
∀p ∈ ℘
+ Rd . Then
S(A, ℘, z) ≤
X
X
+
3ν(d) |Rd |,
√
G( y, z)
d|Π(℘,z),
d<y
where
ν(d) =
X
1,
p|d
X
√
G( y, z) =
g(`),
`|Π(℘,z),
√
`< y
−1
ω(`) Y
ω(p)
g(`) =
1−
.
`
p
p|`
21
(Ω1 )
Proof. Let {λd }d∈N ⊂ R with λ1 = 1 and define
X
µ+ (d) =
λd1 λd2 ,
d1 , d2 ,
d=[d1 ,d2 ]
where [a, b] =
ab
(a,b)
is the least common multiple of a and b. This choice of µ+ (d) satisfies
the inequality (4.2), indeed
2

X
X
µ+ (d) =
d|n,
d|Π
λd1 λd2 =
P
d|n,
d|Π
X
λd1 λd2 = 
d1 ,d2 |(n,Π)
[d1 ,d2 ]|(n,Π)
and if (n, Π) = 1 then
X
λd  ≥ 0
d|(n,Π)
µ+ (d) = µ+ (1) = λ21 = 1. Thus (4.3) holds, that is
S(a, ℘, z) ≤ X
X µ+ (d)ω(d)
d
d|Π
+
X
|µ+ (d)Rd |
(4.3)
d|Π
= XM + E,
say. Now, assume that λd = 0 for d ≥
√
y. As a consequence we have that µ+ (d) = 0 for
d ≥ y. Thus
M=
X µ+ (d)ω(d)
d|Π
d
=
X
[d1 ,d2 ]|Π
λd1 λd2
ω([d1 , d2 ])
=
[d1 , d2 ]
By condition (Ω1 ), we can define g(k) =
ω(k)
k
Q
X
d1 ,d2 |Π,
√
d1 ,d2 < y,
ω(d1 d2 )6=0
p|k
1−
ω(d1 )λd1 ω(d2 )λd2 (d1 , d2 )
.
d1
d2
ω((d1 , d2 ))
ω(p)
p
−1
≥ 0 and, if ω(k) 6= 0 and
µ(k) 6= 0, we have
1
k Y
ω(p)
k X µ(`)ω(`) X
k/`
1−
=
=
=
µ(`)
g(k)
ω(k)
p
ω(k)
`
ω(k/`)
p|k
`|k
`|k
X
X
k
`0
`0
0
=
µ 0
= µ(k)
µ(` )
.
` ω(`0 )
ω(`0 )
0
0
` |k
` |k
Therefore, by Möbius inversion formula, if ω(d) 6= 0 and µ(d) 6= 0, we have
22
d
ω(d)
=
1
k|d g(k) .
P
Thus
M=
X
d1 ,d2 |Π,
√
d1 ,d2 < y,
ω(d1 d2 )6=0
ω(d1 )λd1 ω(d2 )λd2 (d1 , d2 )
=
d1
d2
ω((d1 , d2 ))
=
`|Π,
√
`< y,
ω(`)6=0
d1 ,d2 |Π,
√
d1 ,d2 < y,
ω(d1 d2 )6=0
k|d1 ,d2
2

X
ω(d1 )λd1 ω(d2 )λd2 X 1
d1
d2
g(k)
X
(4.6)
X ω(d)λd 
X y2
1 
`

 =
,


g(`)
d
g(`)
d|Π,√
`|d,
d< y
`|Π,
√
`< y,
ω(`)6=0
say. Applying lemma 6 with d = 1 and ζ =
1=
X
µ(`)y` =
`|Π,
√
`< y
X
√
y, we get
µ(`)y` =
`|Π,
√
`< y,
ω(`)6=0
X
`|Π,
√
`< y,
ω(`)6=0
p
y`
µ(`) g(`) p
.
g(`)
So, by Cauchy’s inequality, we obtain
!
1≤
X
µ(`)2 g(`)
`|Π,
√
`< y
X
`|Π,
√
`< y,
ω(`)6=0
√
= G( y, z)M,
y`2
g(`)
since Π is square-free and by (4.6). Therefore we have M ≥
!
√1
G( y,z)
and the equality holds
if and only if the equality holds in Cauchy’s inequality, or, in equivalence, if there exists a
constant c such that
p
y
p ` = cµ(`) g(`)
g(`)
∀`|Π, s.t. ` <
√
y, ω(`) 6= 0.
So, to obtain the best estimate, we have to choose y` = cµ(`)g(`) and if that holds, applying
√
again lemma 6 with d = 1 and η = y, we get
1=
X
`|Π,
√
`< y
µ(`)y` =
X
√
µ(`)2 g(`) = cG( y, z).
`|Π,
√
`< y
Thus to obtain the optimal estimate we have to find if there exist some λd such that
√
µ(`)g(`)
√
y` = G(
for all ` < ζ, `|Π. So, applying lemma 6 with ζ = y, we find that the sought
y,z)
23
λd exist and have to be
λd =
X
µ(d)d X
µ(d)d
µ(`)y` =
µ(`)2 g(`)
√
ω(d)
ω(d)G( y, z)
`|Π,√
d|`,
`< z
`|Π,√
d|`,
`< z
−1
√
y
µ(d) d g(d) X
µ(d) Y
ω(p)
=
g(j) =
1−
Gd
,z ,
√
√
G( y, z) ω(d)
G( y, z)
p
d
dj|Π,
√
j< dz
(4.7)
p|d
using the notation of lemma 7. With this choice of λd we have M =
√1
G( y,z)
and (4.3)
becomes
S(A, wp, z) ≤
X
X
+
|µ+ (d)Rd |.
√
G( y, z)
d|Π,
d<y
Therefore, to conclude it’s enough to observe that by (4.7) and lemma 7 we have |λd | ≤ 1
(since G1 (ζ, z) = G(ζ, z)) and so
ν(d) X
X
X
ν(d) a
+
|µ (d)| = λd1 λd2 ≤
1=
2 = 3ν(d) ,
a
d=[d1 ,d2 ]
d=[d1 ,d2 ]
a=0
for all square-free d.
Theorem D can be used to obtain an upper bound for the function
φ(x, z) = #{n ≤ x | p|n ⇒ p ≥ z}.
To prove it we’ll need the following lemmas.
Lemma 8. Let
Hk (z) =
X µ(`)2
,
ϕ(`)
(`,k)=1,
`<z
where ϕ(`) is the Euler’s φ function. Then
Hk (z) ≥
ϕ(k)
log z.
k
24
Proof. Firstly we prove the statement for k = 1. We have that
H1 (z) =
X µ(`)2
`<z
where κ(n) =
Q
p|n
ϕ(`)
h
Y
(pi − 1)−1 =
X
=
X
p1 ···ph <z,
p1 <···<ph ,
αi ≥1
`=p1 ···ph <z, i=1
p1 <···<ph
X 1
1
=
,
pα1 1 · · · pαh h
n
κ(n)<z
p is the square-free kernel of n. Thus,
H1 (z) =
X 1 X1
≥
≥ log z.
n n<z n
κ(n)<z
On the other hand, we have
H1 (z) =
X µ(n)2
`<z
ϕ(n)
=
X X µ(n)2 X X
=
ϕ(n)
0
n<z,
`|k
`|k
`=(n,k)
=
X µ(`)2
`|k
≤
ϕ(`)
n <z/`,
(n,z/`)=1
z X µ(n0 )2 X µ(`)2
=
H
k
ϕ(n0 )
ϕ(`)
`
0
`|k
(n ,k)=1,
n0 <z/`
X µ(`)2
`|k
Y 1 k
Hk (z) =
Hk (z) =
Hk (z)
ϕ(`)
p−1
ϕ(k)
p|k
and so
Hk (z) ≥
ϕ(k)
log z.
k
Lemma 9. For all h ∈ N we have
X
S1 =
µ(d)2 hν(d) ≤ x (1 + log x)h ,
d≤x
S2 =
X
µ(d)2
d≤x
where ν(d) =
P
p|d
µ(`n0 )2
ϕ(`n0 )
hν(d)
≤ (1 + log x)h ,
d
1.
Proof. We have that
S1 ≤
x
µ(d)2 hν(d) = xS2 .
d
d≤x
X
25
Moreover,
S2 =
X µ(d)2
d≤x
X
=
d1 ,...,dh
d
X
1≤
d1 ,...,dh ,
d=d1 ···dh
∞
X
1
X
µ(d1 )2 · · · µ(dh )2
d d=d
1 ···dh ,
di ≤x
d=1
µ(d1 )2
µ(dh )2
···
=
d1
dh
≤x
X µ(d)2
!h
≤ (1 + log x)h .
d
d≤x
Remark 9.1. Using Perron’s formula, one can prove that S1 x (log x)h−1 , anyway this
improvement doesn’t have any effect on our final result about φ(x, z), in fact that just forces
us to use an asymptotic inequality instead of a simple inequality.
Now we are ready to prove the following
Corollary D.1. We have
i) φ(x, z) ≤
ii) π(x) x
log z
+ z 2 (1 + 2 log z)3 ,
x
.
log x
Proof. If we define A = {n ∈ N | n ≤ x}, we have that φ(x, z) = S(A, P, z). Moreover, we
have
#Ad =
xω(d)
+ Rd ,
d
with ω(d) = 1 for all d and |Rd | < 1. Applying Theorem D with y = z 2 we have
X
x
φ(x, z) ≤
+
3ν(d) |Rd |,
G(z, z)
`|Π(P,z),
d<z 2
where
G(z, z) =
X ω(`) Y `|Π,
`<z
`
p|`
ω(p)
1−
p
−1
=
X µ(`)2
`<z
Thus, applying lemma 8 with k = 1, we find
ϕ(x, z) ≤
X
x
+
3ν(d) µ(d)2
log z
2
d<z
26
ϕ(`)
.
and so to obtain item i) it’s enough to apply lemma 9.
To deduce item ii), we have just to observe that by item i) we have
π(x) ≤ φ(x, z) + π(z) ≤
x
+ O z 2 (log z)3 + z
log z
1
and choose z =
x2
.
(log x)2
Remark D.1.1. In the previous corollary we obtained a better estimate than the one we
could obtain from corollary A.2. This is due to the fact that the main terms of theorems A
and B are basically the same, but the error term of the Selberg sieve is much better than
the one of the sieve of Eratosthenes.
We can also use Theorem D to estimate π(x; k, a) = #{primes p ≤ x | p ≡ a (mod k)}
for given coprime a and k.
Corollary D.2. Let ℘ = {primes p ≤ x | p - k} and let
(
1 if p - k,
ω(p) =
0 otherwise.
Then
S(A, ℘, z) ≤
x
k
+
ϕ(k) log z
X
3ν(d) |Rd |.
d|Π(℘,z), (d,k)=1,
d<z 2
Proof. Exercise.
Dirichlet theorem of primes in a progression assures that π(x; k, a) goes to infinity as
x → ∞ if (k, a) = 1 (otherwise it’s clearly 0 or 1). In fact, Dirichlet showed that (if
(k, a) = 1) primes p ≡ a (mod k) have analytic density
1
p≡a (mod k) ps
1
log s−1
1
,
ϕ(k)
P
lim
s→1
= ϕ(k)
that coincide with arithmetic density
π(x; k, a)
1
=
x→∞
π(x)
ϕ(k)
lim
27
that is
(but be aware that the two statements aren’t equivalent). More precisely, we have
π(x; k, a) ∼
1
x
ϕ(k) log x
with an error term that’s not uniform in k. Siegel and Walfisz proved the following result
uniform in k.
Theorem (Siegel-Walfisz). Let (a, k) = 1. For all N > 0 there exists a c = c(N ) > 0 such
that for any k ≤ (logx)N we have
p
1
li x + O x exp −c log x ,
π(x; k, a) =
ϕ(k)
R x du
is the logarithmic integral function. Moreover, if
uniformly in k and where li x := 2 log
u
√
the generalized Riemann hypothesis holds, we have that, for any k ≤
π(x; k, a) =
x
,
(logx)2
√
1
li x + O x log(kx) ,
ϕ(k)
uniformly in k.
As a consequence of theorem D, we can prove the following corollary, that gives an
estimate for π(x; k, a) that is worse than the previous ones, but that holds for a bigger
range of k.
Corollary D.3 (Brun-Titchmarsh). Let (a, k) = 1 and k ≤ x4 . Then
!
x log log xk
2x
+O
π(x; k, a) ≤
,
ϕ(k) log xk
ϕ(k) log x 2
k
uniformly in k.
Proof. Let A = {n ≤ x | n ≡ a (mod k)} and ℘ = {p ∈ P | p - k}. Then
π(x; k, a) ≤ S(A, ℘, z) +
Moreover #Ad =
x ω(d)
k d
z
+ 1.
k
+ Rd , where
(
1 if p - k,
ω(p) =
0 if p | k
28
and |Rd | < 1. Hence, by Corollary D.2 and Lemma 9, we have
S(A, ℘, z) ≤
Taking z =
px
k
k
x
+
ϕ(k) k log z
5
x −2
k
X
3ν(d) µ(d)2 =
d|Π(℘,z), (d,k)=1,
d<z 2
x
+ O z 2 (log z)3 .
ϕ(k) log z
we complete the proof.
Remark D.3.1. If we could replace 2 by 2 − δ for some δ > 0 in Corollary D.3, we would
have as a consequence that the Landau-Siegel zeros don’t exist.
We now state the following theorem.
Theorem E (Bombieri-Vinogradov Theorem, 1965). For all A > 0, there exist c = c(A) >
0 and B = B(A) > 0 such that
li x x
max ∗ π(x; k, a) −
≤C
a∈(Z/kZ)
ϕ(k)
(log x)A
k≤K
X
1
for K = x 2 (log x)−B .
Proof. See Davenport, Multiplicative number Theory (it’s proved using the large sieve).
Combining Theorems D and E, we can study “Titchmarsh divisor problem”, that is to
compute the order of the function
S(x) =
X
d(p + a),
p≤x
for a ∈ N fixed and where d(n) :=
P
d|n
1. In 1930 Titchmarsh was able to prove that
S(x) = O(x). The following corollary goes beyond that estimate providing the asymptotic
behaviour of S(x).
Corollary E.1. For all a ∈ N, there exists c > 0 such that
x log log x
S(x) = cx + O
.
log x
29
Proof. For all n ∈ N we have that
X
d(n) = 2
1 − δ(n),
d|n,
√
d≤ n
where
(
1 if n is a square,
δ(n) =
0 otherwise.
Thus
S(x) = 2
X X
1−
p≤x d|p+a,
√
d≤ p+a
=2
X
X
δ(p + a) = 2
X
π(x; d, −a) + O
√ x
√
d≤ x
p≤x
π(x; d, −a) + O
√ x ,
√
d≤ x,
(a,d)=1
since
P
p≤x
δ(p + a) ≤
√
δ(n + a) = O ( x) .
P
n≤x
Now, let A > 0 and let B = B(A) > 0 as in Theorem E. Write
X
X
X
π(x; d, −a) =
π(x; d, −a) +
√
d≤ x,
(a,d)=1
d≤
√
√
x(log x)−B ,
π(x; d, −a)
√
x(log x)−B ≤d≤ x,
(a,d)=1
(a,d)=1
= S1 (x) + S2 (x),
say. Theorem E implies that
X
S1 (x) =
√
d≤ x(log x)−B ,
(a,d)=1
li x
+
ϕ(d)
X
= li x
d≤
√
x(log x)−B ,
X
√
d≤ x(log x)−B ,
(a,d)=1
1
+O
ϕ(d)
li x
π(x; d, −a) −
ϕ(d)
x
(log x)A
.
(a,d)=1
Moreover we have that
X
d<t,
(a,d)=1
1
= c log t + O(1),
ϕ(d)
for some c > 0. Hence S1 (x) = cx + O
x log log x
log x
X
S2 (x) √
√
x(log x)−B ≤d≤ x,
(a,d)=1
. Finally, Corollary D.3 implies
1
x
x
log log x,
ϕ(d) log x
log x
by (4.8).
30
(4.8)
Chapter 5
Sieve limitations
The optimization problem for the upper bound sieve requires minimising the functional
X µ+ (d)ω(d) X µ+ (d)Rd ,
+
L̃(µ ) := X
d
+
d|Π
d|Π(℘,z)
subject to
(
1 if (n, Π) = 1,
µ+ (d) ≥
0 if (n, Π) > 1.
d|n,
X
d|Π
This is almost a problem of linear programming. To obtain a linear programming problem
+
+
in standard form, we need to write µ+ = µ+
1 − µ2 with µi ≥ 0 and try to minimize the
linear functional
L(µ+ ) := X
X
d|Π(℘,z)
ω(d) X +
+
µ+
(d)
−
µ
(d)
+
µ1 (d) + µ+
1
2
2 (d) |Rd | ,
d
d|Π
subject to
(
1 if (n, Π) = 1,
+
µ+
1 (d) − µ2 (d) ≥
0 if (n, Π) > 1
d|n,
X
d|Π
and
µ+
1 (d) ≥ 0,
µ+
2 (d) ≥ 0.
31
Now, define
ω(d)
c = (−1)k−1 X
+ |Rd |
d
d|Π, k=1,2
x = µ+
k (d)
d|Π, k=1,2
b = δ1,(n,Π)
n∈A
where δi,j is Kronecker’s delta and
An;
d,k
(
(−1)k−1
=
0
if d|n,
otherwise,
with n ∈ A, d|Π and k = 1, 2. Then, what we are trying to minimize is cT x, under the
conditions Ax ≥ b and x ≥ 0. The dual problem is to maximize y T b, subject to y ≥ 0 and
y T A ≤ cT .
Note that, if the conditions Ax ≥ b, x ≥ 0, y T A ≤ cT and y ≥ 0 hold, we have that
cT x ≥ y T Ax ≥ y T b.
(5.1)
Moreover, the strong duality theorem assures that there exist x and y such that the equality
holds in (5.1) and clearly those vectors are solutions for the linear programming problem
and its dual. Thus, tackling the dual problem, we can obtain informations about the best
upper bound it’s possible to obtain through sieve methods.
Now, in this case the dual problem is maximizing the function
X
J(y) =
yn ,
n∈A,
(n,Π)=1
under the conditions
X
X
ω(d)
ω(d)
− |Rd | ≤
ydm ≤ X
+ |Rd |,
d
d
n∈A,
d|n
yn ≥ 0.
Note that, taking yn = 1 for any n, we obtain J(y) = S(A, ℘, z). Moreover, for any subset
à ⊂ A such that
Ãd − X ω(d) ≤ |Rd |,
d 32
(5.2)
taking yn = 1 if n ∈ Ã, 0 otherwise, we find J(y) = S(Ã, ℘, z). Thus, for any à ⊂ A
satisfying (5.2), we have
L(µ+ ) ≥ S(Ã, ℘, z)
and it’s easy to show that we can drop the condition à ⊂ A.
We now give an example where the upper bound given by Selberg sieve is optimal.
Let Ω(n) the number of factor of n counted with multiplicity and let λ(n) = (−1)Ω(n)
be the Liouville function. Set
A± = {n ∈ N | n ≤ x, λ(n) = ∓1}.
Now,
S(A+ , P, z) = #{n ∈ A+ | p|n ⇒ p ≥ z}
= #{n ≤ x | λ(n) = −1, p|n ⇒ p ≥ z}.
1
Clearly, if z > x 3 we have that
S(A+ , P, z) = π(x) − π(z)
x
x
=
+O
+ O(z).
log x
log2 x
We now want to find an upper bound for S(A+ , P, z) using Selberg sieve. We need the
following lemma.
P
Lemma 10. Let Λ(x) = n≤x λ(n). Then there exists c > 0 such that Λ(x) Ec (x),
√
where Ec (x) = x exp(−c log x).
Proof. Let’s consider Mertens function M (x) =
1
.
ζ(s)
P
n≤x
µ(n). It’s well known that
P∞
µ(n)
n=1 ns
=
Moreover, using Perron formula, if x isn’t an integer we have that
1
M (x) =
2πi
Z
k+i∞
k−i∞
1 xs
ds,
ζ(s) s
for any k > 1. Using Cauchy theorem and the zero free region for ζ(s), one can prove
that M (x) = O(Ec (x)). Moreover, note that if n/`2 is square-free, then we have λ(n) =
33
µ(n/`2 ) =
P
d2 |n
µ(n/d2 ). Hence
XX
X X
Λ(n) =
µ(n/d2 ) =
µ(m) Ec (x).
√
d≤ x m≤ dx2
n≤x d2 |n
1
Remark 10.1. Note that Riemann hypothesis is true if and only if M (x) = O(x 2 +ε ) for
1
any ε > 0 and if and only if Λ(x) = O(x 2 +ε ) for any ε > 0.
Now, let’s go back to our sets A± . We have that
#A±
d = #{n ≤ x | λ(n) = ∓1, d|n}
x
= #{m ≤ | λ(m) = ∓λ(d)}.
d
Observe that if λ(m) = ∓λ(d), then 1 ∓ λ(d)λ(m) = 2 ant it’s 0 otherwise. Thus
1 X
(1 ∓ λ(d)λ(m))
#A±
d =
2
x
m≤ d
1 h x i λ(d) x ∓
Λ
2 d
2
d
x 1x
=
+ O Ec
,
2d
d
=
by lemma 10 (and if d ≤ x). Therefore, we have to take X =
x
2
and ω(d) = 1 for all d.
Applying Theorem D to this problem we obtain the remainder term
x
X
X
µ(d)2 3ν(d) Ec
3ν(d) |Rd | d
d<y
d|Π,
d<y
X
ν(d)
µ(d)2
9
! 12
X
d
d<y
µ(d)2 d Ec
d<y
x 2
! 12
d
,
where we used Cauchy’s inequality and we are assuming y < x. Now, by lemma 9, we have
X
µ(d)2
d<y
Moreover, we have
X
d<y
µ(d)2 d Ec
x 2
d
9ν(d)
log9 y.
d
x2
X1
d<y
d
p
exp −2c log (x/y)
p
x2 log y exp −2c log (x/y) .
34
Thus
X
ν(d)
3
p
|Rd | x log y exp −c log (x/y)
5
d|Π,
d<y
and taking y ≤ E1 (x), we find
p
X
3ν(d) |Rd | x log5 x exp −c 4 log x d|Π,
d<y
x
.
log2 x
1
Therefore, taking y ≤ E1 (x) and z ≥ y 2 , theorem D gives us
X
x
+
,
S(A , P, z) ≤
+O
√
G( y, z)
log2 x
where
−1
X ω(`) Y X µ(`)2
1
ω(p)
√
√ √
G( y, z) ≥ G( y, y) =
≥ log y,
1−
=
`
p
2
√ ϕ(`)
`|Π,
√
`< y
`< y
p|`
by lemma 8. Thus
x
S(A , P, z) ≤
+O
log y
+
x
log2 x
.
Taking y = E1 (x), we find
x
S(A , P, z) ≤
+O
log x
!
x
+
3
(log x) 2
and so, since we already knew that
x
+O
S(A , P, z) =
log x
+
x
log2 x
+ O(z)
1
for z > x 3 , we have that with Selberg sieve we are able to prove an optimal upper bound
1
for x 2 ≤ z ≤
x
.
log2 x
Therefore Selberg’s coefficients µ+ are optimal solutions to the mini-
mization problem for L(µ+ ) and, correspondly, A+ is optimal for the dual problem.
We now turn to the lower bound sieve problem. It’s clear that also this problem can
be expressed as a linear programming problem with the new condition
(
X
1 if (n, Π) = 1,
µ− (d) ≤
0 otherwise.
d|n, d|Π
35
Obviously, the choice µ− (d) = 0 for all d satisfies this condition, and the corresponding
inequality is
S(A, ℘, z) ≥ X
X µ− (d)ω(d)
d|Π
d
−
X
µ− (d)Rd = 0.
d|Π
1
Now, for A = A− and z > x 2 , we have that S(A− , P, z) = 1, so the coefficients µ− (d) = 0
are essentially optimal for our linear programming problem and thus so is A− for the dual
problem. In particular, since A− and A+ have the same inputs, ω(d), X and O(Rd ), it
1
is not possible for Sieve machinery to distinguish them. Therefore, for x 2 < z < logx2 x ,
we can’t prove that S(A+ , P, z) x
log x
through sieve methods and thus that π(x) x
.
log x
This problem is due to the fact that integers n with 2|Ω(n) are seen by sieves as same
as integers n with 2 - Ω(n). This phenomenon is known as “Parity problem” and it’s a
big limitation for sieve methods. To tackle this kind of problems is therefore necessary to
insert some other machinery that doesn’t come from sieve methods.
36
Chapter 6
Small gaps between primes
As a consequence of the prime number theorem (stating that π(x) = #{p ≤ x} ∼
x
),
log x
one can prove that
N
X
pn+1 − pn
n=1
log pn
∼N
and thus
lim inf
n→∞
pn+1 − pn
pn+1 − pn
≤ 1 ≤ lim sup
.
log pn
log pn
n→∞
The twin primes conjecture, saying that there are infinitely many primes p such that p + 2
is also prime, leads to think that the much weaker statement
lim inf
n→∞
pn+1 − pn
=0
log pn
is true. Hardy and Littlewood were the first to obtain some results in this direction. In
1926, they proved that
lim inf
n→∞
pn+1 − pn
3
≤ ,
log pn
5
under the generalized Riemann hypothesis. Other progresses have been done over the
years, one of the last being Mayer’s proof (1986) of
lim inf
n→∞
pn+1 − pn
1
< .
log pn
4
Finally, in 2005 Goldston, Pintz and Yildirim managed to prove that this lim inf is 0 and
other results towards the twin prime conjecture. The following are results they were able
to obtain.
37
Theorem 1. We have
lim inf
n→∞
pn+1 − pn
= 0.
log pn
Theorem 2. We have
lim inf
n→∞
pn+1 − pn
1
(log pn ) 2 (log log pn )2
<∞
Moreover, denote with BV(θ) the following statement
X
X
y
x
Λ(n) −
max max (a,q)=1 y≤x ϕ(q) logA x
q≤xθ
n≡an≤y,
(mod p)
for any A > 0.
(BV(θ))
Note that Bombieri-Vinogradov Theorem (Theorem E) states that BV(θ) holds for any
θ<
1
2
and that the Elliott-Halberstam conjecture implies that BV(θ) holds for any θ < 1.
The following are conditional results proved by Goldston, Pintz and Yildirim.
Theorem 3. If BV(θ) holds for some θ > 12 , then
lim inf pn+1 − pn < ∞.
n→∞
Theorem 4. If BV(θ) holds for all θ < 1, then
lim inf pn+1 − pn ≤ 20.
n→∞
Theorem 5. If BV(θ) holds for all θ < 1, then
lim inf pn+1 − pn ≤ 16.
n→∞
We are now going to prove the first 4 theorems. The fifth can be obtained in a similar
way with some refinements.
Let H > 0 and H ⊆ [0, H] ∩ Z. H is said admissible if for any prime p there exists np
such that p - np + h ∀h ∈ H. For example, {0, 2} is admissible, but {0, 2, 4} isn’t, since the
condition fails for p = 3. Clearly if a set H isn’t admissible, there aren’t infinitely many n
such that n + h is prime for all h ∈ H.
38
If we set k = #H, to verify that H is admissible, it’s enough to check the condition for
all primes p ≤ k. Thus the set H = {pi | k < p1 < · · · < pk } is admissible for any k and so
we can find admissible sets of any cardinality.
Now, let N ∈ N and let H be an admissible set. Define







X  X
log(n + h) − log 3N .
S0 :=



N <n≤2N 

 h∈H,
n+h prime
Clearly, if we were able to prove that S0 > 0 for infinitely many N , we would have that
there exist infinitely many n such that
X
log(n + h) > log 3N
h∈H,
n+h prime
and so that for infinitely many n there exist at least two h such that n + h is prime.
Unfortunately this is not the case, since S0 → −∞ as N → ∞. Thus, we try to sum just
on the n that are more likely to give more than one h such that n + h is prime. To do that,
P
we try with Selberg’s idea, multiplying the summands by ( d|n λd )2 and let this being
essentially supported on almost primes. Therfore, we consider the sum



2





X
X  X
log(n + h) − log 3N 
λd  .
S :=


Q

N <n≤2N 
 h∈H,
 d| h∈H (n+h)
n+h prime
The optimal values of the λd are still not known in this context and so we try to use the
Selberg’s sieve ones, that are essentially
λd ≈ µ(d)
log+ (ζ/d)
log ζ
κ
,
where
(
log x if x ≥ 1,
log x :=
0
if x ∈ (0, 1)
+
and κ is the dimension of the sieve. We take κ ≥ k = #H and we write κ = k + `, with
` ≥ 0. As before, if we are able to prove that S > 0 for infinitely many N we’ll obtain
39
that there exist at least two h such that n + h is prime for infinitely many n. Now, assume
1
that k, ` and H ⊂ [0, H] are fixed and that N 10 ≤ ξ ≤ N. Since the λ(d) are supported
on [0, ξ], we can write S in the form
S=
X
X
[d1 ,d2 ]=D
log(n + h) − log 3N



 X
X
N
Q<n≤2N,
D| h∈H (n+h)








ΛD
P
X

N <n≤2N,
 h∈H,
Q
n+h prime
d1 ,d2 | h∈H (n+h)
D≤ξ 2
where ΛD :=
X
λ1 λ2
d1 ,d2 ≤ξ
=




log(n + h) − log 3N




,






h∈H,
n+h prime
λd1 λd2 . Since λd ≤ 1, we have that
|ΛD | ≤ #{d1 , d2 | [d1 , d2 ] = D} ≤ d(D)2 .
Moreover, λd = 0 unless d is squarefree and so the same holds for Λd .
Let’s fix an h0 ∈ H and let H0 = H \ {h0 }. We have
X
X
X
ΛD
ΛD
log(n + h) =
S(h0 ) :=
D≤ξ 2
=
D≤ξ 2
N <n≤2N,
n+h
Q 0 prime,
D| h∈H (n+h)
X
D≤ξ 2
X
ΛD
X
log p
N +h
Q 0 <p≤2N +h0 ,
D| h∈H (p−h0 +h)
log p,
NQ
+h0 <p≤2N +h0 ,
D| h∈H (p−h0 +h)
0
since (D, p) = 1 being p > N ≥ ξ and D ≤ ξ 2 . Now, let a1 , . . . , aν0 (D) be the classes in the
set
{x (mod D) |
Y
(x − h0 + h) ≡ 0 (mod D)}
h∈H0
that are coprime to D. If D is prime we have ν0 (D) ≤ #H0 ≤ k − 1 ≤ 2k = d(D)k and
ν0 (D) is clearly multiplicative, so ν0 (D) ≤ d(D)k holds for all square-free D. Moreover,
we have
ν0 (D)
S(h0 ) =
X
D≤ξ 2
ΛD
X
X
j=1 N +h0 <p≤2N +h0 ,
p≡aj (mod D)
40
log p.
If we define
∆(y, q, a) = X
p≤y,
p≡a (mod q)
y log p −
,
ϕ(q) we have that
ν0 (D) N
S(h0 ) =
+ O(∆(2N + h0 , D, aj )) + O(∆(N + h0 , D, aj ))
ΛD
ϕ(q)
2
j=1
D≤ξ


X
X N
ΛD ν0 (D) + O 
=
|ΛD |ν0 (D) max max ∆(y, D, a)
(a,D)=1 y≤3N
ϕ(D)
2
2
D≤ξ
D≤ξ


X
X ΛD ν0 (D)
+O
=N
d(D)2+k max max ∆(y, D, a) .
(6.1)
(a,D)=1 y≤3N
ϕ(D)
2
2
X
X
D≤ξ
D≤ξ
We now need the following lemma.
Lemma 11. If BV(θ) holds, then for all t ∈ N and A > 0 we have
X
x
d(q)t max max ∆(y, q, a) .
(a,q)=1 y≤x
(log x)A
θ
q≤x
Remark 11.1. The previous lemma is in some sense surprising, since d(q) can be greater
than any positive power of q.
θ
Now, assume that BV(θ) holds. Applying lemma 11, with x = 3N and ξ = (3N ) 2 ,
from (6.1) we have that
X ΛD ν0 (D)
+ O N (log N −A ) .
ϕ(D)
2
S(h0 ) = N
D≤ξ
For any prime p, it is easy to show that
ν0 (p) = #{n (mod p) | ∃h ∈ H h ≡ n (mod p)} − 1 = ν(p) − 1,
say. Thus ν0 (p) is independent of h0 and so
X
h0 ∈H
S(h0 ) =
X
X
ΛD

D≤ξ 2
D|
= kN




X

N
 h0 ∈H,
Q<n≤2N
n+h0 prime
(n+h)
h∈H




log(n + h0 )



X ΛD ν0 (D)
+ O N (log N −A ) .
ϕ(D)
2
D≤ξ
41
With a similar but easier argument, one can prove that
X
X
X ΛD ν(D)
+ O N (log N −A ) ,
ΛD
log 3N = N (log 3N )
ϕ(D)
2
2
N <n≤2N,
D≤ξ
D≤ξ
D|
Q
h∈H (n+h)
with the main difference that this time we don’t need to assume that BV(θ) holds and we
θ
can take ξ = (3N ) 2 for all θ < 1.
We need the following two lemmas.
Lemma 12. We have
2
X ΛD ν0 (D)
1
(k + 1)!
(2` + 2)!
= S(H)
(log ξ)−k+1 + O (log ξ)−k+ 2 ,
ϕ(D)
(` + 1)!
(k + 2` + 1)!
2
D≤ξ
where
Y 1 − ν(p)
p
S(H) =
k .
p
1 − p1
Remark 12.1. Note that ν(p) = #{n (mod p) | ∃h ∈ H, n ≡ h (mod p)} = k if p > H.
Thus
Y 1 − ν(p)
Y
Y
1 − kp
1
p
=
1+O
< ∞.
k =
1
p2
1 − p + · · · p>H
p>H
p>H 1 − 1
p
Moreover we have that S(H) > 0 iff ν(p) < p ∀p iff H is admissible.
Lemma 13. We have
2
X ΛD ν(D)
1
(k + `)!
(2`)!
= S(H)
(log ξ)−k + O (log ξ)−k− 2 .
ϕ(D)
(`)!
(k + 2`)!
2
D≤ξ
Applying the previus lemmas we obtain
2
(k + `)!
(2`)!
(log ξ)−k k(2` + 1)(2` + 2) log ξ+
S =N S(H)
(` + 1)!
(k + 2` + 1)!
1
− (` + 1)2 (k + 2` + 1) log 3N + O N (log N )−k+ 2
θ
and, since we chose ξ = (3N ) 2 , we have
2
−k
(k + `)!
(2`)!
θ
S =N S(H)
(log 3N )−k+1 2k(2` + 1)+
(` + 1)!
(k + 2` + 1)! 2
−k+ 12
− (` + 1)(k + 2` + 1) + O N (log N )
.
We are now ready to prove theorems 3 and 4.
42
(6.2)
Proofs of theorems 3 and 4. Suppose that BV(θ) holds for some θ >
1
.
2
Then, taking
k = `2 , from (6.2) we have that S > 0 for N large enough if
θ>
and since θ >
1
2
` + 1 `2 + 2` + 1
2` + 1
`2
it is certainly possible to find an ` that satisfies this inequality. This
completes the proof of theorem 3.
Now, suppose that BV(θ) holds for some
20
21
< θ < 1. The set H = {11, 13, 17, 19, 23, 29, 31}
is admissible (with k = 7) and, taking ` = 1, (6.2) implies that S > 0 for N large
enough.
To prove theorems 1 and 2 we need to modify slightly our arguments.
Let
S 0 :=
X
N <n≤2N



H
 X



h=1,
n+h prime


2



X
log(n + h) − log 3N 
λd  .

Q

 d| h∈H (n+h)
As before, if S 0 > 0, we can find two prime < p, p0 > N such that |p − p0 | < H.
θ
Now, suppose that BV(θ) holds and take again ξ = (3N ) 2 . Proceding in a similar way
as for S(h0 ) with h0 ∈ H, one can prove that the contribution to S 0 of h0 ∈
/ H is
N
X ΛD ν 0 (D)
0
+ O N (log N −A ) ,
ϕ(D)
2
D≤ξ
where
ν00 (D) = #{x (mod D) |
Y
(x − h0 + h) ≡ 0 (mod D)}.
h∈H
We now need the analogous of lemmas 12 and 13 for ν00 (D).
Lemma 14. We have
2
X ΛD ν 0 (D)
(k + `)!
(2`)!
0
−k
−k− 21
= S (H ∪ {h0 })
(log ξ) + O (log ξ)
.
ϕ(D)
(`)!
(k + 2`)!
2
D≤ξ
43
Therefore,
0
2
(2`)!
(log ξ)−k k(2` + 1)(2` + 2) log ξ+
(k + 2` + 1)!
!
H
X
S (H ∪ h0 )
2
−k+ 12
+ (` + 1) (k + 2` + 1)
− log 3N
+ O N (log N )
.
S (H)
h =1,
S =N S(H)
(k + `)!
(` + 1)!
0
h0 ∈H
/
Since S (H ∪ {h}) = S (H) if h ∈ H, we have that
H
H
X
X
S (H ∪ h0 )
S (H ∪ h0 )
=
−k
S
(H)
S
(H)
h =1,
h =1
0
h0 ∈H
/
(6.3)
0
and the behaviour of the firs summand on the right is given by the following lemma
Lemma 15. We have that
L
X
S (H ∪ h)
h=1
S (H)
∼ L.
We are now ready to prove theorem 1.
θ
Proof of theorem 1. Let’s take H = c log 3N for c > 0 arbitrary. Since we chose ξ = (3n) 2 ,
by lemma 15 and (6.3) we have that S 0 > 0 for sufficiently large N if
θ
k(2` + 1)2 − (` + 1)(k + 2` + 1)(1 − c) > 0.
2
Bombieri-Vinogradov Theorem (Theorem E) assures that BV(θ) holds for any θ <
so we can take θ =
1
2
(6.4)
1
2
and
− 4c . For k = `2 , (6.4) becomes
`2 (2` + 1)
2−c
>
(` + 1)3
1 − c/2
and so we have that S 0 > 0 for ` large enough. Thus for all N large enough there exist
n ∈ (n, 2n] and 0 ≤ h1 < h2 ≤ c log 3N such that p1 = n + h1 and p2 = n + h2 are primes.
Therefore
p2 − p1
log 3N
≤c
≤ 2c
log p1
log N
and this proves theorem 1 since c > 0 was arbitrary.
44
We now give a sketch of the proof of lemma 13. Lemmas 12 and 14 can be proven in a
similar way.
Proof of lemma 13. We need to estimate M =
λd = µ(d)
P
D≤ξ 2
log+ (ζ/d)
log ζ
ΛD ν(D)
,
ϕ(D)
where ΛD =
P
[d1 ,d2 ]=D
λd1 λd2 ,
k+`
,
1
N 10 ≤ ξ ≤ N and ν(p) ∈ [0, k], ν(p) = k if p > H. Firstly, observe that for all δ > 0
k+`
Z s
log+ (ζ/d)
ds
1
ξ
,
=
2πi (δ) d sk+`+1
(k + `)!
where (δ) indicates the line (δ − i∞, δ + i∞). Thus
Z Z
(k + `)!2
ξ s1 +s2
1
M=
F (s1 , s2 ) ds1 ds2
(log ξ)2(k+`) (2πi)2 (δ) (δ) (s1 s2 )k+`+1
(k + `)!2
M 0,
=
2(k+`)
(log ξ)
say, where
∞
X
ν([d1 , d2 ]) −s1 −s2
d d
[d1 , d2 ] 1 2
d1 ,d2 =1
Y
ν(p) 1
1
1
=
1−
+ s2 s1 +s2
.
s1
p
p
p
p
p
F (s1 , s2 ) =
µ(d1 )µ(d2 )
For p > H ν(p) = k and so F (s1 , s2 ) is approximately
F (s1 , s2 ) ≈ ζ(s1 + 1)−k ζ(s2 + 1)−k ζ(s1 + s2 + 1)k .
It is clear that the function G(s1 , s2 ), defined by
G(s1 , s2 ) = F (s1 , s2 )ζ(s1 + 1)k ζ(s2 + 1)k ζ(s1 + s2 + 1)−k ,
has an Euler product
G(s1 , s2 ) =
Y
Gp (s1 , s2 ).
p
Note that G(0, 0) = S(H). Moreover, if p < H, then Gp is regular for <(s1 ), <(s2 ) > −1,
3
Q
while, if p > H, ν(p) = k and so Gp = 1 + O p− 2 for <(s) > − 81 . Therefore p Gp is
45
uniformly convergent for <(s1 ), <(s2 ) > − 18 . Thus on this region G(s1 , s2 ) is holomorphic
in s1 and s2 and
Y G(s1 , s2 ) 1+O p
− 87
O(1)
k 1.
ν(p)6=k
Now, take δ =
1
.
log N
On <(s1 ) = <(s2 ) = δ we have that ξ s1 +s2 = O(1). Moreover ζ(1 + s)
and ζ(1 + s)−1 are O(log(2 + |t|)) on <(s) = δ, |t| ≥ 1. Therefore, for any T ≥ 1 we have
Z
(δ)
Z
δ+iT
δ−iT
ξ s1 +s2
F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1
Z
+∞
−∞
log(2 + |t1 | + |t2 |)O(1)
dt1 dt2
(1 + |t1 |)k+`+1 (1 + |t2 |)k+`+1
Z
|t1 |≥2T
O(1) −k−`
(log T )
T
and clearly the same holds taking |t2 | ≥ T instead of |t1 | ≥ 2T . Thus
1
M =
(2πi)2
0
Z
δ+iT
δ−iT
Z
δ+2iT
δ−2iT
ξ s1 +s2
F (s1 , s2 ) ds1 ds2 + O T 1−k−` .
k+`+1
(s1 s2 )
It is known that there exists c > 0 such that for σ ≥ − logc T , 1 ≤ |t| ≤ T , one has that
ζ(1 + s) 6= 0 and that ζ(1 + s) and ζ(1 + s)−1 are O (log(1 + |t|)). Therefore, ζ(1 + s) 6= 0
±
c
inside the rectangle Γ with vertices a±
T := δ ± 2iT , bT := − log 2T ± 2iT with the usual
orientation. Clearly, if s1 ∈ Γ and |t2 | ≤ T and supposing T < N c σ2 = δ we have that
s1 + s2 6= 0. Applying Cauchy’s formula, for |t2 | ≤ T we have
Z a+T
Z b−T Z b+T Z a+T !
1
1
ξ s1 +s2
ξ s1 +s2
F
(s
,
s
)
ds
=
F (s1 , s2 ) ds1 +
+
+
1
2
1
2πi a−T (s1 s2 )k+`+1
2πi
(s1 s2 )k+`+1
a−
b−
b+
T
T
T
ξ s1 +s2
F (s1 , s2 ); s1 = −s2 +
+ Res
(s1 s2 )k+`+1
ξ s1 +s2
+ Res
F (s1 , s2 ); s1 = 0 .
(s1 s2 )k+`+1
Now, we have that
Z
δ+iT
Z
b−
T
Z
a+
T
+
δ−iT
a−
T
b+
T
!
ξ s1 +s2
F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1
!
Z T Z − c
Z δ
log 2T
O(1)(log T )O(1)
+
dt1 dt2
(δ + |t2 |)k+`+1 T k+`+1
−T
δ
− logc2T
= O T −k−` log N
46
and
Z δ+iT Z
b+
T
b−
T
δ−iT
c
ξ δ− log T (log T )O(1)
dt1 dt2
k+`+1 ( c
+ |t1 |)k+`+1
−T −2T (δ + |t2 |)
log 2T
c
= O ξ − log 2T (log T )O(1) log N .
ξ s1 +s2
F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1
Z
T
Z
2T
√
Thus, choosing T = exp log N we find that
1
M =
(2πi)
0
Z
δ+iT
ξ s1 +s2
Res
F (s1 , s2 ); s1 = −s2 +
(s1 s2 )k+`+1
!
0√
ξ s1 +s2
−c log N
+ Res
,
F (s1 , s2 ); s1 = 0
ds2 + O e
(s1 s2 )k+`+1
δ−iT
for some c0 > 0. Let’s compute the first residue. Let C be the circle s1 = −s2 +
1
eiφ .
log N
We have
Res
ξ s1 +s2
F (s1 , s2 ); s1 = −s2
(s1 s2 )k+`+1
since ζ 1 − s2 +
Z
1
eiϕ
logN
δ+iT
Res
δ−iT
Z
1
ξ s1 +s2
=
F (s1 , s2 ) ds1
2πi C (s1 s2 )k+`+1
(log N )k−1 (log t2 )k
,
(s2 )2k+2`+2 (ζ(1 + s2 ))k
log t2 for σ2 = δ, |t2 | ≤ T (and N big enough). Therefore,
ξ s1 +s2
F (s1 , s2 ); s1 = −s2
(s1 s2 )k+`+1
ds2 = O (log N )k−1 .
Now, let consider the second residue. The function
Z(s1 , s2 ) = G(s1 , s2 )
(s1 + s2 )ζ(1 + s1 + s2 )
s1 ζ(s1 + 1)s2 ζ(1 + s2 )
k
is regular (and nonzero) near s1 = s2 = 0. Let
f (s2 ) := Res
s1 +s2
ξ s1 +s2
ξ
Z(s1 + s2 )
F (s1 , s2 ); s1 = 0 = Res
; s1 = 0 .
(s1 s2 )k+`+1
(s1 s2 )`+1 (s1 + s2 )k
On the rectangle Γ0 with vertices δ ± iT , − logc T ± iT with the usual orientation, we have
that
f (s2 ) ξ <(s2 )
(log ξ)O(1) .
|s2 |k+`+1
47
Thus, applying Cauchy’s formula, we have that the integral
1
(2πi)
Z
δ+iT
f (s2 ) ds2
δ−iT
√
is the residue of f (s2 ) in s2 = 0 plus an error term that is O exp −c0 log N . To
conclude it is therefore sufficient to compute Res (f (s2 ); s2 = 0) and that is just a (long)
caluclation.
48