Sieves in Number Theory Lecture Notes Taught Course Centre 2007 Tim Browning, Roger Heath-Brown Typeset by Sandro Bettin1 1 All errors are the responsibility of the typesetter. In particular there are some arguments which, as an exercise for the typesetter, have been “ fleshed out” or re-interpreted, possibly incor- rectly. Tim’s lectures were neater and more concise. Corrections would be gratefully received at [email protected] Contents 1 Introduction 2 2 Sieve of Eratosthenes 5 3 Large Sieve 10 4 Selberg sieve 19 5 Sieve limitations 31 6 Small gaps between primes 37 1 Chapter 1 Introduction Sieves can be used to tackle the following questions: i) Are there infinitely many primes p such that p + 2 is also prime? ii) Are there infinitely many primes p such that p = n2 + 1 for some n ∈ N? iii) Are there infinitely many primes p such that 4p + 1 is also prime? iv) Is every sufficiently large n a sum of two primes? v) Is it true that the interval (n2 , (n + 1)2 ) contains at least one prime for every n ∈ N∗ ? These problems are still open, but, using Sieves methods, some steps towards their solutions have been done. For example, in 1966 Chen proved a weaker version of iv) stating that every sufficiently large n is a sum of a prime and a P2 (where Pr denotes the numbers that have at most r prime factors). These problems are also related to important problems in other Mathematics branches, such as Artin’s primitive root conjecture, which says that, for all a ∈ Z with a 6= 0, ±1, there exists infinitely many primes p such that a is a primitive root modulo p. Proposition 1. If iii) is true, then Artin’s conjecture is true for a = 2, i.e. there exists infinitely many primes p such that 2 is a primitive root modulo p. 2 Proof. Let p = 2k + 1, with k ∈ N, and q = 4p + 1 = 8k + 5 be primes. Recall that for all prime r ( 1 if r ≡ ±1 (mod 8), 2 = r −1 if r ≡ ±3 (mod 8), a where p is the Legendre symbol. Therefore 2q = −1 and so there doesn’t exist any x such that 2 ≡ x2 (mod 8). Furthermore, by Fermat’s little theorem, 24p = 2q−1 ≡ 1 (mod q) and so the order of 2 modulo q must be 1, 2, 4, p, 2p or 4p. It’s easily checked that the q−1 order can’t be 1, 2 or 4, and it can’t be p either because otherwise 2p = 2 4 ≡ 1 (mod q) 2 and so 2k+1 ≡ 2 (mod q). It remains to show that 22p ≡ 6 1 (mod q). If it weren’t so, we would have 22 ≡ 2−4k (mod q) and so there would be two possibilities: 2 ≡ 2−2k (mod q) or 2 ≡ −2−2k (mod q). The first is impossible for the same reason as before, the second is impossible because it would imply that −1 2 −1 −2 = =− = −1. 1= q q q q The fundamental goal of sieve theory is to produce upper and lower bound for sets of the type S(A, ℘, z) = #{n ∈ A | p|n ⇒ p > z ∀p ∈ ℘}, where A is a finite subset of N, ℘ is a subset of the set of primes P and z > 0. Examples 2. 1. Let A = {n ∈ N | n ≤ x} and ℘ = {p ∈ P | p ≡ 3 (mod 4)}, then S(A, ℘, x) =#{n ≤ x | p|n, p ∈ P ⇒ p 6≡ 3 (mod 4)} #{n ≤ x | n = a2 + b2 for some coprime a, b ∈ N}, so through this function we can detect sums of two squares. 2. Let A = {n ∈ N | n ≤ x} and √ x < z ≤ x. Then S(A, P, z) = #{n ≤ x | p|n ⇒ p > z} = π(x) − π(z), where π(x) = #{p ∈ P | p ≤ x}. 3 3. Let A = {n(2N − n) | n ∈ N, 2 ≤ n ≤ 2N − 2}. Then S(A, P, x) = #{(p, 2N − p) ∈ P2 | and this is related to Goldbach conjecture. 4 √ √ 2N < p < 2N − 2N } Chapter 2 Sieve of Eratosthenes The Möbius function is the function µ : N∗ → {0, ±1} defined by if n = 1, 1 µ(n) = 0 if ∃p ∈ P such that p2 |n, (−1)r if n = p1 · · · pr with p1 , . . . , pr distinct primes. Lemma 1. For all n ∈ N we have X d|n ( 1 µ(d) = 0 if n = 1, otherwise. Proof. Suppose n = pe11 · · · perr with p1 , . . . , pr primes and e1 , . . . , er ∈ N∗ . Then X X r r r µ(d) = µ(d) = 1 + (−1) + · · · + (−1) = (1 − 1)r = 0. 1 r d|n d|p1 ···pr Lemma 2 (Abel’s partial summation formula). Let λ1 , λ2 , . . . be an increasing sequence of real numbers that goes to ∞ and c1 , c2 , . . . a sequence of complex numbers. Let C(x) = P 1 λn ≤x cn and φ : [λ1 , ∞[→ R be of class C . Then Z X X cn φ(λn ) = − C(x)φ0 (x) dx + C(X)φ(X), (2.1) λ1 λn ≤X for all X ≥ λ1 . Moreover, if C(X)Φ(X) → 0 as X → ∞, then Z ∞ ∞ X cn φ(λn ) = − C(x)φ0 (x) dx, λ1 n=1 provided that either side is convergent. 5 (2.2) Proof. One has C(X)φ(X) − X X cn φ(λn ) = λn ≤X cn (φ(X) − φ(λn )) = λn ≤X Z X = X Z λn ≤X X cn φ0 (x) dx = λ1 λ ≤x n Z X cn φ0 (x) dx λn X C(x)φ0 (x) dx. λ1 This proves (2.1). To prove (2.2) it’s enough to let X go to infinity. Let Y Π = Π(℘, z) := p, p∈℘, p≤z Ad := {n ∈ N | dn ∈ A}, for all d ∈ N∗ . Applying lemma 1, we can write S(A, ℘, z) = X 1= X X µ(d) = n∈A d|(n,Π) n∈A, (n,Π)=1 X µ(d)#Ad . (2.3) d|Π(℘,z) Now, suppose that there exist X, Rd and a completely multiplicative function ω(d), with ω(d) ≥ 0 ∀d, ω(p) = 0 ∀p ∈ P \ ℘, such that #Ad = ω(d) X + Rd d ∀d ∈ N∗ . (2.4) Then we can prove the following Theorem A (Sieve of Eratosthenes). Let X, Rd , ω(d) as above and assume furthermore that 1. Rd = O(ω(d)) 2. ∃k ≥ 0 such that P p|Π(℘,z) ω(p) log p p ≤ k log z + O(1) 3. ∃y > 0 such that #Ad = 0 for d > y. 6 Then we have S(A, ℘, z) = XW (z) + O y x+ log z k+1 (log z) log y exp − , log z where Y ω(p) W (z) = 1− . p p∈℘, p≤z Proof. Assume all the hypothesis in the theorem. For all δ > 0, we have δ X X t F (t, z) := ω(d) ≤ ω(d) , d d≤t, d|Π d|Π using “Rankin’s trick”. Since 1 + x ≤ ex for all x ∈ R, using multiplicativity of ω we deduce that F (t, z) ≤ tδ Y 1+ p|Π ω(p) pδ ≤ tδ Y exp p|Π ω(p) pδ = exp δ log t + X ω(p) p|Π pδ . Now, writing δ = 1 − η and using the inequality ex ≤ 1 + xex for x > 0, we see that pη = exp(η log p) ≤ 1 + η log ppη ≤ 1 + ηz η log p, since every prime p|Π(℘, z) is less then z. Therefore X ω(p) F (t, z) ≤ t exp(−η log t) exp pη p p|Π X ω(p) X ω(p) ≤ t exp −η log t + + ηz η log p . p p p|Π Now, applying lemma 2 to cp = ω(p) p log p and φ(x) = p|Π 1 , log x we have Z z X X ω(p) X ω(p) 1 ω(p) 1 1 log p = log p dx + log p 2 p log p p log z x log x p1 p|Π(℘,x) p p|Π(℘,z) p|Π(℘,z) ≤ k log log z + O(1), by hypothesis 2). Hence F (t, z) ≤ t exp (−η log t + k log log z + kz η log z) . 7 Choosing η = (log z)−1 , we obtain F (t, z) t exp − log t log z (log z)2 . (2.5) Moreover, by partial summation (lemma 2) with cd = ω(d), φ(x) = x1 , we can conclude that Z ∞ X ω(d) Z ∞ F (t, z) − F (y, z) F (t, z) F (y, z) = + dt = − dt 2 d t y t2 y y d|Π, d>y Z ∞ − log y 1 − log t k (log z) exp + exp dt log z t log z y − log y k+1 (log z) exp . log z Finally, by hypothesis 3) and (2.3)-(2.4) we have X X µ(d)Xω(d) +O |µ(d)||Rd | S(A, ℘, z) = µ(d)#Ad = d d|Π, d≤y d|Π, d≤y d|Π, d≤y X µ(d)ω(d) X = XW (z) + O X + ω(d) d d|Π, d>y d|Π, d≤y y − log y k+1 = XW (z) + O X+ (log z) exp , log z log z X where we used hypothesis 1) and (2.5)-(2.6). We apply the previous theorem to the problem of twine primes. Corollary A.1 (Brun’s theorem). We have X 1 <∞ p p, p+2∈P Proof. It follows from a slight modified version of Theorem A. 1 Corollary A.2. For z ≤ x 4 log log x , we have Y 1 φ(x, z) = #{n ≤ x | p|n ⇒ p ≥ z} ∼ 1− x p p<z 8 (2.6) Proof. Exercise. Note that Lemma 3 (Merten’s formula). We have Y 1 e−γ 1− ∼ , p log z p≤z where γ is Euler’s constant. Proof. See Hardy Wright, theorem 429. 9 Chapter 3 Large Sieve Lemma 4. Let F : [0, 1] → C be a differentiable function with continuous derivative. Then, if we extend F by periodicity to all R with period 1, we have Z 1 Z 1 X X a 2 |F 0 (α)| dα, |F (α)| dα + F ≤z d 0 0 d≤z 1≤a≤d, (a,d)=1 for all z ∈ N∗ Proof. We have that −F a d α Z F 0 (t) dt. = −F (α) + a d Therefore Z α a |F 0 (t)| dt. (3.1) ≤ |F (α)| + F a d d Now, let δ = 2z12 , so that the intervals I = I ad := I ad − δ, ad + δ , for d ≤ z, 1 ≤ a ≤ d and (a, d) = 1, are all disjoints and contained in [0, 1]. Integrating (3.1) over I, we obtain Z Z α a Z |F 0 (t)| dt dα 2δ F ≤ |F (α)| dα + a d I I Z Z Zd ≤ |F (α)| dα + |F 0 (t)| dt dα I ZI ZI = |F (α)| dα + 2δ |F 0 (t)| dt, I I 10 a , α ⊂ I. Summing over a and d and multiplying by z 2 we obtain d Z X X a X X Z 2 0 z |F (α)| dα + |F (t)| dt F ≤ d I I d≤z 1≤a≤d, d≤z 1≤a≤d, since, if α ∈ I, then (a,d)=1 (a,d)=1 ≤z 2 Z 1 Z 0 |F (α)| dα + 0 1 |F 0 (α)| dα. 0 Theorem B (Analytic large sieve inequality). Let {an }n∈N be a sequence in C, x ∈ N and S(α) = X an e (nα), n≤x where e (β) = exp(2πiβ). Then X X a 2 X |an |2 . S ≤ (z 2 + 4πx) d n≤x d≤z 1≤a≤d, (a,d)=1 Proof. Applying lemma 4 with F (α) = S(α)2 , we obtain Z 1 Z 1 X X a 2 2 |S(α)| dα + 2 |S 0 (α)S(α)| dα. S ≤z d 0 0 d≤z 1≤a≤d, (a,d)=1 By Parseval’s identity we have that Z 1 |S(α)|2 dα = 0 and, since S 0 (α) = 2π P n≤x X |an |2 n≤x nan e (nα), by Cauchy’s inequality and Parseval’s equality, we get Z 1 0 ! 2 |S (α)S(α)| dα 0 ≤ X 2 |an | !2 ! 4π n≤x 2 X n≤x 2 2 n |an | 2 2 ≤ 4π x X 2 |an | , n≤x that completes the proof. Remark. Montgomery-Vaughan (1974) and Selberg proved independently that 4π can be removed from the analytic large sieve inequality. Moreover, 1 is the best possible coefficient of x. 11 Next we deduce a sieve method from Theorem B. We need the following lemma about Ramanujan sums. Lemma 5. For all d, n ∈ N, let cd (n) = X e na d 1≤a≤d, (a,d)=1 . Then 1. (d, d0 ) = 1 ⇒ cdd0 (n) = cd (n)cd0 (n); 2. cd (n) = P D|(d,n) µ d D D; 3. (d, n) = 1 ⇒ cd (n) = µ(d). Proof. 1. By Bézout’s identity we have X cdd0 (n) = e na dd0 1≤a≤dd0 , (a,dd0 )=1 X = e = 1≤s≤d, (s,d)=1 e 1≤r≤d, 1≤s≤d0 , (a,d)=1 ns X d X e nr d0 1≤r≤d0 , (r,d0 )=1 n(rd + sd0 ) dd0 = cd (n)cd0 (n) 2. By lemma 1, we have cd (n) = X 1≤a≤d = X D|(d,n) e na X d µ(d) = D|(a,d) D µ(D) = d X 1≤a≤d e µ D|(d,n) na d 3. It’s a special case of the previous point. 12 d D X µ(D) d 1≤a≤ D D|d X since X D, ( 0 if d - n, = d if d | n. e naD d Theorem C (Arithmetic large sieve inequality). Let ℘ ⊂ P and A = {n ∈ N | n ≤ x}. For each p ∈ ℘, let Ωp = {w1,p , . . . , wω(p),p } be a set of ω(p) residue classes modulo p and put ω(p) = 0 if p ∈ / ℘. Finally, let S(A, ℘, z) = {n ∈ A | n 6≡ wi,p (mod p) ∀ ≤ i ≤ ω(p) ∀p|Π(℘, z)} and S(A, ℘, z) = #S(A, ℘, z). Then S(A, ℘, z) ≤ z 2 + 4πx , L(z) where L(z) = X |µ(d)| d≤z Y p|d ω(p) . p − ω(p) Proof. Let d = p1 , · · · pt a (square-free) integer dividing Π(℘, z). By Chinese remainder theorem, for every i = (i1 , . . . , it ) with 1 ≤ ij ≤ ω(pj ) there exists a unique Wi,d such that Q 0 ≤ Wi,d < d and Wi,d ≡ wij ,pj (mod pj ) for j ≤ t. Let’s call ω(d) = tj=1 ω(pj ) the total numbers of the possible Wi,d as we vary i. Now let n ∈ S(A, ℘, z). Then (n − Wi,d , d) = 1 for all d and i. Hence, by lemma 5 item 3), we have µ(d) = cd (n − Wi,d ) = X e 1≤a≤d, (a,d)=1 a(n − Wi,d ) d . Summing over i and n ∈ S(A, ℘, z), we deduce that X X −aWi,d X an e µ(d)S(A, ℘, z)ω(d) = cd (n − Wi,d ) = e d d n 1≤a≤d, i (a,d)=1 and therefore, by Cauchy-Schwartz inequality, 2 2 X X −aWi,d X X an |µ(d)S(A, ℘, z)ω(d)|2 ≤ e e . d d n 1≤a≤d, i 1≤a≤d, (a,d)=1 (a,d)=1 13 The first term on the right hand side is 2 0 X X −aWi,d X X )a (Wi,d − Wi,d e e = = d d 0 i 1≤a≤d, 1≤a≤d, (a,d)=1 (a,d)=1 = Wi,d ,Wi,d X X µ 0 D|(d,W −W 0 ) Wi,d ,Wi,d i,d i,d d D D= X D|d X 0 cd Wi,d − Wi,d 0 Wi,d ,Wi,d Dµ d D X Wi,d X 1 0 Wi,d 0 D|Wi,d −Wi,d X µ(E)ω(E) d d = Dµ ω(d)ω = dω(d) D D E D|d E|d Y Y ω(p) = ω(d) (p − ω(p)), = dω(d) 1− p X p|d p|d where we used lemma 5 item 2). Hence we have |µ(d)|S(A, ℘, z)2 Y p|d 2 X X an ω(p) ≤ e . p − ω(p) 1≤a≤d, n d (a,d)=1 and this equality is obviously true also if d is not square-free or if it doesn’t divide Π(℘, z). Summing over d ≤ z and applying Theorem B with an = 1 if n ∈ S(A, ℘, z), 0 otherwise, we obtain L(z)S(A, ℘, z)2 ≤ (z 2 + 4πx)S(A, ℘, z). Given a prime p let’s define q(p) to be the smallest positive integer such that q(p) is is equal to −1). Note that, not a square modulo p (or, i.e. the Legendre symbol q(p) p being the Legendre symbol completely multiplicative, q(p) ∈ P. Moreover, q(p) = 2 if p ≡ ±3 (mod 8), since ( 1 if p ≡ ±1 (mod 8), 2 = p 0 if p ≡ ±3 (mod 8). The best result known is q(p) pθ+ε for all ε > 0 unconditionally, where θ = 1 √ 4 e = 0, 1516 . . . , while, assuming the Riemann hypothesis, it is q(p) log2 p. This problem is linked to Artin’s conjecture on primitive roots. Using Theorem C, we can now prove the following corollary. 14 Corollary C.1. Let ε > 0 and Eε (N ) = #{primes p ≤ N | q(p) > N ε }. Then Eε (N ) ε 1. Proof. Since Eε0 (N ) ⊆ Eε (N ) if ε < ε0 , we can suppose ε−1 ∈ N. Let A = {1, . . . , N 2 }, ℘ = {p ∈ P | np = 1 ∀n ≤ N ε } and Ωp = {v (mod p) | vp = −1}. Thus ω(p) = #Ωp = p−1 2 for all p ∈ ℘ and h(p) := ω(p) p−ω(p) p−1 p+1 = ≥ 1 3 if p ∈ ℘. Theorem C implies that (1 + 4π)N 2 (1 + 4π)N 2 Q ≤P . p≤N, |µ(d)| p≤N, h(p) p|d h(p) N 2 + 4πN 2 Q =P d≤N |µ(d)| p|d h(p) S(A, ℘, N ) ≤ P q(p)>N ε p|d⇒p∈℘ But X Eε (N ) = X 1≤ p≤N, q(p)>N ε 3h(p) p≤N, q(p)>N ε and so Eε (N )S(A, ℘, N ) ≤ 3(1 + 4π)N 2 . (3.2) Moreover, we have 2 S(A, ℘, N ) = #{n ≤ N | p≤N, ( mp )=1 ∀m≤N ε n ⇒ 6= 1} p 2 ≥ #{n = m · p1 · · · pk ≤ N | N ε−ε2 /2 (3.3) −1 ε < pj < N for 1 ≤ j ≤ k = 2ε }. 2 Indeed if n = m · p1 · · · pk ≤ N 2 with N ε−ε < pj < N ε for 1 ≤ j ≤ k = 2ε−1 , then for all k 2 p p ∈ ℘ we have pj = 1 for all 1 ≤ j ≤ k and mp = 1, since N 2 ≥ m N ε−ε /2 = mN 2−ε P and so m ≤ N ε . Thus np = m·p1p···pk = 1. Using the fact that p≤B p1 ∼ log log B, the equation (3.3) gives S(A, ℘, N ) ≥ X p1 ,...pk 2 ε− ε2 N <pj <N ε 2 N log 1 1− N2 > N2 p1 · · · pk 2ε−1 ε 2 − X p1 ,...pk 2 ε− ε2 N <pj <N ε Nε ε log N 2ε−1 X 1 − 1 p1 · · · pk p1 ,...pk = N2 pj <N ε log 1 1− 2ε−1 ε 2 − 1 ε log N 2ε−1 ! ε N 2 , (3.4) 15 since log 1 1− ε 2 > 1 ε log N for N large enough (depending on ε). To complete the proof it is enough to put together (3.2) and (3.4). We now would like to tackle the following questions: for a, b ∈ N how likely is it that the conic Ca,b := {ax2 + by 2 = z 2 , (x, y, z) 6= (0, 0, 0)} ⊂ P2Q has a rational point? If M(H) is defined as M(H) = #{a, b ∈ N | a, b ≤ H, Ca,b (Q) 6= ∅}, what is the ratio M(H) H2 as H goes to infinity? We are now going to deduce by Theorem C a partial answer to this problem, but first, we need to state some definitions and results. Let K = R or Qp for some prime p. The Hilbert symbol for K is the function defined by ( 1 ∃(x, y, z) ∈ K 3 \ {0} s.t. ax2 + by 2 = z 2 , (a, b)K = −1 otherwise for all a, b ∈ K ∗ . Write ( (a, b)p (a, b)K = (a, b)∞ K = Qp K = R. We’ll need the following properties: Proposition 3. Let K = Qp for some prime p or R and let a, a0 , b ∈ K ∗ . Then: 1. (a, b) = (b, a) 2. (aa0 , b)K = (a, b)K (a0 , b)K (bimultiplicativity), ( 1 a or b > 0, 3. (a, b)∞ = −1 a, b < 0, 16 α β αβ(p−1)2 4. If p > 2 and a = p u, b = p v for p - uv, then (a, b)p = (−1) β α u p v p , where the last two factors are Legendre symbols. Proof. See §3 of Serre’s “A course in arithmetic”. It’s worthwhile to know the following theorem that proves the Ca,b satisfy the Hasse principle. Theorem (Hasse-Minkowski). There exists (x, y, z) ∈ Q3 \ {0} such that ax2 + by 2 = z 2 iff (a, b)∞ = 1 and (a, b)p = 1 for all primes p. Proof. See Serre’s “A course in arithmetic”. Now we are ready to prove the following Corollary C.2. We have M(H) H2 1 (log H) 2 −ε . Proof. Let M∗ (H 0 , H) = #{a, b ∈ N | a ≤ H 0 , |µ(a)| = 1, b ≤ H, Ca,b (Q) 6= ∅}. Clearly, we have M∗ (H 0 , H) ≤ X |µ(a)|Ma (H), a≤H 0 where Ma (H) = #{b ≤ H | (a, b)p = 1 ∀p > 2}. If we define ℘ = {p ∈ P | p > 2}, A = {b ≤ H} and Ωp = {v (mod p) | p - v, (a, v)p = −1)}, then Ma (H) ≤ S(A, ℘, z) ∀z > 0. Let’s now fix a square-free a ≤ H 0 and assume H 0 ≤ H. Since a is square-free we can write a = pα u for p - u and α ∈ {0, 1} . Thus, by proposition 3 item 4), we have that if p > 2, 17 Ωp = {1 ≤ v ≤ p − 1 | −1 = α v p (p−1) 2 } and so ωp = if α = 1, 0 otherwise. Applying theorem C, we therefore obtain z 2 + 4πH , La (z) Ma (H) where X La (z) = |µ(d)| Yp−1 d≤z, p|d⇒p|a and g(d) = Q p|d p+1 X = g(d) d≤z,d|a p−1 p|d p+1 . p−1 p+1 Now, let ε > 0 and note that P define ν(d) := p|d 1, we have ≥ 1 1+ε iff p ≥ 2+ε ε ε 1. If we take z = √ a and we X Y p−1 Y p−1 X 1 ν(d) La (z) = ε p+1 p+1 1+ε d≤z, d≤z, p|d, D|a pε 1 p|d, pε1 d|a X 1 1 1= ν(a) (1 + ε) 2 √ d≤ a, d|a Moreover, we have that z = √ a≤ √ H0 ≤ 2 1+ε ν(a) . √ H, thus −ν(a) X X X 1 + ε ν(a) 2 ∗ 0 |µ(a)| |µ(a)|Ma (H) H M (H , H) ≤ H . 1+ε 2 a≤H 0 a≤H 0 a≤H 0 P Hardy and Ramanujan proved that a≤H 0 β ν(a) (log HH0 )1−β and so we obtain M∗ (H 0 , H) HH 0 (log H 0 ) 1−ε 2 . Finally, note that Cuv2 ,b (Q) 6= ∅ implies Cu,b (Q) 6= ∅, so, writing a = uv 2 for u square-free, we get M(H) ≤ X √ v≤ H M ∗ H ,H v2 H2 1 (log H) 2 −ε . Remark C.2.1. The result proved in the previous corollary can be improved. In fact, Hooley and Serre proved that H2 H2 M(H) . log H log H 18 Chapter 4 Selberg sieve Eratosthenes sieve investigates the function S(A, ℘, z) = Q p∈℘, p, via the equality P n∈N, (n,Π)=1 1, where Π = Π(℘, z) = p<z S(A, ℘, z) = XX X µ(d) = n∈A d|n, d|Π µ(d)#Ad . d|Π The “basic sieve problem” is to find some arithmetic functions µ± (d) : N → R such that ( X 1 if (n, Π) = 1, µ− (d) ≤ (4.1) 0 if (n, Π) > 1; d|n, d|Π ( 1 if (n, Π) = 1, µ+ (d) ≥ 0 if (n, Π) > 1, d|n, X (4.2) d|Π so that X d|Π µ(d)− #Ad = XX µ− (d) ≤ S(A, ℘, z) ≤ n∈A d|n, d|Π Writing #Ad as #Ad = ω(d)X d XX µ+ (d) = n∈A d|n, d|Π X µ(d)+ #Ad . d|Π + Rd with ω(d) completely multiplicative, this gives S(a, ℘, z) ≤ X X µ+ (d)ω(d) d|Π d + X |µ+ (d)Rd |. (4.3) d|Π Selberg sieve arose out of an effort to minimize (4.3) subject to (4.2). The key idea is to replace µ+ (d) by a quadratic form, optimally chosen. We’ll need the following lemmas 19 Lemma 6. Let ζ > 0 and {λi }i∈N ⊂ R. Then X ω(d)λd µ(`)y` = µ(d) d (4.4) `|Π, d|`, `<ζ holds for all d|Π with d < ζ if and only if y` = X ω(δ)λδ δ δ|Π, `|δ, δ<ζ for all ` < ζ, `|Π. Proof. If y` = P X δ|Π, `|δ, δ<ζ ω(δ)λδ δ µ(`)yl = `|Π, d|`, `<ζ for all ` < ζ, `|Π, we have that X µ(`) `|Π, d|`, `<ζ = X ω(δ)λδ X ω(δ)λδ X µ(`) = δ δ δ|Π, `|δ, δ<ζ δ|Π, d|δ, δ<ζ d|`, `|δ X ω(δ)λδ X X ω(δ)λδ X µ(md) = µ(d) µ(m) δ δ δ δ|Π, d|δ, δ<ζ = µ(d) md|δ δ|Π, d|δ, δ<ζ m| d ω(d)λd . d Vice versa, if (4.4) held for another {y`0 }`<ζ with {y`0 }`<ζ 6= {y` }`<ζ , then there would exist ˜ such that y ˜ 6= y 0 , and this is a contradiction since a maximal `˜ < ζ, `|Π ` `˜ X ˜ ˜ − y 0˜) 6= 0. µ(`)(y` − y`0 ) = µ(`)(y 0= ` ` (4.5) ˜ `|Π, `|`, `<ζ Lemma 7. Let d|Π and z, ζ > 0. For all a|Π, let Ga (ζ, z) = X g(m), am|Π(℘,z), m<ζ with g(m) the multiplicative arithmetic function defined by g(m) = Then, if 0 ≤ ω(p) < p ∀p ∈ ℘, we have Y −1 ζ ω(p) G1 (ζ, z) ≥ Gd ,z 1− . d p p|d 20 ω(m) m Q p|m 1− ω(p) p −1 . Proof. We have that G1 (ζ, z) = X g(m) = X m|Π(℘,z), m<ζ = X g(m) = `|d m|Π, (m,d)=`, m<ζ X g(`) `|d X g(m0 ) = X g(`) ≥ `|d g(`) X 0 g(m ) = Gd dm0 |Π, g(`m0 ) X g(m0 ) dm0 |Π, m0 < ζ` m0 < ζ` X X `|d `m0 |Π, (m0 d )=1, ` `m0 <ζ `|d `m0 |Π, (m0 , d` )=1, X X ζ ,z g(l), d `|d m0 < dζ since g(m0 ) ≥ 0. To conclude the proof it’s enough to observe that −1 ! Y −1 X Y Y ω(p) ω(p) ω(p) = 1− 1− . 1+ g(`) = (1 + g(p)) = p p p `|d p|d p|d p|d We are now ready to prove the following Theorem D (Fundamental theorem for Selberg sieve). Let z > 0, y > 1 and ω(d) a completely multiplicative arithmetic function such that 0 ≤ ω(p) < p and #Ad = ω(d)X d ∀p ∈ ℘ + Rd . Then S(A, ℘, z) ≤ X X + 3ν(d) |Rd |, √ G( y, z) d|Π(℘,z), d<y where ν(d) = X 1, p|d X √ G( y, z) = g(`), `|Π(℘,z), √ `< y −1 ω(`) Y ω(p) g(`) = 1− . ` p p|` 21 (Ω1 ) Proof. Let {λd }d∈N ⊂ R with λ1 = 1 and define X µ+ (d) = λd1 λd2 , d1 , d2 , d=[d1 ,d2 ] where [a, b] = ab (a,b) is the least common multiple of a and b. This choice of µ+ (d) satisfies the inequality (4.2), indeed 2 X X µ+ (d) = d|n, d|Π λd1 λd2 = P d|n, d|Π X λd1 λd2 = d1 ,d2 |(n,Π) [d1 ,d2 ]|(n,Π) and if (n, Π) = 1 then X λd ≥ 0 d|(n,Π) µ+ (d) = µ+ (1) = λ21 = 1. Thus (4.3) holds, that is S(a, ℘, z) ≤ X X µ+ (d)ω(d) d d|Π + X |µ+ (d)Rd | (4.3) d|Π = XM + E, say. Now, assume that λd = 0 for d ≥ √ y. As a consequence we have that µ+ (d) = 0 for d ≥ y. Thus M= X µ+ (d)ω(d) d|Π d = X [d1 ,d2 ]|Π λd1 λd2 ω([d1 , d2 ]) = [d1 , d2 ] By condition (Ω1 ), we can define g(k) = ω(k) k Q X d1 ,d2 |Π, √ d1 ,d2 < y, ω(d1 d2 )6=0 p|k 1− ω(d1 )λd1 ω(d2 )λd2 (d1 , d2 ) . d1 d2 ω((d1 , d2 )) ω(p) p −1 ≥ 0 and, if ω(k) 6= 0 and µ(k) 6= 0, we have 1 k Y ω(p) k X µ(`)ω(`) X k/` 1− = = = µ(`) g(k) ω(k) p ω(k) ` ω(k/`) p|k `|k `|k X X k `0 `0 0 = µ 0 = µ(k) µ(` ) . ` ω(`0 ) ω(`0 ) 0 0 ` |k ` |k Therefore, by Möbius inversion formula, if ω(d) 6= 0 and µ(d) 6= 0, we have 22 d ω(d) = 1 k|d g(k) . P Thus M= X d1 ,d2 |Π, √ d1 ,d2 < y, ω(d1 d2 )6=0 ω(d1 )λd1 ω(d2 )λd2 (d1 , d2 ) = d1 d2 ω((d1 , d2 )) = `|Π, √ `< y, ω(`)6=0 d1 ,d2 |Π, √ d1 ,d2 < y, ω(d1 d2 )6=0 k|d1 ,d2 2 X ω(d1 )λd1 ω(d2 )λd2 X 1 d1 d2 g(k) X (4.6) X ω(d)λd X y2 1 ` = , g(`) d g(`) d|Π,√ `|d, d< y `|Π, √ `< y, ω(`)6=0 say. Applying lemma 6 with d = 1 and ζ = 1= X µ(`)y` = `|Π, √ `< y X √ y, we get µ(`)y` = `|Π, √ `< y, ω(`)6=0 X `|Π, √ `< y, ω(`)6=0 p y` µ(`) g(`) p . g(`) So, by Cauchy’s inequality, we obtain ! 1≤ X µ(`)2 g(`) `|Π, √ `< y X `|Π, √ `< y, ω(`)6=0 √ = G( y, z)M, y`2 g(`) since Π is square-free and by (4.6). Therefore we have M ≥ ! √1 G( y,z) and the equality holds if and only if the equality holds in Cauchy’s inequality, or, in equivalence, if there exists a constant c such that p y p ` = cµ(`) g(`) g(`) ∀`|Π, s.t. ` < √ y, ω(`) 6= 0. So, to obtain the best estimate, we have to choose y` = cµ(`)g(`) and if that holds, applying √ again lemma 6 with d = 1 and η = y, we get 1= X `|Π, √ `< y µ(`)y` = X √ µ(`)2 g(`) = cG( y, z). `|Π, √ `< y Thus to obtain the optimal estimate we have to find if there exist some λd such that √ µ(`)g(`) √ y` = G( for all ` < ζ, `|Π. So, applying lemma 6 with ζ = y, we find that the sought y,z) 23 λd exist and have to be λd = X µ(d)d X µ(d)d µ(`)y` = µ(`)2 g(`) √ ω(d) ω(d)G( y, z) `|Π,√ d|`, `< z `|Π,√ d|`, `< z −1 √ y µ(d) d g(d) X µ(d) Y ω(p) = g(j) = 1− Gd ,z , √ √ G( y, z) ω(d) G( y, z) p d dj|Π, √ j< dz (4.7) p|d using the notation of lemma 7. With this choice of λd we have M = √1 G( y,z) and (4.3) becomes S(A, wp, z) ≤ X X + |µ+ (d)Rd |. √ G( y, z) d|Π, d<y Therefore, to conclude it’s enough to observe that by (4.7) and lemma 7 we have |λd | ≤ 1 (since G1 (ζ, z) = G(ζ, z)) and so ν(d) X X X ν(d) a + |µ (d)| = λd1 λd2 ≤ 1= 2 = 3ν(d) , a d=[d1 ,d2 ] d=[d1 ,d2 ] a=0 for all square-free d. Theorem D can be used to obtain an upper bound for the function φ(x, z) = #{n ≤ x | p|n ⇒ p ≥ z}. To prove it we’ll need the following lemmas. Lemma 8. Let Hk (z) = X µ(`)2 , ϕ(`) (`,k)=1, `<z where ϕ(`) is the Euler’s φ function. Then Hk (z) ≥ ϕ(k) log z. k 24 Proof. Firstly we prove the statement for k = 1. We have that H1 (z) = X µ(`)2 `<z where κ(n) = Q p|n ϕ(`) h Y (pi − 1)−1 = X = X p1 ···ph <z, p1 <···<ph , αi ≥1 `=p1 ···ph <z, i=1 p1 <···<ph X 1 1 = , pα1 1 · · · pαh h n κ(n)<z p is the square-free kernel of n. Thus, H1 (z) = X 1 X1 ≥ ≥ log z. n n<z n κ(n)<z On the other hand, we have H1 (z) = X µ(n)2 `<z ϕ(n) = X X µ(n)2 X X = ϕ(n) 0 n<z, `|k `|k `=(n,k) = X µ(`)2 `|k ≤ ϕ(`) n <z/`, (n,z/`)=1 z X µ(n0 )2 X µ(`)2 = H k ϕ(n0 ) ϕ(`) ` 0 `|k (n ,k)=1, n0 <z/` X µ(`)2 `|k Y 1 k Hk (z) = Hk (z) = Hk (z) ϕ(`) p−1 ϕ(k) p|k and so Hk (z) ≥ ϕ(k) log z. k Lemma 9. For all h ∈ N we have X S1 = µ(d)2 hν(d) ≤ x (1 + log x)h , d≤x S2 = X µ(d)2 d≤x where ν(d) = P p|d µ(`n0 )2 ϕ(`n0 ) hν(d) ≤ (1 + log x)h , d 1. Proof. We have that S1 ≤ x µ(d)2 hν(d) = xS2 . d d≤x X 25 Moreover, S2 = X µ(d)2 d≤x X = d1 ,...,dh d X 1≤ d1 ,...,dh , d=d1 ···dh ∞ X 1 X µ(d1 )2 · · · µ(dh )2 d d=d 1 ···dh , di ≤x d=1 µ(d1 )2 µ(dh )2 ··· = d1 dh ≤x X µ(d)2 !h ≤ (1 + log x)h . d d≤x Remark 9.1. Using Perron’s formula, one can prove that S1 x (log x)h−1 , anyway this improvement doesn’t have any effect on our final result about φ(x, z), in fact that just forces us to use an asymptotic inequality instead of a simple inequality. Now we are ready to prove the following Corollary D.1. We have i) φ(x, z) ≤ ii) π(x) x log z + z 2 (1 + 2 log z)3 , x . log x Proof. If we define A = {n ∈ N | n ≤ x}, we have that φ(x, z) = S(A, P, z). Moreover, we have #Ad = xω(d) + Rd , d with ω(d) = 1 for all d and |Rd | < 1. Applying Theorem D with y = z 2 we have X x φ(x, z) ≤ + 3ν(d) |Rd |, G(z, z) `|Π(P,z), d<z 2 where G(z, z) = X ω(`) Y `|Π, `<z ` p|` ω(p) 1− p −1 = X µ(`)2 `<z Thus, applying lemma 8 with k = 1, we find ϕ(x, z) ≤ X x + 3ν(d) µ(d)2 log z 2 d<z 26 ϕ(`) . and so to obtain item i) it’s enough to apply lemma 9. To deduce item ii), we have just to observe that by item i) we have π(x) ≤ φ(x, z) + π(z) ≤ x + O z 2 (log z)3 + z log z 1 and choose z = x2 . (log x)2 Remark D.1.1. In the previous corollary we obtained a better estimate than the one we could obtain from corollary A.2. This is due to the fact that the main terms of theorems A and B are basically the same, but the error term of the Selberg sieve is much better than the one of the sieve of Eratosthenes. We can also use Theorem D to estimate π(x; k, a) = #{primes p ≤ x | p ≡ a (mod k)} for given coprime a and k. Corollary D.2. Let ℘ = {primes p ≤ x | p - k} and let ( 1 if p - k, ω(p) = 0 otherwise. Then S(A, ℘, z) ≤ x k + ϕ(k) log z X 3ν(d) |Rd |. d|Π(℘,z), (d,k)=1, d<z 2 Proof. Exercise. Dirichlet theorem of primes in a progression assures that π(x; k, a) goes to infinity as x → ∞ if (k, a) = 1 (otherwise it’s clearly 0 or 1). In fact, Dirichlet showed that (if (k, a) = 1) primes p ≡ a (mod k) have analytic density 1 p≡a (mod k) ps 1 log s−1 1 , ϕ(k) P lim s→1 = ϕ(k) that coincide with arithmetic density π(x; k, a) 1 = x→∞ π(x) ϕ(k) lim 27 that is (but be aware that the two statements aren’t equivalent). More precisely, we have π(x; k, a) ∼ 1 x ϕ(k) log x with an error term that’s not uniform in k. Siegel and Walfisz proved the following result uniform in k. Theorem (Siegel-Walfisz). Let (a, k) = 1. For all N > 0 there exists a c = c(N ) > 0 such that for any k ≤ (logx)N we have p 1 li x + O x exp −c log x , π(x; k, a) = ϕ(k) R x du is the logarithmic integral function. Moreover, if uniformly in k and where li x := 2 log u √ the generalized Riemann hypothesis holds, we have that, for any k ≤ π(x; k, a) = x , (logx)2 √ 1 li x + O x log(kx) , ϕ(k) uniformly in k. As a consequence of theorem D, we can prove the following corollary, that gives an estimate for π(x; k, a) that is worse than the previous ones, but that holds for a bigger range of k. Corollary D.3 (Brun-Titchmarsh). Let (a, k) = 1 and k ≤ x4 . Then ! x log log xk 2x +O π(x; k, a) ≤ , ϕ(k) log xk ϕ(k) log x 2 k uniformly in k. Proof. Let A = {n ≤ x | n ≡ a (mod k)} and ℘ = {p ∈ P | p - k}. Then π(x; k, a) ≤ S(A, ℘, z) + Moreover #Ad = x ω(d) k d z + 1. k + Rd , where ( 1 if p - k, ω(p) = 0 if p | k 28 and |Rd | < 1. Hence, by Corollary D.2 and Lemma 9, we have S(A, ℘, z) ≤ Taking z = px k k x + ϕ(k) k log z 5 x −2 k X 3ν(d) µ(d)2 = d|Π(℘,z), (d,k)=1, d<z 2 x + O z 2 (log z)3 . ϕ(k) log z we complete the proof. Remark D.3.1. If we could replace 2 by 2 − δ for some δ > 0 in Corollary D.3, we would have as a consequence that the Landau-Siegel zeros don’t exist. We now state the following theorem. Theorem E (Bombieri-Vinogradov Theorem, 1965). For all A > 0, there exist c = c(A) > 0 and B = B(A) > 0 such that li x x max ∗ π(x; k, a) − ≤C a∈(Z/kZ) ϕ(k) (log x)A k≤K X 1 for K = x 2 (log x)−B . Proof. See Davenport, Multiplicative number Theory (it’s proved using the large sieve). Combining Theorems D and E, we can study “Titchmarsh divisor problem”, that is to compute the order of the function S(x) = X d(p + a), p≤x for a ∈ N fixed and where d(n) := P d|n 1. In 1930 Titchmarsh was able to prove that S(x) = O(x). The following corollary goes beyond that estimate providing the asymptotic behaviour of S(x). Corollary E.1. For all a ∈ N, there exists c > 0 such that x log log x S(x) = cx + O . log x 29 Proof. For all n ∈ N we have that X d(n) = 2 1 − δ(n), d|n, √ d≤ n where ( 1 if n is a square, δ(n) = 0 otherwise. Thus S(x) = 2 X X 1− p≤x d|p+a, √ d≤ p+a =2 X X δ(p + a) = 2 X π(x; d, −a) + O √ x √ d≤ x p≤x π(x; d, −a) + O √ x , √ d≤ x, (a,d)=1 since P p≤x δ(p + a) ≤ √ δ(n + a) = O ( x) . P n≤x Now, let A > 0 and let B = B(A) > 0 as in Theorem E. Write X X X π(x; d, −a) = π(x; d, −a) + √ d≤ x, (a,d)=1 d≤ √ √ x(log x)−B , π(x; d, −a) √ x(log x)−B ≤d≤ x, (a,d)=1 (a,d)=1 = S1 (x) + S2 (x), say. Theorem E implies that X S1 (x) = √ d≤ x(log x)−B , (a,d)=1 li x + ϕ(d) X = li x d≤ √ x(log x)−B , X √ d≤ x(log x)−B , (a,d)=1 1 +O ϕ(d) li x π(x; d, −a) − ϕ(d) x (log x)A . (a,d)=1 Moreover we have that X d<t, (a,d)=1 1 = c log t + O(1), ϕ(d) for some c > 0. Hence S1 (x) = cx + O x log log x log x X S2 (x) √ √ x(log x)−B ≤d≤ x, (a,d)=1 . Finally, Corollary D.3 implies 1 x x log log x, ϕ(d) log x log x by (4.8). 30 (4.8) Chapter 5 Sieve limitations The optimization problem for the upper bound sieve requires minimising the functional X µ+ (d)ω(d) X µ+ (d)Rd , + L̃(µ ) := X d + d|Π d|Π(℘,z) subject to ( 1 if (n, Π) = 1, µ+ (d) ≥ 0 if (n, Π) > 1. d|n, X d|Π This is almost a problem of linear programming. To obtain a linear programming problem + + in standard form, we need to write µ+ = µ+ 1 − µ2 with µi ≥ 0 and try to minimize the linear functional L(µ+ ) := X X d|Π(℘,z) ω(d) X + + µ+ (d) − µ (d) + µ1 (d) + µ+ 1 2 2 (d) |Rd | , d d|Π subject to ( 1 if (n, Π) = 1, + µ+ 1 (d) − µ2 (d) ≥ 0 if (n, Π) > 1 d|n, X d|Π and µ+ 1 (d) ≥ 0, µ+ 2 (d) ≥ 0. 31 Now, define ω(d) c = (−1)k−1 X + |Rd | d d|Π, k=1,2 x = µ+ k (d) d|Π, k=1,2 b = δ1,(n,Π) n∈A where δi,j is Kronecker’s delta and An; d,k ( (−1)k−1 = 0 if d|n, otherwise, with n ∈ A, d|Π and k = 1, 2. Then, what we are trying to minimize is cT x, under the conditions Ax ≥ b and x ≥ 0. The dual problem is to maximize y T b, subject to y ≥ 0 and y T A ≤ cT . Note that, if the conditions Ax ≥ b, x ≥ 0, y T A ≤ cT and y ≥ 0 hold, we have that cT x ≥ y T Ax ≥ y T b. (5.1) Moreover, the strong duality theorem assures that there exist x and y such that the equality holds in (5.1) and clearly those vectors are solutions for the linear programming problem and its dual. Thus, tackling the dual problem, we can obtain informations about the best upper bound it’s possible to obtain through sieve methods. Now, in this case the dual problem is maximizing the function X J(y) = yn , n∈A, (n,Π)=1 under the conditions X X ω(d) ω(d) − |Rd | ≤ ydm ≤ X + |Rd |, d d n∈A, d|n yn ≥ 0. Note that, taking yn = 1 for any n, we obtain J(y) = S(A, ℘, z). Moreover, for any subset à ⊂ A such that Ãd − X ω(d) ≤ |Rd |, d 32 (5.2) taking yn = 1 if n ∈ Ã, 0 otherwise, we find J(y) = S(Ã, ℘, z). Thus, for any à ⊂ A satisfying (5.2), we have L(µ+ ) ≥ S(Ã, ℘, z) and it’s easy to show that we can drop the condition à ⊂ A. We now give an example where the upper bound given by Selberg sieve is optimal. Let Ω(n) the number of factor of n counted with multiplicity and let λ(n) = (−1)Ω(n) be the Liouville function. Set A± = {n ∈ N | n ≤ x, λ(n) = ∓1}. Now, S(A+ , P, z) = #{n ∈ A+ | p|n ⇒ p ≥ z} = #{n ≤ x | λ(n) = −1, p|n ⇒ p ≥ z}. 1 Clearly, if z > x 3 we have that S(A+ , P, z) = π(x) − π(z) x x = +O + O(z). log x log2 x We now want to find an upper bound for S(A+ , P, z) using Selberg sieve. We need the following lemma. P Lemma 10. Let Λ(x) = n≤x λ(n). Then there exists c > 0 such that Λ(x) Ec (x), √ where Ec (x) = x exp(−c log x). Proof. Let’s consider Mertens function M (x) = 1 . ζ(s) P n≤x µ(n). It’s well known that P∞ µ(n) n=1 ns = Moreover, using Perron formula, if x isn’t an integer we have that 1 M (x) = 2πi Z k+i∞ k−i∞ 1 xs ds, ζ(s) s for any k > 1. Using Cauchy theorem and the zero free region for ζ(s), one can prove that M (x) = O(Ec (x)). Moreover, note that if n/`2 is square-free, then we have λ(n) = 33 µ(n/`2 ) = P d2 |n µ(n/d2 ). Hence XX X X Λ(n) = µ(n/d2 ) = µ(m) Ec (x). √ d≤ x m≤ dx2 n≤x d2 |n 1 Remark 10.1. Note that Riemann hypothesis is true if and only if M (x) = O(x 2 +ε ) for 1 any ε > 0 and if and only if Λ(x) = O(x 2 +ε ) for any ε > 0. Now, let’s go back to our sets A± . We have that #A± d = #{n ≤ x | λ(n) = ∓1, d|n} x = #{m ≤ | λ(m) = ∓λ(d)}. d Observe that if λ(m) = ∓λ(d), then 1 ∓ λ(d)λ(m) = 2 ant it’s 0 otherwise. Thus 1 X (1 ∓ λ(d)λ(m)) #A± d = 2 x m≤ d 1 h x i λ(d) x ∓ Λ 2 d 2 d x 1x = + O Ec , 2d d = by lemma 10 (and if d ≤ x). Therefore, we have to take X = x 2 and ω(d) = 1 for all d. Applying Theorem D to this problem we obtain the remainder term x X X µ(d)2 3ν(d) Ec 3ν(d) |Rd | d d<y d|Π, d<y X ν(d) µ(d)2 9 ! 12 X d d<y µ(d)2 d Ec d<y x 2 ! 12 d , where we used Cauchy’s inequality and we are assuming y < x. Now, by lemma 9, we have X µ(d)2 d<y Moreover, we have X d<y µ(d)2 d Ec x 2 d 9ν(d) log9 y. d x2 X1 d<y d p exp −2c log (x/y) p x2 log y exp −2c log (x/y) . 34 Thus X ν(d) 3 p |Rd | x log y exp −c log (x/y) 5 d|Π, d<y and taking y ≤ E1 (x), we find p X 3ν(d) |Rd | x log5 x exp −c 4 log x d|Π, d<y x . log2 x 1 Therefore, taking y ≤ E1 (x) and z ≥ y 2 , theorem D gives us X x + , S(A , P, z) ≤ +O √ G( y, z) log2 x where −1 X ω(`) Y X µ(`)2 1 ω(p) √ √ √ G( y, z) ≥ G( y, y) = ≥ log y, 1− = ` p 2 √ ϕ(`) `|Π, √ `< y `< y p|` by lemma 8. Thus x S(A , P, z) ≤ +O log y + x log2 x . Taking y = E1 (x), we find x S(A , P, z) ≤ +O log x ! x + 3 (log x) 2 and so, since we already knew that x +O S(A , P, z) = log x + x log2 x + O(z) 1 for z > x 3 , we have that with Selberg sieve we are able to prove an optimal upper bound 1 for x 2 ≤ z ≤ x . log2 x Therefore Selberg’s coefficients µ+ are optimal solutions to the mini- mization problem for L(µ+ ) and, correspondly, A+ is optimal for the dual problem. We now turn to the lower bound sieve problem. It’s clear that also this problem can be expressed as a linear programming problem with the new condition ( X 1 if (n, Π) = 1, µ− (d) ≤ 0 otherwise. d|n, d|Π 35 Obviously, the choice µ− (d) = 0 for all d satisfies this condition, and the corresponding inequality is S(A, ℘, z) ≥ X X µ− (d)ω(d) d|Π d − X µ− (d)Rd = 0. d|Π 1 Now, for A = A− and z > x 2 , we have that S(A− , P, z) = 1, so the coefficients µ− (d) = 0 are essentially optimal for our linear programming problem and thus so is A− for the dual problem. In particular, since A− and A+ have the same inputs, ω(d), X and O(Rd ), it 1 is not possible for Sieve machinery to distinguish them. Therefore, for x 2 < z < logx2 x , we can’t prove that S(A+ , P, z) x log x through sieve methods and thus that π(x) x . log x This problem is due to the fact that integers n with 2|Ω(n) are seen by sieves as same as integers n with 2 - Ω(n). This phenomenon is known as “Parity problem” and it’s a big limitation for sieve methods. To tackle this kind of problems is therefore necessary to insert some other machinery that doesn’t come from sieve methods. 36 Chapter 6 Small gaps between primes As a consequence of the prime number theorem (stating that π(x) = #{p ≤ x} ∼ x ), log x one can prove that N X pn+1 − pn n=1 log pn ∼N and thus lim inf n→∞ pn+1 − pn pn+1 − pn ≤ 1 ≤ lim sup . log pn log pn n→∞ The twin primes conjecture, saying that there are infinitely many primes p such that p + 2 is also prime, leads to think that the much weaker statement lim inf n→∞ pn+1 − pn =0 log pn is true. Hardy and Littlewood were the first to obtain some results in this direction. In 1926, they proved that lim inf n→∞ pn+1 − pn 3 ≤ , log pn 5 under the generalized Riemann hypothesis. Other progresses have been done over the years, one of the last being Mayer’s proof (1986) of lim inf n→∞ pn+1 − pn 1 < . log pn 4 Finally, in 2005 Goldston, Pintz and Yildirim managed to prove that this lim inf is 0 and other results towards the twin prime conjecture. The following are results they were able to obtain. 37 Theorem 1. We have lim inf n→∞ pn+1 − pn = 0. log pn Theorem 2. We have lim inf n→∞ pn+1 − pn 1 (log pn ) 2 (log log pn )2 <∞ Moreover, denote with BV(θ) the following statement X X y x Λ(n) − max max (a,q)=1 y≤x ϕ(q) logA x q≤xθ n≡an≤y, (mod p) for any A > 0. (BV(θ)) Note that Bombieri-Vinogradov Theorem (Theorem E) states that BV(θ) holds for any θ< 1 2 and that the Elliott-Halberstam conjecture implies that BV(θ) holds for any θ < 1. The following are conditional results proved by Goldston, Pintz and Yildirim. Theorem 3. If BV(θ) holds for some θ > 12 , then lim inf pn+1 − pn < ∞. n→∞ Theorem 4. If BV(θ) holds for all θ < 1, then lim inf pn+1 − pn ≤ 20. n→∞ Theorem 5. If BV(θ) holds for all θ < 1, then lim inf pn+1 − pn ≤ 16. n→∞ We are now going to prove the first 4 theorems. The fifth can be obtained in a similar way with some refinements. Let H > 0 and H ⊆ [0, H] ∩ Z. H is said admissible if for any prime p there exists np such that p - np + h ∀h ∈ H. For example, {0, 2} is admissible, but {0, 2, 4} isn’t, since the condition fails for p = 3. Clearly if a set H isn’t admissible, there aren’t infinitely many n such that n + h is prime for all h ∈ H. 38 If we set k = #H, to verify that H is admissible, it’s enough to check the condition for all primes p ≤ k. Thus the set H = {pi | k < p1 < · · · < pk } is admissible for any k and so we can find admissible sets of any cardinality. Now, let N ∈ N and let H be an admissible set. Define X X log(n + h) − log 3N . S0 := N <n≤2N h∈H, n+h prime Clearly, if we were able to prove that S0 > 0 for infinitely many N , we would have that there exist infinitely many n such that X log(n + h) > log 3N h∈H, n+h prime and so that for infinitely many n there exist at least two h such that n + h is prime. Unfortunately this is not the case, since S0 → −∞ as N → ∞. Thus, we try to sum just on the n that are more likely to give more than one h such that n + h is prime. To do that, P we try with Selberg’s idea, multiplying the summands by ( d|n λd )2 and let this being essentially supported on almost primes. Therfore, we consider the sum 2 X X X log(n + h) − log 3N λd . S := Q N <n≤2N h∈H, d| h∈H (n+h) n+h prime The optimal values of the λd are still not known in this context and so we try to use the Selberg’s sieve ones, that are essentially λd ≈ µ(d) log+ (ζ/d) log ζ κ , where ( log x if x ≥ 1, log x := 0 if x ∈ (0, 1) + and κ is the dimension of the sieve. We take κ ≥ k = #H and we write κ = k + `, with ` ≥ 0. As before, if we are able to prove that S > 0 for infinitely many N we’ll obtain 39 that there exist at least two h such that n + h is prime for infinitely many n. Now, assume 1 that k, ` and H ⊂ [0, H] are fixed and that N 10 ≤ ξ ≤ N. Since the λ(d) are supported on [0, ξ], we can write S in the form S= X X [d1 ,d2 ]=D log(n + h) − log 3N X X N Q<n≤2N, D| h∈H (n+h) ΛD P X N <n≤2N, h∈H, Q n+h prime d1 ,d2 | h∈H (n+h) D≤ξ 2 where ΛD := X λ1 λ2 d1 ,d2 ≤ξ = log(n + h) − log 3N , h∈H, n+h prime λd1 λd2 . Since λd ≤ 1, we have that |ΛD | ≤ #{d1 , d2 | [d1 , d2 ] = D} ≤ d(D)2 . Moreover, λd = 0 unless d is squarefree and so the same holds for Λd . Let’s fix an h0 ∈ H and let H0 = H \ {h0 }. We have X X X ΛD ΛD log(n + h) = S(h0 ) := D≤ξ 2 = D≤ξ 2 N <n≤2N, n+h Q 0 prime, D| h∈H (n+h) X D≤ξ 2 X ΛD X log p N +h Q 0 <p≤2N +h0 , D| h∈H (p−h0 +h) log p, NQ +h0 <p≤2N +h0 , D| h∈H (p−h0 +h) 0 since (D, p) = 1 being p > N ≥ ξ and D ≤ ξ 2 . Now, let a1 , . . . , aν0 (D) be the classes in the set {x (mod D) | Y (x − h0 + h) ≡ 0 (mod D)} h∈H0 that are coprime to D. If D is prime we have ν0 (D) ≤ #H0 ≤ k − 1 ≤ 2k = d(D)k and ν0 (D) is clearly multiplicative, so ν0 (D) ≤ d(D)k holds for all square-free D. Moreover, we have ν0 (D) S(h0 ) = X D≤ξ 2 ΛD X X j=1 N +h0 <p≤2N +h0 , p≡aj (mod D) 40 log p. If we define ∆(y, q, a) = X p≤y, p≡a (mod q) y log p − , ϕ(q) we have that ν0 (D) N S(h0 ) = + O(∆(2N + h0 , D, aj )) + O(∆(N + h0 , D, aj )) ΛD ϕ(q) 2 j=1 D≤ξ X X N ΛD ν0 (D) + O = |ΛD |ν0 (D) max max ∆(y, D, a) (a,D)=1 y≤3N ϕ(D) 2 2 D≤ξ D≤ξ X X ΛD ν0 (D) +O =N d(D)2+k max max ∆(y, D, a) . (6.1) (a,D)=1 y≤3N ϕ(D) 2 2 X X D≤ξ D≤ξ We now need the following lemma. Lemma 11. If BV(θ) holds, then for all t ∈ N and A > 0 we have X x d(q)t max max ∆(y, q, a) . (a,q)=1 y≤x (log x)A θ q≤x Remark 11.1. The previous lemma is in some sense surprising, since d(q) can be greater than any positive power of q. θ Now, assume that BV(θ) holds. Applying lemma 11, with x = 3N and ξ = (3N ) 2 , from (6.1) we have that X ΛD ν0 (D) + O N (log N −A ) . ϕ(D) 2 S(h0 ) = N D≤ξ For any prime p, it is easy to show that ν0 (p) = #{n (mod p) | ∃h ∈ H h ≡ n (mod p)} − 1 = ν(p) − 1, say. Thus ν0 (p) is independent of h0 and so X h0 ∈H S(h0 ) = X X ΛD D≤ξ 2 D| = kN X N h0 ∈H, Q<n≤2N n+h0 prime (n+h) h∈H log(n + h0 ) X ΛD ν0 (D) + O N (log N −A ) . ϕ(D) 2 D≤ξ 41 With a similar but easier argument, one can prove that X X X ΛD ν(D) + O N (log N −A ) , ΛD log 3N = N (log 3N ) ϕ(D) 2 2 N <n≤2N, D≤ξ D≤ξ D| Q h∈H (n+h) with the main difference that this time we don’t need to assume that BV(θ) holds and we θ can take ξ = (3N ) 2 for all θ < 1. We need the following two lemmas. Lemma 12. We have 2 X ΛD ν0 (D) 1 (k + 1)! (2` + 2)! = S(H) (log ξ)−k+1 + O (log ξ)−k+ 2 , ϕ(D) (` + 1)! (k + 2` + 1)! 2 D≤ξ where Y 1 − ν(p) p S(H) = k . p 1 − p1 Remark 12.1. Note that ν(p) = #{n (mod p) | ∃h ∈ H, n ≡ h (mod p)} = k if p > H. Thus Y 1 − ν(p) Y Y 1 − kp 1 p = 1+O < ∞. k = 1 p2 1 − p + · · · p>H p>H p>H 1 − 1 p Moreover we have that S(H) > 0 iff ν(p) < p ∀p iff H is admissible. Lemma 13. We have 2 X ΛD ν(D) 1 (k + `)! (2`)! = S(H) (log ξ)−k + O (log ξ)−k− 2 . ϕ(D) (`)! (k + 2`)! 2 D≤ξ Applying the previus lemmas we obtain 2 (k + `)! (2`)! (log ξ)−k k(2` + 1)(2` + 2) log ξ+ S =N S(H) (` + 1)! (k + 2` + 1)! 1 − (` + 1)2 (k + 2` + 1) log 3N + O N (log N )−k+ 2 θ and, since we chose ξ = (3N ) 2 , we have 2 −k (k + `)! (2`)! θ S =N S(H) (log 3N )−k+1 2k(2` + 1)+ (` + 1)! (k + 2` + 1)! 2 −k+ 12 − (` + 1)(k + 2` + 1) + O N (log N ) . We are now ready to prove theorems 3 and 4. 42 (6.2) Proofs of theorems 3 and 4. Suppose that BV(θ) holds for some θ > 1 . 2 Then, taking k = `2 , from (6.2) we have that S > 0 for N large enough if θ> and since θ > 1 2 ` + 1 `2 + 2` + 1 2` + 1 `2 it is certainly possible to find an ` that satisfies this inequality. This completes the proof of theorem 3. Now, suppose that BV(θ) holds for some 20 21 < θ < 1. The set H = {11, 13, 17, 19, 23, 29, 31} is admissible (with k = 7) and, taking ` = 1, (6.2) implies that S > 0 for N large enough. To prove theorems 1 and 2 we need to modify slightly our arguments. Let S 0 := X N <n≤2N H X h=1, n+h prime 2 X log(n + h) − log 3N λd . Q d| h∈H (n+h) As before, if S 0 > 0, we can find two prime < p, p0 > N such that |p − p0 | < H. θ Now, suppose that BV(θ) holds and take again ξ = (3N ) 2 . Proceding in a similar way as for S(h0 ) with h0 ∈ H, one can prove that the contribution to S 0 of h0 ∈ / H is N X ΛD ν 0 (D) 0 + O N (log N −A ) , ϕ(D) 2 D≤ξ where ν00 (D) = #{x (mod D) | Y (x − h0 + h) ≡ 0 (mod D)}. h∈H We now need the analogous of lemmas 12 and 13 for ν00 (D). Lemma 14. We have 2 X ΛD ν 0 (D) (k + `)! (2`)! 0 −k −k− 21 = S (H ∪ {h0 }) (log ξ) + O (log ξ) . ϕ(D) (`)! (k + 2`)! 2 D≤ξ 43 Therefore, 0 2 (2`)! (log ξ)−k k(2` + 1)(2` + 2) log ξ+ (k + 2` + 1)! ! H X S (H ∪ h0 ) 2 −k+ 12 + (` + 1) (k + 2` + 1) − log 3N + O N (log N ) . S (H) h =1, S =N S(H) (k + `)! (` + 1)! 0 h0 ∈H / Since S (H ∪ {h}) = S (H) if h ∈ H, we have that H H X X S (H ∪ h0 ) S (H ∪ h0 ) = −k S (H) S (H) h =1, h =1 0 h0 ∈H / (6.3) 0 and the behaviour of the firs summand on the right is given by the following lemma Lemma 15. We have that L X S (H ∪ h) h=1 S (H) ∼ L. We are now ready to prove theorem 1. θ Proof of theorem 1. Let’s take H = c log 3N for c > 0 arbitrary. Since we chose ξ = (3n) 2 , by lemma 15 and (6.3) we have that S 0 > 0 for sufficiently large N if θ k(2` + 1)2 − (` + 1)(k + 2` + 1)(1 − c) > 0. 2 Bombieri-Vinogradov Theorem (Theorem E) assures that BV(θ) holds for any θ < so we can take θ = 1 2 (6.4) 1 2 and − 4c . For k = `2 , (6.4) becomes `2 (2` + 1) 2−c > (` + 1)3 1 − c/2 and so we have that S 0 > 0 for ` large enough. Thus for all N large enough there exist n ∈ (n, 2n] and 0 ≤ h1 < h2 ≤ c log 3N such that p1 = n + h1 and p2 = n + h2 are primes. Therefore p2 − p1 log 3N ≤c ≤ 2c log p1 log N and this proves theorem 1 since c > 0 was arbitrary. 44 We now give a sketch of the proof of lemma 13. Lemmas 12 and 14 can be proven in a similar way. Proof of lemma 13. We need to estimate M = λd = µ(d) P D≤ξ 2 log+ (ζ/d) log ζ ΛD ν(D) , ϕ(D) where ΛD = P [d1 ,d2 ]=D λd1 λd2 , k+` , 1 N 10 ≤ ξ ≤ N and ν(p) ∈ [0, k], ν(p) = k if p > H. Firstly, observe that for all δ > 0 k+` Z s log+ (ζ/d) ds 1 ξ , = 2πi (δ) d sk+`+1 (k + `)! where (δ) indicates the line (δ − i∞, δ + i∞). Thus Z Z (k + `)!2 ξ s1 +s2 1 M= F (s1 , s2 ) ds1 ds2 (log ξ)2(k+`) (2πi)2 (δ) (δ) (s1 s2 )k+`+1 (k + `)!2 M 0, = 2(k+`) (log ξ) say, where ∞ X ν([d1 , d2 ]) −s1 −s2 d d [d1 , d2 ] 1 2 d1 ,d2 =1 Y ν(p) 1 1 1 = 1− + s2 s1 +s2 . s1 p p p p p F (s1 , s2 ) = µ(d1 )µ(d2 ) For p > H ν(p) = k and so F (s1 , s2 ) is approximately F (s1 , s2 ) ≈ ζ(s1 + 1)−k ζ(s2 + 1)−k ζ(s1 + s2 + 1)k . It is clear that the function G(s1 , s2 ), defined by G(s1 , s2 ) = F (s1 , s2 )ζ(s1 + 1)k ζ(s2 + 1)k ζ(s1 + s2 + 1)−k , has an Euler product G(s1 , s2 ) = Y Gp (s1 , s2 ). p Note that G(0, 0) = S(H). Moreover, if p < H, then Gp is regular for <(s1 ), <(s2 ) > −1, 3 Q while, if p > H, ν(p) = k and so Gp = 1 + O p− 2 for <(s) > − 81 . Therefore p Gp is 45 uniformly convergent for <(s1 ), <(s2 ) > − 18 . Thus on this region G(s1 , s2 ) is holomorphic in s1 and s2 and Y G(s1 , s2 ) 1+O p − 87 O(1) k 1. ν(p)6=k Now, take δ = 1 . log N On <(s1 ) = <(s2 ) = δ we have that ξ s1 +s2 = O(1). Moreover ζ(1 + s) and ζ(1 + s)−1 are O(log(2 + |t|)) on <(s) = δ, |t| ≥ 1. Therefore, for any T ≥ 1 we have Z (δ) Z δ+iT δ−iT ξ s1 +s2 F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1 Z +∞ −∞ log(2 + |t1 | + |t2 |)O(1) dt1 dt2 (1 + |t1 |)k+`+1 (1 + |t2 |)k+`+1 Z |t1 |≥2T O(1) −k−` (log T ) T and clearly the same holds taking |t2 | ≥ T instead of |t1 | ≥ 2T . Thus 1 M = (2πi)2 0 Z δ+iT δ−iT Z δ+2iT δ−2iT ξ s1 +s2 F (s1 , s2 ) ds1 ds2 + O T 1−k−` . k+`+1 (s1 s2 ) It is known that there exists c > 0 such that for σ ≥ − logc T , 1 ≤ |t| ≤ T , one has that ζ(1 + s) 6= 0 and that ζ(1 + s) and ζ(1 + s)−1 are O (log(1 + |t|)). Therefore, ζ(1 + s) 6= 0 ± c inside the rectangle Γ with vertices a± T := δ ± 2iT , bT := − log 2T ± 2iT with the usual orientation. Clearly, if s1 ∈ Γ and |t2 | ≤ T and supposing T < N c σ2 = δ we have that s1 + s2 6= 0. Applying Cauchy’s formula, for |t2 | ≤ T we have Z a+T Z b−T Z b+T Z a+T ! 1 1 ξ s1 +s2 ξ s1 +s2 F (s , s ) ds = F (s1 , s2 ) ds1 + + + 1 2 1 2πi a−T (s1 s2 )k+`+1 2πi (s1 s2 )k+`+1 a− b− b+ T T T ξ s1 +s2 F (s1 , s2 ); s1 = −s2 + + Res (s1 s2 )k+`+1 ξ s1 +s2 + Res F (s1 , s2 ); s1 = 0 . (s1 s2 )k+`+1 Now, we have that Z δ+iT Z b− T Z a+ T + δ−iT a− T b+ T ! ξ s1 +s2 F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1 ! Z T Z − c Z δ log 2T O(1)(log T )O(1) + dt1 dt2 (δ + |t2 |)k+`+1 T k+`+1 −T δ − logc2T = O T −k−` log N 46 and Z δ+iT Z b+ T b− T δ−iT c ξ δ− log T (log T )O(1) dt1 dt2 k+`+1 ( c + |t1 |)k+`+1 −T −2T (δ + |t2 |) log 2T c = O ξ − log 2T (log T )O(1) log N . ξ s1 +s2 F (s1 , s2 ) ds1 ds2 (s1 s2 )k+`+1 Z T Z 2T √ Thus, choosing T = exp log N we find that 1 M = (2πi) 0 Z δ+iT ξ s1 +s2 Res F (s1 , s2 ); s1 = −s2 + (s1 s2 )k+`+1 ! 0√ ξ s1 +s2 −c log N + Res , F (s1 , s2 ); s1 = 0 ds2 + O e (s1 s2 )k+`+1 δ−iT for some c0 > 0. Let’s compute the first residue. Let C be the circle s1 = −s2 + 1 eiφ . log N We have Res ξ s1 +s2 F (s1 , s2 ); s1 = −s2 (s1 s2 )k+`+1 since ζ 1 − s2 + Z 1 eiϕ logN δ+iT Res δ−iT Z 1 ξ s1 +s2 = F (s1 , s2 ) ds1 2πi C (s1 s2 )k+`+1 (log N )k−1 (log t2 )k , (s2 )2k+2`+2 (ζ(1 + s2 ))k log t2 for σ2 = δ, |t2 | ≤ T (and N big enough). Therefore, ξ s1 +s2 F (s1 , s2 ); s1 = −s2 (s1 s2 )k+`+1 ds2 = O (log N )k−1 . Now, let consider the second residue. The function Z(s1 , s2 ) = G(s1 , s2 ) (s1 + s2 )ζ(1 + s1 + s2 ) s1 ζ(s1 + 1)s2 ζ(1 + s2 ) k is regular (and nonzero) near s1 = s2 = 0. Let f (s2 ) := Res s1 +s2 ξ s1 +s2 ξ Z(s1 + s2 ) F (s1 , s2 ); s1 = 0 = Res ; s1 = 0 . (s1 s2 )k+`+1 (s1 s2 )`+1 (s1 + s2 )k On the rectangle Γ0 with vertices δ ± iT , − logc T ± iT with the usual orientation, we have that f (s2 ) ξ <(s2 ) (log ξ)O(1) . |s2 |k+`+1 47 Thus, applying Cauchy’s formula, we have that the integral 1 (2πi) Z δ+iT f (s2 ) ds2 δ−iT √ is the residue of f (s2 ) in s2 = 0 plus an error term that is O exp −c0 log N . To conclude it is therefore sufficient to compute Res (f (s2 ); s2 = 0) and that is just a (long) caluclation. 48
© Copyright 2026 Paperzz