Linnik`s proof of the Waring-Hilbert theorem from Hua`s book

LINNIK’S PROOF OF THE WARING-HILBERT THEOREM
FROM HUA’S BOOK
(with a correction)
Notes by Tim Jameson
(k)
For integers s ≥ 1, k ≥ 2 and n ≥ 0, let rs (n) denote the number of solutions
(n1 , . . . , ns ) of the equation nk1 + · · · + nks = n with n1 , . . . ns ≥ 0.
(2)
The fact that r4 (n) > 0 for all n was probably already suspected by Diophantus
(c.250). It was stated explicitly by Bachet in 1621, and later Fermat claimed to have a
proof. The first accepted proof was by Lagrange in 1770, building on work of Euler.
Also in 1770, Waring wrote a letter to Euler in which he asserted that every natural
number is a sum of 4 squares, 9 cubes, 19 biquadrates “and so on”. By this it is usually
(k)
assumed that he meant that for each k there exists an s such that rs (n) > 0 for all n. We
let g(k) denote the least such value of s. The problem of showing that g(k) < ∞ for all k has
become known as Waring’s problem. It was first solved by Hilbert in 1909, by a complicated
method [Hil]. Hardy and Littlewood [HL] gave a more elegant proof by the “circle method”
in 1919, which was then refined and simplified by Vinogradov [Vin].
An “elementary” proof, using only number-theoretic methods, was given by Linnik
in 1943 [Lin]. This method is presented in [Hua], but with one serious mistake. Here I
give a version with this mistake corrected. A more general version, with xk replaced by a
polynomial, is given in [Nath].
Linnik’s method depends on the notion of Shnirelman density, defined as follows. Let
A be a set of non-negative integers (possibly including 0). For each n ≥ 1, let A(n) be the
number of a ∈ A with 1 ≤ a ≤ n. The Shnirelman density is
A(n)
.
n≥1
n
σ(A) = inf
The application to Waring’s problem is via the following Lemma.
LEMMA 1. If A contains 0 and has positive Shnirelman density, then there exists h
such that every positive integer is expressible as the sum of h members of A.
We omit the proof of this Lemma, which is quite straightforward. See [Nath, Theorem
11.4] or [Hua, Theorem 19.2.3].
1
(k)
So it will be enough to prove that for given k, there exists s such that {n : rs (n) > 0}
has positive Shnirelman density.
LEMMA 2. Let
M
X
q(n; h1 , h2 ) =
1.
m1 ,m2 =−M
h1 m1 +h2 m2 =n
For (h1 , h2 ) 6= (0, 0) we have q(n; h1 , h2 ) = 0 unless g = (h1 , h2 )|n, say h1 = ga1 , h2 = ga2
(so (a1 , a2 ) = 1). Then
q(n; h1 , h2 ) ≤
2M
+ 1.
max(|a1 |, |a2 |)
Proof. Say n = gf . Then the equation becomes
a1 m1 + a2 m2 = f.
On writing a1 ā1 + a2 ā2 = 1, we may write this as
a1 (m1 − f ā1 ) + a2 (m2 − f ā2 ) = 0.
Thus the general solution has the form
m1 = f ā1 + ka2 ,
m2 = f ā2 − ka1 .
Wlog we may suppose a1 ≥ a2 ≥ 0. This implies a1 > 0 and
X
q(n; h1 , h2 ) =
1.
k:
−M −f ā1 ≤ka2 ≤M −f ā1
−M +f ā2 ≤ka1 ≤M +f ā2
By the second condition, k is constrained to lie in an interval of length 2M/a1 . Such an
interval contains at most 2M/a1 + 1 integers, hence the statement. Note this all works fine
if n = 0.
LEMMA 3. Let
q(n) =
H
X
M
X
1.
m1 ,m2 =−M
h1 ,h2 =−H
h1 ,h2 6=0 h1 m1 +h2 m2 =n
Then
q(n) ≤
20HM σ−1 (n) (n 6= 0, H ≤ M ),
20H 2 M
(n = 0).
2
Proof. We have
H
X
q(n) =
q(n; h1 , h2 )
h1 ,h2 =−H
h1 ,h2 6=0
H
X
= 4
q(n; h1 , h2 )
h1 ,h2 =1
X
≤ 4H 2 + 8M
X
g|n 1≤a1 ≤H/g
g≤H 0≤a2 ≤H/g
1
.
max(a1 , a2 )
We could have said a2 ≥ 1 here, but have allowed a2 = 0 also (since it naturally leads to no
worse an estimate and) so that the following applies to the variant of this lemma required
for Lemma 7b:

q(n) ≤ 4H 2 + 8M

X
X
X

1≤a1 ≤H/g 0≤a2 <a1
g|n
g≤H
= 4H 2 + 16M
X
X
g|n
g≤H
1≤a≤H/g
≤ 4H 2 + 16HM
1
+
a1
X
X
1≤a2 ≤H/g 1≤a1 ≤a2
1
a2
1
X1
,
g
g|n
g≤H
and the result follows.
Although we have thrown away a lot in the case n = 0 (we could give the result as
2
4H + 16HM log(eH)) this will have little effect in the application.
The following lemmas are of some interest in their own right.
LEMMA 4. We have
∞
X
a,b=1
(a,b)=1
1
a2 b 2
5
= .
2
Proof. Denote the sum by S. We have
∞
X
d,e=1
1
d2 e2
= ζ(2)2 .
Now write (d, e) = g and d = ga, e = gb, so that (a, b) = 1. Then
∞
X
∞
X
1
1
=
= ζ(4)S.
2
2
4
2 b2
d
e
g
a
g,a,b=1
d,e=1
(a,b)=1
3
Hence
S=
LEMMA 5. We have
ζ(2)2
π 4 /36
5
= 4
= .
ζ(4)
π /90
2
∞
X
1
5
= ζ(3).
de[d, e]
2
d,e=1
Proof. Denote the sum by C. With g, a, b as above, we have [d, e] = gab, hence
∞
X
1
C=
= ζ(3)S.
3 a2 b 2
g
g,a,b=1
(a,b)=1
LEMMA 6. We have
5
σ−1 (n)2 ≤ ζ(3)x.
2
n≤x
X
Proof. This sum is
XX 1
=
de
n≤x
X 1
de
d,e≤x
d,e|n
X
1
n≤x
n≡0 mod [d,e]
x
de[d, e]
d,e≤x
X
≤
≤ Cx,
where C is as in Lemma 5.
A few further estimates show that in fact
X
5
σ−1 (n)2 = ζ(3)x + O(log2 x).
2
n≤x
LEMMA 7 (for the inductive step in Theorem 1). For H ≤ M we have
H
X
M
X
h1 ,...,h4 =−H
h1 ,...,h4 6=0
m1 ,...,m4 =−M
1 ≤ 5250(HM )3 .
h1 m1 +h2 m2 =h3 m3 +h4 m4
Proof. The LHS is
2HM
X
H 4 M 2 + 2H 2 M 2
q(n)2 ≤ 202
2HM
X
!
σ−1 (n)2
n=1
n=−2HM
2
≤ 20
= 202
5
H M + 2H M · ζ(3) · 2HM
2
4
2
3
H M + 10ζ(3)H M 3
4
2
2
2
≤ 202 (1 + 10ζ(3))(HM )3 .
4
Calculation shows that 202 (1 + 10ζ(3)) ≈ 5208.
LEMMA 7b (irritating variant needed for Lemma 8). For 2H ≤ M we have
H
X
M
X
1 ≤ 162M 4 + 5250(HM )3 .
h1 ,...,h4 =−H m1 ,...,m4 =−M
h1 m1 +h2 m2 =h3 m3 +h4 m4
Proof. Let
H
X
M
X
h1 ,h2 =−H
m1 ,m2 =−M
h1 m1 +h2 m2 =n
Q(n) =
H
X
=
1
q(n; h1 , h2 )
h1 ,h2 =−H
= q(n; 0, 0) + 4
X
q(n; h1 , h2 ).
1≤h1 ≤H
0≤h2 ≤H
For n 6= 0 we have q(n; 0, 0) = 0 and obtain the same bound for Q(n) as that for q(n) given
by Lemma 2: In the working of Lemma 2 the 8H 2 is replaced by 8H(H + 1), but since
H < M we can still say 8H(H + 1) ≤ 8HM .
However, in the case n = 0 we have an additional term q(n; 0, 0) = (2M + 1)2 ≤ 9M 2 .
Thus we have
Q(0) ≤ 9M 2 + 20H 2 M,
and so
Q(0)2 ≤ 2(9M 2 )2 + 2(20H 2 M )2
= 162M 4 + 202 H 3 M 2 · 2H
≤ 162M 4 + 202 (HM )3 .
The working of Lemma 7 now gives
H
X
M
X
h1 ,...,h4 =−H m1 ,...,m4 =−M
h1 m1 +h2 m2 =h3 m3 +h4 m4
1=
2HM
X
Q(n)2 ≤ 162M 4 + 5250(HM )3 ,
n=−2HM
as required.
LEMMA 8 (case k = 2 of Theorem 1). Let
f (n) = a2 n2 + a1 n
5
where a2 , a1 are integers with
0 < |a2 | ≤ c2 ,
|a1 | ≤ c1 N.
Then for N ≥ 1 we have
1
Z
0
8
N
X
e(αf (n)) dα ≤ CN 6
n=0
where
C = 162(2c2 + c1 )4 + 5250(2c2 + c1 )3 .
In particular C = 44592 when f (n) = n2 , c2 = 1, c1 = 0.
Proof. We have
8
Z 1 X
N
e(αf
(n))
dα =
0
N
X
n=0
1
Z
e(α(f (n1 ) + · · · + f (n4 ) − f (n5 ) − · · · − f (n8 ))) dα
n1 ,...,n8 =0
0
N
X
=
1.
n1 ,...,n8 =0
f (n1 )+···+f (n4 )=f (n5 )+···+f (n8 )
We may write the equation here as
4
X
(f (ni ) − f (ni+4 )) =
i=1
=
4
X
i=1
4
X
(a2 (n2i − n2i+4 ) + a1 (ni − ni+4 ))
hi mi
i=1
= 0,
where
hi = ni − ni+4
mi = a2 (ni + ni+4 ) + a1 .
Note that (hi , mi ) uniquely determines (ni , ni+4 ) since
1 −1
ni
hi
=
a2 a2
ni+4
m i − a1
has the inverse
ni
ni+4
1
=
2a2
a2 1
−a2 1
for a2 6= 0. Clearly we have
|hi | ≤ N
6
hi
m i − a1
and
|mi | ≤ M, where M = (2c2 + c1 )N.
Noting that M ≥ 2N since c2 ≥ |a2 | ≥ 1, the result follows from Lemma 6b.
THEOREM 1. Let k ≥ 2 and
f (n) = ak nk + · · · + a1 n
where a1 , . . . ak are integers with ak 6= 0 and
|aj | ≤ cj,k N k−j .
Then for N ≥ 1 we have
1
Z
0
8k−1
N
X
k−1
e(αf (n))
dα k,c1,k ,...,ck,k N 8 −k .
n=0
Proof. The statement is more than we need for the application. It has been elaborated
to make its proof by induction on k work. The case k = 2 is given by Lemma 8. We will
omit the suffices in the notation. Suppose the statement is true with k − 1 in place of k.
We have
N
2
N
N
X
X
X
e(αf (m) − αf (n)) = N + 1 +
bh ,
e(αf (n)) =
h=−N
(1)
m,n=0
n=0
h6=0
where
bh =
N
X
min(N,N −h)
e(αf (m) − αf (n)) =
m,n=0
m−n=h
X
e(αhφ(n, h)),
n=max(0,−h)
where
1
(f (n + h) − f (n))
h
k
1X
aj ((n + h)j − nj )
=
h j=1
j−1 k
X
X
j
=
aj
hj−r−1 nr
r
φ(n, h) =
=
j=1
r=0
k−1
X
!
k
X
j
aj hj−r−1 nr
r
r=0
j=r+1
is a degree k − 1 polynomial in n. From the definition we see that φ(n, h) N k−1 . The
coefficient of nk−1 in φ(n, h) is
k
k−1
ak hk−(k−1)−1 = kak 6= 0,
7
and the coefficient of nr is
k
X
j
r
j=r+1
j−r−1
k
X
j
N k−j N j−r−1
r
aj h
j=r+1
N k−r−1 .
Raising (1) to the power 8k−2 using Hölder’s inequality gives
8k−2
2·8k−2
N
N
X
X
k−2
e(αf (n))
N8 + bh h=−N n=0
h6=0 
8k−2 −1
N
N
X
k−2
X 
8k−2
N
+
1
|bh |8
h=−N
h6=0
N
8k−2
+N
8k−2 −1
h=−N
h6=0
N
X
|bh |8
k−2
.
h=−N
h6=0
Raising this to a further fourth power and integrating over α then gives

4
8k−1
Z 1 X
Z 1 X
N
N
k−2
k−2
k−2 

e(αf (n))
dα N 4·8 + N 4·8 −4
|bh |8  dα.

0
0
h=−N
(2)
n=0
h6=0
As a function of α, bh has period 1/|h|. Let |bh |8
8k−2
|bh |
=
∞
X
k−2
have the Fourier series
A(m, h)e(αhm).
m=−∞
This is finite really because
A(m, h) 6= 0 ⇒ m max |φ(n, h)| N k−1 ,
0≤n≤N
so we may write the range for m as |m| ≤ CN k−1 (where C is independent of h). The
coefficients are given by
Z
A(m, h) =
1
|bh |8
k−2
e(−αhm) dα
0
8k−2
dβ
=
e((sgn
h)βφ(n,
h))
e(−(sgn h)βm)
|h|
0
n=max(0,−h)
8k−2
Z 1 min(N,N
−h)
X
=
e(βφ(n,
h))
e(−(sgn h)βm) dβ.
0 n=max(0,−h)
Z
|h| min(N,N
X −h)
8
Thus
8k−2
e(βφ(n,
h))
dβ
n=max(0,−h)
Z
|A(m, h)| ≤
X −h)
1 min(N,N
0
N8
k−2 −(k−1)
,
by the inductive hypothesis. (We have trivially translated the region of summation over n.
This should be written in as part of the official statement or perhaps done away with by
restricting to h > 0 using a 2<.) Now we have

4

Z 1 X
Z 1 X
N
N
X


8k−2 
|bh |

 dα =

0
0
h=−N
h6=0
h=−N
h6=0
h1 ,...,h4 =−N
h1 ,...,h4 6=0
h1 ,...,h4 =−N
h1 ,...,h4 6=0
4
Y
X
|m1 |,...,|m4
N
X
=

A(m, h)e(αhm) dα
|m|≤CN k−1
N
X
=
4
4
Y
|m1 |,...,|m4
|≤CN k−1
A(mi , hi )
1
e α
0
i=1
|≤CN k−1
X
!Z
4
X
!
hi mi dα
i=1
A(mi , hi )
i=1
h1 m1 +h2 m2 +h3 m3 +h4 m4 =0
N
N
X
4(8k−2 −(k−1))
h1 ,...,h4 =−N
h1 ,...,h4 6=0
X
1
|m1 |,...,|m4 |≤CN k−1
h1 m1 +h2 m2 +h3 m3 +h4 m4 =0
N
4(8k−2 −(k−1))
= N 4·8
k−2 −k+4
N 3k
,
by Lemma 7 (note that we may assume C ≥ 1). So finally (2) gives
8k−1
Z 1 X
N
k−2
k−2
k−2
e(αf (n))
dα N 4·8 + N 4·8 −4 N 4·8 −k+4
0 n=0
N 4·8
k−2
+ N8
k−1 −k
.
The fact that the second term dominates is equivalent to 8k ≥ 16k, which is certainly true
for k ≥ 3 so completing the proof.
(k)
THEOREM 2. Let k ≥ 2 and s = 8k−1 . Then the set A = {n ≥ 1 : rs (n) > 0} has
positive Schnirelmann density.
(k)
Proof. Write r(n) for rs (n). We have
N
X
n=0
r(n)
X
=
m1 ,...,ms ≥0
k
mk
1 +···+ms ≤N
9
1
X
≥
0≤m1 ,...,ms
1
≤(N/s)1/k
(N/s)s/k
≥
k N s/k ,
But on the other hand, for n ≥ 1,
r(n)
=
0≤m1 ,...,ms ≤n1/k
Z
=
1
0
s
X

e(αmk ) e(−αn) dα
0≤m≤n1/k
s
Z 1 X
k
e(αm ) dα
0 0≤m≤n1/k
k (n1/k )s−k
=
e(α(mk1 + · · · + mks − n)) dα

0
≤
1
Z
X
(by Theorem 1)
ns/k−1 ,
so that
N
X
r(n) k 1 + N s/k−1 A(N ).
n=0
The statement clearly follows.
By Lemma 1, we can deduce at once:
THEOREM 3 (the Waring-Hilbert theorem). For each k ≥ 2, there exists s(k) such
that rs (n) > 0 for all n ≥ 0.
The mistake in [Hua] is the claim that Q(0) min(H 2 M, M 2 H) (HM )3/2 , which
is wrong because there are (2M + 1)2 solutions with h1 = h2 = 0. This leads to the incorrect
bound (HM )3 (regardless of the relative sizes of H and M ) for the quantity in Lemma
6b. This is wrong if M is much bigger than H (as it is in the application to the inductive
step in Theorem 1) since Q(0)2 then dominates. The way I’ve corrected it is simply to note
that h1 , . . . , h4 = 0 does not occur in Theorem 1 so we can use Lemma 7 instead.
References (added by Graham Jameson)
[HL]
G. Hardy and J. Littlewood, A new solution of Waring’s problem, Quart. J. Math.
48 (1919), 272–293.
[Hil]
D. Hilbert, Beweis für die Darstellbarkeit der ganzen Zahlen durch eine feste
Anzahl nter Potenzen, Math. Annalen 67 (1909), 281–300.
10
[Hua]
L. K. Hua, Introduction to Number Theory, Springer (1982).
[Lin]
Yu. V. Linnik, An elementary solution of Waring’s problem by Shnirelman’s
method (Russian), Mat. Sbornik NS 12 (1943), 225–230.
[Nath]
M. B. Nathanson, Elementary Methods in Number Theory, Springer (2000).
[Vin]
I. M. Vinogradov, On Waring’s theorem (Russian), Izv. Akad. Nauk SSSR
4 (1928), 393–400.
11