ELEMENTARY NUMBER THEORY

ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
JAMES NEWTON
Course Arrangements
Problem sheets and solutions, lecture notes and other information will all be posted on
the KEATS page for the course.
Tutorials. Weekly from second week of semester 2 (week 22) onwards.
Assessment. 90% of your mark is from the final exam, 10% from a class test (in week
28).
Office hours. My office hours are Thursdays 2-4pm. If you have questions, comments or
any feedback about the course you are encouraged to come to my office hours to discuss.
If the timing of the office hours isn’t convenient, email me at [email protected] and
we can make an appointment to meet at some other time. I am also happy to answer
questions and receive comments by email.
Other reading. The lecture notes from last year’s course can be found on KEATS. The
course content this year will be very similar, but the material will be reorganised a little.
Here are three recommended textbooks which cover much of the course material:
• Elementary number theory — David M. Burton
• A friendly introduction to number theory — Joseph H. Silverman
• Elementary number theory — Gareth A. Jones and Mary R. Jones
Course outline. This course is an introduction to a variety of topics in number theory.
The reason for the word ‘elementary’ in the course title is that we won’t rely on any advanced theory from other parts of mathematics (e.g. real or complex analysis, commutative
algebra) — we will use a little bit of the material on groups and rings from the course Introduction to abstract algebra, but even this isn’t really essential. Here are some of the
topics we will cover:
(1) Review from Introduction to abstract algebra:
divisibility, prime numbers, congruences
(2) The ring of residue classes mod m and its unit group, primitive roots
(3) Quadratic residues and the quadratic reciprocity law
(4) Sums of squares
(5) Irrational and transcendental numbers
Date: March 14, 2017.
1
Week 1
2
JAMES NEWTON
(6) Some examples of solving equations in integers and rational numbers:
‘Diophantine equations’
What is number theory? Number theory is the study of the natural numbers, and the
integers:
N = {1, 2, 3, . . .}
Z = {. . . , −2, −1, 0, 1, 2, . . .}
We are interested in different kinds of numbers (even, odd, square, prime,. . . ) and the
relationships between them.
Here are some examples of number theoretic questions, some of which we will discuss
more in the course:
• Can we describe all the integer solutions (a, b, c) to the equation a2 +b2 = c2 ? What
about if we replace the squares by cubes, fourth powers, etc.?
• Which numbers are sums of two squares?
• What can we say about the gaps between prime numbers?
1. Divisibility
Definition 1.1. Let a and b be two integers. We say that b divides a if there exists an
integer q such that a = qb. If b divides a, we write b|a.
Here are some basic properties of divisibility: let a, b, c, be three integers
(1)
(2)
(3)
(4)
(5)
If a|b and b|c, then a|c.
If a|b and a|c then a|(bx + cy) for all x, y ∈ Z.
If a|1 then a = ±1.
If a|b and b|a then a = ±b.
Suppose c 6= 0. Then a|b if and only if ac|bc.
Exercise. Prove the above properties.
Theorem 1.2. Let a ∈ Z and b ∈ N. Then there exists unique integers q, r ∈ Z such that
a = qb + r
and 0 ≤ r < b.
Proof. Let S = {a − nb : n ∈ Z}. Then we let r = a − qb be the smallest non-negative
element of S. Since S contains positive elements (take n to be the negative of a very large
integer) this smallest element exists by the well-ordering principle. We have 0 ≤ r < b,
otherwise r − b is a smaller non-negative element of S.
If a = qb + r = q 0 b + r0 then b|(r − r0 ), and −b < r − r0 < b so this implies that r = r0
and therefore q = q 0 . This shows uniqueness of q and r.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
3
Greatest common divisors.
Definition 1.3. Let a and b be integers. If d is another integer with d|a and d|b we say
that d is a common divisor of a and b.
If at least one of a and b are non-zero, we define the greatest common divisor of a and b
to be the largest positive integer d which is a common divisor of a and b. We write gcd(a, b)
for the greatest common divisor of a and b.
Exercise.
(1) Why did we not define gcd(0, 0)?
(2) Show that gcd(a, b) = gcd(|a|, |b|)
(3) For a a non-zero integer, show that gcd(a, 0) = |a|.
The Euclidean algorithm. The most obvious way to find the gcd of two integers is to
write down all the divisors of each number and compare them to find the greatest common
divisor. Of course this method will be very slow in practice! A more efficient way of
calculating the gcd is based on the following:
Lemma 1.4. If a = qb + r then gcd(a, b) = gcd(b, r).
Proof. Suppose d is a common divisor of a and b. Since r = a − qb we also have d|r so
d is a common divisor of b and r. In the other direction, if d is a common divisor of b
and r, then we also have d|a, since a = qb + r. So d is a common divisor of a and b. We
therefore see that the two sets of integers {common divisors of a and b}, and {common
divisors of b and r} are the same, so they have the same largest element and we have
gcd(a, b) = gcd(b, r).
Suppose we start off with integers a, b and we want to work out gcd(a, b). First we do
some easy simplifications: we can assume that a and b are both non-zero, since the gcd is
easy to work out if one of them is zero. Also, since gcd(a, b) = gcd(|a|, |b|) we can assume
that a and b are both positive. If a = b then gcd(a, b) = |a|, so we can assume a 6= b.
Finally, since gcd(a, b) = gcd(b, a) we can assume a > b > 0.
The above Lemma tells us that we can work out gcd(a, b) by first dividing a by b and
then working out the gcd of b and r, where b > r ≥ 0. Since r is smaller than b we have
reduced the problem to working out the gcd of smaller integers. Let’s write this out more
formally.
First we write
a = q1 b + r1 with 0 ≤ r1 < b.
If r1 = 0 then b|a and gcd(a, b) = b. If r1 6= 0 we divide again and write:
b = q2 r1 + r2 with 0 ≤ r2 < r1 .
We have gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ). If r2 = 0 then gcd(a, b) = r1 . If r2 6= 0 we
divide again:
r1 = q3 r2 + r3 with 0 ≤ r3 < r2
and we continue on in this way. Since b > r1 > r2 > · · · ≥ 0 we eventually get a remainder
rn = 0 (with rn−1 > 0). So gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ) = · · · = gcd(rn−1 , rn ) =
gcd(rn−1 , 0) = rn−1 . Let’s do an example:
4
JAMES NEWTON
Example. Let a = 1492 and b = 1066. We have
1492 = 1 · 1066 + 426
1066 = 2 · 426 + 214
426 = 1 · 214 + 212
214 = 1 · 212 + 2
212 = 106 · 2 + 0
The last non-zero remainder is 2, so gcd(1492, 1066) = 2.
Exercise. Write a computer program to implement the Euclidean algorithm.
This algorithm gives us a practical way to work out the gcd of two integers. It also gives
us an important theoretical result:
Theorem 1.5 (Bezout’s lemma). Let a and b be integers (not both 0). Then there exist
integers u and v such that
gcd(a, b) = au + bv
Proof. As before, we can reduce to the case that a > b > 0. The last two steps in the
Euclidean algorithm to compute gcd(a, b) have
rn−3 = qn−1 rn−2 + rn−1
and
rn−2 = qn rn−1 + 0
with rn−1 = gcd(a, b). So we have an equation
gcd(a, b) = rn−3 − qn−1 rn−2 .
From the previous equation in the Euclidean algorithm, we get rn−2 = rn−4 − qn−2 rn−3 , so
we can substitute this in to eliminate rn−2 and write gcd(a, b) as a linear combination of
rn−3 and rn−4 . Repeating this process, we eventually write gcd(a, b) as a linear combination
of a and b.
Note that the above proof gives us a practical algorithm to compute integers u, v with
gcd(a, b) = au + bv:
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
5
Example. We again set a = 1492 and b = 1066. We have
gcd(a, b) = 2
= 214 − 1 · 212
= 214 − 1 · (426 − 1 · 214)
= −1 · 426 + 2 · 214
= −1 · 426 + 2(1066 − 2 · 426)
= 2 · 1066 − 5 · 426
= 2 · 1066 − 5(1492 − 1 · 1066)
= −5 · 1492 + 7 · 1066,
Bezout’s lemma is very useful for establishing theoretical properties of the greatest common divisor.
Exercise. Extend your programme implementing the Euclidean algorithm so it also outputs
integers u, v with gcd(a, b) = au + bv.
Another proof of Bezout’s lemma. Here is a different proof.
Proposition 1.6. Let a, b be integers, not both zero, and consider the set
S = {au + bv : u, v ∈ Z}.
Let d > 0 be the smallest positive integer in S. Then d = gcd(a, b).
Proof. We have d = au + bv. Dividing a by d we get
a = qd + r
with 0 ≤ r < d. Since r = a − qd = a(1 − qu) − b(qv) ∈ S and d is the smallest positive
element of S we must have r = 0. So d|a. By the same argument, we have d|b, so d is
a common divisor of a and b. Since gcd(a, b) divides a and b, it divides d = au + bv. In
particular, we have gcd(a, b) ≤ d. Since d is a common divisor of a and b we also have
d ≤ gcd(a, b) and we conclude that d = gcd(a, b).
Remark. Here are two consequences of this Proposition:
(1) We have integers u, v such that gcd(a, b) = au + bv (in other words, the Proposition
gives a new proof of Bezout’s lemma)
(2) gcd(a, b) = 1 if and only if there are integers u, v such that
1 = au + bv.
Corollary 1.7. Let a, b be integers, not both zero, and again consider the set
S = {au + bv : u, v ∈ Z}.
We can also consider the set
S 0 = {n gcd(a, b) : n ∈ Z}.
Then the two sets of integers S, S 0 are equal.
6
JAMES NEWTON
Proof. We let S 0 = {n gcd(a, b) : n ∈ Z}. We are going to show that S ⊂ S 0 and S 0 ⊂ S
and deduce that S = S 0 . For the first inclusion, if x = au + bv ∈ S then, since gcd(a, b)
divides a and b, it divides au + bv = x. So x is a multiple of gcd(a, b) which means that
x ∈ S 0 . This proves that S ⊂ S 0 .
Conversely, we now prove that S 0 ⊂ S. So let y ∈ S 0 . We have y = n gcd(a, b). By
Bezout’s lemma, we have gcd(a, b) = au + bv for some integers u, v. Multiplying by n we
get y = a(nu) + b(nv). So y ∈ S. This shows that S 0 ⊂ S, and we have completed the
proof.
Corollary 1.8. Let a, b be integers, not both zero. Let c be an integer. Then c is a common
divisor of a and b if and only if c| gcd(a, b).
Week 2
Proof. If c| gcd(a, b) then c|a and c|b, so c is a common divisor of a and b.
Conversely, suppose c is a common divisor of a and b. By Bezout’s lemma we have
integers u, v such that
gcd(a, b) = au + bv.
Since c|a and c|b we c|(au + bv) = gcd(a, b).
Coprimality.
Definition 1.9. We say that two integers a, b are coprime or relatively prime if
gcd(a, b) = 1.
Lemma 1.10. Suppose a, b are coprime.
(1) If a|c and b|c then (ab)|c.
(2) If a|(bc) then a|c.
(3) If a and c are also coprime, then a and bc are coprime.
Proof.
(1) We have 1 = au + bv for some integers u, v. We can also write c = ae and
c = bf , since a|c and b|c. Multiplying the first equation by c we get
c = cau + cbv = (bf )au + (ae)bv = ab(f u + ev)
so (ab)|c.
(2) Again we have c = cau + cbv. Since a|(bc) and a|a, we get that a|a(cu) + (bc)v = c.
(3) We have 1 = au + bv and 1 = ax + cy. Multiplying the equations together gives
1 = (au + bv)(ax + cy) = a(uax + ucy + bvx) + bc(vy).
It follows from Proposition 1.6 that gcd(a, bc) = 1.
Least common multiples.
Definition 1.11. If a, b are integers, then a common multiple of a and b is an integer c
such that a|c and b|c. If a and b are both non-zero, the least common multiple of a and
b is defined to be the smallest positive integer lcm(a, b) which is a common multiple of a
and b.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
7
Proposition 1.12. Let a, b be non-zero integers. Then
gcd(a, b)lcm(a, b) = |ab|.
Proof. Set a0 =
a
gcd(a,b)
and b0 =
b
.
gcd(a,b)
We want to show that
|ab|
gcd(a,b)
= lcm(a, b). First
|ab|
gcd(a,b)
we need to check that
is a (positive) common multiple of a and b. Since gcd(a, b) is
a common divisor of a and b, this follows from the fact that
|b|
|a|
|ab|
= |a|
= |b|
gcd(a, b)
gcd(a, b)
gcd(a, b)
|ab|
Now we need to show that gcd(a,b)
is the least common multiple. Suppose c is an arbitrary
positive common multiple of a and b. We have a|c and b|c. Dividing by gcd(a, b), we get
c
that a0 and b0 divide gcd(a,b)
. Since a0 and b0 are coprime (by Problem Sheet 1), Lemma
1.10 implies that a0 b0 divides
c
.
gcd(a,b)
Multiplying by gcd(a, b), we deduce that
gcd(a, b)|a0 b0 | divides c, so in particular c ≥
This shows that
|ab|
gcd(a,b)
|ab|
gcd(a,b)
=
|ab|
.
gcd(a,b)
= lcm(a, b).
Remark. We showed in the above proof that any common multiple of a and b is a multiple
of lcm(a, b).
1.1. Linear Diophantine equations. Diophantine equations are equations in one or
more variables, for which we seek integer-valued solutions. Corollary 1.7 allows us to
describe the solutions to the simplest family of Diophantine equations: linear Diophantine
equations where we have integers a, b, c and look for integer solutions (x, y) to
ax + by = c.
Theorem 1.13. Let a, b, c be integers, with a and b not both 0, and let d = gcd(a, b). The
equation
ax + by = c
has an integer solution (x, y) if and only if c is a multiple of d.
Now we assume c is a multiple of d. Fix integers u, v with au + bv = d. Then the
solutions to
ax + by = c
are given by (xn , yn )n∈Z , where
b
c
a
c
xn = u + n, yn = v − n
d
d
d
d
Proof. The first part of the theorem follows immediately from Corollary 1.7. For the second
part, we can check by substituting in that the pairs (xn , yn ) are solutions to the equation.
It remains to check that these are all the solutions. Suppose that (x, y) is a solution. We
can assume that b 6= 0 (otherwise we swap a and b). We have ax+by = c and ax0 +by0 = c,
so subtracting one equation from the other gives
a(x − x0 ) + b(y − y0 ) = 0
8
JAMES NEWTON
Dividing by d and rearranging gives
a
b
(x − x0 ) = − (y − y0 ).
d
d
b
a
Again we use the fact that d and d are coprime (Problem Sheet 1). Since db divides ad (x−x0 )
it follows from Lemma 1.10 that db divides x − x0 . So we have n ∈ Z such that x − x0 = db n.
n
= yn .
This shows that x = xn and we then have y = c−ax
b
2. Prime numbers
Definition 2.1. An integer p > 1 is called a prime number or a prime if it has no positive
divisors other that 1 and p. An integer n > 1 is called composite if it is not prime.
Remark. By convention, the number 1 is neither prime nor composite.
Lemma 2.2.
(1) Let p be a prime number and let a, b be integers. Suppose p|ab. Then
p|a or p|b.
(2) If we have integers a1 , a2 , . . . , an and p|(a1 a2 · · · an ) then p|ai for some i.
Proof.
(1) If p does not divide a, then gcd(a, p) = 1. So it follows from Lemma 1.10
that p|b.
(2) We repeatedly apply the first part. If p does not divide a1 then p|(a2 a3 · · · an ). If
p does not divide a2 either, then p|(a3 a4 · · · an ). We see that p eventually has to
divide one of the ai .
Theorem 2.3 (Fundamental theorem of arithmetic). Every integer n > 1 can be expressed
uniquely (up to reordering) as a product of primes.
Proof. First we prove existence of a prime factorisation, by induction on n. Since 2 is
prime, it has a prime factorisation. For the inductive step, we suppose that all integers
less than n can be expressed as a product of primes. If n is prime, then it is its own prime
factorisation and we are done. If n is not prime, then we have n = md with 1 < m < n
and 1 < d < n. By our inductive hypothesis, both m and d have prime factorisations, so
n does. Uniqueness follows from Lemma 2.2: suppose
n = p1 p2 · · · pr = q1 q2 · · · qs
are two prime factorisations. Since p1 |n we have p1 |qi for some i. Up to reordering we can
assume p1 |q1 , and since q1 is prime we must have p1 = q1 . Now we dividing through by p1
we get
p2 · · · pr = q2 · · · qs
Repeatedly applying the same argument, we can show that r = s and pi = qi for all i’s
(after reordering the qi ).
If n > 1 is an integer, we often write its prime factorisation as
n = pa11 pa22 · · · par r
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
9
with the pi distinct primes and the ai positive integers. Sometimes we will allow some of
the ai to be equal to 0 as well (in this case prime pi is not a factor of n).
We can read off information about the divisibility properties of integers from their prime
factorisations.
Lemma 2.4. Let
n = pa11 pa22 · · · par r
with the pi distinct primes and the ai positive integers.
(1) d > 0 is a divisor of n if and only if
d = pb11 pb22 · · · pbrr
with 0 ≤ bi ≤ ai for each i.
Q
(2) The number of positive divisors of n is ri=1 (ai + 1).
Proof. Exercise (see problem sheet 2).
Lemma 2.5. Let m, n be two positive integers with
m = pa11 pa22 · · · par r
n = pb11 pb22 · · · pbrr
where ai , bi are integers which are ≥ 0.
(1) gcd(m, n) = pe11 pe22 · · · perr where ei = min(ai , bi )
(2) lcm(m, n) = pf11 pf22 · · · pfrr where fi = max(ai , bi )
Proof. Exercise (use the previous lemma to work out what the common divisors of m and
n are, and use Proposition 1.12 to work out the lcm once you know the gcd).
Theorem 2.6 (Euclid). There are infinitely many primes.
Proof. We give a proof by contradiction. Suppose there are only finitely many primes.
Then there is an integer N such that N ≥ p for all primes p. Consider the integer
n = N ! + 1 = 1 · 2 · · · (N − 1) · N + 1.
By the fundamental theorem of arithmetic, there is a prime p with p|n. Since p ≤ N , we
also have p|N !. But 1 = n − N ! so this implies that p|1 which is impossible.
Exercise. Prove that there are infinitely many primes of the form 4k − 1, with k a positive
integer.
Proposition 2.7. There are arbitrarily large gaps between consecutive primes. More precisely, if k ≥ 2 is an integer then there are k consecutive integers which are composite.
Proof. Let k ≥ 2 be an integer. We are going to show that there are k consecutive integers
none of which is prime. Consider
a2 = (k + 1)! + 2, a3 = (k + 1)! + 3, . . . , ak+1 = (k + 1)! + (k + 1)
this gives k consecutive integers, and we have i|ai for each 2 ≤ i ≤ k + 1. We also have
ai > i for each i, so none of the ai is prime.
10
JAMES NEWTON
Example. The above proof tells us that the seven consecutive integers 8!+2 = 40322, . . . , 8!+
8 = 40328 are composite. In fact, we can find a smaller sequence: the seven consecutive
integers 90, 91, 92, 93, 94, 95, 96 are all composite.
A very famous conjecture in number theory is about small gaps between primes:
Conjecture (Twin prime conjecture). There are infinitely many prime numbers p such
that p + 2 is also prime.
Recently, a major breakthrough was achieved by Yitang Zhang, which, with improvements by J. Maynard, T. Tao, and an online project with many collaborators called Polymath shows:
Theorem 2.8 (Zhang–Maynard–Tao–Polymath, 2013–2014). There are infinitely many
prime numbers p such that there is another prime number q with p < q < p + 246.
This is the closest we can get to the twin prime conjecture, for now!
3. Congruences
Definition 3.1. Let m be a non-zero integer, and let a, b ∈ Z. We say that a is congruent
to b modulo m if m|(a − b).
If a is congruent to b modulo m, we write
a ≡ b (mod m).
Here are some basic properties of congruences:
(1) a ≡ b (mod m) ⇐⇒ b ≡ a (mod m) ⇐⇒ a − b ≡ 0 (mod m)
(2) if a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m)
(3) if a ≡ b (mod m) and c ≡ d (mod m) then ac ≡ bd (mod m) and ax+cy ≡ bx+dy
(mod m) for all x, y ∈ Z.
(4) if a ≡ b (mod m) and d|m, then a ≡ b (mod d).
(5) if a ≡ b (mod m) and c 6= 0, then ac ≡ bc (mod mc).
Exercise. Prove the above properties. Here is the first part of (3) to get you started: if
a ≡ b (mod m) and c ≡ d (mod m) then m|(a − b) and m|(c − d). We have ac − bd =
a(c − d) + d(a − b) so m|(ac − bd).
Week 3
Definition 3.2. Let m be a positive integer. A set {x1 , x2 , . . . , xr } is called a complete
residue system modulo m if for every integer y there is exactly one xi such that
y ≡ xi
(mod m).
Example. Let m be a positive integer. Then {0, 1, 2, . . . , m − 1} is a complete residue
system modulo m. In general, every complete residue system has size m.
There are other possibilities too: for example, if m = 5 then {−2, −1, 0, 1, 2} is a complete residue system modulo 5.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
11
Definition 3.3. Let m be a non-zero integer and a ∈ Z. The residue class or congruence
class of a is the set
[a]m = {b ∈ Z : b ≡ a (mod m)}
We might just write [a] if we don’t need to specify the modulus m.
Lemma 3.4.
[a]m = [b]m ⇐⇒ a ≡ b (mod m)
Proof. This follows from the definitions, it’s an exercise to write down the proof.
If follows from the above Lemma that if {x1 , x2 , . . . , xm } is a complete residue system
modulo m then for every integer a there is a unique xi such that [a]m = [xi ]m . The
collection of all congruence classes modulo m is therefore a finite set of cardinality m.
Definition 3.5. For a positive integer m, we let Zm denote the set of congruence classes
modulo m.
Remark. We can write
Zm = {[0]m , [1]m , . . . , [m − 1]m }.
In fact, if {x1 , x2 , . . . , xm } is any complete residue system modulo m then we have
Zm = {[x1 ]m , [x2 ]m , . . . , [xm ]m }.
Now we are going to recall the fact that the congruence classes modulo m naturally form
a commutative ring. In other words, we can add and multiply congruence classes.
Definition 3.6. For m 6= 0 and a, b ∈ Z we define addition and multiplication operations
on Zm by:
[a]m + [b]m = [a + b]m
[a]m · [b]m = [a · b]m .
Using the properties of congruences, you can check that these binary operations are
well-defined: in other words, if a ≡ a0 (mod m) and b ≡ b0 (mod m) then [a]m + [b]m =
[a0 ]m + [b0 ]m and [a]m · [b]m = [a0 ]m · [b0 ]m .
Solving equations in Zm . We already saw the example of linear Diophantine equations.
Suppose we want to find integer solutions x to a polynomial equation in one variable like
x7 + 2x + 2 = 0.
This might be difficult in general, but in the next few lectures we will study the easier
problem of solving these equations modulo m for positive integers m.
12
JAMES NEWTON
Problem. Given a polynomial f (x) = a0 + a1 x + · · · + an xn with ai ∈ Z and a positive
integer m, find all integers a ∈ Z such that f (a) ≡ 0 (mod m).
Lemma 3.7. Let f (x) = a0 + a1 x + · · · + an xn be a polynomial with integer coefficients
ai ∈ Z. If a ≡ b (mod m) then f (a) ≡ f (b) (mod m).
Proof. Since a ≡ b (mod m) we have ai ≡ bi (mod m) for all i ≥ 0. So ai ai ≡ ai bi
(mod m) for all i ≥ 0. Adding together these congruences we get f (a) ≡ f (b) (mod m).
This lemma means that to find all the integers a ∈ Z such that f (a) ≡ 0 (mod m)
we just need to check if f (a) ≡ 0 (mod m) or not for one a in each congruence class.
Moreover, we can view the set of solutions to the congruence equation f (x) ≡ 0 (mod m)
as a subset of Zm .
Another way of seeing this is that we can consider the reduction modulo m of f (x):
[f ]m (x) = [a0 ]m + [a1 ]m x + · · · + [an ]m xn
this is a polynomial with coefficients in Zm . If [a]m ∈ Zm we can evaluate [f ]m (x) at [a]m
to and the Problem is to solve the equation
[f ]m (x) = [0]m
in Zm . Again we see that the solutions to this equation are a subset of Zm .
Example. Let f (x) = x2 + 1 and consider m = 3. Then there are no solutions to f (x) ≡ 0
(mod 3) (since none of 02 + 1, 12 + 1 and 22 + 1 are divisible by 3)/
Consider m = 5. Then the solutions to f (x) ≡ 0 (mod 5) are given by x ≡ ±2 (mod 5).
Remark. If we replace f (x) be a polynomial g(x) whose coefficients are all congruent to
those of f modulo m, then the solutions to f (x) ≡ 0 (mod m) and g(x) ≡ 0 (mod m) are
exactly the same.
Here are some examples I didn’t do in the lecture:
Example. Let m = 5 and f (x) = x2 + 2x + 2. We check which elements in the complete
residue system {−2, −1, 0, 1, 2} satisfy f (x) ≡ 0 (mod 5):
f (−2) = 2
f (−1) = 1
f (0) = 2
f (1) = 5 ≡ 0
f (2) = 10 ≡ 0
(mod 5)
(mod 5)
So the solutions to f (x) ≡ 0 (mod 5) are given by x ≡ 1, 2 (mod 5).
Example. Let m = 4 and f (x) = x5 −x2 +x−3. We check which elements in the complete
residue system {−1, 0, 1, 2} satisfy f (x) ≡ 0 (mod 4):
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
13
f (−1) = −6
f (0) = −3
f (1) = −2
f (2) = 27
So there are no solutions to f (x) ≡ 0 (mod 4). As an immediate consequence, there
are no integer solutions to f (x) = 0, as a solution to this equation would give a solution
(mod 4).
If m is very large this method would take a long time! We are going to develop some
tools that will enable us to solve this problem more efficiently.
Theorem 3.8 (Chinese Remainder Theorem). Let m1 , . . . , mr be positive integers, with
gcd(mi , mj ) = 1 for all i 6= j. Let a1 , . . . , ar be integers. Then the solutions of the
simultaneous congruence equations
x ≡ a1
(mod m1 )
x ≡ a2
..
.
(mod m2 )
x ≡ ar
(mod mr )
are given by integers x lying in a single congruence class (mod m1 m2 · · · mr ).
Proof. We start by doing the case r = 2. By Bezout’s lemma we have integers u, v such
that m1 u + m2 v = 1. In particular, we have m1 u ≡ 1 (mod m2 ) and m2 v ≡ 1 (mod m1 ).
Suppose x ≡ m1 ua2 + m2 va1 (mod m1 m2 ). Then we have x ≡ m2 va1 ≡ a1 (mod m1 )
and x ≡ m1 ua2 ≡ a2 (mod m2 ). This gives us some solutions, and now we check that all
the solutions are given by x ≡ m1 ua2 + m2 va1 (mod m1 m2 ).
Suppose x0 is another solution. Then x − x0 ≡ a1 − a1 ≡ 0 (mod m1 ), and similarly
x − x0 ≡ 0 (mod m2 ). So m1 |(x − x0 ) and m2 |(x − x0 ). Since m1 and m2 are coprime this
implies that m1 m2 divides (x − x0 ). We deduce that x0 ≡ x (mod m1 m2 ).
This finishes the case of r = 2. For the general case r > 2 we just repeatedly apply the
case r = 2. Let’s explain the case r = 3: we have three simultaneous equations and we
begin by considering the first two:
x ≡ a1
(mod m1 )
x ≡ a2
(mod m2 )
The case r = 2 tells us that these two equations are equivalent to the equation x ≡ a12
(mod m1 m2 ) where a12 is some integer (it’s m1 ua2 +m2 va1 , using the notation from earlier).
Now we just have two simultaneous equations to solve:
14
JAMES NEWTON
x ≡ a12
x ≡ a3
(mod m1 m2 )
(mod m3 )
and applying the case r = 2 again tells us that these are equivalent to x ≡ a (mod m1 m2 m3 )
for some integer a.
Note that the above proof gives a procedure for finding the solutions x (see the example
later).
Here is the ring-theoretic formulation of this theorem:
Theorem (Chinese Remainder Theorem). Let m1 , . . . , mr be positive integers, with gcd(mi , mj ) =
1 for all i 6= j. Set m = m1 m2 · · · mr . The map
Zm → Zm1 × Zm2 × · · · × Zmr
given by
[a]m 7→ ([a]mi )i=1,...,r
is a bijection (in fact it is a ring isomorphism).
Example. Let’s find all the solutions to
x2 + 1 ≡ 0
(mod 10).
We begin by finding solutions to
x2 + 1 ≡ 0
(mod mi )
where m1 = 2, m2 = 5. The solutions are given by x ≡ 1 (mod 2), x ≡ −2, 2 (mod 5).
Now we apply the proof of the CRT: the integers u, v in Bezout’s lemma can be taken to
be u = −2, v = 1 since 1 = 2(−2) + 5(1). Our solutions are given by x ≡ m1 ua2 + m2 va1
(mod m1 m2 ) where m1 = 2, m2 = 5, a1 = 1 and a2 = ±2.
So we get
x ≡ 2 · (−2) · (±2) + 5 · (1) · (1) ≡ ∓8 + 5 ≡ ±3
(mod 10)
Subsituting back in you see that these do give solutions, since (±3)2 + 1 = 10 ≡ 0
(mod 10).
Now we move on to discussing inverses modulo an integer, which appeared implicitly in
the proof of the Chinese Remainder Theorem (where we find u and v such that m1 u ≡ 1
(mod m2 ) and m2 v ≡ 1 (mod m1 ).)
Lemma 3.9. Let m be a positive integer and let a ∈ Z. If gcd(a, m) = 1 then there exists
b ∈ Z such that ab ≡ 1 (mod m).
We call such a b an inverse of a modulo m, and denote the residue class [b]m (which
depends only on the residue class [a]m ) by [a]−1
m .
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
15
Proof. We know by Bezout’s lemma that we have au + mv = 1 for some integers u, v. This
says that au ≡ 1 (mod m) so u is an inverse to a modulo m.
If a0 ≡ a (mod m) then a0 u ≡ au ≡ 1 (mod m) so u is also an inverse to a0 (mod m).
Finally if b, b0 are two inverses to a modulo m then ab ≡ ab0 (mod m) which implies that
m|a(b − b0 ). Since m, a are coprime this implies that m|b − b0 and so [b]m = [b0 ]m . This
shows that the residue class [b]m depends only on [a]m (in particular, it is independent of
the choice of b).
Remark. We showed that if a, m are coprime then there is an inverse to a modulo m. In
fact, if there exists b such that ab ≡ 1 (mod m) then a must be coprime to m, since we
know that there are integers u, v with au + mv = 1 if and only if a, m are coprime.
Example. If p is a prime number, then the congruence classes [1]p , . . . , [p − 1]p has an
inverse modulo p. The remaining congruence class [0]p does not have an inverse modulo p.
Definition 3.10. We write Z×
m for the multiplicative group of integers modulo m (of the
group of units modulo m), which are defined by
Z×
m = {[a]m ∈ Zm : a, m are coprime}
The explanation for the terminology in the above definition is that it follows from
Lemma 3.9 that Z×
m with the multiplication operation is a (commutative) group, with
−1
identity element [1]m . In particular, for every g ∈ Z×
∈ Z×
m there is an inverse g
m with
−1
−1
g · g = g · g = [1]m .
We will investigate the structure of this group later in the course.
Here is some terminology which is sometimes used to describe the analogue of a complete
residue system, but just representing congruence classes in Z×
m.
Definition 3.11. Let m be a non-zero integer. A set {x1 , x2 , . . . , xr } is called a reduced
residue system modulo m if for every integer y with gcd(y, m) = 1 there is exactly one xi
such that
y ≡ xi (mod m).
We can use the existence of inverses to solve some simple equations modulo m:
Proposition 3.12. Let a, b ∈ Z and let m be a positive integer. Set d = gcd(a, m). The
congruence equation
ax ≡ b (mod m)
has an integer solution for x if and only if d|b.
If d|b, the solutions are given by integers x such that
−1 " #
[x] md =
a
d
m
d
b
d
m
d
Proof. If ax ≡ b (mod m) then b = ax + km for some integer k. So gcd(a, m) (which
divides a and m) must divide b. Conversely, if d|b then
a
b
m
x≡
(mod )
d
d
d
16
JAMES NEWTON
if and onlu if ax ≡ b (mod m). Multiplying by an inverse of
since ad and md are coprime) we get that
b
a
x≡
d
d
(mod
a
d
modulo
m
d
(which exists
m
)
d
if and only if
−1 " #
[x] md =
a
d
m
d
b
d
m
d
Hensel’s Lemma. We go back to find roots of polynomials modulo m. Suppose we have
a prime factorisation m = pa11 pa22 · · · par r . The Chinese Remainder Theorem tells us that to
solve an equation modulo m it suffices to solve the equation modulo pai i for each i. A very
powerful tool for finding roots of polynomials modulo a prime power (like pai i ) is provided
by Hensel’s Lemma:
Theorem 3.13 (Hensel’s Lemma). Let f (x) = a0 + a1 x + · · · + an xn be a polynomial with
integer coefficients, let p be a prime, and let r be a positive integer. We let f 0 (x) be the
derivative of f (x), so f 0 (x) = a1 + 2a2 x + · · · + nan xn−1 . Suppose xr is an integer with
f (xr ) ≡ 0
(mod pr )
f 0 (xr ) 6≡ 0
(mod p)
and
Then there exists xr+1 ∈ Z satisfying
f (xr+1 ) ≡ 0
(mod pr+1 ) and xr+1 ≡ xr
(mod pr )
Moreover, the xr+1 satisfying these properties is unique modulo pr+1 and we can take
xr+1 = xr − f (xr )u
where u is an inverse of f 0 (xr ) modulo p.
i)
Remark. The formula xr+1 = xr − f (xr )u is similar to the formula xi+1 = xi − ff0(x
which
(xi )
appears in the Newton–Raphson method for finding zeroes of a function f .
Remark. If f 0 (xr ) ≡ 0 (mod p) then Hensel’s lemma doesn’t tell us anything — we have
to find roots of the polynomial another way.
We will prove Hensel’s Lemma using the following key result:
Lemma 3.14. For t ∈ Z and a positive integer r, we have
f (x + pr t) ≡ f (x) + f 0 (x)pr t (mod pr+1 )
where we view both sides as polynomials in x and we mean that all the coefficients of these
two polynomials are congruent modulo pr+1 .
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
17
Proof. It suffices to prove the Lemma for f (x) = xn (multiply by the coefficients ai and
add together to get the general case). We have
(x + pr t)n =
n
X
i=0
r
i
For i ≥ 2, (p t) =≡ 0 (mod p
(x + pr t)n ≡
1
X
i=0
r+1
!
n n−i r i
x (p t)
i
), so we have
!
n n−i r i
x (p t) = xn + nxn−1 (pr t)
i
(mod pr+1 )
Proof of Hensel’s Lemma. We are looking for a solution to f (x) ≡ 0 (mod pr+1 ) of the
form x = xr + pr t. Applying the previous lemma, we see that we get a solution if and only
if
f (xr ) + f 0 (xr )pr t ≡ 0 (mod pr+1 )
This equation is equivalent to
pr t ≡ −f (xr )u
(mod pr+1 )
where u is an inverse to f 0 (xr ) modulo p. So we deduce that all solutions to f (x) ≡ 0
(mod pr+1 ) with x ≡ xr (mod pr ) are given by x ≡ xr − f (xr )u (mod pr+1 ).
Example. Let’s consider the equation
f (x) = x2 + 1 ≡ 0
(mod 65e )
We are going to show that for each e ≥ 1 this equation has exactly 4 solutions in Z65e .
It suffices to show that there are two solutions x1 , x2 modulo 5e and two solutions y1 , y2
modulo 13e for each e (then we apply the Chinese Remainder theorem to get four solutions,
for example one congruent to x1 (mod 5e ) and one congruent to y2 (mod 13e )).
First we consider the equation mod 5: there are two solutions x ≡ −2, 2 (mod 5). Now
0
f (x) = 2x, so f 0 (−2) and f 0 (2) are both non-zero mod 5. It follows that we can apply
Hensel’s Lemma to show that there are exactly two solutions mod 52 , one congruent to
2 (mod 5), the other to −2. Repeatedly applying Hensel’s Lemma we get exactly two
solutions mod 5e for each e ≥ 1. The same argument works mod 13 (the two solutions
mod 13 are x ≡ ±5). So we conclude there are also exactly two solutions mod 13e .
The conclusion of this section is that, using the CRT and Hensel’s Lemma, we can, in
good situations (when the conditions of Hensel’s Lemma are met) reduce solving congruence
equations modulo m to the problem of solving congruence equations modulo p, for prime
factors p of m. The next theorem gives some control on the number of possible solutions.
Theorem 3.15. Let p be a prime, and let f (x) = a0 + a1 x + · · · + an xn be a polynomial of
degree ≤ n with integer coefficients (we allow the possibility that an = 0). We suppose that
ai 6≡ 0 (mod p) for some i. Then the congruence equation
f (x) ≡ 0
(mod p)
Week 4
18
JAMES NEWTON
has at most n solutions in Zp .
Proof. The proof is by induction on n. If n = 0, and a0 6≡ 0 (mod p) then there are no
solutions to a0 = f (x) ≡ 0 (mod p), as required. Now we do the inductive step. So we
assume that n ≥ 1 and that all polynomials g(x) of degree ≤ n − 1 with some coefficient
not divisible by p have at most n − 1 solutions.
If f (x) ≡ 0 has no solutions, then there is nothing left to prove, so suppose [a]p is a
solution. So p|f (a). We have
f (x) − f (a) =
n
X
ai (xi − ai )
i=1
and for each i we have
xi − ai = (x − a)(xi−1 + axi−2 + · · · + ai−2 x + ai−1 )
Taking out the common factor of (x − a) we can write
f (x) − f (a) = (x − a)g(x)
where g(x) is a polynomial with integer coefficients of degree ≤ n − 1. If p divided all
the coefficients of g(x) it would divide all the coefficients of f (x) − f (a). Since p|f (a) this
implies that p divides all the coefficients of f (x), which is a contradiction. So there is a
coefficient of g(x) which is not divisible by p and by our inductive hypothesis we conclude
that the equation g(x) ≡ 0 has at most n − 1 roots in Zp . Now suppose we have f (b) ≡ 0
(mod p). Then we have (b − a)g(b) ≡ 0 (mod p). If p|(b − a)g(b) then either p|(b − a) or
p|g(b). So there are at most n possibilities for b (mod p) — either b ≡ a (mod p) or the
congruence class of b is one of the roots of g(b) in Zp .
In fact the above Theorem is a special case of a general fact about roots of polynomials
with coefficients in a field. Note that the statement of the Theorem doesn’t hold without
the assumption that p is prime: for example, let’s consider the equation
x2 ≡ 1
(mod 8)
You can check that the solutions are given by x ≡ 1, 3, 5, 7 (mod 8) so there are > 2
solutions in Z8 .
4. Euler’s φ function
Definition 4.1. Let m be a positive integer. We define φ(m) to be the number of integers
a such that 1 ≤ a ≤ m and gcd(a, m) = 1. Equivalently φ(m) = |Z×
m |, the cardinality of
the multiplicative group Z×
.
m
Example. If p is prime then φ(p) = p − 1.
Lemma 4.2. Let m, n be coprime positive integers. Then φ(mn) = φ(m)φ(n).
Proof. Here are two examples of proofs (which are mathematically equivalent, but one
takes a slightly more group-theoretic viewpoint):
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
19
(1) Recall that the CRT gives a bijection
Zmn → Zm × Zn
where the maps is [a]mn 7→ ([a]m , [a]n ). We claim that this gives a bijection
×
×
Z×
mn → Zm × Zn
and comparing cardinalities of the two sides proves the lemma. In fact this bijection
is an isomorphism of groups. We need to show the claim. In other words, we need
to show that gcd(a, mn) = 1 if and only if gcd(a, m) = gcd(a, n) = 1. A common
divisor of a and m is a common divisor of a and mn, so if gcd(a, mn) = 1 we
must have gcd(a, m) = 1; applying the same argument to n as well shows that if
gcd(a, mn) = 1 we have gcd(a, m) = gcd(a, n) = 1.
Conversely, suppose that gcd(a, m) = gcd(a, n) = 1. Then gcd(a, mn) = 1 by
Lemma 1.10. This completes the proof of the claim.
(2) We want to count integers 1 ≤ a ≤ mn with gcd(a, mn) = 1. First we note
that a is coprime to mn if and only if it is coprime to m and n (see the previous
paragraph for a proof of this). So we need to count integers 1 ≤ a ≤ mn with
gcd(a, m) = gcd(a, n) = 1. Since m, n are coprime the Chinese Remainder Theorem
says that for each pair of integers 1 ≤ a1 ≤ m, 1 ≤ a2 ≤ n there is an integer a,
unique modulo mn with a ≡ a1 (mod m) and a ≡ a2 (mod n). Since the integers
between 1 and mn are a complete residue system mod mn this says that there is
a unique integer a with 1 ≤ a ≤ mn with a ≡ a1 (mod m) and a ≡ a2 (mod n).
As we let a1 run over the φ(m) possibilities which are coprime to m and a2 run
over the φ(n) possibilities which are coprime to mn, we get φ(m)φ(n) integers a
with 1 ≤ a ≤ mn and a coprime to mn, and we get all such integers this way. So
φ(mn) = φ(m)φ(n).
This Lemma says that φ is a multiplicative function. If m and n are not coprime, we
usually don’t have φ(mn) = φ(m)φ(n). For example φ(4) = 2 6= φ(2)φ(2) = 1.
Lemma 4.3. Let p be prime and i a positive integer. Then
φ(pi ) = pi−1 (p − 1)
Proof. An integer a with 1 ≤ a ≤ pi is coprime to pi if and only if a is not divisible by
p. So the number of such integers is equal to pi minus the number of multiples of p in
{1, . . . , pi }. There are pi−1 multiples of p in this range, namely {p, 2p, . . . , pi−1 p}, so we
get φ(pi ) = pi − pi−1 = pi−1 (p − 1).
Example. φ(1000) = φ(23 53 ) = φ(23 )φ(53 ) = 4 · 25 · 4 = 400
Proposition 4.4. Let m be a positive integer. Then
X
0<d|m
φ(d) = m.
20
JAMES NEWTON
Proof. (I didn’t do this proof in lectures) For each positive divisor d of m we let Sd = {1 ≤
a ≤ d : gcd(a, d) = 1}. We have |Sd | = φ(d). For every integer a with 1 ≤ a ≤ m we have
m
a
m
a positive divisor gcd(a,m)
of m and an element gcd(a,m)
of S gcd(a,m)
. Conversely, for a ∈ Sd
m
m
m
m
we get an integer a d with 1 ≤ a d ≤ m, and gcd(a d , m) = d . This shows that we have a
bijection
{1, 2, . . . , m} ↔ {pairs (d, a) : 0 < d|m and a ∈ Sd }
The cardinality of the left hand side is m, whilst the cardinality of the right hand side is
P
0<d|m φ(d).
Exercise. Find another proof of the above first Proposition, by first proving it for m = pi
using the exact formula we have for φ(pi ) (I did this part in the lecture) and then using
P
multiplicativity of φ to show that the function F (m) = 0<d|m φ(d) is also multiplicative.
Theorem 4.5 (The Fermat–Euler Theorem). Let a ∈ Z and let m be a positive integer.
Suppose gcd(a, m) = 1. Then
aφ(m) ≡ 1 (mod m)
×
Proof. Since Z×
m is a group of order φ(m), and [a]m ∈ Zm , it is an immediate consequence
of Lagrange’s theorem in group theory that [a]φ(m)
= [1]m , because the order of the cyclic
m
group generated by [a]m is a divisor of φ(m). In other words,
aφ(m) ≡ 1
(mod m)
Here is an alternative proof, without using group theory: let x1 , x2 , . . . , xφ(m) be the list
of integers between 1 and m which are coprime to m. You can check that the integers
ax1 , ax2 , . . . , axφ(m) are all distinct modulo m and coprime to m, so this gives another list
(possibly reordered) of integers whose congruence classes give all the units in Zm . In other
words we have
Z×
m = {[x1 ]m , [x2 ]m , . . . , [xφ(m) ]m } = {[ax1 ]m , [ax2 ]m , . . . , [axφ(m) ]m }
Taking the product of all the elements in each list, we get
[x1 ]m [x2 ]m · · · [xφ(m) ]m = [ax1 ]m [ax2 ]m · · · [axφ(m) ]m = [aφ(m) ]m [x1 ]m [x2 ]m · · · [xφ(m) ]m .
Dividing through by the unit [x1 ]m [x2 ]m · · · [xφ(m) ]m gives [aφ(m) ]m = 1.
Corollary 4.6 (Fermat’s Little Theorem). Let p be a prime and let a ∈ Z be an integer
which is coprime to p. Then
ap−1 ≡ 1 (mod p)
If a is an arbitrary integer, we have ap ≡ a (mod p).
Proof. The first part is a special case of the Fermat–Euler Theorem. The second part
follows from the first, unless p|a, in which case ap ≡ a ≡ 0 (mod p), so the statement also
holds in this case.
Definition 4.7. Let m be a positive integer and let [a] ∈ Z×
m . The order of [a] (also called
the order of a modulo m) is defined to be the smallest positive integer i such that [a]i = [1].
In other words, it is the order of the cyclic subgroup of Z×
m generated by [a].
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
21
As noted in the proof of the Fermat–Euler Theorem, the order of [a] is a divisor of φ(m).
5. Primitive roots
This section is about trying to understand the structure of Z×
m as a group. We will
briefly recall some basic facts from group theory.
Let G be a finite group, with group multiplication · and identity e. The order of G is
the number of elements of G. If g ∈ G then the order of g, denoted o(g) is the smallest
positive integer i such that g i = e. The order o(g) is also the order of the cyclic group
hgi = {e, g, g 2 , . . . , g o(g)−1 }
generated by g.
If i is an integer, we have g i = e ⇐⇒ o(g)|i.
Problem. For which positive integers m is Z×
m a cyclic group?
In other words, for which m is there an integer a (coprime to m) such that the order of
a modulo m is φ(m). If such an a exists, then Z×
m is cyclic, generated by [a]m , and we say
that a is a primitive root modulo m.
Lemma 5.1. Let m be a positive integer, and let a be coprime to m. Then a is a primitive
root modulo m if and only if
aφ(m)/p 6≡ 1 (mod m)
for all prime divisors p of φ(m).
Proof. The order of a modulo m is a divisor of φ(m). There are two possibilities: either
the order is equal to φ(m), or the order is less than φ(m), in which case it divides φ(m)/p
for some prime factor p of φ(m). Exercise to finish from here.
Example. Let m = 5. Then Z×
5 = {[1], [2], [3], [4]} and o([2]) = 4 = φ(5), by the previous
Lemma since [2]2 6= [1]. So Z×
is
cyclic.
5
Example. Let m = 8. Then Z×
8 = {[1], [3], [5], [7]}. You can check that o([3]) = o([5]) =
o([7]) = 2, so no element has order 4 = φ(8) and the group is not cyclic.
Week 5
Theorem 5.2. Let p be a prime number. Then Z×
p has φ(d) elements of order d, for each
×
0 < d|(p − 1). In particular, Zp is cyclic, as there are φ(p − 1) elements of order p − 1
(which are the primitive roots).
Proof. Let d|(p−1) be a positive divisor of p−1. We define Wd = {[a] ∈ Z×
p : [a] has order d}
and let wd = |Wd |. We want to show that wd = φ(d) for each d. We know that every
element of Z×
p is an element of Wd for exactly one d (the order of the element, which is
necessarily a divisor of p − 1). So we have
X
0<d|p−1
wd = p − 1.
22
JAMES NEWTON
We also have (Proposition 4.4)
X
φ(d) = p − 1
0<d|p−1
so
X
(φ(d) − wd ) = 0.
0<d|p−1
We are going to show that wd ≤ φ(d) for all d. This means that the above sum is a sum
of non-negative integers with value 0, so all the summands must be 0, and we are done.
If Wd is empty then we obviously have 0 = wd ≤ φ(d). So suppose Wd contains an
element [a]. The powers a, a2 , . . . , ad are all distinct modulo p, since a has order d modulo
p, they are all roots of the polynomial f (x) = xd − 1 in Zp . It follows from Theorem 3.15
that these are all the roots of this polynomial, since it has at most d. Now let [b] be any
element of Wd . Since [b] is a root of xd − 1 we have [b] = [ai ] for some i = 1, 2, . . . , d. Since
o([b]) = d, we must have gcd(i, d) = 1, as we have [b]d/ gcd(d,i) = ([a]d )i/ gcd(d,i) = [1] (see
Exercise 1 on Problem Sheet 5), so we must have d/ gcd(d, i) = d. It follows that there
are at most φ(d) possibilities for [b] — it is equal to [ai ] for one of the φ(d) integers i with
1 ≤ i ≤ d and gcd(a, d) = 1. So we have shown that wd ≤ φ(d), which completes the
proof.
Applications of primitive roots.
Example.
(1) We check that 2 is a primitive root modulo 11: Since φ(11) = 10, by
Lemma 5.1 we just need to check that 25 and 22 are not 1 mod 11, which is easy.
(2) We find all solutions to
7x3 ≡ 3 (mod 11)
We have [7]−1
11 = [−3]11 so we have to solve
x3 ≡ −3 · 3 ≡ 2
(mod 11)
Since 2 is a primitive root mod 11, we have
i
Z×
11 = {[2]11 : i = 0, 1, . . . , 10}
so to solve the equation we need to find the i such that ([2]i11 )3 = [2]3i
11 = [2]11 . This
is equivalent to
3i ≡ 1 (mod 10).
We have [3]−1
10 = 7, so we get
i≡7
(mod 10).
So finally, the solution is given by [x]11 = [2]711 , or equivalently
x ≡ 27 = 128 ≡ 7
(mod 11)
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
23
(3) We find all positive integers y such that
4y ≡ 5
(mod 11)
Since 5 ≡ 24 (mod 1)1 we have to solve 22y ≡ 24 (mod 11). Equivalently, we have
2y ≡ 4 (mod 10). By Corollary 3.12, this is equivalent to y ≡ 2 (mod 5).
(4) Here’s an additional example where we will find there are no solutions: consider
the equation
x8 ≡ 10 (mod 11)
We have 10 ≡ −1 ≡ 25 (mod 11). So we need to find i such that ([2]i11 )8 = [2]8i
11 =
[2]511 . This is equivalent to
8i ≡ 5
(mod 10).
But, by Corollary 3.12, this equation has no solutions, since gcd(8, 10) = 2 - 5. So
there are no solutions to the original equation.
Now we are going to study the case of prime powers m = pi .
Proposition 5.3. Let p be a prime number. Suppose a is a primitive root modulo p. Then
either a or a + p is a primitive root modulo p2 .
Proof. We have φ(p2 ) = p(p − 1). Since o([a]p ) = o([a + p]p ) = p − 1 we know that o([a]p2 )
and o([a + p]p2 ) are divisors of p(p − 1) which are divisible by p − 1 (see the below remark
for more on this). So these orders are equal to either p − 1 or p(p − 1). If o([a]p2 ) = p(p − 1)
then a is a primitive root modulo p2 and we are done. So suppose o([a]p2 ) = p − 1. Then
(a + p)p−1 ≡ ap−1 + pap−2 ≡ 1 + pap−2
(mod p2 )
which is not congruent to 1, so we must have o([a + p]p2 ) = p(p − 1) and a + p is a primitive
root modulo p2 .
Remark. A bit more explanation of one of the steps in the above proof: we claimed that
o([a]p2 ) and o([a + p]p2 ) are divisors of p(p − 1) which are divisible by p − 1. The fact that
these orders divide p(p − 1) is because of Lagrange’s theorem: the order of an element
2
divides the order of Z×
p2 which is φ(p ) = p(p − 1). The fact that the orders are divisible by
p − 1 is because of the following (the same argument works for a and a + p): by definition
we have ao([a]p2 ) ≡ 1 (mod p2 ). If an integer is divisible by p2 it is divisible by p, so we
get ao([a]p2 ) ≡ 1 (mod p). But now we use a general fact about orders of elements in finite
groups: if g i = e then o(g) divides i. In this case this means that o([a]p ) divides o([a]p2 ).
Since a is a primitive root modulo p, we get that p − 1 divides o([a]p2 ).
The Proposition immediately implies that Z×
p2 is cyclic, and gives a procedure to find a
primitive root if we already know a primitive root mod p.
Proposition 5.4. Let p be an odd prime and suppose a is a primitive root modulo p2 .
Then a is a primitive root modulo pi for all i ≥ 2.
24
JAMES NEWTON
Proof. The proof is by induction. Let i ≥ 2 and suppose a is a primitive root modulo pi .
We want to show that a is a primitive root modulo pi+1 . By the same proof as described
in the above remark, we know that o([a]pi+1 ) is divisible by o([a]pi ) = pi−1 (p − 1), so it
suffices to show that
i−1
ap (p−1) 6≡ 1 (mod pi+1 ).
i−1
i−2
We have ap (p−1) = (ap (p−1) )p . Since a is a primitive root modulo pi we know that
ap
but
i−2 (p−1)
≡1
(mod pi−1 )
i−2
ap (p−1) 6≡ 1 (mod pi )
i−2
So there is an integer x such that ap (p−1) = 1 + pi−1 x and p does not divide x.
Now we have
!
p 2i−2 2
pi−1 (p−1)
pi−2 (p−1) p
i−1 p
i
a
= (a
) = (1 + p x) ≡ 1 + p x +
p
x (mod pi+1 )
2
and since p is an odd prime we have p|
ap
i−1 (p−1)
p
2
, so pi+1 |
p
2
≡ 1 + pi x 6≡ 1
p2i−2 (here we use i ≥ 2). So we get
(mod pi+1 )
since p does not divide x. This completes the proof.
We already say that Z×
8 is not cyclic, so we cannot expect this Proposition to work for
p = 2.
i
i
Fact 5.5. Let m be a positive integer. Then Z×
m is cyclic if and only if m = 1, 2, 4, p or 2p
for some odd prime p and positive integer i.
Here is a useful additional Proposition which I mentioned in a later lecture, generalising
Theorem 5.2:
Proposition 5.6. Suppose m > 0 is a positive integer, and suppose that Z×
m has a primitive
root. (For example, we could have m = pn with p an odd prime). Then the number of
primitive roots in Z×
m is φ(φ(m)).
i
Proof. Let g be a primitive root modulo m. Then the elements of Z×
m are {[g] : 1 ≤
i ≤ φ(m)}. We claim that [g]i is a primitive root if and only if i is coprime to φ(m). In
particular there are φ(φ(m)) primitive roots. The claim follows from Exercise 1 on Problem
Sheet 5.
6. Quadratic residues
Problem. Given a prime p and an integer a, decide whether
x2 ≡ a (mod p)
has a solution or not.
If a ≡ 0 (mod p) then the equation is solved by x ≡ 0 (mod p). If p = 2, then the only
other possibility is a ≡ 1 (mod 2), in which case the equation is solved by x ≡ 1 (mod 2).
So from now on we assume that p is odd and a is coprime to p.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
25
Definition 6.1. Let p be an odd prime and a an integer coprime to p. We say that a is a
quadratic residue modulo p if the equation
x2 ≡ a (mod p)
has a solution, and we say that a is a quadratic non-residue modulo p otherwise.
Note that whether a is a quadratic residue or not depends only on the congruence class
[a]p . So it makes sense to ask if a congruence class [a] ∈ Z×
p is a quadratic residue or not.
Example. The quadratic residues modulo 11 are the congruence classes of 1, 3, 4, 5, 9.
Proposition 6.2. Let g ∈ Z be a primitive root modulo p. Then [g i ]p is a quadratic residue
if and only if i is even.
Proof. If i = 2k is even, then (g k )2 ≡ g i (mod p), so [g i ] is a quadratic residue.
Conversely, if x2 ≡ g i (mod p) then we have [x] = [g k ] for some integer k, so we have
2k
g ≡ g i (mod p). Equivalently, we have 2k ≡ i (mod p − 1). Since 2k and p − 1 are even,
i must also be even.
Corollary 6.3. There are (p−1)/2 quadratic residues and (p−1)/2 quadratic non-residues
in Z×
p.
p−2
]}. By the
Proof. Let g be a primitive root modulo p. We have Z×
p = {[1], [g], . . . , [g
2
4
p−3
previous Proposition, the quadratic residues are {[1], [g ], [g ], . . . [g ]} and the quadratic
non-residues are {[g], [g 3 ], . . . , [g p−2 ]}.
Theorem 6.4. −1 is a quadratic residue modulo p if p ≡ 1 (mod 4) and a quadratic
non-residue if p ≡ 3 (mod 4).
Proof. Let g be a primitive root modulo p, and let x = g (p−1)/2 . We have x2 = g p−1 ≡ 1
(mod p) and x 6≡ 1 (mod p), since g is a primitive root. The equation x2 ≡ 1 (mod p) has
only two solutions, so we have x ≡ −1 (mod p). We deduce that −1 is a quadratic residue
if and only if (p − 1)/2 is even, which gives the statement of the Theorem.
Corollary 6.5. There are infinitely many primes p with p ≡ 1 (mod 4).
Proof. Suppose every prime with p ≡ 1 (mod 4) is ≤ N . Consider the positive integer
n = (N !)2 + 1. Let p be a prime divisor of n. Since p is odd and does not divide N ! we
have p ≡ 3 (mod 4). On the other hand (N !)2 +1 ≡ 0 (mod p), so −1 is a quadratic residue
modulo p. This contradicts the Theorem above, so we deduce that there are infinitely many
primes ≡ 1 (mod 4).
This result is a special case of Dirichlet’s Theorem whose proof is beyond the scope of
this course:
Theorem 6.6 (Dirichlet’s Theorem). Let a, b be coprime positive integers. Then there
exist infinitely many prime numbers p with
p ≡ a (mod b)
26
JAMES NEWTON
Corollary 6.7 (Euler’s criterion: Corollary to Proposition 6.2). Let [a] ∈ Z×
p . Then [a] is
a quadratic residue if and only if
[a](p−1)/2 ≡ 1
(mod p)
and [a] is a quadratic non-residue if and only if
[a](p−1)/2 ≡ −1
(mod p)
Proof. We let g be a primitive root and suppose [a] = [g]i . Then [a](p−1)/2 = [g]i(p−1)/2 .
If i is even we have i(p − 1)/2 ≡ 0 (mod p − 1), so [g]i(p−1)/2 = [1]. If i is odd we have
i(p − 1)/2 ≡ (p − 1)/2 (mod p − 1), so [g]i(p−1)/2 = [g](p−1)/2 = [−1], as in the proof of the
Theorem.
The Legendre symbol. We now give some new notation which will us to keep track of
whether something is a quadratic residue.
Definition 6.8. Let a be an integer. We define the Legendre symbol
a
p
by,
!
a
= +1 if a is a quadratic residue modulo p
p
!
a
= −1 if a is a quadratic non-residue modulo p
p
If p|a we define
a
p
= 0.
The definition of the Legendre symbol depends only on the congruence class of a, so we
can also view it as being defined on elements [a] ∈ Zp .
Lemma 6.9. The number of solutions in Zp to
x2 ≡ a (mod p)
is equal to 1 +
a
p
.
Proof. If p|a the only solution is x = [0], and 1 = 1 + 0 = 1 + ap .
If a is a quadratic non-residue mod
p then there are no solutions (by definition of a
a
non-residue), and 0 = 1 − 1 = 1 + p .
If a is a quadratic residue mod p we have one solution x1 , and −x1 gives a second
solution (since p > 2 and x1 6= [0] we have −x1 6= x1 ). Since the polynomial x2 − a
has degree 2, there
are at most two solutions, so we have two solutions in this case, and
2 = 1 + 1 = 1 + ap .
Remark. Let’s reformulate Euler’s criterion (Corollary 6.7) in terms of the Legendre symbol.
It says that if [a] ∈ Z×
p then we have
a
(p−1)/2
a
≡
p
!
(mod p).
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
27
In fact, if p|a we also have
!
a
a
≡
p
because in this case both sides are 0 (mod p).
(p−1)/2
(mod p)
Lemma 6.10 (Multiplicativity of the Legendre symbol). Let a, b be two integers. Then
!
ab
a
=
p
p
!
b
p
!
Proof. By Euler’s criterion, as reformulated in the above remark, we have
!
a
ab
≡ (ab)(p−1)/2 ≡ a(p−1)/2 b(p−1)/2 ≡
p
p
!
b
p
!
(mod p)
Since p > 2, the only way we can have a congruence
!
ab
a
≡
p
p
!
b
p
!
(mod p)
between two numbers, which are one of −1, 0, +1, is if
!
ab
a
=
p
p
!
b
p
!
Lemma 6.11.
!
X
[a]∈Zp
a
=0
p
Proof. We have shown that there are (p − 1)/2 quadratic residues and (p − 1)/2 nonresidues. So the sum has
(p − 1)/2 terms equal to +1 and the same number equal to −1.
The remaining term is p0 , which is 0. So adding up gives us 0.
NB: I’ve added a Proposition at the end of the primitive roots section (Proposition 5.6).
Sums of squares modulo p.
Problem. Let n be a positive integer. How many solutions (x1 , . . . , xn ) ∈ Znp does the
equation
x21 + x22 + · · · + x2n ≡ 1 (mod p)
have?
We write Nn (p) for the number of solutions. Note that we have N1 (p) = 2.
Theorem 6.12. Let a be an integer. The equation
x21 + x22 ≡ a (mod p)
has p −
−1
p
solutions (x1 , x2 ) ∈ Zp × Zp if p - a and p + (p − 1)
−1
p
solutions if p|a.
Week 6
28
JAMES NEWTON
Proof. For each pair (y, z) ∈ Zp × Zp with y + z = [a] and solutions to the equations x21 = y
and x22 = z in Zp , we get a solution (x1 , x2 ) to our original equation, and
all solutions are
y
2
given by this process. The number of solutions to x1 = y is 1 + p , and similarly for
x22 = z, so we get total number of solutions
!
!
!
!
X
X
X
X
y
z
y
z
yz
(1 +
)(1 +
)=
1+
+
+
p
p
p
p
p
y+z=[a]
y+z=[a]
y+z=[a]
y+z=[a]
y+z=[a]
!
X
where we have multiplied out the terms in side the first sum and used multiplicativity of the
Legendre symbol. The first sum on the right hand side gives p, and the second and third
have
yz = y(a−y)
= y 2 (ay −1 −1)
give zero,
by Lemma
6.11. For the final sum, if y ∈ Z×
p we −1 2
and so yz
= ay p −1 , whilst if y ≡ 0 (mod p) we have yz
= 0. Since yp = 1 we have
p
p
X ay −1 − 1
yz
=
p
p
y∈Z×
!
X
y+z=[a]
!
p
−1
If p|a then this is equal to (p − 1) −1
. If p - a then as y runs over Z×
also runs over
p , ay
p
×
−1
Zp , and so ay − 1 runs over {[0], [1], . . . , [p − 2]}. We therefore have
X y0
−1
−1
ay −1 − 1
=(
)−
=−
p
p
p
p
y 0 ∈Zp
!
X
y∈Z×
p
!
!
!
(again by Lemma 6.11). Putting everything together gives the result in the Theorem. The examinable material for the Class Test on 06/03/2017 is everything up to here.
Corollary 6.13. Let p be an odd prime. Then
!

+1 if p ≡ 1 or 7
2
2
= (−1)(p −1)/8 =
−1 if p ≡ 3 or 5
p
(mod 8)
(mod 8)
Proof. We do this by counting the number of solutions N2 (p) to
x21 + x22 ≡ 1
(mod p)
in two different ways. On the one hand we have N2 (p) = p − −1
.
p
On the other hand, suppose we have a solution (x1 , x2 ). Then we get more solutions by
considering (x2 , x1 ), (−x1 , x2 ), etc. Swapping signs and switching x1 and x2 usually gives
8 distinct solutions. The only cases when we do not get distinct solutions are if x1 or x2
are equal to 0 (in each case we have two solutions), or if x1 =±x
2 . If (x1 , x2 ) = (c, ±c)
2
2
then we have 2c ≡ 1 (mod p). There are 2 possibilities for c if p = 1 (and so there are 4
possibilities for (c, ±c)) and no possibilities if
5). To conclude we have shown that we have
2
p
= −1 (see Exercise 5 on Problem Sheet
N2 (p) = 8k + 2 + 2 + 4 = 8k + 8
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
for some integer k if
2
p
= 1 and N2 (p) = 8k + 4 if
2
p
29
= −1. We deduce that
!
−1
≡0
p−
p
if
2
p
(mod 8)
= 1 and
!
−1
p−
≡4
p
(mod 8)
if p2 = −1. Running through the different possibilities for p mod 8 and using Theorem
6.4 gives the final result.
Theorem 6.14. Let n = 2k + 1 be an odd positive integer. Then
!k
Nn (p) = p2k + pk
−1
p
Proof. We prove the Theorem by finding a formula for Nn (p) in terms of Nn−2 (p). A proof
by induction (and the base case N1 (p) = 2) then proves the formula in the statement of
the Theorem holds — we leave this part of the proof as an exercise.
Let n ≥ 3. For any integer a, we denote the number of solutions to
x21 + x22 + · · · + x2n ≡ a (mod p)
by Nn (p, a). By considering the solutions of x21 + x22 ≡ a (mod p) and x23 + · · · + x2n ≡ 1 − a
(mod p) for each congruence class [a] ∈ Zp , we see that
Nn (p) =
X
N2 (p, a)Nn−2 (p, 1 − a)
[a]∈Zp
We worked out N2 (p, a) in the previous Theorem. So we have

Nn (p) =
!

!
−1
−1

(p −
)Nn−2 (p, 1 − a) + (p + (p − 1)
)Nn−2 (p, 1)
p
p
×
[a]∈Z
 X

p

!

!
−1
−1
=
(p −
)Nn−2 (p, 1 − a) + p
Nn−2 (p, 1)
p
p
[a]∈Zp
X
! 

!
−1  X
−1
= (p −
)
Nn−2 (p, 1 − a) + p
Nn−2 (p, 1)
p
p
[a]∈Zp
As a runs over all the congruence classes in Zp , 1 − a also runs over Zp so we have
! 

!
−1  X
−1
Nn (p) = (p −
)
Nn−2 (p, a) + p
Nn−2 (p, 1)
p
p
[a]∈Zp
30
JAMES NEWTON
, since each
The sum [a]∈Zp Nn−2 (p, a) counts all the elements (x1 , . . . , xn−2 ) ∈ Zn−2
P p
2
2
(n − 2)-tuple is counted once, in the term Nn−2 (p, x1 + · · · + xn−2 ). So [a]∈Zp Nn−2 (p, a) =
pn−2 . We deduce that
P
!
!
−1
−1
Nn (p) = (p −
)pn−2 + p
Nn−2 (p)
p
p
and this gives us enough information to prove the Theorem.
Exercise. Use the same method to find a formula for Nn (p) when n is even.
Quadratic Reciprocity.
Theorem 6.15 (Quadratic reciprocity law). Let p and q be two distinct odd primes. Then
!
p
= (−1)
q
p−1 q−1
2
2
!
q
=
p
  q
p − q
p
if p ≡ 1
(mod 4) or q ≡ 1
if p ≡ q ≡ 3
(mod 4)
(mod 4)
This Theorem was first proved by Gauss in 1796 (having been conjectured by Euler and
Legendre). Gauss gave eight different proofs, and there are over 200 published proofs in
total. We will explain a proof due to V. Lebesgue (published in 1838) using our previous
results on computing Nn (p).
Proof. We prove the Theorem by computing Nq (p) (mod q) in two different ways. Firstly,
by the previous theorem, we have
!(q−1)/2
Nq (p) = pq−1 + p(q−1)/2
−1
p
!
p−1 q−1
p
≡1+
(−1) 2 2
q
(mod q)
where the congruence follows from Fermat’s little theorem, and the Euler criterion.
On the other hand, if (x1 , x2 , . . . , xq ) is a solution of x21 + x22 + · · · + x2q ≡ 1 (mod p),
then we get q solutions by considering the cyclic shifts
(x1 , x2 , . . . , xq ), (xq , x1 , . . . , xq−1 ), . . . (x2 , x3 , . . . , xq , x1 ).
These q solutions are all distinct, unless we have x1 = x2 = · · · = xq (it’s an exercise to
check this, it uses the fact that q is prime). The number of solutions with
x
1 = x2 = · · · = xq
q
2
is given by the number of x with qx ≡ 1 (mod p). This is equal to 1+ p (by Exercise 5 on
Problem Sheet 5). We deduce that Nq (p) = qk+1+ pq for some integer k, so Nq (p) ≡ 1+ pq
(mod q). Comparing our two formulas for Nq (p) mod q gives the reciprocity law.
As well as being a beautiful theorem, the quadratic reciprocity law is useful for calculating Legendre symbols:
Example. Compute
43
83
:
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
31
First we check that 43 and 83 are prime. They are also both congruent to 3 mod 4 so
we have
!
!
43
83
=−
83
43
!
40
since 83 ≡ 40
=−
43
!3
!
23 · 5
2
=−
=−
43
43
(mod 43)
!
!
5
2
=−
43
43
5
43
!
!
5
=
since 43 ≡ 3
43
!
(mod 8)
!
43
3
=
=
apply quadratic reciprocity and reduce mod 5
5
5
!
!
5
2
=
quadratic reciprocity and reduce mod 3
=
3
3
Finally, we can check that
2
3
= −1. So we get
!
43
= −1
83
Example. How many solutions does the equation
3x2 + 6x + 2 ≡ 0
(mod 23)
have in Z23 ?
We have 3x2 + 6x + 2 = 3(x + 1)2 − 1, so we have to solve
3(x + 1)2 ≡ 1
(mod 23)
We have [8] = [3]−1 , so we have to solve
(x + 1)2 ≡ 8.
Now we compute
!
!3
2
8
=
23
23
!
2
=
= +1
23
since 23 ≡ 7 (mod 8). So there are two solutions to this equation in Z23 .
Example. We are going to describe all of the odd primes p with
!
3
= +1.
p
32
JAMES NEWTON
We may assume that p 6= 3. The quadratic reciprocity law says that
(mod 4) and
(mod 4) or if
p
3
We have
3
p p
3
=−
p
3
3
p
if p ≡ 3 (mod 4). So we get
= +1 if
p
3
3
p
=
p
3
if p ≡ 1
= +1 and p ≡ 1
= +1 and p ≡ 3 (mod 4).
= +1 if p ≡ 1 (mod 3) and
p
3
= −1 if p ≡ 2 (mod 3). So we conclude that
3
p
= +1 if p ≡ 1 (mod 4) and p ≡ 1 (mod 3) or if p ≡ −1 (mod 4) and p ≡ −1 (mod 3).
Using the Chinese remainder theorem, we see that these conditions are equivalent to
p ≡ ±1
(mod 12),
and this is our final description of the odd primes with
3
p
= +1.
7. Sums of squares
Proposition 7.1. A positive integer n is a square if and only if every exponent ai in the
prime factorisation n = pa11 pa22 · · · par r is even.
Proof. If n = m2 is a square then by considering the prime factorisation of m we see that
the exponents in the prime factorisation of n are even.
Conversely, if every exponent ai is even, then
a /2 a /2
n = (p11 p22 · · · par r /2 )2
Question. Which positive integers are sums of two squares?
If you look at the first few integers, you can check that 1, 2, 4, 5, 8, 9, 10, 13 are sums of
two squares and 3, 6, 7, 11, 12 are not.
Lemma 7.2. Suppose n = a2 + b2 with a, b ∈ Z. Then n ≡ 0, 1 or 2 (mod 4).
Proof. If x ∈ Z then x2 is either 0 or 1 (mod 4).
Week 7
Corollary 7.3. If n ≡ 3 (mod 4) then n is not a sum of two squares.
If we have n = a2 + b2 then n = (a + ib)(a − ib) = |a + ib|2 .
Lemma 7.4. Let m, n be positive integers. If m and n are sums of two squares, so is mn.
Proof. We can write m = |a + ib|2 and n = |c + id|2 , with a, b, c, d ∈ Z. So mn =
|(a + ib)(c + id)|2 = |(ac − bd) + i(ad + bc)|2 is also a sum of two squares.
Theorem 7.5. Let p be a prime. p is a sum of two squares if and only if p = 2 or p ≡ 1
(mod 4).
Proof. We know that if p ≡ 3 (mod 4) then p is not a sum of two squares, and 2 = 12 + 12 .
So it remains to show that if p ≡ 1 (mod
4) then p is a sum of two squares.
−1
Since p ≡ 1 (mod 4) we have p = +1. So we can let a ∈ Z be a solution to the
√
equation a2 + 1 ≡ 0 (mod p). Let k = b pc, so k 2 < p < (k + 1)2 . For each of the
(k + 1)2 > p different pairs of integers (c, d) with 0 ≤ c ≤ k and 0 ≤ d ≤ k consider
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
33
the integer c + ad. Since there are only p distinct residue classes mod p, we must have
c1 + ad1 ≡ c2 + ad2 (mod p) for two different pairs (c1 , d1 ), (c2 , d2 ) (this is an example of
the pigeonhole principle). So we have
c1 − c2 ≡ a(d2 − d1 )
(mod p)
and squaring both sides gives
(c1 − c2 )2 ≡ −(d2 − d1 )2
(mod p)
We deduce that p divides (c1 − c2 )2 + (d2 − d1 )2 . Moreover, we have
0 < (c1 − c2 )2 + (d2 − d1 )2 ≤ k 2 + k 2 < 2p
but the only multiple of p between 0 and 2p is p itself, so we are forced to have p =
(c1 − c2 )2 + (d2 − d1 )2 and we have shown that p is a sum of two squares.
Theorem 7.6 (Two squares theorem). A positive integer n is a sum of two squares if and
only if the exponent of every prime number which is congruent to 3 (mod 4) in the prime
factorisation of n is even.
Proof. First suppose that n = p1 p2 · · · pk m2 with each prime pi equal to either 2 or ≡ 1
(mod 4). Then n is a sum of two squares, since it is a product of integers which are sums
of two squares.
Conversely, suppose n = a2 + b2 be a sum of two squares. Let d = gcd(a, b). Then
2
2
we have n = d2 ( ad + db ) and gcd( ad , db ) = 1. Suppose p|n and p ≡ 3 (mod 4). If
2
2
p|( ad + db ) then we can solve the equation x2 + y 2 ≡ 0 (mod p), with x, y coprime. In
particular p does not divide x or y (otherwise the equation would force p to divide both,
2
in which case p| gcd(x.y)). So we have ([x]p [y]−1
p ) = [−1]p and −1 is a quadratic residue
modulo p. This contradicts the assumption that p ≡ 3 (mod 4). So if p|n and p ≡ 3
2
2
(mod 4) we must have p|d and p - ( ad + db ). This implies that the exponent of p in
the prime factorisation of n is even.
Fact 7.7. A positive integer is a sum of three squares if and only if it is not of the form
4a (8k + 7) with a, k non-negative integers.
Fact 7.8. Every positive integer is a sum of four squares.
8. Irrational and transcendental numbers
Recall that Q = { ab : a ∈ Z, b ∈ N} is the set of rational numbers, and we use R to
denote the real numbers and C to denote the complex numbers.
Definition 8.1. A complex number x ∈ C is called irrational if x ∈
/ Q.
Theorem 8.2. A real number x ∈ R is rational if and only if its decimal expansion either
terminates or repeats.
34
JAMES NEWTON
Proof. We analyse what it means for a real number to have a repeating decimal expansion.
We suppose that 0 < x < 1 and that we can write x = 0.ab = 0.a1 a2 a3 · · · an b1 b2 · · · bk .
Then we have y := 10n x − a = 0.b and 10k y = b + y, so y = 10kb−1 . So we have
x=
a+y
(10k − 1)a + b
=
10n
10n (10k − 1)
Conversely if x = 10n (10c k −1) with 0 < c < 10n (10k −1) then we can write c = (10k −1)a+b
with 0 ≤ b < (10k −1) and 0 ≤ a < 10n . So to show that a rational number has a repeating
(or terminating) decimal expansion, it suffices to show that it can be written as a fraction
with denominator 10n (10k − 1). Suppose x = rs with r ∈ Z and s ∈ N. Multiplying top
0
and bottom of the fraction by a multiple of 2 or 5 if necessary, we can write x = 10rn s0
where s0 is coprime to 10. Since 10 is coprime to s0 there is an integer k such that 10k ≡ 1
(mod s0 ). For example, we can take k = φ(s0 ). In other words, we have s0 |(10k − 1), so
multiplying top and bottom of the fraction by (10k − 1)/s0 we can write x as a fraction
with denominator 10n (10k − 1) as desired.
Example. The number
α=
∞
X
1
n!
n=1 10
is irrational.
Proposition 8.3. Let z ∈ C be a root of a polynomial xm + cm−1 xm−1 + · · · + c1 x + c0 with
integer coefficients ci ∈ Z. Then either z is an integer or z is irrational.
Proof. Suppose z is rational. We must show that z is actually an integer. Write z =
with a ∈ Z, b ∈ N and gcd(a, b) = 1. We have
m
a
+ cm−1
m−1
a
b
and multiplying by b we get
b
+ · · · + c1
a
b
a
+ c0 = 0
b
m
am + cm−1 am−1 b + · · · + c1 abm−1 + c0 bm = 0
and in particular we have b|am . Since gcd(a, b) = 1 we also have gcd(am , b) = 1. Combining
this with the fact that b|am means we must have b = 1 so z ∈ Z.
Remark. Note that in the above Proposition it is crucial that the leading coefficient of
the polynomial is 1. For example, if you consider the polynomial 2x + 1 you see that
polynomials which are not monic (whose leading coefficient is not 1) can have rational, but
non-integral, roots.
√
3
Example. √
2 is irrational: since it is a root of x3 − 2 the preceding proposition implies
that either 3 2 is irrational, or there is an integer n with n3 = 2. But there is no such
integer (if n > 1 then n3 > 2).
Recall the number e = 1 +
1
1!
+
1
2!
+
1
3!
+ ···
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
35
Theorem 8.4 (Euler, 1737). e is irrational
Proof. This proof is due to Fourier. Euler’s original proof uses continued fractions which
is a very nice topic in elementary number theory which we don’t have time to cover in this
course. They are discussed in some of the books on the reading list.
Suppose e = ab is rational. Then for n ≥ b the number
1
1
1
a
1
1
1
1
1
− 1 + + + + ··· +
N = n! e − 1 + + + + · · · +
= n!
1! 2! 3!
n!
b
1! 2! 3!
n!
is a positive integer, since multiplying by n! clears all the denominators. On the other
hand, we have
1
1
N = n!
+
+ ···
(n + 1)! (n + 2)!
!
=
1
1
+
+ ···
n + 1 (n + 1)(n + 2)
and the right hand side is strictly less than
1
1
1
+
+ ··· =
2
n + 1 (n + 1)
n
So, taking n > 1 we get a contradiction to the claim that N is a positive integer. So we
deduce that e must be irrational after all.
A harder fact to prove (proven by Lambert in 1761) is that π is also irrational.
8.1. Algebraic and transcendental numbers.
Definition 8.5. A complex number z ∈ C is called algebraic if z is a root of a non-zero
polynomial with rational coefficients.
A complex number z is called transcendental if it is not algebraic.
Example. If z ∈ Q then z is a root of the polynomial x − z, and is therefore algebraic.
So every transcendental number is irrational
√
√
Example. 3 2 and 3 2 + 1 are algebraic numbers.
Exercise. Find a non-zero polynomial with rational coefficients and
√
3
2 + 1 as a root.
Fact 8.6. e is transcendental (Hermite, 1873) π is transcendental (Lindemann, 1882)
Conjecture (A special case of Schanuel’s conjecture). e + π is transcendental
At the moment, it is not even known whether or not e + π is irrational!
P
1
The proofs of the above facts are quite complicated. An easier result is that ∞
n=1 10n! is
transcendental. We will prove this by considering the problem of approximating irrational
numbers by rational numbers.
Since the rational numbers are dense in the real numbers, any irrational number α ∈ R
can be approximated arbitrarily closely by a rational number — for example, consider the
rational numbers obtained by cutting off the decimal expansion of α further and further
along. However, the denominators of these rational numbers get large as we approximate α
closer and closer. So, more precisely, we want to say something about the difference |α − ab |
Week 8
36
JAMES NEWTON
in terms of the denominator b. Let’s think about how good our naive way of producing rational approximations using the decimal expansion does: if α = 0.a1 a2 a3 · · · an an+1 an+2 · · ·
and we consider the rational approximation αn = 0.a1 a2 · · · an then we have
1
10n
and the denominator of αn is 10n , so our approximation αn is, in general, within distance
1
of α, where b is the denominator of αn .
b
|α − αn | = 0.00 · · · 0an+1 an+2 · · · <
Theorem 8.7 (Dirichlet’s approximation theorem). Let α ∈ R be a real number. Let
n > 0 be a positive integer. Then there exist integers a ∈ Z, b ∈ N, with b ≤ n, such that
1
a
|α − | <
b
bn
Proof. We begin by rewriting the inequality in the statement of the theorem as
1
.
n
So we want to find a natural number b ≤ n such that bα is close to an integer a. Consider
the multiples 0α, 1α, . . . , nα of α. For each integer i with 0 ≤ i ≤ n we write iα = Ni + Fi
where Ni is an integer and 0 ≤ Fi < 1 (so Fi is the fractional part of iα). We have n + 1
numbers F0 , F1 , . . . , Fn all in the interval [0, 1). We divide [0, 1) into the n smaller intervals
[0, 1/n), [1/n, 2/n), . . . , [(n − 1)/n, 1). By the pigeonhole principle one of these intervals
must contain at least 2 of the numbers Fi . So we have two integers i < j (with j ≤ n)
such that |Fi − Fj | < n1 . We can rewrite this inequality as
|a − bα| <
|(iα − Ni ) − (jα − Nj )| <
1
n
and finally we get
|(Nj − Ni ) − (j − i)α| <
1
.
n
So we can take a = Nj − Ni and b = j − i.
Corollary 8.8. Suppose α ∈ R is irrational. Then there exist infinitely many (distinct)
rational numbers ab such that
a
1
|α − | < 2
b
b
Proof. The preceding theorem tells us that for each n we can find an , bn such that
|α −
an
1
|<
bn
bn n
|α −
an
1
|< 2.
bn
bn
and bn ≤ n. So we also have
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
We claim that this must give us infinitely many distinct rational numbers
must have a single rational number ab with
37
an
:
bn
if not, we
a
1
|α − | <
b
bn
for infinitely many n. But taking the limit as n → ∞ shows that this implies that α = ab
and we assumed that α was irrational. So we have shown the claim.
Remark. Note that the rational approximations a/b which are guaranteed to exist by the
above corollary are much better than the approximations we got in general using the
decimal expansion — the error is < 1/b2 whereas for the decimal approximations we got
error < 1/b.
Continued fractions give a systematic way to compute numbers a/b with |α − ab | < b12 .
Dirichlet’s approximation theorem tells us that we can find rational numbers which
approximate α quite closely (relative to the denominator of the rational number). Now
we can ask whether it is possible to do even better than the result in Dirichlet’s theorem:
for example, for an irrational number α ∈ R is it possible to find infinitely many rational
numbers ab with
a
1
|α − | < 3 ?
b
b
Theorem 8.9 (Liouville). Let α ∈ R be an irrational number which is a root of a polynomial
f (x) = cm xm + cm−1 xm−1 + · · · + c1 x + c0
with ci ∈ Q and cm 6= 0. Then there is a real number C > 0 such that that
a
C
α − >
b
bm
for all a ∈ Z, b ∈ N.
In particular, if m = 2 this theorem implies that we cannot find infinitely many rational
numbers ab with |α − ab | < b13 , as for b large enough we have b13 < bC2 .
Proof. By dividing out factors x − β with β rational, we can assume that all the roots of
f (x) are irrational. Multiplying by an integer to clear denominators, we can also assume
that the coefficients ci are all integers. Let α = α1 , α2 , . . . , αm be the complex roots of
f (x), so f (x) = cm (x − α1 ) · · · (x − αm ). Suppose a ∈ Z, b ∈ N with |α − ab | < 1. We have
a f
= |cm ||a/b − α1 | · · · |a/b − αm |
b < |cm ||α − a/b|(1 + |α| + |α2 |) · · · (1 + |α| + |αm |) = c|α − a/b|
where we use the inequality |a/b| < 1 + |α|, and c > 1 is a real number.
On the other hand,
a
cm am + cm−1 am−1 b + · · · + c1 abm−1 + c0 bm
=
b
bm
f
38
JAMES NEWTON
and the numerator of this fraction
is a non-zero integer (non-zero because f (x) has no
rational roots). So we have |f ab | ≥ b1m .
We deduce that
a 1
f
< c|α − a/b|
≤
bm
b Now we take C = c−1 and claim that this satisfies the statement of the theorem. Since
c > 1 we have C < 1, so if
a
C
α − ≤
b
bm
we certainly have |α − ab | < 1. Now we know that |α − a/b| > c−1 b1m = bCm which is a
contradiction. So we have proven that
a
C
α − >
b
bm
for all a, b.
Example. The proof
√ of the above theorem is a bit tricky. Let’s2 go through√the proof
√ in
the special case α =√ 2. In this case,
our
polynomial
is
f
(x)
=
x
−
2
=
(x
−
2)(x
+
2).
√
We have α = α1 = 2 and α2 = − 2.
√
We assume that ab is a rational number with | 2 − ab | < 1. Then we have |sqrt2 + ab | ≤
√
√
√
√
| ab − 2| + 2 2 by the triangle inequality, so | 2 + ab | < 1 + 2 2.
2
2
On the other hand we have ( ab ) = a −2b
so |f ( ab )| ≥ b12 .
b2
Combining everything, we get
√
a √
a √ √
a
a
1
| − 2|(1 + 2 2) > | − 2|| 2 + | = |f ( )| ≥ 2
b
b
b
b
b
and so
1
a √
√
| − 2| >
.
b
(1 + 2 2)b2
√
√
√
This was all assuming | 2− ab | < 1 but if | 2− ab | ≥ 1 we clearly have | 2− ab | > (1+21√2)b2
as well (since b ≥ 1). So we deduce that we have
√
1
a
√
| 2− |>
b
(1 + 2 2)b2
in all cases.
So in this case we can take C =
Corollary 8.10. The number α =
1√
1+2 2
P∞
in the statement of Liouville’s theorem.
1
n=1 10n!
is transcendental.
Proof. Suppose for a contradiction α is a root of a polynomial of degree m with rational
coefficients. So there is a real number C > 0 such that
a
C
α − >
b
bm
for all a ∈ Z, b ∈ N.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
To approximate α by rational numbers, we just consider the finite sums αk =
which have denominator 10k! . We have
|α − αk | =
39
Pk
1
n=1 10n! ,
∞
X
1
2
< (k+1)!
n!
10
n=k+1 10
(to check the inequality, just compare the two decimal expansions). By taking k large
2
= (10k!2)k+1 less than (10Ck! )m , which contradicts Liouville’s
enough, we can make 10(k+1)!
theorem. So α is transcendental.
9. Diophantine equations
We met the notion of Diophantine equations, and the special case of linear Diophantine
equations, earlier in the course 1.1.
The general form of a Diophantine equation is a multivariable polynomial equation:
f (x1 , x2 , . . . , xr ) = c1 xa111 xa212 · · · xar 1r + · · · + cm xa1m1 xa2m2 · · · xar mn = 0
where the coefficients ci are all integers, and we seek integer solutions (x1 , x2 , . . . , xr ) to
this equation.
Example.
x4 + 2x2 − 4xy + 2y 2 − 1 = 0
Given a Diophantine equation we can ask how many (if any) solutions the equation has,
and whether there is an algorithm or formula to find the solutions. There is no general
theory to answer these questions (in fact there is a theorem in mathematical logic which
says that there cannot be such a general theory — try reading about ‘Hilbert’s tenth
problem’). We will discuss a collection of examples which illustrate some of the methods
you can use. We should emphasise that these examples are all carefully chosen — if you
pick a completely random Diophantine equation you are very unlikely to be able to find
its solutions.
Example. Here are some simple examples of equations which we can solve completely:
(1) x3 − 2x = 0 — the only integer solution is x = 0.
(2) x4 + 2x2 − 4xy + 2y 2 − 1 = 0 — we can rewrite this equation as x4 + 2(x − y)2 = 1. If
x, y are integers then x4 +2(x−y)2 = 1 is only possible if we have x4 = 1, (x−y)2 = 0,
so the solutions are given by x = ±1 and y = x.
(3) x3 = 5y 6 — we get one obvious solution x = y = 0. If y 6= 0 then x 6= 0. Let
a be the power of 5 in the prime factorisation of x and let b be the power of 5 in
the prime factorisation of y. Then the equation implies that 3a = 6b + 1 so we get
1 = 3(a − 2b) which is impossible. So there is no solution with y 6= 0.
40
JAMES NEWTON
9.1. Using divisibility and congruences to analyse Diophantine equations. Suppose we have a Diophantine equation f (x1 , x2 , . . . , xr ) = 0. We observe that if this equation
has an integer solution then the congruence equation
f (x1 , x2 , . . . , xr ) ≡ 0
(mod m)
has a solution for every positive integer m. So one way to show that a Diophantine equation
has no solutions is to find a positive integer m such that the equation
f (x1 , x2 , . . . , xr ) ≡ 0
(mod m)
has no solutions.
Example.
(1) Consider the equation x2 = y 5 + 7. We are going to show it has no
integer solutions. We can consider the equation mod 11. Euler’s criterion implies
that y 5 ≡ 0, 1 or −1 (mod 11) for every integer y. So the right hand side of the
equation can take the values 6, 7 or 8 (mod 11). On the other hand x2 can take
the values 0, 1, 3, 4, 5, 9 (mod 11). Since there are no common values in these lists,
there are no solutions to the equation
x2 ≡ y 5 + 7
(mod 11)
and hence no integer solutions to the equation x2 = y 5 + 7.
(2) The equation x12 + y 12 = z 12 + w12 + 3 has no integer solutions: we consider the
equation mod 13. Fermat’s little theorem says that x12 + y 12 ≡ 0, 1 or 2 (mod 13),
and similarly z 12 + w12 + 3 ≡ 3, 4 or 5 (mod 13). As in the previous example, there
are no common values so the equation x12 + y 12 = z 12 + w12 + 3 has no integer
solutions.
(3) Consider the equation 15x2 − 7y 2 = 9. Suppose we have an integer solution (x, y).
We get 3|(7y 2 ), so 3|y. Let y = 3y 0 . Then we have 15x2 −63(y 0 )2 = 9 or equivalently
5x2 − 21(y 0 )2 = 3. This now implies that 3|x so we let x = 3x0 . Then we get
the equation 15(x0 )2 − 7(y 0 )2 = 1. Finally, we observe that this equation has no
solutions mod 3: reducing mod 3 we get an equation 2(y 0 )2 ≡ 1 (mod 3), which
has no solutions. So the original equation 15x2 − 7y 2 = 9 has no integer solutions.
The first two of these examples could be described as finding a prime p such that the
congruence classes of terms in the equation only have a few possible values, in order to
rule out solutions. The last one works by identifying prime divisors of possible solutions
(x, y) and dividing out by these to simplify the equation.
We already considered the example of y 2 = x5 + 7. Now we will consider a very similar
equation which is rather trickier to analyse:
Example. Consider the equation y 2 = x3 + 7. First we note that if (x, y) is a solution
then x must be odd: if x is even then y 2 ≡ 3 (mod 4) which is impossible.
So we deduce that x is odd and y is even. Let’s rewrite the equation as
y 2 + 1 = x3 + 8 = (x + 2)((x − 1)2 + 3)
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
41
Since (x − 1) is even, the second factor on the right hand side is a positive integer which
is ≡ 3 (mod 4). This implies that the right hand side of the above equation is divisible by
a prime p with p ≡ 3 (mod 4). So y 2 ≡ −1 (mod p). But this is impossible since −1 is
not a square modulo a prime which is 3 (mod 4). We deduce that the original equation
y 2 = x3 + 7 has no integer solutions.
Example. Find all integer solutions of x3 + 2y 3 + 4z 3 = 9w3 . There is one obvious
solution: (0, 0, 0, 0). Now suppose that not all of x, y, z, w are 0. Then we let d be the
greatest common divisor of x, y, z, w. Dividing through by d we get another solutions
(x/d, y/d, z/d, w/d), so we can assume that x, y, z, w have greatest common divisor 1.
For any integer a, a3 ≡ 0 or ±1 (mod 9). Since x3 + 2y 3 + 4z 3 ≡ 0 (mod 9), you can
check that we must have x3 ≡ 2y 3 ≡ 4z 3 ≡ 0 (mod 9). So we have x ≡ y ≡ z ≡ 0 (mod 3)
and therefore x3 ≡ y 3 ≡ z 3 ≡ 0 (mod 27). We deduce that 3|w. But this contradicts the
assumption that x, y, z, w have gcd 1 since we have shown that 3 divides all of them.
In this example, we used the fact that the equation was homogeneous (each term has
the same degree) to divide through by common divisors and reduce to the case where a
solution has no common factors.
Finally, we give another tricky example (we didn’t do this one in lectures):
Example. Consider the equation x4 + x3 + x2 + x + 1 = y 2 .
We consider f (x) = 4(x4 + x3 + x2 + x + 1). Then
f (x) = (2x2 + x)2 + 3x2 + 4x + 4 = (2x2 + x)2 + 3(x + 2/3)2 + 8/3 > (2x2 + x)2
On the other hand,
f (x) = (2x2 + x + 1)2 − (x + 1)(x − 3).
Since we also have f (x) = (2y)2 , f (x) cannot lie between the squares of consecutive integers,
2x2 + x and 2x2 + x + 1. So we must have (x + 1)(x − 3) < 0. This means that we only get
solutions with x ∈ [−1, 3]. By considering each possible value for x in turn and solving for
y we get solutions (−1, ±1), (0, ±1), (3, ±11).
9.2. Pythagorean triples. Let’s consider the equation x2 + y 2 = z 2 and try to describe
all integer solutions (x, y, z).
Definition 9.1. A Pythagorean triple is a set of three positive integers (x, y, z) such that
x2 + y 2 = z 2 . We say that the triple is primitive if the only positive common divisor of x, y
and z is 1 (i.e. gcd(x, y, z) = 1).
We call a right-angled triangle with integer length sides a Pythagorean triangle. Its side
lengths form a Pythagorean triple.
Example. (3, 4, 5) and (5, 12, 13) are primitive Pythagorean triples.
Lemma 9.2. Supppose (x, y, z) is a primitive Pythagorean triple. Then any two of the
three integers (x, y, z) are coprime.
Week 9
42
JAMES NEWTON
Proof. Suppose p is a prime number with p|x, p|z. Then y 2 = z 2 − x2 so p|y 2 which implies
that p|y. This contradicts the primitivity of (x, y, z). So x, z are coprime. A similar
argument applies to the other pairs x, y, y, z.
To find all Pythagorean triples, it suffices to find all primitive Pythagorean triples —
any Pythagorean triple (x, y, z) is equal to (dx0 , dy 0 , dz 0 ) where d ∈ N and (x0 , y 0 , z 0 ) is a
primitive Pythagorean triple.
Lemma 9.3. If (x, y, z) is a primitive Pythagorean triple then one of x, y is even and the
other is odd.
Proof. If x, y are both even, then z 2 = x2 + y 2 is also even, so z is even. This is impossible,
because (x, y, z) is assumed to be primitive. If x, y are both odd, then z 2 ≡ 2 (mod 4)
which is impossible (the square of any integer is either 0 or 1 mod 4).
From now on, we can assume (by swapping x and y if necessary) that a primitive
Pythagorean triple (x, y, z) has x even and y odd.
Here is a trick which we will use a couple of times:
Lemma 9.4. Suppose a, b are coprime positive integers and ab = c2 is the square of an
integer c. Then a and b are themselves squares.
Proof. Consider the prime factorisations a = pa11 · · · par r , b = q1b1 · · · qsbs , with the ai and bi
positive integers. Since a and b are coprime, the primes pi , qi are all distinct, so the prime
factorisation of ab is
ab = pa11 · · · par r q1b1 · · · qsbs
and ab is a square if and only if all the numbers ai , bi are even, which happens if and only
if both a and b are squares.
Theorem 9.5. All primitive Pythagorean triples, with x even, are given by the formulas:
x = 2st, y = s2 − t2 , z = s2 + t2
for integers s > t > 0 such that gcd(s, t) = 1 and s 6≡ t (mod 2).
To get all Pythagorean triples (up to swapping x and y) we take integers s, t as above
and d another positive integer and consider
x = 2dst, y = d(s2 − t2 ), z = d(s2 + t2 ).
Proof. Let (x, y, z) be a primitive Pythagorean triple. Since y and z are both odd we have
integers u, v such that z − y = 2u and z + y = 2v. The equation x2 + y 2 = z 2 can be
rewritten as
x2 = z 2 − y 2 = (z − y)(z + y) = 4uv
so
2
x
2
= uv.
Now we claim that gcd(u, v) = 1. Indeed, if d|u and d|v then d divides u + v = y and
u − v = z. Since y and z are coprime, we must have d = 1. Since uv is a square and u, v
are coprime, we see that both u and v are squares, so we write u = t2 and v = s2 with s, t
positive integers.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
43
Now we have z = s2 + t2 , y = s2 − t2 and x2 = 4s2 t2 , so x = 2st. Since gcd(y, z) = 1, we
have gcd(s, t) = 1, and since y is odd we have s 6≡ t (mod 2).
To check that the formulas give a primitive Pythagorean triple is left as an exercise.
To get all Pythagorean triples, we just use the observation above that if (x, y, z) is a
Pythagorean triple then (x, y, z) = (dx0 , dy 0 , dz 0 ) where (x0 , y 0 , z 0 ) is a primitive Pythagorean
triple.
9.3. Fermat’s Last Theorem.
Theorem 9.6 (Wiles, Taylor, 1994). If n ≥ 3 there are no positive integer solutions
(x, y, z) to the equation
xn + y n = z n
The simplest case of Fermat’s Last Theorem is when n = 4 — for this case (and perhaps
only this case) Fermat is likely to have found a proof himself. In fact we can proof something
a bit stronger:
Theorem 9.7. There are no positive integer solutions (x, y, z) to the equation
x4 + y 4 = z 2
Proof. For a contradiction, we suppose (x0 , y0 , z0 ) is a solution. We can assume that
gcd(x0 , y0 ) = 1, as if d divides x0 and y0 then d2 divides z0 and (x0 /d, y0 /d, z0 /d2 ) is
another solution. Now (x20 , y02 , z0 ) is a primitive Pythagorean triple. We can assume x0 is
even (otherwise we swap x0 and y0 ), and then we have
x20 = 2st, y02 = s2 − t2 , z0 = s2 + t2
for integers s > t > 0 such that gcd(s, t) = 1 and s 6≡ t (mod 2).
So we get a primitive Pythagorean triple (t, y0 , s), with y0 odd, and therefore t even. So
we have
t = 2uv, y0 = u2 − v 2 , s = u2 + v 2
with u, v coprime, and we have x20 = 4uv(u2 + v 2 ). Since u, v are coprime, we also have
u, u2 + v 2 and v, u2 + v 2 coprime. Considering the prime factorisations of u, v and (u2 + v 2 )
we deduce from the equation x20 = 4uv(u2 +v 2 ) that all three of u, v and u2 +v 2 are squares.
Letting u2 + v 2 = z12 , u = x21 and v = y12 , we get
z12 = x41 + y14
so we have produced another positive integer solution (x1 , y1 , z1 ) to the equation x4 + y 4 =
z 2 . Moreover, we have z1 ≤ s < s2 + t2 = z0 . So z1 is strictly less than z0 . But
now we can repeat this procedure and produce an infinite sequence of positive integers
z0 > z1 > z2 > · · · . This is impossible, so we get the desired contradiction. This method
of proof is called ‘Fermat descent’.
Germain made the most progress towards solving Fermat’s last theorem by elementary
methods.
44
JAMES NEWTON
Definition 9.8. A Sophie Germain prime is a prime number p such that 2p + 1 is also
prime.
It is conjectured that there are infinitely many Sophie Germain primes, but this remains
unproven. The first few examples are p = 2, 3, 5, 11, 23, 29, . . ..
Theorem 9.9. Let p be a Sophie Germain prime. Then the equation xp + y p = z p has no
integer solution (x, y, z) with p - xyz.
Proof. (Not examinable, it’s a bit complicated, but still only uses ideas we’ve seen already.)
We assume for a contradiction that such a solution exists. Dividing through by gcd(x, y, z)
we can assume that gcd(x, y, z) = 1. We can also assume p is odd, since we already showed
that if p = 2 then one of x, y is even so p|xyz. By changing the sign of z we can rewrite
the equation as xp + y p + z p = 0, so we can instead consider solutions to this equation.
Now we have
−xp = y p + z p = (y + z)(z p−1 − z p−2 y + · · · + y p−1 )
We claim that the two factors on the right hand side are coprime. Since p - x, p is not
a factor of the right hand side. If r 6= p is a prime and r|(y + z) then y ≡ −z (mod r) so
z p−1 − z p−2 y + · · · + y p−1 ≡ z p−1 + z p−1 + · · · + z p−1 ≡ pz p−1
(mod r).
Since gcd(x, y, z) = 1 and r|x we can’t have r|z, otherwise the equation y p = −xp − z p
implies that r|y and r would be a common divisor of x, y and z. This implies that r does
not divide (z p−1 − z p−2 y + · · · + y p−1 ) and establishes the claim.
Now considering the prime factorisations of the coprime integers (whose product is a pth
power) (y + z) and (z p−1 − z p−2 y + · · · + y p−1 ) shows that we have integers A and T such
that
y + z = Ap
and
z p−1 − z p−2 y + · · · + y p−1 = T p .
The same argument shows that we have integers B, C with x + y = B p and x + z = C p .
Let q = 2p + 1, which we are assuming is prime. Considering the equation mod q we get
x(q−1)/2 + y (q−1)/2 + z (q−1)/2 ≡ 0
(mod q).
If q - xyz then each term on the left hand side is ±1 (mod q) by Euler’s criterion. Since
q > 5, this is impossible. So we have q|xyz. Without loss of generality we can assume q|x,
since everything is symmetric. Since gcd(x, y, z) = 1 arguing as before (with r) shows that
this implies q - yz.
Now we have B p +C p −Ap = 2x, so B (q−1)/2 +C (q−1)/2 −A(q−1)/2 ≡ 0 (mod q). Applying
Euler’s criterion again, we deduce that q|ABC. Since q|x but q - yz we have q - BC so we
must have q|A. We then get T p ≡ py p−1 (mod q).
Since y ≡ B p ≡ B (q−1)/2 (mod q) and p is odd we have y p−1 ≡ 1 (mod q), by Fermat’s
little theorem. Euler’s criterion implies that T p ≡ ±1 (mod q). So we get an equation
±1 ≡ p (mod q), which is impossible. Finally we have obtained a contradiction, so there
was no solution (x, y, z) with p - xyz to our original equation.
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
45
9.4. Pell’s equation. A Pell equation is an equation of the form
x2 − Dy 2 = 1
where D is a fixed positive integer which is not a perfect square.
Here is one natural way this equation appears: suppose we are interested in numbers
which are simultaneously square and triangular. This means that we are interested in
solutions to the equation
m(m + 1)
n2 =
2
Rearranging this equation we get
2(2n)2 = (2m + 1)2 − 1
so solutions to this equation correspond to solutions to the Pell equation x2 − 2y 2 = 1.
We can easily spot the solution x = 3, y = 2 (this corresponds to the square and
triangular number 1!). To get other solutions, we can observe that the equation x2 −2y 2 = 1
can be written as
√
√
(x + y 2)(x − y 2) = 1
If k is a positive integer then raising both sides to the power of k gives
√
√
(x + y 2)k (x − y 2)k = 1
√
√
So if we
(3 +√
2 2)k in the form x + y 2 we get a new solution. For example,
√ rewrite
(3 + 2 2)3 = 99 + 70 2, so x = 99, y = 70 is another solution, which corresponds to the
square and triangular number 352 = 49·50
= 1225.
2
We can do the same thing for any non-square positive integer D: if we have one solution
2
2
(x1 , y1√) to the equation
√ kx − Dy = 1 we get more solutions by choosing x, y so that
x + y D = (x1 + y1 D) .
This process actually gives all the solutions to the Pell equation.
Week 10
Theorem 9.10. Let D be a positive integer which is not a square. Suppose the equation
x2 − Dy 2 = 1
has a positive integer solutions. Let (x1 , y1 ) be the positive integer solution with the smallest
value of x1 . Then every positive integer solution (x, y) is given by
√
√
x + y D = (x1 + y1 D)k
for k = 1, 2, 3, . . ..
Proof. The proof of this Theorem is a version of Fermat descent.
We suppose that (u, v) is a solution with u > x1 . We are going to show that
√
√
√
u + v D = (x1 + y1 D)(s + t D)
where s, t are positive integers, s < u and s2 − Dt2 = 1. We can then repeat the procedure,
applying it to (s, t). As the value of x is decreasing with each step, we eventually reach
the solution
with the
x value, which is (x1 , y1 ). Then we will have shown that
√
√ smallest
k
u + v D = (x1 + y1 D) for some k.
46
JAMES NEWTON
We are looking for s, t such that
√
√
√
u + v D = (x1 + y1 D)(s + t D)
Multiplying out the equation we get
√
√
u + v D = (x1 s + Dy1 t) + (x1 t + y1 s) D
so we need to solve the simultaneous equations
u = x1 s + Dy1 t and v = x1 t + y1 s
for s and t. Multiplying the first equation by y1 and the second by x1 and taking the
difference we get
y1 u − x1 v = Dy12 t − x21 t = −t
so we get
t = x1 v − y1 u and similarly s = ux1 − Dy1 v
You can check with a bit of algebra that (s, t) is a solution, so it remains to show that
s < u and s, t are positive. Once we’ve shown s, t are positive then since u = x1 s + Dy1 t
it’s immediate that u > s.
√
√
2
2
2
We have
u
=
1
+
Dv
>
Dv
so
u
>
Dv
and
similarly
x
>
Dy1 , so s = ux1 −
1
√
√
Dy1 v > ( Dv)( Dy1 ) − Dy1 v = 0. Finally we need to show that t is positive. We have
t = x1 v − y1 u so we need to show that x1 v > y1 u. Squaring both sides, it suffices to show
that x21 v 2 > y12 u2 . We can rewrite this as
(1 + Dy12 )v 2 > y12 (1 + Dv 2 )
which simplifies to v 2 > y12 . This inequality follows from our assumption that u > x1 .
Fact 9.11. In fact, the Pell equation x2 − Dy 2 always has positive integer solutions. So the
above Theorem always applies, and tells us that every solution can be obtained from the
solution with the smallest value of x.
Proof of fact. (Not examinable) Here is a sketch proof of this fact: First we use Dirichlet’s
approximation theorem to show that there are infinitely many pairs of positive integers
(x, y) with
√
x
1
| D − | < 2.
y
y
In particular, there are infinitely many (x, y) with
√
√
√
√
√
1
1
|x2 − Dy 2 | = |x − y D||x + y D| < |x + y D| = |(x − y D) + 2y D|
y
y
√
√
1 1
< ( + 2y D) ≤ 1 + 2 D
y y
ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B)
47
Applying
the pigeonhole principle, we deduce that there is an integer M with |M | <
√
1+2 D and infinitely many solutions to x2 −Dy 2 = M . Applying the pigeonhole principle
again, we can find two distinct solutions (x1 , y1 ), (x2 , y2 ) to this equation with
x1 ≡ x2
(mod M ) and y1 ≡ y2
(mod M ).
Now we consider
√
√
√
√
x1 − Dy1
x1 − Dy1 x2 + Dy2
√
√
√
=
=u+v D
x2 − Dy2
x2 − Dy2 x2 + Dy2
with u, v ∈ Q. We claim that u, v are non-zero integers and u2 − Dv 2 = 1. Switching signs
if necessary, this gives positive integer solutions to the Pell equation.
First we show that u, v are integers. We have
x1 y 2 − x2 y 1
x1 x2 − Dy1 y2
and v =
u=
M
M
and it follows from the congruence between x1 , x2 and y1 , y2 , together with the fact that
these are solutions to x2 − Dy 2 = M , that u and v are integers. Now since √
we have
2
2
u − Dv√ = 1, we must have u 6= 0. If v = 0 then u = ±1 which implies that x1 − Dy1 =
±(x2 − Dy2 ) which contradicts the fact that (x1 , y1 ) and (x2 , y2 ) are chosen to be distinct
pairs of positive integers.
Example. Let’s find a solution to
x2 − 13y 2 = 1
Applying the proof of Dirichlet’s approximation theorem gives (x, y) with x2 − 13y 2 small.
In particular, we have
112 − 13 · 32 = 1192 − 13 · 332 = 4
Unfortunately these solutions are not congruent mod 4, so
√
√
119 − 33 13
11 3 √
√
u + v 13 =
−
13
=
2
2
11 − 3 13
is not an integer solution. Searching further (with a computer!) we find that (14159, 3927)
is a solution to x2 − 13y 2 = 4, and 14159 ≡ 11 (mod 4), 3927 ≡ 3 (mod 4). We get
√
√
√
14159 − 3927 13
√
u + v 13 =
= 649 − 180 13
11 − 3 13
so a solution to the original equation is given by (649, 180).
The above method doesn’t seem to be a very efficient way of finding solutions! Continued
fractions, which we mentioned in the section on Dirichlet’s approximation theorem, give a
much faster way of computing solutions.

Download Report

ELEMENTARY NUMBER THEORY

Paperzz.com

Your Paperzz