ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) JAMES NEWTON Course Arrangements Problem sheets and solutions, lecture notes and other information will all be posted on the KEATS page for the course. Tutorials. Weekly from second week of semester 2 (week 22) onwards. Assessment. 90% of your mark is from the final exam, 10% from a class test (in week 28). Office hours. My office hours are Thursdays 2-4pm. If you have questions, comments or any feedback about the course you are encouraged to come to my office hours to discuss. If the timing of the office hours isn’t convenient, email me at [email protected] and we can make an appointment to meet at some other time. I am also happy to answer questions and receive comments by email. Other reading. The lecture notes from last year’s course can be found on KEATS. The course content this year will be very similar, but the material will be reorganised a little. Here are three recommended textbooks which cover much of the course material: • Elementary number theory — David M. Burton • A friendly introduction to number theory — Joseph H. Silverman • Elementary number theory — Gareth A. Jones and Mary R. Jones Course outline. This course is an introduction to a variety of topics in number theory. The reason for the word ‘elementary’ in the course title is that we won’t rely on any advanced theory from other parts of mathematics (e.g. real or complex analysis, commutative algebra) — we will use a little bit of the material on groups and rings from the course Introduction to abstract algebra, but even this isn’t really essential. Here are some of the topics we will cover: (1) Review from Introduction to abstract algebra: divisibility, prime numbers, congruences (2) The ring of residue classes mod m and its unit group, primitive roots (3) Quadratic residues and the quadratic reciprocity law (4) Sums of squares (5) Irrational and transcendental numbers Date: March 14, 2017. 1 Week 1 2 JAMES NEWTON (6) Some examples of solving equations in integers and rational numbers: ‘Diophantine equations’ What is number theory? Number theory is the study of the natural numbers, and the integers: N = {1, 2, 3, . . .} Z = {. . . , −2, −1, 0, 1, 2, . . .} We are interested in different kinds of numbers (even, odd, square, prime,. . . ) and the relationships between them. Here are some examples of number theoretic questions, some of which we will discuss more in the course: • Can we describe all the integer solutions (a, b, c) to the equation a2 +b2 = c2 ? What about if we replace the squares by cubes, fourth powers, etc.? • Which numbers are sums of two squares? • What can we say about the gaps between prime numbers? 1. Divisibility Definition 1.1. Let a and b be two integers. We say that b divides a if there exists an integer q such that a = qb. If b divides a, we write b|a. Here are some basic properties of divisibility: let a, b, c, be three integers (1) (2) (3) (4) (5) If a|b and b|c, then a|c. If a|b and a|c then a|(bx + cy) for all x, y ∈ Z. If a|1 then a = ±1. If a|b and b|a then a = ±b. Suppose c 6= 0. Then a|b if and only if ac|bc. Exercise. Prove the above properties. Theorem 1.2. Let a ∈ Z and b ∈ N. Then there exists unique integers q, r ∈ Z such that a = qb + r and 0 ≤ r < b. Proof. Let S = {a − nb : n ∈ Z}. Then we let r = a − qb be the smallest non-negative element of S. Since S contains positive elements (take n to be the negative of a very large integer) this smallest element exists by the well-ordering principle. We have 0 ≤ r < b, otherwise r − b is a smaller non-negative element of S. If a = qb + r = q 0 b + r0 then b|(r − r0 ), and −b < r − r0 < b so this implies that r = r0 and therefore q = q 0 . This shows uniqueness of q and r. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 3 Greatest common divisors. Definition 1.3. Let a and b be integers. If d is another integer with d|a and d|b we say that d is a common divisor of a and b. If at least one of a and b are non-zero, we define the greatest common divisor of a and b to be the largest positive integer d which is a common divisor of a and b. We write gcd(a, b) for the greatest common divisor of a and b. Exercise. (1) Why did we not define gcd(0, 0)? (2) Show that gcd(a, b) = gcd(|a|, |b|) (3) For a a non-zero integer, show that gcd(a, 0) = |a|. The Euclidean algorithm. The most obvious way to find the gcd of two integers is to write down all the divisors of each number and compare them to find the greatest common divisor. Of course this method will be very slow in practice! A more efficient way of calculating the gcd is based on the following: Lemma 1.4. If a = qb + r then gcd(a, b) = gcd(b, r). Proof. Suppose d is a common divisor of a and b. Since r = a − qb we also have d|r so d is a common divisor of b and r. In the other direction, if d is a common divisor of b and r, then we also have d|a, since a = qb + r. So d is a common divisor of a and b. We therefore see that the two sets of integers {common divisors of a and b}, and {common divisors of b and r} are the same, so they have the same largest element and we have gcd(a, b) = gcd(b, r). Suppose we start off with integers a, b and we want to work out gcd(a, b). First we do some easy simplifications: we can assume that a and b are both non-zero, since the gcd is easy to work out if one of them is zero. Also, since gcd(a, b) = gcd(|a|, |b|) we can assume that a and b are both positive. If a = b then gcd(a, b) = |a|, so we can assume a 6= b. Finally, since gcd(a, b) = gcd(b, a) we can assume a > b > 0. The above Lemma tells us that we can work out gcd(a, b) by first dividing a by b and then working out the gcd of b and r, where b > r ≥ 0. Since r is smaller than b we have reduced the problem to working out the gcd of smaller integers. Let’s write this out more formally. First we write a = q1 b + r1 with 0 ≤ r1 < b. If r1 = 0 then b|a and gcd(a, b) = b. If r1 6= 0 we divide again and write: b = q2 r1 + r2 with 0 ≤ r2 < r1 . We have gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ). If r2 = 0 then gcd(a, b) = r1 . If r2 6= 0 we divide again: r1 = q3 r2 + r3 with 0 ≤ r3 < r2 and we continue on in this way. Since b > r1 > r2 > · · · ≥ 0 we eventually get a remainder rn = 0 (with rn−1 > 0). So gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ) = · · · = gcd(rn−1 , rn ) = gcd(rn−1 , 0) = rn−1 . Let’s do an example: 4 JAMES NEWTON Example. Let a = 1492 and b = 1066. We have 1492 = 1 · 1066 + 426 1066 = 2 · 426 + 214 426 = 1 · 214 + 212 214 = 1 · 212 + 2 212 = 106 · 2 + 0 The last non-zero remainder is 2, so gcd(1492, 1066) = 2. Exercise. Write a computer program to implement the Euclidean algorithm. This algorithm gives us a practical way to work out the gcd of two integers. It also gives us an important theoretical result: Theorem 1.5 (Bezout’s lemma). Let a and b be integers (not both 0). Then there exist integers u and v such that gcd(a, b) = au + bv Proof. As before, we can reduce to the case that a > b > 0. The last two steps in the Euclidean algorithm to compute gcd(a, b) have rn−3 = qn−1 rn−2 + rn−1 and rn−2 = qn rn−1 + 0 with rn−1 = gcd(a, b). So we have an equation gcd(a, b) = rn−3 − qn−1 rn−2 . From the previous equation in the Euclidean algorithm, we get rn−2 = rn−4 − qn−2 rn−3 , so we can substitute this in to eliminate rn−2 and write gcd(a, b) as a linear combination of rn−3 and rn−4 . Repeating this process, we eventually write gcd(a, b) as a linear combination of a and b. Note that the above proof gives us a practical algorithm to compute integers u, v with gcd(a, b) = au + bv: ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 5 Example. We again set a = 1492 and b = 1066. We have gcd(a, b) = 2 = 214 − 1 · 212 = 214 − 1 · (426 − 1 · 214) = −1 · 426 + 2 · 214 = −1 · 426 + 2(1066 − 2 · 426) = 2 · 1066 − 5 · 426 = 2 · 1066 − 5(1492 − 1 · 1066) = −5 · 1492 + 7 · 1066, Bezout’s lemma is very useful for establishing theoretical properties of the greatest common divisor. Exercise. Extend your programme implementing the Euclidean algorithm so it also outputs integers u, v with gcd(a, b) = au + bv. Another proof of Bezout’s lemma. Here is a different proof. Proposition 1.6. Let a, b be integers, not both zero, and consider the set S = {au + bv : u, v ∈ Z}. Let d > 0 be the smallest positive integer in S. Then d = gcd(a, b). Proof. We have d = au + bv. Dividing a by d we get a = qd + r with 0 ≤ r < d. Since r = a − qd = a(1 − qu) − b(qv) ∈ S and d is the smallest positive element of S we must have r = 0. So d|a. By the same argument, we have d|b, so d is a common divisor of a and b. Since gcd(a, b) divides a and b, it divides d = au + bv. In particular, we have gcd(a, b) ≤ d. Since d is a common divisor of a and b we also have d ≤ gcd(a, b) and we conclude that d = gcd(a, b). Remark. Here are two consequences of this Proposition: (1) We have integers u, v such that gcd(a, b) = au + bv (in other words, the Proposition gives a new proof of Bezout’s lemma) (2) gcd(a, b) = 1 if and only if there are integers u, v such that 1 = au + bv. Corollary 1.7. Let a, b be integers, not both zero, and again consider the set S = {au + bv : u, v ∈ Z}. We can also consider the set S 0 = {n gcd(a, b) : n ∈ Z}. Then the two sets of integers S, S 0 are equal. 6 JAMES NEWTON Proof. We let S 0 = {n gcd(a, b) : n ∈ Z}. We are going to show that S ⊂ S 0 and S 0 ⊂ S and deduce that S = S 0 . For the first inclusion, if x = au + bv ∈ S then, since gcd(a, b) divides a and b, it divides au + bv = x. So x is a multiple of gcd(a, b) which means that x ∈ S 0 . This proves that S ⊂ S 0 . Conversely, we now prove that S 0 ⊂ S. So let y ∈ S 0 . We have y = n gcd(a, b). By Bezout’s lemma, we have gcd(a, b) = au + bv for some integers u, v. Multiplying by n we get y = a(nu) + b(nv). So y ∈ S. This shows that S 0 ⊂ S, and we have completed the proof. Corollary 1.8. Let a, b be integers, not both zero. Let c be an integer. Then c is a common divisor of a and b if and only if c| gcd(a, b). Week 2 Proof. If c| gcd(a, b) then c|a and c|b, so c is a common divisor of a and b. Conversely, suppose c is a common divisor of a and b. By Bezout’s lemma we have integers u, v such that gcd(a, b) = au + bv. Since c|a and c|b we c|(au + bv) = gcd(a, b). Coprimality. Definition 1.9. We say that two integers a, b are coprime or relatively prime if gcd(a, b) = 1. Lemma 1.10. Suppose a, b are coprime. (1) If a|c and b|c then (ab)|c. (2) If a|(bc) then a|c. (3) If a and c are also coprime, then a and bc are coprime. Proof. (1) We have 1 = au + bv for some integers u, v. We can also write c = ae and c = bf , since a|c and b|c. Multiplying the first equation by c we get c = cau + cbv = (bf )au + (ae)bv = ab(f u + ev) so (ab)|c. (2) Again we have c = cau + cbv. Since a|(bc) and a|a, we get that a|a(cu) + (bc)v = c. (3) We have 1 = au + bv and 1 = ax + cy. Multiplying the equations together gives 1 = (au + bv)(ax + cy) = a(uax + ucy + bvx) + bc(vy). It follows from Proposition 1.6 that gcd(a, bc) = 1. Least common multiples. Definition 1.11. If a, b are integers, then a common multiple of a and b is an integer c such that a|c and b|c. If a and b are both non-zero, the least common multiple of a and b is defined to be the smallest positive integer lcm(a, b) which is a common multiple of a and b. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 7 Proposition 1.12. Let a, b be non-zero integers. Then gcd(a, b)lcm(a, b) = |ab|. Proof. Set a0 = a gcd(a,b) and b0 = b . gcd(a,b) We want to show that |ab| gcd(a,b) = lcm(a, b). First |ab| gcd(a,b) we need to check that is a (positive) common multiple of a and b. Since gcd(a, b) is a common divisor of a and b, this follows from the fact that |b| |a| |ab| = |a| = |b| gcd(a, b) gcd(a, b) gcd(a, b) |ab| Now we need to show that gcd(a,b) is the least common multiple. Suppose c is an arbitrary positive common multiple of a and b. We have a|c and b|c. Dividing by gcd(a, b), we get c that a0 and b0 divide gcd(a,b) . Since a0 and b0 are coprime (by Problem Sheet 1), Lemma 1.10 implies that a0 b0 divides c . gcd(a,b) Multiplying by gcd(a, b), we deduce that gcd(a, b)|a0 b0 | divides c, so in particular c ≥ This shows that |ab| gcd(a,b) |ab| gcd(a,b) = |ab| . gcd(a,b) = lcm(a, b). Remark. We showed in the above proof that any common multiple of a and b is a multiple of lcm(a, b). 1.1. Linear Diophantine equations. Diophantine equations are equations in one or more variables, for which we seek integer-valued solutions. Corollary 1.7 allows us to describe the solutions to the simplest family of Diophantine equations: linear Diophantine equations where we have integers a, b, c and look for integer solutions (x, y) to ax + by = c. Theorem 1.13. Let a, b, c be integers, with a and b not both 0, and let d = gcd(a, b). The equation ax + by = c has an integer solution (x, y) if and only if c is a multiple of d. Now we assume c is a multiple of d. Fix integers u, v with au + bv = d. Then the solutions to ax + by = c are given by (xn , yn )n∈Z , where b c a c xn = u + n, yn = v − n d d d d Proof. The first part of the theorem follows immediately from Corollary 1.7. For the second part, we can check by substituting in that the pairs (xn , yn ) are solutions to the equation. It remains to check that these are all the solutions. Suppose that (x, y) is a solution. We can assume that b 6= 0 (otherwise we swap a and b). We have ax+by = c and ax0 +by0 = c, so subtracting one equation from the other gives a(x − x0 ) + b(y − y0 ) = 0 8 JAMES NEWTON Dividing by d and rearranging gives a b (x − x0 ) = − (y − y0 ). d d b a Again we use the fact that d and d are coprime (Problem Sheet 1). Since db divides ad (x−x0 ) it follows from Lemma 1.10 that db divides x − x0 . So we have n ∈ Z such that x − x0 = db n. n = yn . This shows that x = xn and we then have y = c−ax b 2. Prime numbers Definition 2.1. An integer p > 1 is called a prime number or a prime if it has no positive divisors other that 1 and p. An integer n > 1 is called composite if it is not prime. Remark. By convention, the number 1 is neither prime nor composite. Lemma 2.2. (1) Let p be a prime number and let a, b be integers. Suppose p|ab. Then p|a or p|b. (2) If we have integers a1 , a2 , . . . , an and p|(a1 a2 · · · an ) then p|ai for some i. Proof. (1) If p does not divide a, then gcd(a, p) = 1. So it follows from Lemma 1.10 that p|b. (2) We repeatedly apply the first part. If p does not divide a1 then p|(a2 a3 · · · an ). If p does not divide a2 either, then p|(a3 a4 · · · an ). We see that p eventually has to divide one of the ai . Theorem 2.3 (Fundamental theorem of arithmetic). Every integer n > 1 can be expressed uniquely (up to reordering) as a product of primes. Proof. First we prove existence of a prime factorisation, by induction on n. Since 2 is prime, it has a prime factorisation. For the inductive step, we suppose that all integers less than n can be expressed as a product of primes. If n is prime, then it is its own prime factorisation and we are done. If n is not prime, then we have n = md with 1 < m < n and 1 < d < n. By our inductive hypothesis, both m and d have prime factorisations, so n does. Uniqueness follows from Lemma 2.2: suppose n = p1 p2 · · · pr = q1 q2 · · · qs are two prime factorisations. Since p1 |n we have p1 |qi for some i. Up to reordering we can assume p1 |q1 , and since q1 is prime we must have p1 = q1 . Now we dividing through by p1 we get p2 · · · pr = q2 · · · qs Repeatedly applying the same argument, we can show that r = s and pi = qi for all i’s (after reordering the qi ). If n > 1 is an integer, we often write its prime factorisation as n = pa11 pa22 · · · par r ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 9 with the pi distinct primes and the ai positive integers. Sometimes we will allow some of the ai to be equal to 0 as well (in this case prime pi is not a factor of n). We can read off information about the divisibility properties of integers from their prime factorisations. Lemma 2.4. Let n = pa11 pa22 · · · par r with the pi distinct primes and the ai positive integers. (1) d > 0 is a divisor of n if and only if d = pb11 pb22 · · · pbrr with 0 ≤ bi ≤ ai for each i. Q (2) The number of positive divisors of n is ri=1 (ai + 1). Proof. Exercise (see problem sheet 2). Lemma 2.5. Let m, n be two positive integers with m = pa11 pa22 · · · par r n = pb11 pb22 · · · pbrr where ai , bi are integers which are ≥ 0. (1) gcd(m, n) = pe11 pe22 · · · perr where ei = min(ai , bi ) (2) lcm(m, n) = pf11 pf22 · · · pfrr where fi = max(ai , bi ) Proof. Exercise (use the previous lemma to work out what the common divisors of m and n are, and use Proposition 1.12 to work out the lcm once you know the gcd). Theorem 2.6 (Euclid). There are infinitely many primes. Proof. We give a proof by contradiction. Suppose there are only finitely many primes. Then there is an integer N such that N ≥ p for all primes p. Consider the integer n = N ! + 1 = 1 · 2 · · · (N − 1) · N + 1. By the fundamental theorem of arithmetic, there is a prime p with p|n. Since p ≤ N , we also have p|N !. But 1 = n − N ! so this implies that p|1 which is impossible. Exercise. Prove that there are infinitely many primes of the form 4k − 1, with k a positive integer. Proposition 2.7. There are arbitrarily large gaps between consecutive primes. More precisely, if k ≥ 2 is an integer then there are k consecutive integers which are composite. Proof. Let k ≥ 2 be an integer. We are going to show that there are k consecutive integers none of which is prime. Consider a2 = (k + 1)! + 2, a3 = (k + 1)! + 3, . . . , ak+1 = (k + 1)! + (k + 1) this gives k consecutive integers, and we have i|ai for each 2 ≤ i ≤ k + 1. We also have ai > i for each i, so none of the ai is prime. 10 JAMES NEWTON Example. The above proof tells us that the seven consecutive integers 8!+2 = 40322, . . . , 8!+ 8 = 40328 are composite. In fact, we can find a smaller sequence: the seven consecutive integers 90, 91, 92, 93, 94, 95, 96 are all composite. A very famous conjecture in number theory is about small gaps between primes: Conjecture (Twin prime conjecture). There are infinitely many prime numbers p such that p + 2 is also prime. Recently, a major breakthrough was achieved by Yitang Zhang, which, with improvements by J. Maynard, T. Tao, and an online project with many collaborators called Polymath shows: Theorem 2.8 (Zhang–Maynard–Tao–Polymath, 2013–2014). There are infinitely many prime numbers p such that there is another prime number q with p < q < p + 246. This is the closest we can get to the twin prime conjecture, for now! 3. Congruences Definition 3.1. Let m be a non-zero integer, and let a, b ∈ Z. We say that a is congruent to b modulo m if m|(a − b). If a is congruent to b modulo m, we write a ≡ b (mod m). Here are some basic properties of congruences: (1) a ≡ b (mod m) ⇐⇒ b ≡ a (mod m) ⇐⇒ a − b ≡ 0 (mod m) (2) if a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m) (3) if a ≡ b (mod m) and c ≡ d (mod m) then ac ≡ bd (mod m) and ax+cy ≡ bx+dy (mod m) for all x, y ∈ Z. (4) if a ≡ b (mod m) and d|m, then a ≡ b (mod d). (5) if a ≡ b (mod m) and c 6= 0, then ac ≡ bc (mod mc). Exercise. Prove the above properties. Here is the first part of (3) to get you started: if a ≡ b (mod m) and c ≡ d (mod m) then m|(a − b) and m|(c − d). We have ac − bd = a(c − d) + d(a − b) so m|(ac − bd). Week 3 Definition 3.2. Let m be a positive integer. A set {x1 , x2 , . . . , xr } is called a complete residue system modulo m if for every integer y there is exactly one xi such that y ≡ xi (mod m). Example. Let m be a positive integer. Then {0, 1, 2, . . . , m − 1} is a complete residue system modulo m. In general, every complete residue system has size m. There are other possibilities too: for example, if m = 5 then {−2, −1, 0, 1, 2} is a complete residue system modulo 5. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 11 Definition 3.3. Let m be a non-zero integer and a ∈ Z. The residue class or congruence class of a is the set [a]m = {b ∈ Z : b ≡ a (mod m)} We might just write [a] if we don’t need to specify the modulus m. Lemma 3.4. [a]m = [b]m ⇐⇒ a ≡ b (mod m) Proof. This follows from the definitions, it’s an exercise to write down the proof. If follows from the above Lemma that if {x1 , x2 , . . . , xm } is a complete residue system modulo m then for every integer a there is a unique xi such that [a]m = [xi ]m . The collection of all congruence classes modulo m is therefore a finite set of cardinality m. Definition 3.5. For a positive integer m, we let Zm denote the set of congruence classes modulo m. Remark. We can write Zm = {[0]m , [1]m , . . . , [m − 1]m }. In fact, if {x1 , x2 , . . . , xm } is any complete residue system modulo m then we have Zm = {[x1 ]m , [x2 ]m , . . . , [xm ]m }. Now we are going to recall the fact that the congruence classes modulo m naturally form a commutative ring. In other words, we can add and multiply congruence classes. Definition 3.6. For m 6= 0 and a, b ∈ Z we define addition and multiplication operations on Zm by: [a]m + [b]m = [a + b]m [a]m · [b]m = [a · b]m . Using the properties of congruences, you can check that these binary operations are well-defined: in other words, if a ≡ a0 (mod m) and b ≡ b0 (mod m) then [a]m + [b]m = [a0 ]m + [b0 ]m and [a]m · [b]m = [a0 ]m · [b0 ]m . Solving equations in Zm . We already saw the example of linear Diophantine equations. Suppose we want to find integer solutions x to a polynomial equation in one variable like x7 + 2x + 2 = 0. This might be difficult in general, but in the next few lectures we will study the easier problem of solving these equations modulo m for positive integers m. 12 JAMES NEWTON Problem. Given a polynomial f (x) = a0 + a1 x + · · · + an xn with ai ∈ Z and a positive integer m, find all integers a ∈ Z such that f (a) ≡ 0 (mod m). Lemma 3.7. Let f (x) = a0 + a1 x + · · · + an xn be a polynomial with integer coefficients ai ∈ Z. If a ≡ b (mod m) then f (a) ≡ f (b) (mod m). Proof. Since a ≡ b (mod m) we have ai ≡ bi (mod m) for all i ≥ 0. So ai ai ≡ ai bi (mod m) for all i ≥ 0. Adding together these congruences we get f (a) ≡ f (b) (mod m). This lemma means that to find all the integers a ∈ Z such that f (a) ≡ 0 (mod m) we just need to check if f (a) ≡ 0 (mod m) or not for one a in each congruence class. Moreover, we can view the set of solutions to the congruence equation f (x) ≡ 0 (mod m) as a subset of Zm . Another way of seeing this is that we can consider the reduction modulo m of f (x): [f ]m (x) = [a0 ]m + [a1 ]m x + · · · + [an ]m xn this is a polynomial with coefficients in Zm . If [a]m ∈ Zm we can evaluate [f ]m (x) at [a]m to and the Problem is to solve the equation [f ]m (x) = [0]m in Zm . Again we see that the solutions to this equation are a subset of Zm . Example. Let f (x) = x2 + 1 and consider m = 3. Then there are no solutions to f (x) ≡ 0 (mod 3) (since none of 02 + 1, 12 + 1 and 22 + 1 are divisible by 3)/ Consider m = 5. Then the solutions to f (x) ≡ 0 (mod 5) are given by x ≡ ±2 (mod 5). Remark. If we replace f (x) be a polynomial g(x) whose coefficients are all congruent to those of f modulo m, then the solutions to f (x) ≡ 0 (mod m) and g(x) ≡ 0 (mod m) are exactly the same. Here are some examples I didn’t do in the lecture: Example. Let m = 5 and f (x) = x2 + 2x + 2. We check which elements in the complete residue system {−2, −1, 0, 1, 2} satisfy f (x) ≡ 0 (mod 5): f (−2) = 2 f (−1) = 1 f (0) = 2 f (1) = 5 ≡ 0 f (2) = 10 ≡ 0 (mod 5) (mod 5) So the solutions to f (x) ≡ 0 (mod 5) are given by x ≡ 1, 2 (mod 5). Example. Let m = 4 and f (x) = x5 −x2 +x−3. We check which elements in the complete residue system {−1, 0, 1, 2} satisfy f (x) ≡ 0 (mod 4): ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 13 f (−1) = −6 f (0) = −3 f (1) = −2 f (2) = 27 So there are no solutions to f (x) ≡ 0 (mod 4). As an immediate consequence, there are no integer solutions to f (x) = 0, as a solution to this equation would give a solution (mod 4). If m is very large this method would take a long time! We are going to develop some tools that will enable us to solve this problem more efficiently. Theorem 3.8 (Chinese Remainder Theorem). Let m1 , . . . , mr be positive integers, with gcd(mi , mj ) = 1 for all i 6= j. Let a1 , . . . , ar be integers. Then the solutions of the simultaneous congruence equations x ≡ a1 (mod m1 ) x ≡ a2 .. . (mod m2 ) x ≡ ar (mod mr ) are given by integers x lying in a single congruence class (mod m1 m2 · · · mr ). Proof. We start by doing the case r = 2. By Bezout’s lemma we have integers u, v such that m1 u + m2 v = 1. In particular, we have m1 u ≡ 1 (mod m2 ) and m2 v ≡ 1 (mod m1 ). Suppose x ≡ m1 ua2 + m2 va1 (mod m1 m2 ). Then we have x ≡ m2 va1 ≡ a1 (mod m1 ) and x ≡ m1 ua2 ≡ a2 (mod m2 ). This gives us some solutions, and now we check that all the solutions are given by x ≡ m1 ua2 + m2 va1 (mod m1 m2 ). Suppose x0 is another solution. Then x − x0 ≡ a1 − a1 ≡ 0 (mod m1 ), and similarly x − x0 ≡ 0 (mod m2 ). So m1 |(x − x0 ) and m2 |(x − x0 ). Since m1 and m2 are coprime this implies that m1 m2 divides (x − x0 ). We deduce that x0 ≡ x (mod m1 m2 ). This finishes the case of r = 2. For the general case r > 2 we just repeatedly apply the case r = 2. Let’s explain the case r = 3: we have three simultaneous equations and we begin by considering the first two: x ≡ a1 (mod m1 ) x ≡ a2 (mod m2 ) The case r = 2 tells us that these two equations are equivalent to the equation x ≡ a12 (mod m1 m2 ) where a12 is some integer (it’s m1 ua2 +m2 va1 , using the notation from earlier). Now we just have two simultaneous equations to solve: 14 JAMES NEWTON x ≡ a12 x ≡ a3 (mod m1 m2 ) (mod m3 ) and applying the case r = 2 again tells us that these are equivalent to x ≡ a (mod m1 m2 m3 ) for some integer a. Note that the above proof gives a procedure for finding the solutions x (see the example later). Here is the ring-theoretic formulation of this theorem: Theorem (Chinese Remainder Theorem). Let m1 , . . . , mr be positive integers, with gcd(mi , mj ) = 1 for all i 6= j. Set m = m1 m2 · · · mr . The map Zm → Zm1 × Zm2 × · · · × Zmr given by [a]m 7→ ([a]mi )i=1,...,r is a bijection (in fact it is a ring isomorphism). Example. Let’s find all the solutions to x2 + 1 ≡ 0 (mod 10). We begin by finding solutions to x2 + 1 ≡ 0 (mod mi ) where m1 = 2, m2 = 5. The solutions are given by x ≡ 1 (mod 2), x ≡ −2, 2 (mod 5). Now we apply the proof of the CRT: the integers u, v in Bezout’s lemma can be taken to be u = −2, v = 1 since 1 = 2(−2) + 5(1). Our solutions are given by x ≡ m1 ua2 + m2 va1 (mod m1 m2 ) where m1 = 2, m2 = 5, a1 = 1 and a2 = ±2. So we get x ≡ 2 · (−2) · (±2) + 5 · (1) · (1) ≡ ∓8 + 5 ≡ ±3 (mod 10) Subsituting back in you see that these do give solutions, since (±3)2 + 1 = 10 ≡ 0 (mod 10). Now we move on to discussing inverses modulo an integer, which appeared implicitly in the proof of the Chinese Remainder Theorem (where we find u and v such that m1 u ≡ 1 (mod m2 ) and m2 v ≡ 1 (mod m1 ).) Lemma 3.9. Let m be a positive integer and let a ∈ Z. If gcd(a, m) = 1 then there exists b ∈ Z such that ab ≡ 1 (mod m). We call such a b an inverse of a modulo m, and denote the residue class [b]m (which depends only on the residue class [a]m ) by [a]−1 m . ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 15 Proof. We know by Bezout’s lemma that we have au + mv = 1 for some integers u, v. This says that au ≡ 1 (mod m) so u is an inverse to a modulo m. If a0 ≡ a (mod m) then a0 u ≡ au ≡ 1 (mod m) so u is also an inverse to a0 (mod m). Finally if b, b0 are two inverses to a modulo m then ab ≡ ab0 (mod m) which implies that m|a(b − b0 ). Since m, a are coprime this implies that m|b − b0 and so [b]m = [b0 ]m . This shows that the residue class [b]m depends only on [a]m (in particular, it is independent of the choice of b). Remark. We showed that if a, m are coprime then there is an inverse to a modulo m. In fact, if there exists b such that ab ≡ 1 (mod m) then a must be coprime to m, since we know that there are integers u, v with au + mv = 1 if and only if a, m are coprime. Example. If p is a prime number, then the congruence classes [1]p , . . . , [p − 1]p has an inverse modulo p. The remaining congruence class [0]p does not have an inverse modulo p. Definition 3.10. We write Z× m for the multiplicative group of integers modulo m (of the group of units modulo m), which are defined by Z× m = {[a]m ∈ Zm : a, m are coprime} The explanation for the terminology in the above definition is that it follows from Lemma 3.9 that Z× m with the multiplication operation is a (commutative) group, with −1 identity element [1]m . In particular, for every g ∈ Z× ∈ Z× m there is an inverse g m with −1 −1 g · g = g · g = [1]m . We will investigate the structure of this group later in the course. Here is some terminology which is sometimes used to describe the analogue of a complete residue system, but just representing congruence classes in Z× m. Definition 3.11. Let m be a non-zero integer. A set {x1 , x2 , . . . , xr } is called a reduced residue system modulo m if for every integer y with gcd(y, m) = 1 there is exactly one xi such that y ≡ xi (mod m). We can use the existence of inverses to solve some simple equations modulo m: Proposition 3.12. Let a, b ∈ Z and let m be a positive integer. Set d = gcd(a, m). The congruence equation ax ≡ b (mod m) has an integer solution for x if and only if d|b. If d|b, the solutions are given by integers x such that −1 " # [x] md = a d m d b d m d Proof. If ax ≡ b (mod m) then b = ax + km for some integer k. So gcd(a, m) (which divides a and m) must divide b. Conversely, if d|b then a b m x≡ (mod ) d d d 16 JAMES NEWTON if and onlu if ax ≡ b (mod m). Multiplying by an inverse of since ad and md are coprime) we get that b a x≡ d d (mod a d modulo m d (which exists m ) d if and only if −1 " # [x] md = a d m d b d m d Hensel’s Lemma. We go back to find roots of polynomials modulo m. Suppose we have a prime factorisation m = pa11 pa22 · · · par r . The Chinese Remainder Theorem tells us that to solve an equation modulo m it suffices to solve the equation modulo pai i for each i. A very powerful tool for finding roots of polynomials modulo a prime power (like pai i ) is provided by Hensel’s Lemma: Theorem 3.13 (Hensel’s Lemma). Let f (x) = a0 + a1 x + · · · + an xn be a polynomial with integer coefficients, let p be a prime, and let r be a positive integer. We let f 0 (x) be the derivative of f (x), so f 0 (x) = a1 + 2a2 x + · · · + nan xn−1 . Suppose xr is an integer with f (xr ) ≡ 0 (mod pr ) f 0 (xr ) 6≡ 0 (mod p) and Then there exists xr+1 ∈ Z satisfying f (xr+1 ) ≡ 0 (mod pr+1 ) and xr+1 ≡ xr (mod pr ) Moreover, the xr+1 satisfying these properties is unique modulo pr+1 and we can take xr+1 = xr − f (xr )u where u is an inverse of f 0 (xr ) modulo p. i) Remark. The formula xr+1 = xr − f (xr )u is similar to the formula xi+1 = xi − ff0(x which (xi ) appears in the Newton–Raphson method for finding zeroes of a function f . Remark. If f 0 (xr ) ≡ 0 (mod p) then Hensel’s lemma doesn’t tell us anything — we have to find roots of the polynomial another way. We will prove Hensel’s Lemma using the following key result: Lemma 3.14. For t ∈ Z and a positive integer r, we have f (x + pr t) ≡ f (x) + f 0 (x)pr t (mod pr+1 ) where we view both sides as polynomials in x and we mean that all the coefficients of these two polynomials are congruent modulo pr+1 . ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 17 Proof. It suffices to prove the Lemma for f (x) = xn (multiply by the coefficients ai and add together to get the general case). We have (x + pr t)n = n X i=0 r i For i ≥ 2, (p t) =≡ 0 (mod p (x + pr t)n ≡ 1 X i=0 r+1 ! n n−i r i x (p t) i ), so we have ! n n−i r i x (p t) = xn + nxn−1 (pr t) i (mod pr+1 ) Proof of Hensel’s Lemma. We are looking for a solution to f (x) ≡ 0 (mod pr+1 ) of the form x = xr + pr t. Applying the previous lemma, we see that we get a solution if and only if f (xr ) + f 0 (xr )pr t ≡ 0 (mod pr+1 ) This equation is equivalent to pr t ≡ −f (xr )u (mod pr+1 ) where u is an inverse to f 0 (xr ) modulo p. So we deduce that all solutions to f (x) ≡ 0 (mod pr+1 ) with x ≡ xr (mod pr ) are given by x ≡ xr − f (xr )u (mod pr+1 ). Example. Let’s consider the equation f (x) = x2 + 1 ≡ 0 (mod 65e ) We are going to show that for each e ≥ 1 this equation has exactly 4 solutions in Z65e . It suffices to show that there are two solutions x1 , x2 modulo 5e and two solutions y1 , y2 modulo 13e for each e (then we apply the Chinese Remainder theorem to get four solutions, for example one congruent to x1 (mod 5e ) and one congruent to y2 (mod 13e )). First we consider the equation mod 5: there are two solutions x ≡ −2, 2 (mod 5). Now 0 f (x) = 2x, so f 0 (−2) and f 0 (2) are both non-zero mod 5. It follows that we can apply Hensel’s Lemma to show that there are exactly two solutions mod 52 , one congruent to 2 (mod 5), the other to −2. Repeatedly applying Hensel’s Lemma we get exactly two solutions mod 5e for each e ≥ 1. The same argument works mod 13 (the two solutions mod 13 are x ≡ ±5). So we conclude there are also exactly two solutions mod 13e . The conclusion of this section is that, using the CRT and Hensel’s Lemma, we can, in good situations (when the conditions of Hensel’s Lemma are met) reduce solving congruence equations modulo m to the problem of solving congruence equations modulo p, for prime factors p of m. The next theorem gives some control on the number of possible solutions. Theorem 3.15. Let p be a prime, and let f (x) = a0 + a1 x + · · · + an xn be a polynomial of degree ≤ n with integer coefficients (we allow the possibility that an = 0). We suppose that ai 6≡ 0 (mod p) for some i. Then the congruence equation f (x) ≡ 0 (mod p) Week 4 18 JAMES NEWTON has at most n solutions in Zp . Proof. The proof is by induction on n. If n = 0, and a0 6≡ 0 (mod p) then there are no solutions to a0 = f (x) ≡ 0 (mod p), as required. Now we do the inductive step. So we assume that n ≥ 1 and that all polynomials g(x) of degree ≤ n − 1 with some coefficient not divisible by p have at most n − 1 solutions. If f (x) ≡ 0 has no solutions, then there is nothing left to prove, so suppose [a]p is a solution. So p|f (a). We have f (x) − f (a) = n X ai (xi − ai ) i=1 and for each i we have xi − ai = (x − a)(xi−1 + axi−2 + · · · + ai−2 x + ai−1 ) Taking out the common factor of (x − a) we can write f (x) − f (a) = (x − a)g(x) where g(x) is a polynomial with integer coefficients of degree ≤ n − 1. If p divided all the coefficients of g(x) it would divide all the coefficients of f (x) − f (a). Since p|f (a) this implies that p divides all the coefficients of f (x), which is a contradiction. So there is a coefficient of g(x) which is not divisible by p and by our inductive hypothesis we conclude that the equation g(x) ≡ 0 has at most n − 1 roots in Zp . Now suppose we have f (b) ≡ 0 (mod p). Then we have (b − a)g(b) ≡ 0 (mod p). If p|(b − a)g(b) then either p|(b − a) or p|g(b). So there are at most n possibilities for b (mod p) — either b ≡ a (mod p) or the congruence class of b is one of the roots of g(b) in Zp . In fact the above Theorem is a special case of a general fact about roots of polynomials with coefficients in a field. Note that the statement of the Theorem doesn’t hold without the assumption that p is prime: for example, let’s consider the equation x2 ≡ 1 (mod 8) You can check that the solutions are given by x ≡ 1, 3, 5, 7 (mod 8) so there are > 2 solutions in Z8 . 4. Euler’s φ function Definition 4.1. Let m be a positive integer. We define φ(m) to be the number of integers a such that 1 ≤ a ≤ m and gcd(a, m) = 1. Equivalently φ(m) = |Z× m |, the cardinality of the multiplicative group Z× . m Example. If p is prime then φ(p) = p − 1. Lemma 4.2. Let m, n be coprime positive integers. Then φ(mn) = φ(m)φ(n). Proof. Here are two examples of proofs (which are mathematically equivalent, but one takes a slightly more group-theoretic viewpoint): ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 19 (1) Recall that the CRT gives a bijection Zmn → Zm × Zn where the maps is [a]mn 7→ ([a]m , [a]n ). We claim that this gives a bijection × × Z× mn → Zm × Zn and comparing cardinalities of the two sides proves the lemma. In fact this bijection is an isomorphism of groups. We need to show the claim. In other words, we need to show that gcd(a, mn) = 1 if and only if gcd(a, m) = gcd(a, n) = 1. A common divisor of a and m is a common divisor of a and mn, so if gcd(a, mn) = 1 we must have gcd(a, m) = 1; applying the same argument to n as well shows that if gcd(a, mn) = 1 we have gcd(a, m) = gcd(a, n) = 1. Conversely, suppose that gcd(a, m) = gcd(a, n) = 1. Then gcd(a, mn) = 1 by Lemma 1.10. This completes the proof of the claim. (2) We want to count integers 1 ≤ a ≤ mn with gcd(a, mn) = 1. First we note that a is coprime to mn if and only if it is coprime to m and n (see the previous paragraph for a proof of this). So we need to count integers 1 ≤ a ≤ mn with gcd(a, m) = gcd(a, n) = 1. Since m, n are coprime the Chinese Remainder Theorem says that for each pair of integers 1 ≤ a1 ≤ m, 1 ≤ a2 ≤ n there is an integer a, unique modulo mn with a ≡ a1 (mod m) and a ≡ a2 (mod n). Since the integers between 1 and mn are a complete residue system mod mn this says that there is a unique integer a with 1 ≤ a ≤ mn with a ≡ a1 (mod m) and a ≡ a2 (mod n). As we let a1 run over the φ(m) possibilities which are coprime to m and a2 run over the φ(n) possibilities which are coprime to mn, we get φ(m)φ(n) integers a with 1 ≤ a ≤ mn and a coprime to mn, and we get all such integers this way. So φ(mn) = φ(m)φ(n). This Lemma says that φ is a multiplicative function. If m and n are not coprime, we usually don’t have φ(mn) = φ(m)φ(n). For example φ(4) = 2 6= φ(2)φ(2) = 1. Lemma 4.3. Let p be prime and i a positive integer. Then φ(pi ) = pi−1 (p − 1) Proof. An integer a with 1 ≤ a ≤ pi is coprime to pi if and only if a is not divisible by p. So the number of such integers is equal to pi minus the number of multiples of p in {1, . . . , pi }. There are pi−1 multiples of p in this range, namely {p, 2p, . . . , pi−1 p}, so we get φ(pi ) = pi − pi−1 = pi−1 (p − 1). Example. φ(1000) = φ(23 53 ) = φ(23 )φ(53 ) = 4 · 25 · 4 = 400 Proposition 4.4. Let m be a positive integer. Then X 0<d|m φ(d) = m. 20 JAMES NEWTON Proof. (I didn’t do this proof in lectures) For each positive divisor d of m we let Sd = {1 ≤ a ≤ d : gcd(a, d) = 1}. We have |Sd | = φ(d). For every integer a with 1 ≤ a ≤ m we have m a m a positive divisor gcd(a,m) of m and an element gcd(a,m) of S gcd(a,m) . Conversely, for a ∈ Sd m m m m we get an integer a d with 1 ≤ a d ≤ m, and gcd(a d , m) = d . This shows that we have a bijection {1, 2, . . . , m} ↔ {pairs (d, a) : 0 < d|m and a ∈ Sd } The cardinality of the left hand side is m, whilst the cardinality of the right hand side is P 0<d|m φ(d). Exercise. Find another proof of the above first Proposition, by first proving it for m = pi using the exact formula we have for φ(pi ) (I did this part in the lecture) and then using P multiplicativity of φ to show that the function F (m) = 0<d|m φ(d) is also multiplicative. Theorem 4.5 (The Fermat–Euler Theorem). Let a ∈ Z and let m be a positive integer. Suppose gcd(a, m) = 1. Then aφ(m) ≡ 1 (mod m) × Proof. Since Z× m is a group of order φ(m), and [a]m ∈ Zm , it is an immediate consequence of Lagrange’s theorem in group theory that [a]φ(m) = [1]m , because the order of the cyclic m group generated by [a]m is a divisor of φ(m). In other words, aφ(m) ≡ 1 (mod m) Here is an alternative proof, without using group theory: let x1 , x2 , . . . , xφ(m) be the list of integers between 1 and m which are coprime to m. You can check that the integers ax1 , ax2 , . . . , axφ(m) are all distinct modulo m and coprime to m, so this gives another list (possibly reordered) of integers whose congruence classes give all the units in Zm . In other words we have Z× m = {[x1 ]m , [x2 ]m , . . . , [xφ(m) ]m } = {[ax1 ]m , [ax2 ]m , . . . , [axφ(m) ]m } Taking the product of all the elements in each list, we get [x1 ]m [x2 ]m · · · [xφ(m) ]m = [ax1 ]m [ax2 ]m · · · [axφ(m) ]m = [aφ(m) ]m [x1 ]m [x2 ]m · · · [xφ(m) ]m . Dividing through by the unit [x1 ]m [x2 ]m · · · [xφ(m) ]m gives [aφ(m) ]m = 1. Corollary 4.6 (Fermat’s Little Theorem). Let p be a prime and let a ∈ Z be an integer which is coprime to p. Then ap−1 ≡ 1 (mod p) If a is an arbitrary integer, we have ap ≡ a (mod p). Proof. The first part is a special case of the Fermat–Euler Theorem. The second part follows from the first, unless p|a, in which case ap ≡ a ≡ 0 (mod p), so the statement also holds in this case. Definition 4.7. Let m be a positive integer and let [a] ∈ Z× m . The order of [a] (also called the order of a modulo m) is defined to be the smallest positive integer i such that [a]i = [1]. In other words, it is the order of the cyclic subgroup of Z× m generated by [a]. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 21 As noted in the proof of the Fermat–Euler Theorem, the order of [a] is a divisor of φ(m). 5. Primitive roots This section is about trying to understand the structure of Z× m as a group. We will briefly recall some basic facts from group theory. Let G be a finite group, with group multiplication · and identity e. The order of G is the number of elements of G. If g ∈ G then the order of g, denoted o(g) is the smallest positive integer i such that g i = e. The order o(g) is also the order of the cyclic group hgi = {e, g, g 2 , . . . , g o(g)−1 } generated by g. If i is an integer, we have g i = e ⇐⇒ o(g)|i. Problem. For which positive integers m is Z× m a cyclic group? In other words, for which m is there an integer a (coprime to m) such that the order of a modulo m is φ(m). If such an a exists, then Z× m is cyclic, generated by [a]m , and we say that a is a primitive root modulo m. Lemma 5.1. Let m be a positive integer, and let a be coprime to m. Then a is a primitive root modulo m if and only if aφ(m)/p 6≡ 1 (mod m) for all prime divisors p of φ(m). Proof. The order of a modulo m is a divisor of φ(m). There are two possibilities: either the order is equal to φ(m), or the order is less than φ(m), in which case it divides φ(m)/p for some prime factor p of φ(m). Exercise to finish from here. Example. Let m = 5. Then Z× 5 = {[1], [2], [3], [4]} and o([2]) = 4 = φ(5), by the previous Lemma since [2]2 6= [1]. So Z× is cyclic. 5 Example. Let m = 8. Then Z× 8 = {[1], [3], [5], [7]}. You can check that o([3]) = o([5]) = o([7]) = 2, so no element has order 4 = φ(8) and the group is not cyclic. Week 5 Theorem 5.2. Let p be a prime number. Then Z× p has φ(d) elements of order d, for each × 0 < d|(p − 1). In particular, Zp is cyclic, as there are φ(p − 1) elements of order p − 1 (which are the primitive roots). Proof. Let d|(p−1) be a positive divisor of p−1. We define Wd = {[a] ∈ Z× p : [a] has order d} and let wd = |Wd |. We want to show that wd = φ(d) for each d. We know that every element of Z× p is an element of Wd for exactly one d (the order of the element, which is necessarily a divisor of p − 1). So we have X 0<d|p−1 wd = p − 1. 22 JAMES NEWTON We also have (Proposition 4.4) X φ(d) = p − 1 0<d|p−1 so X (φ(d) − wd ) = 0. 0<d|p−1 We are going to show that wd ≤ φ(d) for all d. This means that the above sum is a sum of non-negative integers with value 0, so all the summands must be 0, and we are done. If Wd is empty then we obviously have 0 = wd ≤ φ(d). So suppose Wd contains an element [a]. The powers a, a2 , . . . , ad are all distinct modulo p, since a has order d modulo p, they are all roots of the polynomial f (x) = xd − 1 in Zp . It follows from Theorem 3.15 that these are all the roots of this polynomial, since it has at most d. Now let [b] be any element of Wd . Since [b] is a root of xd − 1 we have [b] = [ai ] for some i = 1, 2, . . . , d. Since o([b]) = d, we must have gcd(i, d) = 1, as we have [b]d/ gcd(d,i) = ([a]d )i/ gcd(d,i) = [1] (see Exercise 1 on Problem Sheet 5), so we must have d/ gcd(d, i) = d. It follows that there are at most φ(d) possibilities for [b] — it is equal to [ai ] for one of the φ(d) integers i with 1 ≤ i ≤ d and gcd(a, d) = 1. So we have shown that wd ≤ φ(d), which completes the proof. Applications of primitive roots. Example. (1) We check that 2 is a primitive root modulo 11: Since φ(11) = 10, by Lemma 5.1 we just need to check that 25 and 22 are not 1 mod 11, which is easy. (2) We find all solutions to 7x3 ≡ 3 (mod 11) We have [7]−1 11 = [−3]11 so we have to solve x3 ≡ −3 · 3 ≡ 2 (mod 11) Since 2 is a primitive root mod 11, we have i Z× 11 = {[2]11 : i = 0, 1, . . . , 10} so to solve the equation we need to find the i such that ([2]i11 )3 = [2]3i 11 = [2]11 . This is equivalent to 3i ≡ 1 (mod 10). We have [3]−1 10 = 7, so we get i≡7 (mod 10). So finally, the solution is given by [x]11 = [2]711 , or equivalently x ≡ 27 = 128 ≡ 7 (mod 11) ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 23 (3) We find all positive integers y such that 4y ≡ 5 (mod 11) Since 5 ≡ 24 (mod 1)1 we have to solve 22y ≡ 24 (mod 11). Equivalently, we have 2y ≡ 4 (mod 10). By Corollary 3.12, this is equivalent to y ≡ 2 (mod 5). (4) Here’s an additional example where we will find there are no solutions: consider the equation x8 ≡ 10 (mod 11) We have 10 ≡ −1 ≡ 25 (mod 11). So we need to find i such that ([2]i11 )8 = [2]8i 11 = [2]511 . This is equivalent to 8i ≡ 5 (mod 10). But, by Corollary 3.12, this equation has no solutions, since gcd(8, 10) = 2 - 5. So there are no solutions to the original equation. Now we are going to study the case of prime powers m = pi . Proposition 5.3. Let p be a prime number. Suppose a is a primitive root modulo p. Then either a or a + p is a primitive root modulo p2 . Proof. We have φ(p2 ) = p(p − 1). Since o([a]p ) = o([a + p]p ) = p − 1 we know that o([a]p2 ) and o([a + p]p2 ) are divisors of p(p − 1) which are divisible by p − 1 (see the below remark for more on this). So these orders are equal to either p − 1 or p(p − 1). If o([a]p2 ) = p(p − 1) then a is a primitive root modulo p2 and we are done. So suppose o([a]p2 ) = p − 1. Then (a + p)p−1 ≡ ap−1 + pap−2 ≡ 1 + pap−2 (mod p2 ) which is not congruent to 1, so we must have o([a + p]p2 ) = p(p − 1) and a + p is a primitive root modulo p2 . Remark. A bit more explanation of one of the steps in the above proof: we claimed that o([a]p2 ) and o([a + p]p2 ) are divisors of p(p − 1) which are divisible by p − 1. The fact that these orders divide p(p − 1) is because of Lagrange’s theorem: the order of an element 2 divides the order of Z× p2 which is φ(p ) = p(p − 1). The fact that the orders are divisible by p − 1 is because of the following (the same argument works for a and a + p): by definition we have ao([a]p2 ) ≡ 1 (mod p2 ). If an integer is divisible by p2 it is divisible by p, so we get ao([a]p2 ) ≡ 1 (mod p). But now we use a general fact about orders of elements in finite groups: if g i = e then o(g) divides i. In this case this means that o([a]p ) divides o([a]p2 ). Since a is a primitive root modulo p, we get that p − 1 divides o([a]p2 ). The Proposition immediately implies that Z× p2 is cyclic, and gives a procedure to find a primitive root if we already know a primitive root mod p. Proposition 5.4. Let p be an odd prime and suppose a is a primitive root modulo p2 . Then a is a primitive root modulo pi for all i ≥ 2. 24 JAMES NEWTON Proof. The proof is by induction. Let i ≥ 2 and suppose a is a primitive root modulo pi . We want to show that a is a primitive root modulo pi+1 . By the same proof as described in the above remark, we know that o([a]pi+1 ) is divisible by o([a]pi ) = pi−1 (p − 1), so it suffices to show that i−1 ap (p−1) 6≡ 1 (mod pi+1 ). i−1 i−2 We have ap (p−1) = (ap (p−1) )p . Since a is a primitive root modulo pi we know that ap but i−2 (p−1) ≡1 (mod pi−1 ) i−2 ap (p−1) 6≡ 1 (mod pi ) i−2 So there is an integer x such that ap (p−1) = 1 + pi−1 x and p does not divide x. Now we have ! p 2i−2 2 pi−1 (p−1) pi−2 (p−1) p i−1 p i a = (a ) = (1 + p x) ≡ 1 + p x + p x (mod pi+1 ) 2 and since p is an odd prime we have p| ap i−1 (p−1) p 2 , so pi+1 | p 2 ≡ 1 + pi x 6≡ 1 p2i−2 (here we use i ≥ 2). So we get (mod pi+1 ) since p does not divide x. This completes the proof. We already say that Z× 8 is not cyclic, so we cannot expect this Proposition to work for p = 2. i i Fact 5.5. Let m be a positive integer. Then Z× m is cyclic if and only if m = 1, 2, 4, p or 2p for some odd prime p and positive integer i. Here is a useful additional Proposition which I mentioned in a later lecture, generalising Theorem 5.2: Proposition 5.6. Suppose m > 0 is a positive integer, and suppose that Z× m has a primitive root. (For example, we could have m = pn with p an odd prime). Then the number of primitive roots in Z× m is φ(φ(m)). i Proof. Let g be a primitive root modulo m. Then the elements of Z× m are {[g] : 1 ≤ i ≤ φ(m)}. We claim that [g]i is a primitive root if and only if i is coprime to φ(m). In particular there are φ(φ(m)) primitive roots. The claim follows from Exercise 1 on Problem Sheet 5. 6. Quadratic residues Problem. Given a prime p and an integer a, decide whether x2 ≡ a (mod p) has a solution or not. If a ≡ 0 (mod p) then the equation is solved by x ≡ 0 (mod p). If p = 2, then the only other possibility is a ≡ 1 (mod 2), in which case the equation is solved by x ≡ 1 (mod 2). So from now on we assume that p is odd and a is coprime to p. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 25 Definition 6.1. Let p be an odd prime and a an integer coprime to p. We say that a is a quadratic residue modulo p if the equation x2 ≡ a (mod p) has a solution, and we say that a is a quadratic non-residue modulo p otherwise. Note that whether a is a quadratic residue or not depends only on the congruence class [a]p . So it makes sense to ask if a congruence class [a] ∈ Z× p is a quadratic residue or not. Example. The quadratic residues modulo 11 are the congruence classes of 1, 3, 4, 5, 9. Proposition 6.2. Let g ∈ Z be a primitive root modulo p. Then [g i ]p is a quadratic residue if and only if i is even. Proof. If i = 2k is even, then (g k )2 ≡ g i (mod p), so [g i ] is a quadratic residue. Conversely, if x2 ≡ g i (mod p) then we have [x] = [g k ] for some integer k, so we have 2k g ≡ g i (mod p). Equivalently, we have 2k ≡ i (mod p − 1). Since 2k and p − 1 are even, i must also be even. Corollary 6.3. There are (p−1)/2 quadratic residues and (p−1)/2 quadratic non-residues in Z× p. p−2 ]}. By the Proof. Let g be a primitive root modulo p. We have Z× p = {[1], [g], . . . , [g 2 4 p−3 previous Proposition, the quadratic residues are {[1], [g ], [g ], . . . [g ]} and the quadratic non-residues are {[g], [g 3 ], . . . , [g p−2 ]}. Theorem 6.4. −1 is a quadratic residue modulo p if p ≡ 1 (mod 4) and a quadratic non-residue if p ≡ 3 (mod 4). Proof. Let g be a primitive root modulo p, and let x = g (p−1)/2 . We have x2 = g p−1 ≡ 1 (mod p) and x 6≡ 1 (mod p), since g is a primitive root. The equation x2 ≡ 1 (mod p) has only two solutions, so we have x ≡ −1 (mod p). We deduce that −1 is a quadratic residue if and only if (p − 1)/2 is even, which gives the statement of the Theorem. Corollary 6.5. There are infinitely many primes p with p ≡ 1 (mod 4). Proof. Suppose every prime with p ≡ 1 (mod 4) is ≤ N . Consider the positive integer n = (N !)2 + 1. Let p be a prime divisor of n. Since p is odd and does not divide N ! we have p ≡ 3 (mod 4). On the other hand (N !)2 +1 ≡ 0 (mod p), so −1 is a quadratic residue modulo p. This contradicts the Theorem above, so we deduce that there are infinitely many primes ≡ 1 (mod 4). This result is a special case of Dirichlet’s Theorem whose proof is beyond the scope of this course: Theorem 6.6 (Dirichlet’s Theorem). Let a, b be coprime positive integers. Then there exist infinitely many prime numbers p with p ≡ a (mod b) 26 JAMES NEWTON Corollary 6.7 (Euler’s criterion: Corollary to Proposition 6.2). Let [a] ∈ Z× p . Then [a] is a quadratic residue if and only if [a](p−1)/2 ≡ 1 (mod p) and [a] is a quadratic non-residue if and only if [a](p−1)/2 ≡ −1 (mod p) Proof. We let g be a primitive root and suppose [a] = [g]i . Then [a](p−1)/2 = [g]i(p−1)/2 . If i is even we have i(p − 1)/2 ≡ 0 (mod p − 1), so [g]i(p−1)/2 = [1]. If i is odd we have i(p − 1)/2 ≡ (p − 1)/2 (mod p − 1), so [g]i(p−1)/2 = [g](p−1)/2 = [−1], as in the proof of the Theorem. The Legendre symbol. We now give some new notation which will us to keep track of whether something is a quadratic residue. Definition 6.8. Let a be an integer. We define the Legendre symbol a p by, ! a = +1 if a is a quadratic residue modulo p p ! a = −1 if a is a quadratic non-residue modulo p p If p|a we define a p = 0. The definition of the Legendre symbol depends only on the congruence class of a, so we can also view it as being defined on elements [a] ∈ Zp . Lemma 6.9. The number of solutions in Zp to x2 ≡ a (mod p) is equal to 1 + a p . Proof. If p|a the only solution is x = [0], and 1 = 1 + 0 = 1 + ap . If a is a quadratic non-residue mod p then there are no solutions (by definition of a a non-residue), and 0 = 1 − 1 = 1 + p . If a is a quadratic residue mod p we have one solution x1 , and −x1 gives a second solution (since p > 2 and x1 6= [0] we have −x1 6= x1 ). Since the polynomial x2 − a has degree 2, there are at most two solutions, so we have two solutions in this case, and 2 = 1 + 1 = 1 + ap . Remark. Let’s reformulate Euler’s criterion (Corollary 6.7) in terms of the Legendre symbol. It says that if [a] ∈ Z× p then we have a (p−1)/2 a ≡ p ! (mod p). ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 27 In fact, if p|a we also have ! a a ≡ p because in this case both sides are 0 (mod p). (p−1)/2 (mod p) Lemma 6.10 (Multiplicativity of the Legendre symbol). Let a, b be two integers. Then ! ab a = p p ! b p ! Proof. By Euler’s criterion, as reformulated in the above remark, we have ! a ab ≡ (ab)(p−1)/2 ≡ a(p−1)/2 b(p−1)/2 ≡ p p ! b p ! (mod p) Since p > 2, the only way we can have a congruence ! ab a ≡ p p ! b p ! (mod p) between two numbers, which are one of −1, 0, +1, is if ! ab a = p p ! b p ! Lemma 6.11. ! X [a]∈Zp a =0 p Proof. We have shown that there are (p − 1)/2 quadratic residues and (p − 1)/2 nonresidues. So the sum has (p − 1)/2 terms equal to +1 and the same number equal to −1. The remaining term is p0 , which is 0. So adding up gives us 0. NB: I’ve added a Proposition at the end of the primitive roots section (Proposition 5.6). Sums of squares modulo p. Problem. Let n be a positive integer. How many solutions (x1 , . . . , xn ) ∈ Znp does the equation x21 + x22 + · · · + x2n ≡ 1 (mod p) have? We write Nn (p) for the number of solutions. Note that we have N1 (p) = 2. Theorem 6.12. Let a be an integer. The equation x21 + x22 ≡ a (mod p) has p − −1 p solutions (x1 , x2 ) ∈ Zp × Zp if p - a and p + (p − 1) −1 p solutions if p|a. Week 6 28 JAMES NEWTON Proof. For each pair (y, z) ∈ Zp × Zp with y + z = [a] and solutions to the equations x21 = y and x22 = z in Zp , we get a solution (x1 , x2 ) to our original equation, and all solutions are y 2 given by this process. The number of solutions to x1 = y is 1 + p , and similarly for x22 = z, so we get total number of solutions ! ! ! ! X X X X y z y z yz (1 + )(1 + )= 1+ + + p p p p p y+z=[a] y+z=[a] y+z=[a] y+z=[a] y+z=[a] ! X where we have multiplied out the terms in side the first sum and used multiplicativity of the Legendre symbol. The first sum on the right hand side gives p, and the second and third have yz = y(a−y) = y 2 (ay −1 −1) give zero, by Lemma 6.11. For the final sum, if y ∈ Z× p we −1 2 and so yz = ay p −1 , whilst if y ≡ 0 (mod p) we have yz = 0. Since yp = 1 we have p p X ay −1 − 1 yz = p p y∈Z× ! X y+z=[a] ! p −1 If p|a then this is equal to (p − 1) −1 . If p - a then as y runs over Z× also runs over p , ay p × −1 Zp , and so ay − 1 runs over {[0], [1], . . . , [p − 2]}. We therefore have X y0 −1 −1 ay −1 − 1 =( )− =− p p p p y 0 ∈Zp ! X y∈Z× p ! ! ! (again by Lemma 6.11). Putting everything together gives the result in the Theorem. The examinable material for the Class Test on 06/03/2017 is everything up to here. Corollary 6.13. Let p be an odd prime. Then ! +1 if p ≡ 1 or 7 2 2 = (−1)(p −1)/8 = −1 if p ≡ 3 or 5 p (mod 8) (mod 8) Proof. We do this by counting the number of solutions N2 (p) to x21 + x22 ≡ 1 (mod p) in two different ways. On the one hand we have N2 (p) = p − −1 . p On the other hand, suppose we have a solution (x1 , x2 ). Then we get more solutions by considering (x2 , x1 ), (−x1 , x2 ), etc. Swapping signs and switching x1 and x2 usually gives 8 distinct solutions. The only cases when we do not get distinct solutions are if x1 or x2 are equal to 0 (in each case we have two solutions), or if x1 =±x 2 . If (x1 , x2 ) = (c, ±c) 2 2 then we have 2c ≡ 1 (mod p). There are 2 possibilities for c if p = 1 (and so there are 4 possibilities for (c, ±c)) and no possibilities if 5). To conclude we have shown that we have 2 p = −1 (see Exercise 5 on Problem Sheet N2 (p) = 8k + 2 + 2 + 4 = 8k + 8 ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) for some integer k if 2 p = 1 and N2 (p) = 8k + 4 if 2 p 29 = −1. We deduce that ! −1 ≡0 p− p if 2 p (mod 8) = 1 and ! −1 p− ≡4 p (mod 8) if p2 = −1. Running through the different possibilities for p mod 8 and using Theorem 6.4 gives the final result. Theorem 6.14. Let n = 2k + 1 be an odd positive integer. Then !k Nn (p) = p2k + pk −1 p Proof. We prove the Theorem by finding a formula for Nn (p) in terms of Nn−2 (p). A proof by induction (and the base case N1 (p) = 2) then proves the formula in the statement of the Theorem holds — we leave this part of the proof as an exercise. Let n ≥ 3. For any integer a, we denote the number of solutions to x21 + x22 + · · · + x2n ≡ a (mod p) by Nn (p, a). By considering the solutions of x21 + x22 ≡ a (mod p) and x23 + · · · + x2n ≡ 1 − a (mod p) for each congruence class [a] ∈ Zp , we see that Nn (p) = X N2 (p, a)Nn−2 (p, 1 − a) [a]∈Zp We worked out N2 (p, a) in the previous Theorem. So we have Nn (p) = ! ! −1 −1 (p − )Nn−2 (p, 1 − a) + (p + (p − 1) )Nn−2 (p, 1) p p × [a]∈Z X p ! ! −1 −1 = (p − )Nn−2 (p, 1 − a) + p Nn−2 (p, 1) p p [a]∈Zp X ! ! −1 X −1 = (p − ) Nn−2 (p, 1 − a) + p Nn−2 (p, 1) p p [a]∈Zp As a runs over all the congruence classes in Zp , 1 − a also runs over Zp so we have ! ! −1 X −1 Nn (p) = (p − ) Nn−2 (p, a) + p Nn−2 (p, 1) p p [a]∈Zp 30 JAMES NEWTON , since each The sum [a]∈Zp Nn−2 (p, a) counts all the elements (x1 , . . . , xn−2 ) ∈ Zn−2 P p 2 2 (n − 2)-tuple is counted once, in the term Nn−2 (p, x1 + · · · + xn−2 ). So [a]∈Zp Nn−2 (p, a) = pn−2 . We deduce that P ! ! −1 −1 Nn (p) = (p − )pn−2 + p Nn−2 (p) p p and this gives us enough information to prove the Theorem. Exercise. Use the same method to find a formula for Nn (p) when n is even. Quadratic Reciprocity. Theorem 6.15 (Quadratic reciprocity law). Let p and q be two distinct odd primes. Then ! p = (−1) q p−1 q−1 2 2 ! q = p q p − q p if p ≡ 1 (mod 4) or q ≡ 1 if p ≡ q ≡ 3 (mod 4) (mod 4) This Theorem was first proved by Gauss in 1796 (having been conjectured by Euler and Legendre). Gauss gave eight different proofs, and there are over 200 published proofs in total. We will explain a proof due to V. Lebesgue (published in 1838) using our previous results on computing Nn (p). Proof. We prove the Theorem by computing Nq (p) (mod q) in two different ways. Firstly, by the previous theorem, we have !(q−1)/2 Nq (p) = pq−1 + p(q−1)/2 −1 p ! p−1 q−1 p ≡1+ (−1) 2 2 q (mod q) where the congruence follows from Fermat’s little theorem, and the Euler criterion. On the other hand, if (x1 , x2 , . . . , xq ) is a solution of x21 + x22 + · · · + x2q ≡ 1 (mod p), then we get q solutions by considering the cyclic shifts (x1 , x2 , . . . , xq ), (xq , x1 , . . . , xq−1 ), . . . (x2 , x3 , . . . , xq , x1 ). These q solutions are all distinct, unless we have x1 = x2 = · · · = xq (it’s an exercise to check this, it uses the fact that q is prime). The number of solutions with x 1 = x2 = · · · = xq q 2 is given by the number of x with qx ≡ 1 (mod p). This is equal to 1+ p (by Exercise 5 on Problem Sheet 5). We deduce that Nq (p) = qk+1+ pq for some integer k, so Nq (p) ≡ 1+ pq (mod q). Comparing our two formulas for Nq (p) mod q gives the reciprocity law. As well as being a beautiful theorem, the quadratic reciprocity law is useful for calculating Legendre symbols: Example. Compute 43 83 : ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 31 First we check that 43 and 83 are prime. They are also both congruent to 3 mod 4 so we have ! ! 43 83 =− 83 43 ! 40 since 83 ≡ 40 =− 43 !3 ! 23 · 5 2 =− =− 43 43 (mod 43) ! ! 5 2 =− 43 43 5 43 ! ! 5 = since 43 ≡ 3 43 ! (mod 8) ! 43 3 = = apply quadratic reciprocity and reduce mod 5 5 5 ! ! 5 2 = quadratic reciprocity and reduce mod 3 = 3 3 Finally, we can check that 2 3 = −1. So we get ! 43 = −1 83 Example. How many solutions does the equation 3x2 + 6x + 2 ≡ 0 (mod 23) have in Z23 ? We have 3x2 + 6x + 2 = 3(x + 1)2 − 1, so we have to solve 3(x + 1)2 ≡ 1 (mod 23) We have [8] = [3]−1 , so we have to solve (x + 1)2 ≡ 8. Now we compute ! !3 2 8 = 23 23 ! 2 = = +1 23 since 23 ≡ 7 (mod 8). So there are two solutions to this equation in Z23 . Example. We are going to describe all of the odd primes p with ! 3 = +1. p 32 JAMES NEWTON We may assume that p 6= 3. The quadratic reciprocity law says that (mod 4) and (mod 4) or if p 3 We have 3 p p 3 =− p 3 3 p if p ≡ 3 (mod 4). So we get = +1 if p 3 3 p = p 3 if p ≡ 1 = +1 and p ≡ 1 = +1 and p ≡ 3 (mod 4). = +1 if p ≡ 1 (mod 3) and p 3 = −1 if p ≡ 2 (mod 3). So we conclude that 3 p = +1 if p ≡ 1 (mod 4) and p ≡ 1 (mod 3) or if p ≡ −1 (mod 4) and p ≡ −1 (mod 3). Using the Chinese remainder theorem, we see that these conditions are equivalent to p ≡ ±1 (mod 12), and this is our final description of the odd primes with 3 p = +1. 7. Sums of squares Proposition 7.1. A positive integer n is a square if and only if every exponent ai in the prime factorisation n = pa11 pa22 · · · par r is even. Proof. If n = m2 is a square then by considering the prime factorisation of m we see that the exponents in the prime factorisation of n are even. Conversely, if every exponent ai is even, then a /2 a /2 n = (p11 p22 · · · par r /2 )2 Question. Which positive integers are sums of two squares? If you look at the first few integers, you can check that 1, 2, 4, 5, 8, 9, 10, 13 are sums of two squares and 3, 6, 7, 11, 12 are not. Lemma 7.2. Suppose n = a2 + b2 with a, b ∈ Z. Then n ≡ 0, 1 or 2 (mod 4). Proof. If x ∈ Z then x2 is either 0 or 1 (mod 4). Week 7 Corollary 7.3. If n ≡ 3 (mod 4) then n is not a sum of two squares. If we have n = a2 + b2 then n = (a + ib)(a − ib) = |a + ib|2 . Lemma 7.4. Let m, n be positive integers. If m and n are sums of two squares, so is mn. Proof. We can write m = |a + ib|2 and n = |c + id|2 , with a, b, c, d ∈ Z. So mn = |(a + ib)(c + id)|2 = |(ac − bd) + i(ad + bc)|2 is also a sum of two squares. Theorem 7.5. Let p be a prime. p is a sum of two squares if and only if p = 2 or p ≡ 1 (mod 4). Proof. We know that if p ≡ 3 (mod 4) then p is not a sum of two squares, and 2 = 12 + 12 . So it remains to show that if p ≡ 1 (mod 4) then p is a sum of two squares. −1 Since p ≡ 1 (mod 4) we have p = +1. So we can let a ∈ Z be a solution to the √ equation a2 + 1 ≡ 0 (mod p). Let k = b pc, so k 2 < p < (k + 1)2 . For each of the (k + 1)2 > p different pairs of integers (c, d) with 0 ≤ c ≤ k and 0 ≤ d ≤ k consider ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 33 the integer c + ad. Since there are only p distinct residue classes mod p, we must have c1 + ad1 ≡ c2 + ad2 (mod p) for two different pairs (c1 , d1 ), (c2 , d2 ) (this is an example of the pigeonhole principle). So we have c1 − c2 ≡ a(d2 − d1 ) (mod p) and squaring both sides gives (c1 − c2 )2 ≡ −(d2 − d1 )2 (mod p) We deduce that p divides (c1 − c2 )2 + (d2 − d1 )2 . Moreover, we have 0 < (c1 − c2 )2 + (d2 − d1 )2 ≤ k 2 + k 2 < 2p but the only multiple of p between 0 and 2p is p itself, so we are forced to have p = (c1 − c2 )2 + (d2 − d1 )2 and we have shown that p is a sum of two squares. Theorem 7.6 (Two squares theorem). A positive integer n is a sum of two squares if and only if the exponent of every prime number which is congruent to 3 (mod 4) in the prime factorisation of n is even. Proof. First suppose that n = p1 p2 · · · pk m2 with each prime pi equal to either 2 or ≡ 1 (mod 4). Then n is a sum of two squares, since it is a product of integers which are sums of two squares. Conversely, suppose n = a2 + b2 be a sum of two squares. Let d = gcd(a, b). Then 2 2 we have n = d2 ( ad + db ) and gcd( ad , db ) = 1. Suppose p|n and p ≡ 3 (mod 4). If 2 2 p|( ad + db ) then we can solve the equation x2 + y 2 ≡ 0 (mod p), with x, y coprime. In particular p does not divide x or y (otherwise the equation would force p to divide both, 2 in which case p| gcd(x.y)). So we have ([x]p [y]−1 p ) = [−1]p and −1 is a quadratic residue modulo p. This contradicts the assumption that p ≡ 3 (mod 4). So if p|n and p ≡ 3 2 2 (mod 4) we must have p|d and p - ( ad + db ). This implies that the exponent of p in the prime factorisation of n is even. Fact 7.7. A positive integer is a sum of three squares if and only if it is not of the form 4a (8k + 7) with a, k non-negative integers. Fact 7.8. Every positive integer is a sum of four squares. 8. Irrational and transcendental numbers Recall that Q = { ab : a ∈ Z, b ∈ N} is the set of rational numbers, and we use R to denote the real numbers and C to denote the complex numbers. Definition 8.1. A complex number x ∈ C is called irrational if x ∈ / Q. Theorem 8.2. A real number x ∈ R is rational if and only if its decimal expansion either terminates or repeats. 34 JAMES NEWTON Proof. We analyse what it means for a real number to have a repeating decimal expansion. We suppose that 0 < x < 1 and that we can write x = 0.ab = 0.a1 a2 a3 · · · an b1 b2 · · · bk . Then we have y := 10n x − a = 0.b and 10k y = b + y, so y = 10kb−1 . So we have x= a+y (10k − 1)a + b = 10n 10n (10k − 1) Conversely if x = 10n (10c k −1) with 0 < c < 10n (10k −1) then we can write c = (10k −1)a+b with 0 ≤ b < (10k −1) and 0 ≤ a < 10n . So to show that a rational number has a repeating (or terminating) decimal expansion, it suffices to show that it can be written as a fraction with denominator 10n (10k − 1). Suppose x = rs with r ∈ Z and s ∈ N. Multiplying top 0 and bottom of the fraction by a multiple of 2 or 5 if necessary, we can write x = 10rn s0 where s0 is coprime to 10. Since 10 is coprime to s0 there is an integer k such that 10k ≡ 1 (mod s0 ). For example, we can take k = φ(s0 ). In other words, we have s0 |(10k − 1), so multiplying top and bottom of the fraction by (10k − 1)/s0 we can write x as a fraction with denominator 10n (10k − 1) as desired. Example. The number α= ∞ X 1 n! n=1 10 is irrational. Proposition 8.3. Let z ∈ C be a root of a polynomial xm + cm−1 xm−1 + · · · + c1 x + c0 with integer coefficients ci ∈ Z. Then either z is an integer or z is irrational. Proof. Suppose z is rational. We must show that z is actually an integer. Write z = with a ∈ Z, b ∈ N and gcd(a, b) = 1. We have m a + cm−1 m−1 a b and multiplying by b we get b + · · · + c1 a b a + c0 = 0 b m am + cm−1 am−1 b + · · · + c1 abm−1 + c0 bm = 0 and in particular we have b|am . Since gcd(a, b) = 1 we also have gcd(am , b) = 1. Combining this with the fact that b|am means we must have b = 1 so z ∈ Z. Remark. Note that in the above Proposition it is crucial that the leading coefficient of the polynomial is 1. For example, if you consider the polynomial 2x + 1 you see that polynomials which are not monic (whose leading coefficient is not 1) can have rational, but non-integral, roots. √ 3 Example. √ 2 is irrational: since it is a root of x3 − 2 the preceding proposition implies that either 3 2 is irrational, or there is an integer n with n3 = 2. But there is no such integer (if n > 1 then n3 > 2). Recall the number e = 1 + 1 1! + 1 2! + 1 3! + ··· ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 35 Theorem 8.4 (Euler, 1737). e is irrational Proof. This proof is due to Fourier. Euler’s original proof uses continued fractions which is a very nice topic in elementary number theory which we don’t have time to cover in this course. They are discussed in some of the books on the reading list. Suppose e = ab is rational. Then for n ≥ b the number 1 1 1 a 1 1 1 1 1 − 1 + + + + ··· + N = n! e − 1 + + + + · · · + = n! 1! 2! 3! n! b 1! 2! 3! n! is a positive integer, since multiplying by n! clears all the denominators. On the other hand, we have 1 1 N = n! + + ··· (n + 1)! (n + 2)! ! = 1 1 + + ··· n + 1 (n + 1)(n + 2) and the right hand side is strictly less than 1 1 1 + + ··· = 2 n + 1 (n + 1) n So, taking n > 1 we get a contradiction to the claim that N is a positive integer. So we deduce that e must be irrational after all. A harder fact to prove (proven by Lambert in 1761) is that π is also irrational. 8.1. Algebraic and transcendental numbers. Definition 8.5. A complex number z ∈ C is called algebraic if z is a root of a non-zero polynomial with rational coefficients. A complex number z is called transcendental if it is not algebraic. Example. If z ∈ Q then z is a root of the polynomial x − z, and is therefore algebraic. So every transcendental number is irrational √ √ Example. 3 2 and 3 2 + 1 are algebraic numbers. Exercise. Find a non-zero polynomial with rational coefficients and √ 3 2 + 1 as a root. Fact 8.6. e is transcendental (Hermite, 1873) π is transcendental (Lindemann, 1882) Conjecture (A special case of Schanuel’s conjecture). e + π is transcendental At the moment, it is not even known whether or not e + π is irrational! P 1 The proofs of the above facts are quite complicated. An easier result is that ∞ n=1 10n! is transcendental. We will prove this by considering the problem of approximating irrational numbers by rational numbers. Since the rational numbers are dense in the real numbers, any irrational number α ∈ R can be approximated arbitrarily closely by a rational number — for example, consider the rational numbers obtained by cutting off the decimal expansion of α further and further along. However, the denominators of these rational numbers get large as we approximate α closer and closer. So, more precisely, we want to say something about the difference |α − ab | Week 8 36 JAMES NEWTON in terms of the denominator b. Let’s think about how good our naive way of producing rational approximations using the decimal expansion does: if α = 0.a1 a2 a3 · · · an an+1 an+2 · · · and we consider the rational approximation αn = 0.a1 a2 · · · an then we have 1 10n and the denominator of αn is 10n , so our approximation αn is, in general, within distance 1 of α, where b is the denominator of αn . b |α − αn | = 0.00 · · · 0an+1 an+2 · · · < Theorem 8.7 (Dirichlet’s approximation theorem). Let α ∈ R be a real number. Let n > 0 be a positive integer. Then there exist integers a ∈ Z, b ∈ N, with b ≤ n, such that 1 a |α − | < b bn Proof. We begin by rewriting the inequality in the statement of the theorem as 1 . n So we want to find a natural number b ≤ n such that bα is close to an integer a. Consider the multiples 0α, 1α, . . . , nα of α. For each integer i with 0 ≤ i ≤ n we write iα = Ni + Fi where Ni is an integer and 0 ≤ Fi < 1 (so Fi is the fractional part of iα). We have n + 1 numbers F0 , F1 , . . . , Fn all in the interval [0, 1). We divide [0, 1) into the n smaller intervals [0, 1/n), [1/n, 2/n), . . . , [(n − 1)/n, 1). By the pigeonhole principle one of these intervals must contain at least 2 of the numbers Fi . So we have two integers i < j (with j ≤ n) such that |Fi − Fj | < n1 . We can rewrite this inequality as |a − bα| < |(iα − Ni ) − (jα − Nj )| < 1 n and finally we get |(Nj − Ni ) − (j − i)α| < 1 . n So we can take a = Nj − Ni and b = j − i. Corollary 8.8. Suppose α ∈ R is irrational. Then there exist infinitely many (distinct) rational numbers ab such that a 1 |α − | < 2 b b Proof. The preceding theorem tells us that for each n we can find an , bn such that |α − an 1 |< bn bn n |α − an 1 |< 2. bn bn and bn ≤ n. So we also have ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) We claim that this must give us infinitely many distinct rational numbers must have a single rational number ab with 37 an : bn if not, we a 1 |α − | < b bn for infinitely many n. But taking the limit as n → ∞ shows that this implies that α = ab and we assumed that α was irrational. So we have shown the claim. Remark. Note that the rational approximations a/b which are guaranteed to exist by the above corollary are much better than the approximations we got in general using the decimal expansion — the error is < 1/b2 whereas for the decimal approximations we got error < 1/b. Continued fractions give a systematic way to compute numbers a/b with |α − ab | < b12 . Dirichlet’s approximation theorem tells us that we can find rational numbers which approximate α quite closely (relative to the denominator of the rational number). Now we can ask whether it is possible to do even better than the result in Dirichlet’s theorem: for example, for an irrational number α ∈ R is it possible to find infinitely many rational numbers ab with a 1 |α − | < 3 ? b b Theorem 8.9 (Liouville). Let α ∈ R be an irrational number which is a root of a polynomial f (x) = cm xm + cm−1 xm−1 + · · · + c1 x + c0 with ci ∈ Q and cm 6= 0. Then there is a real number C > 0 such that that a C α − > b bm for all a ∈ Z, b ∈ N. In particular, if m = 2 this theorem implies that we cannot find infinitely many rational numbers ab with |α − ab | < b13 , as for b large enough we have b13 < bC2 . Proof. By dividing out factors x − β with β rational, we can assume that all the roots of f (x) are irrational. Multiplying by an integer to clear denominators, we can also assume that the coefficients ci are all integers. Let α = α1 , α2 , . . . , αm be the complex roots of f (x), so f (x) = cm (x − α1 ) · · · (x − αm ). Suppose a ∈ Z, b ∈ N with |α − ab | < 1. We have a f = |cm ||a/b − α1 | · · · |a/b − αm | b < |cm ||α − a/b|(1 + |α| + |α2 |) · · · (1 + |α| + |αm |) = c|α − a/b| where we use the inequality |a/b| < 1 + |α|, and c > 1 is a real number. On the other hand, a cm am + cm−1 am−1 b + · · · + c1 abm−1 + c0 bm = b bm f 38 JAMES NEWTON and the numerator of this fraction is a non-zero integer (non-zero because f (x) has no rational roots). So we have |f ab | ≥ b1m . We deduce that a 1 f < c|α − a/b| ≤ bm b Now we take C = c−1 and claim that this satisfies the statement of the theorem. Since c > 1 we have C < 1, so if a C α − ≤ b bm we certainly have |α − ab | < 1. Now we know that |α − a/b| > c−1 b1m = bCm which is a contradiction. So we have proven that a C α − > b bm for all a, b. Example. The proof √ of the above theorem is a bit tricky. Let’s2 go through√the proof √ in the special case α =√ 2. In this case, our polynomial is f (x) = x − 2 = (x − 2)(x + 2). √ We have α = α1 = 2 and α2 = − 2. √ We assume that ab is a rational number with | 2 − ab | < 1. Then we have |sqrt2 + ab | ≤ √ √ √ √ | ab − 2| + 2 2 by the triangle inequality, so | 2 + ab | < 1 + 2 2. 2 2 On the other hand we have ( ab ) = a −2b so |f ( ab )| ≥ b12 . b2 Combining everything, we get √ a √ a √ √ a a 1 | − 2|(1 + 2 2) > | − 2|| 2 + | = |f ( )| ≥ 2 b b b b b and so 1 a √ √ | − 2| > . b (1 + 2 2)b2 √ √ √ This was all assuming | 2− ab | < 1 but if | 2− ab | ≥ 1 we clearly have | 2− ab | > (1+21√2)b2 as well (since b ≥ 1). So we deduce that we have √ 1 a √ | 2− |> b (1 + 2 2)b2 in all cases. So in this case we can take C = Corollary 8.10. The number α = 1√ 1+2 2 P∞ in the statement of Liouville’s theorem. 1 n=1 10n! is transcendental. Proof. Suppose for a contradiction α is a root of a polynomial of degree m with rational coefficients. So there is a real number C > 0 such that a C α − > b bm for all a ∈ Z, b ∈ N. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) To approximate α by rational numbers, we just consider the finite sums αk = which have denominator 10k! . We have |α − αk | = 39 Pk 1 n=1 10n! , ∞ X 1 2 < (k+1)! n! 10 n=k+1 10 (to check the inequality, just compare the two decimal expansions). By taking k large 2 = (10k!2)k+1 less than (10Ck! )m , which contradicts Liouville’s enough, we can make 10(k+1)! theorem. So α is transcendental. 9. Diophantine equations We met the notion of Diophantine equations, and the special case of linear Diophantine equations, earlier in the course 1.1. The general form of a Diophantine equation is a multivariable polynomial equation: f (x1 , x2 , . . . , xr ) = c1 xa111 xa212 · · · xar 1r + · · · + cm xa1m1 xa2m2 · · · xar mn = 0 where the coefficients ci are all integers, and we seek integer solutions (x1 , x2 , . . . , xr ) to this equation. Example. x4 + 2x2 − 4xy + 2y 2 − 1 = 0 Given a Diophantine equation we can ask how many (if any) solutions the equation has, and whether there is an algorithm or formula to find the solutions. There is no general theory to answer these questions (in fact there is a theorem in mathematical logic which says that there cannot be such a general theory — try reading about ‘Hilbert’s tenth problem’). We will discuss a collection of examples which illustrate some of the methods you can use. We should emphasise that these examples are all carefully chosen — if you pick a completely random Diophantine equation you are very unlikely to be able to find its solutions. Example. Here are some simple examples of equations which we can solve completely: (1) x3 − 2x = 0 — the only integer solution is x = 0. (2) x4 + 2x2 − 4xy + 2y 2 − 1 = 0 — we can rewrite this equation as x4 + 2(x − y)2 = 1. If x, y are integers then x4 +2(x−y)2 = 1 is only possible if we have x4 = 1, (x−y)2 = 0, so the solutions are given by x = ±1 and y = x. (3) x3 = 5y 6 — we get one obvious solution x = y = 0. If y 6= 0 then x 6= 0. Let a be the power of 5 in the prime factorisation of x and let b be the power of 5 in the prime factorisation of y. Then the equation implies that 3a = 6b + 1 so we get 1 = 3(a − 2b) which is impossible. So there is no solution with y 6= 0. 40 JAMES NEWTON 9.1. Using divisibility and congruences to analyse Diophantine equations. Suppose we have a Diophantine equation f (x1 , x2 , . . . , xr ) = 0. We observe that if this equation has an integer solution then the congruence equation f (x1 , x2 , . . . , xr ) ≡ 0 (mod m) has a solution for every positive integer m. So one way to show that a Diophantine equation has no solutions is to find a positive integer m such that the equation f (x1 , x2 , . . . , xr ) ≡ 0 (mod m) has no solutions. Example. (1) Consider the equation x2 = y 5 + 7. We are going to show it has no integer solutions. We can consider the equation mod 11. Euler’s criterion implies that y 5 ≡ 0, 1 or −1 (mod 11) for every integer y. So the right hand side of the equation can take the values 6, 7 or 8 (mod 11). On the other hand x2 can take the values 0, 1, 3, 4, 5, 9 (mod 11). Since there are no common values in these lists, there are no solutions to the equation x2 ≡ y 5 + 7 (mod 11) and hence no integer solutions to the equation x2 = y 5 + 7. (2) The equation x12 + y 12 = z 12 + w12 + 3 has no integer solutions: we consider the equation mod 13. Fermat’s little theorem says that x12 + y 12 ≡ 0, 1 or 2 (mod 13), and similarly z 12 + w12 + 3 ≡ 3, 4 or 5 (mod 13). As in the previous example, there are no common values so the equation x12 + y 12 = z 12 + w12 + 3 has no integer solutions. (3) Consider the equation 15x2 − 7y 2 = 9. Suppose we have an integer solution (x, y). We get 3|(7y 2 ), so 3|y. Let y = 3y 0 . Then we have 15x2 −63(y 0 )2 = 9 or equivalently 5x2 − 21(y 0 )2 = 3. This now implies that 3|x so we let x = 3x0 . Then we get the equation 15(x0 )2 − 7(y 0 )2 = 1. Finally, we observe that this equation has no solutions mod 3: reducing mod 3 we get an equation 2(y 0 )2 ≡ 1 (mod 3), which has no solutions. So the original equation 15x2 − 7y 2 = 9 has no integer solutions. The first two of these examples could be described as finding a prime p such that the congruence classes of terms in the equation only have a few possible values, in order to rule out solutions. The last one works by identifying prime divisors of possible solutions (x, y) and dividing out by these to simplify the equation. We already considered the example of y 2 = x5 + 7. Now we will consider a very similar equation which is rather trickier to analyse: Example. Consider the equation y 2 = x3 + 7. First we note that if (x, y) is a solution then x must be odd: if x is even then y 2 ≡ 3 (mod 4) which is impossible. So we deduce that x is odd and y is even. Let’s rewrite the equation as y 2 + 1 = x3 + 8 = (x + 2)((x − 1)2 + 3) ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 41 Since (x − 1) is even, the second factor on the right hand side is a positive integer which is ≡ 3 (mod 4). This implies that the right hand side of the above equation is divisible by a prime p with p ≡ 3 (mod 4). So y 2 ≡ −1 (mod p). But this is impossible since −1 is not a square modulo a prime which is 3 (mod 4). We deduce that the original equation y 2 = x3 + 7 has no integer solutions. Example. Find all integer solutions of x3 + 2y 3 + 4z 3 = 9w3 . There is one obvious solution: (0, 0, 0, 0). Now suppose that not all of x, y, z, w are 0. Then we let d be the greatest common divisor of x, y, z, w. Dividing through by d we get another solutions (x/d, y/d, z/d, w/d), so we can assume that x, y, z, w have greatest common divisor 1. For any integer a, a3 ≡ 0 or ±1 (mod 9). Since x3 + 2y 3 + 4z 3 ≡ 0 (mod 9), you can check that we must have x3 ≡ 2y 3 ≡ 4z 3 ≡ 0 (mod 9). So we have x ≡ y ≡ z ≡ 0 (mod 3) and therefore x3 ≡ y 3 ≡ z 3 ≡ 0 (mod 27). We deduce that 3|w. But this contradicts the assumption that x, y, z, w have gcd 1 since we have shown that 3 divides all of them. In this example, we used the fact that the equation was homogeneous (each term has the same degree) to divide through by common divisors and reduce to the case where a solution has no common factors. Finally, we give another tricky example (we didn’t do this one in lectures): Example. Consider the equation x4 + x3 + x2 + x + 1 = y 2 . We consider f (x) = 4(x4 + x3 + x2 + x + 1). Then f (x) = (2x2 + x)2 + 3x2 + 4x + 4 = (2x2 + x)2 + 3(x + 2/3)2 + 8/3 > (2x2 + x)2 On the other hand, f (x) = (2x2 + x + 1)2 − (x + 1)(x − 3). Since we also have f (x) = (2y)2 , f (x) cannot lie between the squares of consecutive integers, 2x2 + x and 2x2 + x + 1. So we must have (x + 1)(x − 3) < 0. This means that we only get solutions with x ∈ [−1, 3]. By considering each possible value for x in turn and solving for y we get solutions (−1, ±1), (0, ±1), (3, ±11). 9.2. Pythagorean triples. Let’s consider the equation x2 + y 2 = z 2 and try to describe all integer solutions (x, y, z). Definition 9.1. A Pythagorean triple is a set of three positive integers (x, y, z) such that x2 + y 2 = z 2 . We say that the triple is primitive if the only positive common divisor of x, y and z is 1 (i.e. gcd(x, y, z) = 1). We call a right-angled triangle with integer length sides a Pythagorean triangle. Its side lengths form a Pythagorean triple. Example. (3, 4, 5) and (5, 12, 13) are primitive Pythagorean triples. Lemma 9.2. Supppose (x, y, z) is a primitive Pythagorean triple. Then any two of the three integers (x, y, z) are coprime. Week 9 42 JAMES NEWTON Proof. Suppose p is a prime number with p|x, p|z. Then y 2 = z 2 − x2 so p|y 2 which implies that p|y. This contradicts the primitivity of (x, y, z). So x, z are coprime. A similar argument applies to the other pairs x, y, y, z. To find all Pythagorean triples, it suffices to find all primitive Pythagorean triples — any Pythagorean triple (x, y, z) is equal to (dx0 , dy 0 , dz 0 ) where d ∈ N and (x0 , y 0 , z 0 ) is a primitive Pythagorean triple. Lemma 9.3. If (x, y, z) is a primitive Pythagorean triple then one of x, y is even and the other is odd. Proof. If x, y are both even, then z 2 = x2 + y 2 is also even, so z is even. This is impossible, because (x, y, z) is assumed to be primitive. If x, y are both odd, then z 2 ≡ 2 (mod 4) which is impossible (the square of any integer is either 0 or 1 mod 4). From now on, we can assume (by swapping x and y if necessary) that a primitive Pythagorean triple (x, y, z) has x even and y odd. Here is a trick which we will use a couple of times: Lemma 9.4. Suppose a, b are coprime positive integers and ab = c2 is the square of an integer c. Then a and b are themselves squares. Proof. Consider the prime factorisations a = pa11 · · · par r , b = q1b1 · · · qsbs , with the ai and bi positive integers. Since a and b are coprime, the primes pi , qi are all distinct, so the prime factorisation of ab is ab = pa11 · · · par r q1b1 · · · qsbs and ab is a square if and only if all the numbers ai , bi are even, which happens if and only if both a and b are squares. Theorem 9.5. All primitive Pythagorean triples, with x even, are given by the formulas: x = 2st, y = s2 − t2 , z = s2 + t2 for integers s > t > 0 such that gcd(s, t) = 1 and s 6≡ t (mod 2). To get all Pythagorean triples (up to swapping x and y) we take integers s, t as above and d another positive integer and consider x = 2dst, y = d(s2 − t2 ), z = d(s2 + t2 ). Proof. Let (x, y, z) be a primitive Pythagorean triple. Since y and z are both odd we have integers u, v such that z − y = 2u and z + y = 2v. The equation x2 + y 2 = z 2 can be rewritten as x2 = z 2 − y 2 = (z − y)(z + y) = 4uv so 2 x 2 = uv. Now we claim that gcd(u, v) = 1. Indeed, if d|u and d|v then d divides u + v = y and u − v = z. Since y and z are coprime, we must have d = 1. Since uv is a square and u, v are coprime, we see that both u and v are squares, so we write u = t2 and v = s2 with s, t positive integers. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 43 Now we have z = s2 + t2 , y = s2 − t2 and x2 = 4s2 t2 , so x = 2st. Since gcd(y, z) = 1, we have gcd(s, t) = 1, and since y is odd we have s 6≡ t (mod 2). To check that the formulas give a primitive Pythagorean triple is left as an exercise. To get all Pythagorean triples, we just use the observation above that if (x, y, z) is a Pythagorean triple then (x, y, z) = (dx0 , dy 0 , dz 0 ) where (x0 , y 0 , z 0 ) is a primitive Pythagorean triple. 9.3. Fermat’s Last Theorem. Theorem 9.6 (Wiles, Taylor, 1994). If n ≥ 3 there are no positive integer solutions (x, y, z) to the equation xn + y n = z n The simplest case of Fermat’s Last Theorem is when n = 4 — for this case (and perhaps only this case) Fermat is likely to have found a proof himself. In fact we can proof something a bit stronger: Theorem 9.7. There are no positive integer solutions (x, y, z) to the equation x4 + y 4 = z 2 Proof. For a contradiction, we suppose (x0 , y0 , z0 ) is a solution. We can assume that gcd(x0 , y0 ) = 1, as if d divides x0 and y0 then d2 divides z0 and (x0 /d, y0 /d, z0 /d2 ) is another solution. Now (x20 , y02 , z0 ) is a primitive Pythagorean triple. We can assume x0 is even (otherwise we swap x0 and y0 ), and then we have x20 = 2st, y02 = s2 − t2 , z0 = s2 + t2 for integers s > t > 0 such that gcd(s, t) = 1 and s 6≡ t (mod 2). So we get a primitive Pythagorean triple (t, y0 , s), with y0 odd, and therefore t even. So we have t = 2uv, y0 = u2 − v 2 , s = u2 + v 2 with u, v coprime, and we have x20 = 4uv(u2 + v 2 ). Since u, v are coprime, we also have u, u2 + v 2 and v, u2 + v 2 coprime. Considering the prime factorisations of u, v and (u2 + v 2 ) we deduce from the equation x20 = 4uv(u2 +v 2 ) that all three of u, v and u2 +v 2 are squares. Letting u2 + v 2 = z12 , u = x21 and v = y12 , we get z12 = x41 + y14 so we have produced another positive integer solution (x1 , y1 , z1 ) to the equation x4 + y 4 = z 2 . Moreover, we have z1 ≤ s < s2 + t2 = z0 . So z1 is strictly less than z0 . But now we can repeat this procedure and produce an infinite sequence of positive integers z0 > z1 > z2 > · · · . This is impossible, so we get the desired contradiction. This method of proof is called ‘Fermat descent’. Germain made the most progress towards solving Fermat’s last theorem by elementary methods. 44 JAMES NEWTON Definition 9.8. A Sophie Germain prime is a prime number p such that 2p + 1 is also prime. It is conjectured that there are infinitely many Sophie Germain primes, but this remains unproven. The first few examples are p = 2, 3, 5, 11, 23, 29, . . .. Theorem 9.9. Let p be a Sophie Germain prime. Then the equation xp + y p = z p has no integer solution (x, y, z) with p - xyz. Proof. (Not examinable, it’s a bit complicated, but still only uses ideas we’ve seen already.) We assume for a contradiction that such a solution exists. Dividing through by gcd(x, y, z) we can assume that gcd(x, y, z) = 1. We can also assume p is odd, since we already showed that if p = 2 then one of x, y is even so p|xyz. By changing the sign of z we can rewrite the equation as xp + y p + z p = 0, so we can instead consider solutions to this equation. Now we have −xp = y p + z p = (y + z)(z p−1 − z p−2 y + · · · + y p−1 ) We claim that the two factors on the right hand side are coprime. Since p - x, p is not a factor of the right hand side. If r 6= p is a prime and r|(y + z) then y ≡ −z (mod r) so z p−1 − z p−2 y + · · · + y p−1 ≡ z p−1 + z p−1 + · · · + z p−1 ≡ pz p−1 (mod r). Since gcd(x, y, z) = 1 and r|x we can’t have r|z, otherwise the equation y p = −xp − z p implies that r|y and r would be a common divisor of x, y and z. This implies that r does not divide (z p−1 − z p−2 y + · · · + y p−1 ) and establishes the claim. Now considering the prime factorisations of the coprime integers (whose product is a pth power) (y + z) and (z p−1 − z p−2 y + · · · + y p−1 ) shows that we have integers A and T such that y + z = Ap and z p−1 − z p−2 y + · · · + y p−1 = T p . The same argument shows that we have integers B, C with x + y = B p and x + z = C p . Let q = 2p + 1, which we are assuming is prime. Considering the equation mod q we get x(q−1)/2 + y (q−1)/2 + z (q−1)/2 ≡ 0 (mod q). If q - xyz then each term on the left hand side is ±1 (mod q) by Euler’s criterion. Since q > 5, this is impossible. So we have q|xyz. Without loss of generality we can assume q|x, since everything is symmetric. Since gcd(x, y, z) = 1 arguing as before (with r) shows that this implies q - yz. Now we have B p +C p −Ap = 2x, so B (q−1)/2 +C (q−1)/2 −A(q−1)/2 ≡ 0 (mod q). Applying Euler’s criterion again, we deduce that q|ABC. Since q|x but q - yz we have q - BC so we must have q|A. We then get T p ≡ py p−1 (mod q). Since y ≡ B p ≡ B (q−1)/2 (mod q) and p is odd we have y p−1 ≡ 1 (mod q), by Fermat’s little theorem. Euler’s criterion implies that T p ≡ ±1 (mod q). So we get an equation ±1 ≡ p (mod q), which is impossible. Finally we have obtained a contradiction, so there was no solution (x, y, z) with p - xyz to our original equation. ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 45 9.4. Pell’s equation. A Pell equation is an equation of the form x2 − Dy 2 = 1 where D is a fixed positive integer which is not a perfect square. Here is one natural way this equation appears: suppose we are interested in numbers which are simultaneously square and triangular. This means that we are interested in solutions to the equation m(m + 1) n2 = 2 Rearranging this equation we get 2(2n)2 = (2m + 1)2 − 1 so solutions to this equation correspond to solutions to the Pell equation x2 − 2y 2 = 1. We can easily spot the solution x = 3, y = 2 (this corresponds to the square and triangular number 1!). To get other solutions, we can observe that the equation x2 −2y 2 = 1 can be written as √ √ (x + y 2)(x − y 2) = 1 If k is a positive integer then raising both sides to the power of k gives √ √ (x + y 2)k (x − y 2)k = 1 √ √ So if we (3 +√ 2 2)k in the form x + y 2 we get a new solution. For example, √ rewrite (3 + 2 2)3 = 99 + 70 2, so x = 99, y = 70 is another solution, which corresponds to the square and triangular number 352 = 49·50 = 1225. 2 We can do the same thing for any non-square positive integer D: if we have one solution 2 2 (x1 , y1√) to the equation √ kx − Dy = 1 we get more solutions by choosing x, y so that x + y D = (x1 + y1 D) . This process actually gives all the solutions to the Pell equation. Week 10 Theorem 9.10. Let D be a positive integer which is not a square. Suppose the equation x2 − Dy 2 = 1 has a positive integer solutions. Let (x1 , y1 ) be the positive integer solution with the smallest value of x1 . Then every positive integer solution (x, y) is given by √ √ x + y D = (x1 + y1 D)k for k = 1, 2, 3, . . .. Proof. The proof of this Theorem is a version of Fermat descent. We suppose that (u, v) is a solution with u > x1 . We are going to show that √ √ √ u + v D = (x1 + y1 D)(s + t D) where s, t are positive integers, s < u and s2 − Dt2 = 1. We can then repeat the procedure, applying it to (s, t). As the value of x is decreasing with each step, we eventually reach the solution with the x value, which is (x1 , y1 ). Then we will have shown that √ √ smallest k u + v D = (x1 + y1 D) for some k. 46 JAMES NEWTON We are looking for s, t such that √ √ √ u + v D = (x1 + y1 D)(s + t D) Multiplying out the equation we get √ √ u + v D = (x1 s + Dy1 t) + (x1 t + y1 s) D so we need to solve the simultaneous equations u = x1 s + Dy1 t and v = x1 t + y1 s for s and t. Multiplying the first equation by y1 and the second by x1 and taking the difference we get y1 u − x1 v = Dy12 t − x21 t = −t so we get t = x1 v − y1 u and similarly s = ux1 − Dy1 v You can check with a bit of algebra that (s, t) is a solution, so it remains to show that s < u and s, t are positive. Once we’ve shown s, t are positive then since u = x1 s + Dy1 t it’s immediate that u > s. √ √ 2 2 2 We have u = 1 + Dv > Dv so u > Dv and similarly x > Dy1 , so s = ux1 − 1 √ √ Dy1 v > ( Dv)( Dy1 ) − Dy1 v = 0. Finally we need to show that t is positive. We have t = x1 v − y1 u so we need to show that x1 v > y1 u. Squaring both sides, it suffices to show that x21 v 2 > y12 u2 . We can rewrite this as (1 + Dy12 )v 2 > y12 (1 + Dv 2 ) which simplifies to v 2 > y12 . This inequality follows from our assumption that u > x1 . Fact 9.11. In fact, the Pell equation x2 − Dy 2 always has positive integer solutions. So the above Theorem always applies, and tells us that every solution can be obtained from the solution with the smallest value of x. Proof of fact. (Not examinable) Here is a sketch proof of this fact: First we use Dirichlet’s approximation theorem to show that there are infinitely many pairs of positive integers (x, y) with √ x 1 | D − | < 2. y y In particular, there are infinitely many (x, y) with √ √ √ √ √ 1 1 |x2 − Dy 2 | = |x − y D||x + y D| < |x + y D| = |(x − y D) + 2y D| y y √ √ 1 1 < ( + 2y D) ≤ 1 + 2 D y y ELEMENTARY NUMBER THEORY (5CCM224A AND 6CCM224B) 47 Applying the pigeonhole principle, we deduce that there is an integer M with |M | < √ 1+2 D and infinitely many solutions to x2 −Dy 2 = M . Applying the pigeonhole principle again, we can find two distinct solutions (x1 , y1 ), (x2 , y2 ) to this equation with x1 ≡ x2 (mod M ) and y1 ≡ y2 (mod M ). Now we consider √ √ √ √ x1 − Dy1 x1 − Dy1 x2 + Dy2 √ √ √ = =u+v D x2 − Dy2 x2 − Dy2 x2 + Dy2 with u, v ∈ Q. We claim that u, v are non-zero integers and u2 − Dv 2 = 1. Switching signs if necessary, this gives positive integer solutions to the Pell equation. First we show that u, v are integers. We have x1 y 2 − x2 y 1 x1 x2 − Dy1 y2 and v = u= M M and it follows from the congruence between x1 , x2 and y1 , y2 , together with the fact that these are solutions to x2 − Dy 2 = M , that u and v are integers. Now since √ we have 2 2 u − Dv√ = 1, we must have u 6= 0. If v = 0 then u = ±1 which implies that x1 − Dy1 = ±(x2 − Dy2 ) which contradicts the fact that (x1 , y1 ) and (x2 , y2 ) are chosen to be distinct pairs of positive integers. Example. Let’s find a solution to x2 − 13y 2 = 1 Applying the proof of Dirichlet’s approximation theorem gives (x, y) with x2 − 13y 2 small. In particular, we have 112 − 13 · 32 = 1192 − 13 · 332 = 4 Unfortunately these solutions are not congruent mod 4, so √ √ 119 − 33 13 11 3 √ √ u + v 13 = − 13 = 2 2 11 − 3 13 is not an integer solution. Searching further (with a computer!) we find that (14159, 3927) is a solution to x2 − 13y 2 = 4, and 14159 ≡ 11 (mod 4), 3927 ≡ 3 (mod 4). We get √ √ √ 14159 − 3927 13 √ u + v 13 = = 649 − 180 13 11 − 3 13 so a solution to the original equation is given by (649, 180). The above method doesn’t seem to be a very efficient way of finding solutions! Continued fractions, which we mentioned in the section on Dirichlet’s approximation theorem, give a much faster way of computing solutions.
© Copyright 2026 Paperzz