Elliptic curve cryptography
Matthew England
MSc Applied Mathematical Sciences
Heriot-Watt University
Summer 2006
Abstract
This project studies the mathematics of elliptic curves, starting with their
derivation and the proof of how points upon them form an additive abelian
group. We then work on the mathematics neccessary to use these groups
for cryptographic purposes, specifically results for the group formed by an
elliptic curve over a finite field, E(Fq ). We examine the mathematics behind
the group of torsion points, to which every point in E(Fq ) belongs, and
prove Hasse’s theorem along with a number of other useful results. We finish
by describing how to define a discrete logarithm problem using E(Fq ) and
showing how this can form public key cryptographic systems for use in both
encryption and key exchange.
Acknowledgments
Many thanks to Dr. Mark Lawson, for his help, supervision and enthusiasm
for this project.
Contents
1 Introduction
1
2 Elliptic curves
2
2.1 A class of algebraic curves . . . . . . . . . . . . . . . . . . . . 2
2.2 Group law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Prime curve examples . . . . . . . . . . . . . . . . . . 10
3 Torsion points and endomorphisms of
3.1 Endomorphisms of elliptic curves . .
3.2 Torsion points . . . . . . . . . . . . .
3.2.1 Successive doubling . . . . . .
3.2.2 The basis for E[n] . . . . . .
3.3 Division polynomials . . . . . . . . .
3.4 The Weil pairing . . . . . . . . . . .
4 Elliptic curves over finite fields
4.1 Examples . . . . . . . . . . . . . . .
4.2 Hasse’s theorem . . . . . . . . . . . .
4.2.1 The Frobenius endomorphism
4.3 Orders of points . . . . . . . . . . . .
4.3.1 Baby Step, giant step . . . . .
5 Elliptic curve cryptography
5.1 The basics of cryptography . . . . .
5.2 Public key cryptography . . . . . .
5.3 The discrete logarithm problem . .
5.3.1 Diffie-Hellman key exchange
5.3.2 The El Gamal cryptosystem
i
.
.
.
.
.
elliptic
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
curves
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
31
35
36
38
44
.
.
.
.
.
47
47
50
51
55
58
.
.
.
.
.
61
61
64
67
68
69
5.4
Elliptic curve cryptography . . . . . . . . . . . . . .
5.4.1 The discrete logarithm problem for
elliptic curves . . . . . . . . . . . . . . . . . .
5.4.2 Diffie-Hellman key exchange for elliptic curves
5.4.3 El Gamal cryptosystem for elliptic curves . . .
. . . . . 70
. . . . . 70
. . . . . 71
. . . . . 73
6 Summary and conclusions
75
Bibliography
77
APPENDIX
78
A Elliptic curve material
A.1 Singular curves . . . . . . . . . . . . . . . . . . . .
A.1.1 The relationship between multiple roots
and singular points . . . . . . . . . . . . . .
A.1.2 Triple root . . . . . . . . . . . . . . . . . . .
A.1.3 Double root . . . . . . . . . . . . . . . . . .
A.2 Deriving the condition for distinct roots . . . . . .
A.2.1 Determining the roots . . . . . . . . . . . .
A.2.2 The discriminant . . . . . . . . . . . . . . .
A.2.3 Relating back to elliptic curves . . . . . . .
A.3 Elliptic curves in characteristic 2 . . . . . . . . . .
A.4 Elliptic curves in characteristic 3 . . . . . . . . . .
A.5 The proof of associativity . . . . . . . . . . . . . .
A.5.1 Projective geometry and the point at infinity
A.5.2 Lines in PK2 . . . . . . . . . . . . . . . . . .
A.5.3 The proof of associativity . . . . . . . . . .
A.6 The proofs omitted from Chapter 3 . . . . . . . . .
A.7 Methods to determine the order of E(Fq ) exactly .
A.7.1 Subfield curves . . . . . . . . . . . . . . . .
A.7.2 Legendre symbols . . . . . . . . . . . . . . .
A.8 Supersingular curves . . . . . . . . . . . . . . . . .
78
. . . . . . 78
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
78
80
84
94
94
97
100
101
105
106
106
108
114
122
129
129
131
135
B Mathematical background material
137
B.1 Algebraic curves . . . . . . . . . . . . . . . . . . . . . . . . . . 137
B.2 Fractions in polynomial rings . . . . . . . . . . . . . . . . . . 140
B.3 Number theory . . . . . . . . . . . . . . . . . . . . . . . . . . 141
ii
B.4 Group theory . . . . . . . . . . . . . . . . . . .
B.5 Field theory . . . . . . . . . . . . . . . . . . . .
B.5.1 Finite fields . . . . . . . . . . . . . . . .
B.5.2 Constructing F9 . . . . . . . . . . . . . .
B.5.3 Constructing F8 . . . . . . . . . . . . . .
B.5.4 Addition and multiplication tables of F4
B.6 Miscellaneous . . . . . . . . . . . . . . . . . . .
C Matlab Code
C.1 The Matlab
C.2 The Matlab
C.3 The Matlab
C.4 The Matlab
C.5 The Matlab
C.6 The Matlab
C.7 The Matlab
code
code
code
code
code
code
code
for
for
for
for
for
for
for
ECAD.m . .
PC.m . . . .
ECADP.m .
inve.m . . .
SUCDOB.m
check.m . .
RR44.m . .
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
143
147
150
153
156
157
158
.
.
.
.
.
.
.
161
. 161
. 163
. 165
. 167
. 167
. 169
. 170
Chapter 1
Introduction
An elliptic curve is usually defined to be the graph of an equation
y 2 = x3 + Ax + B
where x, y, A and B belong to a specified field. These curves are of great
use in a number of applications, largely because it possible to take two points
on such a curve and generate a third. In fact, we will show that by defining
an addition operation and introducing an extra point, ∞, the points on an
elliptic curve form an additive abelian group.
Such a group can then be used to create an analogue of the discrete
logarithm problem which is the basis for several public key cryptosystems.
This project will introduce the mathematics behind elliptic curves and then
demonstrate how to use them for cryptography.
The project loosely follows and adds to the work in Chapters 2 to 6 of
[9]. If not otherwise stated the material has been adapted from this source.
Chapter 2 of the project introduces the basic mathematics behind elliptic
curves, such as the proof that the points upon them form an abelian group.
Chapter 3 then considers those points in the group which are torsion while
Chapter 4 considers elliptic curves defined over finite fields. Here we prove
Hasse’s theorem to give a bound on the size of the group. Chapter 5 demonstrates how the mathematics of the previous chapters can be employed in a
cryptographic algorithm for use in key exchange or encryption of messages.
Appendix A contains some further results on elliptic curves while Appendix B contains the mathematical background material that is employed
throughout the project. We also make use of Matlab to speed up calculations
with elliptic curves and the relevant codes can be found in Appendix C.
1
Chapter 2
Elliptic curves
Elliptic curves have, over the last three decades, become an increasingly
important subject of research in number theory and related fields such as
cryptography. They have also played a part in numerous other mathematical
problems over hundreds of years. For example, the congurant number problem
of finding which integers n can occur as the area of a right angled triangle with
rational sides can be expressed using elliptic curves (see Chapter 1 of [9]).
In this chapter we set out the basic mathematics of elliptic curves, starting
with their derivation and definition followed by the proof that points upon
them form an additive abelian group.
2.1
A class of algebraic curves
Elliptic curves are a specific class of algebraic curves. In this section we show
how we arrive at their standard definition, seen in the introduction, from the
more general case. First consider an algebraic curve formed from a conic on
the left and a cubic on the right:
y 2 + θ1 xy + θ2 y + θ3 x + θ4 = x3 + σ1 x2 + σ2 x + σ3
where θi , σi are constants. We can then combine the constant and linear
terms to form what is known as the generalised Weierstrass equation:
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6
(2.1)
where a1 , ..., a6 are constants. In practice we must specify which field these
constants and the variables, x, y belong to. So long as this field does not have
2
characteristic 2 then we can divide the above equation by 2 and complete
the square. This gives
2
a1 x a3 2
a1 a3 a3
a21
2
3
y+
+
x + a4 +
x+
+ a6
= x + a2 +
2
2
4
2
4
which can be written as
y12 = x3 + a02 x2 + a04 x + a06
with y1 = y + a1 x/2 + a3 /2 and some constants a02 , a04 , a06 . If the characteristic
were 2 then 2 would be equivalent to 0 in this field. We would then not be
able to perform the above operation as we cannot divide by zero.
If the characteristic was neither 3 or 2, then we could perform a further
substitution letting x1 = x + a02 /3 to obtain
y12 = x31 + Ax1 + B
for some constants A, B. This equation is known as the Weierstrass equation
for an elliptic curve and is used in all cases, except those where the characteristic of the field is either 2 or 3. If the characteristic is 2 then we use the
generalised Weierstrass equation and if it is 3 we use Equation (2.1).
Notice that we assume the coefficients of the y 2 and x3 terms are one.
Suppose we start with an equation
cy 2 = dx3 + ax + b
with c, d 6= 0. Then multiply both sides of the equation by c3 d2 to obtain
(c2 dy)2 = (cdx)3 + (ac2 d)(cdx) + (bc3 d2 )
and so if we use the change of variables
y1 = c2 dy,
x1 = cdx
then we have an equation in Weierstrass form.
We cannot draw meaningful pictures of such curves over most fields, but
for intuition we can think of graphs over the real numbers of which there are
two main types.
3
Figure 2.1: Some examples of elliptic curves defined over the real numbers.
On the left is y 2 = x3 − x and on the right y 2 = x3 + x
The first example has three real roots, while the second has one. We
prove in Appendix A.1 that when an elliptic curve has a multiple root it
will have a singular point, which causes problems when defining the addition
operation. We investigate the singular cases in Appendix A.1 but otherwise
assume that all the roots are distinct.
In Appendix A.2 we use the definition of the discriminant applied to this
case when the characteristic is neither 2 or 3 to derive the following condition
for distinct roots.
4A3 + 27B 2 6= 0
The general definition for an elliptic curve will be the Weierstrass equation
applied with the above condition.
As mentioned above we must specify what set A, B, x and y belong to.
Usually they will belong to a field such as R, C or Q, one of the finite fields
Fp (= Zp ) for a prime p or one of the finite fields Fq where q = pk with k ≥ 1.
If K is a field with A, B ∈ K then we say the elliptic curve E is defined
over K. In general we use E and K to represent an elliptic curve and the
field over which it is defined. If we wish to consider points in a field L ⊇ K
we write E(L), which is defined as below.
E(L) = {∞} ∪ {(x, y) ∈ L × L | y 2 = x3 + Ax + B}
We include this point of infinity on elliptic curves for use in the group operation defined in the following section. It is easiest to regard it as a point
4
(∞, ∞) and denote it simply by ∞ sitting at the top of the y-axis. A line
is said to pass through ∞ when it is exactly verticle (i.e. x = constant),
and so two verticle lines will meet at ∞. We make sense of this concept and
interpret ∞ as being on an elliptic curve in Appendix A.5.1. We also think
of ∞ as sitting at the bottom of the y-axis, but this would imply two straight
lines meet at two points. Instead we require this top and bottom ∞ to be
the same point, (as if the y-axis were wrapped around to form a circle).
2.2
Group law
As stated in the introduction, we can start with two points on an elliptic curve
(or even one) and produce another. In this section we describe how to carry
out this process and derive the formula for use with the Weierstrass equation.
We then show that by defining this process as an addition operation we can
generate an additive abelian group.
Suppose we have a point P = (x0 , y0 ) on an elliptic curve (in any characteristic). If L is a line through P and ∞ then it is a verticle line x = x0 .
We denote the other point of intersection between L and E as P 0 . For the
Weierstrass equation, P 0 = (x0 , −y0 ) since this curve is symmetric about the
x-axis. For the generalised Weierstrass equation it is as calculated as in the
lemma below.
Lemma 2.1. If P = (x0 , y0 ) lies on the curve, E, given by
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6
then the other point of intersection between E and x = x0 is
P 0 = (x0 , −a1 x0 − a3 − y0 )
Proof We know that when x = x0 there are two points on E, y0 and y1 so:
y 2 + a1 x0 y + a3 y = x30 + a2 x20 + a4 x0 + a6
0 = y 2 + y(a1 x1 + a3 ) + (−x30 − a2 x20 − a4 x0 + a6 )
≡ (y − y0 )(y − y1 ) = y 2 − y(y0 + y1 ) + y0 y1
We can see that the negative of the coefficient of the linear term is the sum
of the roots. Therefore
y0 + y1 = −a1 x0 − a3
y1 = −a1 x0 − a3 − y0
5
So −P = (x0 , −a1 x0 − a3 − y0 ) as required.
So if P = (x0 , y0 ) then P 0 as defined above is (x0 , −a1 x0 − a3 − y0 ) if
the characteristic of K is 2 and (x0 , −y0 ) otherwise. Later we conclude that
P 0 = −P in group notation.
We can now define elliptic curve addition. Suppose we are on an elliptic
curve, E, defined over a field K of any characteristic. If we start with two
points, P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on E then we can find a third point,
P3 as follows. Draw the line L between P1 and P2 , find the third point
of intersection, denoted P30 . Finally calculate (P30 )0 = P3 using the method
above. The addition operation is then defined as
P1 + P2 = P3
Figure 2.2: Adding points on an elliptic curve
We now find explicit formula for P3 by looking at the different possibilities
for P1 and P2 . Suppose that we are on an elliptic curve E given by the
Weierstrass equation y 2 = x3 + Ax + B.
First assume P1 6= P2 and that neither point is ∞. We then know that
the slope of the line L is
y2 − y1
m=
x2 − x1
6
Now assume that x2 6= x1 in which case the equation of L is
y = m(x − x1 ) + y1
(2.2)
To find the intersection with E substitute (2.2) into the equation for E:
(m(x − x1 ) + y1 )2 = x3 + Ax + B
⇒ x3 − m2 x2 + ... = 0
where the three roots of this cubic are the three points where L intersects
E. Note from Theorem B.16 that the sum of the roots is the negative of the
coefficient of the x2 term in the cubic. We know two of the roots are x1 and
x2 and so we can conclude that x03 = m2 − x1 − x2 . We can then substitute
back to get y30 = m(x03 − x1 ) + y1 . Finally we can reflect in the x-axis to find
P3 = (x3 , y3 )
x3 = m2 − x1 − x2 ,
y3 = m(x1 − x3 ) − y1
In the case that x1 = x2 but y1 6= y2 the line through P1 and P2 is
verticle and so intersects E at ∞. Reflecting ∞ in the x-axis gives ∞ and
so P1 + P2 = ∞
In the case where P1 = P2 = (x1 , y1 ) the line, L, is the tangent at (x1 , y1 ).
Implicit differentiation allows us to find m, the slope of L
2y
dy
= 3x2 + A
dx
=⇒
m=
dy
3x2 + A
= 1
dx
2y1
If y1 = 0 then L is verticle so we set P1 + P2 = ∞. Otherwise the equation
of L is
y = m(x − x1 ) + y1
as before. We can substitute in to obtain the same cubic and then use the
fact that x1 is a double root to obtain P3 = (x3 , y3 )
x3 = m2 − 2x1 ,
y3 = m(x1 − x3 ) − y1
Finally suppose P2 = ∞ in which case the line between P1 and ∞ is a
verticle line that intersects E at P10 — the reflection of P1 in the x-axis. Then
when we reflect this back we get P1 so
P1 + ∞ = P1
7
we can extend this to include ∞ + ∞ = ∞.
We can now begin to see why elliptic curves are suited for the definition of
such an operation. The right hand side of the Weierstrass equation is cubic
which ensures that the line between any two points will intersect at a third
point, the first step in the operation. Then the y 2 term on the left hand side
makes the curve symmetric about the x-axis, which is vital for the reflection
part. The addition operation is summarised in the box below.
SUMMARY
Let E be an elliptic curve defined by y 2 = x3 + Ax + B.
Let P1 = (x1 , y1 ) and P2 = (x2 , y2 ) be points on E with P1 , P2 6= ∞.
We then define P1 + P2 = P3 = (x3 , y3 ) as follows
1. If x1 6= x2 then
x3 = m2 − x1 − x2 ,
where m =
y3 = m(x1 − x3 ) − y1
y2 −y1
x2 −x1
2. If x1 = x2 but y1 6= y2 then P1 + P2 = ∞
3. If P1 = P2 and y1 6= 0 then
x3 = m2 − 2x1 ,
where m =
y3 = m(x1 − x3 ) − y1
3x21 +A
2y1
4. If P1 = P2 and y1 = 0, then P1 + P2 = ∞
Also we define P + ∞ = P for all points P on E
If the characteristic of K is 2 or 3 then we use the same method for elliptic
curve addition but the formula are different. We consider the characteristic
2 and 3 cases in Appendix A.3 and Appendix A.4 respectively.
Theorem 2.2. The points on E form an additive abelian group with ∞ as
the identity element and elliptic curve addition as the group operator.
8
Proof Recall the definition of a group from Appendix B.4. The commutativity is obvious from the formulas and the intuition of drawing a straight
line through two points, while the identity property holds by definition. It is
also clear from the formulas that the sum of any two points will also be on
the elliptic curve, and if those original points had coordinated in a field L,
then so does the sum.
For inverses we define −P as P 0 , (the reflection of P in the x-axis in
the characteristic not 2 case). Then P + P 0 = ∞ for all P . Associativity
can be proved with the formulas, trying all cases, or with a number of other
approaches. We use projective space to prove this property in Appendix A.5.
This theorem will also hold for the characteristic not 2 case similarly
(defining −P as P 0 given by Equation (2.1)).
Example 2.1. Let E be the curve y 2 = x3 − 25x and suppose we know the
point (−4, 6) lies on the curve. To find another point on E we can add this
point to itself. In the notation of elliptic curve addition we have:
m=
23
3(−4)2 − 25
=
2(6)
12
Hence
!
23
2(−4, 6) = (−4, 6) + (−4, 6) =
− 2(−4), (−4 − x3 ) − 6
12
1681 −62279
=
,
144
1728
23
12
2
A Matlab m-file was constructed to perform elliptic curve addition over
the real numbers. Suppose we have an elliptic curve, E, given by y 2 =
x3 + Ax + B and two points P1 = (x1 , y1 ), P2 = (x2 , y2 ). The m-file will find
the sum, P1 + P2 = P3 = (x3 , y3 ), where + represents elliptic curve addition.
It takes as its inputs x1 , y1 , x2 , y2 and A and produces x3 , y3 and, if requested,
m. In future examples elliptic curve addition is performed with this m-file
to save calculation.
The file is stored in ECAD.m and can be found in Appendix C.1
Note that if P is a point on an elliptic curve and k is a positive integer,
then kP denotes P + P + ... + P (with k summands). If k < 0 then
kP = (−P ) + (−P ) + ... + (−P ), (with |k| summands).
9
2.2.1
Prime curve examples
This section contains some examples of working with elliptic curves which
are defined over Zp . These are often called the prime curves and can be
far simpler to work with as we can reduce modulo p at each stage. These
examples are derived from those in Section 10.3 of [8].
Suppose we have an elliptic curve, E, over Zp . In this case we have a
cubic equation in which the variables and coefficients take values on the set
of integers 0, 1, ...(p − 1) and all calculations are performed modulo p.
y 2 ≡ x3 + Ax + B
(mod p)
We write Ep (A, B) for the set of integers (x, y) that satisfy the above equation, together with a point at infinity, ∞.
Example 2.2. The set E11 (1, 6) is the set of integers (x, y) that satisfy
y 2 ≡ x3 + x + 6 (mod 11)
We can see that (x, y) = (7, 9) is in this set as
92 (mod 11) = (73 + 7 + 6) (mod 11)
81 (mod 11) = 356 (mod 11) ⇐⇒ 4 = 4
To find all the points in E11 (1, 6) we find all the possible values x3 + x + 6
(mod p) and then see what values of y 2 will match. There are 11 choices of
x, the integers {0, 1, ..., 10}. Subbing these values in turn into the cubic and
reducing modulo 11 will give us the possible values of y 2 :
x=0
x=1
x=2
x=3
x=4
x=5
=⇒
=⇒
=⇒
=⇒
=⇒
=⇒
RHS
RHS
RHS
RHS
RHS
RHS
x=6
x=7
x=8
x=9
x = 10
=6
=8
= 16 ≡ 5
= 36 ≡ 3
= 74 ≡ 8
= 136 ≡ 4
=⇒
=⇒
=⇒
=⇒
=⇒
RHS
RHS
RHS
RHS
RHS
= 228 ≡ 8
= 356 ≡ 4
= 526 ≡ 9
= 744 ≡ 7
= 1016 ≡ 4
So we can see that the possible values of y 2 are {3, 4, 5, 6, 7, 8, 9}
i.e. y 2 cannot be 0,1,2 or 10.
Next examine the 10 possible values of y and identify which values of x
they could be paired with to give a point on the curve.
10
y
y
y
y
y
y
=0
=1
=2
=3
=4
=5
⇒
⇒
⇒
⇒
⇒
⇒
y2
y2
y2
y2
y2
y2
=0
=1
=4
=9
= 16 ≡ 5
= 25 ≡ 3
⇒
⇒
⇒
⇒
⇒
⇒
No Points
No Points
x = 5, 7, 10
x=8
x=2
x=3
y
y
y
y
y
=6
=7
=8
=9
= 10
⇒
⇒
⇒
⇒
⇒
y2
y2
y2
y2
y2
= 36 ≡ 3
= 49 ≡ 5
= 64 ≡ 9
= 81 ≡ 4
= 100 ≡ 1
⇒
⇒
⇒
⇒
⇒
x=3
x=2
x=8
x = 5, 7, 10
No Points
So there are 13 points in E11 (1, 6) — (the 12 found above and ∞):
E11 (1, 6) = {(2, 4), (2, 7), (3, 5), (3, 6), (5, 2), (5, 9), (7, 2), (7, 9), (8, 3), (8, 8), (10, 2), (10, 9), ∞}
An m-file, PC.m, to find and plot all the points on a prime curve was constructed and is stored in Appendix C.2. This m-file takes as its inputs, A, B
and p and produces two vectors X, Y which contain all the points (x, y) that
lie on y 2 ≡ x3 + Ax + B (mod p).
When run on this example it verified that we had found found all the
points in E11 (1, 6) and plotted the graph below. We can see that the points
are symmetric about the line y = 5.5
11
We can perform the elliptic curve addition operation on prime curves,
however we reduce modulo p at each step. For example, still considering
E11 (1, 6):
• If P = (8, 3) then we know that −P = (8, −3). Working modulo 11 we
see that −P = (8, 8) which is also a point in E11 (1, 6).
• Let P = (8, 3) and Q = (3, 5). Then to find R = P + Q:
m=
5−3
2
2
1
=
≡ = =1×4=4
3−8
−5
6
3
The penultimate step involved taking the multiplicative inverse of 3 in
Z11 . We now proceed to show that
xR = 42 − 8 − 3 = 5,
yR = 4(8 − 5) − 3 = 9
So in E11 (1, 6) we find (8, 3) + (3, 5) = (5, 9).
• Again let P = (8, 3). To calculate 2P = P + P :
3(82 ) + 1
193
6
=
≡ = 1 (mod 11)
2∗3
6
6
= 12 − 2(8) = −15 ≡ 7 (mod 11)
= 1(8 − 7) − 3 = −2 ≡ 9 (mod 11)
m =
Then
x2P
y2P
So in E11 (1, 6) we find 2(8, 3) = (7, 9).
The earlier m-file for performing elliptic curve addition was modified for use
with prime curves. It now reduces modulo p at each stage using Matlab’s mod
function and find the inverse of elements so the final answer is an element on
a prime curve.
This new m-file is ECADP.m and can be found in Appendix C.3. It
contains the same inputs and outputs as ECAD.m but the user must input
p in addition. It makes use of the m-file inve.m which is stored in Appendix
C.4. This m-file takes as its inputs a number N and a prime p and outputs
the inverse of N in the group Zp .
The m-file ECADP.m was used to calculate the remaining entries in the
addition table overleaf (Table 2.1). In Example 3.4 we show that (2, 7) is a
generator of this group and so it is isomorphic to Z13 .
12
13
(2,4)
(5,9)
∞
(7,2)
(10,2)
(2,7)
(8,8)
(7,9)
(3,6)
(5,2)
(10,9)
(8,3)
(3,5)
(2,4)
(2,7)
∞
(5,2)
(10,9)
(7,9)
(8,3)
(2,4)
(3,5)
(7,2)
(10,2)
(5,9)
(3,6)
(8,8)
(2,7)
(3,5)
(7,2)
(10,9)
(8,3)
∞
(8,8)
(7,9)
(5,2)
(2,7)
(5,9)
(3,6)
(2,4)
(10,2)
(3,5)
(3,6)
(10,2)
(7,9)
∞
(8,8)
(7,2)
(8,3)
(2,4)
(5,9)
(3,5)
(5,2)
(10,9)
(2,7)
(3,6)
(5,2)
(2,7)
(8,3)
(8,8)
(7,2)
(10,2)
∞
(10,9)
(3,5)
(3,6)
(2,4)
(7,9)
(5,9)
(5,2)
(5,9)
(8,8)
(2,4)
(7,9)
(8,3)
∞
(10,9)
(3,6)
(10,2)
(2,7)
(3,5)
(5,2)
(7,2)
(5,9)
(7,2)
(7,9)
(3,5)
(5,2)
(2,4)
(10,9)
(3,6)
(2,7)
∞
(8,8)
(10,2)
(5,9)
(8,3)
(7,2)
(7,9)
(3,6)
(7,2)
(2,7)
(5,9)
(3,5)
(10,2)
∞
(2,4)
(10,9)
(8,3)
(8,8)
(5,2)
(7,9)
(8,3)
(5,2)
(10,2)
(5,9)
(3,5)
(3,6)
(2,7)
(8,8)
(10,9)
(7,9)
∞
(7,2)
(2,4)
(8,3)
(8,8)
(10,9)
(5,9)
(3,6)
(5,2)
(2,4)
(3,5)
(10,2)
(8,3)
∞
(7,2)
(2,7)
(7,9)
(8,8)
(10,2)
(8,3)
(3,6)
(2,4)
(10,9)
(7,9)
(5,2)
(5,9)
(8,8)
(7,2)
(2,7)
(3,5)
∞
(10,2)
(10,9)
(3,5)
(8,8)
(10,2)
(2,7)
(5,9)
(7,2)
(8,3)
(5,2)
(2,4)
(7,9)
∞
(3,6)
(10,9)
Table 2.1: The addition table for E11 (1, 6)
.
This is the group of points (x, y) that satisfy y 2 = x3 + x + 6 within the field Z11 along with the point ∞.
This group can be shown to be isomorphic to Z13 and generated by the point (2,7).
+
(2,4)
(2,7)
(3,5)
(3,6)
(5,2)
(5,9)
(7,2)
(7,9)
(8,3)
(8,8)
(10,2)
(10,9)
∞
∞
(2,4)
(2,7)
(3,5)
(3,6)
(5,2)
(5,9)
(7,2)
(7,9)
(8,3)
(8,8)
(10,2)
(10,9)
∞
Example 2.3. Consider E23 (1, 1), the set of integers (x, y) that satisfy
y 2 ≡ x3 + x + 1 (mod 23)
Running PC.m with A = B = 1 and p = 23 produced:
Note that all the point with the exception of (4,0) are symmetric about
the line y = 11.5. If there were another point, symmetric to (4,0) then there
would be a point at (4,23). However this is equivalent to (4,0) in modulo 23,
so its as if the y-axis was wrapped around to form a circle — the analogy
given earlier.
An m-file to check whether a point lies on a prime curve, (check.m),
was created and stored in Appendix C.6. This m-file takes as its inputs
x, y, A, B, p and checks whether the point (x, y) lies on the curve
y 2 ≡ x3 + Ax + B
14
(mod p)
Chapter 3
Torsion points and
endomorphisms of elliptic
curves
The order, of an element, a, in any additive abelian group defined by an
elliptic curve, is the smallest positive integer m such that ma = ∞. If no such
m exists, we say that a has infinite order. Finitely generated abelian groups
can be split into the torsion and torsion free subgroups where the former
contain the torsion points which are those points whose orders are finite.
These points play a large role in the theory of elliptic curves, especially in
elliptic curves defined over finite fields, where all points are torsion. In general
the torsion subgroup is simpler to work with, which is another reason why
elliptic curves over finite fields are of such great interest. In this chapter we
examine the properties of the torsion points as well as deriving some results
for use in Chapter 4. We start by considering endomorphisms of elliptic
curves, which help in our study of the torsion points since multiplication by
n on an elliptic curve can be described as an endomorphism.
3.1
Endomorphisms of elliptic curves
Recall that a homomorphism is a structure-preserving map between two algebraic structures (in this case, groups). Here we use endomorphism to mean
a homomorphism α : E(K) → E(K) that is given by rational functions. In
other words, α(P1 + P2 ) = α(P1 ) + α(P2 ), and there are rational functions
15
R1 (x, y), R2 (x, y) with coefficients in K such that
α(x, y) = (R1 (x, y), R2 (x, y))
for all (x, y) ∈ E(K). Since α is a homomorphism we have α(∞) = ∞. Also
assume that α is not the trivial endomorphism that maps every point to ∞,
denoted by α = 0.
Example 3.1. Let E be given by y 2 = x3 + Ax + B and let α(P ) = 2P .
Then α is a homomorphism and α(x, y) = (R1 (x, y), R2 (x, y)) where
2
2
3x + A
− 2x
R1 (x, y) =
2y
2
2
2 !
3x + A
3x + A
R2 (x, y) =
3x −
−y
2y
2y
Since α is a homomorphism given by rational functions, it is an endomorphism of E.
The following theorem will allow us to use a standard form for the rational
functions that describe an endomorphism.
Theorem 3.1. Let E be given by y 2 = x3 + Ax + B, and defined over a field
K. Any endomorphism, α, can be completely defined by the following, where
p(x), q(x) are polynomials with no common factors and s(x), t(x) likewise.
p(x) s(x)
,y
α(x, y) = (r1 (x), r2 (x)y) =
q(x) t(x)
Proof α is an endomorphism and so can be expressed with rational functions,
α(x, y) = (R1 (x, y), R2 (x, y)). Now, since y 2 = x3 + Ax + B for all (x, y) ∈
E(K) we can replace any even power of y by a polynomial in x, and any odd
power of y by y times a polynomial in x:
R(x, y) =
p1 (x) + p2 (x)y
p3 (x) + p4 (x)y
We could then rationalize the denominator and replace y 2 to get
R(x, y) =
q1 (x) + q2 (x)y
q3 (x)
16
(3.1)
Since α is a homomorphism it will preserve the structure of the curve so
α(x, −y) = α(−(x, y)) = −α(x, y)
This means that
R1 (x, −y) = R1 (x, y),
and
R2 (x, −y) = −R2 (x, y)
By writing R1 in the form of Equation (3.1) we can see that q2 (x) = 0, and
similarly with R2 , we find that q1 (x) = 0. Therefore we may assume that
α(x, y) = (r1 (x), r2 (x)y)
for rational functions r1 (x), r2 (x).
We must still consider what happens when one of the rational functions
is not defined at a point. Write
r1 (x) =
p(x)
,
q(x)
and
r2 (x) = y
s(x)
t(x)
with polynomials p(x), q(x) that do not have a common factor and s(x), t(x)
likewise. If q(x) = 0 at some point (x, y) then we assume that α(x, y) = ∞.
If q(x) 6= 0 then part (ii) of Lemma 3.2 below shows that r2 (x) will also be
defined. This completes the proof of Theorem 3.1
Lemma 3.2. Let
α(x, y) =
p(x) s(x)
,y
q(x) t(x)
be an endomorphism of the elliptic curve E given by y 2 = x3 + Ax + B. Let
p, q be polynomials with no common root, and s, t likewise. Then
(i) For a polynomial u(x), such that u and q have no common root
(x3 + Ax + B)s(x)2
u(x)
=
2
t(x)
q(x)3
(ii) t(x0 ) = 0 if and only if q(x0 ) = 0.
17
Proof (i) Because α is a endomorphism, the point α(x, y) also lies on the
elliptic curve E. Hence
2
(x3 + Ax + B)s(x)2
y 2 s(x)2
s(x)
=
= y
t(x)2
t(x)2
t(x)
3
p(x)
p(x)
=
+A
+B
q(x)
q(x)
p(x)3 + Ap(x)q(x)2 + Bq(x)3
u(x)
=
≡
3
q(x)
q(x)3
where u(x) = p(x)3 + Ap(x)q(x)2 + Bq(x)3 . We still need to show that u(x)
and q(x) do not share a root.
Suppose q(a) = 0. If u(a) = 0 also, then
u(a) = p(a)3 + Ap(a)q(a)2 + Bq(a)3 = 0
p(a)3 = 0 =⇒ p(a) = 0
We assumed p(x) and q(x) shared no common roots so this cannot happen.
Therefore if q(a) = 0 then u(a) 6= 0 meaning u and q have no common roots.
(ii) From part (i) we know that
(x3 + Ax + B)s(x)2 q(x)3 = t(x)2 u(x)
Then if q(x0 ) = 0 we have
t(x0 )2 u(x0 ) = 0
Now we know that u and q do not share a common root so u(x0 ) 6= 0 therefore
t(x0 ) = 0 as required.
To prove the converse, suppose t(x0 ) = 0, then
(x30 + Ax0 + B)s(x0 )2 q(x0 )3 = 0
But s(x0 ) 6= 0 because t and s are assumed to have no common roots so
(x30 + Ax0 + B)q(x0 )3 = 0
We now consider the following two cases
a) If x30 + Ax0 + B 6= 0 then q(x0 )3 = 0 so q(x0 ) = 0 and we are done.
18
b) If x30 + Ax0 + B = 0 then (x − x0 ) divides (x3 + Ax + B) so
x3 + Ax + B = (x − x0 )Q(x)
where Q(x0 ) 6= 0 as we have assumed no multiple roots. Now because
t(x0 ) = 0 we can make a similar factorisation to get t(x) = (x − x0 )T (x)
for some polynomials T (x). Now we can consider again the equation from
part (i)
(x3 + Ax + B)s(x)2 q(x)3 = t(x)2 u(x)
(x − x0 )Q(x)s(x)2 q(x)3 = [(x − x0 )T (x)]2
q(x)3 Q(x)s(x)2 = (x − x0 )T (x)2 u(x)
Now when x = x0 we get
q(x0 )3 Q(x0 )s(x0 )2 = 0
We have already shown that s(x0 ) 6= 0 and that Q(x0 ) 6= 0 so we have
q(x0 ) = 0 as required.
Define the degree of α to be, deg(α) = Max {deg(p(x)), deg(q(x))} if α
is non trivial. If α = 0 then define deg(α) = 0.
Define α 6= 0 to be a separable endomorphism if the derivative r10 (x) is
not identically zero. (Recall that if a function is identically zero then it is the
zero function as opposed to merely zero at a particular point.) By Lemma
3.3 below, this is equivalent to saying that at least one of p0 (x) and q 0 (x) is
not identically zero.
Lemma 3.3. Let p(x), q(x) be polynomials with no common roots. Then
d p(x)
= 0 if and only if p0 (x) = 0 and q 0 (x) = 0
dx q(x)
Proof Using the quotient rule
q(x)p0 (x) − p(x)q 0 (x)
d p(x)
=
dx q(x)
q(x)2
19
So if r10 (x) = 0 then q(x)p0 (x) − p(x)q 0 (x) = 0. Suppose for a contradiction
that p0 (x) 6= 0. We can then write
q(x) =
p(x)q 0 (x)
p(x)
Let x0 be a root of q(x), then by assumption p(x0 ) 6= 0. We can then consider
the following two cases.
(i) If x0 is not a root of q(x), then q 0 (x0 ) 6= 0. Now setting x = x0 gives
p(x0 )q 0 (x0 )
p0 (x0 )
0 = p(x0 )q 0 (x0 )
q(x0 ) =
But p(x0 ) 6= 0 and q 0 (x0 ) 6= 0 so we have a contradiction.
(ii) If x0 is a root of q 0 (x0 ) then
q(x) = (x − x0 )n Q(x)
q 0 (x) = (x − x0 )m R(x)
where Q(x0 ) 6= 0, R(x0 ) 6= 0 and m < n. Then substituting gives
p(x)(x − x0 )m R(x)
p0 (x)
p(x)R(x)
(x − x0 )r Q(x) =
p0 (x)
(x − x0 )n Q(x) =
where r > 0. Now let x = x0
0 = p(x0 )R(x0 )
But p(x0 ) 6= 0 and R(x0 ) 6= 0 so we have a contradiction.
So we must assume that p0 (x) = 0. The proof that q 0 (x) = 0 is similar with
the roles of p and q reversed.
20
Example 3.2. Consider again α(P ) = 2P which had
2
2
3x + A
R1 (x, y) =
− 2x
2y
Subbing in for y 2 and simplifying yields
r1 =
x4 − 2Ax2 − 8Bx + A2
4(x3 + Ax + B)
Therefore deg(α) = 4. Note that q 0 (x) = 4(3x2 + A) which is not zero. This
is true even in characteristic 3 when we set A = 0 because a curve x3 + B
will have multiple roots in characteristic 3 (27B 2 ≡ 0), which is contrary to
assumption. Therefore α is a separable endomorphism.
Example 3.3. We now repeat the previous example in characteristic 2, using
the formula from Appendix A.3 for doubling a point.
If y 2 + xy = x3 + a2 x2 + a6 we have
α(x, y) = (r1 (x), R2 (x, y))
with r1 (x) = (x4 + a6 )/x2 . Therefore deg(α) = 4. Since p0 (x) = 4x3 ≡ 0 and
q 0 (x) = 2x ≡ 0 the endomorphism α is not separable.
Similarly in the case y 2 +a3 y = x3 +a4 x+a6 , we have r1 (x) = (x4 +a24 )/a23 .
Therefore deg(α) = 4 but α is not separable.
In general, when in characteristic p, the map α(Q) = pQ has degree p2
and is not separable.
Suppose E is defined over the finite field Fq . Then we define the
Frobenius Map as
φq (x, y) = (xq , y q )
Lemma 3.4. Let E be defined over Fq . Then φq is an endomorphism of E
with degree q, and φq is not separable.
Proof The main task of this proof is to show that φq : E(Fq ) → E(Fq ) is
a homomorphism. So we need to show that if (x1 , y1 ) + (x2 , y2 ) = (x3 , y3 )
then φq (x1 , y1 ) + φq (x2 , y2 ) = φq (x3 , y3 ) for all the possible combinations of
(x1 , y1 ) and (x2 , y2 ) ∈ E(Fq ). Throughout the proof we can use Proposition
B.14 because E is defined over Fq . This stated that
φq (x + y) = φq (x) + φq (y)
φq (xy) = φq (x)φq (y)
21
(i) If x1 6= x2 then (x3 , y3 ) is given by
x3 = m2 − x1 − x2 ,
y3 = m(x1 − x2 ) − y1 ,
m=
y2 − y1
x2 − x1
Now consider the sum of φq (x1 , y1 ) and φq (x2 , y2 ) given by (X, Y ) where
2
2
y2q − y1q
(y2 − y1 )q
q
q
− x1 − x2 =
− xq1 − xq2
q
q
q
x2 − x1
(x2 − x1 )
!q
2
y2 − y1
− x1 − x2 = xq3
x2 − x1
q
q
y2 − y1q
y2 − y1
q
q
q
(x1 − x3 ) − y1 =
(x1 − x3 )q − y1q
q
q
x − x1
x2 − x1
2
q
y2 − y1
(x1 − x3 ) − y1 = y3q
x2 − x1
X =
=
Y
=
=
So φq (x1 , y1 ) + φq (x2 , y2 ) = (xq3 , y3q ) = φq (x3 , y3 ) as required.
(ii) If (x1 , y1 ) = (x2 , y2 ) and y1 6= 0 then (x3 , y3 ) is given by
x3 = m2 − 2x1 ,
y3 = m(x1 − x3 ) − y1 ,
m=
3x21 + A
2y1
We now show that the sum of φq (x1 , y1 ) and φq (x2 , y2 ) given by (X, Y )
is φq (x3 , y3 ) as before. We use 2q = 2, 3q = 3, Aq = A, since 2,3,A ∈ Fq .
q 2
3q x2q
1 +A
−
=
− 2xq1
2q y1q
!q
2
2
2
(3x21 + A)q
3x
+
A
1
− 2xq1 =
− 2x1 = xq3
(2y1 )q
2y1
2
q
2q
3x1 + A
3x1 + A
q
q
q
(x1 − x3 ) − y1 =
(x1 − x3 )q − y1q
q
2y
2y1
21
q
3x1 + A
(x1 − x3 ) − y1 = y3q
2y1
X =
=
Y
=
=
3x2q
1 +A
2y1q
2
2xq1
So φq (x1 , y1 ) + φq (x2 , y2 ) = (xq3 , y3q ) = φq (x3 , y3 ) as required.
22
(iii) If x1 = x2 but y1 6= y2 (so y2 = −y1 ) then (x3 , y3 ) = ∞. So
φq (x1 , y1 ) + φq (x2 , y2 ) = φq (x1 , y1 ) + φq (x1 , −y1 ) = (xq1 , y1q ) + (xq1 , −y1q )
The final equality uses the fact that q is a power of a prime and so odd,
meaning (−y)q = −y q . Now, by definition the sum of a point on an
elliptic curve and its reflection in the x-axis is the point ∞ so
φq (x1 , y1 ) + φq (x2 , y2 ) = ∞
Finally we note that
φq (∞) = φq ((X, Y )+(X, −Y )) = φq (X, Y )+φq (X, −Y ) = (X q , Y q )+(X q , −Y q ) = ∞
So φq (x1 , y1 ) + φq (x2 , y2 ) = ∞ = φq (x3 , y3 ) as required.
(iv) If (x1 , y1 ) = (x2 , y2 ) and y1 = 0, then (x3 , y3 ) = ∞ by definition. Then
φq (x1 , y1 ) + φq (x2 , y2 ) = (xq1 , 0) + (xq1 , 0) = ∞
We showed in the case above that φq (∞) = ∞ so
φq (x1 , y1 ) + φq (x2 , y2 ) = ∞ = φq (∞) = φq (x3 , y3 )
as required.
(v) If one of the points, say (x2 , y2 ) = ∞ then (x3 , y3 ) = (x1 , y1 ). So
φq (x1 , y1 ) + φq (x2 , y2 ) = φq (x1 , y1 ) + ∞ = φq (x1 , y1 ) = φq (x3 , y3 )
as required
So we have shown that φq is a homomorphism. Since φq (x, y) = (xq , y q ),
the map is given by rational functions, making φq an endomorphism. We
can clearly see that the degree is q, and since q ≡ 0 in Fq , the derivative of
r1 (x) = xq is identically zero, meaning φq is not separable.
The following is the key result of this section which allows us to relate
the degree of an endomorphism to the size of its kernel. If a homomorphism
maps from G to H then the kernel is the set of elements mapped to, eH ,
the identity of H. Since a group homomorphism preserves identity elements,
the identity element, eG , of G must belong to the kernel. If this is the only
element of the kernel then the homomorphism is injective.
23
Theorem 3.5. Let α 6= 0 be a separable endomorphism of an elliptic curve,
E. Then
deg(α) = #Ker(α)
where Ker(α) is the kernel of the homomorphism α : E(K) → E(K)
If α is not separable then
deg(α) > #Ker(α)
Proof Write α(x, y) = (r1 (x), yr2 (x)) with r1 (x) = p(x)/q(x), as above.
Assume first that α is a separable endomorphism so r10 6= 0.
r10 = [p(x)q(x)−1 ]0 = p0 (x)q(x)−1 − p(x)q(x)−2 q 0 (x) 6= 0
So we can multiply by q(x)2 to see that p0 q − pq 0 is not the zero polynomial.
Let S be the set of x ∈ K such that (pq 0 − p0 q)(x)q(x) = 0. Since both
0
pq − p0 q and q(x) are not the zero polynomial we know that S is a set of
zeros to a non zero polynomial and hence finite. Its image under r1 (x) will
hence be finite as well.
Let (a, b) ∈ E(K) be such that
(i) a 6= 0, b 6= 0, (a, b) 6= ∞.
(ii) deg(p(x) − aq(x)) = Max{deg(p), deg(q)} = deg(α)
(iii) a 6∈ r1 (S).
(iv) (a, b) ∈ α(E(K))
We must prove that such an (a, b) exists. Consider each of the conditions in
turn:
(i) There are infinitely many (a, b) ∈ E(K) since K is algebraically closed.
So clearly we can exclude those when a = 0, b = 0 and (a, b) = ∞.
(ii) Let p(x) = cxn + lower order terms and q(x) = dxm + lower order terms.
If the deg(p) > deg(q) then n > m so p − aq will clearly have deg(n)
as required. Similarly if deg(p) < deg(q) then the condition will always
hold. So consider what happens when n = m. The condition will only
fail if c − ad = 0. But if this were the case then multiply a by an integer
greater than one, to find a point for which the condition holds.
24
(iii) We can always find a point that satisfies this condition as r1 (S) is finite,
but we have an infinite number of points.
(iv) There are infinitely many points in E(K). If the set {r1 (x)|x ∈ E(K)}
was finite then for at least some k ∈ K there are infinitely many k so k =
r1 (x). This would mean that r1 (x) − k = 0 for infinitely many k. This
implies that r1 (x) is a constant, which would make its derivative zero
and give us a contradiction. Hence r1 (x) is infinite, making α(E(K))
an infinite set. So we can always find (a, b) ∈ α(E(K)).
So such a point (a, b) exists. We want to prove that there are exactly
deg(α) points (x1 , y1 ) ∈ E(K) such that α(x1 , y1 ) = (a, b). For such a point
we have
p(x1 )
= a,
y1 r2 (x1 ) = b
q(x1 )
Since (a, b) 6= ∞ we must have q(x1 ) 6= 0, so by Lemma 3.2 r2 (x1 ) is defined.
Since b 6= 0 and y1 r2 (x1 ) = b we know that r2 (x1 ) 6= 0 so we can set y1 =
b/r2 (x1 ). Therefore x1 determines y1 so we need only count how many values
of x1 satisfy
p(x1 ) = aq(x1 ) ⇒ p(x1 ) − aq(x1 ) = 0
By assumption (ii) p(x) − aq(x) = 0 has deg(α) roots, counting multiplicities, so if all the roots are distinct we are done. We must show that p − aq
has no multiple roots. Suppose that x0 is a multiple root of p − aq. Then we
know that both the curve and its derivative are zero here:
p(x0 ) − aq(x0 ) = 0 =⇒ p(x0 ) = aq(x0 )
p0 (x0 ) − aq 0 (x0 ) = 0 =⇒ aq 0 (x0 ) = p0 (x0 )
Multiplying the two equations yields
ap(x0 )q 0 (x0 ) = ap0 (x0 )q(x0 )
Since a 6= 0
p(x0 )q 0 (x0 ) − p0 (x0 )q(x0 ) = 0
which implies that x0 is a root of pq 0 −p0 q so x0 ∈ S. Therefore a = r1 (x0 ) ∈ S
which is contrary to assumption. Therefore p − aq has deg(α) distinct roots
and hence there are deg(α) points (x1 , y1 ) ∈ E(K) such that α(x1 , y1 ) =
(a, b).
25
Since α is a homomorphism and this holds for the point (a, b), it will hold
for all (a, b) ∈ α(E(K)), including the identity meaning the kernel of α has
deg(α) elements.
If α is not separable then the above steps hold, but p0 − aq 0 is always the
zero polynomial so p(x) − aq(x) = 0 always has multiple roots and so fewer
than deg(α) solutions.
Theorem 3.6. Let E be an elliptic curve defined over a field K. Let α 6= 0
be an endomorphism of E. Then α : E(K) → E(K) is surjective.
Proof Let (a, b) ∈ E(K). We want to prove that there is a point (x, y) ∈
E(K) that α maps to it. Since α(∞) = ∞, we may assume that (a, b) 6= ∞.
Let r1 (x) = p(x)/q(x) as above. We consider the two cases:
(i) If p(x) − aq(x) is not a constant then it has a root, at x0 say. Since
p and q have no common roots we know q(x0 ) 6= 0 (if it were, then it
would imply p(x0 ) = 0 which is contrary to assumptions.) So
p(x0 ) − aq(x0 ) = 0 =⇒ a =
p(x0 )
q(x0 )
Choose y0 ∈ K to be either square root of x30 + Ax0 + B. Then α(x0 , y0 )
is defined and equals (a, b0 ) for some b0 . Since (b0 )2 = a3 + Aa + B = b2
we have b = ±b0 . If b0 = b then we have found our point (x, y) that maps
to (a, b) and we are done. If b0 = −b then α(x0 , −y0 ) = (a, −b0 ) = (a, b).
(ii) Now consider the case when p − aq is constant. Since E(K) is infinite
and the kernel of α is finite, only finitely many points of E(K) can
map to a point with a given x coordinate. So either p(x) or q(x) is not
constant.
If p and q are two non constant polynomials then there is at most one
value of a so p − aq is constant. Therefore there are at most two points
(a, b) and (a, −b) that are not mapped to by α. Let (a1 , b1 ) = α(P1 ) be
any other point. We can choose it such that (a1 , b1 ) + (a, b) 6= (a, ±b).
So there exists P2 with α(P2 ) = (a1 , b1 )+(a, b). Then α(P2 −P1 ) = (a, b)
and α(P1 − P2 ) = (a, −b). So every point (a, b) is mapped to by α.
26
We have shown that if α 6= 0 is an endomorphism of E then every point
(a, b) ∈ E(K) is mapped to by a point (x, y) ∈ E(K). Therefore α is
surjective.
We next want to derive a criterion for separability (Proposition 3.10). If
(x, y) is a point on y 2 = x3 + Ax + B, then we can differentiate to get
2yy 0 = 3x2 + A
Similarly we can differentiate a rational function to get
d
f (x, y) = fx (x, y) + fy (x, y)y 0
dx
where fx and fy are the partial derivatives.
Lemma 3.7. Let E be the elliptic curve y 2 = x3 + Ax + B. Fix a point (u, v)
on E. For any point (x, y) so x 6= u
(u, v) + (x, y) = (f (x, y), g(x, y))
where f (x, y) and g(x, y) are rational functions whose coefficients depend on
(u, v). Then
d
f (x, y)
1
dx
=
g(x, y)
y
Proof From the addition formulas we have
2
y−v
f (x, y) =
−u−x
x−u
y−v
y−v
u−
+u+x −v
g(x, y) =
x−u
x−u
y−v
2u(x − u)2 − (y − v)2 + x(x − u)2
−v
=
x−u
(x − u)2
−(y − v)3 + x(y − v)(x − u)2 + 2u(y − v)(x − u)2 − v(x − u)3
=
(x − u)3
Then using the quotient rule we can calculate
d
2(x − u)2 (y − v)y 0 − 2(y − v)2 (x − u)(1)
f (x, y) =
−1
dx
(x − u)4
2y 0 (y − v)(x − u) − 2(y − v)2 − (x − u)3
=
(x − u)3
27
Because 2yy 0 = 3x2 + A we can substitute for y 0 to give
2
2( 3x2y+A )(y − v)(x − u) − 2(y − v)2 − (x − u)3
d
f (x, y) =
dx
(x − u)3
(3x2 + A)(y − v)(x − u) − 2y(y − v)2 − y(x − u)3
=
y(x − u)3
y
(3x2 + A)(y − v)(x − u) − 2y(y − v)2 − y(x − u)3
d
f (x, y) − g(x, y) =
dx
(x − u)3
(y − v)3 − x(y − v)(x − u)2 − 2u(y − v)(x − u)2 + v(x − u)3
+
(x − u)3
Then
(x − u)3 y
d
f (x, y) − g(x, y) = (3x2 + A)(y − v)(x − u) − 2y(y − v)2 − y(x − u)3
dx
+(y − v)3 − x(y − v)(x − u)2 − 2u(y − v)(x − u)2 + v(x − u)3
= −Avx + vu3 − yu3 + yv 2 + y 2 v − Ayu + Avu − y 3 − v 3 + x3 y − x3 v + Ayx
= v[Au + u3 − v 2 − Ax − x3 + y 2 ] + y[−Au − u3 + v 2 + Ax + x3 − y 2 ]
Because (u, v) and (x, y) lie on E we can use v 2 = u3 + Au + B and
y 2 = x3 + Ax + B to reduce the above expression
(x − u)3 y
d
f (x, y) − g(x, y) = v[Au + u3 − (u3 + Au + B) − Ax − x3 + (x3 + Ax + B)]
dx
+y[−Au − u3 + (u3 + Au + B) + Ax + x3 − (x3 + Ax + B)]
= v[−B + B] + y[+B − B] = 0
Then because x 6= u this implies
y
d
f (x, y) = g(x, y)
dx
which can be rearranged to give the desired result
28
Lemma 3.8. Let α1 , α2 , α3 be non-zero endomorphisms of an elliptic curve
E with α1 + α2 = α3 . Write αj (x, y) = (Rαj (x), ySαj (x)). Suppose there are
constants cα1 , cα2 such that
R0 (x)
Rα0 1 (x)
= cα1 and α2
= c α2 .
Sα1 (x)
Sα2 (x)
Then
Rα0 3 (x)
= c α1 + c α2
Sα3 (x)
Proof Let (x1 , y1 ) and (x2 , y2 ) be variable points on E, so x1 6= x2 . Write
(x3 , y3 ) = (x1 , y1 ) + (x2 , y2 )
where
(x1 , y1 ) = α1 (x, y),
(x2 , y2 ) = α2 (x, y)
Then x3 and y3 are rational functions of x1 , y1 , x2 , y2 which in turn are rational functions of x, y. By Lemma 3.7 with (x, y) = (x1 , y1 ) and (u, v) = (x2 , y2 )
y3
∂x3
=
∂x1
y1
Similarly with (x, y) = (x2 , y2 ) and (u, v) = (x1 , y1 )
y3
∂x3
=
∂x2
y2
By assumption
∂xj
yj
= c αj
∂x
y
for j = 1, 2. So by the chain rule
dx3
∂x3 ∂x1 ∂x3 ∂x2
y3
y1 y3
y2
y3
=
+
= cα1 + cα2 = (cα1 + cα2 )
dx
∂x1 ∂x
∂x2 ∂x
y1
y
y2
y
y
Then dividing by y3 /y gives the result
Proposition 3.9. Let E be an elliptic curve defined over a field K, and let
n be a nonzero integer. Suppose that multiplication by n on E is given by
n(x, y) = (Rn (x), ySn (x))
for all (x, y) ∈ E(K), where Rn and Sn are rational functions. Then
Rn0 (x)
=n
Sn (x)
This then implies that multiplication by n is separable if and only if n is not
a multiple of the characteristic p of the field.
29
Proof We showed earlier that R−n = Rn and S−n = −Sn and so we have
0
R−n
/S−n = −Rn0 /Sn . Therefore the result for positive n will imply the result
for negative n.
We will prove that Rn0 (x)/Sn (x) = n for all positive n using proof by
mathematical induction (PMI). We can see this is trivially true for n = 0
and n = 1. Suppose that it is true for n, then Lemma 3.8 will imply that it
is true for the sum, n + 1. Therefore
Rn0 (x)
=n
Sn (x)
∀ n ≥ 1 by PMI. This coupled with the fact that if it holds for positive n,
then it holds for negative n implies the result for all integers n.
Now for multiplication by n to be separable we need Rn0 (x) 6= 0. This will
be the case if and only if n = Rn0 (x)/Sn (x) 6= 0, which is equivalent to p not
dividing n. So this proves the second part of the proposition, multiplication
by n is separable if and only if n - p.
Proposition 3.10. Let E be an elliptic curve defined over Fq , where q is the
power of the prime p. Let r and s be integers, not both 0. The endomorphism
rφq + s is separable if and only if p - s. (φq the Frobenius map)
Proof Let the endomorphism that describes multiplication by r be
r(x, y) = (Rr (x), ySr (x))
Then the endomorphism for multiplication by rφq is
(Rrφq (x), ySrφq (x)) = (rφq )(x, y) = (Rrq (x), y q Srq (x))
= (Rrq (x), y(x3 + Ax + B)(q−1)/2 Srq (x))
Therefore
0
Rrφ
q
qRrq−1 Rr0
crφq =
=
=0
Srφq
Srφq
Also cs = Rs0 /Ss = s by Proposition 3.9. So by Lemma 3.8
0
Rrφ
q +s
Srφq +s
= crφq +s = crφq + cs = 0 + s = s
0
Therefore Rrφ
6= 0, (and hence the endomorphism is separable), if and
q +s
only if p - s.
30
3.2
Torsion points
The torsion points are those points in E whose orders are finite. Let E be
an elliptic curve defined over a field K, with algebraic closure K and let n
be a positive integer. For a given n we define the subgroup
E[n] = {P ∈ E(K) | nP = ∞}
This group acts as the kernel of the multiplication by n endomorphism, which
maps x 7→ nx. We will start by looking at the form of E[2] and E[3] before
moving on to the general case.
When the characteristic is not two E can be expressed in the form
y 2 = x3 + a02 x2 + a04 x + a6 = (x − e1 )(x − e2 )(x − e3 )
with e1 , e2 , e3 ∈ K. It is easy to calculate E[2], as a point satisfies 2P = ∞
if and only if the tangent line at P is verticle. When we have a curve in
characteristic not 2 this only happens when y = 0 so
E[2] = {∞, (e1 , 0), (e2 , 0), (e3 , 0)}
Because E[n] is a finite abelian group we can apply Theorem B.6 here. When
the characteristic is not 2, E[2] is a group of order 4 and so isomorphic to
either Z4 or Z2 ⊕ Z2 . We know the group is not cyclic as all points have order
2, so we conclude that in this case
E[2] ' Z2 ⊕ Z2
If the characteristic is 2 then, from Appendix A.3 E has one of the following
forms
(I) y 2 + xy + x3 + a2 x2 + a6 = 0
(II) y 2 + a3 y + x3 + a4 x + a6 = 0
In the first case a6 6= 0 and in the second case a3 6= 0, otherwise the curves
would be singular. If P = (x, y) is a point of order 2 then once again the
tangent at P must be verticle. This time, however, the curve is not symmetric
about the x-axis so we look for the points when the partial derivative with
respect to y vanishes:
31
(I) fy = 2y + x ≡ x
(II) fy = 2y + a3 ≡ a3
(mod 2)
(mod 2)
So in the first case we need x = 0 meaning 0 = y 2 + a6 = (y +
√
Therefore (0, a6 ) is the only point of order 2 and
E[2] = {∞, (0,
√
√
a6 )2 .
a6 )} ' Z2
In the second case the partial derivative with respect to y is a3 6= 0. Therefore
there is no point of order 2 so
E[2] = {∞} ' Z1
We denote the set of only one element by 0. The following proposition
summarises these results.
Proposition 3.11. Let E be an elliptic curve over a field K. If the characteristic of K is not 2 then
E[2] ' Z2 ⊕ Z2
If the characteristic of K is 2 then E[2] ' 0 or Z2
Now consider E[3]. Assume first that the characteristic is neither 2 nor
3, in which case E is given by y 2 = x3 + Ax + B. A point P satisfies
3P = ∞ if and only if 2P = −P . This means that the x-coordinate of 2P
equals the x-coordinate of P while the y-coordinate will differ in sign. (If
the y-coordinates were equal then 2P = P implying P = ∞.) So using the
addition equations
m2 − 2x = x,
m=
3x2 + A
2y
Hence
(3x2 + A)2
= 3x
4y 2
(3x2 + A)2 = 12x(x3 + Ax + B)
3x4 + 6Ax2 + 12Bx − A2 = 0
32
The discriminant of this polynomial is −6912(4A3 + 27B 2 )2 which is clearly
non-zero since we assumed the roots of the Weierstrass equation were distinct.
So this polynomial has no multiple roots, meaning there are 4 distinct values
of x ∈ K each yielding 2 values of y, summing to 8 points of order 3. Since
∞ is also in E[3] we see that E[3] is a group of order 9, so from Theorem B.6
we know that it is isomorphic to either Z9 or Z3 ⊕ Z3 . But, every element is
3-torsion, so no point has order 9, meaning the group is not cyclic. Therefore
E[3] ' Z3 ⊕ Z3
Next assume we are in characteristic 3 meaning we have an equation of the
form y 2 = x3 + a2 x2 + a4 x + a6 . We can compute the x-coordinate of 2P
in the usual method. We first use implicit differentiation to calculate the
gradient of the tangent, m = (2a2 x + a4 )2 /4y 2 and then we substitute in E
and note that the x2 coefficient has an extra term this time. So setting the
x-coordinate of 2P to that of P gives
2
2a2 x + a4
− a2 = 3x ≡ 0
2y
(4a22 x2 + a24 + 4a2 a4 x) − 4a2 y 2 = 0
a22 x2 + a24 + a2 a4 x − a2 (x3 + a2 x2 + a4 x + a6 ) = 0
a2 x3 + a2 a6 − a24 = 0
Recall that 3 ≡ 0, 4 ≡ 1 in characteristic 3.
1/2
Note that we cannot have a2 = a4 = 0 as then y 2 = (x+a6 )3 has multiple
roots. If a2 = 0 then we get −a24 = 0 which cannot happen, so E[3] = {∞} '
Z1 in this case. If a2 6= 0 then the equation becomes a2 (x3 + a) = 0 for some
constant a. This has a single triple root so there is one value of x and 2
corresponding values of y meaning two points of order 3. Since ∞ is also a
point we see that E[3] has order 3 so E[3] ' Z3 .
Finally assumes that we are in characteristic 2. We can use the addition
formulas from Appendix A.3 to show that E[3] ' Z3 ⊕ Z3 . As before we have
two possibilities:
(I) If y 2 + xy = x3 + a2 x2 + a6 then calculating 2P and setting the xcoordinate equal to the x-coordinate of P gives
x 4 + a6
x2
4
0 = x − x 3 + a6
x =
33
The discriminant if this polynomial is 256a36 − 27a26 ≡ a26 (mod 2). We
cannot have a6 = 0 as then the curve would be singular, so we conclude
the discriminant is non zero. So the polynomial has 4 roots, and so 8
points of order 3. Therefore as before E[3] ' Z3 ⊕ Z3 .
(II) If y 2 + a3 y = x3 + a4 x + a6 then we get
x4 + a24
a23
0 = x4 + a24 − xa23
x =
The discriminant of this polynomial is −27(a23 )4 + 256(a24 )3 ≡ a23 (mod
2). We cannot have a3 = 0 as then the curve would be singular, so we
conclude the discriminant is non-zero and hence E[3] ' Z3 ⊕ Z3 .
So to conclude, if we are in characteristic not 3, then E[3] ' Z3 ⊕ Z3 , while
if we are in characteristic 3, then E[3] ' Z3 or Z1 . The following theorem
describes the general case.
Theorem 3.12. Let E be an elliptic curve over a field K, and let n be a
positive integer. If the characteristic of K does not divide n, or is zero then
E[n] ' Zn ⊕ Zn
If the characteristic of K is p > 0 and p|n write n = pr n0 with p - n0 . Then
E[n] ' Zn0 ⊕ Zn0
or
Zn ⊕ Zn0
This theorem will be proved in the next section, but notice how it covers
the two example we have just looked at. For example, when n = 3 as long
as the characteristic did not divide 3 (ie was not 3) then E[3] ' Zn ⊕ Zn =
Z3 ⊕ Z3 . While when the characteristic was 3, we could write 3 = 31 × 1 and
then E[3] ' Z1 ⊕ Z1 = Z1 or Z3 ⊕ Z1 = Z3 .
An elliptic curve E in characteristic p is called ordinary if E[p] ' Zp . It
is called supersingular if E[p] ' 0 and so only contains the point ∞. As
expected, this was one of the possibilities for E[3] in the characteristic 3 case
above.
34
3.2.1
Successive doubling
Recall that if P is a point on an elliptic curve and k is a positive integer,
then kP denotes P + P + ... + P (with k summands). If k is a large integer
it is more efficient to use successive doubling, as used below to compute 19P .
2P = P + P, 4P = 2P + 2P, 8P = 4P = 4P,
16P = 8P + 8P, 19P = 16P + 2P + P
The only problem is that if we are working in the rational numbers the size of
the coordinates increases rapidly. This is not a problem when working with
finite fields though as we can continually reduce modulo p. The following
algorithm uses successive doubling to calculate kP .
The Successive Doubling Algorithm
Let k be a positive integer and let P be a point on an elliptic curve.
The following procedure computes kP .
1. Set a = k, B = ∞ and C = P .
2. If a is even let a = a/2, and let B = B, C = 2C.
3. If a is odd let a = a − 1, and let B − B + C, C = C.
4. If a 6= 0 go to step 2
5. Output B.
The output, B, is kP .
Example 3.4. Consider E11 (1, 6) from Example 2.2 which was defined by
y 2 ≡ x3 + x + 6 (mod 11)
Let G = (2, 7) and suppose we wish to compute G, 2G, ..., 13G. Working
from the addition formulas:
−615 −6117
1 10
2G = 1G + 1G =
,
≡
,
≡ (5, 2) (mod 11)
196 2744
9 2
−38 −469
6 4
3G = 2G + 1G =
,
≡
,
≡ (6 ∗ 5, 4 ∗ 9) ≡ (8, 3) (mod 11)
9
27
9 5
35
We perform the rest of the calculations with ECADP.m
4G = 2G + 2G
5G = 4G + 1G
6G = 3G + 3G
7G = 4G + 3G
8G = 4G + 4G
= (10,2)
= (3,6)
= (7,9)
= (7,2)
= (3,5)
9G = 5G + 4G
10G = 5G + 5G
11G = 8G + 3G
12G = 6G + 6G
13G = 6G + 7G
=
=
=
=
=
(10,9)
(8,8)
(5,9)
(2,4)
(∞, ∞)
As expected all of these points lie on E11 (1, 6), however this has in fact
generated E11 (1, 6). This means that E11 (1, 6) is a cyclic group with
G = (2, 7) a generator.
If we had just wanted to calculate 13G, however, we could have used
the successive doubling algorithm. This would have taken only 6 steps as
opposed to the 12 used above:
(1)
(2)
(3)
(4)
a
a
a
a
=
=
=
=
13, B = ∞, C = G
12, B = G, C = G
6, B = G, C = 2G
3, B = G, C = 4G
(5) a = 2,
(6) a = 1,
(7) a = 0,
B = 5G,
B = 5G,
B = 13G,
C = 4G
C = 8G
C = 8G
An m-file to perform the successive doubling algorithm over prime curves
(SUCDOB.m) was created and can be found in Appendix C.5. This m-file
takes as its inputs X1, Y 1, k, A, p and outputs X2, Y 2 where
(X2, Y 2) = k(X1, Y 1) = (X1, Y 1)+(X1, Y 1)+...+(X1, Y 1)
(k summands)
and addition is performed over the elliptic curve
y 2 ≡ x3 + Ax + B
(mod p)
Testing this m-file on the example above gives 12G = (2, 4) and 13G = ∞ as
expected.
3.2.2
The basis for E[n]
Let n be a positive integer not divisible by the characteristic of K. We
show here (for use in the following sections) that we can find a basis {β1 , β2 }
for E[n] ' Zn ⊕ Zn . Every element of E[n] can be expressed in the form
m1 β1 + m2 β2 with integers m1 , m2 that are uniquely determined mod n. Let
36
α : E(K) → E(K) be a homomorphism. α maps E[n] to E[n] so there exists
a, b, c, d ∈ Zn such that
α(β1 ) = aβ1 + cβ2 ,
α(β2 ) = bβ1 + dβ2
Therefore each homomorphism is represented by a 2 × 2 matrix
a b
αn =
c d
So now composition of homomorphisms corresponds to multiplication of the
corresponding matrices.
Example 3.5. Let E be the elliptic curve defined over R by y 2 = x3 − 2 and
let n = 2. Then
E[2] = {∞, (21/3 , 0), (ζ21/3 , 0), (ζ 2 21/3 , 0)}
where ζ is a non trivial cube root of unity. Let
β1 = (21/3 , 0),
β2 = (ζ21/3 , 0)
Then {β1 , β2 } is a basis for E[2], and β3 = (ζ 2 21/3 , 0) = β1 + β2 .
Let α : E(C) → E(C) represent complex conjugation: α(x, y) = (x, y)
where x is the complex conjugate of x. It is easy to verify α is a homomorphism and that P1 + P2 = P1 + P2 , which is the same as α(P1 ) + α(P2 ) =
α(P1 + P2 ). We have
α(β1 ) = 1 · β1 + 0 · β2 ,
Therefore
α(β2 ) = 1 · β1 + 1 · β2 = β3
α2 =
1 1
0 1
Note that α22 is the identity matrix mod 2, which corresponds to the fact
that α ◦ α is the identity homomorphism.
37
3.3
Division polynomials
This section aims to prove Theorem 3.12 as well as obtain other results for
use in Chapter 4. Define the division polynomials ψm ∈ Z[x, y, A, B] by
ψ0
ψ1
ψ2
ψ3
ψ4
ψ2m+1
ψ2m
=
=
=
=
=
=
=
0
1
2y
3x4 + 6Ax2 + 12Bx − A2
4y(x6 + 5Ax4 + 20Bx3 − 5A2 x2 − 4ABx − 8B 2 − A3 )
3
3
ψm+2 ψm
− ψm−1 ψm+1
, m≥2
−1
2
2
(2y) (ψm )(ψm+2 ψm−1 − ψm−2 ψm+1
), m ≥ 3
Lemma 3.13. ψn is a polynomial in Z[x, y 2 , A, B] when n is odd, and a
polynomial in 2yZ[x, y 2 , A, B] when n is even.
Proof We can see the lemma is true for n ≤ 4. Assume for induction that the
lemma holds for all n < 2m, where 2m > 4, so m > 2. We must now prove
that the lemma holds for n = 2m and n = 2m + 1 to prove the lemma with
PMI. Because 2m > m + 2 we can see that all polynomials in the definition
of ψ2m and ψ2m+1 satisfy the induction assumptions.
First consider the case when m is even: Then ψm , ψm+2 , ψm−2 are in
2yZ[x, y 2 , A, B] and ψm−1 and ψm+1 are in Z[x, y 2 , A, B] so
3
ψm+2 ψm
∈ 24 y 4 Z[x, y 2 , A, B] = Z[x, y 2 , A, B]
3
ψm−1 ψm+1
∈ Z[x, y 2 , A, B]
∴ ψ2m+1 ∈ Z[x, y 2 , A, B]
Similarly
2
ψm+2 ψm−1
2
ψm−2 ψm+1
2
2
ψm+2 ψm−1
− ψm−2 ψm+1
2
2
)
(2y)−1 (ψm+2 ψm−1
− ψm−2 ψm+1
∴ ψ2m
∈
∈
∈
∈
∈
2yZ[x, y 2 , A, B]
2yZ[x, y 2 , A, B]
2yZ[x, y 2 , A, B]
Z[x, y 2 , A, B]
2yZ[x, y 2 , A, B]
Now consider the case when m is odd: then ψm−1 and ψm+1 are in
2yZ[x, y 2 , A, B] while ψm , ψm+2 , ψm−2 are in Z[x, y 2 , A, B] so
3
ψm+2 ψm
∈ Z[x, y 2 , A, B]
38
3
ψm−1 ψm+1
∈ 24 y 4 Z[x, y 2 , A, B] = Z[x, y 2 , A, B]
∴ ψ2m+1 ∈ Z[x, y 2 , A, B]
Similarly
2
ψm+2 ψm−1
2
ψm−2 ψm+1
2
2
ψm+2 ψm−1
− ψm−2 ψm+1
2
2
)
− ψm−2 ψm+1
(2y)−1 (ψm+2 ψm−1
∴ ψ2m
∈
∈
∈
∈
∈
22 y 2 Z[x, y 2 , A, B]
22 y 2 Z[x, y 2 , A, B]
22 y 2 Z[x, y 2 , A, B]
2yZ[x, y 2 , A, B]
2yZ[x, y 2 , A, B]
So we have proved the lemma with PMI for both choices of m.
Define the polynomials
φn = xψn2 − ψn+1 ψn−1
2
2
ωn = (4y)−1 (ψn+2 ψn−1
− ψn−2 ψn+1
))
Lemma 3.14. φn ∈ Z[x, y 2 , A, B] for all n. If n is odd then ωn ∈ yZ[x, y 2 , A, B]
while if n is even then ωn ∈ Z[x, y 2 , A, B].
Proof This proof is a lengthly but simple application of PMI. The proof can
be found in Appendix A.6.
Next consider an elliptic curve y 2 = x3 + Ax + B with no multiple roots
(4A3 + 27B 2 6= 0). We don’t specify what field A, B are in so treat them
as variables. We regard the polynomials in Z[x, y 2 , A, B] as polynomials in
Z[x, A, B] by substituting for y 2 . Note that φn is not necessarily a polynomial
in x alone, but ψn2 (x) is.
Lemma 3.15. When considering points on the elliptic curve y 2 = x3 +Ax+B
2 −1
(i) ψn2 (x) = n2 xn
+ lower degree terms
2
(ii) φn (x) = xn + lower degree terms
Proof This is another lengthly but simple use of PMI which can be found
in Appendix A.6
39
Lemma 3.16. Let ∆ = 4A3 + 27B 2 and let
F (x, z)
G(x, z)
f1 (x, z)
g1 (x, z)
f2 (x, z)
g2 (x, z)
=
=
=
=
=
=
x4 − 2Ax2 z 2 − 8Bxz 3 + A2 z 4
4z(x3 + Axz 2 + Bz 3 )
12x2 z + 16Az 3
3x3 − 5Axz 2 − 27Bz 3
4∆x3 − 4a2 bx2 z + 4A(3A3 + 22B 2 )xz 2 + 12B(A3 + 8B 2 )z 3
A2 Bx3 + A(5A3 + 32B 2 )x2 z + 2B(13A3 + 96B 2 )xz 2 − 3A2 (A3 + 8B 2 )z 3
Then by simply multiplying out the brackets we see
F f1 − Gg1 = 16A3 z 7 + 108B 2 z 7 = 4∆z 7
F f2 + Gg2 = 16x7 A3 + 108x7 B 2 = 4∆x7
Theorem 3.17. Let E be an elliptic curve. The endomorphism of E given
by multiplication by n has degree n2 .
Proof By Lemma 3.15 we see that the maximum of the degrees of φn (x) and
ψn2 (x) is n2 . So we can conclude that n2 is the degree of the endomorphism
by definition, provided that φn (x) and ψn2 (x) have no common roots.
Suppose for a contradiction that they share common roots, with n the
smallest index for which this happens. First suppose n = 2m is even.
φ2 (x) = x4 − 2Ax2 − 8Bx + A2
ψ22 = 4y 2 = 4(x3 + Ax + B)
From Theorem 3.6
2
φm (x) ωm (x, y)
φ2 (φm /ψm
) ω2 (m(x, y))
2m(x, y) = 2[m(x, y)] = 2 2
,
=
,
2 ) ψ (m(x, y))3
ψm (x) ωm (x, y)3
ψ22 (φm /ψm
2
So considering the first term gives
2
φ2m
φ2 (φm /ψm
)
=
2
2
2
ψ2m
ψ (φ /ψ )
24 m m 2
φm
φm
φm
φ3m
φm
2
=
− 2A 4 − 8B 2 + A / 4( 6 + A 2 + B)
8
ψm
ψm
ψm
ψm
ψm
4
2 4
6
2 8
φm − 2Aφm ψm − 8Bφm ψm + A ψm
=
2 )(φ3 + Aφ ψ 4 + Bψ 6 )
(4ψm
m m
m
m
U
=
V
40
Then using Lemma 3.16
2
2
14
U · f1 (φm , ψm
) − V · g1 (φm , ψm
) = 4ψm
∆
2
7
2
U · f2 (φm , ψm ) + V · g2 (φm , ψm ) = 4φm ∆
2
If U, V have a common root then so do φm and ψm
. But since n = 2m is the
first index for which there is a common root this is impossible, so U and V
do not share a common root.
2
2
. Since U/V = φ2m /ψ2m
We need to show that U = φ2m and V = ψ2m
and U, V have no common root it follows that φ2m is a multiple of U and
2
ψ2m
is a multiple of V . But by Lemma 3.15 we can show that both φ2m and
2
2
U equal x4m + lower order terms, so φ2m = U . Therefore V = ψ2m
and they
share no common roots.
Now suppose that n, the smallest index such that there is a common
roots, is odd so n = 2m + 1. Let r be a common root of φn and ψn2 .
φn = xψn2 − ψn−1 ψn+1
n
2
and ψn−1
and since ψn2 (r) = 0 it follows that ψn−1 ψn+1 (r) = 0. Now, ψn+1
2
are polynomials in x, and their product vanishes in r therefore ψn+δ
(r) = 0
where δ is either 1 or -1.
Since n is odd both ψn and ψn+2δ are polynomials in x and
2
(ψn ψn+2δ )2 = ψn2 ψn+2δ
vanishes at r, (as ψn2 does). Therefore ψn ψn+2δ vanishes at r also. Since
2
2
φn+δ = xψn+δ
− ψn2 ψn+2δ
2
we find that φn+δ (r) = 0. Therefore φn+δ and ψn+δ
have a common root
(where n + δ is even).
2
When considering the n even case we showed that if φ2m and ψ2m
have a
2
common root then so do φm and ψm
. Since n + δ is even we can apply this
to 2m = n + δ. Since n is the smallest index for which there is a common
root
n+δ
≥ n, =⇒ n ≤ δ
2
The only option would be n = 1 but clearly φ1 = x and ψ12 = 1 have no
common roots so we have a contradiction.
So φn and ψn2 have no common roots in all cases. Therefore, we can
conclude that the multiplication by n map has degree n2 .
41
Theorem 3.18. (Proof omitted - See Section 9.5 of [9]): Let P be a point
on the elliptic curve y 2 = x3 + Ax + B over a field of characteristic not 2.
Let n be a positive integer, then
φn (x) ωn (x, y)
,
nP =
ψn2 (x) ωn (x, y)3
We now use the above results to prove Theorem 3.12, from the previous
section.
Theorem 3.12 Let E be an elliptic curve over a field K, and let n be a
positive integer. If the characteristic of K does not divide n, or is zero then
E[n] ' Zn ⊕ Zn
If the characteristic of K is p > 0 and p|n write n = pr n0 with p - n0 . Then
E[n] ' Zn0 ⊕ Zn0
Zn ⊕ Zn0
or
Proof We first deal with the case when p - n. Recall that if α(x, y) =
(R(x), yS(x)) is an endomorphism on an elliptic curve then α is separable
if R0 (x) is not identically zero. From Theorem 3.18 and Lemma 3.15 we see
the multiplication by n map has
2
xn + ...
φn (x)
= 2 n2 −1
R(x) = 2
ψn (x)
nx
+ ...
So using the quotient rule, the numerator of R0 (x) is
2 −1
0
Rnum
(x) = (n2 xn
2 −1
+ ...)(n2 xn
4 2n2 −2
= (n x
= n2 x
2n2 −2
2
2 −2
+ ...) − (xn + ...)(n2 (n2 − 1)x2n
4
2
+ ...) − ((n − n )x
2n2 −2
+ ...)
+ ...)
+ ... 6= 0
So R0 (x) 6= 0 and therefore multiplication by n is separable.
As stated earlier, E[n] is the kernel of the multiplication by n endomorphism. We have just shown this to be separable so we can apply Theorem
3.5 to show the group has order equal to the degree of the endomorphism.
By Theorem 3.17 this is n2 . The structure theorem for finite abelian groups
then says that E[n] is isomorphic to
Zn1 ⊕ Zn2 ⊕ ... ⊕ Znk
42
for some integers n1 , n2 , ..., nk with ni |ni+1 for all i.
By Lemma B.8 E[l] has order lk , but since we proved above that E[l] has
order l2 we must have k = 2. So E[n] ' Zn1 ⊕ Zn2 where n1 |n2 . The order
of E[n] is n2 = n1 n2 so it follows that n1 = n2 = n. Therefore
E[n] ' Zn ⊕ Zn
when the characteristic p of the field does not divide n.
Now consider the case when p|n. We consider first the p-power torsion
on E. By Proposition 3.9 multiplication by p is not separable, and so by
Theorem 3.5 the kernel, E[p], of multiplication by p has order less that the
degree of the endomorphism, which is p2 by Theorem 3.17. Every element of
E[p] has order 1 or p, so the order of E[p] is either 1 or p. If E[p] was trivial
then E[pk ] would be for all k, so suppose E[p] has order p.
We will show that E[pk ] ' Zpk for all k. First we must show that the
order can not be smaller than pk . Suppose there exists an element P of order
pj . By Theorem 3.6 multiplication by p is surjective so there exists a point
Q with pQ = P . Since
pj Q = pj−1 P 6= ∞,
P j+1 Q = pj P = ∞
Q has order pj+1 . There is an element of order 1, (∞), so by induction there
are points of order pk for all k. Therefore p will generate E[pk ] meaning E[pk ]
is a cyclic group of order pk , and so E[pk ] ' Zpk .
Finally write n = pr n0 with r ≥ 0 and p - n0 . Then
E[n] ' E[n0 ] ⊕ E[pr ]
We have E[n0 ] ' Zn0 ⊕Zn0 , since p - n0 and we have just showed that E[pr ] ' 0
or Zpr . So
E[n] ' Zn0 ⊕ Zn0 ⊕ 0 or Zn0 ⊕ Zn0 ⊕ Zpr
Now since p - n0 we can use the chinese remainder theorem (B.1) to show
Zn0 ⊕ Zpr ' Zn0 pr ' Zn
Therefore we obtain
E[n] ' Zn0 ⊕ Zn0 or Zn ⊕ Zn0
which completes the proof of Theorem 3.12.
43
3.4
The Weil pairing
Here we consider the Weil pairing which in itself a worthwhile subject. However, many of its uses are omitted in the project and so we state it here
without proof in order to derive some useful results for the next chapter. For
this section we let E be an elliptic curve over a field K and let n be an integer
not divisible by the characteristic of K. Then E[n] ' Zn ⊕ Zn . Let
µn = {x ∈ K|xn = 1}
be the group of nth roots of unity in K. Since the characteristic of K does
not divide n, the equation xn = 1 has no multiple roots, and hence n roots
in K. Therefore µn is a cyclic group of order n. Any generator, ζ, of µn
is called a primitive nth root of unity, which in Theorem A.14 we show is
equivalent to saying that ζ k = 1 if and only if n divides k.
Theorem 3.19. (Proof omitted - See Chapter 11 of [9]): Let E be an elliptic
curve defined over a field K and let n be a positive integer . Assume that the
characteristic of K does not divide n. Then there is a pairing
en : E[n] × E[n] → µn
called the Weil pairing that satisfies the following properties.
1. en is bilinear in each variable. This means
en (S1 + S2 , T ) = en (S1 , T )en (S2 , T )
en (S, T1 + T2 ) = en (S, T1 )en (S, T2 )
for all S, S1 , S2 , T, T1 , T2 ∈ E[n].
2. en is non degenerate in each variable. This means that if en (S, T ) = 1
for all T ∈ E[n] then S = ∞ and also that if en (S, T ) = 1 for all
S ∈ E[n] then T = ∞.
3. en (T, T ) = 1 for all T ∈ E[n].
4. en (T, S) = en (S, T )−1 for all S, T ∈ E[n].
5. en (σS, σT ) = σ(en (S, T )) for all automorphisms σ of K such that σ is
the identity map on the coefficients of E. (If E is in Weierstrass form
this means that σ(A) = A and σ(B) = B.)
44
6. en (α(S), α(T )) = en (S, T )deg(α) for all separable endomorphisms α of
E. If the coefficients of E lie in the finite field Fq then the statement
also holds when α is the Frobenius endomorphism φq . (Note this statement holds for all endomorphism α, separable or not.)
Corollary 3.20. Let {T1 , T2 } be a basis of E[n]. Then en (T1 , T2 ) is a primitive nth root of unity.
Proof Suppose en (T1 , T2 ) = ζ with ζ d = 1. Then
en (T1 , dT2 ) = en (T1 , T2 + ... + T2 ) = en (T1 , T2 )d = ζ d = 1
en (T2 , dT2 ) = en (T2 , T2 + ... + T2 ) = en (T2 , T2 )d = 1d = 1
Let S ∈ E[n], then S = aT1 + bT2 for some integers a, b. Therefore
en (S, dT2 ) = en (T1 , dT2 )a en (T2 , dT2 )b = 1a 1b = 1
This holds for all S so Theorem 3.19(2) implies that dT2 = ∞. This can
happen only if n|d so it follows from Theorem A.14 that ζ is a primitive nth
root of unity.
Corollary 3.21. If E[n] ⊆ E(K) (as opposed to E(K)) then µn ⊂ K.
Proof Let σ be an automorphism of K such that σ is the identity on K.
Let T1 , T2 be a basis on E[n]. Since T1 , T2 are assumed to have coordinates
in K we have σT1 = T1 and σT2 = T2 . Then by Theorem 3.19(5)
ζ = en (T1 , T2 ) = en (σT1 , σT2 ) = σ(en (T1 , T2 )) = σ(ζ)
The fundamental theorem of Galois theory says that if an element x ∈ K is
fixed by all automorphisms σ then x ∈ K. Therefore ζ ∈ K and by Corollary
3.20, also a primitive nth root of unity. Hence µn ⊂ K.
We now deduce two propositions for use in the proof of Hasse’s theorem.
Recall that if α is an endomorphism of E then we obtain
a b
αn =
c d
with entries in Zn , describing the action on α on a basis {T1 , T2 } of E[n].
45
Proposition 3.22. Let α be an endomorphism of an elliptic curve E defined
over a field K. Let n be a positive integer not divisible by the characteristic
of K. Then det(αn ) ≡ deg(α) (mod n).
Proof By Corollary 3.20, ζ = en (T1 , T2 ) is a primitive nth root of unity. By
Theorem A.14(6)
ζ deg(α) = en (α(T1 ), α(T2 )) = en (aT1 + cT2 , bT1 + dT2 )
= en (T1 , T1 )ab en (T1 , T2 )ad en (T2 , T1 )cb en (T2 , T2 )cd
= ζ ad−bc
So
ζ deg(α) ζ −(ad−bc) = ζ ad−bc ζ −(ad−bc)
ζ deg(α)−(ad−bc) = 1
ζ is a primitive nth root of unity so by Lemma A.14, n|[deg(α) − (ad − bc)]
Therefore
deg(α) − (ad − bc) ≡ 0 (mod n)
deg(α) ≡ ad − bc (mod n)
So we can now reduce questions about the degree to calculations with matrices. Propositions 3.22 and 3.23 hold for all endomorphisms (as Theorem
3.19(6) holds for all) but we prove Proposition 3.23 for separable endomorphisms only.
Let α and β be endomorphisms of E and let a, b be integers. The endomorphism aα + bβ is defined by
(aα + bβ)(P ) = aα(P ) + bβ(P )
Proposition 3.23.
deg(aα + bβ) = a2 deg(α) + b2 deg(β) + ab(deg(α + β) − deg(α) − deg(β))
Proof Let n be any integer not divisible by the characteristic of K. Represent α and β by matrices αn and βn , with respect to some bases of E[n].
Then aαn + bβn gives the action of aα + bβ on E[n]. By Theorem B.17
det(aαn +bβn ) = a2 det(αn )+b2 det(βn )+ab(det(αn +βn )−det(αn )−det(βn ))
for any matrices αn , βn . Therefore by Proposition 3.22
deg(aα+bβ) ≡ a2 det(α)+b2 det(β)+ab(det(α+β)−det(α)−det(β))
(mod n)
Since this holds for infinitely many n it is an equality.
46
Chapter 4
Elliptic curves over finite fields
Let F be a finite field and E an elliptic curve defined over F. Since there are
only a finite number of pairs (x, y), with x, y ∈ F, the group E(F) must itself
be finite. In this chapter we discuss the basic theory of elliptic curves over
finite fields, which is the starting point for cryptographic applications.
During the course of the chapter we prove Hasse’s theorem which gives a
bound of the size of the group defined by E(Fq ). We also look at methods
to find the order of a point in E(F).
4.1
Examples
A finite field will have pn elements for some prime p and some integer n ≥ 1
(see Appendix B.5.1). Those curves, Fp where n = 1 are known as the prime
curves and are isomorphic to Zp . When working with an elliptic curve defined
over a finite field Fp we perform all operations modulo p.
Example 4.1. Let E be y 2 = x3 + x + 1 over F5 (= Z5 ). To find all the
point on E(F5 ) we consider the possible values of x, the values of x3 + x + 1
they give, and then what values of y, will give the same value when squared.
x
0
1
2
3
4
∞
x3 + x + 1
1
3
1
1
4
y
±1
±1
±1
±2
∞
47
Points
(0,1), (0,4)
(2,1), (2,4)
(3,1), (3,4)
(4,2), (4,3)
∞
So we see that E(F5 ) has order 9.
We can perform addition as before. For example let’s compute 3(0, 1) =
2(0, 1) + (0, 1). We first need to calculate 2(0, 1) = (x, y), so using the
notation of the addition formulas:
1
3(0)2 + 1
= ≡ 3,
2
2
2
x = 3 − 2(0) = 9 ≡ 4
and
m=
then
y = 3(0 − 4) − 1 = −13 ≡ 2
Next we compute 3(0, 1) = (4, 2) + (0, 1) = (X, Y ) where
1−4
3
= ≡ 3 × 3 = 9 ≡ 4 (mod 5)
0−2
2
X = 42 − 4 − 0 = 12 ≡ 2 (mod 5)
Y = 4(4 − 2) − 2 = 6 ≡ 1 (mod 5)
m =
So 3(0, 1) = (2, 1). Now we know that E(F5 ) has order 9, so all its elements
have order dividing 9. The only choices are 1,3 or 9 and we have shown that
(0,1) does not have order 1 or 3. Therefore (0,1) has order 9 and hence E(F5 )
is cyclic and generated by (0,1). For more examples of working with E(Fp )
see Section 2.2.1.
Example 4.2. Let E be the elliptic curve y 2 + xy = x3 + 1 defined over F2 .
There are only four points in F2 and all except (0,0) satisfy the elliptic curve
equation so
E(F2 ) = {∞, (0, 1), (1, 0), (1, 1)}
This is a cyclic group of order 4. The point ∞ has order 1 and the point (0,1)
has order 2. We can show, (using the formula from Appendix A.3 since we are
in characteristic 2), that (1,0) and (1,1) have order 4 and so are generators
of the group.
Now consider E(F4 ) = E(F22 ). F4 is a finite field with 4 elements which
we can write as F4 = {0, 1, ω, ω 2 }, where ω 2 +ω +1 = 0 (see Appendix B.5.1).
We can use w3 = 1 since
0(ω − 1) = (ω 2 + ω + 1)(ω − 1)
0 = ω3 + ω2 + ω − ω2 − ω − 1 = ω3 − 1
Now let’s list the elements of E(F4 ).
48
x=0
x=1
x=ω
x = ω2
x=∞
⇒ y2 = 1
⇒ y2 + y = 0
⇒ y 2 + ωy = 0
⇒ y2 + ω2y = 0
⇒
⇒ y=1
⇒ y = 0, 1
⇒ y = 0, ω
⇒ y = 0, ω 2
⇒ y=∞
Therefore E(F4 ) = {∞, (0, 1), (1, 0), (1, 1), (ω, 0), (ω, ω), (ω 2 , 0), (ω 2 , ω 2 )}.
Since we are in characteristic 2 we know, by Proposition 3.11, that there
is at most one point of order 2 which we have already identified as (0,1).
E(F4 ) is a group of order 8, so its elements must have order 1,2,4 or 8. We
know only ∞ has order 1 and only (0,1) has order 2. By Theorem B.6 we
know that only 4 elements have order dividing 4, so it is those of the order 4
subgroup, E(F2 ). We can conclude that E(F4 ) is cyclic of order 8 where any
of the four point that contain ω or ω 2 is a generator.
Let φ2 (x, y) = (x2 , y 2 ) be the Frobenius map. We can see that φ2 permutes the elements of E(F4 ) as
φ2 (E(F4 )) = {∞, (0, 1), (1, 0), (1, 1), (ω 2 , 0), (ω 2 , ω 2 ), (ω 4 , 0), (ω 4 , ω 4 )}
= {∞, (0, 1), (1, 0), (1, 1), (ω 2 , 0), (ω 2 , ω 2 ), (ω, 0), (ω, ω)} = E(F4 )
using w3 = 1. Further more we can see that
E(F2 ) = {(x, y) ∈ E(F4 ) | φ2 (x, y) = (x, y)}
In general, for any elliptic curve E, defined over Fq and any extension F of
Fq , the Frobenius map φq permutes the elements of E(F) and is the identity
on the subgroup E(Fq ). (See Lemma 4.3)
Theorem 4.1. Let E be an elliptic curve over the finite field Fq . Then
E(Fq ) ' Zn ,
or
Zn1 ⊕ Zn2
for some integer n ≥ 1, or for some integers n1 , n2 ≥ 1 with n1 |n2 .
Proof From Theorem B.6 we know that a finite abelian group, such as E(Fq )
is isomorphic to a direct sum of cyclic groups
E(Fq ) ' Zn1 ⊕ Zn2 ⊕ ... ⊕ Znr
with ni |ni+1 for n ≥ 1. We can then apply Corollary B.7 show E(Fq ) has nr1
elements of order dividing n1 . However, by Theorem 3.12 there are at most
n21 such points, therefore r ≤ 2, which gives the desired result.
49
4.2
Hasse’s theorem
The aim of this section is to prove Hasse’s theorem, which gives a bound on
the size of E(Fq ). We follow the logic in Chapter VI of [5] to understand the
size of E(Fq ).
For each of the q possible values of x, there are at most 2 y’s which
together with the x could satisfy the Weierstrass equation. So it is easy to
see that there are at most 2q + 1 points in E(Fq ) — ∞ along with the 2q
possible pairs (x, y). However, since only half the elements in Fq have square
roots we might expect around half that number.
Recall the Legendre symbol (Appendix B.6). We can generalise this to a
finite field Fq , q odd, by defining for x ∈ Fq
+1 if t2 = x has a solution t ∈ F×
q
x
−1 if t2 = x has no solution t ∈ Fq
=
Fq
0 if x = 0
We can now give a more accurate solution to the number of points on E(Fq ):
3
X
X x3 + Ax + B x + Ax + B
1+
1+
=q+1+
Fq
Fq
x∈F
x∈F
q
q
We would expect x3 + Ax + B to be equally likely to have a square root
or not. So we could treat the sum as a random walk where we have equal
chance of taking one step forwards or back at each stage. From probability
√
theory the net distance traveled after q tosses is of the order q. So using
√
this analysis we would expect the size of E(Fq ) to be around q + 1 + q. As
we see from Hasse’s Theorem below, this is close to the truth.
Theorem 4.2 (Hasse). Let E be an elliptic curve over the finite field Fq .
Then the order of E(Fq ) satisfies the following inequality.
√
|q + 1 − E(Fq )| ≤ 2 q
The proof is given in the following section
50
4.2.1
The Frobenius endomorphism
Let Fq be a finite field with algebraic closure Fq and let the Frobenius map
for Fq , φq : Fq → Fq be given by
φq : x 7→ xq
Let E be an elliptic curve defined over Fq , then φq acts on the coordinates
of points in E(Fq ) as below.
φq (x, y) = (xq , y q ),
φq (∞) = ∞
Lemma 4.3. Let E be defined over Fq and let (x, y) ∈ E(Fq ). Then
(i) φq (x, y) ∈ E(Fq ).
(ii) (x, y) ∈ E(Fq ) if and only if φq (x, y) = (x, y).
Proof We know from Theorem B.14 that in a field with characteristic q
• (a + b)q = aq + bq
• aq = a
This proof will hold for both the Weierstrass and generalised Weierstrass
equation so assume E is given by
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6
with ai ∈ Fq . Now raising each side of the equation to the power q gives
(y 2 + a1 xy + a3 y)q = (x3 + a2 x2 + a4 x + a6 )q
(y 2 )q + aq1 xq y q + aq3 y q = (x3 )q + aq2 (x2 )q + aq4 xq + aq6
(y q )2 + a1 (xq y q ) + a3 (y q ) = (xq )3 + a2 (xq )2 + a4 (xq ) + a6
So we see that (xq , y q ) lies on E, proving part (i).
For part (ii) we recall from Theorem B.14 that x ∈ Fq if and only if
φq (x) = x. The same will be true for y, and so using part (i)
(x, y) ∈ E(Fq ) ⇔ x, y ∈ Fq
⇔ φq (x) = x, φq (y) = y
⇔ φq (x, y) = (x, y)
51
Let E be an elliptic curve defined over Fq . Recall from Lemma
φq is then an endomorphism of E of degree q, and is not separable.
find that the kernel of the endomorphism φq is trivial, (related to
that it is not separable by Theorem 3.5).
Since φq is an endomorphism of E, so is φ2q = φq ◦ φq . Moreover
3.4 that
We also
the fact
so is
φnq = φq ◦ φq ◦ ... ◦ φnq
for every n ≥ 1. Since multiplication by -1 is also an endomorphism we can
conclude that the sum φnq − 1 is an endomorphism of E.
Proposition 4.4. Let E be defined over Fq and let n ≥ 1. Then
(i) Ker(φnq − 1) = E(Fqn ).
(ii) φnq − 1 is a separable endomorphism, so #E(Fqn ) =deg (φnq − 1).
Proof Part (i) can be seen easily from Lemma 4.3 and the fact that φnq − 1
is separable was proved in Proposition 3.10. Therefore part (ii) follows from
Theorem 3.5.
PROOF OF HASSE’S THEOREM
Let
a = q + 1 − #E(Fq ) = q + 1 − deg(φq − 1)
√
We need to show that |a| ≤ 2 q. We use the following.
(4.1)
Lemma 4.5. Let r, s be integers with gcd(s, q) = 1. Then
deg(rφq − s) = r2 q + s2 − rsa
Proof Using Proposition 3.23 with a = r, α = φq , b = s and β = −1:
deg(rφq − s) = r2 deg(φq ) + s2 deg(−1) + rs[deg(φq − 1) − deg(φq ) − deg(−1)]
We know that deg(φq ) = q and deg(−1) = 1 so using the definition of a
deg(rφq − s) = r2 q + s2 + rs[deg(φq − 1) − q − 1]
= r2 q + s2 − rs[q + 1 − deg(φq − 1)]
= r2 q + s2 − rsa
52
Note that the assumption that gcd(s, q) = 1 was included to allow the
use of Proposition 3.23. We now return to the proof of Hasse’s Theorem.
By definition the deg(rφq − s) ≥ 0, so by the above lemma
r2 q + s2 − rsa ≥ 0
r
r 2
a ≥ 0
q+1−
s
s
for all r, s with gcd(s, q) = 1.
We show here that the set of rational numbers r/s such that
gcd(s, q) = 1 is dense in R.
For a subset X ⊆ R to be dense in R means that for all integers a ∈ R
an interval centered on a will contain points in X.
Let X denote the set in question and let the point s be equal to a
power of 2 or a power of 3. One of these choices will be coprime to q,
since q is a power of a single prime p. It is easy to see that the rationals
of the form r/2m or r/3m will be dense in R.
Therefore X will contain a subset that is dense in R and so X is itself
dense in R.
Since the set of rationals r/s such that gcd(s, q) = 1 is dense in R we
conclude that for all real numbers x, qx2 − ax + 1 ≥ 0
Suppose for a contradiction that this were not the case and that there
was r ∈ R such that, ar2 − ar + 1 < 0.
Consider a sequence of open intervals about r:
(r − , r + ) where = 1/n, n = 1, 2, 3, ...
Then within each of these intervals there would be a point xn ∈ X where
X is the dense set of rationals r/s such that gcd(s, q) = 1.
We would get a sequence, x1 , x2 , ... of numbers getting closer and
closer to r. For i sufficiently large we could find a value of
qx2i − axi + 1
that was arbitrarily close to ar2 − ar + 1. However, since xi ∈ X this
first value would be ≥ 0 while the second is strictly less than zero. So we
have a contradiction.
53
So qx2 − ax + 1 ≥ 0 for all x ∈ R. Therefore the polynomial must have
either a double real root or a pair of complex roots. Hence, the discriminant
of the polynomial is negative or 0 :
a2 − 4q ≤ 0
√
This means that |a| ≤ 2 2 which completes the proof of Hasse’s theorem.
The following theorem is another useful consequence of Proposition 4.4
Theorem 4.6. Let E be an elliptic curve defined over Fq and a as defined
in Equation (4.1). Then
φ2q − kφq + q = 0
as endomorphisms of E, if and only if k = a. In other words, if (x, y) ∈
E(Fq ) then
2
2
(xq , y q ) − k(xq , y q ) + q(x, y) = ∞
for all (x, y) ∈ E(Fq ) if and only if k = a.
Moreover a is the unique integer satisfying
a ≡ T race((φq )m )
(mod m)
for all m with gcd(m, q) = 1.
Proof If φ2q − aφq + q is not the zero endomorphism, then its kernel is finite
(Proposition 3.5), so we must show that its kernel is infinite.
Let m ≥ 1 be an integer with gcd(m, q) = 1. Recall that φq induces a
matrix (φq )m that describes the action of φq on E[m]. Let
s t
(φq )m =
u v
φq − 1 is separable by Proposition 3.10, so we can use Theorem 3.5 and
Proposition 3.22 to show
#Ker(φq − 1) = deg(φq − 1) ≡ det((φq )m − I) (mod m)
s−1
t
= (s − 1)(v − 1) − tu
= u
v−1 = sv − tu − (s + v) + 1
54
By Proposition 3.22, sv − tu = det((φq )m ) ≡ deg(φq ) = q (mod m). Note
also from Equation (4.1) that #Ker(φq − 1) = q + 1 − a so we can conclude
Trace((φq )m ) = s + v ≡ a
(mod m)
By the Cayley-Hamilton theorem (every square matrix satisfies its characteristic equation) or straightforward calculation
(φq )2m − a(φq )m + qI ≡ 0
(mod m)
where I is the 2 × 2 identity matrix. This means that the endomorphism
φ2q − aφq + q is identically zero on E[m]. Since there are infinitely many
choices for m, the kernel is infinite, making the endomorphism 0, as required.
Suppose a1 6= a satisfies φ2q − a1 φq + q. Then
(a − a1 )φq = (φ2q − aφq + q) − (φ2q − a1 φq + q) = (0) − (0) = 0
By Theorem 3.6, φq : E(Fq ) → E(Fq ) is surjective, therefore for any element
y ∈ E(Fq ) there exists x ∈ E(Fq ) such that φq (x) = y. So for all y ∈ E(Fq )
(a − a1 )y = (a − a1 )φq = 0
therefore (a−a1 ) annihilates E(Fq ). In particular (a−a1 ) annihilates E[m] for
every m ≥ 1. Since there are points in E[m] of order m when gcd(m, q) = 1,
we find that a − a1 ≡ 0 (mod m). Therefore a − a1 = 0, so a is unique.
4.3
Orders of points
Let P ∈ E(Fq ). The order of P is the smallest positive integer k such that
kP = ∞. In this section we show how knowing the order of a point in E(Fq )
can allow us to find the order of E(Fq ) itself. We then derive and demonstrate
an algorithm to find the order of a point.
The order of a point will always divide the order of the group, E(Fq ),
(see Theorem B.3). Also, for an integer n, we have nP = ∞ if and only if
the order of P divides n. By Hasse’s Theorem #E(Fq ) lies in an interval of
√
√
length 4 q. Therefore if we find a point of order greater than 4 q, then
#E(Fq ) must be a multiple of this. There could only be one multiple in the
interval which will therefore be #E(Fq ).
55
√
Even if the order of the point is smaller than 4 q, we will still obtain a
relatively small list of possibilities for #E(Fq ). Also using several more points
could shorten the list to a unique possibility for #E(Fq ). In the following
subsection we will discuss a method for finding the order of a point.
Example 4.3. Let E be y 2 = x3 − 10x + 21 over F557 . The point (2,3) can
be shown to have order 189 (see Example 4.6). Hasse’s Theorem implies
511 ≤ #E(F557 ) ≤ 605
The only multiple of 189 in this range is 3(189) = 567, so #E(F557 ) = 567.
Example 4.4. Let E be y 2 = x3 + 7x + 12 over F103 . It is relatively easy
to show that the point (−1, 2) has order 13 and the point (19,0) has order
2. Therefore the order of E(F103 ) is a multiple of 26. By Hasse’s Theorem
84 ≤ #E(F103 ) ≤ 124 so the order must be 104.
Example 4.5. Let E be y 2 = x3 + 7x + 12 over F7 . In this case E(F7 ) '
Z3 ⊕ Z3 and every point except infinity has order 3. Hasse’s theorem gives
3 ≤ #E(F7 ) ≤ 13 so all we can conclude is that the order is 3,6,9 or 12.
When we are in situations where E(Fq ) ' Zn ⊕ Zn , as in the previous
example, finding the order of the group is far more difficult. However this
situation is fairly rare, as the next theorem shows.
Proposition 4.7. Let E be an elliptic curve over Fq and suppose
E(Fq ) ' Zn ⊕ Zn
for some integer n. Then either q = n2 + 1, q = n2 ± n + 1 or q = (n ± 1)2 .
Proof In this case #E(Fq ) = n2 , so by Hasse’s Theorem n2 = q + 1 − a
√
where |a| ≤ 2 q. We now need the following lemma
Lemma 4.8. a ≡ 2 (mod n)
Proof Let p be the characteristic of Fq . If p|n then, by Theorem B.4, there
would be (p − 1) elements of order p in Zn and so (including ∞) p2 points in
E[p]. However, if p|n then by Theorem 3.12 we write n = pr n0 and we have
either
E[n] ' Zn0 ⊕ Zn0 , or Zn ⊕ Zn0
56
where p - n0 . If we are in the first case then E[p] has only 1 element and if
we are in the second it has p, so we must conclude that p - n
Since E[n] ⊆ E(Fq ), we can use Corollary 3.21 to show the nth roots of
unity are in Fq . Then by Proposition B.15 (q−1) is a multiple of n. Therefore
a = q + 1 − n2 = (q − 1) + 2 − n2 ≡ 2
(mod n)
Now write a = 2 + kn for some integer k. Then
n2 = q + 1 − a = q − 1 − kn =⇒ q = n2 + kn + 1
By Hasse’s Theorem
√
2 q ≥ |q + 1 − #E(Fq )| = |n2 + kn + 1 + 1 − n2 | = |2 + kn|
Taking squares of both sides gives
4(q) ≥ 4 + 4kn + k 2 n2
4(n2 + kn + 1) ≥ 4 + 4kn + k 2 n2
=⇒ k 2 ≤ 4
So |k| ≤ 2, meaning the possible values of k are 0, ±1, ±2. Substituting these
into q = n2 + kn + 1 give the possible values of q stated in the theorem:
k = 0 ⇒ q = n2 + 1
k = ±1 ⇒ q = n2 ± n + 1
k = ±2 ⇒ q = n2 ± 2n + 1 = (n ± 1)2
Most values of q are not in one of these forms, and even for such q it is
unlikely the elliptic curve would have the form E(Fq ) ' Zn ⊕ Zn .
More generally, most q are such that all elliptic curve over Fq have points
√
of order greater than 4 q. So we can usually find points with orders that
will allow us determine #E(Fq ).
We discuss other methods to determine exactly the size of E(Fq ) in Appendix A.7. We show how we can derive the size of E(Fqn ) from the size of
E(Fq ) if it is known in Section A.7.1. Then in Section A.7.2 we show how to
use the Legendre symbol mentioned earlier in a point counting algorithm.
57
4.3.1
Baby Step, giant step
We want to find the order of P ∈ E(Fq ). We will need to find an integer k,
so kP = ∞. Let #E(Fq ) = N , then
√
√
q+1−2 q ≤N ≤q+1+2 q
We could try every integer in this range, to see which ones satisfy N P = ∞,
√
which would take around 4 q steps. However we can speed this up to 4q 1/4
steps using the following, baby step, giant step algorithm.
(1) Compute Q = (q + 1)P
(2) Choose an integer, m, with m > q 1/4 . Compute and store the points
jP for j = 0, 1, 2, ..., m.
(3) Compute the points
Q + k(2mP )
for k = −m, −(m − 1), ..., m
until there is a match with a point or its negative in the stored list:
Q + k(2mP ) = ±jP
(4) Conclude that (q + 1 + 2mk ∓ j)P = ∞. Let M = q + 1 + 2mk ∓ j.
(5) Factor M . Let p1 , ..., pr be the distinct prime factors of M .
(6) Compute (M/pi )P for i − 1, ..., r. If (M/pi )P = ∞ for some i replace M
with M/pi and go back to step (5).
If (M/p1 )P 6= ∞ for all i then M is the order of the point P .
(7) If we are looking for #E(Fq ) then repeat steps 1-6 with randomly chosen
points in E(Fq ) until the least common multiple of the orders divides only
√
√
one integer N with q + 1 − 2 q ≤ N ≤ q + 1 + 2 q. Then N = #E(Fq ).
We must now show that this method works. The first point to prove is
that there will always be a match in step (3):
Lemma 4.9. Let a be an integer with |a| ≤ 2m2 . There exists integers a0
and a1 with −m ≤ a0 ≤ m and −m ≤ a1 ≤ m such that
a = a0 + 2ma1
58
Proof Let a0 ≡ a (mod 2m), with −m < a0 ≤ m and a1 = (a − a0 )/2m.
Now the integer a0 clearly exists and satisfies the conditions of the lemma.
|a1 | ≤
2m2 + m
2m + 1
=
<m+1
2m
2
Because a1 is an integer we see |a1 | ≤ m, and so also satisfies the conditions
of the lemma. Finally we see that, as required
a0 + 2ma1 = a1 + (a − a0 ) = a
Let a = a0 + 2ma1 be as in the lemma. Let k = −a1 which is reasonable
as −a1 will be one of the k’s tested. Then
Q + k(2mP ) = (q + 1 − 2ma1 )P = (q + 1 − a + a0 )P
= N P + a0 P = a0 P = ±jP
where j = |a0 |. This is again reasonable as one of the j’s will be |a0 |. So we
see that we will always find a match in stage (3).
To make the conclusion of part (4) note that
(q + 1 + 2mk ∓ j)P = [Q + k(2mP )] ∓ jP
= [±jP ] ∓ jP = ∞
by the rules of elliptic curve addition.
We must now show that step (6) yields the order of P , and the algorithm
will find the order of the point.
Lemma 4.10. Let G be an additive group (with identity 0), and let g ∈ G.
Suppose M g = 0 for some positive integer M . Let p1 , ..., pr be the distinct
primes dividing M . If (M/pi )g 6= 0 for all i, then M is the order of g.
Proof Let k be the order of g, then k|M . Suppose k 6= M and let pi be
a prime dividing M/k. Then pi k|M so k|(M/pi ). Therefore (M/pi )g = 0
contrary to assumption. Therefore k = m.
Therefore step (6) finds the order of P .
59
Example 4.6. Let E be the elliptic curve y 2 = x3 − 10x + 21 over F557 and
let P = (2, 3). We will show P has order 189 as stated in Example 4.3 using
the procedure above.
(1) Q = 558P , which using successive doubling is (418,33...)
(2) Let m = 5 which is greater that 5571/4 (= 4.858...). The list of jP is
∞, (2, 3), (58, 164), (44, 294), (56, 339), (132, 364)
(3) When k = 1 we have Q + k(2mP ) = (2, 3) which matches a point on the
list, when j = 1
(4) We have (q + 1 + 2mk − j)P = 567P = ∞.
(5) Factor 567 = 34 · 7
(6) (567/4)P = 189P − ∞. So we now now try again with 189 = 33 · 7.
(7) (189/3)P = (38, 535) 6= ∞ and (189/7)P = (136, 360) 6= ∞. Therefore
189 is the order of P .
As stated in Example 4.3 this allows us to determine #E(F557 ) = 567.
Notes on this algorithm:
• To save storage space only store the x-coordinates of the points jP .
• Computing Q + k(2mp) can be done by computing Q and 2mP once
only, and then using it for all points. Then to get from Q + k(2mP ) to
Q+(k+1)(2mP ) simply add 2mP rather than recomputing everything.
Similarly once jP has been computed just add P to get (j + 1)P .
• The baby steps are from the point jP to (j + 1)P , while the giant steps
are from k(2mP ) to (k + 1)(2mP ). The second step is far bigger, 2mP
instead of P , hence the name of the algorithm.
60
Chapter 5
Elliptic curve cryptography
We start this chapter by introducing the basic terms used in cryptography,
and then move on to discuss public key cryptography in more detail. We give
the definitions of two public key systems, one for key exchange and one for
encryption, and show how they can be adapted for use with elliptic curves.
Most of the cryptographic definitions and explanations are well known
and here the basics are adapted from [7] Chapter 1. The background on
public key schemes in Section 5.2 was adapted from Chapter 6 of [8].
5.1
The basics of cryptography
In keeping with the traditions of cryptographic discussion suppose that we
have two users Alice and Bob who wish to communicate securely so that the
evesdropper, Eve, does not learn about the information exchanged. They
will use cryptography, the science of keeping messages secure.
If Alice wishes to send the plaintext, M , (her message) to Bob she will
use some encryption function (E) to transform this message to ciphertext,
C. This ciphertext should be unintelligible to any third party, but also able
to be decrypted once it has been received by Bob.
Plaintext −→ Encryption −→ Ciphertext −→ Decryption −→ Plaintext
We will think of the plaintext (and ciphertext) as strings of 0s and 1s (bits)
which almost all messages (text, pictures etc.) can be converted into.
The cryptographic algorithm that is used for encryption and decryption is
know as the cipher. Restricted algorithms have security based on keeping this
61
algorithm a secret. Such a requirement is unrealistic given any relatively large
system and also allows no quality control or standardization of the algorithm.
Kerchoff ’s assumption (1883) was that the secrecy of a cryptosystem must
rely on a key and not the cipher.
It is these key based systems that are used in practice, with the keyspace,
K, defined as the range of possible keys. Increasing the key by 1 bit will
double the size of the key space, so adding 5 bits for example, will make the
keyspace 32 times bigger. There are two main types of key-based cryptosystems:
• Symmetric key algorithms use the same key for both encryption and
decryption (or the decryption key can be easily derived from the encryption key).
EK (M ) = C,
DK (C) = M
Alice and Bob need to agree on this secret key before they can communicate securely.
• Public key algorithms use separate keys for encryption and decryption.
EK1 (M ) = C,
DK2 (C) = M
The encryption key is often know as the public key and the decryption
key as private. Because the encryption key is known publically, Alice
does not need to have had prior communication with Bob to send him
a message.
A cryptosystem is an algorithm, plus all possible plaintexts, ciphertexts
and keys.
Cryptanalysis is the attempt to obtain the plaintext without access to
the key, by attacking the system. The most basic form of attack would be
to try every possible key until the correct one is found, which is known as a
brute-force attack. It is important to make the keyspace large enough for this
to be infeasible. However a larger key will result in more time and memory
needed to perform the algorithm and so there is a trade off to consider.
There are many other more sophisticated attacks that a cryptanalyst can
employ, which users of a cryptosystem must consider. A cryptosystem would
be unconditionally secure if no matter how much ciphertext an opponent
62
has they are unable to derive the plaintext. There has only ever been one
such cryptosystem, the one time pad. This system had a key as long as the
message itself, which could only be used once and so is not very practical.
Most systems aim for computational security which is when the cryptosystem cannot be broken with ’available resources’. This can be defined in
a variety of ways, including the amount of time, data and memory required.
There are other applications of cryptography in addition to keeping messages secure that can be of great use. These include:
• Authentication: A system with authentication is able to prove the origin of a message. If Bob receives a message it would be valuable to
know for sure that it was sent by Alice and not some impostor.
• Integrity: A system with integrity would allow Bob to be sure that the
message he has received has not been modified.
• Nonrepudiation: If a system provides nonrepudiation then Alice would
not be able to falsely deny sending a message to Bob.
Public key systems, in particular, allow for these other applications.
Elliptic curves are used to create public key cryptosystems which we focus
on in the next section. However, at present public key systems are too cumbersome for large scale use and so messages are still encoded with symmetric
key algorithms. In most industrial cryptosystems public key is used to create
the key needed for the symmetric algorithm which sends the message. Since
symmetrical algorithms still play such an important part we briefly look at
them here.
These algorithms are usually based on substitutions (swapping a bit
stream for another) and permutations (rearranging the ones we have). A
simple example is the Caesar cipher (used by the roman commander to
communicate with his generals). Each letter is substituted for the one three
characters to the right (modulo 26). For example:
CRYPTOGRAPHY −→ FUBSWRJUDSKB
Such a simple example could be easily broken by looking at the letter frequencies, for example. However there are much more sophisticated systems
used in practice. Two such examples are the block ciphers, DES and AES.
63
DES (the data encryption standard) was a 56-bit cipher constructed by
IBM and the NSA and adopted by the USA in ’76. It enjoyed widespread
use internationally but in recent years has been considered insecure for many
applications. This is chiefly due to the 56-bit key size being too small; DES
keys have been broken in less than 24 hours.
AES (the advanced encryption standard) is a 128-bit cipher constructed
by two Belgian cryptographers, Joan Daemen and Vincent Rijmen which
often goes by its creators name, Rijndael. This cipher was adopted, after a
5-year standardization process, by the USA in 2001 to replace DES. Notice
that the keyspace is substantially bigger (recall that one extra bit doubles
the keyspace).
5.2
Public key cryptography
Public key cryptography (also known as asymmetric) uses two separate keys,
as opposed to symmetric encryption where the decryption key is easily derived from the encryption key. This use of two keys has profound consequences in the areas of key distribution and authentication.
It should also be noted that from its earliest beginnings to modern times
cryptography has been based on permutations and substitutions (from the
rotor machines of WWII to complicated computer code like DES). Public
key revolutionised this, basing algorithms on mathematical functions.
In 1976 Walt Diffie and Martin Hellman came up with the idea of public
key cryptography as a method of solving the problem of key distribution
and the need for digital signatures in symmetric cryptography, by using two
different but related keys for encryption and decryption. They recognised
that it must be computationally infeasible to determine the decryption key
given the knowledge of the cryptographic algorithm and encryption key. Figure 7.1 demonstrates how such a system would allow Alice to securely send
a message to Bob without any prior contact.
Some algorithms will also have the property that either of the two keys
can be used for encryption with the other used for decryption. In this case
the public key algorithm could be used for authentication as in Figure 7.2. In
addition to knowing the message could only have come from Alice Bob can
also be sure of the data security as no-one without access to Alice’s private
key could have altered the message.
64
Figure 5.1: Public key encryption: Alice encrypts a message with Bob’s
public key and sends it to him. Only Bob could read this message as only he
has access to the private key neccessary for decryption.
Figure 5.2: Public key authentication: Alice encrypts a message with her
private key and sends it to Bob. Only Alice could have sent the message as
only she has access to the private key neccessary for encryption.
65
The authenticated message could be read by anyone who has access to
Alice’s public key, so it must also be encrypted with Bob’s public key to be
secure. To be more efficient Alice should only encrypt a small segment with
her private key for authentication purposes (an authenticator block ) and then
encrypt the whole message in Bob’s public key.
Diffie & Hellman recognised the possible uses of such a public key cryptosystems:
• Encryption / decryption: The sender encrypts a message with the recipient’s public key.
• Digital signature: The sender signs a message with his public key.
• Key exchange: Two sides cooperate to exchange a session key.
Although postulating this system, Diffie & Hellman did not demonstrating
that such an algorithm for encryption exists (although they did propose a
scheme for key exchange which is examined in more detail in the next section).
Diffie & Hellman also recognised the need for a trapdoor one-way function
in such a system. A one-way function maps a domain so every function value
has a unique inverse, with the condition that the calculation of the function
value is easy where as the calculation of the inverse is infeasible. (Easy
implies polynomial length computation time.) A trapdoor one-way function
is the same except that the inverse is easy to compute if certain additional
information is known. Therefore we require a function f such that:
Y = fk (X) is easy to compute, if k and X are known
X = fk−1 (Y ) is easy to compute, if k and Y are known
X = fk−1 (Y ) is infeasible to compute, if Y is known but k is not
The classic example of such a function is the factorisation of large primes
modulo p. While it is relatively easy to multiply the two primes it is extremely
difficult to factorise the product, unless some other information is known.
The first successful algorithm for public key encryption was RSA in 1978,
named after its creators Ron Rivest, Adi Shamir and Len Adleman. This
system relied on the prime factorisation problem described above and has
since been widely used in a variety of applications. Although an important
66
subject in cryptography it is not used in conjunction with elliptic curves and
so not discussed here.
As with symmetric schemes, the security of a public key system depends
on the size of the key, and any algorithm would be vulnerable to a brute
force attack of trying all possible keys. The countermeasure is to use large
keys, however unlike symmetric schemes the computation time may not rise
linearly with the key size and so there is a trade off between security and
practicality. In practice the key sizes that make brute force attacks impractical result in encryption speeds that are too slow for general use. This is
why, as mentioned earlier, public key cryptography has been confined to key
management and signature applications, such as key exchange and authentication
The actual message to be transfered is then encoded with a symmetric
key system (eg AES). Due to the level of computation involved in public key
systems this is likely to remain the case for some time with Walt Diffie himself
saying, ‘the restriction of public key cryptography to key management and
signature applications is almost universally accepted’.
5.3
The discrete logarithm problem
Diffie & Hellman derived an algorithm that allowed users to exchange a key
securely, which can then be used in the subsequent encryption of messages.
It appeared in the original paper by Diffie & Hellman (’76) and has been
employed in a number of commercial products. The algorithm depends of
the difficulty of computing discrete logarithms.
Recall that a primitive root of a prime, p is a number whose powers
generate the integers from 1 to (p − 1). So if α is a primitive root of a prime
number p then the numbers α, α2 , α3 , ..., αp−1 (mod p) are distinct and consist
of the integers 1 through p − 1 in some permutation. For any integer β one
can find a unique exponent a such that
β = αa
(mod p)
where 0 ≤ a ≤ (p − 1)
The exponent, a, is referred to as the discrete logarithm and is denoted by
indα,p (β).
We are able to define a one-way function with discrete logarithms since it
is relatively easy to calculate b = αa (mod p) but extremely difficult to find a
67
given b, α and p. Diffie & Hellman originally recognised the problem below for
the multiplicative group Z?p (see Appendix B.3), however in the next section
we show how it can be redefined for the groups formed by elliptic curves.
Discrete log problem: Let p be prime, α a primitive element of Z?p and β ∈ Z?p .
Find the unique integer a, 0 ≤ a ≤ p − 2 such that αa = β (mod p).
There is no known efficient (polynomial time) algorithm to solve the discrete log problem, provided p is carefully chosen.
5.3.1
Diffie-Hellman key exchange
This description of the key exchange and following example was adapted
from Chapter 6.4 of [8]. Suppose Alice and Bob want to securely exchange
a key for future communications. To use the classical version of the Diffie
Hellman key exchange they would proceed as follows
1. A prime number p and a primitive root of p, α, are known publically.
2. Alice selects a random integer XA < p & computes YA = αXA (mod p)
Bob selects a random integer XB < p & computes YB = αXB (mod p)
3. Each user keeps X secret and sends Y to the other.
4. Alice computes K = (YB )XA mod p. Bob computes K = (YA )XB mod p
These two calculations produce identical results since
(YA )XB = (αXA )XB = αXA XB = (αXB )XA = (YB )XA
and so the two sides have exchanged a secret key. The only information
an attacker has to work with is p, α, YA and YB . It is believed that it is
computationally infeasible to obtain K from this information. The opponent
would be forced to take a discrete logarithm and compute XB = indα,q (YB ).
This is summarised as the following problem.
The Diffie-Hellman problem Given p prime, α a primitive root modulo p and
elements αa (mod p) & αb (mod p), find αab (mod p).
68
The security of the Diffie Hellman Key Exchange lies in the fact that it is
relatively easy to calculate exponentials modulo a prime but very difficult to
calculate discrete logarithms. For large primes the latter task is considered
infeasible. However it has not been proved that there is no other way to solve
the Diffie-Hellman problem, other than first finding the discrete log.
Example 5.1. Suppose p = 97, α = 5, XA = 36, XB = 58. Then
YA = 536 ≡ 50 (mod 97)
and
YB = 558 ≡ 44 (mod 97)
Alice and Bob will exchange Y ’s and each compute:
KA = (YB )XA = 4436 = 75
(mod 97),
KB = (YA )XB = 5058 = 75 (mod 97)
From {50, 44} the attacker cannot easily compute the shared secret key, 75
5.3.2
The El Gamal cryptosystem
This is a public key cryptosystem based on the discrete log problem, first
proposed in 1984. It will allow Alice to securely send a message to Bob
without prior communication. This description of the El Gamal system was
adapted from Chapter 6.2 of [10]. For simplicity, assume the message can be
stored as an element of Z?p and define the algorithm as follows.
The key is formed from the prime p, the primitive root α, an integer a
and β = αa (mod p). The values p, a, β are made public while a is kept
private. If Alice wants to send a message, M ∈ {0, 1, ..., p − 1}, to Bob she
proceeds as follows.
1. Alice selects a random integer r ∈ Z?p .
2. Alice computes y1 = αr (mod p) and y2 = M β r (mod p).
3. Alice sends the ciphertext C = (y1 , y2 ) to Bob.
4. Bob uses his private key, a, to calculate y2 y1p−1−a (mod p) which gives
the message M .
The decryption in the final step works because
y2 y1p−1−a = y2 y1−a since xp−1 ≡ 1 (mod p)
= (mβ r )(αr )−a by the definition of y1 and y2
= m(β r )(α−ar ) = m(αar )(α−ar ) ≡ m (mod p)
69
Any third party would know p, α, β, y1 = αr and y2 = mβ r . To recover m a
third party could attempt to solve the discrete logarithm problem and find
a from β = αa . If the problem is set up carefully then this is considered
infeasible.
It is important that Alice use a different random integer each time she
sends a message. Suppose the same r was used to encrypt both m1 and m2
and the resulting ciphertext were (y1 , y2 ), (z1 , z2 ). Then
m1 β r
m1
y2
=
=
z2
m2 β r
m2
Then suppose that the secret message m1 was made public at some later
point. If this happened then anyone who had stored the ciphertext could
easily compute the new secret message m2 by calculating m1 z2 /y2 = m2 .
Even worse, the evesdropper can easily recognise that this mistake had been
made as y1 would equal z1 .
5.4
Elliptic curve cryptography
In this section we show how elliptic curves are able to perform the protocalls
of the previous section. We describe the discrete logarithm for elliptic curves,
and how it can be used for key exchange and encryption.
5.4.1
The discrete logarithm problem for
elliptic curves
The systems of the previous system were originally designed for the finite
abelian group F×
q — the multiplicative group of a finite field. We will now
redefine then for use with the finite, additive, abelian group formed by elliptic
curves over a finite field Fq .
The elliptic curve analogue of multiplying two points in F×
q is adding two
×
points in E(Fq ). So if we were raising a point P ∈ Fq to the kth power we
are now multiplying P ∈ E(Fq ) by k. When using these systems in practice,
with large k, it will be necessary to use the method of successive doubling
described in Section 3.2.1.
Let α, β ∈ E(Fq ) and suppose we know aα = β for some integer a.
Then the discrete logarithm problem for elliptic curves would be to find the
integer a.
70
One way of solving the problem would be to try all possible a (brute force
attack), so in cryptographic applications a is usually such that it could be
an integer of several hundred digits. There are also more advanced attacks
on the discrete logarithm problem which mean the the elliptic curve E and
finite field Fq need to be selected carefully. We should specifically ensure
that the order of E(Fq ) is large enough to maintain security and that E is
not supersingular.
Recall that an elliptic curve E in characteristic p is defined to supersingular if E[p] = {∞}. These curves are important as many calculations can
be done more quickly on then than on an arbitrary elliptic curve. Unfortunately, however, discrete logarithms can be significantly easier to solve on
these curves and the cryptographic algorithms defined on them are open to
specific attacks. Some useful results for identifying supersingular curves can
be found in Appendix A.8.
As in the classical case, there is no known efficient method for solving a
well formed discrete logarithm problem for elliptic curves. We now look at
how the systems described in the previous section can be used with elliptic
curves. The description of these systems is adapted from Chapters 6.2 and
6.4 of [9] respectively.
5.4.2
Diffie-Hellman key exchange for elliptic curves
Here we describe the Diffie-Hellman key exchange for use with elliptic curves.
This will enable Alice and Bob to securely construct a key for use in a symmetric encryption scheme such as DES or AES.
1. Alice and Bob agree on an elliptic curve E over a finite field Fq so the
discrete logarithm problem is hard in E(Fq ).
They also agree on a point P ∈ E(Fq ) such that the subgroup generated
by P has large order (usually prime).
2. Alice chooses secret integer, a, computes Pa = aP and sends Pa to Bob.
3. Bob chooses secret integer, b, computes Pb = bP and sends Pb to Alice.
4. Alice computes aPb = abP . Bob computes bPa = abP .
5. Alice and Bob agree on a method to extract a key from abP . (For
example, use the last 256 bits of the x-coordinate.)
71
The only information the eavesdropper, Eve, has is the curve, E, the finite
field, Fq , and the points P, aP and bP . She will therefore need to solve:
Diffie-Hellman problem for elliptic curves: Given P, aP and bP in E(Fq )
compute abP .
If Eve can solve discrete logs in E(Fq ) then she could use P and aP to
find a. She could then compute a(bP ) to get abP . However, if E and Fq
are chosen carefully then this is considered computationally infeasible. It is
not known whether there is a way of computing abP without first solving a
discrete log problem.
Example 5.2. (From Chapter 6.5 of [8]) The following will allow Alice and
Bob to exchange a secret key:
1. Let E be y 2 = x3 − 4 defined over F211 and let P = (2, 2) ∈ E(F211 ).
Both of these are agreed publically by Alice and Bob.
2. Alice chooses a secret integer, a = 121 and calculates
Pa = aP = 121(2, 2) = (115, 48)
where SUCDOB.m was used for the final step. Alice sends Pa to Bob.
3. Bob chooses a secret integer, b = 203 and calculates
Pb = bP = 203(2, 2) = (130, 203)
where SUCDOB.m was used for the final step. Bob sends Pb to Alice.
4. Alice computes aPb = 121(130, 203) which using SUCDOB.m = (161, 169).
Bob computes bPa = 203(115, 48) which using SUCDOB.m = (161, 169).
5. So Alice and Bob have securely generated the point (161, 169). They
will have previously agreed some way to extract a key from this point.
Any evesdropper would know the system E(F211 ) and the points (2,2),(115,48)
and (130,203). To obtain (161,169) though, Eve would have to solve the
Diffie-Hellman problem for elliptic curves.
72
5.4.3
El Gamal cryptosystem for elliptic curves
Here we describe the El Gamal cryptosystem adapted for use with elliptic
curves. Suppose Alice wants to send a message to Bob. Bob will establish his
public key as follows. Choose an elliptic curve E over a finite field Fq such
that the discrete log problem is hard for E(Fq ). He also chooses a point, P ,
on E (usually so that the order of P is a large prime). He chooses a secret
integer s and computes B = sP
Bob’s public key consists of E, Fq , and the points P and B, while the
integer s is kept private. To send a message to Bob, Alice proceeds as follows:
1. Alice obtains Bob’s public key and encodes her message as a point,
M ∈ E(Fq ).
2. Alice chooses a secret random integer r and computes
M1 = rP
and
M2 = M + rB
3. Alice sends M1 , M2 to Bob.
4. Bob decrypts by calculating M2 − sM1
The decryption works because
M2 − sM1 = (M + rB) − s(rP )
= (M + rsP ) − s(rP ) = M
An evesdropper would know Bob’s public information and the points M1 , M2 .
If she could calculate discrete logs then she could use P and B to find s, and
then decrypt the message. This should be infeasible for a careful choice of
system. There is not any other known way to find M .
As in the classical case it is important that Alice uses a different random
integer, r, each time. If the same r were used to encrypt both M and M 0
then the evesdropper would notice that M1 = M10 . She would then compute
M20 − M2 = M 0 + kB − M − kB = M 0 − M
If at any point in the future the original message, M , were made public then
Eve could easily calculate the new message, M 0 .
73
Example 5.3. The following is an example of how Alice would send a message to Bob using the El Gamal cryptosystem adapted for elliptic curves. It
was generated using the Matlab programs created throughout the project.
Bob chooses E to be y 2 = x3 + 8x + 1 defined over F101 and P to be
(11, 39) ∈ E(F101 ). (To generate a list of elements on E(F101 ) PC.m was
used). Bob then chooses s = 96 and calculates
B = sP = 96(11, 39) = (26, 98)
using SUCDOB.m
(To ensure no errors were made we use check.m to guarantee this (and all
following points) are on E(F101 ).) Bob makes E, Fq , P and B public while
keeping s private. To send a message to Bob Alice proceeds as follows.
1. Alice obtains Bob’s public key and encodes her message as
M = (74, 91) ∈ E(F101 ).
2. Alice chooses her secret integer r = 128 and computes
M1 = rP = 128(11, 39) = (85, 76)
M2 = M + rB = (74, 91) + 128(26, 98) = (74, 91) + (3, 70) = (76, 72)
(To perform the multiplication steps SUCDOB.m was used, while ECADP.m
was used for the addition steps.)
3. Alice sends M1 and M2 to Bob.
4. Bob calculates
M2 − sM1 = (76, 72) − 96(85, 76) = (76, 72) − (3, 70)
= (76, 72) + (3, −70) = (74, 91) = M
So Bob has securely received Alice’s message M .
74
Chapter 6
Summary and conclusions
In this project we studied the mathematics of elliptic curves, starting with
their definition and the proof that points upon them can form an additive
abelian group. We then showed how, using points on this group, we could
form a discrete logarithm problem which is the basis of several public key
cryptography systems. Finally we demonstrated how elliptic curves could be
used for key exchange and encryption. These cryptosystems are considered
secure providing they are set up carefully, which is where results such as
Hasse’s theorem on the group size are useful.
There were, however, numerous areas of elliptic curve mathematics that
were omitted from this project. For example, the specific attacks that can
be used against the elliptic curve discrete log problem, or other algorithms
for finding the order of E(Fq ). There are also a number of non-cryptographic
uses for elliptic curves, such as the proof of Fermat’s last theorem and in
the areas of primality testing and factorisation. This could be considered
ironic since breakthroughs in these areas would damage the security of RSA
— the system elliptic curve cryptography could replace. For further details
of the elliptic curve discrete log problem and the non-cryptographic uses of
elliptic curves see Chapters 5 and 7 of [9] respectively. More background on
the history and development of public key cryptography can be found in [6]
while [2] gives a far more detailed examination of elliptic curve cryptography.
We have demonstrated how elliptic curves can be used to create public
key systems for both key exchange and encryption. It is also possible to use
elliptic curve to form an analogue of the popular RSA system. However,
these were not discussed here since they are based on the same underlying
75
hard problem (factorising primes) and offered no real advantage over the
classical RSA system.
This however, is not the case for the elliptic curve schemes using discrete
logarithms. At present the methods for computing elliptic curve discrete
logarithms are much less efficient than there classical counterparts. As a
result shorter key sizes can be employed for the elliptic curves schemes with
obvious memory and performance benefits. As mentioned earlier, there are
specific attacks that can be employed against elliptic curves, but these can
be avoided if the system is set up carefully.
When comparing an elliptic curve system with the widely implemented
RSA scheme there are also obvious benefits. Since both schemes are largely
used in conjunction with a symmetric scheme we compare them as to the
security needed for this. On the NSA website (see [11]) it is claimed that to
provide security for a 128-bit symmetric key an RSA scheme would require
a 3072-bit key, while an elliptic curve scheme would only require a 256-bit
key. It is also claimed here that, the ‘United States, the UK, Canada and
certain other NATO nations have all adopted some form of elliptic curve
cryptography for future systems to protect classified information throughout
and between their governments’.
Despite the obvious advantages elliptic curve schemes are yet to enjoy the
success of RSA. This is because they have yet to generate the same level of
confidence that RSA has, through years of testing and use. However, elliptic
curves are the subject of continued research and development, and in future
years their use may become widespread.
76
Bibliography
[1] J. W. Archbold, Algebra, Fourth Edition, Pitman Paperbacks, 1970.
[2] H. Cohen, G. Frey, Handbook of elliptic and hyperelliptic curve cryptography, Chapman & Hall/CRC, 2006.
[3] J. B. Fraleigh, A first course in abstract algebra, 5th edition, AddisonWesley, 1994.
[4] W. Fulton, Algebraic curves, W. A. Benjamin, Inc., 1969
[5] N. Koblitz, A course in number theory and cryptography, Springer, 1994.
[6] S. Levy, Crypto, Allen Lane, 2000.
[7] B. Schneier, Applied cryptography, Second Edition, John Wiley, 1996.
[8] W. Stallings, Cryptography and network security, Third Edition, Prentice Hall, 2003.
[9] L. C. Washington, Elliptic curves, Chapman & Hall/CRC, 2003.
[10] Course notes - MT362 Cipher systems, Royal Holloway University of
London, 2004
[11] NSA website:
The case for elliptic curve cryptography.
http://www.nsa.gov/ia/industry/crypto_elliptic_curve.cfm?MenuID=10.2.7
77
Appendix A
Elliptic curve material
A.1
Singular curves
Throughout this project we have been working with y 2 = x3 + Ax + B under
the assumption that x3 + Ax + B has distinct roots. The reason given for
this assumption was that an elliptic curve will have a singular point if and
only if it has multiple roots, and these singular points cause problems for
the elliptic curve addition operation. In this section we prove this result and
examine what happens when the curves have multiple roots. We show that
by defining the set Ens (K) of non singular points on these curves, the elliptic
curve addition becomes either addition of elements in K, or multiplication
of elements in K ? or a quadratic extension of K.
Note that if x3 + Ax + B has a triple root then by translating we can
assume the root is at x = 0, and so the curve has equation y 2 = x3 . Similarly
if there is a double root we may assume this root is at zero and so E has
equation y 2 = x2 (x + a) for some a 6= 0.
A.1.1
The relationship between multiple roots
and singular points
We show here that an elliptic curve has singular points if and only if it has
multiple roots. This result was not adapted from any reference but proved
directly from the definition.
First recall that a singular point on a curve, is a point where the curve is
not smooth (ie not differentiable). For algebraic curves the singular points
78
are those points where both partial derivatives vanish. Elliptic curves can be
described as algebraic curves by rewriting the Weierstrass equation as
f (x, y) = y 2 − x3 − Ax − B = 0
and a point (x0 , y0 ) is singular if fx (x0 , y0 ) = fy (x0 , y0 ) = 0.
Theorem A.1. : An elliptic curve with multiple roots has a singular point.
Proof We prove this for the two different cases
(i) In the case when there is a triple root, y 2 = x3 so
f (x, y) = x3 − y 2
∂f
∂f
= 3x2 ,
= 2y
∂x
∂y
At the point (x, y) = (0, 0) all three of the above expressions are zero,
so (0,0) is a singular point.
(ii) In the case when there is a double root, y 2 = x2 (x + a) so
f (x, y) = x3 + ax2 − y 2
∂f
∂f
= 3x2 + 2ax,
= 2y
∂x
∂y
At the point (x, y) = (0, 0) all three of the above expressions are zero,
so (0,0) is a singular point.
Theorem A.2. An elliptic curve with a singular point has multiple roots.
Proof Consider the Weierstrass equation
y 2 = x3 + Ax + B
We can define this as an algebraic curve and calculate the partial derivatives
f (x, y) = x3 + Ax + B − y 2
∂f
∂f
= 3x2 + A,
= −2y
∂x
∂y
79
If a point (x0 , y0 ) were singular then
∂f
(x0 , y0 ) = 0 =⇒ A = −3x30
∂x
∂f
(x0 , y0 ) = 0 =⇒ y0 = 0
∂y
f (x0 , y0 ) = 0 =⇒ B = −x30 + 3x30 = 2x30
But, if this were the case then
4A3 + 27B 2 = 4[−3x20 ]3 + 27[2x30 ]2 = −108x60 + 108x60 = 0
which in Appendix A.2 is shown to imply the existence of a multiple root.
These two theorems together show that an elliptic curve defined by the
Weierstrass equation has singular points if and only if it has multiple roots.
A.1.2
Triple root
Consider the case when x3 + Ax + B has a triple root. By translating we can
assume the root is at x = 0, and the curve has equation y 2 = x3
Figure A.1: The graph of y 2 = x3
80
We can see from the graph, or from a quick check of the conditions that
the point (0,0) is the only singular point on the curve. Consider a straight
line through the origin, y = mx. By substitution we can see where this line
will intersect the elliptic curve:
y 2 = x3
(mx)2 = x3
m2 = x
So any line through (0,0) will intersect the curve again in, at most, one other
point where x = m2 and hence y = m3 x. This will clearly cause problems for
the elliptic curve addition operation since we require for there to be another
point on this line.
However, if we exclude (0,0) then the remaining points, denoted Ens (K),
form a group with the same group law as before. We show in the next
theorem that this is an additive group isomorphic to K.
Theorem A.3. Let E be the curve y 2 = x3 and let Ens (K) be the nonsingular points on this curve with coordinates in K, including ∞. The map
Ens (K) → K :
(x, y) 7→
x
,
y
∞ 7→ 0
is a group isomorphism (bijective structure preserving map) between Ens (K)
and K, which is itself an additive group.
Proof Let t = x/y. Then
x3
y 2 y 2
1
=
=
=
x2
x2
x
t2
3
2
x
x
y
1
y =
= 2 = 2 = 3
t
tx
tx
t
x =
So every point in Ens (K) can be expressed in terms of the parameter t ∈ K,
(with t = 0 corresponding to the point ∞). Also every value of t can produce
a point in Ens (K), hence the map is a bijection from Ens (K) 7→ K.
Suppose (x1 , y1 ) + (x2 , y2 ) = (x3 , y3 ). We must show that in all the
different cases, t1 + t1 = t3 , where ti = xi /yi in order to show that the map
is structure-preserving.
81
(i) If x1 6= x2 then the addition formula says that
2
y2 − y1
− x1 − x2
x3 =
x2 − x1
Substitute xi = 1/t2i and yi = 1/t3i to get
3 3 2
!2
t1 −t2
1
1
−
1
1
1
(t22 + t21 )
t32
t31
(t1 t2 )3
=
−
=
−
−
1
t21 −t22
t23
t21 t22
(t1 t2 )2
− t12
t22
1
(t1 t2 )2
2
(t31 − t32 )
(t21 + t22 )(t21 − t22 )2
=
−
t1 t2 (t21 − t22 )
t21 t22 (t21 − t22 )2
(t31 − t32 )2 − (t21 + t22 )(t21 − t22 )2
=
t21 t22 (t21 − t22 )2
−2t3 t3 + t2 t4 + t4 t2
t2 t2 (−2t1 t2 + t21 + t22 )
= 2 2 1 42 41 2 2122 = 21 22
t1 t2 (t1 + t2 − 2t1 t2 )
t1 t2 (t1 − t2 )2 (t1 + t2 )2
(t1 − t2 )2
=
(t1 − t2 )2 (t1 + t2 )2
1
1
=
2
t3
(t1 + t2 )2
Similarly
y3 =
y2 − y1
x2 − x1
(x1 − x3 ) − y1
gives
1
=
t33
=
=
=
=
"
1
t32
1
t22
−
1
t31
1
t21
#
1
1
1
− 2 − 3
2
t1 t3
t1
−
(t31 − t32 )
1
(t1 + t2 )2 − t21
− 3
2
2
2
2
t1 t2 (t1 − t2 )
t1 (t1 + t1 )
t1
2
2
t2 (t2 + 2t1 )(t1 − t2 )(t2 + t1 t2 + t1 )
1
− 3
3
3
t1 t2 (t1 − t2 )(t1 + t2 )
t1
2
2
(t2 + 2t1 )(t2 + t1 t2 + t1 )
1
− 3
3
3
t1 (t1 + t2 )
t1
2
2
(t2 + 2t1 )(t2 + t1 t2 + t1 ) − (t1 + t2 )3
t31 (t1 + t2 )3
82
t31
t31 (t1 + t2 )3
1
=
(t1 + t2 )3
=
1
t33
So by taking the ratio of the expressions we can see
1/(t1 + t2 )2
1/t23
=
1/t33
1/(t1 + t2 )3
t3 = t1 + t2
as required.
(ii) If x1 = x2 but y1 6= y2 then we have t2 = −t1 , recalling that t = x/y.
Hence t3 = t1 + t2 = 0 which corresponds to the point ∞ as required.
(iii) If (x1 , y1 ) = (x2 , y2 ) then we need only consider the case when y1 6= 0.
This is because if y1 = 0 then we are at the point (0,0) which we have
excluded. Here we have t1 = t2 so we must show that t3 = 2t1 . Recalling
that A = 0 for this curve, the addition operation gives
2 2
3x1
x3 =
− 2x1
2y1
Substituting xi = 1/t2i and yi = 1/t3i gives
4 2
1
3/t1
2
=
− 2
2
3
t3
2/t1
t1
2
3
8
=
− 2
2t1
4t1
1
9−8
= 2
=
2
4t1
4t1
Similarly
y3 =
3x21
2y1
(x1 − x3 ) − y1
gives
1
=
t33
3
2t1
83
1
1
− 2
2
t1 4t1
−
1
t31
1
3
3
− 3
=
2
2t1
4t1
t1
9
8
1
=
− 3 = 3
3
8t1 8t1
8t1
So taking the ration of the expressions gives
1/t23
1/4t21
=
1/t33
1/8t31
t3 = 2t1
as required
(iv) If one of (x1 , y1 ), (x2 , y2 ) were ∞ then (x3 , y3 ) would is the other point.
This corresponds to either t1 or t2 being zero, making this final case
trivial.
So we have shown that this map is structure preserving in all cases, and
a bijection between Ens (K) and K meaning it is a group isomorphism.
A.1.3
Double root
Consider the case where x3 + Ax + B has a double root. By translating x,
we may assume this root is at zero and so the curve E has equation
y 2 = x2 (x + a)
for some a 6= 0.
We can again show that the point (0,0) is the only singularity from the
definition or from the graph below. If we consider the straight line through
the origin, y = mx then we see that as before, it only intersects E at the
origin and, at most, one other point:
y 2 = x2 (x + a)
(mx)2 = x2 (x + a)
m2 = (x + a)
So we have similar problems with the elliptic curve addition operation.
84
Figure A.2: The graph of y 2 = x2 (x + 1) = x3 + x2
We again define Ens (K) to be the nonsingular points in E with coordinates in K, including the point ∞. Let α2 = a (so α might lie in K or an
extension of K). The equation for E may be rewritten
y 2
=a+x
x
Now when x is near 0 the right hand side is approximately a. Therefore E
is approximated by (y/x)2 = a or y/x = ±α near x = 0. This means that
the two tangents to E at (0,0) are
y = −αx
y = αx,
We will show that Ens (K) forms a multiplicative group that is isomorphic to
either K or a quadratic extension of K, depending on whether or not α ∈ K.
Theorem A.4. Let E be the curve y 2 = x2 (x + a) with 0 6= a ∈ K. Let
Ens (K) be the nonsingular points on E with coordinates in K. Let α2 = a.
Consider the map
y + αx
ψ : (x, y) 7→
, ∞ 7→ 1
y − αx
85
(i) If α ∈ K, then ψ gives an isomorphism from Ens (K) to K × , which is
the multiplicative group of the field K.
(ii) If α 6∈ K then ψ gives an isomorphism
Ens (K) ' {u + αv | u, v ∈ K, u2 − av 2 = 1}
where the right hand side is a group under multiplication.
Proof (i) Let ψ(x, y) = t then
t=
y + αx
y − αx
(A.1)
We show that
t+1
=α
α
t−1
y + αx + y − αx
y − αx
×
y − αx
y + αx − y + αx
y
x
(A.2)
We can rewrite E as x = (y/x)2 − a, and then use Equation (A.2) to obtain
=α
2y
2αx
=
2
4α2 t
y2
2 (t + 1)
2
−
a
=
α
−
α
=
x2
(t − 1)2
(t − 1)2
y
4α2 t
t+1
4α3 t(t + 1)
y = x =
×
α
=
x
(t − 1)2
t−1
(t − 1)3
x =
So (x, y) determines t and t determines (x, y). In case (i) α ∈ K, so given
any (x, y) ∈ Ens (k) we have ψ(x, y) = t ∈ K × making ψ injective. Then if
we are given any t ∈ K ? we can find the corresponding (x, y) ∈ Ens (K) so ψ
is surjective. Hence in case (i) the map ψ is a bijection.
We have shown that ψ is bijective, but we must also show it is a homomorphism (ie structure preserving) in order to conclude it an isomorphism.
Suppose (x1 , y1 ) + (x2 , y2 ) = (x3 , y3 ) and let
ti =
yi + αxi
yi − αxi
We must show that t1 t2 = t3 . First recall that
4α2 ti
= xi
(ti − 1)2
4α3 ti (ti + 1)
= yi
(t1 − 1)3
86
(A.3)
(A.4)
We now consider the various cases, but note that as y 2 = x2 (x + a) is not in
Weierstrass form, the addition formulas will differ from normal.
(a) If x1 6= x2 , then the line through x1 and x2 will be given by
y = m(x − x1 ) + y1 as before. However, when subbing in the equation
for E, the coefficient for x2 will have an extra term, −a. So
2
y2 − y1
− a − x1 − x2
x3 =
x2 − x1
We can substitute for Equations (A.3) and (A.4) to get
3
y2 − y1
4α2 t1
4α t2 (t2 + 1) 4α3 t1 (t2 + 1)
4α3 t2
−
−
=
/
x2 − x1
(t2 − 1)3
(t1 − 1)3
(t2 − 1)2 (t1 − 1)2
3
4α
t2 (t2 + 1)(t1 − 1)3 − t1 (t1 + 1)(t2 − 1)3
=
4α2
(t2 − 1)3 (t1 − 1)3
t2 (t1 − 1)2 − t1 (t2 − 1)2
÷
(t2 − 1)2 (t1 − 1)2
t2 (t2 + 1)(t1 − 1)3 − t1 (t1 + 1)(t2 − 1)3
= α·
(t2 (t1 − 1)2 − t1 (t2 − 1)2 )(t2 − 1)(t1 − 1)
(t1 − t2 )(t21 t22 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)
= α·
(t1 t2 − 1)(t1 − t2 )(t2 − 1)(t1 − 1)
2 2
α(t1 t2 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)
=
(t1 t2 − 1)(t2 − 1)(t1 − 1)
Then the addition equation gives
α2 4t3
α2 (t21 t22 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)2
4α2 t1
4α2 t2
2
=
−
α
−
−
(t3 − 1)2
(t1 t2 − 1)2 (t2 − 1)2 (t1 − 1)2
(t1 − 1)2 (t2 − 1)2
α2 (t21 t22 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)2 − (t1 t2 − 1)2 (t2 − 1)2 (t1 − 1)2
=
(t1 t2 − 1)2 (t2 − 1)2 (t1 − 1)2
−4t1 (t1 t2 − 1)2 (t2 − 1)2 − 4t2 (t1 t2 − 1)2 (t1 − 1)2
+
(t1 t2 − 1)2 (t2 − 1)2 (t1 − 1)2
4t1 t2 (t2 − 1)2 (t1 − 1)2
4t3
=
(t3 − 1)2
(t1 t2 − 1)2 (t2 − 1)2 (t1 − 1)2
t3
t1 t2
=
2
(t3 − 1)
(t1 t2 − 1)2
87
Similarly
y3 =
y2 − y1
x2 − x1
(x1 − x3 ) − y1
So substituting for Equations (A.3) and (A.4) gives
22
4α3 t3 (t3 + 1)
α(t1 t2 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)
=
(t3 − 1)3
(t1 t2 − 1)(t2 − 1)(t1 − 1)
2
4α t1
4α3 t1 (t1 + 1)
4α2 t1 t2
×
−
−
(t1 − 1)2 (t1 t2 − 1)2
(t1 − 1)3
22
(t1 t2 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)
t3 (t3 + 1)
=
(t3 − 1)3
(t1 t2 − 1)(t2 − 1)(t1 − 1)
t1 (t1 t2 − 1)2 − t1 t2 (t1 − 1)2
t1 (t1 + 1)
×
−
2
2
(t1 t2 − 1) (t1 − 1)
(t1 − 1)3
(t21 t22 + t2 t21 + t1 − 6t1 t2 + t1 t22 + t2 + 1)(t1 (t1 t2 − 1)2 − t1 t2 (t1 − 1)2 )
=
(t1 t2 − 1)3 (t1 − 1)3 (t2 − 1)
t1 (t1 + 1)(t1 t2 − 1)3 (t2 − 1)
−
(t1 t2 − 1)3 (t1 − 1)3 (t2 − 1)
t1 t2 (t1 − 1)3 (t2 − 1)
=
(t1 t2 − 1)3 (t1 − 1)3 (t2 − 1)
t3 (t3 + 1)
t1 t2 (t1 t2 + 1)
=
3
(t3 − 1)
(t1 t2 − 1)3
Then taking the ratio yields
t3 − 1
t3 + 1
(t3 − 1)(t1 t2 + 1)
t1 t2 t3 + t3 − t1 t2 − 1
⇒ 2t3 − 2t1 t2
t1 t2
=
=
=
=
=
t1 t2 − 1
t1 t2 + 1
(t3 + 1)(t1 t2 − 1)
t1 t2 t3 − t3 + t1 t2 − 1
0
t3
as desired.
(b) If x1 = x2 but y1 6= y2 then we know (x3 , y3 ) = ∞. Recall that
α
t+1
x
=
t−1
y
88
So because, x1 = x2 and y1 = −y2 we have
t2 + 1
t1 + 1
= −α
α
t1 − 1
t2 − 1
(t1 + 1)(t2 − 1) = −(t2 + 1)(t1 − 1)
t1 t2 + t2 − t1 − 1 = −t1 t2 − t1 + t2 + 1
2t1 t2 = 2
1
t2 =
t1
So we find that t3 = t1 t2 = 1 which corresponds to the point (x, y) = ∞
as required.
(c) If (x1 , y1 ) = (x2 , y2 ) and y1 6= 0 then to add the points we draw the
tangent at (x1 , y1 ). Using implicit differentiation we see this has gradient
m = (3x2 + 2ax)/(2y). So the addition operation gives
2
2
3x1 + 2α2 x1
− α2 − 2x1
x3 =
2y1
We can substitute to get
3
3x21 + 2α2 x1
48α4 t21
8α4 t1
8α t1 (t1 + 1)
=
+
/
2y1
(t1 − 1)4 (t1 − 1)2
(t1 − 1)3
48α4 t21 + 8α4 t1 (t1 − 1)2
8α3 t1 (t1 + 1)
=
/
(t1 − 1)4
(t1 − 1)3
4 2
8α
6t1 + t1 (t1 − 1)2
(t1 − 1)3
=
×
8α3
(t1 − 1)4
t1 (t1 + 1)
2
2
α(4t1 + t1 + 1)
α(4t1 + t1 + 1)t1
=
=
t1 (t1 + 1)(t1 − 1)
(t1 + 1)(t1 − 1)
Then the addition operation gives
4α2 t3
α2 (4t1 + t21 + 1)2
8α2 t1
=
−
− α2
2
2
2
2
(t3 − 1)
(t1 + 1) (t1 − 1)
(t1 − 1)
2
2
4t3
(4t1 + t1 + 1) − 8t1 (t1 + 1)2 − (t1 + 1)2 (t1 − 1)2
=
(t3 − 1)2
(t1 + 1)2 (t1 − 1)2
4t21
=
(t1 + 1)2 (t1 − 1)2
t3
t21
=
(t3 − 1)2
(t1 + 1)2 (t1 − 1)2
89
Similarly
y3 =
3x21 + 2α2 x1
2y1
(x1 − x3 ) − y1
gives
4α3 t3 (t3 + 1)
=
(t3 − 1)3
4α2 t1
4α3 t1 (t1 + 1)
α(4t1 + t21 + 1)
4α2 t21
−
−
(t1 + 1)(t1 − 1)
(t1 − 1)2 (t1 + 1)2 (t1 − 1)2
(t1 − 1)3
4α2 t1 (t1 + 1)2 − 4α2 t21
α(4t1 + t21 + 1)
4α3 t1 (t1 + 1)
=
−
(t1 + 1)(t1 − 1)
(t1 − 1)2 (t1 + 1)2
(t1 − 1)3
[4t1 + t21 + 1] × (t1 (t1 + 1)2 − t21 ) − t1 (t1 + 1)4
t3 (t3 + 1)
=
(t3 − 1)3
(t1 − 1)3 (t1 + 1)3
t21 (t21 + 1)
=
(t1 + 1)3 (t1 − 1)3
So taking the ratio yields
t3
t3 (t3 + 1)
t21
t21 (1 + t21 )
/
=
/
(t3 − 1)2
(t3 − 1)3
(t1 + 1)2 (t1 − 1)2
(t1 + 1)3 (t1 − 1)3
(t1 + 1)(t1 − 1)
(t21 − 1
t3 − 1
=
=
t3 + 1
1 + t21
t21 + 1)
So
(t3 − 1)(t21 + 1) = (t21 − 1)(t3 − 1)
t3 + t3 t21 − 1 − t21 = t3 t21 + t21 − t3 − 1
2t3 − 2t21 = 0
So t3 = t21 = t1 t2 as required.
(d) If (x1 , y1 ) = (x2 , y2 ) and y1 = 0 then either x1 = 0 or x1 = −a. We cannot
have x1 = 0 as we have excluded the point (0,0). So x1 = −a = −α2 .
This implies
4α2 t1
(t1 − 1)2
= 4t1
= 0
−α2 =
−(t1 − 1)2
(t1 + 1)2
So t1 = −1 meaning t3 = t21 = 1 corresponding to the point ∞ as
required.
90
(e) Finally consider the case when one of (x1 , y1 ), (x2 , y2 ) is ∞. In this case
(x3 , y3 ) would be the other point, which corresponds to either t1 or t2
being one, making this final case trivial.
So we have show that ψ preserves the stricture of the group Ens (K). We
also showed earlier that ψ is a bijective map from Ens (K) to K × and so we
conclude that in case (i) it is an isomorphism.
Proof (ii) We will first show that in case (ii) the map ψ is a bijection. Notice
that we can rationalise the denominator of (y + αx)/(y − αx) by multiplying
top and bottom by (y + αx) to get an expression of the form u + αv:
ψ(x, y) =
y + αx y + αx
y + αx
=
×
y − αx
y − αx y + αx
(y + αx)2
(y + αx)2
= 2
= 2
y − α 2 x2
y − ax2
2yx
y 2 + ax2
(y + αx)2
=
+α
=
x3
x3
x3
≡ u + αv
Now notice that we can change the sign of α throughout this equation while
preserving the equality (because −α2 = a also) so
y − αx
= u − αv
y + αx
We can now show that
u2 − av 2 = (u + αv)(u − αv) =
(y + αx)(y − αx)
=1
(y − αx)(y + αx)
So for any x, y ∈ Ens (K), ψ(x, y) is a function of the form u + αv where
u, v ∈ K and u2 − av 2 = 1. Therefore ψ is injective.
Conversely let us suppose that we have u, v ∈ K such that that
u2 − av 2 = 1. Let
2
u+1
u+1
− a,
y=
x
x =
v
v
y
u+1
=⇒
=
x
v
91
Then (x, y) satisfy y 2 = x2 (x + a) and so lie on the curve E. Also
ψ(x, y) =
=
=
=
=
=
=
=
(y/x) + α
y + αx
=
y − αx
(y/x) − α
u+1
+α
u + 1 + αv
v =
u+1
u + 1 − αv
−α
v
(u + 1) + αv (u + 1) + αv
×
(u + 1) − αv (u + 1) + αv
(u + 1)2 + 2αv(u + 1) + α2 v 2
(u + 1)2 − α2 v 2
u2 + 2u + 1 + 2αv(u + 1) + av 2
u2 + 2u + 1 − av 2
u2 + 2u + u2 + 2αv(u + 1)
2u + 1
u2 + u + αv(u + 1)
u+1
u + αv
So for any u, v ∈ K such that u2 − av 2 = 1 we can find x, y ∈ Ens (K) such
that ψ(x, y) = u + αv. Therefore ψ is surjective and hence a bijection in case
(ii) as well.
We must also show that ψ is structure preserving for this case as well,
but the details will be almost identical to those given in the proof of case (i)
so we omit them here.
The final task is to check that the set, G = {u+αv | u, v ∈ K, u2 −av 2 = 1}
on the right hand side of case (ii) is a multiplicative group.
• If (u, v) and (u0 , v 0 ) ∈ G then:
(u, v) × (u0 , v 0 ) ≡
=
=
≡
(u + αv) × (u0 + αv 0 )
uu0 + αuv 0 + αvu0 + α2 vv 0
(uu0 + avv 0 ) + α(uv 0 + vu0 )
U + αV
and for this U, V
U 2 − αV 2 = (uu0 + α2 vv 0 )2 − α(uv 0 + vu0 )2
92
= u2 u02 + 2uu0 vv 0 α2 + α4 v 2 v 02 − u2 v 02 α2 − 2uv 0 vu0 α2 − α2 v 2 u02
= u02 [u2 − αv 2 ] − α2 v 02 [u2 − α2 v 2 ] + (2uu0 vv 0 α2 − 2uu0 vv 0 α2 )
= u02 [1] − α2 v 02 [1] − (0) = 1
So (u, v) × (u0 , v 0 ) gives a point U + αV where U, V ∈ K and
U 2 − αV 2 = 1. Hence G is closed
• We check that all elements have inverses:
1
1
u − αv
=
×
u + αv
u + αv u − αv
u − αv
= 2
= u − αv
u − α2 v 2
So the inverse of u + αv is u − αv. So all elements have inverses.
• There is an identity element, I = (u + αv) = (1 + α × 0), such that
g × I = g for all g ∈ G.
• The group operation is standard multiplication which is associative.
So we have verified that G = {u + αv | u, v ∈ K, u2 − av 2 = 1} is a
multiplicative group.
One situation where singular curves arise naturally is when curves have integral coefficients and we reduce modulo various primes. For example let E
be
y 2 = x(x + 35)(x − 55)
Then
E (mod 5) : y 2 ≡ x3
E (mod 7) : y 2 ≡ x2 (x + 1)
E (mod 11) : y 2 ≡ x2 (x + 2)
The first case is called additive reduction and was treated by Theorem A.3.
The second case is split multiplicative
reduction and was covered by Theorem
√
A.4(1). In the final case α = 2 6∈ F11 , so we are in the situation of Theorem
A.4(2). This is called non-split multiplicative reduction.
It can be shown that for all primes, p ≥ 13 the cubic polynomial has
distinct roots mod p, so E mod p is nonsingular. This situation is called
good reduction.
93
A.2
Deriving the condition for distinct roots
In Appendix A.1 we proved that if an elliptic curve has multiple roots then
it will have a singular point. In the project we considered only those elliptic
curves without multiple roots. It was stated earlier that this was equivalent
to imposing the condition 4A3 + 27B 2 6= 0. In this section we prove this
result by calculating the discriminant using the method in Chapter 12 of [1].
A.2.1
Determining the roots
Let f (x) be a general cubic polynomial given by
f (x) = a0 x3 + 3a1 x2 + 3a2 x + a3 ,
a0 6= 0
with coefficients in the field F . The cubic has three roots in C
We wish to find an expression for the discriminant of the cubic polynomial
in terms of, not the roots, but the coefficients. To derive this formula we will
have to first determine an expression for the roots
It will be easier to perform the calculation on a reduced version of the
polynomial so define
x − a1
2
g(x) = a0 f
a0
3
2
x − a1
x − a1
x − a1
3
2
2
= a0
+ 3a1 a0
+ 3a2 a0
+ a20 a3
a0
a0
a0
= (x − a1 )3 + 3a1 (x − a1 )2 + 3a0 a2 (x − a1 ) + a20 a3
= x3 + x(3a0 a2 − 3a21 ) + (a20 a3 − 3a0 a1 a2 + 2a31
= x3 + 3Hx + G
where
G = a20 a3 − 3a0 a1 a2 + 2a21 ,
H = a0 a2 − a21
Define g(x) as the reduced cubic of f (x). Note g(a0 x + a1 ) = a20 f (x) and so
1. On multiplying the roots of f (x) by a0 and then adding a1 we obtain
the roots of g(x).
2. g(x) has no term in x2 and its coefficients are in F .
94
Recall that the nth roots of unity are the complex numbers which yield 1
when raised to a given power, n. The third roots (cubic roots) of unity are
√
√
−1 + 3i
−1 − 3i
1,
,
2
2
where i is the imaginary unit; the latter two roots are primitive. Let w be a
primitive cube root of 1 and u, v any numbers. Since
(x − u − v)(x − uw − vw2 )(x − uw2 − vw) = x3 − 3uvx − u3 − v 3
using either of the cube roots, we know that the roots of
x3 − 3uvx − u3 − v 3 are
u + v,
uw + vw2 ,
uw2 + vw
We want to determine the roots of g(x) by choosing u and v so that
uv = −H,
u3 + v 3 = −G
Here we show that this implies u3 and v 3 are the roots of the quadratic
C(x) = x2 + Gx − H 3
Using the quadratic formula the roots of C(x) are
√
1
ξ = (−G + G2 + 4H 3 ),
2
√
1
η = (−G − G2 + 4H 3 )
2
Now set u to be any cube root of ξ. This implies v = −H/u because
√
1
(−G − G2 + 4H 3 )
2
√
√
1
(−G − G2 + 4H 3 ) (−G + G2 + 4H 3 )
2
√
=
−G + G2 + 4H 3
√
√
(1/2)(G2 − G + G − G2 − 4H 3
√
=
−G + G2 + 4H 3
−2H 3
−H 3
√
=
=
ξ
−G + G2 + 4H 3
v3 = η =
So the necessary choices of u and v satisfy
u3 = ξ,
v3 = η
95
−H
u
We can now see that this choice of u and v satisfy the conditions.
√
−H
uv = u
= −H
u
u3 + v 3 = ξ + η
√
√
1
1
=
(−G + G2 + 4H 3 ) + (−G − G2 + 4H 3 )
2
2
√
= −G
u=
p
3
ξ,
v=
So the roots of g(x) can now be found.
Note that if ξ = 0 then this implies that H = 0 and so the roots of g(x)
are the cube roots of −G.
Example
Solve x3 + 3x2 − 3x − 14 = 0
a0 = 1,
a1 = 1,
a2 = −1,
a3 = −14
H = +1(−1) − (12 ) = −2
G = (12 )(−14) − 3(1)(1)(−1) + 2(12 ) = −9
So C(x) = x2 − 9x + 8 giving ξ = 1, η = 8.
We know u is the cube root of ξ so take u = 1, then v = −H/u = 2.
Hence the roots of g(x) are
1 + 2,
w + 2w2 ,
w2 + 2w
which using either of the two options for w gives
3,
√
1
− (3 + i 3),
2
√
1
− (3 − i 3)
2
Finally we subtract a1 and divide by a0 to get the roots of the unreduced equation, f (x)
2,
√
1
− (5 + i 3),
2
96
√
1
− (5 − i 3)
2
A.2.2
The discriminant
The discriminant of a polynomial is a number that can be easily computed
from the coefficients of the polynomial and which is zero if and only if the
polynomial has a multiple root. If the polynomial, p(x) has roots r1 , ..., rn
and leading coefficient a0 then
p(x) = (x − r1 )(x − r2 )...(x − rn )
and it can be shown that the discriminant is
Y
D = a40 (ri − rj )2
i<j
Note that for a quadratic polynomial
ax2 + bx + c = 0
The discriminant is b2 − 4ac.
Let α, β, γ be the roots of f (x), then the discriminant of f (x) is
D = a40 (β − γ)2 (γ − α)2 (α − β)2
This term helps to discriminant between different types of cubics in the
following obvious ways:
• D = 0 if and only if f (x) has at least two equal roots.
• If all the roots of f (x) are different and D/a40 is real then
(i) D/a40 > 0 when all the roots are real.
(ii) D/a40 < 0 if at least one root is not real.
Theorem A.5. When f (x) has real coefficients then these further statements
hold
• D > 0 ⇒ The cubic has three distinct real roots.
• D = 0 ⇒ The cubic has three real roots of which at least two are equal.
• D < 0 ⇒ The cubic has one real root and two conjugate unreal roots.
97
Proof Since f (x) is real it can definitely be written as a product of two real
factors, one linear and one quadratic, and so assuming α is the real root:
f (x) = (x − α)(a0 x2 + b0 x + c0 )
This is the same a0 term as we know the coefficient of x3 is a0 . However, b0
and c0 are new constants.
Now, β and γ are the roots of a0 x2 + b0 x + c0 so
D =
=
=
=
a40 {(α − β)(α − γ)}2 (β − γ)2
a40 {α2 − α(β − γ) + βγ 2 }2 {(β + γ)2 − 4βγ}
{(α − β)(α − γ)}2 {(a40 )[(β − γ)]2 }
{a0 α2 + b0 α + c0 }2 {b20 − 4a0 c0 }
For the final step note that the second term is the discriminant of the
quadratic which can be defined using the general formula above or the specific
quadratic form.
Now, the first term is positive unless α is also a (real) root of a0 x2 +b0 x+c0
which would make the first term zero and imply the third root is real. The
second term is only zero when a0 x2 + b0 x + c0 has equal real roots making
the ± part of the quadratic formula redundant.
Hence D = 0 if and only if f (x) has three real roots of which at least two
are equal.
If D 6= 0 then the sign of D is the same as that of the second term,
b20 − 4a0 c0 . This is the determinant of the quadratic and clearly if it is
positive then the roots of the cubic are all real, and if it is negative then two
of them are complex.
We want to get the discriminant of the cubic in terms of the coefficients so
that we can apply the theorem without knowing the roots. We still assume
that f (x) has roots α, β, γ and so g(x) has by definition the roots
p = a0 α + a1 ,
q = a0 β + a1 ,
r = a0 γ + a1
Because g(x) is monic the discriminant is
(q − r)2 (r − p)2 (p − q)2 = (a0 β + a1 − a0 γ − a1 )2 × (a0 γ + a1 − a0 α − a1 )2
×(a0 α + a1 − a0 β − a1 )2
= a60 (β − γ)(γ − α)(α − β)
= a20 D
98
So if we find the discriminant of g(x) we can easily calculate the discriminant
of f (x). So we choose the easier task of calculating the discriminant of g(x).
Set p = u + v, q = uw + vw2 and r = uw2 + vw, the three roots of g(x)
found earlier. Then using either value of w we find that
p+q+r = 0
pq + pr + rq = −3uv
pqr = u3 + v 3
So
p + q + r = 0,
pq + pr + rq = 3H,
pqr = −G
We can then show
p(q − r)2 = p(q + r)2 − 4pqr =
=
=
=
=
(u3 + 3u2 v + 3uv 2 + v 3 ) + 4G
(u + v)3 + 4G = p3 + 4G
u3 + v 3 + 3uv(u + v) + 4G
−G − 3Hp + 4G
3(−HP + G)
Similarly
q(r − p)2 = 3(−Hq + G)
r(p − q)2 = 3(−Hr + G)
We can now calculate the discriminant g(x) when multiplied by −G:
−G(q − r)2 (r − p)2 (p − q)2 = p(q − 4)2 q(r − p)2 r(p − q)2
= 27{(−Hp + G)(−Hq + G)(−Hr + g)}
= 27{−H 3 pqr + GH 2 (qr + rp + pq)
−G2 H(p + q + r) + G3 }
= 27{H 3 G + GH 2 (3H) + 0 + G3 }
= 27G(G2 + 4H 3 )
Thus we can see that if G 6= 0 then g(x) has√discriminant −27(G2 + 4H 3 ).
If G = 0 then the roots of g(x) are 0, ± −3H making the squared differences −3H, −3H and −12H. This then makes the discriminant −108H 3
which is −27(G2 + 4H 3 ) with G set to zero.
99
Thus in all cases the discriminant of g(x) is −27(G2 + 4H 3 ).
It then follows that the discriminant of f (x) is
−27(G2 + 4H 3 )
a20
−27{(a20 a3 − 3a0 a1 a2 + 2a31 )2 + 4(a0 a2 − a21 )3 }
=
a20
= −27(a20 a23 − 6a0 a1 a2 a3 + 4a0 a32 − 3a21 a22 + 4a31 a3 )
D =
A.2.3
Relating back to elliptic curves
We are considering elliptic curves that are the solutions to the Weierstrass
equation
y 2 = x3 + Ax + B
The roots of this curve will be the same as the roots of the cubic on the left
hand side. We can calculate the discriminant of the cubic by relating it to
g(x) = x3 + 3Hx + G which had discriminant −27(G2 + 4H 3 ).
We can see that here
3H = A ⇒ H =
A
A3
⇒ H3 =
3
27
G = B ⇒ G2 = B 2
So the elliptic curve cubic has discriminant
4A3
−27(G + 4H ) = −27(B +
) = −(27B 2 + 4A3 )
27
2
3
2
as required.
So to impose the condition that all roots are distinct we will require
4A3 + 27B 2 6= 0
If we are working with the generalised Weierstrass equation then a similar
calculation will have to be performed to find the discriminant, using the
equation for D, the discriminant of f (x).
100
A.3
Elliptic curves in characteristic 2
The formula for elliptic curve addition in Section 2.2 were derived using the
Weierstrass equation, y 2 = x3 + Ax + B and so do not apply when the field K
has characteristic 2. When in characteristic 2 we work with the generalised
Weierstrass equation:
y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6
for an elliptic curve E. We now consider two different possibilities:
(I) If a1 6= 0 then letting
x = a21 x1 +
a3
,
a1
y = a31 y1 +
a21 a4 + a23
a31
will change the generalised Weierstrass equation to
2
a21 a4 + a23
a21 a4 + a23
a3
a21 a4 + a23
3
3
2
3
a1 y 1 +
+ a1 a1 x 1 +
a1 y 1 +
+ a3 a1 y 1 +
a31
a1
a31
a31
3
2
a3
a3
a3
2
2
2
= a1 x 1 +
+ a2 a1 x 1 +
+ a4 a1 x 1 +
+ a6
a1
a1
a1
Collecting powers of x1 and y1 gives
2
a1 a3 3
a1 a4 + a23
6 2
3
6
3
a1 y1 + a1 x1 y1 + y1 2a1
+
a + a1 a3
a31
a1 1
2 2a1 a3
a21 a23
6 3
2
2
2
= a1 x1 + Cx1 + x1 −a1 a4 + a3 + 3 2 + a2
+ D.
a1
a1
a61 y12 +a61 x1 y1 + y1 2(a21 a4 + a23 ) + 2(a31 + a3 ) = a61 x31 + Cx21 + x1 4a23 + 2a2 a1 a3 + D
where C and D are new constants. Because we are in characteristic 2
we can reduce modulo 2, to give
a61 y12 + a61 x1 y1 = a61 x31 + Cx21 + D
y12 + x1 y1 = x31 + a02 x21 + a06
for new constants a02 , a06 .
101
Considering the partial derivatives:
f (x1 , y1 ) = y12 + x1 y1 − x31 − a02 x21 − a06
fy (x1 , y1 ) = 2y1 + x1 ≡ x1 (mod 2), fx (x1 , y1 ) = y1 − 3x21 − 2a02 x1
So a singular point on this curve must have x1 = 0, which in turn
implies y1 = 0. So the curve will have a singular point if and only
if the origin lies on the curve. So we can conclude that this curve is
nonsingular if and only if a06 6= 0.
(II) If a1 = 0 then let
x = x 1 + a2 ,
y = y1
Then the generalised Weierstrass equation becomes
y12 + a3 y1 = (x1 + a2 )3 + a2 (x1 + a2 )2 + a4 (x1 + a2 ) + a6
= x31 + 4a2 x21 + 5a22 x1 + a4 x1 + 2a32 + a4 a2 + a6
y12 + a03 y1 ≡ x31 + a04 x1 + a06
for constants a03 , a04 , a06 .
Considering the partial derivatives:
f (x1 , y1 ) = y12 + a03 y1 − x31 − a04 x1 − a06
fy (x1 , y1 ) = 2y1 + a03 ≡ a03 (mod 2)
fx (x1 , y1 ) = −3x21 − a04
So we see that this curve is nonsingular if and only if a03 6= 0.
Addition of points is similar to the simple case. To add two points P1
and P2 on E we draw the line, L, through them (the tangent if P1 = P2 ) and
find the third point of intersection P30 . We then compute P3 = −P30 using
Equation (2.1) — not simply reflecting in x-axis. Then P1 + P2 = P3 . We
still have P + ∞ = P , for all points P .
As before, the points on E, form an additive abelian group with ∞ as the
identity element. We now explicitly find the formulas for doubling a point,
treating the two cases separately.
(I) y 2 + xy = x3 + a2 x2 + a6 : Because we are in characteristic 2 we can
rewrite this as
0 = y 2 + xy + x3 + a2 x2 + a6
102
Implicit differentiation yields
0 = 2yy 0 + y + xy 0 + 3x2 + 2a2 x ≡ (y + x2 ) + xy 0
(mod 2)
Therefore the slope of the tangent line, L, through P0 = (x0 , y0 ) is
m=
y0 + x20
x0
The line, L, is given by
y = m(x − x0 ) + y0 = mx + b
for a constant b. To find the other point where L intersects E, (x1 , y1 ),
we substitute:
0 = (mx + b)2 + x(mx + b) + x3 + a2 x2 + a6 = x3 + (m2 + m + a2 )x2 + ...
We know the sum of the roots, (x0 + x0 + x1 ) is equal to the negative
of the x2 coefficient. So we obtain
x1 = −(m2 + m + a2 ) − 2x0 ≡ m2 + m + a2
2 y0 + x20
y0 + x20
=
+
+ a2
x0
x0
(y02 ) + 2y0 x20 + x40 + x0 y0 + x30 + a2 x20
=
x20
(x30 + a2 x20 + a6 + x0 y0 ) + 2y0 x20 + x40 + x0 y0 + x30 + a2 x20
=
x20
2(x30 + a2 x20 + x0 y0 + y0 x20 ) + x40 + a6
x40 + a6
=
≡
(mod 2)
x20
x20
The y-coordinate of this intersection is y1 = m(x1 − x0 ) + y0 ). Since
(x1 , y1 ) = −2P we get 2P = (x2 , y2 ) where x2 = x1 and y2 is given by
Equation (2.1). (Note the coefficients in (2.1) refer to the Generalised
Weierstrass equation, so here a1 = 1, a3 = 0.)
So if P = (x0 , y0 ) we obtain 2P = (x2 , y2 ) where
x2 = x1 =
x40 + a6
x20
y2 = −x1 − y1 ≡ +x2 + m(x2 − x0 ) + y0 ,
103
m=
y0 + x20
x0
(II) y 2 + a3 y = x3 + a4 x + a6 : Because we are in characteristic 2 we can
rewrite this as
0 = y 2 + a3 y + x3 + a4 x + a6
Implicit differentiation yields
0 = 2yy 0 + a3 y 0 + 3x2 + a4 ≡ a3 y 0 + (x2 + a4 )
The tangent line L at P = (x0 , y0 ) is
y = m(x − x0 ) + y0 ,
x20 + a4
m=
a3
Note that earlier we showed a3 6= 0 otherwise the curve would be
singular. Now, substituting to find the third point of intersection,
(x1 , y1 ) gives
0 = (mx + b)2 + a3 (mx + b) + x3 + a4 x + a6 = x3 + m2 x2 + ...
So
x40 + a24
x40 + 2a4 x20 a24
≡
x1 = −m − 2x0 ≡ m =
a23
a23
and y1 = m(x1 − x0 ) + y0 . Therefore 2P = (x2 , y2 ) where
2
x2 = x1 =
2
x40 + a24
a23
y2 = −a3 − y1 ≡ a3 + y1 = a3 + m(x2 − x0 ) + y0 ,
x20 + a4
m=
a3
If we want to add two distinct points so (x0 , y0 ) + (x1 , y1 ) = (x2 , y2 ) then
we proceed as before. The line L will have gradient
m=
y1 − y0
,
x1 − x0
and equation y = m(x − x0 ) + y0
(I) If y 2 + xy = x3 + a2 x2 + a6 then substituting into E to find the third
point of intersection gives
x02 = m2 + m − x0 − x1 ,
y20 = m(x02 − x0 ) + y0
Then using Equation (2.1) we find
x2 = x02 = m2 + m − x0 − x1
y2 = −x02 − y20 = x2 + m(x2 − x0 ) + y0
104
(II) If y 2 + a3 y = x3 + a4 x + a6 then substituting in E gives
x02 = m2 − x0 − x1 ,
y20 = m(x02 − x0 ) + y0
Then using Equation (2.1) we find
x2 = x02 = m2 − x0 − x1
y2 = −x02 − y20 = x2 + m(x2 − x0 ) + y0
A.4
Elliptic curves in characteristic 3
The case in characteristic 3 is simpler. We will have an equation of the form
y 2 = x 3 + a2 x 2 + a4 x + a6
As always to add two points P1 and P2 on E we draw the line, L, through
them (the tangent if P1 = P2 ). We then find the third point of intersection
P30 . We can compute P3 = −P30 by reflecting in the x-axis as in the original
case, because here the curve in symmetric about the x-axis as with
y 2 = x3 + Ax + B. Then P1 + P2 = P3 .
105
A.5
The proof of associativity
In this section we introduce the topic of projective geometry. This will allow
us to interpret the point at infinity as being on an elliptic curve, and give us
the necessary background to tackle the proof of associativity.
A.5.1
Projective geometry and the point at infinity
Two dimensional projective space over K, PK2 , is given by equivalence classes
of triples (x, y, z) with x, y, z ∈ K and at least one of x, y, z non-zero. We say
two triples (x1 , y1 , z1 ) and (x2 , y2 , z2 ) are equivalent if there exists a non-zero
element λ ∈ K such that
(x1 , y1 , z1 ) = (λx2 , λy2 , λz2 )
We then write (x1 , y1 , z1 ) ∼ (x2 , y2 , z2 ). The equivalence class of an element
is the set of elements that are equivalent to it. So here, the equivalence class
of a triple only depends on the ratios of x to y to z. Therefore the equivalence
class of (x, y, z) is denoted (x : y : z).
If (x : y : z) is a point with z 6= 0 then (x : y : z) = (x/z : y/z : 1). These
are the finite points in PK2 . However if z = 0 then we think of this as setting
the x or y coordinate to ∞. Therefore the points (x : y : 0) are the points at
infinity in PK2 . Later in this section the point at infinity on an elliptic curve
will be identified as one of these points.
The 2-dimensional affine plane over K is usually denoted
A2K = {(x, y) ∈ K × K}
Clearly the map (x, y) 7→ (x : y : 1) maps all the points of A2K to points in
PK2 and so is an inclusion relation A2K ,→ PK2 . So the affine plane is identified
within the finite points in PK2 .
A polynomial is homogeneous of degree n if it is a sum of terms of the
form axi y j z k with a ∈ K and i + j + k = n. For example
F (x, y, z) = 2x3 − 5xyz + 7yz 3
is homogeneous of degree 3. If a polynomial, F , is homogeneous of degree n
then F (λx, λy, λz) = λn F (x, y, z) for all λ ∈ K. So if F is homogeneous of
some degree and (x1 , y1 , z1 ) ∼ (x2 , y2 , z2 ) then F (x1 , y1 , z1 ) = 0 if and only if
106
F (x2 , y2 , z2 ) = 0. Therefore a zero of F in PK2 does not depend on how the
equivalence class is represented, so the set of zeros of F in PK2 is well defined.
If F (x, y, z) is an arbitrary polynomial in x, y, z then we cannot discuss the
point in PK2 where F = 0 as this depends on the equivalence class of (x, y, z).
For example if F = x2 + 2y − 3z, then F (1, 1, 1) = 0. But F (2, 2, 2) = 2
and we need (1 : 1 : 1) = (2 : 2 : 2) so to avoid this problem we work with
homogeneous polynomials as described above.
If f (x, y) is a polynomials in x, y then we can make in homogeneous by
inserting the appropriate powers of z. For example if f (x, y) = y 2 − x3 −
Ax − B then the homogeneous polynomials would be F (x, y, z) = y 2 z − x3 −
Axz 2 − Bz 3 . Explicitly if
X
f (x, y) =
ai xpi y qi
i
with maxi (pi + qi ) = n, then its homogeneous form is
X
F (x, y, z) =
ai xpi y qi z n−pi −qi
i
We show that
F (x, y, z) = z n
X
ai xpi z −pi y qi z −qi = z n
X
i
x y
,
= znf
z z
i
ai
x pi y qi
z
z
(A.5)
Also, it is clear that
F (x, y, 1) = f (x, y)
We can now see why two parallel lines are said to meet at infinity. Let
y = mx + b1 ,
y = mx + b2
be two non-verticle parallel lines, with b1 6= b2 . Their homogeneous forms
can be found as before (in the form F = 0), or expressed as below by simply
rearranging.
y = mx + b1 z,
y = mx + b2 z
To find the point of intersection we solve these simultaneously, to get
z(b1 − b2 ) = 0 ⇒ z = 0
⇒ y = mx
107
We cannot have all of x, y, z equal to 0, so x 6= 0. This allows us to rescale
by x to show the intersection is at
(x : mx : 0) = (1 : m : 0)
Similarly if x = c1 and x = c2 are two verticle lines then they intersect at
(0 : 1 : 0), which is also one of the points at infinity in PK2 .
Now consider the elliptic curve y 2 = x3 + Ax + B with homogeneous form
y 2 z = x3 + Axz 2 + Bz 3
The points (x, y) on the original curve correspond to (x : y : 1) on the
projective curve. To see which points on E lie at infinity, set z = 0 to obtain
0 = x3 . Therefore x = 0 and y is any nonzero number. We rescale by y to
show that
(0 : y : 0) = (0 : 1 : 0)
is the only point at infinity of E. This is why we think of the infinity point
as being at the end of the y-axis. Also since (0 : 1 : 0) = (0 : −1 : 0) the
points at infinity at the top and bottom of the y-axis are the same.
Next look for points at infinity on the generalised Weierstrass equation.
The homogeneous form of the equation is
y 2 z + a1 xyz + a3 yz 2 = x3 + a2 x2 z + a4 xz 2 + a6 z 3
When we set z = 0 we get 0 = x3 . Therefore ∞ = (0 : 1 : 0) is the only point
at infinity here, just as it was with the Weierstrass equation.
Throughout this project we usually work in the standard affine coordinates. However, there are situations where projective coordinates speeds up
calculations, such as the proof of associativity, which is simpler to prove in
projective notation.
A.5.2
Lines in PK2
The standard way to describe a line in PK2 is by a linear equation
sx + ty + rz = 0. Sometimes it is useful to give a parametric description:
x = a1 u + b 1 v
y = a2 u + b 2 v
z = a3 u + b 3 v
108
(A.6)
where u, v run through K, and at least one of u, v is non-zero. For example
if s 6= 0 the line sx + ty + rz = 0 can be described by
r
t
u−
v
x = −
s
s
y = 1·u+0·v =u
z = 0·u+1·v =v
Suppose all the vectors (ai , bi ) are multiples of each other, so (ai , bi ) =
λi (a1 , b1 ). Then (x, y, z) = x(1, λ2 , λ3 ) for all u, v such that x 6= 0. So we get
a point, rather than a line in projective space. We need to impose a condition
on the coefficients a1 , ..., b3 that ensures we actually get a line. This can be
expressed as making sure the matrix
a1 b 1
a2 b 2
a3 b 3
has rank 2.
If (u1 , v1 ) = λ(u2 , v2 ) for some λ ∈ K × then (u1 , v1 ) and (u2 , v2 ) yield
equivalent triples (x, y, z). Therefore we can regard (u, v) as running through
points (u : v) in 1-dimensional projective space PK1 .
We want to quantify the order to which a line intersects a curve at a
point.
Lemma A.6. Let G(u, v) be a non zero homogeneous polynomial and let
(u0 : v0 ) ∈ PK1 . Then there exists an integer k ≥ 0 and a polynomial H(u, v)
with H(u0 , v0 ) 6= 0 such that
G(u, v) = (v0 u − u0 v)k H(u, v)
Proof Suppose v0 6= 0. Let m be the degree of G and let g(u) = G(u, v0 ).
Factor out as large a power of (u − u0 ) as possible so
g(u) = (u − u0 )k h(u)
for some k ≥ 0 and for some polynomial h, with degree (m − k) and with
h(u0 ) 6= 0. Let H(u, v) = (v m−k /v0m )h(uu0 /v) so H(u, v) is homogeneous of
109
degree (m − k). Then by Equation (A.5)
m v
uv0 G(u, v) =
g
v0
v
m
v
uv0
uu0
=
(
− u0 )k h(
)
v0
v
v
uu v m−k
0
k
=
(v
u
−
u
v)
h
0
0
m
v0
v
k
= (v0 u − u0 v) H(u, v)
as desired.
If v0 = 0 then u0 6= 0 and the proof would be the same with the roles of
u and v reversed.
Let f (x, y) = 0 describe a curve C in the affine plane and let
x = a1 t + b 1 ,
y = a2 t + b 2
be a line L written in terms of the parameter t. Let
f˜(t) = f (a1 t + b1 , a2 t + b2 )
Then L intersects C when t = t0 if f˜(t0 ) = 0. If (t − t0 )2 divides f˜(t), and the
point corresponding to t0 is nonsingular, then L is tangent to C (see Lemma
A.8). Generally, we say that L intersects C to order n at the point (x, y)
corresponding to t = t0 if (t − t0 )n is the highest power of (t − t0 ) that divides
f˜(t).
The homogeneous version of this is as follows. Let F (x, y, z) be a homogeneous polynomial, so F = 0 describes a curve C in PK2 . Let L be a line
given parametrically and let
F̃ (u, v) = F (a1 u + b1 v, a2 u + b2 v, a3 u + b3 v)
We say that L intersects C to order n at the point P = (x0 : y0 : z0 )
corresponding to (u : v) = (u0 : v0 ) if (v0 u − u0 v)n is the highest power of
(v0 u − u0 v) dividing F̃ (u, v). We denote this by
ordL,P (F ) = n
110
If F̃ is identically zero, then we let ordL,P (F ) = ∞. This order is independent
of the chosen parameterization of L. Note that v = v0 = 1 corresponds to
the non-homogeneous case above, and the benefit of this formulation is that
we can treat the points at infinity along with the finite points in a uniform
manner.
Lemma A.7. Let L1 and L2 be lines intersecting at a point P . For i = 1, 2
let Li (x, y, z) be a linear polynomial defining Li . Then ordL1 ,P (L2 ) = 1 unless
L1 (x, y, z) = αL2 (x, y, z) for a constant α, in which case ordL1 ,P (L2 ) = ∞.
Proof When we substitute the parameterization for L1 into L2 (x, y, z), we
obtain L̃2 which is a linear expansion in u, v. Let P correspond to (u0 : v0 ).
Since L̃2 (u0 , v0 ) = 0, it follows that L̃2 (u, v) = β(v0 u−u0 v) for some constant
β. If β 6= 0 then ordL1 ,P (L2 ) = 1.
If β = 0 then all points on L1 lie on L2 . Since two points in PK2 determine
a line, and L1 has at least three points it follows that L1 and L2 are the same
line. Therefore L1 (x, y, z) is proportional to L2 (x, y, z), F̃ is identically zero
and ordL1 ,P (L2 ) = ∞.
A line that intersects a curve to order at least 2 is usually tangent to the
curve. But consider the curve C defined by
F (x, y, z) = y 2 z − x3 = 0
Let
x = au,
y = bu,
z=v
be a line through the point P = (0 : 0 : 1). Note that P corresponds to
(u : v) = (0 : 1). F̃ (u, v) = (b2 u2 )v − a3 u3 = u2 (b2 v − a3 u) so every line
through P intersects C to order at least 2. The line with b = 0 intersects
with order 3, and is the best choice for the tangent at P . We can see that
the affine part of C is y 2 = x3 which had the singular point at (0,0).
A curve C in PK2 defined by F (x, y, z) = 0 is said to be non-singular at a
point P if at least one of the partial derivatives Fx , Fy , Fz is nonzero at P .
Consider the elliptic curve defined by
F (x, y, z) = y 2 z − x3 − Axz 2 − Bz 3 = 0
Assume the characteristic of our field, K, is not 2 or 3. We have
Fx = −3x2 − Az 2 ,
Fy = 2yz,
111
Fz = y 2 − 2Axz − 3Bz 2
Now suppose P = (x : y : z) is a singular point, so the partial derivatives at
this point all vanish. If z = 0 then Fx = 0 implies x = 0 and Fz = 0 implies
y = 0 so P = (0 : 0 : 0) which is impossible. Therefore z 6= 0 so take z = 1.
Now Fy = 0 will give y = 0. Since (x : y : 1) lies on the curve we know x
satisfies both
x3 + Ax + B = 0, &
Fx = −3x2 − A = 0
So x is a root of the polynomial and its derivative, making it a double root.
However we assumed this was not the case so we have a contradiction. Therefore an elliptic curve (with no multiple roots) has no singular points.
Note this is true even if considering points in K, the algebraic closure of
K. In general a non-singular curve will mean a curve with no singular points
in K.
If P is a non-singular point of a curve F (x, y, z) = 0 then the tangent line
at P is, Fx (P )x + Fy (P )y + Fz (P )z = 0.
For example if F (x, y, z) = y 2 z − x3 − Axz 2 − Bz 3 = 0, then the tangent
line at (x0 : y0 : z0 ) is
(−3x20 − Az02 )x + (2y0 z0 )y + (y02 − 2Ax0 z0 − 3Bz02 )z = 0
If we set z0 = z = 1 then we obtain
(−3x20 − A)x + (2y0 )y + (y02 − 2Ax0 − 3B) = 0
Then using y02 = x30 + Ax0 + B gives
(−3x20 − A)(x − x0 ) + 2y0 (y − y0 ) = 0
which is the tangent line in affine coordinates that was used in deriving the
addition formulas. Now consider the point of infinity on this curve. We have
(x0 : y0 : z0 ) = (0 : 1 : 0). The tangent line is given by 0x + 0y + 0z = 0,
which is the line at infinity in PK2 . It intersects the elliptic curve only at
(0 : 1 : 0), which corresponds to the fact that ∞ + ∞ = ∞ on an elliptic
curve.
Lemma A.8. Let F (x, y, z) = 0 define a curve C. If P is a nonsingular
point of C, then there is exactly one line in PK2 that intersects C to order at
least 2, and it is the tangent to C at P .
112
Proof Let L be a line intersecting C to order k ≥ 1. Parameterize L
and sub into F to give F̃ (u, v). Let (u0 : v0 ) correspond to P , then F̃ =
(v0 u − u0 v)k H(u, v) for some H, with H(u0 , v0 ) 6= 0. Then using the chain
rule
F̃u (u, v) = +kv0 (v0 u − u0 v)k−1 H(u, v) + (v0 u − u0 v)k Hu (u, v)
F̃v (u, v) = −ku0 (v0 u − u0 v)k−1 H(u, v) + (v0 u − u0 v)k Hv (u, v)
We know that k ≥ 2 if and only if F̃u (u, v) = F̃v (u, v) = 0.
Suppose k ≥ 2, then the chain rule shows that at P
F̃u = a1 Fx + a2 Fy + a3 Fz = 0, F̃v = b1 Fx + b2 Fy + b3 Fz = 0
(A.7)
Recall that since we are dealing with a line the vectors, (a1 , a2 , a3 ) and
(b1 , b2 , b3 ) are linearly independent.
Suppose that L0 were another line that intersects C to order at least 2.
Then we obtain the second set of equations
a01 Fx + a02 Fy + a03 Fz = 0, b01 Fx + b02 Fy + b03 Fz = 0
at P .
If the vectors a’ = (a01 , a02 , a03 ) and b’ = (b01 , b02 , b03 ) span the same plane in
K 3 as a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) then
b0 = γa + δb
a0 = αa + βb,
for some invertible matrix
α β
γ δ2
Therefore
ua0 + vb0 = (uα + vγ)a + (uβ + vδ)b ≡ u1 a + v1 b
for a new choice of parameters u1 , v1 . This means that L and L0 are the same
line.
If the vectors spanned different planes then they would be different lines.
However if this were the case then a,b,a’,b’ span all of K 3 . Since (Fx , Fy , Fz )
has dot product zero with these vectors this implies it is the zero vector.
This in turn means P is a singular point, contrary to assumption.
So we have shown that there is only one line that intersects with order
k ≥ 2. We must now show that this is the tangent line. Suppose that Fx 6= 0.
113
The tangent line can be given the parameterization
x = −(Fy /Fx )u − (Fz /Fx )v,
y = u,
z=v
so, in the notation of Equation (A.6)
a1 = −Fy /Fx , b1 = −Fz /Fx , a2 = 1, b2 = 0, a3 = 0, b3 = 1
Substitute into Equation (A.7) to get
F̃u = (−Fy /Fx )Fx + Fy = 0,
F̃v = (−Fz /Fx )Fx + Fz = 0
Therefore the tangent line intersects the curve to order k ≥ 2.
A.5.3
The proof of associativity
The proof of associativity will follow easily from the next theorem. The
proof of this theorem would be considerably simplified if the points Pij were
assumed to be distinct. The cases where they are equal correspond to the
cases when a tangent line is used in the group operation.
Theorem A.9. Let C(x, y, z) be a homogeneous cubic polynomial and let C
be the curve in PK2 described by C(x, y, z) = 0. Let l1 , l2 , l3 and m1 , m2 , m3 be
lines in PK2 such that li 6= mj for all i,j. Let Pij be the point of intersection
of li and mj . Suppose Pij is a nonsingular point on the curve C for all
(i, j) 6= (3, 3).
In addition we require that if , for some i, there are k ≥ 2 of the points
Pi1 , Pi2 , Pi3 equal to the same point, then li intersects C to order at least k.
Similarly, if for some j there are k ≥ 2 of the points P1j , P2j , P3j equal to the
same point, then mj intersects C to order at least k.
Then P33 also lies on the curve C.
Proof Express l1 in the parametric form of Equation (A.6) so C(x, y, z)
becomes C̃(u, v). The line l1 passes through P11 , P12 , P13 . Let (u1 : v1 ),
(u2 : v2 ), (u3 : v3 ) be the parameters on l1 for these points. Since these points
lie on C we have C̃(ui , vi ) = 0 for i = 1, 2, 3.
Let mj have equation mj (x, y, z) = aj x + bj y + cj z = 0. Subbing in
the parameterization for l1 yields m̃j (u, v). Since Pij lies on mj we have
m̃j (uj , vj ) = 0 for j = 1, 2, 3. Since l1 6= mj and since the zeros of m̃j yield
114
the intersections of l1 and mj , the function m̃j (u, v) will vanish only at Pij ,
so its linear form m̃j is nonzero.
Therefore m̃1 (u, v)m̃2 (u, v)m̃3 (u, v) is a nonzero cubic homogeneous polynomial. We need to relate this to C̃.
Lemma A.10. Let R(u, v) and S(u, v) be homogeneous polynomials of degree
3, with S not identically zero. Suppose there are three points (ui , vi ), i = 1, 2, 3
at which R and S vanish. If k of these points are equal to the same point
then let (vi u − ui v)k divide R and S.
Then there is a constant α ∈ K such that R = αS.
Proof First we prove that a non-zero cubic homogeneous polynomial S(u, v)
can have at most 3 zeros (u : v) in PK1 (counting multiplicities). Factor off
the highest power of v, say v k , so S(u, v) vanishes to order k at (1:0) and
S(u, v) = v k S0 (u, v) with S0 (1, 0) 6= 0. Since S0 (u, 1) is a polynomial of
degree (3 − k), it can have at most (3 − k) zeros and exactly this if K is
algebraically closed. All points (u : v) 6= (1, 0) can be written in the form
(u : 1) so S0 (u, v) has at most 3 − k zeros in PK1 . Therefore S(u, v) has at
most k + (3 − k) = 3 zeros in PK1 .
Let (u0 : v0 ) be any point in PK1 not equal to any of the (ui , vi ). Since
S can have at most three zeros, S(u0 , v0 ) 6= 0. Let α = R(u0 , v0 )/S(u0 , v0 ).
Then R(u, v) − αS(u, v) is a cubic homogeneous polynomial that vanishes at
the four points (ui , vi ), i = 0, 1, 2, 3. Therefore R − αS must be identically
zero.
Now we can note that C̃ and m̃1 m̃2 m̃3 vanish at the points (ui : vi ), i =
1, 2, 3. Also if k of the points P1j are the same point then k of the linear
functions vanish at this point, so m̃1 m̃2 m̃3 vanishes to order at least k, and
by assumption so does C̃. So by the lemma there is a constant α so
C̃ = αm̃1 m̃2 m̃3
Let
C1 (x, y, z) = C(x, y, z) − αm1 (x, y, z)m2 (x, y, z)m3 (x, y, z)
The line l1 can be described by the linear equation l1 (x, y, z) = ax+by +cz =
0. At least one coefficient is non zero so assume a 6= 0 (the other cases will
be similar). The parameterization of l1 can be taken to be
x = −(b/a)u − (c/a)v,
115
y = u,
z=v
(A.8)
Then C̃1 (u, v) = C1 (−(b/a)u−(c/a)v, u, v). We can regroup to write C1 (x, y, z)
as a polynomial in x with polynomials in y, z as coefficients. Then writing
xn = [−(b/a)u − (c/a)v]n = (1/an )[−(by + cz)]n
= (1/an )[(ax + by + cz) − (by + cz)]n = (1/an )[(ax + by + cz)n + ...]
allows us to give C1 (x, y, z) as a polynomial in ax + by + cz whose coefficients
are polynomials in y, z:
C1 (x, y, z) = a3 (y, z)(ax + by + cz)3 + ... + a0 (y, z)
(A.9)
for some function ai (y, z), i = 0, 1, 2, 3. Substituting Equation (A.8) into
Equation (A.9) yields
0 = C̃1 (u, v) = a0 (u, v)
Therefore a0 (y, z) = a0 (u, v) is the zero polynomial. It follows from Equation
(A.8) that C1 (x, y, z) is a multiple of l1 (x, y, z) = ax + by + cx.
Similarly there is a constant β such that C(x, y, z) − βl1 l2 l3 is a multiple
of m1 . Let
D(x, y, z) = C − αm1 m2 m3 − βl1 l2 l3
Then D is a multiple of l1 and a multiple of m1 .
Lemma A.11. D(x, y, z) is a multiple of l1 (x, y, z)m1 (x, y, z).
Proof Write D = m1 D1 , so we need to show that l1 divides D1 . Parameterize
l1 as in Equation (A.8) (again considering the case a 6= 0). Then substituting
yields D̃ = m̃1 D̃1 . Since l1 divides D, we have D̃ = 0, and since m1 6= l1 we
have m̃1 6= 0. Therefore D̃1 (u, v) is the zero polynomial. This implies that
D1 (x, y, z) is a multiple of l1 as required.
So by the lemma D(x, y, z) = l1 m1 l where l(x, y, z) is linear. By assumption C = 0 at P22 , P23 , P24 and l1 l2 l3 & m1 m2 m3 vanish at these points.
Therefore D(x, y, z) vanishes at these three points. We must show that D is
identically zero.
Lemma A.12. l(P22 ) = l(P23 ) = l(P32 ) = 0.
Proof Suppose that P13 6= P23 . If l1 (P23 ) = 0 then P23 is on the line l1 , as
well as on l2 and m3 by definition. Therefore P23 = P13 , the intersection of
116
l1 and m3 , which is a contradiction. If l1 (P23 ) 6= 0 then because D(P23 ) = 0,
we have m1 (P23 )l(P23 ) = 0.
Next suppose that P13 = P23 . Then by the assumption of Theorem A.9,
m3 is tangent to C at P23 , so ordm3 ,P23 (C) ≥ 2. Since P13 = P23 and P23 lies
on m3 we have
ordm3 ,P23 (l1 ) = ordm3 ,P23 (l2 ) = 1
Therefore ordm3 ,P23 (αl1 l2 l3 ) ≥ 2. Also ordm3 ,P23 (βm1 m2 m3 ) = ∞, therefore
ordm3 ,P23 (D) ≥ 2 since D is the sum of terms, each of which vanishes to order
at least 2. But ordm3 ,P23 (l1 ) = 1, so
ordm3 ,P23 (m1 l) = ordm3 ,P23 (D) − ordm3 ,P23 (l1 ) ≥ 1
Therefore m1 (P23 )l(P23 ) = 0.
So in both cases we have m1 (P23 )l(P23 ) = 0.
If m1 (P23 ) 6= 0 then l(P23 ) = 0 as required.
If m1 (P23 ) = 0 then P23 lies on m1 , as well as on l2 and m3 . Therefore
P23 = P21 , since l2 and m1 intersect at a unique point. By the assumption
of Theorem A.9 l2 is tangent to C at P23 and so ordl2 ,P23 (C) ≥ 2. As above
ordl2 ,P23 (D) ≥ 2 so
ordl2 ,P23 (l1 l) ≥ 1
If l1 (P23 ) = 0 then P23 lies on l1 , l2 , m3 and therefore P13 = P23 . By assumption m3 is tangent to C at P23 . Since P23 is a nonsingular point of C, by
Lemma A.8 we have l2 = m3 , a contradiction.
Therefore l1 (P23 ) 6= 0 and so l(P23 ) = 0 as required.
l(P22 ) = l(P32 ) = 0 similarly.
Suppose for a contradiction that l(x, y, z) is not zero, and so defines a line l.
First suppose that P23 , P22 , P32 are distinct. Then l and l2 are lines
through P23 and P22 , and so l = l2 . Similarly l = m2 and so l2 = m2
which is a contradiction
Next suppose P32 = P22 , so m2 is tangent to C at P22 . As before
ordm2 ,P22 (l1 m1 l) ≥ 2
We will show this forces l = m2
117
If m1 (P22 ) = 0, then P22 lies on m1 , m2 , l2 and so P21 = P22 . This means
that l2 is tangent to C at P22 . By Lemma A.8, l2 = m2 a contradiction.
Therefore m1 (P22 ) 6= 0
If l1 (P22 ) 6= 0, then ordm2 ,P22 (l) ≥ 2, which means l = m2 .
If l1 (P22 ) = 0, then P22 = P32 lies on l1 , l2 , l3 , m2 so P12 = P22 = P32 .
Therefore ordm2 ,P22 (C) ≥ 3 and so by the reasoning above ordm2 ,P22 (l1 m1 l) ≥
3. We proved m1 (P22 ) 6= 0 so ordm2 ,P22 (l) ≥ 2. This means l = m2 .
So under the assumption that P32 = P22 l is the same line as m2 . Now
by Lemma A.12 P23 lies on l and therefore on m2 , as well as on l2 and m3 by
definition. Therefore P22 = P23 , and so l2 is tangent to C at P22 . However
P32 = P22 means m2 is tangent to C at P22 as well. This means that l2 = m2
contrary to assumption, so P32 6= P22 .
We can show that P23 6= P22 similarly with the roles of the indicies reversed.
Finally suppose that P23 = P32 , so P23 lies on l2 , l3 , m2 , m3 . This implies
P22 = P32 which we know is impossible.
So all possibilities lead to contradictions so we conclude that l(x, y, z) is
identically zero. This in turn gives D = 0 so
C = αl1 l2 l3 + βm1 m2 m3
Since l3 and m3 vanish at P33 , we have C(P33 ) = 0 as desired, completing
the proof of Theorem A.9.
Proof Of Associativity
Let P, Q, R be points on an elliptic curve E. Define the lines
l1 = P, Q,
m1 = Q, R,
l2 = ∞, Q + R,
m2 = ∞, P + Q,
l3 = R, P + Q
m3 = P, Q + R
were + refers to elliptic curve addition. It can be easily verified that these
line have the following intersections (where X is unknown).
m1
m2
m3
l1
l2
l3
Q
−(Q + R)
R
−(P + Q)
∞
P +Q
P
Q+R
X
118
We first deal with some special cases:
(i) If P, Q or R is ∞ then association is trivial. For example, if P = ∞
then, as required
(P + Q) + R = (Q) + R = Q + R
P + (Q + R) = (Q + R) = Q + R
(ii) If P + Q = ∞ then
(P + Q) + R = ∞ + R = R
To find (Q + R) we draw the line L through Q and R, which intersects
E again at −(Q + R). Since P + Q = ∞ we have the reflection of Q
in the x-axis, −Q = P . So the reflection of L we be the line L0 which
passes through P , −R and (Q + R). Now P + (Q + R) is found by
drawing the line through P and (Q + R) which is L0 . The third point
of intersection of L0 with E is −R. Therefore
P + (Q + R) = R
So associativity holds in this case.
(iii) If Q + R = ∞ then associativity holds similarly to above.
So now assume that P, Q, R, (P + Q), (Q + R) 6= ∞. We must now verify
the assumptions of Theorem A.9 for the remaining cases. Now, if two of the
points on a line are equal then by definition the line through them will be
the tangent line, and will intersect to order 2. If three of the points are equal
then it implies that all three are ∞. Earlier we saw that if the tangent line to
the curve intersects at ∞ then it will intersect to order 3, so this assumption
is satisfied.
Suppose that li 6= mj for all i, j. Then the assumptions of Theorem A.9
are all satisfied and so all the points in the table, including X lie on E. Now
l3 will have three points of intersection with E; R, (P + Q) and X. By the
definition of elliptic curve addition we have
X = −[(P + Q) + R]
Similarly m3 intersects E in three places; P, (Q + R) and X so
X = −[P + (Q + R)]
119
So we see that, (P + Q) + R = P + (Q + R) as desired.
Our final task will be to consider what happens if some line li equals some
line mj . First observe the following three results:
(i) If P, Q, R are collinear then
(P + Q) + R = (−R) + R = ∞ and P + (Q + R) = P + (−P ) = ∞
So associativity holds.
(ii) If P,Q,(Q+R) are collinear then P + (Q + R) = −Q.
Also P + Q = −(Q + R) so
(P + Q) + R = −(Q + R) + R = −Q
where the second equality is proved by Lemma A.13 below.
(iii) If Q, R, (P + Q) are collinear then associativity holds as above.
Lemma A.13. Let P1 , P2 be points on an elliptic curve. Then
(P1 + P2 ) − P2 = P1
and
− (P1 + P2 ) + P2 = −P1
Proof The first equation is the reflection of the second so we just prove the
second. The line, L, through P1 and P2 intersects the elliptic curve again at
−(P1 + P2 ). So to calculate −(P1 + P2 ) + P2 we would draw the line between
them which is L. This cuts again at P1 so its reflection is −P1 .
Now suppose li = mj for some i, j. We can assume the all the points of
intersection except ∞ and possibly X are finite. Consider the various cases.
(i) l1 = m1 : Then P, Q, R are on the same line. This means they are
collinear and so associativity follows.
(ii) l1 = m2 : ∞, P + Q is a verticle line so P Q is too. Therefore P +Q = ∞,
and by the earlier argument associativity follows.
(iii) l2 = m1 : In this case its Q + R = ∞ so associativity holds similarly.
(iv) l1 = m3 : Then P, Q and (Q + R) are collinear, so associativity holds.
120
(v) l3 = m1 : Then Q, R and (P + Q) are collinear, so associativity holds.
(vi) l2 = m2 : So we know that (P + Q), (Q + R) and ∞ are on this line. So
P + Q = ±(Q + R). If P + Q = Q + R then by Lemma A.13
P = (P + Q) − Q = (Q + R) − Q = R
Therefore
(P +Q) +R = R + (P +Q) = P +(P +Q) = P +(R +Q) = P +(Q+R)
as required. If P + Q = −(Q + R), then
(P + Q) + R = −(Q + R) + R = −Q
P + (Q + R) = P − (P + Q) = −Q
So associativity holds.
(vii) l2 = m3 : We have a line with P, (Q+R), ∞ on it meaning P = −(Q+R).
Since Q, R and −(Q + R) are collinear by definition we have that Q
and R are on this line as well. So P, Q, R are collinear and associativity
holds.
(viii) l3 = m2 : We have a line with R, (P + Q), ∞ on it so associativity holds
similarly to the previous case.
(ix) l3 = m3 : So P, R, (Q + R) and (P + Q) lie on the same line, but
this line cannot intersect in 4 points, so either P = R, P = P + Q or
Q + R = P + Q (other combinations would imply ∞ was on the line.
If P = R then we are in the case l2 = m2 . If P = P + Q then
P − P = (P + Q) − P
∞ = Q
and so associativity follows. If Q + R = P + Q then similarly adding
−Q, gives P = R which we have already treated.
So this completes the proof of associativity for all possible cases. When
we are working in characteristic 2 the proof of associativity is very similar
to this case, since with the generalised Weierstrass equation E can still be
given as a homogeneous cubic polynomial and so Theorem A.9 can still be
applied.
121
A.6
The proofs omitted from Chapter 3
In Chapter 3 the proofs of Lemmas 3.14 and 3.15 were omitted and said to
be lengthly but simple exercises in proof by mathematical induction (PMI).
We give the proofs of these lemmas here along with Theorem A.14 which was
used in Section 3.4.
Lemma 3.14 φn ∈ Z[x, y 2 , A, B] for all n. If n is odd then ωn ∈ yZ[x, y 2 , A, B]
while if n is even then ωn ∈ Z[x, y 2 , A, B].
Proof If n is odd then ψn+1 and ψn−1 are in yZ[x, y 2 , A, B] so their product
is in Z[x, y 2 , A, B] and so is xψn2 . If n is even then ψn is in yZ[x, y 2 , A, B]
so ψn2 is in Z[x, y 2 , A, B] and so is ψn+1 and ψn−1 . So either way all the
components of φn are in Z[x, y 2 , A, B] so φn is as well.
Now consider ωn . If n is odd then ψn+2 and ψn−2 are in Z[x, y 2 , A, B],
while ψn+1 and ψn−1 are in 2yZ[x, y 2 , A, B]. So
2
ψn+2 ψn−1
∈ 22 y 2 Z[x, y 2 , A, B]
2
ψn−2 ψn+1
∈ 22 y 2 Z[x, y 2 , A, B]
∴ ωn ∈ yZ[x, y, A, B]
While if n is even then ψn+2 and ψn−2 are in 2yZ[x, y 2 , A, B], while ψn+1 and
ψn−1 are in Z[x, y 2 , A, B]. So
2
ψn+2 ψn−1
∈ 2yZ[x, y 2 , A, B]
2
ψn−2 ψn+1
∈ 2yZ[x, y 2 , A, B]
1
Z[x, y, A, B]
∴ ωn ∈
2
This result will suffice for future applications, but to prove the lemma we
need to get rid of the 2 in the denominator when n is even. We will prove
with PMI that
2 −1)/4
ψn ≡ (x2 + A)(n
2
(mod 2)
(n2 −4)/4
ψn ≡ (yn)(x + A)
n-odd
(mod 2)
n-even
We can see the hypothesis is true for n ≤ 4:
2 −4)/4
ψ0 = 0, ⇒ (yn)(x2 + A)(n
2 −1)/4
ψ1 = 1, ⇒ (x2 + A)(n
122
=0
√
= (x2 + A)(1−1)/4 = 1
√
2
ψ2 = 2y, ⇒ (yn)(x2 + A)(n −4)/4 = 2y(x2 + A)4−4 = 2y
ψ3 = 3x4 + 6Ax2 + 12Bx − A2 ≡ x4 + A2 (mod 2),
2 −1)/4
⇒ (x2 + A)(n
= (x2 + A)2 ≡ x4 + A2
2 −4)/4
ψ4 = 4y(x6 + ...) ≡ 0 ⇒ (ny)(x2 + A)(n
= 4y(x2 + A)3 ≡ 0
√
(mod 2)
√
Assume for induction that the lemma holds for all n < 2m, where 2m > 4,
so m > 2. We must now prove that the lemma holds for n = 2m and
n = 2m + 1 to prove the lemma with PMI. Because 2m > m + 2 we can see
that all polynomials in the definition of ψ2m and ψ2m+1 satisfy the induction
assumptions.
First assume m is odd, so m ± 2 is odd also and m ± 1 is even. Then
3
3
ψ2m+1 = ψm+2 ψm
− ψm−1 ψm+1
1
2 −1+3m2 −3)
= (x2 + A) 4 ((m+2)
1
− (m − 1)(m + 1)3 y 3 (x2 + A) 4 ((m−1)
2 −4+3(m+1)2 −3)
Because (m ± 1) is odd the second term will be even and so ≡ 0 (mod 2)
1
2 +4m)
ψ2m+1 ≡ (x2 + A) 4 (4m
= (x2 + A)
+ 0 (mod 2)
1
((2m+1)2 −1)
4
as required. Similarly
2
2
ψ2m = (2y)−1 (ψm )(ψm+2 ψm−1
− ψm−2 ψm+1
)
h
i
m2 −1
1
2
1
2
2
= + (x2 + A) 4 (x2 + A) 4 ((m+2) −1) y 2 (m − 1)2 (x2 + A) 4 ((m−1) −4)
2y
h
i
m2 −1
1
2
1 2
2
((m−2)2 −1) 2
2 2
((m+1)2 −4)
4
4
4
− (x + A)
(x + A)
y (m + 1) (x + A)
2y
m2 −1
1
1
(m − 1)2 2
(m + 1)2 2
2
(3m2 −3)
(3m2 −3)
4
4
4
= (x + A)
y
(x + A)
−
(x + A)
2
2
1
(m − 1)2 (m + 1)2
2 −4)
2
(4m
= y(x + A) 4
−
2
2
1
−4m
2
= y(x2 + A) 4 (2m) −4)
2
1
≡ [2m]y(x2 + A) 4 ((2m)
2 −4)
≡0
(mod 2)
as required.
123
√
Now assume m is even, so m ± 2 is even also and m ± 1 is odd. Then
1
2 −4−3m2 −12)
ψ2m+1 = (m + 2)m3 y 4 (x2 + A) 4 ((m+2)
1
− (x2 + A) 4 ((m−1)
2 −1+3(m+1)2 −3)
Because (m + 2) and m are even the first term will be ≡ 0 (mod 2)
1
2 +4m)
ψ2m+1 ≡ 0 + (x2 + A) 4 (4m
1
2 −1
= (x2 + A) 4 ((2m+1)
as required. Similarly
h
i
m2 −4
1
1
2
2
(my(x2 + A) 4 ) (m + 2)y(x2 + A) 4 ((m+2) −4+2(m−1) −2)
2y
h
i
m2 −4
1
1
2
2
((m−2)2 −4+2(m+1)2 −2)
4
4
) (m − 2)y(x + A)
− (my(x + A)
2y
h
i
m2 −4
my 2
2
2
(x + A) 4 (m + 2)(x2 + A)3m − (m − 2)(x2 + A)3m
=
2
4m2 −4
my 2
=
(x + A) 4 [m + 2 − m + 2]
2
1
2
= (2m)y(x2 + A) 4 ((2m) −4)
ψ2m ≡ +
as required. So by PMI we conclude that
2 −1)/4
ψn ≡ (x2 + A)(n
2
(mod 2)
(n2 −4)/4
ψn ≡ (yn)(x + A)
n-odd
(mod 2)
n-even
Now if n is even then (n ± 2) is even and (n ± 1) is odd so
2
2
ωn = (4y)−1 (ψn+2 ψn−1
− ψn−2 ψn+1
))
h
i
1
1
2
2
≡ +
(n + 2)y(x2 + A) 4 ((n+2) −4+2(n−1) −2)
4y
i
1
1 h
2
2
(n − 2)y(x2 + A) 4 ((n−2) −4+2(n+1) −2)
(mod 2)
−
4y
i
1h
2
2
=
(n + 2)(x2 + A)3n − (n − 2)(x2 + A)3n
4
1 2
2
(x + A)3n [n + 2 − n + 2]
=
4
2
= (x2 + A)3n
So now we have ωn ∈ Z[x, y 2 , A, B] if n is even, completing the proof.
124
Lemma 3.15 When considering points on the elliptic curve y 2 = x3 +Ax+B
2 −1
(i) ψn2 (x) = n2 xn
+ lower degree terms
2
(ii) φn (x) = xn + lower degree terms
Proof We will first show by induction that
2 −4)/2
ψn = y(nx(n
ψn = nx
(n2 −1)/2
+ ...)
n − even
n − odd
+ ...
where (+...) represents lower order terms. The hypothesis is true for n ≤ 4:
√
2
ψ0 = 0, ⇒ y(nx(n −4)/2 + ...) = 0
√
2
ψ1 = 1, ⇒ nx(n −1)/2 + ... = x0 = 1
√
2
ψ2 = 2y, ⇒ y(nx(n −4)/2 + ...) = 2yx0 + ... = 2y
√
2
ψ3 = 3x4 + ..., ⇒ nx(n −1)/2 + ... = 3x(9−1)/2 + ... = 3x4 + ...
√
2
ψ4 = 4y(x6 + ...), ⇒ y(nx(n −4)/2 + ...) = 4yx(16−4)/2 + ... = 4yx6 + ...
Assume for induction that the lemma holds for all n < 2m, where 2m > 4,
so m > 2. We must now prove that the lemma holds for n = 2m and
n = 2m + 1 to prove the lemma with PMI. Because 2m > m + 2 we can see
that all polynomials in the definition of ψ2m and ψ2m+1 satisfy the induction
assumptions.
First assume m is odd, so m ± 2 is odd also and m ± 1 is even. Then
3
3
ψ2m+1 = ψm+2 ψm
− ψm−1 ψm+1
2 −1+3m2 −3]/2
= [(m + 2)m3 x[(m+2)
−y 4 [(m − 1)(m + 1)3 x
4
3
2m2 +2m
4
3
2m2 +2m
= [(m + 2m )x
= [(m + 2m )x
= (2m + 1)x
+ ...]
[(m−1)2 −4+3(m+1)2 −12]/2
6
+ ...]
2 +2m−6
4
+ ...] − (x + ...)[(m + 2m3 − 2m − 1)x2m
4
3
+ ...] − [(m + 2m − 2m − 1)x
[(2m+1)2 −1]/2
2m2 +2m
+ ...]
+ ...
as required. Similarly
2
2
ψ2m = (2y)−1 (ψm )(ψm+2 ψm−1
− ψm−2 ψm+1
)
h
i
m2 −1
1
2
2
= + (mx 2 ) × y 2 (m + 2)(m − 1)3 x[(m+2) −1+2(m−1) −8]/2 + ...
2y
125
+ ...]
h
i
m2 −1
1
2
2
(mx 2 ) × y 2 (m − 2)(m + 1)3 x[(m−2) −1+2(m+1) −8]/2 + ...
2y
h
i
m2 −1
y
2
2
=
(mx 2 ) ((m + 2)(m − 1)3 x[3m −3]/2 + ...) − ((m − 2)(m + 1)3 x[3m −3]/2 + ...)
2
my [4m2 −4]/2
(m + 2)(m − 1)2 − (m − 2)(m + 1)2 + ...
x
=
2
my [4m2 −4]/2
2
=
(m3 − 3m + 2) − (m3 − 3m − 2) + ... = (2m)yx[(2m) −4]/2
x
2
as required.
Now assume m is even, so m ± 2 is even also and m ± 1 is odd. Then
i
h
4
3 [(m+2)2 −4+3m2 −12]/2
ψ2m+1 = y (m + 2)m x
+ ...
h
i
2
2
− (m − 1)(m + 1)3 x[(m−1) −1+3(m+1) −3]/2 + ...
h
i
6
3 [4m2 +4m−12]/2
= (x + ...) (m + 2)m x
+ ...
h
i
2
− (m − 1)(m + 1)3 x[4m +4m]/2 + ...
−
2 +4m]/2
= [(m4 + 2m3 ) − (m4 + 2m3 − 2m − 1)]x[4m
= (2m + 1)x
+ ...
[(2m+1)2 −1]/2
as required. Similarly
h
i
1
2
2
2
(ymx[m −4]/2 )y (m + 2)(m − 1)2 x[(m+2) −4+2(m−1) −2]/2 + ...
2y
h
i
1
2
2
2
− (ymx[m −4]/2 )y (m − 2)(m + 1)2 x[(m−2) −4+2(m+1) −2]/2 + ...
2y
i
my [m2 −4]/2 h
2
2
=
x
((m + 2)(m − 1)2 x3m + ...) − ((m − 2)(m + 1)2 x3m + ...)
2
my
2
[(m + 2)(m − 1)2 − (m − 2)(m + 1)2 ]x[4m −4]/2 + ...
=
2
2
= (2m)yx[(2m) −4]/2 + ...
ψ2m = +
as required. So, by PMI we can conclude that
2 −4)/2
ψn = y(nx(n
ψn = nx
(n2 −1)/2
+ ...)
+ ...
n − even
n − odd
We can now use this to prove the lemma. Consider the case when n is odd:
2 −1)/2
ψn2 = (nx(n
2 −1)/2
+ ...) × (nx(n
126
+ ...)
2 −1
= n2 xn
+ ...
as required. Next consider the case when n is even
2 −4)/2
ψn2 = y(nx(n
= y 2 n2 (x
n2 −4
2 n2 −1
= nx
2 −4)/2
+ ...) × y(nx(n
+ ...)
2 −4
+ ...) = (x3 + Ax + B)n2 (xn
+ ...)
+ ...
as required, proving part (i) of the lemma
Now for part (ii). First let n be odd, so (n ± 1) is even:
φn = xψn2 − ψn+1 ψn−1
2 −1
= x(n2 xn
2 n2
= (n x
2 n2
= (n x
2 −4+(n−1)2 −4]/2
+ ...) − y 2 ((n + 1)(n − 1)x[(n+1)
3
2
+ ...) − (x + ...)((n − 1)x
2
+ ...) − ((n − 1)x
n2
n2 −3
+ ...) = x
+ ...)
+ ...)
n2
+ ...
as required. Finally consider the case when n is even, so (n ± 1) is odd:
2 −1
φn = x(n2 xn
2 n2
= (n x
= n2 x
n2
2 −1+(n−1)2 −1]/2
+ ...) − ((n + 1)(n − 1)x[(n+1)
2
+ ...) − ((n − 1)x
n2
+ ...)
+ ...)
+ ...
as required. This completes the proof of part (ii) and the lemma.
We now state and prove Theorem A.14 which was used in the corollaries
of the Weil pairing given in Section 3.4. For this theorem we suppose that
E is an elliptic curve over a field K and n is an integer not divisible by the
characteristic of K. Let
µn = {x ∈ K|xn = 1}
be the group of nth roots of unity in K. Since the characteristic of K does
not divide n, the equation xn = 1 has no multiple roots, and hence n roots
in K. Therefore µn is a cyclic group of order n. Any generator, ζ, of µn is
called a primitive nth root of unity.
Lemma A.14. ζ being a primitive nth root of unity is equivalent to saying
that ζ k = 1 if and only if n divides k.
127
Proof To prove the lemma we need to prove the following two statements:
(i) Let ζ be a primitive nth root of unity. Then ζ k = 1 if and only if n|k.
(ii) Let ζ k = 1 if and only if n|k. Then ζ is a primitive nth root of unity.
First consider statement (i). ζ is a primitive nth root if unity. So
µn = {ζ i , i = 0...(n − 1)}
a. If n|k then
ζ k = ζ nj = (ζ n )j = 1j = 1
as required
b. If ζ k = 1 then k = qn + r for some r such that 0 ≤ r < n. Then
ζ k = ζ qn ζ r = ζ r
so ζ r = 1. But 0 is the only r in the range 0 ≤ r < n such that ζ r = 0 so
r = 0 meaning k = qn. So n|k as required.
Next consider statement (ii). Suppose ζ k = 1 ⇐⇒ n|k, then ζ n = 1 = ζ 0 .
Suppose for a contradiction that ζ i = ζ j for some i, j < n, i 6= j. Then
ζ i−j = 1 so (i − j)|n. This would imply that i ≡ j (mod n) which is a
contradiction. Therefore ζ i for i = 0...(n − 1) are all distinct elements.
Further
(ζ i )n = (ζ n )i = (ζ 0 )i = 1i = 1.
So {ζ i , i = 0, ..., (n − 1)} = µn as required.
128
A.7
Methods to determine the order of E(Fq )
exactly
Hasse’s theorem gave bounds for the group of points on an elliptic curve over
a finite field. In this section we discuss methods for determining the group
order exactly.
A.7.1
Subfield curves
Suppose we have an elliptic curve defined over a small finite field Fq , so that
we can determine the order of E(Fq ) by listing the points, or some other
elementary procedure. We can then determine the order of E(Fqn ) for all n.
Theorem A.15. Let #E(Fq ) = q + 1 − a and write X 2 − aX + q =
(X − α)(X − β). Then for all n ≥ 1
#E(Fqn ) = q n + 1 − (αn + β n )
(A.10)
Proof We first need to show that αn + β n is an integer, which is implied by
the following.
Lemma A.16. Let sn = αn +β n . Then s0 = 2, s1 = a and sn+1 = asn −qsn−1
for all n ≥ 1.
Proof Clearly s0 = α0 + β 0 = 1 + 1 = 2 and s1 = α + β. By considering
Equation (A.10) with n = 1 and eq(4.1) we see that α + β = a as required.
Let g(X) = X 2 −aX +q = (X −α)(X −β) so g(α) = g(β) = 0. Therefore
α2 − aα + q = 0,
β 2 − aβ + q = 0
Multiplying by αn−1 and β n−1 respectively gives
αn+1 − aαn + qαn−1 = 0,
αn+1 = aαn − qαn−1 ,
β n+1 − aβ n + qβ n−1 = 0
β n+1 = aβ n − qβ n−1
Then
sn+1 = αn+1 + β n+1 = aαn − qαn−1 + aβ n − qβ n−1
= a(αn + β n ) − q(αn−1 + β n−1 ) = asn − qsn−1
129
So αn + β n is an integer for all n ≥ 0. Let
f (X) = (X n − αn )(X n − β n ) = X 2n − (αn + β n )X n + q n
Then X 2 − aX + q = (X − a)(X − b) divides f (X). It follows from the
standard algorithm for dividing polynomials that the quotient, Q(x), is a
polynomial with integer coefficients. Therefore, letting X = φq gives
(φnq )2 − (αn + β n )φnq + q n = f (φq ) = Q(φq )(φ2q − aφq + q) = 0
with the final equality using Theorem 4.6. Note that φnq = φqn so
(φqn )2 − (αn + β n )φqn + q n = 0
We know from Theorem 4.6, that there is only one k such that (φqn )2 −kφqn +
q n = 0, and that it is k = q n + 1 − #E(Fqn ). Therefore
αn + β n = q n + 1 − #E(Fqn )
which can be rearranged to complete the proof of Theorem A.15.
Example A.1. We showed in Example 4.2 that the curve, E, given by
y 2 +xy = x3 +1 over F2 satisfies #E(F2 ) = 4. Therefore a = q+1−#E(Fq ) =
2 + 1 − 4 = −1 and we obtain the polynomial
√ √ −1 + −7
−1 − −7
2
X +X +2= X −
X−
2
2
Theorem A.15 tells us that
#E(F4 ) = 4 + 1 −
√ 2
√ 2 −1 + −7
−1 − −7
−
2
2
We could compute the last expression directly, but better use the recurrence
relation of Lemma A.16
α2 + β 2 = s2 = as1 − 2s0 = (−1)(−1) − 2(2) = −3
So, #E(F4 ) = 4 + 1 − (−3) = 8, (as we calculated when listing points).
130
We could perform a similar calculation to find the size of much larger
fields. A matlab m-file (RR44.m) was created to encode the recurrence relation, and can be found in Appendix C.7. This takes as its inputs, n, q and
#E(Fq ) and outputs sn as defined by Lemma A.16. It was used to calculate
s101 =
√ 101
√ 101 −1 − −7
−1 + −7
−
= 2969292210605269
2
2
We can then show that
#E(F2101 ) = 2101 + 1 − 2969292210605269
= 2.535301200456456 × 1030
to 16 significant figures, using Matlab.
A.7.2
Legendre symbols
To make a list of all the points on y 2 = x3 + Ax + B over a finite field,
we listed every possible value of x, and then found the square roots, y, of
(x3 + Ax + B) if they existed. This procedure will be the basis for a simple
point counting algorithm.
Recall our generalisation of the Legendre symbol to a finite field Fq , q
odd:
+1 if t2 = x has a solution t ∈ F×
q
x
−1 if t2 = x has no solution t ∈ Fq
=
Fq
0 if x = 0
Theorem A.17. Let E be an elliptic curve, y 2 = x3 + Ax + B over Fq . Then
X x3 + Ax + B #E(Fq ) = q + 1 +
Fq
x∈F
q
Proof Consider a point x0 ∈ Fq . There are points on E with x-coordinate
x0 if x30 + Ax0 + B is a non-zero square in Fq . There is one such point if it is
zero, and no such points if it is square. It follows that the number of points
in E with x coordinate x0 is
3
x0 + Ax0 + B
1+
Fq
131
So to find the order of E(Fq ) we must sum over all x0 ∈ Fq and add 1 for the
point at infinity:
3
X
X x3 + Ax + B x + Ax + B
#E(Fq ) = 1 +
1+
=1+q+
Fq
Fq
x∈F
x∈F
q
q
Corollary A.18. Let x3 + Ax + B be a polynomial with A, B, ∈ Fq , q odd.
3
X x + Ax + B ≤ 2√q
Then
Fq
x∈Fq
Proof Suppose x3 + Ax + B has no repeated roots, so y 2 = x3 + Ax + B is
an elliptic curve. By Theorem A.17
3
X 3
X
x + Ax + B x + Ax + B √
= −
= |q + 1 − #E(Fq )| ≤ 2 q
Fq
Fq
x∈F
x∈F
q
q
as required (the inequality follows from Hasse’s Theorem).
We now consider the case when x3 + Ax + B has repeated roots. First
recall that for a finite field, Fq with q odd, Fq× is cyclic of even order q − 1.
This means that half the elements of Fq× are squares and half are non squares.
Therefore
Xx
= 0 + 1 − 1 + 1 − 1 + ... = 0
Fq
x∈F
q
Now consider u ∈ Fq . Since the set {x + u : x ∈ Fq } = Fq we have
X x + u
=0
F
q
x∈F
q
Now let the cubic have repeated root r, so
X x3 + Ax + B X (x − r)2 (x − s) =
F
Fq
q
x∈F
x∈F
q
q
132
(A.11)
Now if x 6= r then (x − r)2 (x − s) is only a square when (x − s) is.
"
# X x − s
X x3 + Ax + B r−s
f (r)
=
−
+
F
F
F
Fq
q
q
q
x∈Fx
x∈Fq
r−s
0
r−s
using (A.11) = 0 −
+
=−
Fq
Fq
Fq
Since the absolute value of this will be ≤ 1 we can easily conclude that it is
√
≤ 2 q completing the proof of the corollary.
Example A.2. Let E be the curve y 2 = x3 + x + 1 over F5 (as in Ex 4.1).
12 = 1,
22 = 4,
32 = 9 ≡ 4
(mod 5),
42 = 16 ≡ 1
(mod 5)
So the non-zero squares modulo 5 are 1 and 4. Using Theorem A.17
X x3 + Ax + B 4 3
X
x +x+1
=5+1+
Fq
x=0
1
3
11
31
69
= 6+
+
+
+
+
F
F
F
F
F
5 5 5 5 5
1
3
1
1
4
= 6+
+
+
+
+
F5
F5
F5
F5
F5
= 6+1−1+1+1+1=9
#E(Fq ) = q + 1 +
x∈Fq
F5
which is what we calculated in Example 4.1. Note also that this verifies
Corollary A.18
3
X
x + Ax + B √
= 9 ≤ 10 = 2 q
Fq
x∈F
q
Lemma A.19. Let x ∈ Fq with q odd. Then as elements of Fq
x
= x(q−1)/2
Fq
133
Proof The Lemma is trivially true in the case when x = 0:
x
0
=
= 0 = 0(q−1)/2 = x(q−1)/2
Fq
Fq
Now if t2 = x for some t then
x
(q−1)/2
q−1
=t
tq
t
= ≡ =1=
t
t
x
Fq
so the lemma is true here also. Finally suppose x does not have a square
root. Now note that
x(q−1)/2 − 1 x(q−1)/2 + 1 = xq−1 − 1 ≡ 1 − 1 = 0
So if we show that, given x is not a perfect square, x(q−1)/2 6= 1 then we must
have x(q−1)/2 = 1 by the equation above.
Let G = F×
q the cyclic group of order q − 1. Let H be the subgroup of G
which contains those elements of G whose order divides (q − 1)/2. Since G is
cyclic we know that H exists and has (q − 1)/2 elements. Let H 0 be another
subset of G whose elements are perfect squares. Now since G is cyclic we
have for a primitive root, g
G = {g 0 , g 1 , g 2 , ..., g q−2 }
So we can see that half the elements are squares and half are non squares.
Therefore H 0 is also of order (q − 1)/2 and hence H = H 0 . Therefore the
elements in F×
q that are squares are also those whose order divides (q − 1)/2.
Hence if x is not a perfect square then x(q−1)/2 6= 1, which implies x(q−1)/2 = 1,
completing the proof.
When using Theorem A.17 it is possible to compute each individual generalised Legendre symbol quickly (using the method above for example).
However, it is more efficient to square all the elements of F×
q and store the
list of squares for future use.
Consider the case of Fp . Make a vector with p entries, one for each element
of Fp and initially set all entries to −1. Now, for each j with 1 ≤ j ≤ (p−1)/2
square j, reduce j to get k mod p and change the kth entry in the vector to
+1. Finally change the 0th entry to 0 which will leave the resulting vector
as a list of the values of the Legendre symbol.
134
A.8
Supersingular curves
Recall that an elliptic curve E in characteristic p is defined to supersingular if
E[p] = {∞}. This means there are no points of order p, even with coordinates
in an algebraically closed field.
These curves are important as many calculations can be done more quickly
on then than on arbitrary elliptic curve. Unfortunately, however, discrete
logarithms can be significantly easier to solve on these curves and the cryptographic algorithms defined on them are open to specific attacks.
So when using elliptic curves for cryptographic purposes it is useful to
ensure the curve is not supersingular. The following result gives a way of
determining this.
Proposition A.20. Let E be an elliptic curve over Fq , where q is a power
of a prime number p. Let a = q + 1 − #E(Fq ). Then E is supersingular if
and only if a ≡ 0 (mod p), which is if and only if #E(Fq ) ≡ 1 (mod p).
Proof Write X 2 − aX + q = (X − α)(X − β). Theorem A.15 implies
#E(Fqn ) = q n + 1 − (αn + β n )
Lemma A.16 says that sn = αn + β n satisfies the recurrence relation
s0 = 2,
s1 = a,
sn+1 = asn − qsn−1
Suppose a ≡ 0 (mod p). Then s1 = a ≡ 0, s2 = as0 −qs1 ≡ 0 and so sn+1 ≡ 0
(mod p) for all n ≥ 1 by the recurrence relation. Therefore
#E(Fqn ) = q n + 1 − (αn + β n ) = pm + 1 − sn ≡ 1
(mod p)
This means that #E(Fqn ) = 1 + pR for some integer r, so p is clearly not a
divisor of #E(Fqn ). Therefore there are no points of order p in E(Fqn ) for
any n ≥ 1. Since Fq = ∪n≥1 Fqn (Appendix B.5.1) there are no points of order
p in E(Fq ). Therefore E is supersingular, proving the ’if’ of the theorem.
Now suppose a 6≡ 0 (mod p). Then sn+1 ≡ asn (mod p) for n ≥ 1. Since
s1 = a we have sn ≡ an (mod p) for all n ≥ 1. Therefore
#E(Fqn ) = q n + 1 − sn ≡ 1 − an
(mod p)
By Fermat’s Little Theorem ap−1 ≡ 1 (mod p). Therefore E(Fqp−1 ) has order
divisible by p, and hence contains at least one point of order p (Theorem
135
B.4). This means that E is not supersingular as there is a point of order p
in the algebraically closed field.
Finally note that
#E(Fq ) ≡ q + 1 − a ≡ 1 − a
(mod p)
So #E(Fq ) ≡ 1 (mod p) if and only if a ≡ 0 (mod p).
Corollary A.21. Suppose p ≥ 5 is a prime. Then an elliptic curve E,
defined over Fq , is supersingular if and only if a = 0, which is the case if and
only if #E(Fp ) = p + 1.
Proof If a = 0 then E is supersingular by Proposition A.20. Conversely
suppose that E is supersingular but a 6= 0. Since a ≡ 0 (mod p) we must
√
√
have |a| ≥ p. By Hasse’s Theorem |a| ≤ 2 p, so p ≤ 2 p. This means that
√
p ≤ 2 so p ≤ 4 as required.
The curve y 2 + a3 y = x3 + a4 x + a6 is supersingular in characteristic
2. Similarly in characteristic 3 the curve y 2 = x3 + a2 x2 + a4 x + a6 is
supersingular if and only if a2 = 0. The following allows us to construct
supersingular curves in other characteristics.
Proposition A.22. Suppose q is odd and q ≡ 2 (mod 3). Let B ∈ F×
q . Then
the elliptic curve E given by y 2 = x3 + B is supersingular.
3
×
×
Proof Let ϕ : F×
q → Fq be a homomorphism defined by ϕ(x) = x . Fq will
have q − 1 elements and since q − 1 is not a multiple of 3 we can conclude
that there are no elements of order 3 in F×
q . Therefore the kernel of ϕ (set
of elements that ϕ maps to the identity) is trivial. Therefore ϕ is injective
and hence must be surjective as its a map from a finite group to itself. In
particular this shows that every element in Fq has a cube root in Fq .
For each y ∈ Fq there is exactly one x ∈ Fq such that (x, y) lies on the
curve, the unique cube root of y 2 − B. Since there are q values of y there are
q points. Including the point ∞ gives
#E(Fq ) = q + 1 = pn + 1 ≡ 1
(mod p)
Therefore, by Theorem A.20, E is supersingular.
136
Appendix B
Mathematical background
material
In this chapter we summarise the background mathematics that is used
throughout the project. Some of the results are well-known and as such
are stated without proof or reference.
B.1
Algebraic curves
An algebraic curve is a set of common zeros of a polynomial. An elliptic curve
can be defined as an algebraic curve in two variables, (x, y), by rewriting the
Weierstrass equation as, find (x, y) so
y 2 − x3 − Ax − B = 0
A defining feature of an algebraic curve is that a straight line can only intersect it at a finite number of points. So sin(x) is not an algebraic curve,
for example, as the straight line y = 1/2 intersects it at an infinite number
of points.
The benefit of elliptic curves being algebraic curves is that we can use
techniques other than calculus to study them. This section defines many of
the terms and techniques used with algebraic curves and follows Chapter 1
of [4]
137
A domain (or integral domain) is a ring with at least two elements in
which the cancellation law holds. A Field is a domain in which every nonzero element is a unit — has a multiplicative inverse (for full definition see
Appendix B.5). Throughout this project Z denotes the domain of integers,
while Q, R and C are the fields of rational, real and complex numbers respectively.
For any ring R, R[x]P
is the ring of polynomials with coefficients in R. The
degree of a polynomial
ai xi is the largest integer d such that ad 6= 0. The
polynomial is monic if ad = 1. The ring of polynomials in n variables over R
is written R[X1 , ..., Xn ] although we often write R[X, Y ] and R[X, Y, Z] when
n = 2, 3. The monomials in R[X1 , ..., Xn ] are the polynomials X1i1 X2i2 ...Xnin
where ij are non-negative integers. The degree of a monomial
2 +...+in .
Pis i1 +i
Every F ∈ R[X1 , ..., Xn ] has a unique expression F =
ai xi where the
xi are the monomials and ai ∈ R. F is homogeneous, or a form of degree d,
if all coefficients ai are zero except possibly those belonging to monomials of
degree d. Any polynomial F has a unique expression F = F0 + F1 + ...Fd ,
where Fi is a form of degree i. If Fd 6= 0 then d is the degree of F , written
deg(F ). The terms F0 , F1 , F2 , ... are called the constant, linear, quadratic,...
terms of F . F is constant if F = F0
Let R be a ring and with (R, +) the abelian group of the ring. Then a
subset I of R is called right ideal if
• (I, +) is a subgroup of (R, +).
• xr is in I for all x in I and all r in R.
The subset is called left ideal if
• (I, +) is a subgroup of (R, +).
• rx is in I for all x in I and all r in R
An ideal I in a ring R is proper if I 6= R. A proper ideal is maximal if
it is not contained in any larger proper ideal. I is a prime ideal if whenever
ab ∈ I either a ∈ I or b ∈ I.
A set S of elements of a ring R generate an ideal
nX
o
I=
ai si | si ∈ S, ai ∈ R
138
The ideal in finitely generated if S is a finite set S = {f1 , ...fn }.
Let I be an ideal in a ring R. The residue class ring of modulo I is
written R/I. It is the set of equivalent classes of elements in R, under the
equivalence relation: a ≡ b if a − b ∈ I. The equivalence class containing a
is called the I-residue of a, denoted a.
R/I forms a ring so that the function π : R → R/I taking each element to
its I-residue is a ring homomorphism. If ϕ : R → S is a ring homomorphism
to a ring S, and ϕ(I) = 0, then there is a unique ring homomorphism ϕ :
R/I → S such that ϕ = ϕπ. A proper ideal I ∈ R is prime if and only if
R/I is a domain, and maximal if and only if R/I is a field. Every maximal
ideal is prime.
If R is a ring, a ∈ R, F ∈ R[X] and a is a root of F , then F = (X − a)G,
G ∈ R[X]. A field k is algebraically
Q closed if any non-constant F ∈ k[X] has
a root. It follows that F = µ (X − λi )ei , µ, λi ∈ k, where the λi are the
distinct roots of F . ei is called the multiplicity of k.
A polynomial of degree d had d roots in k, counting multiplicities.
P
Let R be aPring. The derivative of a polynomial F =
ai X i ∈ R[X] is
defined to be
iai X i−1 , and is written FX or ∂F/∂X. If F ∈ R[X1 , ..., Xn ]
then FXi is defined by considering F as a polynomial in Xi with coefficients
in R[X1 , ..., Xi−1 , Xi+1 , ..., Xn ]. The following rules can be easily verified:
1. (aF + bG)X = aFX + bGX , where a, b ∈ R.
2. FX = 0 if F is a constant.
3. (F G)X = FX G + F GX
4. (F n )X = nF n−1 FX
5. If G1 , ..., Gn ∈ R[X] and
PF ∈ R[X1 , ..., Xn ]
then F (G1 , ..., Gn )X = i=1 FXi (G1 , ..., Gn )GiX
6. FXi Xj = FXj Xi , where FXi Xj = (FXi )Xj
7. (Euler’s
Thm) If F is a form of degree m in R[X1 , ..., Xn ] then mF =
Pn
i=1 Xi FXi
139
B.2
Fractions in polynomial rings
This section, adapted from Chapter 9 of [1], describes how to work with
fractions inside polynomial rings, which is necessary throughout the project.
The properties of a polynomial ring F [x] closely resemble the properties
of a number field. However, one aspect where it differs is that given two
polynomials a(x), b(x) where b(x) 6= 0 in F [x], it is not always possible to
find a polynomial q(x) such that a(x) = b(x)q(x). For example, the ring F [x]
may contain the polynomials x and (1 + x), but x does not properly divide
(1 + x).
Consider a second pair of polynomials α(x), β(x) such that β(x) 6= 0.
These polynomials are said to be equivalent to a(x), b(x) when
a(x)β(x) = α(x)b(x)
Let a(x)/b(x) denote the equivalence class of pairs equivalent to a(x), b(x).
The class is then also representable by α(x)/β(x) and so we write
α(x)
a(x)
=
b(x)
β(x)
Addition and multiplication are defined as for polynomials
a(x)
+
b(x)
a(x)
·
b(x)
c(x)
a(x) · d(x) + b(x) · c(x)
=
d(x)
b(x) · d(x)
c(x)
a(x) · c(x)
=
d(x)
b(x) · d(x)
If a(x)/b(x) = α(x)/β(x) and c(x)/d(x) = γ(x)/δ(x) then it follows that
a(x)
+
b(x)
a(x)
·
b(x)
c(x)
α(x) γ(x)
=
+
d(x)
β(x) δ(x)
c(x)
α(x) γ(x)
=
·
d(x)
β(x) δ(x)
We call a(x)/b(x) a rational function of x over F
It can be easily verified that these laws for addition and multiplication
satisfy commutativity, associativity and distribution
A unique rational function p(x)/q(x) can always be found so that
a(x)
c(x) p(x)
=
+
b(x)
d(x) q(x)
140
⇒
p(x)
a(x) c(x)
a(x)d(x) − b(x)c(x)
=
−
=
q(x)
b(x) d(x)
b(x)d(x)
This rational function is called the difference.
Similarly if c(x) 6= 0 then a unique rational function r(x)/s(x) can always
be found so that
c(x) r(x)
a(x)
=
·
b(x)
d(x) s(x)
a(x)d(x)
r(x)
=
⇒
s(x)
b(x)c(x)
This rational function is called the quotient of a(x)/b(x) by c(x)/d(x).
The sum, product, difference and quotient (when there is one) of two
rational functions over F is also a rational function over F . This system of
rational functions forms a field.
We observe that the rational integral functions a(x)/1 have the same
properties as the polynomials a(x). So we can take the system of rational
functions and replace all those of the form a(x)/1 by a(x). This resulting
set of polynomials and rational functions is called the quotient field of the
polynomial ring F [x]. Now if b(x) 6= 0 and if a(x) = b(x)q(x) then q(x) =
a(x)/b(x).
B.3
Number theory
• The greatest common divisor (gcd), of two non-zero integers, is the
largest positive integer that divides both numbers.
• The integers a and b are said to be coprime if they have no common
factor other than 1 and -1, or equivalently, if their gcd is 1.
• The Euler totient function φ(n) of a positive integer n is defined to be
the number of positive integers less than or equal to n and coprime to
n. For example, φ(8) = 4 since the four numbers 1, 3, 5 and 7 are
coprime to 8, but 2,4 and 6 are not.
• Let n be a positive integer. Then Zn is the set of integers modulo n:
Zn = {0, 1, 2, ..., n − 1}
141
and Zn is a group under addition. Define Z?n as
Z?n = {a | 1 ≤ a ≤ n, gcd(a, n) = 1}
Z?n is a group with respect to multiplication mod n.
• Let a ∈ Z?n . The order of a mod n is the smallest integer k > 0 such
that ak ≡ 1 (mod n). The order of a mod n divides φ(n) (the Euler
totient function).
• A primitive root modulo n is an integer g such that, modulo n, every
integer coprime to n is congruent to a power of g. Consider, for example, when n = 14 so Z?n = {1, 3, 5, 9, 11, 13}. We then see that 3 is a
primitive root modulo 14 as
{31 , 32 , 33 , 34 , 35 , 36 } = {3, 9, 27, 81, 243, 729} ≡ {3, 9, 13, 11, 5, 1} = Z?n
The only other primitive root modulo 14 is 5.
• Let p be prime and a ∈ Z?p . The order of a mod p divides (p − 1). A
primitive root mod p is an integer, g, such that the order of g mod p
equals (p − 1). Then every integer is congurant modulo p to 0 or a
power of g. For example, 3 is a primitive root mod 7:
{1, 3, 9, 27, 81, 243} ≡ {1, 3, 2, 6, 4, 5}
(mod 7) ≡ Z?7
There are φ(p − 1) primitive roots mod p. A primitive root mod p
always exists and so Z?p is a cyclic group.
Theorem B.1 (Chinese remainder theorem). Let n1 , n2 , ..., nr be positive integers such that gcd(ni , nj ) = 1 when i 6= j. Let a1 , a2 , ..., ar be integers.
There exists an x such that
x ≡ ai (mod ni ) for all i
The integer x is uniquely determined modulo n1 n2 ...nr .
Example B.1. Let n1 = 4, n2 = 3, n3 = 5 and let a1 = 1, a2 = 2, a3 = 3.
Then x = 53 is a solution to the simultaneous congruences
x≡1
(mod 4), x ≡ 2
(mod 3), x ≡ 3
(mod 5)
and any solution to the congruences is equivalent to 53 modulo 60.
Theorem B.2 (Fermat’s little theorem). If p is a prime number then for
any integer a
ap ≡ a (mod p)
142
B.4
Group theory
• A set is a collection of objects considered as a whole. The objects of a
set are called elements. If A and B are sets and every element of A is
also an element of B, then A is a subset of B.
• A group (G, ∗ ) is a nonempty set, G, together with a group operator,
∗, which satisfy the group axioms:
– Associativity: ∀a, b, c ∈ G,
(a ∗ b) ∗ c = a ∗ (b ∗ c)
– Identity element: ∃e ∈ G such that ∀a ∈ G,
– Inverse element: ∀a ∈ G ∃b ∈ G such that,
(where e is the neutral element).
– Closure: ∀a, b ∈ G,
e∗a=a∗e=a
a∗b = b∗a = e
a∗b∈G
• A group G is said to be abelian (or commutative) if for every a, b ∈ G,
a ∗ b = b ∗ a. Groups lacking this property are called non-abelian.
• The integers under addition form an abelian group while the integers
under multiplication do not (as not ever integer has an inverse that is
also an integer under multiplication)
• If the operation is thought of as an analogue of multiplication, then the
group operations are written multiplicatively. That is:
– write a · b or even ab for a ∗ b and call it the product of a and b.
– write 1 (or e) for the identity element and call it the unit element.
– write a−1 for the inverse of a and call it the reciprocal of a.
However, sometimes the group operation is thought of as analogous to
addition and written additively:
– write a + b for a ∗ b and call it the sum of a and b.
– write 0 for the identity element and call it the zero element.
– write −a for the inverse of a and call it the opposite of a.
Usually, only abelian groups are written additively, although abelian
groups may also be written multiplicatively.
143
• As elliptic curves form additive abelian groups we use additive group
notation in this project (although we use ∞ for the identity element).
• The order of a group G, denoted by |G|, is the number of elements of
the set G. A group is called finite if it has finitely many elements.
• The order of an element g ∈ G is the smallest integer k > 0 such that
g ∗g ∗...∗g (k times) = e. So using the additive notation of this product
the order of g ∈ G is the smallest integer k > 0 such that kg = 0. Note
that if k is the order of g then
g i = g j ⇐⇒ i ≡ j
(mod k)
• Given a group G under a binary operation ∗, we say that a subset H
of G is a subgroup of G if H also forms a group under the operation ∗.
Theorem B.3 ( Lagrange’s theorem). Let G be a finite group.
(i) Let H be a subgroup of G. Then the order of H divides the order of G.
(ii) Let g ∈ G. Then the order of g divides the order of G.
Consider two sets of elements, the domain and the codomain, and a function f that maps elements from the domain to the codomain.
• f is injective (1-1) if, for every y in the codomain, there is at most one
x in the domain such that f (x) = y.
• f is surjective (onto) if, for every y in the codomain, there is at least
one x in the domain such that f (x) = y.
• f is bijective if, for every y in the codomain there is exactly one x in
the domain such that f (x) = y.
So the function f is bijective if it is both injective and surjective
• A homomorphism is a structure-preserving map between two algebraic
structures (such as groups, rings, or vector spaces). So a homomorphism between groups preserves the structure of the group operation.
• An isomorphism is a bijective (1-1 & onto) map f such that both f
and its inverse f −1 are homomorphisms.
144
• An automorphism is an isomorphism from an object to itself.
• An endomorphism is a homomorphism from an object to itself.
The diagram below denotes implication.
Automorphism −→ Isomorphism
↓
↓
Endomorphism −→ Homomorphism
A cyclic group is a group isomorphic to either Z or Zn for some n. These
groups can be generated by one element. For example Z4 is generated by 3:
{0, 3, 3 + 3, 3 + 3 + 3} = {0, 3, 6, 9} ≡ {0, 3, 2, 1}
(mod 4) = Z4
Theorem B.4. Let G be a finite cyclic group of order n and let d > 0 divide
n. Then
(i) G has a unique subgroup of order d.
(ii) G has d elements of order dividing d, and G has φ(d) elements of order
exactly d (where φ(d) is the Euler Totient function).
Example B.2. Consider Z6 . Since 3|6 there is a unique subgroup of Z6 ,
{0, 2, 4}, which is of order 3. Also φ(3) = 2 and as expected, Z6 has two
elements of order three (2 & 4)
The direct sum of two groups G1 and G2 is defined to be the set of ordered
pairs formed from elements of G1 and G2 :
G1 ⊕ G2 = {(g1 , g2 ) | g1 ∈ G1 , g2 ∈ G2 }
Ordered pairs can be added componentwise:
(g1 , g2 ) + (h1 , h2 ) = (g1 + h1 , g2 + h2 )
This makes G1 ⊕ G2 into a group with (0,0) as the identity element. These
definitions can be extended for the sum of more than two groups.
Remark B.5. Suppose Y = A ⊕ B ⊕ ... ⊕ R is a direct sum of R groups.
Then any point in H of order dividing n satisfies
(0, 0, ..., 0) = (a, b, ..., r)n = (an , bn , ..., rn )
where a, b, ... represent elements in A, B, ...
This implies that any point in H of order dividing n is composed of points
in A, B, ... that also have order dividing n.
145
Theorem B.6. A finite abelian group, G, is isomorphic to
Zn1 ⊕ Zn2 ⊕ ... ⊕ Zns
with ni |ni+1 for i = 1, 2, ..., s − 1. The ni are uniquely determined by G.
Example B.3. If we have a finite abelian group of order 12, then n1 ...ns
multiply to give 12. So the only options are (n1 , n2 ) = (1, 12), (2, 6) and
(3, 4). Of these only (1,12) and (2,6) satisfy n1 |n2 so we conclude that the
group is isomorphic to either Z12 or Z2 ⊕ Z6 .
Example B.4. Similarly, if we have a finite abelian group of order 27 then
it is isomorphic to either Z27 , Z3 ⊕ Z9 or Z3 ⊕ Z3 ⊕ Z3 .
Corollary B.7. Suppose we have a finite abelian group G in the form of
Theorem B.6 above. Then G will have nr1 elements of order dividing n1 .
Proof For each i, we have n1 |ni and so by Theorem B.4 Zni will have n1
elements of order dividing n1 . By Remark B.5 any element of G with order
dividing n1 will be composed of i elements, each of which have order dividing
n1 themselves. Therefore, since each group Zni has n1 candidates there will
be nr1 elements in G of order dividing n1 .
Lemma B.8. Suppose E[n] is isomorphic to the direct product of groups.
E[n] ' Zn1 ⊕ Zn2 ⊕ ... ⊕ Znk
Let l be a prime dividing n1 . Then E[l] ⊆ E[n] and has order lk .
Proof l|n1 and l|ni ni |ni+1 for all i. So l|ni for all i and also l|n. Therefore
any point in E[l] will also be in E[n] so E[l] ⊆ E[n].
Recall Theorem B.4 part(2): A group G has φ(d) elements of order exactly
d (φ the Euler Totient Function). So if G = Zn , and p were prime then there
would be p − 1 points of order p. Now the set {x ∈ Zn : px = 0} will contain
these p − 1 points as well as the infinity point and so has size p.
Finally apply this to the direct product of groups that we are working
with. The size of E[l] will be the number of points in the set
{x ∈ E[n] : l · x = ∞}
which considering the form of E[n] is pk . So E[l] ⊆ E[n] and has order lk .
146
B.5
Field theory
A field is a set in which we can perform analogues of the operations (+, −, ×)
for all elements and also ÷ by all elements except for 0. We usually think of
division by an element as multiplying by that elements inverse. So b/a = ba−1
where a−1 is the element such that a−1 × a = 1. The formal definition of a
field follows.
A field is a commutative ring (F, +, ×) such that 0 does not equal 1 and
all elements of F except 0 have a multiplicative inverse.
(Note: 0 and 1 here stand for the identity elements for the + and × operations,and not the real numbers.) This means that the following all hold:
• Closure of F under + and ×
For all a, b belonging to F , both a + b and a × b belong to F (or more
formally, + and × are binary operations on F ).
• Both + and × are associative
For all a, b, c ∈ F , a + (b + c) = (a + b) + c and a × (b × c) = (a × b) × c.
• Both + and × are commutative
For all a, b belonging to F , a + b = b + a and a ∗ b = b ∗ a.
• The operation × is distributive over the operation +
For all a, b, c, belonging to F , a × (b + c) = (a × b) + (a × c).
• Existence of an additive identity
There exists an element 0 ∈ F , such that for all a belonging to F ,
a + 0 = a.
• Existence of a multiplicative identity
There exists an element 1 ∈ F different from 0, such that for all a
belonging to F , a ∗ 1 = a.
• Existence of additive inverses
For every a ∈ F , there is an element −a ∈ F , such that a + (−a) = 0.
• Existence of multiplicative inverses
For every a 6= 0 in F , there is an element a−1 ∈ F , such that a×a−1 = 1.
The requirement 0 6= 1 ensures that the set which only contains a single
element is not a field
147
We get infinite fields with an infinite number of elements such as Q, R
and C. There are also finite fields with a finite number of elements such as
Zp for p prime.
Example B.5. The set, Z5 = {0, 1, 2, 3, 4}, is a finite field. To see this we
calculate the addition and multiplication tables.
+
0
1
2
3
4
0
0
1
2
3
4
1
1
2
3
4
0
2
2
3
4
0
1
3
3
4
0
1
2
×
0
1
2
3
4
4
4
0
1
2
3
0
0
0
0
0
0
1
0
1
2
3
4
2
0
2
4
1
3
3
0
3
1
4
2
4
0
4
3
2
1
So we can clearly see that both the addition and multiplication operations
are closed, commutative and associative. Further analysis shows the rest of
the rules hold, with 0 as the additive identity and 1 as the multiplicative
identity. We can also see that each element has an additive inverse and each
element (except 0) has a multiplicative inverse.
For example 4 + 1 = 5 ≡ 0 (mod 5) and 4 × 4 = 16 ≡ 1 (mod 5) so the
additive inverse for 4 is 2, while its multiplicative inverse is itself.
Let K be a field. There is a ring homomorphism ϕ : Z → K that sends
1 ∈ Z to 1 ∈ K. If ϕ is injective then we say K has characteristic 0.
Otherwise there is a smallest positive integer p such that ϕ(p) = 0 and we
say K has characteristic p.
So if we are in a field (K, +, ×) with identities 0 and 1 then consider the
elements,
1, 1 + 1, 1 + 1 + 1, ...
Now if there is n such that
1 + 1 + ...1
≡0
n times
then we say the field K has characteristic n. If however all those elements
are unique then we say K has characteristic 0.
(Clearly if K is a finite field then it cannot have characteristic zero, but there
are infinite fields with positive characteristic.)
Theorem B.9. The characteristic p is prime.
148
Proof By Contradiction Assume p = ab with 1 < a ≤ b < p
Then ϕ(a)ϕ(b) = ϕ(p) = 0 ⇒ ϕ(a) = 0 or ϕ(b) = 0
⇒ CONTRADICTION so p is prime
• A multiplicative group is formed from a field K(+, ∗) under the multiplication operator with the zero element removed. This group is usually
denoted K × .
• When K has characteristic 0 the field Q of rational numbers is contained in K. When K has characteristic p the field Fp of integers
modulo p is contained in K.
• Let K and L be fields with K ⊆ L. If α ∈ L we say that α is algebraic
over K if there exists a non-constant polynomial
f (X) = X n + an−1 X n−1 + ... + a0
with a0 , ..., an−1 ∈ K such that f (α) = 0.
• We say that the field L is algebraic over K (or that L is an algebraic
extension of K) if every element of L is algebraic over K.
• An algebraic closure of a field K is a field K containing K such that:
1. K is algebraic over K.
2. Every non-constant polynomial g(X) with coefficients in K has a
root in K (=⇒ K is algebraically closed).
If g(X) has degree n and has a root α ∈ K, then we can write
g(X) = (X − α)g1 (X) with g1 (X) of degree (n − 1). By induction we
see that g(X) has exactly n roots (counting multiplicatively) in K.
• It can be shown that every field K has an algebraic closure, and that any
two algebraic closures of K are isomorphic. Assume that a particular
algebraic closure of a field K has been chosen, and refer to it as the
algebraic closure of K.
• A field K is said to be algebraically closed if every polynomial (in one
variable of degree at least 1), with coefficients in K, has a zero (root) in
K. C is algebraically closed (by the fundamental theorem of algebra).
The algebraic closure of K can also be defined as the smallest algebraically closed field containing K.
149
Example B.6. C is the algebraic closure of R:
x2 + 1 6= (x + n)(x + m) for n, m ∈ R So we can see that R is not algebraically
closed as its roots are not in R. However C is algebraically closed and is the
smallest such field containing R.
x2 + 1 = (x + i)(x − i) ∈ C
When K = Q the algebraic closure, Q is the set of complex numbers that
are algebraic over Q. When K = C the algebraic closure is C itself, since C
is algebraically closed.
B.5.1
Finite fields
A finite field is a field that contains only finitely many elements. The finite
fields are completely known as described below.
1. Every finite field has pn elements for some prime p and some integer
n ≥ 1. (This p is the characteristic of the field.)
2. For every prime p and integer n ≥ 1, there exists a finite field with pn
elements.
3. All fields with pn elements are isomorphic, which justifies using the
same name for all of them, Fpn (in other literature GF(pn ) is often
used).
So for example, there is a finite field F8 = F23 with 8 elements, and every
field with 8 elements is isomorphic to this one. However, there is no finite
field with 6 elements, because 6 is not a power of any prime.
Example B.7. Let p be prime, the integers mod p form a finite field Fp with
p elements (ie with n = 1 in the above definition). However the ring Zpn is
not a field when n ≥ 2 since then p does not have a multiplicative inverse.
Theorem B.10. Fpm ⊆ Fpn ⇐⇒ m|n
Theorem B.11. The algebraic closure of Fp is
[
Fp =
Fpn
n≥1
150
Theorem B.12. If F is a finite field with q = pn elements then xq = x for
all elements x ∈ F .
Theorem B.13. Let Fp be the algebraic closure of Fp and let q = pn . Then
Fq = {α ∈ Fp | αq = α}
Proof Let F×
q be the set of non-zero elements of Fq under the multiplication
operator. F×
q is a group of order q − 1. We know that an element 0 6= α ∈ Fq
will have order dividing q − 1 so αq−1 = 1. Therefore αq = α for all α ∈ Fq .
Next recall that a polynomial g(X) has multiple roots if and only if it
shares a common root with g 0 (X). Let g(X) = X q − X defined in Fp . Then
d
(X q − X) = qX q−1 − 1 = −1
dx
since q = pn = 0 in Fp . So the polynomial X q − X has no multiple roots.
Therefore there areSq distinct α ∈ Fp such that αq = α.
Because Fp = n≥1 Fpn we know Fq ⊂ Fp . There are q elements in Fq ,
all of which satisfy αq = α. There are exactly q elements in Fp with this
property so
Fq = {α ∈ Fp | αq = α}
as required.
Define the q-th power Frobenius automorphism φq of Fq by
φq (x) = xq
for all x ∈ Fq
Proposition B.14. Let q be a power of the prime p. Then
(i) Fq = Fp
(ii) φq is an automorphism of Fq . In particular,
φq (x + y) = φq (x) + φq (y)
φq (xy) = φq (x)φq (y)
for all x, y ∈ Fq .
(iii) Let α ∈ Fq . Then
α ∈ Fqn ⇐⇒ φnq (α) = α
151
Proof
(i) This is a special case of the fact that if K ⊂ L and every element of
L is algebraic over K, then L = K. We prove this as follows. If α is
algebraic over L and L is algebraic over K then α is algebraic over K.
Therefore L is algebraic over K, and is algebraically closed. Therefore
it is the algebraic closure of K.
(ii) If 1 ≤ j ≤ p − 1 then the binomial coefficient pj has a factor of p
in the numerator, that is not canceled by the denominator and so is
equivalent to 0 modulo p. Therefore
p
p
p
p
p−1
(x + y) = x +
x y+
xp−2 y 2 + ... + y p = xp + y p
1
2
n
n
n
Now assume this holds for pn , so (x + y)p = xp + y p , then
n+1
(x + y)p
n
n
n
n+1
= [(x + y)p ]p = [xp + y p ]p = xp
n+1
+ yp
So by PMI, for all n ≥ 1 we have
n
n
n
(x + y)p = xp + y p
⇒ φq (x + y) = φq (x) + φq (y)
The fact that φq (xy) = φq (x)φq (y) is clear from the definition of φq .
So together these show that φq is a homomorphism of fields. We need
to show that φq is bijective. We can see that both 0 and 1 are mapped
to themselves, so let x be an element not equal to zero or one. Then
1 = x×x−1 = φq (x)×φq (x)−1 so φq is injective. Now it remains to show
that φq is surjective. If α ∈ Fp , then α ∈ Fqn for some n, so φnq (α) = α.
Therefore α is in the image of φq meaning φq is surjective and hence an
automorphism.
(iii) This is a restatement of Theorem B.13 with q n in place of q. The
theorem still holds as q n is still a power of the prime p.
Let F×
p be the group formed from the nonzero elements of Fp under the
?
multiplication operator. In Appendix B.3 we showed that F×
p = Zp is a cyclic
group which has the following useful consequence.
152
Proposition B.15. Let m be a positive integer such that p - m and let µm
be the group of mth roots of unity. Then
µm ⊆ F×
q ⇐⇒ m|(q − 1)
Proof Because µm is a group of order m, and F×
q is a group of order q − 1
we have by Lagrange’s theorem (B.3) that m|(q − 1).
Conversely suppose m|(q − 1). Since F×
q is a cyclic group of order q − 1,
by Theorem B.4 it has a subgroup of order m. Then by Lagrange’s theorem
(B.3) the elements of this subgroup satisfy xm = 1. Hence they must be the
m elements of µm .
If we are dealing with Fp , the finite field of order p, where p is prime then
this is isomorphic to Zp = {0, 1, 2, ..., p − 1}. Addition and multiplication of
elements can then be performed modulo p. However Fpn is not isomorphic
to Zpn as discussed earlier, so these fields must be explicitly constructed.
To do this we first select an irreducible polynomial of degree n, f (x), with
elements in Fp = Zp . Then Fq = Fp (x)/ < f (x) > where Fp (x) is the ring of
polynomials with coefficients in Fp and < f (x) > is the ideal generated by
f (x).
Example B.8. Consider F4 . The polynomial f (x) = x2 + x + 1 is irreducible
over F2 so we have F4 = F2 (x)/ < x2 + x + 1 >. This is written as the set
{0, 1, x, x + 1} where we work under the relation x2 + x + 1 = 0. Since we
are working under characteristic 2, we can write this as x2 = x + 1. Then for
example
x3 = x(x2 ) = x2 + x = 2x + 1 ≡ 1
B.5.2
Constructing F9
Since 9 = 32 , we will be working in F3 , whose elements we will represented
by 0,1 and 2, and where addition and multiplication are done modulo 3. We
seek an extension of degree 2 over the prime field, so our first task is to find
a monic irreducible polynomial of degree 2 with coefficients in F3 . For large
field this can be a difficult assignment, and there are some theorems that
can help. However when the prime field is small the brute force procedure is
effective. We can in fact easily list all of the monic quadratics in F3 [x]:
153
(1) x2
(2) x2 + 1
(3) x2 + 2
(4) x2 + x
(5) x2 + x + 1
(6) x2 + x + 2
(7) x2 + 2x
(8) x2 + 2x + 1
(9) x2 + 2x + 2
Now the problem is to find the irreducible ones in this list. Clearly, any
polynomial without a constant term is factorable (x is a factor), so the first,
fourth and seventh can immediately be crossed out. For the remaining six
polynomials, we may opt for one of two procedures:
(a) We could substitute in turn, for x, all the elements of the prime field in
which we are working. If none of these substitutions evaluates to zero
then the polynomial is irreducible (i.e. it has no root in the field). So, for
example, substituting in the polynomial x2 + 2 gave the following values:
(i) x = 0 =⇒ 02 + 2 = 2
(iii) x = 2 =⇒ 22 + 2 = 0
(ii) x = 1 =⇒ 12 + 2 = 0
Thus x2 + 2 factors, in fact x2 + 2 = (x + 1)(x + 2). On the other hand,
the same procedure for x2 + 1 gives:
(i) x = 0 =⇒ 02 + 1 = 1
(iii) x = 2 =⇒ 22 + 1 = 2
(ii) x = 1 =⇒ 12 + 1 = 2
meaning x2 + 1 is irreducible. We could do this to each polynomial in
turn to find the irreducible ones.
(b) The second possible procedure is to take all the linear factors and multiply them in all possible pairs to get a list of all the factorable quadratics,
removing these from our list leaves all the irreducible quadratics. So
(i) (x + 1)(x + 1) = x2 + 2x + 1
(iii) (x + 2)(x + 2) = x2 + x + 1
(ii) (x + 1)(x + 2) = x2 + 2
implying that the remaining polynomials x2 +1, x2 +x+2 and x2 +2x+2
are the only irreducible monic quadratic polynomials in F3 [x].
We could now use any one of these polynomials to construct the group. We
would let ρ be a zero of the chosen polynomial and write out the elements of
F9 in its vector form representation using the basis (1, ρ). For example if we
used the polynomial x2 + 1 and let ρ be the root then
F9 ' {0, 1, 2, ρ, ρ + 1, ρ + 2, 2ρ, 2ρ + 1, 2ρ + 2}
154
where ρ2 + 1 = 0.
This however does not give us the most useful representation of the field.
We will use the fact that the multiplicative group of a field is cyclic, so there
exists a primitive element (a generator of the cyclic group) that could give
us a handy representation of the elements. Now the primitive elements are
to be found among the roots of the irreducible polynomials (they cannot be
elements of the prime field). The cyclic group we are after has order 8, so not
every root need be primitive. For example, ρ was a root of x2+1 ⇒ ρ2 +1 = 0,
so ρ2 = 2. We can now write out the powers of ρ:
(i) ρ1 = ρ
(ii) ρ2 = 2
(iii) ρ3 = ρ(ρ2 ) = 2ρ
(iv) ρ4 = ρ(ρ3 ) = 2ρ2 = 2(2) ≡ 1
(mod 3)
So ρ has order 4 and so does not generate the cyclic group of order and is not
a primitive element. On the other hand, consider µ a root of the polynomial
x2 + x + 2. Then 2 + + 2 = 0 so 2 = 2 + 1. Now the powers of give us:
i µ1 = µ
ii µ2 = 2µ + 1
iii µ3 = µ(µ2 ) = µ(2µ + 1) = 2µ2 + µ = 2(2µ + 1) + µ = 5µ + 2 ≡ 2µ + 2
iv µ4 = µ(µ3 ) = 2µ2 + 2µ = 4µ + 2 + 2µ = 6µ + 2 ≡ 2
v µ5 = µ(µ4 ) = 2µ
vi µ6 = µ(µ5 ) = 2µ2 = 4µ + 2 ≡ + 2
vii µ7 = µ(µ6 ) = µ2 + 2µ = 2µ + 1 + 2µ ≡ µ + 1
viii µ8 = (µ4 )2 = 22 = 4 ≡ 1
So µ is a primitive element and can represent the elements of F9 as the 8
powers of µ together with 0. Notice also that the terms on the right are all
the possible terms that can be written as linear combinations of the basis
(1, µ) over F3 . When working with finite fields it is convenient to have both
of the above representations, since the terms on the left are easy to multiply
and the terms on the right are easy to add. For example:
(2µ + 2)3 = (µ3 )3 = µ9 = µ
(2µ + 2)3 + µ + 2 = µ + µ + 2 = 2µ + 2 = µ3
155
B.5.3
Constructing F8
Since 8 = 23 , the prime field is F2 and we need to find a monic irreducible
cubic polynomial over that field. Since the coefficients can only be 0 and 1,
the list of irreducible candidates is easily obtained:
(1) x3 + 1
(2) x3 + x + 1
(3) x3 + x2 + 1
(4) x3 + x2 + x + 1
Now substituting 0 gives 1 in all cases, and substituting 1 will give 0 only if
there are an odd number of x terms. So the irreducible cubics are x3 + x + 1
and x3 + x2 + 1. Now the multiplicative group of this field is a cyclic group
of order 7 and so every nonidentity element is a generator. Letting µ be a
root of the first polynomial, we have µ3 + µ + 1 = 0, so µ3 = µ + 1. The
powers of µ are:
i µ1 = µ
ii µ2 = µ2
iii µ3 = µ + 1
iv µ4 = µ(µ3 ) = µ2 + µ
v µ5 = µ(µ4 ) = µ2 + µ + 1
vi µ6 = µ(µ5 ) = µ3 + µ2 + µ = µ2 + 2µ + 1 = µ2 + 1
vii µ7 = µ(µ6 ) = µ3 + µ = 2µ + 1 = 1
So µ is a generator. Now suppose we had chosen a root of the second polynomial, say , ρ. We would then have ρ3 = ρ2 + 1 and the the powers of arho
are
i ρ1 = ρ
ii ρ2 = ρ2
iii ρ3 = ρ2 + 1
156
iv ρ4 = ρ(ρ3 ) = ρ3 + ρ = ρ2 + ρ + 1
v ρ5 = ρ(ρ4 ) = ρ3 + ρ2 + ρ = 2ρ2 + ρ + 1 ≡ ρ + 1
vi ρ6 = ρ(ρ5 ) = ρ2 + ρ
vii ρ7 = ρ(ρ6 ) = ρ3 + ρ2 = 2ρ2 + 1 ≡ 1
We know that these two representations must be isomorphic, and in fact the
isomorphism is given by the map µ 7→ ρ6 .
B.5.4
Addition and multiplication tables of F4
Earlier we showed that F4 = {0, 1, w, w + 1} where w2 + w + 1 = 0 which
in turn implied that w3 = 2w + 1. We now construct the addition and
multiplication tables:
0×x
1×x
w×w
w × (w + 1)
(w + 1) × (w + 1)
=
=
=
=
=
0 ∀x ∈ F4
x ∀x ∈ F4
w2 = −w − 1 ≡ w + 1
w2 + w = 2w + 1 = 1
w2 + 2w + 1 = 3w + 2 = w
0+x
1+1
1+w
1 + (w + 1)
w+w
w + (w + 1)
(w + 1) + (w + 1)
=
=
=
=
=
=
=
x ∀x ∈ F4
2≡0
w+1
w+2≡w
2w ≡ 0
2w + 1 ≡ 1
2w + 2 ≡ 0
So
×
0
1
w
w+1
0
0
0
0
0
1
0
1
w
w+1
w
w+1
0
0
w
w+1
w+1
1
1
0
+
0
1
w
0
0
1
w
1
1
0
w+1
w
w
w+1
0
w+1 w+1
w
1
157
w+1
w+1
w
1
0
B.6
Miscellaneous
• The nth roots of unity are the complex numbers which yield 1 when raised
to a given power n. So they are the complex numbers z which solve
z n = 1,
n = 1, 2, ...
The nth roots of unity form, under multiplication, a cyclic group of order n.
A generator for this group is a primitive nth root of unity. The primitive
nth roots of unity are
e(2pik)/n
where k and n are coprime
Example B.9. The third roots (cubic roots) of unity are
√
√
−1 + 3i
−1 − 3i
1,
,
2
2
where i is the imaginary unit. The latter two roots are primitive.
• The kernel of a homomorphism measures the degree to which the
homomorphism fails to be injective. Let G and H be groups and let f be a
group homomorphism from G to H. If eH is the identity element of H, then
the kernel of f is the set
{g ∈ G | f (g) = eH }
This is the subset of G consisting of all those elements of G that are mapped
by f to the element eH . The kernel is usually denoted ker(f).
Since a group homomorphism preserves identity elements, the identity
element eG of G must belong to the kernel. The homomorphism f is injective
if and only if its kernel contains just one element, eG .
• Let p be a
then defined as
+1
x
−1
=
p
0
prime number and x an integer. The Legendre symbol is
if t2 ≡ x (mod p) has a solution t 6≡ 0
if t2 ≡ x (mod p) has no solution t
if x ≡ 0 (mod p)
158
(mod p)
Theorem B.16. Suppose the roots of a cubic polynomial sum to give a value,
V . Then −V is the coefficient of the x2 term in the cubic.
Proof Let the three roots of the cubic be a, b and c. We can then represent
the cubic as
(x − a)(x − b)(x − c) = (x − a)(x2 − bx − cx + bc)
= x3 − bx2 − cx2 + bcx − ax2 + abx + acx − abc
= x3 − (a + b + c)x2 + (ab + ac + bc)x − abs
So clearly the coefficient of the x2 term is the negative of the sum of the
roots.
Theorem B.17. Let M and N be arbitrary 2 × 2 matrices:
A B
W X
M=
, N=
C D
Y Z
Define
e=
N
Z
−Y
−X
W
Then
e ) = det(M + N) - det(M) - det(N)
(i) Tr(M N
(ii) det(aM + bN) - a2 det(M) - b2 det(N) = ab[det(M + N) - det(M) det(N)]
Proof (i)
e =
MN
AZ − BY
CZ − DY
BW − AX
DW − CX
e ) = AZ − BY + DW − CX
∴ Tr(M N
A+W B+X
M +N =
C +Y D+Z
∴ det(M + N ) = (A + W )(D + Z) − (B + X)(C + Y )
= AD + AZ + W D + W Z − BC − BY − XC − XY
159
det(M ) = AD − BC,
det(N ) = W Z − XY
Therefore
det(M + N ) − det(M ) − det(N ) = AD + AZ + W D + W Z − BC − BY − XC − XY
−AD + BC − W Z + XY
= AZ + W D − BY − XC
e)
= AZ − BY + DW − CX = Tr(M N
Proof (ii)
aA + bW aB + bX
aM + bN =
aC + bY aD + bZ
∴ det(aM + bN ) = (aA + bW )(aD + bZ) − (aC + bY )(aB + bX)
= a2 AD + abAZ + abW D + b2 W Z − a2 BC − abCX − baY B − b2 Y X
So the LHS of the identity is
LHS = det(aM + bN ) − a2 det(M ) − b2 det(N )
= abAZ + abW D − abCX − abY B
= ab[AZ − BY + DW − CX]
Then the RHS is
RHS = ab[det(M + N ) − det(M ) − det(N )]
e ] by part (a)
= ab[T r(M N
AZ − BY BW − AX
= ab × T r
CZ − DY DW − CX
= ab[AZ − BY + DW − CX] = LHS
160
Appendix C
Matlab Code
This Appendix contains the code for all the Matlab programs that were
constructed during the course of this project. Below is a table summarising
the programs.
Appendix Code
C.1
C.2
C.3
C.4
C.5
C.6
C.7
C.1
Description
ECAD.m
PC.m
ECADP.m
inve.m
SUCDOB.m
check.m
RR44.m
Performs elliptic curve addition over the real numbers.
Finds all the points on a prime curve, and plots them.
Performs elliptic curve addition over a prime curve.
Finds the inverse of an element in Zp for p prime.
Performs the successive doubling algorithm.
Checks whether a point lies on a particular prime curve.
Performs the recurrence relation of Lemma A.16.
The Matlab code for ECAD.m
Below is the matlab code for the ECAD.m which performed elliptic curve
addition over the real numbers.
Let E be the elliptic curve y 2 = x3 + Ax + B and let P1 = (x1 , y1 ),
P2 = (x2 , y2 ). The m-file will then produce
P1 + P2 = P3 = (x3 , y3 )
where + is the elliptic curve addition operation over E. The user must input
the coordinates x1 , y1 , x2 , y2 and, if P1 = P2 , also the parameter A. The
161
m-file will then produce x3 , y3 and, if requested, the value of m.
function [x3,y3,m] = ECAD(x1,y1,x2,y2,A)
% This function m-file performs the Elliptic Curve addition
% operation over the real numbers.
% Suppose we are working on the elliptic curve y^2 = x^3 + Ax + B
% Define P1 = (x1,y1)
%
P2 = (x2,y2)
% Then P1 + P2 = P3 = (x3,y3) is defined as below
% If one if the variables in infinity then we define P + infinity = P
% The user should type in ’infinity’ for both the x and y values.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
if x1==’infinity’
x3=x2; y3=y2;
return
end
if x2==’infinity’
x3=x1; y3=y1;
return
end
if x1==x2
if y1==y2
if y1==0
display(’P3 is infinity’)
x3=’infinty’;, y3=’infinity’;
return
end
m = sym( (3*(x1)^2 + A)/(2*(y1)) );
162
x3 = sym( m^2 - x1 - x2);
y3 = sym( m*(x1 - x3) - y1 );
return
end
display(’P3 is infinity’)
x3=’infinty’;, y3=’infinity’;
return
end
m = sym( (y2-y1)/(x2-x1) );
x3 = sym( m^2 - x1 - x2 );
y3 = sym( m*(x1 - x3) - y1 );
C.2
The Matlab code for PC.m
Below is the Matlab code for PC.m which will find and plot all the points
on a specific prime curve. This m-file takes as its inputs, A, B and p and
produces two vectors X, Y which contain all the points (x, y) that lie on
y 2 ≡ x3 + Ax + B
(mod p)
function [X,Y,n] = PC(A,B,p)
% This function m-file finds and plots all the points that lie in E_p(A,B)
% These points are on the curve y^2 = x^3 + AX + b (mod p)
RHS
LHS
X =
Y =
= zeros(3,1);
= zeros(3,1);
zeros(2,1);
zeros(2,1);
for i=0:1:(p-1)
RHS(i+1) = (i)^3 + A*(i) + B;
RHS(i+1) = rmp(RHS(i+1),p);
LHS(i+1) = (i)^2;
LHS(i+1) = rmp(LHS(i+1),p);
163
end
ii=1;
for z=0:1:(p-1)
I=find(RHS==z);
J=find(LHS==z);
q1 = isempty(I);
q2 = isempty(J);
if (q1) == 0
if q2 == 0
n=length(I);
m=length(J);
for h=1:1:n
for g=1:m
X(ii)=I(h)-1;
Y(ii)=J(g)-1;
ii=ii+1;
end
end
end
end
end
n=length(X) + 1;
%%%%%%%PLOTTING%%%%%%%%%%%
h=plot(X,Y,’ko’);
set(h(1),’LineWidth’,1.5)
axis([0, (max(X)+1), 0,(max(Y)+1) ])
xlabel(’X’,’FontSize’,15,’FontWeight’,’bold’)
ylabel(’Y’,’FontSize’,15,’FontWeight’,’bold’)
title([’The points in E_{’,int2str(p),’}(’,int2str(A),’,’,int2str(B),’)’],
’FontSize’,12,’FontWeight’,’bold’)
164
C.3
The Matlab code for ECADP.m
Below is the Matlab code for ECADP.m which is a modified version of
ECAD.m for use with prime curves. It contains the same inputs and outputs
as ECAD.m but the user must input p in addition. It makes use of the m-file
inve.m which is stored in Appendix C.4.
function [x3,y3,m] = ECADP(x1,y1,x2,y2,A,p)
%
%
%
%
%
This function m-file performs Elliptic Curve addition over prime curves.
Suppose we are working on the elliptic curve y^2 = x^3 + Ax + B
Define P1 = (x1,y1)
P2 = (x2,y2)
Then P1 + P2 = P3 = (x3,y3) is defined by as below
% If one if the variables in infinity then we define P + infinity = P
% and the user should type in ’infinity’ for both the x and y values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
if x1==’infinity’
x3=x2; y3=y2;
return
end
if x2==’infinity’
x3=x1; y3=y1;
return
end
if x1==x2
if y1==y2
if y1==0
display(’P3 is infinity’)
x3=’infinty’;, y3=’infinity’;
return
end
165
%m = sym( (3*(x1)^2 + A)/(2*(y1)) );
mnum = 3*(x1)^2 + A;
mden = 2*(y1);
m = mod( (mnum * inve(mden,p)) , p );
% x3 = sym( m^2 - x1 - x2);
x3 = mod( (m^2 - x1 - x2) , p);
% y3 = sym( m*(x1 - x3) - y1 );
y3 = mod( (m*(x1 - x3) - y1) , p);
return
end
display(’P3 is infinity’)
x3=’infinty’;, y3=’infinity’;
return
end
% m = sym( (y2-y1)/(x2-x1) );
mnum = y2 - y1;
mden = x2 - x1;
m = mod( (mnum * inve(mden,p)) , p);
% x3
x3 =
% y3
y3 =
= sym( m^2 - x1 - x2 );
mod( (m^2 - x1 - x2) , p);
= sym( m*(x1 - x3) - y1 );
mod( (m*(x1 - x3) - y1) , p);
166
C.4
The Matlab code for inve.m
Below is the Matlab code for inve.m which finds the inverse of an element,
N , in the group Zn . This is used for working with prime curves, where we
can reduce modulo p. The user must input the element N and prime p.
function [I] = inve(N,p)
% This m-file finds the inverse of an element, N, in the group Z_p
% for use with prime curves.
N = mod(N,p);
H = zeros(3,1);
for i = 1:(p-1)
H(i) = mod(N*i,p);
end
I = find(H==1);
C.5
The Matlab code for SUCDOB.m
Below is the Matlab code for SUCDOB.m which performs the successive doubling algorithm over prime curves. This m-file takes as its inputs X1, Y 1, k, A, p
and outputs X2, Y 2 where
(X2, Y 2) = k(X1, Y 1) = (X1, Y 1)+(X1, Y 1)+...+(X1, Y 1) (k summands)
and addition is performed over the elliptic curve
y 2 ≡ x3 + Ax + B
167
(mod p)
function [X2,Y2] = SUCDOB(X1,Y1,k,A,p)
%
%
%
%
This is a function m-file to perform the successive doubling algorithm
on prime curves. If P = (X1,Y1) and k is an integer, then this algorithm
will find kP = (X2,Y2) where we are operating over the elliptic curve
y^2 = x^3 + Ax + B (mod p), p prime
a = k;
BX = ’infinity’;
BY = ’infinity’;
CX = X1;
CY = Y1;
while a~=0
gg = mod(a,2);
if gg == 0
a = a/2;
BX = BX; BY = BY;
[CX,CY] = ECADP(CX,CY,CX,CY,A,p);
end
if gg == 1
a = a-1;
[BX,BY] = ECADP(BX,BY,CX,CY,A,p);
CX = CX; CY = CY;
end
end
X2 = BX;
Y2 = BY;
168
C.6
The Matlab code for check.m
Below is the Matlab code for check.m which checks whether a specific point
lies on a prime curve. This m-file takes as its inputs x, y, A, B, p and checks
whether the point (x, y) lies on the curve
y 2 ≡ x3 + Ax + B
(mod p)
function [flag] = check(x,y,A,B,p)
% An m-file to check if the point (x,y) lies on the prime curve
% y^2 = x^3 + Ax + B (mod p)
R = x^3 + A*x + B;
R = rmp(R,p);
L = y^2;
L = rmp(L,p);
if L == R
flag = ’YES’;
display(’This point lies on the curve’)
else
flag = ’NO’;
display(’This point does not lie on the curve’)
end
169
C.7
The Matlab code for RR44.m
Below is the Matlab code for RR44.m which performs the recurrence relation
of Lemma A.16. It takes as its inputs, n, q and #E(Fq ), and outputs s(n)
where s(n) is defined by the reccurence relation of Lemma A.16:
s(0) = 2, s(1) = a, s(n + 1) = as(n) − qs(n − 1)
function [A] = RR44(n,q,EFQ)
% Function m-file to calculate s(n) where s is defined by
% s(0)=2, s(1)=a, s(n+1) = as(n) - qs(n-1)
% a = q + 1 - #E(F_q)
% Inputs - n,q & EFQ = #E(F_q)
% Outputs - A = s(n)
a = q + 1 - EFQ
s = zeros(3,1);
s(1) = 2;
s(2) = a;
for i = 3:n+1
s(i) = a*s(i-1) - q*s(i-2);
end
A = s(n+1);
170
© Copyright 2026 Paperzz