FINDING THE ROOTS OF CUBIC AND QUARTIC POLYNOMIALS

FINDING THE ROOTS OF CUBIC AND QUARTIC
POLYNOMIALS BY RADICALS
B. A. BAILEY
Abstract. In this document, we give solutions to the problems of solving cubic and quartic equations by means of radicals. The presented solutions rely on the work of the mathematicians Cardan, Vieta, Descartes,
and de Moivre. This document is intended to be read and studied by
students who are thoroughly grounded in algebra (including that of complex numbers) and trigonometry.
A note to the student: do not read this document quickly. If you do not
verify all the claims made (and there is plenty to verify), you will most likely
not have a full understanding or appreciation of the content.
1. Introduction
In this document we present solutions to the following problems:
Problem 1. Let a, b, and c be complex numbers. Find an algorithm to
determine formulas (using radicals) for the roots of z 3 + az 2 + bz + c. In
other words, what is a procedure to find the values z0 , z1 , and z2 such that
z 3 + az 2 + bz + c = (z − z0 )(z − z1 )(z − z2 ) for all z ∈ C?
Problem 2. Let a, b, c, and d be complex numbers. Find an algorithm to
determine formulas (using radicals) for the roots of z 4 + az 3 + bz 2 + cz + d.
In other words, what is a procedure to find the values z0 , z1 , z2 , and z3 such
that z 4 + az 3 + bz 2 + cz + d = (z − z0 )(z − z1 )(z − z2 )(z − z3 ) for all z ∈ C?
We content ourselves with describing algorithms rather than the formulas that arise from implementing them. This is because the formulas, while
expressible in terms of radicals, are very long and complicated, and convey
almost no intuition as to their origin.
Though the history of Problems 1 and 2 is extensive, we will not discuss
it in detail here (see [1] for more information). We only touch lightly on the
history of the specific methods that are presented in this document, which
are discussed below in order of logical flow.
When 16th century Italian algebraists attempted to find all the roots of
generic cubic and quartic equations, the notion of complex numbers quickly
arose, but it was not clear at the time how such quantities could be accurately reasoned with and manipulated. Abraham De Moivre (1667-1754)
shed significant light on the topic by discovering what is now known as de
Moivre’s formula. Among other things, de Moivre’s formula clarifies how
1
root extraction of complex numbers (which is necessary to solve any high
degree polynomial with radicals) may be meaningfully accomplished. Another key in finding the roots of a cubic or a quartic polynomial is the
conversion of the original polynomial to another one of the same degree but
with fewer coefficients (Girolamo Cardano (Cardan) (1501-1576)). We can
then relate the roots of this simpler polynomial to the roots of the original
one.
The solution to Problem 1 that we present is based off of Vieta’s substitution (François Viète (Vieta) (1540-1603)), which is at the heart of all
methods of solution of generic cubic equations by radicals. This substitution
cleverly reduces the problem of finding an initial root of the cubic equation
(the hard step), to solving a related quadratic equation.
For Problem 2, we employ a technique of René Descartes (1596-1650) (see
[1]), which reduces the difficult case of Problem 2 to solving a particular cubic equation. As far as the author is aware, every method to solve a quartic
equation by radicals involves solving some auxiliary cubic equation (the
particular cubic depends on the method). What distinguishes Descartes’
technique is the surprisingly intuitive and natural way in which this cubic
equation is discovered.
In section 2, we present algorithms which solve Problems 1 and 2. We
follow with section 3, where the necessary groundwork is laid, and conclude
with section 4, where the algorithms are derived and discussed as needed.
2. Algorithms to solve problems 1 and 2
2.1. An algorithm to find the roots of f (z) = z 3 + az 2 + bz + c.
Consider the cubic equation
g(u) = u3 + pu + q
(1)
where
2a3 ab
a2
+ b, q =
−
+ c.
3
27
3
If the complex roots of g(u) are u0 , u1 , and u2 , then the roots of f (z) are
a
a
a
(3)
z0 = − + u0 , z1 = − + u1 , z2 = − + u2 .
3
3
3
Here is how u0 , u1 , and u2 may be found:
(2)
p=−
Case 1) p = 0.
In this case, u0 , u1 , and u2 (the roots of g(u)) are the three complex cube
roots of −q, which are well defined.
Case 2) p 6= 0.
Let ω be a complex number satisfying ω 2 + qω −
2
p3
= 0. Note ω may be
27
found by the quadratic formula. If t0 , t1 , and t2 are the complex cube roots
of ω (which are well defined), then u0 , u1 , and u2 (the roots g(u)) are
p
p
p
, u1 = t1 −
, u2 = t2 −
.
(4)
u0 = t0 −
3t0
3t1
3t2
2.2. An algorithm to find the roots of f (z) = z 4 + az 3 + bz 2 + cz + d.
Consider the quartic equation
(5)
g(u) = u4 + pu2 + qu + r
where
3a2
a3 ab
3a4
ba2 ca
+ b, q =
−
+ c, r = −
+
−
+ d.
8
8
2
256
16
4
If the complex roots of g(u) are u0 , u1 , u2 , and u3 , then the roots of f (z)
are
a
a
a
a
(7)
z0 = − + u0 , z1 = − + u1 , z2 = − + u2 , z3 = − + u3 .
4
4
4
4
Here is how u0 , u1 , u2 , and u3 may be found:
(6) p = −
Case 1) q = 0.
The equation u4 + pu2 + r = 0 is quadratic in u2 . Solve for u2 via the
quadratic formula, and take both square roots of each answer. These four
complex numbers u0 , u1 , u2 , and u3 are the roots of g(u).
Case 2) q 6= 0.
Let ω be a complex number satisfying
ω 3 + 2pω 2 + (p2 − 4r)ω − q 2 = 0.
We can find such a value by the algorithm of section 2.1. Let k be a complex
square root of ω. Define
q
1
q
1
p + k2 +
, n=
p + k2 −
.
m=
2
k
2
k
The roots of g(u) are the four complex roots u0 , u1 , u2 , and u3 of the two
quadratic expressions u2 + ku + n and u2 − ku + m, which may be found
by the quadratic formula.
2.3. Comments on the cubic and quartic algorithms.
In the case of the cubic algorithm, we have two choices for ω. For the
quartic algorithm, we have three choices for ω, and for each of these, there
are two choices for k. What does this say about the uniqueness of our final
z values? If a polynomial of degree n can be factored into n linear terms, it
can be done in only one way, up to the order in which the terms are written.
Because of this, we can be sure that our final z values do not depend on our
choices of ω and k.
As it happens, the fundamental theorem of algebra (FTA) ensures that
3
any polynomial with complex coefficients can be written as a product of
linear factors. This result has a long history, and was essentially proven by
Gauss in 1799. However, a discussion of why FTA is true is beyond the
scope of this document.
3. Necessary background for the justification of the cubic
and quartic algorithms
3.1. de Moivre’s formula and nth roots of complex numbers.
The purpose of this section is to demonstrate that complex numbers have
complex nth roots. To this end, we show that each nonzero complex number a + ib has n distinct complex nth roots (and no others) which can be
computed by the functions sine and cosine. Note that this also shows that
z n − (a + ib) has n distinct complex roots when a + ib is nonzero.
Let n be a positive integer, θ be an angle, and r be a positive real number.
The equality below is known as de Moivre’s formula:
(8)
(r cos(θ) + ir sin(θ))n = rn (cos(nθ) + i sin(nθ)),
n ≥ 1.
Exercise 1: Use mathematical induction to prove de Moivre’s formula.
If we adopt the notation
cis(θ) := cos(θ) + i sin(θ),
then (8) becomes
(rcis(θ))n = rn cis(nθ).
(9)
Recall that any nonzero complex number a + ib, where a and b are real, can
be written in polar form:
where r =
√
a + ib = r(cos(θ) + i sin(θ)) = rcis(θ),
a2 + b2 , and θ is the angle in [0, 2π) that satisfies
b
a
and sin(θ) = √
.
a2 + b2
a2 + b2
Given 0 6= a + ib = rcis(θ), define the quantity
θ + 2πk
ωk = r1/n cis
n
cos(θ) = √
for each integer k from 0 to n − 1. Notice that ω0 , ω1 , · · · , ωn−1 is a list of
n distinct complex numbers. Applying (9), we have
n
θ + 2πk
θ + 2πk
n
1/n
1/n n
(ωk ) =
r cis
= (r ) cis n ·
n
n
= rcis(θ + 2πk) = rcis(θ) = a + ib.
That is, each ωk is an nth root of a + ib. Are there any other nth roots of
a + ib? The answer is no. The statement “z is an nth root of a + ib” means
4
z n − (a + ib) = 0, but the polynomial z n − (a + ib) can have at most n roots
since its degree is n. Since we already have a list of n distinct complex nth
roots of a + ib, there are no more to be discovered.
To summarize, if n is a positive integer, and a + ib is a nonzero complex
number with polar form rcis(θ), then the nth roots of a + ib (which are all
distinct) are
θ + 2πk
1/n
, for k = 0, · · · , n − 1.
(10)
ωk = r cis
n
3.2. Depressed polynomials.
3.2.1. Depressed polynomials and roots of polynomials.
In this section we will show that finding the roots of a polynomial can be
reduced to finding the roots of a corresponding depressed polynomial.
Let n ≥ 1. A polynomial
b0 + b1 u + · · · + bn−1 un−1 + un
is said to be depressed if bn−1 = 0. We show that any polynomial
f (z) = a0 + a1 z + · · · + an−1 z n−1 + z n ,
n≥2
an−1
. Letting
can be written as a depressed polynomial in terms of u = z +
n
an−1
z =u−
, we have
n
an−1 f (z) = a0 + a1 u −
+ ···
n
an−1 n−1 an−1 n
+an−1 u −
(11)
+ u−
.
n
n
After expanding the terms in (11) and collecting terms according to powers
of u, we have
(12)
f (z) = b0 + b1 u + · · · + bn−1 un−1 + un =: g(u)
for some complex numbers b0 , · · · , bn−1 given in terms of a0 , · · · , an−1 . Expanding the last two terms in (11), we see that bn−1 , the coefficient of un−1
an−1
= 0.
in f (z), is bn−1 = an−1 − n ·
n
By (12) we have
(13)
n
f (z) = u +
n−2
X
bk uk = g(u)
where u = z +
k=0
an−1
.
n
The connection to finding roots is this: if
g(u) = (u − u0 ) · · · (u − un−1 ),
then
an−1 an−1 f (z) = z − u0 −
· · · z − un−1 −
.
n
n
5
As we will see in sections 4.1 and 4.2, given a cubic or a quartic equation
f (z), we can always solve the associated depressed polynomial g(u).
A comment: while the treatment above is purely algebraic in nature, one
can efficiently study depressed polynomials using the concept of a derivative
from calculus (developed between 1650 and 1700). Once this tool is available,
one can write concise, motivated formulas for b0 , · · · , bn−2 from (13). For
those who are familiar with derivatives, try to see how to do it!
3.2.2. A depressed polynomial in disguise.
The idea of using depressed polynomials to solve polynomial equations is
one that you have seen before, just at a lower level of generality: factoring
f (z) = z 2 + az + b by completing the square. Note
a2
a 2
+b− .
f (z) = z +
2
4
From this we see
a2
f (z) = g(u) := u2 + b −
24 a
= u2 −
−b
4
a
where u = z + . Note g(u) is a depressed quadratic expression. This yields
2
!
!
r
r
a2
a2
−b
u+
−b
(14)
g(u) = u −
4
4
r
a2
a2
− b is some complex square root of
− b, which is defined by
4
4
the results of section 3.1. By (14),
!!
!!
r
r
a2
a2
a
a
f (z) = z − − +
−b
z− − −
−b
.
2
4
2
4
where
As we see, factoring g(u) lets us factor f (z).
4. Justification of the cubic and quartic algorithms
Exercise 2: Using section 3.2.1 as a reference, verify the statements in
sections 2.1 and 2.2 concerning equations (1), (2), (3), (5), (6), and (7).
4.1. Derivation and discussion of the cubic algorithm.
6
4.1.1. Derivation of the cubic algorithm.
Case 1) needs no justification, so we focus on Case 2). Let v 6= 0. We subp
stitute u = v−
into g(u). This is the aforementioned Vieta’s substitution,
3v
and yields
p
p
p 3
g v−
+p v−
=
v−
+q
3v
3v
3v
p3
1
3 2
3
(15)
(v
)
+
qv
−
.
=
v3
27
Let ω be a solution to
(16)
ω 2 + qω −
p3
= 0,
27
and t0 be some cube root of ω. Since
p 6= 0, ω and t0 are nonzero. By (15)
p
and (16), we have that g t0 −
= 0. That is, u0 below is a root of g(u):
3t0
(17)
u0 := t0 −
p
.
3t0
We now know that g(u) = (u − u0 )h(u), where h(u) is some quadratic
expression. By polynomial long division, we see that
g(u) = (u − u0 )(u2 + u0 u + p + u20 ).
The other roots of g(u) are the roots of u2 + u0 u + p + u20 , which (by the
quadratic formula) are
r
3u20
u0
+p ,
(18)
− ±i
2
4
r
3u20
3u20
where
+ p is some complex square root of
+ p. Using (17), we
4
4
see
3u20
3
p 2
+p =
t0 −
+p
4
4
3t0
3
p 2
,
=
t0 +
4
3t0
so let the square root in (18) satisfy
r
√ 3u20
3
p
+p=
t0 +
.
4
2
3t0
7
Equation (18) becomes
r
√ u0
3u20
p
3
p
1
− ±i
t0 −
±i
t0 +
+p = −
2
4
2
3t0
2
3t0
√ !
p
1
3
t0 −
=
− ±i
(19)
√ .
1
2
2
3t − ± i 3
0
2
2
Also,
√
√
1
1
3
4π
3
=− +i
=− −i
, cis
.
(20)
cis
2
2
3
2
2
4π
2π
, and cis
are the distinct cube roots
By (10), the numbers 1, cis
3
3
of 1, so
2π
4π
(21)
t0 , t1 := t0 cis
, t2 := t0 cis
3
3
2π
3
are the distinct cube roots of ω. By equations (17) through (21), the roots
of g(u) are those values given in (4).
4.1.2. Comments on the derivation of the cubic algorithm.
A question: why is the work past (17) necessary, since the argument
p
p
satisfies g(u0 ) = 0 also shows that u1 = t1 −
and
that u0 = t0 −
3t0
3t1
p
u2 = t2 −
satisfy g(u1 ) = 0 and g(u2 ) = 0 as well? The reason the
3t2
factoring argument is necessary is this: we don’t just want values u such
that g(u) = 0. We want the right three values u0 , u1 , and u2 such that
(22)
g(u) = (u − u0 )(u − u1 )(u − u2 ).
In this instance, there are two choices for ω that satisfy (16), say ω and
ω1 , and each of those choices has three cube roots, say t0 , t1 , t2 and t3 , t4 , t5 .
For each of these six numbers, the reasoning in section 4.1.1 applies, and the
p
value uk = tk −
will satisfy g(uk ) = 0. So which ones do we choose so
3tk
that we can factor g(u) into the form of (22)? Our chosen approach makes
a direct attack on this question unnecessary.
4.2. Derivation of the quartic algorithm.
Case 1) needs no justification, so we focus on Case 2). Our approach to
finding the roots of g(u) is this: factor g(u) into two quadratics
(23)
u4 + pu2 + qu + r = (u2 + k1 u + n)(u2 + k2 u + m)
whose coefficients k1 , n, and k2 , m we can determine in terms of p, q, and
r. Second, we find the roots of the quadratics in (23), which are the roots
of g(u).
8
If we expand the right hand side of (23), collect terms according to powers
of u, and equate the coefficients with those on the left hand side, we obtain
k2 + k1 = 0,
m + n + k1 k2 = p,
k1 m + k2 n = q,
nm = r.
Immediately, k2 = −k1 := −k. This gives
u4 + pu2 + qu + r = (u2 + ku + n)(u2 − ku + m)
(24)
2
m+n=p+k ,
k(m − n) = q,
and
nm = r.
Now q 6= 0, so k 6= 0. Therefore, we can divide by k in the second equality
in (24):
q
m + n = p + k 2 , m − n = , nm = r.
k
These equations yield
1
q
1
q
m=
p + k2 +
, n=
p + k2 −
, and
2
k
2
k
q
q
p + k2 −
= 4r.
p + k2 +
k
k
The last equation above reduces to
k 6 + 2pk 4 + (p2 − 4r)k 2 − q 2 = 0.
Analysis of these steps in reverse order justifies the algorithm from Case
2).
References
[1] David M. Burton, The History of Mathematics, an Introduction, The McGraw-Hill
Companies, Inc., 1221 Avenue of the Americas, New York, NY 10020 (2011).
9