203
R A T I O N A L A P P R O X I M A T I O N S TO
ALGEBRAIC NUMBERS
By K. F. R O T H
1. Let a be any algebraic irrational number and suppose there are
infinitely many rational approximations hjq to a such that
h
a —
2
I proved in 1955 that this implies K < 2. In this talk I shall try to outline
the proof, and to say a few words about some of the possible extensions,
and about the limitations of the method.
It is easily seen that there is no loss of generality in supposing that
a is an algebraic integer. Accordingly, we shall suppose that oc is a root
of the polynomial
», x
n
n .
/rtX
* J
f(x) = xn + Oitf*-1 +...+an
(2)
with integral coefficients and highest coefficient 1.
<£•
(D
2. Previous work on the problem entailed the use of polynomials in
two variables. It has long been recognized that further progress would
demand the use of polynomials in more than two variables, and that
polynomials in a large number of variables would have to be used to
obtain the full result. It is not difficult to formulate properties which
a polynomial should have in order to be useful for our purpose.
Suppose that hxjqX9 ...9hm\qm are rational approximations to a, all
satisfying (1). Let Q{xl9...9xm) denote a polynomial with integral
coefficients, of degree at most r5- in x$ for eachj. Then
H!--rl)H.
where P = gf... q^9 provided, of course, that
<
Fl)*0'
Let the Taylor expansion of Q(h1/q1,..., hmjqm) in powers of (hjqj) — a,...,
(Äm/?J-abe
,
\H
/h
\*m
Z...2&, .J- 1 -«) ...(--<*) •
204
K. F. ROTH
Suppose now that Q has the properties:
A
=
ü<pA>
?-?|G*
where A is small;
B:
Q<h,...An = ° f o r
aU
h> --'im satisfying
é->4fc<P*
(^>0).
Then each term in this Taylor expansion with a non-zero coefficient has
1
1
— -a\ ... — - a
K
K
(q£...qt)
Hence
U * i , ...,**)
1
P *'
(4)
Comparison of (3) and (4) yields
* < -^-.
(5)
We must bear in mind that to obtain (5) we used, in addition to A and B,
the condition
Wi
ftJ
To prove our theorem we shall establish the existence of a polynomial Q satisfying the conditions A, B, C with <fi near to \. For this
purpose it will be necessary to take m large and to choose the approximations hxlql9...,hmlqm suitably. Only after choosing these approximations do we choose the polynomial Q9 and the latter choice depends on
the former.
[With m = 2 it is only possible to satisfy condition B with çS of the
order of n~% (where n is the degree of a), and this leads to an estimate of
the type K < cni.]
3. The logical structure of the proof is as follows. We suppose K > 2;
m is chosen sufficiently large and is fixed throughout. A small positive
number 8 ( < 1/m) will be fixed until the end of the proof, when we let
S -> 0. We denote by A any function of d and m such that A -> 0 as 8 -> 0
for fixed m.
We begin by choosing hxjql9 ...9hm\qm [with (h^q^ = 1] from the
assumed infinite sequence of approximations to a (satisfying (1))
by first taking qx sufficiently large (in terms of m, 8, a), then taking q2
sufficiently large in terms of qx, and so on. It will in fact suffice if
_^ML>s-^
(j = 2,...,m).
APPROXIMATIONS TO ALGEBRAIC NUMBERS
205
Then we choose integers rl9..., rm which are sufficiently large in relation
to qx,..., qm and which satisfy
<Ê < S$' < gïl(1+A5).
(6)
This presents no difficulty. We note that (6) implies
qmn
^ p < qmna+A)m
(7)
Condition B now takes the form that the Taylor coefficients Qilt...tim
vanish for all il9 ...9im satisfying
^ + . . . + ^ < m ç i + A.
r
l
(8)
r
m
4. We shall first outline a proof of the existence of a polynomial Q*
satisfying conditions A and B only (with <p near to f ). This proof, due
essentially to Siegel, is based on the use of Dirichlet's compartment
principle. The question of satisfying C as well, which gives rise to the
principal difficulty, is deferred until later.
We put Bx = q(ri and consider all polynomials W(xl9 ...,# m ) of degree
at most r^ in x$, having positive integral coefficients, each less than Bx.
We try to find two such polynomials W9 W" such that their derivatives
of order ix, ...,i m are equal when xx = ... = xm = a, for all il9 ...,im
satisfying (8). Since any such derivative is of the form
AQ +
Axoc+...+An_xan-19
where A09 ...9An_x are integers, one can estimate the number of possibilities for a derivative for given ix,..., im. This number can be shown to
be less than i?^ 1 * 3 ^. The number of polynomials W is about B{, where
r = (rx+1)... (rm+1). Thus the number of polynomials W will exceed
the number of possible distinct sets of derivatives provided that the
number of sets of ix, ...,im satisfying (8), and with no i exceeding the
corresponding r, is less than about rj{n(l + 38)}. The number of integer
points (il9 ...,i m ) in the region defined by the above conditions can be
shown to be less than frjn if <f> is chosen so that
(9)
m*<j) + A = \ m — Bnmi.
The polynomial #* = W — W" satisfies B, by its definition, and can
be shown by a process of simple estimation to satisfy A. Furthermore,
on letting Ä ^ O w e would obtain
(j) = J — 37im~i,
206
K. F. ROTH
so that (f) could be assumed to be sufficiently near to f, since m is large.
[It is in order to be able to choose <f> near to \, that we must work with
polynomials in many variables.]
5. To find a polynomial Q which satisfies A, B and C as well, we seek
a derivative
of not too high an order, of the polynomial Q* just considered. We want
Q not to vanish at (hx\qx, .-.,hmlqm). The 'order 5 is measured by
Jilri + • • • +Jmlrm- The replacement of Q* by Q will involve a weakening
of condition B, but provided the order in question is a A, this will make
no difference on letting 8 -> 0. There will also be an effect on condition A,
but this turns out to be insignificant. Condition C is the essential
requirement now.
The existence of such a derivative, whose order is a A, is not easy to
establish. One would in fact expect this to cause difficulty, as the choice
of Q* was designed to make Q* very small at (hxfqx,.-.,hmlqm).
At this stage it is convenient to introduce the notion of the index of
a polynomial at a point. We define the index of a polynomial at the
point (ocx,...9ocm) relative to positive parameters rl9...9rm to be the
minimal order of derivative (measured as above) which does not vanish
at the point (ax, ...,am). In this language, we need to show that the
index of Q* at (hxjqx,...,hjqm) is a A.
For polynomials in two variables, two quite different fines of reasoning
have been used to obtain upper bounds for the index of Q* a ^
(hxjqx, ...,hmjqm). The first, due to Siegel, is algebraic in nature. It is
based on the principle that, under certain conditions, the sum of the
indices of a polynomial at a finite number of points (not restricted to
be rational) is bounded in terms of its degrees in the various variables.
Since Q* satisfies condition B (with an appropriate (j>), it has an almost
maximal index (in a certain sense) at the point (a, ...,a) and at the
points obtained by replacing a by its conjugates; and it can be deduced
that the index of Q* is small at any other point. I have been unable to
extend this method to polynomials in more than 2 variables.
The second method, due to Schneider, is arithmetic in nature. It is
based on the principle that, under certain conditions, the index of a
polynomial at a rational point is bounded in terms of the magnitude of
its coefficients. Since the coefficients of Q* are not too large, this leads
to a result of the desired kind.
APPROXIMATIONS TO ALGEBRAIC NUMBERS
207
My treatment is based on Schneider's approach and enables me to
prove the following lemma.
Principal lemma. Let 0 < S < m -1 , and let rl9..., rm he positive integers
satisfying
rm>l08-\
^x*-
1
(i = 2,...,m).
Let qx,..-,qm he positive integers satisfying
qx> c = c(m9 8), q)ù ^ q{i.
Consider any polynomial B9 not identically zero, with integral coefficients
of absolute value at most qxn and of degree at most r^ in x$. Then
index .B < 10»tf*>w,
where the index is taken at a point (hxlqx, -..,hmlqm) relative to rl9 ...,r m ;
the h's being integers relatively prime to the corresponding q's.
This suffices for the purpose of finding Q; the hypotheses of the lemma
are satisfied when B = Q*, and the lemma shows that the index of Q*
at (hxjql9..., hjqj is a A, as required.
6. The proof of the lemma is self-contained, as indeed it must be, for
it uses induction on the number of variables, whereas in the main proof
the number m is fixed. Furthermore, the lemma has to be generalized
before the induction can be set up.
We consider the class of all polynomials B(xl9..., xm) with integral coefficients, each coefficient being numerically at most JS, say, and of degree
at most r$ in x$. We obtain, under certain conditions, an upper bound
for the indices of polynomials of this class at a point (hxjql9 -..,hmlqm)
relative to rl9 ...,r m . During the course of the proof, which is by induction on m, it is necessary to consider various different sets of values
of the parameters involved. The final estimate is of the type required to
establish the lemma.
The case m = 1 is simple. Suppose the coefficients of a polynomial
B(xx) are numerically less than B. If dx is the index of B at hx\qx relative
to rx, the polynomial B(xx) is divisible by
It follows from Gauss's theorem on the factorization of polynomials
with integral coefficients into polynomials with rational coefficients, that
B(xx) =
(q^-hJ^R*^),
208
K. F. ROTH
where B*(xx) is a polynomial with integral coefficients. Hence the
coefficient of the highest term in 12* is an integral multiple of qxin9 so that
rx\ogqx
This gives an upper bound of the required type for m = 1.
Now suppose that upper bounds of this kind have been obtained for
m = 1,2,...9p — 1, where p ^ 2. We wish to deduce an upper bound for
the indices for classes of polynomials in p variables.
For any given polynomial B(xl9 ...9xp)we consider all representations
of the form
•ß = <f>o(xp) fo(xv • • > xP-i) + • • • + <t>i-i{xv) fti-i(xv • • > S-i)>
( 10 )
where the <f>v and ijrv are polynomials with rational coefficients, subject
to the condition that the <j>v and ijrv are of degree at most r$ in Xj. Such
a representation is possible, e.g. with I—I = rp and <fiv(xp) = xvp. From
all such representations we select one for which I is least.
In this representation the polynomials <j> form a linearly independent
set, and so do the ifr's. Thus the Wronskian W(xp) of the ç5's is not
identically zero, and the same is true of a certain generalized Wronskian
G(xl9 ...9xp_x) of the ^'s. From (10) and the rule for multiplication of
determinants by rows, it follows that
G(xl9 ...,^_ 1 ) W(xp) = F(xl9 ...9xp)
(11)
is a certain determinant whose elements are all of the form
Since G and W have rational coefficients, there is an equivalent factorization of F in the form
F(xl9 ...9xp) = U(xl9 ...,^_ x ) V(xp)9
(12)
where U and V have integral coefficients.
If the coefficients of B are assumed to be numerically less than B9 this
will imply an upper bound for the coefficients of F; and this, in turn,
will imply upper bounds for the coefficients of U and V. The induction
hypothesis then gives us upper bounds for the indices of U and V at
the points (h1/q1,...,hp_x\qp_x) and hp\qp respectively; and by a multiplicative property of indices, (12) then yields an upper bound for the
index of F at (hjq^ ..., hp\qp).
On the other hand, F is obtained from B by the operations of differentiation, addition and multiplication; and by using some simple properties
of indices relating to these operations, one obtains a lower bound for the
APPROXIMATIONS TO ALGEBRAIC NUMBERS
209
index of F in terms of the index of B. Thus the upper bound for the index
of F leads to an upper bound for the index of B.
In this way it is possible to set up the induction on m, although the
details are somewhat more complicated than they are made to appear
above.
This concludes the outline of the proof of our theorem. We note that
the proof of the existence of a polynomial Q satisfying conditions A, B, C
of § 2 is very indirect, and it would be of considerable interest if such a
polynomial could be obtained by a direct construction.
7. The theorem can be generalized and extended in various ways. For
example, instead of considering rational approximations to the algebraic
number a, we may consider approximations to a by algebraic numbers ß
(a) lying in a fixed algebraic field, or (b) of fixed degree. In each case the
accuracy of the approximation is measured in terms of H(ß), the maximum absolute value of the rational integral coefficients in the primitive
irreducible equation satisfied by /?.
The results already found by Siegel can be improved in both cases.
In case (a) the best possible result has been obtained.f In case (6),
Siegel's result is significant only if the degree of ß is not too large compared to the degree of a, and I do not know how to obtain an improvement
which does not suffer from a similar limitation.
The theorem can also be extended to ^p-adic and gr-adic number fields,
and this has been done by Ridout and Mahler respectively.
Various deductions can be made from the theorem. For example, for
a given oc and K > 2, it is possible to estimate the number of solutions
of (1), as has been done by Davenport and myself. This leads to estimates
for the number of solutions of certain Diophantine equations.
The method is subject to a severe limitation, however, due to the role
played by the selected approximations hx\qx, ...,hm\qm. One cannot
answer questions of the following type:
(i) Can one give, in terms of a and K (if K > 2), an upper bound for
the greatest denominator q among the finite number of solutions h\q
of (1)?
(ii) Can one prove that
h
OL
< q-®+f(Q))
has only a finite number of solutions h\q for some explicit function f(q)
such that f(q) -> 0 as q -> oo?
f See W. J. LeVeque, Topics in Number Theory, Addison Wesley, 1956.
14
TP
210
K. F. ROTH
Liouville's result
h\
n
a —- > c(oc)q~
remains the only known result of its type for which an explicit value of
the constant can be given.
Our method can only throw fight on such questions if some assumption
is made concerning the cgaps' between the convergents to a. It would
appear that a completely new idea is needed to obtain any information
concerning problems of the above type.
One outstanding problem is to obtain a theorem analogous to ours
concerning simultaneous approximations to two or more algebraic
numbers by rationals of the same denominator. In the case of such
simultaneous approximation to two algebraic numbers ax, a 2 (subject
to a suitable independence condition), one would expect the inequalities
h
ai- —
ho
<T
< cr*
But
to have at most a finite number of solutions for any K >
practically nothing is known in this connection.
A complete solution of the problem of simultaneous approximations
could lead to the complete solution of many others, such as, for example,
case (6) of the first problem mentioned in this section.
© Copyright 2025 Paperzz