Monographs in Theoretical Computer Science

Monographs in Theoretical Computer Science
An EATCS Series
Editors: W. Brauer G. Rozenberg A. Salomaa
Advisory Board: G. Ausiello M. Broy S. Even
1. Hartmanis N. Jones T. Leighton M. Nivat
C. Papadimitriou D. Scott
Cristian Calude
Information and Randomness
An Algorithmic Perspective
Forewords by
Gregory J. Chaitin and Arto Salomaa
Springer-Verlag Berlin Heidelberg GmbH
Author
Prof. Dr. Cristian Calude
Department of Computer Science, Auckland University
Private Bag 92019, Auckland, New Zealand
and
Faculty of Mathematics, Bucharest University
Str. Academiei 14, RO-70109 Bucharest, Romania
E-mail: [email protected]
Editors
Prof. Dr. Wilfried Brauer
Institut fUr Informatik, Technische Universitat Miinchen
Arcisstrasse 21, 0-80333 Miinchen, FRG
Prof. Dr. Grzegorz Rozenberg
Institute of Applied Mathematics and Computer Science
University of Leiden, Niels-Bohr-Weg I, P. O. Box 9512
NL-2300 RA Leiden, The Netherlands
Prof. Dr. Arto Salomaa
The Academy of Finland
Department of Mathematics, University of Turku
FIN-20500 Turku, Finland
Library of Congress Cataloging-in·Publication Data
Calude, Cristian
Information and randomness: an algorithmic perspective 1 Cristian Calude :
Forewords by 1. Salomaa and Gregory 1. Chaitin.
(Monographs in theoretical computer science)
Includes bibliographical references and index.
ISBN 978-3-662-03051-6
ISBN 978-3-662-03049-3 (eBook)
DOI 10.1007/978-3-662-03049-3
I. Machine theory. 2. Computational complexity. 3. Stochastic processes.
I. Title. II. Series: EATCS monographs in theoretical computer science.
QA267.C33 1995
003'.54'015113-dc20
94·33125
This work is subject to copyright. All rights are reserved, whether the whole or pan of the
material is concerned, specifically the rights of translation, reprinting, re·use of illustrations,
recitations, broadcasting, reproduction on microfilms or in other ways, and storage in data
banks. Duplication of this publication or pans thereof is only permitted under the provisions
of the German Copyright Law of September 9, 1965, in its current version, and permission
for use must always be obtained from Springer· Verlag Berlin Heidelberg GmbH. Violations fall
under the prosecution act of the German Copyright Law.
© Springer· Verlag Berlin Heidelberg 1994
Originally published by Springer-Verlag Berlin Heidelberg New York in 1994
Softcover reprint of the hardcover I st edition 1994
The use of registered names, trademarks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Production: PRODUserv Springer Produktions·Gesellschaft, Berlin
Data conversion by Lewis & Leins, Berlin
Cover-layout: MetaDesign plus GmbH, Berlin
SPIN 10129482 45/3020·543210 - Printed on acid-free paper
A Note from the Series Editors
The EATCS Monographs series already has a fairly long tradition of more
than thirty volumes over ten years. Many of the volumes have turned out
to be useful also as textbooks. To give more freedom for prospective authors
and more choice for the audience, a Texts series has been branched off:
Texts in Theoretical Computer Science. An EATCS Series.
Texts published in this series are intended mostly for graduate level. Typically, an undergraduate background in computer science will be assumed.
However, the background required will vary from topic to topic, and some
books will be self-contained. The texts will cover both modern and classical
areas with an innovative approach that may give them additional value as
monographs. Most books in this series will have examples and exercises.
The original series continues as
Monographs in Theoretical Computer Science. An EATCS Series.
Books published in this series present original research or material of interest
to the research community and graduate students. Each volume is normally
a uniform monograph rather than a compendium of articles. The series also
contains high-level presentations of special topics. Nevertheless, as research
and teaching usually go hand in hand, these volumes may still be useful as
textbooks, too.
The present volume is an excellent example of a monograph that also has
a text book potential. Enjoy!
June 1994
w.
Brauer, G. Rozenberg, A. Salomaa
Editor's Foreword
The present book by Calude fits very well in the EATCS Monographs series.
Much original research is presented especially on topological aspects of algorithmic information theory. The theory of complexity and randomness is
developed with respect to an arbitrary alphabet, not necessarily binary. This
approach is richer in consequences than the classical one.
Remarkably, however, the text is so self-contained and coherent that the
book may also serve as a textbook. All proofs are given in the book and thus
it is not necessary to consult other sources for classroom instruction.
The research in algorithmic information theory is already some 30 years
old. However, only the recent years have witnessed a really vigorous growth
in this area. As a result, also the early history of the field from the mid1960s has become an object of debates, sometimes rather hectic. This is very
natural because in the early days many authors were not always careful in
their definitions and proofs.
In my estimation, the present book has a very comprehensive list of references. Often results not at all used in the book are referenced for the sake
of historical completeness. The system of crediting original results is stated
clearly in the preface and followed consistently throughout the book.
May 1994
Arto Salomaa
Academy of Finland
Foreword
Algorithmic information theory (AlT) is the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker
and shaking vigorously. The basic idea is to measure the complexity of an
object by the size in bits of the smallest program for computing it.
AlT appeared in two installments.
In the original formulation of AlT, AlTl , which lasted about a decade,
there were 2N programs of size N. For the past twenty years, AlTl has been
superseded by a theory, AlT2 , in which no extension of a valid program is a
valid program. Therefore there are much fewer than 2N possible programs of
size N.
I have been the main intellectual driving force behind both AlTl and
AlT2 , and in my opinion AlTl is only of historical or pedagogic interest.
Unfortunately, AlTl is better known at this time by the general scientific
public than the new and vastly superior AlT2 • Most people who talk about
program-size complexity are unaware of the fact that they are using a completely obsolete version of this concept! This book should help to remedy this
situation.
In my opinion, program-size complexity is a much deeper concept than
run-time complexity, which however is of greater practical importance in
designing useful algorithms.
The main applications of AlT are two-fold. First, to give a mathematical
definition of what it means for a string of bits to be patternless, random,
unstructured, typical. Indeed, most bit strings are algorithmically irreducible
and therefore random. And, even more important, AlT casts an entirely new
light on the incompleteness phenomenon discovered by Codel. AlT does this
by placing information-theoretic limits on the power of any formal axiomatic
theory.
The new information-theoretic viewpoint provided by AlT suggests that
incompleteness is natural and pervasive and cannot be brushed away in our
everyday mathematical work. Indeed, AlT provides theoretical support for
a quasi-empirical attitude to the foundations of mathematics and for adopting new arithmetical axioms that are not self-evident but are only justified
pragmatically.
There are also connections between AlT and physics.
X
Foreword
The program-size complexity measure of AIT is analogous to the Boltzmann entropy concept that plays a key role in statistical mechanics. And my
work on Hilbert's 10th problem using AIT shows that God not only plays dice
in quantum mechanics and nonlinear dynamics, but even in elementary number theory. AIT thus plays a role in recent efforts to build a bridge between
theoretical computer science and theoretical physics.
In this spirit, I should point out that a universal Turing machine is, from
a physicist's point of view, just a physical system with such a rich repertoire
of possible behavior that it can simulate any other physical system. This
bridge-building is also connected with recent efforts by theoretical physicists
to understand complex physical systems such as those encountered in biology.
This book, benefiting as it does from Cristian Calude's own research in
AIT and from his experience teaching AIT in university courses around the
world, should help to make the detailed mathematical techniques of AIT accessible to a much wider audience.
April 1993
G. J. Chaitin
IBM Watson Research Center
Preface
We sail within a vast sphere, ever drifting in uncertainty, driven from
end to end. When we think to attach ourselves to any point and to
fasten to it, it wavers and leaves us; and if we follow it, it eludes our
grasp, slips past us, and vanishes forever.
Blaise Pascal
This book represents an elementary and, to a large extent, subjective
introduction to algorithmic information theory (AIT). As it is clear from its
name, this theory deals with algorithmic methods in the study of the quantity
of information.
While the classical theory of information is based on Shannon's concept
of entropy, AIT adopts as a primary concept the information-theoretic complexity or descriptional complexity of an individual object. The entropy is
a measure of ignorance concerning which possibility holds in a set endowed
with an a priori probability distribution. Its point of view is largely global.
The classical definition of randomness as considered in probability theory
and used, for instance, in quantum mechanics allows one to speak of a process (such as a tossing coin, or measuring the diagonal polarization of a
horizontally-polarized photon) as being random. It does not allow one to call
a particular outcome (or string of outcomes, or sequence of outcomes) random,
except in an intuitive, heuristic sense. The information-theoretic complexity
of an object (independently introduced in the mid 1960s by R. J. Solomonoff,
A. N. Kolmogorov and G. J. Chaitin) is a measure of the difficulty of specifying that object; it focuses the attention on the individual, allowing one
to formalize the randomness intuition. An algorithmically random string is
one not producible from a description significantly shorter than itself, when
a universal computer is used as the decoding apparatus.
Our interest is mainly directed to the basics of AlT. The first three chapters present the necessary background, i.e. relevant notions and results from
recursion theory, topology, probability, noiseless coding and descriptional
complexity. In Chapter 4 we introduce two important tools: the Kraft-Chaitin
Theorem (an extension of Kraft's classical condition for the construction of
prefix codes corresponding to arbitrary recursively enumerable codes) and
XII
Preface
relativized complexities and probabilities. As a major result, one computes
the halting probability of a universal, self-delimiting computer and one proves
that Chaitin's complexity equals, within 0(1), the halting entropy (Coding
Theorem). Chapter 5 is devoted to the definition of random strings and to the
proof that these strings satisfy almost all stochasticity requirements, e.g. almost all random strings are Borel normal. Random sequences are introduced
and studied in Chapter 6. In contrast with the case of strings - for which randomness is a matter of degree, the definition of random sequences is "robust".
With probability one every sequence is random (Martin-Lof Theorem) and
every sequence is reducible to a random one (Gacs Theorem); however, the
set of random sequences is topologically "small". Chaitin's Omega Number,
defined as the halting probability of a universal self-delimiting computer, has
a random sequence of binary digits; the randomness property is preserved
even when we re-write this number in an arbitrary base. In fact, a more
general result is true: random sequences are invariant under change of base.
We develop the theory of complexity and randomness with respect to an
arbitrary alphabet, not necessarily binary. This approach is more general and
richer in consequences than the classical one; see especially Sections 4.5 and
6.7.
The concepts and results of AIT are relevant for other subjects, for instance for logic, physics and biology. A brief exploration of some applications
may be found in Chapter 7. Finally, Chapter 8 is dedicated to some open
problems.
The literature on AIT has grown significantly in the last years. Chaitin's
books Algorithmic Information Theory, Information, Randomness fj Incompleteness and Information- Theoretic Incompleteness are fundamental for the
subject. Osamu Watanabe has edited a beautiful volume entitled Kolmogorov
Complexity and Computational Complexity published in 1992 by SpringerVerlag. Ming Li and Paul Vitanyi have written a comprehensive book, An
Introduction to Kolmogorov Complexity and Its Applications, published by
Springer-Verlag. Karl Svozil is the author of an important book entitled
Randomness & Undecidability in Physics, published by World Scientific in
1993.
The bibliography tries to be as complete as possible. In crediting a result I
have cited the first paper in which the result is stated and completely proven.
*
I am most grateful to Arto Salomaa for being the springboard of the project
leading to this book, for his inspiring comments, suggestions and permanent
encouragement.
I reserve my deepest gratitude to Greg Chaitin for many illuminating
conversations about AIT that have improved an earlier version of the book,
for permitting me to incorporate some of his beautiful unpublished results
and for writing the Foreword.
Preface
XIII
My warm thanks go to Charles Bennett, Ronald Book, Egon Borger,
Wilfried Brauer, Douglas Bridges, Cezar Campeanu, Ion Chitescu, Rusins
Freivalds, Peter Gacs, Josef Gruska, Juris Hartmanis, Lane Hemaspaandra
(Hemachandra), Gabriel Istrate, Helmut Jurgensen, Mike Lennon, Ming Li,
Jack Lutz, Solomon Marcus, George Markowsky, Per Martin-Lof, Hermann
Maurer, Ion Mandoiu, Michel Mendes-France, George Odifreddi, Roger Penrose, Marian Pour-EI, Grzegorz Rozenberg, Charles Rackoff, Sergiu Rudeanu,
Bob Solovay, Ludwig Staiger, Karl Svozil, Andy Szilard, Doru ~tefanescu,
Garry Tee, Monica Tataram, Mark Titchener, Vladimir Uspensky, Drago§
Vaida, and Marius Zimand for stimulating discussions and comments; their
beautiful ideas and/or results are now part of this book.
This book was typeset using the 1I\TEX package CLMonoOl produced by
Springer-Verlag. I offer special thanks to Helmut Jurgensen, Kai Salomaa,
and Jeremy Gibbons - my 'lEX and 1I\TEX teachers.
I have taught parts of this book at Bucharest University (Romania), the
University of Western Ontario (London, Canada) and Auckland University
(New Zealand). I am grateful to all these universities, specifically to the
respective chairs loan Tomescu, Helmut Jurgensen, and Bob Doran, for the
assistance generously offered. My eager students have influenced this book
more than they may imagine.
I am indebted to Bruce Benson, Rob Burrowes, Peter Dance, and Peter
Shields for their competent technical support.
The co-operation with Frank Holzwarth, J. Andrew Ross, and Hans
Wossner from Springer-Verlag, was particularly efficient and pleasant.
Finally, a word of gratitude to my wife Elena and daughter Andreea; I
hope that they do not hate this book as writing it took my energy and attention for a fairly long period.
March 1994
Cristian Calude
Auckland. New Zealand
Table of Contents
1.
Mathematical Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Prerequisites...........................................
1.2 Recursive Function Theory ..............................
1.3 Topology .............................................
1.4 Probability Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
2
5
6
2.
Noiseless Coding ....................... " .. . . .... .... . . ..
2.1 Prefix-Free Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
2.2 Instantaneous Coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
2.3 Exercises and Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
2.4 History of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
15
15
17
22
23
3.
Program Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
3.1 An Example ................. , ..... .. .. . . .. . . .... . . .. ..
3.2 Computers and Complexities. . . . . . . . . . . . . . . . . . . . . . . . . . . ..
3.3 Algorithmic Properties of Complexities. . . . . . . . . . . . . . . . . . ..
3.4 Quantitative Estimates .................................
3.5 Halting Probabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
3.6 Exercises and Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
3.7 History of Results ......................................
25
25
26
32
33
35
37
38
4.
Recursively Enumerable Instantaneous Codes ............
4.1 The Kraft-Chaitin Theorem .............................
4.2 Relativized Complexities and Probabilities. . . . . . . . . . . . . . . ..
4.3 Speed-Up Theorem.. . . .. . . .. .. ..... .... .... .. .. . . .... ..
4.4 Coding Theorem .......................................
4.5 Binary vs Non-Binary Coding. . . . . . . . . . . . . . . . . . . . . . . . . . ..
4.6 Exercises and Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
4.7 History of Results ......................................
41
41
50
59
62
64
67
69
5.
Random Strings ..........................................
5.1 Empirical Analysis .....................................
5.2 Chaitin's Definition of Random Strings. . . . . . . . . . . . . . . . . . ..
5.3 Relating Complexities K and H ..........................
71
71
75
80
XVI
Table of Contents
5.4
5.5
5.6
5.7
5.8
5.9
A Statistical Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
A Computational Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Borel Normality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Extensions of Random Strings ...........................
Exercises and Problems .................................
History of Results ......................................
81
89
93
98
103
106
6.
Random Sequences .......................................
6.1 From Random Strings to Random Sequences ...............
6.2 The Definition of Random Sequences ......................
6.3 Characterizations of Random Sequences ...................
6.4 Properties of Random Sequences .........................
6.5 Reducibility Theorem ...................................
6.6 Chaitin's Omega Number ................................
6.7 Is Randomness Robust? .................................
6.8 Exercises and Problems .................................
6.9 History of Results ......................................
107
107
116
125
137
153
166
168
178
181
7.
Applications ..............................................
7.1 Three Information-Theoretic Proofs. . . . . . . . . . . . . . . . . . . . . . .
7.2 Information-Theoretic Incompleteness ....................
7.3 Coding Mathematical Knowledge .........................
7.4 Randomness in Mathematics .............................
7.5 Probabilistic Algorithms .................................
7.6 Structural Complexity ..................................
7.7 What Is Life? ..........................................
7.8 Randomness in Physics ..................................
7.9 Metaphysical Themes ...................................
183
183
187
190
193
198
201
205
210
214
8.
Open Problems ........................................... 217
Bibliography .................................................. 221
Notation Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Subject Index ................................................ 235
Author Index ................................................. 237
1. Mathematical Background
In this chapter we collect facts and results which will be used freely in the
book, in an attempt to make it as self-contained as possible.
1.1 Prerequisites
Denote by N, Q, I and R, respectively, the sets of natural, rational, irrational,
and real numbers; N+ = N \ {O} and R+ = {x E R I x ;:::: O}. If 8 is a finite
set, then #8 denotes the cardinality of 8. We shall use the following functions:
i) rem(m, i), the remainder of the integral division of m by i (m, i E N+), ii)
la J , the "floor" ofthe real a (rounding downwards), iii) Ia 1, the "ceiling" of
the real a (rounding upwards), iv) G), the binomial coefficient, v) logQ, the
base Q logarithm, log = llog2J. By I we denote the divisibility predicate. By
c we denote the (non-strict) inclusion relation between sets.
Fix A = {al' ... , aQ }, Q ::::: 2, a finite alphabet. By A * we denote the set
of all strings XlX2 ... Xn with elements Xi E A (1 ::; i ::; n); the empty string
is denoted by A. A * is a (free) monoid under concatenation (this operation
is associative and the empty string is the null element). Let A + = A * \ {>.}.
For x in A*, IxlA is the length of x (IAIA = 0). If there is no ambiguity we
write Ixl instead of IxIA' Every total ordering on A, say al < a2 < ... < aQ,
induces a quasi-lexicographical order on A* :
< ... < alalal < ... < alalaQ < ... < aQaQaQ < ....
We denote by string(n) the nth string according to the quasi-lexicographical order. In this way we get a bijective function string : N ----+ A *. It is
seen that Istring(n)1 = llogQ(n(Q -1) + l)J.
On A * we define the prefix-order relation as follows: x <p y iffl there
exists a string z E A* such that y = xz. A prefix-free set 8 c A* is a set such
that for all strings x, y E 8 with x <p y one has x = y.
If x E A * and i EN, then Xi is the concatenation xx ... x (i times), in case
i > 0; xO = A. For two subsets 8, T c A* their concatenation 8T is defined
1
We adopt the abbreviation iff for "if and only if" .
1. Mathematical Background
2
to be the set {xy I x E B, YET}. For m in N, Am = {x E A* I Ixl = m}. In
case m ::::: 1 we consider the alphabet B = Am and construct the free monoid
B* = (Am) *. Every x E B* belongs to A *, but the converse is false. For
x E B* we denote by Ixl m the length of x (according to B) which is exactly
m-llxl-
For Q E N, Q ::::: 2, let AQ be the alphabet {O, 1, ... ,Q - I}. The elements of AQ are to be considered as the digits used in natural positional
representations of numbers in base Q. Thus, an element a E AQ denotes both
the symbol used in number representations and the numerical value in the
range from 0 to Q - 1 which it represents. By (n)Q we denote the base-Q
representation of the number n.
By AW we denote the set of all (infinite) sequences x = XlX2 ..• Xn ..• with
elements Xi in A. The set AW is no longer a monoid, but it comes equipped
with an interesting probabilistic structure, which will be discussed in Section
1.4.
For x E AW and n E N+, put x(n) = Xl .•. Xn E A*. For S c A*,
SAW = {x E A W I x(n)
xAw
=
{x}AW,x
E
E
B,for some natural n::::: I};
A*.
Let f,g : A* ---+ R+ be two functions. We say that f :S 9 + 0(1) if there
exists a constant c > 0 such that f(x) :S g(x) + c, for all strings x E A*;
sometimes we may use the notation f::!< g. If f:S g+O(l) and g:S f +0(1),
then we write f ~ g. In general,
0U)
{g: A*
R+
I
there exist c E R+,m E N such
that g(x) :S cf(x), for all strings x, Ixl ::::: m}.
---+
A partial function cp : X ~ Y is a function defined on a subset Z of X,
called the domain of cp (write: dom(cp)). In case dom(cp) = X we say that cp
is total and we indicate this by writing cp : X ---+ Y. For x E dom( cp) we write
cp(x) 1= 00; in the opposite case, i.e. when x ¢. dom(cp), we put cp(x) = 00.
The range of cp is range( cp) = {cp( x) I x E dom( cp)}; the graph of cp is
graph(cp) = {(x, cp(x)) I x E dom(cp)}. Two partial functions cp, f : X ~ Y
are equal iff dom(cp) = dom(f) and cp(x) = f(x), for all x E dom(cp).
Each chapter is divided into sections. The definitions, theorems, propositions, lemmata, corollaries, and facts are sequentially numbered within each
chapter. Each proof ends with the Halmos end-mark D.
1.2 Recursive Function Theory
Algorithmic information theory is essentially based on recursion theory.
1.2
Recursive Function Theory
3
Informally, an algorithm for computing a partial function cp : N ~ N is
a finite set of instructions which, given an input x E dom(cp), yields after
a finite number of steps the output y = cp(x). The algorithm must specify
unambiguously how to obtain each step in the computation from the previous
steps and from the input. In case cp is computed by an algorithm we call it a
partial computable function; if cp is also total, then it is called a computable
function. These informal notions have as formal models the partial recursive
functions - abbreviated p. r. functions, respectively, the recursive functions.
A partial function cp : A * ~ A * is called partial recursive if there exists a
partial recursive function f : N ~ N such that
cp(x) = string(f(string-l(x))),
for all x E A *. Similarly, for recursive functions.
There are many equivalent ways to formally define p.r. functions, i.e.
by means of Turing machines, Godel-Kleene equations, Kleene operations,
Markov algorithms, abstract programming languages, etc. The essential way
does not matter for what follows. The main result to be used is the possibility
of enumerating all p.r. functions
cp~n)
:
(A*t ~ A*
in such a way that the following two conditions are fulfilled:
Universality Theorem. There is a p.r. function of two variables cp~2)(e,x)
such that
Uniform Composition Theorem. There is a recursive function of two
variables comp such that
(/l(l)
Yco~p(x,y)
(z)
= (/l(l) ((/l(l) (z)).
Yx YY
The p.r. functions of a variable, CPx = cp~l) are essential for the whole
theory as there exist pairing functions, i.e. recursive bijective functions <>:
A * x A * ---- A * which may iterated and by which one can reduce the number
of arguments.
As a basic result one gets
Theorem 1.1 (Kleene). For every mE N+ and every recursive function f
there effectively exists an x (called fixed point of f) such that cpr;: = Cpj(x)·
The subsets of A * are studied from the point of view of calculability. A
set X c A * is recursive if its characteristic function is recursive. A weaker
property is recursive enumerability: a set X is recursively enumemble - abbreviated r. e. - if it is either empty or else the range of some recursive function.
4
1. Mathematical Background
Equivalently, X is r.e. if it is the domain of a p.r. function. An infinite r.e.
set is the range of some one-One recursive fUnction, i.e. it can be enumerated
injectively. Every infinite r.e. set has an infinite recursive subset. As usual,
Wi = dom(<pi) is an enumeration of all r.e. sets.
There exists a very strong relation between computations and polynomials. To every polynomial P(x, Yl, Y2, ... ,Ym) with integer coefficients one
associates the set
D = {x E N
I P(X,Yl,Y2, ... ,Ym) =
O,for some Yl,Y2,··· ,Ym E Z}.
Call a set Diophantine if it is of the above form. The main relation is given
by the following result:
Theorem 1.2 (Matijasevic). A set is r.e. iff it is Diophantine.
If the polynomial P is built up not only by means of the operations of
addition, and multiplication, but also by exponentiation, then it is called an
exponential polynomial. Using the exponential polynomials instead of polynomial we may define in a straightforward way the notion of exponential
Diophantine set. Of course, by MatijaseviC's Theorem, a set is r.e. iff it is
exponential Diophantine. However, a stronger result may be proven. Call a
set D singlefold exponential Diophantine if it is exponential Diophantine via
the exponential Diophantine polynomial P(x, Yl, Y2, ... ,Ym) and for xED
there is a unique m-tuple of non-negative integers Yl, Y2,· .. , Ym such that
P(x, Yl, Y2,···, Ym) =
o.
Theorem 1.3 (Jones-Matijasevic). A set is r.e. iff it is singlefold exponential Diophantine.
For more details see Matijasevic recent monograph [170], and Jones and
Matijasevic paper [126]. It is not known whether singlefold representations
are always possible without exponentiation.
A function f : A * --+ R+ is called semi-computable from below if its graph
approximation set
{(x,r) E A* x Q I r < f(x)}
is r.e. A function f is semi-computable from above if - f is semi-computable
from below. If f is semi-computable from both below and above, then f is
called computable. It is not too difficult to see that
A function f is semi-computable iff there exists a non-decreasing (in n)
recursive function g : A* x N --+ Q such that f(x) = limn-->oo g(x, n).
A function f is computable iff there exists a recursive function g
A* x N --+ Q such that for all n ~ 1, If(x) - g(x, n)1 < lin.
For more facts in recursion theory we recommend the following books:
Azra, J aulin [3], Borger [27], Bridges [28]' Calude [31], Cohen [76]' Mal' cev
[161]' Odifreddi [183], Rogers [198], Salomaa [205], Soare [215], Wood [247].
1.3 Topology
5
1.3 Topology
We are going to use some rudiments of topology, mainly to measure the size
of different sets. The idea that comes naturally in mind is to use a Baire-like
classification.
Given a set X, a topology on X is a collection T of subsets of X such that
1. 0 E T and X E T.
2. For every U E T and VET, we have Un VET.
3. For every WeT, we have UW E T.
When a topology has been chosen, its members are called open sets. Their
complements are called closed sets. The pair (X, T) is called a topological
space.
An alternative, equivalent way to define a topology is by means of a
closure operator Cl (mapping subsets of X into subsets of X) satisfying the
following (Kuratowski) properties:
1.
2.
3.
4.
Cl(0) = 0.
Z c Cl(Z), for all subsets Z C X.
Cl(Cl(Z)) = Cl(Z), for all subsets Z C X.
Cl(Y U Z) = Cl(Y) U Cl(Z), for all subsets Y, Z
c
X.
For instance, in the topological space (X, T) the closure operator Cl r is
defined by
Clr(Z) = n{F c X I Z c F, F is closed}.
Let T be a topology on a set X and let Cl r be its closure operator. A
set T C X is said to be rare with respect to T if for every x E X and every
open neighborhood N x of x one has N x rt Clr(T). A set which is a countable
union of rare sets is called meagre, or set of the first Baire category. A set
which is not meagre is called a second Baire category set. A dense set is a
set whose closure is equal to the whole space. Passing to complements we get
co-rare, co-meagre, co-dense sets.
Intuitively, the properties of being rare, meagre, dense, second Baire category, co-meagre, co-rare describe an increasing scale for the "sizes" of subsets
of X, according to the topology T. Thus, for instance, a dense set is "larger"
than a rare one, and a co-rare set is "larger" than a dense set.
We shall work with the spaces of strings and sequences endowed with
topologies induced by various order relations. If < is an order relation on A * ,
then the induced topology is defined by means of the closure operator Cl r ( <)
acting as follows:
Clr«)(Z)
= {u E A* I v < u,for some v
E
Z},
or, equivalently, by means of the basic open neighborhoods
N;;
= {v
E
A*
I U < v}.
6
1. Mathematical Background
The space of sequences AW is endowed with the topology generated by the
sets xAw,x E A*.
Various conditions of constructivity will be discussed when using these
topologies.
Let (X, T) be a topological space. A subset S of X is compact if whenever
WeT and S = UW, there is a finite V C W such that S = UV. If X is
itself compact, then we say that the topological space (X, T) is compact.
Let (Xi, Ti) be topological spaces for all i E I. Let X be the Cartesian
product X = I1iEI Xi' Let Pi be the projection from X onto the ith coordinate space Xi:
Pi({Xj}jEI) = Xi, i E I.
There is a unique topology T on X - called the product topology - which is
the smallest topology on X making all coordinate projections continuous, i.e.
for all W E Ti, one has pi1(W) E T.
Theorem 1.4 (Tychonoff). Let (Xi, Ti) be compact topological spaces for all
i E I. Then, the Cartesian product X = I1iEI Xi endowed with the product
topology is compact.
In case of the space of sequences AW one can see that the topology induced
by the family (XAW)XEA* coincides with the product topology of an infinity
of copies of A each of which comes with the discrete topology (i.e. every
subset of A is open). So, by Tychonoff's Theorem, the space of all sequences
is compact.
See more on topology in Kelley [134].
1.4 Probability Theory
In this section we describe the probabilities on the space of sequences AW.
Probabilities are easiest to define for finite sets; see, for instance, Chung
[73], Feller [97]. The classical example concerns a toss of a fair 2 coin. We may
model this situation by means of an alphabet A = {O, I}, 0 = "heads", 1 =
"tails". We agree to set to 1 the total probability of all possible outcomes.
Also, if two possible outcomes cannot both happen, then we assume that their
probabilities add. Introducing the notation "P( ... )" for "the probability of
... ", we may write the relations P(O) + P(l) = 1, P(O) = P(l), so P(O) =
P(l) = 1/2.
Let us toss our fair coin. If we toss it twice we get four possible outcomes
00,01,10,11,
each of which has the probability 1/4. In general, if the coin is tossed n times,
we get 2n possible strings of length n over the alphabet A = {O, I}; and each
string has probability 2- n .
2
That is heads and tails are equally likely.
1.4 Probability Theory
7
What about letting n tend to infinity? We will get all possible sequences
of O's and l's, i.e. the space AW = {O,l}w. Note that each sequence has
probability zero, but this does not determine the probabilities of other interesting sets of possible outcomes, as in the finite case. To see this "convert"
our sequences into "reals" in the interval [0,1] by preceding the sequence by
a "binary point" and regarding it as a binary expansion. For instance, to the
sequence
0101010101010101 ...
we associate the number
0.0101010101010101 ...
i.e.
1
1
1
4 + 16 + ... = 3·
Every number in [0,1] has such an expansion; the dyadic rationals (and only
them) k2- n have in fact two such expansions. On [0,1] we have the usual
Lebesgue measure which assigns to each subinterval its length. Via this identification we see that a possible string of outcomes on the first n tosses corresponds to the set of all infinite sequences beginning with that string, and
in turn to a subinterval of [0, 1] of length 2- n . Furthermore, every set of k
different strings of outcomes for the first n tosses corresponds to a set in
[0, 1] with Lebesgue measure k2- n . We may say that the Lebesgue measure
"corresponds" to the probabilities previously defined.
Having in mind this correspondence - as a guide - we pass to the "direct"
construction of the uniform probability on the space of all sequences, over a
not necessarily binary alphabet A. For the rest of this section we shall follow
Calude and Chitescu [36].
First, let us review some notions. A (Boolean) ring of sets is a non-empty
class R of sets which is closed under union and difference. A (Boolean) algebra of sets is a non-empty class R of sets which is closed under union and
complementation. A a-ring is a ring which is closed under countable union,
and a a-algebra is an algebra which is closed under countable union. In every
set X, the collection of all finite sets is a ring, but not an algebra unless X
is finite. The collection of all finite and co-finite sets is an algebra, but not
a a-algebra, unless X is finite. The collection of all subsets of a given set is
a a-algebra. So, for any family C of subsets of a given set we can construct
the smallest a-algebra containing C; it is called the a-algebra generated by C.
In a topological space, the a-algebra generated by the open sets is called the
Borel a-algebra, and sets in it are called Borel sets.
Let R be a ring. A measure is a real-valued, non-negative, and countably
additive function I-" defined on R such that 1-"(0) = 0. 3 A measure for which
the whole space has measure one is called a probability.
3
The function I-' is count ably additive if for every disjoint sequence {En}n;::o of
sets in R, whose union is also in R, we have I-'(Un;::oEn) = 'I:-n>ol-'(En).
8
1.
Mathematical Background
Every ring R generates a unique O"-ring S(R). If JL is a finite measure on
a ring R, then there is a unique measure 7i on the O"-ring S(R) such that for
every E E R, 7i(E) = JL(E); the measure 7i is finite. See, for instance, Dudley
[94].
Consider now the total space AW. One can see that the class of sets
P
= {xAW I x
E
A*} U {0}
has the following properties:
1. xAw
c yAW iff y <p x,
2. xAw n yAW
1- 0 iff x <p y or y <p x,
3. in case X,Y E P, XnY E {X, Y, 0}.
Next let us consider the topology on AW generated by P, which coincides
with the product topology on AW, previously discussed. Also, note that every
element in P is both open and compact and the O"-algebra generated by P
is exactly the Borel O"-algebra. Indeed, because P consists of open sets, we
get one inclusion; on the other hand, every open set is a union (at most
countable) of sets in the canonical basis generating the product topology,
and every set in this basis is a finite union of sets in P.
Theorem 1.5. If X and (Xi)iEN are in P, and X = UEN Xi, Xi being
mutually disjoint, then only a finite number of Xi are non-empty.
Proof Let X = UiEN Xi, Xi be as above and suppose Xi 1- 0, for infinitely
many i EN. Because X is compact and all Xi are open, we can find a natural
n such that
n
X = UXi .
i=l
Let m > n such that Xm 1- 0. Every sequence x E Xm belongs to X,
consequently it belongs to some Xi with i ::; n < m, contradicting the fact
that Xi and Xm are disjoint.
D
Before passing further we note that for every string x E A * and natural
k 2:: lxi, there exists a single partition of xAw formed with elements zAw,
with Izl = k, namely
xAW
=
U
xyAw.
{yEA' IIYI=k-lxl}
We introduce the class C of all finite mutually disjoint unions of sets in
P.
Theorem 1.6. The class C is an algebra.