ICS-C2000 Theoretical Computer Science Recap from the previous

Topics:
Countable and uncountable sets
Universal Turing machines
ICS-C2000 Theoretical Computer Science
The diagonal and universal languages
Round 9: Decidability
Material:
Aalto University
School of Science
Deptartment of Computer Science
In Finnish: sections 6.1, 6.2, 1.7, 6.3 and 6.4 in “Orposen pruju”
In English: sections 4.1–4.2 in Sipser’s book and these slides
Spring 2016
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
2/40
Recap from the previous round
Example: A Turing machine recognizing the language {ak bk ck | k ≥ 0}
b/b, R
C/C, R
a/a, R
B/B, R
q0
a/A, R
b/B, R
q1
⊳/⊳, L
a/A, R
q4
B/B, R
⊳/⊳, L
q5
B/B, R
C/C, R
A/A, R
q2
c/C, L
q3
C/C, L
b/b, L
B/B, L
a/a, L
Computation on the input aabbcc:
(q0 , aabbcc) ` (q1 , Aabbcc) `
(q1 , Aabbcc) ` (q2 , AaBbcc) `
(q2 , AaBbcc) ` (q3 , AaBbCc) `
(q3 , AaBbCc) ` (q3 , AaBbCc) `
(q3 , AaBbCc) ` (q4 , AaBbCc) `
(q1 , AABbCc) ` (q1 , AABbCc) `
(q2 , AABBCc) ` (q2 , AABBCc) `
(q3 , AABBCC) ` (q3 , AABBCC) `
(q3 , AABBCC) ` (q3 , AABBCC) `
(q4 , AABBCC) ` (q5 , AABBCC) `
(q5 , AABBCC) ` (q5 , AABBCC) `
(q5 , AABBCC/) `
(qacc , AABBCC/).
Some computability theory
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
3/40
4/40
The Church–Turing thesis: (strong enough) computing device ≡
Turing machine
Computability theory: Study what can be, and especially what
cannot be computed with Turing machines
An important distinction: Machines that always halt and those
that don’t
An alternative terminoly:
Recall the connection between languages and decision problems
(binary input-output relations): the language AΠ corresponding to
a decision problem Π consists of those inputs x for which the
answer is “yes” (binary output 1) in the decision problem Π
The decision problem Π is
Definition 6.1
I
A Turing machine
I
I
M = (Q, Σ, Γ, δ, q0 , qacc , qrej )
is total (a decider in Sipser’s book) if it halts on all possible input
strings. A language A is
decidable if the corresponding language AΠ is recursive, and
semi-decidable if AΠ is recursively enumerable.
A problem that is not decidable is called undecidable. (Note: an
undecidable problem can be semi-decidable.)
In other words, a decision problem is (i) decidable if it has an
algorithm that solves it correctly and always terminates, and (ii)
semi-decidable if it has an algorithm that solves it correctly but
may not terminate on some “no” instances.
recursively enumerable (also called Turing-recognizable) if it can
be recognized with some Turing machine
recursive (also called Turing-decidable) if it can be recognized by
some total Turing machine.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
5/40
6/40
Example:
Example:
The language
The language
{an bn cn | n ≥ 0}
{n | n is a prime number in binary}
over the alphabet {a, b, c} corresponds to the decision problem
over the alphabet {0, 1} corresponds to the decision problem
Given a string x over the alphabet {a, b, c}. Is x of form
an bn cn for some n ≥ 0?
Given a binary number n. Is n a prime number?
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
7/40
8/40
6.2 Some basic properties of recursive and recursively
enumerable languages
Example:
The decision problem
Lemma 6.1
Given a multivariate polynomial P.
Does the polynomial have integer roots?
Let A, B ⊆ Σ∗ be recursive languages. Then Ā = Σ∗ − A, A ∪ B, and
A ∩ B are recursive languages as well.
can be expressed as the language
Proof
(i) Let MA be a total Turing machine with L (MA ) = A. We obtain a
total Turing machine MĀ recognizing Ā simply by swapping the
accept and reject states of MA :
{P | P is a multivariate polynomial having an integer root}
over, for instance, the ASCII alphabet when we agree some representation for polynomials, e.g., x1^3*x2-7*x1 could represent the polynomial x13 x2 − 7x1 .
MA
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
9/40
10/40
Lemma 6.2
Let A, B ⊆ Σ∗ be recursively enumerable languages. Then A ∪ B and
A ∩ B are recursively enumerable languages, too.
(ii) Let MA and MB be total Turing machines with L (MA ) = A and
L (MB ) = B. We obtain a total Turing machine M recognizing the
language A ∪ B by sequential composition of MA and MB : if MA
accepts the input, then M also accepts; if MA rejects the input,
then execute MB on the input string.
Proof
A ∩ B as in Lemma 6.1 and A ∪ B with a construction similar to that in
Lemma 6.3 (left as an exercise).
Lemma 6.3
A language A ⊆ Σ∗ is recursive if and only if the languages A and Ā
are recursively enumerable.
MA
MB
Proof
By Lemma 6.1(i), if A is recursive, then Ā is also recursive and thus
both A and Ā are recursively enumerable as well.
We next show that if A and Ā are both recursively enumerable, then A
is recursive:
(iii) A ∩ B = Ā ∪ B̄.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
11/40
12/40
Let MA and MĀ be Turing machines
recognizing the languages A and Ā,
respectively. For all x ∈ Σ∗ , either MA or
MĀ halts and accepts x.
We build a 2-tape Turing machine M by
combining MA and MĀ “in parallel”: on the
tape 1, M simulates the machine MA , and
on the tape 2 it simulates the machine MĀ .
MA
MĀ
Countable and uncountable sets
If the simulation on the tape 1 halts in an accepting configuration, then
M accepts the input. If the tape 2 simulation accepts, then M rejects
the input.
Corollary 6.4
Let A ⊆ Σ∗ be a recursively enumerable language that is not recursive.
Then the language Ā is not recursively enumerable.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
13/40
14/40
1.7 Countable and uncountable sets
Lemma 1.11
Assume a finite alphabet Σ. The set Σ∗ of all strings over Σ is
countably infinite.
Definition 1.10
A set X is countably infinite if there is a bijection f : N → X .
Proof
A set is countable if it is finite or countably infinite.
We construct a bijection f : N → Σ∗ as follows. Let Σ = {a1 , a2 , . . . , an }.
We fix some arbitrary “alphabetical order” for the symbols in Σ, for instance, a1 < a2 < · · · < an .
The strings in Σ∗ can now be enumerated in shortlex order with respect
to the chosen alphabetical order:
A set that is not countable is called uncountable.
Intuitivly, a set X is countable if its elements can be ordered and
indexed with natural numbers:
I X = {x , x , x , . . . , x
0 1 2
n−1 } if X is a finite set with n elements, and
I X = {x , x , x , . . .} if X is countably infinite.
0 1 2
we first enumerate string of length 0 (i.e., ε), then those of length
1 (i.e., a1 , a2 , . . . , an ), then those of length 2, and so on
Each subset of a countable set is also a countable set (proof left
as an exercise). On the other hand, uncountable sets have both
countable and uncountable subsets. Thus uncountable sets are,
in some sense, “larger” than countable sets.
inside each length group, the strings are enumerated in the
lexicographical order with respect to the chosen alphabetical
order
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
15/40
16/40
The bijection f is thus
0 7→ ε
2n + 1 7→ a2 a1
1 7→ a1
..
.
2 7→ a2
..
.
..
.
..
.
By Lemma 1.11 the set of such strings is countably infinite, and
therefore the set of all possible programs in any programming
language is countable as well.
3n 7→ a2 an
..
.
n 7→ an
n + 1 7→ a1 a1
..
.
We next prove that the set of languages over any alphabet is
uncountable. Thus there are more languages than there are
possible computer programs and it is impossible, in any
programming language, to write a program for every language.
n2 + n 7→ an an
n + 2 7→ a1 a2
..
.
Note that, in fact all the programs written in any programming
language are just strings over the chosen base alphabet (ASCII
in C, Unicode allowed in some others)
n2 + n + 1 7→ a1 a1 a1
..
.
n2 + n + 2 7→ a1 a1 a2
..
.
2n 7→ a1 an
..
.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
17/40
18/40
Lemma 1.12
Let Σ be an alphabet. The set of all languages over Σ is uncountable.
Ã
Proof (Cantor’s diagonal argument)
Denote the set P (Σ∗ ) of all languages over Σ by A . Assume that these
languages can be enumerated in some order, i.e.
&
A1
A2
A3 · · ·
1
x0
A = {A0 , A1 , A2 , . . .}.
A0
60
0
0
1 ···
0 ···
0
Let the set of all strings over Σ, i.e. Σ∗ , be enumerated in the shortlex
order as x0 , x1 , x2 , . . .. Using these orders, we define the language à as
x1
0
61
0
à = {xi ∈ Σ∗ | xi ∈
/ Ai }.
x2
1
1
61
1 ···
0
6 0 ···
As à ∈ A and we have assumed that the languages in A can be enumerated, we must have that à = Ak for some k ∈ N. But now, according
to the definition of Ã, we have
x3
..
.
0
..
.
0
..
.
0
..
.
1
.. . .
.
.
Graphically, the idea of the proof can
be illustrated as follows:
We form the “incidence matrix”
over the languages A0 , A1 , A2 , . . .
and the strings x0 , x1 , x2 , . . ..
The cell on row i and column j
there has the value (i) 1 if xi ∈ Aj
and (ii) 0 otherwise.
Now the language à differs from
any other language Ak in the
“diagonal” of the matrix.
xk ∈ Ã ⇔ xk ∈
/ Ak = Ã.
We obtained a contradiction and thus the assumption must be wrong
and A cannot be enumerated.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
19/40
20/40
Universal Turing Machines
Modern computers are programmable, meaning that they can
execute different programs
Universal Turing Machines
There are even programs that execute other programs (Java
virtual machine, Python interpreter etc)
Can we program Turing machines?
Yes, “universal Turing machines” can execute (simulate) another
Turing machines that they receive as (a part of their) input string
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
21/40
22/40
Computer programs can also tranform other programs into new
programs
For instance, a C++ compiler transforms a program written in C++
into an assembly language program
More generally, programs are just strings and thus they can be
modified quite freely into other programs with text editors or by
another programs
And programs can also be executed as parts of other programs
(libraries etc)
If we agree some encoding scheme for Turing machines, they can
be used and manipulated in the same way!
In the following, we present one encoding for Turing machines
and outline a “universal Turing machine” that can “execute” Turing
machines in the encoding
The details of the encoding are not the main point but the idea
that “Turing machines ≈ algorithms/programs” is
An encoding for Turing machines
Without loss of generality, we study standand, deterministic
1-tape Turing machines whose input alphabet is Σ = {0, 1}
Each scuh machine
M = (Q, Σ, Γ, δ, q0 , qacc , qrej )
can be encoded into a binary string
The idea:
I
I
we agree some ordering for states and tape alphabet, and
use unary coding for encoding the transitions
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
23/40
24/40
Example:
Assume that
I Q = {q , q , . . . , qn } withä qacc = q
0 1
n−1 and qrej = qn
I Γ ∪ {., /} = {a , a , . . . , a } with a = 0, a = 1, a = ., and
m
0 1
0
1
2
a3 = /
Consider the following machine M .
Q = {q0 , q1 , q2 , q3 }
Σ = {0, 1}
Γ = {0, 1}
⊲/⊲, R
⊳/⊳, L
1/1, R
0/0, R
Furthermore, let ∆0 = L and ∆1 = R
Encoding the transition function δ: the code for the value
δ(qi , aj ) = (qr , as , ∆t ) is
q0
0/1, R
⊲/⊲, R
⊳/⊳, L
q1
1/
0/1, R
0,
R
cij = 0i+1 10j+1 10r+1 10s+1 10t+1 .
⊳/1, R
q2
⊳/⊳, L
1/1, R
0/0, R
⊲/⊲, R
q3
⊲/⊲, R
The code for the whole machine M is
1/0, R
cM = 111c0,0 11c0,1 11c0,2 11c0,3 11c1,0 11...11c3,3 111, where
cM = 111c00 11c01 11. . .11c0m 11c10 11. . .11c1m 11
. . .11cn−2,0 11. . .11cn−2,m 111.
Note: from the code of a machine it is easy to algorithmically
deduce the number of states as well as the size of the tape
alphabet
δ(q0 , 0) = (q1 , 1, R) :
c0,0 = 010100100100
δ(q0 , 1) = (q2 , 0, R) :
c0,1 = 0100100010100
δ(q0 , .) = (qrej , ., R) :
c0,2 = 0100010000001000100
δ(q0 , /) = (qrej , /, L) :
c0,3 = 01000010000001000010
..
.
ICS-C2000 Theoretical Computer Science / Round 9
..
.
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
25/40
26/40
Above we show how to associate each Turing machine M to a
binary string cM encoding it
In reverse, we can associate each binary string c to a Turing
machine Mc .
A language that is not recursively enumerable
Some binary strings do not encode any Turing machine in the
way described above — to such strings, we associate some trivial
Turing machine Mtriv that rejects all input strings
That is, we define
Mc =
the machine M for which cM = c if c is a valid encoding
the machine Mtriv otherwise
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
27/40
28/40
As a result, we obtain
I
I
an enumeration of all the Turing machines (with the input alphabet
{0, 1}) and also
an enumeration of all the recursively enumerable languages over
the alphabet {0, 1}
The machines are
Lemma 6.5
The “diagonal language”
D = {c ∈ {0, 1}∗ | c ∈
/ L (Mc )}
is not recursively enumerable.
Mε , M0 , M1 , M00 , M01 , . . . ,
Proof
Assume that D is recursively enumerable; thus D = L (M) for some
Turing machine M . Let d be the binary encoding of M , i.e. D = L (Md ).
Now
(the indices are in the shortlex order)
The recursively enumerable languages are
d∈D
L (Mε ), L (M0 ), L (M1 ), L (M00 ), L (M01 ), . . .
⇔
d∈
/ L (Md ) = D.
As we obtained a contradiction, the assumption must be wrong and
thus D is not recursively enumerable.
Each language can appear multiple times in the list as some
machines recognize the same language
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
29/40
30/40
The decision problem corresponding to the language D is: “Given
a binary string c. Is it the case that the Turing machine associated
to c does not accept the string c?” More natural examples of
undecidable languages will be seen later.
Forming the language D graphically: if the characteristic functions
of the languages L (Mε ), L (M0 ), L (M1 ), . . . are represented as
an infinite array, then the language D is the one that is obtained
by “flipping” the language obtained from the the diagonal:
D
& L (Mε ) L (M0 ) L (M1 ) L (M00 )
ε
6 01
0
0
0
0
0
6 10
1
0
1
0
0
6 10
1
00
0
0
0
6 01
..
.
..
.
..
.
..
.
..
.
Universal Language and Universal Turing Machines
···
···
···
···
···
..
.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
31/40
32/40
Universal language
Lemma 6.6
The language U is recursively enumerable.
1
We define the “universal language” (here everything is over the
binary alphabet {0, 1}) as
Proof
It is easier to describe a universal Turing machine MU recognizing U as
a 3-tape Turing machine (which can then be translated into a standard
1-tape machine as explained in the last round).
In the beginning, the input is on the tape 1 as usual. After this, the
machine works as follows:
U = {cM w | the Turing machine M accepts the string w}
The corresponding decision problem is
Given a Turing machine M and a string w.
Does M accept the string w?
If A is a recursively enumerable language and M is a Turing
machine recognizing A, then
A = {w ∈ {0, 1}∗ | cM w ∈ U}.
The language U is recursively enumerable, too. The machines
recognizing the language U are called universal Turing machines.
1
1. First, MU checks whether the input is of form cw, where c is a
valid coding for a Turing machine. If the input is not that form, MU
rejects it (and thus works as expected for Mtriv ). Otherwise, it
copies the binary input sub-string w = a1 a2 . . . ak ∈ {0, 1}∗ to the
tape 2 in the unary form
00010a1 +1 10a2 +1 1 . . . 10ak +1 10000.
The language ATM in Sipser’s book
1. · · ·
1 1 0
0 1 0
···
i+1
···
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
33/40
34/40
0 1 0
1. · · ·
···
···
1 0
···
0 1 0
···
i+1
···
0 1 0
···
j+1
j+1
2.
2.
1 1 0
0 1
···
1 0
···
0 1
···
j+1
···
j+1
3.
0
···
0
···
i+1
3.
0
···
0
···
i+1
2. Now it is known that the input is of form cw, where c = cM for
some machine M and MU has to check whether M accepts w. To
do this, MU keeps the description c of M on the tape 1, uses the
tape 2 to simulate the tape of M (in unary coding), and stores the
current simulated state of M on the tape 3 in the unary form
qi ∼ 0i+1 (in the beginning, MU thus writes the code 0 for the
state q0 on the tape 3).
3. After the initial preparations, MU works in phases and simulates
the action of one transition of M in each phase. In the beginning
of each phase, MU searches the place on the description of M on
the tape 1 that corresponds to the current simulated state qi (tape
3) and symbol aj under the simulated tape head (tape 2).
Let that place on the tape 1 be 0i+1 10j+1 10r+1 10s+1 10t+1 .
Now MU replaces the sub-string 0i+1 on the tape 3 with 0r+1 and
the sub-string 0j+1 on the tape 2 with 0s+1 , and move the tape
head of the tape 2 one simulated symbol left if t = 0 and right if
t = 1.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
35/40
36/40
Lemma 6.7
1. · · ·
1 1 0
0 1 0
···
i+1
2.
···
1 0
···
0 1 0
···
The language U is not recursive.
···
j+1
0 1
Proof
···
j+1
3.
0
···
0
···
i+1
If the description on the tape 1 does not contain a place
corresponding to the simulated state qi , the simulated machine M
has reached the accept or the reject state. Thus i = k + 1 or
i = k + 2, where qk is the last state having transitions encoded in
the description of M , and the machine MU enters the state qacc or
qrej , respectively.
Assume that U would be recursive. Thus there is a total Turing machine
MUT recognizing U . We can now build a total Turing machine MD that
recognizes the diagonal language D in Lemma 6.5 as follows.
Let MOK be a total Turing machine that checks whether a string is a
valid coding of a Turing machine. Similarly, let MDUP be a total Turing machine that transforms a string c into the string cc. We build the
T, M
machine MD by combining the machines MU
OK and MDUP :
c
MOK
MDUP
cc
MUT
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
37/40
38/40
Corollary 6.8
c
MOK
MDUP
cc
The language
MUT
Ũ = {cM w | w ∈
/ L (M)}
is not recursively enumerable.
T is, and M recogNow clearly MD is total whenever the machine MU
D
nizes D:
c ∈ L (MD ) ⇔ c ∈
/ L (MOK ) tai cc ∈
/ L (MUT )
⇔ c∈
/ L (Mc )
Proof
The language Ũ is effectively the same as the complement Ū of the
universal language U ; more precisely, Ū = Ũ ∪ ERR, where ERR is the
(obviously) recursive language
ERR = {x ∈ {0, 1}∗ | x does not start with
⇔ c ∈ D.
a valid coding of a Turing machine}.
But by Lemma 6.5, the language D is not recursive. We have thus
obtained a contradiction and our assumption that U is recursive must
be wrong.
If the language Ũ were recursively enumerable, then so would be the
language Ū . But as the language U is recursively enumerable, it would
follow (by Lemma 6.3) that U is recursive. This contradicts Lemma 6.7
and therefore we must conclude that Ũ is not recursively enumerable.
ICS-C2000 Theoretical Computer Science / Round 9
ICS-C2000 Theoretical Computer Science / Round 9
Aalto University / Dept. Computer Science
Aalto University / Dept. Computer Science
39/40
40/40