ON HEADS VERSUS TAPES
W. Paul
Fakultat fUr Mathematik
Universitat Bielefeld
4800 Bielefeld 1
let Fi be nix mi-figures. We define (F1,F2 )
to be the (n 1+n 2+1)x{max ml ,m 2} figure specified
by picture 1. For figures Fl, ... ,Fk we define'
3 ), ... ,F k)
(F 1, ... ,F k) = (... «F1
,F2},F
Abstract:
2-dimensional 2-tape Turing machines cannot simulate 2-dimensional Turing machines with 2 heads
on 1 tape in real time.
1. Extant related work
h-head Turing machines can be simulated in linear
time by h-tape Turing machines [4]. They can
also be simulated in real time by multi tape Turing machines [I] and 4h-4 tapes suffice for
the simul ation [2]. For d >1 2-head. d-dimensianal Turing machines can be siinulated in real
time by Turing machines with 3 d-dimensional and
some I-dimensional tapes. For d>l and h>2
h-head d-dimensional Turing machines can be simulated in real time by Turing machines with
3h{h-1){h-2)j2 d-dimensional and O(h 3ij) I-dimensional tapes [2]. For d >1 simulating n
steps of ad-dimensional (h+1)-head machine
on-line by ad-dimensional h-head machine may
require n1+&(h,dJ steps [3] . We will use techniques from [3].
F1
F2
B
~
Picture 1
We fix a simple ordering among figures.
For i= 1,2 let Fi be an n.x m.-figure and
i. _ i i i
i1 1
"i ~i
s · -F11
FIn.··· Fm. 1 ··· Fm.n.€{O,l,B}
1
1
11
the string.which is obtained by concatenating the
rows of F' • We defi ne F1 < F£ if
("1,m1) <lex(n 2,m2) or (n l ,m l ) = (n 2,m2) and
sl <lex s2 (with B< 0< 1 say). Occasionally we
will treat sets D of fiqures like sequences of
figures. In such cases the sequence of figures in
D ordered by < is meant.
2. Basics
We review some known facts, which will be used
later.
i) Self oelimitin¥ strin~. Let s be a
o/l-str1ng ofength
and let bin(l) be
the binary representation of l. fOrm bini (l)
by replacing in bin(l) each 0 by 00 and
each 1 by 01 . We call the string
"
s':= bini (l)lls the self-delimiting version of
s.
Fact 1: Let sl, ... ,sk be O/l-strings with
lengths ll, .•• ,lk.Then s1 •.. S'-lsk isa
O/l-string of length
K
i-I
.L li+O(.I log li) which codes sl, ••. ,sk.
1=1
1=1
'
iii) Ko1mogorov complexity of fisures. Let C
be the class of Turlng machlnes with one 1dimensional input tape, ohne 2-dimensional working
tape .and tape alphabet {O,l,B} on bothtaoes .
For M€Clet c(M) denote the self delim1ting
version of the standard encoding of Minto
{O,l}* • Let U be a universal machine in C ,
i.e. for any Me:C and v€{O,l}* the machine U
started with input c(M)v will simulate M with
input v. Let ~1t • • • , F ,G1' . · . ,Gt be fi gures .
The Kolmogorov complexit§ K{Fl,
... ,FsIG t ,··· ,G )
of Fl , •.• ,Fs glven 61, .•. ,Gt is defTned as t
the length of the shortest x€{O,I}* such that
U started with x on the input tape and
Iii) Fi gures .An nxm-matri x F wi th e'ntri es in
{O,I,B} is called an nx m-fi~ure.
.
IFI: =n·m is called the area of
. Figures
correspond in an obvious way to'rectangular parts
of inscriptions of 2-dimensional Turing machine
tapes with alphabet {O,l;B}. A figure G is
called a subfigure of F at position (a,b) if
Gi,i = Fi-a,j~b for al,l 1 and j · For i=1,2
68
CH1695-6181/0000/0068$00.750 1981 IEEE
(G 1, ... ,G t ) on the working tape with the head on
the top left corner of G1 produces F , ..• ,F
1
on the working tape and halts.
s
here functions from configurations into ~ whose
value changes only little in a single computation
step.
The Ko1mogorov complexity K(F 1,
,Fs ) of
F1,···,Fs is defined as K(F1,
,FsIA) where
A is a figure consisting of blanks only.
Intuitively K(F 1, ..• ,Fs IG 1, ... ,Gt ) is t~e number
of bits necassary to specify F1, ... ,F s if
G1, ... ,G t are known. Also
I(F1' · · · , Fs IGI ' · · · , Gt ): =K( FI ' · · · , Fs) - K(F 1, ... ,Fs IG 1, ... ,G t ) is intuitively the number of bits which are saved by the knowledge of
G1,.·.,Gt if we wish to specify FI, ... ,Fs .
Consequently it is called the information about
F" ... ,F in G" ... ,G t . We wlll not use It
fOrma lly s here, bat. in order
to support i ntui·ti on
. we will occasionally rephrase and i~terpret intermediate results involving Ko1mogorov complexity
i ti terr-s of' i nformati on.
Some immediate consequences of the definitions
and fact .1 are summarized in
iii) 'Random squares. An nx n-0/1-matri x Q is
called an nx n-O/I-sQuare. It is called
Chaitin- random if K(Q); n2. By counting random
squares 0 a sizes exist~ Basically random
square are their own shortest descriptions. For
lower bound proofs they have the very desirable
property that all coding tricks for such squares
are more or less obvious or impossible. We make
this now somewhat more precise.
Fact 5: Let Q be an n x n-O/l-random square and
let PI, ... ,P s be pairwise non-overlapping subfigures of Q ., Let QI be obtained from Q by
replacing for each i an occurence of Pi in Q
by blanks. Then
K(P1' · · · , Psi QI) ~I: IPiI -0(s 1og n) .
Proof: Q can be specified by n, the shapes
and positions of P1, .•. ,Ps in Q, the bits of
Q' in row order and how to get PI, ... ,P s from
QI • Thus
2
n
Fact 2: Let F,G and H be any figures. Then
K(FIH) ~ K(F,GIH) + O(log IFI)
K(GIH) ~ K(F,GIH) + O(log IGI)
K(FIG,H)~ K(FIG) + O(log IGI)
K(FIH) ~ K(FIG) + K(GIH) + O(log IFI)
K(F,GIH) ~ K(FIH) + K(GIH) + O(log IFI)
Fact 3: Let F1 and F2 be figures of the same
shape which differ it at most k entries. Let G
be a third figure. Then
.
K(F 2 IF 1 ) < O(k 1091F 1 1)
IK(GIF 1 ) ~ K(GIF2)1~ O(k 109IF 1 ,) .
A simple but all important consequence of fact 3
on Turing machine computations ist
log n} +
n2_~IPil + K(Pl, ••• ,P~Q').
IJ
Phrased in terms of information this says, that a
reasonably regularly shaped portion QI of a
random square Q contains very little information
about the missing portions P1, ... ,P s of Q.
Taking s = 1 and applying fact 2 we get
K(PI) > IPII - O(log n) . Thus random squares are
locally almost random.
Fact 6: Let Q be an nx n-O/1-random square and
let P1, ... ,P s be pairwise non-overlapping
p x p-subsquares of Q • Let R be any fi gure.
Then
2
max K(PiIR) ~ P -O(IRI/s) - O(log n) .
Q can be specified by n,p the position
in Q,R, for each i how to get
Pi from R and finally the remaining bits of Q
in row order. Thus
n2~K(Q)~0(S log n}+0(IRI}+~K(PiIR)+n2 - sp2 '0
Proof:
or-Pl, ... ,P s
Fact 4: Let S bea 2-dimensiona1 multi tape
Turlng machine with tape alphabet {O,I,B} . For
i = 1, ... ,s let Ci be the inscription of a
rectangular portion of some tape of S . Suppose
in some time interval for all i the portion of
tape occupied by Ci was visited at most ki
times and C~ ist the resulting inscription. Then
K( Ci ' · · · , C~ 'C 1~ · · · , Cs ) ~ O(Lk i log ICi I) ·
This says that a figure R can contain much information only about a constant times as many subsquares Pi of Q as its area con cover.
3. Definition of a machine M with 2 heads on one
tape and a baslc property of ltS slmulators
M has linear input and output tapes and one 2dimensional working tape. The working- and output
alphabets are {O,l,B} ~ An action of M is to
move one of its heads (we allow diagonal moves)
or to print a symbol under one of its heads or to
output the symbol under one of its heads on the
working tape. With each action an input symbol is
associated. Upon reading an input symbol M performs the correspondings action. M is an abstract
Consequently' if F is any figure then
IK(F'CI,···,Cs)-K(FICi,···,C~)I~O(Lki
~ K(Q} ~ O(s
log ICil) ·
Phrased in terms of information this says, that by
vi si ti ng the portion of tape occupied by
C1, .•. ,C s Lki times one can pump no more than
O(I:Ki log ICil) bits of information about figure
F into that portion. Also observe that we have
69
Part 12 consists of l . partsJl, ... ,J
where
each part J; consists of 2 parts M; WiT. The
"moving part" Mi has length at most t and drives head 20f M to the top left corner of some
block Bi in T. The choice of Bi will depend
on the behaviour of S during 11 Jl ... J;-l
(Ml '1 S ,~mpty). In the "wri ti ng part
Wi head 1
of M writes down Li row by row (see picture 2).
Observe
9/8 2
IM1 ... MTI = O{n
log n) •
storage unit in the sence of [3].
Let S be a 2-dimensional 2-tape Turing machine
with working alphabet {O,l,B} and suppose. S
simulates M iflreal time with delayc , 1.e.
S makes at most c steps to simulate .any step
of M.
Fact 7: Let J be any input sequence for M and
S . For i = 1,2 let Bi be a figure ea~h of
whose entries can be reached ,after executl0n of
J by head i of M within d steps ~nd let
C· be a figure which contains all entrles
r~achable after execution of J by head i of
S within cd steps. Then
K(B 1,B 2 IC 1,C 2)
~
f
II
n
0(10gl(C 1,C 2)1) •
--------,~-------,
",-
f"
Proof: If (C1,C2) is known (B1,B2) can be
descrlbed in the following way
i) state and headpositions of 5 relative to
C1 and C2 after J :
0(10gl(C1,C2)1) bits
ii) for each input sequence J' of length d
which drives a head of M somewhere into
B} or 82 and then prints out the symbol
under that head run S on that input sequence. One has only to use
O(log d) ~ O(log I(Cl,C2}1}. ~its in order
to specify the relatlve posltlons of 81
and BZ to the heads of M plus an additional O{l) bits for a little simulation
program which generates from the given data
all the sequences JI and runs S on input J' .
In terms of information this fact states, that
if there is an information deficit about
(81,82) in (C1,C2), then one cannot answer
all questions concerning (B1,B2) by looking at
(C1,C2) only. 50 much for survey.
I
-.
n3/ 4
L~
L1
1/2
L
n
2
n1/ 2109 2n
n .
i
l
\
:
:
Picture 2
4. The input sequence for M
The input sequence for M whose last steps cannot be simulated with delay c will consist of
5 parts 11 .•. 15 where 12,14 and IS will
depend on the behaviour of the simulator.
i ) Choose a 1arge n x n-OLI-random square Q.
Partition it into n31ll x n3/ 4-blocks. Let R
be the figure which is obtained by replacing the
bottom right block L of Q by blanks. Part
11 makes head 1 of M print R row by row
and then moves head 1 to the top left corner of
the missing block. Head 2 stays at the top left
corner of R.
ii) let t = n7/81092n and let T be the top
1eftt x t-iYHsquare of Q (resp. R) • It
constists of n f. 10g4n blocks. Let. Ll. consi~t of the first nI /21og2n rows of the
missing bottom r.itht bloCk L of Q. Let L2
cOnsist of the next n1/ 2 rows of l.l3 of
the next n /l rows and so on. li i~ called
~ a~er i • Thus layer 1 is by a l 09 n factor
arger,lhan the other layers and' there are
T ~
~ n7/ 810g2n
Bi
iii) Part 13 drives both heads of M to the top
left corner of L.
iv} Partition l into small nll/16xnll/16 block~
Depending on the behaviour of 5 in
II 12 13 a pair (a,b) of the small blocks is
chosen. Part 14 drives the heads of M to the
top left corners of a and b
v)
There will be at least one choice for IS of
n11 / 16 moves which cannot be correctly simulated by S with, c n11/ 16 further steps.
4
5. Choosing the blocks
Bi
Partition the tapes of S into cn 3/ 4 x cn 3/ 4_
blocks. For sets 0 of blocks of Sand ie IN
we define Nj(O}as the set of blocks reachable
from 'D wi thin ; c: n3/ 4 moves. It is call ed the
i-neighbourhood of D. Clearly for fixed i we
have
IH;(D)I= .O(IDI) •
A block of S. is called fat it if has been visited
in at least n5/ 4/10g2n steps of 12. We denote
1ayers.
70
n5/ 410g 2n-0(10g
by fat(i) the set of blocks of S which is fat
afte'rexecuti on ofJ· . Now
11 2 1 =1:1 Mi 1+1: IW 1=0(n9/ 81
}+Q(n 3/2 )=0(n3/ 2 , •
i
Thus at most
O(n3/2-5/4log2n) = 0{n 1/ 4log 2n) blocks of S
Will ever be fat. Hence for all i the set of
blocks N3(fat(i)) ordered and interPreted as
a figure nas area at most 0(n3/2+1/4log2n) =
= 0(n7/4lo~ln) .
'
By 'fact 6 we can choose a block B. 1 in T
such that
1+
K(B i +1 1N3(fat( i ») ~ n3/ 2/2 .
n) -<K(L 1 IR)
~K{LIINI(Cl,C2)
oln
+K(N 1(C 1 ,C 2 ) at time tIN 1(C1 ,C 2)
after 11)
+K(N 1(C 1,C 2 ) after I1IR)
+O(log n) .
By fact 7 the first term can be estimated by
O(log n) . No block in Nl(Cl,C2) is fat after
54
J i and IJi+11 = 0(n / ) , thus by fact 4 the
second term can be estimated by O(n 5/ 410g n) .
Computing .11 from R and simulating S on input II glves all the tape inscriptions of S
after 11. O(log n) further bits give Nl(Cl,C2)
after II . Thus the third term is bounded by
O(log n) . '
c
An immediate consequence ist
Lemma 3: For each i > 1 : the head of S whi ch is
outslde of N2(fat (1)} after Mi+l does not
touch N2{fat (i») during Wi+l .
6. The effect of 12 and 13
The goal behind moving head 2 of M after
Ji to a block Bi+l such that the vicinity
of the fat blocks has large information deficit
about Bi+l is of course to force at least one
head of S far away from the fat blocks; recall
that ,after Mi+l the simulator S must be
ready to answer questions about 8i+l quickly.
On the other hand head 1 of M stays close to
Ll and thus S must also be peady to answer
questions about Ll quickly. But at least instuitively only fat blocks can have enough information about Ll , thus we expect one head of S
to be trapped in the close vicinity of at least
one fat block. Precisely this is asserted in the
first two lemmas.
Lemma 1: For each i >1 during all of Wi+l at
least one head of S-is outside of N2(fat{i)) .
Proof: Suppose false at time t during Wi+l
a~d.let. Cl,C2 be the blocks in N2{fat (i)
vlslted by the heads of S at time t. Thus
Nl (Cl ,C2) cN3{ fat (i)) . By choice of Bi+l and
fact 2 we have
32
n / /2 ~K(Bi+lIN3(fat (i» after J i )
~K (Bi +1 ,N1(C1' C2 )
at time t)
Proof: Suppose it does touch N2(fat (i») for the
during Wi+l at time t . Then at time
t it is still outside of Nl(fat (i) , thus by
lemma 2 the other head is at time t inside of
Nl (fat (i"») . But thi s contradi cts 1emma 1.
c
Now we know that for all i after Mi+l at least
~ne head of S it outside of N2(fat (i)
and
stays there during all of Wi+l . Wlog let us
assume it is head 2 of S for i = 1 . The crucial
point of the whole construction is, that for all
i it will be the same head.
Lenma 4: For all i ~1 head 2 of S is outside
of N2{fat (i» after Mi+l .
Proof: The lemma is true for i = 1 . Suppose it
TSt"rUe for a11 ·i < i and fa 1se for i . Then
after Mi+l by lemma 1 head 1 of S is in some
block Cl E£ N2(fat (i» and by lemma 2 head 2
of S is in some block C2E Nl(fat (i) .
Let k = min{k l INl (C 2 ) n fat(kl):I= 0} and let
EE Nl(C2) n fat (k). By induction hypothesis and
lenma 3 no block in N2(E) and hence no block in
Nl(C2) was touched in Wk+l, ... ,Wi .
Intuitively the argument now is the following.
Head 1 of S is not even close to a fat block,
thus it has for the next cn 3/ 4 steps access to
only very little information about Ll and L2 .
The fat blocks which are accessible quickly by
head 2 of S were not visited often before Jk,
they were visited at most IJkl time during Jk
and they were not visited after Jk except possibly in the. relatively short periods
M2, ...•Mi+l . Thus if k = 1 then there was no,
opportunity to get enough i'nfonnationabout L2
into Nl(Cl,C2). But if k> 1 then only during
J~ there was an opportunity to get O(tJkllog n}
b1tS about l} into Nl(Cl,C2).'
As IL l ,.= n5 410g2n. this is not enough information.
~time
I
'a t time t )
+K(N 1(C 1,C 2) at time tIN 1(C 1,C 2 ) after J i )
+K(N 1(C 1,C2 ) after J i IN3(fat (i) after J i )
+O(log n) .
By fact 7 the first term is bounde~ by O(log n).
IJ·
1'= 0(n5/~) the
second term ist bounded bt+ 10(n5/4 log n) . By
fact 2 the third term is bounded by O(log n) . D
By fact 4 and because
lemma 2: For each i > 1 during all of Ji+l at
least one head of S -is in N1(fat (i») .
Proof: .Suppose false at time t ·during Ji+l •
rer-tl,C2 be the blocks visited by S at time
t . By fact 5 and fact 2 we have
71
For sets 0 of small blocks of S let n(O) be
the set of blocks reachable from blocks in 0
within cn 11 /16 steps. We say that a pair (c,d)
of small blocks with c in N1(C1) and d in
Nl(C2) is useful for a pair la,b) of small blocks
in L if K(a,btn(c,d»< n11/8/2 , i.e. if n{c,d)
contains a lot of info~tion about (a,b) . Our
next goal is to find a pair (a,b) for which no
pair (c,d) is useful. Let u{c,d) be the number
of pairs (a,b) for which (c,d) is useful. Then
we have
Lema 6: For all C:~U(C.d)< n1/ 8/109 n.
Proof: Let l.l = n1/ / 109 nand suppose
u( c ,d) ~ l.l for some small block c in N1(C 1)
Let P1, .•. 'P~ be distinct pairs of small blocks
in L such that for each Pi there is a small
block d; in Nl(C2) such that (c,di) is useful for Pi . These pairs are formed of at least
p:= l.l1/2 = n1/16/(109 n,)1/2 many distinct small
blocks ai of L. For each j ~ p let Pij
be a pair where aj accurs and let ej = dij .
Then by fact 2 Klajln(c,e Jo» < K{Pi oln{c,e Jo) +
11/8
J
+ O(log n) ~ ~ + O{log n).
Now by fact 5 and fact 2
p. (n 11/ 8-O{10g n»
~ K(a ,···,a IQ without a , ... ,ap )
1
p
1
~ K{a , ..• ,ap IR)+ O{p log n)
1
~ K(ai' · · ·0' ap In(c) , n(e1) , ... ,n(ep ) )
+ K(n(c»
+ K(n{e 1 ),···,n(ep )IN 1(C 2) after 13 )
+ K(N 1(C 2 ) after 131R)
+ O(p log n) .
The first term can be estimated by
~(K(ailn(d.ei) + 0(109 n» = p(n 11/ 8/2 + 0( 109 n».
The se~ond term ist 0(n 11 / 8 ) . The third term is
o(P. 109 n) . By 1enma 5 and fact 4 the forths
term can be estimated by Oln 5/ 410g3n) .
0
By lemma 6 I u(c,d) = o(n 1/ 4 ) . Thus there is a
°
C,el
palr (a,b) of small blocks of L for which no
pai~ (c,d) of.sma1l blocks c in. N1{C1) and
d ln Nl(C2) 1S useful. Part 14 of the input
sequence is chosen to move the heads of M to
the top left corners of a and b in O(n 3/ 4 )
steps.
8. Knockout
let (c,d) be the pair of small blocks visited by
S after 14. It was not useful for (a,b) after
13 . Hen.ce oy fact 2
11/8
/2 ~ K(a,bln(c,d) after 13)
n
~ K(a,bln(c,d) after 14)
+ K(n(c,d) after ~ 14In(c,d) after 13)
+ O( 1og n) .
~ O(log n) + 0(n 3/ 410g n) + O(log n)
by fact 7 and fact 5.
0
The formal argument below follows exactly these
lines.
If k = 1 then by fact 5 and fact 2 we have
n5/ 4-O(10g n)~K(L2IQ without L2)
~K(L2IR,ll} + O(log n}
~K(L2IN1(C1,C2) after Mi +1 )
+K(N 1(C 1 ) after Mi+1 IR)
+K(N 1(C 2 ) after Mi+1 IN 1(C 2)
after J 1)
+K(N 1(C 2 ) after J 1 IR,ll)
+O( 1og n) •
By fact 7 the first term is bounded by O(log n) .
In J1 ... Ji Mi+l the blocks in N1(C1) were
visited at most 0(n 5/ 4/10g 2 n + n7/ 8 10g2n) times,
thus by fact 4 the second term can be estimated by
O(n 5/ 4/10g n) . In J2 ... Ji Mi+1 the blocks in
N1(C2) can only have been visited in M?, ... ,Mi+l
thus by fact 4 the third term can be estlmated by
O(n 9/ 810g 3n} . The forth term can clearly be estimated by O(log n) .
If k > 1 then by fact 5 and fact 2 we have
n5/ 410g2n - O(log n)
~K(L1IQ without ll)
~K(L1IN1(C1,C2) after M
i +1 )
+K(N 1(C 1) after Mi+1 IR}
+K(N 1(C 2) after Mi+1 IR)
+O(log n) .
By fact 7 the first term can be estimated by
O(log n). Exactly as above the second term can be
estimated by 0(n 5/ 4/10g n) .
The blocks in Nl(C2) were visited at most
0(n 5/ 4/10g 2n} times in J1 ... Jk-1 , at most
0(n 5/ 4 ) times in Jk and at most 0(n 9/ 810g2n)
times in Mk+1 ... Mi+1 . Thus bt fact 4 the third
term can be estimated by O(n57410g n) .
c
Lemma 3, Lemma 4 and 1131 = O(n) imply
Lemma 5: In 12 13 no block of tape 2 of 5 is
vlslted more than 0(n 5/ 410g2n) times.
a
7. The choice of 14
We have forced the simulator S to spread out the
information about L allover tape 2 and nowhere
on tape 2 there is a lot of it close together. For
purposes of retrieving information about L. tape
2 can intuitively be considered as degenerate and
we are almost in the situation of 2 heads versus 1
head~ Consequently after an appropriate modi ficati on we wi 11 make the correspondi n.g argument from
[3] work.
Let C1,Cz. be the blocks visited by S after
13 · Partltion Nl(C1,C2) into small
cnl1/16xenl1/16_blocks. Thus each of L ,N1(C1)
and Nl(C2) is partitioned· into
0«n3/4-11/16}2) = 0(n 1/ 8 ) small blocks, each of
area n11 / 8 resp. 0(n 11 / 8 ) .
72
9. Exercises and conjectures
a) h heads on one 2-dimensional tape cannot besimulated in real time'by 2h-2 ~-dirnensional
b) f~~e~esults hold in dimension >2
c) attaching any number of linear tapes to the
simulator does not help.
10. References
[1] Fischer, Meyer and Rosenberg:
Real-time simulation for multihead
tape units.
J.ACM 19, 590-607, 1972
[2] Leong and Seiferas:
New real-time simulations of multihead .tape units.
Proc. 9th ACM-STOC, 239-247,1977
[3]
Paul, Seiferas and Simon:
An information theoretic approach
to time bounds for on-line computation.
Proc. 12th ACM-STOC, 357-367,1980
[4] Stoss:
k-Band-Simulation von k-Kopf
Turing Maschinen.
Computing 6, 309-317, 1970
73
© Copyright 2026 Paperzz