1 Ergodicity of Cellular Automata Andrei Toom E-mail [email protected] [email protected] This course will be delivered on January 14-18, 2013 at Tartu Uiversity, Estonia as a part of the graduate program. Content 1. Undirected plane percolation: finite and infinite. Duality. . . . . . . . . . . . . . 3 2. Directed plane percolation: finite and infinite. Duality. . . . . . . . . . . . . . . 14 3. Stavskaya process has a critical value. Approximations. . . . . . . . . . . . . . . 19 4. d -dimensional DCA (Deterministic Cellular Automata). . . . . . . . . . . . . . 30 5. σ is empty ⇐⇒ D is an eroder ⇐⇒ ∃ α > 0 : RD is non-ergodic . . 42 6. Measures on the sigma-algebra in Ω = AG . M is compact. . . . . . . . . 49 7. General definition of PCA (Probabilistic Cellular Automata). . . . . . . . . . 54 8. In dealing with PCA, should we count time as an extra coordinate? . . 59 9. Coupling of measures and processes. Order and Monotonicity. . . . . . . . 64 10. The problem of ergodicity for PCA is undecidable. . . . . . . . . . . . . . . . . . 71 Main terms and notations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 2 Foreword For several decades mathematicians study processes with many (or infinitely many) interacting components. Such processes are known under many names. In this course we call them cellular automata. By choosing this title we dismiss many non-essential technical complications. The parameter of time is discrete and the set of states of every component is finite, often has only two elements. In spite of their conceptual simplicity, cellular automata demonstrate a plethora of interesting phenomena and the title ”programmable matter” coined by T. Toffoli and N. Margolus [32] is quite appropriate. In a cellular automaton with discrete time and positive transition probabilities, any local event can happen with a positive probability (which is not true for systems with continuous time where only one change can occur at a time). If, in spite of this, a cellular automaton is non-ergodic (and we shall present examples of this), it is a convincing analog of phase transitions, which have many forms in the real world (freezing, melting, evaporation, condensation etc.) and are among the most important natural phenomena. We are especially interested in non-trivial (that is, strictly between zero and one) critical values of parameters, when properties of the process on the opposite sides of a critical value are qualitatively different, thus imitating natural phase transitions where the structure of matter changes qualitatively when temperature continuously changes across freezing or condensation points. We start with some basic notions of percolation because it is strongly connected with cellular automata. The present book consists of ten chapters and almost every chapter contains at least one theorem relevant to our study. Other statements are called lemmas. Some of them are proved, other proofs are left to the reader. Some results included here can be found in the surveys [36] and [37]. Our list of references certainly does not include all the relevant studies; it includes only those publications to which we explicitly refer. There are many solved and still more unsolved problems in this area. We selected a few of them, which are closest to our notions, and placed them at the end of every chapter along with a few exercises for those students who want to get some hands-on experience of doing mathematics in this area. 3 Chapter 1 Undirected plane percolation 1.1. Percolation models. We call a real function f of n real arguments monotonic if x1 ≤ x01 , . . . .xn ≤ x0n =⇒ f (x1 , . . . , xn ) ≤ f (x01 , . . . , x0n ). Let us call a variable Boolean if it has only two possible values 0 and 1 . According to the general definition above, we call a Boolean function φ of n Boolean arguments monotonic if (v1 ≤ v10 , . . . , vn ≤ vn0 ) =⇒ φ(v1 , . . . , vn ) ≤ φ(v10 , . . . , vn0 ). In this chapter a percolation model is a graph with a finite or countable set of vertices and a finite or countable set of edges. For every edge there are two vertices, called its ends. An edge may be directed or undirected. If an edge is undirected, it is either open in both direction or closed in both directions. If an edge is directed, it is open or closed in one direction and open or closed in the other direction independently. 1.2. Finite undirected percolation. We say that a graph is drawn on a plane if the following conditions are fulfilled: • ) All vertices of the graph are represented by different points in that plane, every edge is represented by a self-avoiding arc and ends of this arc represent those vertices, which this edge connects. • ) An arc is a set {f (t) : 0 ≤ t ≤ 1} , where f is a continuous function from [0, 1] to the plane and the relation t ↔ f (t) is one-to-one. The set {f (t) : 0 < t < 1} is called the inner part of this arc. Elements of the inner part are called inner points. No inner point of an edge can belong to another edge. In fact we assume for simplicity that every arc representing an edge is a broken line. Those values of t , in the vicinity of which this function is not linear, are called angles. • ) Every edge connects two vertices which are called its endpoints. Inner points of any arcs representing an edge have no common points with other edges. Endpoints of an arc may coincide with each other only when the edges have one and the same vertex as an end point. Throughout this text plane graph means a graph drawn on a plane in the described way. (Graphs which can be drawn on a plane in this way are called planar, but we do not speak about them.) 4 Every plane graph divides the plane into parts which are called faces. In this subsection we consider only finite graphs, so one of the faces is unbounded, others are bounded. For short we may use ”vertices” instead of ”points representing them”. Every plane graph has a dual graph, also a plane graph. For any plane graph Γ and its dual plane graph Γ0 : (a) There is a one-to-one correspondence between faces of Γ and vertices of Γ0 , such that every vertex of Γ0 belongs to the corresponding face of Γ . (b) There is a one-to-one correspondence between edges of Γ and edges of Γ0 such that inner parts of these two edges intersect exactly at one point which is not an angle of any of them. (c) There is a one-to-one correspondence between faces of Γ0 and vertices of Γ , such that every vertex of Γ belongs to the corresponding face of Γ0 . Let us establish the following rule connecting states of edges of a plane graph and its dual: Given an undirected plane graph Γ and its dual Γ0 : an edge of Γ0 is open if and only if the corresponding edge of Γ is closed. (1) A path in a graph is a finite or infinite sequence vertex-edge-vertex-edge-. . . , where every edge connects those two vertices, between which it is placed in this sequence. A contour is a path in which the first and last vertices coincide. A path or contour is called open if all the edges included in it are open. We say that there is percolation from a vertex A to a vertex B is there is an open path from A to B . Theorem 1.1. Let Γ be a plane graph and Γ0 its dual. Assume the rule (1). Let A, B be two different vertices of Γ . Then in Γ there is no open path connecting A and B if and only if in Γ0 there is an open contour separating point A from point B . I cannot say who discovered this theorem; for a long time it was considered ”generally known” and remained unwritten. Now its proof is available in [26]. Instead of proving this theorem, we illustrate it. The following diagram shows a graph Γ whose vertices are represented by black circles and edges are represented by double lines. Vertices of its dual are represented by white circles and edges are represented by curves. 5 Y i C A vH H HH HH HHH HHH HHH HHH HH HH v i i H Hv B H H HH HH HH HH HH HH HH HH HH HH HH v Z X D Figure 1.1 Figure 1.1 shows a finite graph Γ with four vertices A, B, C, D drawn on a plane and its dual graph Γ0 with three vertices X, Y, Z also drawn there. According to our theorem 1.1, there is no path in Γ connecting A with B if and only if there is an open contour in Γ0 separating A from B . j 6 6 6 ? 6 ti ? - - i (0, 0) ? ? - ? - - Figure 1.2. Infinite undirected percolation. 6 The most interesting and important property of infinite percolation is possibility of phase transition and we go straight to one case of it. Figure 1.2 represents an infinite graph, which we call ”checkered paper”. In mathematical terms, it is a graph, whose set of vertices is Z2 , the set of pairs (i, j) , where both i and j are integer numbers. Any vertex (i, j) is connected by undirected edges with (i + 1, j), (i, j + 1), (i − 1, j), (i, j − 1) . Let us imagine that every edge is a pipe which is either open or closed, namely it is open with probability ε and closed with probability 1 − ε independently of the other edges. The origin (0, 0) is the only source of some liquid, which can pass along open edges, but not along closed ones. There are no one-way edges: any edge is either open in both directions, or closed in both directions. Vertices are always open. The vertices which the liquid can reach are called wet, others are called dry. The source (0, 0) is always wet by definition. If the edge from (0, 0) to (1, 0) is open, the vertex (1, 0) is also wet. If, in addition, the edge from (1, 0) to (1, 1) is also open, the vertex (1, 1) is also wet and so on. Let us call a path a sequence ”vertex-edge-vertex-edge-. . . ” in which every edge connects the vertices between which it is placed. If a path is finite, it ends with a vertex and the number of edges in the sequence is called length of the path. One finite path of length 14 is shown on figure 1.2. A path is called self-avoiding if all the vertices in its sequence are different. For example, the path shown on figure 1.2 is self-avoiding. A path is called open if all its edges are open. A vertex is wet if and only if there is an open path from the source (0, 0) to this vertex (or from this vertex to the source, which means the same). We say that there is percolation from (0, 0) to ∞ if the set of wet vertices is infinite. The most interesting feature of this kind of percolation is existence of a non-trivial critical value, which in the present case can be formulated as follows: Theorem 1.2. Percolation from (0, 0) to ∞ on the checkered paper as described above, has a critical value ε∗ strictly between zero and one, such that: (a) If ε < ε∗ , the probability of percolation from (0, 0) to ∞ is zero. (b) If ε > ε∗ , the probability of percolation from (0, 0) to ∞ is positive. In fact we shall prove that (a’) if ε is small enough, the probability of percolation from (0, 0) to ∞ is zero and (b’) if ε is large enough, the probability of percolation from (0, 0) to ∞ is positive. 7 Items (a’) and (b’) are sufficient to prove our theorem. Indeed, we may define ε∗ as the supremum of those values of ε , for which the probability of percolation from (0, 0) to ∞ is zero. According to what will be proved, thus defined ε∗ is strictly between zero and one. In chapter 9 we shall prove that the probability of percolation in the finite case is a non-decreasing function of ε . The same is true for percolation from (0, 0) to ∞ , whence the critical value is unique. It remains to prove (a’) and (b’). To prove this, we need the following lemma. Lemma 1.1. For any wet vertex there is an open self-avoiding path from (0, 0) to this vertex. Proof. Let us call by v the vertex in question. Since v is wet, there is an open path from (0, 0) to v . If this path is self-avoiding, we are done. Let us assume that this path is not self-avoiding. Then its sequence of vertices v0 = (0, 0), v1 , . . . , vn = v contains two identical vertices vi = vj . Let us exclude from this path all the vertices with number in the range i + 1, . . . , j . Thus we obtain another open path, which is shorter than the previous one. Let us repeat this operation, obtaining a sequence of open paths, each shorter than the previous one. We have to stop because lengths of these paths are nonnegative. Lemma 1.2. Percolation from (0, 0) to ∞ on the checkered paper is equivalent to existence of an open self-avoiding infinite path starting at (0, 0) . In one direction it is evident: if such a path exists, the set of its vertices is infinite and all of them are wet, so the set of wet vertices is also infinite. Now let us prove lemma 1.2 in the opposite direction. We can encode a path starting at (0, 0) by the sequence of directions of its edges as we pass them. For example, the path shown on figure 1.2 can be encoded as a sequence of directions east, north, west, north, west, south, west, south, south, south, east, south, east, east. Let the set of wet vertices be infinite. Let us call S the set of open finite self-avoiding paths starting at (0, 0) . Since for any wet vertex there is such a path leading there, S is infinite. Let us classify S into four subsets depending on direction of the first edge in the path: S = Seast ∪ Snorth ∪ Swest ∪ Ssouth . 8 Since the union of these four sets is infinite, at least one of them is also infinite. Let it be Seast (the other cases are analogous). Then we classify Seast into three classes according to direction of the second edge: Seast = Seast, east ∪ Seast, north ∪ Seast, south . Again, at least one of these subsets must be infinite. Seast, north , we classify it again: Seast, north = Seast, north, east ∪ Seast, north, north ∪ Seast, If it is, say, north, west and again at least one of these subsets must be infinite. Thus we continue inductively. At the n -th step of our inductive argument we already have a sequence of n directions such that the set of open self-avoiding finite paths starting with these directions is infinite. Since we can continue this argument infinitely, this sequence grows infinitely, thereby defining an infinite open self-avoiding path starting at (0, 0) . Lemma 1.2 is proved. Now let us prove the statement (a’). If there is an infinite open selfavoiding path starting at zero, then, by taking its first n steps, we obtain a finite open self-avoiding path starting at (0, 0) , whose length is n . Let us estimate the probability of its existence. For any self-avoiding path of lenght n the probability that it is open is εn . The number of self-avoiding paths of length n starting at (0, 0) does not exceed 4 · 3 n−1 . Thus the event ”there is an open self-avoiding path of length n starting at (0, 0) ” is a union of at most 4 · 3n−1 events, the probability of each being εn . Therefore the probability of this event does not exceed the sum of their probabilities, which is 4 · 3 n−1 · εn = 4 · (3ε)n . 3 If ε < 1/3 , this quantity tends to zero when n → ∞ . But the probability of percolation from (0, 0) to ∞ is not greater than this quantity. Therefore the probability of percolation is zero for all ε < 1/3 . The proof of (b’) is more difficult. Which sets of closed edges make percolation impossible? Let us call such sets obstacles. It is better to speak about minimal obstacles, that is obstacles, all of whose proper subsets are not obstacles. One minimal obstacle is shown on figure 1.3. Closed edges are crossed and wet vertices are circled. 9 j 6 i i i i i i i i i t i i i - i (0, 0) i i i i i Figure 1.3. One minimal obstacle, that is a minimal set of closed edges that makes percolation impossible. Closed edges are crossed, wet vertices are circled. You can see that closed edges on figure 1.3 form some kind of fence around the origin. It is better to make the crossing bars longer, so that they form a continuous contour around (0, 0) as shown on figure 1.4. This observation can be turned into a rigorous statement. In the previous section we defined dual graphs for plane graphs. In the analogous way we may define dual graphs for infinite plane graphs, including the checkered paper. On figure 1.4 the dual graph is shown by dotted lines. There is a one-to-one correspondence between edges of the two graphs, namely every edge of the dual graph crosses exactly one edge of the original graph and vice versa, and the relation between their being open is exactly the same as (1). Lemma 1.3. If the rule (1) is applied, there is no open path in the checkered paper from (0, 0) to ∞ if and only if in the dual graph there is an open contour surrounding (0, 0) . Proof of this lemma is published in [26]. The method we use here is known as Peierls contour method because R. Peierls used it first; he applied it to Ising model. 10 j 6 ti - i (0, 0) Figure 1.4. The crossing bars form a contour surrounding the origin. Now let us prove assertion (b’). According to lemma 1.2, the probability that there is no percolation in checkered paper from (0, 0) to ∞ equals the probability of existence of an open contour surrounding (0, 0) in the dual graph. This probability does not exceed the sum over all contours surrounding (0, 0) of the probabililty that a given contour is open. Let us estimate this sum. All contours have an even number of steps and therefore this number can be denoted 2n , where n ≥ 2 because the minimal contour has length 4. A contour having 2n steps is open with a probability (1 − ε)2n . Thus the probability that there is no percolation from (0, 0) to ∞ does not exceed Prob(no percolation from (0, 0) to ∞ ) ≤ ∞ X Cn (1 − ε) 2n , n=2 where Cn is the number of different contours having 2n steps and surrounding (0, 0) . It remains to estimate Cn . To determine a contour surrounding (0, 0) and having 2n steps, it is sufficient: i) Specify the i coordinate of the leftmost point of intersection of our contour with the positive half of axis i . This coordinate equals k + 1/2 , 11 where k is an integer number between zero and n − 2 . (For example, k = 2 on figures 1.2 and 1.3.) Thus here we have n − 1 cases. ii) Specify directions of the 2n edges starting from that which we hit in the item i) and going counter-clockwise along the contour. The first edge’s direction is north, every other edge’s direction has at most three possible values, the last edge’s direction is predetermined because the contour must return to its initial point, so the number of cases here does not exceed 3 2n−2 . Therefore Cn ≤ (n − 1) · 32n−2 and the probability that there is no percolation from (0, 0) to ∞ does not exceed ∞ X (n − 1) · 3 2n−2 (1 − ε)2n . n=2 For (1 − ε) small enough this sum is less than one and this is what we need. In fact, this sum equals 2 2 x where x = 3(1 − ε) . 3(1 − x) It is less than one if 1 ε > 1 − √ ≈ 0.71. 2 3 Thus the probability of percolation on checkered paper is zero if ε is small enough and positive if ε is large enough. We can define the critical value ε∗ as the supremum of those values of ε for which the probability of percolation is zero. Then 0 < 1 1 ≤ ε∗ ≤ 1 − √ < 1. 3 2 3 In chapter 9 we shall prove an important property of percolation: the probability of percolation is a non-decreasing function of ε in all the cases considered in this chapter. Hence in every case there is only one critical value ε∗ such that probability of percolation is zero for ε < ε∗ and positive for ε > ε∗ . If thew critical value ε∗ equals 0 or 1, we call it trivial; if it is in (0, 1) , we call it non-trivial. Thus we have proved our main statement: existence of a non-trivial critical value, which we define as the supremum of those values of ε for which the probability of percolation is zero. 12 Of course, our estimations of the critical value are very rough and can be improved. In fact, in the present case the critical value is known exactly: it equals 1/2 . However, this is an exception connected with the fact that the dual graph of checkered paper is isomorphic with it. To prove this exact value is much more difficult. The proof was first obtained by Kesten and published in his book [16]. You can find a later version of this proof in [13].) Generally, dual graphs are not isomorphic with the original graphs and the critical values are difficult even to estimate. Even computability of some of these critical values was brought to public attention only recently [14]. We formulated lemma 1.3 only for the checkered paper, but in fact it can be formulated in more general terms: for any periodic plane graph. A plane graph is called periodic if it has a 2-dimensional group of automorphisms. Let us assume that (0, 0) is the only source of liquid. As before, a vertex is wet if liquid can reach there and percolation from (0, 0) to ∞ means that the set of wet vertices is infinite. Lemma 1.4. (A general version of Lemma 1.3.) If rule (1) is applied, there is no percolation from (0, 0) to ∞ in a plane periodic graph, all of whose faces are bounded, if and only if in its dual graph there is an open contour surrounding (0, 0) . You can find the proof in [26]. One of those graphs, to which this statement allows to prove existence of critical value, is shown on figure 1.4. Exercise 1.1. Prove existence of a critical value for the triangular lattice shown on figure 1.5. Exercise 1.2. Let us consider one-dimensional percolation, where the set of vertices is Z and two vertices x and y are connected with an edge if |x − y| ≤ 100 . As before, suppose that 0 is the only source of liquid, every edge is open with a probability ε independently from other edges and percolation means that the set of wet vertices is infinite. Prove that in this case the probability of percolation is zero for all ε < 1 . As before, we can define ε∗ as the supremum of those values of ε for which the probability of percolation is zero, but now this supremum is trivial: ε∗ = 1 . This suggests that there is qualitative difference between one-dimensional and multi-dimensional cases in percolation. 13 qq qq qq qq qqqqq qq qq qq qq qqqqq qq qq qq qq qqqqq qq qq qq qq qqqqq qq qq qq qq qqqqq J J J J J J J J J J q q q q q q q q q q q q q q q q q q q q q q q q q J q q q q q J q q q q q J q q q q q J q q q J q q q q q qq qqq J q qq qq qq J J J J qq qq qq qq J J J J J qq qq qq qq J qq J qq qq J qq J J qqq J qqqqq J qqqqq J qqqqq J qqqqq J qqq q qqq qqq q qqq qqq qqq qqq qqq qqq qqqqqJ qqqqqJ qqqqqJ q q q q q qJq q q q q q qJq q q q q q J J J J J qq qq qq qq qq qq qq J qq J J qq J qq J qq qq qq qq qq J J J J J J qqqqq J qqqqq J qqqqq J qqqqq J qqqqq q q q q q q q q q q q q q q q q q q q q J q q q q q J q q q q q J q q q q q J q q q q q J q q q q q q q q qq qqq J q qq qq qq J J J J qq qq qq qq J J J J J qq qq qq qq qq J J J J J qq qq qq qqq J qqqqq J qqqqq J qqqqq J qqqqq J qqq q q qqq qqq qqq qqq qqq qqq qqq qqq qqqqqJ qqqqqJ qqqqqJ q q q q q qJq q q q q q qJq q q q q q J J J J J qq qq qq qq qq J J J J qq J qq qq qq qq qq qq qq qq qq J J J J J J qqqqq J qqqqq J qqqqq J qqqqq J qqqqq q q q q q q q q q q q q q q q q q q q q J q q q q q q q q q q q J q q q q q J q q q q q J q q q q q J q q qq qqq J qq qq qq q J J J J qq qq qq qq J J J J J qq qq qq qq qq qq qq qq J J J J J qqq J qqqqq J qqqqq J qqqqq J qqqqq J qqq q q q q q q q q q qqq qq qq qq q q q q q q q qJq q q q q q q q qJq q q q q q q q J q q q q q q q J qqqqqJ qq qq qq qq qq J J J J J qq qq qq qq qq J J J J J qq qq qq qq qq qq qq qq qq qq Figure 1.5. Continuous lines show a triangular lattice and dotted lines show the dual graph. In this case the dual graph is a hexagonal lattice. Exercice 1.3. Let us take the infinite graph used in theorem 1.2 and take its subgraph keeping only those vertices (i, j) for which i ≤ j and j ≥ 0 and only those edges, both ends of which are kept. As before, every edge is open with a probability ε independently of other edges. Let us denote by Π the probability of percolation from (0, 0) to ∞ . Prove that Π undergoes a phase transition as ε grows from 0 to 1. 14 Chapter 2 Directed plane percolation: finite and infinite. Another kind of percolation, which is still more important in the present course, is directed percolation. In this case an edge may be closed in one direction and open in the other direction. The figure 2.1 shows a directed version of checkered paper. j 6 6 6 6 6 6 - 6 6 - 6 6 - 6 6 - 6 6 6 6 6 6 6 - 6 6 - - 6 6 - 6 6 6 6 - 6 6 6 6 6 - 6 6 - - - - 6 6 6 6 6 6 (0,0) r - Figure 2.1. 6 6 6 6 - 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 - - i Figure 2.1 is a directed version of checkered paper. In this case a path can contain only east or north directed steps. One path starting at (0, 0) is shown by arrows. Suppose that every edge in the figure 2.1 is open in the direction of arrow with probability ε and closed with probability 1 − ε independently of others and always closed in the opposite direction. As before, we call a path open if all its steps are open - in that direction in which we use them - and a vertex is wet if there is an open path from (0, 0) to this vertex. Percolation means, as before, that the set of open vertices is infinite or, which is equivalent, that there is an infinite open path starting at (0, 0) . Theorem 2.1. Percolation from (0, 0) to ∞ on the directed checkered paper has a critical value ε∗ strictly between zero and one such that: 15 (a) If ε < ε∗ , the probability of percolation is zero. (b) If ε > ε∗ , the probability of percolation is positive. As before, it is sufficient to prove that (a’) the probability of percolation is zero for ε small enough and (b’) the probability of percolation is positive for ε large enough. The assertion (a’) is easy to prove. However, when proving the assertion (b’) we meet a new difficulty: the minimal obstacles are more complicated. j 6 6 6 6 6 6 (0, 0) 6 - 6 - 6 6 - 6 6 - 6 - 6 6 6 6 6 6 - - 6 6 6 - 6 6 6 - 6 6 6 6 6 6 - 6 6 6 6 6 6 - 6 6 6 - 6 6 6 6 6 6 6 s 6 6 - - - - - i Figure 2.2. One minimal obstacle on the directed checkered paper. Wet vertices are circled. This obstacle also can be presented as a contour surrounding (0, 0) . See next figure. First of all let us formulate how states of edges of the dual graph depend on the states of edges of the original graph. We must choose one of the two orientations on the plane, and we choose the counterclockwise one. This choice is arbitrary, but it must be made. This is a kind of chirality similar to the choice which police of any face must make: all vehicles must drive on one and the same side: right or left. A similar choice must be made in electromagnetism, screw industry and 16 other circumstances. Thus to every direction of an edge of the original graph there corresponds a direction of the corresponding edge of the dual graph: the direction from the right hand to the left hand as we advance straight ahead along the original edge in the chosen direction. So we adopt the following rule: Every edge of the dual graph is open in a certain direction if and only if the corresponding edge of the original graph is (2) closed in the corresponding direction. Lemma 2.1. If rule (2) is applied, there is no percolation in a directed graph on a plane from (0, 0) to ∞ if and only if in the dual graph there is an open contour going around the source (0, 0) in the positive direction, that is counterclockwise. Proof of lemma 2.1 is available in [26]. It is illustrated by figure 2.3 where one contour is shown. 6j 6 6 6 6 6 6 6 6 6 6 6 6 6 qq q q q q q q q q qqqqqqq q q q q q q q q qq q q q q q q qq qq qq6 q q qq qq q 6 q 6 6 6 6 qq 6 6 qq q q q q q -q q q q q q q q q q-q q q q q q q q qq qq q? q6 qq q qq qq q q q 6 q 6 qq 6 6 6 6 6 q q? q q q q q q q q q qq qq q6 q q qq qq 6 qq q qq qq qq qq 6 qq 6 qq 6 6 6 qq 6 qq 6 q? qq q? qq q q qq qq 6 qq qq 6 q q q q q q q q 6 qq 6 qq 6 6 6 qq 6 qq 6 q q q q q q q q q q q q q q q q q q q q q q q q q q q ? ? qq q q 6 qq qq qq qq 6 qq 6 6 6 6 6 qq 6 q q q q q q q q q-q q q q q q q q q-q q q q q q q q qq qq q q q q q q q q q? qqq q q q q q -q q qq6 q qq q q s - 6 j=1 (0,0) - - - - - - i - i=2 Figure 2.3. Contour corresponding to the obstacle shown on figure 2.2. Here x = 2 and y = 1 (coordinates of its beginning and end). Let us use lemma 2.1 to prove the assertion (b’). Like before, the proba- 17 bility that there is no percolation does not exceed X α |ω| , (3) ω∈Ω where α = 1−ε , Ω is the set of all minimal obstacles and |ω| is cardinality of ω . It follows from lemma 2.1 that to every minimal obstacle there corresponds a contour in the dual graph - like that shown on figure 2.3. The original graph cuts the plane into infinitely many square faces and one unbounded face. Every contour in the dual graph starts and ends in one and the same face, namely in the unbounded one. Let us denote by i the horizontal coordinate of its beginning and j the vertical coordinate of its end. For the contour shown on figure 2.3 i = 2 and j = 1 and generally i and j take any positive integer values. Also let us denote e, n, w, s the numbers of east, north, west, south steps in a contour. Notice that w = i + e and n = j + s . The number of edges in the corresponding obstacle is n + w = i + j + e + s . The total number of steps in the contour is e + n + w + s = i + j + 2e + 2s . The directions of the first and last steps are determined uniquely and directions of all the others are chosen of at most three options, so the number of contours with given e, n, w, s does not exceed 3 e+n+w+s−2 = 3 i+j+2e+2s−2 . The table below shows the probabilities of original edges to be open in all directions and probabilities of dual edges resulting from the rule (2). Original graph Dual graph east : open with prob. ε north: open with prob. α = 1 − ε north: open with prob. ε west : open with prob. α = 1 − ε west : always closed south: always open south: always closed east : always open Figure 2.4. Therefore the probability of an obstacle is αn+w = αi+j+e+s , whence (3) does not exceed ∞ X ∞ X ∞ X ∞ X i=1 j=1 e=0 s=0 3 i+j+2e+2s−2 α i+j+e+s = 18 ∞ ∞ ∞ ∞ X X X 1 X i j e · (3α) · (3α) · (9α) · (9α)s = 9 i=1 e=0 s=0 j=1 1 3θ 3θ 1 1 · · · · = 9 1 − 3α 1 − 3α 1 − 9α 1 − 9α α (1 − 3α)(1 − 9α) 2 . If α is small enough, e.g. smaller than 0.09 , this expression is less than 1 . Thus we have proved that the probability of percolation is positive as soon as ε > 1 − 0.09 = 0.91 . This is only an estimation. The critical value is less than this; its exact value is unknown. It is not yet even proved that it is computable. Using the method of [14], probably, it is possible to prove that this critical value is computable. Exercice 2.1 For every natural n let us consider a finite directed graph Γn which is a square piece of the graph in theorem 2.1. In other words its set of vertices is {0, . . . , n}2 and from every vertex (i, j) there go two directed edges: to (i + 1, j) and to (i, j + 1) . Every edge is open in this direction with a probability ε independently of the other edges and always closed in the opposite direction. Let us denote by Π the probability of percolation from (0, 0) to (n, n) . Prove that Π undergoes a phase transition as n → ∞ . 19 Chapter 3. Stavskaya process You may wonder why did we pay so much attention to percolation if the title of this course promised to speak about cellular automata. The answer is that among the many applications of percolation, the study of cellular automata plays an essential part. So let us go to processes. Every process considered here is based on a set denoted by G and called ground space. (Most often G = Zd .) Elements of G are called points; we may imagine them as components of a multicomponent system. Let us present our first example of a cellular automaton called Stavskaya process; Olga Stavskaya, advised by I. I. Piatetski-Shapiro, was the first person who wrote a computer program, simulating the process named after her and experimentally showed existence of phase transition in that process. Our denotations are somewhat different from the article [31], which initiated our studies of probabilistic cellular automata at Moscow University, partially summed up in [36, 37]. In the present case the ground space is G = Z . Dealing with PCA, we shall use an operator, which we call random noise and denote Rαβ , where α and β are parameters with values in [0, 1] . The operator Rαβ substitutes: every 0 by 1 with probability α and every 1 by 0 with probability β, (4) doing this to all components simultaneously and independently. If one of these parameters equals zero, it may be omitted: Rα means Rαβ with β = 0 and Rβ means Rαβ with α = 0 . Dealing with PCA, we in every case choose a finite or countable ground space G and non-empty finite alphabet A . Its elements are called letters. In the present case A = {0, 1} and the letters are 0 and 1 . We may imagine that the state 1 means presence of a particle and the state 0 means an empty site. The set AG is called the configuration space and denoted by Ω . In the present case Ω = {0, 1}Z . Its elements are bi-infinite sequences called configurations. Every configuration x is determined by its components xi ∈ {1, 0} , where i ∈ Z . We shall consider a sequence of probabilistic measures enumerated by t = 20 0, 1, 2, . . . , which we call ”Stavskaya process”. We assume that initially all the components are zeros for sure and then, at every step of the discrete time, two transformations occur, denoted by D (deterministic) and by Rα (random), where α ∈ [0, 1] is a real parameter. The deterministic transformation D , when applied to a configuration x turns it into a configuration y defined as follows: ( 1 if xi = xi+1 = 1, yi = 0 in all the other cases. Speaking informally, every particle dies if it find no protection from its right neighbor. The transformation Rα is probabilistic: under its action, every component turns into a state 1 (present) with probability α independently from what happens at other places. We shall use pseudo-codes to formalize our ideas. We enumerate lines for convenience of reference. The Stavskaya process with ”all zeros” as the initial condition may be expressed by the following pseudo-code: 1 2 3 4 5 6 7 for all i ∈ Z do simultaneously xi, 0 ← 0 for t = 1 to ∞ do for all i ∈ Z do simultaneously x(i, t) ← min(x(i, t − 1), x(i + 1, t − 1)) for all i ∈ Z do simultaneously if rnd < α then x(i, t) ← 1 The sign ← in lines 2 and 5 is the assignment operator; x ← a means that variable x is assigned value a . Thus lines 1-2 assign the initial configuration ”all zeros”. Lines 4-7 perform a time step. Lines 4-5 corresponds to the operator D . Lines 6-7 corresponds to the operator Rα . This pseudocode uses a random number rnd which is uniformly distributed between 0 and 1, newly generated every time when it is called and is independent from all the previously generated random numbers. Let us represent the same idea in mathematical terms. We assume that the process is induced by independent random variables rnd(i, t) , everyone of which equals 1 with probability α and 0 with probability 1 − α by the map defined in the following inductive way: 21 Base of induction: x(i, 0) = 1 for all i ∈ Z . Induction step: x(i, t) = max(y(i, t), min(x(i, t − 1), x(i + 1, t − 1))). where every y(i, t) is an independent random variable, which equals 1 with probability α and zero with probability 1 − α . Figure 3.1 shows this triangle for i = 0 and t = 3 . (The axis of time is slanted to make the scheme symmetric.) (0, 3) % (0, 2) (1, 2) % % (0, 1) (1, 1) (2, 1) % % % (0, 0) (1, 0) (2, 0) (3, 0) Figure 3.1. T I @ @ @ @ I @ I @ @ @ @ @ @ s0 @ I @ @ @ I @ @ I @ @ @ s1 @ @ s2 @ @ s3 Figure 3.2. The figure 3.2 shows that part of the percolation model for Stavskaya process, which is relevant for the state of point 0 at time 3 . In this model: Edges are always open upward and closed downward. Vertices s0 , s1 , s2 , s3 are always open. Other vertices are open with probability 1 − α and closed with probability α independently of each other. There is zero at the point (0, 3) (point 0 at time 3) if and only if there is an open path in this graph from some of the sources si to the target T . Lemma 3.1. In the Stavskaya process starting from ”all zeros” there is a zero at a point (v, t) if and only if there is an open path from some initial vertex to the vertex (v, t) in the percolation graph. 22 Notice that the state of a point i at time t depends only on what happens in the triangle {(j, s) : i ≤ j ≤ i + t − s} . However, it is better to stretch every vertex, thus turning vertex percolation into bond percolation as shown on figure 3.3. B q q q q q q q q q qq q q q q q6q q q q qq q q q q q q q q q q q q } q q >Z q q Z q q q q q q Z q q q q q qq q q q q6q q q qq q q q q6q q q q qq q q q q q q q q q q q q q q q q q q >Z } q } q q >Z q q q Z Z q q q q q q q q Z q q Z q q q q q q qq q q q q 6 6 6 qqqq q q q q q qqqqq q q q q q qqqqq q q q q q qqqq q q q q q q q q q q q q q >Z >Z >Z } qq } qq } qq q q q q q q q q Z Z Z q q q q q q q Z q q Z q q Z q q q q q q q q q q q q q q q6q q q q q q q q6q q q q q q q q6q q q q q q q q6q q q q q A q q s0 q Ts s1 s2 s3 XXX XXX @ @ XXX @s C S Figure 3.3. Stavskaya process as percolation. Presence of a zero at the point (0, 3) amounts to percolation in the graph shown by contunuous lines from the source S to the target T. Dotted lines show the dual graph. The sides AB and BC correspond to one face. Let us imagine that four vertices denoted s0 , s1 , s2 , s3 on figure 3.2 are sources of liquid and that arrows are directed pipes which can transmit this liquid upward, but not back. The inclined arrows are always open, but the vertical arrows may be open or closed because they imitate our random operator: everyone of them is closed with probability α and open with probability 1 − α . Then the probability that there is a zero at point 0 at time 3 in our process equals the probability that there ia an open path from at least one of the sources s0 , s1 , s2 , s3 to the target T . Thus we have reduced a problem about our random process to a problem about percolation. 23 However, it is better to have only one source. For this reason we introduce a special vertex S and connect in by edges with s0 , s1 , s2 , s3 . It is convenient to assume that these edges are always open in both directions - then the dual edges will be always closed in both directions and we don’t even need to draw them. For the same reason it is convenient to assume that the vertical edges of the original graph are always open downward; it does not create any unwanted opportunities of percolation because the slanted edges are always closed downward. Then, according to the rule (2), the edges of the dual graph (shown by dotted lines) are open as follows: edges directed . and - are always open in these directions and always closed in the opposite directions; edges directed → are open in this direction with probability 1 − α and always closed in the opposite direction. First let us prove that α∗ ≤ 1/2. Indeed the probability of percolation from the t = 0 level to the point (0, t) is P (percolation) = P (∃ open path) ≤ X P (path is open) ≤ path X t (1 − α) = 2(1 − α) t . path If α > 1/2 then the last expression tends to zero as t → ∞ . Now let us prove that α∗ > 0 . With this purpose we concentrate our attention at the dual graph shown on figure 3.3. According to lemma 2.1, there is no percolation in the original graph if and only if there is an open contour in this graph surrounding T and going in the positive (counterclockwise) direction. We may assume that every contour starts and ends at the topmost point B . The probability that there is such a contour does not exceed ∞ X Ck α k , (5) k=1 where Ck is the number of such contours corresponding to obstacles with k elements, that is having k horizontal steps. Every contour has equal numbers of steps of all the three directions, so altogether it has at most 3k steps. Since every step of a contour has only three possible directions, Ck ≤ 33k and therefore the probability that there is one at site 0 at time t does not exceed ∞ X 27α 33k · αk = , (6) 1 − 27α k=1 24 which is less than one as soon as α < 1/54 . Thus, whenever α < 1/54 , zeros do not die out because their density does not tend to zero. We have proved that 1/54 ≤ α∗ ≤ 1/2 for Stavskaya process. Bq A q q Jq q ] q J q q q q T Jq q s Jq qq q qq q q q q q q q qJq q q q Jq q q q ] q q q Jq q q q Jq q q q q q q q Jq qqq q q q q q q q q q q q q q q q q q q qJq qq q q q q q Jq q q q ] q q q q q Jq q q q q q q q q Jq q q q q q q q q q q q J q q q q q q q q q q q q q q q q q q q qJqq q q q q q q q q qq q q q q q q q q q q q Jq q q q q q ] q q q q q q q Jq q q q q q q q q q Jq q q q q q q q q q q q q q q Jq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqJq q q q q q q q q C Figure 3.4. The dual graph for the Stavskaya process. Thick continuous arrows show one contour surrounding T . Existence of an open contour surrounding T in the positive direction amounts to presence of one at the point 0 at time 3 in the process. Stavskaya process with a finite space Stavskaya process has been simulated on a computer using Monte Carlo method. But a computer cannot deal with infinity! So in fact this simulation was performed with a finite space. To bring our pseudo-code closer to our simulation we need a finite space Zm - the set of remainders modulo m , where m is an arbitrary natural number. The following pseudo-code shows how to do it, where the lines are enumerated for reference purpose: 1 2 3 4 5 6 7 for all i ∈ Zm do simultaneously xi, 0 ← 0 for t = 1 to tmax do for all i ∈ Zm do simultaneously xi, t ← min(xi, t−1 , xi+1, t−1 ) for all i ∈ Zm do simultaneously if rnd < α then xi, t ← 1 Operations with residues modulo m result in residues modulo m , in 25 particluar (m − 1) + 1 = 0 . It is easy to prove that for any α > 0 , for any finite space the zeros die out in a finite mean time . To prove this is easy, it is sufficient to notice that all the ones may die at once with a probability αm . Hence the expectation of time when all zeros die out is not greater than α−m . Why this was not observed in computer experiments? Notice that the number α−m is enormous even for quite moderate values of α and m , much greater that the time which we can afford in experiment. What we observe in the experiment is not distinction between ergodicity and nonergodicity which takes place in infinite processes, but distinction between fast and slow convergence which takes place in finite processes. This distinction can be formulated as follows. Theorem 3.1. Let us take the Stavskaya process with Zm as the ground space and with initial condition ”all zeros”. As before, α means birth rate. Then: (a) If α > 3/4 , the mathematical expectation of time, when all components turn ones, grows at most as a logarithm of m when m → ∞ . (b) If α < 1/54 , the mathematical expectation of time, when all components turn ones, grows at least as an exponent of m when m → ∞ . Our process starts with ”all zeros”. Let us estimate the expectation of time T when it reaches ”all ones”. Since T takes only natural values, E(T ) = P (T ≥ 1) + P (T ≥ 2) + P (T ≥ 3) + . . . (7) We may assume that m > 1 and shall consider two cases. Case 1. Let α ≥ 3/4 . As in the infinite case, we use percolation on the appropriate graph G . << The event T > t , that is ”all ones” is not yet reached at time t , takes place if and only if there is an open path from time 0 to time t . Everyone of these paths is open with probability (1 − α)t and there are m · 2t such paths. So the probability that at least one of these paths is open does not exceed m · (2(1 − α))t . But this probability is P (T > t) . Therefore P (T > t) ≤ m · (2(1 − α))t ≤ m · 2−t . (8) Since T takes only natural values, we may rewrite the sum (7) as follows: E(T ) = P (T > 0) + P (T > 1) + P (T > 2) + . . . (9) Of course, each of them does not exceed 1, but we use this trivial fact only for those terms, whose parameter t does not exceed log2 m . The others, 26 whose parameter t exceeds log2 m , do not exceed terms of a geometric progression, whose first term does not exceed 1 and whose common ratio does not exceed 1/2, so their sum does exceed 2. Altogether we get E(T ) ≤ 2 + log2 m . Thus for large enough values of α the expectation E(T ) grows at most as logarithm of m . Case 2. Let α ≤ 1/54 . We reach the configuration ”all ones” not earlier than at time t if and only if there are i, j ∈ Zm such that there is an open path from (i, 0) to (j, t) . Connecting all the vertices (i, 0) to one pole A and all the vertices (j, t) to another pole B , we get a percolation on a finite plane graph. Let us denote by Γ this graph and by Γ0 the dual graph. There is no percolation from (i, 0) to (j, t) if and only if there is no percolation from A to B . This is equivalent to existence of an open contour in the dual graph which separates A from B . Any contour of this sort may be described as follows: we start at the point (0, 0) and let t grow until we meet an edge of Γ0 belonging to this contour. Then we move along our contour in the direction in which it surrounds A in the counter-clockwise orientation. The length of this contour is not less than m and at every step we have to choose among at most three directions. Then, denoting by k the number of horizontal steps in this contour, T ≥ t if and only if at least one of these contours is open. Then P (T ≥ t) ≤ ∞ X 33k · αk . k=m Thus ∀ t : P (T ≥ t) ≥ 1 − t · (27α)m . Any of these terms is not less than 1/2 if t≤ 1 . 2 · (27α)m Remembering that α ≤ 1/54 , we get t ≤ 2m /2. Substituting each of these terms by 1/2 and excluding all the others, we get E(T ) ≥ 2m . Thus we have proved that for α small enough E(T) grows at least exponentially when m → ∞ . 27 Thus for finite Stavskaya processes also there is some kind of phase transition. We may denote by εlog the infimum of α for which the mathematical expectation of the time, when all zeros die out, grows at most as a logarithm of m when m → ∞ . We may also denote by αexp the supremum of α for which the mathematical expectation of time, when all zeros die out, grows as an exponent of m when m → ∞ . So 0 < εexp ≤ εlog < 1. We guess that εexp = εlog , but cannot prove it. Chaos approximation. We have only very rough estimations of the critical values of Stavskaya process. This is typical of probabilistic cellular automata and percolation. Considering these difficulties, it makes sense to examine approximations. One of them is called ”chaos approximation” or ”mean field approximation.” Let us assume that every time when Stavskaya operator is applied, all components are randomly mixed. In this way we get a product measure after every time step and one parameter xt , the density of zeros, is sufficient to describe what happens all the time. This parameter at all times is determined by the conditions: x0 = 0, xt+1 = α + (1 − α) x2t . (10) Of course, behavior of xt is different from behavior of the density of zeros at time t in Stavskaya process. However, their qualitative similairy is intriguing: in both cases there is a critical value of α . In physics such approximations of complex processes by simple iterations are widely used and called ”mean-field approximations” because they can be interpreted as a substitution of individual particles and their interactions by some uniformly distrubuted mean field described by just one parameter - density. We iteratively define a sequence x0 , x1 , x2 , . . . with initial value x0 = 0 by the rule ∀ n : xn+1 = α + (1 − α) · x2n . (11) where α ∈ [0, 1] is a constant paremeter. We are interested in the limit of xn as n → ∞ . The diagram 3.6 helps us to see that ( 1 lim = n→∞ α/(1 − α) if α ≥ 1/2, if α < 1/2. 28 y=1 6 y=3/4 y=1/2 y=1/4 - x=1 (0,0) Figure 3.5. lim ε = 1/2 6 - ε=1 (0,0) Figure 3.6. In the figure 3.6 the thick curve shows the limit limn→∞ xn as a function of ε . The value ε = 1/2 is critical. Thus we see that even such a simple approximation has a critical point, only its value is 1/2, which is different from the critical value of Stavskaya process. 29 Cayley tree and Bethe lattice. We can present a graph on which the chaos approximation is exact. It is based on ideas of Bethe and Cayley adapted to our kind of random processes. A part of this graph is shown on the figure. The central point, marked with the black circle represents the value of the zero-th component at a time t . 6 *H Y HH HH H @ I @ I @ @ @ @ A K AK AK AK A A A A A A A A CO CO CO CO CO CO CO CO C C C C C C C C C C C C C C C C Figure 3.7. Part of Bethe lattice, where the chaos approximation for uniform operators with two neighbors is exact because any product-measure turns into a product-measure. Exercice 3.1. Prove that the limit limt→∞ xt of the iterations (11) exists and study its behavior depending on parameter α . For which values of α this limit equals 1 and for which values of α this limit is less than 1? Solved problem 3.1. Throughout this book the alphabet A is finite. Let us make an exception just to make the reader aware of this possibility. Let A = {0, 1, 2, 3, . . .} , G = Z and Ω = AZ . The initial condition is δ0 concentrated in ”all zeros”. Let a deterministic operator D be defined as follows: ∀ x ∈ Ω, ∀ v ∈ G : (Dx)v = min(xv , xv+1 ). The random operator Rα increases every component by 1 with probability α independently from all the others. So we get a sequence of measures µt = (Rα D)t . Let us denote by Et the mathematical expectation of µt (0) . Let us say that this model grows if Et tends to infinity as t → ∞ . Then there is a critical value α∗ ∈ (0, 1) such that this model grows for α > α∗ and does not grow for α < α∗ . 30 Chapter 4 Deterministic cellular automata aka DCA Throughout this chapter our alphabet A will have a special element written as 0 and called ”zero”. The configuration ”all zeros”, all of whose components are zeros, is assumed to be invariant under those deterministic operators which we consider and denote by D : D (”all zeros”) = ”all zeros”. (12) You may imagine the configuration ”all zeros” as an abstract analog of some uniform tissue or crystal. Imagine that there is a small defect in this tussue: a defect in a crystal, a small tumor in a healthy tissue etc. We want to predict, what will happen to this defect if we leave it alone: will it disappear or grow or remain the same? The area of processes with local interaction is very large and we try to restrict it. One of means for that is this. In every case we take a set G which we call ground space. Elements of G are called points. In the most typical case G = Zd . Also we choose a natural number n and n maps vi : G → G . Also we choose a non-empty finite set A called alphabet and a transition function f : An → A . Given all this. we have a deterministic operator D : Ω → Ω defined as follows: ( D x)i = f (xv1 (i) , . . . , xvn (i) ). (13) For any p ∈ G and i ∈ {1, . . . , n} we call vi (p) the i -th neighbor of p . Given a set G and maps vi : G → G , we call a map H : G → G an automorphism if H is one-to-one and ∀ p ∈ G and i ∈ {1, . . . , n} : vi (H(G)) = H(vi (p)). We call this interaction scheme uniform or USI if for all p, q there is an automorphism H such that H(p) = q . All the random processes in this book are based on USI. The most usual examples of USI where d is called dimension are: (a) G = Zd and vi (p) = p + vi (0) for all i ∈ {1, . . . , n} ; (b) G = Zdm and vi (p) = p + vi (0)( mod m) for all i . What we call DCA, are traditionally called Cellular Automata. DCA have been studied for several decades and you can find a lot of reference to them 31 at the internet. In particular, John von Neumann [25] used a concrete twodimensional DCA with 29 letters in the alphabet, each component having 9 neighbors including itself to design an abstract ”animal”capable of selfreproduction. John Conway designed a concrete two-dimensional DCA with the same 9 neighbors and only two letters in the alphabet, which he called ”Game of Life”. Studies of ”Garden of Eden” provided a contribution to the theory of DCA. Stephen Wolfram organized a lot of computer experiments with DCA. Some early results about undecidability in DCA are available in [15]. You can find some modern studies of DCA searching internet for Computational Mechanics, Complexity and Emergence. Our results are motivated by the study of ergodicity of probabilistic cellular automata. Attractors and Eroders. A configuration x is called a finite deviation or f.d. of configuration y if the set {v ∈ G : xv 6= yv } is finite. We say that a configuration x is an attractor if D x = x and for any y f.d. from x there is a natural number t such that D t y = x . The notion of attractor is a deterministic analog of the notion of ergodicity, which we shall discuss later. Given two configurations x and y , we call them each other’s finite deviation if the set {p ∈ G : x(p) 6= y(p)} is finite. This definition is interesting only if G is infinite because, if G is finite, all configurations are finite deviations of each other. Any map from AG to AG will be called a D-operator (where D means deterministic.) We call a configuration x invariant for a D-operator D if D x = x . Given two configurations x, y ∈ AG , we say that x attracts y under D if x is invariant for D and there is a natural number t such that D t y = x. We call a configuration x an attractor under a D-operator D if x is invariant under D and for any finite deviation y of x there is a natural number t such that D t y = x . A configuration is called an island if it is a finite deviation from ”all zeros”. Due to our assumptions D transforms any island into an island. (”All zeros” is an island because we claim that the empty set is finite.) In applications zero often represents an empty unit of space while other letters represent units occupied by various particles. Let us call a deterministic operator D an eroder if, given it, the configura- 32 tion ”all zeros” is an attractor. Our first theorem is about non-decidability of the problem of eroders. In the bulk of mathematics, the greater is the class of objects which we study, the more valuable are our results. There is, however, one area, where it is the other way round: proofs of undecidability. If we prove non-existence of algoritm declaring ”yes” or ”no” for each object of a certain class, our result is the more valuable, the smaller is this class. We are going to declare undecidability of the problem of deciding which deterministic operators are eroders. To make this result as valuable as possible, we consider only DCA with the following additional conditions: d = 1, f (0, . . . , 0) = 0, V (0) = {−1, 0, 1} . (14) Theorem 4.1. For DCA satisfying the additional conditions (14) the problem of recognizing eroders vs. non-eroders is undecidable. This theorem should not astonish us because DCA are a very rich class of objects. We do not present a proof of this theorem here; it can be found in [27]. To give you some taste of proofs of undecidability, we present a proof of undecidability in chapter 9. Now we go to theorem 4.2, which asserts decidability of the problem of recognizing eroders vs. non-eroders in some cases. The main difference between theorem 4.1 and theorem 4.2 is that in the latter case we assume that the function f is monotonic. This is what it means. Let us enumerate A to obtain A = {0, . . . , h} with a natural order on this set. We call a function f : AV → A monotonic if a1 ≤ b1 , · · · , aW ≤ bW =⇒ f (a1 , . . . , aW ) ≤ f (b1 , . . . , bW ). h=4 [10] h=3 [10] [24] h=2 [10] [30] h=1 trivial [35] [35] [35] d=1 d=2 d=3 d=4 Figure 4.1. (15) 33 The figure 4.1 illustrates our theorem 4.2: it shows that we know necessary and sufficient criteria for eroders in the case h = 1 [35] and in the case d = 1 [10]. Also we have some incomplete results about the case d = h = 2 [30, 24]. Monotonicity is essential in all these cases. Eroders vs. non-eroder for the case h = 1 . Although the space of our operators is discrete, we need to speak about continuous real space Rd now. A set in Rd is called convex if with any two points it contains the segment with the ends at these points. Given a set in the real space Rd , its convex hull is the intersection of all convex sets containing this set. The following figure gives two examples. PP A P PP P B C@ PP P Fr I r r r J M r G Kr @ PP @ PP@ PP @ D E r Lr rH Figure 4.2. Examples of a set and its convex hull. ABCD is a nonconvex quadrangle. The triangle ABD is its convex hull. The quadrangle EF GH is a convex hull of the finite set {E, F, G, H, I, J, K, L, M } . We are mostly interested in convex hulls of finite sets. For example, convex hull of a point is the same point; convex hull of two points is the segment with the ends at these points; convex hull of three points is either the triangle with these points as vertices or the segment with two of these points as ends and the third point between them. If you put nails into a board at several points and turn a cord around them, it takes the form of the boundary of the convex hull of these points. Trying to find a solvable case, we restrict ourselves to the case A = {0, 1} and d = 1 . More than that, let us consider only uniform operators. This means that we choose a finite list of vectors v1 , . . . , vn ∈ Zd and a Boolean function f : {0, 1}n → {0, 1} and define D as follows: ( D x)v = f (xv+v1 , . . . , xv+vn ) for all v ∈ G = Zd . However, even after all these restrictions the problem of discerning eroders remains algorithmically unsolvable. So we need a stronger assumption. Here it is: we assume that D is monotonic, which means that the function 34 f (·) is monotonic, that is x1 ≤ y1 , . . . , xn ≤ yn =⇒ f (x1 , . . . , xn ) ≤ f (y1 , . . . , yn ). Let us denote by V the list v1 , . . . , vn the set of neighhbor vectors. Let us call a subset z of V a zero-set if (∀i ∈ Z : xi = 0) =⇒ f (xw1 , . . . , xvn ) = 0, We call a zero-set minimal if all its proper subsets are not zero-sets. Since V is finite, the set of its subsets is also finite, whence the set of minimal zero-sets is also finite. On the other hand, (12) implies that the set {v1 , . . . , vn } is a zero-set, whence the family of zero-set is not empty and we may denote them by z1 , . . . , zk . Now let us imbed Zd into a real space Rd and denote by conv (S) the convex hull of any set S ⊂ Rd . We denote by σ the intersection of convex hulls of all the minimal zero-sets: σ = conv (z1 ) ∩ · · · ∩ conv (zk ). (16) Theorem 4.2 [35]. D is an eroder if and only if σ is empty. Let us consider several examples. We define the function median of any odd number of real arguments as follows: given an odd sample (x0 , . . . , x2k ), we first reorder it into the non-decreasing order (y0 ≤ y1 ≤ · · · ≤ y2k−1 ≤ y2k ), and then we define the median as the middle term: median (x0 , . . . , x2k ) = yk . What we call ”median”, in some articles was called”voting” [39]. Example 4.1. Let d = 1, V = {−k, . . . , k} and ( D x)(0) = median (x(−k), · · · , x(k)). In this case a subset of V is a minimal zero-set if and only if it has k + 1 elements. Therefore σ has one element zero, whence it is non-empty. Accordingly, D is not an eroder. It is sufficient to take an island x such that ( 1 if 0 ≤ p ≤ k, x(p) = 0 at all the other points. 35 Example 4.2. Let d = 2 . In this case we denote points, that is elements of Z2 by pairs (i, j) , where i, j ∈ Z . Then we choose the following neighborhood: V = {(0, 0), (1, 0), (−1, 0), (0, 1), (0, −1)} . In this case σ also has one element zero and therefore is non-empty. Accordingly, D is not an eroder; as an island x , which D does not erode, we may take ( 1 if 0 ≤ i ≤ 1 and 0 ≤ j ≤ 1, x(i, j) = 0 at all the other points. Example 4.3. The NEC operator, where NEC stands for NorthEast-Center. In this case the deterministic operator D N EC turns any x ∈ Ω into D x defined as follows: ( D x)i,j = median (xi,j+1 , xi+1,j , xi,j ) for all i, j, (17) For D N EC there are three minimal zero-sets: {(0, 1), (1, 0)} , {(0, 1), (0, 0)} , {(1, 0), (0, 0)} . Their convex hulls are segments with these points as ends: [(0, 1), (1, 0)], [(0, 1), (0, 0)], [(0, 0), (1, 0)]. The intersection of these three segments is empty. Thus the set σ for D N EC is empty and the operator is an eroder. You may check it looking at the figure 3.5. In fact, both ”all zeros” and ”all ones” are attractors under this operator. To prove this, it is sufficient to notice that this function is 0-1 symmetric, which means that if all its arguments change their values, the function also changes its value. Example 4.4 shows that σ may be non-empty, but have no integer point: ( D x)(0, 0) = min max(x(0, 0), x(1, 1), max(x(0, 1), x(1, 0), ) . This is because D has two minimal zero-sets, namely {(0, 0), (1, 1)} and {(0, 1), (1, 0)} . Their convex hulls are diagonals of the square with vertices (0, 0), (1, 0), (1, 1), (0, 1) . Therefore in this case σ consists of one point (1/2, 1/2) - the center of this square. 36 Example 4.5, flattening, for which ( D f lat x)i,j = min max(xi,j , xi,j+1 ), max(xi+1,j . xi+1,j+1 ) . (18) This formula already has min-max form. Its σ and its analog (if 0 and 1 trade places) is empty also. So both ”all zeros” and ”all ones” are attractors under D f lat . You can figure this out looking at figures 4.3 and 4.4. Theorem 4.3. (A version of Helly’s theorem.) If there is a finite family of convex sets in Rd such that every d + 1 of them have a common point, then all the sets in the family have a common point. We don’t need this theorem in general, it is sufficient to prove it for the case when all these convex sets are closed half-planes. We leave to the reader to prove the following special case of Helly’s theorem for d = 1 : if there are several closed half-lines in a line, every two of which have a non-empty intersection, then all of them have a non-empty intersection. (A closed half-line is one of the two halves into which a line is cut by a point, including this point.) Also we need the following statement: if there are two closed convex sets in a plane, which do not intersect, then there is a line in this plane, which separates them, so that these sets are on different sides of this line. For our case, when all the sets in question are intersections of several closed half-planes, this statement is evident and we also leave its proof to the reader. Now let us prove by contradiction that if there is a finite family of closed half-planes in a plane such that every three of them have a common point, then all of them have a common point. Let n be the smallest number of closed half-planes in a counter-example, and let C1 , . . . , Cn be closed half-planes in a plane whose intersection is empty although every three of them have a non-empty intersection. Since n is minimal with this property, the intersection I = C1 ∩ · · · ∩ Cn−1 is non-empty. However, I has no common points with Cn . Then there is a straight line x in the plane, which separates them, so that I and Cn are on different sides of it. Now for all i = 1, . . . , n − 1 we denote Di the intersection of Ci with this line x . Let us prove that every two of the sets D1 , . . . , Dn−1 have a common point. Let us take some sets Ci and Cj , We know that Ci , Cj and Cn have a common point. Therefore the intersection Ci ∩ Cj has a point on one side of x and a point on the other side. Hence, since 37 Ci ∩ Cj is convex, it has a common point with x and this point belongs to Di ∩ Dj , which therefore is non-empty. Now we can apply Helly’s theorem for the one-dimensional case to the sets D1 , . . . , Dn−1 because all of them are closed half-lines (unless they are empty or coincide with this line, which is easy to handle). Since we have proved that every two of them have a common point, all of them must have a common point. Let us call this point y . But every Di is a subset of Ci , whence y belongs to all C1 , . . . , Cn−1 , whence it belongs to their intersection I . So the line x has a common point y with I , which contradicts our choice of the line x . This contradiction proved Helly’s theorem in the case in which we need it. Lemma 4.1. If σ is empty, then there are at most three zero-half-planes, whose intersection is empty. Proof. Every convex hull of a zero-set can be represented as an intersection of several closed half-planes, all of which are zero-half-planes. Therefore, σ can be represented as an intersection of several zero-half-planes. Since it is empty, from Helly’s theorem at most three of them have an empty intersection. Lemma 4.1 is proved. Lemma 4.2. If there are at most three zero-half-planes, whose intersection is empty, then D is an eroder. What is important here is the minimal number of zero-half-planes, whose intersection is empty. On the plane this number is either 2 or 3 and these cases should be considered separately. To hit at the idea in each case it is sufficient to examine our two examples: flattening for the former case and D N EC for the latter case. Points in the state 1 Figure 4.3. Why flattening is an eroder, that is why ”all zeros” is an attractor under flattening. The set of points in the state 1 is between these lines until it diasppears. The left line does not move. The right line moves in the direction of arrow as time goes on. 38 ? Points in the state zero Figure 4.4. Why ”all ones” is an attractor under flattening. The set of points in the state 0 is between these lines until it disappears. The lower line does not move. The upper line moves in the direction of arrow as time goes om. For flattening there are two zero-half-planes whose intersection is empty: {(i, j) : i ≤ 0} {(i, j) : i ≥ 1} . and Based on this, let us show that flattening is an eroder. Given a configuration x with a finite I1 (x) , let us draw two vertical lines such that this set is between them. Then I1 ( D t x) is also between two vertical lines: one of these lines remains the same and the other line moves towards it as t grows. Thus at every time step the distance between these lines decreases and the configuration becomes ”all zeros” in a time equal to the initial distance between the lines. @ @ @ @ @ @ @ @ @ @ @ @ Points in the state 1 @ @ @ @ @ @ @ @ Figure 4.5. Why D N EC is an eroder. The set of points in the state 1 is in the triangle formed by these three lines until it disappears. The horizontal and vertical lines do not move. The inclined line moves in the direction of arrow as time goes on. Due to its 0-1 symmetry, ”all ones” also is an attractor under D N EC . 39 For D N EC there are three zero-half-planes, whose intersection is empty: {(i, j) : i ≤ 0} , {(i, j) : j ≤ 0} and {(i, j) : i + j ≥ 1} . Based on this, let us show that D N EC is an eroder. For any configuration x with a finite set I1 (x) we draw three lines so as to place this set into a triangle surrounded by these lines: one horizontal, one vertical and one with the slope −1 . After every application of D N EC the set I1 ( D Nt EC x) remains between the lines, the horizontal and vertical lines remaining the same, but the line with the slope -1 moving towards their intersection. So the set I1 ( D Nt EC x) shrinks to an empty set and the configuration becomes ”all zeros” in a time proportional to its initial size. Using these ideas you can prove lemma 4.2. Lemma 4.3. If σ is not empty, then D is not an eroder. We shall prove this for the case when σ contains the origin. In this case we shall present a configuration x , which is a finite deviation from ”all zeros”, such that I1 (D x) ⊇ I1 (x) . From monotonicity this implies that I1 (Dt x) is non-empty for all t . Any set S + v = {i + v, i ∈ S} , where v is a vector, is called a shift of S . Let us call a set S obtuse for another set Q if any shift of S , which intersects conv (Q) , intersects Q also. The following figure illustrates this. H HH H HH S H HH HH H H HH HH H H HH HH H H S+v H H HH H Q Figure 4.6. The set S is not obtuse for the set Q because its shift S + v intersects the convex hull of Q without intersecting Q . Theorem 4.4. Carathéodory’s Convex Sets Theorem for bounded 2-dimensional sets. Let S be any set in a plane. Then a point p belongs to conv (S) if and only if it belongs to conv ({p1 , p2 , p3 }) , where p1 , p2 , p3 are some points in S (some of which may coincide). In other words, [ conv (S) = conv ({p1 , p2 , p3 }). p1 , p2 , p3 ∈ S 40 We leave its proof to the reader. Lemma 4.4. For any set S in a plane the set −3 conv (S) is obtuse for S . We leave to the reader proof of this lemma when S consists of one, two or three points and, assuming that for these cases the lemma is already proved, prove it in general. Let us take any shift S+v of S , assume that it does not intersect −3 conv (S) and prove that conv (S) + v also does not intersect −3 conv (S) . Let us take any point p ∈ conv (S) . Now, due to theorem 4.3, there are p1 , p2 , p3 ∈ S such that p ∈ conv ({p1 , p2 , p3 }) . By assumption, S +v does not intersect −3 conv (S) , whence p1 +v, p2 + v, p3 + v do not belong to −3 conv (S) . Since our lemma is already proved for sets containing at most three points, we can conclude that conv ({p1 + v, p2 + v, p3 + v}) does not intersect −3 conv (S) , whence p does not belong to −3 conv (S) . Lemma 4.4 is proved. Now notice that for any S1 , S2 , z ⊆ R2 : If S1 is obtuse for z and S2 is non-empty, then S1 + S2 is obtuse for z , where + means vector sum: S1 + S2 = {s1 + s2 : s1 ∈ S1 , s2 ∈ S2 } . The number of minimal zero-sets is finite since all of them are subsets of V . So we can denote them z1 , . . . , zn . For every minimal zero-set zi we have a bounded set Si obtuse for it. Then their vector sum S1 + · · · + Sn is also bounded and obtuse for all minimal zero-sets. Let us add a large enough sphere S0 to be sure that the intersection of the total vector sum S with Z2 is non-empty and define S = S0 + S1 + · · · + Sn . Thus we have a bounded set S , which is obtuse for all zero-sets. Let us prove that the configuration x , whose I1 (x) = S ∩ Z2 , is not eroded by D . In fact we shall prove that I1 ( D x) ⊇ I1 (x) . Assume the opposite: there is a point v such that xv = 1 , but ( D x)v = 0 . Since ( D x)v = 0 , there must be a minimal zero-set z such that xv+i = 0 for all i ∈ z . Therefore z + v does not intersect S . Since S is obtuse for z , the convex hull of z + v also does not intersect S . Since σ contains the origin, the convex hull of z also contains the origin, so the convex hull of z + v contains v , whence v does not belong to S , which contradicts our assumption that xv = 1 . Lemma 4.4 is proved. Example 4.6. Pay attention that σ may contain no integer points and still be non-empty. Let D : Ω → Ω , where Ω = {0, 1}G , G = Z2 , be 41 defined as follows: ( D x)i,j = min (max(xi,j , xi+1,j+1 ), max(xi+1,j , xi,j+1 )). a) What is σ in this case? b) Let us denote x the configuration such that I1 (x) = {(0, 0), (1, 0), (0, 1), (1, 1)} . What are sets I1 ( D t x) for t=1,2,3,...? Exercise 4.1. Prove another version of Helly’s theorem: If there is a family of bounded convex sets in Rd such that every d + 1 of them have a common point, then all the sets in the family have a common point. (The family does not need to be finite.) You can find a proof of a very general version of Helly’s theorem and many related facts in [28]. Exercise 4.2. Prove that the following statement is false: ”If there is a family of convex sets in Rd such that every d + 1 of them have a common point, then all the sets in the family have a common point.” 42 Chapter 5 σ is empty ⇐⇒ D is an eroder ⇐⇒ RD may be non-ergodic In this chapter we do not yet consider cellular automata in general; instead we consider an important class of them. As before, DCA means a Deterministic Cellular Automaton defined in chapter 4, usually denoted by D with or without a subscript. An RD-operator is a composition of a deterministic operator D and a random operator R defined in (4). Following tradition, we apply them in one order (first D, then R), but write in the opposite: (first R, then D). Let us define a deterministic operator D like we did it in chapter 4. The ground space is Zd . We choose a natural number n and n different neighbor vectors v1 , . . . , vn ∈ Zd . Also we need a function f : An → A . One class of RD operators are percolation operators. In this case A = {0, 1} . You may imagine that ”one” means that a component is alive and ”zero” means that it is dead. In this case we define the function f as follows: ( 0 if a1 = · · · = an = 0, f (a1 , . . . , an ) = (19) 1 in all the other cases. The random operator Rα acts as was defined in (4). The main purpose of this chapter is to present and partiallly prove the following theorem about RD operators. Theorem 5.1. Let A = {0, 1} . Let D denote a monotonic DCA acting on AG where G = Zd , which turns ”all zeros” into ”all zeros”. Let the set σ ⊂ Rd be defined as in (16). Then: (a) The following limit exists for all α : lim (Rα D )t δ0 . t→∞ (20) (b) If σ is empty, then D is an eroder and the limit (20) is different from δ1 for α small enough; this limit tends to δ0 as α → 0 . (c) If σ is not empty, then D is not an eroder and the limit (20) equals δ1 for all α > 0 . The following scheme shows of which parts its proof consists. By a closed half-plane we mean one of two halves into which a plane is divided by a line, including this line. A zero-half-plane means a closed half-plane, which contains a zero-set. 43 σ is empty ? There are at most three zero-half-planes, whose intersection is empty ? ? Rα D is non-ergodic for small α D is an eroder σ is not empty D is not an eroder - Rα D is ergodic for all α > 0 Figure 5.1. Scheme of proof of theorem 5.1. Every arrow in this scheme represents an argument. But first of all we need the following theorem. Lemma 5.1. If D is not an eroder, then Rα D is ergodic for all positive α . For this purpose we estimate the probability that the 0 -th component is zero after t applications of Rα D to any initial configuration and prove that this probability tends to zero when t → ∞ . Since D is not an eroder, there is a finite deviation x from ”all zeros” not eroded by it. So I1 ( D t x) is not empty for all natural t and we can choose a point pt in it. For every u ∈ [1, t] let us consider the event: ”At the time u ∈ [1, t] the random operator Rα turns all components of I1 (x1 ) − p t−u into ones”. This event is sufficient for the point 0 at time t to be in the state 1. But these events are independent from each other (because they pertain to different times) and the probability of each is α C where C is the number of elements in I1 (x) . Therefore for every of these events the probability 44 to happen is α C and the probability not to happen is 1 − α C . So the probability that none of these events happens is t C 1−α , which tends to zero when t tends to ∞ . Thus the probability to have zero at the origin tends to zero as t → ∞ . The same is true for any point and this is sufficient for zeros to die out. Lemma 5.2. If there are at most three zero-half-planes, whose intersection is empty, then Rα D is non-ergodic for small enough α > 0 . A general proof needs some sophisticated technique: either branching analogs of contours [35], or renormalization [4]. [21] contains a good explanation of the branching method for the NEC operator. Flattening: its non-ergodicity We shall prove lemma 5.2 for only one particular case: P = Rα D f lat , where D f lat was defined in example 4.5, formula (18). Notice that δ1 is invariant for P , so it is sufficient to prove that P t δ0 does not tend to δ1 when t → ∞ for α small enough. In fact we shall estimate from above the density of ones in measures P t δ0 uniformly in t . Let us use triple (i, j, t) when we speak of the point (i, j) at time t . We shall imagine this triple as a point in a three-dimensional integer space where the axes i, j are horizontal and the axis t goes upward. We may also denote a point (i, j, t) by one letter, say A , and in this case we write i = i(A), j = j(A), t = t(A) . Let x(i, j, t) equal 1 if there is a particle at (i, j, t) and equal 0 otherwise. Also let us use independent random variables y(i, j, t) , which equal 1 if the random operator turned zero into one at this point and equal 0 otherwise. Every y(i, j, t) equals one with probability α and zero with probability 1 − α independently of other variables y(·) . Thus we can define the variables x(i, j, t) in the following inductive way: Base of induction: x(i, j, 0) = 0 for all i, j . Induction step: x(i, j, t) = max(y(i, j, t), z(i, j, t)) , where the intermediate variable z(i, j, t) is (i, j) -th component of the result of application of D f lat to the configuration at time t − 1 , that is z(i, j, t) = min(max(x(i , j, t − 1), x(i , j + 1, t − 1)), max(x(i + 1, j, t − 1). x(i + 1, j + 1, t − 1))). 45 This pseudo-code expresses the same idea: 1 2 3 4 5 6 for all (i, j) ∈ Z2 do simultaneously x(i, j, 0) ← 0 for t = 1 to ∞ do for all (i, j) ∈ Z2 do simultaneously x(i, j, t) ← ( D f lat x(t − 1))(i, j) if rnd < α then x(i, j, t) ← 1 Here ( D f lat x(t − 1))(i, j) is the (i, j) -th component of the result of application of D f lat to the configuration x(t − 1) , whose components are x(i, j, t − 1), i, j ∈ Z . Let us estimate from above the probability that there is a particle at (0, 0) at time T , that is x(0, 0, T ) = 1 . We shall cover this event by several events and estimate the sum of their probabilities. To every one of these events there will correspond a special path. If k -th and (k + 1) -th vertices of this path are (ik , jk , tk ) and (ik+1 , jk+1 , tk+1 ) , then we denote ∆k = (∆ik , ∆jk , ∆tk ) , where ∆ik = ik+1 − ik , ∆jk = jk+1 − jk , ∆tk = tk+1 − tk . Let us take an arbitrary realization of our process and construct the event to which it belongs along with the corresponding path. We shall proceed inductively, at every step constructing some event and some path and proving the following induction assumptions about them: a) This path leads from (0, 0, T ) to (1, 0, T ) . b) This path has steps only of three following types: down, having ∆i = 0 and ∆t = −1 . horizontal, having ∆i = 1 and ∆t = 0 . up, having ∆i = −1 and ∆t = 1 . In all the three cases |∆j| ≤ 1 . c) If a horizontal step starts at (i, j, t) , then x(i, j, t) = 1 . Base of induction. The event is ” x(0, 0, T ) = 1 ” and the path is (0, 0, T ) → (1, 0, T ) . It is evident that all the assumptions are fulfilled. When we stop: when our path has the following property: d) for every vertex (i, j, t) , where a horizontal step starts, y(i, j, t) = 1 . 46 6t r A = (i, j, t) r r C = (i, j1 , t − 1) D = (i + 1, j2 , t) i Figure 5.2. This is an illustration of one induction step. The axis j is not drawn because it is perpendicular to the paper. If x(i, j, t) = 1 , but y(i, j, t) = 0 , this value of x must be ”inherited” from the previous time step: there must be points (i, j1 , t − 1) and (i + 1, j2 , t − 1) , where x equals one, j1 and j2 being equal either j or j + 1 . Induction step. Suppose that there is a vertex A = (i, j, t) , where a horizontal step AB starts, but y(i, j, t) = 0 . By induction assumption, x(i, j, t) = 1 . Since the variable y at this point is zero, the value of x at this point is inherited from the previous time step. Looking at the figure 5.2 will help you to realize that there must be two other points C and D such that t(C) = t(D) = t(A) − 1, i(C) = i(A) and i(D) = i(A) + 1. Then we define another vertex E by the rule: the vector from D to E is the same as the vector from A to B . After that we change our path as follows: instead of going straight from A to B , we go from A to C , then to D , then to E and thence to B . In other words, we insert C, D, E between A and B into the sequence of vertices of our path. It is easy to prove that the new path also satisfies the induction assumptions a), b), c). This induction process cannot last forever because x(i, j, 0) ≡ 0 . When it stops, we have a path satisfying the requirements a), b), c) and d). but this path may not be self-avoiding yet. We want a self-avoiding path with all these properties. To obtain it, we use ”delooping” similar to that which we used when proving lemma 1.1. If the path which we have is not yet self-avoiding, it visits some vertex twice and makes a loop between these visits. We eliminate this loop (including only one of these visits) from our path, thus obtaining a shorter path, which has all the properties (a), (b), (c) and (d). So we do until we get a self-avoiding path with these properties. 47 Our event is that y(i, j, t) = 1 for all those vertices of our path where horizontal steps start. Since the path is self-avoiding, probability of this event is not greater than αk , where k is the number of horizontal steps in the path. Notice that the number of down steps is equal to the number of up steps and both equal k − 1 . So the length of the path is 3k − 2 . The number of possible paths with length 3k −2 does not exceed C 3k−3 where C is the number of different vectors (∆i, ∆j, ∆t) allowed by condition b). Thus the sum of probabilities of all events does not exceed ∞ X k=1 C 3k−3 αk = α . 1 − C 3α For α small enough this sum is less than 1, whence measures P t δ0 do not tend to δ1 . Lemma 5.2 is proved. Exercice 5.1. Let us take percolation operators with the initial configuration ”all ones”, that is ”all alive”. (a) Let us take percolation operators with n = 1 . Prove that all of them are ergodic whenever α > 0 . (b) Let us take percolation operators with n = 2 . Prove that these operators behave like Stavskaya operator: all of them have one and the same critical values. Exercice 5.2. Let d = 1 and D : Ω → Ω be defined thus: ( D x)i = median (x−1 , x0 , x1 ). Let also Rαβ be the random noise defined in (4) Prove that the composition Rαβ D is ergodic whenever 2/3 < α + β < 4/3 . Exercise 5.3. Prove that the operator Rαβ D N EC is ergodic as soon as 2/3 < α + β < 4/3 . Solved problem 5.1. [10] contains a criterion of eroders for arbitrary finite set of states, but only for dimension d = 1 . Solved problem 5.2. Theorem 5.1 states a connection between eroders and existence of critical values. This connection is not true for greater numbers of states of components. This is shown by the following onedimensional counter-example, where every point has three states: 0,1 and 2. In this case Ω = {0, 1, 2}Z and uniform deterministic operator D is defined by the rule ∀x ∈ Ω, v ∈ Z : ( D x)v = f (xv−1 , xv , xv+1 ), 48 the function f being defined as follows: 1 f (x−1 , x0 , x1 ) = 0 median (x−1 , x0 , x1 ) if x−1 = 1, x0 = x1 = 2, if x−1 = x0 = 1, x1 = 0, in all the other cases. a) Check that the function f (·) is monotonic. b) Let us call a configuration x a finite deviation from ”all zeros” if the set {v ∈ Z : xv 6= 0} is finite. Prove that D is an eroder that is for any finite deviation x from ”all zeros” there is t such that D t x = ”all zeros”. c) Let Rα be a random operator, which turns any component into the state 2 with probability α independently of others. Prove that for any α > 0 the operator Rα D is ergodic [36]. Unsolved problem 5.1. Does uniqueness of invariant measure imply ergodicity? In other words, is there a non-ergodic cellular automaton, which has exactly one invariant measure? Unsolved problem 5.2. Along with the NEC operator, the article [39] wrote about another operator Rαβ D median , acting on the same space, where D median is defined by ( D median x)(i,j) = median (xi,j , xi+1,j , xi,j+1 , xi−1,j , xi,j−1 ), (21) where the Boolean function median (·) of five arguments equals 0 if the majority of its arguments equals 0 and equals 1 if the majority of its arguments equals 1 . Since neither ”all zeros” nor ”all ones” is an attractor under D median , theorem 5.2 cannot be used. However, computer modelling suggested that this operator is non-ergodic for small enough α = β . Also it seems plausible that this operator is ergodic whenever α 6= β . Both statements are unproved. 49 Chapter 6. Measures If we want to define a normalized probability distribution on a finite or countable set Ω , it is easy: we just atrribute a non-negative probability P rob(ω) to every i ∈ Ω so that X P rob(ω) = 1. ω∈Ω Then for any S ⊂ Ω we simply assign X P rob(S) = P rob(ω) ω∈S and obtain a probabilistic distribution with many good qualities. The situation is more difficult when Ω is non-countable. Let us remember some notions which we introduced in the previous chapters. Suppose that we have a non-empty finite set denoted by A and called alphabet. Elements of A are called letters. We call G ground space and Ω = AG configuration space. Elements of G are called points and elements of Ω are called configurations. Every configuration ω is determined by its components ωi for all i ∈ G . For any i ∈ G and any a ∈ A we call a basic set the set {ω : ωi = a} . (22) A finite intersection of basic sets is called a thin cylinder. and a finite union of thin cylinders is called a cylinder. Cylinders form an algebra A because a finite intersection and a finite union of cylinders and a complementar of a cylinder are cylinders. We want to have a measure µ defined on all cylinders in Ω which satisfies the following conditions: (a) µ(empty set) = 0. (23) (b) µ(Ω) = 1. (c) If S1 ∩ S2 is empty, then µ(S1 ∪ S2 ) = µ(S1 ) + µ(S2 ). But this is not enough because many important subsets of Ω are not cylinders. For example, any set consisting of only one configuration is not a cylinder. To include into consideration many inportant sets, which are not cylinders, we define the σ -algebra A0 as the minimal set, which 50 includes A and satisfies the following rule also: any countable intersection and any countable union of elements of A0 also belong to A0 . Now instead of the condition (c) we want a stronger condition (c’): (c0 ) If every two sets of a sequence S1 , S2 , S3 , . . . of subsets of Ω have an empty intersection, then µ(S1 ∪ S2 , ∪S3 ∪ . . . ) = µ(S1 ) + µ(S2 ) + µ(S3 ) + . . . Theorem 6.1. Carathéodory’s extension theorem. If we have a measure µ defined on all elements of an algebra A , which satisfies the conditions (23), then there is a measure µ0 defined on all the elements of the corresponding σ -algebra A0 , which coincides with µ on A and satisfies the condition (c’) on all the elements of A0 . Proof may be found in [1] or [6]. One example of usefullness of this extension: given any configuration x , we obtain the set, which consists only of this configuration, as the countable intersection \ {ω : ωi = xi } . i∈G Let us denote by M the set of probability measures on Ω , that is on the σ -algebra generated by the cylinders as described above. By convergence in M we mean convergence on all thin cylinders (or on all cylinders, which is the same). Theorem 6.2. The set M of normalized measures on the sigma-algebra generated by cylinders in Ω is compact. Proof. In fact we shall prove that every sequence in M has a convergent subsequence; it is well-known that this is equivalent to compactness of M. Let C1 , C2 , C3 , . . . be a list of all thin cylinders. Now let us take an arbitrary sequence µi ∈ M and prove that it has a converging subsequence νi . Let us consider the sequence of µi (C1 ) , i.e. values of our measures on C1 . These values are real numbers between zero and one, so their sequence has a converging subsequence. So the sequence (µi ) has a subsequence (µ1i ) whose values on C1 converge. Let us define ν1 = µ11 and consider the sequence of values of µ1i (C2 ), i = 2, 3, 4, . . . (24) 51 Again, this sequence has a converging subsequence, whence the sequence (24) has a subsequence (µ2i ) of measures whose values on C1 and C2 converge. Let us define ν2 = µ21 and consider values of measures µ2i (C3 ), i = 2, 3, 4, . . . (25) Arguing in the same way, we obtain its subsequence whose values on C3 converge and so on. Thus we define a sequence (ν1 , ν2 , ν3 , . . .) , whose values on all cylinders converge. Thus we found a subsequence which converges on all thin cylinders. Theorem 6.2 is proved. An important class of measures is product measures. We call µ a product measure if for any thin cylinder C = {ω : ωi1 = ai1 , . . . , ωin = ain } . we have µ(C) = j=n Y µ(ωij ) j=1 Uniformity. We consider measures on AZ . We take shifts on Z , then shifts on M We consider a measure µ uniform if it is invariant under all shifts. Existence of non-measurable sets. The idea of the proof is not new: a similar one was used in [17] on pp. 18, in [18] on pp. 13-14, and essentially in the same way in [6] on pp. 408-409. All of these arguments are in the spirit of Lebesgue measure; accordingly the space (analog of our configuration space) is a circumference and two sets are congruent if one turns into the other by a rotation. In our case (in the style of our approach) the configuration space is AZ and two sets are congruent if one turns into the other by a shift. Now we present our argument. For every v ∈ Z we define a map Tv : Ω → Ω as follows: for any configuration x = (. . . x−1 , x0 , x1 , . . . ) and all integer i (Tv (x))i = xi−v . Let us call such maps translations. We claim two sets S1 , S2 ⊂ Ω equivalent if one of them may be turned into the other by a translation. This claim is consistent with the general mathematical notion of equivalence and has all its properties including representation of Ω as a unon of classes of equivalence. Suppose we want to attribute to every S ⊂ Ω a 52 non-negative number µ(S) , called its measure, which satisfy the following conditions: (a) µ(empty set) = 0. (b) For every configuration x ∈ Ω : µ(x) = 0. (c) µ(Ω) = 1. (d) If S ⊂ S ⊂ Ω then µ(S ) ≤ µ(S ). 1 2 1 2 (e) If S1 , S2 ⊂ Ω and S1 and S2 are equivalent, then µ(S1 ) = µ(S2 ). (f ) If every two sets of a sequence S1 , S2 , S3 , · · · ⊂ Ω have an empty intersection, then µ(S1 ∪ S2 , ∪S3 ∪ . . . ) = µ(S1 ) + µ(S2 ) + µ(S3 ) + . . . Theorem 6.3. The conditions (a,b,c,d,e,f ) are incompatible. Notice that these conditions are redundand; for example, (a) follows from (b) and (d). But such as they are written these conditions are good enough for us. Proof. Let us assume all these conditions and come to a contradiction. We have called two configurations x, y equivalent if there is v ∈ Z such that Tv (x) = y . Evidently this ”equivalence” satisfies all the conditions for an equivalence relation in the general sense, so it allows us to classify Ω into classes of equivalence. Let us call a configuration x periodic if there exists a shift Tv , where v 6= 0 , such that x = Tv (x) . The set of periodic configurations is countable, therefore it has measure 0. Therefore the set of non-periodic configurations has measure 1. This set is altready classified into classes. For every class of non-periodic configurations let us choose just one representative - an element of this class. Putting all these representatives together, we get a set S . Notice that no two different elements of S belong to one class and therefore no two different elements of S are equivalent. Now let us consider translations Tv (S) of S indexed by the vector v ∈ Z . Lemma 6.2. If v 6= w , then the sets Tv (S) and Tw (S) have an empty intersection. Proof by contradiction. Let Tv (S) and Tw (S) have a common configuration x : x ∈ Tv (S) and x ∈ Tw (S), 53 whence T−v x ∈ S and T−w x ∈ S. Then S has two different equivalent elements T−v x and T−w x , which is impossible. Lemma 6.2 is proved. Now notice that the translations Tv (S) at all vectors v have pairwise emplty intersections and their union has measure 1. Therefore X µ(Tv (R)) = 1. v∈Z Now let us consider two cases (a) µ(S) = 0 . Then µ(Ω) = 0. (b) µ(S) > 0 . Then µ(Ω) = ∞. In both cases µ(Ω) is not 1 as it should be. This contradiction proves theorem 6.2. Exercice 6.1. Let A = {0, 1} and G = Z . Prove that the following sets belong to the sigma algebra generated by cylinders in Ω = AG : (a) (b) (c) x1 + · · · + xn ) 1 x ∈ Ω : lim = . n→∞ n 2 x1 + · · · + xn ) 1 . x ∈ Ω : lim < n→∞ n 2 x1 + · · · + xn x ∈ Ω : lim exists . n→∞ n Exercice 6.2. In the chapters 1 and 2 we naively spoke about probabilities of percolation from a vertex to infinity without having checked that these events belonged to the sigma-algebra generated by cylinders. But it is better late than never. Prove that these events really belong to that sigma-algebra. 54 Chapter 7 PCA in general In this chapter we speak about cellular automata. All of them are linear operators acting on measures belonging to the space M introduced in the previous chapter. So we shall often use the word ”operator” instead of a longer phrase ”cellular automaton”. By ”process” we mean a sequence of measures µ, P µ, P 2 µ, . . . resulting from iterative application of an operator P to some initial measure µ . A measure µ is called invariant for P if P µ = µ . We call an operator P : M → M ergodic if it has an invariant measure µinv and for any measure µ the limit limit limt→∞ P t µ exists and is one and the same for all µ . If P is ergodic, it has only one invariant measure, but the converse is not proved for cellular automata and is not true for operators in general. In this chapter we shall prove that all cellular automata of a large class have at least one invariant measure. Generally, it is important to study ergodicity and sets of invariant measures of cellullar automata. Ergodic operators correspond to real systems which ”forget” their initial conditions - this is what we usually want to achieve when we mix a drink. Non-ergodic operators correspond to real systems which partially remember their initial conditions - this is what we want to achieve when we keep information in computer memory. The central goal of this course is to present some examples of ergodic and some examples of non-ergodic cellular automata. Let us present a general definition of PCA aka Probabilistic Cellular Automata and prove some general statements about them. As usual, the ground space G is either Zd or Zdm . We denote by M the space of normalized measures on sigma-algebra generated by cylinders in Ω = AG where A is the alphabet. We have n neighbor vectors v1 , . . . , vn ∈ Zd and denote V (p) = {v1 (p), . . . , vn (p)} . If the ground space is finite, then the configuration space is also finite and our PCA is just a finite Markov chain. We define it as follows. For any delta-measure δ(x) , where x is any configuration in Ω , the measure P δ(x) is a product-measure with factors defined as follows. Let us call the distribution of the i -th component according to this measure the transitional distribution and denote it by θi (·|x) . In fact, the i -th transitional distribution depends only on components of x in the neighborhood of i , so we can write it also as θi (·|xV (i) ) , where xV (i) is 55 the restriction of x to V (i) . By θi (y|x) we denote the value of θi (·|x) on y ∈ A , that is the conditional probability that after application of operator P the i -th component will be in the state y if before application the neighborhood of i was in the state xV (i) . This probability is called transitional probability. Thus we have defined how our operator acts on delta-measures. If our space is finite, this is sufficient because Ω is finite also and any measure is a finite linear combination of delta-measures. If the ground space is infinite, measures on Ω generally are not finite linear combinations of delta-measures, but as soon as we concentrate on the value of P µ on a certain thin cylinder S , we may restrict µ to V (S) , where S is the support of this thin cylinder. Since S is finite, V (S) is finite also, and this restriction is a linear combination of delta-measures on V (S) . Thus we can write an explicit formula for the value of P µ at an arbitrary thin cylinder: (P µ)(yi = bi , i ∈ S) = Y X µ(xi = ai , i ∈ V (S)) θi (bi |aV (i) ) aj ,j∈V (S) (26) i∈I for any finite set S ⊂ G and any bi , i ∈ S . Several examples. We call a random operator degenerate if at least one of its transition probabilities equals zero and non-degenerate only if all its transition probabilities are strictly positive. Any deterministic operator can be considered as a very degenerate PCA, which transforms any delta-measure into a delta-measure. Its transition probabilities are ( 1 if y = fi (xV (i) ), θi (y|xV (i) ) = 0 otherwise. Any percolation operator P = Rα D defined in (19) can be represented in the form (26) by taking V (i) = {i + v1 , . . . , i + vn } and ( 0 if xj = 0 for all j ∈ V (i), θi (1|x) = 1−α otherwise. Any map from M to M is called an operator. We call a measure µ invariant for an operator P if P µ = µ . 56 Theorem 7.1. Any PCA has at least one invariant measure. Proof. Let us apply our operator P iteratively to an arbitrary initial measure µ . We obtain a sequence of measures µ, P µ, P 2 µ, P 3 µ, . . . . Let us form another sequence of measures ψ1 , ψ2 , ψ3 , . . . , where 1 k−1 ψk = µ + · · · + P µ , k = 1, 2, 3, . . . (27) k From theorem 6.2 this sequence has a subsequence which converges to some measure φ . Let us prove that φ is invariant for P . Suppose that it is not, which means that there is a thin cylinder C = {y : yi = ai for all i ∈ S} on which φ and P φ have different values: φ(C) 6= (P φ)(C). (28) Let us denote H = |φ(C) − (P φ)(C)| > 0 . Let us denote also Ca = {x : xi = ai for all i ∈ V (I)} , where a ∈ SV (I) . Using these notations, we can rewrite the formula (26) as X Y (P µ)(C) = µ(Ca ) θi (bi |a). a (29) i∈I Since the sequence (27) has a subsequence, which converges to φ , we can take k so large that |ψk (C) − φ(C)| < H/3 and |ψk (Ca ) − φ(Ca )| < H 3 Q j∈V (I) |Sj | (30) for all a , where |Sj | is the cardinality of Sj . Then P ψk = whence 1 k P µ + · · · + P kµ , µ(C) − (P k µ)(C) , ψk (C) − P ψk (C) = k which is not greater by modulo than 2/k . This is less than H/3 as soon as we choose k > 6/H , 57 Also let us prove that | (P φ)(C) − (P ψk )(C) | < H/3 : | (P φ)(C) − (P ψk )(C) | = X Y Y X = θ (b |a ) ψ (x = a ) φ(x = a ) θ (b |a ) − i i i I I I I i i V (i) V (i) aV (I) aV (I) i∈I i∈I X Y (31) (φ − ψ )(x = a ) θ (b |a ) k I I i i V (i) . aV (I) i∈I Since every transition probability does not exceed one, Y θi (bi |aV (i) ) ≤ 1. i∈I Hence from (30), (31) does not exceed H/3 . Thus | φ(C) − (P φ)(C) | ≤ | φ(C) − ψk (C) | + | ψk (C) − (P ψk )(C) | + | P ψk (C) − (P φ)(C) | < H/3 + H/3 + H/3 = H. This contradiction shows that our assumption (28) was false. Theorem 7.1 is proved. In fact, we often need a similar, but more general theorem, which can be formulated as follows. Theorem 7.2. Suppose that we have a non-empty convex compact subset C of M where a cellular automaton P acts. Suppose that there is µ ∈ M such that P t µ belongds to C for all t . Then P has an invariant measure which belongs to C . Proof is evident. The following theorem achieves one of the main goals of this book. Theorem 7.3. If D is a monotonic deterministic operator defined by (37) and both ”all zeros” and ”all ones” are attractors under it, then Rαβ D has at least two different invariant measures for small enough positive α and β . Proof. Let C be the set of measures, for which the density of zeros does not exceed, say, 1/3. Then, taking the initial measure concentrated in ”all ones” and α such that 27α/(1 − 27α) < 1/3 and using theorem 5.3, we 58 conclude that operator P has an invariant measure in C . The same is true if instead of 1/3 we take any positive number. Theorem 7.3 is proved. Do we really need to select a sub-sequence in (27) to prove theorem 7.1 ? Perhaps, already the sequence αt always converges ? The following example shows that this assumption is false. Example 7.1, which shows that a sequence ψ defined by (27) may have no limit: let Ω = {0, 1}Z , P being a deterministic operator, simply shifting every configuration at one site to the left and µ concentrated in the configuration a ∈ Ω , where for all i ∈ Z ( 1 if i ≥ 1 and the integer part of log3 i is odd, ai = 0 in all the other cases. Take C = {x ∈ Ω : x0 = 1} . Then P k µ has no limit when k → ∞ . An explanation of a similar difficulty for processes with continuous time has been provided by Liggett [23, p. 12 of section 1 of chapter 1]. Exercice 7.1. Show that theorem 7.1 becomes false if A is infinite. 59 Chapter 8 Ergodicity and dimension For a long time (since the first studies of Ising model by Ising, Peierls, Onsager a.o. in the first half of the XX century) it was a common opinion in statistical physics that phase transitions can occur only in systems, whose dimension is greater than one. For example, §152 of Landau and Lifshitz’s fundamental monograph [20] was called ”The impossibility of the existence of phases in one-dimensional systems” and an argument of physical nature was presented in support of this impossibility. Another example: ”In one dimension bosons do not condense, electrons do not superconduct, ferromagnets do not magnetize, and liquids do not freeze” [22], p. vi. (Our exercice 1.2 shows in this direction also.) However, all these ideas were formed in dealing with models of equilibrium, which had no time parameter. Cellular automata, besides space, have time. Should it be counted as an additional dimension? There were many arguments about it in Moscow in the 1960-1970-s. Dobrushin suggested that time should be counted as a dimension. In this sense cellular automata, which we call one-dimensional because their ground space is one-dimensional, essentially are two-dimensional and as such should be non-ergodic as soon as their parameters satisfy certain inequalities. Piatetski-Shapiro expressed his doubts about it and to clarify this dispute, undertook a computer simulation of some cellular automata. Three processes were chosen for modeling and both Dobrushin and Piatetski-Shapiro agreed that these processes might be treated as crucial ones. In all the three cases the notion of median, borrowed from statistics, was used to define the deterministic operators. Only, instead of ”median” they spoke about ”voting” [39]. First we define three deterministic operators D 3 , D 5 , D N EC . In the first case the ground space is G = Z , in the other cases the ground space is G = Z2 . In all the three cases the configuration space is AG . Their definitions: ∀ i ∈ Z , x ∈ Ω : ( D 3 x)i = ∀ (i, j) ∈ Z2 , x ∈ Ω : ( D 5 x)i,j = ∀ (i, j) ∈ Z2 , x ∈ Ω : ( D N EC x)i,j = median (xi−1 , xi , xi+1 ), median (xi,j+1 , xi+1,j , xi,j , xi,j−1 , xi−1,j ), median (xi,j+1 , xi+1, j , xi+1,j , xi,j ). Now let us remember what we called the random noise Rαβ : it turns any delta-measure δa , where a ∈ Ω is a given configuration, into a product- 60 measure µ , according to which for any point p ∈ G the component xp is distributed as follows: ( µ(xp = 1) = α, if ap = 0, then µ(xp = 0) = 1 − α, ( µ(xp = 1) = 1 − β, if ap = 1, then µ(xp = 0) = β, Dobrushin expected that with α = β positive but small enough all these processes would be non-ergodic, that is their limit behavior as t → ∞ would depend on the initial condition. Especially appropriate would be two contrasting initial conditions: a measure concentrated in ”all zeros” and a measure concentrated in ”all ones”. Denoting by D any of these operators, computer was used to simulate the measures δ0 (Rαβ D )t and δ1 (Rαβ D )t as t grows. Especially important was the case α = β and D = D 3 . The results of modeling and their interpretations were published in [39]. They seemed to show that Rαβ D 3 was ergodic for all α = β strictly between 0 and 1/2 . At that time these results seemed quite convincing. However, no rigorous proof of this conjecture is obtained till now. Gray [11] proved a similar statement for continuous time. What about the other two operators, their results seemed similar to each other although it can be proved rigorously that in the biased case β = 0 (33) and (34) they behave differently: for (34) both ”all zeros” and ”all ones” are attractors, while for (33) both of ”all zeros” and ”all ones” are not. After that the positive rates conjecture was proposed by several authors based on various informal considerations. It claimed that all uniform non-degenerate one-dimensional probabilistic cellular automata with a finite alphabet are ergodic. (see, for example, [23, chapter 4, section 3], or [36, p. 115] or [11]). Bennett and Grinstein’s computer simulation of NEC Let us mention here results of a more recent computer modeling of the composition Rαβ DN EC . Since this operator has two parameters α and β , instead of a critical point there is a critical curve, see diagram. (Due to monotonicity, about which we shall speak below, it was sufficient to consider only two initial conditions ”all zeros” and ”all ones”, and due to symmetry of D 3 it was sufficient to consider only one of them.) When α = β was close to 1/2 , the operator was ergodic, that is tended to one 61 and the same regime from all initial conditions, but for α = β small enough the operator was non-ergodic, i.e. it ”remembered” the initial condition: if simulation started from ”all zeros”, zeros prevailed all the time, if it started from ”all ones”, ones prevailed all the time. 1 0 -1 .0 .1 .2 .3 .4 Figure 8.1. Sketch borrowed from Bennett and Grinstein’s article [2]: computer simulation (using CAM [32]) and chaos approximation of the NEC operator with asymmetric noise. The horizontal axis is noise, the vertical axis is bias, where noise = β + α, bias = β−α , β+α where α and β are parameters of the noise, probabilities of 0 to become 1 and of 1 to become 0. The left curve is simulation using CAM - Cellular Automata Machine The right curve is mean field aka chaos approximation. The left part of the rectangle - two-phase region. The right part of the rectangle - one-phase region. Regretfully, no analogous simulation was performed for D 5 , so we can only guess for which α and β the operator Rαβ D 5 is ergodic. One guess is that it is non-ergodic only for α = β small enough. Most of the problems considered here pertains to the infinite case. One of them is to decide for any configuration x and any operator D whether x is an attractor under D . 62 We have presented some non-degenerate non-ergodic cellular automata, for which G = Z2 , so we may call them two-dimensional. Similar constructions and arguments can be presented for all dimensions greater than one. Although the Stavskaya process shows a possibility of phase transition in the one-dimensional case, it is not quite satisfactory since one of its invariant measures is degenerate (concentrared in ”all ones”). Is it possible to present a one-dimensional cellular automaton, which has two non-degenerate invariant measures? The following processes are very short of answering this question, but deserve some attention. In both A = {0, 1} and 0 < 1 . Example 8.1. The GKL (Gacs, Kurdyumov, Levin) process [8]. Its operator is a composition of a deterministic operator D GKL and our standard random noise Rαβ , where D is defined as follows: ( median (xi , xi−1 , xi−3 ) if xi = 0, ∀ x ∈ Ω : ( D x)i = median (xi , xi+1 , xi+3 ) if xi = 1. The deterministic operator D GKL in the example 8.1 is uniform In this example both ”all zeros” and ”all ones” are attractors, but it is a unanimous guess that Rαβ D GKL is ergodic whenever α+β > 0 . (See a detailed computer study of D GKL without and with random noise in [29].) Example 8.2. The 2-lines process D 2lines [37]. In this case ground space is special: G = Z × {−1, 1} . Elements of G may be written as pairs (i, j) , where i ∈ Z and j equals -1 or 1. The alphabet is {0, 1} . Now we define the deterministic operator D 2lines as follows: ∀ x ∈ Ω ∀ (i, j) ∈ G : ( D 2lines x)(i,j) = min(max(xi,j xi, −j ), xi−j, j ). The deterministic operator D 2lines in the example 8.2 is uniform, but it is the only example in this book, where the group of automorphisms is non-commutative. The operator D 2lines is an eroder, but it has been proved [34] that Rα D 2lines is ergodic whenever α > 0 . Gacs’ construction. In spite of considerable computer simulations, nothing can substitute a rigorous argument. The systems, which mathematicians consider, may be much more general than those which arise from physical considerations, and may contradict physical intuition. Now the positive rates conjecture is refuted: P. Gács proposed a non-ergodic non-degenerate uniform onedimensional system [9]. Gács’ system actually is an operator acting on 63 Ω = AZ , which is a composition of a deterministic and a random operator, the random operator turning any state into any other state with a small probability. The main property of the system is that errors do not accumulate, so that the density of components in ”wrong” states remains small forever. The system is very complicated and some defects were found in its first version, but now Gray [12] has helped to correct all the flaws and a final version of Gács’ construction has been publushed [9]. It takes more than two hundred pages to describe and it needs, although finite, but enormous number of elements in the set A of states of a single component and although positive but very small probability of error. I asked Gács to estimate, at least roughly, the parameters of his construction. He was not sure, but suggested 2100 as a rough estimate of the number of states and one divided by a square of this number as a rough estimation of probability of error. Although Gács’s result is very important theoretically, these numbers make any practical application very unlikely. It would be interesting to find out, whether such a large number of states and such a small probability of error are really necessary. Exercise 8.1. Write a chaos approximation for the operator Rαβ D N EC and study for which values of parameters α and β it is ergodic. 64 Chapter 9 Coupling, Order, Monotonicity We start with a trivial lemma. Lemma 9.1. (a) Given a percolation model on a finite graph, we attribute to the edges e1 , . . . , en Boolean variables v 1 , . . . , v n so that each v i equals 1 if the edge ei is open and equals 0 if this edge is closed. Given a percolation model with vertices A and B , we denote by Π the Boolean variable, which equals 1 if there is an open path from A to B and equals 0 otherwise. Then Π is a monotonic Boolean function of (v 1 , . . . , v n ) . (b) Given a percolation model on an infinite graph, we attribute to the edges e1 , . . . , en Boolean variables v 1 , v 2 , . . . so that each v i equals 1 if the edge ei is open and equals 0 if this edge is closed. Given a percolation model with a vertex A , we denote by Π the Boolean variable, which equals 1 if there is an open path from A to ∞ and equals 0 otherwise. Then Π is a monotonic Boolean function of (v 1 , . . . , v n ) . Proof of lemma 9.1 is evident and we omit it. Now we go to the main business of this chapter. By a coupling of two or more measures we mean a measure on a product of their spaces, whose marginals are given measures. Any two given measures have a lot of couplings, most of them useless. For example, a product of two measures is one of their couplings (a fairly useless one). Proof of the following lemma is based on a useful coupling. Theorem 9.1. Suppose that we have a percolation model in which every edge is open with probability ε and closed with probability 1 − ε independently from all the other edges. Given any two vertices A and B , the probability of percolation from A to B is a monotonic function of ε . Proof. We denote by P (ε) the probability of percolation from A to B if every edge is open with a probability ε and closed with a probability 1 − ε independently of all the other edges. In fact we shall prove that ε1 < ε2 =⇒ P (ε1 ) ≤ P (ε2 ). To prove this, we use the idea of coupling. Given ε1 and ε2 such that ε1 < ε2 , we introduce for all i ∈ {1, . . . , n} variables wi , independent from each other, each having three possible states ”both”, ”one”and ”none”and 65 distributed as follows: ”both” wi = ”one” ”none” with probability ε1 , with probability ε2 − ε1 , with probability 1 − ε2 . (35) For all i ∈ {1, . . . , n} , we define the states of the variables v1i and v2i as functions of the states of w1 , . . . , wn in the following way: ( ( i 1 if w = ”both”, 0 if wi = ”none”. i i (36) v1 = v2 = 0 if wi 6= ”both”. 1 if wi = 6 ”none”. Then (a) every v1i equals 1 with probability like in the percolation with parameter (b) every v2i equals 1 with probability like in the percolation with parameter (c) v1i ≤ v2i a.s. for all i . ε1 independently from others just ε1 ; ε2 independently from others just ε2 ; Evidently, the function f is monotonic. Hence and from (c) f (v11 , . . . , v1n ) ≤ f (v21 , . . . , v2n ). Hence and from (a) and (b) P (ε1 ) ≤ P (ε2 ) . Theorem 9.1 is proved. Coupling of processes. By a coupling of two or more processes we mean a process on a product of their spaces, whose marginals are given processes. This time we use a general class of deterministic operators. In all of them A = {0, 1} . To define a deterministic operator D , we take an arbitrary non-empty finite list of vectors v1 , . . . , vn ∈ Zd and an arbitrary Boolean function f : {0, 1}n → {0, 1} . Then D is defined as follows: ( D x)v = f (xv+v1 , . . . , xv+vn ) for all v ∈ Zd . (37) Also we use a random operator Rαβ , defined in (4). Theorem 9.2. Let D be defined by (37) and 1− 1 < α + β ≤ 1. n (38) Then the operator Rαβ D is ergodic. Proof. In the present case we use a coupling of three processes: two processes generated by our operator Rαβ D with different initial measures and 66 a percolation process. This coupling is defined by the following pseudocode, where x(v, t) , y(v, t) and m(v, t) mean components of the three marginals at a point v at time t : 1 for all v ∈ Zd do simultaneously 2 x(0) ← µinit x init 3 y(0) ← µy 4 m(v, 0) ← 1 5 for t = 1 to ∞ do 6 for all v ∈ Zd do simultaneously 7 x(v, t) ← f (x(v + v1 , t − 1), . . . , x(v + vn , t − 1)) 8 y(v, t) ← f (y(v + v1 , t − 1), . . . , y(v + vn , t − 1)) 9 m(v, t) ← max(m(v + v1 , t − 1), . . . , m(v + vn , t − 1)) 10 r ← rnd 11 if r < α then 12 x(v, t) ← 0 13 y(v, t) ← 0 14 m(v, t) ← 0 15 if r > 1 − β then 16 x(v, t) ← 1 17 y(v, t) ← 1 18 m(v, t) ← 0 Let us first ignore all the lines of this pseudo-code which deal with the values m(v, t) and concentrate our attention on those lines, which deal with x(v, t) and y(v, t) . These lines describe two processes with one and the same operator Rαβ D and arbitrary different initial conditions set by lines 1-3. These processes function simultaneously, using a common source of random noise. In both processes every component every time does the following: first, due to lines 7 and 8, it assumes some value, which depends on states of its neighbors one time step ago, and second, due to lines 10-17 it makes a random change, becoming 1 with probability α and becoming 0 with probability β . Let us call a point (v, t) a break if x(v, t) 6= y(v, t) . It follows from percolation arguments that the density of breaks at time t does not exceed coef t , where coef = n(1 − (α + β)). (39) Under the condition (38) coef < 1 and therefore the quantity (39) tends to zero as t → ∞ . 67 To monitor what happens to breaks, we have special marks m(v, t) , which may equal 0 or 1. We call a point (v, t) marked if m(v, t) = 1 and unmarked otherwise. The interaction is arranged in such a way that m(v, t) = 1 at every break: ∀ v, t : x(v, t) 6= y(v, t) =⇒ m(v, t) = 1. (The converse may be false: m(v, t) may equal 1 at other points also.) Initially all the points are marked (to be on the safe side, because initially all the points may be breaks) and become unmarked only when lines 12-13 or 16-17 assign equal values to x(v, t) and y(v, t) . Notice that the mark process is a percolation process with the death rate α + β . Now let us prove that under conditions (38) our operator P = Rαβ D is ergodic. From theorem 7.1 P has at least one invariant measure µinv . Let us take an arbitrary measure µ0 and prove that µ0 P t tends to µinv as t → ∞ . Remember that by convergence in M we mean convergence on all thin cylinders. Let us take an arbitrary thin cylinder C and prove that µ0 P t (C) tends to µinv (C) as t → ∞. Let us denote µt = µ0 P t for all t . Remember that any thin cylinder is an intersection of several basic sets (22). In particular C = C1 ∩ · · · ∩ Ck . Therefore |µt (C) − µinv |(C) ≤ k X |µt (Ci ) − µinv (Ci )|. i=1 Since every item in the right side tends to zero as t → ∞ , the right side also tends to zero, whence the left side also tends to zero. Theorem 9.2 is proved. Order on configurations. Let us assume that the alphabet A is ordered (perhaps, partially). For example, if A = {0, 1, . . . , m} , it may be ordered in the same way as on the number line: 0 ≺ 1 ≺ · · · ≺ m − 1 ≺ m. We shall use signs ≺ and and words preceeds and succeeds speaking about this order. For example, when A = {0, 1} , we typically assume that 68 0 ≺ 1 which means that zero preceeds one and (which is the same) 1 0 which means that one succeeds zero. These relations are a generalization of relations ”less” and ”more” among real numbers, but in the present case there may be uncomparable elements. For any a, b ∈ A we assume that: (a) x ≺ y meanbs the same as y x , (b) If x ≺ y and x y then x = y . Let us introduce a partial order on AG by saying that configuration x preceeds configuration y or, what is the same, y succeeds x and writing x ≺ y or y x if xi ≤ yi for all i ∈ G . In chapter 4 we have called a deterministic operator D monotonic if x ≺ y implies D x ≺ D y . This definition is consistent with what we said before. Let us say that a measurable set S ⊂ Ω is upper if (x ∈ S and x ≺ y) =⇒ y ∈ S. Analogously, a set S is lower if (y ∈ S and x ≺ y) =⇒ x ∈ S. It is easy to check that a complement to an upper set is lower and vice versa. Order on measures. We introduce a partial order on M by saying that a normalized measure µ preceeds ν (or ν succeeeds µ ) if µ(S) ≤ ν(S) for any upper S (or µ(S) ≥ ν(S) for any lower S , which is equivalent). Order on operators. We call an operator P : M → M monotonic if µ ≺ ν implies P µ ≺ P ν . Also let us say that operator P1 preceeds operator P2 or P2 succeeeds P1 and write P1 ≺ P2 or P2 P1 if P1 µ ≺ P2 µ for all measures µ . Notice that there are uncomparable configurations, none of which preceeds the other, for example (0, 1) and (1, 0) . Similarly there are uncomparable measures and uncomparable operators. Notice also that all our definitions are consistent: if we consider a deterministic operator as a degenerate random operator, our definitions of monotonicity coincide. Lemma 9.2. Let Q Q us have two product-measures µ, ν ∈ M , where µ = i µi and ν = i νi . Then µ ≺ ν if and only if µi ≺ νi for all i . Proof is easy and we omit it. 69 Of course, composition of monotonic operators is monotonic, so to know that a composition of two operators is monotonic, it is sufficient to check monotonicity of each. How to check monotonicity of an operator? Lemma 9.3. An operator P defined by (26) is monotonic if and only if all the transition distributions θi (·|x) are monotonic functions of x , that is x ≺ y =⇒ θi (·|x) ≺ θi (·|y) . (40) Proof of lemma 9.3. In one direction: suppose that (40) is false, that is there are i , and y ≺ z such that θi (·|y) does not preceed θi (·|z) . Then δ(y) ≺ δ(z) serve as those µ ≺ ν for which P µ does not preceed P ν because both are product-measures and the i -th factor of P µ does not preceed the i -th factor of P ν . From lemma 9.1 this is sufficient. In the opposite direction it also follows from lemma 9.1. Lemma 9.4. Given two operators P1 and P2 with one and the same Ω and transition distributions θi1 (·|x) and θi2 (·|x) respectively. Then P1 ≺ P2 if and only if θi1 (·|x) ≺ θi2 (·|x) (41) for all i ∈ G and x ∈ Ω . Proof of lemma 9.4. In one direction: Let us assume that θi1 (·|y) does not preceed θi2 (·|y) for some i ∈ U and some y ∈ Ω and prove that P1 does not preceed P2 , that is exists a measure µ such that P1 µ does not preceed P2 µ . Let us take µ = δ(y) . Then both P1 µ and P2 µ are product-measures, the i -th factors of which violate condition of lemma 9.1, whence P1 µ does not preceed P2 µ . In the opposite direction: Now assume (41) and prove that P1 µ ≺ P2 µ for all µ . It is sufficient to prove this for delta-measures, for which it follows from lemma 9.1. Lemma 9.4 is proved. Lemma 9.5. Let A = {0, 1} . If P is monotonic, then the sequences P t δ0 and P t δ1 converge and P is ergodic if and only if their limits coincide. Proof. It is easy to prove by induction that δ0 ≺ P δ0 ≺ P 2 δ0 ≺ P 3 δ0 ≺ P 4 δ0 . . . and δ1 P δ 1 P 2 δ1 P 3 δ1 P 4 δ1 . . . . Indeed, in each case the first inequality is evident because δ0 preceeds and δ1 succeeeds any measure and all the other inequalities follow from this. Thus for any upper or lower set C the sequences P t δ1 (C) and P t δ0 (C) 70 are monotonic and therefore each of them has a limit. Now let us take any thin cylinder C and denote C = {x | ∃ y ∈ C : y ≺ x} 0 and C = C \ C. 0 It is easy to show that C and C are upper sets, so the values of P t δ0 and P t δ1 at these sets have limits. Therefore their values at C also have limits, which means that these measures have limits. Lemma 9.5 is proved. One example of application of lemma 9.5 : the sequence P t δ1 for the Stavskaya operator P has a limit; since from (6) P t δ1 (x0 = 0) ≤ 27α for all t, 1 − 27α the same inequality is true for the limit measure, whence for small values of α the Stavskaya operator has at least two different invariant measures. Exercise 9.1. Given two measures µ and ν on one and the same sigmaalgebra generated by cylinders in Ω . Given µ ≺ ν and ν ≺ µ . May we conclude that µ = ν ? Exercise 9.2. Prove that the operator Rαβ defined in (4) is monotonic whenever α + β ≤ 1 . 71 Chapter 10 Undecidability of the problem of ergodicity for PCA The main result of this chapter is undecidability of the problem of ergodicity of cellular automata. Let us think, what do we really want to achieve in dealing with cellular automata, in particular with the study of their ergodicity. Mathematics is an abstract science and we, mathematicians, want to prove general theorems. In the present case we want to have criteria of ergodicity for as large classes of cellular automata as possible. However, it is known that some general problems in all areas of mathematics cannot be solved in the algorithmical sense. It is only natural that in dealing with cellular automata we face such situations very often, because our object is very general. When we meet an undecidable problem, it means that we are working close to the boundaries of natural possibility. This moves us to treat with more respect those partial results which we have obtained: perhaps, they are close to what can be done at all. In this chapter we shall show that the problem of deciding, which cellular automata are ergodic and which are not, is algorithmically unsolvable for a certain class of them. There are several formalizations of algorithms. We choose one of them, namely Turing machines, named after their inventor Alan Turing, because they are the most similar to cellular automata. In fact we shall use the following class of Turing machines with one head and one bi-infinite tape. A Turing machine of this class consists of a head and a tape. The tape is infinite in both directions and consists of cells enumerated by integer numbers. Every cell can be in several states. The set G of states is one and the same for all cells. The head also has a finite set H ∪ {stop} of states, where one state, called stop , plays a special role described below. At every step of the discrete time the head observes one cell of the tape as shown in figure 10.1. head 6 ? Figure 10.1. Turing Machine. At every step of the discrete time the head observes one cell, exchanges information with it and then moves to another cell. 72 Also we choose three functions: Ftape : G × H → G, Fhead : G × H → H ∪ {stop} , Fmove : G × H → {←, →} . When the machine starts, the tape is ”empty”, which means that all cells are filled with the initial symbol g1 ∈ G . The head is in the initial state h1 ∈ H and the head observes the 0 -th cell of the tape. At every step the head simultaneously writes into that cell of the tape, which it observes, a new symbol according to the function Ftape , goes to a new state according to the function Fhead , and moves one cell left or one cell right along the tape according to the values ←, → of the function Fmove respectively, the arguments of all the three functions being the symbol in the presently observed cell of the tape and the present state of the head. The machine stops when and if the head reaches the state stop . (That is why we don’t need to define our functions when the head is in the state stop .) It is well-known that the problem of deciding, which of these machines ever stop, is algorithmically unsolvable, that is there is no algorithm capable to predict for all Turing machines, which of them ever stop having started with empty tape. This famous theorem is described in many books. We shall use it to prove another theorem about undecidability - about cellular automata. In the greater part of mathematics, a theoretical result is the better the more general it is, that is the larger is the class of objects under consideration. However, when we prove algorithmic unsolvability, it is the other way round: the result is the more valuable the smaller is the class of objects for which it is proved. For this reason, we minimize our class as much as possible: the arbitrary number n of states of a single component is the only source of infinitness of our class, everything else is minimized: the dimension is 1, the interaction is only between nearest neighbors and all the transition probabilities are either 0 or 1/2 or 1. Our configuration space is {1, . . . , n}Z , where n is a natural parameter. A cellular automaton is an operator, which is a composition of two operators: first deterministic D , then random R . Our deterministic operator is determined by a function f : {1, . . . , n}3 → {1, . . . , n} in the following way: it transforms any configuration x into D x , where ( D x)i = f (xi−1 , xi , xi+1 ) for all i ∈ Z. (42) 73 Our random operator R is very simple: all components of a configuration, which are not in the state n , turn into the state 1 with probability 1/2 independently of each other. An operator P is called ergodic if the ditribution P t µ tends to one and the same limit dsitribution from all initial conditions µ . Theorem 10.1. There is no algorithm to decide which cellular automata described in chapter 8 are ergodic and which are not [19, 36]. Our method of proof consists in the following: for every Turing machine of the class described above we construct a cellular automaton of the class described above, which is ergodic if and only if that Turing machine stops having started on an empty tape. This is sufficient to prove that the problem of deciding which of our cellular automata are ergodic is unsolvable because if it were solvable, the problem of deciding which of Turing machines stop would be solvable also, but it is known that it is not. In more detail: if the ergodicity problem were solvable, then we could take a Turing machine, construct the corresponding cellular automaton, apply to it that hypothetical deciding procedure, decide whether it is ergodic or not, and conclude whether the Turing machine stops or not. Thus for every Turing machine T we shall construct an operator P which is ergodic if and only if T stops. In fact, P imitates the functioning of T in the following way: under its action, all components at any time randomly (with probability 1/2 ) turn into heads of T in initial condition. Every head marks its territory with brackets and imitates the functioning of T on its territory. This functioning may be interrupted by other heads, but measures are taken to prevent the heads from collaborating: as soon as a head gets any sign of presence of another head, it commits a suicide. If M never stops, the process remains in this messy regime forever. If M stops, some components go to a special final state, which starts an ”epidemy” by turning their neighbors into the same state, so that the process tends to the δ(all final) , measure concentrated in the configuration ”all components are in the final state”. In fact, this measure is invariant in all cases, but the process tends to it from all initial conditions only if M stops. Thus, for any Turing machine M of the class described above we shall construct an operator P , which is ergodic if and only if M stops. We set S = Slef t × Sright × Stape × Shead , 74 where Slef t = Sright = {0, 1} , Stape = G, Shead = H ∪ {0, stop} . Accordingly, we write a generic element of S as x = (lef t(x), right(x), tape(x), head(x)). We say that a state x has a left bracket if lef t(x) = 1 and that it has a right bracket if right(x) = 1 . We call x a no-head if head(x) = 0 and a head otherwise. We call x a stop-head if head(x) = stop . The state (0, 0, g1 , 0) is called empty , the state (1, 1, g1 , h1 ) is called newborn and the state (0, 0, g1 , stop) is called f inal . For brevity we shall write F∗ (x) = F∗ (tape(x), head(x)) , where ∗ means ‘tape’, ‘head’ or ‘move’. We say that a head x wants to move left or to move right when Fmove (x) equals ← or → respectively. We need all these states to make our process imitate the functioning of Turing machine M . The tape component imitates what is written on the tape, the head component imitated what is in the head or its absence if it is zero. The left and right brackets are necessary to exclude interference of the heads. Our operator P is a composition P = RD , where R turns any state except final into the newborn state with a probability 1/2 independently. It remains to define the deterministic operator D , that is to define the function f (·) in formula (42). Its definition consists of several rules. Rule 0. If x or y or z is a stop-head, then f (x, y, z) = f inal . Formulating all the other rules, we assume that neither x nor y nor z is a stop-head. We call a triple (x, y, z) ∈ S 3 normal if at most one of x, y, z is a head. Rule 1. Whenever the triple (x, y, z) is not normal, f (x, y, z) = empty . In all the following rules we assume that the triple (x, y, z) is normal. Rule 2. If all of x, y, z are no-heads, then f (x, y, z) = y . All the subsequent rules form three groups depending on which of the three arguments is a head: center, denoted y , or its left neighbor, denoted x , or its right neighbor, denoted z . The center-rules: 75 Rule 1-center. If y is a head which wants to move left, then f (x, y, z) = (0, right(y), Ftape (y), 0). Rule 2-center. If y is a head which wants to move right, then f (x, y, z) = (lef t(y), 0, Ftape (y), 0). Since the left rules and the right rules are symmetric, we shall omit the right ones. The left-rules: Rule 1-left. If x is a head, which wants to move right and has a right bracket, then f (x, y, z) = (0, 1, g1 , Fhead (x)). Rule 2-left. If x is a head, which wants to move right and has no right bracket and y has no left bracket, then f (x, y, z) = (0, right(y), tape(y), Fhead (x)). Rule 3-left. If x is a head, which wants to move right and has no right bracket, but y has a left bracket, then f (x, y, z) = y . Rule 4-left. If x is a head, which wants to move left, then f (x, y, z) = y . The right-rules are analogous to left-rules, only with right and left permuted. Our operator D is defined. To make the operator R satisfy the promised condition, it is sufficient to choose n equal to the cardinality of S and enumerate the states of S so that the newborn state gets number 1 and the final state gets number n . Lemma 10.1. Operator P = R D is ergodic if and only if the Turing machine M stops. Proof of lemma 10.1. Due to the rule 0, the measure δ(all final) concentrated in the configuration “all components are in the final state” is invariant for P . Therefore P is ergodic if and only if our process tends to δ(all final) from any initial configuration. Now let us argue in two directions. One direction: Let us suppose that M stops after T steps and prove that our process tends to δ(all final) from any initial configuration. Let us consider a region [s0 − 2T, s0 + 2T ] ⊂ Z , where s0 is any integer number. If a stop-head is present there, it turns into f inal , which expands in both 76 directions due to rule 0. If there is no stop-head there, then the following scenario has a positive probability: First, at some time t0 births occur in all sites in the range [s0 − 2T, s0 + 2T ] . At the next time step all of these sites become empty. At the next time step birth occurs at the middle site s0 and this is the only birth that occurs in the space-time region {(s, t) | s0 − 2T + (t − t0 ) ≤ s ≤ s0 + 2T − (t − t0 ), 0 < t − t0 ≤ 2T } . Under these conditions, we are dealing with configurations imitating the functioning of M during time long enough for M to stop. As soon as the head stops, it turns into f inal , which expands in both directions due to rule 0. This scenario has a positive probability. Therefore it happens somewhere almost sure, whence our process tends to δ(all final) . The other direction: Let us assume that M never stops, i.e. continues to function forever if started at the empty tape. Let us take the initial measure concentrated in the configuration “all components are in the empty state” and prove that the resulting distributions cannot contain a stophead with a positive probability and therefore cannot tend to δ(all final) . This would be evident if every head functioned alone, never interacting with other heads. Let us show that in our process every head either functions as if it were alone or disappears. In our construction every head creates its own ”territory” marked by left bracket at the left end and by right bracket at the right end. This territory consists of sites which this head has visited. However, this territory may be invaded by another head, which changes the states of cells which it visits, and our head must recognize when it happens. Every time when a head wants to move beyond its territory, it carries the bracket one step further, perhaps, invading another head’s territory. In this case, due to rule 1-left, it changes the symbol on the tape to the empty one, so its functioning does not differ from functioning of a solitary head on a tape which was empty at the beginning. The crucial question is what happens when some head returns to a place which was its territory, but was invaded by another head. If our head does not notice that the site was invaded and uses symbol written there by another head, it may eventually stop although it would not stop if functioned alone. We must avoid this. Let us examine the situation in more detail. Since right and left are symmetric, it is sufficient to examine, what happens if some head wants to move right. If it has a right bracket, it means that it is expanding its 77 territory; in this case it can do it due to rule 1-left and in doing it it will erase the former tape symbol and write the initial symbol as if it were alone on the tape. If it has no right bracket and its right neighbor has no left bracket, it means that it is moving within its own territory and goes to a place, which has never been invaded and the symbol in the right neighbor was written by itself - see rule 2-left. However, if it has no right bracket. but its right neighbor has a left bracket, it means that another head has visited this site. Having discovered that it is not unique, our head gets so angry that it commits a suicide, that is turns into a no-head. In more detail, on one hand, due to rule 2-center, our head is not any more where it was, but, on the other hand, this head does not emerge in its right neighbor cell due to rule 3-left and the cell, which it wanted to invade, remains intact. All this assures that every head either moves within its own territory, never visited by other heads, expanding it and imitating the functioning of the original Turing machine with one head, or disappears. Therefore the probability that a stop-head will ever emerge is zero and the process does not tend to δ(all final) . Thus lemma 10.1 is proved, whence theorem 10.1 immediately follows. Exercise 10.1. Let us consider the class of operators Rαβ D on {0, 1}G , where G = Zd and D is any operator defined by formula (37) with only one restriction: the number of neighbors n = 1 . Present an algorithm, which decides for all operators of this class, which of them are ergodic and which are not. Exercise 10.2. Let us consider the class of monotonic deterministic operators D : Ω → Ω , defined by (37), where Ω = {0, 1}G and G = Zd . Present an algorithm to decide, which of these operators are ergodic. (Since every deterministic operator can be interpreted as a random operator, we can apply to them the notion of ergodicity.) Exercise 10.3. Prove that the problem of deciding, which cellular automata have only one invariant measure, is algorithmically unsolvable. This statement is similar to theorem 10.1, but is not identical with it and needs to be proved separately because we don’t yet know whether uniqueness of invariant measure implies ergodicity. 78 Main terms and notations Z - the set of integer numbers. Zd - the d -dimensional integer space. R - the set of real numbers. Rd - the d -dimensional real space. Percolation model - graph, whose edges may be open or closed. Path in a graph - a finite or infinite sequence ”vertex-edge-vertex-edge...”, where every edge connects those vertices between which it is places in this sequence. Contour - a path in which the first and last vertices coincide. G - ground space, a finite or countable set, discrete analog of physical space. A - a non-empty finite set called alphabet, the set of states of any component. letter - any element of A AG - the configuration space, its elements are called configurations. Configuration space - product-space Ω = AG . Configuration - an element of configuration space x ∈ Ω . Determined by its components xv for all v ∈ G Thin cylinder - subset of Ω of the form {x ∈ Ω : xi1 = ai1 , . . . , xin = ain } . Support of this thin cylinder is the set {i1 , . . . , in } . Normalized measure µ on Ω is defined by its values on thin cylinders. ”Normalized” means µ(Ω) = 1 . M - set of normalized measures on Ω . Delta-measure δ(x) - measure concentrated on one configuration x . Product-measure - a measure on a product-space, in which all the marginals are independent. P µ - result of application of operator P to measure µ . Transition distribution θi (·|x) - distribution of the i -th component according to the measure P δ(x) . Transition probability θi (y|x) - probability that the i -th component equals 79 y according to the measure P δ(x) . Degenerate measure - a measure on sigma-algebra generated by thin cylinders, which equals zero at at least one thin cylinder. Degenerate cellular automaton - a cellular automaton, at least one transition probability of which is zero. Composition P Q of two operators P and Q - an operator, whose action consists of action first Q , then P . Uniform measure - a measure, which is invariant under space shifts. Uniform operator - an operator which commutes with space shifts. Invariant measure: a measure µ ∈ M is called invariant for operator P if P µ = µ . Ergodicity: Operator P : M → M is called ergodic if the limit limt→∞ P t µ exists and is one and the same for all µ ∈ M . A deterministic operator D : is monotonic if x ≺ y implies D x ≺ D y . A random operator P : M → M is monotonic if µ ≺ ν implies P µ ≺ P ν. Coupling of two or more measures - a measure on a product-space, whose marginals are given measures. Coupling of two or more processes - a process on a product-space, whose marginals are given processes. finite deviation Given two configurations x, y ∈ Ω , we call x and y finite deviations of each other if the set {g ∈ G : xg 6= yg } is finite. A configuration x ∈ Ω is callled an attractior of a deterministic operator D if x is invariant for D and for any y finite deviation of x there is a natural t such that D t y = x. . Shift of a set S in a linear space at a vector v denoted S + v - the set {i + v | i ∈ S} . Vector sum of two sets {i + j | i ∈ S1 , j ∈ S2 } . in a linear space: S1 + S2 = Convex set - a set in a linear space, which with any two points a, b contains the segment [a, b] . Convex hull of a set S in a linear space - intersection of all convex sets containing S . Turing machine - an abstract ”machine” proposed by Alan Turing as a formalization of the notion of algorithm. 80 References [1] Robert B. Ash. Measure, Integration, and Functional Analysis. Academic Press, New York and London (1972). [2] C. Bennett and G. Grinstein. Role of Irreversibility in Stabilizing Complex and Nonergodic Behavior in Locally Interacting Discrete Systems. Phys. Rev. Letters, v. 55 (1985), n. 7, pp. 657-660. [3] P. Berman and J. Simon. Investigations of Fault-Tolerant Networks of Computers. ACM Symp. on Theory of Computing, v. 20 (1988), pp. 66-77. [4] M. Bramson and L. Gray. A Useful Renormalization Argument. Festschrift for F. Spitzer. Birkhäuser, Boston, MA. [5] S. R. Broadbent and J. M. Hammersley. Percolation processes I. Crystals and mazes. Proceedings of the Cambridge Philosophical Society, v. 53 (1957), pp. 629-641. [6] R. Durrett. Probability: Theory and Examples. 4-th edition (2010). [7] R. E. Edwards. Functional Analysis: Theory and Applications. Dover Publications, Inc., N.Y. (1995). [8] P. Gacs, G. Kurdyumov and L. Levin. One-dimensional homogeneous media, which erode finite islands. Problems of Information Transmission., v. 14 (1978), pp. 223226. (Translated from Russian.) [9] P. Gács. Reliable cellular automata with self-organization. Journal of Stat. Physics, v. 103 (2001), n. 1/2, pp. 45-267. [10] G. Galperin. Homogeneous local monotone operators with memory. Doklady of Soviet Acad. of Sciences, 228 (1976), pp. 277-280. (Translated from Russian.) [11] L. Gray. The Positive Rates Problem for Attractive Nearest Neighbor Spin Systems on Z . Z. Wahrscheinlichkeitstheorie verw. Gebiete, v. 61 (1982), pp. 389-404. [12] L. Gray. A reader’s guide to P. Gács’s ”positive rates” paper: ”Reliable cellular automata with self-organization” [J. Statist. Phys. 103 (2001), no. 1-2, 45-267]. Journal of Stat. Physics, v. 103 (2001), n. 1-2, pp. 1-44. [13] G. Grimmett. Percolation. Springer (1999). [14] . O. Häggström. Computability of Percolation Thresholds. In and Out of Equilibrium 2 (2007), Progress in Probability, v. 60 (2008), pp. 321-329, ed. by V. Sidoravicius and M. Vares. 81 [15] F. C. Hennie. Iterative Arrays of Logical Circuits. MIT Press Classics (1961). [16] H. Kesten. Percolation Theory for Mathematicians. Birkhauser, Boston (1982). [17] A. N. Kolmogorov and S. V. Fomin. Measure, Lebesgue Integrals, and Hilbert Space. Academic Press, New York and London, 1961. (Translated from Russian.) [18] A. N. Kolmogorov and S. V. Fomin. Elements of the Theory of Functions and Functional Analysis. Volume 2. Measure. The Lebesgue Integral. Hilbert Space. (Translated from Russian.) [19] G. I. Kurdumov. An algorithm-theoretic Method for the Study of Uniform Random Networks. Multicomponent Random Systems ed. by R. L. Dobrushin and Ya. G. Sinai. Advances in Probability and Related Topics, v. 6 (1980), pp. 471-503. Contributing editor D. Griffeath, series editor P. Ney. Marcel Dekker, Inc. New York and Basel. (Translated from Russian.) [20] L. D. Landau and E. M. Lifshitz. Statistical Physics. (Vol. 5 of Course of Theoretical Physics.) 2-d Edition. Pergamon Press, 1969. [21] J. L. Lebowitz, C. Maes and E. R. Speer. Statistical mechanics of probabilistic cellular automata. Journal of Stat. Physics, 59 (1990), 1-2, 117-168. [22] Mathematical Physics in One Dimension. Exactly Soluble Models of Interacting Particles. A Collection of Reprints with Introductory Text by Elliott H. Lieb and Daniel C. Mattis. N.Y., Academic Press, 1966. [23] Thomas M. Liggett. Interacting Particle Systems. N.Y., Springer-Verlag, 1985 and 2005. [24] M. L. Menezes and A. Toom. A non-linear eroder in presence of one-sided noise. Brazilian Journal of Probability and Statistics, vol. 20, n. 1, june 2006, pp. 1-12. [25] J. von Neumann and A. W. Burks. Theory of self-reproducing automata. Urbana, University of Illinois Press, 1966. [26] P. Lima and A. Toom. Dualities Useful in Bond Percolation. Cubo, vol. 10 (2008), n. 3, pp. 93-102. [27] N. Petri. Unsolvability of the recognition problem for annihilating iterative networks. Selecta Mathematica Sovietica, vol. 6 (1987), pp. 354-363. (Translated from Russian.) [28] R. T. Rockafellar. Convex analysis. Princeton University Press, 1970. [29] P. G. de Sá and C. Maes. The Gacs-Kurdyumov-Levin Automaton Revisited. Journal of Stat. Physics, v. 67 (1992), n.3/4, pp. 507-522. 82 [30] Luis de Santana. Velocities a là Galperin in Continuous Spaces. Doctoral Thesis defended on July 26, 2012 at CCEN/UFPE, Brazil. [31] O. Stavskaya and I. Piatetski-Shapiro. On homogeneous nets of spontaneously active elements. Systems Theory Res., v. 20 (1971), pp. 75-88. (Translated from Russian.) [32] T. Toffoli and N. Margolus. Programmable matter, concepts and realization. Physica D, v. 47 (1991), pp. 263-272. [33] A.Toom and L. Mityushin. Two Results regarding Non-Computability for Univariate Cellular Automata. Problems of Information Transmission, v. 12 (1976), n. 2, pp. 135-140. (Translated from Russian.) [34] A. Toom. Unstable Multicomponent Systems. Problems of Information Transmission, v. 12 (1976), n. 3, pp. 220-224. (Translated from Russian.) [35] A. Toom. Stable and attractive trajectories in multicomponent systems. Multicomponent Random Systems, ed. by R. Dobrushin and Ya. Sinai. Advances in Probability and Related Topics, v. 6 (1980), pp. 549-575. Contributing editor D. Griffeath, series editor P. Ney. Marcel Dekker, Inc. New York and Basel. (Translated from Russian.) [36] A. Toom, N. Vasilyev, O. Stavskaya, L. Mityushin, G. Kurdyumov and S. Pirogov. Discrete Local Markov Systems. Stochastic Cellular Systems : Ergodicity, Memory, Morphogenesis. Ed. by R. Dobrushin, V. Kryukov and A. Toom. Nonlinear Science: theory and applications, Manchester University Press, 1990, pp. 1-182. (Translated from Russian.) [37] A. Toom. Cellular Automata with Errors: Problems for Students of Probability. Topics in Contemporary Probability and its Applications. Ed. by J. L. Snell. Series Probability and Stochastics ed. by R. Durrett and M. Pinsky. CRC Press (1995), pp. 117-157. [38] A. Toom. Every continuous operator has an invariant measure. Journal of Stat. Physics, v. 129 (2007), n. 3, pp. 555-566. [39] N. Vasilyev, M. Petrovskaya and I. Piatetski-Shapiro. Modelling of voting with random errors. Automation and Remote Control, v.10 (1970), pp. 1639-1642. (Translated from Russian.)
© Copyright 2026 Paperzz