5
STATISTICAL BEHAVIOUR
In this chapter I introduce some notions concerning the statistical behaviour of a
dynamical system, including in particular a characterization of systems that exhibit
chaos. I will illustrate the concepts of ergodicity and mixing, discussing the most
known examples.
The classical definition of dynamical system may be summarized as follow.1
One considers a differentiable manifold M and assumes the following:
(i) there is a measure µ on M defined by a positive and smooth density;
(ii) there is a flow on M , i.e., a one parameter group φt of diffeomorphisms;
(iii) the flow φt preserves the measure µ, i.e., for any measurable subset A ⊂ M one
has µ(φt A) = µ(A).
The definition above is strongly related to the origin of the concept of ergodicity (or,
more generically, of chaotic and statistical behaviour) from the mechanical problem of
describing large systems of particles, beginning with the molecules in a gas, but extending them the same concepts to other physical situation such as, e.g., the dynamics
of atoms in a cristal lattice or the electromagnetic field in a cavity.
However, these concepts have been later developed as characterizing a vide class
of dynamical systems, thus introducing the so called abstract dynamical systems.
We may roughly distinguish between flows (the dynamics depends on a continuous
time) and maps (or systems with discrete time). The flow is typically generated by a
system of differential equations. Hamiltonian systems represent a special but fundamental class. A map is defined as a discrete group generated by a diffeomorphism Φ,
as has been discussed in the previous chapter. In the latter case it is usual to denote
φt (x) = Φt with t ∈ Z. In the following I will use the notation φt for the iterates of
the map and φ for φ1 = Φ.
1
The reader will remark that there is a nontrivial difference with respect to the problems
treated in the previous chapters: the condition that the dynamic preserves measure is
added. This select a particular class of dynamical systems that has a major interest in
physics.
130
Chapter 5
A more general and more abstract definition is often used, by asking M to be a
measurable space, and weakening the hypothesis that the map φ is a diffeomorphism:
as we shall see below, in some cases it is not even requested to map to be continuous. In
particular, the map needs not be invertible, in which case one considers only positive
iterates of the map. However, the measure preserving property is maintained, i.e., one
has µ φ−1 (A) = µ(A) for every measurable A ⊂ M . Moreover, it is also assumed
that µ(M ) = 1 (the measure is said to be normalized).2 An abstract dynamical system
is often denoted as a triple (M, µ, φ).
5.1
Ergodicity
The first characterization of a system exhibiting a statistical behaviour is related to
the property of ergodicity. In order to introduce this notion we need some definitions.
I give the definition for the case of a discrete map, but the corresponding definitions
for a flow are an easy matter.
5.1.1 Time average and phase average
Consider an orbit of a dynamical system, namely the subset Ω(x) ⊂ M defined as
either
[
[
φt x
Ω(x) =
φt x or Ω(x) =
t∈Z
t∈Z+
according to whether the map is invertible or not.
Definition 5.1:
For x ∈ M the time average of a function f : M → R is defined as
N
1 X
¯
f (x) = lim
f (φt (x)) ,
N→∞ N
t=0
provided the limit exists.
Definition 5.2:
as
For a measurable function f : M → R the phase average is defined
hf i =
Z
f dµ .
M
5.1.2 The ergodic theorem of Birkhoff
The following theorem, due to Birkhoff, states that the time average is a well defined
quantity for almost every initial point. The next corollary states the independence of
the time average from the initial point of the orbit.
2
The relevant property is that M has finite measure, i.e., µ(M ) < ∞. Normalization then
is a trivial matter.
Statistical behaviour
131
Proposition 5.3: Let (M, µ, φ) be a discrete dynamical system, and let f be a
measurable function. Then the time average
N
1 X
f (φt (x))
f¯(x) = lim
N→∞ N
t=0
exists almost everywhere.
Corollary 5.4:
With the same hypotheses of proposition 5.3 one has
f¯(φt x) = f¯(x) ,
whenever the average exists, for any t ∈ Z or t ∈ Z+ , as appropriate.
Proof.
*** aggiungere la dimostrazione ***
Q.E.D.
5.1.3 Sojourn time and invariant functions
Definition 5.5: For an orbit {x , φ(x) , φ2(x) , . . .} the average sojourn time of a
measurable set A ⊂ M is defined as
1
τA (x) = lim
#{t : 0 ≤ t ≤ N ∧ φt (x) ∈ A} .
N→∞ N
Equivalently:
N
1 X
χA (φt (x)) = χ̄A (x) ,
τA (x) = lim
N→∞ N
t=0
where χA is the characteristic function of A , i.e., χA (x) = 1 for x ∈ A , otherwise
χA (x) = 0 .
Definition 5.6:
An invariant function for φ is a function f : M → R such that
f (φ(x)) = f (x) for all x ∈ M .
This clearly extends to a map the concept of first integral for the flow of a system of
differential equations. An example of invariant function is easily constructed as follows.
Take any measurable function g(x) and define f (x) = g(x). In view of corollary 5.4
the function f (x) has the same value at every point of the orbit {φt (x)}; thus it is
invariant.
Definition 5.7:
A subset A ⊂ M is invariant in case one has φ−1 (A) = A .
5.1.4 Equivalent definitions of ergodicity
The following definition characterizes an ergodic system
Definition 5.8: A system M, µ, φ is said to be ergodic if the time average of any
measurable function coincides with the phase average almost everywhere (i.e., except
for a subset of points of zero measure). That is:
f¯ = hf i a.e. in M .
132
Chapter 5
Proposition 5.9: Let the measure µ be normalized, i.e., µ(M ) = 1. The following
properties are equivalent definitions of ergodicity:
(i) for every measurable subset A ⊂ M one has τA = µ(A) a.e. in M ;
(ii) one has metric indecomposability: if φ−1 (A) = A then we have either µ(A) = 0
or µ(A) = 1 ;
(iii) if f is a measurable invariant function then f = const a.e. in M .
Proof. Let us denote the definition as property (0). I will proceed cyclically by
proving the following implications:
(0) ⇒ (i) ⇒ (ii) ⇒ (iii) ⇒ (0) .
a. The implication (0) ⇒ (i) is straightforward: just apply (0) to the characteristic
function χA of a measurable subset A.
b. For the implication (i) ⇒ (ii) proceed by contradiction. Let the measurable subset
A ⊂ M be invariant and 0 < µ(A) < 1. Then the sojourn time in A of an orbit {φt (x)}
is either 0 or 1 for almost all x ∈
/ A or x ∈ A, respectively. This contradicts (i), because
the sojourn time should be χA = µ(A) 6= 0, 1.
c. For the implication (ii) ⇒ (iii) proceed again by contradiction. Let f (x) be invariant
and not constant a.e.; then there exists a measurable subset A ⊂ M with 0 < µ(A) < 1
such that f (x) > hf i for some x ∈ A. In view of corollary 5.4 this subset is invariant,
because all points of the same orbit have the same time average. Thus A is an invariant
subset with 0 < µ(A) < 1, which contradicts (ii).
d. The last implication, (iii) ⇒ (0), is proven again by contradiction. Let g(x) be such
that g(x) 6= hgi for x in some measurable subset A ⊂ M with 0 < µ(A) < 1. Define
f (x) = g(x). By corollary 5.4 f (x) is an invariant function, and, trivially, f (x) = g(x)
and f (x) 6= hf i. Thus we have found an invariant function which is not constant a.e.,
which contradicts (iii).
Q.E.D.
5.1.5 The circle map
Let M = T1 be a circle endowed with the coordinate ϑ ∈ R /(2π Z), and let µ be the
Lebesgue measure on the circle, i.e., the length of the smallest arc joining two points
ϑ0 , ϑ1 . The circle map is defined as a rotation of a given angle ω ∈ Rn , namely
(5.1)
ϑ → ϑ + ω mod(2π) .
The map is clearly invertible and measure preserving, since the rotation does not
change the length of any arc.
ω
Lemma 5.10: An orbit of the circle map is periodic if and only if 2π
is a rational
ω
number. In that case all orbits are periodic. If 2π is an irrational number then every
orbit is dense on the torus.
Proof. The orbit is periodic if and only if the equality ϑ+rω = ϑ mod 2π holds true,
i.e., if rω = 2sπ for a pair of integers s, r. In that case the equality holds true for every
ω
ω
= rs , which is a rational number. If 2π
is irrational then all points
ϑ, and one has 2π
{ϑs = ϑ0 + sω mod 2π}s∈Z of the orbit with arbitrary initial point ϑ0 are distinct.
Since the circle is compact, the orbit has an accumulation point, i.e., for arbitrary δ
Statistical behaviour
133
there are integers j, k such that 0 < δ̃ = |ϑj − ϑk | < δ. By translational invariance
the inequality δ̃ = |ϑj − ϑj+s | < δ holds true for every j. Consider the subsequence
ϑ0 , ϑs , ϑ2s , ϑ3s , . . . has a point that falls inside any interval of length δ on the circle.
Q.E.D.
Proposition 5.11: The circle map (5.1) is ergodic with respect to the Lebesgue
ω
is an irrational number.
measure if and only if 2π
ω
Proof. If 2π
is a rational number, then every orbit has a finite number of points.
Let N be such number, and denote by ϑ0 , . . . , ϑN−1 the distinct points of the orbit.
1
. Let Uδ (ϑ) = (ϑ − δ/2, ϑ + δ/2) be the open interval of width
Take a positive δ < 2N
δ centered in ϑ, which is measurable
for the Lebesgue measure on the circle. Consider
SN−1
the measurable set A = j=0 Uδ (ϑj ). Then one has µ(A) ≤ N δ < 1/2. On the
other hand A is clearly invariant, because the measure is preserved. Thus there is an
invariant subset A with positive measure µ(A) < 1, and the system is not ergodic in
view of property (ii) of proposition 5.9.
ω
be irrational. We prove that every measurable and invariant subset A ⊂ M
Let now 2π
satisfies property (ii) of proposition 5.9. If µ(A) = 0 there is nothing to prove (consider,
e.g., a single orbit starting at a given point), so we may assume µ(A) > 0 and prove
that in such a case it must be µ(A) = 1. Let us say that a point ϑ ∈ A is a concentration
point for the measure of A in case
for every positive ε (however small) there exists a δ
such that one has µ A ∩Uδ (ϑ) > δ(1 −ε). It is immediate to see that if µ(A) > 0 then
A possesses at least one concentration point. Let ϑ0 ∈ A be a concentration point,
and let ε > 0 and δ > 0 be the corresponding parameters. Let moreover {ϑs }s∈Z be
the orbit with initial point ϑ0 . Recall that the orbit is a subset of A, because A is
assumed to be invariant. Since
the orbit is dense, for every positive δ the sequence of
s
intervals Uδ (ϑs ) = φ Uδ (ϑ0 ) covers the circle, and for all of them we have
µ A ∩ Uδ (ϑs ) > δ(1 − ε) ,
since the measure is preserved. Adding up the contribution of all intervals we have
[
µ
A ∩ Uδ (ϑs ) > 1 − ε ,
s∈Z
µ(A) = 1 because ε is arbitrary. Thus, every invariant subset of positive measure
must have full measure, so that the system is ergodic in view of property (ii) of
proposition 5.9. *** prova da migliorare ***
Q.E.D.
5.1.6 The translation on the torus
A generalization of the circle map is the translation on the torus Tn = Rn /(2π Z)n
(we might call it a rotation). The map is defined as
(5.2)
ϑ → ϑ + ω mod(2π) ,
ω ∈ Rn .
In order to investigate the ergodicity properties of the map we must generalize
the concept of irrationality to a generic vector ω ∈ Rn .
134
Chapter 5
1
I
0
1
0
Φ−1(I)
Figure 5.1. The doubling map of the circle, φ(x) = 2x (mod 1) .
Definition 5.12:
(5.3)
The vector ω ∈ Rn is said to be irrational in case3
hk, ωi =
6 0 mod 2π
for 0 6= k ∈ Zn .
That is: the plane hk, ωi = 0 orthogonal to ω intersects the lattice 2π Zn (considered as
ω
is an irrational
a subset of Rn ) only at the origin. This is equivalent to saying that 2π
number.
Example 5.1: Irrational rotation on a torus For n = 2 the irrationality condition
ω
intersects the lattice Z2 (considered as a subset
means only that the line with slope 2π
of the plane R2 ) only at the origin.
Proposition 5.13: The translation map (5.3) of the torus is ergodic with respect
to the Lebesgue measure if and only if ω is irrational.
The proof may be worked out by suitably adapting the argument for the circle
map in the previous section. Thus I leave it to the reader.
*** commentare il caso razionale (non ergodico, dare una funzione invariante
non costante) ***
The translation on the torus Tn is the direct counterpart of the Kroneker flow
on the torus Tn+1 , namely the flow generated by the dynamical system ϑ̇ = ω. The
correspondence is easy established by making the Poincaré section of the continuous
time map φt (ϑ) = ϑ + ωt mod 2π with the plane4 ϑn+1 = 0. This is equivalent to
considering the discrete map φτ with τ = ω2π
.
n+1
3
4
I use the notation hk, ωi =
P
j
kj ωj .
It is actually equivalent to considering the successive intersections of the line ϑ + ωt with
the family of planes ϑn+1 = 2jπ with j ∈ Z and reducing all intersection to the cube
[0, 2π)n representing the torus Tn .
Statistical behaviour
135
5.1.7 The doubling map on the circle
Consider the map of the circle T = R / Z
φ : T→T
x 7→ 2x (mod 1) .
If we consider the usual topology and the Lebesgue measure on T then the map is
both continuous and measure preserving. It has x = 0 as the unique fixed point.
Exercise 5.1:
Try to calculate the iterates of the doubling map of the circle on
a digital computer. The results will likely be that after some iterates (e.g., less than
50, the number depending on the initial point) the sequence falls on the fixed point
x = 0. If this happens, try the following further experiences.
(i) Iterate the map on a programmable desk calculator (the difficult part might be
to find one and to learn how to program it).
(ii) Iterate the similar map x → 3x mod 1; this will likely result in a very long
sequence (possibly endless, apparently non periodic) of non zero values, except
for some very particular initial points.
Explain these two facts.
Exercise 5.2: Change the computer code for the iteration of the doubling map of
the circle so that it produces an infinite, either periodic or non periodic sequence of
non zero iterates.
Let us say that an orbit is definitely periodic in case it becomes a periodic orbit
after a transient. This may well happen, because the map is not invertible (produce
an example).
Exercise 5.3: Prove the following properties for the doubling map of the circle.
(i) If the initial point x is a rational number then the orbit is either periodic or
definitely periodic.
(ii) If the initial point may be written as a dyadic fraction (i.e., a fraction of the
form j/2k with a power of 2 as denominator) then the orbit falls down on the
fixed point x = 0 in at most k iterates.
(iii) If the initial point x is an irrational number then the orbit is not periodic.
(iv) The map possesses a dense orbit (non periodic, of course). Produce an example.
(v) There exists a non periodic orbit which is not dense. Produce an example.
(vi) The map is ergodic.
The underlying mechanism that produces the variety of orbits illutrated by the
exercise is the stretching of every interval in the segment.
5.2
Mixing
Mixing is a stronger property than ergodicity. In rough terms one can imagine the
action of stirring two incompressible liquids (e.g., milk and coffee) in a glass until they
are completely mixed together.
136
Chapter 5
5.2.1 Equivalent definitions of mixing
Definition 5.14: A discrete and invertible dynamical system (M, µ, φ) is said to be
mixing in case for any two measurable subsets A, B ⊂ M one has
lim µ(φ−t A ∩ B) = µ(A) · µ(B) .
(5.4)
t→∞
The following proposition provides an equivalent definition of mixing.
Proposition 5.15: A discrete and invertible dynamical system (M, µ, φ) is mixing
if and only if for any two measurable real functions on M one has
Z
Z
Z
t
(5.5)
lim
f ◦ φ gdµ =
f dµ ·
gdµ .
t→∞
M
M
M
In shorter notation we may rewrite (5.5) as
(5.6)
lim (f ◦ φt )g = hf ihgi
t→∞
The quantity so defined is named correlation between the functions f and g. The
property essentially charactizes the loss of information during the evolution of the
system, because any connection between the initial time and the current time is lost.
Proof of proposition 5.15. Let f = χA and g = χB , the characteristic
functions
of two measurable subsets A and B. Remark that one has χA ◦ φt (x) = χA φt (x) =
χφ−t A (x). Thus (5.6) writes
lim (χA ◦ φt )χB = lim hχφ−t A χB i = hχA ihχB i .
t→∞
t→∞
On the other hand for the characteristic functions we have
χφ−t A χB = χφ−t A∩B ,
so that the relation above reads
lim µ(φ−t A ∩ B) = µ(A)µ(B) ,
t→∞
which is the definition 5.14 of mixing. This proves that the relations (5.6) and (5.4)
are equivalent for characteristic functions, and in particular that property (5.6) for all
measurable functions implies (5.4).
We must now prove that (5.4) implies (5.6) for all measurable functions. To this end
we first prove that the claim is true for sums of characteristic functions. Let us consider
any two partitions {Aj } and {Bk } of M into disjoint measurable subsets, and let the
functions f and g be written as
X
X
f=
fj χAj , g =
gk χBk
j
k
with real coefficients fj and gk . Then, using that the equivalence of the two conditions
applies to characteristic functions, calculate
X
X
fj gk µ(φ−t Aj ∩ Bk ) .
(f ◦ φt )g =
fj gk hχφ−t Aj χBk i =
j,k
j,k
Statistical behaviour
137
Letting t → ∞ we get
X
lim (f ◦ φt )g =
fj gk µ(Aj )µ(Bk )
t→∞
j,k
=
X
j
fj hχAj i
X
gk hχBk i = hf ihgi .
k
Thus the claim is true for sums of characteristic functions.
Let now f, g be measurable functions. We may write f = f˜ + f ′ and g = g̃ + g ′ with
f˜, g̃ sums of characteristic functions and kf ′ k < ε, kg ′ k < ε (use the L2 norm). The
one easily finds
h(f ◦ φt )gi − h(f˜ ◦ φt )gi < Cε , hf ihgi − hf˜ihg̃i < Cε
with some constant C and uniformly in t. The claim holds true for f˜, g̃, and since ε
is arbitrary it is true for f, g.
Q.E.D.
The relation between ergodic and mixing systems is clarified by the following
Proposition 5.16:
A mixing system is also ergodic.
Proof. Let A be invariant, so that φ−t A = A. Let B = A, so that µ(φ−t A ∩ A) =
µ(A). Then apply definition 5.14 and get
2
µ(A) = lim µ(φ−t A ∩ B) = µ(A) .
t→∞
This implies either µ(A) = 0 or µ(A) = 1. Thus the system is ergodic in view of
property (ii) of proposition 5.9.
Q.E.D.
The converse of proposition 5.16 is false. An elementary example is the circle map
illustrated in sect. 5.1.5. For it is immediate to see that every interval is just translated,
while keeping its length. Some expansion/contraction mechanism in necessary in order
to let the dynamics to spread a subset over the whole manifold, as requested by mixing.
Such a mechanism is illustrated by the example of the next section in a model that
generalizes the doubling map of the circle to a torus T2 .
5.2.2 The baker transformation
We consider the square [0, 1) × [0, 1) and the mapping φ defined as φ(x, y) = (x1 , y1 )
with
1
(5.7)
x1 = 2x − ⌊2x⌋ , y1 = (y + ⌊2x⌋) .
2
The mapping is not continuous, so we forget the topological aspect and concentrate
on the measure. The Lebesgue measure is clearly preserved.
Let us see how subsets of the square are modified by the map. For instance, let’s
follow for a few steps the evolution of rectangle [1/2, 1) × [0, 1) . This is illustrated
in figure 5.3. The area is dispersed at every step into thinner and thinner horizontal
strips that tend to densely fill the square. Similarly, the map φ−1 generates a sequence
of thinner and thinner vertical strips.
138
Chapter 5
Figure 5.2. Representing the baker transformation. The square is uniformly
stretched by a factor 2 in the horizontal direction and shrunk by a factor 1/2 in
the vertical direction, thus transforming into a rectangle. Then the right part of
the rectangle is cut and superimposed to the left part, thus reconstructing the
square.
The reader may try to figure out the fate of any open subset, e.g, a small circle,
under the map. By applying only the stretching/shrinking operation (i.e., delaying
the operation of cutting and superimposing the rectangles) the square is transformed
into a thinner and thinner rectangle with an exponentially increasing width and an
exponentially decreasing height. Thus a small circle is stretched into a very long ellipse
with the same areas as the initial circle. By cutting the rectangle and superimposing
all slices the ellipse is transformed into narrower and narrower strips that fill uniformly
the square. Thus the shrinking/stretching mechanism causes the points of the ellipse
to spread over the square, while keeping constant the total area.
5.3
Isomorphism between dynamical systems
As usual in mathematics, introducing the concept of isomorphism is an helpful device.
In the context of dynamical system we may consider different settings. I restrict here
the definition the the case of measure preserving mappings
Definition 5.17: The dynamical systems (M , µ , Φ and (N , ν , Ψ) are said to be
isomorphic in case there is a one-to-one measure preserving function h : M → N such
Statistical behaviour
Φ
139
Φ
Φ
Φ
Φ
Φ
Φ
Φ
Figure 5.3. Representing the forward iterations of the baker transformation.
The backward iterations are also represented by following the path in reverse
direction.
that the following diagram is commutative, i.e., if Ψ ◦ h = h ◦ Φ ,
M
Φy
M
h
−→
←−
h−1
h
−→
←−
h−1
N
yΨ
N
As a matter of fact, the definition appears a little strong and somehow inappropriate,
because most statements include the condition “almost everywhere”. In fact we can
introduce a slightly weaker (but essentially equivalent) definition by asking the isomorphism h to be defined almost everywhere, i.e., except for sets of zero measure. We
140
Chapter 5
shall need this restriction later.
5.4
Symbolic dynamics
Let A denote a finite set of symbols, that we call an alphabet. We define S as the set
of doubly infinite sequences
s = {sj }j∈Z = {. . . , s−2 , s−1 , s0 , s1 , s2 , . . .} ,
sj ∈ A .
Similarly, we define S + as the set of one-sided sequences
s = {sj }j∈Z+ = {s0 , s1 , s2 , . . .} ,
sj ∈ A .
Such a set may be equipped with a topology, a metrics and/or a measure.
5.4.1 Topology
For any s∗ ∈ S we define a neighbourhood basis via the family of sets
Uk (s∗ ) = {s ∈ S : sj = s∗j
for |j| ≤ k} .
With a minor adaptation, a similar topology is introduced in S + via the basis
Uk (s∗ ) = {s ∈ S + : sj = s∗j
for 0 ≤ j ≤ k} .
5.4.2 Metrics
Considering two sequences s, t ∈ S and with a constant D > 1 the distance may be
defined as5
(
X δ(sj , tj )
1 if x 6= y
,
δ(x, y) =
.
(5.8)
dist(s, t) =
|j|
D
0
if
x
=
y
j∈Z
Another possibility is to identify the alphabet with the set {0 , . . . , N − 1} and to
define
X |sj − tj |
.
(5.9)
dist(s, t) =
|j|
D
j∈Z
For the space S + of one sided sequences the same definition may be used by just
restricting the sum to j ∈ Z+ .
Let us investigate the relations with the topology of sect. 5.4.1 in S. For s ∈ Uk (s∗ )
the distance (5.8) is calculated as
X δ(sj , s∗j )
X 1
2
dist(s, s∗) =
,
≤
=
|j|
|j|
(D − 1)Dk
D
D
|j|>k
5
|j|>k
In some texts the constant D is chosen to be D = N = #A , the cardinality of A .
Statistical behaviour
141
the upper limit occurring when sj 6= s∗j for all |j| > k. If we further assume that
s∈
/ Uk+1 (s∗ ) then at least one of s−j 6= s∗−j or sj 6= s∗j holds true, and so we get the
relations
1
2
≤ dist(s, s∗) ≤
.
k+1
D
(D − 1)Dk
It is now an easy matter to conclude with
Lemma 5.18: Let D > 3, and let dist(s, s∗) = ε for some ε > 0. If
then s ∈ Uk (s∗ ) and s ∈
/ Uk+1 (s∗ )
1
Dk+1
≤ε<
1
Dk
Exercise 5.4: Formulate the corresponding lemma for the distance (5.9) in S and
for both distances (5.8) and (5.9) in S + .
5.4.3 Measure
We consider the σ-algebra generated by the subsets
Cjβ = {s ∈ S : sj = β} ,
β∈A .
These subsets will be referred to as cylinders.6 Let j1 , . . . , jk be distinct integers and
Cjβ11 , . . . , Cjβkk be cylinders, with β1 , . . . , βk arbitrary symbols from the alphabet A .
We extend the definition of cylinders, also introducing a natural notation, to elements
of the σ-algebra constructed as intersections of cylinders, namely
,...,βk
= Cjβ11 ∩ . . . ∩ Cjβkk .
Cjβ11,...,j
k
We introduce the product measure µ as follows: choose N = #A positive numbers
(or weights)7 p0 , . . . , pN−1 , such that p0 + · · · + pN−1 = 1 , and make a one-to-one
correspondence of {p0 , . . . , pN−1 } with A . E.g., denote the symbols as α0 , . . . , αN−1
and assign pj to the symbol αj , or just denote the weights by pα0 , . . . , pαN −1 . Using
the alphabet A = {0, . . . , N − 1} is allowed, of course: this is just what we do when
representing numbers on a given basis. E.g., in decimal notation, we use the digits
{0, . . . , 9} as alphabet. The measure of Cjβ is defined as
µ(Cjβ ) = pβ .
The measure of the intersections is defined via the product measure, i.e.,
,...,βk = pβ1 · . . . · pβk .
µ Cjβ11,...,j
k
Lemma 5.19: The following properties hold true.
(i) For β 6= γ and for any j one has Cjβ ∩ Cjγ = ∅, and so also
(5.10)
µ(Cjβ ∪ Cjγ ) = µ(Cjβ ) + µ(Cjγ ) = pβ + pγ .
6
Some authors call them rectangles. It is just matter of taste.
7
Some of the weights pj may well be set to zero. In such a case one introduces a singular
measure concentrated on some particular subsets. I simplify the discussion by omitting
these cases.
142
Chapter 5
(ii) The complement of the cylinder Cjβ is the union of disjoint cyliders; precisely
[
′′
′
′
Cjβ , with Cjβ ∩ Cjβ = ∅ for β ′ 6= β ′′ .
(5.11)
S \ Cjβ =
β ′ ∈A \{β}
(iii) Property (i) generalizes as follows. Let {j1 , . . . , jn } ∩ {k1 , . . . , kn } =
6 ∅, and
let β1 , . . . , βn and γ1 , . . . , γn be the corresponding symbols. If there is a pair
,...,βn
,...,γn
jl = km such that βl 6= γm then Cjβ11,...,j
∩ Ckγ11,...,k
= ∅, and so also
n
n
,...,γn
,...,βn
,...,γn
,...,βn
) = µ(Cjβ11,...,j
) + µ(Ckγ11,...,k
)
µ(Cjβ11,...,j
∪ Ckγ11,...,k
n
n
n
n
(5.12)
= pβ1 · . . . · pβn + pγ1 · . . . · pγn .
,...,βn
(iv) Property (ii) generalizes as follows. The complement of the cylinder Cjβ11...,j
n
is the union of disjoint cylinders
[
β ′ ,...,β ′
,...,βn
=
Cj11...,jn n ,
(5.13)
S \ Cjβ11...,j
n
′
β1′ ,...,βn
the union being made over all β1′ ∈ A \ {β1 }, . . . , βn′ ∈ A \ {βn }.
(v) Let {A1 , . . . , An } and {B1 , . . . , Bm } be two sets of cylinders, all of them being
pairwise disjoint, i.e., Aj ∩ Ak = Aj ∩ Bk = Bj ∩ Bk = ∅ for every allowed pairs
j, k. Then
(5.14) µ (A1 ∪. . .∪An ) ∩(B1 ∪. . .∪Bm ) = µ(A1 ∪. . .∪An ) µ(B1 ∪. . .∪Bm ) .
Proof. (i) Just use the definition of cylinder: a sequence with sj = γ can not belong
to Cjβ because this means sj = β 6= γ.
′
′
S
(ii) One clearly has S = β ′ ∈A Cjβ , the cylinders Cjβ being disjoint by (i). Just
subtract Cjβ .
,...,βn
= Cjβ11 ∩ . . . ∩ Cjβnn .
(iii) Similar to the proof of (i), recalling that Cjβ11,...,j
n
S
β ′ ,...,β ′
(iv) Similar to the proof of (ii). Use S = β ′ ,...,β ′ Cj11,...,jnn with β1′ , . . . , βn′ taking all
n
1
,...,βn
. The cylinders in the union are disjoint in view
possible values; then subtract Cjβ11,...,j
n
of property (iii).
S
(v) Recall that we can write (A1 ∪ . . . ∪ An ) ∩ (B1 ∪ . . . ∪ Bm ) = j,k Aj ∩ Bk , where
{Aj ∩ Bk }j=1,...,n, k=1,...,m is a set of pairwise disjoint cylinders. Thus
X
µ (A1 ∪ . . . ∪ An ) ∩ (B1 ∪ . . . ∪ Bm ) =
µ(Aj ∩ Bk )
j,k
=
X
j
as claimed.
µ(Aj ) ×
X
µ(Bk ) = µ(A1 ∪ . . . ∪ An ) µ(B1 ∪ . . . ∪ Bm ) ,
k
Q.E.D.
Example 5.2: Sets of zero measure Let A0 ∈ S + be the set of sequences which do not
contain the symbol α0 . Then we have µ(A0 ) = 0. *** aggiungere la dimostrazione
***
Statistical behaviour
143
A further property may be stated by referring to the base subsets of the topology.
Precisely, we mus consider the special set of cylinders that correspond to the base sets
of the topology. To this end I shall introduce the notation C(n, β) for the cylinder
β ,...,β
1
2n+1
C(n, β) = C−n,...,0,...,n
,
β ∈ A 2n+1 .
For a given n there are N 2n+1 distinct cylinders, of course. Moreover, by prop′
′
erty (iii)
S of lemma 5.19, we have C(n, β) ∩ C(n, β ) = ∅ for β 6= β . Finally, we
have β∈A 2n+1 C(n, β) = S.
*** Ricontrollare il lemma ***
Lemma 5.20: The following properties hold true.
(i) Let A = Cjγ ∪ Ckδ with j < k and |k| < n. Then A is the union of disjoint sets
of cyliders. E.g.
[
A=
C(n, β) , B = {β ∈ A 2n+1 : βj = γ , βk = δ} .
β∈B
(ii) More generally, let A = Cjγ11 ∪. . .∩Cjγnn with j1 < . . . < jn be a finite intersection
of cylinders. Then A is the union of disjoint sets of cyliders. E.g.
[
A=
C(n, β) , B = {β ∈ A 2n+1 : βj1 = γ1 , . . . , βjn = γn } .
β∈B
(iii) Let A be a finite arbitrary union of cylinders. Then A is the union of disjoint
sets of cyliders.
Proof. The proof of (i) and (ii) is written in the corresponding formulæ for A. The
proof of (iii) is a straightforward application of property (ii).
Q.E.D.
Lemma 5.21: For any measurable subset A ∈ S and for any positive ε there exists
à ∈ S which is a finite union of disjoint cylinders and satisfies8
µ(A △ Ã) < ε .
(5.15)
Proof.
*** controllare l’enunciato e aggiungere la dimostrazione ***
5.4.4 The Bernoulli shift
A dynamics on the set S of two sided sequences is defined via the (left) shift transformation σ as follows:
σ :S →S
s 7→ σ(s) : (σ(s))j = sj+1
8
for j ∈ Z .
The symbol △ denotes the symmetric difference of sets, i.e., A △ B = (A ∪ B) \ (A ∩ B),
or, equivalently, A △ B = (A \ B) ∪ (B \ A).
Q.E.D.
144
Chapter 5
A similar definition applies also to the set S + of one sided sequences, by trowing away
the first symbol of the sequence s. That is
σ : S+ → S+
s 7→ σ(s) : (σ(s))j = sj+1
for j ∈ Z+ .
The dynamical system defined by the map σ on S is called Bernoulli shift. It is clearly
a generalization of the coin toss, the latter being represented by the case of an alphabet
with two symbols and with weights p1 = p2 = 1/2 (if the coin is unbiased).
5.4.5 Topological properties
The shift transformation in both S and S + is continuous with respect to the topology
defined in subsection 5.4.1. The easy proof is left to the reader.
Let us consider the space S of two sided
I will denote Ω(s) the orbit
S sequences.
t
with initial point s ∈ S, namely Ω(s) = t∈Z φ (s). A point s will be said to be
periodic with period τ > 0 in case one has φτ (x) = φ(x) for every x ∈ Ω(s); in that
case we shall also say that the orbit is periodic. It is a trivial remark that a periodic
orbits contains a finite number of points, actually τ if τ is the minimal period.9 A
point will be said to be non periodic if it is not periodic, so that the orbit is formed
by a countable set of distinct points.10
The following properties show that the dynamics of the Bernoully shift may be
very complicated.
(i) There are infinitely many periodic points, which form a dense subset of S,.
(ii) There exist non periodic points s such that the orbit Ω(s) is dense in S; this is
expressed by saying that the Bernoulli shift is topologically transitive in S.
(iii) There exist non periodic points s such that the orbit Ω(s) is non dense in S.
Let us give examples. A periodic orbits is easily constructed by taking a finite set of
symbols, e.g., {s1 , . . . , sn }, and concatenating them indefinitely both in the left and
in the right direction, namely
s = {. . . , s1 . . . sn s1 . . . sn s1 . . . sn . . .} .
With a common notation for periodic numbers we may write
s = {s1 . . . sn } .
For any open set Uk (s∗ ) of the basis of the topology we may construct a periodic point
s ∈ Uk (s∗ ) by taking
s = {s∗−k . . . s∗0 . . . s∗k } ,
which shows that periodic points are dense in S.
9
An orbit with period τ has also periods 2τ, 3τ, . . ., and τ is the minimal period if all
periods of the orbit are multiples of τ .
10
If two points φt (s) and φt (s) of the orbit coincide, then clearly τ = |t′ − t| is a period,
and the orbit can not be non periodic.
′
Statistical behaviour
145
An example for the case (ii) is constructed as follows. Consider the finite sequences
of length 1, 2, 3, . . . constructed by taking all possible combinations of symbols in A .
More explicitly, take the sequences
{α0 } , . . . , {αN−1 }
of length 1 ,
{α0 α0 } , . . . , {α0 αN } , . . . , {αN−1 α0 } , . . . , {αN−1 αN−1 } of length 2 ,
{α0 α0 α0 } , . . . , {αN−1 αN−1 αN−1 } ,
of length 3 ,
... ... ... ... ...
and so on, each line containing N n subsequences for increasing n. Then concatenate
all these sequences both on the left and the right sides, e.g., as
s = {. . . αN−1 αN−1 αN−1 . . . α0 α0 α0 | αN−1 αN−1 . . . α0 α0 | αN−1 . . . α0 |
α0 . . . αN−1 | α0 α0 . . . αN−1 αN−1 | α0 α0 α0 . . . αN−1 αN−1 αN−1 | . . .}
(I added the vertical bars only in order to help recognizing the partial sequences). The
orbit Ω(s) is clearly dense in S.
Concerning the case (iii), consider the finite sequences
{α0 } , . . . , {αN−1 }
of length 1 ,
{α0 α0 } , . . . , {α0 αN−1 }
{α0 α0 α0 } , . . . , {α0 α0 αN−1 }
of length 2 ,
of length 3 ,
... ... ... ... ...
and so on, each line containing N subsequences. Then concatenate all these sequences
both on the left and the right sides, e.g., as
s = {. . . αN−1 α0 α0 . . . α0 α0 α0 | αN−1 α0 . . . α0 α0 | αN−1 . . . α0 |
α0 . . . αN−1 | α0 α0 . . . α0 αN−1 | α0 α0 α0 . . . α0 α0 αN−1 | . . .}
(I added again the vertical bars only in order to help recognizing the partial sequences).
The orbit is clearly non periodic and non dense in S. Other examples for the case (iii)
may be devised by just considering sequences that are non periodic and removing
everywere one of the symbols, e.g., αN−1 . If N > 2 one can construct plenty of non
periodic, non dense points.
Exercise 5.5:
Produce other examples of non periodic orbits that illustrate the
properties (ii) and (iii).
For the set S + of one sided sequences a further situation occurs, due to the non
invertibility of the shift operation. The key remark is that if we have two sequences
s, s′ satisfying sj = s′j for j > n, with some n > 0 then we have σ t (s) = σ t (s′ ) for
t > n. That is, the evolution under the map σ make the orbits to coalesce toghether
into the same orbit after a transient. This introduces a new class of orbits, namely the
definitely periodic ones, that after a transient become periodic. A trivial example is a
point
s = {hxxxisk sk+1 . . . sk+n } .
where hxxxi stands for any finite sequence of k symbols.
146
Chapter 5
Exercise 5.6: Produce examples illustrating the properties (i), (ii) and (iii) above
for the set S + . Show in particular that definitely periodic points are dense in S + .
5.4.6 Mixing
The main result is
Proposition 5.22: The Bernoulli shift is mixing for the measure µ defined by any
arbitrary choice of the weights p0 , . . . , pN−1 .
Proof. We show that the definition (5.4) applies. To this end we proceed in three
steps:
(i) it is true for cylinders;
(ii) it is true for disjoint unions of cylinders;
(iii) it is true for any measurable set A ∈ S.
,...,γn
,...,βm
be two cylinders. We may
and Ckγ11,...,k
Let us come to the proof. (i) Let Cjβ11,...,j
m
n
akways assume that j1 < . . . < jm and k1 < . . . < kn . The main remark is that
,...,βn
,...,βn
. This is straightforward. Let now j1 + t > kn . Then we
= Cjβ11+t,...,j
σ −t Cjβ11,...,j
n +t
n
have
,...,βn
γ1 ,...,γn β1 ,...,βn
γ1 ,...,γn µ σ −t Cjβ11,...,j
∩
C
=
µ
C
∩
C
j
+t,...,j
+t
k1 ,...,kn
k1 ,...,kn
n
1
n
,...,γn γ1 ,...,γn ,...,βm β1 ,...,βn
µ Ckγ11,...,k
.
= µ Cj1 +t,...,jn +t µ Ck1 ,...,kn = µ Cjβ11,...,j
m
n
The equality is true for any choice of the weights p0 , . . . , pN−1 the proof being independent of them.
(ii) *** completare, ma è abbastanza ovvio ***
(iii) Using lemma 5.21, let µ(A △ Ã) < ε and µ(B △ B̃) < ε with à and B̃ disjoint
unions of cylinders. By (ii) for t big enough we have
µ σ −t à ∩ B̃ = µ σ −t à µ(B̃) = µ(Ã) µ(B̃) .
On the other hand we also have11
µ σ −t A ∩ B − µ σ −t à ∩ B̃ < aε ,
µ(A) µ(B) − µ(Ã) µ(B̃) < aε
with some constant a > 1. Thus for t big enough we have
µ (σ −t A) ∩ B − µ(A) µ(B) < 2aε .
Since ε is arbitrary, we conclude that the system is mixing for the measure µ. Again,
the proof is independent of the choice of the weights p1 , . . . , pN−1 .
Q.E.D.
A last remark is concerned with the relation between different measures as generated by different weights. Actually, the following proposition holds true for ergodic
measures, independent of the mixing property of the dynamics. In rough terms, it
states that the measures µ and ν are concentrated on disjoint subsets.
*** andrebbe messo nella parte ergodica ***
11
Use A ⊂ (Ã ∪ (A △ Ã)) and the similar relation for B, B̃.
Statistical behaviour
147
Proposition 5.23: Let two different measures µ and ν be given on a manifold M ,
and let both of them be invariant for the map φ. If φ is ergodic with respect to both
measures then there exist disjoint subsets Mµ and Mν such that
µ(Mµ ) = ν(Mν ) = 1 ,
µ(Mν ) = ν(Mµ ) = 0 .
Proof. Since the measures are different, there exists a measurable function f with
different phase averages with respect to the two measures, i.e., hf iµ 6= hf iν , the lower
labels denoting the measure. On the other hand, by ergodicity, there exist two subsets
Mµ and Mν with measure µ(Mµ ) = ν(Mν ) = 1 such that the time averages satisfy
f (x) = hf iµ for x ∈ Mµ and f (x) = hf iν for x ∈ Mν . We show that Mµ ∩ Mν = ∅.
For, a point x ∈ Mµ ∩ Mν would have two different time averages, contradicting
the ergodicity hypotheses. Thus, the sets being disjoint, we conclude that µ(Mν ) =
ν(Mµ ) = 0.
Q.E.D.
5.4.7 The Bernoulli scheme
The simplest case of a symbolic dynamical system is given by the alphabet A = {0, 1}
containing only the digits 0 and 1. We may consider either the set S + of one–sided
sequences
s = {s0 , s1 , s2 , . . .} = {sj }j∈Z+
or the set S of two sided sequences
s = {s−2 , s−1 , s0 , s1 , s2 , . . .} = {sj }j∈Z .
This model has a clear correspondence with the action of repeatedly tossing a coin:
just associate the symbols 0, 1 to the outcomes “head” and “tail”, respectively. An one
sided sequence s = {sj }j∈Z+ corresponds to a possible result of an infinite sequence of
tosses. If the coin is unbiased everybody will assign the same probability
1/2 to both
1 1
0 and 1. This results in the usual Bernoulli scheme denoted as B 2 , 2 . A biased coin
will generate the same symbolic dynamics, but with a different measure.
We show that the doubling map of the circle of sect. 5.1.7 is isomorphic to the
one sided Bernoulli shift B 12 , 21 . To this end, just define h by associating to every
x ∈ [0, 1) the sequence of the digits 0 , 1 of its binary representation. A minor problem
concerns the rational numbers of the form m/2k for some k > 0 . For, such numbers
possess two different binary representations, namely
0. hxxxi 1 0 0 0 0 0 0 0 0 . . .
0. hxxxi 0 1 1 1 1 1 1 1 1 . . .
where hxxxi stands for any sequence of binary digits. In order to avoid this unpleasant
lack of uniqueness, just remove from S + all sequences that end with infinite ones. This
is clearly a countable set; hence it has measure zero. With this, h turns out to be
invertible,
because the point x associated to the sequence s = {s0 , s1 , . . .} is simply
P
x = j≥0 sj /2j+1 . The map h so defined is clearly one-to-one. It remains to prove
that it preserves the topology and/or the measure. The trivial proof is left to the
reader. The relevant geometrical property is that the map is strongly expanding: the
148
Chapter 5
1
φ2(x)
φ3(x)
φ5(x)
φ4(x)
φ(x)
x
0
0
Q0
Q1
1
Figure 5.4. Illustrating the construction of the sequence that identifies the
baker transformation with the symbolic dynamics. The sequence the represents
the point x in the figure is {. . . 010100 . . .}, where the dots stand for the infinite
parts of the sequence that must be determined by iterating infinitely many times
the map, both in the direct and in the inverse direction.
distance between two nearby points doubles at every step, so that the memory of the
initial condition is rapidly lost.
The two sided Bernoulli shift is isomorphic to the baker transformation. A direct correspondence is constructed by considering the binary representation of the
cordinates x, y of the square, namely
x = 0.a1 a2 a3 . . . ,
y = 0.b1 b2 b3 . . .
where the aj ’s and bj ’s are the binary digits 0, 1. A double sided sequence is constructed as
s = {. . . b3 b2 b1 | a1 a2 a3 . . .} ,
the vertical bar separating the y sequence (reversed) from the x sequence of the digits.
It is an easy exercise to check that the baker map (5.7) corresponds precisely to shifting
the bar rigth by one position. Again, in order to avoid different binary representations
of the same point we should remove all sequences that terminate with an infinite
sequence of ones either on the left or on the right, but it is a set of zero measure.
Still working on the baker map I illustrate also a general method that allows us
to construct an isomorphism of the mapping with the Bernoulli shift. This metod produces in fact the same sequence as above, but it is interesting in view of its generality,
that may be applied to many other situations.
Consider the partition Q = Q0 ∪Q1 with Q0 = [0, 1/2) ×[0, 1) and Q1 = [1/2, 1) ×
[0, 1) , and let x ∈ Q be any point.as represented in fig. 5.4. Using the alphabet {0, 1}
Statistical behaviour
Qsj−1 ∩ φ−1Qsj
Qsj
Qsj−2 ∩ φ−1(Qsj−1 ∩ φ−1Qsj )
149
Qsj−3 ∩ φ−1(Qsj−2 ∩ φ−1(Qsj−1 ∩ φ−1Qsj ))
Figure 5.5. Illustrating the construction that leads to associate a single point x
to a given sequence s. Here the first three forward steps are represented (for which
the inverse map is used). The backward steps generate a sequence of horizontal
strips, as described in the text.
associate to x the sequence s ∈ S constructed as follows:
(
0 if φj (x) ∈ Q0 ,
sj =
1 if φj (x) ∈ Q1 .
This assigns to every x ∈ Q a unique sequence s ∈ S , and so it is a mapping of Q into
a subset of S .
Doing the converse requires some more attention. Consider any sequence s ∈ S ,
and look at sj , j > 0 . We have
φj (x) ∈ Qsj
φj−1 (x) ∈ Qsj−1 ∩ φ−1 (Qsj )
φj−2 (x) ∈ Qsj−2 ∩ φ−1 (Qsj−1 ∩ φ−1 (Qsj ))
φj−3 (x) ∈ Qsj−3 ∩ φ−1 Qsj−2 ∩ φ−1 (Qsj−1 ∩ φ−1 (Qsj ))
...
150
Chapter 5
U0
U1
Figure 5.6. The horseshoe transformation.
as represented in figure 5.5. Hence for a generic j > 0 we have
x ∈ Qs0 ∩ φ−1 (Qs1 ∩ φ−1 (Qs2 ∩ . . . ∩ φ−1 (Qsj ) ∩ . . .)) .
Hence, if we know s0 , . . . , sj then we also know that x belongs to a vertical strip of
width 2−j−1 . Similarly, if we know s−j , . . . , s−1 then we also know that x belongs
to a horizontal strip of height 2−j . Remark that, by construction, the left (lower)
side belongs to the strip, while the right (upper) side does not (recall the definition of
Q0 , Q1 ). Hence x belongs to the intersection of the strips, namely the union of an open
rectangle with its left and lower sides. Letting j → ∞ we may prove that the sequence
s defines a unique point x ∈ Q provided it does not end with an infinite sequence of 1
on either side. That is, we exclude
from S the sequences such that sj = 1 for |j| > J ,
T
for some J (Hint: use I = j≥0 (0, 1/j) = ∅ . For, 0 ∈ I by definition, and given any
x > 0 there is a j such that x > 1/j .). However, this is just a countable set12 , and so
it has zero measure; so we skip it.
This defines a one-to-one correspondence
h between Q and (a subset of measure
1 1
1 of) S . If we identify S with B 2 , 2 the measure of S actually coincides with the
Lebesgue measure on Q .
An easy remark is that this correspondence is, trivially, the binary representation
of the coordinates (x, y) of a point in Q . But the construction above is quite general,
and this is indeed the interesting part of the game.
5.5
The horseshoe
We now forget all problems related to the measure and concentrate only on the topological properties. We consider the following model, due to Smale. Take a rectangle Q
and do the following operations, illustrated in fig. 5.6
(i) Shrink the rectangle by a factor less than 1/2 in the vertical direction.
12
Hint: For all J , the number of s ∈ S such that sj = 1 for |j| > J is finite. Hence we are
dealing with a countable union of finite sets.
Statistical behaviour
V0
151
V1
U0
U1
Figure 5.7. Illustrating the inverse map for the horseshoe transformation.
V0
V1
U0
U1
Figure 5.8. The intersection between the direct and the inverse image under
the horseshoe transformation.
(ii) Stretch the resulting rectangle by a factor greater than 2 in the horizontal
direction.
(iii) Bend the resulting strip into the shape of a horseshoe.
(iv) Superimpose the resulting horseshoe to the original rectangle so that both ends
and the curved part exceed the sides of the rectangle.
This mapping is not invertible, of course, but we may consider the points that are
mapped out of the rectangle as lost, and concentrate only on the points that are
mapped inside through successive iterations of the map. To this end, let us consider
the sets U0 , U1 represented in figure 5.6, and let V0 , V1 be the preimages of U0 , U1 ,
respectively, i.e., V0 = φ−1 (U0 ) , V1 = φ−1 (U1 ) . Clearly, V0 and V1 are two vertical
strips as represented in figure 5.7.
With a little abuse of notation we shall write
V0 ∪ V1 = Q ∩ φ−1 (Q) ,
152
Chapter 5
U0,0
U0,1
U1,0
U1,1
Figure 5.9. The second iterate of the horseshoe transformation.
V0,0 V0,1
V1,0 V1,1
U0,0
U0,1
U1,0
U1,1
Figure 5.10. The first few steps of the contruction of a symbolic dynamics for
the horseshoe transformation.
where φ−1 (Q) is intended to be applied to the part of the horseshoe inside the rectangle.
Let us draw the next iteration, thus constructing φ(U0 ∪ U1 ) ∩ Q = φ2 (Q) ∩ φ(Q) ∩
Q . This is the union of two disjoint horizontal strips in U0 and two strips in U1 , as
illustrated in fig. 5.9
Similarly, the preimage of U0,0 , U0,1 , U1,0 and U1,1 are four vertical strips as in
figure 5.10. We shall denote again V0,0 ∪ V0,1 ∪ V1,0 ∪ V1,1 = φ−2 (Q) ∩ (V0 ∪ V1 ) =
φ−2 (Q) ∩ φ−1 (Q) ∩ Q . Actually, the points that remain inside the square for at least
Statistical behaviour
153
2 iterations are the points that belongs to the intersections of the vertical strips with
the horizontal ones, i.e., points belonging to
φ−2 (Q) ∩ φ−1 (Q) ∩ Q ∩ φ(Q) ∩ φ2 (Q) .
Extending this argument we see that the dynamics is well defined on the subset of Q
\
φj (Q) ,
Λ=
j∈Z
which is clearly invariant. We should prove that Λ is not empty; this is just a technical
matter that I leave to the reader (Hint: Λ is closed, an infinite intersection of closed
sets, and φ restricted to Λ is invertible).
We prove now that the dynamics on Λ may be associated with the Bernoulli shift
on S. Here the method used for the baker transformation at the end of sect. 5.4.6
proves to be very useful. Let us associate to x ∈ Λ the sequence s ∈ S defined as
(
0 if φj (x) ∈ U0 ,
sj =
1 if φj (x) ∈ U1 .
This maps Λ to S . The correspondence is completed by proving that to every sequence
s ∈ S there corresponds a unique point x ∈ Λ . The existence of such a point may
be proved by following the arguments for the baker transformation. The uniqueness
follows from the expanding properties of the map φ.
Here are some further properties:
(i) Λ is a closed set similar to the Cantor set;
(ii) φ is invertible on Λ ;
(iii) the mapping h : Λ → S is continuous if one considers in Λ the topology induced
by the metrics
X |sj − s′j |
′
.
dist(s, s ) =
2|j|
j∈Z
We conclude that the dynamics on Λ is conjugated to the symbolic dynamics.
The relevance of the horseshoe model for the dynamics in the neighbourhood of a
homoclinic orbit may be heuristically understood as follows. Consider a (curvilinear)
rectangle Q around the stable manifold W (s) , which includes a homoclinic point P .
By continuity, the image of Q by the map φ will cover the areas that follow the
unstable manifold W (u) , and by the area preserving property will be stretched in
the direction of W (u) and thinner and thinner in the transversal direction. At some
iteration k the image φk (Q) must intersect Q as in figure 5.11, thus giving raise to the
dynamics of the horseshoe.
5.6
Poincaré’s recurrence theorem
We consider an abstract dynamical system (M, Ψ, µ) with Ψ invertible and µ(M ) <
154
Chapter 5
P
Q
W+
φk (Q)
W−
O
Figure 5.11. Similarity between the omoclinic intersection and the horseshoe
transformation.
∞ . For any measurable set A ⊂ M we define the recurrence set of A as
RA = {x ∈ A : ∀K0 > 0 , ∃k > K0 such that Ψk (x) ∈ A} .
The wandering set of A is defined as
VA = A\RA ,
i.e., the complement of RA in A .
Remark. a wandering point may return to A a finite number of times. A recurrent
point returns to A an infinite number of times.
It is also useful to introduce the set
VA,K0 = {x ∈ A : Ψk (x) 6∈ A for all k ≥ K0 } .
The following relations apply:
(a)
VA,K0 = {x ∈ A : Ψk (x) ∈ M \A for all k ≥ K0 } ,
thus
VA,K0 = A ∩
\
k≥K0
Ψ−k (M \A) ;
Statistical behaviour
155
(b)
VA =
[
VA,K0 .
K0 >0
These relations imply that VA,K0 , and so also VA , are measurable sets.
Theorem 5.24:
We have
µ(VA ) = 0 .
Proof.
By the definition of VA,K0 we have
ΨnK0 (VA,K0 ) ∩ A = ∅ ,
∀n > 0 .
This implies that for every n1 > n2 > 0 we also have
Ψn1 K0 (VA,K0 ) ∩ Ψn2 K0 (VA,K0 ) = ∅ .
For, assume by contradiction that the latter set contains a point x1 ; then we would
have
Ψ−n2 K0 (x1 ) ∈ VA,K0 ∩ Ψ(n1 −n2 )K0 (VA,K0 )
⊂ Ψ(n1 −n2 )K0 (VA,K0 ) ∩ A = ∅ .
Hence all sets ΨnK0 (VA,K0 ) are disjoint. Recalling that the map is measure preserving,
and that
!
[
X
µ
ΨnK0 (VA,K0 ) =
µ ΨnK0 (VA,K0 )
n>0
n>0
=
X
µ(VA,K0 ) ,
n>0
from µ(M ) < ∞ we conclude µ(VA,K0 ) = 0 .
Q.E.D.
156
Chapter 5
© Copyright 2026 Paperzz