Stochastic Evolution and Stationary Distributions

Evolution & Learning in Games
Econ 243B
Jean-Paul Carvalho
Lecture 11.
Stochastic Evolution and Stationary Distributions
1 / 38
Outline
I
So far, we have focussed exclusively on the behavior of the
mean dynamic, which we have treated as a (deterministic)
approximation to the underlying stochastic process.
I
The informal justification was that when the population is
large, idiosyncratic noise will be averaged away.
I
In this lecture, we shall formalize this notion for a large but
finite population, in the finite-horizon case.
2 / 38
Outline
I
When we want to analyze evolution over infinite horizons,
we need a different set of concepts and tools, which we
shall begin to introduce in the second part of the lecture.
I
Eventually, we shall see that we can make sharp
predictions about the long-run behavior of the stochastic
dynamic, while characterizing its (approximate)
medium-run behavior by the deterministic mean dynamic.
I
Our stochastic analysis will focus (primarily) on the
single-population case.
3 / 38
The Stochastic Evolutionary Process
I
Let the population be large but finite, with N members.
I
The set of feasible social states is then a discrete grid
embedded in X:
XN = X
\
1 n
NZ
= {x ∈ X : Nx ∈ Zn }.
4 / 38
The Stochastic Evolutionary Process
I
I
Once again:
I
A population game is denoted by F : X → R.
I
The strategy space is S = {1, . . . n},
I
Choices are described by the revision protocol
ρ = Rn × X → Rn+×n .
Each agent is also equipped with a rate 1 Poisson clock,
where a ring signals the arrival of a revision opportunity
for the clock’s owner.
5 / 38
Markov Property
The following independence assumptions are made:
I
Different clocks ring independently of each other,
I
Strategy choices are made independently of the clocks’
rings,
I
Choices are based exclusively on the current state of the
process x and current payoff vector π; beyond that the
history of play does not matter.
Together these assumptions mean that the current state
summarizes all the information on the history of play required
to characterize the future motion of the process.
6 / 38
Markov Process
I
That is, the stochastic evolutionary process {XtN } is a
continuous-time Markov process on the finite state space
X N.
I
Explicitly, the process is fully characterized by its jump
N
rates {λN
x }x∈X N and its transition probabilities {Pxy }x,y∈X N .
I
A jump rate λN
x is the rate at which revision opportunities
arrive in the society as a whole in state x.
I
If the rate for an individual is one for all x ∈ X N , then the
N
jump rate is λN
x = N for all x ∈ X .
7 / 38
Transition Probabilities
I
When an agent playing i ∈ S receives a revision
opportunity, he switches to j 6= i with probability ρij .
I
The probability that the next revision involves a switch
from strategy i to j is then xi ρij .
I
The transition involves one less agent playing i and one
more agent playing j, i.e. the state is shifted by N1 (ej − ei ).
I
PN
xy (t) is the probability of transiting from state x to y in
exactly one revision.
I
If PN
xy (t) is independent of t, then the Markov process is
time homogenous, and we write PN
xy . This is the case we will
be dealing with.
8 / 38
Transition Matrix
I
PN is a |X N | × |X N | matrix, called the transition matrix,
whose elements are the transition probabilities.
I
For all x, ∑y∈X N PN
xy = 1.
I
This means that PN is a row stochastic matrix, i.e. its row
elements sum to one.
I
For example, consider the following transition matrix for a
two-state Markov process:
N
P =
1
2
1
4
1
2
3
4
.
9 / 38
Representation
Another way to describe this process is via a graph:
10 / 38
Markov Process
Observation 11.1. A population game F, a revision protocol ρ, a
revision opportunity arrival rate of one, and a population size
N define a Markov process {XtN } on the state space X N .
This process is described by some initial state X0N = xN
0 , the
N
jump rates λx = N and the transition probabilities:
PN
x,x+z

 xi ρij F(x), x
=
1 − ∑i∈S ∑j6=i xi ρij F(x), x

0
if z = N1 (ej − ei ), i, j ∈ S, i 6= j
if z = 0
otherwise
11 / 38
Deterministic Approximation
∞
Theorem 11.1. Let {{XtN }}N
=N0 be a sequence of
continuous-time Markov processes on the state space X N .
Suppose that V is Lipschitz continuous. Let the initial
conditions X0N = xN
0 converge to state x0 ∈ X as N → ∞, and let
{xt }t≥0 be the solution to the mean dynamic starting from x0 .
Then for all T < ∞ and ε > 0,
lim P
N →∞
sup |XtN − xt | < ε
= 1.
t∈[0,T ]
12 / 38
Conditions
I
Lipschitz continuity guarantees unique solutions to the
mean dynamic, but does not apply to the best response
dynamic.
I
Still, deterministic approximation results for the best
response dynamic are available.
I
More importantly, this is a finite-horizon result.
13 / 38
Conditions
I
To see why the result does not extend to an infinite
horizon, consider the logit dynamic in a potential game.
I
We have seen that this mean dynamic converges to a
perturbed equilibrium from any initial state.
I
Over a finite horizon, the underlying Markov process
closely tracks the mean dynamic, when the population is
large.
I
When the population is large but finite, however, there is
always a positive probability path from each state to every
other state, so that every state in X N is visited infinitely
often with probability one.
14 / 38
Long-Run Analysis
I
Over finite-horizons, the focus is on the mean dynamic.
I
In the long run, however, the object of interest is the
stationary distribution µ of the process {Xt } (we are now
dropping the N superscript).
I
The stationary distribution tells us the frequency
distribution of visits to each state as t → ∞, i.e. almost
surely the process spends proportion µ(x) of the time in
state x.
I
Before analyzing stationary distributions, let us introduce
some definitions.
15 / 38
Communication
I
State y is accessible from x if there exists a positive
probability path from x to y, i.e. a sequence of states
beginning in x and ending in y in which each one step
transition between states has positive probability under P.
I
States x and y communicate if they are each accessible
from the other.
I
A set of states E is closed if the process cannot leave it, i.e.
for all x ∈ E and y ∈
/ E, Pxy = 0.
I
An absorbing state is a singleton closed set.
I
Every state in X is either transient or recurrent; and a state
is recurrent if and only if it is a member of a closed
communication class.
16 / 38
Recurrence
Theorem 11.2. Let {Xt } be a Markov Process on a finite set X .
Then starting from any state x0 , the frequency distribution of
visits to states converges to a stationary distribution µ, which
solves µP = µ. In such a vector, µ(x) = 0 for all transient states.
I
This is why closed communication classes are commonly
called recurrence classes.
17 / 38
Stationary Distributions
I
FACT: every non-negative row stochastic matrix has at
least one left eigenvector with eigenvalue one, i.e. there
exists a µ such that µP = µ.
I
In other words there exists at least one stationary
distribution.
Theorem 11.3.
(i) If a stationary distribution µ is unique, it is the long run
frequency distribution independent of the initial state.
(ii) If there are multiple stationary distributions, then the long
run frequency distribution is among these, and can be any one
of them.
18 / 38
Stationary Distributions
I
In our previous example, µ = ( 13 , 23 ) (check by showing
that µ solves µP = µ).
I
Consider the following Markov process:
I
The solutions to the stationarity equation are:
µ1 = ( 13 , 23 , 0, 0), µ2 = (0, 0, 0, 1),
or any convex combination of µ1 and µ2 .
19 / 38
Irreduciblity
I
A Markov Process is irreducible if there is a positive
probability path from each state to every other, i.e. if all
states in X communicate, or equivalently if X forms a
single recurrent class.
Theorem 11.4. If the Markov process {Xt } is irreducible then it
has a unique stationary distribution, and this stationary
distribution is independent of the initial state.
In this case, we say that the process {Xt } is ergodic, its
long-run behavior does not depend on initial conditions.
20 / 38
k-Step Ahead Probabilities
I
We may not only want to know the proportion of time the
process spends in each state (given by µ), but also the
probability of being in a given state at some future point in
time.
I
Recall that P(X1 = y|X0 = x) = Pxy is a one step transition
probability.
I
Two step transition probabilities are computed by
multiplying P by itself:
P ( X2 = y | X0 = x ) =
∑
P(X2 = y, X1 = z|X0 = x)
∑
P(X1 = z|X0 = x)P(X2 = y|X1 = z, X0 = x)
∑
Pxz Pzy
z∈X
=
z∈X
=
z∈X
2
= P
xy
.
21 / 38
Aperiodicity
I
By induction, the t-step transition probabilities are given
by the entries of the tth power of the transition matrix:
P(Xt = y|X0 = x) = Pt
xy
.
I
Let Tx be the set of all positive integers T such that there is
a positive probability of moving from x to x in exactly T
periods.
I
The process is aperiodic if for every x ∈ X , the greatest
common denominator of Tx is 1.
This holds whenever the probability of remaining in each state
is positive, i.e. Pxx > 0 for all x ∈ X .
22 / 38
Aperiodicity
I
If {Xt } is irreducible and aperiodic, then with probability
one:
for all x0 , x ∈ X
lim Pt
t→ ∞
x0 x
= µ (x).
I
Therefore, from any initial state x0 both the proportion of
time the process spends in each state up through time t
and the probability of being in each state at time t converge
to the stationary distribution µ.
I
Hence the stationary distribution provides a lot of
information about the long run behavior of the process.
23 / 38
Full Support Revision Protocols
I
Irreducibility and aperiodicity are desirable properties of a
Markov process.
I
They are both generated by full support revision protocols,
i.e. revision protocols in which all strategies are chosen
with positive probability.
I
Let us consider two examples which are extensions of best
response protocols.
24 / 38
Full Support Revision Protocols
I
Best response with mutations:
I
A revising agent switches to his current best response with
probability 1 − ε, and chooses a strategy uniformly
(mutates) with probability ε > 0.
I
Let us refer to this protocol as BRM(ε).
I
In case of best response ties, it is often assumed that a
non-mutating agent sticks with her current strategy if it is a
best response; otherwise she chooses at random among
from the set of best responses.
25 / 38
Full Support Revision Protocols
I
Logit Choice:
ρij (π ) =
exp(η −1 πj )
.
∑k∈S exp(η −1 πk )
I
We can define ε−1 = exp(η −1 ), where ε is an increasing
function of η. As η → 0, ε → 0.
I
In this way, we can rewrite the revision protocol as follows:
ρij (π ) =
ε − πj
.
∑ k ∈S ε − πk
26 / 38
Stationary Distributions
I
There are two problems:
I
It may not be possible to compute the stationary
distribution explicitly,
I
Even if it is possible to do so, the stationary distribution
may spread weight widely over the state space.
27 / 38
Analyzing Large-Dimensional Markov Processes
I
In this lecture:
I
I
We shall study a class of reversible Markov processes
whose stationary distributions are easy to compute;
In the next lecture, we shall:
I
Introduce the concept of stochastic stability, which can
drastically reduce the number of states which attract
positive weight in the stationary distribution.
28 / 38
Reversible Markov Processes
I
Reversible Markov processes permit easy computation
even if the state space X , and hence the |X | × |X |
transition matrix P, is large.
I
A constant jump rate process {Xt } is reversible if it admits
a reversible distribution, i.e. a probability distribution µ on
X that satisfies the following detailed balance conditions:
µx Pxy = µy Pyx
I
for all x, y ∈ X.
(1)
Such a process is reversible in the sense that it looks the
same whether time is run forward or backward.
29 / 38
Reversible Markov Processes
I
Recall that a stationary distribution µ satisfies:
∑
µx Pxy = µy
for all y ∈ X.
(2)
x∈X
I
Summing (1) over x we get:
∑
x∈X
µx Pxy =
∑
µy Pyx
x∈X
= µy
∑ Pyx
(3)
x∈X
= µy .
I
Therefore, a reversible distribution is also a stationary
distribution.
30 / 38
Reversible Markov Processes
I
There are two contexts in which the stochastic
evolutionary process {Xt } is known to be reversible:
1. two-strategy games (under arbitrary revision protocols),
2. potential games under exponential protocols.
I
We shall now study the first and leave the second to later.
31 / 38
Two-Strategy Games
I
Let F : X → R2 be a two strategy game, with strategy set
{0, 1}, full support revision protocol ρ : R2 × X → R2×2 ,
and finite population size N.
I
This defines an irreducible and aperiodic Markov Process
{Xt } on the state space X N .
I
For this class of games, let x ≡ x1 . The state of the process
is fully described by x.
I
Therefore, the state space is X N = {0, N1 , . . . , 1}, a
uniformly spaced grid embedded in the unit interval.
32 / 38
Birth and Death Processes
I
Because revision opportunities arrive at an exponential
rate in continuous time, agents switch strategies
sequentially.
I
This means that transitions are always between adjacent
states.
I
If in addition the state space is linearly ordered (which it is
in a two-strategy game), then we refer to the Markov
process as a birth and death process.
I
We shall now show that a stationary distribution for such
processes can be calculated in a straightforward way.
33 / 38
Birth and Death Processes
I
In a birth and death process, there are vectors p, q ∈ R|X |
with p1 = q0 = 0 such that the transition matrix takes the
following form:
Pxy
I
px
if y = x + N1 ,
qx
if y = x − N1 ,
≡

1 − px − qx
if y = x,


0
otherwise




The process is irreducible if px > 0 for all x < 1 and qx > 0
for all x > 0, which is what we shall assume.
34 / 38
Birth and Death Processes
I
Because of the “local” structure of transitions, the
reversibility condition reduces to:
µx qx = µx−1/N px−1/N
I
for all x > 0.
Applying the formula inductively, we have:
µx qx qx−1/N . . . q1/N = µ0 p0 p1/N p2/N . . . px−1/N .
I
That is, the process running ‘down’ from state x to zero
should look like the process running ‘up’ from zero to state
x.
35 / 38
Stationary Distribution
I
Rearranging, we see that the stationary distribution
satisfies:
µx
=
µ0
I
Nx
∏
j=1
p(j−1)/N
qj/N
for all x ∈ { N1 , . . . , 1}.
(4)
For a full support revision protocol ρ and rate one Poisson
alarm clocks, the upward and downward probabilities are
given by:
px = (1 − x)ρ01 F(x), x
qx = xρ10 F(x), x .
(5)
36 / 38
Stationary Distribution
I
Substituting the expressions in (5) into (4) yields:
µx
=
µ0
Nx
∏
j=1
p(j−1)/N
qj/N
Nx
=∏
j=1
1−
j−1
N
j
N
.
j−1
N ),
j
ρ10 F( N ),
ρ01 F(
j−1 N
j N
(6)
for all x ∈ { N1 , . . . , 1}.
37 / 38
Stationary Distribution
Simplifying, we have the following result:
Theorem 11.5. Suppose that a population of N agents plays the
two-strategy game F using the full support revision protocol ρ.
Then the stationary distribution for the evolutionary process
{XtN } on X N is given by:
µx
=
µ0
Nx
j−1
j−1 N − j + 1 ρ01 F( N ), N
.
∏
j
j j
ρ10 F( N ), N
j=1
for x ∈ { N1 , . . . , 1}, (7)
with µ0 determined by the requirement that ∑x∈X µx = 1.
38 / 38