Why are reflected random walks so hard to simulate ?

Why are reflected random walks
so hard to simulate ?
Sean Meyn
Department of Electrical and Computer Engineering
and the Coordinated Science Laboratory
University of Illinois
Joint work with Ken Duffy - Hamilton Institute, National University of Ireland
NSF support: ECS-0523620
References
(Skip this advertisement)
Outline
Motivation
Reflected Random Walks
Why So Lopsided?
0
T
Theory:
20
Observed:
Most Likely Paths
X
16
12
8
4
0
Summary and Conclusions
0
10
20
30
40
50
MM1 queue - the nicest of Markov chains
For the stable queue, with load strictly less than one, it is
•
•
•
•
•
Reversible
Skip-free
Monotone
Marginal distribution geometric
Geometrically ergodic
MM1 queue - the nicest of Markov chains
For the stable queue, with load strictly less than one, it is
•
•
•
•
•
Reversible
Skip-free
Monotone
Marginal distribution geometric
Geometrically ergodic
Why then, can’t I simulate my queue?
MM1 queue - the nicest of Markov chains
Ch 11 CTCN
1000
800
Gaussian density
Histogram
600
1
√
100,000
400
100,000
t=1
X(t) − η
200
MM1 queue
0
−1000
−500
0
500
1000
1500
2000
Why then, can’t I simulate my queue?
ρ = 0.9
II
Reflected Random Walk
t
Reflected Random Walk
A reflected random walk
where
is the initial condition,
and the increments ∆ are i.i.d.
Reflected Random Walk
A reflected random walk
where
is the initial condition,
and the increments ∆ are i.i.d.
Basic stability assumption:
and finite second moment
MM1 queue - the nicest of Markov chains
Reflected random walk with increments,
Load condition
Reflected Random Walk
A reflected random walk
where
is the initial condition,
and the increments ∆ are i.i.d.
Basic stability assumption:
Basic question: Can I compute sample path averages
to estimate the steady-state mean?
Simulating the RRW: Asymptotic Variance
The CLT requires a third moment for the increments
1
Asymptotic variance = O
(1 − ρ)4
E[∆(0)] < 0 and E[∆(0)3 ] < ∞
Whitt 1989
Asmussen 1992
Simulating the RRW: Asymptotic Variance
The CLT requires a third moment for the increments
1
Asymptotic variance = O
(1 − ρ)4
E[∆(0)] < 0 and E[∆(0)3 ] < ∞
Ch 11 CTCN
1000
800
Gaussian density
Histogram
600
1
√
100,000
400
100,000
t=1
X(t) − η
200
MM1 queue
0
−1000
−500
0
500
1000
1500
2000
ρ = 0.9
Simulating the RRW: Lopsided Statistics
Assume only negative drift, and finite second moment:
E[∆(0)] < 0 and E[∆(0)2 ] < ∞
Lower LDP asymptotics: For each r < η ,
1
lim log P η(n) ≤ r
n→∞ n
= −I(r) < 0
M 2006
Simulating the RRW: Lopsided Statistics
Assume only negative drift, and finite second moment:
E[∆(0)] < 0 and E[∆(0)2 ] < ∞
Lower LDP asymptotics: For each r < η ,
1
lim log P η(n) ≤ r
n→∞ n
= −I(r) < 0
M 2006
Upper LDP asymptotics are null: For each r ≥ η ,
1
lim log P η(n) ≥ r = 0
n→∞ n
M 2006
CTCN
even for the MM1 queue!
III
Why So Lopsided?
Lower LDP
Construct the family of twisted transition laws
P̌ (x, dy) = eθx−Λ(θ)+F̌ (x) P (x, dy)eF̌ (y)
θ<0
Lower LDP
Construct the family of twisted transition laws
P̌ (x, dy) = eθx−Λ(θ)+F̌ (x) P (x, dy)eF̌ (y)
θ<0
Geometric ergodicity of the twisted chain implies multiplicative
ergodic theorems, and thence Bahadur-Rao LDP asymptotics
Kontoyiannis & M 2003, 2005
M 2006
Lower LDP
Construct the family of twisted transition laws
P̌ (x, dy) = eθx−Λ(θ)+F̌ (x) P (x, dy)eF̌ (y)
θ<0
Geometric ergodicity of the twisted chain implies multiplicative
ergodic theorems, and thence Bahadur-Rao LDP asymptotics
Kontoyiannis & M 2003, 2005
M 2006
This is only possible when θ is negative
Null Upper LDP: What is the area of a triangle?
A=
h
b
1
2
bh
Null Upper LDP: What is the area of a triangle?
A=
X(t)
h
1
2
η(n) ≈ n−1 A
t
0
b
bh
T
Null Upper LDP: Area of a Triangle?
Consider the scaling
T =n
η(n) ≈ n−1 A
h = γn Base obtained from most
likely path to this height:
b = β∗n
e.g., Ganesh, O’Connell, and Wischik, 2004
X(t)
Large deviations for reflected random walk:
h
t
0
b
T
Null Upper LDP: Area of a Triangle
Consider the scaling
T =n
h = γn
b = β∗n
η(n) ≈ n−1 A
An = 12 γβ ∗ n2
X(t)
Upper LDP is null: For any γ,
h
t
0
b
T
M 2006
CTCN 2008
(see also Borovkov, et. al. 2003)
IV
Most Likely Paths
What Is The Most Likely Area?
Are triangular excursions optimal?
Most likely path?
t
0
T
What Is The Most Likely Area?
Triangular excursions are not optimal:
Better...
t
0
T
A concave perturbation: greatly increased
area at modest “cost”
Better...
Most Likely Paths
t
0
Scaled process
ψ n (t) = n−1 X(nt)
LDP question translated to the scaled process
Duffy and M 2009, ...
T
Better...
Most Likely Paths
t
0
Scaled process
ψ n (t) = n−1 X(nt)
LDP question translated to the scaled process
1
η(n) ≈ An ⇐⇒
0
ψ n (t) dt ≈ A
T
Better...
Most Likely Paths
t
0
Scaled process
ψ n (t) = n−1 X(nt)
LDP question translated to the scaled process
1
η(n) ≈ An ⇐⇒
0
ψ n (t) dt ≈ A
Basic assumption: The sample paths for the unconstrained
random walk with increments ∆ satisfy the LDP in D[0,1] with
good rate function IX
T
Better...
Most Likely Paths
t
0
Scaled process
ψ n (t) = n−1 X(nt)
LDP question translated to the scaled process
1
η(n) ≈ An ⇐⇒
0
ψ n (t) dt ≈ A
Basic assumption: The sample paths for the unconstrained
random walk with increments ∆ satisfy the LDP in D[0,1] with
good rate function IX
1
IX (ψ) = ϑ̄+ ψ(0+) +
I∆ (ψ̇(t)) dt
0
For concave ψ, with
no downward jumps
T
Better...
Most Likely Path Via Dynamic Programming
t
0
Dynamic programming formulation
1
min
ϑ̄+ ψ(0+) +
I∆ (ψ̇(t)) dt
0
1
ψ(t) dt = A
s.t.
0
T
Better...
Most Likely Path Via Dynamic Programming
t
0
Dynamic programming formulation
1
ϑ̄+ ψ(0+) +
min
I∆ (ψ̇(t)) dt
0
1
ψ(t) dt = A
s.t.
0
Solution: Integration by parts
+ Lagrangian relaxation:
1
min ϑ̄+ ψ(0+) +
0
1
I∆ (ψ̇(t)) dt + λ ψ(1) − A −
tψ̇(t) dt
0
T
Better...
Most Likely Path Via Dynamic Programming
t
0
1
min ϑ̄+ ψ(0+) +
0
1
I∆ (ψ̇(t)) dt + λ ψ(1) − A −
tψ̇(t) dt
0
Variational arguments where ψ is strictly concave:
ψ(t)
ψ linear here
t
ψ(0+)
0
T00
T1
1
Strictly concave on (T00 , T1 )
T
Better...
Most Likely Path Via Dynamic Programming
t
0
1
min ϑ̄+ ψ(0+) +
0
1
I∆ (ψ̇(t)) dt + λ ψ(1) − A −
tψ̇(t) dt
0
Variational arguments where ψ is strictly concave:
ψ(t)
ψ linear here
t
ψ(0+)
0
T00
T1
1
∇I(ψ̇(t)) = b − λ∗ t
Strictly concave on (T00 , T1 )
for a.e. t ∈ (T00 , T1 )
T
ψ(t)
Most Likely Paths - Examples
Increment
1
∗
Exponent: I ψ (0+) +
+
I∆ (ψ̇ ∗ (t)) dt
ψ linear here
t
ψ(0+)
0
T00
T1
Optimal paths ψ ∗ (t) for various A
0
±1
(MM1 Queue)
1
0.8
1
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
0
0.1
0.2
0.3
0.4
0.5
A
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0.6
0.8
1
t
Gaussian
0.5
0.4
3
0.3
2
0.2
1
0.1
0
0
0.2
0.4
0.6
0.8
1
A
0
t
0.6
0.5
Poisson
0.8
0.4
0.6
0.3
0.4
0.2
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
A
0
t
Geometric
4
3
2
1.5
2
1
1
0.5
0
0
0.2
0.4
0.6
0.8
1
A
0
0
0.2
0.4
t
1
ψ(t)
Most Likely Paths - Examples
ψ linear here
t
ψ(0+)
0
T00
T1
Selected Sample Paths:
Theory:
20
Theory:
20
Observed: X
Observed: X
15
15
10
10
5
5
0
10
20
30
40
50
RRW: Gaussian Increments
0
0
10
20
30
40
RRW: MM1 queue
The observed path has the largest simulated
mean out of 108 sample paths
1
Conclusions
Summary
It is widely known that simulation variance
is high in “heavy traffic” for queueing models
Large deviation asymptotics are exotic,
regardless of load
Sample-path behavior is identified for RRW when
the sample mean is large.
This behavior is very different than previously seen
in “buffer overflow” analysis of queues
Summary
It is widely known that simulation variance
is high in “heavy traffic” for queueing models
Large deviation asymptotics are exotic,
regardless of load
Sample-path behavior is identified for RRW when
the sample mean is large.
This behavior is very different than previously seen
in “buffer overflow” analysis of queues
What about other Markov models?
MM1 queue - the nicest of Markov chains
What about other Markov models?
•
•
•
•
•
Reversible
Skip-free
Monotone
Marginal distribution geometric
Geometrically ergodic
MM1 queue - the nicest of Markov chains
What about other Markov models?
•
•
•
•
•
Reversible
Skip-free
Monotone
Marginal distribution geometric
Geometrically ergodic
The skip-free property makes
the fluid model analysis possible. Similar behavior in
• Fixed-gain SA
• Some MCMC algorithms
What Next?
• Heavy-tailed and/or long-memory settings?
What Next?
• Heavy-tailed and/or long-memory settings?
• Variance reduction, such as control variates
η(n)
30
25
η + (n)
20
η − (n)
15
10
5
n
0
0
100
200
300
400
500
600
700
800
900
1000
M 2006
CTCN
What Next?
• Heavy-tailed and/or long-memory settings?
• Variance reduction, such as control variates
Performance as a function of safety stocks
23
23
22
22
21
21
20
20
19
19
18
10
20
30
w
40
50
60
60
50
40
30
20
w
(i) Simulation using smoothed estimator
10
18
10
20
30
w
40
50
60
60
50
40
30
20
10
w
(ii) Simulation using standard estimator
Smoothed estimator using fluid value function in CV: g = h − P h = −Dh,
ˆ
h = J
Henderson, M., and Tadi´c 2003
CTCN
What Next?
• Heavy-tailed and/or long-memory settings?
• Variance reduction, such as control variates
• Original question? Conjecture,
1
lim √ log P η(n) ≥ r = −Jη (r) < 0,
n→∞
n
r>η
for some range of r
[1] S. P. Meyn. Large deviation asymptotics and control variates for simulating large functions. Ann. Appl.
Probab., 16(1):310–339, 2006.
[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.
References
[3] K. R. Duffy and S. P. Meyn. Most likely paths to error when estimating the mean of a reflected random
walk. http://arxiv.org/abs/0906.4514, June 2009.
[1] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov
processes. Ann. Appl. Probab., 13:304–362, 2003. Presented at the INFORMS Applied Probability
Conference, NYC, July, 2001.
[2] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicatively
regular Markov processes. Electron. J. Probab., 10(3):61–123 (electronic), 2005.
[1] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE methods for skip-free Markov chain stability with
applications to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008.
Simulation
Exact LDPs
LDPs for RRW
[2] V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation and
reinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.
[1] S. P. Meyn. Large deviation asymptotics and control variates for simulating large functions. Ann. Appl.
Probab., 16(1):310–339, 2006.
[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.
[3] K. R. Duffy and S. P. Meyn. Most likely paths to error when estimating the mean of a reflected random
walk. http://arxiv.org/abs/0906.4514, June 2009.
[1] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov
processes. Ann. Appl. Probab., 13:304–362, 2003.
[2] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicatively
regular Markov processes. Electron. J. Probab., 10(3):61–123 (electronic), 2005.
[1] S. G. Henderson, S. P. Meyn, and V. B. Tadić. Performance evaluation and policy selection in multiclass
networks. Discrete Event Dynamic Systems: Theory and Applications, 13(1-2):149–189, 2003. Special
issue on learning, optimization and decision making (invited).
[2] S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007.
[1] G. Fort, S. Meyn, E. Moulines, and P. Priouret. ODE methods for skip-free Markov chain stability with
applications to MCMC. Ann. Appl. Probab., 18(2):664–707, 2008.
[2] V. S. Borkar and S. P. Meyn. The ODE method for convergence of stochastic approximation and
reinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.