Time Inconsistent Optimal Control and Mean Variance Optimization

Time Inconsistent Optimal Control and
Mean Variance Optimization
Tomas Björk
Stockholm School of Economics
Agatha Murgoci
Copenhagen Business School
Xunyu Zhou
Oxford University
Conference in honour of
Walter Schachermayer
Wien 2010
– Typeset by FoilTEX –
1
Contents
• Recap of DynP.
• Problem formulation.
• Discrete time.
• Continuous time.
• Example: Dynamic mean-variance optimization.
– Typeset by FoilTEX –
2
Standard problem
We are standing at time t = 0 in state X0 = x0.
"Z
max
u
#
T
h(s, Xs, us)dt + F (XT )
E
0
dXt = µ(t, Xt, ut)dt + σ(t, Xt, ut)dWt
For simplicty we assume that
• X is scalar.
• The adapted control ut is scalar with no restrictions.
We denote this problem by P
We restrict ourselves to feedback controls of the form
ut = u(t, Xt).
– Typeset by FoilTEX –
3
Dynamic Programming
We embed the problem P in a family of problems Ptx
Ptx :
"Z
max
u
Et,x
#
T
h(s, Xs, us)dt + F (XT )
t
dXs = µ(t, Xs, us)ds + σ(s, Xs, us)dWs,
Xt = x
The original problem corresponds to P0,x0 .
– Typeset by FoilTEX –
4
Bellman
We now have the Bellman optimality principle, which says that the family
{Pt,x; t ≥ 0, x ∈ R} are time consistent.
More precisely: If û is optimal on the time interval [t, T ], then it is also
optimal on the sub-interval [s, T ] for every s with t ≤ s ≤ T .
We can easily derive the Hamilton-Jacobi-Bellman equation
HJB:
1
= 0,
Vt(t, x) + sup h(t, x, u) + µ(t, x, u)Vx(t, x) + σ 2(t, x, u)Vxx(t, x)
2
u
V (T, x) = F (x)
– Typeset by FoilTEX –
5
Three Disturbing Examples
Hyperbolic discounting (Ekeland-Lazrak-Pirvu)
"Z
max
u
#
T
ϕ(s − t)h(cs)ds + ϕ(T − t)F (XT )
Et,x
t
Mean variance utility (Basak-Chabakauri)
max
u
γ
Et,x [XT ] − V art,x (XT )
2
Endogenous habit formation
XT
max Et,x ln
−β
u
x
dXt = [rXt + (α − r)ut]dt + σutdWt
– Typeset by FoilTEX –
6
Moral
• These types of problems are not time consistent.
• We cannot use DynP.
• In fact, in these cases it is unclear what we mean
by “optimality”.
Possible ways out:
• Easy way: Dismiss the problem as being silly.
• Pre-commitment: Solve (somehow) the problem
P0,x0 and ignore the fact that later on, your
“optimal” control will no longer be viewed as
optimal.
• Game theory:
Take the time inconsistency
seriously. View the problems as a game and look
for a Nash equilibrium point.
We use the game theoretic approach.
– Typeset by FoilTEX –
7
Our Basic Problem
max
u
Et,x [F (x, XT )] + G x, Et,x [XT ]
dXs = µ(Xs, us)ds + σ(Xs, us)dWs,
Xt = x
This can be extended considerably.
For simplicity we will consider the easier problem
max
u
– Typeset by FoilTEX –
Et,x [F (XT )] + G Et,x [XT ]
8
The Game Theoretic Approach
• This is a bit delicate to formalize in continuous time.
• Thus we turn to discrete time, and then go to the
limit.
– Typeset by FoilTEX –
9
Discrete Time
Given: A controlled Markov process {Xn : n = 0, 1, . . . T }
At any time n we can change the transition probabilities
for Xn → Xn+1 by choosing a control value u ∈ R.
Players:
• For each point in time n there is a player – “player
No n” or “Pn”.
• Pn chooses the feedback control law un(Xn).
• A sequence of control laws u0, . . . , uT −1 is denoted
by u.
• Given a sequence u of control laws, the value
function for Pn is defined by
Jn(x, u) =
– Typeset by FoilTEX –
En,x [F(XuT)]
+G
En,x [XuT]
10
Subgame Perfect Nash Equilibrium
The value function for Pn was defined by
Jn(x, u) =
En,x [F (XTu )]
+G
En,x [XTu ]
We see that Jn(x, u) depends on (n, x) and
un, un+1, . . . , uT −1.
Definition:
• The control law û is an equilibrium strategy if the
following hold for each fixed n.
– Assume that Pk use ûk (·) for k = n+1, . . . , T −1
.
– Then it is optimal for player No n to use ûn(·).
• The equilibrium value function is defined by
Vn(x) = Jn(x, û)
– Typeset by FoilTEX –
11
The infinitesimal operator
T
Let {fn(·)}n=0 be a sequence of real valued functions.
Def: For a fixed control value u ∈ R, the infinitesimal operator Au, is
defined by
(Auf )n (x) = E [fn+1(Xn+1) − fn(x)| Xn = x, un = u]
Def: For a fixed control law u, the Au, is defined by
(Auf )n (x) = E [fn+1(Xn+1) − fn(x)| Xn = x, un = un(x)]
– Typeset by FoilTEX –
12
Important Idea
It turns out that a fundamental role is played by the
function sequence fn defined by
û fn(x) = En,x XT
where û is the equilibrium strategy.
The process fn(Xn) is of course a martingale under
the equilibrium control û so we have
Aûfn(x) = 0,
fT (x) = x.
– Typeset by FoilTEX –
13
Extending HJB
Proposition: The equilibrium value function Vn(x) and the function fn(x)
satisfy the system
sup {AuVn(x) − Au (G ◦ f )n (x) + (Huf )n (x)} = 0,
u
VT (x) = F (x) + G(x)
Aûfn(x) = 0,
fT (x) = x.
u
u
(H f )n (x) = G En,x fn+1 Xn+1
− G (fn(x)) ,
– Typeset by FoilTEX –
û fn(x) = En,x XT
14
Continuous Time
The discrete time results extend immediately to
continuous time.
• Now X is a controlled continuous time Markov
process with controlled infinitesimal generator
1
u
Et,x g(t + h, Xt+h) − g(t, x)
A g(t, x) = lim
h→0 h
u
• The extended HJB is now an equation with time
step [t, t + h].
• Divide the discrete time HJB equations by h and let
h → 0.
– Typeset by FoilTEX –
15
Extended HJB Continuous Time
Conjecture: The equilibrium value function satisfies the system
sup {AuV (t, x) − Au (G ◦ f ) (t, x) + G0 (f (t, x)) · Auf (t, x)} = 0,
u
Aûf (t, x) = 0,
V (T, x) = F (x) + G(x)
f (T, x) = x.
Note the fixed point character of the extended HJB.
– Typeset by FoilTEX –
16
General Problem
"Z
max
u
Et,x
– Typeset by FoilTEX –
T
#
C(t, x, Xs, us)ds + F (t, x, XT ) + G t, x, Et,x [XT ]
t
17
The general case
Z T
Z T
“
”
u s
u s,x
u
sup { A V (t, x) + C(x, x, u) −
(A c )t (x, x)ds +
(A c
)t (x)ds
t
t
u∈U
“
”
“
”
“
”
u
u x
u
u
− A f (t, x, x) + A f
(t, x) − A (G g) (t, x) + H g (t, x)}
=
0,
û y
A f (t, x)
=
0,
û
A g(t, x)
=
0,
û s,y
(A c
)t (x)
=
0,
V (T, x)
=
F (x, x) + G(x, x),
s,y
cs (x)
=
C(x, y, ûs (x)),
f (T, x, y)
=
F (y, x),
g(T, x)
=
x.
– Typeset by FoilTEX –
0≤t≤s
18
Optimal for what?
• In continuous time, it is not immediately clear how
to define an equilibrium strategy.
• We follow Ekeland et al and define the equilibrium
using spike variations.
– Typeset by FoilTEX –
19
HJB as a Necessary Condition
Conjecture: Assume that there exists and equilibrium
control û and define V and f as above. The V and f
satisfies the extended HJB system.
Note: It is probably very hard to prove this, due to
technical problems.
We do however have a converse result.
– Typeset by FoilTEX –
20
Verification Theorem
Theorem: Assume that V , f and û satisfies the
extended HJB system. Then V is the equilibrium
value function and û is the equilibrium control.
Proof: Not very hard, but a bit harder than for
standard DynP.
– Typeset by FoilTEX –
21
A useful Lemma
Consider a functional
J(t, x, u) = Et,x [F (x, XTu )] + G (x, Et,x [XTu ]) .
and denote the equilibrium control and value function
by û and V respectively. Let ϕ(x() be a given
deterministic real valued function and consider the
functional
J ϕ(t, x, u) = ϕ(x) {Et,x [F (x, XTu )] + G (x, Et,x [XTu ])}
Denoting the equilibrium control and value function by
ûϕ and V ϕ respectively we have
ûϕ(t, x) = û(t, x),
V ϕ(t, x) = ϕ(x)V (t, x)
– Typeset by FoilTEX –
22
Practical handling of the theory
• Make a parameterized Ansatz for V .
• Make a parameterized Ansatz for f .
• Plug everything into the extended HJB system and
hope to obtain a system of ODEs for the parameters
in the Ansatz.
• Alternatively, compute Lie symmetry groups.
– Typeset by FoilTEX –
23
Basak’s Example (in a simple version)
dSt = αStdt + σStdWt,
dBt = rBtdt
Xt = portfolio value process
u = amount of money invested in risky asset
Problem:
max
u
γ
Et,x [XT ] − V art,x (XT )
2
dXt = [rXt + (α − r)ut]dt + σutdWt
This corresponds to our standard problem with
γ 2
F (x) = x − x ,
2
– Typeset by FoilTEX –
γ 2
G(x) = x
2
24
Extended HJB
1
γ
Vt + sup [rXt + (α − r)u]Vx + σ 2u2Vxx − σ 2u2fx2
2
2
u
= 0
V (T, x) = x
Aûf
= 0
f (T, x) = x
Ansatz:
V (t, x) = g(t)x + h(t)
f (t, x) = A(t)x + B(t)
– Typeset by FoilTEX –
25
Result
The equilibrium value function and strategy are given
by
2
(α
−
r)
1
(T − t)
V (t, x) = er(T −t)x +
2γ σ 2
û(t, x) =
1 α − r −r(T −t)
e
γ σ2
r(T −t)
f (t, x) = e
– Typeset by FoilTEX –
1 (α − r)2
x+
(T − t)
2
γ σ
26
A Closer Look at Naive Mean Variance
We recall that for the Basak problem we have
1 α − r −r(T −t)
û(t, x) =
e
γ σ2
where u is the dollar amount invested in the risky asset.
Is this economically meaningful?
– Typeset by FoilTEX –
27
NO!
– Typeset by FoilTEX –
28
A Closer Look at Naive Mean Variance
We recall that for the Basak problem we have
û(t, x) =
1 α − r −r(T −t)
e
2
γ σ
• The control u is the number of dollars invested in
the risky asset. In the Basak case this is independent
of the level of wealth.
• You thus invest the same number of dollars in the
risky asset regardless of whether your wealth is 100
dollars or 10 billion dollars.
• Not so realistic.
– Typeset by FoilTEX –
29
Realistic Mean Variance
Idea:
We let the risk aversion coefficient γ depend on current
wealth.
max
u
γ(x)
V art,x (XT )
Et,x [XT ] −
2
dXt = [rXt + (α − r)ut]dt + σutdWt
This case is covered by our general theory
– Typeset by FoilTEX –
30
General Solution
The equilibrium control is given by
β fx(t, x, x) + γ(x)g(t, x)gx(t, x)
û(t, x) = − 2
.
σ fxx(t, x, x) + γ(x)g(t, x)gxx(t, x)
The functions f and g are determined by the system
1 2 2
ft(t, x, y) + (rx + β û) fx(t, x, y) + σ û fxx(t, x, y) = 0,
2
1
gt(t, x) + (rx + β û) gx(t, x) + σ 2û2gxx(t, x) = 0,
2
with boundary conditions
f (T, x, y) = x −
γ(y) 2
x ,
2
g(T, x) = x.
The equilibrium value function V is given by
γ(x) 2
V (t, x) = f (t, x, x) +
g (t, x).
2
– Typeset by FoilTEX –
31
A natural choice of γ(x)
1. Dimension analysis:
J(t, x, u) = Et,x [XTu ] −
γ(x)
V art,x [XTu ]
2
• The term Et,x [XTu ] has the dimension (dollar).
• The term V art,x [XTu ] has the dimension (dollar)2.
• We we have to choose γ in such a way that γ(x)
has the dimension (dollar)−1.
• The most obvious way to accomplish this is of course
to specify γ as
γ
γ(x) = ,
x
– Typeset by FoilTEX –
32
A natural choice of γ(x)
2. Back to basics.
• The original mean variance problem is posed in
terms of returns:
u
u
XT
γ(x)
XT
−
J(t, x, u) = Et,x
V art,x
x
2
x
• We can write this as
o
γ
1n
u
u
J(t, x, u) =
Et,x [XT ] − V art,x [XT ]
x
2x
• It now follows from previous Lemma that this will
lead to the same equilibrium control
J(t, x, u) =
with
– Typeset by FoilTEX –
Et,x [XTu ]
γ(x)
−
V art,x [XTu ]
2
γ
γ(x) = .
x
33
The Case γ(x) = 1/x
The equilibrium control is given by
û(t, x) = c(t)x
where c solves the integral equation
o
RT
β n − RtT [r+βc(s)+σ2c2(s)]ds
− t σ 2 c2s ds
c(t) =
e
+ γe
γσ 2
This equation has a unique solution and we provide a
convergent numerical algorithm to solve it.
– Typeset by FoilTEX –
34
Open Questions
• Martingale approach to equilibrium control?
• Convex duality theory?
• More examples.
• Existence and uniqueness of the extended HJB
system.
• Extension to optimal stopping.
– Typeset by FoilTEX –
35
Optimal for what?
In continuous time, it is not immediately clear how to
define an equilibrium strategy. We follow Ekeland et
al.
• Consider a fixed control law u.
• Fix (t, x) and a “small” time increment h.
• Choose an arbitrary real number u.
• Consider the control law uh(t, x) defined by
uh(s, y) =
û(s, y) for t + h ≤ s ≤ T
u for t ≤ s ≤ t + h
Def: The control law û is an equilibrium control if
J (t, x, û) − J (t, x, uh)
lim
≥0
h→0
h
for all choices of t, x, h, u.
– Typeset by FoilTEX –
36
Connection to Standard Problems
sup {AuV (t, x) − Au (G ◦ f ) (t, x) + G0 (f (t, x)) · Auf (t, x)} = 0,
u
• Assume that we know the equilibrium strategy û.
û • Since ft(x) = Et,x XT , we can now compute f
• Now define the function h(t, x, u) by
h(t, x, u) = (Huf ) (t, x) − Au (G ◦ f ) (t, x)
The extended HJB takes the form
sup {AuV (t, x) + h(t, x, u)} = 0,
V (T, x) = F (x) + G(x)
u
– Typeset by FoilTEX –
37
Equivalent Standard Problems
We obtained
sup {AuV (t, x) + h(t, x, u)} = 0,
u
V (T, x) = F (x) + G(x)
This is the HJB for the time consistent problem
"Z
max
u
Et,x
– Typeset by FoilTEX –
T
#
h(s, Xs, us)dt + F (XT ) + G(XT )
t
38
Equivalent Standard Problem
The Basak problem has the same optimal control as
the time consistent problem
"
max Et,x XT −
u
γσ
2
2Z
#
T
e2r(T −s)u2s ds
t
dXt = [rXt + (α − r)ut]dt + σutdWt
We note in passing that
σ 2u2t dt = dhXit
– Typeset by FoilTEX –
39