Turnpike property in optimal control - Laboratoire Jacques

Turnpike property in optimal control
Emmanuel Trélat1
1 Université
Pierre et Marie Curie (Paris 6), Laboratoire Jacques-Louis Lions
Séminaire Parisien d’Optimisation, 16 janvier 2017
E. Trélat
Turnpike in optimal control
E. Trélat
Turnpike in optimal control
Turnpike property
The solution of an optimal control problem in large time should spend most of its
time near a steady-state.
In infinite horizon the solution should converge to that steady-state.
Historically: discovered in econometry (Von Neumann points).
The first turnpike result was discovered in 1958 by Dorfman,
Samuelson and Solow, in view of deriving efficient programs
of capital accumulation, in the context of a Von Neumann
model in which labor is treated as an intermediate product.
Paul Samuelson (1915–2009)
Nobel Prize in Economic
Science, 1970
E. Trélat
Turnpike in optimal control
Turnpike property
The solution of an optimal control problem in large time should spend most of its
time near a steady-state.
In infinite horizon the solution should converge to that steady-state.
Excerpt from: Dorfman - Samuelson - Solow (1958)
Thus in this unexpected way, we have found a real normative significance for steady growth
– not steady growth in general, but maximal von Neumann growth. It is, in a sense, the
single most effective way for the system to grow, so that if we are planning long-run growth,
no matter where we start, and where we desire to end up, it will pay in the intermediate
stages to get into a growth phase of this kind.
It is exactly like a turnpike paralleled by a network of minor roads. There is a fastest route
between any two points; and if the origin and destination are close together and far from the
turnpike, the best route may not touch the turnpike. But if origin and destination are far
enough apart, it will always pay to get on to the turnpike and cover distance at the best rate
of travel, even if this means adding a little mileage at either end.
The best intermediate capital configuration is one which will grow most rapidly, even if it is
not the desired one, it is temporarily optimal.
E. Trélat
Turnpike in optimal control
Turnpike property
The solution of an optimal control problem in large time should spend most of its
time near a steady-state.
In infinite horizon the solution should converge to that steady-state.
- Turnpike theorems have been derived in the 60’s for discrete-time optimal control problems arising in econometry (Mac Kenzie, 1963).
- Continous versions by Haurie for particular dynamics (economic growth models). See
also Carlson Haurie Leizarowitz 1991, Zaslavski 2000, Faulwasser Bonvin 2015.
- More recently, in biology: Rapaport 2005, Coron Gabriel Shang 2014; human locomotion: Chitour Jean Mason 2012; MPC: Grüne 2012-2014.
- Linear heat and wave equations: Porretta Zuazua 2013.
- Rockafellar 1973, Samuelson 1972: saddle point feature of the extremal equations of
optimal control.
- Different point of view by Anderson Kokotovic (1987), Wilde Kokotovic (1972):
exponential dichotomy property → hyperbolicity phenomenon.
E. Trélat
Turnpike in optimal control
General nonlinear optimal control problem
f : IRn × IRm → IRn
R : IRn × IRn → IRk ,
dynamics
R = (R 1 , . . . , R k )
f 0 : IRn × IRm → IR
of class
terminal conditions
instantaneous cost
C2.
Optimal control problem (OCP)T
For T > 0 fixed, find uT (·) ∈ L∞ (0, T ; IRm ) such that
ẋ(t) = f (x(t), u(t))
R(x(0), x(T )) = 0
Z T
min
f 0 (x(t), u(t)) dt
0
Examples of terminal conditions R: point-to-point, point-to-free, periodic, ...
Optimal (assumed) solution: (xT (·), uT (·)).
E. Trélat
Turnpike in optimal control
General nonlinear optimal control problem
Pontryagin maximum principle ⇒ ∃(λT (·), λ0T ) 6= (0, 0) such that
∂H
(xT (t), λT (t), λ0T , uT (t))
∂λ
∂H
(xT (t), λT (t), λ0T , uT (t))
λ̇T (t) = −
∂x
∂H
(xT (t), λT (t), λ0T , uT (t)) = 0
∂u
ẋT (t) =
where
H(x, λ, λ0 , u) = hλ, f (x, u)i + λ0 f 0 (x, u)
Moreover we have transversality conditions
„
−λT (0)
λT (T )
«
=
k
X
γi ∇R i (xT (0), xT (T ))
i=1
(generic...) assumption made throughout: no abnormal ⇒ λ0T = −1
E. Trélat
Turnpike in optimal control
Static optimal control problem
Static optimal control problem
min
(x,u)∈IRn ×IRm
f 0 (x, u)
f (x,u)=0
Optimal (assumed) solution: (x̄, ū).
Lagrange multipliers ⇒ (λ̄, λ̄0 ) 6= (0, 0) such that
f (x̄, ū) = 0
D
E
0
∂f
∂f
λ̄0
(x̄, λ̄, ū) + λ̄,
(x̄, λ̄, ū) = 0
∂x
∂x
D ∂f
E
∂f 0
(x̄, λ̄, ū) + λ̄,
(x̄, λ̄, ū) = 0
λ̄0
∂u
∂u
i.e.
∂H
(x̄, λ̄, λ̄0 , ū) = 0
∂λ
∂H
−
(x̄, λ̄, λ̄0 , ū) = 0
∂x
∂H
(x̄, λ̄, λ̄0 , ū) = 0
∂u
H(x, λ, λ0 , u) = hλ, f (x, u)i + λ0 f 0 (x, u)
(generic...) assumption made throughout: no abnormal ⇒ λ̄0T = −1
(Mangasarian-Fromowitz)
E. Trélat
Turnpike in optimal control
(OCP)T
Static optimal control problem
ẋ(t) = f (x(t), u(t))
min
R(x(0), x(T )) = 0
Z T
min
f 0 (x(t), u(t)) dt
(x,u)∈IRn ×IRm
f 0 (x, u)
f (x,u)=0
0
∂H
(xT (t), λT (t), −1, uT (t))
∂λ
∂H
(xT (t), λT (t), −1, uT (t))
λ̇T (t) = −
∂x
∂H
(xT (t), λT (t), −1, uT (t)) = 0
∂u
∂H
(x̄, λ̄, −1, ū) = 0
∂λ
∂H
(x̄, λ̄, −1, ū) = 0
−
∂x
∂H
(x̄, λ̄, −1, ū) = 0
∂u
ẋT (t) =
H(x, λ, λ0 , u) = hλ, f (x, u)i + λ0 f 0 (x, u)
(x̄, λ̄, ū): equilibrium point of the extremal equations
E. Trélat
Turnpike in optimal control
It is expected that, in large time T , the optimal extremal solution (xT (·), λT (·), uT (·)) of
(OCP)T approximately consists of 3 pieces:
1
short-time: (xT (0), λT (0), uT (0)) → (x̄, λ̄, ū)
2
long-time, stationary: (x̄, λ̄, ū)
3
short-time: (x̄, λ̄, ū) → (xT (T ), λT (T ), uT (T ))
E. Trélat
(transient arc, on [0, ε])
(on [ε, T − ε])
(transient arc, on [T − ε, T ])
Turnpike in optimal control
Exponential turnpike
H∗# =
−1
A = Hxλ − Huλ Huu
Hxu ,
∂2H
(x̄, λ̄, −1, ū)
∂ ∗ ∂#
B = Huλ ,
−1
W = −Hxx + Hux Huu
Hxu .
Theorem (Trélat Zuazua, JDE 2015)
Huu < 0,
W >0
rank(B, AB, . . . , An−1 B) = n (Kalman condition)
(x̄, λ̄) ”almost satisfies” the terminal + transversality conditions
Then for T > 0 large enough:
kxT (t) − x̄k + kλT (t) − λ̄k + kuT (t) − ūk 6 C1 (e−νt + e−ν(T −t) )
∀t ∈ [0, T ]
Moreover:
−1 ∗
E− A + A∗ E− − E− BHuu
B E− − W = 0
∗
E+ A + A E+ −
−1 ∗
E+ BHuu
B E+
−W =0
minimal solution of Riccati
maximal solution of Riccati
−1 ∗
ν = − max{Re(µ) | µ ∈ Spec(A − BHuu
B E− )} > 0.
E. Trélat
Turnpike in optimal control
Exponential turnpike
H∗# =
∂2H
(x̄, λ̄, −1, ū)
∂ ∗ ∂#
−1
A = Hxλ − Huλ Huu
Hxu ,
B = Huλ ,
−1
W = −Hxx + Hux Huu
Hxu .
Theorem (Trélat Zuazua, JDE 2015)
Huu < 0,
W >0
rank(B, AB, . . . , An−1 B) = n (Kalman condition)
(x̄, λ̄) ”almost satisfies” the terminal + transversality conditions
Then for T > 0 large enough:
kxT (t) − x̄k + kλT (t) − λ̄k + kuT (t) − ūk 6 C1 (e−νt + e−ν(T −t) )
∀t ∈ [0, T ]
In some sense, this result shows that:
1) (dynamic) control 2) T → +∞
E. Trélat
⇔
1) T → +∞
2) (static) control
Turnpike in optimal control
Exponential turnpike
Extension to infinite dimension:
(generalization of Porretta Zuazua 2013: linear heat and wave equations with internal control)
X , U Hilbert spaces
A : D(A) → X operator generating a C0 semigroup on X
f :X ×U →X
f0
(dynamics)
C2
: X × U → IR (instantaneous cost) C 2
Optimal control problem (OCP)T
For T > 0 fixed, find uT (·) ∈ L∞ (0, T ; U) such that
ẏ(t) = Ay(t) + f (y(t), u(t))
y(0) = y0
Z
min
T
f 0 (y(t), u(t)) dt
0
PMP, optimal steady-state: same as before.
E. Trélat
Turnpike in optimal control
Exponential turnpike
Theorem (Trélat Zhang Zuazua 2016)
Huu < 0,
W = C?C > 0
(A, B) exponentially stabilizable
(A, C) exponentially detectable
Then there exist ε > 0, ν > 0, c > 0 such that ∀T > 0, if ky0 − ys kX + kλs kX 6 ε
then any optimal extremal triple (y T (·), u T (·), λT (·)) of (OCP T ) satisfies
‚
‚
‚
‚
‚
‚
”
“
‚ T
‚
‚
‚
‚
‚
‚y (t) − ȳ ‚ + ‚u T (t) − ū ‚ + ‚λT (t) − λ̄‚ 6 c e−νt + e−ν(T −t)
X
U
X
∀t ∈ [0, T ]
ν = exponential stability rate for a C0 semigroup resulting from the (operator) algebraic Riccati equation.
Example
Minimize
1
2
Z T Z
0
Ω
2
|y (x, t) − yd (x)| dx dt +
8
3
>
>
< yt − 4y + y = χω u
>
>
:
1
2
Z T Z
2
|u(x, t)| dx dt
0
ω
in Ω × (0, T ),
y =0
on ∂Ω × (0, T ),
y (0) = y0
in Ω.
Turnpike if kyd kL2 and ky0 kL2 are small enough.
E. Trélat
Turnpike in optimal control
Particular case: linear quadratic
(OCP)T
Static optimal control problem
ẋ(t) = Ax(t) + Bu(t),
x(0) = x0 ,
min
1
2
Z
T
x(T ) = x1
min
(x,u)∈IRn ×IRm
“
(x(t) − x d )∗ Q(x(t) − x d )
Ax+Bu=0
0
d ∗
1“
(x − x d )∗ Q(x − x d )
2
d
”
+ (u − u d )∗ U(u − u d )
+ (u(t) − u ) U(u(t) − u ) dt
ẋT (t) = AxT (t) + BU −1 B ∗ λT (t) + Bu d
∗
λ̇T (t) = QxT (t) − A λT (t) − Qx
d
E. Trélat
Ax̄ + BU −1 B ∗ λ̄ + Bu d = 0
Q x̄ − A∗ λ̄ − Qx d = 0
Turnpike in optimal control
”
Particular case: linear quadratic
(OCP)T
Static optimal control problem
ẋ(t) = Ax(t) + Bu(t),
x(0) = x0 ,
min
1
2
Z
T
x(T ) = x1
min
(x,u)∈IRn ×IRm
“
(x(t) − x d )∗ Q(x(t) − x d )
Ax+Bu=0
0
d ∗
1“
(x − x d )∗ Q(x − x d )
2
d
+ (u − u d )∗ U(u − u d )
”
+ (u(t) − u ) U(u(t) − u ) dt
ẋT (t) = AxT (t) + BU −1 B ∗ λT (t) + Bu d
∗
λ̇T (t) = QxT (t) − A λT (t) − Qx
Ax̄ + BU −1 B ∗ λ̄ + Bu d = 0
Q x̄ − A∗ λ̄ − Qx d = 0
d
Theorem
U = U ∗ > 0,
Q = Q∗ > 0
rank(B, AB, . . . , An−1 B) = n (Kalman condition)
⇒ kxT (t) − x̄k + kλT (t) − λ̄k + kuT (t) − ūk 6 C1 (e−νt + e−ν(T −t) )
E. Trélat
Turnpike in optimal control
∀t ∈ [0, T ]
”
Example in LQ case
Example
(x(T ) free ⇒ λ(T ) = 0)
ẋ1 (t) = x2 (t),
x1 (0) = 0
ẋ2 (t) = −x1 (t) + u(t),
x2 (0) = 0
min
1
2
T
Z
“
”
(x1 (t) − 2)2 + (x2 (t) − 7)2 + u(t)2 dt
0
Optimal solution of the static problem:
x̄2 = 0,
x̄1 = ū
“
min (x1 − 2)2 + (x2 − 7)2 + u 2
”
x2 =0
x1 =u
whence
x̄ = (1, 0),
ū = 1,
E. Trélat
λ̄ = (−7, 1)
Turnpike in optimal control
Example in LQ case
Example
(x(T ) free ⇒ λ(T ) = 0)
ẋ1 (t) = x2 (t),
x1 (0) = 0
ẋ2 (t) = −x1 (t) + u(t),
x2 (0) = 0
min
1
2
T
Z
“
”
(x1 (t) − 2)2 + (x2 (t) − 7)2 + u(t)2 dt
0
Oscillation of (x1 (·), x2 (·)) around
the steady-state (1, 0)
E. Trélat
Turnpike in optimal control
Example in control-affine case
Example
min
ẋ1 (t) = x2 (t),
x1 (0) = 1
ẋ2 (t) = 1 − x1 (t) + x2 (t)3 + u(t),
x2 (0) = 1
1
2
T
Z
“
”
(x1 (t) − 1)2 + (x2 (t) − 1)2 + (u(t) − 2)2 dt
0
Optimal solution of the static problem:
1 − x̄1 + x̄23 + ū = 0
x̄2 = 0,
min
x2 =0
1−x1 +x23 +u=0
“
”
(x1 − 1)2 + (x2 − 1)2 + (u − 2)2
whence
x̄ = (2, 0) ,
ū = 1,
E. Trélat
λ̄ = (−1, −1)
Turnpike in optimal control
Example in control-affine case
Example
min
ẋ1 (t) = x2 (t),
x1 (0) = 1
ẋ2 (t) = 1 − x1 (t) + x2 (t)3 + u(t),
x2 (0) = 1
1
2
Z
T
“
”
(x1 (t) − 1)2 + (x2 (t) − 1)2 + (u(t) − 2)2 dt
0
Oscillation of (x1 (·), x2 (·)) around
the steady-state (2, 0)
E. Trélat
Turnpike in optimal control
Proof in the LQ case
„
Ax̄ + BU −1 B ∗ λ̄ + Bu d = 0
i.e.
Q x̄ − A∗ λ̄ − Qx d = 0
|
ẋT (t) = AxT (t) + BU −1 B ∗ λT (t) + Bu d
∗
λ̇T (t) = QxT (t) − A λT (t) − Qx
A
Q
«
«„ « „
x̄
−Bu d
BU −1 B ∗
=
∗
d
λ̄
−A
Qx
{z
}
M
δx(t) = xT (t) − x̄,
δλ(t) = λT (t) − λ̄
d
(
δ ẋ(t) = A δx(t) + BU −1 B ∗ δλ(t)
δ λ̇(t) = Q δx(t) − A∗ δλ(t)
i.e.
„
Ż (t) = MZ (t)
E. Trélat
Turnpike in optimal control
with Z (t) =
δx(t)
δλ(t)
«
Proof in the LQ case
Shooting problem
δx(0) = x0 − x̄,
„
Ż (t) = M Z (t)
with Z (t) =
δx(T ) = x1 − x̄
%
«
δx(t)
δλ(t)
δλ(0) unknown
Key lemma
M is Hamiltonian, i.e. M ∈ sp(n, IR) (Lie algebra of Sp(n, IR) symplectic matrices),
implying that:
µ ∈ Spec(M) ⇒ −µ, µ̄, −µ̄ ∈ Spec(M).
Moreover, under Kalman (A, B):
Re(Spec(M)) 6= 0
(no pure imaginary eigenvalue)
⇒ M hyperbolic
Similar property in infinite dimension (under exponential stabilizability and
detectability).
E. Trélat
Turnpike in optimal control
Proof in the LQ case
Proof: “dichotomy transformation”, borrowed from Wilde - Kokotovic 1972
(in infinite dimension: see Lukes 1969, Sakamoto 2002)
−1 ∗
E− A + A∗ E− − E− BHuu
B E− − W = 0
minimal solution of Riccati (1)
−1 ∗
E+ A + A∗ E+ − E+ BHuu
B E+ − W = 0
maximal solution of Riccati (2)
„
P=
In
E−
In
E+
«
⇒
P −1 MP =
„
A + BU −1 B ∗ E−
0
0
A + BU −1 B ∗ E+
«
Moreover:
(2) − (1)
⇒
(E+ − E− )(A + BU −1 B ∗ E+ ) + (A + BU −1 B ∗ E− )∗ (E+ − E− ) = 0
E+ − E− invertible ⇒ Spec(A + BU −1 B ∗ E+ ) = −Spec(A + BU −1 B ∗ E− )
`
´
Re Spec(A + BU −1 B ∗ E− ) < 0 by the algebraic Riccati theory
⇒ conclusion
E. Trélat
Turnpike in optimal control
Proof in the LQ case
„
Setting Z (t) =
Ż1 (t) =
„
Z1 (t) =
v (t)
w(t)
In
E−
In
E+
«
Z1 (t), we get
„
A + BU −1 B ∗ E−
0
«
⇒
0
A + BU −1 B ∗ E+
«
Z1 (t)
v 0 (t) = (A + BU −1 B ∗ E− )v (t)
0
w (t) = (A + BU
−1
∗
B E+ )w(t)
purely hyperbolic
→ Re(eigenvalues) < 0
→ Re(eigenvalues) > 0
whence
kv (t)k 6 kv (0)ke−νt
kw(t)k 6 kw(T )ke−ν(T −t)
where
ν = − max{Re(µ) | µ ∈ Spec(A + BU −1 B ∗ E− )} > 0.
(click on the figure to see time evolution)
E. Trélat
Turnpike in optimal control
Consequences for the numerical computations
Direct methods (full discretization): initialization with the solution of the static problem
⇒ successful convergence
Indirect method (shooting):
solve
ż(t) = F (z(t)),
G(z(0), z(T )) = 0
Usual implementation: z(0) unknown, tuned such that G(z(0), z(T )) = 0.
Here we propose the following variant:
z(0)
←−
backward
integration
z(T /2) unknown
↓
−→
z(T )
forward
integration
tuned s.t.
G(z(0), z(T )) = 0
E. Trélat
Turnpike in optimal control
Example in control-affine case
Example
ẋ1 (t) = x2 (t),
x1 (0) = 1
3
ẋ2 (t) = 1 − x1 (t) + x2 (t) + u(t),
min
1
2
Z
T
x2 (0) = 1
“
”
(x1 (t) − 1)2 + (x2 (t) − 1)2 + (u(t) − 2)2 dt
0
Impossible to make converge the usual
shooting method if T > 3
(explosive term + too high sensitivity)
Easy convergence with the variant, ∀T
E. Trélat
Turnpike in optimal control
Periodic turnpike
X , U, V Hilbert spaces
A : D(A) → X operator generating a C0 semigroup on X
B ∈ L(U, X ) linear bounded control operator
C ∈ L(X , V ) linear bounded observation operator
Q ∈ L(U, U) positive definite
Tracking trajectory: yd (·) ∈ C([0, +∞); X ), ud (·) ∈ L2loc (0, +∞; U), Π-periodic:
yd (t + Π) = yd (t), ud (t + Π) = ud (t)
∀t
Optimal control problem (OCP)T
For T > 0 fixed, find uT (·) ∈ L2 (0, T ; U) such that
ẏ(t) = Ay (t) + Bu(t)
y(0) = y0
min
1
2
Z
0
T
“
”
kC(y(t) − yd (t))k2V + hQ(u(t) − ud (t)), u(t) − ud (t)iU dt
E. Trélat
Turnpike in optimal control
Periodic turnpike
We replace the steady-state optimal control problem with:
Periodic optimal control problem
Z
”
1 Π“
kC(y(t) − yd (t))k2V + hQ(u(t) − ud (t)), u(t) − ud (t)iU dt
min
2 0
ẏ(t) = Ay (t) + Bu(t)
y(0) = y(Π)
Theorem (Trélat Zhang Zuazua 2016)
If (A, B) is exponentially stabilizable and (A, C) is exponentially detectable, then:
The periodic optimal control problem has a unique solution (y Π (·), u Π (·)), which has a
unique extremal lift (y Π (·), u Π (·), λΠ (·)) (of which we have explicit expressions).
There exist c > 0, ν > 0 such that ∀T > 0,
`
´
ky T (t)−y Π (t)kX +ku T (t)−u Π (t)kU +kλT (t)−λΠ (t)kX 6 c e−νt +e−ν(T −t)
ν = exponential stability rate for a C0 semigroup resulting from the (operator) algebraic Riccati equation.
⇒ The optimal extremal is almost Π-periodic (except at the beginning and at the end).
E. Trélat
Turnpike in optimal control
∀t ∈ [0, T ]
Periodic turnpike
The proof uses in particular a kind of “periodic Riccati theory”:
Lemma
If (A, B) is exponentially stabilizable and (A, C) is exponentially detectable, then the unique
solution of the periodic optimal control problem is
y Π (t) = z(t) − Eq(t),
λΠ (t) = −Pz(t) + (I + PE)q(t),
u Π (t) = ud (t) + Q −1 B ∗ λΠ (t)
with:
P > 0 unique solution of the algebraic Riccati equation A∗ P + PA − PBQ −1 B ∗ P + C ∗ C = 0
E 6 0 unique solution of the Lyapunov equation 2(A − BQ −1 B ∗ P)E − BQ −1 B ∗ = 0
´−1 R Π
`
´
z(t) = S(t)(I − S(Π)
EP)Bud (τ ) − EC ∗ Cyd (τ ) dτ
0 S(Π − τ ) (I + R
`
´
+ 0t S(t − τ ) (I + EP)Bud (τ ) − EC ∗ Cyd (τ ) dτ
q(t) = S(Π − t)∗ (I − S(Π)∗
´−1 R Π
`
´
S(Π − τ )∗ − PBud (Π − τ ) + C ∗ Cyd (Π − τ )) dτ
R 0Π−t
`
´
+ 0
S(Π − t − τ )∗ − PBud (Π − τ ) + C ∗ Cyd (Π − τ ) dτ
E. Trélat
Turnpike in optimal control
Example of periodic turnpike
min
1
2
T
Z
”
“
(x(t) − cos(2πt))2 + (y(t) − sin(2πt))2 + u(t)2 dt
T = 20, Π = 1
0
ẋ(t) = y(t),
ẏ(t) = u(t),
E. Trélat
x(0) = 0.1,
y(0) = 0
Turnpike in optimal control
Shape turnpike
(ongoing works with Can Zhang and Enrique Zuazua)
First toy model:
D ⊂ IR2 nonempty bounded open, ω ⊂ D open subset
N = {Ω ∈ O | Ω ⊃ ω, ] Ωc 6 N}
∀N ∈ IN
Oω
N is compact for the complementary Hausdorff topology, defined by the distance
Oω
0
1
dH c (Ω1 , Ω2 ) = max @ max min kx − y k, max min kx − ykA
x∈Ωc y ∈Ωc
1
2
x∈Ωc y∈Ωc
2
1
∀Ωi ∈ O
(Sverak 1993)
T > 0, y0 ∈ L2 (D), f ∈ L2 (D), z ∈ H 1 (ω)
(OCPT )
min J T (Ω) =
N
Ω⊂Oω
1
T
T
Z
0
Z “
”
|y(x, t) − z(x)|2 + |∇y(x, t) − ∇z(x)|2 dx dt
ω
8
>
< ∂t y − ∆y = f
y =0
>
:
y(·, 0) = y0
in Ω × (0, T )
on ∂Ω × (0, T )
in Ω
See also Henrot - Sokolowski 1998, Allaire - Münch - Periago 2010
E. Trélat
Turnpike in optimal control
Shape turnpike
T > 0, y0 ∈ L2 (D), f ∈ L2 (D), z ∈ H 1 (ω)
Optimal control problem (OCPT )
min J T (Ω) =
N
Ω⊂Oω
1
T
T
Z
Z “
”
|y(x, t) − z(x)|2 + |∇y(x, t) − ∇z(x)|2 dx dt
0
ω
8
>
< ∂t y − ∆y = f
y =0
>
:
y(·, 0) = y0
in Ω × (0, T )
on ∂Ω × (0, T )
in Ω
Static (elliptic) optimal control problem
Z “
”
min J s (Ω) =
|y(x) − z(x)|2 + |∇y(x) − ∇z(x)|2 dx
N
Ω⊂Oω
ω
(
− ∆y = f
in Ω,
y =0
on ∂Ω.
E. Trélat
Turnpike in optimal control
Shape turnpike
Trélat Zhang Zuazua 2017
There exists C > 0 such that |J T − J s | 6 C
“
√1
T
+
1
T
”
∀T > 0.
If T → +∞ then any closure point of minimizers of (OCPT ) (in complementary
Hausdorff topology) is a minimizer of the static problem.
Comments:
The proof uses Γ-convergence arguments, and the exponential decay of the
energy of the heat equation without forcing term.
It works only for time-independent forcing terms f .
Compactness can be obtained for other classes of domains: uniform bound in
BV, perimeter, cone condition, etc.
No convergence rate as T → +∞ for the state.
Done here only for 2D heat equations, with y(T ) free.
E. Trélat
Turnpike in optimal control
Further comments
Competition between turnpikes (see also Rapaport Cartigny)
Interaction with discretizations (see Grüne)
Turnpike for PDEs with unbounded control operator:
- No general result.
- Gugat Trélat Zuazua SCL 2016: 1D wave equation, Neumann boundary control
Ongoing works with Can Zhang and Enrique Zuazua:
- Weaker “measure” turnpike results for dissipative systems
involving state and/or control constraints
(see also Bonvin Faulwasser)
- Turnpikes in optimal design, adiabatic theory
ẋ(t) = Aσ (t)x(t) + b
T
Z
min
σ∈Σ
“
d 2
2
kx(t) − x k + kAσ (t)k
0
”
min
dt
“
”
kx − x d k2 + kAσ k2
σ∈Σ
Aσ x+b=0
σ: shape (for instance)
E. Trélat
Turnpike in optimal control
E. Trélat
Turnpike in optimal control
Competition between two global turnpikes
T
Z
min
“
”
(x(t) − 1)2 + (u(t) − 3.47197)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
x(0) = x0 ,
x(T ) = xf or free
Plot of x 7→ (x − 1)2 + (3x − 3x 3 − 3.47197)2 :
The choice of ud = 3.47197 is done so that
the static optimal control problem
min
(x,u) | −3x+3x 3 +u=0
“
”
(x − 1)2 + (u − 3.47197)2
has two global minima:
x̄1 = −1.3473,
x̄2 = 0.5939
E. Trélat
Turnpike in optimal control
Competition between two global turnpikes
T
Z
min
“
”
(x(t) − 1)2 + (u(t) − 3.47197)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
x(0) = x0 ,
x(T ) = xf or free
Global solutions of the static problem: x̄1 = −1.3473 and x̄2 = 0.5939.
x0 = −5,
xf = −1,
T = 10
x0 = 2,
→ Turnpike around x̄1 = −1.3473
xf = 1,
T = 10
→ Turnpike around x̄2 = 0.5939
E. Trélat
Turnpike in optimal control
Local versus global turnpike
T
Z
min
”
“
(x(t) − 1)2 + (u(t) − 1)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
x(0) = x0 ,
x(T ) = xf or free
Plot of x 7→ (x − 1)2 + (3x − 3x 3 − 1)2 :
The static optimal control problem has
a unique globally optimal solution x̄ = 0.7815
a locally optimal solution x̄loc = −1.1055
E. Trélat
Turnpike in optimal control
Local versus global turnpike
T
Z
min
”
“
(x(t) − 1)2 + (u(t) − 1)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
Global solution x̄ = 0.7815.
x(0) = x0 ,
x(T ) = xf or free
Local solution x̄loc = −1.1055.
x0 = −2, xf = −1, T = 10
Initialization with the constant trajectory x̄loc .
x0 = −2, xf = −1,
Initialization with x̄.
→ Local turnpike around xloc = −1.1055.
Cost C = 4.19.
→ Global turnpike around x̄ = 0.7815.
Cost C = 2.61.
E. Trélat
Turnpike in optimal control
T = 10
Local versus global turnpike
T
Z
min
”
“
(x(t) − 1)2 + (u(t) − 1)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
x(0) = x0 ,
x(T ) = xf or free
x0 = −2, xf = −1, T = 2 (quite small)
Initialization with the constant trajectory x̄loc .
x0 = −2, xf = −1,
Initialization with x̄.
Cost C = 8.88.
→ Globally optimal!
Cost C = 9.64.
→ Locally but not globally optimal!
E. Trélat
Turnpike in optimal control
T =2
Local versus global turnpike
T
Z
min
”
“
(x(t) − 1)2 + (u(t) − 1)2 dt
0
ẋ(t) = −3x(t) + 3x(t)2 + u(t),
x0 = −2,
x(0) = x0 ,
x(T ) = xf or free
xf = −1
T ∈ {2.5, 2.7, 2.9, 3.1, 3.3}
Global optimal trajectory:
Bifurcation at T = T0 ' 2.9.
T < T0 ⇒ turnpike around x̄loc
T > T0 ⇒ turnpike around x̄
→ in accordance with the global
turnpike result for T large enough
E. Trélat
Turnpike in optimal control