Stochastic Optimal Control Problems
Part I: Deterministic Case
Hasnaa Zidani
ENSTA-Paris, University Paris-Sacaly
IMPA, June 20-24, 2016
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
1 / 30
Outline
1
Controlled differential systems
2
A Direct Numerical appraoch
3
Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
2 / 30
Outline
1
2
3
Controlled differential systems
Introduction and Examples
State equation
Existence of optimal solutions
A Direct Numerical appraoch
Discrete Optimal Control Problem
Example
State of the art
Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
3 / 30
y state of the system
u control input
u
y
Find a control law and its corresponding trajectory that optimize some
performances of the system while complying with prescribed constraints
(physical or economical constraints on the control and/or the state)
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
4 / 30
Consider the problem of minimizing the cost function
Z T
`(yt , ut )dt + φ(y0 , yT ) subject to: ẏt = f (yt , ut ), t ∈ (0, T ),
0
and the constraints:
Control constraints: c(ut ) ≤ 0, t ∈ (0, T ),
State constraints: g (yt ) ≤ 0, t ∈ (0, T ),
Mixed state and control constraints: c(ut , yt ) ≤ 0, t ∈ (0, T ),
Initial-final equality and inequality constraints:
Φi (y0 , yT ) = 0,
Ψi (y0 , yT ) ≤ 0,
H. Zidani (ENSTA ParisTech)
i = 1, · · · , r1 ,
i = r1 + 1, · · · , r .
Stochastic Optimal Control Problems
SVAN’2016
5 / 30
Function spaces: Control and state spaces
U := L∞ (0, T ; Rm );
Y := W 1,∞ (0, T ; Rd ).
Their extension to Hilbert spaces:
U2 := L2 (0, T ; Rm );
H. Zidani (ENSTA ParisTech)
Y2 := H 1 (0, T ; Rd ).
Stochastic Optimal Control Problems
SVAN’2016
6 / 30
The space race: Goddard problem
Example (Goddard)
ḣ(t) = v (t),
u(t)
v̇ (t) =
− g,
m(t)
ṁ(t) = −bu(t),
v (0) = 0,
h(t)
v (t)
m(t)
:
:
:
altitude
velocity
masse
m(0) = mo
u(t)
:
thrust
h(0) = 0,
ä The trust u(t) is subject to: 0 ≤ u(t) ≤ umax .
ä The rocket’s mass satisfies the contraint: m1 ≤ m(t) ≤ m2 (t).
The optimal control problem is the following:
Max h(T )
u(t) ∈ [0, umax ], (h, v , m) vérifie l’EDO,
m1 ≤ m(t) ≤ m2 (t) t ≥ 0.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
7 / 30
Launcher’s problem: Ariane 5
• Steer the launcher from Kourou to the GEO
• State variables (r, v) ∈ R3 × R3 :
ṙ = v
→
− −
→
−
→
v̇ = P + FT (r, v, u) − FD (r, v, u);
u ∈ R3 the trust force (control input).
• State constraints: Heat flux, limited capacity
of ergol, target constraint (GEO)
Objective function: maximization of the payload.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
8 / 30
Standing assumptions
Assume the set of admissible control inputs is:
Uad := {u ∈ U; ut ∈ U on (0, T )}.
(A0) U is a closed set in Rm .
(A1) f : Rd × Rm −→ Rd is loc. Lipschitz continuous.
(A2) For every x ∈ Rd , f (x, U) is a convex set of Rd .
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
9 / 30
Proposition
Assume (A0)-(A1). Let x ∈ Rd .
i) For every u ∈ Uad , there exists y u ∈ H 1 ([0, T ]; Rd ) solution of the
equation: ẏtu = f (yty , ut ),
y0u = x.
ii) Moreover, the application defined by
T (·) : L2 (0, T ; Rm ) −→ H 1 (0, T ; Rd )
u 7−→ T (u) := y u
is continuous
n
S[0,T ] (x) := y | ∃u ∈ Uad , ẏt = f (yt , ut ),
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
y0 = x
o
SVAN’2016
10 / 30
Under (A0)-(A2) and if U is a compact set,
ä S[0,T ] (x) is a compact set in W 1,1 endowed with C 0 -topology.
This result is a consequence of Filippov’s theorem, see the books of Vinter (2010) or Aubin-Cellina
(1984).
ä the set-valued function x
S[0,T ] (x) is Lipschitz continuous,
∃L > 0, S[0,T ] (x) ⊂ S[0,T ] (z) + L|x − z|BW 1,1
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
∀x, z ∈ Rd .
SVAN’2016
11 / 30
Example (1)
Min
Z
1
y 2 (t) dt
0
ẏ (t) = u(t),
y (0) = 0,
u(t) ∈ {−1, 1}
un (t) =
(
1
−1
(
t − kn
yn (t) =
−t +
2k+1
sur ( 2k
2n , 2n )
2k+2
sur ( 2k+1
2n , 2n )
(k+1)
n
2k+1
sur ( 2k
2n , 2n )
2k+2
sur ( 2k+1
2n , 2n )
This simple problem doesn’t admit a solution
yn → 0,
y ≡ 0 is not admissible !!
kun kL∞ ,L2 = 1 6→ 0
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
12 / 30
Example (1’)
Min
Z
1
un (t) =
y 2 (t) dt
(
1
−1
(
t − kn
yn (t) =
−t +
0
ẏ (t) = u(t),
y (0) = 0,
u(t) ∈ [−1, 1]
2k+1
sur ( 2k
2n , 2n )
2k+2
sur ( 2k+1
2n , 2n )
k+1
n
2k+1
sur ( 2k
2n , 2n )
2k+2
sur ( 2k+1
2n , 2n )
The relaxed control problem admits a solution!
yn → 0,
y ≡ 0 is admissible
kun kL∞ ,L2 = 1 6→ 0
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
13 / 30
Outline
1
2
3
Controlled differential systems
Introduction and Examples
State equation
Existence of optimal solutions
A Direct Numerical appraoch
Discrete Optimal Control Problem
Example
State of the art
Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
14 / 30
”First discretize and then optimize”
Consider a general control problem
RT
Min φ(yT ) + 0 `(yt , ut )
subject to:
ẏt = f (yt , ut ), t ∈ (0, T ), y0 = x
c(ut ) ≤ 0, t ∈ (0, T ),
g (yt ) ≤ 0, t ∈ (0, T ),
c(ut , yt ) ≤ 0, t ∈ (0, T ),
H. Zidani (ENSTA ParisTech)
Φi (y0 , yT ) = 0,
i = 1, · · · , r1 ,
Ψi (y0 , yT ) ≤ 0,
i = r1 + 1, · · · , r .
Stochastic Optimal Control Problems
SVAN’2016
15 / 30
The Euler discretization
ä N: number of time steps, hk > 0 duration of k-th time step
P
ä Steps begin at time t0 = 0, and for k = 1 to N, tk = kj=0 hj
ä State equation: yk+1 = yk + hk f (uk , yk ), k = 0, · · · , N − 1.
ä Cost function: φ(yN )+
ä Running constraints:
c(uk ) ≤ 0; g (yk ) ≤ 0; c(uk , yk ) ≤ 0,
k = 1, · · · , N − 1.
ä Final equality and inequality constraints:
Φi (y0 , yN ) = 0,
Ψi (y0 , yN ) ≤ 0,
H. Zidani (ENSTA ParisTech)
i = 1, · · · , r1 ,
i = r 1 + 1, · · · , r .
Stochastic Optimal Control Problems
SVAN’2016
16 / 30
ä Some control problems are ”naturally” desribed by controlled discrete
dynamics.
ä Indeed, in some cases the control can act on the control variable only
at very specific dates (daily, monthly, ...)
ä In this case, the time schedule is fixed and the control problem is
already in the form of a complex finite dimensional control problem.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
17 / 30
Example: A production problem
yt : amount of steel produced at time t.
0 ≤ ut ≤ 1 is a fraction of steel produced at time t and allocated to
investment.
Thepart of yt allocated to investment is used to increase the
production capacity according to Eq:
dyt
(t) = kut yt ,
dt
where y0 = A is the initial production and k is the coefficient of
increase in production.
The optimal control problem consists here at choosing u in an optimal
way to maximize the production allocate to the consumption during a
fixed time horizon T .
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
18 / 30
Questions
In case of continuous control problem
How is the discretized version related to the original continuous control
problem ?
Given a nominal local solution (ū, ȳ ) of the original problem:
Does the discretized problem have a solution (uh , yh ) near (ū, ȳ ) ??
Can we expect an Error order as kuh − ūk + kyh − ȳ k = O(h), where
h := maxk hk ?
Is it reasonable to assume that the solution is (piecewise) smooth ?
How do we solve the discretized problem ?
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
19 / 30
Example: double integrator (I)
Consider the very simple example with constraints on the control:
ä Dynamics: ÿt = ut ∈ [−1, 1]
ä Optimization problem: reach the zero state in minimal time
.. . .
. ..
.. . .
2.0
. .. .
...
...
...
........
......
.....
..
...
...............................
......
.
...
...
.. .
...
...
1.6 .
....
..
..
.
.
..
.
.
.
.. .
..
.
..
......................
..
.....
.
.
.
...
.
...
.
..
.
..
.
...
..
........
..
......
..
..
..
.
..
.
..
..
.
.
. ..
.....
..
.
....
..
..
.
.
.
1.2
.
.
.
.
.
..
.
..
...
....
.
...
.
.
.
..
.
.
.
.
.
.
.
....
..
. ..
...
..
..
.
..
.
.
.
.
..
..
...
...
..
...
.....................................
.
...
..
.
.
..
.
.
..
.
.
..
........
...
...
..
..
..
......
.
0.8
..
.
.
..
..
..
.
.
...
...
..
.....
..
....
.
.
..
.
..
.
.
...
.
.
.
...
....
..
...
..
..
.
.
.
.
.
.
.
.
..
.
.
.
...
...
..
....
..
..
..
..
.
.
.
.....................
.
.
..
...........
...
..
..
...
0.4
...........
..
...
..
..
.
.
..
.
..
.
.
.
..
..
.
...
...
..
......
..
..
..
........
...
.
..
.
...
..
.
..
..
..
.....
...
....
.
...
..
.
.
.
.
.
....
.....
...
..
..
..
...
....
....
...
..
0.0 ...
..
..
...
....
....
...
..
..
..
.....
...
....
..
...
..
.....
..
...
....
..
..
......
...
.....
..
.
..
.
.
.
..
.
..
.
.
..
...
........
..
...
..
..
..
.................................................
...
...
....
.
-0.4
..
.
..
...
..
....
...
..
..
....
...
.
....
...
..
..
....
...
.....
..
....
...
..
...
.....
.
..
.....
..
...
...
......
...
..
.....
..
.
-0.8
..
.
..
.
..
.
..
.
..
...
..
..............
..
...
...
.........
.
...
...
..
.
.
.................
.
...
..
.
....
...
....
.
...
....
. ..
...
....
....
...
...
....
.. .
-1.2
.....
..
.....
...
...
......
..
......
. ..
...
..
........
.
..
.
.
..
.
...............................
..
...
...
..
...
. ..
....
.
-1.6
....
....
.. . .
.. . . .
. .. . .
.......
..
.
....
.........
..
..
..
........................
-2.0 .
.
. ..
...
-2.0
-1.6
-1.2
H. Zidani (ENSTA ParisTech)
....
.
. ..
.. ..
-0.8
-0.4
.....
0.0
0.4
0.8
1.2
Stochastic Optimal Control Problems
1.6
2.0
SVAN’2016
20 / 30
Example: double integrator (I)
ä Solution: Bang-bang optimal control, at most one switching time
ä Discretized solution of same nature (costate affine function of time)
ä Error only due to the switching time step
ä Expected error: at most O(h)
Ref. Alt, Baier, Gerdts, Lempio, Error bounds for Euler approximation of linear-quadratic control
problems with bang-bang solutions. 2012.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
21 / 30
Example: double integrator (II)
Fuller’s problem I (work with J. Laurent-Varin)
ẍt = ut ∈ [−1, 1]; Integral cost
Same dynamics:
!T
0
x2t dt.
1.0
0.8
0.6
0.4
0.2
0.0
−0.2
−0.4
−0.6
−0.8
−1.0
0
1
2
3
4
5
6
7
8
Figure 2: Fuller problem: optimal control, logarithmic penalty
Ref. PhD work of J. Laurent-Varin, 2005.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
8
SVAN’2016
22 / 30
PROS
This method can integrate all types of constraints (state constraints,
mixed constraints, ... etc)
The discrete problem is a finite dimensional optimisation problem
CONS
local approach
Huge number of variables
Stability and convergence results: in some cases, the discretized control
problem doesn’t have any feasible solution while the original control
problem does have a solution!
The discretization of the control problem should take into account the
structure of the optimal trajectory
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
23 / 30
Outline
1
2
3
Controlled differential systems
Introduction and Examples
State equation
Existence of optimal solutions
A Direct Numerical appraoch
Discrete Optimal Control Problem
Example
State of the art
Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
24 / 30
1
Controlled differential systems
2
A Direct Numerical appraoch
3
Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
25 / 30
With a final state constraint.
Min φ(yT )
subject to:
ẏt = f (yt , ut ), t ∈ (0, T ), y0 = y0
Ψ(yT ) = 0
The mapping T : u 7−→ yu is univoque
The OCP (P) can be re-written as:
MinF(u) := J (u, yu )
u ∈ Uad ; Ψ(T (u))(T ) = 0.
Reminder (A known result in Optimization theory)
ū ∈ Uad is a minimum of (P) =⇒
∃(λo , λ) 6= 0, [λo F 0 (ū) + [Ψ0 (T (u)) · T 0 (ū)(T )]T λ] · (u − ū) ≥ 0
∀u ∈ Uad .
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
26 / 30
Differentiability of F
(A1’) Assume f is of classe C 1 .
Theorem
Assume (A0)-(A1) and (A1’), then T is differentiable on L2 (0, T ; Rm ). Moreover,
we have :
T 0 (u) · v = zvu
∀u, v ∈ L2 (0, T ; Rm );
where zvu is the linearized state, solution of:
żt = fy0 (ytu , ut )zt + fu0 (ytu , ut )vt
z0 = 0,
on (0, T ),
(1)
where y·u := T (u) stands for the state associated to u.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
27 / 30
Theorem
We have:
Z T
λo F (u).v + [T (ū)(T )] λ] · v = hp(t), fu (ytu , ut ) · vt )i dt
0
0
T
0
where y u = T (u), and p is the adjoint state associated to u, solution of:
−ṗ(t) = [fy (ytu , ut ]t p(t),
p(T ) = λo Φ0 (T , yTu ) + λ
Itroduce the hamiltonien H : Rd × Rm × Rd → R, defined by:
H(x, q, v ) = q · f (x, v ).
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
28 / 30
Theorem (Sous (A1)-(A3) et (A1’))
let ū ∈ Uad is a minimum of (P), then the triplet (ū, ȳ , p̄) satifies:
ȳ˙ (t) = f (ȳ (t), ū(t)), ȳ (0) = xo
˙
−p̄(t)
= [fy (ȳu (t), ū(t))]t p(t),
.
∂u H(ȳ (t), ū(t), p̄(t)) · (u − ū(t)) ≥ 0,
∀u ∈ U.
The triplet (ū, ȳ , p̄) is called a Pontryagin extremal.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
29 / 30
Theorem (Sous (A1)-(A3) et (A1’))
let ū ∈ Uad is a minimum of (P), then the triplet (ū, ȳ , p̄) satifies:
ȳ˙ (t) = f (ȳ (t), ū(t)), ȳ (0) = xo
˙
−p̄(t)
= [fy (ȳu (t), ū(t))]t p(t),
H(ȳ (t), ū(t), p̄(t)) = min H(ȳ (t), u, p̄(t)).
u∈U
The triplet (ū, ȳ , p̄) is called a Pontryagin extremal.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
29 / 30
More generally ...
Min φ(yT ) +
subject to:
RT
0
`(y (t), u(t)) dt
ẏt = f (yt , ut ), t ∈ (0, T ), y0 = y0
Ψ(yT ) = 0
Theorem (Sous (A1)-(A3) et (A1’))
let ū ∈ Uad is a minimum of (P), then there exists (λ0 , λ) ∈ {0, 1} × Rd
such that
ȳ˙ (t) = ∂p H(ȳ (t), ū(t), p̄(t), λ0 ), ȳ (0) = xo
˙
−p̄(t)
= ∂y H(ȳu (t), ū(t), p̄(t), λ0 )]t p(t),
∂u H(ȳ (t), ū(t), p̄(t), λ0 ) · (u − ū(t)) ≥ 0, ∀u ∈ U,
where H(x, v , q, µ) := hq, f (x, a)i + µ`(x, v )
and µ ∈ {0, 1}.
for x ∈ Rd , v ∈ U, q ∈ Rd
Moreover, λo = 1 if the problem is free of state constraints.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
30 / 30
More generally ...
Min φ(yT ) +
subject to:
RT
0
`(y (t), u(t)) dt
ẏt = f (yt , ut ), t ∈ (0, T ), y0 = y0
Ψ(yT ) = 0
Theorem (Sous (A1)-(A3) et (A1’))
let ū ∈ Uad is a minimum of (P), then there exists (λ0 , λ) ∈ {0, 1} × Rd
such that
ȳ˙ (t) = ∂p H(ȳ (t), ū(t), p̄(t), λ0 ), ȳ (0) = xo
˙
−p̄(t)
= ∂y H(ȳu (t), ū(t), p̄(t), λ0 )]t p(t),
∂u H(ȳ (t), ū(t), p̄(t), λ0 ) · (u − ū(t)) ≥ 0, ∀u ∈ U,
where H(x, v , q, µ) := hq, f (x, a)i + µ`(x, v )
and µ ∈ {0, 1}.
for x ∈ Rd , v ∈ U, q ∈ Rd
Moreover, λo = 1 if the problem is free of state constraints.
H. Zidani (ENSTA ParisTech)
Stochastic Optimal Control Problems
SVAN’2016
30 / 30
© Copyright 2026 Paperzz