Numerical Optimal Control
Moritz Diehl
Simplified Optimal Control Problem in ODE
path constraints h(x, u) ≥ 0
6
states x(t)
initial value
x0
r
controls u(t)
p
t
0
Z
minimize
x(·),u(·)
terminal
constraint r (x(T )) ≥ 0
p
T
T
L(x(t), u(t)) dt + E (x(T ))
0
subject to
x(0) − x0 = 0,
ẋ(t)−f (x(t), u(t)) = 0,
h(x(t), u(t)) ≥ 0,
r (x(T )) ≥ 0
(fixed initial value)
t ∈ [0, T ], (ODE model)
t ∈ [0, T ], (path constraints)
(terminal constraints)
More general optimal control problems
Many features left out here for simplicity of presentation:
I
multiple dynamic stages
I
differential algebraic equations (DAE) instead of ODE
I
explicit time dependence
I
constant design parameters
I
multipoint constraints r (x(t0 ), x(t1 ), . . . , x(tend )) = 0
Optimal Control Family Tree
Three basic families:
I
Hamilton-Jacobi-Bellmann equation /
dynamic programming
I
Indirect Methods / calculus of variations / Pontryagin
I
Direct Methods (control discretization)
Principle of Optimality
Any subarc of an optimal trajectory is also optimal.
6
states x(t)
initial
value x0
intermediate
value x̄
s
s
p
0
optimal
controls u(t)
p-
t̄
T
Subarc on [t̄, T ] is optimal solution for initial value x̄.
Dynamic Programming Cost-to-go
IDEA:
I
Introduce optimal-cost-to-go function on [t̄, T ]
Z
J(x̄, t̄) := min
x,u
T
L(x, u)dt + E (x(T ))
s.t. x(t̄) = x̄, . . .
t̄
I
Introduce grid 0 = t0 < . . . < tN = T .
I
Use principle of optimality on intervals [tk , tk+1 ]:
Z tk+1
J(xk , tk ) = min
L(x, u)dt + J(x(tk+1 ), tk+1 )
x,u
tk
s.t.
rxk
x(tk ) = xk , . . .
rx(tk+1 )
tk tk+1
p-
T
Dynamic Programming Recursion
Starting from J(x, tN ) = E (x), compute recursively backwards, for
k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min
L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u
tk
by solution of short horizon problems for all possible xk and
tabulation in state space.
Dynamic Programming Recursion
Starting from J(x, tN ) = E (x), compute recursively backwards, for
k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min
L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u
tk
by solution of short horizon problems for all possible xk and
tabulation in state space.
J(·, tN )
6
@
R xN
@
Dynamic Programming Recursion
Starting from J(x, tN ) = E (x), compute recursively backwards, for
k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min
L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u
tk
by solution of short horizon problems for all possible xk and
tabulation in state space.
J(·, tN−1 ) J(·, tN )
6
6
@
@
R xN−1 @
@
R xN
Dynamic Programming Recursion
Starting from J(x, tN ) = E (x), compute recursively backwards, for
k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min
L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u
tk
by solution of short horizon problems for all possible xk and
tabulation in state space.
J(·, t0 )
J(·, tN−1 ) J(·, tN )
6
6
6
···
@
R
@ x0
@
@
R xN−1 @
@
R xN
Hamilton-Jacobi-Bellman (HJB) Equation
I
Dynamic Programming with infinitely small timesteps leads to
Hamilton-Jacobi-Bellman (HJB) Equation:
∂J
∂J
(x, t)f (x, u)
s.t. h(x, u) ≥ 0.
− (x, t) = min L(x, u) +
u
∂t
∂x
I
Solve this partial differential equation (PDE) backwards for
t ∈ [0, T ], starting at the end of the horizon with
J(x, T ) = E (x).
I
NOTE: Optimal controls for state x at time t are obtained
from
∂J
∗
u (x, t) = arg min L(x, u) +
(x, t)f (x, u)
s.t. h(x, u) ≥ 0.
u
∂x
Dynamic Programming / HJB
I
I
“Dynamic Programming” applies to discrete time,
“HJB” to continuous time systems.
Pros and Cons
+ Searches whole state space, finds global optimum.
+ Optimal feedback controls precomputed.
+ Analytic solution to some problems possible (linear systems
with quadratic cost → Riccati Equation)
I
“Viscosity solutions” (Lions et al.) exist for quite general
nonlinear problems.
- But: in general intractable, because partial differential
equation (PDE) in high dimensional state space: “curse of
dimensionality”.
I Possible remedy: Approximate J e.g. in framework of
neuro-dynamic programming [Bertsekas 1996].
I
Used for practical optimal control of small scale systems e.g.
by Bonnans, Zidani, Lee, Back, ...
Indirect Methods
For simplicity, regard only problem without inequality constraints:
6
states x(t)
initial value
x0
r
controls u(t)
p
t
0
Z
minimize
x(·),u(·)
terminal
cost E (x(T ))
p
T
T
L(x(t), u(t)) dt + E (x(T ))
0
subject to
x(0) − x0 = 0,
ẋ(t)−f (x(t), u(t)) = 0,
(fixed initial value)
t ∈ [0, T ], (ODE model)
Pontryagin’s Minimum Principle
OBSERVATION: In HJB, optimal controls
∂J
∗
(x, t)f (x, u)
u (t) = arg min L(x, u) +
u
∂x
depend only on derivative
∂J
∂x (x, t),
not on J itself!
IDEA: Introduce adjoint variables
λ(t)
=
ˆ
∂J
(x(t), t)T ∈ Rnx
∂x
and get controls from Pontryagin’s Minimum Principle
u ∗ (t, x, λ) = arg min L(x, u) + λT f (x, u)
u
|
{z
}
Hamiltonian=:H(x,u,λ)
QUESTION: How to obtain λ(t)?
Adjoint Differential Equation
I
Differentiate HJB Equation
−
∂J
∂J
(x, t) = min H(x, u,
(x, t)T )
u
∂t
∂x
with respect to x and obtain:
−λ̇T =
I
∂
(H(x(t), u ∗ (t, x, λ), λ(t))) .
∂x
Likewise, differentiate J(x, T ) = E (x) and obtain terminal
condition
∂E
λ(T )T =
(x(T )).
∂x
How to obtain explicit expression for controls?
I
In simplest case,
u ∗ (t) = arg min H(x(t), u, λ(t))
u
is defined by
∂H
(x(t), u ∗ (t), λ(t)) = 0
∂u
(Calculus of Variations, Euler-Lagrange).
I
In presence of path constraints, expression for u ∗ (t) changes
whenever active constraints change. This leads to state
dependent switches.
I
If minimum of Hamiltonian locally not unique, “singular arcs”
occur. Treatment needs higher order derivatives of H.
Necessary Optimality Conditions
Summarize optimality conditions as boundary value problem:
x(0) = x0 ,
initial value
∗
ẋ(t) = f (x(t), u (t)), t ∈ [0, T ],
ODE model
∂H
(x(t), u ∗ (t), λ(t))T , t ∈ [0, T ], adjoint equations
−λ̇(t) =
∂x
u ∗ (t) = arg min H(x(t), u, λ(t)), t ∈ [0, T ], minimum principle
u
∂E
λ(T ) =
(x(T ))T .
∂x
Solve with so called
I
gradient methods,
I
shooting methods, or
I
collocation.
adjoint final value.
Indirect Methods
I
I
“First optimize, then discretize”
Pros and Cons
Boundary value problem with only 2 × nx ODE.
Can treat large scale systems.
Only necessary conditions for local optimality.
Need explicit expression for u ∗ (t), singular arcs difficult to
treat.
- ODE strongly nonlinear and unstable.
- Inequalities lead to ODE with state dependent switches.
+
+
-
Possible remedy: Use interior point method in function space
inequalities, e.g. Weiser and Deuflhard, Bonnans and
Laurent-Varin
I
Used for optimal control e.g. in satellite orbit planning at
CNES...
Direct Methods
I
“First discretize, then optimize”
I
Transcribe infinite problem into finite dimensional, Nonlinear
Programming Problem (NLP), and solve NLP.
Pros and Cons:
I
+ Can use state-of-the-art methods for NLP solution.
+ Can treat inequality constraints and multipoint constraints
much easier.
- Obtains only suboptimal/approximate solution.
I
Nowadays most commonly used methods due to their easy
applicability and robustness.
Direct Methods Overview
We treat three direct methods:
I
Direct Single Shooting (sequential simulation and
optimization)
I
Direct Collocation (simultaneous simulation and optimization)
I
Direct Multiple Shooting (simultaneous resp. hybrid)
Direct Single Shooting
[Hicks1971,Sargent1978]
Discretize controls u(t) on fixed grid 0 = t0 < t1 < . . . < tN = T ,
regard states x(t) on [0, T ] as dependent variables.
6
states x(t; q)
x0
r
discretized controls u(t; q)
p
0
q0
qN−1
q1
t
pT
Use numerical integration to obtain state as function x(t; q) of
finitely many control parameters q = (q0 , q1 , . . . , qN−1 )
NLP in Direct Single Shooting
After control discretization and numerical ODE solution, obtain
NLP:
Z T
L(x(t; q), u(t; q)) dt + E (x(T ; q))
minimize
q
0
subject to
h(x(ti ; q), u(ti ; q)) ≥ 0,
i = 0, . . . , N,
r (x(T ; q)) ≥ 0.
(discretized path constraints)
(terminal constraints)
Solve with finite dimensional optimization solver, e.g. Sequential
Quadratic Programming (SQP).
Solution by Standard SQP
Summarize problem as
min F (q) s.t. H(q) ≥ 0.
q
Solve e.g. by Sequential Quadratic Programming (SQP), starting
with guess q 0 for controls. k := 0
1. Evaluate F (q k ), H(q k ) by ODE solution, and derivatives!
2. Compute correction ∆q k by solution of QP:
1
min ∇F (qk )T ∆q+ ∆q T Ak ∆q s.t. H(q k )+∇H(q k )T ∆q ≥ 0.
∆q
2
3. Perform step q k+1 = q k + αk ∆q k with step length αk
determined by line search.
ODE Sensitivities
∂x(t; q)
of a numerical ODE
∂q
solution x(t; q) with respect to the controls q?
How to compute the sensitivity
Four ways:
1. External Numerical Differentiation (END)
2. Variational Differential Equations
3. Automatic Differentiation
4. Internal Numerical Differentiation (IND)
Numerical Test Problem
3
Z
x(t)2 + u(t)2 dt
minimize
x(·),u(·)
0
subject to
x(0) = x0 ,
ẋ = (1 + x)x + u,
1 − x(t)
0
1 + x(t) 0
1 − u(t) ≥ 0 ,
1 + u(t)
0
(initial value)
t ∈ [0, 3],
(ODE model)
t ∈ [0, 3],
(bounds)
x(3) = 0.
Remark: Uncontrollable growth for
(1 + x0 )x0 − 1 ≥ 0 ⇔ x0 ≥ 0.618.
(zero terminal constraint).
Single Shooting Optimization for x0 = 0.05
I
Choose N = 30 equal control intervals.
I
Initialize with steady state controls u(t) ≡ 0.
I
Initial value x0 = 0.05 is the maximum possible, because
initial trajectory explodes otherwise.
Single Shooting: Initialization
Single Shooting: First Iteration
Single Shooting: 2nd Iteration
Single Shooting: 3rd Iteration
Single Shooting: 4th Iteration
Single Shooting: 5th Iteration
Single Shooting: 6th Iteration
Single Shooting: 7th Iteration and Solution
Direct Single Shooting: Pros and Cons
I
Sequential simulation and optimization.
+ Can use state-of-the-art ODE/DAE solvers.
+ Few degrees of freedom even for large ODE/DAE systems.
+ Active set changes easily treated.
+ Need only initial guess for controls q.
- Cannot use knowledge of x in initialization (e.g. in tracking
problems).
- ODE solution x(t; q) can depend very nonlinearly on q.
- Unstable systems difficult to treat.
I
Often used in engineering applications e.g. in packages gOPT
(PSE), DYOS (Marquardt), . . .
Direct Collocation (Sketch)
[Tsang1975]
I
Discretize controls and states on fine grid with node values
si ≈ x(ti ).
I
Replace infinite ODE
0 = ẋ(t) − f (x(t), u(t)),
t ∈ [0, T ]
by finitely many equality constraints
i = 0, . . . , N − 1,
si + si+1
si+1 − si
−f
, qi
e.g. ci (qi , si , si+1 ) :=
ti+1 − ti
2
ci (qi , si , si+1 ) = 0,
I
Approximate also integrals, e.g.
Z ti+1
si + si+1
, qi (ti+1 −ti )
L(x(t), u(t))dt ≈ li (qi , si , si+1 ) := L
2
ti
NLP in Direct Collocation
After discretization obtain large scale, but sparse NLP:
minimize
s,q
N−1
X
li (qi , si , si+1 ) + E (sN )
i=0
subject to
s0 − x0 = 0,
ci (qi , si , si+1 ) = 0,
h(si , qi ) ≥ 0,
r (sN ) ≥ 0.
(fixed initial value)
i = 0, . . . , N − 1,
(discretized ODE model)
i = 0, . . . , N,
(discretized path constraint
(terminal constraints)
Solve e.g. with SQP method for sparse problems.
What is a sparse NLP?
General NLP:
min F (w ) s.t.
w
G (w ) = 0,
H(w ) ≥ 0.
is called sparse if the Jacobians (derivative matrices)
∂G
∂G
=
and ∇w H T
∇w G T =
∂w
∂wj ij
contain many zero elements.
In SQP methods, this makes QP much cheaper to build and to
solve.
Direct Collocation: Pros and Cons
I
Simultaneous simulation and optimization.
+ Large scale, but very sparse NLP.
+ Can use knowledge of x in initialization.
+ Can treat unstable systems well.
+ Robust handling of path and terminal constraints.
- Adaptivity needs new grid, changes NLP dimensions.
I
Successfully used for practical optimal control e.g. by Biegler
and Wächter (IPOPT), Betts,
Direct Multiple Shooting
I
[Bock 1984]
Discretize controls piecewise on a coarse grid
u(t) = qi
I
for
t ∈ [ti , ti+1 ]
Solve ODE on each interval [ti , ti+1 ] numerically, starting with
artificial initial value si :
ẋi (t; si , qi ) = f (xi (t; si , qi ), qi ),
t ∈ [ti , ti+1 ],
xi (ti ; si , qi ) = si .
Obtain trajectory pieces xi (t; si , qi ).
I
Also numerically compute integrals
Z ti+1
li (si , qi ) :=
L(xi (ti ; si , qi ), qi )dt
ti
Sketch of Direct Multiple Shooting
6
xi (ti+1 ; si , qi ) 6= si+1
@
R
@
r
s0r s1
r
r
r s r si+1
q
t0 t1
r
r
r sN−1
r s
N
q
q -
qi
6
q0
r
i
x0 f
r
q
r
q
q
q
ti ti+1
q
tN−1
tN
NLP in Direct Multiple Shooting
6
q
q q q q q
q
q
q
q q
bq
q
p p p
p
p p
minimize
s,q
p p
N−1
X
li (si , qi ) + E (sN )
i=0
subject to
s0 − x0 = 0,
(initial value)
si+1 − xi (ti+1 ; si , qi ) = 0, i = 0, . . . , N − 1, (continuity)
h(si , qi ) ≥ 0, i = 0, . . . , N,
r (sN ) ≥ 0.
(discretized path constraints
(terminal constraints)
Structured NLP
I
Summarize all variables as w := (s0 , q0 , s1 , q1 , . . . , sN ).
I
Obtain structured NLP
min F (w )
w
s.t.
G (w ) = 0
H(w ) ≥ 0.
I
Jacobian ∇G (w k )T contains dynamic model equations.
I
Jacobians and Hessian of NLP are block sparse, can be
exploited in numerical solution procedure.
Test Example: Initialization with u(t) ≡ 0
Multiple Shooting: First Iteration
Multiple Shooting: 2nd Iteration
Multiple Shooting: 3rd Iteration and Solution
Direct Multiple Shooting: Pros and Cons
I
Simultaneous simulation and optimization.
+ uses adaptive ODE/DAE solvers
+ but NLP has fixed dimensions
+ can use knowledge of x in initialization (here bounds;
more important in online context).
+ can treat unstable systems well.
+ robust handling of path and terminal constraints.
+ easy to parallelize.
- not as sparse as collocation.
I
Used for practical optimal control e.g by Franke (ABB)
(“HQP”), Terwen (Daimler); Bock et al.
(“MUSCOD-II”); in ACADO Toolkit; ...
Conclusions: Optimal Control Family Tree
((
(
(((
(
(
(
((
((((
Hamilton-Jacobi- Indirect Methods,
Pontryagin:
Bellman Equation:
Solve Boundary
Tabulation in
Value Problem
State Space
Direct Methods:
Transform into
Nonlinear Program
(NLP)
(
(
(((
(
(
(
(
(
((
(
(
(
((
Single Shooting:
Only discretized
controls in NLP
(sequential)
Collocation:
Multiple Shooting:
Discretized controls
Controls and node
and states in NLP
start values in NLP
(simultaneous)
(simultaneous/hybrid)
Literature
I
T. Binder, L. Blank, H. G. Bock, R. Bulirsch, W. Dahmen, M.
Diehl, T. Kronseder, W. Marquardt and J. P. Schler, and O.
v. Stryk: Introduction to Model Based Optimization of
Chemical Processes on Moving Horizons. In Grötschel,
Krumke, Rambau (eds.): Online Optimization of Large Scale
Systems: State of the Art, Springer, 2001. pp. 295–340.
I
John T. Betts: Practical Methods for Optimal Control Using
Nonlinear Programming. SIAM, Philadelphia, 2001. ISBN
0-89871-488-5
I
Dimitri P. Bertsekas: Dynamic Programming and Optimal
Control. Athena Scientific, Belmont, 2000 (Vol I, ISBN:
1-886529-09-4) & 2001 (Vol II, ISBN: 1-886529-27-2)
I
A. E. Bryson and Y. C. Ho: Applied Optimal Control,
Hemisphere/Wiley, 1975.
© Copyright 2025 Paperzz