Optimal control 2

Optimal control
T. F. Edgar
Spring 2012
Optimal Control
β€’ Static optimization (finite dimensions)
β€’ Calculus of variations (infinite dimensions)
β€’ Maximum principle (Pontryagin) / minimum principle
Based on state space models
Min 𝑉 𝒙, 𝒖
S.t. 𝒙 = 𝒇 𝒙, 𝒖, 𝑑
𝒙 𝑑0 is given
𝑑𝑓
𝑉 𝒙, 𝑒 = Ξ¦ 𝒙 𝑑𝑓
+
𝐿 𝒙, 𝒖, 𝑑 𝑑𝑑
𝑑0
General nonlinear control problem
2
Special Case of 𝑽
β€’ Minimum fuel:
𝑑𝑓
0
β€’ Minimum time:
𝑑𝑓
1𝑑𝑑
0
β€’ Max range :
𝒖 𝑑𝑑
π‘₯ 𝑑𝑓
β€’ Quadratic loss:
𝑑𝑓
0
𝒙𝑇 𝑸𝒙 + 𝒖𝑇 𝑹𝒖 𝑑𝑑
Analytical solution if state equation is linear, i.e.,
𝒙 = 𝑨𝒙 + 𝑩𝒖
3
β€œLinear Quadratic” problem - LQP
𝑑𝑓 2
π‘₯
𝑑𝑑
0
β€’ Note 𝐼𝑆𝐸 =
is not solvable in a
realistic sense (𝑒 is unbounded), thus need
control weighting in 𝑉
β€’ E.g., 𝑉 =
𝑑𝑓
0
π‘₯ 2 + π‘Ÿπ‘’2 𝑑𝑑
β€’ π‘Ÿ is a tuning parameter (affects overshoot)
4
β€’ 𝑉 = π‘ƒπ‘Ÿπ‘œπ‘“π‘–π‘‘ ?
Ex. Maximize conversion in exit of tubular reactor
max π‘₯3 𝑑𝑓
π‘₯3 : Concentration
𝑑: Residence time parameter
In other cases, when π‘₯ and 𝑒 are deviation variables,
π‘₯2 +
5
β€’ Initial conditions
(a) π‘₯ 0 β‰  0, π‘₯ 𝑑𝑓 β†’ π‘₯𝑑 = 0 or 𝑉 =
𝑑𝑓
0
π‘₯ βˆ’ π‘₯𝑑 2 𝑑𝑑
Set point change, π‘₯𝑑 is the desired π‘₯
(b) π‘₯ 0 β‰  0, impulse disturbance,
π‘₯𝑑 = 0
(c) π‘₯ 0 = 0, model includes disturbance term
π‘₯𝑑 = 0
6
Other considerations:
β€œopen loop” vs. β€œclosed loop”
β€’ β€œopen loop”: optimal control is an explicit function of time,
depends on π‘₯ 0 -- β€œprogrammed control”
β€’ β€œclosed loop”: feedback control, 𝑒 𝑑 depends on π‘₯ 𝑑 , but
not on π‘₯ 0 . e.g., 𝑒 𝑑 = βˆ’πΎ 𝑑 π‘₯ 𝑑
Feedback control is advantageous in presence of noise,
model errors.
Optimal feedback control arises from a specific optimal
control problems, the LQP.
7
Derivation of Minimum Principle
𝑑𝑓
min 𝑉 𝒙, 𝒖 = Ξ¦ 𝒙 𝑑𝑓
+
𝐿 𝒙 𝑑 , 𝒖 𝑑 , 𝑑 𝑑𝑑
0
𝒙 = 𝒇 𝒙, 𝒖, 𝑑
𝒙𝑛×1 , π’–π‘Ÿ×1
Ξ¦, 𝐿, 𝑓 have continuous 1st partial w.r.t. 𝒙, 𝒖, 𝑑
Form Lagrangian
𝑑𝑓
𝑉 𝑒 =Ξ¦+
𝐿 + 𝝀𝑇 𝒇 βˆ’ 𝒙 𝑑𝑑
𝑑0
Multipliers: adjoint variables, costates
8
β€’ Define 𝐻 = 𝐿 + 𝝀𝑇 𝒇 (Hamiltonian)
𝑑𝑓
𝑉 𝑒 =Ξ¦+
𝐻 βˆ’ 𝝀𝑇 𝒙 𝑑𝑑 = Ξ¦ π‘₯ βˆ’ 𝝀𝑇 𝒙
𝑑0
( 𝝀𝑇 𝒙𝑑𝑑 = 𝝀𝑇 𝒙
𝑑𝑓
βˆ’ 𝝀𝑇 𝒙
π‘‘πŸŽ
+
𝑑𝑓
𝑑𝑓
+
𝐻 + 𝝀𝑇 𝒙 𝑑𝑑
𝑑0
𝝀𝑇 𝒙 𝑑𝑑)
β€’ Since 𝑉 is Lagrangian, we treat as unconstrained problem with
variables: 𝒙 𝑑 , 𝝀 𝑑 , 𝒖 𝑑
β€’ Use variations: 𝛿𝒙 𝑑 , 𝛿𝒖 𝑑 , 𝛿 𝑉 (for 𝛿𝝀 𝑑 => original constraint,
the state equation.)
𝛿𝑉 = 0
=
𝑑Φ
βˆ’ πœ†π‘‡
𝑑π‘₯
+ πœ†π‘‡ 𝛿π‘₯
𝑑𝑓
𝑑𝑓
𝑑0
+
𝐻𝑒 𝛿𝑒 + 𝐻π‘₯ 𝛿π‘₯ + πœ†π‘‡ 𝛿π‘₯ 𝑑𝑑
𝑑0
9
β€’ Since 𝛿π‘₯ 𝑑 , 𝛿𝑒 𝑑 are arbitrary (β‰  0), then
πœ•π»
πœ•π‘₯
+ πœ† = 0 οƒ¨πœ† =
πœ•π»
πœ•π‘’
= 0, β€œoptimality equation” for weak minimum
𝑑=
πœ•Ξ¦
𝑑𝑓 ,
πœ•π‘₯
πœ•π»
βˆ’
πœ•π‘₯
(n equations. β€œadjoint equation”)
βˆ’ πœ† = 0  πœ† 𝑑𝑓 =
πœ•Ξ¦
βˆ’
πœ•π‘₯ 𝑑𝑓
(n boundary conditions)
If π‘₯ 𝑑0 is specified, then 𝛿π‘₯ 𝑑0 = 0
Two point boundary value problem (β€œTPBVP”)
10
β€’ Example:
𝑑π‘₯1
𝑑𝑑
= 𝑒 βˆ’ π‘₯1 (1st order transfer function)
min 𝑉 =
1 𝑑𝑓
2 0
π‘₯12 + 𝑒2 𝑑𝑑
LQP
1 2
𝐻 = π‘₯1 + 𝑒2 + πœ†1 𝑒 βˆ’ π‘₯1
2
πœ†1 = βˆ’π‘₯1 + πœ†1 , πœ†1 𝑑𝑓 = 0
𝐻𝑒 = 𝑒 + πœ†1 = 0
π‘’π‘œπ‘π‘‘ = βˆ’πœ†1
(but don’t know πœ†1 𝑑 yet)
11
β€’ Free canonical equations (eliminate 𝑒)
(1) π‘₯1 = 𝑒 βˆ’ π‘₯1 = βˆ’πœ†1 βˆ’ π‘₯1 (π‘₯1 0 is known)
(2) πœ†1 = βˆ’π‘₯1 + πœ†1 , πœ†1 𝑑𝑓 = 0
Combine (1) and (2),
πœ†1 = 2πœ†1  πœ†1 = π‘˜1 𝑒
0 = π‘˜1 𝑒
2𝑑𝑓
+ π‘˜2 𝑒 βˆ’
2𝑑
+ π‘˜2 𝑒 βˆ’
2𝑑
2𝑑𝑓
π‘₯1 = πœ†1 βˆ’ πœ†1 = π‘˜1 1 βˆ’ 2 𝑒
2𝑑
+ π‘˜2 1 + 2 𝑒 βˆ’
2𝑑
π‘₯1 0 = π‘˜1 1 βˆ’ 2 + π‘˜2 1 + 2
π‘’π‘œπ‘π‘‘
𝑑 =
= 𝑐1 𝑒
2𝑑
π‘₯ 0
2βˆ’1 +
βˆ’ 𝑐2 𝑒 βˆ’
2 + 1 𝑒2
2𝑑𝑓
𝑒
2𝑑
βˆ’ 𝑒2
2𝑑𝑓 βˆ’ 2𝑑
2𝑑
𝑒 < 0 βˆ€π‘‘ for π‘₯ 0 > 0, initially correct to reduce π‘₯ 𝑑
12
β€’ Another example:
π‘₯1 = π‘₯2
π‘₯2 = 𝑒 (double integrator)
1
𝑉=
2
∞
0
π‘₯12 + π‘₯22 + 𝑒2 𝑑𝑑
1 2 1 2 1 2
𝐻 = π‘₯1 + π‘₯2 + 𝑒 + πœ†1 π‘₯2 + πœ†2 𝑒
2
2
2
πœ•π»
πœ†1 = βˆ’
= βˆ’π‘₯1
πœ•π‘₯1
πœ•π»
πœ†2 = βˆ’
= βˆ’π‘₯2 βˆ’ πœ†1
πœ•π‘₯2
𝐻𝑒 = 0 = 𝑒 + πœ†2  π‘’π‘œπ‘π‘‘ = βˆ’πœ†2
13
β€’ Free canonical equations
π‘₯1 = π‘₯2
π‘₯2 = βˆ’πœ†2
πœ†1 = βˆ’π‘₯1
πœ†2 = βˆ’π‘₯2 βˆ’ πœ†1 (𝒙, 𝝀 coupled)
οƒ¨πœ†2 βˆ’ πœ†2 + πœ†2 = 0
Char. Equation: π‘Ÿ 4 βˆ’ π‘Ÿ 2 + 1 = 0 οƒ  π‘Ÿβ€²2 βˆ’ π‘Ÿ β€² + 1 = 0
π‘Ÿ β€² = 0.5 ± 0.707𝑗
π‘Ÿ = ±0.85 ± 0.4𝑗 (4 roots, apply boundary condition)
14
β€’ Can motivate feedback control via discrete time, one step
ahead
π‘₯π‘˜+1 = 𝑒π‘₯π‘˜ + π‘“π‘’π‘˜
Set π‘˜ = 0, π‘₯1 = 𝑒π‘₯0 + 𝑓π‘₯0 (π‘₯0 fixed)
min 𝑉 = π‘₯12 + π‘Žπ‘’02
𝑉 = 𝑒π‘₯0 + 𝑓𝑒0
2
+ π‘Žπ‘’02
πœ•π‘‰
= 2𝑓 𝑒π‘₯0 + 𝑓𝑒0 + 2π‘Žπ‘’0 = 0
πœ•π‘’0
π‘Ž
𝑓
0 = 𝑒π‘₯0 + 𝑓𝑒0 + 𝑒0 𝑒0 =
βˆ’π‘’π‘₯0
π‘Ž
𝑓+𝑓
Feedback control
15
Continuous Time LQP
𝒙 = 𝑨𝒙 + 𝑩𝒖
1 𝑇
1
𝑉 = 𝒙 𝑑𝑓 𝑺𝒙 𝑑𝑓 +
2
2
𝑑𝑓
𝒙𝑇 𝑸𝒙 + 𝒖𝑇 𝑹𝒖 𝑑𝑑
0
𝑺, 𝑸 β‰₯ 𝑢, 𝑹 β‰₯ 𝑢
𝐻=
𝝀𝑇
1 𝑇
1 𝑇
𝑨𝒙 + 𝑩𝒖 + 𝒙 𝑸𝒙 + 𝒖 𝑹𝒖
2
2
𝝀 = βˆ’π‘Έπ’™ βˆ’ 𝑨𝑇 𝝀, 𝝀 𝑑𝑓 = 𝑺𝒙 𝑑𝑓
𝑯𝒖 = 𝑢 = 𝑩𝑇 𝝀 + 𝑹𝒖
π’–π‘œπ‘π‘‘ = βˆ’π‘Ήβˆ’1 𝑩𝑇 𝝀 (𝑹 > 𝑢)
𝑯𝒖𝒖 = 𝑹 > 𝑢
16
β€’ Free canonical equations
𝒙 = 𝑨𝒙 βˆ’ π‘©π‘Ήβˆ’1 𝑩𝑇 𝝀 (𝒙 0 given)
𝝀 = βˆ’π‘Έπ’™ βˆ’ 𝑨𝑇 𝝀 (𝝀 𝑑𝑓 given)
Let 𝝀 = 𝑷𝒙 (Riccati transformation)
π’–π‘œπ‘π‘‘ = βˆ’π‘Ήβˆ’1 𝑩𝑇 𝑷𝒙, let 𝑲 = π‘Ήβˆ’1 𝑩𝑇 𝑷 (feedback control)
Then we have ODE in 𝑷
𝒙 = 𝑨𝒙 βˆ’ π‘©π‘Ήβˆ’1 𝑩𝑇 𝑷𝒙 (1)
𝝀 = βˆ’π‘Έπ’™ βˆ’ 𝑨𝑇 𝝀  𝑷𝒙 + 𝑷𝒙 = βˆ’π‘Έπ’™ βˆ’ 𝑨𝑇 𝑷𝒙 (2)
17
Substitute Eq. (1) into Eq. (2):
𝑷 + 𝑷𝑨 + 𝑨𝑇 𝑷 βˆ’ π‘·π‘©π‘Ήβˆ’1 𝑩𝑇 𝑷 + 𝑸 = 𝑢 (Riccati ODE)
𝑷 𝑑𝑓 = 𝑺
( backward time integration)
At steady state, 𝑷 β†’ 𝑷𝑒 for 𝑑𝑓 β†’ ∞, solve steady state
equation.
𝑷 is symmetric, 𝑷 = 𝑷𝑇
18
β€’ Example
𝑸=
0 0
, 𝑑𝑓 β†’ ∞
0 1
βˆ’1 0
1
,𝑩=
, 𝑅 = 0.1
1 0
0
Plug into Riccati Equation (Steady state)
𝑨=
2
5𝑃11
+ 𝑃11 βˆ’ 𝑃12 = 0
𝑃11 = 0.1706
2
𝑃22 = 0.8556

10𝑃12
βˆ’1=0
𝑃12 = 𝑃21 = 0.3162
1 + 10𝑃11 𝑃12 βˆ’ 𝑃22 = 0
Feedback Matrix:
𝑲 = π‘Ήβˆ’1 𝑩𝑇 𝑷 = βˆ’1.706 βˆ’3.162
19
β€’ Generally 3 ways to solve steady state Riccati
Equation:
(1) integration of ode’s οƒ  steady state;
(2) Newton-Raphson (non linear equation
solver);
(3) transition matrix (analytical solution).
20
β€’ Transition matrix approach
𝒙
𝑨 βˆ’π‘©π‘Ήβˆ’1 𝑩𝑇
=𝜸=
𝑇 𝜸
βˆ’π‘Έ
βˆ’π‘¨
𝝀
Reverse time integration (Boundary Condition: at 𝑑 = 𝑑𝑓 ):
Let 𝜏 = 𝑑𝑓 βˆ’ 𝑑
When 𝑑 = 𝑑𝑓 , 𝜏 = 0
π‘‘πœΈ
βˆ’π‘¨ π‘©π‘Ήβˆ’1 𝑩𝑇
=𝜸=
𝜸
𝑸
𝑨𝑇
π‘‘πœ
𝜸 = 𝑒𝒛 𝜏 𝜸 𝜏 = 0
Partition exponential
πœƒ11
𝒙
=𝜸=
𝝀
πœƒ21
πœƒ12
𝜸 𝜏=0
πœƒ22
21
𝒙 𝜏 = πœƒ11 𝒙 𝑑𝑓 + πœƒ12 𝝀 𝑑𝑓 = πœƒ11 𝒙 𝑑𝑓 + πœƒ12 𝑷 𝑑𝑓 𝒙 𝑑𝑓 (1)
𝝀 𝜏 = πœƒ21 𝒙 𝑑𝑓 + πœƒ22 𝝀 𝑑𝑓
𝑷 𝜏 𝒙 𝜏 = πœƒ21 𝒙 𝑑𝑓 + πœƒ22 𝑷 𝑑𝑓 𝒙 𝑑𝑓 (2)
Combine (1) and (2), factor out 𝒙 𝑑𝑓
𝑷 𝜏 πœƒ11 + πœƒ12 𝑷 𝑑𝑓
= πœƒ21 + πœƒ22 𝑷 𝑑𝑓
Fix integration βˆ†π‘‘, πœƒπ‘–π‘— Δ𝑑 is fixed
𝑷 𝑑 βˆ’ βˆ†π‘‘ πœƒ11 + πœƒ12 𝑷 𝑑
= πœƒ21 + πœƒ22 𝑷 𝑑
Boundary condition: 𝑷 𝑑𝑓 = 𝑺
Backward time integration of 𝑃, then forward time integration
𝒙 = 𝑨𝒙 + 𝑩𝒖
𝒖 = βˆ’π‘Ήβˆ’1 𝑩𝑇 𝑷𝒙
22
Integral Action (eliminate offset)
β€’ Add terms 𝒖𝑇 𝑹𝒖 or 𝒙1𝑇 𝑸𝒙1 to objective function
Example: π‘₯1 = π‘Žπ‘₯1 + 𝑏𝑒
1
𝑉=
2
𝑑𝑒
2
2
π‘žπ‘₯1 + π‘Ÿπ‘’ + π‘ž
𝑑𝑑
2
𝑑𝑑
Augment state equation
π‘₯1 = π‘Žπ‘₯1 + 𝑏𝑒 (new state variable)
𝑑𝑒
𝑑𝑑
= 𝑀 (new control variable)
Calculate feedback control
𝑀 π‘œπ‘π‘‘ = βˆ’π‘˜1 π‘₯1 βˆ’ π‘˜2 𝑒
Integrate: 𝑒 = π‘˜β€²1
𝑑𝑒
1
= βˆ’π‘˜1 π‘₯1 βˆ’ π‘˜2 π‘₯1 βˆ’ π‘Žπ‘₯1
𝑑𝑑
𝑏
π‘₯1 𝑑𝑑 + π‘˜β€²2 π‘₯1
23
β€’ Second method:
π‘₯0 =
π‘₯1 𝑑𝑑; π‘₯0 = π‘₯1
1
𝑉=
2
π‘žπ‘₯12 + π‘Ÿπ‘’2 + π‘ž π‘₯0
2
𝑑𝑑
π‘₯0 = π‘₯1
π‘₯1 = π‘Žπ‘₯1 + 𝑏𝑒
Optimal control:
𝑒 = βˆ’π‘˜1 π‘₯1 βˆ’ π‘˜0 π‘₯0 = βˆ’π‘˜1 π‘₯1 βˆ’ π‘˜0
π‘₯1 𝑑𝑑
With more state variables,  PID controller
24