8.3 “Optimization Based Approaches to Autonomy,” Cedric

Optimization
Based
Approaches to
Autonomy
SAE Aerospace Control and Guidance Systems
Committee (ACGSC) Meeting
Harvey’s Resort, Lake Tahoe, Nevada
March 3, 2005
Cedric Ma
Northrop Grumman Corporation
Outline
 Introduction
 Level of Autonomy
 Optimization and Autonomy
 Autonomy Hierarchy and Applications
 Path Planning with Mixed Integer Linear Programming
 Optimal Trajectory Generation with
Nonlinear Programming
 Summary and Conclusions
2
Autonomy in Vehicle Applications
PACK LEVEL COORDINATION
TEAM
TACTICS
NAVIGATION
LANDING
FORMATION FLYING
RENDEZVOUS & REFUELING
COOPERATIVE SEARCH
OBSTACLE AVOIDANCE
3
Autonomy: Boyd’s OODA “Loop”
Observe
Orient
Implicit
Guidance
& Control
Unfolding
Circumstances
Observations
Feed
Forward
Genetic
Heritage
Unfolding
Interaction
With
Environment
Act
Implicit
Guidance
& Control
Cultural
Traditions
Analyses &
Synthesis
New
Information
Outside
Information
Decide
Feed
Forward
Decision
(Hypothesis)
Previous
Experience
Feedback
Feed
Forward
Action
(Test)
Unfolding
Interaction
With
Environment
Feedback
Note how orientation shapes observation, shapes decision, shapes action, and in turn is shaped by the feedback and
other phenomena coming into our sensing or observing window.
Also note how the entire “loop” (not just orientation) is an ongoing many-sided implicit cross-referencing process
of projection, empathy, correlation, and rejection.
From “The Essence of Winning and Losing,” John R. Boyd, January 1996.
Defense and the National Interest, http://www.d-n-i.net, 2001
4
Level of Autonomy
 Ground Operation
 Activities performed off-line
 Tele-Operation
 Awareness of sensor / actuator interfaces
 Executes commands uploaded from the
ground
1 Ground operation  Reactive Control
 Awareness of the present situation
2 Tele-operation
 Simple reflexes, i.e. no planning required
3 Reactive Control
 A condition triggers an associated action
4 Responsive Control
5 Deliberative Control  Responsive Control
 Awareness of past actions
 Remembers previous actions
 Remembers features of the environment
 Remembers goals
Goal of Optimization
 Deliberative Control
Based Autonomy
 Awareness of future possibilities
 Reasons about future consequences
 Chooses optimal paths / plans
5
Optimization and Autonomy
Observe
Orient
Implicit
Guidance
& Control
Unfolding
Circumstances
Observations
Feed
Forward
Genetic
Heritage
Unfolding
Interaction
With
Environment
Act
Implicit
Guidance
& Control
Cultural
Traditions
New
Information
Outside
Information
Decide
Analyses &
Synthesis
Feed
Forward
Decision
(Hypothesis)
Previous
Experience
Feedback
Feedback
Formulation of problem
shapes the “Orient” mechanism
Feed
Forward
Action
(Test)
Unfolding
Interaction
With
Environment
Objective/Reward Function
Constraints/Rules
(i.e. Dynamics/Goal)
Vehicle
State
Optimizer
Optimal
Control/Decision
Determines best course of action based on
current objective, while meeting constraints
6
Autonomy Hierarchy
Mission
Planning
Planning & Scheduling, Resource Allocation & Sequencing
Task Sequencing, Auto Routing
Time Scale: ~1 hr
Cooperative
Control
Multi-Agent Coordination, Pack Level Organization
Formation Flying, Cooperative Search & Electronic Warfare
Conflict Resolution, Task Negotiation, Team Tactics
Time Scale: ~1 min
“Navigation,” Motion Planning
Obstacle/Collision/Threat Avoidance
Time Scale: ~10s
Path
Planning
“Guidance,” Contingency Handling
Trajectory
Landing, Rendezvous, Refueling
Generation
Time Scale: ~1s
“Control,” Disturbance Rejection
Trajectory
Applications: Stabilization, Adaptive
Following/
Reconfigurable Control, FDIR
Inner Loop
Time Scale: ~0.1s
7
Path Planning with
Mixed-Integer
Linear Programming
(MILP)
8
Overview: Path Planning
 Path Planning bridges the gap between Mission
Planner/AutoRouter and Individual Vehicle Guidance
 Acts on an “intermediate” time scale between that of
mission planner (minutes) and guidance (<seconds)
 Short reaction time
Mission waypoints
Nap-of-the-Earth Flight
Multi-vehicle Coordination
Terrain Navigation
Obstacle Avoidance
Collision Avoidance
9
Path-Planning with MILP
 Mixed-Integer Linear Programming
 Linear Programs (LP) with integer variables
 COTS MILP solver: ILOG CPLEX
 Vehicle dynamics as linear constraints:
 Limit velocity, acceleration, climb/turn rate
 Resulting path is given to 4-D guidance
 Integer variables can model:
 Obstacle collision constraints (binary)
 Control Modes, Threat Exposure
 Nonlinear Functions: RCS, Dynamics
 Min. Time, Acceleration, Altitude, Threat
 Objective function includes terms for:
Acceleration, Non-Arrival, Terminal, Altitude,
Threat Exposure
10
Basic Obstacle Avoidance Problem
 Vehicle Dynamic Constraints
 Double Integrator dynamics
 Max acceleration
 Max velocity
 Objective Function (summed over each time step)
 Acceleration (1-norm) in x, y, z
 Distance to destination (1-norm)
 Altitude (if applicable)
 Obstacle Constraints (integer)
x – M b 1 ≤ x1
x + M b 2 ≥ x2
 One set per obstacle per time step
y – M b3 ≤ y 1
 No cost associated with obstacles
y+Mb ≥y
4
2
b1 + b2 + b3 + b4 ≤ 3
11
Receding Horizon MILP Path-Planning
 Path is computed periodically,
with most current information
 Planning horizon, replan period
chosen based on problem type,
computational requirements, &
environment
 Only subset of current plan is
executed before replanning
 RH reduces computation time
 Shorter planning horizon
 Does not plan to destination
 RH introduces robustness to
path planning
 Pop-up obstacles
 Unexpected obstacle movement
12
Obstacle Avoidance
Nap of the Earth Flight
Treetop level
Urban Low Altitude
Operations
13
Collision Avoidance
 Problem is formulated identically
as Obstacle Avoidance in MILP
 Air vehicles are moving obstacles
 Path calculation based on
expected future trajectory of other
vehicles
 Dealing with Uncertainty
 Vehicles of uncertain intent can be
enlarged with time
 Receding Horizon
 Frequent replanning
 Change in planned path (blue)
in response to changes in
intruder movement
14
Coordinated Conflict Resolution
 3-D Multi-Vehicle PathPlanning problem
 Centralized version
 “Decentralized Cooperative
Trajectory Planning of
Multiple Aircraft with Hard
Safety Guarantees” by MIT
 Loiter maneuvers can be used
to produce provably safe
trajectories
 Minimum separation distance
is specified in problem
formulation
 No limit to number of vehicles
 Non-cooperative vehicles are
treated as moving obstacles
15
Threat Avoidance
 Purpose: To avoid detection
by known threats by planning
trajectory behind opaque
obstacles
 Shadow-like “Safe Zones”
 One per threat/obstacle pair
 Well defined for convex
obstacles
 Nice topological properties
Vehicle hiding
behind building
On-time arrival
at destination
 Patent Pending: Docket No.
000535-030
Threat
16
Summary
MILP Path Planning
 MILP: Fast Global Optimization
 No suboptimal local minima
 Branch & Bound provides fast
tree-search
 Commercial solver on RTOS
 Tractability Trade-off:
 Time Discretization
 Constraints active only at
discrete points in time
 Time Scale Refinement
 Linear dynamics/constraints
 Formulation should properly
capture nonlinearity of solution
space
 True global minimum is in a
neighborhood of MILP optimal
solution
17
Optimal Trajectory
Generation with
Nonlinear
Programming
(NLP)
18
Problems & Goal of Trajectory Generation
 Currently, the primary method is pre-generated waypoint routes with
little/no adaptation or reaction to threats or condition changes
 Even the latest vehicles have low autonomy levels and are doing exactly
what they are told, largely indifferent to the world around them
 What are the potential gains of Near Real Time Trajectory Generation?
 Improved Effectiveness
 Reduced operator workload – force multiplier
 Mission planning / re-planning
 Account for range and time delays
 Improved Survivability?
 UAV trades success/risk
 Limp-home capability
 Autonomous threat mitigation
(RCS, SAM, Small Arms, AA Fire)
 Air/Air Engagement
 Accurate release of cheap ‘dumb’ ordinance
 GOAL DRIVEN AUTONOMY
Command ‘What’ not ‘How’
 How best can we mimic (improve?) on human skill and speed at
trajectory generation in complex environments?
19
Classical Trajectory Optimization Problem
Cost:
Constraints:
Issues:
• Becomes the traditional two
point constrained boundary value
problem
• Computationally expensive due
to equality constraints from the
system, environment and
actuation dynamics
• Currently intractable in required
time for effective control
Hope?
• Perhaps our systems contain a structure which allows all solutions of the
system, (trajectories) to be smoothly mapped, from a set of free trajectories
in a reduced dimensional space. Algebraic solutions in this reduced space
would implicitly satisfying the dynamic constraints of the original system.
20
Trajectory Generation: Current Methods
 Brute force numerical method solution
of the dynamic and constraint ODE’s
Iteration 1:
e
 Solution Method
1) Guess control e(t)
2) Propagate dynamics from beginning to
end (simulate)

3) Propagate constraints from beginning
to end (simulate)
4) Check for constraint violation
5) Modify guess e(t)
6) Repeat until feasible/optimal solution
obtained. (optimize)
e
 Vast complexity and extremely long
solution times are addressed by
either/both:
 Very simple control curves

 All calculations performed offline
(selected/looked-up online)
 Much of previous work in subject
devoted to improving ‘wisdom’ of next
guess
21
Iteration 2:
Differential Systems Suggest an Elegant
Solution
 Perhaps our systems contain a structure which allows all solutions of the
system, (trajectories) to be smoothly mapped by a set of free trajectories in a
reduced dimensional space. Algebraic solutions in this reduced space would
implicitly satisfying the dynamic constraints of the original system dynamics
and constraint ODE’s
 Constraints are mapped into the flat
space as well and also become time
independent
 Direct Solutions! We are modifying
the same curve we are optimizing!
 Local Support: Every solution is only
affected by the trajectory near it
 Basically a curve fit problem
22
Differential Flatness
Definition:
A system is said to be differentially
flat if there exists variables z1,…,zm
of the form
such that (x,u) can be expressed in
terms of z and its derivatives by an
equation of the form
Example:
(Point-to-Point):
Differential Constraints are
reduced to algebraic equations in
the Flat space!
Note: Dynamic Feedback Linearization via endogenous feedback
is equivalent to differential flatness.
23
Instinct Autonomy: Now Using Flatness
 Simply find any curve that
satisfies the constraints in the
flat space
 Solution Method
1) Map system to flat space using
‘w-1’
2) Guess trajectory of flat output zn
3) Compare against constraints (in
flat space)
4) Optimize over control points
 When completed apply ‘w’
function to convert back to
normal space
 Much simpler control space, no
simulation required:
 Very simple to manipulate
curves
 All calculations performed online on the vehicle
24
z1
z2
e

Too Good to be True? What did we Lose?
 It seems reasonable that such a reduction in complexity
would result in some sort of approximation
 Many systems lose nothing at all!
 Linear models that are controllable (including nonminimum phase)
 Fully-flat nonlinear models
 Some systems make reasonable assumptions
 Conventional A/C make identical assumptions as dynamic
inversion
 Some systems are very much less obvious and more
complicated
 This is one of the hardest questions of Differential
Flatness – identifying the flat output can be very difficult
 Modern configurations are very challenging!
 After one stabilization loop, most systems become
differentially flat (or very close to it)
25
SEC Autonomous Trajectory Generation
GO TO WP_D
GO TO WP_A
GO TO Rnwy_3
GO TO WP_C
GO TO WP_A
GO TO WP_S
26
Summary
MILP Path Planning
 MILP: Fast Global Optimization
 No suboptimal local minima
 Branch & Bound provides fast
tree-search
 Commercial solver on RTOS
 Tractability Trade-off:
 Time Discretization
 Constraints active only at
discrete points in time
 Time Scale Refinement
 Linear dynamics/constraints
 Formulation should properly
capture nonlinearity of solution
space
 True global minimum is in a
neighborhood of MILP optimal
solution
27
Optimal Trajectory Generation
 OTG: Fast Nonlinear Optimization
 Optimal control for full nonlinear
systems
 Differential Flatness property
allows problem to be mapped to
lower dimensional space for
NLP solver
 Absence of dynamics in new
space speeds optimization
 Easier constraint propagation
 Problem setup should focus on
right “basin of attraction”
 NLP solver seeks locally optimal
solutions via SQP methods
 Good initial guess
 Use in conjunction with global
methods, i.e. MILP
Conclusions
 Optimization based approaches help achieve a higher level of
autonomy by enabling autonomous decision making
 Cast autonomy applications into standard optimization problems,
to be solved using existing optimization tools and framework
 Benefits: no need to build custom solver, existing body of theory,
continued improvement in solver technology
 Future: broad range of complex autonomy applications are
enabled by a wide, continuous spectrum of powerful optimization
engines and approaches
 Challenge: advanced development of V&V, sensing, & fusion
technology, leading to widespread certification and adoption
 Thanks/Credits:
 NTG/OTG Approach: Mark Milam/NGST, Prof. R. Murray/Caltech
 MILP Approach: Prof. Jonathan How/MIT
 Autonomy Slides: Jonathan Mead/NGST
 OTG Slides: Travis Vetter/NGIS
28

Download Report

8.3 “Optimization Based Approaches to Autonomy,” Cedric

Paperzz.com

Your Paperzz