Adaptive Guidance and Control for Autonomous

Adaptive Guidance and Control
For Autonomous Launch Vehicles
12
Eric N. Johnson
Anthony J. Calise
[email protected]
[email protected]
School of Aerospace Engineering
Georgia Institute of Technology, Atlanta, GA 30332
(404) 894-3000
J. Eric Corban
[email protected]
Guided Systems Technologies, Inc.
McDonough, GA 30253-1453
(770) 898-9100
Abstract— Adaptive guidance technology is developed to
expand the potential of adaptive control when applied to
autonomous launch systems. Specifically, the technique of
pseudo-control hedging is applied to implement a fully
integrated approach to direct adaptive guidance and control.
For rocket powered launch vehicles, a recoverable failure
will generally lead to a reduction in total control authority.
Pseudo-control hedging was developed to prevent the
adaptive law from “seeing” and adapting to select vehicle
input characteristics such as actuator position limits, actuator
rate limits and linear input dynamics. In this work, a
previously developed adaptive inner-loop provides fault
tolerance using an inverting control system design
augmented with a neural network. An adaptive outer-loop is
introduced that provides closed-loop guidance for tracking
of a reference trajectory. The outer-loop adapts to force
perturbations, while the inner-loop adapts to moment
perturbations. The outer-loop is “hedged” to prevent
adaptation to inner-loop dynamics. The hedge also enables
adaptation while at control limits, and eliminates the need
for time-scale separation of inner and outer-loop dynamics,
which is potentially important for abort scenarios. The
paper develops the methodology for adaptive trajectory
following and control. Numerical simulation results in
representative failure scenarios for the X-33 reusable launch
vehicle demonstrator are then presented.
The paper
concludes with a brief summary of an autonomous guidance
and control system appropriate for future reusable launch
vehicles, and the application of the developed adaptive
components within such an architecture.
TABLE OF CONTENTS
1.
2.
3.
4.
5.
1
2
INTRODUCTION
ADAPTIVE AUTOPILOT DESIGN
ADAPTIVE GUIDANCE LAW DESIGN
NUMERICAL SIMULATION RESULTS FOR X-33
SUMMARY OF AUTONOMOUS G&C SYSTEM
0-7803-6599-2/01/$10.00 © 2001 IEEE
Updated December 15, 2000
1. INTRODUCTION
Reliable and affordable access to space along with a global
engagement capability are now recognized as critical
requirements of the U.S. Air Force in the 21st century. New
initiatives are in place to create a truly integrated AeroSpace
force, and include technology developments that will enable
a hypersonic cruise reconnaissance/strike vehicle that could
reach any spot in the world within three hours. Such
technology will also support development of two stage
military spaceplanes in the coming decade, and eventually
lead to unmanned military single-stage-to-orbit vehicles.
NASA has similarly established aggressive goals for
reduction in both the cost of operations and turn-around time
of future Reusable Launch Vehicles (RLVs), and seeks to
obtain greatly enhanced safety in operations. Autonomous
guidance and control (G&C) technologies are recognized as
critical to the objective of achieving reliable, low-cost,
aircraft-like operations into space.
Specifically, next
generation G&C systems must be able to fly a variety of
vehicle types in multiple mission scenarios, as well as
handle dispersions, failures and abort requirements in a
robust fashion [1]. This paper is concerned with the
development of autonomous G&C systems for RLVs, and in
particular with the ability to handle large dispersions and
failures through adaptation.
Neural network-based direct adaptive control has recently
emerged as an enabling technology for practical
reconfigurable flight control systems. In the recent USAF
Reconfigurable Control for Tailless Fighter Aircraft
(RESTORE) program, adaptive nonlinear control was
combined with on-line real-time parameter identification and
on-line constrained optimization to demonstrate the
capability of a next generation aircraft control system with
redundant control actuation devices to successfully adapt to
unknown failures and damage. The reconfigurable flight
control system was based on a dynamic inversion control
law augmented by an on-line neural network. By means of
a bounded weight update law, the neural network
continuously learns (i.e., adapts) to produce an output that is
used to cancel the inversion error between the plant model
(used by the dynamic inversion control law) and the true
vehicle dynamics [2]. The program culminated in successful
flight demonstration of the adaptive controller on the X-36
[3].
This approach to failure and damage tolerant control has
recently been improved to handle control saturation,
unmodeled actuator dynamics, and quantized control, and
applied to autopilot design for the X-33. The X-33 is a suborbital aerospace vehicle intended to demonstrate
technologies necessary for future Reusable Launch Vehicles.
Features of the design include a linear aerospike rocket
engine, vertical take-off, and horizontal landing. For X-33,
it is desirable to provide stable recovery and performance
under anticipated and unanticipated failures, aborts, and
variations in the environment and vehicle dynamics. A
simulation study has shown that neural network (NN)
augmented non-linear adaptive flight control provides an
approach that maintains stable performance under large
variations in the vehicle and environment. This can have a
two-fold benefit, by increasing safety in the presence of
unanticipated failures and by reducing the tuning required
per
mission
due
to
small
changes
in
vehicle/environment/payload
configuration.
These
improvements have the potential to directly reduce cost and
increase the safety of future operational launch vehicles [4].
Adaptive guidance technology is required, however, to
realize the full potential of adaptive control in application to
RLVs. For some classes of failures, one would expect to be
able to continue to track the nominal trajectory (i.e.,
guidance commands). But for others, a loss in thrust and/or
control power will prevent successful tracking of the
nominal solution. The combination of adaptive guidance
and adaptive control is required to successfully manage a
wide class of potential failures in autonomous launch
systems. By adaptive guidance we mean both trajectory
regeneration (when needed), and successful trajectory
tracking despite the potential for significant force
perturbations. Possible failures lead to a large number of
abort scenarios that must be addressed. Ascent, reentry, and
abort trajectories can be complicated in failure scenarios by
constraints that result from a reduction in control power
(induced by the failure) and by the potential for control
saturation.
The capability for on-board trajectory
regeneration is thus essential to establish the ability to
overcome such in-flight failures. The further addition of
adaptive tracking of the trajectory will allow for successful
mission completion in an expanded set of failure scenarios,
and provides the time necessary for diagnosis of failures and
for the regeneration of the desired trajectory.
Neighboring optimal solutions based on real-time
identification of perturbations in control effectiveness have
been proposed as one possible means of accomplishing online trajectory regeneration. The neighboring optimal
approach attempts to circumvent the requirement to solve a
two-point-boundary-value problem using linear (although
time varying) optimal control theory. Implementation
typically involves a computationally intense solution for
time-varying gains, and its application is limited to small
perturbations from the nominal trajectory. Furthermore, for
the neighboring solution to be valid, it is required that the
nominal trajectory be optimal. This is no longer necessarily
true following a failure. In fact, until a model of the failed
system can be identified that is valid across the flight
envelope, attempts to optimize the trajectory for the failed
system will be of limited value.
There are a number of mature technologies for fault
detection and isolation that are appropriate for use in
detecting, isolating and identifying various classes of
failures. There are, as well, mature research efforts focused
on the task of on-line system identification [5,6]. On-line
system identification can be used to detect and model many
additional classes of failures, but will require flight time
following the failure to produce a valid result. The
proposed autonomous G&C architecture will draw upon this
technology base to produce on-line an approximate model of
the failed system. Note however, that for trajectory replanning, it is not sufficient to capture the local impact of
the failure. In many cases it will be necessary to fully isolate
the cause of degraded performance, so that its impact can be
correctly modeled across the flight envelope.
Recognizing that significant time is required to construct a
model of the failed system, it will be necessary to devise a
proper guidance strategy for this interim period. In this
paper, pseudo-control hedging [4] is used to locally modify
the nominal trajectory commands so that the vehicle follows
a feasible path with the same overall objectives as the
nominal trajectory. The modified (i.e. hedged) trajectory
commands are achievable despite a reduction in control
power and/or control saturation. This strategy does not
require the identification of a failed system model for
implementation, and the requirement for optimality can be
relaxed in the absence of a model reflecting the failure.
Once an approximate model of the failed system is
produced, the impact of the failure on the mission can be
assessed and the mission objectives can be intelligently
reassigned. At this time it may become necessary (and
possible) to generate a new optimal path.
Without the benefit of a valid neighboring optimal solution,
one must resort to regeneration of the optimal trajectory online. While great success has been achieved in numerically
solving complex nonlinear trajectory optimization problems
using either direct or indirect techniques, most of these
algorithms have proven ill-suited for on-board
implementation [7]. However, a recently developed hybrid
method for trajectory optimization has proven suitable for
on-board implementation. The hybrid designation in this
context refers to the combined use of analytical and
numerical methods of solution. Specifically, the approach
combines optimal control with collocation techniques, and
has the demonstrated potential of being able to determine
profiles with modified targeting in a closed-loop fashion onboard the vehicle [8-10]. The availability of this approach
circumvents any need to consider neighboring optimal
solutions, and also provides the capability to respond in
flight to redefined mission objectives and aborts. However,
as noted earlier, a full-envelope model of the failed system is
required.
We focus in this paper on the development of a strategy for
closed-loop guidance (i.e., trajectory following). This
strategy is designed to provide for (1) adaptive closed-loop
guidance during nominal operation; (2) adaptive closed-loop
guidance and guidance command modification (i.e., local
trajectory reshaping) to maintain feasible guidance
commands in the time period between the occurrence of a
failure and the completion of modeling the failure followed
(if necessary) by trajectory optimization; and (3) for tracking
of the trajectory in all cases when subject to potentially
significant variations in the force equations as well as
control saturation. This paper does not address on-line
modeling of the failed system, nor on-line trajectory
optimization given such a model.
The method of pseudo-control hedging, first introduced in
[4], is used to implement this fully integrated approach to
direct adaptive guidance and control. This approach is
depicted in block diagram form in Figure 1-1. An adaptive
inner-loop provides fault tolerance as in the X-33 and X-36
applications previously discussed. An adaptive outer-loop
is introduced that provides closed-loop guidance for tracking
of the reference trajectory. The outer-loop adapts to force
perturbations, while the inner-loop adapts to moment
perturbations. The outer-loop is “hedged” to prevent
adaptation to inner-loop dynamics. The hedge also enables
adaptation while at control limits, and eliminates the need
for time-scale separation of inner and outer-loop dynamics,
which is important for abort cases. This approach provides a
logical strategy for guided flight during the time required for
on-line system identification and trajectory regeneration, and
also deals with unidentified failure cases.
Hedge
Hedge
Reference
Reference
Trajectory
Trajectory
Outer
Outer
Loop
Loop
Inner
Inner
Loop
Loop
Neural
Neural
Network
Network
Figure 1-1 - Integrated Adaptive Guidance and
Control Using Neural Networks
The paper proceeds as follows. Section 2 describes pseudocontrol hedging and the adaptive inner-loop design in detail.
Section 3 describes the development of an adaptive outerloop for trajectory following which commands the inner
loop of Section 2. Section 3 also includes numerical results
from a simple idealized launch vehicle simulation to
illustrate the function of the adaptive outer loop. Section 4
summarizes application of the developed method to the X33 reusable launch vehicle demonstrator, and presents
numerical simulation results for the system response in two
representative failure cases. Section 5 completes the paper
by summarizing an overall approach to autonomous
guidance and control for hypersonic vehicles that employs
the adaptive components.
2. ADAPTIVE AUTOPILOT DESIGN
First consider the method termed pseudo-control hedging.
The purpose of the method is to prevent the adaptive
element of a control system from trying to adapt to selected
system input characteristics (characteristics of plant or of the
controller). To do this, the adaptive law is prevented from
“seeing” selected system characteristics.
A plain-language conceptual description of the method is:
The reference model is moved in the opposite direction
(hedged) by an estimate of the amount the plant did not
move due to system characteristics the control designer
does not want the adaptive control element to see. To
formalize the language in the context of an adaptive control
law involving dynamic inversion, “movement” should be
replaced by some system signal. Preventing the adaptive
element from ‘seeing’ a system characteristic means to
prevent that adaptive element from seeing the system
characteristic as model tracking error.
Figure 2-1 is an illustration of conventional Model
Reference Adaptive Control (MRAC) with the addition of
pseudo-control hedging compensation. The pseudo-control
hedge compensator is designed to modify the response of the
reference model.
where x , x& , δ ∈ ℜ . An approximate dynamic inversion
element is developed to determine actuator commands of the
form
x rm
Reference
Model
n
δ cmd = fˆ −1 (x, x& ,ν )
PCH
δ
xc
Controller
-
x
Plant
(2)
where ν is the pseudo-control signal, and represents a
desired x& that is expected to be approximately achieved by
e
+
δ cmd . That is, this dynamic inversion element is designed
without consideration of the actuator model (i.e., “perfect”
actuators). This command ( δ cmd ) will not equal actual
Adaptation
Law
control ( δ ) due to actuator dynamics.
Figure 2-1 – Model Reference Adaptive Control
(MRAC) with pseudo-control hedge
compensation
To get a pseudo-control hedge (ν h ), an estimated actuator
position ( δˆ ) is determined based on a model or a
measurement. In cases where the actuator position is
The more specific case of the pseudo-control hedging
applied to an adaptive control architecture employing
approximate dynamic inversion is illustrated in Figure 2-2.
The adaptive element shown in Figure 2-2 is any
compensator attempting to correct for errors in the
approximate dynamic inversion. This could be as simple as
an integrator or something more powerful such as an on-line
neural network.
measured, it is regarded as known ( δˆ = δ ). This estimate
is then used to get the difference between commanded
pseudo-control and the achieved pseudo-control
(
xrm
(
)
δˆ
PCH
xc
Reference
Model
ν rm
ν
− ν ad
Approximate δ cmd
Dynamic
Inversion
δ
Actuator
x
Plant
+
Adaptive
Element
e
(5)
where xc is the command signal, then the reference model
dynamics with pseudo-control hedge becomes
&x&rm = f rm ( x rm , x& rm , x c ) − ν h .
P-D
Compensator
Figure 2-2 – MRAC including an approximate
dynamic inversion; The pseudocontrol hedge component utilizes an
estimate of actuator position
(6)
The instantaneous pseudo-control output of the reference
model (if used) is not changed by the use of pseudo-control
hedge, and remains
ν rm = f rm ( x rm , x& rm , x c ) .
Designs of a suitable neural network architecture and its
associated update law for the controller architecture
illustrated in Figure 2-2 are well documented in the
literature, as is the associated proof of boundedness [2,4].
The design of the pseudo-control hedge compensator for the
controller architecture illustrated in Figure 2-2 is now
described. For simplicity, consider the case of full model
inversion, in which the plant dynamics are taken to be of the
form
&x& = f (x, x& , δ )
(3)
( 4)
pseudo-control hedge is to be subtracted from the reference
model state update. For example, if the (stable) reference
model dynamics without pseudo-control hedge was of the
form
x&&rm = f rm (x rm , x& rm , xc ) ,
fˆ
Adaptation
Law
ν pd
)
)
With the addition of pseudo-control hedge, the reference
model has a new input, ν h . As introduced earlier, the
νh
ν h = ν − fˆ x, x& , δˆ
(
ν h = fˆ ( x, x& , δ cmd ) − fˆ x, x& , δˆ
= ν − fˆ x, x& , δˆ
(1)
In other words, the pseudo-control hedge signal
reference model output
(7)
ν h affects
ν rm only through changes in
reference model state.
The following sub-sections discuss the theory associated
with this application pseudo-control hedging, as well as its
limitations. There are two fundamental changes associated
with pseudo-control hedge that affect existing boundedness
theorems for the NN architecture. Firstly, there is a change
to the model tracking error dynamics (e), which is the basis
for adaptation laws presented in the earlier work of
references [2] and [4]. Secondly, the reference model is not
necessarily stable due to the fact that it is now coupled with
the rest of the system.
Remark 1: If instead one makes the less restrictive
assumption that the realization of δˆ does not produce any
additional dynamics (i.e., contains no internal states) then
[ (
Model Tracking Error Dynamics
The complete pseudo-control signal for the system
introduced earlier is
ν = ν rm + ν pd − ν ad + ν r
(8)
ν rm is given by Eqn (7),
the Proportional-Derivative (PD) compensator output (ν pd )
where the reference model signal
is acting on tracking error
ν pd = [K d
where
Kp] e,
(9)
K d and K p are diagonal matrices containing
desired second-order linear error dynamics and model
tracking error is expressed as
 x& − x& 
e =  rm
.
 x rm − x 
(10)
The adaptation signal ( − ν ad
+ ν r ) is the output of the
adaptive element, where the so-called robustifying term ν r
is dropped in the remainder of this section for clarity. The
model tracking error dynamics are now found by
differentiating Eqn (10) and utilizing the previous Eqns:
[ (
)
)]
(
e& = Ae + B ν ad x, x& , δˆ − f (x, x& , δ ) + fˆ x, x& , δˆ (11)
) (
e& = Ae + B ν ad x, x& , δˆ − ∆′ x, x& , δˆ
where
(
) (
∆' x, x& , δˆ = ∆ x, x& , δ , δˆ
)]
(16)
)
(17)
appears as model error to the adaptive law.
Remark 2: When the realization of δˆ does contain
additional dynamics, these dynamics will appear as
unmodeled input dynamics to the adaptive law. Previous
results that improve robustness to unmodeled input
dynamics can be applied to address a residual model error
( ε ′ ), which comes about when Eqns (11) and (17) are
applied to put tracking error dynamics in the following form:
[ (
) (
)
]
e& = Ae + B ν ad x, x& , δˆ − ∆ ′ x, x& , δˆ + ε ′(t )
(18)
Stability of the Reference Model
A significant difference from previous MRAC work is that
the reference model is not necessarily stable. This occurs
because the assumptions made up to this point allow the
adaptive element to continue to function when the actual
control signal has been replaced by any arbitrary signal.
This completely arbitrary signal does not necessarily
stabilize the plant.
where
− K d
A=
 I
− Kp
0 
(12)
I 
B= 
 0
(13)
Model error to be compensated for by ν ad is defined as
(
)
(
∆ x, x& , δ , δˆ = f ( x, x& , δ ) − fˆ x, x& , δˆ
)
System response for ( δˆ
(14)
If one assumes that δ is exactly known ( δˆ = δ ), it follows
from Eqn (11) that
[ (
) (
e& = Ae + B ν ad x, x& , δˆ − ∆ x, x& , δˆ
)]
However, stability and tracking are still of interest for
closed-loop control. System characteristics to be removed
from the adaptation must be limited to items that are a
function of the commanded control, such as saturation,
quantized control, linear input dynamics, and latency. This
class of system characteristics will be referred to in this
section as an actuator model. In general, this actuator model
could also be a function of plant state.
(15)
Eqn (15) is of the same form as the model tracking error
dynamics seen in previous work [2,3]. As a result, the
boundedness of a NN adaptation law given by earlier results
can be used with some modification.
(
= δ ) is now
) (
&x& = fˆ x, x&, δˆ + ∆ x, x&, δˆ
)
When the actuator is ideal, one obtains
(
&x& = fˆ ( x, x& , δ cmd ) + ∆ x, x& , δˆ
(19)
)
(20)
Remark 3: When the actuator is “ideal” and the actual
position and the commanded position are equal, the addition
of pseudo-control hedge has no effect on any system signal.
Remark 4: When the actuator position and command differ,
the adaptation occurs as though the command had
corresponded to the actual. Also, the system response is as
close to the command as was permitted by the actuator
model.
Previous results suggest that pseudo-control hedging
prevents interactions between the adaptive element and the
actuator model. However, there can clearly be interactions
between the actuator model and the reference model and
linear compensation (PD control) that can cause a detriment
to stability and tracking performance. It is through selection
of desired dynamics that the control system designer should
address the actuator, utilizing methods from non-adaptive
control, independent of the adaptive law. This is the desired
condition, because the limitations of the actuators are
normally a primary driver in selection of desired system
dynamics.
σ is a sigmoidal activation function that represents the
‘firing’ characteristics of the neuron, e.g.
σ (z ) =
In this section, a NN is described for use as the adaptive
element (ν ad ). Single Hidden Layer (SHL) Perceptron
V
bv
x1
xn1
(24)
 θ w,1
w
1,1
W =
 M

 wn2 ,1
L θ w , n3 
L w1,n3 

O
M 

L wn 2 , n 3 
(25)
and define a new sigmoid vector
[
( )]
σ (z ) = bw
where
σ (z1 ) σ (z 2 ) L σ z n1
ν ad1
σ1
ν ad n
]
T
ν ad = W T σ (V T x )
3
(27)
(28)
Consider a SHL perceptron approximation of the nonlinear
function ∆ , introduced in Eqn (15), over a domain D of
(21)
j =1
k = 1,L, n 3 and
n1


σ j = σ  bvθ v , j + ∑ vi , j xi 
i =1


x 2 L x n1
x1
definitions, the input-output map of the SHL NN in the
controller architecture can be written in matrix form as
n2
ν ad k = bwθ w ,k + ∑ w j ,k σ j
(26)
bv ≥ 0 is an input bias that allows for the threshold θ v to
be included in the weight matrix V . With the above
ν ad 2
σ n2
T
bw ≥ 0 allows for the threshold θ w to be included
[
The following definitions are convenient for further analysis.
The input-output map can be expressed as
Here
L θ v , n2 
L v1,n2 

O
M 

L v n1 ,n2 
x = bv
Figure 2-3 – The Single Hidden Layer (SHL)
Perceptron Neural Network
where
θ v ,1
v
1,1
V =
 M

v n1 ,1
in the weight matrix W . Define
W
bw
(23)
The factor a is known as the activation potential, and is
normally a distinct value for each neuron. For convenience
define the two weight matrices
Neural Network
NNs are universal approximators in that they can
approximate any smooth nonlinear function to within
arbitrary accuracy, given a sufficient number of hidden layer
neurons and input information. Figure 2-3 shows the
structure of a SHL NN.
1
1 + e − az
x . There exists a set of ideal weights {W * ,V * } that bring
the output of the NN to with an ε -neighborhood of the
error ∆ ( x , x& , δ ) = ∆ ( x ) .
This ε -neighborhood is
bounded by ε , defined by
ε =
sup
x
W T σ (V T x ) − ∆ ( x )
(29)
(22)
The universal approximation theorem implies that ε can be
made arbitrarily small given enough hidden layer neurons.
n1 , n2 , and n3 are the number of input nodes, hidden
The matrices W and V can be defined as the values that
minimize ε . These values are not necessarily unique.
layer nodes, and outputs respectively. The scalar function
*
*
The NN outputs are represented by
ν ad where W and V
are estimates of the ideal weights. Define
V 0 
Z=

0 W 
and let
with
(30)
⋅ imply the Frobenius norm.
the closed-loop system remain bounded.
Quaternion-Based NN Adaptive Flight Control Architecture
Assumption 1: The norm of the ideal NN weights is
bounded by a known positive value
Z* ≤ Z
(31)
Define the derivative of the sigmoids as
L
 0
(
)
z
∂
σ
 1
∂σ (z )  ∂z1
σ z (z ) =
=
O

∂z

 0
Γw , Γv > 0 and λ > 0 , guarantees that all signals in
0 
0 


∂σ ( z n 2 ) 
∂z n 2 

T
q, ω
νh
PCH
(32)
ν rm
Reference
Model
Guidance
ν
qrm , ω rm
From the tracking error dynamics described previously, Eqn
(15), define the vector
r = (e T PB )
The quaternion-based adaptive flight control architecture
employed is illustrated in Figure 2-4. The flight control
system determines a desired angular acceleration, or pseudocontrol, which forms the input to a nominal dynamic
inversion. The nominal dynamic inversion converts these
desired angular accelerations into the required control
torque commands and then actuator commands (utilizing a
control allocator).
Navigation
and Sensors
q, ω
(33)
P ∈ ℜ 2 n×2 n is the positive definite solution to the
T
Lyapunov Equation A P + PA + Q = 0 .
Where a
+ e
-
P-D
Control
ν −ν h
On-Line
NN
ν pd
Nominal
Dynamic
Inversion
+
+ -
Control
Allocation
δ cmd
Control
Torque
Commands
ν ad
Figure 2-4 – Quaternion-based NN adaptive flight
control architecture with pseudocontrol hedge
Where
reasonable positive definite choice for Q is
K d K p
Q=
 0
0 
1
2 1
K d K p  4 n2 + bw2
(34)
q g ∈ ℜ 4 , with an associated angular
stored as a vector,
The robustifying signal is chosen to be
rate command vector,
ν r = −[K r 0 + K r1 ( Z + Z )]r
with
The reference model employed is illustrated in Figure 2-5.
A guidance attitude command is provided as a quaternion
(35)
K r 0 , K r1 > 0,∈ ℜ n×n .
ω g ∈ ℜ 3 . Nominally, this reference
model gives a second order response to changes in the
guidance command and with quaternion error angles
normally calculated given two quaternions with the function
? (q,r ) = −2 sign (q1r1 + q2 r2 + q3r3 + q4 r4 ) ×
− q1r2 + q2 r1 + q3r4 − q4 r3 
− q r − q r + q r + q r 
2 4
3 1
4 2
 13
− q1r4 + q2 r3 − q3r2 + q4 r1 
The following theorem [4] guarantees uniform ultimate
boundedness of tracking errors, weights, and plant states.
With a non-ideal actuator, one must also apply Assumption
2.
(38)
Assumption 2: Reference model signals remain bounded.
( ) known ∀i, j = 1,2,3 .
ν rm
Theorem 1: Consider the feedback linearizable system
given by Eqn (1), where
sign
∂f i
∂δ j
The augmented feedback control law given by Eqn (2), with
ν defined by Eqns (4), (5), (7), (8), (9), (28), and (35),
where W& and V& satisfy
{(
)
}
(36)
}
(37)
W& = − σ − σ zV T x r T + λ r W Γw
{(
)
V& = −Γv x r TW T σ z + λ r V
νh
ωg
qg
Error
Angles
ωn
2ζ
+
+
2ζω n
-
+
ω rm
1
s
q& = q& (ω c )
Figure 2-5 – Quaternion-based attitude reference
model with pseudo-control hedge
1
s
qrm
for which ν rm has the analytic form
ν rm = K d (ω g − ω rm ) + K pξ (q g , q rm )
(39)
and the reference model dynamics are of the form
ω& rm = K d (ω g − ω rm ) + K pξ (q g , q rm ) − ν h
is not desired. In this work we use pseudo-control hedging
to remove the effect of inner-loop dynamics and any innerloop adaptation from the outer loop process. The result is a
guidance system (the outer-loop) that can respond to force
perturbations like the inner-loop responds to moment
perturbations.
(40)
signal defined previously.
A block diagram of the combined inner and outer loops is
shown in Figure 3-1. The outer-loop is enclosed by the
gray-bordered box on the left-hand side of the figure, and
provides direct force effector commands (such as engine
throttle commands), as well as attitude command
adjustments to the inner-loop. The inner-loop is enclosed in
the gray-bordered box on the right-hand side of the figure,
and uses moment-generating effectors to achieve the attitude
commands generated by the outer loop. A single NN is
employed to serve the needs of both the inner and outer
loops. The NN thus has six outputs which are used to
correct for force and moment model errors in each of the
axes. In Figure 3-1, the symbols p and v represent
position and velocity respectively. The pseudo-control has
been delineated as linear acceleration ( a ) and angular
Vehicle angular acceleration can be modeled by
angle corrections from the outer-loop.
PD gains are applied to the tracking error in a manner
similar to that used for reference model dynamics
ν pd = M {K d (ω rm − ω ) + K pξ (q rm , q )}
(41)
where M is chosen to be identity for X-33. It should given
larger values when the reference model dynamics are
intended to be the dominant, lower frequency, response.
The pseudo-control is selected as
ν = ν rm + ν pd + ν r − ν ad
Here,
(42)
ν ad is the NN output, and ?r is the robustifying
?& = f ( x,? , δ )
acceleration ( α ). The symbol
(43)
? ∈ ℜ 3 represents the angular rate of the vehicle,
δ ∈ ℜ m represents the control effectors, and x represents
where
other vehicle states that angular acceleration depends upon,
with m > 3 . The approximately feedback linearizing
control law is
δ = fˆ −1 ( x,? , ν )
(44)
The next section describes the novel use of pseudo-control
hedging to implement an outer-loop adaptive controller that
is insensitive to inner loop dynamics. Numerical results for
the combined (inner and outer-loop) adaptive control system
follow.
∆qOL represents attitude
As shown in the figure, the outer-loop reference model is
driven by a stored nominal trajectory prescribed in terms of
commanded position and velocity. The filtered trajectory
commands, modified by the hedge signal, are combined with
the output of proportional plus derivative control of the
trajectory following error and the appropriate neural
network outputs to produce the pseudo-control in each axis
of control. As described in Section 2 for the inner loop
design, the pseudo-control serves as the input to the model
inversion process, and the neural network output signal
serves to cancel the errors that result from inversion of an
approximate model of the plant. An equivalent structure is
used for the inner loop, where the inner-loop reference
model is driven by the attitude commands associated with
the stored nominal trajectory modified by the output of the
outer-loop.
3. ADAPTIVE GUIDANCE LAW DESIGN
It is common practice to approach the guidance and control
problem by independent design of inner and outer loops.
The purpose of the inner-loop is to use the control surfaces
to achieve a desired attitude and angular velocity with
respect to the Earth or to the relative wind (i.e., angle of
attack and sideslip angle). The outer-loop generates innerloop commands to achieve a desired trajectory.
The theory and results given thus far have pertained to the
inner-loop portion of the problem only. Introduction of
adaptation in a traditional outer-loop that is to be coupled
with this adaptive inner-loop is problematic. In particular,
adaptation of the outer loop to the dynamics of the inner
loop (which will appear to the outer loop as inversion error)
A simple idealized six-degree-of-freedom model of a rocketpowered launch vehicle performing a gravity-turn trajectory
is used to illustrate the function of this two-loop
ah
Outer-Loop
Hedge
αh
p , v , q, ω
Inner-Loop
Hedge
qc
xc , vc
Outer Loop
Reference Model
arm
Inner Loop
Reference Model
+
Outer-Loop
PD
Outer Loop
Approx
Inversion
Inner-Loop
PD
p , v , q, ω
− aad
v, p, vrm , prm
+
∆qOL
q, ω , qrm , ω rm
OUTER LOOP
δ throttlecmd
p , v , q, ω
α rm
Inner Loop
Approx
Inversion
p , v , q, ω
− α ad
δ attitudecmd
Plant
p, v , q, ω
INNER LOOP
NN
p, v , q, ω , aˆ , αˆ
Figure 3-1 – Inner and outer-loop adaptive flight control architecture that utilizes pseudo-control
hedge to de-couple the adaptation process
design and the effect of the hedge in the outer loop. The
nominal trajectory is defined in terms of position, velocity
and attitude commands. The simple model exhibits lift and
side-force, with thrust aligned along the body x-axis. The
controls are the three components of torque and the thrust
magnitude. A plot of the reference trajectory downrange
position versus altitude is given in Figure 3-2. The
idealization of the simulation model is such that the nominal
inverting controllers (for both force and moment) are exact.
However, at 20 seconds into the trajectory, a pitch axis
moment error (-0.05
rad
sec 2
) and a z-axis force error (15
ft
sec 2
)
are introduced for the purpose of representing a failure that
causes a significant change to the aerodynamic forces and
moments the vehicle experiences. As depicted in Figure 33, these changes are significant enough that control
saturation occurs due to the failure in the pitch axis.
Also plotted in Figure 3-2 is the trajectory command
generated by the outer loop command filter (i.e. reference
model). This is the trajectory that is to be tracked by the
flight system. During the period of control saturation, the
outer loop hedging signal alters the reference model output
to produce feasible trajectory commands. That is, the
trajectory is locally reshaped, but only as much as is
required to be feasible given the failure condition.
Knowledge of the failure condition is not required by the
controller. At 30 seconds into the flight, the simulated
failure is removed. As evident in the figure, the system
ultimately brings the vehicle back onto the original reference
trajectory. Adaptation for pitch moment is shown in Figure
3-4. As evident in the figure, the NN does a good job of
capturing model error induced by the simulated failure. The
simultaneous z-force adaptation is shown in Figure 3-5.
Note the system is able to adapt much quicker to the
moment error, and that coupling between moment and force
adaptation is not evident.
x 105
Altitude (ft)
2
1.5
1
Command
Reference
0.5
0
0
0.5
1
1.5
2
2.5
3
3.5
x105
Downrange (ft)
Pitch Angular Accel (rad/sec2)
20
Actual
NN Output
15
10
5
0
-5
0
20
40
60
80
100
Time (sec)
Figure 3-2 – Gravity turn trajectory with force/moment error
introduced at 20 seconds
Figure 3-5 – Time history of simultaneous force
adaptation
Results showing combined inner and outer-loop adaptation
for the X-33 are given in the next section. The primary
deviation from the above description and idealized example
is that the throttle will be open-loop. That is, the nominal
throttle command is employed, and linear force adaptation
will occur for horizontal and vertical deviations from the
nominal trajectory.
0.05
0.04
0.03
0.02
0.01
0
-0.01
4. NUMERICAL SIMULATION RESULTS FOR X-33
-0.02
-0.03
-0.04
-0.05
0
20
40
60
80
100
Time (sec)
Figure 3-3 – Time history of pitch angular acceleration
(i.e. pitch control) illustrating saturation at
maximum of 0.05 from 10-30 seconds.
Pitch Angular Accel (rad/sec2)
Z-Axis Linear Accel (ft/sec2)
2.5
0.01
0
-0.01
-0.02
-0.03
-0.04
Actual
NN Output
-0.05
-0.06
-0.07
0
20
40
60
80
Time (sec)
100
The subject guidance and control architecture was tested in
the Marshall Aerospace Vehicle Representation in C
(MAVERIC), which is the primary guidance and control
simulation tool for X-33. The simulation extends from
launch to Main Engine Cut-Off (MECO). Typical missions
include vertical launch and peak Mach numbers of
approximately 8, altitudes of 180,000 feet, and dynamic
pressures of 500 Knots Equivalent Air Speed (KEAS).
During ascent, vehicle mass drops by approximately a factor
of 3, and vehicle inertia by a factor of 2 due to fuel
consumption. The inner and outer-loop flight control
architecture illustrated in Figure 3-1 was used to generate
the results that follow.
The inner-loop approximate inversion consisted of
multiplying desired angular acceleration by an estimate of
vehicle inertia, and using a fixed-gain control allocation
matrix based on the existing baseline X-33 control allocation
system [11]. The outer-loop approximate inversion is a
transformation of acceleration commands to attitude
commands, which included only an estimate of the affect of
thrust tilt and a fixed linear model for the relationship
between aerodynamic-angle changes and aerodynamic force
coefficients. This conversion involves estimated thrust,
vehicle mass, and dynamic pressure.
Figure 3-4 – Time history of moment adaptation
NN inputs were angle-of-attack, side-slip angle, bank angle,
vehicle angular rate, body-axis velocity, and estimated
rates for
and
V were 20 for all inputs. For the inner- loop, K p
K d were chosen based on a natural frequency of 1.0,
1.5, and 1.0 rad sec for the roll, pitch, and yaw axes
respectively and a damping ratio of 0.7. For the outer-loop,
they corresponded to 0.5, 0.2, and 0.1 rad sec for the x, y,
and z body axis directions respectively, all with damping
ratio of unity.
0.3
Z-Axis Linear Accel (G)
pseudo-control (νˆ ). Four middle layer neurons were used;
learning rates on W were unity for all axes and learning
Two failure cases are now discussed. The first is a failure of
a single body flap. The second is a hypothetical failure that
involves a large change in aerodynamic normal force
coefficient.
Flap Failure
Here, the right-side body flap freezes during flight. This
introduces roll, yaw, pitch, and lift disturbances to the
vehicle. No direct knowledge of the failure is given to the
flight controller or guidance. The outer-loop controller will
then act to maintain the desired reference trajectory. The
failure is introduced 60 seconds after liftoff, which is near
the time instant of maximum dynamic pressure.
Figure 4-1 shows outer-loop adaptation. Here, the vertical
(body z-axis) linear acceleration output of the NN is shown
along with the corresponding actual model error. Tracking
of the reference trajectory is shown in Figure 4-2. For this
example, deviations are small because limited control
saturation occurs.
0.15
0.1
0.05
0
50
100
150
200
250
Time (sec)
Figure 4-1 – Horizontal and vertical acceleration
outputs of the NN, showing
adaptation due to flap failure
18
4
x 10
16
Actual/Ref
Command
14
Altitude (ft)
The resulting flight control system has no scheduled gains.
Since base-aerodynamic moments were neglected when
selecting the approximate dynamic inversion, these must be
corrected by NN adaptation. This design represents an
extreme case of relying on adaptation. Design freedom
exists to use scheduled gains or a more complex dynamic
inversion if desired.
0.2
-0.05
0
Aerodynamic surface actuator and main engine thrust
vectoring position and rate limits are included in the innerloop pseudo-control hedge signal. The pseudo-control
hedge also has knowledge of axis priority logic within the
control allocation system, which appears as input saturation.
The implementation included pseudo-control hedge
compensation for notch filters that could be designed to
prevent excitation of specific aeroelastic modes, although
these filters where not used for the results presented here.
Closed-loop control is not used for main engine throttling
(the nominal schedule is used). The outer-loop hedge design
also includes inner-loop dynamics.
Actual
NN Output
0.25
hr
hc
h
12
10
8
6
4
2
0
0
50
100
150
200
250
Time (sec)
Figure 4-2 – Comparison of actual and reference
trajectories, peak deviation is
approximately 322 feet
Failure that Affects Lift Force
This sub-section describes a hypothetical failure that
involves a large change in aerodynamic normal force
coefficient.
This could be caused by some vehicle
component failure, such as the loss of an aerodynamic
fairing. The failure is introduced 60 seconds after liftoff,
where normal force coefficient is reduced by 0.5.
Figure 4-3 shows outer-loop adaptation. Here, the body zaxis linear acceleration output of the NN and the
corresponding model error are shown. There is a large
change at 60 seconds due to the change in normal force.
This change (as well as the linear feedback of position and
velocity errors) causes a change in the attitude command.
18
x 104
16
1
hr
hc
h
Actual/Ref
Command
14
Altitude (ft)
Z-Axis Linear Accel (G)
1.2
0.8
0.6
0.4
10
8
6
4
0.2
Actual
NN Output
0
-0.2
0
12
50
100
150
200
2
0
0
50
100
150
200
250
250
Time (sec)
Figure 4-3 – Body z-axis acceleration output of the
NN and actual model error, showing
adaptation to lift perturbation
Figure 4-4 shows outer-loop adaptation for the linear
acceleration component along the body x-axis. Adaptation
is correct even though no closed-loop control is occurring
along this axis. Here, the differences in thrust and drag
between the nominal dynamic inversion model (which
assumed constant thrust and no drag) and actual are tracked.
Initially, throttle setting is higher than that assumed in the
nominal model.
Time (sec)
Figure 4-5 – Comparison of actual and reference
trajectories
System Health Monitor
(Fault Detection & Real-Time System ID)
On-Line System Modeling
On-Board
Mission
Planning
On-Line
Optimal
Trajectory
Generation
Closed
Loop
Adaptive
Guidance
Closed
Loop
Adaptive
Autopilot
Optimal
Control
Allocation
X-Axis Linear Accel (G)
0.2
Figure 5-1 - Proposed Overall Architecture for
Autonomous Guidance and Control
0.1
Primarily the Atmospheric Drag
Not Included In Nominal Model
0
-0.1
-0.2
-0.3
Actual
NN Output
-0.4
-0.5
0
50
100
150
200
250
Time (sec)
Figure 4-4 – X-axis acceleration output of the NN,
showing correct adaptation even
though no closed-loop control is
occurring on this axis
Finally, tracking of the reference trajectory is shown in
Figure 4-5.
5. SUMMARY OF AUTONOMOUS G&C SYSTEM
An architecture to enable autonomous ascent guidance and
control of reusable launch vehicles is proposed as follows
(see also Figure 5-1).
A nominal ascent profile is generated and loaded prior to the
mission. The subject closed-loop adaptive guidance law is
used to track the nominal trajectory, and provides robustness
to dispersions (as well as applicability to many
vehicle/mission variants). The guidance commands are
tracked by the subject adaptive autopilot that provides
further robustness to dispersions and parametric uncertainty.
A nominal control allocation strategy is employed for
distribution of the torque commands between the various
control effectors. Fault detection and on-line system
identification algorithms are continuously run in an effort to
detect degraded system performance, to isolate the source of
the anomaly, and to facilitate the on-line modeling of the
effect of a component fault, failure or damage across the
flight envelope.
Once approximately modeled, a
combination of in-flight simulation and hybrid on-line
optimal trajectory generation capability can be employed to
determine the impact of the failure on the planned mission,
and when necessary, to reshape the trajectory, or compute an
abort trajectory [8-10]. The model of the failed system may
also be used to alter the control allocation strategy. Time
will be required to detect, identify and model the impact of a
failure. In the time between occurrence of the fault and
successful trajectory regeneration, the subject approach to
adaptive guidance and control is used to track, and when
necessary, locally reshape, the nominal trajectory.
Specifically, hedging of the guidance commands is
employed to ensure the guidance commands remain feasible
in light of the failure.
[9] Gath, P.F., Calise, A.J., “Optimization of Launch
Vehicle Ascent Trajectories with Path Constraints and Coast
Arcs,” AIAA-99-4308 (to appear in Journal of Guidance,
Control, and Dynamics).
ACKNOWLEDGEMENTS
[10] Calise, A.J., et al, "Further Improvements to a Hybrid
Method for Launch Vehicle Ascent Trajectory
Optimization," AIAA-2000-4261.
This work was supported in part by the NASA Marshall
Space Flight Center, Grant NAG3-1638, and in part by the
U.S. Air Force Wright Laboratories, Contract F33615-00-C3021.
[11] Hanson, J., Coughlin, D., Dukeman, G., Mulqueen, J.,
and McCarter, J., “Ascent, Transition, Entry, and Abort
Guidance Algorithm Design for X-33 Vehicle,” Presented at
the AIAA Guidance, Navigation, and Control Conference
and Exhibit, 1998, Boston, MA.
REFERENCES
[1] Hanson, John M., “Advanced Guidance and Control
Project for Reusable Launch Vehicles,” AIAA-2000-3957,
Presented at the AIAA Guidance, Navigation and Control
Conference and Exhibit, 14-17 August 2000, Denver, CO.
[2] Calise, A., Lee, S., and Sharma, M., “Development of a
Reconfigurable flight control law for the X-36 tailless
fighter aircraft,” AIAA-2000-3940, Presented at the AIAA
Guidance, Navigation, and Control Conference and Exhibit,
14-17 August 2000, Denver, CO.
[3] Brinker, J., and Wise, K., “Flight Testing of a
Reconfigurable Flight Control Law on the X-36 Tailless
Fighter Aircraft,” AIAA-2000-3941, Presented at the AIAA
Guidance, Navigation, and Control Conference and Exhibit,
14-17 August 2000, Denver CO.
[4] Johnson, E., Calise, A., Rysdyk, R., and El-Shirbiny, H.,
“Feedback
Linearization
with
Neural
Network
Augmentation Applied to X-33 Attitude Control,” AIAA2000-4157, Presented at the AIAA Guidance, Navigation
and Control Conference and Exhibit, 14-17 August 2000,
Denver, CO.
[5] “Reconfigurable Systems for Tailless Fighter Aircraft –
RESTORE,” Final Report, September 1999, Boeing, AFRLVA-WP-TP-1999-30XX.
[6] “Reconfigurable Systems for Tailless Fighter Aircraft –
RESTORE,” Final Report, September 1999, Lockheed
Martin, AFRL-VA-WP-TR-1999-3078.
[7] Corban, J. Eric, “Real-Time Guidance and Propulsion
Control for Single-Stage-to-Orbit Airbreathing Vehicles,
Ph.D. Thesis, School of Aerospace Engineering, Georgia
Institute of Technology, November 1989.
[8] Calise, A.J., Melamed, N., Lee, S., “Design and
Evaluation of a 3-D Optimal Ascent Guidance Algorithm,”
AIAA J. of Guidance, Control and Dynamics, Vol 21, No. 6,
Nov.-Dec., 1998, pp 867-875.
Eric N. Johnson is an Assistant
Professor in the Georgia Tech
school of Aerospace Engineering.
He also has five years of industry
experience, including The Charles
Stark Draper Laboratory and
Lockheed Martin. He has a diverse
background
in
guidance,
navigation, and control; including
applications such as airplanes,
helicopters, submarines, munitions, and launch vehicles. He
holds MS degrees in Aeronautics Engineering from MIT and
The George Washington University, and a PhD from
Georgia Tech. His research interests include estimation,
control, and guidance; aerospace vehicle design; digital
avionics systems; and simulation.
Anthony J. Calise is a Professor in the Georgia Tech school
of Aerospace Engineering. Prior to joining the faculty at
Georgia Tech, Dr. Calise was a Professor Mechanical
Engineering at Drexel University for 8 years. He also
worked for 10 years in industry for the Raytheon Missile
Systems Division and Dynamics Research Corporation,
where he was involved with analysis and design of inertial
navigation systems, optimal missile guidance and aircraft
flight path optimization. Since leaving industry he has
worked continuously as a consultant for 19 years. He is the
author of over 150 technical reports and papers. He was the
recipient of the USAF Systems Command Technical
Achievement Award, and the AIAA Mechanics and Control
of Flight Award. He is a fellow of the AIAA and former
Associate Editor for the Journal of Guidance, Control, and
Dynamics and for the IEEE Control Systems Magazine. The
subject areas that Dr. Calise has published in include
Optimal Control Theory, Aircraft Flight Control, Optimal
Guidance of Aerospace Vehicles, Adaptive Control using
Neural Networks, Robust Linear Control and Control of
Flexible Structures. In the area of adaptive control, Dr.
Calise has developed a novel combination for employing
neural network based control in combination with feedback
linearization. Applications include flight control of fighter
aircraft, helicopters and missile autopilot design.
J. Eric Corban is president and
founder
of
Guided
Systems
Technologies, Inc. (GST). There he
has led efforts in patenting and
applying
neural
network-based
adaptive control to a variety of
systems. He led efforts to implement
and flight test this technology on
several unmanned helicopters for the
U.S. Army. He supported GST’s
program to develop an adaptive autopilot for the
USAF/Boeing RESTORE effort, and is currently directing
the application of this technology to several variants of the
Joint Direct Attack Munition for the USAF. Prior to his
graduate studies he was a member of the technical staff at
McDonnell Douglas Helicopters. He holds a BS in physics
from Millsaps College, and BS, MS and PhD degrees in
Aerospace Engineering from the Georgia Institute of
Technology. He is a member of the AIAA, IEEE and the
Association for Unmanned Vehicle Systems International.