Kalman Filtering and Smoothing to Estimate Real

AIAA 2009-5972
AIAA Guidance, Navigation, and Control Conference
10 - 13 August 2009, Chicago, Illinois
Kalman Filtering and Smoothing to Estimate Real-Valued
States and Integer Constants
∗
Mark L. Psiaki
Cornell University, Ithaca, N.Y. 14853-7501
New algorithms have been developed that solve linear Kalman filtering and fixed
interval smoothing problems that include dynamic real-valued states and static integer
unknowns. These algorithms are useful for estimation problems in which some of the
estimated quantities are known a priori to be constant integers, as in double-differenced
carrier-phase Global Positioning System (GPS) relative navigation. The new estimation
algorithms solve Integer Linear Least-Squares (ILLS) problems in order to derive estimates
from square-root information equations. The ILLS solver makes use of Least-squares
AMBiguity Decorrelation Adjustment (LAMBDA) methods. The true optimal solution of
each filtering or smoothing problem requires that all integers be included for the entire time
interval, even those that apply to measurements that are remote from the current time point
of interest. This fact causes the size of the ILLS problem to grow with the length of the time
interval, thereby greatly slowing the execution speed of the solution algorithm. Alternative
approximate methods for filtering and smoothing are proposed and tested that bound the
size of the ILLS problems. Bounded problem sizes are achieved by treating integers from
remote-in-time measurements as real-valued unknowns, which allows them to be dropped
from explicit consideration. The resulting algorithms have been tested using a truth-model
simulation. Their accuracies, although sub-optimal, can be very good, and they reduce the
computational costs of the filter and the smoother.
I.
Introduction
E
stimation problems that involve integer unknowns have become important with the advent of carrier-phase
differential GPS (CDGPS) techniques 1,2,3,4. Carrier phase GPS measurements have a precision on the order 0.5
cm, but they also have unknown biases. The method of double differencing can transform these biases into integervalued unknowns if the receiver's carrier phase measurements processing has been designed properly 3. If the
estimation algorithm that uses these measurements can take advantage of the known integer nature of the doubledifferenced biases, then significant improvements in accuracy, convergence speed, or both can be achieved 4,5.
Additional problems besides CDGPS involve the estimation of real-valued dynamic variables and integer-valued
biases. Other radio-navigation techniques that involve carrier-phase measurements sometimes give rise to
estimation problems that include unknown integer constants. As in CDGPS, the ability to treat such constants
exactly as integers can improve the performance of the estimation algorithm.
Various techniques have been presented within the CDGPS literature for generating estimates of constant integer
unknowns, e.g., see Refs. 2,4-10. These techniques offer relative position accuracies on the cm level over receiver
baselines on the order of 10 km or more, and some of these techniques can be used with dynamically moving
relative positions. Only the methods of Refs. 2 and 4 implement Kalman filters that involve estimation of both a
dynamic real-valued state and constant integers.
It is difficult to incorporate integers in an optimal estimation framework. This difficulty may account for the
rarity in the CDGPS literature of Kalman filters. The difficulty arises because the derivation of estimates involves
more than the usual Kalman filter linear algebra. It requires the solution of an Integer Linear Least-Squares (ILLS)
∗
Professor, Sibley School of Mechanical and Aerospace Engineering. Associate Fellow, AIAA.
1
American Institute of Aeronautics and Astronautics
Copyright © 2009 by Mark L. Psiaki. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.
problem 4,5. An ILLS problem must be solved by using complicated search techniques that can become very
expensive computationally as the dimension of the problem grows 6,9.
The goal of the present paper is to develop optimal and suboptimal techniques for Kalman filtering and fixedinterval smoothing in the presence of dynamic real-valued states and constant integer-valued states, also known as
integer ambiguities. This paper makes two contributions to the art of dynamic estimation with unknown integer
ambiguities. The first contribution is a pair of Kalman filters, one optimal and one sub-optimal. These filters
represent generalizations of techniques that have been developed in Ref. 4. The second contribution is to develop a
corresponding pair of optimal and suboptimal fixed-interval smoothers for the same class of mixed real/integer
estimation problems.
This is the first known work that deals with Kalman-type smoothing in the presence of constant, unknown
integer states. Smoothing is often mentioned in the GPS signal processing literature, but this term rarely refers to a
smoothing algorithm in the sense of Kalman filtering. Rather, GPS "smoothing" usually refers to an ad hoc
averaging method that tends to reduce the effects of random measurement errors.
The suboptimal Kalman filter and the suboptimal smoother both make approximations that reduce the
dimensions of their corresponding ILLS problems. Such reductions can decrease the algorithms' computational
complexity. The method of reducing each ILLS problem's dimensions is to treat some of the integers as real-valued
unknowns. These real-valued unknowns can be de-coupled from the remaining integers, thereby reducing the
dimension of the remaining ILLS problem, which reduces its computational burden. The integers that are
approximated as real-valued unknowns are integers that affect only measurements which are remote in time from
the given sample at which a solution is being calculated. This approach to relaxing integer constraints can yield
suboptimal solutions that are nearly optimal due to a well known property of Kalman filters and smoothers:
Measurements remote in time from a given sample instant tend not to have a strong effect on the optimal estimate at
that instant.
The 6 remaining sections of this paper define and evaluate its mixed real/integer Kalman filters and smoothers.
Section II defines a discrete-time system model that includes dynamic real-valued unknowns and constant integer
unknowns, and it defines Kalman filtering and fixed-interval smoothing problems for this system. Section III
develops the optimal and suboptimal Kalman filters for the given model form based on a combination of standard
square-root information filter (SRIF) 11 calculations and ILLS solutions. Section IV presents the optimal and
suboptimal fixed-interval smoothers. These algorithms combine an SRIF implementation of the Rauch-TungStriebel (RTS) smoother 12 coupled with appropriate ILLS calculations to deal with the integer unknowns. Section
V develops an example problem that is used to test the new algorithms, and Section VI uses truth-model simulation
data for that example problem in order to evaluate the new estimation algorithms. Section VII summarizes the
paper's developments and presents its conclusions.
II. Kalman Filtering and Smoothing Problems for a System with Dynamic Real-Valued States and
Constant Integer Ambiguities
A. Mixed Real/Integer Dynamics Model
The discrete-time problem model includes a vector of dynamic real-valued states and a vector of constant
integer-valued ambiguities. The following linear model describes the time evolution of these two vectors:
x k +1 = Φ k x k + Γ k w k + η k
⎡n ⎤
nk +1 = ⎢ k ⎥
⎣δnk ⎦
for k = 0, ..., K-1
(1a)
for k = 0, ..., K-1
(1b)
The vector xk is the real-valued part of the state at sample k, and the vector nk is the integer-valued part of the state.
The vector wk is the random process noise, and the vector ηk is a known non-homogeneous term in the xk dynamics
model. This latter term represents a slight generalization of the usual Kalman filter model form, one that can be
useful in a linear approximation of a nonlinear dynamic model. The matrix Φk is the usual state transition matrix,
and the matrix Γk is the process-noise influence matrix.
The process-noise vector is a Gaussian white-noise vector whose statistics are defined by the following squareroot information equation:
Rwwk w k = −ν wk
for k = 0, ..., K-1
2
American Institute of Aeronautics and Astronautics
(2)
The square matrix Rwwk is the a priori square-root information matrix for wk, and νwk is a sample from a discretetime Gaussian white-noise sequence with statistics E{νwk} = 0 and E{νwkνwkT} = I.
The integer state nk contains all of the ambiguities that affect any measurements for sample indices j = 1, ..., k, as
defined below in Subsection II.B. The incremental integer vector δnk represents a new set of ambiguities that enter
the measurement model at sample k+1 and possibly at samples beyond k+1, but not at any earlier samples. This
vector will be empty, i.e., of zero dimension, if no new ambiguities arise at sample k+1. Consistent with the
assumption that the first measurement occurs at sample 1, the initial vector n0 is modeled as having zero dimension
so that n1 = δn0 and nk = [δn0; …; δnk-1].
The integer elements of the state do not affect the dynamic model of the state's real part. This problem form is
consistent with CDGPS applications and other radio-navigation applications that include the estimation of integer
ambiguities. The integers typically arise purely in the measurement model. Each integer enters a particular type of
measurement, perhaps a particular set of CDGPS double-differenced receiver/satellite pairs. Each integer is a
constant over all measurement sample times for its corresponding type. The integers have no direct effect on the
time evolution of real-valued states such as receiver position or velocity. Thus, this model form, although restricted,
is applicable to a significant class of problems.
An important feature of the integer dynamic model in Eq. (1b) is that the dimension of the integer vector nk can
grow with time. It could become very large. This is typical of CDGPS applications and of other applications.
When new signals are acquired from newly risen GPS satellites, new integer ambiguities get added to the problem,
those in δnk. New integers are normally introduced very slowly in terrestrial CDGPS applications because an
additional GPS satellite typically rises only about once every hour. In low-Earth orbit (LEO) space-based CDGPS
applications, however, the rising of new GPS satellites can happen rapidly due to the rapid motion of the
constellation of LEO satellites for which relative positions are being estimated 2,4,5.
One might suppose that the integer dynamic model in Eq. (1b) should also delete old integers that no longer
apply to any measurements after sample k. Such a model would be consistent with CDGPS problems because some
GPS satellites set behind the Earth and, therefore, their ambiguities stop affecting available measurements. Even if
the satellite were to rise again during the filtering/smoothing interval, a new ambiguity would be needed for the new
carrier-phase data from its signal. Therefore, it is reasonable to develop methods that can explicitly denote early
elements of nk that do not affect any measurements after sample k.
Suppose that one can re-arrange and partition nk as follows
⎡ Δn1 ⎤
⎢ M ⎥
⎢ Δn k ⎥ = Π k n k
⎢ ∂n ⎥
⎣ k⎦
for k = 0, ..., K-1
(3)
where the square matrix Πk is a permutation matrix (ΠkΠkT = I, and all of its elements are either 1 or 0). This matrix
is chosen to enforce the following restrictions: each component Δnj for j = 1, ..., k affects only the measurements in
the sample range 1 through j, but the component ∂nk is known to affect measurements in a range of samples that
extends above k. If one allows various of these component vectors to have zero dimension, then this rearrangement
and partitioning of nk is consistent with standard CDGPS estimation problems. The integer ambiguities in Δnj
correspond to GPS satellites that set some time between sample j and sample j+1.
Given Eq. (3), one might be tempted to modify both the definition of nk and the dynamic model in Eq. (1b).
One might try to redefine nk to consist only of Δnk and ∂nk . Such a change would necessitate a modification to Eq.
(1b) to form nk+1 only from the two components ∂nk and δnk. This modification, however, would not permit the
development of a truly optimal estimation algorithm because of the integer nature of the discarded unknowns Δn1,
..., Δnk-1. Nevertheless, the rearrangement and decomposition in Eq. (3) is useful for measurement model definition
and for developing an efficient suboptimal Kalman filter.
The partitioning of the integer states in Eq. (3) leads to a dynamic model for the generation of Δnk+1 and ∂nk +1
from ∂nk and δnk. It takes the form:
⎡Δnk +1 ⎤ ~ ⎡∂nk ⎤
⎢⎣ ∂nk +1 ⎥⎦ = Π k ⎢⎣δnk ⎥⎦
for k = 0, ..., K-1
3
American Institute of Aeronautics and Astronautics
(4)
~
where the square permutation matrix Π k obeys the relationship
⎡ T
Π k +1 ⎢ Π k
⎣ 0
0⎤ = ⎡ I ~0 ⎤
I ⎥⎦ ⎢⎣0 Π k ⎥⎦
for k = 0, ..., K-1
(5)
The algebraic derivation of Eq. (5) relies on Eq. (3) for samples k and k+1 and on Eq. (4) for sample k. The
dimension of the identity matrix in the block matrix on the left-hand side of Eq. (5) equals the dimension of δnk.
The dimension of the identity matrix on the right-hand side equals the sum of the dimensions of Δn1, ..., Δnk.
One might be tempted to define one large constant vector of constant dimension, ntot = nK. This large vector
includes all of the ambiguities that affect any measurements during the entire filtering/smoothing interval. The use
of ntot in the problem definition would obviate the need for the integer dynamics model in Eq. (1b). This simplified
problem definition is not used because it causes problems when defining the Kalman filter. There is typically no a
priori information about the integer unknowns. They are fully determined by the measurements that involve them.
The use of the dynamic integer vector model in Eq. (1b) allows one to pose sensible Kalman filtering problems in
which xk and nk are observable based solely on data up through sample k. The same cannot be said about xk and ntot.
Figure 1 illustrates the relationship of integer ambiguities in ntot = nK to the sample times at which they affect
measurements. This figure relates to an example problem in which dim(ntot) = 40. Each independent measurement
type corresponds to a separate line segment in the figure. To each such line segment there corresponds one distinct
integer ambiguity element in ntot. The horizontal-axis time span of each line segment indicates which samples are
affected by the corresponding element of ntot. At any given sample time tk, the set of integers in Δnk corresponds to
those line segments that extend up to time tk, but not above it. Similarly, each integer in δnk −1 corresponds to a line
segment that extends down to time tk, but not below it. Each integer in ∂nk corresponds to a line segment that
crosses tk and extends into the future, with the possibility that it also extends into the past. Given that Fig. 1 is
typical of this type of problem, the vectors Δnk and δnk −1 are often empty vectors, but the vector ∂nk is usually of
non-zero dimension. Of course, the set of integers in δnk −1 normally intersects the set in ∂nk .
-3
11
10
Additional measurements used in
tk = 3000 sec sub-optimal smoother
x 10
Measurements used in tk = 3000 sec
sub-optimal filter & smoother
Ambiguity Sensitivities, htilde (m)
9
8
7
6
5
4
3
tk = 3000 sec +/- i*deltat range
2 for considering exact integers
1
0
1000
2000
3000
4000
5000
Time (sec)
6000
7000
8000
9000
Figure 1. Measurement availability time intervals and integer ambiguity sensitivities for the
simulated problem's 40 independent measurement "types".
4
American Institute of Aeronautics and Astronautics
B. Measurement Model
The measurement model at sample k+1 takes the form
~
⎡ Δn ⎤
yk +1 = H xk xk + H wk w k + H nk+1 ⎢ k +1 ⎥ + ν yk+1
⎣ ∂nk +1 ⎦
for k = 0, ..., K-1
(6)
~
where yk+1 is the measurement vector, Hxk, Hwk, and H nk+1 are measurement sensitivity matrices, and νk+1 is the
measurement noise. The measurements are normalized so that νk+1 is a sample from a discrete-time Gaussian whitenoise sequence with statistics E{νk+1} = 0 and E{νk+1νk+1T} = I.
The use of xk and wk terms in this measurement model in place of an xk+1 term represents a modest departure
from traditional models. This model is consistent with some GPS applications with which the author is familiar. It
has the advantage of directly modeling the possibility of correlation between the process noise and the net
measurement error. One can easily transform to this model from a traditional model of the form:
(
~
(
⎡Δn ⎤
y k +1 = H xk+1 x k +1 + H nk+1 ⎢ k +1 ⎥ + ν yk+1
⎣ ∂nk +1 ⎦
(7)
One replaces xk+1 in this formula with the right-hand side of Eq. (1a). This substitution yields the model
(
(
~
(
⎡ Δn ⎤
y k +1 − H xk+1η k = H xk+1 (Φk x k + Γ k w k ) + H nk+1 ⎢ k +1 ⎥ + ν yk+1
⎣ ∂nk +1 ⎦
(8)
which has the desired form.
The measurement model in Eq. (6) can be re-cast in the form
y k +1 = H xk x k + H wk w k + H nk+1 n k +1 + ν yk+1
(9)
by using Eq. (3) defined for sample k+1. The large integer sensitivity matrix in Eq.(9) can be derived from Eqs. (3)
and (8):
~
H nk+1 = H nk+1 ⎡0 K 0 I 0⎤ Π k +1
⎢⎣0 K 0 0 I ⎥⎦
(10)
Despite the fact that many of its columns are all zeros, this measurement sensitivity matrix is useful for purposes of
defining and solving the exact optimal filtering and smoothing problems.
C. Kalman Filtering and Fixed-Interval Smoothing Problem Formulations
SRIF filtering and smoothing techniques are used throughout this paper. Filters and smoothers that are derived
using these techniques can conveniently be interpreted as being solutions to constrained least-squares problems that
have a dynamic programming structure. This interpretation of SRIF techniques has proven useful when developing
a Kalman filter that includes constant integer unknowns 4,5.
The optimal real/integer Kalman filter and smoother can be derived as the solutions to problems of the form:
find:
to minimize:
x0, …, xk+1, w0, …, wk, and nk+1 = [δn0; …; δnk]
J =
1 [ Rˆ
xx 0 x 0
2
k
+ 12 ∑
j =0
subject to:
k
T
(11a)
T
− zˆ x 0 ] [ Rˆ xx0 x 0 − zˆ x 0 ] + 12 ∑ [ Rwwj w j ] [ Rwwj w j ]
j =0
⎡δn0 ⎤
⎡δn 0 ⎤
M ⎥ − y j+1 ] T [ H xj x j +H wj w j +H nj+1 ⎢ M ⎥ − y j+1 ]
⎢δn j ⎥
⎥
⎣
⎦
⎣δn j ⎦
(11b)
for j = 0, ..., k
(11c)
{[ H xj x j +H wj w j +H nj+1 ⎢⎢
x j +1 = Φ j x j + Γ j w j + η j
nk+1 is an integer-valued vector
5
American Institute of Aeronautics and Astronautics
}
This problem formulation seeks to find the real-valued state time history, the real-valued process-noise time
history, and the integer-valued constant ambiguities that minimize half the sum of the squares of the noise terms in
Eqs. (2) and (9) and in the following initial real-state square-root information equation:
Rˆ xx0 x 0 = zˆ x 0 − νˆ x 0
(12)
The matrix Rˆ xx 0 is the initial square-root information matrix for x0, and the vector zˆ x 0 stores information about the
−1 ˆ
mean initial state vector if Rˆ xx 0 is nonsingular: xˆ 0 = Rˆ xx
0 z x 0 . The error νˆ x 0 is a sample from a discrete-time
Gaussian white-noise sequence with statistics E{ νˆ x 0 } = 0 and E{ νˆ x 0νˆ xT0 } = I. The first term on the right-hand
side of Eq. (11b) is half the sum of the squares of the errors in Eq. (12). The second term equals half the sum of the
squared errors in Eq. (2) for j = 0, …, k. Similarly, the third term is half the sum of the squared errors in Eq. (9) for
the same range of the sample index j.
The Kalman filter state estimate at sample k+1 is xˆ k +1 and nˆ k +1 . It equals the corresponding elements of the
solution to Problem (11a)-(11c). This estimate constitutes the maximum a posteriori (MAP) estimate because the
cost function in Eq. (11b) is the negative natural logarithm of the a posteriori joint probability/probability-density of
the solution variables conditioned on the measurements y1, …, yk+1. In other words, the properly normalized
exponential of the negative of this cost constitutes the a posteriori joint probability/probability-density function for
the discrete integer vector nk+1 and for the real-valued variables x0, …, xk+1, w0, …, wk.
Section III develops a mixed real/integer linear least-squares solution procedure that solves the Kalman filtering
problem in order to determine xˆ k +1 and nˆ k +1 . Note that the Kalman filter estimate discards the solution values of
x0, …, xk and w0, …, wk. In fact, its calculations are designed in a way that does not even compute these latter
quantities. This is standard procedure in the SRIF-type framework that is used to develop the Kalman filter.
If the entire solution of Problem (11a)-(11c) is computed, x0, …, xk+1, w0, …, wk, and nk+1, then it constitutes the
MAP smoother solution for the fixed smoothing interval from j = 0 to k+1. If k+1 is extended to the terminal
sample K, then this solution constitutes the desired optimal fixed-interval smoother for the entire data batch. The
SRIF solution to this optimal smoothing problem is presented in Section IV. It builds on the mixed real/integer
Kalman filter of Section III by adding an SRIF implementation of an RTS backwards smoother pass. This
backwards pass is specially tailored to incorporate the integer-valued ambiguities.
III. Optimal and Suboptimal Solutions of the Filtering Problem
A. Optimal Filter
The filtering problem in Eqs. (11a)-(11b) is solved by adapting the standard SRIF techniques of Ref. 11 to
include integer-valued estimates. This implementation uses the dynamic model of nk as a vector of expanding
dimension, as in Eq. (1b). The solution can be defined in terms of the operations for a single sample interval.
These operations are designed in a way that allows them to be applied recursively in order to solve the entire
problem.
The one-sample operations of the optimal Kalman filter implement dynamic propagation and a measurement
update, as in Ref. 11, followed by computation of the filtered estimates using an ILLS solution, as in Ref. 4. For the
interval starting at sample k and ending at sample k+1, these operations start from the following set of square-root
information equations
Rˆ xxk x k + Rˆ xnk nk = zˆ xk − νˆ xk
Rˆ nnk nk = zˆ nk − νˆ nk
(13a)
(13b)
These equations store the a posteriori information that can be used to determine the filtered state estimates x̂ k and
n̂k . This information is stored in the square-root information matrixes R̂ xxk , R̂ xnk , and R̂nnk and in the nonhomogeneous vectors ẑ xk and ẑ nk . As in all of this paper's square-root information equations and consistent with
Ref. 11, the vectors νˆ xk and νˆ nk are uncorrelated, zero-mean, identity-covariance Gaussian white noise processes.
The SRIF dynamic propagation and measurement update steps are combined in order to properly account for the
presence of the process-noise term in Eq. (9). The dynamic propagation part inverts the dynamics models in Eqs.
(1a) and (1b) in order to replace xk and nk in Eqs. (13a) and (13b) with expressions that depend on xk+1, wk, and nk+1.
6
American Institute of Aeronautics and Astronautics
This part also appends the process-noise information equation, Eq. (2), to the system of equations. The
measurement update uses the measurement model in Eq. (9) and inverts the dynamics model in Eq. (1a) to eliminate
xk from this equation. The resulting system of equations takes the form
Rwwk
⎡
⎢ − Rˆ Φ −1 Γ
xxk k
k
⎢
0
⎢
⎢
−1
⎢⎣ H wk −H xk Φk Γ k
0
0 ⎤
0
⎡
⎤ ⎡ ν wk ⎤
⎡
⎤
Rˆ xxk Φk−1 [ Rˆ xnk ,0]⎥⎥ ⎢ w k ⎥ ⎢ zˆ xk +Rˆ xxk Φk−1η k ⎥ ⎢ νˆ xk ⎥
⎥−⎢
x
=⎢
zˆ nk
νˆ nk ⎥⎥
0
[ Rˆ nnk ,0]⎥ ⎢ k+1 ⎥ ⎢
⎥
⎢
−1
⎥ ⎢n ⎥
H xk Φk−1
H nk+1 ⎥⎦ ⎣ k+1 ⎦ ⎢⎣ y k+1+H xk Φk η k ⎥⎦ ⎢⎣ν yk+1 ⎥⎦
(14)
where the first row in this block matrix/vector equation is Eq. (2) and where the second through fourth rows are,
respectively, transformed versions of Eqs. (13a), (13b), and (9). This procedure requires that the state transition
matrix Φk be invertible, which is typical for SRIF algorithms. If Φk is not invertible, then techniques like those in
Ref. 13 must be used. Note the trailing zero columns in the 2nd and 3rd rows of the block matrix on the left-hand
side of Eq. (14). These are needed if the dimension of nk+1 is larger than the dimension of nk, i.e., if δnk has nonzero dimension.
The dynamic-propagation/measurement-update procedure uses orthonormal/upper-triangular (QR)
factorization 14 in order to find an orthonormal matrix Qak that transforms the large block matrix on the left-hand
side of Eq. (14) into an upper triangular matrix. Multiplication of both sides of Eq. (14) by Qak yields the following
transformed system of square-root information equations.
⎡ Rˆ wwk
⎢
⎢ 0
⎢ 0
⎢
⎣⎢ 0
Rˆ wxk+1
Rˆ
xxk+1
0
0
Rˆ wnk+1 ⎤
⎡ zˆ wk ⎤ ⎡ νˆ wk ⎤
⎥⎡ w ⎤
⎢
⎥
Rˆ xnk+1 ⎥ ⎢ k ⎥ ⎢⎢ zˆ xk+1 ⎥⎥ ⎢νˆ xk+1 ⎥
=
−
x
1
+
k
⎢
⎥
ˆ
zˆ
Rˆ nnk+1 ⎥⎥ ⎢ n ⎥ ⎢ nk+1 ⎥ ⎢ν nk+1 ⎥
⎣ k+1 ⎦ ⎢ z rk+1 ⎥ ⎢ν rk+1 ⎥
⎣
⎦ ⎣
⎦
0 ⎦⎥
(15)
where R̂ wwk , Rˆ xxk +1 , and Rˆ nnk+1 are square, upper-triangular matrices. The matrices Rˆ wxk+1 , Rˆ wnk+1 , and Rˆ xnk+1
and the vectors ẑ wk , zˆ xk+1 , zˆ nk+1 , and z rk+1 are dense and have appropriate dimensions. The vector z rk+1
represents the residual error that remains at the end of this combined dynamic-propagation/measurement-update
operation. It is the SRIF equivalent of the filter innovation.
The matrices Rˆ xxk +1 , Rˆ xnk+1 , and Rˆ nnk+1 and the vectors zˆ xk+1 and zˆ nk+1 contain all of the information
necessary to enable recursive execution of these operations for the next sample interval. The matrices R̂ wwk ,
Rˆ wxk+1 , and Rˆ wnk+1 and the vector ẑ wk can be discarded unless one wants to solve the smoothing problem after
the forwards filtering recursion has completed at sample k+1 = K. These matrices and vectors have (^) overstrikes
in order to indicate that they apply to the a posteriori estimates, those conditioned on data up through yk+1.
Up to this point, the SRIF filtering algorithm is largely identical to the algorithm that would be employed if all
of the states were real-valued. The principle difference lies in the manner of computing the filtered estimates xˆ k +1
and nˆ k +1 based on the data in Rˆ xxk+1 , Rˆ xnk+1 , Rˆ nnk+1 , zˆ xk+1 , and zˆ nk+1 . The computation starts by solving the
following ILLS problem
find:
nk+1
(16a)
1 [ Rˆ
nnk +1 n k +1
2
T
− zˆ nk+1 ] [ Rˆ nnk+1 n k+1 − zˆ nk+1 ]
to minimize:
J =
subject to:
nk+1 is an integer-valued vector
(16b)
(16c)
−1 ˆ
If nk+1 were permitted to be real-valued, then the minimizing solution would just be nˆ k +1 = Rˆ nnk
+1 z nk+1 . The
restriction to integer values necessitates the use of a complicated search procedure, such as the algorithm of Ref. 9.
The integer-valued vector that minimizes the cost in Eq. (16b) constitutes the optimal estimate nˆ k +1 . Given this
estimate, straightforward linear algebra is used to determine the real-valued part of the state estimate:
−1
ˆ
xˆ k+1 = Rˆ xxk
+1[ zˆ xk+1 − R xnk+1nˆ k+1 ]
7
American Institute of Aeronautics and Astronautics
(17)
B. Suboptimal Filter
There is a second significant difference between the optimal Kalman filter of Subsection III.A and the Kalman
filter that would be implemented if the elements of nk+1 were allowed to take on any real values. The restriction of
nk+1 to integer values forces the optimal Kalman filter to carry forward all of its elements, regardless of whether they
affect the measurements at sample k+1 or beyond. Recalling Eq. (3), these carried-forward integers are contained in
the components Δn1, ..., Δnk. A standard SRIF would permute the vector [wk; xk+1; nk+1] in Eq. (14), simultaneously
making consistent permutations to the columns of the block matrix on the left-hand side of Eq. (14). This
permutation would place droppable components of nk+1, components that do not affect any measurements beyond
yk+1, ahead of xk+1. This permutation would result in a variant form of Eq. (15) that would allow both wk and the
droppable components of nk+1 to be discarded before proceeding to the next sample period.
The dropping of unnecessary components of nk+1 would limit the growth in the number of retained elements of
nk+1 as per Eq. (1b). This limitation of retained dimension would reduce the amount of computation and memory
required by the SRIF algorithm.
Unfortunately, the integer nature of nk+1 implies that the standard SRIF techniques for dropping components
would cause the filter to be sub-optimal. This is true because the SRIF dropping procedure relies on the condition
that the dropped quantities are real-valued.
It is possible, even desirable, however, to develop a suboptimal filter that drops elements of nk+1. If done
properly, only a slight accuracy loss will be incurred by treating the dropped elements as real-valued unknowns.
It is helpful to define a new vector of retained integers in order to develop the suboptimal filter. Suppose that
the number of retained Δnj components in addition to Δnk is min(i,k-1). That is, nominally the i most recent
components are retained in addition to those that affect the present measurements, unless k is so small that there do
not exist i components Δnj that satisfy j < k. These retained components and the components that directly affect the
present measurements are formed into the following integer vector:
⎡ Δnmax (1,k −i ) ⎤
⎥
⎢
M
mk = ⎢
⎥
Δn k
⎥
⎢
∂nk
⎦
⎣
for k = 1, ..., K
(18)
with m0 being defined as an empty vector of dimension 0. Suppose, also, that the retained state square-root
information equations at sample k take the form:
Rˆ xxk x k + Rˆ xmk m k = zˆ xk − νˆ xk
Rˆ
m = zˆ − νˆ
mmk
k
mk
mk
(19a)
(19b)
These equations are similar to Eqs. (13a) and (13b), the only difference being the substitution of the truncated
vector of integers mk in place of the full vector nk.
A typical set of integers that comprise mk is illustrated in Fig. 1 by four corresponding data-type line segments.
The four segments in question are labeled as applying to the suboptimal filter and smoother calculations for tk =
3000 sec. These four line segments correspond to a sample lag of i = 40. They overlap the time span from tk-40 =
2400 sec to tk = 3000 sec, which is indicated by the left half of the bracketed bar at the bottom of the figure.
The suboptimal dynamic propagation operation requires that mk in Eqs. (19a) and (19b) be expressed in terms of
the components to be dropped, Δnk-i, and the retained components at the next sample, mk+1. This expression takes
the form:
0 ⎤
⎡I
~ m
mk = ⎢
for k = 1, ..., i
(20a)
I
0
[
,
0
]Π kT ⎥⎦ k +1
⎣
Δ n k −i
⎡
⎤
⎥
0 ⎤
for k = i+1, ..., K
(20b)
m k = ⎢⎡ I
~ T ⎥ m k +1 ⎥
⎢⎢
I
Π
0
[
,
0
]
⎢⎣ ⎣
k ⎦
⎦⎥
~
The matrix [ I ,0]Π kT in Eqs. (20a) and (20b) multiplies the last two components of the mk+1 vector, [Δnk+1; ∂nk+1 ],
to yield ∂nk , consistent with Eq. (4).
8
American Institute of Aeronautics and Astronautics
Substitution of Eq. (20b) into Eqs. (19a) and (19b) yields the modified square-root information equations:
Rˆ xxk x k + Rˆ xmak Δnk −i + Rˆ xmbk m k +1 = zˆ xk − νˆ xk
Rˆ
Δn + Rˆ
m
= zˆ − νˆ
mmak
k −i
mmbk
k +1
mk
mk
(21a)
(21b)
where the matrices R̂ xmak and R̂ xmbk are composed of subsets of the columns of R̂ xmk , and the matrices R̂mmak
and R̂mmbk are composed of subsets of the columns of R̂mmk , as defined by the formulas
0
⎡
⎤
⎡ I ⎤
0 ⎤⎥
Rˆ xmak = Rˆ xmk ⎢ ⎡0⎤ ⎥ , Rˆ xmbk = Rˆ xmk ⎢ ⎡ I
~ T ⎥⎥
⎢ ⎢0 [ I ,0]Π
⎢⎣ ⎢⎣0⎥⎦ ⎥⎦
k ⎦⎦
⎣⎣
(22a)
0
⎡
⎤
⎡ I ⎤
0 ⎤⎥
Rˆ mmak = Rˆ mmk ⎢ ⎡0⎤ ⎥ , Rˆ mmbk = Rˆ mmk ⎢ ⎡ I
~ T ⎥⎥
⎢ ⎢0 [ I ,0]Π
⎢⎣ ⎢⎣0⎥⎦ ⎥⎦
k ⎦⎦
⎣⎣
(22b)
Substitution of Eq. (20a) into Eqs. (19a) and (19b) yields equations that are similar to Eqs. (21a) and (21b), except
that the Δnk-i terms are not present, and the formulas for R̂ xmbk and R̂mmbk are slightly different.
Equations (21a) and (21b) can be used in conjunction with standard SRIF dynamic propagation and
measurement update techniques to produce the suboptimal filter's equivalent of Eq. (14). After appending processnoise information Eq. (2) and measurement Eq. (6) to Eqs. (21a) and (21b) and after using the dynamics model Eq.
(1a) to eliminate xk, the new dynamic-propagation/measurement-update equation becomes.
⎡ 0
⎢ Rˆ
⎢ xmak
⎢ Rˆ mmak
⎢
⎢⎣ 0
Rwwk
ˆ
− R xxk Φk−1 Γ k
0
H wk −H xk Φk−1 Γ k
0
ˆ
R xxk Φk−1
0
H xk Φk−1
⎤ ⎡ Δn ⎤ ⎡
0
⎤ ⎡ν
⎤
⎥ ⎢ wk −i ⎥ ⎢ zˆ +Rˆ Φ −1η ⎥ ⎢ ˆwk ⎥
⎥ ⎢ k ⎥ = ⎢ xk xxk k k ⎥ − ⎢ ν xk ⎥
zˆ mk
x
⎥ ⎢ νˆ mk ⎥
mmbk ⎥ ⎢ k+1 ⎥ ⎢
⎥ ⎢ m k+1 ⎥ ⎢ y +H Φ −1η ⎥ ⎢ν yk+1 ⎥
H mk+1 ⎥⎦ ⎣
⎦
⎦ ⎣ k+1 xk k k ⎦ ⎣
0
ˆ
R xmbk
Rˆ
(23)
where
~
H mk+1 = H nk+1 ⎡0 K 0 I 0⎤
⎢⎣0 K 0 0 I ⎥⎦
(24)
QR factorization of the large block matrix on the left-hand side of Eq. (23) yields an orthonormal matrix Qbk that
can be used to transform the Eq. (23) into the following block upper-triangular form:.
⎡ Rˆ ΔnΔnk
⎢
⎢ 0
⎢ 0
⎢
⎢ 0
⎢
⎣ 0
Rˆ Δnwk
Rˆ
wwk
Rˆ Δnxk+1
Rˆ
0
Rˆ xxk+1
0
0
0
0
wxk+1
Rˆ Δnmk+1 ⎤
ˆ
ˆ
⎥ ⎡ Δn ⎤ ⎡ z Δnk ⎤ ⎡ν Δnk ⎤
Rˆ wmk+1 ⎥ ⎢ k −i ⎥ ⎢ zˆ wk ⎥ ⎢ νˆ wk ⎥
wk
⎥
⎢ˆ
⎥ ⎢ˆ
Rˆ xmk+1 ⎥⎥ ⎢ x k+1 ⎥ = ⎢ z xk+1 ⎥ − ⎢ν xk+1 ⎥
⎢
⎥ ⎢ zˆ
⎥
⎥ ⎢νˆ
Rˆ mmk+1 ⎥ ⎢⎣ m k+1 ⎥⎦ ⎢ mk+1 ⎥ ⎢ mk+1 ⎥
z
ν
⎥
1
rk
+
⎣
⎦ ⎣ rk+1 ⎦
0 ⎦
(25)
where the definitions of the R matrices and the z vectors are similar to those given after Eq. (15) for the optimal
filter.
Given Eq. (25), the Kalman filter can implement a suboptimal approximation. That approximation is to estimate
Δnk-i, wk, xk+1, and mk+1 by minimizing the sum of the squared errors in Eq. (25), the sum of the squares of the
components of the ν vectors, subject only to the constraint that the elements of mk+1 be integers. The elements of
Δnk-i are permitted to be real-valued. This constraint relaxation allows the suboptimal filter to drop the first line of
Eq. (25) from consideration because its contribution to the optimal cost can be set to zero by the following
procedure: Assign νˆ Δnk = 0 and solve the first equation for Δnk-i in terms of the optimal values of wk, xk+1, and
mk+1. The remaining four lines of Eq. (25) are in the form of Eq. (15), and the suboptimal filter can proceed like the
optimal filter from this point onwards.
9
American Institute of Aeronautics and Astronautics
The determination of mˆ k+1 still involves the solution of an ILLS problem, like the one defined in Eqs. (16a)(16c), except that the cost function in Eq. (16c) is replaced by half the sum of the squares of the errors in the fourth
line of Eq. (25). Given the optimal integer solution mˆ k+1 , the suboptimal filter computes xˆ k+1 =
−1
ˆ
ˆ
ˆ
Rˆ xxk
+1 [ z xk+1 − R xmk+1 m k+1 ] , as in Eq. (17).
The advantage of the suboptimal filter is that the ILLS problem for determining mˆ k+1 can have a lower
dimension than the problem for determining nˆ k+1 in Eqs. (16a)-(16c). This reduction can save significant amounts
of computation time and memory.
The forms of Eqs. (23) and (25) will vary if k < i+1 due to the difference between Eq. (20a) and Eq. (20b):
There will be no Δnk-i component in the modified forms of Eqs. (23) and (25). The large block matrices on the lefthand sides of both equations will lose their first columns, and Eq. (25) will lose its first row. All other computations
involving these equations will proceed as already defined.
The accuracy of the suboptimal filter is likely to approach that of the optimal filter as i increases. A Kalman
filter implicitly considers all past measurements, those in the range j = 1, ..., k, when computing its state estimate at
sample k+1 15. An increase in i increases the number of past samples whose measurement models are correctly
considered, albeit on an implicit basis, when computing the estimates xˆ k+1 and mˆ k+1 . The relaxation of the
constraints that earlier ambiguities take on integer values, however, causes approximation errors to enter the
suboptimal filter's implicit consideration of measurements at samples j ≤ k-i. If i is large enough, then the impact
of these old measurements on xˆ k+1 and mˆ k+1 will be small due to the usual decaying measurement influence that is
caused by the presence of process noise in the dynamic model and by the filter's 1/ k averaging effect. Therefore,
suboptimal approximation of these old measurements' effects should have a negligible impact on xˆ k+1 and mˆ k+1 .
C. Comparison of Suboptimal Filter with Alternate Approaches
The Kalman filter of Ref. 4 effectively implements a suboptimal filter that is similar to the one defined in
Subsection III.B. The main difference is that it always uses i = 0. That is, it retains in mk+1 only those integers that
enter the measurements at sample k+1 or beyond. Therefore, because it allow i > 0, the present suboptimal filter
represents a generalization of Ref. 4's filter. The use of i = 0 is reasonable for the CDGPS relative orbit
determination problems that are the subject of Ref. 4. Their rich measurements cause the filter to have a very short
memory. For filters with a long memory, it is probably wise to use this paper's generalization.
An alternative to the suboptimal filter of Subsection III.B is to validate old integers and set them to fixed values
that never change. Validation refers to a procedure whereby a decision is made that the integer-valued estimate of a
particular element of nk+1 is definitely correct and, therefore, that the estimation algorithm can cease to re-estimate
this quantity. In the context of Eq. (23), validation and fixing of the Δnk-i vector would allow the filter to substitute
the validated estimate into the equation. The validated Δnk-i terms would then be subtracted from both sides of the
equation, leaving a left–hand side that consisted of a block matrix multiplying the vector [wk; xk+1; mk+1], similar to
Eq. (14). The filter could then proceed much like the optimal filter does in order to derive Eq. (15) from Eq. (14).
The result would be an optimal filter, but one whose computational load had been reduced to that of the suboptimal
filter.
Although integer validation and fixing has these advantages, this approach is not adopted by the present filter
because validation may not always be possible due to insufficient information. If validation is likely to work well
on a given problem, then the approaches of the present paper should be modified significantly. Alternatively, a
compromise strategy might be to allow for partial validation in which some of the dropped integers were validated
and fixed because there existed sufficient information about them. Any remaining unvalidated integer unknowns
would be dropped using the real-valued assumption, but only after the filter had progressed i+1 samples past the last
sample that involved a given unvalidated integer.
IV. Optimal and Suboptimal Solutions of the Smoothing Problem
A. Optimal Smoother
The optimal fixed-interval smoother is the solution of problem (11a)-(11c) for k+1 = K. Given the optimal
Kalman filter solution for x̂ K and n̂ K from Subsection III.A, the optimal smoothing problem can be solved by
using the standard SRIF implementation of the RTS smoother backwards pass 11. The smoothed estimate of nK is
10
American Institute of Aeronautics and Astronautics
n*K = n̂ K , where the * superscript indicates the smoothed estimate here and throughout the remainder of this paper.
The smoothed estimates, n1, …, nK-1 can be generated by simple backwards recursion of Eq. (1b) because that
equation simply models the appending of new integer constants to nk in order to generate nk+1. Therefore
n*k = [ I 0]n*k +1
for k = K-1, ..., 1
(26)
The remainder of the optimal smoothing calculations work by recursively deriving a smoothed square-root
information equation for xk for k = K, …, 0. It takes the form:
R *xxk x k = z *xk − ν *xk
(27)
where R *xxk is the smoothed square-root information matrix, and z *xk is the smoothed non-homogeneous term that
*
) -1 z *xk . Their values at sample K are R *xxK
stores information about the smoothed mean state estimate: x k* = ( R xxk
*
*
= R̂ xxK and z xK = zˆ xK − Rˆ xnK nˆ K , which implies that x K = x̂ K .
The RTS backwards recursion determines the smoothed square-root information equations for w0, …, wK-1 and
for x0, …, xK-1. It uses the first line of Eq. (15) from the filter, which is the a posteriori wk square-root information,
and it uses the dynamics model in Eq. (1a) along with the k+1 version of Eq. (27). Its first operation uses Eq. (1a)
to eliminate xk+1 from the first line of Eq. (15) and from the k+1 version of Eq. (27). The resulting pair of equations
takes the form
⎡ Rˆ wwk + Rˆ wxk+1Γk
⎢
R *xxk+1Γk
⎢⎣
Rˆ wxk+1Φ k ⎤ ⎡ w k ⎤ ⎡ zˆ wk − Rˆ wxk+1η k − Rˆ wnk+1nk*+1 ⎤ ⎡ νˆ wk ⎤
⎥−⎢ * ⎥
⎥
=⎢
*
R *xxk+1Φ k ⎥⎦ ⎢⎣ x k ⎥⎦ ⎢⎣
z *xk+1 − R xxk
⎥⎦ ⎣⎢ν xk+1 ⎦⎥
+1η k
(28)
The first line of Eq. (28) uses the smoothed integer estimate nk* +1 in place of nk+1 in its modified version of the first
line of Eq. (15).
The RTS smoother recursion uses QR factorization in order to compute an orthonormal matrix that transforms
the block matrix on the left-hand side of Eq. (28) into upper-triangular form. Call this matrix Qck. Multiplication of
both sides of Eq. (28) by this matrix yields the following system of 2 smoothed square-root information equations.
*
⎡ Rwwk
⎢
⎢⎣ 0
* ⎤
*
*
Rwxk
⎡ w k ⎤ ⎡ z wk ⎤ ⎡ν wk ⎤
⎥
=
−
⎢
⎥
⎢
⎥
*
R xxk
⎥⎦ ⎢⎣ x k ⎥⎦ ⎢⎣ z *xk ⎥⎦ ⎢⎣ν *xk ⎥⎦
(29)
The second line of Eq. (29) is the smoothed square-root information equation for xk, as in Eq. (27). It provides
the needed inputs for the next recursion of the smoother. This recursion implements the Eqs. (28)-(29) calculations
backwards starting at sample k = K-1 and ending at sample k = 0.
Standard SRIF techniques 11 use Eq. (29) to compute the smoothed estimates w k* and x k* . The computation
*
and ν *xk to zero and then solving the resultant linear system of equations
proceeds by setting the noise terms ν wk
for wk and xk. Additional standard calculations use the block square-root information matrix on the left-hand side of
Eq. (29) in order to compute the covariance of the estimation error in the vector [ w *k ; x k* ]:
*
⎡ Pwwk
⎢ * T
⎢⎣( Pwxk )
* ⎤ ⎡ *
Pwxk
R
⎥ = ⎢ wwk
*
Pxxk ⎥⎦ ⎢⎣ 0
* ⎤
Rwxk
⎥
R *xxk ⎥⎦
−1
*
⎡ Rwwk
⎢
⎢⎣ 0
* ⎤
Rwxk
⎥
R *xxk ⎥⎦
−T
(30)
*
where the notation ()-T indicates the inverse of the transpose of the matrix in question. Pwwk
is the w k* estimation
*
*
*
error covariance, Pxxk is the x k estimation error covariance, and Pwxk is the cross correlation between the errors
in these two quantities. The validity of Eq. (30) depends on the assumption that the integer estimates in n*K = n̂ K
are exactly correct. If there is any significant probability of another integer value being correct for any of the
components of nK, then a much more complicated covariance calculation is needed, one that will produce “larger”
*
*
Pwwk
and Pxxk
matrices.
11
American Institute of Aeronautics and Astronautics
B. Suboptimal Smoother
If the suboptimal Kalman filter of Subsection III.B has been used to generate estimates x̂ 0 , …, x̂ K and m̂1 ,
…, m̂ K , then the optimal smoother of Subsection IV.A cannot be used because the necessary data in n*K and in the
first line of Eq. (15) will not have been generated. In this situation, a sensible approach is to develop a suboptimal
RTS smoother that is consistent with the derivation of the suboptimal Kalman filter.
Like the suboptimal filter, the suboptimal smoother generates its estimates of wk and xk by treating certain
elements of nK as exact integers and others as being real-valued. The elements that are treated as exact integers are
those that affect any measurements within ± i samples of sample k. The relevant integers are contained in the
vector
mk
⎡
⎤
⎢
⎥
δnk
lk = ⎢
⎥
M
⎢δn
⎥
⎣ min ( K −1,k +i −1) ⎦
for k = 0, ..., K
(31)
Recall from the discussion following Eq. (1a) that the integers in δnj affect the measurements only at samples j+1
and beyond. Note that lK = mK, consistent with Eq. (31).
Figure 1 indicates the integers in a typical lk vector via the six corresponding measurement-type line segments.
They are labeled as applying to the suboptimal filter and smoother or just to the smoother at sample time tk = 3000
sec. These are the only line segments whose time spans overlap the horizontal span of the bracketed bar at the
bottom of the figure. This bar extends from tk-i = 2400 sec to tk+i = 3600 sec.
The suboptimal smoother works with a coupled pair of square-root information equations for xk and lk that takes
the form:
R *xxk x k + R *xlk l k = z *xk − ν *xk
(32a)
*
− ν lk
(32b)
*
Rllk
lk
=
*
z lk
with the usual definitions of the R matrices and of the z and ν vectors, as in Eqs. (13a) and (13b). The smoother
*
=
initializes these equations at sample k = K by using the values R *xxK = R̂ xxK , R *xlK = R̂ xmK , z *xK = ẑ xK , RllK
*
R̂ mmK , and z lK = ẑ mK , which come from the terminal sample of the suboptimal filter. The smoother then
recursively computes Eqs. (32a) and (32b) for samples k = K-1, …, 0.
The backwards recursion of the suboptimal smoother requires an ability to express lk+1 in terms of lk and δnk+i.
The latter vector contains the ambiguities that get dropped from consideration as exact integers in propagating
backwards from sample k+1 to sample k. The required lk+1 dynamic model takes the form:
⎡ I ~0
⎢0 Π k
l k+1 = ⎢
0 0
⎢0 0
⎣
⎡0
⎢0
l k+1 = ⎢
0
⎢0
⎣
0⎤
⎡0 ⎤
0⎥
l + ⎢0⎥δn
I ⎥ k ⎢0 ⎥ k + i
⎢⎣ I ⎥⎦
0⎥⎦
I ~0
0 Πk
0 0
0 0
⎡0 I ~0
l k+1 = ⎢0 0 Π k
⎢0 0 0
⎣
for k = 0, ..., i
(33a)
0⎤
⎡0 ⎤
0⎥
l + ⎢0⎥δn
I ⎥ k ⎢0 ⎥ k + i
⎢⎣ I ⎥⎦
0⎥⎦
for k = i+1, ..., K-i-1
(33b)
0⎤
0⎥ l k
I ⎥⎦
for k = K-i, ..., K-1
(33c)
Equations (33a)-(33c) have been derived by consideration of Eqs. (4), (18), and (31). Equation (33b) is the standard
case. The leading column of zeros in the block matrix coefficient of lk acts to drop the integers in Δnk-i from lk+1,
and the first row in the block matrix acts to shift Δnk-i+1, ..., Δnk into their proper locations within lk+1. The second
row of the block matrix operates on the ∂nk and δn k components of lk, as per Eqs. (4), in order to produce the
12
American Institute of Aeronautics and Astronautics
Δnk+1 and ∂n k +1 components of mk+1 within the definition of lk+1. The third row in the block matrix acts to shift
δnk+1, ..., δnk+i-1 into their proper locations within lk+1. The final row of zeros in the block matrix recognizes the fact
that the integers in δnk+i are not found in lk. In Eq. (33a), the leading column of zeros is missing from the block
matrix because lk and lk+1 both include the initial component Δn1. In Eq. (33c), the bottom row of zeros in the block
matrix and the right-most δnk+i term are missing because the integers in both lk and lk+1 include those that affect the
measurements at sample k = K. The proper dimensions of the various identity matrices in Eqs. (33a)-(33c) can be
determined by a careful consideration of this description in conjunction with Eqs. (4), (18), and (31).
Substitution of Eq. (33b) into the sample k+1 versions of Eqs. (32a) and (32b) yields the modified square-root
information equations:
R *xxk+1 x k+1 + R *xlak l k + R *xlbk δnk +i = z *xk+1 − ν *xk+1
(34a)
*
*
*
*
Rllak
l k + Rllbk
δnk +i = z lk
+1 − ν lk +1
(34b)
*
*
and Rllbk
The matrices R *xlak and R *xlbk consist of subsets of the columns of R *xlk+1 , and the matrices Rllak
*
consist of subsets of the columns of Rllk+1 :
R *xlak
*
Rllak
⎡0 I ~0
0 Πk
0 0 0
⎢0 0 0
⎣
=
⎢0
R *xlk+1 ⎢
=
* ⎢0
Rllk
+1 ⎢
⎡0 I ~0
0 Πk
0 0 0
⎢0 0 0
⎣
0⎤
⎡0 ⎤
0⎥
*
*
, R xlbk = R xlk+1 ⎢⎢0⎥⎥
0
I⎥
⎢⎣ I ⎥⎦
0⎥⎦
0⎤
⎡0 ⎤
0⎥
*
* ⎢0 ⎥
, Rllbk = Rllk+1 ⎢ ⎥
0
I⎥
⎥
⎢⎣ I ⎥⎦
0⎦
(35a)
(35b)
Substitution of Eq. (33a) or Eq. (33c) into Eqs. (32a) and (32b) yields equations that are similar to Eqs. (34a) and
(34b), but with the following exceptions: The δnk+i terms are not present when using Eq. (33c), and the formulas
*
for R *xlak and Rllak
change slightly in both cases.
The suboptimal smoother needs to use the first two lines of the suboptimal a posteriori square-root information
equations in Eq. (25). In order to use them, it first needs to re-express Δnk-i and mk+1 in terms of lk. This is possible
because of the relationships given in Eqs. (31) and (33a)-(33c) and because Δnk-i is the first component of lk by
virtue of being the first component of mk. The resulting modified versions of the first two lines of Eq. (25) are
Rˆ Δnwk w k + Rˆ Δnxk+1 x k+1 + Rˆ Δnlk l k = zˆ Δnk − νˆ Δnk
Rˆ
w + Rˆ
x + Rˆ l = zˆ − νˆ
wwk
k
wxk+1 k+1
wlk k
wk
wk
(36a)
(36b)
where the two new matrices in these equations take the form
⎡0 I ~0 0⎤
Rˆ Δnlk = Rˆ ΔnΔnk [ I 0] + Rˆ Δnmk+1 ⎢
for k = i+1, ..., K-1
⎣0 0 Π k 0⎥⎦
⎡0 I ~0 0⎤
Rˆ wlk = Rˆ wmk+1 ⎢
for k = i+1, ..., K-i-1
⎣0 0 Π k 0⎥⎦
(37a)
(37b)
in the nominal sample range where Eq. (33b) applies and in the later range where Eq. (33c) applies. In the earlier
sampling range of Eq. (33a), these formulas change to
⎡ I ~0 0⎤
for k = 0, ..., i
Rˆ Δnlk = Rˆ Δnmk+1 ⎢
⎣0 Π k 0⎥⎦
⎡ I ~0 0⎤
Rˆ wlk = Rˆ wmk+1 ⎢
k = 0, ..., i
⎣0 Π k 0⎥⎦
13
American Institute of Aeronautics and Astronautics
(38a)
(38b)
The R̂ ΔnΔnk term disappears from Eq. (38a) in this case because the entire leading column disappears from the
block matrix on the left-hand side of Eq. (25). These derivations presume i > 0. Modifications would have to be
made for i = 0.
The backwards recursion of the suboptimal smoother is developed by using the dynamics model in Eq. (1) to
eliminate xk+1 from Eqs. (34a), (36a), and (36b). When coupled with Eq. (34b), the resulting system becomes
⎡ 0
⎢
⎢ 0
⎢ *
⎢ R xlbk
⎢ R*
⎣ llbk
Rˆ Δnwk +Rˆ Δnxk+1 Γ k
+Rˆ
Rˆ
Γ
Rˆ Δnxk+1Φ k
Φ
Rˆ
R *xxk+1 Γ k
R *xxk+1Φ k
0
0
wwk
wxk+1 k
wxk +1
k
⎡ zˆ Δnk − Rˆ Δnxk+1η k ⎤ ⎡νˆ
Rˆ Δnlk ⎤
⎤
⎥ ⎢ Δnk ⎥
⎥ ⎡δn k +i ⎤ ⎢
Rˆ wlk ⎥ ⎢ w k ⎥ ⎢ zˆ wk − Rˆ wxk+1η k ⎥ ⎢ νˆ wk ⎥
⎥ ⎢ x ⎥ = ⎢ z * − R * η ⎥ − ⎢ν *xk+1 ⎥
*
R xlak
xxk+1 k ⎥
⎥ ⎢ k ⎥ ⎢ xk+1
⎢ * ⎥
*
* ⎥⎣ l k ⎦ ⎢
⎥ ⎢⎣ν lk
+1 ⎥⎦
z
Rllak ⎦
lk+1
⎣
⎦
(39)
where the order of the transformed equations from top to bottom is Eq. (36a), (36b), (34a) and (34b).
The suboptimal RTS smoother completes its one-sample recursion by using QR factorization to transform Eq.
(39) so that the block matrix on its left-hand side is upper triangular. The resulting system of square-root
information equations becomes:
⎡ Rδ*nδnk
⎢
⎢ 0
⎢ 0
⎢
⎣⎢ 0
Rδ*nwk
*
R wwk
0
0
Rδ*nxk
*
R wxk
R *xxk
0
Rδ*nlk ⎤ ⎡δn ⎤ ⎡ z δ*nk ⎤ ⎡ν δ*nk ⎤
k +i
⎢ * ⎥ ⎢ * ⎥
* ⎥⎢
R wlk
⎥ w k ⎥ = ⎢ z wk ⎥ − ⎢ν wk ⎥
⎥
*
*
* ⎥⎢ x
k ⎥ ⎢ z xk ⎥ ⎢ ν xk ⎥
R xlk
⎢
⎥
⎢
⎥
⎢
⎥
l
*
⎣ k ⎦ ⎢ z* ⎥ ⎢ ν * ⎥
Rllk
⎦⎥
⎣ lk ⎦ ⎣ lk ⎦
(40)
The final two lines of Eq. (40) are in the form of Eqs. (32a) and (32b). They provide the inputs for the next
recursion of Eqs. (35a), (35b), (37a), (37b), (39) and (40) for backwards propagation from sample k to sample k-1.
Equation (40) enables the suboptimal smoother to compute its approximations of the smoothed values of wk, xk,
and lk. As with the suboptimal filter, it does this by minimizing the sum of the squares of the components of the ν
error vectors. This minimization constrains lk to be integer-valued, but it allows δnk+i to be real-valued. This
approach is consistent with its principle of retaining as exact integers only those elements of nK that affect
measurements within i samples of the current sample. The first step of this procedure determines l k* by solving an
ILLS problem like that in Eqs. (16a)-(16c), except that the cost function is
*
* T
*
*
J = 0.5[ Rllk
l k − z lk
] [ Rllk
l k − z lk
]
(41)
*
The integer result l k is substituted back into Eq. (40), and the smoothed estimates of wk and xk are computed by
solving the 2nd and 3rd lines of Eq. (40) under the assumption that their ν error vectors equal 0. The resulting
smoothed estimates are
x k* = ( R *xxk ) −1 ( z *xk − R *xlk l k* )
w k*
=
*
( Rwwk
) −1 ( z *wk
*
− Rwxk
x *k
(42a)
*
− Rwlk
l k* )
(42b)
The smoothed estimate of δnk+i normally is not computed because it does not affect the other estimates and because
it is not likely to be very accurate due to the neglect of the constraint that it be integer-valued.
The forms of Eqs. (39) and (40) will vary if k > K-i-1due to the lack of a δnk+i term in Eq. (33c). This lack
causes the δnk+i terms to drop out of Eqs. (39) and (40), and the large block matrices on the left-hand sides of these
equations lose their first columns. Otherwise, all computations involving Eqs. (39) and (40) proceed as described
above.
V. Example Mixed Real/Integer Estimation Problem
This section presents a model of a stochastic system that has real-valued dynamic states and constant integervalued measurement ambiguities. It is a simplified abstraction of a CDGPS relative navigation problem, as in Ref.
4. It includes kinematics and measurement models that operate along a single spatial coordinate. This section also
14
American Institute of Aeronautics and Astronautics
discusses how this model has been used to develop a truth-model simulation. Data from this simulation have been
used to evaluate this paper's two new Kalman filters and smoothers, and those results are presented in Section VI.
A. Dynamic Model
The dynamic model of the real-valued unknowns is the discrete-time equivalent of a triple integrator kinematic
position/velocity/acceleration model that is driven by white-noise jerk. This model takes the form:
x k +1
⎡1
= ⎢⎢0
⎢⎣0
Δt k
1
0
⎡ Δt k2 /(2 5 )
0.5 Δt k2 ⎤
⎢
⎥
Δt k ⎥ x k + Δt k q ⎢ Δt k 5 / 4
⎢
1 ⎥
5 /3
⎦
⎣
0
0 0⎤
⎥
Δt k /( 4 3 ) 0 0⎥ w k
1/ 3
1/3 0⎥
⎦
for k = 0, ..., K-1
(43)
where the three elements of xk are, respectively, position, velocity, and acceleration, where Δtk = tk+1 - tk is the time
interval from sample k to sample k+1, where wk is the 4-element process-noise vector that constitutes the discretetime equivalent of the continuous-time white-noise jerk process, and where q is the power spectral density of the
continuous-time jerk in m2/sec5 units. The two matrices in Eq. (43) constitute the Φk and Γk matrices of the general
model form given in Eq. (1a). The Γk matrix in Eq. (43) and the related measurement model matrix have been
chosen so that the discrete-time process-noise statistics are: E{wk} = 0, E{wkwjT} = δkjI4x4. Therefore, Rwwk = I4x4 in
Eq. (2). This model does not include a non-homogeneous term. In other words, ηk = 0.
As is evident in Eq. (43), only 3 elements of wk are needed in order to properly model the effect of process noise
on the real-valued part of the state at the sample times tk, tk+1, ... The inclusion of a 4th element of wk is needed in
order to properly model the effect of the continuous-time random jerk on the system's measurements, which are
modeled as occurring at the mid-point of each sample interval.
B. Measurement Model with Integer Ambiguities
The measurements are position at the mid point of each sampling interval, but with integer ambiguities. The
measurements' sensitivities to the integer ambiguities vary with time in order to ensure that the ambiguities and the
position component of xk are independently observable. There are numerous different measurement "types" that
come on-line and that go off-line throughout the data batch. Each different "type" is distinguished by having its
own unique integer ambiguity and its one unique time history of the sensitivity of its output to its ambiguity.
The measurement model takes the form
0
⎡(1 / σ sk1 ) 0
⎢
0
0
y k +1 =
O
⎢ 0
σ
0
(
1
/
skrk
⎢⎣
+
Δt k2
⎡ 31 5
⎢ 1920
Δt k q ⎢ M
⎢ 31 5
⎢⎣ 1920
{
⎡1 0.5 Δt k
⎢M
M
⎢
1
0
.
5
Δt k
⎢⎣
⎤
⎥
)⎥⎥
⎦
− 3
128
1
192
− 3
128
1
192
M
M
5⎤
320 ⎥
0.125 Δt k2 ⎤
⎥x
M
k
2⎥
0.125 Δt k ⎥
⎦
⎡h~k , s
k1
⎢
M ⎥wk + ⎢ 0
5⎥
⎢ 0
320 ⎥⎦
⎣
0
0
0
O
~
0 hk , skr
k
⎤ ⎡ ns
⎥ ⎢ k1
⎥⎢ M
⎥ ⎢⎣n skrk
⎦
}
⎤
⎥
⎥
⎥⎦
+ ν yk+1
for k = 0, ..., K-1
(44)
The set Sk = { s k1 , ..., s krk } contains the absolute indices into the nK cumulative integer ambiguity vector of the rk
elements that affect the rk measurements in the vector yk+1. Each of the rk elements of yk+1 is an independent
measurement of the system's position at the mid-point time 0.5(tk+tk+1). The standard deviation σ skp is the
~
measurement error standard deviation of the pth measurement in yk+1, and the ratio hk ,skp / σ skp is the sensitivity of
this measurement to its corresponding integer ambiguity.
The subscript notation for σ s indicates that the measurement error standard deviation is constant for all
measurements of the "type" associated with the integer ambiguity whose index in nK is s. This is true regardless of
which sample index k applies for a given measurement, hence the dropping of the kp sub-subscript in this paragraph
and in the next.
~
The leading k subscript in hk , s indicates variations with the sample count k of the given measurement "type's"
sensitivity to its associated integer ambiguity. These variations are modeled as being linear in the sample count:
15
American Institute of Aeronautics and Astronautics
~
~
~
~
~
hk , s = h0,s + (∂h / ∂n) s k , where h0, s and (∂h / ∂n) s are constants for a given ambiguity index s. This linear
variation of the ambiguities' measurement sensitivities is an acceptable approximation of how ambiguities enter a
CDGPS estimation problem.
The index set Sk varies from sample to sample in such a way that a given ambiguity index s is an element of all
measurements within a given range of samples. That is, s ∈ Sk for all kmins ≤ k ≤ kmaxs. These index sets can be
used to determine the definitions of the partial ambiguity vectors nk, δnk, Δnk, and ∂nk and of the permutation
~
matrices Π k and Π k , as defined in Section II. These definitions and the form of Eq. (44) serve to define the
~
matrices of the measurement model form in Eq. (6), Hxk, Hwk, and H nk+1 .
The particular problem at hand encompasses K = 600 sample intervals spread out fairly evenly over 9000
seconds so that the mean value of Δtk is 15 seconds. There is some variation of the sample intervals over the range
13.8 seconds to 16.2 seconds.
There are 40 elements of the cumulative integer ambiguity vector nK. They correspond to 40 different
independent measurement "types". In a double-differenced CDGPS problem, this would correspond to 40
independent double-differenced integer ambiguities, which implies that 41 satellites would be visible over the 2.5
hour data batch. This number of available measurement "types" during this time span would be optimistic for a
terrestrial CDGPS application, but it would be reasonable or even pessimistic for an application in LEO 5. The
measurement error standard deviations, σs for s = 1, ..., 40, have been chosen randomly to lie in the range 0.0031 m
to 0.0098 m, with a mean of 0.0051 m. This is consistent with typical CDGPS measurement accuracies.
~
The unnormalized sensitivities hk ,s and the sample ranges kmins to kmaxs for s = 1, ..., 40 have also been chosen
~
randomly. They are best indicated by showing the 40 hk , s vs. tk curves over the 40 sample ranges k = kmins, ..., kmaxs,
one curve and time span for each s = 1, ..., 40. The resulting 40 line segments are plotted in Fig. 1. This figure
shows that the duration of the availability of any given measurement ranges from 250 seconds to 1280 seconds and
~
that the unnormalized sensitivities hk , s range from 0.0013 m to 0.011 m. These unnormalized sensitivities have
been chosen to be relatively small. Their smallness forces the filter and smoother algorithms to use significant
lengths of each measurement "type's" data arc in order to resolve its corresponding ambiguity as an exact integer.
C. Truth-Model Simulation
Truth-model simulation in the present SRIF framework starts by randomly sampling zero-mean, identitycovariance Gaussian distributions in order to generate the "truth" random vectors νˆ x 0 , ν wk for k = 0, ..., K-1, and
ν yk+1 for k = 0, ..., K-1, which define, respectively, the initial estimation error, the process noise, and the
measurement noise. This is accomplished by using a numerical random number generator. Given these random
vectors, the "truth" process-noise time history is generated by solving Eq. (2) for wk for k = 0, ..., K-1. The "truth"
−1 ˆ
initial state is determined as x0 = Rˆ xx
0 [ z x 0 - νˆ x 0 ] , and the dynamic model in Eq. (1a) is iterated for k = 0, ..., K-1 in
order to compute the "truth" values for x1, ..., xK. The resulting "truth" state and process-noise time histories are
used along with "truth" integer ambiguities and the "truth" measurement noise in order to synthesize the simulated
noisy measurements, as per Eq. (6). The "truth" cumulative integer ambiguity vector nK, which is needed in the
computation of the simulated measurements, is generated by independently sampling each of its 40 elements from
an equi-probable random integer distribution that covers the range from -10000 to +10000.
The following numerical values have been used in the truth-model simulation: q = 9x10-10 m2/sec5, Rˆ xx 0 =
−1 ˆ
2
diag[1/(1000m); 1/(4.5m/sec); 1/(0.00073575m/sec2); x̂ 0 = Rˆ xx
0 z x 0 = [30000m; - 4.5m/sec; - 0.0004905m/sec ].
VI. Filter and Smoother Results for Truth-Model Data
Data from the truth-model simulation of Section V have been used to test the new filtering and smoothing
algorithms of Sections III and IV. In addition, a third filter/smoother pair has been applied to the data, one that
treats all of the integer ambiguities as real-valued unknowns for all samples. These latter algorithms can be
implemented using standard SRIF techniques. They have been included as baseline comparison cases.
Figure 2 plots the x1(t) position error time histories for three filters, and Fig. 3 plots the corresponding errors for
the three related smoothers. The blue solid curve in Fig. 2 is for the optimal filter of Subsection III.A, and Fig. 3's
blue solid curve is for the optimal smoother of Subsection IV.A. These two algorithms always treat all 40
ambiguities as exact integers. The two figures' green dotted curves are for the suboptimal filter and smoother of,
16
American Institute of Aeronautics and Astronautics
respectively, Subsections III.B and IV.B. They have been tuned to estimate as exact integers only those ambiguities
that affect measurements which lie within i = 40 samples of any given estimation time. This value of i translates
into a time range of approximately +0/-600 sec for the filter and ± 600 sec for the smoother. The two figures' red
dash-dotted curves are for the suboptimal standard SRIF filter and smoother that estimate all of the integers as realvalued quantities. Note that the vertical scales of the two figures are different by almost an order of magnitude,
consistent with the fact that a smoother is more accurate than a filter.
0.1
x1 filter error (m)
0.05
0
-0.05
Optimal
Suboptimal, i = 40
Suboptimal, no integers
-0.1
0
1000
2000
3000
4000
5000
Time (sec)
6000
7000
8000
9000
Figure 2. Comparison of x1 errors of 3 filters.
0.015
Optimal
Suboptimal, i = 40
Suboptimal, no integers
x1 smoother error (m)
0.01
0.005
0
-0.005
-0.01
-0.015
0
1000
2000
3000
4000
5000
Time (sec)
6000
7000
Figure 3. Comparison of 3 smoothers' x1 errors.
17
American Institute of Aeronautics and Astronautics
8000
9000
Figure 2 demonstrates that all 3 filters have rapid convergence, reaching steady-state performance within the
first 1000 sec. A very close examination of the steady-state parts of the curves reveals that the optimal filter is
slightly more accurate than the suboptimal filter that uses i = 40. The least accurate filter is the one that treats all of
the ambiguities as being real-valued quantities. The respective steady-state RMS errors of the latter two filters are,
respectively, 1.2% and 3.0% larger than that of the optimal filter. Thus, use of the new optimal or suboptimal filter
does not appear to make a significant difference for this example problem.
Considering Fig. 3, it is apparent that Subsection IV.B's suboptimal smoother produces results that are almost
identical to those of the exact optimal smoother. The only noticeable differences between the optimal blue solid
curve and the suboptimal green dotted curve occur during the short time span from t = 6210 sec to t = 6450 sec.
The all-reals suboptimal smoother, the red curve, is obviously much less accurate. Its RMS x1 error is 41% larger
than that of the optimal smoother, whereas Subsection IV.B's suboptimal smoother has an RMS error that is only
3% larger than the optimal value. The all-reals suboptimal smoother also has a low-frequency wander in its error
time history. The other two smoothers' error time histories display no such wander.
It is worthwhile to examine the noted brief discrepancy between the errors of the Subsection IV.B suboptimal
smoother and those of the optimal smoother. This discrepancy is the result of errors in the suboptimal smoother's
integer ambiguity estimates. Of course the suboptimal smoother makes erroneous estimates of those ambiguities
that it approximates as real-valued quantities, but in some cases the suboptimal smoother also makes errors in its
estimates of the ambiguities that it treats as exact integers. Recall that these ambiguities are the elements of the lk
vector in the ILLS problem in Eq. (41). Errors between the suboptimal smoothed estimates of lk and their
corresponding "truth" values arise between t = 5640 sec and t = 6450 sec. Note, however, that the erroneous
integers are never different from the "truth" values by more than 1. No integer ambiguity errors occur for the
optimal smoother in this case, even though they are theoretically possible. The likelihood of such increases,
however, for a suboptimal filter, as illustrated by this example. It is surprising that there is almost no discrepancy
between the optimal and suboptimal smoothed x1 estimates during the time interval from t = 5640 sec to t = 6210
sec even though the suboptimal smoother experiences integer errors during this interval. The worst-case difference
between the optimal and suboptimal smoothers is only 0.0058 m. Thus, the suboptimal smoother of Subsection
IV.B achieves very good performance for this test case.
Although they are slightly less accurate, the suboptimal filter and smoother of Sections III and IV have the
advantage of less computational cost. This advantage is illustrated by Fig. 4, which plots three metrics of the
cumulative computational burden of solving ILLS problems. One metric, the solid blue curve, is for the optimal
Kalman filter's ILLS problems, those in Eqs. (16a)-(16c). The second metric, the dotted green curve, is for the
suboptimal Kalman filter's ILLS problems, those associated with the 4th row of Eq. (25). The third metric, the
dashed red curve, is for the suboptimal smoother's ILLS problems, which minimize the cost function in Eq. (41).
The metric Ltot is used to characterize the burden associated with any one ILLS problem. It is defined in Section
III.C of Ref. 9. Ltot is a count of the number of solution algorithm minor search steps that are required in order to
exactly solve the ILLS problem. It is a fairly reliable indicator of the overall computation time that would be
required to solve each problem 9. At any given time, each plotted performance metric in Fig. 4 represents the
cumulative sum of the Ltot values of all of the ILLS problems that have been solved up to and including that time.
The Ltot sums for the dashed red curve include the final value of the dotted green curve because the suboptimal filter
solutions are computed prior to running the suboptimal smoother.
The dotted green curve on Fig. 4 is lower than the solid blue curve after t = 2000 sec because the suboptimal
filter's ILLS problems after this point are of significantly lower dimension than those of the optimal filter. The
largest dimension of the suboptimal filter's exact-integer mk vector is 12, whereas the largest dimension of the
optimal filter's nk vector is 40. These lower ILLS problem dimensions tend to yield lower Ltot values.
The dashed red curve for the suboptimal smoother rises very slowly from the terminal value of the green curve.
This rise is slow because the smoother's integer ambiguity set often does not change from sample to sample during
its backwards pass. The ILLS problem associated with Eq. (41) needs to be re-solved only when this integer set
changes.
There is no need to produce a similar backwards-pass curve for the optimal smoother. The optimal smoothed
values of the integer ambiguities are simply the solution to the ILLS problem in Eqs. (16a)-(16c) at the terminal
Kalman filtering sample k+1 = K. Thus, the optimal smoother's equivalent of the red dashed curve would be a
horizontal line extended backwards from the terminal maximum value of the solid blue optimal filter curve.
It is evident from Fig. 4 that the suboptimal filter and smoother achieve significant computational cost savings in
comparison to the optimal filter/smoother. The computational cost of solving an ILLS problem can increase very
18
American Institute of Aeronautics and Astronautics
rapidly with the number of integer unknowns. In the optimal filter, the number of integer unknowns increases
steadily as k increases. The potential dimension of its nk integer ambiguity vector is ultimately bounded only by the
finite duration of the data interval. Therefore, the computation time performance metric of the optimal Kalman
filter can increase dramatically towards the end of the data interval, as shown by the solid blue curve in Fig. 4. The
suboptimal filter and smoother, on the other hand, have ILLS dimension bounds that are implicitly determined by i,
the number of sample intervals over which ambiguities are treated as exact integers. This is why the dotted green
curve grows less rapidly than the solid blue curve after t = 2000 sec. These bounds on the ILLS problem
dimensions tend to translate into bounds on the rates of growth of the dotted green curve and the dashed red curve.
The suboptimal smoother entails added ILLS computational cost during its backwards pass. This differs from
the optimal smoother, as already discussed. Despite this added cost, the suboptimal smoother is less expensive
computationally than is the optimal smoother; the suboptimal smoother's added costs are not nearly as large as the
added costs of implementing an optimal Kalman filter versus a suboptimal filter. Even after its entire backwards
pass, the suboptimal smoother, in conjunction with the suboptimal Kalman filter, incurs less than 35% of the
computational cost that is incurred by the optimal Kalman filter.
4
4
x 10
Cumulative Ltot ILLS Execution Cost Metric
3.5
Optimal KF
Suboptimal KF, i = 40
Suboptimal KF & Smoother, i = 40
3
2.5
2
1.5
1
0.5
0
0
1000
2000
3000
4000
5000
Time (sec)
6000
7000
8000
9000
Figure 4. Comparison of cumulative Ltot ILLS execution cost metric time histories for
optimal and suboptimal filters and for a suboptimal smoother.
VII.
Summary and Conclusions
Two new Kalman filters and their associated fixed-interval smoothers have been developed for a class of mixed
real/integer linear estimation problems in which the integers, also called ambiguities, are known to be constants and
to enter only the measurement equations. These estimation algorithms have been developed using square-root
information filter/smoothing techniques. One of the filter/smoother pairs is an optimal estimator that solves the
resulting maximum a posteriori estimation problem by using integer linear least-squares techniques to deal with its
integer part. Each optimal Kalman filter update must re-solve for all integer ambiguities that affect past and current
measurements. The optimal smoother uses the integer ambiguity estimates from the final Kalman filter sample time.
The computational cost of the optimal solution can be very large due to the linear growth of the dimension of its
integer least-squares problem as the number of samples grows.
The suboptimal filter and its associated smoother treat some of the ambiguities as exact integers and others as
being real-valued. The latter ambiguities can be conveniently partitioned out of the problem in order to bound the
size of the integer linear least-squares sub-problem that must be solved during each Kalman filter measurement
update and during some of the backwards recursion steps of the smoother. The suboptimal filter and smoother
19
American Institute of Aeronautics and Astronautics
choose which ambiguities to retain as exact integers based on whether they affect measurements within a certain
time offset from the current sample of interest. This strategy can save a significant amount of computational cost
without significantly affecting the accuracy of the estimates. The retention of accuracy arises because filters and
smoothers tend to have "forgetting factors" whereby measurements distant in time from a given time point of
interest do not have a significant impact on the state estimate at that time. Therefore, it should be allowable to mismodel some of the distant measurements by not treating their associated ambiguities as exact integers.
The new Kalman filtering and smoothing algorithms have been tested using data from a truth-model simulation
of an example estimation problem. The problem is a simplified 1-dimensional approximation of a system that uses
carrier-phase differential GPS techniques to estimate the relative position, velocity, and acceleration between two
receivers. The suboptimal filter achieved nearly optimal accuracy even though some of its measurement updates
retained only 12.5% of the original problem's ambiguities as exact integers. The suboptimal smoother was similarly
successful even though it sometimes treated only 5% of the original ambiguities as exact integers. These low
numbers of exact integers allowed the suboptimal filter and smoother to achieve dramatic computational time
savings. The suboptimal filter's cumulative computational burden was less than 31% of the burden of the optimal
filter. The suboptimal filter/smoother combination used less than 35% of the computation time that was used by
their optimal counterparts.
It must be noted that both new filters exhibited only marginal accuracy improvements in comparison to a simple
filter that treats all of the ambiguities as real-valued quantities. In the case of smoothing, however, the new optimal
and suboptimal algorithms were significantly more accurate than a simple algorithm that completely ignored the
integer nature of the ambiguities.
References
1
Goad, C., "Surveying with the Global Positioning System," in Global Positioning System: Theory and Applications, Vol. II,
Parkinson, B.W., and Spilker, J.J., Jr., eds., American Institute of Aeronautics and Astronautics, (Washington, 1996), pp. 501517.
2
Kroes, R., Montenbruck, O., Bertiger, W., and Visser, P., "Precise GRACE Baseline Determination Using GPS," GPS
Solutions, Vol. 9, No. 1, April 2005, pp. 21-31.
3
Psiaki, M.L., and Mohiuddin, S., "Modeling, Analysis, and Simulation of GPS Carrier Phase for Spacecraft Relative
Navigation," Journal of Guidance, Control, and Dynamics, Vol. 30, No. 6, Nov.-Dec. 2007, pp. 1628-1639.
4
Mohiuddin, S., and Psiaki, M.L., "Carrier-Phase Differential Global Positioning System Navigation Filter for High-Altitude
Spacecraft," Journal of Guidance, Control, and Dynamics, Vol. 31, No. 4, July-Aug. 2008, pp. 801-814.
5
Mohiuddin, S., and Psiaki, M.L., "High-Altitude Satellite Relative Navigation Using Carrier-Phase Differential Global
Positioning System Techniques," Journal of Guidance, Control, and Dynamics, Vol. 30, No. 5, Sept.-Oct. 2007, pp. 1427-1436.
6
Teunissen, P.J.G., The Least-Squares Ambiguity Decorrelation Adjustment: A Method for Fast GPS Integer Ambiguity
Estimation," Journal of Geodesy, Vol. 70, Nos. 1-2, Nov. 1995, pp. 65-82.
7
Teunissen, P.J.G., "GPS Carrier Phase Ambiguity Fixing Concepts," GPS for Geodesy, 2nd ed., P.J.G. Teunissen and A.
Kleusberg, eds., Springer, (New York, 1998), pp. 319-388.
8
Verhagen, S., and Teunissen, P.J.G., "New Global Navigation Satellite System Ambiguity Resolution Method Compared to
Existing Approaches," Journal of Guidance, Control, and Dynamics, Vol. 29, No. 4, July-Aug., 2006, pp. 981-991.
9
Psiaki, M.L., and Mohiuddin, S., "Global Positioning System Integer Ambiguity Resolution Using Factorized Least-Squares
Techniques," Journal of Guidance, Control, and Dynamics, Vol. 30, No. 2, March-April 2007, pp. 346-356.
10
Wolfe, J.D., Speyer, J.L., Hwang. S., Lee, Y.J., and Lee, E., "Estimation of Relative Satellite Position Using Transformed
Differential Carrier-Phase GPS Measurements," Journal of Guidance, Control, and Dynamics, Vol. 30, No. 5, Sept.-Oct. 2007,
pp. 1217-1227.
11
Bierman, G.J., Factorization Methods for Discrete Sequential Estimation, Academic Press, (New York, 1977), pp. 69-76,
115-122, 214-217.
12
Rauch, H.E., Tung, F., and Striebel, C.T., "Maximum Likelihood Estimates of Linear Dynamic Systems," AIAA Journal,
Vol. 3., No. 8, 1965, pp. 1445-1450.
13
Psiaki, M.L., "Null-Space Square-Root Information Filtering and Smoothing for Singular Problems," Journal of Guidance,
Control, and Dynamics, Vol. 29, No. 3, May-June 2006, pp. 695-703.
14
Gill, P.E., Murray, W., and Wright, M.H., Practical Optimization, Academic Press, (New York, 1981), pp. 37-40.
15
Psiaki, M.L., "Backward-Smoothing Extended Kalman Filter," Journal of Guidance, Control, and Dynamics, Vol. 28, No.
5, Sept.-Oct. 2005, pp. 885-894.
20
American Institute of Aeronautics and Astronautics