Best Linear Unbiased State Estimation with Noisy and Noise

12th International Conference on Information Fusion
Seattle, WA, USA, July 6-9, 2009
Best Linear Unbiased State Estimation with Noisy and Noise-free
Measurements∗
Zhansheng Duan X. Rong Li
Department of Electrical Engineering
University of New Orleans
New Orleans, LA 70148, U.S.A.
{zduan,xli}@uno.edu
Abstract – Numerous state estimation problems (e.g., under linear or nonlinear equality constraints, with correlated
or singular measurement noise) can be formulated as the
problem with noisy and noise-free measurements. Under the
assumption that the initial state and the noise are jointly
Gaussian distributed and the coefficient matrix for linear
equality constraint has full row rank, some optimal state estimation algorithms were obtained in the literature. Given
only the first two moments and without any assumption on
the rank of the measurement matrix for noise-free measurement, optimal batch and two sequential forms of the state
estimation algorithms in the sense of best linear unbiased
estimation (BLUE) are obtained in this paper. Extension to
nonlinear measurements is also discussed.
Keywords: State estimation, noisy measurement, noisefree measurement, recursibility, BLUE.
1 Introduction
Practical measurements are noisy most of the times and uncertainties about the quantities of interest are imposed. One
major effort is then the removal of noise and having a better
guess of the quantities. For instance, high frequency components are usually treated as noise and different filters are
designed to filter them out. The received signals are usually
noisy, so denoising tries to recover the signals which are as
close to the original signals as possible.
As will be shown in the following, however, numerous
state estimation problems can be formulated as those with
both noisy and noise-free measurements 1 (e.g., state estimation problems under linear or nonlinear equality constraints,
with correlated or singular measurement noise).
For the case with noise-free measurements only, one
heuristic way is to increase the zero diagonal elements
of the measurement noise covariance matrix artificially to
∗ Research supported in part by NSFC grant 60602026, Project 863
through grant 2006AA01Z126, ARO through grant W911NF-08-1-0409
and NAVO through Contract # N62306-09-P-3S01. Z. Duan is also with the
College of Electronic and Information Engineering, Xi’an Jiaotong University.
1 A Noise-free measurement is also called a perfect measurement.
978-0-9824438-0-4 ©2009 ISIF
a small positive number, but optimality is lost. Another
well established way, the so-called reduced-order filter, assumes that only part of the system state is measured directly and is driven by independent sources of noise [1].
For the case the noise-free measurement is due to equality constraints, numerous results and methods are available
[2, 3, 4, 5, 6, 7, 8, 9, 10]. For example, the reparameterization method simply reparameterizes the system model so
that the equality constraints are not required any more. It
has several disadvantages. First, the physical meaning of
the reparameterized system state may vary and be different
at different time instants. Second, the interpretation of the
reparameterized system state is less natural and more difficult [3]. Another popular method of equality constrained estimation is the projection method [3, 5, 6, 8, 9, 10], in which
the unconstrained estimate is projected onto the constraint
subspace. Unfortunately, it has serious problems, as analyzed below. Its main idea is to apply classical constrained
optimization techniques to the constrained estimation problem. But there is a significant difference between classical constrained optimization and constrained estimation and
this difference seems to be totally ignored in most of the existing work. Some other methods, e.g., maximum probability method and mean square method, were also discussed in
[3]. They are not free of the problems and confusion, either.
Also, all existing work processes the noisy measurement
first and then the equality constraint. Is it the only choice
or a good choice? If there are more than one choice, how
should the end user choose among them? Unfortunately,
such questions have not been answered in theory.
As shown later, the state estimation problem with both
noisy and noise-free measurements is in essence actually
one with singular measurement noise and is not really a big
deal in theory itself. What matters is the computational complexity of the solution. So a main focus of this paper is to
find computationally efficient ways and analyze their applicapability to different scenarios.
The paper is organized as follows. Sec. 2 formulates the
problem. Sec. 3 gives some cases with noise-free measurements so as to show our research work in this direction is re-
2193
ally useful. Sec. 4 presents the batch BLUE estimator. Sec.
5 presents two equivalent forms of the sequential BLUE estimator. Sec. 6 discusses extension to nonlinear measurements. Sec. 7 gives concluding remarks.
2 Problem formulation
Consider the following generic dynamic system
xk = Fk−1 xk−1 + Gk−1 wk−1
(1)
with zero-mean white noise wk with cov(wk ) = Qk ≥ 0
and xk ∈ Rn , E [x0 ] = x̄0 , cov(x0 ) = P0 .
Assuming that two types of measurements of the system
state are available at the same time. The first type is the
noisy measurement
(1)
(1)
(1)
zk = Hk xk + vk
(1)
(2)
(1)
(1)
with zero-mean white noise vk with cov(vk ) = Rk > 0
(1)
and zk ∈ Rm1 . hwk i, hvk i and x0 are uncorrelated with
each other.
(1)
The assumption Rk > 0 is explained in great detail later.
The second type of measurement is noise-free:
(2)
(2)
zk = Hk xk
(3)
(2)
where zk ∈ Rm2 .
(2)
One may think that zk is always random since it is a
measurement of the state. As shown later, this is not necessarily the case.
In this paper, given only the first two moments, we try
to obtain the optimal state estimation in the sense of BLUE
with both noisy and noise-free measurements. That is,
BLUE ∆
= E ∗ xk |z k = arg
x̂k|k
min
x̂k|k =ak +Bk Zk
MSE x̂k|k
where
z k = {z1 , · · · , zk } , Zk = [z1′ , · · · , zk′ ]
(1)
(2)
zk = [(zk )′ , (zk )′ ]′
MSE x̂k|k = E[(xk − x̂k|k )(xk − x̂k|k )′ ]
′
(4)
and ak , Bk do not depend on Zk .
3 Noise-free measurements
Before getting into the details about how to obtain the optimal state estimation, let us discuss some cases with both
types of measurements to show that our work is not only
theoretical meaningful but also useful for applications.
3.1 Linear equality constraints
In this case, a linear equality constraint is placed on the
estimand (quantity to be estimated):
Dxk = dk
(5)
where matrix D and vector dk are both known.
Let
(2)
(2)
zk = dk , Hk = D
It can be easily seen that the above linear equality constraint
is exactly a noise-free measurement. So state estimation
with linear equality constraint is indeed a special case of the
problem with both noisy and noise-free measurements.
This case has already been widely studied [2, 3, 4, 5, 6,
7, 8]. Several assumptions were made in the derivation of
their results. For example, except for [4], they all made the
Gaussian assumptions. Under the Gaussian assumption, the
result in [2] was claimed to be optimal in the sense of generalized maximum likelihood (GML) in which the correlation between the pseudo measurement noise and the estimand was totally ignored; the result in [7] and the maximum probability method in [3] were claimed to be optimal
in the sense of maximum a posterior probability (MAP); the
mean square method in [3] was claimed to be optimal in the
sense of minimum mean-squared error (MMSE). Necessary
conditions need to be satisfied by a constrained linear system, and one way to construct a homogeneously constrained
linear system based on the information of an unconstrained
linear system was introduced in [6]. [5, 6] formulated the
problem with the Gaussian assumption just to be in accordance with the assumptions in the standard Kalman filtering. [8] provided a geometric interpretation of the results in
[3], so the Gaussian assumption was maintained. Since our
optimality criterion is BLUE, we only assume the first two
moments to be known, which is the same as in [4]. Another
common assumption is that D has full row rank, because
when D does not have full row rank, we can make it full
row rank by simply removing linearly dependent rows. But
to obtain the rank of D and find its linearly dependent rows
may not be trivial. In addition, why did they need this assumption? What is the advantage by doing so? This will be
analyzed later. Comparatively, we do not have any restric(2)
tions on the rank of Hk .
There were several common problems in the derivations
of most existing results. First, the objective functions are
inconsistent before and after the using of linear constraints,
although some of the results are the same as form 1 of our
sequential BLUE filter. Specifically, the objective function
before using the linear constraints is MSE (in the average
sense), while afterwards it becomes fitting error in the leastsquares sense, in which the estimate was treated as data. One
hard question in theory is in what sense the final estimate is
optimal. Second, there seems to be a confusion between
constrained Bayesian estimation and traditional constrained
optimization, although it is well known that estimation is
a special type of optimization. Their difference lies in the
way the constraints are treated. In the traditional constrained
optimization, constraints limit the search space to the constraint subspace. In constrained estimation, constraints are
also useful posterior information about the estimand and can
be treated as a special type of measurement, as shown above,
so constraints should also be put in the conditioning. Condi-
2194
tioning is important and critical since our estimand (system
state) is random and what we are estimating is actually the
characteristics of its posterior distribution. Also in both constrained estimation and constrained optimization, we need
to double check whether the final result really satisfies the
constraints.
3.2 Nonlinear equality constraints
In this case, a nonlinear equality constraint is placed on
the estimand:
c (xk ) = 0
where c (·) is some vector-valued nonlinear function.
Let
(2)
(2)
zk = 0, hk (xk ) = c (xk )
(2)
As can be seen, the original noisy measurement zk is
now noise-free with respect to (w.r.t.) the augmented state
xak . As such, we only have noise-free measurements. It was
argued in, e.g., [11, 12], that due to the increased state dimension and zero measurement noise, numerical problems
may arise if we still use inverse as we did in the standard
Kalman filtering, which is undesirable. That is also why the
“difference measurement” method is popular for this case.
But if the inverse is replaced by the MP inverse in the singular cases, the optimal state estimation can still be obtained
based on this augmented noise-free form. Also one bonus is
that we will have the optimal estimate of the measurement
noise at the same time.
3.4 Singular measurement noise
We can easily see that state estimation with nonlinear equality constraint is indeed a special case of the problem with
nonlinear measurement model (see the section “extension to
nonlinear measurements” for detail).
This case has also been widely studied [2, 3, 9, 10, 8].
Under the Gaussian assumption and linerization based on a
first-order taylor series expansion (TSE), the result in [2]
was claimed to be approximately optimal in the sense of
GML. [3] extended their state estimation results with linear equality constraints to the case with nonlinear equality
constraints. Under the Gaussian assumption, a second-order
TSE was utilized in [10, 8] to get better estimation results.
[9] even put constraints on the statistics of the distribution
of the estimate, and a two-step projection method was proposed therein. The same problems mentioned in the above
subsection exist in most of these results.
Suppose that there are only noisy measurements for the
dynamic system (1):
zk = Hk xk + vk
(6)
where Rk =cov(vk ) ≥ 0. How to get the optimal state
estimate in this case? One answer is as follows.
If Rk ≥ 0, it follows that
rank (Rk ) = m1 < m = dim (vk )
It then follows from singular value decomposition (SVD)
that there must exist a unitary matrix Uk such that
(1)
R̄k
0m1 ×(m−m1 )
′
Uk Rk Uk =
0(m−m1 )×m1 0(m−m1 )×(m−m1 )
(1)
3.3 Autocorrelated measurement noise
Suppose that only noisy measurements are available for
the dynamic system (1) as follows
(2)
zk = Hk xk + vk
(1)
where R̄k is an m1 × m1 diagonal matrix and R̄k > 0.
(1)
That is also why we assume Rk > 0 in our problem formulation.
Let
z̄k = Uk zk
where the measurement noise vk is autocorrelated instead of
white:
vk+1 = Bk vk + vkw
It follows from Eq. (6) that
with zero-mean white vkw , uncorrelated with hwk i and x0 ,
and cov(vkw ) = Rkw > 0.
A “brute force” solution to the optimal state estimation
problem in this case is as follows. Let
′
Fk 0
xak = x′k vk′
, Fka =
0 Bk
′
Gk 0
Gak =
, wka = wk′ (vkw )′
0 I
a
Hk = Hk Im2 ×m2
It can be rewritten as
z̄k = Uk Hk xk + Uk vk
z̄k = H̄k xk + v̄k
where
then the above dynamic system can be rewritten as
(1)
(1)
(2)
H̄k = [(H̄k )′ , (H̄k )′ ]′ = Uk Hk
(1)
(2)
v̄k = [(v̄k )′ , (v̄k )′ ]′ = Uk vk
(2)
(1)
z̄k ∈ Rm1 , z̄k ∈ Rm2
(1)
H̄k
(2)
∈ Rm1 ×n , H̄k
∈ Rm2 ×n
(2)
(1)
a
a
xak = Fk−1
xak−1 + Gak−1 wk−1
(2)
z̄k = [(z̄k )′ , (z̄k )′ ]′
v̄k ∈ Rm1 , v̄k ∈ Rm2 , m2 = m − m1
(2)
(1)
zk = Hka xak
(1)
(1)
(2)
(2)
cov(v̄k ) = R̄k , cov(v̄k , v̄k ) = 0, v̄k = 0 a.s.
2195
Since Uk is a unitary matrix, which is certainly invertible,
z̄k = Uk zk is sufficient in the sense that the BLUE based
on zk is equivalent to the BLUE based on z̄k . That is, the
original noisy measurements only equation (6) is equivalent
to the following:
(1)
(1)
(1)
z̄k = H̄k xk + v̄k
(2)
z̄k
=
5 Sequential BLUE
To reduce the computational complexity of batch BLUE, it
is proved in the following that two optimal sequential forms
can be obtained which process the noisy and noise-free measurements sequentially and thus can save computation.
5.1 Form 1
(2)
H̄k xk
which is exactly in the form of our formulation.
4 Batch BLUE
With the augmented notation for zk in Eq. (4), the stacked
measurement equation can be written as
#
"
(1)
(1)
Hk
vk
x
+
zk =
k
(2)
0m2 ×1
Hk
Theorem 1 (Sequential BLUE, form 1).
Given
x̂k−1|k−1 = E ∗ [xk−1 |z k−1 ], Pk−1|k−1 = MSE(x̂k−1|k−1 )
and zk , one form of the sequential BLUE estimate of xk is:
Prediction: Same as in batch BLUE.
Update by the noisy measurement:
(1)
(1)
(1)
(1)
(7)
(1)
(1)
(1)
Pk|k = Pk|k−1 − Pk|k−1 (Hk )′ (Sk )−1 Hk Pk|k−1
(1)
vk =
(1)
· (zk − Hk x̂k|k−1 )
Define
Hk =
(1)
x̂k|k = x̂k|k−1 + Pk|k−1 (Hk )′ (Sk )−1
(1)
(1)
(1)
Sk = Hk Pk|k−1 (Hk )′ + Rk
(1)
(2)
[(Hk )′ , (Hk )′ ]′
(1)
[(vk )′ , 0′m2 ×1 ]′
(8)
Update by the noise-free measurement:
(1)
(1)
(2)
(2)
(2)
(1)
(2)
(2)
(2)
(2) (1)
x̂k|k = x̂k|k + Pk|k (Hk )′ (Sk )+ (zk − Hk x̂k|k ) (9)
Then the equation becomes
(1)
(1)
Pk|k = Pk|k − Pk|k (Hk )′ (Sk )+ Hk Pk|k
zk = Hk xk + vk
where zk ∈ Rm1 +m2 , vk is zero-mean white noise with
(1)
cov(vk ) = Rk = diag{Rk , 0m2 ×m2 } and uncorrelated
with hwk i and x0 .
Given x̂k−1|k−1 = E ∗ [xk−1 |z k−1 ], Pk−1|k−1 =
MSE(x̂k−1|k−1 ) and zk , it is well known (see, e.g., [13, 14])
that the batch BLUE estimator of xk is:
Prediction:
x̂k|k−1 = E ∗ [xk |z k−1 ] = Fk−1 x̂k−1|k−1
(2)
Sk
=
(10)
(2)
(2) (1)
Hk Pk|k (Hk )′
(11)
(1)
Proof: Given zk , the updated BLUE estimate (7)-(8)
(1)
(1)
(1)
follows directly from x̂k|k = E ∗ [xk |z k−1 , zk ] and Pk|k =
(1)
MSE(x̂k|k ).
Since the BLUE estimator E ∗ [xk |z k−1 , zk ] always has
the quasi-recursive form [14],
(1)
Pk|k−1 = MSE(x̂k|k−1 )
′
= Fk−1 Pk−1|k−1 Fk−1
+ Gk−1 Qk−1 G′k−1
(2)
x̂k|k = E ∗ [xk |z k−1 , zk ] = E ∗ [xk |z k−1 , zk , zk ]
(1)
∗
= x̂k|k + C1,2 Cz̃+∗ z̃2|1
2|1
Update:
where
x̂k|k = E ∗ [xk |z k−1 , zk ]
(2)
(2)
(1)
∗
z̃2|1
= zk − E ∗ [zk |z k−1 , zk ]
= x̂k|k−1 + Pk|k−1 Hk′ Sk+ (zk − Hk x̂k|k−1 )
Pk|k = MSE(x̂k|k ) = Pk|k−1 − Pk|k−1 Hk′ Sk+ Hk Pk|k−1
Sk = Hk Pk|k−1 Hk′ + Rk
(2)
(2) (1)
(2)
(1)
= zk − Hk x̂k|k = Hk (xk − x̂k|k )
Here A+ stands for the unique Moore-Penrose pseudoinverse (MP inverse for short) of matrix A.
Compared with the standard Kalman filtering, nothing is
different except that the inverse, if not exist, is now replaced
by the MP inverse.
Since Rk ≥ 0 in general, the batch BLUE estimator with
both noisy and noise-free measurements needs to calculate
the MP inverse Sk+ , which is (m1 + m2 ) × (m1 + m2 ).
(1)
Even if x0 , wk and vk are jointly Gaussian, vk will be
singular Gaussian in this case. Since the singular Gaussian
density is not defined, the MAP estimate of xk does not exist. This is also one of the reasons why we choose BLUE as
our optimality criterion.
and thus
(2)
(1)
(2)
(1)
(2)
∆
(2)
∗
∗
) = Hk Pk|k (Hk )′ = Sk
Cz̃2|1
= cov(z̃2|1
(1)
∗
) = Pk|k (Hk )′
C1,2 = cov(x̃k|k , z̃2|1
Thus, (9) follows. Also,
(1)
′
Pk|k = MSE x̂k|k = Pk|k − C1,2 Cz̃+∗ C1,2
2|1
=
(1)
Pk|k
−
(2)
(2)
(2) (1)
(1)
Pk|k (Hk )′ (Sk )+ Hk Pk|k
Remark: This sequential BLUE estimator needs to cal(1)
culate an m1 × m1 inverse (Sk )−1 and an m2 × m2 MP
2196
(2)
inverse (Sk )+ , respectively. This is surely less demanding
than computing the (m1 + m2 ) × (m1 + m2 ) MP inverse
Sk+ in the batch BLUE estimator.
(2)
(1)
Remark: If Pk|k > 0 and Hk has full row rank, then
(2)
(2)
(Sk )+ can be replaced by (Sk )−1 and this sequential
BLUE will be exactly the same as the one in [3] for a linear
equality constrained state estimation problem. But there is
still a significant difference. This sequential BLUE is optimal with constraints also in the conditioning and is free of
the problems of the existing results discussed above.
(2)
Remark: If Hk has full row rank, a nice equivalent
weighted average form of the BLUE estimate of xk was presented in [4]:
(2)
(1)
(2)
x̂k|k = (I − Jk )x̂k|k + Jk (Hk )+ zk
where
(2)
(1)
(2)
(12)
(2)
Jk = Pk|k (Hk )′ (Sk )+ Hk
This can also be easily proved from our sequential BLUE
(2)
(2)
since Hk (Hk )+ = I. This weighted average form does
provide a better understanding of the mathematical mean(1)
ing behind x̂k|k , which is a weighted average of x̂k|k and
(2)
(2)
(Hk )+ zk (a particular solution to the linear equality constraint equation in (5)). But computationally, this form is
not preferred for two reasons. First, the BLUE estima(2)
tor (12) requires two MP inverses, one being (Hk )+ =
(2)
(2)
(2)
(2)
(Hk )′ (Hk (Hk )′ )−1 and the other (Sk )+ . Our se(2)
quential BLUE only needs one MP inverse (Sk )+ . Sec(2)
ond, the BLUE estimator (12) is valid only when Hk has
full row rank, but our sequential BLUE does not have this
limitation.
Remark: If the noise-free measurement is due to some linear equality constraint (5), this form of the sequential BLUE
estimator is really contrary to the convention of processing
the constraints later.
Remark: Form 1 and form 2 of the sequential BLUE filter
have the same computational complexity.
(2)
Remark: If Hk has full row rank, then a nice equivalent
(2)
weighted average form of x̂k|k is
(2)
(2)
where
(2)
(2)
(2)
(2)
(2)
(2)
(2)
(2)
· (zk − Hk x̂k|k−1 )
(2)
(2)
Pk|k = Pk|k−1 − Pk|k−1 (Hk )′ (Sk )+ Hk Pk|k−1
(2)
and for sequential BLUE form 2, we have
(2)
x̂2k|k = x̂k|k−1 , Pk|k = Pk|k−1
(2)
Proof: If the noise-free measurement zk is from a linear
equality constraint as in Eq. (5), then it will be deterministicly known.
If x̂0|0 = x̄0 , P0|0 = P0 , it then follows that in sequential
BLUE form 1
(2)
(2)
(2)
(2)
(1)
(1)
(1)
(2)
(2)
(1)
(1)
(1)
(1)
(1)
Proof: Parallel to that of Theorem 1.
(2) (1)
(2)
(2)
zk = E ∗ [zk |z k−1 ] = Hk x̂k|k−1
(1) (2)
The theorem can then be easily shown.
(2)
Pk|k = Pk|k − Pk|k (Hk )′ (Sk )−1 Hk Pk|k
(2)
(1)
and in sequential BLUE form 2
x̂k|k = x̂k|k + Pk|k (Hk )′ (Sk )−1 (zk − Hk x̂k|k )
(1)
(2)
zk = E ∗ [zk |z k−1 , zk ] = Hk x̂k|k
(2)
Update by the noisy measurement:
(1)
(1)
x̂k|k = x̂k|k , Pk|k = Pk|k
Sk = Hk Pk|k−1 (Hk )′
Sk = Hk Pk|k (Hk )′ + Rk
(2)
which can be easily proved from the sequential BLUE (form
(2)
(2)
2) from Hk (Hk )+ = I. The same remarks as for form 1
can be made here.
Remark: Since both forms of the sequential BLUE estimator are equivalent to the batch BLUE estimator, it follows
clearly that the sequential BLUE estimator does not depend
(1)
(2)
on the order of zk and zk being processed. And the two
forms of the sequential BLUE filters have exactly the same
performance.
If the noise-free measurement is from a linear equality
constraint as in Eq. (5), and the BLUE estimator is initialized by x̄0 and P0 , the following theorem shows that the update by the noise-free measurement in both forms of the sequential BLUE can be simply skipped without performance
loss. That is, both forms of the sequential BLUE reduces
to one single form which only has the update by the noisy
measurement.
Theorem 3. If the noise-free measurement is from a linear equality constraint as in Eq. (5), and x̂0|0 = x̄0 , P0|0 =
P0 , then for sequential BLUE form 1, we have
(1)
x̂k|k = x̂k|k−1 + Pk|k−1 (Hk )′ (Sk )+
(2)
(2)
Lk = Pk|k−1 (Hk )′ (Sk )+ Hk
5.2 Form 2
Theorem 2 (Sequential BLUE, form 2).
Given
x̂k−1|k−1 = E ∗ [xk−1 |z k−1 ], Pk−1|k−1 = MSE(x̂k−1|k−1 )
and zk , an alternative form of the sequential BLUE estimator of xk is:
Prediction: Same as in batch BLUE.
Update by the noise-free measurement:
(2)
x̂k|k = (I − Lk )x̂k|k−1 + Lk (Hk )+ zk
Remark: It should be noted that the conclusion in Theorem 3 is obtained under the assumption that the constrained
BLUE estimates can be obtained optimally. In practical
2197
application, this optimality can not be guaranteed due to
the presence of model-mismatch (e.g., in target tracking) or
nonlinearity in the system model, so the update by the noisefree measurement should still be kept in order to better improve estimation performance. Note also that if the noisefree measurement is random, the conclusion in Theorem 3
does not hold in general. Due to space limitation, numerical
examples to show these points will not be provided.
If the noise-free measurement is from a linear equality
constraint as in Eq. (5), then we need to double check
whether the estimate really satisfies the constraints. The following theorem shows that all estimates really do so.
Theorem 4. For estimates from all forms, we have
(2)
(2)
(2) (2)
(2) (1)
(2)
(2)
(2)
Hk x̂k|k = zk , Hk x̂k|k = zk
(2)
Hk x̂k|k = zk , Hk x̂k|k−1 = zk
Proof: Since the batch BLUE and two forms of sequential
BLUE are equivalent, only form 1 of the sequential BLUE
(2)
(2)
is used to show that Hk x̂k|k = zk for simplicity.
It follows from Eq. (9) that
(2)
(2)
(2)
(2)
(2)
(1)
zk − Hk x̂k|k = (I − Sk (Sk )+ )Hk (xk − x̂k|k )
and thus, by the unbiasedness,
(2)
(2)
E[zk − Hk x̂k|k ] = 0
(2)
(2)
cov(zk − Hk x̂k|k )
(2)
(2)
(2)
(2)
(2)
(2)
(2)
= (I − Sk (Sk )+ )Sk (I − (Sk )+ Sk ) = 0
So we can see that zk = Hk x̂k|k almost surely.
(2) (2)
(2)
Hk x̂k|k = zk
(2)
(2) (1)
can be proved similarly. Hk x̂k|k =
(2)
(2)
zk and Hk x̂k|k−1 = zk then follow from Theorem 3.
(2)
(2)
(2)
(2) (2)
Remark: Hk x̂k|k = zk and Hk x̂k|k = zk actually
6 Extension to nonlinear measurements
Although the batch BLUE estimator is optimal, due to its
heavier computational burden, it is not preferred. The batch
form processes everything in one shot, which is conceptually much clearer. It is the same as the standard Kalman
filtering except that the inverse is now replaced by the MP
inverse. If the computational burden is really an issue, the
sequential forms are preferred.
Both forms of the sequential BLUE estimator are equivalent to the batch BLUE estimator and they have the same
performance and computational complexity. Since form 1 is
already simple enough, what is the specific reason or advantage to adopt and derive form 2? What is the benefit to have
these two forms? How should the end user choose between
them? There should be no preference between these two
forms if only linear measurements are involved. But if one
(1)
(2)
or both of zk and zk are nonlinear, there is a preference.
That is, with only linear measurements, the performance is
(1)
(2)
independent of the order of zk and zk being processed.
(1)
(2)
With nonlinear measurements, the order of zk and zk really matters.
As was demonstrated in [15, 16], for nonlinear filtering,
sequential processing can not only lead to smaller computational complexity but also improve the performance (accuracy). This is because that when the measurement with
lower nonlinearity is processed first, the measurement with
higher nonlinearity processed later will have a better reference point for function approximation based nonlinear filtering techniques [17], e.g., extended Kalman filtering (EKF)
and DD2 [18], and sigma points or quadrature points for
moment approximation based nonlinear filtering techniques
[17], e.g., unscented filtering [19] and Gaussian Hermite filter [20]. This will be used as our guideline for sequential
processing in nonlinear filtering.
6.1 Nonlinear noisy measurements
In this case, the noisy measurement is nonlinear
(2)
hold for all noise-free measurement cases, e.g., when zk is
random (if the noise-free measurement is not from a linear
equality constraint as in Eq. (5)).
Since x̂k|k is the BLUE estimate of xk given z k and Pk|k
is the corresponding MSE matrix, they have the following
properties:
• x̂k|k is an unbiased estimate of xk : E[xk − x̂k|k ] = 0.
(1)
(1)
(1)
(2)
(2)
• Pk|k ≤ Pk|k = MSE(x̂k|k ) ≤ Pk|k−1 since x̂k|k =
(1)
E ∗ [xk |z k−1 , zk ].
(2)
• Pk|k ≤ Pk|k = MSE(x̂k|k ) ≤ Pk|k−1 since x̂k|k =
(2)
E ∗ [xk |z k−1 , zk ].
(1)
(1)
(1)
zk = hk (xk , vk )
(13)
while the noise-free measurement is still linear.
Since the noise-free measurement is still linear, we should
(2)
(2)
use it first for update. The update x̂k|k and Pk|k by using
the noise-free measurement following the prediction right
away will be optimal. Then the update by using noisy measurement should have better accuracy than if the noise-free
measurement is used after the noisy one.
For one cycle of nonlinear filtering, the prediction and
update by noise-free measurement are exactly the same as
in Theorem 2, and the update by noisy measurement can be
obtained as follows:
(2)
∗
x̂k|k = x̂k|k + C2,1 Cz̃+∗ z̃1|2
(1)
vk
• If x0 , wk and
are jointly Gaussian distributed, x̂k|k
and Pk|k are also optimal in the sense of MMSE.
2198
(14)
1|2
(2)
′
Pk|k = Pk|k − C2,1 Cz̃+∗ C2,1
1|2
(15)
(1)
(1)
(2)
∗
z̃1|2
= zk − E ∗ [zk |z k−1 , zk ]
(i)
(j)
6.2 Nonlinear noise-free measurements
Remark: Due to the nonlinearity of zk or zk or
both, we do not have an elegant analytical form for
(j)
(i)
(j)
∗
∗ , Cx̃
E ∗ [zk |z k−1 ], E ∗ [zk |z k−1 , zk ], Cz̃j∗ , Cz̃i|j
k|k−1 z̃j
and Cj,i in general, but they can be approximated by EKF,
UF, DD2, GHF and even from the original definition of
BLUE as in [21].
In this case, the noisy measurement is still linear while
the noise-free measurement is nonlinear:
7 Conclusions
(16)
∗
∗
= cov(z̃1|2
)
Cz̃1|2
C2,1 =
(17)
(2)
∗
)
cov(x̃k|k , z̃1|2
(2)
(18)
(2)
zk = hk (xk )
(19)
Since the noisy measurement is still linear, we should use
(1)
(1)
it first for update. The update x̂k|k and Pk|k by using the linear noisy measurement following the prediction right away
will be optimal. Then the update by using noise-free measurement should have better accuracy than if the noisy measurement is used after the noise-free one.
For one cycle of nonlinear filtering, the prediction and
update by noisy measurement steps are exactly the same as
in Theorem 1, and the update by noise-free measurement
can be obtained as follows:
(1)
∗
x̂k|k = x̂k|k + C1,2 Cz̃+∗ z̃2|1
References
2|1
Pk|k =
(1)
Pk|k
−
′
C1,2 Cz̃+∗ C1,2
2|1
(2)
(2)
[1] D. Haessig and B. Friedland, “Separate-bias estimation with
reduced-order Kalman filters,” IEEE Transactions on Automatic Control, vol. 43, no. 7, pp. 983–987, July 1998.
(1)
∗
= zk − E ∗ [zk |z k−1 , zk ]
z̃2|1
(1)
∗
∗
∗
)
), C1,2 = cov(x̃k|k , z̃2|1
= cov(z̃2|1
Cz̃2|1
Remark: If the noise-free measurement is from a nonlinear equality constraint and the BLUE estimator is initialized
by x̂0|0 = x̄0 , P0|0 = P0 , it can be easily shown as in Theorem 3 that the update by the noise-free measurement can be
skipped.
6.3 Nonlinear noisy and noise-free measurements
In this case, both the noisy measurement (13) and the
noise-free measurement (19) are nonlinear.
In this case, we need to first measure the nonlinearity of
(i)
(j)
hk (·) and hk (·), i, j = 1, 2, i 6= j, respectively. If the
(j)
(i)
nonlinearity of hk (·) is lower than that of hk (·), we can
use the following general equations to obtain the sequential
BLUE of the system state.
For one cycle of nonlinear filtering, the prediction step is
exactly the same as in Theorems 1 or 2, and the update by
(j)
zk can be obtained as:
(j)
(j)
x̂k|k = E ∗ [xk |z k−1 , zk ] = x̂k|k−1 + Cx̃k|k−1 z̃j∗ Cz̃+∗ z̃j∗
j
(j)
Pk|k
z̃j∗
= Pk|k−1 −
=
(j)
zk
−E
∗
Cx̃k|k−1 z̃j∗ Cz̃+∗ Cx̃′ k|k−1 z̃j∗
j
(j)
[zk |z k−1 ]
Cz̃j∗ = cov(z̃j∗ ), Cx̃k|k−1 z̃j∗ = cov(x̃k|k−1 , z̃j∗ )
(i)
This paper targets the state estimation problem with both
noisy and noise-free measurements. This more general
framework has numerous real supports, e.g, state estimation
problems under linear or nonlinear equality constraints, with
autocorrelated or singular measurement noise. Although the
state estimation with both noisy and noise-free measurements is not a big deal in theory itself, computationally efficient ways should be preferred. So two sequential forms of
the BLUE estimator which are equivalent to the batch BLUE
estimator have been proposed in this paper. How to extend
the results to the nonlinear measurement case and how to
make choices between the two sequential forms have also
been discussed.
The update by zk can be obtained as (14) through (18).
[2] L. S. Wang, Y. T. Chiang, and F. R. Chang, “Filtering methods for nonlinear systems with constraints,” IEE Proceedings
Control Theory and Applications, vol. 149, no. 6, pp. 525–
531, November 2002.
[3] D. Simon and T. L. Chia, “Kalman filtering with state
equality constraints,” IEEE Transactions on Aerospace and
Elctronic Systems, vol. 38, no. 1, pp. 128–136, January 2002.
[4] J. Zhou and Y. M. Zhu, “The linear minimum mean-square
error estimation with constraints and its applications,” in
Proceedings of 2006 International Conference on Computational Intelligence and Security, Guangzhou, China, November 2006, pp. 1801–1804.
[5] N. Gupta, “Kalman filtering in the presence of state space
equality constraints,” in Proceedings of the 26th Chinese
Control Conference, Zhangjiajie, Hunan, China, July 2007,
pp. 107–113.
[6] S. Ko and R. R. Bitmead, “State estimation for linear systems
with state equality constraints,” Automatica, vol. 43, no. 8,
pp. 1363–1368, August 2007.
[7] B. O. S. Teixeira, J. Chandrasekar, L. A. B. Torres, L. A.
Aguirre, and D. S. Bernstein, “State estimation for equalityconstrained linear systems,” in Proceedings of the 46th IEEE
Conference on Decision and Control, New Orleans, LA,
USA, December 2007, pp. 6220–6225.
[8] C. Yang and E. Blasch, “Fusion of tracks with road constraints,” Journal of Advances in Information Fusion, vol. 3,
no. 1, pp. 14–32, June 2008.
[9] S. J. Julier and J. J. LaViola, “On Kalman filtering with
nonlinear equality constraints,” IEEE Transactions on Signal
Processing, vol. 55, no. 6, pp. 2774–2784, June 2007.
2199
[10] C. Yang and E. Blasch, “Kalman filtering with nonlinear state
constraints,” IEEE Transactions on Aerospace and Electronic
Systems, vol. 45, no. 1, pp. 70–84, January 2009.
[11] X. R. Li, Applied Estimation and Filtering.
University of New Orleans, February 2006.
Course Notes,
[12] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with
Applications to Tracking and Navigation: Theory Algorithms
and Software. Wiley Interscience, 2001.
[13] X. R. Li, Y. M. Zhu, J. Wang, and C. Z. Han, “Optimal linear
estimation fusion - part I: Unified fusion rules,” IEEE Transactions on Information Theory, vol. 49, no. 9, pp. 2192–2208,
September 2003.
[14] X. R. Li, “Recursibility and optimal linear estimation and filtering,” in Proceedings of the 43rd IEEE Conference on Decision and Control, Atlantis, Paradise Island, Bahamas, December 2004, pp. 1761–1766.
[15] Z. S. Duan, C. Z. Han, and X. R. Li, “Sequential nonlinear
tracking filter with range-rate measurements in spherical coordinates,” in Proceedings of the 7th International Conference on Information Fusion, vol. 1, Stockholm, Sweden, June
2004, pp. 599–605.
[16] Z. S. Duan, X. R. Li, C. Z. Han, and H. Y. Zhu, “Sequential
unscented Kalman filter for radar target tracking with range
rate measurements,” in Proceedings of the 8th International
Conference on Information Fusion, vol. 1, Philadelphia, PA,
USA, July 2005, pp. 130–137.
[17] X. R. Li and V. P. Jilkov, “A survey of maneuvering target
tracking - approximation techniques for nonlinear filtering,”
in Proceedings of 2004 SPIE Conference on Signal and Data
Processing of Small Targets, vol. 5428, San Diego, CA, USA,
April 2004, pp. 537–550.
[18] M. Norgaard, N. K. Poulsen, and O. Ravn, “New developments in state estimation for nonlinear systems,” Automatica,
vol. 36, no. 11, pp. 1627–1638, November 2000.
[19] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte, “A new
method for nonlinear transformation of means and covariances in filters and estimators,” IEEE Transactions on Automatic Control, vol. 45, no. 3, pp. 477–482, March 2000.
[20] K. Ito and K. Q. Xiong, “Gaussian filters for nonlinear filtering problems,” IEEE Transactions on Automatic Control,
vol. 45, no. 5, pp. 910–927, May 2000.
[21] Z. L. Zhao, X. R. Li, and V. P. Jilkov, “Best linear unbiased filtering with nonlinear measurements for target tracking,” IEEE Transactions on Aerospace and Electronic Systems, vol. 40, no. 4, pp. 1324–1336, October 2004.
2200