Determining the initial states in forward

Determining the initial states in
forward-backward ltering
Fredrik Gustafsson
Submitted to IEEE Trans. on Signal Processing
Abstract
Forward-backward ltering is a common tool in o-line ltering. Filtering
rst forwards and then backwards and the other way around do not give the
same result generally. Here we propose a method to choose the initial state in
the lter to eliminate this discrepancy. The objectives are to obtain uniqueness
and to remove transients in both ends.
1 Introduction
In batchwise signal processing one has the possibility not only to lter a signal forwards
in time as usual, but also apply a lter backwards on the ltered signal. Forwardbackward ltering is needed in the following applications for example:
Zero-phase ltering using IIR lters.
Implementation of non-causal Wiener lters or other lters with poles both inside
and outside the unit circle.
The rst trick to obtain zero-phase lters is used in many software packages, see for
instance the function filtfilt in Matlab [2], although it is not much discussed in
Department of Electrical Engineering, Linkoping University, S-58183 Linkoping, Sweden
1
standard text books. The design of non-causal Wiener lters leads to a lter with one
stable part and one unstable one as discussed in for instance [1]. The unstable part
must be implemented using backward ltering. In both cases, there might be a problem
with transients when applied to nite signals.
A lter G(q) becomes all-pass, or zero phase, when applied rst forwards, yf (t) =
G(q)u(t), and then backwards on the reversed ltered signal, yfbR (t) = G(q)yfR(t), and
nally the output is reversed again. The superindex R stands for reversed sequences,
the indexes f for forward and b for backward and q denotes the shift operator qu(t) =
u(t +1). We can also rst lter backwards and then forwards and obtain ybf (t). In both
cases, the total eect is a zero phase lter with transfer function jG(ei! )j2. However,
we have yfb(t) 6= ybf (t) generally due to unknown initial conditions, and this is not
satisfactory from the logical point of view. Besides, the FB (forward-backward) ltered
sequence has visible transients in the end and the BF ltered sequence at the beginning.
One way to obtain symmetry is to implement a lter with poles both inside and outside
the unit circle by using the partial fractions of the stable poles and unstable poles
respectively. This is the standard way in non-causal Wiener ltering. The stable lter
is applied forwards on the input and the unstable lter backwards and the lter output
is the sum of these terms. Since both lters are applied to the input, the result is of
course independent of the order of computations. However, there still is a problem with
transients, this time in both ends, due to unknown initial conditions.
We will here determine the initial conditions that makes ybf (t) = yfb(t) and at the same
time removes the transients. That is, the initial condition of one lter is chosen to match
the end of the other lter. Instead of solving the system of equations arising from the
condition ybf (t) = yfb(t), we will work in a least squares framework, which immediately
gives a feasible implementation. The method is illustrated on ltered white noise. The
improvement for short data records or long lter impulse responses is obvious.
2 Determining the initial state
A linear lter can always be expressed as
Y = H U + O x0 ;
where Y = (y0; y1; ::; yN ?1)T is the vector of outputs, U = (u0; u1; ::; uN ?1)T is the vector
of inputs to the lter and x0 its initial state. H is a Toeplitz matrix of impulse response
2
coecients
0
BB
H = BBBB
@
h0
h1
...
0 h0 0
...
0
...
hN ?1 h1 h0
1
CC
CC
CC
A
(1)
where N is the number of inputs, and O is an \observability" matrix. If the lter is
written in state space form
x(t + 1) = Fx(t) + Gu(t)
y(t) = Hx(t) + Du(t)
then
1
0
D
0
0
C
BB
BB HG D 0 ... CCC
(2)
H = BB ..
CC
...
.
A
@
HF N ?2G HG D
1
0
H
BB HF CC
O = BBB .. CCC
@ . A
HF N ?1
(3)
We introduce the row and column reversing operators R and C with the following easily
checked properties:
A = BC )
AR = B R C
AC = BC C
BC R = B C C
(BC RC )RC = B RC C
ARC = AT if A Toeplitz
The forward and backward lters are characterized by the pairs (Hf ; Of ) and (Hb ; Ob).
Using the notation x0 and xN ?1 for the initial conditions for forward and backward
lters, the result of forward-backward and backward-forward ltering can be written as
Yfb = (Hb YfR + ObxN ?1 )R
= HbR (Hf U + Of x0 )R + ObRxN ?1
= HbR HfR U + HbROfR x0 + ObR xN ?1
3
and
Ybf = Hf YbR + Of x0
= Hf (Hb U R + Ob xN ?1 )R + Of x0
= HfC HbC U + Hf ObR xN ?1 + Of x0
respectively. Note that HbR HfR 6= HfC HbC .
It turns out that there always is a solution to Yfb ? Ybf = 0. We will not prove this
statement. Instead we formulate the problem as minimizing kYfb ? Ybf k22, which turns
out to be easier to solve. We have
Yfb ? Ybf = (HbR HfR ? HfC HbC )U + (HbR OfR ? Of )x0 + (I ? Hf )ObR xN ?1
Since this is a linear expression in x0 and xN ?1 , the minimizing arguments are given by
the least squares estimate
0 d 1
i h
@ x0 A = (HbR OfR ? Of ) ; (I ? Hf )ObR # HbR HfR ? HfC HbC U
xN ?1
Here # denotes pseudo-inverse. By substituting these estimated initial states into the
FB and BF lters we get
h
ih
i Y^fb = HbR HfR U + HbROfR ; ObR (HbR OfR ? Of ) ; (I ? Hf )ObR # HbR HfR ? HfC HbC U
i# ih
h
Y^bf = HfC HbC U + Of ; Hf ObR (HbROfR ? Of ) ; (I ? Hf )ObR HbRHfR ? HfC HbC U
The expressions given above are not useful for implementation, since N N matrices
are involved. Next we seek a feasible implementation without computing H explicitely.
3 Fast implementation
In this section we describe how Yfb, without proof claimed to be the same as Ybf , can
be computed eciently. First an exact method is detailed, where the initial states are
computed with linear in time complexity, and then a constant complexity approximation
is proposed.
We rst compute the nominal values of Yfb and Ybf for zero initial conditions using
standard lter routines. That is, we have
Yfb0 = HbR HfR U
Ybf0 = HfC HbC U
4
We also need arbitrary state space realizations of the lters, (Ff ; Gf ; Hf ; Df ) and
(Fb; Gb; Hb; Db). The \observability" matrixes Of and Of are computed according to
4
(3). Now, we just have to compute the quantities Sbf =
Hb OfR and Sfb =4 Hf ObR and
we can express the least squares estimate as
0 d 1
i
h
@ x0 A = (SfbR ? Of ) ObR ? Sbf # (Ybf0 ? Yfb0 ):
xN ?1
Thus, we have
h
ih
i#
Yfb = Yfb0 + SbfR ; ObR (SfbR ? Of ) ObR ? Sbf (Ybf0 ? Yfb0 ):
(4)
Next is shown that Sfb can be computed without forming the N N matrix Hf . The
vectors of impulse response coecients can be expressed in hf as
2
hf = 4
Df
Of (1 : N ? 1; :)Gf
3
5:
Here Of (1 : N ? 1; :) means rows 1 to N ? 1 of Of . Note that hf is the rst column of
Hf . Let Sfb(t: :) denote the t'th row of Sfb and hf (t) the t'th component of hf . From
(1), (2) and (3) we have
Sfb (N; :) = (hRf )T ObR = (hTf )C ObR = hTf Ob
Sfb (N ? 1; :) = Sfb (N; :)Fb ? hf (N )HbFbN ?1
Sfb (t ? 1; :) = Sfb (t; :)Fb ? hf (t)HbFbN ?1
Thus, Sfb can be computed by the recursion
Sfb (N; :) = hTf Ob
Sfb (t ? 1; :) = Sfb (t; :)Fb ? hf (t)HbFbN ?1; t = N; N ? 1; :::; 2:
and Sbf is computed similarly. In this way, we do not have to compute the N N
matrix H explicitely, but only the N 2n matrices O and S .
4 Fixed complexity approximation
For long signals, it seems to be overkill to work with the complete impulse response h
and observability matrix O. If we assume that the impulse response is approximately
5
zero for t > M a logical approximation in (4) is to replace F t by 0 if t > M . This
leads to an O^ having non-zero elements in its rst M rows only, and S^ having non-zero
elements in its last M rows. Thus, we are faced with a least squares problem with 2M
equations and 2n unknowns, so all that is needed is to compute the pseudo-inverse of
a 2M 2n matrix in (4).
5 Simulations
In this section, a band-pass Butterworth lter applied to a white noise sequence1 will be
examined. The lter is of tenth order with (normalized) cut-o frequencies 0.2 and 0.25
and the noise sequence is of length 200. Figure 1 shows rst Yfb and Ybf using the initial
states in Matlab's filtfilt2 function in the signal processing toolbox and second the
result using zero inital states x0 and xN ?1 , i.e. Yfb = HRHR U and Ybf = HC HC U .
Then follows the signals (4) using optimized initial states. Finally, the result using
partial fractions,
?1
jH (q)j2 = BA((qq)) BA((qq?1)) = CA((qq)) + AD(q(?q)1) ;
is shown. As seen, the last two curves are identical showing the symmetry of the
proposed initialization method. The 2-norms are as follows:
kYfbfiltfilt ? Yfbfiltfilt k2 = 1:6
kYfb0 ? Yfb0 k2 = 1:0
kY^fb ? Y^fbk2 = 8 10?13
Thus, there is a signicant dierence depending on the order of forward and backward
ltering for the rst two choices of initial conditions. The transient from the lter is
clearly visible either in the beginning or the end. The proposed method and the partial
fraction method are perfectly symmetric.
To highlight the transients due to unknown initial conditions in the partial fraction
method, a very simple zero-phase high-pass lter is used,
2
=36 ? 20=9
1
y(t) = q + 0:8 u(t) = q125
+ 1:25 q + 0:8
The result can be reproduced using Matlab's Gaussian distribution with seed 0
Version 4.1 and earlier of Matlab has a bug. With the following denition of the initial
state zi, the function works as intended: zi = (eye(nfilt-1) + [aa(2:nfilt)' [-eye(nfilt-2);
1
2
zeros(1,nfilt-2)]])n (bb(2:nfilt)-bb(1)*aa(2:nfilt))';
6
Forward−backward filtering using Matlab‘s filtfilt
Forward−backward filtering using zero initial conditions
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
−0.1
−0.1
−0.2
−0.2
−0.3
−0.3
−0.4
−0.4
−0.5
0
20
40
60
80
100
120
140
160
180
200
−0.5
0
20
Forward−backward filtering using optimized initial conditions
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
−0.1
−0.1
−0.2
−0.2
−0.3
−0.3
−0.4
−0.4
−0.5
0
20
40
60
80
100
120
140
160
40
60
80
100
120
140
160
180
200
180
200
Forward−backward filtering using partial fractions
0.5
180
200
−0.5
0
20
40
60
80
100
120
140
160
Figure 1: The forward-backward and backward-forward ltered signal using rst Matlabs ltlt, then zero initial condition, optimized initial conditions and nally using
partial fractions.
7
Forward−backward filtering using optimized initial conditions
Forward−backward filtering using partial fractions
50
200
45
150
40
35
100
30
50
25
20
0
15
−50
10
5
0
20
40
60
80
100
120
140
160
180
200
−100
0
20
40
60
80
100
120
140
160
180
200
Figure 2: The ltered signal using optimized initial conditions and partial fractions.
and a DC oset of 100 is added to the u(t) used in the previous example. The result
is shown in Figure 2. The transients for the partial fraction method are obvious. Of
course, some ad-hoc rules can be used to determine initial conditions to compensate for
the DC oset in this example. The point is, that the proposed method computes the
initial conditions for each lter to match the end of the other lter response when the
transient hopefully is gone.
6 Conclusion
We have here detailed a method to choose the initial conditions for lters to be applied
forwards and backwards to a signal. The objectives were to get the same result independently of the order of ltering and to minimize transients due to osets and trends
in the data. Zero initial conditions or other ad-hoc initializations can by examples easily be shown to give undesired transients at the beginning or end of the data sequence
depending on the order of application of the forward and backward lters. Transients
are also a problem when using partial fractions of the stable and unstable parts. The
proposed method gives no transients and is symmetric, which was exemplied by a
simulation.
References
[1] T. Kailath. Lectures on Wiener and Kalman ltering. Springer-Verlag, Wien, 1981.
[2] MATLAB. Signal Processing Toolbox User's Guide. The MathWorks, Inc, 1993.
8