A Distributed Time-Difference of Arrival Algorithm for Acoustic

A Distributed Time-Difference of Arrival Algorithm
for Acoustic Bearing Estimation
Peter W. Boettcher and Gary A. Shaw
MIT Lincoln Laboratory
244 Wood Street
Lexington, Massachusetts 02420–9185
[email protected]
Abstract – An algorithm for estimating bearing-to-target
using time-difference of arrival (TDOA) in distributed
acoustic sensor networks is presented. This algorithm is advantageous in those cases where beamforming or other coherent processing is not feasible, such as when microphone
placement and time synchronization are uncertain, or when
the individual nodes communicate by energy-constrained
low-bandwidth links. The algorithm is applied to experimental, ground-truthed acoustic data, and full results are
presented.
Keywords: TDOA, bearing, target tracking.
1
Introduction
The distributed sensing community is exploring the utility of sensor platforms consisting of many small, lowcost sensors employing low-power, omni-directional sensors (e.g., acoustic and seismic). Such sensors are effecPSfrag replacements
←−−− 500 m −−−→
tive in classifying targets according to signature, but, due to
the limited directionality of the sensors, do not individually
provide geolocation or bearing information. Furthermore, a
system of distributed, wireless, energy-constrained sensors Figure 1: SITEX00 test site layout: The solid line repmust be designed to minimize energy utilization in process- resents the roads along which the vehicles travel, and the
ing and communication, as well as inter-sensor bandwidth. square markers represent the locations of the sensor nodes.
The DARPA SensIT program is developing wireless
sensor nodes and associated networking and collaborative processing algorithms for ad hoc networks of energyconstrained sensors [1]. In August 2000, a data collection ous military vehicles containing GPS receivers to provide
experiment called SITEX00 was run at Twentyninepalms, ground truth.
CA, with the cooperation of the USMC. A total of 37 nodes
In this paper, we examine an algorithm for combining
were placed in three clusters along the roads at the test
site (see Figures 1 and 2). Each node was configured with the measurements of individual sensors to collaboratively
acoustic, seismic, and IR sensors, and the location of each determine the bearing to a target vehicle. This algorithm
node was determined using GPS. Over a period of two estimates the difference in acoustic travel time using feaweeks, numerous collection runs were made, using vari- tures from the dominant frequency, and it uses these timeof-arrival differences to estimate the bearing to the target.
This work is sponsored by DARPA under A/F Contract #F19628-00The algorithm is applied to the experimental data from
C-0002. Opinions, interpretations, recommendations and conclusions are
those of the authors and are not necessarily endorsed by the Department SITEX00, and performance and computational complexity
of Defense.
results are presented.
T O APPEAR IN FUSION 2001, M ONTREAL , AUGUST 2001
Node 3
Node 2
Node 4
ath
PSfrag replacements
e
Tru
p
get
tar
Node 1
ath
lp
ous
lle
ara
p
ne
rro
E
Figure 2: SITEX00 test site: Military vehicle passing
through a field of sensor nodes.
2
2.1
Figure 3: Illustration of a vehicle generating CPA events
on a set of nodes.
Motivation for TDOA algorithm
Coherent processing
motion. As illustrated in Figure 3, CPA measurements can
be used to establish heading and speed, but the solution
is not unique. In addition, the target must pass directly
past several nodes before an estimate can be made; consequently, this technique is only useful when the target is
located within the field of sensors.
Over the years, unattended ground sensors for target
tracking have evolved to employ arrays of high-sensitivity
microphones and coherent beamforming to provide a bearing estimate to the target [2], [3]. However, recent concepts
for very small, low-cost sensors have renewed interest in
single-element acoustic sensors, since the sensing node can
be very small and consume very little energy. Coherent
beamforming using randomly-spaced single-element sensors is a possibility, but performance will be limited by 1)
uncertainties in sensor location, 2) grating lobes due to undersampling of the acoustic wavefront, 3) inhomogeneities
in the propagation medium, 4) errors in time synchronization, and 5) differences in the sensor response characteristics. Furthermore, coherent processing among distributed
sensor nodes requires communication of the raw data samples between the nodes, which consumes precious energy
and bandwidth. One must conclude that beamforming or
other coherent processing across distributed sensor nodes
is a difficult and energy-intensive task.
2.2
Closest point of approach
2.3
Time-difference of arrival
The time-difference of arrival (TDOA) approach presented in this paper shares some of the advantages of
the two approaches of coherent processing and CPA-based
techniques. Instead of using phase difference information,
as beamforming does, TDOA techniques compare the travel
time of sound through air over much larger baselines. In order to avoid transmitting time-series data from one node to
another, the relative time-difference of arrival of a signal
is estimated using the dominant frequency of the acoustic
spectrum as a feature. This feature is quite compressible,
and allows accurate determination of acoustic travel-time
delay. The resultant time delay estimates can provide bearing estimates for targets outside the field of sensors; the
PSfrag replacementsrange is limited only by the SNR of the strongest feature.
At the other end of complexity spectrum lie algorithms
based on closest point of approach (CPA) [4]. These algorithms use the average received energy at each sensor node
to estimate the time at which the target vehicle passed closest to the node. A series of these CPA estimates from multiple nodes can be used to estimate the direction and speed
of the target vehicle. In the trivial case of a single target
traveling at a constant velocity, the problem can be reduced
to a least-squares matrix solution.
CPA-based approaches require negligible bandwidth,
since only the time and amplitude of each detected CPA
event must be communicated. However, one of the major
disadvantages of the technique is the difficulty in sorting
out multiple simultaneous target tracks or complex target
Remote Node
Frequency
Estimator
Time Delay
Estimator
Data
Reduction
Central (reference) Node
Frequency
Estimator
Data
Reduction
Bearing
Solver
Broadcast
Remote Node
Frequency
Estimator
Data
Reduction
Time Delay
Estimator
Figure 4: Block diagram of TDOA bearing estimation algorithm. The dashed boxes indicate the physical node on
which the blocks are run.
2
3
TDOA for distributed sensors
3.1
The raw time-series data is input into this filter structure,
and θ1 is allowed to update with every input sample. This
update occurs to minimize the energy on the output of the
notch filter, so that the notch lies directly on the strongest
frequency component present in the input. For each sample,
ω0 is computed from θ1 using (1), and is output to the next
block of the algorithm.
Figure 7 shows an example of the operation of the frequency tracker. The spectrogram of a portion of the experimental SITEX00 data is shown, with the overlaid frequency
estimate.
Frequency estimation
This essential stage of the algorithm (see Figure 4 for
a block diagram) takes as input the raw time-series data
and produces an estimate of the dominant instantaneous frequency. This frequency estimate serves as the basis for the
delay estimation.
PSfrag replacements
0.5
−
Bandpass Output
Notch Output
PSfrag replacements
+
e(n)
0.5
Emitter Frequency (Hz)
A(z)
All-pass
u(n)
Sfrag replacements
Figure 5: Lossless filter bank used for adaptive notch filter.
u(n)
cos(θ2 )
sin(θ2 )
cos(θ1 )
−sin(θ2 )
sin(θ1 )
cos(θ2 )
0
08:12
z −1
sin θ2 =
ω0 ∈ [0, π]
1 − tan(B/2)
,
1 + tan(B/2)
3.2
(1)
(2)
µ(n) > 0, (3)
where µ(n) is the stepsize and e(n) and x1 (n) are outputs
from the filter structure. Finally, the stepsize µ(n) is updated by choosing a “forgetting factor” λ:
λn−k x21 (n),
0 λ ≤ 1.
Data reduction
At this stage in the algorithm, the data rate has not
changed, since the frequency tracker produces one estimate
per input sample. Since one of the original goals was bandwidth reduction, this data must be reduced before transmission.
The most obvious approach is decimation. Intuition and
inspection of the data suggest that the dominant frequency
changes much more slowly than the original time-series
sampling rate. The actual decimation factor remains an adjustable parameter of the system and drives a tradeoff between bandwidth and accuracy. The SITEX00 data was
sampled at 256 Hz; decimation of the frequency estimate by
a factor of 10 yields a rate of approximately 26 samples per
second and has no noticeable impact on delay estimation
accuracy. Even with a liberal encoding of 16 bits/sample,
only 416 bits per second of bandwidth are required.
Censoring the transmitted information has further potential for bandwidth savings. When it is clear that a portion
of the extracted frequency estimate has no distinguishing
features or is noisy due to low SNR, that portion can be
discarded before transmission. For example, a constant frequency cannot be used to estimate travel-time delay. Cen-
where ω0 is the notch frequency and B is the 3-dB attenuation bandwidth. This separation of bandwidth and center
frequency allows straightforward adaptation. In fact, the
bandwidth B can be fixed, allowing only ω0 to be updated.
Regalia proposes the following update rule for θ1 :
θ1 (n + 1) = θ1 (n) − µ(n)e(n)x1 (n),
08:18
Figure 7: Spectrogram of SITEX00 data with overlaid frequency estimate. The vehicle decelerates at 08:14, causing
the SNR to drop and the filter to lose the frequency track.
Presently, Regalia’s adaptive IIR notch filter [5] is used
for the frequency estimation. The filter structure is a planar rotation lattice filter (Figures 5 and 6). The two free
parameters θ1 and θ2 are related to the filter parameters as
follows:
θ1 = ω0 − π/2,
08:16
Time
Figure 6: Normalized lattice filter (A(z)).
µ(n) = 1/
08:14
cos(θ1 )
x1 (n)
n
X
50
−sin(θ1 )
z −1
y(n)
100
(4)
k=0
3
nodes. The good correlation from one node to the other is
clearly visible, as is the travel-time delay.
The bearing to the target (and thus the time delay) is assumed to be constant over some short interval, typically
2–4 seconds. Both the local and the reference frequency
estimates are therefore split into overlapping short-duration
frames. Each pair of local and reference frames are correlated to estimate the delay, as further described below.
Because the variations in the frequency estimates are the
3.3 Delay estimation
key features, the mean and linear terms are removed from
In the following sections, an example of time delay and all frames. Each pair of frames is correlated; the location
bearing estimation is given for the 6 nodes shown in Fig- of the resulting peak indicates the travel-time delay, and the
ure 8.
height and shape of the peak provide a simple confidence
measure.
The decimation described in Section 3.2 affects system
accuracy at this stage. With no frequency decimation, the
PSfrag replacementsdelay time estimate would be quantized to the original
acoustic sampling period. After decimation, the delay estimate is quantized to multiples of the sampling period. To
mitigate this effect, the three points surrounding the peak
are used for a parabolic interpolation to determine the peak
location more accurately.
Figure 10 shows the estimates of acoustic travel-time de←
− 100 m →
−
frag replacements
lay for each of 5 nodes, with respect to a sixth reference
node. The predicted delay based on GPS ground-truth is
overlaid.
soring has not been investigated sufficiently to establish
performance bounds.
Following the data reduction step, the one elected reference node in the cluster transmits its frequency estimates
to the surrounding nodes. If the network access scheme
allows broadcast, the communication of frequency features requires only a single transmission per cluster (416
bps/cluster of in this case, ignoring protocol overhead).
Figure 8: Locations of nodes used in delay estimation and
bearing formation.
100
Intersection
Arrival Lag (Samples)
80
frag replacements
Frequency Estimate (Hz)
59
58
57
56
Leave SE
Intersection
Arrive N
60
40
Leave SW
20
0
-20
-40
-60
Estimated Lag
Ground Truth
55
-80
08:12
500 samples ≈ 1.9 seconds
08:14
08:16
08:18
Local Time
54
Time
Figure 10: Time delay estimates. The dots show the estimates of the time delay from the data. The solid lines are
predicted from the ground-truth.
Figure 9: Overlaid frequency estimates from 3 nodes.
Each of the neighboring nodes in the cluster have meanwhile performed the same frequency estimation. Upon receiving the frequency estimate from the reference node,
each neighboring node can estimate the acoustic travel-time
delay from its own sensor to the sensor of the reference
node. Communication occurs only as valid features are detected and extracted. A publish/subscribe communication
protocol can be employed to support data-adaptive communication [1]. Figure 9 shows the frequency tracks for several
3.4
Bearing solution
Each node transmits the estimated travel-time delays
back to the reference node, for each successful correlation.
Under a far-field assumption, specifically that the target is
much farther from the cluster than the nodes are from each
other, the following is true:
4
PSfrag replacements
150
d sin φ
∆t =
,
cs
(5)
Bearing to target (degrees)
100
where ∆t is the estimated delay, d is the internode spacing,
cs is the speed of sound, and φ is the bearing measured from
broadside. The bearing φ is obtained for each pair of nodes
as
cs ∆t
).
φ = arcsin(
d
(6)
The arcsin produces two solutions, so several node pairs
are necessary to disambiguate the solutions. Of the two possible bearing solutions for each pair of nodes, one is chosen, according to the criterion that the chosen bearings are
as mutually consistent as possible (see Figure 11).
The results from the SITEX00 dataset are shown in Figure 12. Since 5 node-pairs are used to estimate the bearing, the median bearing estimate is shown, with the GPS
ground-truth overlaid.
Incorrect
-100
08:16
08:14
08:18
Local time
Figure 12: Bearing results, using the approximate center of
the 6-node cluster as reference. Ground truth bearing is the
solid line, and the dots represent the estimated bearing.
400 bits/second could be required, with no compression beyond decimation. This requirement is low enough to admit
the possibility of non-RF communication, such as non-lineof-sight UV communication [6]. However, if the network
access scheme does not have a broadcast capability, or if
the reference node is not able to use it, this number might
have to be scaled by the number of neighboring nodes to be
used in the computation, typically 3–5. Standard compression algorithms could lower this number significantly.
Node 2
Node 1
4.2
Coherent beamforming complexity
Altough the distributed beamforming problems stemming from location and time uncertainties are difficult to
overcome, the theoretical computational and bandwidth requirements will be compared to the TDOA scheme. As a
point of reference, the Remote Sentry Advanced Technology Demonstration is described in sufficient detail in [2] to
approximate the computational requirements.
This system samples acoustic data at 512 Hz and uses 72
fixed beams over 160 narrowband frequencies to produce 2
bearing estimates per second. The Remote Sentry system
is certainly more powerful than the TDOA algorithm proposed in this paper, since it can track multiple targets with
better accuracy. However, this comparison provides a context for the discussion of the TDOA algorithm. Although
the Remote Sentry uses 8 sensors, we will perform the calculations using 5 sensors for consistency with the TDOA
algorithm.
With 5 acoustic sensors, the FFT calculations require approximately 230 Kops/s. For each of the 160 frequency
bands, a 5x5 covariance matrix must be estimated, requiring 25 samples per estimate for accuracy. This estimate is
formed twice per second, requiring a total of 200 Kops/s.
Another 100 Kops/s are required for solving for the weight
Figure 11: Each pair of nodes produces 2 possible solutions, one of which is correct. The consistent solutions from
multiple node pairs determine the true solution.
4.1
-50
-200
08:12
Correct
4
0
-150
Node 3
PSfrag replacements
50
Complexity and bandwidth
TDOA complexity
The frequency estimation stage requires very few CPU
cycles, since it uses an extremely efficient filter structure.
The CPU bottleneck is the correlation, but even this stage
has modest requirements. For example, at a 256 Hz acoustic sampling rate, 10x decimation of the frequency estimate,
and 5 second frames, the correlation is performed using approximately 12.8 Kops (kilo-operations) per frame. At a
frame rate of 2 frames per second, 24 Kops per second are
required.
As discussed in Section 3.2, the communication bottleneck occurs when transmitting the frequency estimate from
the reference node to the neighboring nodes. Given system
parameters similar to those listed in the previous paragraph,
5
vectors for all 72 fixed beams. The final result for the beamforming stage of the system is then over 500 Kops/s.
The bandwidth required for the system is simple to estimate. Four of the remote sensor nodes must communicate
their raw time-series data to the fifth node, which performs
the beamforming. With an encoding of 16 bits/sample,
32 kbits/second is required for the data transmission.
5
at Twentyninepalms. We especially wish to thank Mssrs.
Arch Owen, Ken Theriault, and Richard McNeil of BBN
Technologies for their tireless efforts in planning and executing the data collection experiment and providing the
ground-truthed data.
References
[1] D. Coffin, D. Van Hook, S. Kolek, and S. P. McGarry,
“Declarative ad hoc sensor networking,” in Proc. of the
SPIE, San Diego, CA, July 2000, vol. 4126.
Conclusions
The algorithm presented here demonstrates an example
of a collaborative processing algorithm for distributed sensors. The form of the algorithm is driven by consideration of bandwidth and energy constraints, and uncertainties
in node location and calibration. The algorithm provides
bearing-to-target estimates in a distributed sensor network
where coherent processing is impossible. The bearing estimates provide useful tracking information at ranges far beyond those possible with closest point of approach based
algorithms, but the algorithm still has quite modest processing and bandwidth requirements. Algorithm performance
was illustrated using distributed sensor data collected at the
recent SensIT SITEX00 experiment. In comparison to coherent processing, the TDOA algorithm requires approximately 10x less computation and communication.
Areas for future work include extension to the multitarget case by replacing the frequency estimation module, implementation in distributed hardware, and integration with other collaborative algorithms.
6
[2] J. A. Brooks, Jr., M. A. Gallo, “Remote sentry advanced technology demonstration,” in Proc. of the
SPIE, June 1996, vol. 2764, pp. 154–164.
[3] Kevin T. Malone, Loren Riblett, and Thomas Essenmacher, “Acoustic/seismic identifications, imaging,
and communications in Steel Rattler,” in Proc. of the
SPIE, 1997, vol. 3081, pp. 158–165.
[4] F. M. Dommermuth, “The estimation of target motion
parameters from CPA time measurements in a field of
acoustic sensors,” J. Acoust. Soc. Am., vol. 83, no. 4,
pp. 1476–1480, April 1988.
[5] Phillip A. Regalia, “An improved lattice-based adaptive
IIR notch filter,” IEEE Trans. Sig. Proc., vol. 39, no. 9,
pp. 2124–2128, September 1991.
[6] Gary A. Shaw, M. Nischan, M. Iyengar, S. Kaushik,
and M. K. Griffin, “NLOS UV communication for distributed sensor systems,” in Proc. of the SPIE, San
Diego, CA, July 2000, vol. 4126.
Acknowledgments
This work was supported by Dr. Sri Kumar of DARPA.
Many individuals contributed to the successful data collect
6