Nonlinear Adaptive Filtering of Stimulus Artifact R. Grieve, P. Parker

Nonlinear Adaptive Filtering of Stimulus Artifact
R. Grieve, P. Parker, B. Hudgins and K. Englehart
Department of Electrical and Computer Engineering
and
Institute of Biomedical Engineering
University of New Brunswick, Fredericton, New Brunswick, Canada
Please address all correspondence to:
Philip A. Parker
Dept. of Electrical and Computer Engineering,
University of New Brunswick,
PO Box 4400,
Fredericton, New Brunswick,
Canada, E3B5A3
Abstract
Noninvasive measurements of somatosensory evoked potentials have both clinical and
research applications. The electrical artifact which results from the stimulus is an interference
which can distort the evoked signal, and introduce errors in response onset timing estimation.
Given that this interference is synchronous with the evoked signal, it cannot be reduced by the
conventional technique of ensemble averaging. The technique of adaptive noise cancelling has
potential in this regard however, and has been used effectively in other similar problems. An
adaptive noise cancelling filter which uses a neural network as the adaptive element is
investigated in this application. The filter is implemented and performance determined in the
cancelling of artifact for in vivo measurements on the median nerve. A technique of segmented
neural network training is proposed in which the network is trained on that segment of the record
time window which does not contain the evoked signal. The neural network is found to
generalize well from this training to include the segment of the window containing the evoked
signal. Both quantitative and qualitative measures show that significant stimulus artifact
reduction is achieved.
Keywords: evoked, response, stimulus, artifact, adaptive, filter, cancelling.
2
I Introduction
Somatosensory evoked potentials (SEPs) are used widely in clinical and research
applications, and include peripheral, spinal, and cortical potentials. For reasons of convenience,
comfort, and safety, the preferred measurement techniques are noninvasive using surface
electrical stimulation and recording electrodes. The recorded SEP is thus small in amplitude-typically on the order of a microvolt peak, [1]. While an electrical stimulus evokes the desired
response from the nervous process of interest, it also results in an interfering potential referred to
as the stimulus artifact (SA).
Due to a number of factors, the stimulus artifact is a particularly troublesome form of
interference. First, the amplitude of the SA is typically orders of magnitude larger than the SEP,
[2]. Second, the SA is synchronous or coherent with the SEP and thus cannot be reduced by
ensemble averaging--the common method for reducing other forms of interference, [3]. Third, in
most cases the SA overlaps the SEP in both the time and frequency domains, such that
conventional time windowing and frequency filtering are not capable of removing the SA
without distortion of the SEP, [4]. Fourth, the mechanisms by which the electrical stimulus is
coupled into the recording are varied, complex and distributed, making minimization of the SA
at the source problematic, [5].
Given that the conventional methods are not effective for SA reduction in SEP
measurements, alternative approaches have to be considered. A technique which has potential
in this regard is adaptive noise cancelling (ANC). In this approach an estimate of the SA is
3
obtained via an adaptive filter and the estimate subtracted from the composite SA plus SEP
signal. ANC filters have been used in other biological signal acquisition applications. Ye and
Choy, [6], used such a filter for the reduction of respiratory artifact in rheopneumography
measurements. Sadasivan and Narayana, [7], developed an ANC filter for the reduction of
electrooculogram in
electroencephalography measurements. Thakor and Zhu, [8], have used an ANC filter for the
removal of the QRS complex in ECG in order to improve detection of the P-wave. Suzuki et al.,
[9], developed a real-time ANC filter for suppression of ambient noise in lung sound
measurements.
In the application of noise cancelling to SA reduction, McGill et al., [4], estimated the
artifact waveform from subthreshold stimulation and subtracted the estimate from the
suprathreshold stimulus response. Nonlinear adaptive ANC filters were not considered in this
study. Work by one of the present authors investigated the application and performance of linear
and nonlinear finite impulse response (FIR) ANC filters for SA reduction in SEP measurements,
[10,11]. The performance of linear FIR ANC filters was found to be rather poor, [10],
suggesting that the input-output relationship of tissue to electrical stimulus is nonlinear. Indeed
the source of this nonlinearity is, at least in part, the nonlinear voltage-current characteristic of
the skin, [12]. A second-order truncated Volterra series nonlinear FIR ANC filter was shown to
have substantially improved performance, [10].
While this filter achieves good SA cancellation, it has two serious limitations: 1) it cannot
model nonlinearities of order higher than two, and 2) the number of filter coefficients grows as
the square of the FIR filter length. An alternative ANC filter that does not suffer from these
4
limitations is the neural network (NN) configuration. The objectives of this paper are to
investigate the application and performance of the NN ANC filter for the reduction of SA in SEP
measurement
II Neural Network ANC Filter
The NN configuration has been used in ANC filters for other applications requiring
cancellation of different types of interference. Yuwaraj et al., [13], applied the NN ANC filter
for cancellation of interference in otoacoustic emission measurements. In the first application to
SA reduction, Grieve et al., [14], investigated the pi-sigma NN configuration with some success
but concluded that a feedforward NN configuration would be more appropriate. The feedforward
multi layer perceptron (MLP) NN will be considered in this paper for the ANC filter and SA
cancellation.
Figure 1 about here
The general configuration for the ANC filter is given in Fig.1. The primary channel input Xp
is the addition of evoked response S, artifact Ap, and uncorrelated noise np. The reference
channel input Xr is the addition of artifact Ar, correlated with Ap through the system H, and
uncorrelated noise nr. The adaptive filter ha(t) estimates Ap from Xr and is adapted by the error e
through an appropriate training algorithm. Following convergence of ha(t), the ANC filter output
5
is the minimum mean square error estimate of S.
Different structures can be used for the adaptive filter, ha(t), and perhaps the most common
is the linear FIR filter. However the structure of interest here is the NN adaptive filter because of
its ability to generalise from training to test data, and to model nonlinear transfer functions of
arbitrary order. The general structure of a feedforward MLP neural network is given in Fig. 2.
For the implementation in this paper, samples, X1i, i=1,2,..,N, from X1 form the N-dimensional
input feature vector for the input layer. The input layer is fully connected to the hidden layer
through the weights Wji, j=1,2,..n, where n is the number of hidden nodes. The hidden layer is
connected to the output layer through the weights Wj, j=1,2,..,n, and hyperbolic tangents are used
for activation functions. The bias inputs and associated weights are included in practise in order
to accommodate dc offsets.
During the training phase, the NN weights are updated, from small randomly selected initial
values, using the back-propagation algorithm, [15]. The update equations are given by
Wj(k+1)= Wj(k) + µ1aje(k), and
Wji(k+1)= Wji(k) + µ2Wj(k)f'(sj(k))X1i(k), where
k is the time index, µ1 and µ2 are the learning rates for layers 1 and 2 respectively, f'(z) is the
derivative of the activation function evaluated at z, and e(k) is the cancellation error at time
index k. Appropriate values for N, n, µ1, and µ2 are determined by performance and stability
criteria to be considered in Section IV as part of two in vivo experiments.
These experiments require the acquisition of SEP and SA data, and the following section
describes the acquisition methodology.
6
Figure 2 about here
III Methodology
A block diagram of the measurement system for acquiring SEP from the median nerve is
shown in Fig. 3. Median nerve sensory fibres are stimulated at the finger and SEP acquired at
the wrist using surface Ag/AgCl electrodes. The resulting SEP and SA are amplified over the
bandwidth 10 Hz to 10 kHz, sampled at a rate of 25.6 kHz, and the digital data stored for
subsequent off-line processing. Following each stimulus, a data window of 10 ms or 256
samples is recorded. The Bruel and Kjaer 2032 signal analyser is used in an on-line ensemble
averaging mode in order to confirm the presence of SEP during the recording session. In order
to ensure no SEP crosstalk between the primary and reference inputs to the ANC filter, a
necessary condition for an ANC filter, a double-stimulus single-channel technique is used. In
this technique a series of alternating sub and supra threshold stimuli are applied to the finger.
The responses to the subthreshold stimuli are used as the reference data, and those to the
suprathreshold as the primary data. Thus no channel crosstalk is possible. This technique also
has the advantage of requiring only a single data acquisition channel and associated hardware.
Two Grass S11 stimulators with isolators are used to generate the series of stimuli. Figure 4
shows a typical series of stimuli and acquired data for this technique.
Figure 3 about here
7
Figure 4 about here
IV Experiment 1
In order to optimize the NN parameters and evaluate the NN ANC performance, the
first experiment acquired data without the presence of SEP. In this way the cancellation
performance can be examined over the full SA duration, including the time interval in which the
SA would otherwise overlap the SEP. If the SEP were present and unknown it would be
impossible to precisely evaluate the filter's performance in cancelling the SA. At the same time
the network parameters can be studied and optimised --in particular network size and training
data window will be investigated. The methodology of Section III was used to collect 300
primary and reference records. However, for this experiment both reference and primary stimuli
were subthreshold, thus ensuring no SEP in either channel.
Because the SA and SEP are correlated, it is essential to ensure that during NN training with
SEP present, the primary channel training data not contain SEP. Otherwise the NN filter will
adapt to cancel not only SA but also the SEP. Thus for training, the record time windows were
divided into two segments, P and P', such that P segment contains no SEP and P', which is the
8
remainder, contains the SEP. This training method is referred to as segmented-training. Fig.5
shows an example of the segments P and P'. Due to the finite conduction velocity of the nerve
Figure 5 about here
fibre, there is always a time delay between the stimulus application and the onset of the SEP.
This interval determines the magnitude of the time segment P. The value of P for practical
application can thus be based on the minimum expected nerve conduction delay between
stimulus and SEP arrival at the recording electrodes. For the median nerve and this
measurement setup, a value of 40 samples or 1.56 ms is appropriate. In other applications an
appropriate P value can always be estimated. It is important that this estimation and
segmentation be carefully done as the algorithm training is sensitive to the partitioning– if the P
segment is too short poor generalization of the filter will result and if too long some cancellation
of the SEP can occur.
The performance measures for the AN filter relate to residual SA in the filter output as a ratio
to the SA in the primary channel. Specifically the measures are ρ1, ρ2, and ρ3 as defined by
ρ1 = |SAp(k)|max / |SAo(k)|max , k∈P,
ρ2 = σSAp (k) / σSAo (k), k∈P, and
ρ3 = σSAp (k) / σSAo (k), k∈P',
where SAp(k) and SAo(k) are the SA components at time-index k in the primary and output
respectively, σX is the standard deviation of the random variable X, P is a time segment of the
record window, and P' is a time segment containing the remainder of the record window. ρ1 is a
9
measure of the reduction of the SA peak, and ρ2 and ρ3 are measures of the power reduction of
the SA in segments P and P' respectively. Since the NN was trained on the segment P, which
contains only a portion of the SA, the ability of the NN to generalise and accurately generate Ap
from Ar in both P and P' is critical to the AN filter performance. Thus the performance measure
ρ3 is of particular significance.
The 300 acquired response records were divided randomly into a 150-record training set and
a 150-record test set. The primary and reference inputs used to train the NN were generated by
averaging the training set data. These inputs were made up of the first 40 data points, preceded
by a 20 point zero pad. The training was continued until the incremental change in the mean
squared error with each iteration was not significant, indicating convergence of the network. The
NN size was optimized by varying the number of input and hidden nodes, training the network
for each combination, and determining the values of ρ1, ρ2, and ρ3 for the test set. The number of
input and hidden nodes were varied over 4 through 20, and 2 through 14 respectively. Data were
collected and processed for 6 subjects, and the results for one, which is typical of the 6, are
shown in Fig.6. After considering the results for all subjects, and keeping in mind that ρ3 is of
particular importance here, the combination of 12 input nodes and 6 hidden nodes was selected
as optimum for further implementations. The low ρ3 value is due to the fact that the SA power
in P' of the primary channel is comparable to the uncorrelated channel noise power, hence
limiting the maximum value that ρ3 can attain.
Figure 6 about here
10
To determine the generalisation ability of the filter, the length of the training section was
varied over 20 through 100 data points from a record length of 256 points, the NN trained, and
the performance measure ρ3 calculated (P= 40). The results are shown in Fig.7. Again ρ3 is the
measure of most significance since it measures the performance in P', and hence the ability of the
NN to generalise. The value of ρ3 is fairly constant over the training range tested indicating that
the NN is generalising well (this observation will be considered in more detail in Section VI).
Also shown in Fig.7 is the performance of a nonlinear FIR AN filter with 10 taps, [10], trained
with the Recursive Least Squares algorithm, (NRLS), and operating on the same data. For this
Figure 7 about here
filter, ρ3 generally continues to increase with P indicating that its generalising ability is not as
strong as the NN for P in the range necessary for practical application, ie P≤40 in this case.
V Experiment 2
In this experiment the NN ANC filter of Section IV was used to cancel SA in SA plus
11
SEP signal from 6 subjects. The methodology of Section III, using both sub and supra threshold
stimuli as per Fig. 4, was used to acquire the data. As in Section IV, 300 primary and reference
records were acquired, and these were randomly split into a 150-record training set and a 150record test set. In order to test the practical effectiveness of the filters, the final outputs were
obtained by filtering each of the 150 records in the test set, and averaging the 150 results.
The primary and reference inputs used to train the NN were generated by averaging the
training set data. These inputs were made up of the first 40 data points, preceded by a 20 point
zero pad. The training was continued until the incremental change in the mean squared error
with each iteration was not significant, indicating convergence of the network. The resulting
filter was then used to filter the 150 records in the test set, and the 150-record filter output was
averaged to obtain the final output. The training of the NRLS filter was slightly different. It was
found that when the NRLS filter is trained on averaged data, the kernel values upon which it
converges are large numbers. If a signal is filtered that is slightly different than that used to train
the filter, the output contains a very high residual interference. To prevent the NRLS filter from
converging on large kernel values, it was trained on raw data, rather than averaged data. All 150
primary and reference pairs in the training set were used to train the filter. Each record was
composed of a 20 point zero pad followed by 40 points of data. The resulting filter was then used
to filter the 150 records in the test set, and the 150-record filter output was averaged to obtain the
final output. It should be noted that the performance measure ρ3 cannot be used in this
experiment. Since the SEP is present and unknown, the residual SA following cancellation
cannot be evaluated quantitatively. Therefore performance evaluation in the segment P_ must be
a qualitative observation of the SA reduction in the overlap region, relative to the SEP.
12
Table 1 gives ρ1 and ρ2 values for the two adaptive filters, NN and NRLS FIR, used in the
AN filter over the 6 subjects. It is evident that both give substantial reductions in the SA over
the segment P. It is also evident that the NN gives a somewhat larger reduction which is to be
expected given that the NRLS is able to model 2nd order nonlinearities only while the NN can
model nonlinearities of arbitrary order. A paired t-test shows a significant difference in their
performances at the 5 % significance level.
NRLS
NN
NRLS
NN
Subject 1
73
88
72
70
Subject 2
11
53
12
72
Subject 3
27
50
20
46
Subject 4
30
67
24
54
Subject 5
31
44
39
52
Subject 6
24
40
31
37
Table 1: Performance measures ρ1 and ρ2 for Experiment 2 over 6 subjects.
Figure 8 shows the results of the SA cancellation for the NN and NRLS FIR adaptive filters
for 3 subjects which are typical of the 6 subjects. These figures allow a qualitative assessment of
the cancellation in the P_ segment (k>40) where the SA and SEP overlap, as well as in the P
segment (k≤40). The relative abilities of the NN and NRLS adaptive filters to generalise from P
to P_ are particularly evident in Fig. 8b.
13
Figure 8 about here
VI Conclusions and Discussion
The overlap of SA with SEP can cause significant signal distortion and onset timing errors. An
ANC filter with a NN as the adaptive element was shown to provide substantial SA reduction.
An MLP with 12 input and 6 hidden nodes, and segmented training achieved a SA reduction in
the P segment of 35 dB averaged over 6 subjects. Since the NN can model a nonlinearity of
arbitrary order it was able to achieve a larger reduction than an NRLS FIR AN filter based on the
truncated 2nd order Volterra model. It was also demonstrated that the NN is able to provide
better generalization from the P to the P_ segments. The SA cancellation in the P_ interval was
shown qualitatively to be excellent.
Figure 7 demonstrates the power of the NN in generalizing, and identifying/modelling the
nonlinear system between the primary and reference channels. The fact that ρ3 is relatively
constant over the length of P investigated indicates that it is able to capture and model the
nonlinear system well with as few as 20 samples per evoked response. For this reason, the NN
AN filter trained in the P segment is able to provide excellent SA cancellation in the P_ segment.
In contrast, the NRLS AN filter requires a P of approximately 70 samples to achieve the same
14
performance, by which time the SA is typically overlapping the SEP. The difference is no doubt
due to the 2nd order nonlinearity modelling limitation of the NRLS, as well as the structural
differences in the NN and FIR adaptive filters. The dip in ρ3 seen for both filters as P increases
is likely due to the fact that in the early part of P the SA is very large compared to the
uncorrelated noise, and hence the adaption converges well. However in the later part of P the
SA falls rapidly to values comparable to the uncorrelated noise and the adaption and
convergence are disturbed. A corresponding drop in performance occurs until P is sufficiently
large to allow the adaption to recover.
From the results of Table 1 and the small residual SA of Figure 8, it is clear that the NN
adaptive filter has converged well and has identified/modelled the nonlinear system ha(t) well.
When considering training and convergence times, it is noted that in the case where the SA is
stationary, the NN can be trained, the coefficients fixed, and SEP measurements made on-line or
off-line. In the case of nonstationary SA, the NN adaptive filter can be allowed to track SA
changes over time by using appropriate values for µ1 and µ2 in the update algorithm, [11].
Tracking and convergence properties of the filter in this mode need to be investigated further in a
continuation of this work.
VII References
[1] C. D. McGillem, J. I. Aunon, and D. G. Childers, "Signal processing in evoked potential
research: Applications of filtering and pattern recognition", CRC Crit. Rev. Bioeng., Vol. 9, pp.
15
225-265, Oct. 1981.
[2] S. B. Harrison and D. F. Lovely, “Identification of noise sources in surface recordings of
spinal somatosensory evoked potentials”, Med. & Biol. Eng. & Comp., Vol. 33, pp. 299-305,
1995.
[3] D. Regan, “Human Brain Electrophysiology”, Elsevier Science Publishing Co., New York,
1989.
[4] K. C. McGill, K. L. Cummins, B. B. Dorfman, L. J. Berlizot, K. Luetkemeyer, D. G.
Nishimura, and B. Widrow, " On the nature and elimination of stimulus artifact in nerve signals
evoked and recorded using surface electrodes", IEEE Trans. Biomed. Eng., Vol. BME-29, pp.
129-137, 1982.
[5] R. N. Scott, L. McLean, and P. A. Parker, "Stimulus artifact in somatosensory evoked
potential measurement”, Med. & Biol. Eng. & Comp., Vol. 35, pp. 1-5, 1997.
[6] J. Ye and T. T. Choy, " On-line respiratory artifact removal via adaptive FIR filters in
rheopneumographic measurements", Med. & Biol. Eng. & Comp., Vol. 32, pp. 620-624, 1994.
[7] P. K. Sadasivan and D. Narayana Dutt, " A non-linear estimation model for adaptive
minimization of EOG artefacts from EEG signals", Intl. Journal of Biomed. Comp., Vol. 36, pp.
199-207, 1994
16
[8] N. V. Thakor and Y. S. Shu, " Applications of adaptive filtering to ECG analysis: noise
cancellation and arrhythmia detection", IEEE Trans Biomed. Eng., Vol. BME-38, pp. 785-793,
1991.
[9] A. Suzuki, C. Sumi, K. Nakayama, and M. Mori, "Real-time adaptive cancelling of ambient
noise in lung sound measurements", Med. & Biol. Eng. & Comp., Vol. 33, pp. 704-708, 1995.
[10] V. Parsa, P. A. Parker and R. N. Scott, "Adaptive stimulus artifact reduction in noncortical
somatosensory evoked potential studies", IEEE Trans. Biomed. Eng., Vol. BME-45, pp. 165179, 1998.
[11] V. Parsa, P. A. Parker and R. N. Scott, " Convergence characteristics of two algorithms in
non-linear stimulus artifact cancellation for electrically evoked potential enhancement”, Med. &
Biol. Eng. & Comp., Vol. 36, pp. 1-13, 1998.
[12] W. G. Stevens, "The current-voltage relationship in human skin", Med. Electron. Biol.
Eng., Vol.1, pp. 389-399, 1963.
[13] M. Yuwaraj, A. Gluzmann, P. Madsen and H. Kunov, " Adaptive stimulus artifact
cancellation in otoacoustic emission testing", Proceedings of the 20th Canadian Medical and
Biological Engineering Conference, pp. 106-107, 1994.
17
[14] R. Grieve, P. A. Parker and B. Hudgins, "Adaptive stimulus artifact cancellation in
biological signals using neural networks", Proceedings of the 17th IEEE EMBES conference,
Montreal, Sept. 1995.
[15] S. Haykin, " Neural Networks: A Comprehensive Foundation", Macmillan College
Publishing Company, New York, 1994.
Acknowledgements
This work was supported in part by an NSERC Postgraduate Scholarship and NSERC grant #
OGP0004445.
18
Figure and Table Captions
Figure 1: Adaptive noise cancelling (AN) filter structure.
Figure 2: Feedforward neural network structure.
Figure 3: Stimulation and acquisition system.
Figure 4: Alternating sub and supra electrical stimuli (4a) and resulting SA and SEP(4b}.
Figure 5: The P and P_ segments for segmented training. The filter is
trained on the P segment where no SEP is present.
Figure 6: Network size test on subthreshold stimulus data showing
performance as a function of the number of input and hidden nodes.
Figure 7: Performance ρ3 versus length of training segment.
Figure 8: SA cancellation results from Experiment 2 for subjects a, b, and c.
Table 1: Performance measures ρ1 and ρ2 for Experiment 2 over 6 subjects.
19
Table 1
ρ1
ρ2
NRLS
NN
NRLS
NN
Subject 1
73
88
72
70
Subject 2
11
53
12
7
Subject 3
27
50
20
46
Subject 4
30
67
24
54
Subject 5
31
44
39
52
Subject 6
24
40
31
37
20
Xp=s+Ap+np
Âp
Reference
input
Xr=Ar+nr
Figure 1
21
Bias i/p
W b1
W b2
W bn
Figure 2.
22
Figure 3
23
S am p le N u m b er
S am p le N u m b er
Figure 4 :
24
Figure 5:
25
ρ1
ρ1
# of Hidden Nodes
# of Input Nodes
ρ2
ρ2
# of Input Nodes
ρ3
# of Hidden Nodes
ρ3
# of Input Nodes
# of Hidden Nodes
Figure 6:
26
Neural Network
ρ3
NRLS
Length of Training Section, P
Figure 7
27
Figure 8:
28