#1782
Efficient Nonparametric Spectral Estimation with
Varying Bandwidth
Yudianto Pawitan
University of Washington. Seattle
and
Ashis K. Gangopadhyay
University of North Carolina. Chapel Hill
Abstract.
We consider an estimation of spectral density at a particular
frequency.
A linear model is develoPed based on a weak convergence result
expressing a local behaviour of the kernel spectral estimates and a second
stage estimate is derived from such model.
The estimate is shown to be
asymptotically normal and its ARE with respect to the kernel estimate with the
optimal. but unknown. bandWidth can be greater than one.
A small sample
simulation study is conducted which shows good mean square characteristics of
the second stage estimate.
Practical implementation is discussed through
examples.
Keywords.
Cumulant mIXIng. kernel
convergence. spectral estimate .
•
smoothing.
optimal
bandwidth.
weak
1.
Introduction.
Let Xt be a stationary time series observed at discrete time t
±2.... .
= O.
i1.
The second order properties of X is equivalently described by the
t
covariance function
and by the spectral density function
f(v)
=!
~(m)
exp(-2vvm).
m
There is a long history to the problem of estimating f(v) from a finite sample
XO....• Xr-1 (see. e.g .• Priestley. 1981).
The tradi tional approach has been the nonpararnetric smoothing of
the
periodograms or the covariance transform. but. as pointed out by Wahba (1980).
the choice of the important smoothing parameter (the window Width) is largely
~
left subjective.
to
the
log
She then proposed to fit a cross-validated smoothing spline
spec t rum.
Bel trao
and
Bloomfield
(1987)
establ ished
cross-validated choice of the bandwidth in kernel spectrum estimation.
a
The
approach is global in that it yields one value of bandwidth to use for the
whole range of frequency.
Cameron (1987) proposed an automatic step function
fit to the spectrum based on some optimal partitioning of the frequency band.
As he commented the procedure seems to give a good exploratory estimate of the
spectrum and it does not give a smooth estimate.
The paper is motivated by the approach of Bhattacharya and Mack (1987) and
Bhattacharya and Gangopadhyay (1988) in which the problem of choosing a (local)
- optimal
•
bandwidth
bandwidths and
asymptotic
is bypassed by carrying out
the estimation at
then finding a second stage estimate as a
linear
model.
Section
2
states
the
several
solution of an
conditions
and
a
weak
- 2 -
convergence result that motivates the linear model.
We develop the estimate in
Section 3 and discuss its properties in Section 4.
We show that the estimate
is asymptotically efficient in mean square relative to the estimate using the
optimal bandwidth.
unlmown.
2.
This is a good property since the optimal bandwidth is
We then show some simulation resul ts and examples in Section 5.
PRELIMINARY RESULT
Let X be a zero mean stationary time series observed at t = O.I •...• T-l.
t
Define the periodogram at frequency v as
I(v) =
i II
Xt exp(-2lrivt)1
t
2
(1)
A smooth periodogram estimate of the spectral density is
"
fr(v) =
.5
1_.
5
(2)
Kr(X) I(v-X)dA.
where
and K(X)
is a
derivative and I
sYmmetric kernel
K(X)dX = 1.
define on [-.5 .. 5] having bounded first
B
r
is a
smoothing parameter.
called
the
bandwidth. and K(X) is called the spectral window.
It is well-known that the
shape of K(X) is not as important as the size of
Br.
For later reference
denote
2
a = I X K(X)dX.
(3)
At this point we introduce the following assumptions:
[AI].
The time series Xt is zero mean strictly stationary and cumulant mixing
in the sense of Brillinger (1981). i.e.
I
la j cum(al •...• ~_I)1
a 1 · . ·~-1
for j=I.2 •...• k-l and k=2.3 •...• where
•
<~
- 3 -
= cum(Xt.Xt+a1.···.Xt+~_1)
cum(a1·····~_1)
is the joint cumulant of order k of X .
t
The spectral density of X satisfies
t
[A2]
f(v) > 0
(i)
(if) f" is continuous at v.
Simple calculation of the mean square error of (2) (see. e.g .• Brillinger.
-1/5
1981. §5.6) shows that an optimal rate for B is of order T
. so we consider
T
1/5
B- -_ tT- 1/5 . Denote Af t (v) as (2) using
tTand we will investigate the
-T
Br=
A
behaviour of "ft(v). for fixed v. as a stochastic process with index t when T ~
m
The symbol! indicates the weak convergence of distributions.
Theorem 1.
Under Al and A2. for any 0
< a < b < m.
as T
~
m
we have
(4)
for t € [a.b] and Wet) is the standard Brownian motion.
Proof.
From (2) and Brillinger (1981)
A
E ft(V)
= BT-1
J
= B~1 J
-1
K(A BT ) E I(v-A)dA
K(A B~l) (f(v-A) + O(T»dA.
Under A2. get a second order eXPansion of f around v and use (3) to get
E ft(v)
= f(v)
+
~
fll(v)
B~
+
O(B~).
(5)
Now write the LHS of (4) as
_2/5 A
A
=""1 (f t -E f t)
+ ~ f"
2
t
2
where we have dropped v and o(BJ)
for convenience.
t •...• t the joint distribution of
1
p
(6)
For fini te number of
- 4 -
2
~A
A!Jf
t } =(f -E f ) ... { - Wet )}
i
£"
t
t
ti
i
i
(7)
•
which follows from Brillinger (1981).
The proof is completed by showing the tightness of
viLfrr
Y (t) E ----- (f
A
../t
T
A
- E f),
t
t
t € [a.b].
From Prokhorov (1956. Lemma 2.2) tightness is implied by
EIYT(s) - YT(t)I
A
< cls-tl B
(8)
for any s,t € [a,b] and all r large, where A > 0, B > 1 and C > 0 are constants
independent of r.
It is straight forward to show from (7) that
(9)
Since the cumulants of Yr(t) converges, it follows that
EI
Jl:~tr
(Yr{t) - Yr{s»1
By simple algebra, we can choose any
~
4
... Elf N{O,I)1
4
4
= 3f .
(10)
•
> 0 and r O accordingly such that
4
(3f +~) Is-t 12
(st)2
< a -4
for all r
> rO
and any s. t € [a,b].
2
(3f 4 +~) Is-t 1
(11 )
Thus (8) is satisfied and the proof is
complete.
3.
1HE PROrosED ESTIMATOR
Proceeding as in Bhattacharya and Mack (1987) we note that the weak
convergence result in Theorem 1 suggests a linear model
(12)
= f +
~
f"
~+
f et
•
- 5 -
•
for every t
€
Cov(e
= T1/5
,e
t
1
t2
)
[a, b] and where e
min(t
t
-1-1
, t }.
I
2
~ t- I T- 215 Wet).
= 0 and
t
The best linear estimate of the intercept
.
in (12) is our second stage estimate of f.
•
Clearly E e
: !
We will now develop this estimate
in a practical form.
It was convenient to prove the result in Section 2 using the continuously
smoothed periodogram (2). but in practice I(A) is computed at the discrete
Fourier frequencies of the form k/f. k=O. 1..... [T/2]. using the fast Fourier
transform algorithm.
and L = [bT
I
4/5
Denote L
= [BrT] = [tT4/5 ].
0
<a
~
t
~
b. L
O
= [aT4/5 ]
]. so L takes values LO.LO+I ..... L .
I
We wi 11 consider the rectangular window, hence (2) becomes the simple
averaged periodogram
•
•
.... ,.
"j
e
1
fL{v} =[2: I{f},
A
(13)
e
where the summation is over esatisfying
approximately L/T.
Asymptotic equivalence.
.t
If -
vI
~ B~2.
in almost
B .
T
sure sense.
then.
is
between
discrete and continuous smoothing is stated in Brillinger (1981, Thm. 5.9.1).
The linear model (12) becomes
f
L
=f
+ ~(L/T}2 + f e .
L
( 14)
where L = LO.LO+I ..... LI , and ~ = ~ f".
min(I/L·.I/L"}.
The second stage estimate of f is the BLUE of f in (14). which is
(15)
A
where
•
~
will be derived below and is shown on (20).
Define
and
- 6 -
AL = ~1 e L .
1
1
Thus {AL • LO ~ L ~ L1} are mutually uncorrelated wi th mean 0 and variance l.
From (14) we define the following
VL
= (L(L+l»
~
A
A
•
(fL+1-f L)
(16)
(17)
where
(18)
(19)
The BLUE of
~
in the linear model defined by (16)-(19) is
(20)
A
A
4.
ASYMPTOTIC PROPERTIES OF f
"
To study the asymptotic properties of "f. note that
L -1
1
I
L=L
o
u~
=T-4 I
L(L+l)(2L+1)2
L
4
5 5
... [) (b -a )
and
(21)
e
- 7 -
!}4
~
5
5
5
(b -a )
~
+ f[-2a
2
2
Weal + 2b
Web)
- 6 Sb t W( t)dt]
(22)
a
by vitue of the weak convergence result in Theorem 1.
Now, substituting (21)
and (22) in (15) and (20) we get the following.
Theorem 2:
Under the regularity conditions Al and A2.
(i) a locally asymptotically best linear unbiased estimate of the spectral
density f is given by
,..
f
=
fL
1
- P(L 1/T)2
(23)
where
(24)
where u
(ii)
L
and V are defined in (16)-(19).
L
We also have
,..
{~5(f-f). (P-~)} ~ f(e.~),
where
(e.~)
(25)
is bivariate normal with mean (0,0) and covariance
(A+l)/b
[
where A
=54
(1-(a/b»
(26)
-A b-3
5 -1
.
- 8 -
. Remark I.
Note that by Theorem I. the asymptotic MSE (AMSE) of f t is
(27)
and it is minimized at
(28)
So. if we define the asymptotic relative efficiency (ARE) of a given estimator
A
A
of f wi th respect to f t
o
as the ratio of AMSE of f t
to that of the given
0
A
A
estimator. then ARE of f is given by
~ t~I/{(A+I)/b}
=b
using theorem 2 (ii).
to make ARE large.
t~I[~
+
{1-(aIb)}-I]
(29)
This suggests that we choose b sufficiently large so as
However the choice of b is limited since L = [bT4/5]
1
<T
and by the fact that we only use second order approximation.
A
Remark 2.
It should be noted that the second term of the estimator f in (23)
attempts to correct the bias of the first stage estimate at those frequencies
where f" is large.
Remark 3.
A modification is necessary near the boundary of
the frequency
range.
5.
SIMULATION AND EXAMPLES
A
A
In this section we will discuss some properties of f by way of a Monte
Carlo simulation of an ARMA model.
It will also address the practical question
A
A
of choosing LI .
The example shows that as L is large f achieves comparable
1
MSE properties to those of the simple averaged periodograms with optimal (but.
- 9 -
in practice, unknown) smoothing parameter L.
In agreement with Remark 1 this
suggests that L be chosen large enough such that f
1
L
oversmooths but f still
1
shows a stable appearance over successive L .
1
This point will be elaborated
later.
(a)
ARMA (2,2) model.
We generate x t = -.25 x t - 1 -
0,1, .... 255 where e
Priestley (1981).
t
.5 x t - 2 + e t - e t - 1 -
are iid N(O,l).
.75 e t - 2 , for t =
This model is used in Beamish and
The simulation is done in GAUSS programming language (1986)
which uses the fast acceptance-rejection algori thm proposed by Kinderman and
Ramage (1976) to generate the white noise series.
The process starts from zero
and the first 200 points are dropped so as to get stationarity.
are based on 100 replications.
The results
We use L =10 and consider various values for
O
Fig. 1
9
8
7
2
6
w
IJl
~
5
4
3
2
0
•
10
20
30
40
50
60
lor 11
Fig. 1.
Integrated MSE (IMSE) of simple averaged periodogram (solid line) as a
""-
function of L and IMSE of f as a function of L (dashed line).
1
- 10 -
Figure
1
shows
the
integrated
NSE
(INSE)
of
the
simple
averaged
periodogram, calculated by
INSE(L) =
~~
k=l
{_1_ 1~
100 j=l
(f
Lj
(k-l) _ f(k-l»2}
64
64
(39)
A
A
where f is the true spectrum.
The corresponding IMSE (L 1) for f is compared
The summands in (30) are the pointwise MSE. From Figure 1 we see
similarly.
that the optimal L is around 20, which would not be obvious from a particular
,.,
,.,
real ization.
~otice that as L
gets
large
the
IMSE
of
f becomes close to
1
optimal.
Hence choosing L
1
is relatively simple.
,.,,.,
Figure 2 (a), (b) and (c) show the pointwise characteristics of f in terms
A
of MSE, variance and bias, compared to those of f
L
with L=20 which is known to
A
A
be optimal from Figure 1.
,.,
Figure 3 (a), (b), (c) and (d) show f and f
particular realization with T=256.
for a
L
We use L =10 and L =35. 45, 55 respectively
O
1
in (b), (c) and (d).
,.,
,.,
The MSE of f is very close to that of f 20 . With large L , f has smaller
1
,.,
,.,
MSE around the spectral peak. With moderate L , f has smaller bias but larger
1
,.,
,.,
,.,
,.,
,.,
variance compared to f 20 or f with larger L . Hence f can achieve comparable
1
MSE as the optimal simple averaged per iodogram , but with a different
A
bias-variance structure.
The example wi th a particular realization in Figure 3 illustrates the
,.,
,.,
point that L1 is to be chosen large enough but f still shows a stable
appearance.
It is worth noting that the estimate stays similar for a wide
A
range of L1 , but it gets smoother.
The optimal f
does not look very smooth,
20
which would temp one to try a larger L and, consequently, oversmooth.
.
- 11 -
Fig,2(a)
Fig.2(b)
16
14
.
14
•
12
12
I
I
,
,
10
w
VI
~
I
10
:,
:,
:,
:: \ \
I
I
,
8
6
j
I
I
I
...
;"
>
\
(;
..:i..,,',
"~
I
4
8
...!'!!u
1\
4
I
2
.'
0
00
0.1
../""
J
i
i
2
..
,
0
0.2
0.3
0.4
0.5
0.0
0.1
F·"':.f""Icy
"
Fig.2(c)
2
o
- - - - - ..",:".:: " ,./
..... --'
I
·1
·2
·3
0.0
0.1
0.2
.
0.3
0.4
0.5
Frequency
,.,
,.,
Figure 2.
L1=55
Pointwise characteristics of f with L =10 and L =35 (dashed lines),
O
1
(dotted lines) compared to that of the optimal simple averaged
periodogram f 20 (solid lines).
(a)
MSE, (b) Variance and (c) Bias.
- 12 -
Fig.3{a)
Fig.3{b)
20
l.
•
,~
, ,.
:, ,:
,,
15
t".
i,'
:
'
i
\\
:
~\
\I
: I
,:
,
\, I
: I
,
"
I
",I
,
\'
I!
....
5
i..
i·,
r
,j
..' ,:,,
......
I:','
'.
-'"
'";.
o
00
0.1
0.2
0.3
0.4
c-:
0.5
0.1
03
0.2
04
05
Frequency
Fig.3{c)
~
~
Fig,3(d)
20
20
15
~s
~
U
10
~
-
10 -
Vl
Vl
5 r
5
I
o
o
0.0
0.1
0.2
0.3
0."
0.5
00
02
0.1
Spectral estimates from a realization of x
e t - e t - 1 - .75 e t - 2 • t=O •...• 255.
0.4
0.5
Frequency
Frequency
Figure 3.
0.3
=
-.25 X - - .5 X - +
t
t 2
t 1
The plots show the true spectrum (dotted
"
line) with (a) the optimal "f 20 (solid line); (b). (c) and (d) show "r (dashed
lines) using LO=lO and L1=35. 45 and 55 respectively. and its corresponding "f
L1
(solid lines).
- 13 -
(b)
Other examples.
It is obvious that the choice of L
•
l
is limited by the range of frequency
where the spectrum has a relatively constant second order derivative.
Hence a
blind choice of large L will not work for a spectrum that has a lot of peaks.
l
A thoughtful procedure like prefiltering is still useful especially to handle
sharp peaks.
We will show two more examples to see how f works in more
difficult situations than (a).
'"
Figure 4 (a) and (b) show the true spectrum of an AR(4) and '"f with L =5
O
and L =10 and 15 respectively.
l
With L =15, f
oversmooths the periodogram, so
l
L
1
'"
..
'"
it is probably a good choice for f.
Priestley (198l).
This process is considered in Beamish and
'"
'"
With larger L , f becomes erratic.
l
It should be noted that
'"
it is possible for '"f to be negative for some value of L and some frequency.
l
This will be an indication that the particular L
l
is not appropriate for the
frequency.
'"
'"
Figure 5 (a) and (b) show the true spectrum of an AR(6) and f with L =15
l
and 20 respectively.
'"
'"
This example is considered in Lysne and Tjosteim (1987).
With L =20, f is reasonably good at handling the peaks .
l
•
- 14 -
Fig.4(a)
Fig.4(b)
250oo~----------------.
~----------------.
.
·.·
..
....
20000
2SOoo
20000
.
"
""
""
:,; :
15000
I
15000
~
:1,: ::
I II
10000
.."
i
?I: ::
:t ~ ::
~ i 1:
1,1 g
1,'
j 1l
:,':: :
III
10000
Hi:
I' •
, \:.:
i !,1
:,1:: :
,
.
SOOO
.
: I:: :
:' ('': 1\ ::
/1 :
1 ,:
. ':
5000
I;
I;
{ :
o
o
0.0
0.1
02
0.3
0.4
0.5
0.0
01
0.2
04
03
0.5
..
Figure 4.
t=O, ... ,511.
x
t
= 2.7607 x - - 3.8106 X - + 2.6535 Xt - 3 t 1
t 1
The plot shows
the
true spectrum (dotted
.9238 Xt - 4 + e t ,
lines),
"
"
lines) and f (dashed lines) with L =5 and (a) L =10: (b) L =15.
1
O
1
f
L
(solid
1
- 15 -
Fig.5(a)
Fig.5(b)
250
2SO
200
200
•
lSO
lSO
~
E
2
u4>
i
III
!
'.
::'.
'.
"
:'
0-
III
100
100
50
so
o
o
0.1
0.0
0.2
0.3
0.4
0.5
0.1
0.2
0.3
0.4
05
FreC;uer:cy
Frequency
Figure 5.
0.0
Xt = 1.68 Xt - 1 - 3.03 Xt - 2 + 2.85 Xt - 3 - 2.59 Xt - 4
+ 1.23 Xt - 5
- .59 X - + e , t=O, ... ,511. The plot shows the true spectrum (dotted lines),
t
t 6
,..
,..
,..
f
(solid lines) and f (dashed lines) with LO=5 and (a) L1=15; (b) L 1=20.
L
1
6.
<X>NCLUSION
We have considered the problem of estimating spectral densi ty functions
using a two-staged procedure.
•
It is motivated by the local behavior of kernel
estimates with varying bandwidths.
The estimate is shown to achieve good mean
_&quare characteristics asymptotically and in the simulation, even compared to
I
the kernel estimate with optimal bandwidth. thus reducing the need to find such
optimal bandwidth.
- 16 -
REFERENCES
BEAMISH. N. and PRIFSfLEY. M.B. (1981). A study of autoregressive and window
spectral estimation. Appl. Statist. 30. 41-58.
..
BELTRAO. K.1. and BLOOMFIEI..D. P. (1987).
Determining the bandwidth of a
kernel spectrum estimate. }. Time Series Anal. 8. 21-38.
BHATIAaIARYA. P.K. and MA(](. Y.P. (1987). Weak convergence of k-NN density and
regression estimators with varying k and applications. Ann. Statist. 15.
976-994.
BHATIAaIARYA. P.K. and GANGOPADHYAY. A.K. (1987). Technical Report Series of
the Intercollege Division of Statistics. University of california at
Dnvis. Technical Report #104.
BRIllINGER. D.R. (1981).
Time Series Dnta Analysis and Theory.
Holt. Rinehart and Winston.
New York:
CAMERON. M.A. (1987) An automatic non-parametric spectrum estimation.
Series Anal. 8. 379-387.
GAUSS.
Programming Language Manual (1986).
}. Time
Aptech Inc.
LYSNE. D. and TJOSTEIM. D. (1987).
spectral estimation.
Loss of spectral peaks in autoregressive
Biometrika 74. 200-6.
..
KINDERMAN. A.J. and RAMAGE. J.G. (1976). Computer generation of normal random
numbers. }. Amer. Statist. Assoc. 71. 893-6.
PRIESTLEY, M.B. (1981).
Press.
Spectral Analysis and Time Series.
New York: Academic
PROKHOROV, Y. V. (1956). Convergence of random processes and limit theorems in
probability theory. Theory Prob. Applic. 1. 157-214.
WAHBA. G. (1980).
Automatic smoothing of the log periodogram.
Statist. Assoc. 75. 122-132.
Department of Biostatistics
University of Washington
Seattle. WA 98105
}. Amer.
Department of Statistics
University of NC at Chapel Hill
Chapel Hill, NC 27599-3260
•
..
© Copyright 2026 Paperzz