Statistical Analysis of Incomplete Long

Biometrika
(1999), 86, 4,pp. 965-972
? 1999BiometrikaTrust
Printedin GreatBritain
Statisticalanalysisofincomplete
data
long-range
dependent
BY WILFREDO PALMA AND GUIDO DEL PINO
Departamento
de Estadistica,PontificiaUniversidad
Catolicade Chile,Casilla 306,
Santiago22, Chile
[email protected]
gdelpino(mat.puc.cl
SUMMARY
This paperaddressesboththeoretical
and methodological
of
issuesrelatedto theprediction
long-memory
modelswithincomplete
data.Estimates
and forecasts
are calculatedby meansof
statespace modelsand theinfluence
of data gaps on theperformance
of shortand long run
is investigated.
predictions
Thesetechniques
areillustrated
witha statistical
analysisoftheminimumwaterlevelsoftheNileriver,
a timeseriesexhibiting
strongdependency.
Somekeywords:ARFIMAmodel;Incomplete
data;Linearpredictor;
Maximum
Long-memory;
likelihood;
Meansquareprediction
error;
Statespacesystem.
1. INTRODUCTION
data arisein a widevariety
Long-range
dependent
ofscientific
fromhydrology
to
disciplines,
see forexampleBloomfield
economics;
(1992),Robinson(1993),Beran(1994) and Ray & Tsay
classoflong-memory
(1997).A well-known
processes
aretheautoregressive
fractionally
integrated
movingaverage(ARFIMA) models,defined
bythediscrete-time
equation
(D(B)( 1 - B)dyt = E)(B)et,
fort= 1,... , n,whereIdI< 4, {?t} is a whitenoisesequencewithzeromeanand varianceo2, B is
thebackshift
operator
ofdegreesp and q respectively
Byt= yt D(B) and ((B) arepolynomials
withno commonzerosand all theirrootsoutsidetheunitcircle,and (1 - B)d is thefractional
The ARFIMA
difference
operator.
modelshavelongmemory
becausetheirautocorrelations
decay
to zero at a hyperbolicrate,thatis Pk -I kI` (oC
> 0), forlarge k. Estimationof theselong-range
dependentmodelsis discussedby Fox & Taqqu (1986), Dahlhaus (1989) and Sowell(1992), among
others.The problemof data gaps in time serieshas receiveda great deal of attention;see for
exampleJones(1980), Ansley& Kohn (1983) and Penzer & Shea (1997). In particular,Palma &
Chan (1997) and Chan & Palma (1998) develop state space methodsfordealing withmissing
observationsin the long-memory
context.
The mainobjectiveofthispaperis to investigate
ofmissingvaluesor data irregularities
theeffects
on the behaviourof predictionerrors.If Ytdenotesthe value of the seriesat timet,the complete
and the observed series are (yt, t E In = {1, . . . , n}) and (Yt, t E K,,= {k1,. . . , kr,} ' IJl)respectively.
Informationmay be available in the patternof missingobservations,whichis equivalentto the
is performed
conditionalon it. For
Kn, but thisset is normallyconsideredas fixed,or inference
likelihoodinferencethis may be justifiedby appealing to the 'missingat random' conditionof
Little& Rubin(1987,p. 10), whichmeansherethatKnis not affected
by theparameter0 specifying
thetimeseriesmodel or by theunobservedvalues {Yt:t E I, t ? K11}.The likelihoodfunctionL(0)
is just thejoint densityof the observedvalues,whichmay be specifiedin manyways,leadingto
formulaeforthelikelihood,e.g.integrated
different
likelihoodor recursivelikelihood.Nevertheless,
theyall lead to equivalentfunctions.
This paper focuseson therecursivelikelihoodfora Gaussian
timeseries,whichcan be calculatedbymeansoftheKalman filter.
Let 9tbe thebestlinearpredictor
966
WILFREDO
PALMA
AND
DEL
GUIDO
PINO
ofYtgiventheobservedvaluesup to timet - 1 and let A,be thevarianceoftheone-stepprediction
erroryt- t. Withthisnotationthe recursivelikelihoodis
L(f))-(27)
-
r(I1HAt
exp F-
r
{
A_
]1
Explicitstatespace formulaeforcalculatingthe predictionsytand Atmay be foundin Palma &
Chan (1997).
Section 2 addressesthe behaviourof shortand long run forecastsof ARFIMA models in the
presenceof data gaps. Applicationsof theseproceduresto the analysisof the annual minimum
waterlevelsof theNile riverare discussedin ? 3.
2.
INFLUENCE
OF MISSING
VALUES
ON
PREDICTION
2 1. Preliminaries
This sectionstudiesthe evolutionof the one-stepmean square predictionerror,E(yt- it)2, for
ARFIMA models,duringand aftera block of missingdata. The resultsare thenspecialisedto the
case of a singlemissingobservation.
For mathematicalconvenience,the analysisis carriedout by takinginto accountthe fullpast,
instead of the finitepast, Yt-i, Yt-2,...
Yl, of the time series.Throughoutthis
Yt-1, Yt-2, ..,
sectionwe use theconceptsofexponentialand hyperbolicratesofconvergenceofa sequence {Xk }
to its limitx as k- oo. Exponentialconvergencecorrespondsto IXk-X I < C1 a - k forlarge k, for
some a > 1 and a positiveconstantC1, and hyperbolicconvergenceto IXk- xl < C2k ', forsome
C2 >0 and o > 0.
2 2. Influence
ofa blockofmissingvalues
0..'., Yto+ m -1, are missing,the standarddeviationof the
When m consecutiveobservations,Yto
is added.
predictionerrorincreasesduringthe data gap and thendecreases,as new information
we may take to= 0.
By stationarity,
The followingtwo propositionscharacterisethisbehaviourduringand afterthe data gap and
specifythe convergencerates.Proofsare givenin the Appendix.
THEOREM
Z1
1. Let Yt be a stationaryinvertibleprocess with AR(O0)
decompositionYt=
Yt=yJ=o
jYt-j and MA(OO) decomposition
jet-j. Suppose that the observations
= (yt, t < k,t 0 {0, 1,... , m- 1}). Denotebye(t,m,k) theerror
and let ~//km
Yo, ... 5Ym are missing
of thebestlinearpredictorofYtgiven #k,n,thatis givenall theavailableinformation
beforetimek,
?t-?-
and by U2(t,m,k) its variance.Then
(a) for k= 0, ..., m,a2(k, m,k)=
(b) for k > m, u2(k,m,k)- u
a
u2m2
L
j /J;
maxj>k-m
7E
Theorem 1(a) shows that,as expected,duringthe data gap the mean square predictionerror
is beingadded. In contrast,after
increasesmonotonicallyup to timem,sinceno new information
timem the varianceof the predictionerrordecreases,as new observationsare incorporated.The
nextresultspecifiesthe ratesat whichtheseerrorvariancesincreaseor decreaseduringand after
the gap.
THEOREM 2. Withthenotationof Theorem1,fora long-memory
ARFIMAmodel,
(a) o2(k, m,k)- a2 _ Clk2d-l, for someconstantC1> 0 and largek (k < m);
C2k-2d-2,
forsomeconstant
(b) o2(k, m,k) _
C2 > 0 andlargek (k> m).
ARMAmodel,
Also,for a short-memory
(c) o2(k, m,k) - a2 _ C3aj-k forsomeconstants
C3 > 0, a1 > 1 andlargek (k< m);
(d) o'(k, m,k)-aE ~ C4a2j for someconstantsC4 > 0, a2 > 1 and largek (k > m).
Accordingto Theorem2, thereare two different
hyperbolicratesforthemean square prediction
errorduringand afterthe data gap. For example,ifd = 0 40, the varianceof the predictionerror
967
Miscellanea
duringthe gap increasesat rate O(k-02), whereasit decreasesto U2 at rate O(k-28). Thus informationis lost duringthe block of missingobservationsat a muchslowerratethanthatat which
it is gained afterthe data gap.
Theorems1 and 2 assume the data gap lengthto be fixed.However,if the lengthof the gap
equivalentto thatduringthe tranincreasesto infinity,
thenthepredictionprocessis statistically
sitionperiodat thebeginningofthetimeseries,withno previousobservation.The followingresult
characterises
the convergenceof thepredictionerrorin thiscase.
THEOREM
errore(k, Xo,k). Then,as k-o,
3. Let o2(k, oo,k) be thevarianceof theprediction
72(k,oo k)-u2,Ck-1.
past may be drawnfrom
An interesting
relationshipbetweenpredictingwithfiniteand infinite
Theorem3. Let 9(t,m,k) be the errorof the best linearpredictorof y, based on the finitepast
(Yt-1,Yt-2 ... Yi), and let J2(k,oo,k) be its variance.Then
2 X(koo
k) =
2
k-i
H(1 -
)
o'(k,oo, k).
i=1
to o2, the errorvarianceof the best
Accordingto Theorem3, thistermconvergeshyperbolically
)
past (Yt-1, Yt-2
linearpredictorof ytbased on the infinite
2 3. Influence
ofa singlemissingvalue
errorofan isolated
of Theorem1, themeansquareprediction
THEOREM 4. Undertheconditions
is asfollows:
missingobservation
(a) for k >, 52(k, 1, k)- U2 =in2 2(0, 1, k);
decreasingsequencethena2(k,1,k) - a2is a monotonically
(b) for k > 0, ifik is a monotonically
to zero.
decreasingsequenceconverging
Accordingto Theorem4(a) thereis a jump in themean square predictionerroraftera missing
value. Its magnitudeis n2U25 unlessi, = 0, as is trueforsome ARFIMA models.For instance,in an
ARFIMA (1, d, 1) process71 = 0-d, and so thereis no jump in theone-stepmean squareprediction errorif d = 0 - b. The monotonicityconditionin Theorem4(b) is shared by all fractional
1 (k-1-d)/k
> Oandso 7k/7k + 1
noiseprocesseswithlongmemoryparameterd. In fact,7.I =
(k + 1)/(k- d). However,ford E (-1, 1) and k > 1, (k + 1)/(k- d) > 1, provingthat7Ek/7Ek1 > 1.
Figure1 depictsthe evolutionof the mean square predictionerror,a2(k,1,k), aftera single
missingvalue forARFIMA(0, d,0) modelswithparametersd = 0 10,d = 0 40 and d = 049, respect?-
-d=O0
o 120
10
d=040
d=049
E 115
V
D 1-05
1*10D___
.,_,,,_,__._
1 00 v
. ^...
._w
. ..........
...............
A__:
f........
5
10
Time
. .._
..........
15
Fig. 1. Mean square predictionerrorforfractionalnoise processes.
20
968
PALMA
WILFREDO
AND
GUIDO
DEL
PINO
ively.Ifwe take a' = 1 it followsfromi, = 0- -d thatthemagnitudeofthejump is d2. Observe
that, after the firstpeak, the mean square predictionerror decays to one monotonically.
betweenthe mean square predictionerrorand one is not noticeable
the difference
Furthermore,
afterabout six stepsfromthemissingvalue.
The resultsdiscussedin thissectionindicatea sharpcontrastbetweenARMA and ARFIMAprocesses.For example,fromTheorem2, the influenceof a data gap on the mean square prediction
errorvanishes at an exponentialrate for ARMA models and at a hyperbolicrate for ARFIMA
processes.Thus, the influenceon the predictionerrorspersistslongerin the latterprocesses.On
errorvarianceafterthe end of
the otherhand, as a consequenceof Theoreml(a) the forecasting
processesa clear
theseriesgrowsmoreslowlyforARFIMAthanARMAmodels,givinglong-memory
processes.
advantageover short-memory
3. APPLICATION:
THE NILE
RIVER DATA REVISITED
The annual minimumwater levels of the Nile rivermeasuredat the Roda gorge is a welllong-rangedependency;see forexampleHosking(1984), Beran(1994,
knowntimeseriesexhibiting
Ch. 1) and Hipel & McLeod (1994, Ch. 10). These measurements,available fromStatlib at
www.stat.cmu.edu,
are displayedin Fig. 2(a) spanninga time period fromAD 622 to AD 1921.
Severalblocks of repeatedobservations,i.e. consecutiveyearshavingexactlythe same minimum
waterlevel,have been removed.Since the observationsare specifiedby fourdigits,therepetitions
are probablythe resultof a lack of new information.
From AD 622 to AD 1281, the period analysedby Beran (1994), thereare 48 repeatedvalues.
to roughly27% of the
This figurerisesto 344 whenthe fullperiodis considered,corresponding
century,
when
samplesize of 1297 observations.The problemis especiallycriticalafterthefifteenth
can be found.
blocks of up to 55 consecutiverepetitions
FollowingBeran (1994,p. 11), we fittedan ARFIMA(O,d,0) model to theNile riverdata and the
maximumlikelihoodestimatesare presentedin Table 1. The fullperiodfromAD 622 to AD 1921
was dividedinto two sub-periods,fromAD 622 to AD 1281,whichcoincideswithBeran'sanalysis,
and fromAD 1282 to AD 1921,a periodin which46% of the observationsare missing.The means
of the timeseriesin Fig. 2(a), displayedin the thirdcolumn of Table 1, are similarin the three
(a) Data withoutrepetitions
;
14-
102
600
800
1000
1200
1400
1600
1800
1400
1600
1800
Year
(b) One-steppredictions
~14%-
121
600
800
1000
1200
Year
Fig. 2. Annual minimumwaterlevelsof theNile river(AD 622-AD 1921).
Miscellanea
969
periods,whereasthe standarddeviationsof the timeseries,shownin the fourthcolumn,present
small changesfromperiodto period.
Table 1: Nile riverdata. Maximumlikelihoodestimationof d
and
C72
Period
AD 622-AD
1281
1921
622-AD 1921
AD 1282-AD
AD
Missing
7%
46%
27%
y1150
12 08
11 71
7y
0 89
1 17
1 04
d
td
SE
0 3712 1196 0 921
0 4385
0 4141
14 92
18 21
1 095
0 733
From Table 1, it can be observedthat the maximumlikelihoodestimatesof the long-memory
parameterd in the threeperiods are similar.The estimatefound by Beran (1994, p. 125) is
d = 040, for the period AD 622-AD 1281 withoutremovingthe repeatedvalues. The maximum
likelihoodestimateof d forthe second and thirdperiodswiththe repeatedvalues is 0 49 in both
models.
cases,indicatingalmostnonstationary
The estimatedstandarddeviThe t-statistics
ford in the studiedperiodsare highlysignificant.
ations of the noise U. are close to one in the firstand second periods.However,when the full
periodis considered,thisestimatedrops to around 0 7.
The influenceof the data gaps on theforecastscan be analysedfromthe Kalman filteroutput.
Figure2(b) depictsone-steppredictions.The residuals,et= Yt,- y, and the predictions'standard
deviationsare displayedin Fig. 3. The evolutionof these standarddeviationsis explained by
the theoreticalresultsfrom? 2. Two typicalsituationsare shownin Fig. 4. Aftera singlemissing
value at time t = 791, see Fig. 4(a), the residual standard deviationjumps to roughly0 8
in agreementwithTheorem4. It can
cU" (1 + 7E2), and thendecays to C, = 0-733monotonically,
six stepsafterthe missingvalue to reach
be also observedin Fig. 4(a) thatit takes approximately
the 0 733 level,analogouslyto the theoreticalresultdisplayedin Fig. 1 ford = 0 40.
For a stringof missingvalues,see Fig. 4(b), the mean square predictionerrorapproachesthe
upper limito2 monotonicallyat a hyperbolicrate O(k-026), see Theorem 2(a). Then, as new
is added, the errorvariancedecays to o2 at a hyperbolicrate O(k-274) as indicated
information
(a)
600
800
1000
1200
1600
1400
1800
Year
(b)
'1-4.
1-0
I II kln
I
IIE
a08
600
800
II 11ld IIAl a
1000
I RIN
I I
1200
1
1
1400
Lf
1600
h
1800
Year
Fig.3. (a) Residuals:
Yt- 9 and (b) standarddeviationsof one-steppredictions,forthe Nile
riverdata.
970
WILFREDO
(a)
AD
PALMA AND GUIDO
DEL
(b) AD
781-AD 801
080 -
PINO
1529-AD 1588
-t
100
24 0-78-
t0.90
0-76-
0.80
0-74-
_.
.
...
790
.
800
.
.
~~~~~~~~.
1530
.
1550
Year
1570
1590
Year
Fig. 4: Nile river data. Evolution of one-step standard deviations (a)
(b) AD 1529-AD 1588 period.
AD 781-AD
801 period,
by Theorem2(b). It takes55 missingobservationsforthemean square predictionerrorto increase
to about one, but it takesfewerthan six observationsto regainthe originallevel,U.
Model fitting
fortheNile riverapplicationhas beenperformed
usingthestatespace formulation,
and a Fortranprogramthatimplements
thisapproachis availablefromtheauthors.As shownby
thisapplication,data irregularities
may severelyaffectthe estimationof long-memory
processes,
models.
resultingin almostnonstationary
ACKNOWLEDGEMENT
Part ofthisworkcorrespondsto thefourthchapterofW. Palma's Ph.D. dissertation
at CarnegieMellon University;he would like to thankN. H. Chan, J. B. Kadane and N. Terrinfor their
guidanceand helpfulcommentsand discussions.This researchwas supportedin part by a grant
fortheircarefulreviewof
fromFondecyt.The authorswishto thankthe editorand two referees
the paper.
APPENDIX
Proofs
Proofof Theorem1. Part (a) is a standardresultforthe t - s stepsahead errorvariance;see e.g.
Beran (1994, p. 167). To prove(b) take t = k in theAR(oo) decompositionand subtractfromboth
sides the best linearpredictor.Then all termsvanish,exceptforYk and those associatedwiththe
missingobservations,whichyieldsthe usefulidentity
e(k,m,k) =
Z
k
?k -
ice(k-I, m,k).
(Al)
1
j=k-m+
of 8k to all previousobservations,
By the orthogonality
k)=
,72(k,m,
72
+ var
{
k
j=k-m+l1
-j, m,k)}
7tje(k
(A2)
form> 1. Boundingthe sum in (b) and U2(j,m,k) by o2 ends the proof.
Proof of Theorem 2. (a) For k < m, o2(k, m,k) -
2
Jk
j=
=
=
J??Zk
j.
J
V2 C1k2d-1,forlargek. (b) For k >>m,from(A2),
~m,k z-ml{f(,m,k-E}=vrgk
E
Zkl+zekj
v
Since
D
/i
Cljdl
,k>
forlarge j,
Miscellanea
971
forlarge k,
k
var {
2}
k)
7C-Pnl{f2k,m,
-
e(k-j, m,k)}
j =k -mtt+ 1
=
var(Zk),
where
m-1
Zk =
j=O
in- 1
yj -E
Y,
j=O
..
* *nYmnzY-1 nY-25-*
YjI|YO
< var(Zk)< var(Zm)< oo. Therefore
However,0 < var(ZO,,)
2(k,m,k)-
_ C2
- C2k-2d-2
bm?
(k? i).
Parts (c) and (d) are proved analogously,by observingthatforARMA processesi/ij
7Ej C4aJ- forlargej.
C3a-j
and
D2
Proofof Theorem3. Let C be a constantwhichdoes not dependon k. Then,forlargek,
92(k, oo, k) -U2
=
o2(k, oo, k) I1-
=
u72(k,oo,k)
072
I- exp
f1
-exp(C
k{1
k)}
2(k
log(1 -i)}
{k?
Z i24-Ck1l
/1
F]
as required.
Proofof Theorem4. Takingm= 1 in (Al) yields(a). Part (b) followsfromthemonotonicbehaviour of 072(0, 1,r),whichdecreasesfromo72 to U2(0, 1, 00), the varianceof the interpolationerror
D
givenall observationsbut themissingone.
REFERENCES
averageprocesswith
ANSLEY,C. F. & KOHN,R. (1983). Exact likelihoodof vectorautoregressive-moving
70, 275-8.
missingor aggregateddata. Biometrika
Processes.New York: Chapman and Hall.
BERAN,
J. (1994). StatisticsforLong-Memory
ClimaticChange21, 1-16.
BLOOMIFIELD,
P. (1992). Trendsin global temperature.
processes.Ann.Statist.26, 719-40.
W. (1998). State-spacemodellingoflong-memory
CHAN,N. H. & PALMA,
R. (1989). Efficient
parameterestimationof selfsimilarprocesses.Ann.Statist.17, 1749-66.
DAHLHAUS,
Fox, R. & TAQQU,M. S. (1986). Large sample propertiesof parameterestimatesfor stronglydependent
stationaryGaussian timeseries.Ann.Statist.14, 517-32.
Systems.
HIPEL,K. W. & MCLEOD,A. I. (1994). TimeSeriesModellingof WaterResourcesand Environmental
Amsterdam:Elsevier.
in hydrologicaltimeseriesusingfractional
Water
differencing.
J.R. M. (1984). Modelingpersistence
HOSKING,
Resour.Res. 20, 1898-908.
R. H. (1980). Maximumlikelihoodfitting
ofARMA modelsto timeserieswithmissingobservations.
JONES,
Technometrics
22, 389-95.
LITTLE,R. J. A. & RUBIN,D. B. (1987). StatisticalAnalysiswithMissingData. New York: Wiley.
processes.J. Forecasting
PALMA,W. & CHAN,N. H. (1997). Estimationand forecastingof long-memory
16, 395-410.
averagemodelwithincomplete
J.& SHEA,B. (1997). The exactlikelihoodofan autoregressive-moving
PENZER,
data. Biometrika
84, 919-28.
972
WILFREDO
PALMA AND GUIDO
DEL
PINO
RAY,B. K. & TSAY,R. S. (1997). Bandwidthselectionforkernelregressionwithlong-rangedependenterrors.
Biometrika
84, 791-802.
6thWorldCongress,
In AdvancesinEconometrics,
ROBINSON,
P. M. (1993). Time serieswithstrongdependency.
Ed. C. A. Sims,pp. 47-95. Cambridge:CambridgeUniversityPress.
SOWELL,F. (1992). Maximum likelihood of stationaryunivariate fractionallyintegratedtime series.
53, 165-88.
J. Econometrics
[ReceivedJuly1998.RevisedMay 1999]