Detrending moving-average cross-correlation coefficient: Measuring

Detrending moving-average cross-correlation coefficient:
Measuring cross-correlations between non-stationary series
arXiv:1311.0657v1 [q-fin.ST] 4 Nov 2013
Ladislav Kristoufek
Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Pod
Vodarenskou Vezi 4, 182 08, Prague 8, Czech Republic
Institute of Economic Studies, Faculty of Social Sciences, Charles University, Opletalova 26, 110 00,
Prague 1, Czech Republic
Abstract
In the paper, we introduce a new measure of correlation between possibly non-stationary
series. As the measure is based on the detrending moving-average cross-correlation analysis
(DMCA), we label it as the DMCA coefficient ρDM CA (λ) with a moving average window
length λ. We analytically show that the coefficient ranges between -1 and 1 as a standard
correlation does. In the simulation study, we show that the values of ρDM CA (λ) very
well correspond to the true correlation between the analyzed series regardless the (non)stationarity level. Dependence of the newly proposed measure on other parameters –
correlation level, moving average window length and time series length – is discussed as
well.
Keywords: correlations, econophysics, non-stationarity
PACS codes: 05.10.Ln, 05.45.Pq
1. Introduction
Inspection of statistical properties of multivariate series has become a topic of increasing
importance in econophysics. For this purpose, various estimators of power-laws in crosscorrelations of a pair of series have been proposed [13, 19, 8, 6, 15]. Out of these, the
detrended cross-correlation analysis (DCCA) [13, 19] has become the most popular one.
Apart from the analysis of the power laws in the cross-correlation function itself, Zebende
[17] proposed the DCCA cross-correlation coefficient as a combination of DCCA and the
detrended fluctuation analysis (DFA) [10, 11, 7]. Even though its ability to uncover powerlaw cross-correlations has been somewhat disputed [12, 2, 3, 18], Kristoufek [9] shows that
the coefficient is able to estimate the correlation coefficient between non-stationary series
precisely and that it dominates the standardly used Pearson’s correlation coefficient.
Email address: [email protected] (Ladislav Kristoufek)
Preprint submitted to Physica A
November 5, 2013
In this paper, we propose an alternative but also a complementary coefficient to the
DCCA cross-correlation coefficient – detrending moving-average cross-correlation coefficient. In the following section, the coefficient is introduced. After that, results of the wide
Monte Carlo study are presented showing that the newly proposed coefficient estimates
the true correlation coefficient precisely even for strongly non-stationary series.
2. DMCA coefficient
We start with the detrending moving average procedure (DMA) proposed by Vandewalle & Ausloos [14] and further developed by Alessio et al. [1]. For (possibly asymptotPt
ically non-stationary)
series
{x
}
and
{y
},
we
construct
integrated
series
X
=
t
t
t
i=1 xi
Pt
and Yt = i=1 yi for t = 1, 2, . . . , T where T is the time series length which is common for
both series. Fluctuation functions Fx,DM A and Fy,DM A are then defined as
2
Fx,DM
A (λ)
1
=
T −λ+1
2
Fy,DM
A (λ)
bT −θ(λ−1)c
X
g
Xt − X
t,λ
2
,
(1)
i=bλ−θ(λ−1)c
1
=
T −λ+1
bT −θ(λ−1)c
X
g
Yt − Y
t,λ
2
(2)
i=bλ−θ(λ−1)c
where λ is the moving average window length and θ is a factor of moving average type
g
g
(forward, centered and backward for θ = 0, θ = 0.5 and θ = 1, respectively). X
t,λ and Yt,λ
then represent the specific moving averages with the window size λ at time t. Properties
of DMA have been frequently studied and the procedure is usually compared to DFA.
Both methods perform similarly but DMA is much less computationally demanding as the
procedure contains no box-splitting and regression fitting [4, 16, 5]. In a similar manner,
different types of moving averages have been studied and the centered one (θ = 0.5) shows
the best results [4]. Therefore, we apply θ = 0.5 in this study as well.
For the bivariate series, He & Chen [6] propose the detrending moving-average crosscorrelation analysis (DMCA) which can be seen as a detrended covariance. The bivariate
2
fluctuation FDM
CA is defined as
2
FDM
CA (λ)
1
=
T −λ+1
bT −θ(λ−1)c
X
g
g
Xt − X
Y
−
Y
t,λ
t
t,λ .
(3)
i=bλ−θ(λ−1)c
In the steps of Zebende [17], we propose the detrending moving-average cross-correlation
coefficient, or also the DMCA-based correlation coefficient, as
ρDM CA (λ) =
2
FDM
CA (λ)
Fx,DM A (λ)Fy,DM A (λ)
2
(4)
The coefficient can be rewritten as
ρDM CA (λ) =
2
FDM
CA (λ)
=
Fx,DM A (λ)Fy,DM A (λ)
PbT −θ(λ−1)c 1
g
g
X
−
X
Y
−
Y
t
t,λ
t
t,λ
i=bλ−θ(λ−1)c
T −λ+1
r
2 =
2
P
P
bT
−θ(λ−1)c
bT
−θ(λ−1)c
1
1
g
g
i=bλ−θ(λ−1)c Xt − Xt,λ
i=bλ−θ(λ−1)c Yt − Yt,λ
T −λ+1
T −λ+1
PbT −θ(λ−1)c
i=bλ−θ(λ−1)c x,t y,t
qP
PbT −θ(λ−1)c 2 (5)
bT −θ(λ−1)c
2
i=bλ−θ(λ−1)c x,t
i=bλ−θ(λ−1)c y,t
where {x,t } and {y,t } are the series {Xt } and {Yt }, respectively, detrended by the centered
moving average of length λ. From the last part of Eq. 5, it is visible that
− 1 ≤ ρDM CA (λ) ≤ 1
(6)
according to the Cauchy-Schwartz inequality. The DMCA-based correlation coefficient
thus has the same range as the standard correlation. In the following section, we show
that the values of the newly proposed coefficient very precisely describe the correlations
between two series and this is true even for strongly non-stationary series.
3. Simulations results
In this section, we show that the DMCA coefficient is able to describe correlations
between (even non-stationary) series very precisely. To do so, we present a wide Monte
Carlo simulation study1 for varying level of correlations and (non-)stationarity. For this
purpose, we utilize two ARFIMA(0,d,0) processes with correlated error terms
xt =
∞
X
an (d1 )εt−n
(7)
an (d2 )νt−n
(8)
n=0
yt =
∞
X
n=0
Γ(n+di )
, hεt i = hνt i = 0, hε2t i = hνt2 i = 1 and hεt νt i = ρεν .
where an (di ) = Γ(n+1)Γ(d
i)
Parameter d is important for the stationarity discussion. For d < 0.5, the series
are stationary while for 0.5 ≤ d < 1, the series are non-stationary but mean-reverting
whereas for d ≥ 1, the series are non-stationary and non-mean-reverting (explosive). To
see how the DMCA coefficient is able to quantify the correlation for (non-)stationary series, we select several levels of d, specifically d1 = d2 ≡ d = 0.1, 0.4, 0.6, 0.9, 1.1, 1.4.
1
The R-project codes for the DMCA coefficient are available at http://staff.utia.cas.cz/
kristoufek/Ladislav_Kristoufek/Codes.html or upon request from the author.
3
To cover a wide spectrum of possible correlation levels in a sufficient detail, we study
ρεν = −0.9, −0.8, . . . , 0.8, 0.9. To see the effect of different moving average lengths, we
study the cases λ = 5, 15, 31, 101. And finally, we examine two time series lengths which
are representative for usually analyzed series in econophysics – T = 1000, 5000.
In Figs. 1-4, all the results are summarized. In Figs. 1-2, the time series length of
T = 1000 is discussed, and in Figs. 3-4, the results for T = 5000 are illustrated. In the
figures, we present the 2.5%, 50% and 97.5% quantiles, i.e. we show the 95% confidence
intervals and the median value, based on 1000 simulations for a given parameter setting.
The main findings can be summarized as follows. Firstly, the DMCA coefficient is
an unbiased estimator of the true correlation coefficient of the series regardless the (non)stationarity setting, the correlation level, the time series length and the moving average
window size λ. Secondly, the confidence intervals get wider with an increasing λ. Nonetheless, the confidence intervals remain quite narrow for all inspected λs (when compared to
the DCCA coefficient or Pearson’s correlation examined by Kristoufek [9]). Thirdly, the
performance of the coefficient is symmetric around the zero correlation, i.e. there are no
evident difference between the performance for the positive and for the negative correlations. Fourthly, the DMCA coefficient gets more precise with an increasing absolute value
of the true correlation – for the true correlation of both ±0.9, the confidence intervals
are extremely narrow. Fifthly, the standard deviation of the coefficient is approximately
symmetric around the zero correlation and it increases with the parameter d. And sixthly,
the performance of the DMCA coefficient gets better with an increasing time series length
T.
4. Conclusions
In the paper, we introduce a new measure of correlation between possibly non-stationary
series. As the measure is based on the detrending moving-average cross-correlation analysis
(DMCA), we label it as the DMCA coefficient ρDM CA (λ) with a moving average window
size λ. We analytically show that the coefficient ranges between -1 and 1 as a standard
correlation does. In the simulation study, we show that the values of ρDM CA (λ) very
well correspond to the true correlation between the analyzed series regardless the (non)stationarity level.
As the ρDM CA (λ) coefficient can be seen as both an alternative and a complement to the
ρDCCA (s) coefficient [17], precision of these coefficients should be compared. As both this
study and our previous study of the statistical properties of the DCCA coefficient [9] are
constructed in a similar manner, the comparison is easy. The most important findings are
the following. Firstly, both coefficients provide an unbiased estimator even for highly nonstationary processes. Secondly, apart from the very strong non-stationarity case (d = 1.4),
both coefficients bring very precise estimates of the true correlation. And thirdly, turning
to the differences, we find that the DMCA coefficient has narrower confidence intervals
and lower standard deviations for given settings. However, it needs to be noted that the
crucial parameters of the coefficients are different. For the DMCA method, we have the
moving average window length λ, and for the DCCA method, we utilize the scale s which is
4
used for box-splitting and averaging of the fluctuations around the time trend. Therefore,
these two coefficients are not easily comparable. However, an indisputable advantage of
the newly proposed DMCA coefficient remains – the DMCA procedure is computationally
much less demanding than the DCCA method mainly due to the box-splitting and the
time-trend fitting in the DCCA procedure. This might become an issue for long financial
series and in any other branch of research. Overall, we suggest to use both methods in
empirical analyses dealing with potentially non-stationary series and each of the methods
can better control for different types of trends.
Acknowledgements
The support from the Grant Agency of Charles University (GAUK) under project
1110213, Grant Agency of the Czech Republic (GACR) under projects P402/11/0948 and
402/09/0965, and project SVV 267 504 are gratefully acknowledged.
References
[1] Alessio, E., A. Carbone, G. Castelli, and V. Frappietro (2002). Second-order moving
average and scaling of stochastic time series. European Physical Journal B 27, 197–200.
[2] Balocchi, R., M. Varanini, and A. Macerate (2013). Quantifying different degrees of
coupling in detrended cross-correlation analysis. EPL 101, 20011.
[3] Blythe, D. (2013). A rigorous and efficient asymptotic test for power-law crosscorrelation. Technical report, arXiv:1309.4073.
[4] Carbone, A. and G. Castelli (2003). Scaling properties of long-range correlated noisy
signals: application to financial markets. Proceedings to SPIE 5114, 406–414.
[5] Grech, D. and Z. Mazur (2005). Statistical properties of old and new techniques in
detrended analysis of time series. Acta Physica Polonica B 36, 2403–2413.
[6] He, L.-Y. and S.-P. Chen (2011). A new approach to quantify power-law crosscorrelation and its application to commodity markets. Physica A 390, 3806–3814.
[7] Kantelhardt, J., S. Zschiegner, E. Koscielny-Bunde, A. Bunde, S. Havlin, and H. E.
Stanley (2002). Multifractal Detrended Fluctuation Analysis of Nonstationary Time
Series. Physica A 316(1-4), 87–114.
[8] Kristoufek, L. (2011). Multifractal height cross-correlation analysis: A new method for
analyzing long-range cross-correlations. EPL 95, 68001.
[9] Kristoufek, L. (2013). Measuring correlations between non-stationary series with DCCA
coefficient. Physica A submitted, arXiv:1310.3984.
5
[10] Peng, C., S. Buldyrev, A. Goldberger, S. Havlin, M. Simons, and H. E. Stanley (1993).
Finite-size effects on long-range correlations: Implications for analyzing DNA sequences.
Physical Review E 47(5), 3730–3733.
[11] Peng, C., S. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, and A. Goldberger (1994).
Mosaic organization of DNA nucleotides. Physical Review E 49 (2), 1685–1689.
[12] Podobnik, B., Z.-Q. Jiang, W.-X. Zhou, and H. E. Stanley (2011). Statistical tests for
power-law cross-correlated processes. Physical Review E 84, 066118.
[13] Podobnik, B. and H. E. Stanley (2008). Detrended cross-correlation analysis: A new
method for analyzing two nonstationary time series. Physical Review Letters 100, 084102.
[14] Vandewalle, N. and M. Ausloos (1998). Crossing of two mobile averages: A method
for measuring the roughness exponent. Physical Review E 58, 6832–6834.
[15] Wang, J., P. Shang, and W. Ge (2012). Multifractal cross-correlation analysis based
on statistical moments. Fractals 20(3), 271–279.
[16] Xu, L., P. Ivanov, K. Hu, Z. Chen, A. Carbone, and H. E. Stanley (2005). Quantifying signals with power-law correlations: A comparative study of detrended fluctuation
analysis and detrending moving average techniques. Physical Review E 71, 051101.
[17] Zebende, G. F. (2011). DCCA cross-correlation coefficient: Quantifying level of crosscorrelation. Physica A 390, 614–618.
[18] Zebende, G. F., M. F. da Silva, and A. Machado Filho (2013). DCCA cross-correlation
coefficient differentiation: Theoretical and practical approaches. Physica A 392, 1756–
1761.
[19] Zhou, W.-X. (2008). Multifractal detrended cross-correlation analysis for two nonstationary signals. Physical Review E 77, 066211.
6
Figure 1: Estimated DMCA correlation coefficients for different fractional integration parameters d I. Results for the time series of length T = 1000 are shown here. Separate figures represent
different parameters d – d = 0.1 (top left), d = 0.4 (top center), d = 0.6 (top right), d = 0.9 (bottom left),
d = 1.1 (bottom center), d = 1.4 (bottom right). Red lines represent the true value of ρεν . The solid lines
of shades of grey (mostly overlapping with the red line) represent the median values of 1000 simulations
for the given parameter setting. The dashed lines represent the 95% confidence intervals (the 2.5th and
the 97.5th quantiles of the simulations). Different shades of grey stand for different values of the moving
average window λ going from the lowest one (λ = 5, the darkest shade) to the highest one (λ = 101, the
lightest shade).
Figure 2: Standard deviations of DMCA correlation coefficients for different fractional integration parameters d I. Results for the time series of length T = 1000 are shown here. Separate figures
represent different parameters d – d = 0.1 (top left), d = 0.4 (top center), d = 0.6 (top right), d = 0.9
(bottom left), d = 1.1 (bottom center), d = 1.4 (bottom right). Solid lines represent the standard deviation of 1000 simulations for given parameter setting. Different shades of grey stand for different values
of the moving average window λ going from the lowest one (λ = 5, the darkest shade) to the highest one
(λ = 101, the lightest shade).
7
Figure 3: Estimated DMCA correlation coefficients for different fractional integration parameters d II. Results for the time series of length T = 5000 are shown here. Notation of Fig. 1 is
used.
Figure 4: Standard deviations of DMCA correlation coefficients for different fractional integration parameters d II. Results for the time series of length T = 5000 are shown here. Notation of
Fig. 2 is used.
8