ESTIMATING CONDITIONAL VOLATILITY WITH NEAREST NEIGHBOR PREDICTIONS Acosta-González, Eduardo* Universidad de Las Palmas de Gran Canaria Fernández-Rodríguez, Fernando** Universidad de Las Palmas de Gran Canaria Pérez-Rodríguez, Jorge*** Universidad de Las Palmas de Gran Canaria SUMMARY We propose a new approach to measure volatility. This differs from GARCH family models in that it is based on non-linear dynamical systems and non-parametric regression. Volatility is defined as the risk of predictions in relation to a priori information about the past Nearest Neighbors (NNs). Our new measure of volatility is compared with GARCH in simulated and financial time series. The out-of-sample forecasting results indicated that the GARCH-based model forecasts, in most cases, were biased and exhibited no significant informational content. In contrast, the NN-based model forecasts generally exhibited less bias, and in most cases had significant information content. JEL classification: C52; C53 Keywords: Nearest neighbor predictions; GARCH models Universidad de Las Palmas de Gran Canaria Campus de Tafira. Facultad de CC. Económicas y Empresariales 35017 Las Palmas de Gran Canaria Spain (*) e-mail: [email protected] Tfno/Fax: +34 928 451 820 / +34 928 451 829 (**) e-mail: [email protected] Tfno/Fax: +34 928 451 802 / +34 928 451 829 (***) e-mail: [email protected] Tfno/Fax: +34 928 458 222 / +34 928 451 829 1 1. INTRODUCTION Market risk has become one of the buzzwords of financial markets 1. Two facts are apparent: first, the role of uncertainty is central to much of modern finance theory, such as capital asset pricing model (CAPM), consumption-based CAPM (C-CAPM), or arbitrage pricing theory (APT) because there exists a feedback between risk and return. For example, according to CAPM the risk premium is determined by the covariance between the future return on the asset and one or more benchmark portfolios. Theory also suggests that the price of an asset is a function of its volatility, or risk. Second, regulators, commercial and investment banks, and corporate and institutional investors are increasingly focusing on measuring more precisely the level of market risk incurred by their investors, and have long recognized that asset returns exhibit volatility clustering2. Consequently, an understanding of how volatility evolves over time is central to the decision making process. In empirical finance, we tend to be less interested in the level of the asset price or stock market index since it is widely assumed that such time series can be described as random walk. However, recent work in finance has demonstrated that financial markets are not characterized perfectly by random walk theory as a weak efficient-market hypothesis (see Lo and MacKinlay, 1999), and mean return and volatility can be estimated using non-linear models. There is widespread agreement that the volatility of asset returns is, to some degree, forecastable. Only in the last two decades, however, have been developed measures and statistical models that can accommodate and account for this dependence and analyze whether the volatility is stable over time3. Traditionally, to estimate volatility using historical time series analysis, most practitioners in the finance profession have relied on a moving average with fixed weights for all observations across the measurement sample. Volatility was defined as the standard deviation of changes over a specified period. Most measurement samples also used the rule that the longer the forecast horizon, the more historical data should be used. Typical studies of this nature are Black (1976), French et al. (1987) and Schwert (1989, 1990); these estimate by using the sample standard deviation of stock price changes or moving averages of squared price changes. Traditional estimates are based on the assumption that volatility is necessary to construct the series, although this is difficult to defend theoretically. Moreover, the length of the interval greatly affects the measured persistence. Using a simple moving average has not been very satisfactory, however, because of a unique characteristic of the measure. Since all points in the sample have equal weights, volatility tends to rise sharply when confronted with a shock but then decline just as sharply once that 1 JP Morgan defines risk as the degree of uncertainty of future net returns. Many participants in the financial markets are subject to a wide variety of risks. A common classification of risks is based on the source of the underlying uncertainty. For example, credit risk (estimates the potential loss due to the inability of the counterparty to meet its obligations), operational risk (results from the errors that may be made in instructing payments or settling transactions), liquidity risk (reflected in the inability of a firm to fund its non-liquid assets) and market risk (involves the uncertainty of future earnings resulting from market conditions – asset prices, interest rates, volatility). 2 In particular, volatility clustering implies that big surprises of either sign increase the probability of future volatility. 3 The more stable the volatility the more reliable is the prediction of future volatility from past observation. 2 particular observation falls out of the measurement sample. Using simple moving averages creates measures of volatility which tend to look like plateaus when charted. One way to avoid this problem is to use an exponential moving average where the latest observations are assigned the greatest weight in estimating volatility. This approach has two conceptual advantages, which are important to the practitioner. The first is that the volatility estimate reacts faster to shocks to the markets, as recent data carries more weight in the estimation. The second is that following a shock, volatility declines gradually as the weight of the shock observation falls. In contrast, the use of a simple moving average leads to changes in volatility once the shock observation falls out of the measurement sample, which in some cases can be several months after it has occurred. However, research in finance has devoted significant effort in the last two decades to coming up with better models for estimating volatility. Time series realizations of returns often exhibit time-dependent volatility. These facts allow an alternative volatility specification based on non-linear models. Several authors have fitted time series models to obtain estimates of conditional or expected volatility from return data. This idea was first formalized in Engle’s (1982) ARCH model, which is based on the specification of conditional densities at successive periods of time with a timedependent volatility process. The ARCH model is based on the assumption that forecasts of the variance at some future point in time can also be improved by using recent information. Since the publication of the original ARCH paper in 1982, these methods have been used by many researchers. Alternative formulations have been suggested and used and the range of applications has continually widened (see Bollerslev et al., 1992a and Bera and Higgins (1993) for a survey of these models). In the ARCH model (Engle, 1982) and its extension as generalized ARCH (Bollerslev, 1986), or exponential GARCH (Nelson, 1991) approximations, time series volatility is measured by means of the conditional variance of its unexpected component, that is, a distributed lag over squared innovations. Fitting GARCH models to stock price data provides an alternative way to estimate conditional volatility and has become standard in recent empirical applications. However, as Pagan and Schwert (1990) have shown, the ARCH models present some problems in the estimation of volatility because there are important non-linearities in stock return behavior that are not captured by conventional ARCH or GARCH models. Furthermore, evidence of non-linearity in financial time series has accumulated over the years, and new prediction techniques, such as chaotic dynamics and artificial neural networks, have been introduced. Nearest-Neighbor predictions (NN hereafter) is a new non-parametrical technique of short-run forecasting inspired by the literature on forecasting chaotic dynamical systems. The basic philosophy behind these predictors is that elements of a past time series might have a resemblance to elements in the future. In order to generate predictions, patterns with similar behavior are located in terms of nearest neighbors and the time evolution of NNs is used to yield the prediction. The NN prediction procedure makes no attempt to fit a global model to the whole time series, but uses only local information about the points to be predicted. NN methods are included in the framework of non-parametric methods (Härdle and Linton, 1994). Original ideas on NN have been contributed by Stone (1977) (consistent non-parametric regression) and Cleveland (1979) (robust locally weighted regression). Farmer and Sidorowich (1987) gave an important impulse to this kind of prediction by applying the NN method to predict chaotic time series. 3 NN predictors in financial time series suggest a mixture of technical analysis and chaotic behavior. The chaos paradigm holds that non-linear behavior is capable of producing deterministic, apparently random series that are short-term predictable; chartism, on the other hand, holds that parts of financial series, in the past, might have a resemblance to parts in the future. Clyder and Osler (1997) show that non-linear forecasting techniques, based on the literature of complex dynamic systems, can be viewed as a generalization of these chartist graphical methods; that is, the NN prediction method may be considered as a developed and sophisticated chartism inspired by chaotic dynamics, where, in order to yield predictions, present patterns of a time series are compared with past patterns. NN predictions have been applied several times to predicting financial time series; for instance, Diebold and Nason (1990), Bajo-Rubio et al. (1992), Mizrach (1992), Fernández-Rodríguez et al. (1999), Soofi and Cao (1999) and Lisi and Medio (1997). Most of these studies show that NN predictors have a higher efficiency than a random walk. The purpose of the present paper is to propose a different approach of estimating conditional volatility based on ideas of non-linear dynamical systems like NN predictions. In the ARCH (Engle, 1982) and GARCH (Bollerslev, 1986) approximations, time series volatility is measured by means of the conditional variance of its unexpected component. We define the non-predictable component of the volatility of an observation as the risk of its NN prediction, in relation to a priori information about the past NNs. There have been several attempts to measure the volatility of a financial time series using concepts related to non-linear dynamical systems. Philippatos and Wilson (1972 and 1974) used entropy as a measure of uncertainty in the selection of efficient portfolios. Bajo-Rubio et al. (1992) proposed an indicator of global volatility based on the inverse of the maximum Lyapunov characteristic exponent in order to measure exchange rate volatility. Finally, a first indicator of daily exchange rate volatility based on NN forecasting was introduced in Sosvilla-Rivero et al. (1999). In finance, volatility is associated with the risk of a specific prediction. For instance, for the GARCH models, volatility is associated with the risk of an ARMA-GARCH prediction. In this study the volatility of an observation in a series is associated with the risk of a NN prediction. This paper proceeds as follows. Section 2 introduces the GARCH models, Section 3 presents the nearest neighbor technique and Section 4 describes a new volatility measure based on NNs. Section 5 gives empirical results for some GARCH-family models and NN predictions and a forecast evaluation of the different models. The final section provides a brief summary and conclusions. 2. GARCH MODEL Consider a stock market index It and its continuous rate of return xt, which we have . The index t denotes daily closing observations. The constructed as x t log I t I t 1 conditional distribution of the series of disturbances which follows a GARCH process can be written as: t | t 1 N 0, ht where t denotes all available information at time t-1. 4 The regression model for the series of xt can be written as s L x t r L t , s L 1 1 L ... s Ls , r L 1 1 L ... r Lr t z t ht , z t N 0,1 , (1) where L is the backward shift operator. The parameter reflects a constant term, which in practice is typically estimated to be close or equal to zero. The orders r and s identify the terms of the ARMA(s,r) stochastic process and we assume that the error term is heteroskedastic. In this sense, the conditional variance ht could be written as q p ht i j ht j , i 1 2 t i (2) j 1 where p0, q>0 and >0, i0 ,j0 for a non-negative GARCH(p,q) process. The GARCH(p,q) model reduces to the ARCH where p=0 and at least one of the ARCH parameters must be non-zero (q>0). Also, the GARCH parameters are restricted by q imposing stationary unconditional variance. This occurs when p j 1 i j 1 j 1 , which implies that the mean, variance and autocovariance are finite and constant over time. When q p j 1 j 1 i j 1 , the unconditional variance does not exist. However, it is interesting that the integrated GARCH or IGARCH model can be strongly stationary. Another specification is the exponential GARCH proposed by Nelson (1991). A single specification of this model is EGARCH(1,1), as: log ht log ht 1 2 t 1 ht 1 ht 1 t 1 (3) where, > 0, 0 < < 1, 0 < <1 , and < 0. is the volatility persistence and reflects an asymmetric effect or leverage. If is strictly negative, positive shocks over return produce less volatility than the negative. Note that the left-hand side is the log of the conditional variance. This implies that the leverage effect is exponential, rather than quadratic, and that forecasts of the conditional variance are guaranteed to be nonnegative. The presence of leverage effects can be tested by the null hypothesis 0 . The impact is asymmetric if 0 . Other models that we employ are the GARCH-M and EGARCH-M formulations. The formers are characterized by the introduction of the standard deviation into the model (1). By reformulating (1), we have: s L x t ht r L t , t z t ht , z t N 0,1 (4) and by substituting (2) or (3) into equation (4), we have the GARCH(p,q)-M and EGARCH(p,q)-M specifications. 5 The estimation of (1) and (2), (1) and (3) or (4) is performed by using the Berndt, Hall, Hall and Hausman (1974) algorithm (hereafter BHHH) and the Bollerslev and Wooldridge (1992b) procedure for heteroskedasticity-consistent covariance. Assuming the normally conditional error distribution function, the log-likelihood is expressed by: lt 1 T 1 T 2 log ht t . 2 t 1 2 t 1 ht t 1 T T Lt However, Nelson’s log likelihood specification for the log conditional variances differs slightly from the specification above, because in Nelson’s model it is assumed that the error follows a generalized error distribution, while we assume normally distributed errors. Bollerslev et al. (1992a) and Bera and Higgins (1993) provide excellent surveys of GARCH family models. 3. NEAREST-NEIGHBOR PREDICTIONS The Nearest-Neighbor predictions are a short-run forecast technique inspired by the literature on forecasting non-linear dynamical systems. The basic tool for NN predictions is the embedding of our series in a phase space. Given a series { x t } (t=1,2 ...., T), in order to detect behavioral patterns in this series, segments of equal length are considered as vectors x td of d consecutive observations sampled from the original time series, that is: xtd ( xt , xt 1 ,......, xt ( d 1) ) t d , d 1,...., T . These d-dimensional vectors are often called d-histories; the parameter d is referred to as the embedding dimension, while the m-dimensional space d is the phase space of the time series. The Takens (1981) embedding theorem establishes that, for a large enough embedding dimension d, if the original time series is sampled from a deterministic (perhaps chaotic) dynamical system, the trajectories of d-histories x td mimic the data generation process. The proximity of two d-histories in the phase space d may be interpreted as similar dynamic behavior and allows us to refer to the "nearest neighbors" of a particular segment x td of the series. Given the series { x t } (t=1,2 ...., T), the prediction of observation xT 1 is generated by analyzing the historical paths for the last available d-history xTd ( xT , xT 1 ,......, xT ( d 1) ) . To that end, segments x td1 , x td2 ,........., x tdk (5) with similar dynamic behavior to the last one in the series xTd are detected, seeking the k vectors in the phase space d , which maximize the function: 6 ( x td , x Td ) , as we see in Figure 1 [Figure 1] Therefore, the k d-histories (5) chosen present the highest serial correlation with respect to the last d-history xTd . Once the NNs to xTd have been established, the future short-term evolution of the time series is obtained by estimating the next observation xT 1 . The prediction xˆT 1 of xT 1 can be obtained after extrapolating the observations x t1 1 , x t2 1 ,....., x tk 1 , (6) subsequent to the k NNs d-histories (5), that is: xˆT 1 F ( xi1 1 , xi2 1 ,....., xik 1 ) . The simplest determination of function F(.) is a projection on every argument, that is xˆTelem 1 xtr 1 , for r=1, 2, ...., k (7) Henceforth, we term this kind of ingenuous prediction elementary predictions xˆTelem 1 . A better determination of function F(.) consists of using the mean of the elementary predictions, that is 1 k (8) xt 1 , k i 1 r for geometrical reasons, such a predictor xˆTbar 1 would be called barycentric predictor (Fernández-Rodríguez and Sosvilla-Rivero, 1998). When generating NN predictions, in order to get greater accuracy than elementary or barycentric ones, locally adjusted linear autoregressive (LALA hereafter) predictions are usually employed to fit the function F(.). This procedure involves the regression by ordinary least squares of the future evolution of the k nearest neighbors chosen on the basis of their preceding d-histories, that is, regressing x tr from (6) on xˆTbar1 xtdr ( xtr , xtr 1 ,......, xtr ( d 1) ) from (5), for r=1,....k. Then, the fitted coefficients are used to generate predictions for any xT 1 as follows ˆ ˆ ˆ ˆ xˆTlala 1 a0 xT a1 xT 1 ..... ad 1 xT ( d 1) ad , (9) where the â i are the values of a i that minimize the expression 7 x 2 k r 1 tr 1 a0 xtr a1 xtr 1 ...... ad 1 xtr ( d 1) ad Sugihara and May (1990), Casdagli (1992) and Casdagli and Weigend (1994) offer a detailed decryption of this kind of predictor. Other implementations of NN estimation (especially that of Cleveland and Devlin, 1988) advocated local weighting schemes that place greater weights on near observations in estimating the local linear regression. Although such weighting schemes have certain theoretical attractions, there are practical difficulties in their implementation. Wayland et al. (1994) showed that unweighted algorithms tend to yield superior results. 4. A NEW VOLATILITY MEASURE In this study the volatility of an observation in a series is associated with unpredictability in the sense of NN predictions. In non-linear dynamical systems, the basic idea behind NN predictions is that parts of a time series from the past might have a resemblance to other parts in the future. How can we predict and measure the unpredictability of an observation of a series in the NN sense? Let us consider two kinds of NN predictions; on the one hand, elementary predictions xˆTelem 1 xtr 1 , for r=1, 2, ...., k given in (7); on the other hand, locallyadjusted linear autoregressive predictions xˆTlala 1 , given in (9). The simplest way of predicting the volatility of xT 1 is by considering the variance of its elementary predictions (7), that is 2 v 2 k k 1 / k xtr 1 1 / k xtr 1 1 / k xtr 1 xˆTbar1 . r 1 r 1 r 1 k bar T 1 (10) Note that vTbar 1 predicts the volatility by comparing different predictions of this observation xT 1 , that is, elementary predictions of (6) and barycentric predictions of (8). If elementary predictions are similar, volatility is low; if predictions differ considerably, volatility is high because the observation xT 1 seems unpredictable. Observe that vTbar 1 is a measure of the risk of a NN barycentric prediction (8). If we wish to measure the risk of a more sophisticated prediction like xˆTlala in (9), 1 following (10) we must consider the new volatility predictor given by 1 / k xtr 1 xˆ 2 k lala T 1 v r 1 lala T 1 . (11) Our new methodology to predict volatility seeks similar-behaving patterns in the past of the series (5), similar to the recent past (the last d-history xTd ) of the series. Observe that in (10) and (11) volatility is predicted as the mean square of the differences between NN predictors. 8 As a final intuitive explanation, Figure 1 shows that NNs detected in the past, given by (5), have a similar volatility to the last d-history xTd of the series. If xTd has low (high) volatility, the NNs selected in the past have low (high) volatility. Finally, observe that our new measure of volatility is conceptually similar to the GARCH philosophy. In finance, volatility is associated with the risk of a specific prediction. So, for a prediction xˆT 1 of the random variable xT 1 volatility is defined as E xT 1 xˆT 1 . 2 For instance, for the GARCH models, volatility is associated with the risk of an ARMA-GARCH prediction and defined as garch hT 1 E xT 1 xˆTarma 1 , 2 while ht is estimated as ht 0 1 t21 1 ht 1 . In this paper, the volatility of a series observation is associated with the risk of a NN prediction, the locally-adjusted linear autoregressive (LALA) predictor, that is lala vTlala ˆT 1 1 E xT 1 x . 2 In this case vTlala 1 is estimated by the expression 1 k 2 ( xtr 1 xˆTlala 1 ) . k r 1 5. FORECAST EVALUATION IN SIMULATED AN FINANCIAL TIME SERIES 5.1. Data and estimation period The data used in this paper to compare GARCH and our NN volatility measure are simulated and real financial time series. Simulated series are provided by Gaussian white noise (GWN), GARCH, GARCH-M, EGARCH and EGARCH-M. Real series are daily observed stock market indices in the New York Stock Exchange. Table 1 reports header notations and a brief description of each data item. The data were collected from 2nd January 1962 to 31st January 1996, and provided by Data Disk Plus; these data are copyrighted by Finance & Technology Publishing, where a purchaser is licensed only for personal use of the data contained therein. In this study, we consider the daily closing prices as the daily observations for all indices, but we also consider the higher and lower index for S&P 500. [Table 1] 9 Some characteristics of the rates of return, xt, are given in Table 2. The number of observations is 8579 for all seven indices. The means and variances are quite small. The high kurtosis indicates the existence of fat-tailed distributions to describe these variables. The estimated measure of skewness is either positive or negative and is large in an absolute sense. [Table 2] The forecast volatility period is the last 500 observations, that is from 8th February 1994 to 31st January 1996. The prediction of the GARCH and EGARCH models and the LALA model are well-suited recursively one-step-ahead. We estimate models (1) and (2), (1) and (3) for the GARCH and EGARCH models, (4) for GARCH-M and EGARCH-M and (9) and (11) for the LALA model and variants of mean effects between observations 1 to T (where T is 7th February 1994) and then obtain the forecast for the next period (T+1) (8th February 1994). In the next step, we estimate the model between 1 to T+1, including real values of series for T+1. We thus estimate the model for a sample size that includes the real rate of return for the date 8th February 1994, and then forecast the next observation. We repeat this process until 31st January 1996. In this sense, for each series we construct two predicted volatility series ht for the GARCH family model, vt for the LALA model and two error series tarma-garch, tlala for the forecast period (t=1,…500). 5.2 Bias test By definition, the predictable component of volatility in a series is the conditional variance of that series. Like Pagan and Schwert (1990), we use the regression t2 0 1vt ut (12) to compare the forecasting performance of stock volatility. If the forecasts are unbiased, 0 = 0 and 1 = 1. Different estimations of 0 and 1 mean the existence of bias in the model's predictions. The purpose of a bias test is to determine whether forecasts are unbiased estimates of the actual series (square errors in our case). In other words, the bias test determines whether model forecasts are systematically higher or lower than actual square errors. A natural approach to this forecasting performance is to estimate 1 from a simulated series. This single experiment is replicated 1000 times by generating 255 gaussian random series with different variances vt(i) (t=1,2, … ,255 and i=1,2, …, 1000 )4. For each replication i, a random observation t(i) is selected from each series; we then compute 1(i) from t2 ( i ) 0( i ) 1( i ) vt( i ) ui( i ) 4 t 1,2, ... 255 (13) These variances have been generated from a GARCH process (0 = 0.0001, 1 = 0.2 and 1 = 0.7). 10 Figure 2(a) shows the estimation of 1(i) in (13) to be around one. The same simulation for the GARCH series instead of the gaussian random series produces Figure 2(b)5. In this case 1( i ) is not around one. In fact, its mean is 0.872225. Similar results are obtained when different parameters of conditional variance are used. Table 3 summarizes the statistical basis of 1(i) for both cases, revealing substantial bias in the forecast of the slope coefficient estimates for the GARCH case. It is apparent that volatility is downward biased by GARCH models. [Figure 2] [Table 3] In Tables 4 and 5, we report some results of this test applied to the estimated errors of the above models. The regression equation is (12), where vt and t2 are estimated from a ARMA-GARCH model (these series are ht and squares of tarma-garch) and a LALA model (these series are vt and squares of tlala), respectively6. [Table 4] Tables 4 shows that in the case of the GARCH-family models, the hypothesis of unbiasedness is rejected for all cases. For the LALA model forecast (Table 5), the hypothesis of absence of bias was not rejected at the 5% level for series DJBA, DJIA, DJTA, DJUA, or SP500C, or for any of the simulated series, but was rejected at the 5% level for series SP500H and SP500L. In terms of bias, the LALA model performance is clearly superior. It is interesting to examine the shape of the bias in the cases where its presence is significant. In the GARCH-family models, intercepts and slopes are significantly different from zero and one, respectively. The slope coefficients are significantly greater than one, which indicates that the GARCH-family models (Table 4) have a tendency to underestimate the magnitude of volatility. Finally, in the case of the LALA model (Table 5), the SP500L intercept is significantly different from zero but its slope is not significantly different from one. This indicates that the bias is constant, and does not vary with the level of the forecast. [Table 5] 5.3. Informational content test To determine whether the forecasts generated from alternative models (GARCH family and LALA) contain additional information, we used the informational content test 5 GARCH series have been generated using the same parameters as those used to generate variances of gaussian random series. 6 Computational problems arose in estimating the ARMA-GARCH model for series DJBA, GWN and the simulated EGARCH-M series. However, we had no problems with the NN-based procedures in these series, which provide good results. 11 developed by Fair and Shiller (1989, 1990), which involves running regressions to estimate the square errors in each model on paired volatility forecasts. The regression is: t2 0 1vt 2ht ut (14) We tested a null hypothesis of no information in both forecasts, H0: 1 = 0 and 2 = 0, against the alternative hypothesis that both coefficients are non-zero or that at least one is non-zero. If both 1 and 2 are zero, then neither forecast vt nor ht contains significant information on which volatility can be estimated. However, if the 1 (2) coefficient is non-zero then vt (ht) contains significant information to estimate volatility and no additional independent information is found in ht (vt). Finally, if both 1 and 2 are nonzero, significant information is revealed in both series and the information sets are independent. Results from this test are presented in Tables 6 and 7. In Table 6 equation (14) is estimated by using the square of tarma-garch as the endogenous variable whereas in Table 7 equation (14) is estimated by using the square of tlala as the endogenous variable. Table 6 shows that, in the case of DJIA, DJTA, DJUA, SP500C, SP500H and all the simulated series, the coefficients associated with vt and ht are not significantly different from zero over the forecast period from 8th February 1994 to 31 January 1966. Hence, vt and ht do not contain information to explain the squared errors generated by the GARCH family models. However, in the case of SP500L, 2 is significantly negative over the 500 daily forecast period, and then ht generated from GARCH family models contained information to explain t2. The negative coefficients imply that ht is negatively correlated with t2, a perverse result in economic terms. Table 7 shows that in all real series the 1 coefficient is significantly positive whereas 2 is not significantly different from zero. Hence, the LALA model dominated the GARCH family models over the 500 daily forecast period. In the case of the simulated series, 1 and 2 are significantly positive, except for the EGARCH series where 2 is not significantly different from zero. This result should be interpreted carefully; if both coefficients are non-zero, significant information is revealed in vt and ht and the information sets are independent. There is no statistical explication for the fact that ht contains information above that found in vt. [Table 6] [Table 7] 6. CONCLUSION The purpose of this paper is to analyze the forecasting volatility accuracy of the GARCH-family and NN models of indices obtained from the New York Stock Exchange (DJIA, DJTA, DJBA, DJUA, SP500H, SP500L, SP500C) and simulated series. The NN models are based on non-parametric models developed originally by Stone (1977) and Cleveland (1979), and were used by Farmer and Sidorowich (1987) to predict chaotic time series. For financial series, the in-sample period estimation for the models was 2 Jan 1962 to 7 Feb 1994. For the out-of-sample period it was 8 Feb 1994 to 12 31 Jan 1996. One-step-ahead forecasts were generated for both financial and simulated series. Performance criteria included bias tests and informational content tests. The out-of-sample forecasting results indicated a sharp difference between the performance of the GARCH-family models and the NN model. Forecasts from GARCH family models, in most cases, were biased and exhibited no significant informational content, except for the GARCH, GARCH-M and EGARCH simulated series in which the squared errors were generated from a NN model. In contrast, NN model forecasts generally exhibited unbiased estimations, and in all cases presented a significant information constant when squared errors were generated from a NN model. Our results show that the NN model improves considerably on the out-of-sample volatility forecasting performance of the GARCH family models. REFERENCES Bajo-Rubio, O., Fernández-Rodríguez, F. and Sosvilla_Rivero, S. (1992), ‘Chaotic behaviour in exchange –rate series. First results for the Peseta_U.S. Dollar case’, Economics Letters 39, 207-211. Bera, A. and Higgins, M.L. (1993), ‘ARCH Models: Properties, estimation and testing’, Journal of Economic Surveys 7, 305-366. Berndt, E., Hall, B., Hall, R. and Hausman, J. (1974), ‘Estimation inference in nonlinear structural models’, Annals of Economic and Social Measurement 4, 653-665. Black, F. (1976), ‘Studies in Stock Price Volatility’, American Statistical Association, Proceedings of the 1976 Business Meeting of the Business and Economic Statistics Section, 177-181. Bollerslev, T. (1986), ‘Generalized autoregressive conditional heteroskedasticity’, Journal of Econometrics 31, 307-327. Bollerslev, T., Chou, R. and Kroner, K. (1992a), ‘ARCH modeling in finance: A review of the theory and empirical evidence’, Journal of Econometrics 32, 5-59. Bollerslev, T. and Wooldridge, J. (1992b), ‘Quasi-maximum likelihood estimation and inference in dynamic models with time varying covariances’, Econometric Reviews 11, 143-172. Casdagli, M. (1992), 'Chaos and deterministic versus stochastic non-linear modeling', Journal of the Royal Statistical Society, Ser., b, 54, 303-328. Casdagli, M. and Weigend, A. S. (1994), ‘Exploring the continuum between deterministic and stochastic modelling’, In Weigend, A.S. and Gershenfeld, N.A. (Eds), Time Series Prediction: Forecasting the Future and Understanding the Past. Reading, MA. Addison Wesley. Cleveland, W.S. (1979), ‘Robust locally weighted regression and smoothing scatterplots’, Journal of the American Statistical Association 74, 829-836. Cleveland, W.S. and Devlin, S.J. (1988), 'Locally weighted regression: an approach to regression analysis by local fitting'. Journal of the American Statistical Association, 83, 596-610. Clyder, W.C. and Osler C. L. (1997), ‘Charting: Chaos theory in disguise?’, Journal of Futures Markets 17, 489-514. 13 Diebold, F. X. and Nason, J. A. (1990), ‘Nonparametric exchange rate predictions’, Journal of International Economics 28, 315-332. Engle, R.F. (1982), ‘Autoregressive conditional heteroskedasticity with estimates of variance of U.K. inflation’, Econometrica 50, 987-1007. Fair, R.C. and Shiller, R.J. (1989), ‘The informational content of ex ante forecasts’, The Review of Economics and Statistics 27, 325-331. Fair, R.C. and Shiller, R.J. (1990), ‘Comparing information in forecasts from econometrics models’, The American Economic Review 80, 375-389 Farmer, D. and Sidorowich, J. (1987), ‘Predicting chaotic time series’, Physical Review Letters 59, 845-848. French, K., Schwert, G. and Stambaugh, R. (1987), ‘Expected stock returns and volatility’, Journal of Financial Economics 19, 3-29. Fernández-Rodríguez, F. Sosvilla-Rivero, S. (1998). ‘Testing nonlinear forecastability in time series: Theory and evidence from the EMS’, Economics Letters 59, 49-63. Fernández-Rodríguez, F., Sosvilla-Rivero, S. and Andrada_Félix, J. (1999), ‘Exchangerate forecasts with simultaneous nearest-neighbor methods: Evidence from the EMS’, International Journal of Forecasting 15, 383-392. Härdle, W. and Linton, O. (1994), ‘Applied nonparametric methods’. In Handbook of Econometrics 4, Engle, R.F., MacFadden, D. (eds). Elsevier, Amsterdam. Lisi, F. and Medio A. (1997), ‘Is a random walk the best exchange rate predictor?’, International Journal of Forecasting 13, 255-267. Lo, A.W. and MacKinlay, A. C. (1999), A non-random walk down wall street, Princeton University Press. Princeton, New Jersey. Mizrach, B. (1992), ‘Multivariate nearest-neighbor forecast of EMS exchange rates’, Journal of Applied Econometrics 7, S151-S163. Nelson, D.B. (1991), ‘Conditional heteroskedasticity in asset returns: A new approach’, Econometrica 59, 347-370. Newey, W.K. and West, K.D. (1987), ‘A simple, positive semidifinite heteroskedasticity and autocorrelation consistent covariance matrix’, Econometrica 55, 703-708. Pagan, A.R. and Schwert, G.W. (1990), ‘Alternative models for conditional stock volatility’, Journal of Econometrics 45, 267-290. Philippatos, G. C. and Wilson C. J. (1972), ‘Entropy, market risk, and the selection of efficient portfolios’, Applied Economics 4, 209-220. Philippatos, G. C. and Wilson C. J. (1974), ‘Entropy, market risk, and the selection of efficient portfolios: reply’, Applied Economics 6, 77-81. Schwert, G. (1989), ‘Why does stock market volatility change over time?’, Journal of Finance 44, 1115-1153. Schwert, G. (1990), ‘Shock volatility and the crash of 87’, Review of Financial Studies 3, 77-102. Soofi, A.S. and Cao, L. (1999), ‘Nonlinear deterministic forecasting of daily pesetadollar exchange rate’, Economics Letters 62, 175-180. Sosvilla-Rivero, S., Fernández-Rodríguez, F. and Bajo-Rubio, O. (1999), ‘Exchange rate volatility in the EMS before and after the fall’, Applied Economics Letters 6, 717-722. Stone, C. J. (1977), ‘Consistent nonparametric regression’, Annals of Statistics 5, 595620. Sugihara, G. and May, R.M. (1990), ‘Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series’, Nature 344, 734-741. 14 Takens, F. (1981), Detecting strange attractors in turbulence. in: lect. notes math 898, Raud, D.A. and Young L.S. (eds). Springer-Verlag, New York Wayland, R., Pickett, D., Bromley, D. and Passamante, A. (1994), 'Measuring spatial spreading in recurrent time series'. Physyca D 79, 320-334 15 Table 1. Descriptions of stocks exchanges indices time series. Index Explanation DJBA Close for the Dow Jones 20 Bond Average. DJIA Close for the Dow Jones Industrial Average. DJTA Close for the Dow Jones Transportation Average. DJUA Close for the Dow Jones Utility Average. SP500H High for the S&P 500 index. SP500L Low for the S&P 500 index. SP500C Close for the S&P 500 index. Table 2. Descriptive statistics. Sample DJBA DJIA Statistics Mean 0.000027 0.000234 Median 0.000000 0.000264 Maximum 0.164649 0.096662 Minimum -0.058005 -0.256509 Std. Deviation 0.003092 0.009276 Skewness 16.91432 -2.390666 Kurtosis 961.9340 76.24175 Period 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 2-jan-1962 to 31-jan-1996 DJTA DJUA SP500H SP500L SP500C 0.000304 0.000180 0.072551 -0.192361 0.010383 -0.759212 20.81961 0.0000687 0.0000000 0.080016 -0.166481 0.007067 -1.362277 46.13572 0.000254 0.000327 0.057217 -0.140601 0.008111 -0.667659 17.61075 0.000255 0.000432 0.101400 -0.224859 0.008701 -1.899351 62.60098 0.000256 0.000338 0.087089 -0.228997 0.008864 -2.101220 60.60433 Jarque-Bera Probability 3.29e+08 0.000000 1925480.0 114317.6 0.000000 0.000000 667695.0 0.000000 76936.44 0.000000 1274801.0 1192313.0 0.000000 0.000000 Observations 8578 8578 8578 8578 8578 8578 8578 Table 3. Descriptive statistics for 1(i) 1(i) 1(i) (Gaussian case) (GARCH case) Sample Statistics Mean 1.003546 0.872225 Median 0.961091 0.871340 Maximum 2.738856 1.381036 Minimum 0.292862 0.438071 Std. Dev. 0.345485 0.137213 Skewness 0.705542 0.034237 Kurtosis 3.726891 2.891560 Observations 1000 1000 16 Table 4. Pagan and Schwert (1990) bias test. GARCH models. Panel (A): Financial series Parameters DJIA DJTA DJUA SP500C R2 0.17 0.22 0.25 0.24 a -0.000039 -0.00010 -0.000025 -0.00039 0 (-3.97) (-4.93) (-1.93) (-3.86) [0.000] [0.000] [0.027] [0.000] 1.6849 2.0301 1.3329 1.7162 1b (2.19) (3.93) (1.25) (2.53) [0.015] [0.000] [0.106] [0.006] Wald c 33.950 43.181 13.456 80.702 [0.000] [0.000] [0.001] [0.000] Panel (B): Simulated series Parameters GARCH GARCH-M EGARCH R2 0.53 0.54 0.0046 -0.00475 -0.00427 -0.000642 0a (-4.31) (-6.36) (-0.06) [0.000] [0.000] [0.476] b 1.9012 1.8568 0.02048 1 (3.53) (4.84) (-64.09) [0.000] [0.000] [0.000] Wald c 30.360 68.479 588769 [0.000] [0.000] [0.000] SP500H 0.20 -0.000045 (-5.19) [0.000] 2.1228 (3.75) [0.000] 109.13 [0.000] SP500L 0.25 -0.000048 (-3.55) [0.000] 1.9412 (2.53) [0.006] 78.610 [0.000] Note: To correct heteroskedasticity and correlation problems within the error term, a heteroskedastic and autocorrelation consistent covariance matrix is used to estimate the standard errors of coefficients in equation (12) (Newey-West, 1987). a The t-values in parentheses are calculated for the null hypothesis that i = 0 (i=1,2). b 2 statistics are calculated to test the null hypothesis of (1, 2) = (0,0). The p-value is in brackets. 17 Table 5. Pagan and Schwert (1990) bias test. NN method Panel (A): Financial series Parameters DJBA DJIA DJTA DJUA SP500C npp 6 6 10 8 5 R2 0.98 0.80 0.89 0.99 0.67 0.0000 0.0000 0.0000 0.0000 0.0000 0a (-0.499) (0.388) (0.878) (0.354) (0.121) [0.618] [0.700] [0.380] [0.724] [0.904] 0.9843 0.8997 0.9740 1.0004 0.9284 1b (0.756) (1.741) (0.573) (0.131) (0.398) [0.450] [0.082] [0.567] [0.896] [0.691] Wald c 0.620 3.751 0.997 0.243 0.584 [0.734] [0.153] [0.607] [0.885] [0.747] Panel (B): Simulated series Parameters GWN GARCH GARCH-M EGARCH npp 8 11 12 12 R2 0.99 0.68 0.24 0.41 -7.4654 0.0004 0.0010 0.0025 0a (-1.666) (0.361) (0.734) (0.581) [0.096] [0.718] [0.463] [0.561] b 1.0001 1.0204 0.9214 0.9974 1 (0.074) (0.165) (0.369) (0.011) [0.941] [0.869] [0.712] [0.991] Wald c 3.413 0.879 0.859 4.742 [0.181] [0.645] [0.651] [0.093] SP500H 6 0.85 0.0000 (-2.403) [0.017] 0.9984 (0.024) [0.981] 10.738 [0.005] SP500L 10 0.92 0.0000 (-5.440) [0.000] 1.0126 (0.343) [0.731] 30.699 [0.000] EGARCH-M 11 0.42 0.0020 (0.340) [0.734] 1.0135 (0.045) [0.964] 3.129 [0.209] Note: To correct heteroskedasticity and correlation problems within the error term, a heteroskedastic and autocorrelation consistent covariance matrix is used to estimate the standard errors of coefficients in equation (12) (Newey-West, 1987). a The t-values in parentheses are calculated for the null hypothesis that i = 0 (i=1,2). b 2 statistics are calculated to test the null hypothesis of (1, 2) = (0,0). The p-value is in brackets. 18 Table 6. Informational test. GARCH models Panel (A): Financial Series Parameters DJIA DJTA DJUA SP500C R2 0.00 0.01 0.01 0.00 0.1980 0.6746 0.0005 0.9011 1ª (0.349) (1.319) (0.335) (1.546) [0.727] [0.188] [0.738] [0.123] ª -11.0102 18.588 -20.8763 -18.1718 2 (-0.550) (1.055) (-1.305) (-0.943) [0.583] [0.292] [0.193] [0.346] Wald b 0.470 3.616 1.793 2.780 [0.790] [0.164] [0.408] [0.249] Panel (B): Simulated series Parameters GARCH GARCH-M EGARCH 2 R 0.00 0.01 0.00 0.1795 -0.9142 -0.0781 1ª (0.562) (-1.902) (-0.370) [0.574] [0.058] [0.712] 0.2118 2.2362 0.0210 2ª (0.084) (1.079) (0.233) [0.933] [0.281] [0.816] Wald b 1.629 3.629 0.228 [0.443] [0.163] [0.892] SP500H 0.00 -0.0984 (-0.253) [0.801] -35.3970 (-1.029) [0.304] 1.071 [0.585] SP500L 0.02 0.7509 (1.588) [0.113 -47.8027 (-2.848) [0.005] 8.790 [0.012] Note: To correct heteroskedasticity and correlation problems within the error term, a heteroskedastic and autocorrelation consistent covariance matrix is used to estimate the standard errors of coefficients in equation (12) (Newey-West, 1987). a The t-values in parentheses are calculated for the null hypothesis that i = 0 (i=1,2). b 2 statistics are calculated to test the null hypothesis of (1, 2) = (0,0). The p-value is in brackets. 19 Table 7. Informational test. NN method. Panel (A): Financial series Parameters DJIA DJTA DJUA SP500C R2 0.81 0.89 0.99 0.68 0.8977 0.9728 1.0004 0.9196 1ª (15.153) (21.469) (341.509) (5.102) [0.000] [0.000] [0.000] [0.000] ª 0.4883 0.1992 0.1695 1.1619 2 (0.638) (0.402) (0.066) (1.580) [0.524] [0.688] [0.947] [0.115] Wald b 273.739 460.931 116721.8 32.645 [0.000] [0.000] [0.000] [0.000] Panel (B): Simulated series Parameters GARCH GARCH-M EGARCH R2 0.69 0.31 0.41 ª 0.9353 0.7036 0.9974 1 (7.591) (3.4338) (4.100) [0.000] [0.001] [0.000] 1.0234 1.4244 -0.0126 2ª (3.1110) (5.032) (-0.582) [0.002] [0.000] [0.561] b Wald 87.775 46.583 18.084 [0.000] [0.000] [0.000] SP500H 0.85 0.9991 (15.182) [0.000] 0.2455 (0.4061) [0.685] 248.499 [0.000] SP500L 0.92 1.0150 (28.984) [0.000] -0.2995 (-0.540) [0.590] 962.836 [0.000] Note: To correct heteroskedasticity and correlation problems within the error term, a heteroskedastic and autocorrelation consistent covariance matrix is used to estimate the standard errors of coefficients in equation (12) (Newey-West, 1987). a The t-values in parentheses are calculated for the null hypothesis that i = 0 (i=1,2). b 2 statistics are calculated to test the null hypothesis of (1, 2) = (0,0). The p-value is in brackets. 20 x d t1 ….. x d t2 . xt 1 2 . xt 1 1 x d tk . xt 1 k ….. Figure 1. Graphical interpretation of NNs 3.0 1.4 2.5 1.2 2.0 1.0 1.5 0.8 1.0 0.6 0.5 0.0 0.4 200 400 1 600 800 1000 (i) Figure 2(a). Slopes for gaussian random series 200 400 600 800 1(i) 1000 Figure 2(b). Slopes for GARCH series 21
© Copyright 2026 Paperzz