Variation, jumps, market frictions and high frequency data in financial econometrics∗ Ole E. Barndorff-Nielsen The T.N. Thiele Centre for Mathematics in Natural Science, Department of Mathematical Sciences, University of Aarhus, Ny Munkegade, DK-8000 Aarhus C, Denmark [email protected] Neil Shephard Nuffield College, Oxford OX1 1NF, UK [email protected] July 14, 2005 1 Introduction We will review the econometrics of non-parametric estimation of the components of the variation of asset prices. This very active literature has been stimulated by the recent advent of complete records of transaction prices, quote data and order books. In our view the interaction of the new data sources with new econometric methodology is leading to a paradigm shift in one of the most important areas in econometrics: volatility measurement, modelling and forecasting. We will describe this new paradigm which draws together econometrics with arbitrage free financial economics theory. Perhaps the two most influential papers in this area have been Andersen, Bollerslev, Diebold, and Labys (2001) and Barndorff-Nielsen and Shephard (2002), but many other papers have made important contributions. This work is likely to have deep impacts on the econometrics of asset allocation and risk management. One of our observations will be that inferences based on these methods, computed from observed market prices and so under the physical measure, are also valid as inferences under all equivalent measures. This puts this subject also at the heart of the econometrics of derivative pricing. ∗ Prepared for the invited symposium on Financial Econometrics, 9th World Congress of the Econometric Society, London, 20th August 2005. We are grateful to Tim Bollerslev, Eric Ghysels, Peter Hansen, Jean Jacod, Dmitry Kulikov, Jeremy Large, Asger Lunde, Andrew Patton, Mark Podolskij, Kevin Sheppard and Jun Yu for comments on an earlier draft. Talks based on this paper were also given in 2005 as the Hiemstra Lecture at the 13th Annual conference of the Society of Non-linear Dynamics and Econometrics in London, the keynote address at the 3rd Nordic Econometric Meeting in Helsinki and as a Special Invited Lecture at the 25th European Meeting of Statisticians in Oslo. Ole Barndorff-Nielsen’s work is supported by CAF (www.caf.dk), which is funded by the Danish Social Science Research Council. Neil Shephard’s research is supported by the UK’s ESRC through the grant “High frequency financial econometrics based upon power variation.” 1 One of the most challenging problems in this context is dealing with various forms of market frictions, which obscure the efficient price from the econometrician. Here we will characterise four types of statistical models of frictions and discuss how econometricians have been attempting to overcome them. In section 2 we will set out the basis of the econometrics of arbitrage-free price processes, focusing on the centrality of quadratic variation. In section 3 we will discuss central limit theorems for estimators of the QV process, while in section 4 the role of jumps in QV will be highlighted, with bipower and multipower variation being used to identify them and to test the hypothesis that there are no jumps in the price process. In section 5 we write about the econometrics of market frictions, while in section 6 we conclude. 2 Arbitrage-free, frictionless price processes 2.1 Semimartingales and quadratic variation Given a complete record of transaction or quote prices it is natural to model prices in continuous time (e.g. Engle (2000)). This matches with the vast continuous time financial economic arbitrage-free theory based on a frictionless market. In this section and the next, we will discuss how to make inferences on the degree of variation in such frictionless worlds. Section 5 will extend this by characterising the types of frictions seen in practice and discuss strategies econometricians have been using to overcome these difficulties. In its most general case the fundamental theory of asset prices says that a vector of log-prices at time t, 0 Yt = Yt1 , ..., Ytp , must obey a semimartingale process (written Y ∈ SM) on some filtered probability space Ω, F, (Ft )t≥0 , P in a frictionless market. The semimartingale is defined as being a process which can be written as Y = A + M, (1) where A is a local finite variation process (A ∈ FV loc ) and M is a local martingale (M ∈ Mloc ). Compact introductions to the economics and mathematics of semimartingales are given in Back (1991) and Protter (2004), respectively. The Y process can exhibit jumps. It is tempting to decompose Y = Y ct + Y d , where Y ct and Y d are the purely continuous and discontinuous sample path components of Y . However, technically this definition is not clear as the jumps of the Y process can be so active that they 2 cannot be summed up. Thus we will define Y ct = Ac + M c , where M c is the continuous part of the local martingale component of Y and A c is A minus the sum of its jumps1 . Likewise, the continuous sample path subsets of classes of processes such as SM and M, will be denoted by SMc and Mc . Crucial to semimartingales, and to the economics of financial risk, is the quadratic variation (QV) process of (Y 0 , X 0 )0 ∈ SM. This can be defined as tj ≤t [Y, X]t = p− lim n→∞ X j=1 Ytj − Ytj−1 0 Xtj − Xtj−1 , (2) (e.g. Protter (2004, p. 66–77)) for any deterministic sequence 2 of partitions 0 = t0 < t1 < ... < tn = T with supj {tj+1 − tj } → 0 for n → ∞. The convergence is also locally uniform in time. It can be shown that this probability limit exists for all semimartingales. Throughout we employ the notation that [Y ]t = [Y, Y ]t , while we will sometimes refer to well known that3 p [Y l ]t as the quadratic volatility (QVol) process for Y l . It is [Y ] = [Y ct ] + [Y d ], [Y d ]t = where X ∆Yu ∆Yu0 (3) 0≤u≤t with ∆Yt = Yt − Yt− are the jumps in Y and noting that [Act ] = 0. In the probability literature QV is usually defined in a different, but equivalent, manner (see, for example, Protter (2004, p. 66)) [Y ]t = 2.2 Yt Yt0 −2 Z t 0 Yu− dYu0 . (4) Brownian semimartingales In economics the most familiar semimartingale is the Brownian semimartingale (Y ∈ BSM) Z t Z t Yt = au du + σ u dWu , (5) 0 0 1 It is tempting to use the notation Y c for Y ct , but in the probability literature if Y ∈ SM then Y c = M c , so Y ignores Ac . 2 The assumption that the times are deterministic can be relaxed to allow them to be any Riemann sequence of adapted subdivisions. This is discussed in, for example, Jacod and Shiryaev (2003, p. 51). Economically this is important for it means that we can also think of the limiting argument as the result of a joint process of Y and a counting process N whose arrival times are the tj . So long as Y and N are adapted to at least their bivariate natural filtration the limiting argument holds as the intensity of N increases off to infinity with n. 3 Although the sum of jumps of Y does not exist in general when Y ∈ SM, the sum of outer products of the jumps always does exist. Hence [Y d ] can be properly defined. c 3 where a is a vector of predictable drifts, σ is a matrix volatility process whose elements are càdlàg and W is a vector Brownian motion. The stochastic integral σ •W t , where f •gt is generic Rt notation for the process 0 fu dgu , is said to be a stochastic volatility process (σ • W ∈ SV) — e.g. the reviews in Ghysels, Harvey, and Renault (1996) and Shephard (2005). This vector process has elements which are Mcloc . Doob (1953) showed that all continuous local martingales with absolutely continuous quadratic variation can be written in the form of a SV process (see Rt Karatzas and Shreve (1991, p. 170–172)) 4 . The drift 0 au du has elements which are absolutely continuous — an assumption which looks ad hoc, however arbitrage freeness plus the SV model implies this property must hold (Karatzas and Shreve (1998, p. 3) and Andersen, Bollerslev, Diebold, and Labys (2003, p. 583)). Hence Y ∈ BSM is a rather canonical model in the finance theory of continuous sample path processes. Its use is bolstered by the facts that Ito calculus for continuous sample path processes is relatively simple. If Y ∈ BSM then [Y ]t = Z t Σu du 0 the integrated covariance process, while dYt |Ft ∼ N (at dt, Σt dt) , Σt = σ t σ 0t , where (6) where Ft is the natural filtration – that is the information from the entire sample path of Y up to time t. Thus at dt and Σt dt have clear interpretations as the infinitesimal predictive mean Rt and covariance of asset returns. This implies that A t = 0 E (dYu |Fu ) du while, centrally to our interests, d[Y ]t = Cov (dYt |Ft ) and [Y ]t = Z 0 t Cov (dYu |Fu ) du. Thus A and [Y ] are the integrated infinitesimal predictive mean and covariance of the asset prices, respectively. 2.3 Jump processes There is no plausible economic theory which says that prices must follow continuous sample path processes. Indeed we will see later that statistically it is rather easy to reject this hypothesis even for price processes drawn from very thickly traded markets. In this paper we will add a finite activity jump process (this means there are a finite number of jumps in a fixed time interval) P t Jt = N j=1 Cj , adapted to the filtration generated by Y , to the Brownian semimartingale model. 4 An example of a continuous local martingale which has no SV representation is a time-change Brownian motion where the time-change takes the form of the so-called “devil’s staircase,” which is continuous and nondecreasing but not absolutely continuous (see, for example, Munroe (1953, Section 27)). This relates to the work of, for example, Calvet and Fisher (2002) on multifractals. 4 This yields Yt = Z t au du + 0 Z t σ u dWu + 0 Nt X Cj . (7) j=1 Here N is a simple counting process and the C are the associated non-zero jumps (which we assume have a covariance) which happen at times 0 = τ 0 < τ 1 < τ 2 < ... . It is helpful to decompose J into J = J A + J M , where, assuming J has an absolutely continuous intensity, Rt JtA = 0 cu du, and ct = E (dJt |Ft ). Then J M is the compensated jump process, so J M ∈ M, Rt while J A ∈ FV ct loc . Thus Y has the decomposition as in (1), with A t = 0 (au + cu ) du and Mt = Z t σ u dWu + 0 It is easy to see that [Y d ]t = Nt X j=1 P Nt 0 j=1 Cj Cj [Y ]t = Z Cj − Z t cu du. 0 and so t Σu du + 0 Nt X Cj Cj0 . j=1 Again we note that E (dYt |Ft ) = (at + ct ) dt, but now, Cov (σ t dWt , dJt |Ft ) = 0, (8) so Cov (dYt |Ft ) = Σt dt + Cov (dJt |Ft ) 6= d[Y ]t . This means that the QV process aggregates the components of the variation of prices and so is not sufficient to learn the integrated covariance process. To identify the components of the QV process we can use the bipower variation (BPV) process introduced by Barndorff-Nielsen and Shephard (2006). So long as it exists, the p × p matrix BPV process {Y } has l, k-th element n o 1 n o n o Y l, Y k = Yl +Yk − Yl−Yk , 4 l, k, = 1, 2, ..., p, (9) where, so long as the limit exists and the convergence is locally uniform in t, 5 n Yl o t = p− lim δ↓0 bt/δc X l l l l − Yδ(j−2) Yδ(j−1) Yδj − Yδ(j−1) . (10) j=1 5 In order to simplify some of the later results we consistently ignore end effects in variation statistics. This can be justified in two ways, either by (a) setting Yt = 0 for t < 0, (b) letting Y start being a semimartingale at zero at time C < 0 not at time 0. The latter seems realistic when dealing with markets open 24 hours a day, borrowing returns from small periods of the previous day. It means that there is a modest degree of wash over from one days variation statistics into the next day. There seems little econometric reasons why this should be a worry. Assumption (b) can also be used in equity markets when combined with some form of stochastic imputation, adding in artifical simulated returns for the missing period — see the related comments in Barndorff-Nielsen and Shephard (2002). 5 Here bxc is the floor function, which is the largest integer less than or equal to x. Combining the results in Barndorff-Nielsen and Shephard (2006) and Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) if Y is the form of (7) then, without any additional assumptions, Z t −2 µ1 {Y }t = Σu du, 0 where µr = E |U |r , U ∼ N (0, 1) and r > 0, which means that [Y ]t − µ−2 1 {Y }t = Nt X Cj Cj0 . j=1 At first sight the robustness of BPV looks rather magical, but it is a consequence of the fact that only a finite number of terms in the sum (10) are affected by jumps, while each return which does not have a jump goes to zero in probability. Therefore, since the probability of jumps in contiguous time intervals goes to zero as δ ↓ 0, those terms which do include jumps do not impact the probability limit. The extension of this result to the case where J is an infinite activity jump process is discussed in Section 4.4. 2.4 Forecasting Suppose Y obeys (7) and introduce the generic notation yt+s,t = Yt+s − Yt = at+s,t + mt+s,t , t, s > 0. So long as the covariance exists, Cov (yt+s,t |Ft ) = Cov (at+s,t |Ft ) + Cov (mt+s,t |Ft ) +Cov (at+s,t , mt+s,t |Ft ) + Cov (mt+s,t , at+s,t , |Ft ) . Notice how complicated this expression is compared to the covariance in (6), which is due to the R t+dt fact that s is not necessarily dt and so a t+s,t is no longer known given Ft — while t au du was. However, in all likelihood for small s, a makes a rather modest contribution to the predictive covariance of Y . This suggests using the approximation that Cov (yt+s,t |Ft ) ' Cov (mt+s,t |Ft ) . Now using (8) so Cov (mt+s,t |Ft ) = E ([Y ]t+s − [Y ]t |Ft ) − E 6 (Z t t+s cu du Z t+s cu du t 0 |Ft ) . Hence if c or s is small then we might approximate Cov (Yt+s − Yt |Ft ) ' E ([Y ]t+s − [Y ]t |Ft ) = E ([σ • W ]t+s − [σ • W ]t |Ft ) + E ([J]t+s − [J]t |Ft ) . Thus an interesting forecasting strategy for covariances is to forecast the increments of the QV process or its components. As the QV process and its components are themselves estimable, though with substantial possible error, this is feasible. This approach to forecasting has been advocated in a series of influential papers by Andersen, Bollerslev, Diebold, and Labys (2001), Andersen, Bollerslev, Diebold, and Ebens (2001) and Andersen, Bollerslev, Diebold, and Labys (2003), while the important earlier paper by Andersen and Bollerslev (1998a) was stimulating in the context of measuring the forecast performance of GARCH models. The use of forecasting using estimates of the increments of the components of QV was introduced by Andersen, Bollerslev, and Diebold (2003). We will return to it in section 3.9 when we have developed an asymptotic theory for estimating the QV process and its components. 2.5 Realised QV & BPV The QV process can be estimated in many different ways. The most immediate is the realised QV estimator bt/δc [Yδ ]t = X j=1 Yjδ − Y(j−1)δ 0 Yjδ − Y(j−1)δ , where δ > 0. This is the outer product of returns computed over a fixed interval of time of p length δ. By construction, as δ ↓ 0, [Y δ ]t → [Y ]t . Likewise bt/δc n o X l l l l l Yδ = Yδ(j−1) − Yδ(j−2) Yδj − Yδ(j−1) , t Yδl , Yδk = 1 4 l = 1, 2, ..., p, (11) j=1 p Yδl + Yδk − Yδl − Yδk and {Yδ } → {Y }. In practice, the presence of market frictions can potentially mean that this limiting argument is not really available as an accurate guide to the behaviour of these statistics for small δ. Such difficulties with limiting arguments, which are present in almost all areas of econometrics and statistics, do not invalidate the use of asymptotics, for it is used to provide predictions about finite sample behaviour. Probability limits are, of course, coarse and we will respond to this by refining our understanding by developing central limit theorems and hope they will make good predictions when δ is moderately small. For very small δ these asymptotic predictions become poor guides as frictions bite hard and this will be discussed in section 5. 7 In financial econometrics the focus is often on the increments of the QV and realised QV over set time intervals, like one day. Let us define the daily QV Vi = [Y ]hi − [Y ]h(i−1) , i = 1, 2, ... while it is estimated by the realised daily QV Vbi = [Yδ ]hi − [Yδ ]h(i−1) , i = 1, 2, .... p Clearly Vbi → Vi as δ ↓ 0. The l-th diagonal element of Vbi , written Vbil,l is called the realised q variance6 of asset l, while its square root is its realised volatility. The latter estimates the Vil,l , the daily QVol process of asset l. The l, k-th element of Vbi , Vbil,k , is called the realised covariance between assets l and k. Off these objects we can define standard dependence measures, like realised regression l,k Vbil,k p l,k Vil,k b β i = k,k → β i = k,k , Vbi Vi which estimates the QV regression and the realised correlation b ρl,k i = q Vbil,k Vbil,l Vbik,k Vil,k p q → ρl,k = , i l,l k,k V i Vi which estimates the QV correlation. Similar daily objects can be calculated off the realised BPV process which estimates n o bi = µ−2 {Yδ } − {Yδ } B 1 hi h(i−1) , Bi = Y ct hi − Y ct h(i−1) = Z hi h(i−1) i = 1, 2, ... σ 2u du, i = 1, 2, ... Realised volatility has a very long history in financial economics. It appears in, for example, Rosenberg (1972), Officer (1973), Merton (1980), French, Schwert, and Stambaugh (1987), Schwert (1989) and Schwert (1998), with Merton (1980) making the implicit connection with the case where δ ↓ 0 in the pure scaled Brownian motion plus drift case. Of course, in probability theory QV was discussed as early as Wiener (1924) and Lévy (1937) and appears as a crucial object in the development of the stochastic analysis of semimartingales which occurred in the second half of the last century. For more general financial processes a closer connection between realised QV and QV, and its use for econometric purposes, was made in a series of independent and concurrent papers by Comte and Renault (1998), Barndorff-Nielsen and Shephard (2001) and Andersen, Bollerslev, Diebold, and Labys (2001). The realised regressions and correlations Some authors call Vbil,l the realised volatility, but throughout this paper we follow the tradition in finance of using volatility to mean standard deviation type objects. 6 8 were defined and studied in detail by Andersen, Bollerslev, Diebold, and Labys (2003) and Barndorff-Nielsen and Shephard (2004). A major motivation for Barndorff-Nielsen and Shephard (2002) and Andersen, Bollerslev, Diebold, and Labys (2001) was the fact that volatility in financial markets is highly and unstably diurnal within a day, responding to regularly timed macroeconomic news announcements, social norms such as lunch times and sleeping or the opening of other markets. This makes estimating lim [Y ]t+ε − [Y ]t /ε ε↓0 extremely difficult. The very stimulating work of Genon-Catalot, Larédo, and Picard (1992), Foster and Nelson (1996), Mykland and Zhang (2002) and Mykland and Zhang (2005) tries to tackle this problem using a double asymptotics, as δ ↓ 0 and ε ↓ 0. However, in the last five years many econometrics researchers have mostly focused on naturally diurnally robust quantities like the daily or weekly QV. 2.6 Changes in probability law, QV and BPV The observed price process Y ∈ SM, governed by its data generating process or measure P , is not uniquely interesting. In financial economics the stochastic behaviour of Y under risk neutral versions P ∗ (i.e. so called equivalent martingale measures) are also important for they determine the price of contingent assets based on Y . An interesting question is whether [Y ] computed under P tells us anything about the behaviour of [Y ] under P ∗ . To discuss this recall that the notation P ∗ << P means that the probability law P ∗ is dominated by P . When P ∗ << P and P << P ∗ then P and P ∗ are said to be equivalent measures7 , which is more general than P ∗ being an equivalent martingale measure for P . p Clearly under P , [Yδ ] → [Y ], so if P and P ∗ are equivalent measures then, as the region where [Yδ ] − [Y ] has got substantial probability away from zero narrows as δ ↓ 0, so it must under P ∗ by equivalence. Consequently, in the limit we have that, almost surely, [YP ] = [YP ∗ ] , (12) where [YP ] is the QV of Y under P and [YP ∗ ] be the QV of Y under P ∗ . Hence, potentially, [Yδ ] tells us a lot about [Y ] under P ∗ . This result appears in, for example, book length treatments of semimartingales such as Jacod and Shiryaev (2003, p. 169) and Protter (2004, p. 95). However, its importance in econometrics seems to have gone largely unnoticed. We believe it is powerful. It means that inference based 7 For those who are unfamiliar with these terms, imagine P and P ∗ have discrete support. If this support exactly coincides, then the measures are equivalent. This is crucial for it means we can rescale the measures of P to produce P ∗ and vice versa. Similar ideas hold in the continuous case, see for example Billingsley (1995). 9 on [Yδ ] can be regarded as valid inference on [Y ] under each and every equivalent martingale measure P ∗ (i.e. it holds for incomplete markets). Thus we have a way of making inferences on derivative prices. This is the only result we know of which allows inference under P to be transferred to make conclusions about P ∗ . This, in our view, puts the concepts of realised QV and realised BPV at the centre of financial econometrics. Of course, the dual properties of [Y P ∗ ] and that Y must be a martingale under P ∗ is not enough to identify P ∗ (e.g. it gives no hint at the degree of leverage nor drift), but it does tell us a great deal about reasonable risk neutral processes. To put this observation in context, Garcia, Ghysels, and Renault (2005) and Bates (2003) reviews the econometric literature on pricing derivatives, while Bollerslev, Gibson, and Zhou (2005) use realised volatility as inputs into estimating parameters of SV models used to price options. This result extends further for (e.g. Jacod and Shiryaev (2003, p. 169)) [YPct ] = [YPct∗ ], which implies that [YPd ] = [YPd∗ ], (13) which means that BPV can be used to make inference on the continuous and discontinuous components of Y under P ∗ . Further, if there are jumps under P then there must be jumps, at the same time and of the same variation, under P ∗ . 2.7 Derivatives based on realised QV and QVol In the last ten years an over the counter market in realised QV and QVol has been rapidly developing. This has been stimulated by interests in hedging volatility risk — see Neuberger (1990), Carr and Madan (1998), Demeterfi, Derman, Kamal, and Zou (1999) and Carr and Lewis (2004). Examples of such options are where the payoffs are q max ([Yδ ]t − K1 , 0) , max [Yδ ]t − K2 , 0 . Interesting δ is typically taken as a day. Such options approximate, potentially poorly, q max ([Y ]t − K1 , 0) , max [Y ]t − K2 , 0 . (14) (15) The fair value of options of the type (15) has been studied by a number of authors, for various volatility models. For example, Brockhaus and Long (1999) employs the Heston (1993) SV model, Javaheri, Wilmott, and Haug (2002) GARCH diffusion, while Howison, Rafailidis, and Rasmussen (2004) study log-Gaussian OU processes. Carr, Geman, Madan, and Yor (2005) 10 look at the same problem based upon pure jump processes. Carr and Lee (2003a) have studied how one might value such options based on replication without being specific about the volatility model. See also the overview of Branger and Schlag (2005). The common feature of these papers is that the calculations are based on replacing (14) by (15). These authors do not take into account, to our knowledge, the potentially large difference between using [Yδ ]t and [Y ]t . 2.8 Empirical illustrations: measurement To illustrate some of the empirical features of realised daily QV, and particularly their precision as estimators of daily QV, we have used a series which records the log of the number of German Deutsche Mark a single US Dollar buys (written Y 1 ) and the log of the Japanese Yen/Dollar rate (written Y 2 ). It covers 1st December 1986 until 30th November 1996 and was kindly supplied to us by Olsen and Associates in Zurich (see Dacorogna, Gencay, Müller, Olsen, and Pictet (2001)), although we have made slightly different adjustments to deal with some missing data (described in detail in Barndorff-Nielsen and Shephard (2002)). Capturing time stamped indicative bid and ask quotes from a Reuters screen, they computed prices at each 5-minute period by linear interpolation by averaging the log bid and log ask for the two closest ticks. Figure 1 provides some descriptive statistics for the exchange rates starting on 4th February, 1991. Figure 1(a) shows the first four active days of the dataset, displaying the bivariate 10 q minute returns8 . Figure 1(b) details the daily realised volatilities for the DM Vbi1 , together with 95% confidence intervals. These confidence intervals are based on the log-version of the limit theory for the realised variance we will develop in the next subsection. When the volatility is high, the confidence intervals tend to be very large as well. In Figure 1(c) we have drawn the realised covariance Vbi1,2 against i, together with the associated confidence intervals. These terms move rather violently through this period. The corresponding realised correlations b ρ 1,2 i are given in Figure 1(d). These are quite stable through time with only a single realised correlation standing out from the others in the sample. The correlations are not particularly precisely estimated, with the confidence intervals typically being around 0.2 wide. Table 1 provides some additional daily summary statistics for 100 times the daily data (the scaling is introduced to make the Tables easier to read). It shows the means of the squared 2 1 daily returns Yi1 − Yi−1 and the estimated daily QVs Vbi1 are in line, but that the realised b 1 is below them. The RV and BPV quantities are highly correlated, but the BPV has BPV B i a smaller standard deviation. A GARCH(1,1) model is also fitted to the daily return data and its conditional, one-step ahead predicted variances h i , computed. These have similar means 8 This time resolution was selected so that the results are not very sensitive to market frictions. 11 (a) High requency returns for 4 days (b): 95% CI for daily realised volatility for DM 0.03 0.002 0.02 0.000 0.01 −0.002 DM Yen 0 1 2 3 (c): 95% CI for realised volatility for Yen 4 1991.20 1991.25 1991.30 1991.35 (d) realised correlation for DM & Yen 1991.40 0.020 0.75 0.015 0.50 0.010 0.25 0.005 1991.20 1991.25 1991.30 1991.35 0.00 1991.40 1991.20 1991.25 1991.30 1991.35 1991.40 Figure 1: DM and Yen against the Dollar. Data is 4th February 1991 onwards for 50 active trading days. (a) 10 minute returns on the two exchange rates for the first 4 days of the dataset. (b) Realised volatility for the DM series. This is marked with a cross, while the bars denote 95% confidence intervals. (c) Realised covariance. (d) Realised correlation. and lower standard deviations, but h i is less strongly correlated with squared returns than the realised measures. 2.9 Empirical illustration: time series behaviour Figure 2 shows summaries of the time series behaviour of daily raw and realised DM quantities. They are computed using the whole run of 10 years of 10 minute return data. Figure 2(a) shows the raw daily returns and 2(b) gives the corresponding correlogram of daily squared and absolute returns. As usual absolute returns are moderately more autocorrelated than squared returns, with the degree of autocorrelation in these plots being modest, while the memory lasts a large number of lags. Figure 2(c) shows a time series plot of the daily realised volatilities q Vbi1 for the DM series, indicating bursts of high volatility and periods of rather tranquil activity. The correlogram for 12 Daily QV: Vbi1 bi1 BPV: B GARCH: hi 2 1 Yi1 − Yi−1 Mean 0.509 0.441 0.512 0.504 Standard Dev/Cor .50 .95 .40 .55 .57 .22 .54 .48 .39 1.05 Table 1: Daily statistics for 100 times DM/Dollar return series: estimated QV, BPV, conditional variance for GARCH and squared daily returns. Reported is the mean, standard deviation and correlations. this series is given in Figure 2(d). This shows lagged one correlations of around one half and is around 0.25 at 10 lags. The correlogram then declines irregularly at larger lags. Figure 2(e) q b 1 using the lagged two bipower variation measure. This series does not display the shows B i peaks and troughs of the realised QVol statistics and its correlogram in Figure 2(d) is modestly higher with its r first lag being around 0.56 compared to 0.47. The corresponding estimated jump b 1 is displayed in Figure 2(f), while its correlogram is given in QVol measure max 0, Vbi1 − B i Figure 2(d), which shows a very small degree of autocorrelation. 2.10 2.10.1 Empirical illustration: a more subtle example Interpolation, last price, quotes and trades So far we have not focused on the details of how we compute the prices used in these calculations. This is important if we wish to try to exploit information buried in returns recorded for very small values of δ, such as a handful of seconds. Our discussion will be based on data taken from the London Stock Exchange’s electronic order book, called SETS, in January 2004. The market is open from 8am to 4.30pm, but we remove the first 15 minutes of each day following Engle and Russell (1998). Times are accurate up to one second. We will use three pieces of the database: transactions, best bid and best ask. Note the bid and ask are firm quotes, not indicative like the exchange rate data previous studied. We average the bid and ask to produce a mid-quote, which is taken to proxy the efficient price. We also give some results based on transaction prices. We will focus on four high value stocks: Vodafone (telecoms), BP (hydrocarbons), AstraZeneca (pharmaceuticals) and HSBC (banking). The top row of Figure 3 shows the log of the mid-quotes, recorded every six seconds on the 2nd working day in January. The graphs indicate the striking discreteness of the price processes, which is particularly important for the Vodafone series. Table 2 gives the tick size, the number of mid-point updates and transactions for each asset. It shows the usual result that as the tick size, as a percentage of the price increases, then the number of mid-quote price updates will tend to fall as larger tick sizes mean that there is a larger cost to impatience, that is jumping the queue in the order book by offering a better price than the best current and so updating the 13 0.050 (a) Daily returns 0.10 0.025 (b) ACF: Squared daily returns Squared daily returns Absolute daily returns 0.05 0.000 0.00 −0.025 1988 1990 1992 (c) Estimated daily QVol 1994 1996 0 50 100 150 (d) ACF of daily realised masures 0.50 0.02 200 Daily realised QV Daily realised BPV Daily realised jump QV 0.25 0.01 0.00 1988 1990 1992 1994 (e) Estimated daily continuous QVol 1996 0.020 0.02 0 50 100 (f) Estimated daily jump QVol 150 200 0.015 0.010 0.01 0.005 1988 1990 1992 1994 1996 1988 1990 1992 1994 1996 Figure 2: All graphs refer to the Olsen group’s five q minute changes data DM/Dollar. Top left: daily returns. Middle left: estimated daily QVol Vbi1 , bottom left: estimated daily continuous r q 1 b . Bottom right: estimated continuous QVol b 1 . Top right: ACF QVol B max 0, Vb 1 − B i i i of squared and absolute returns. X-axis is marked off in days. Middle right: ACF of various realised estimators. best quotes. The middle row of Figure 3 shows the corresponding daily realised QVol, computed using 0.015, 0.1, 1, 5 and 20 minute intervals based on mid-quotes. These are related to the signature plots of Andersen, Bollerslev, Diebold, and Labys (2000). As the times of the mid-quotes fall irregularly in time, there is the question of how to approximate the price at these time points. The Olsen method uses linear interpolation between the prices at the nearest observations before and after the correct time point. Another method is to use the last datapoint before the relevant time — the last tick or raw method (e.g. Wasserfallen and Zimmermann (1985)). Typically, the former leads to falls in realised QVol as δ falls, indeed in theory it converges to zero as δ ↓ 0 as its interpolated price process is of continuous bounded variation (Hansen and Lunde (2006)), 14 Vodafone 6.120 4.97 BP AstraZeneca 7.90 HSBC 6.7900 6.115 6.7875 4.96 7.89 6.110 6.7850 6.105 4.95 8 10 12 14 16 8 10 12 14 16 Mid−quotes Raw Interpolation 0.015 Raw Interpolation 0.0150 0.0125 8 10 12 14 16 Hours 8 10 12 14 16 0.0150 Raw Interpolation 0.0100 Raw Interpolation 0.0125 0.0075 0.0100 0.010 0.1 12 Transactions 0.03 10 0.0150 0.0100 0.1 Raw Interpolation 12 Raw Interpolation 10 0.1 12 Raw Interpolation 0.0125 0.0125 12 10 Raw Interpolation 0.0100 0.0100 0.02 0.0050 10 Minutes 0.1 0.0125 0.0100 0.0075 0.0075 0.0075 0.1 12 10 0.1 12 10 0.1 12 10 Minutes 0.1 12 10 Figure 3: LSE’s electronic order book on the 2nd working day in January 2004. Top graphs: mid-quote log-price every 6 seconds, from 8.15am to 4.30pm. X-axis is in hours. Middle graphs: realised daily QVol computed using 0.015, 0.1, 1, 5 and 20 minute midpoint returns. X-axis is in minutes. Lower graphs: realised daily QVol computed using 0.1, 1, 5 and 20 minute transaction returns. Middle and lower graphs are computed using interpolation and the last tick method. while the latter increases modestly. The sensitivity to δ tends to be larger in cases where the tick size is large as a percentage of price and this is the case here. Overall we have the conclusion that the realised QVol does not change much when δ is 5 minutes or above and that it is more stable for interpolation than for last price. When we use smaller time intervals there are large dangers lurking. We will formally discuss the effect of market frictions in section 5. The bottom row in Figure 3 shows the corresponding results for realised QVols computed using the transactions database. This ignores some very large over the counter trades. Realised QVol increases more strongly as δ falls when we use the last tick rather than mid-quote data. This is particularly the case for Vodafone, where bid/ask bounce has a large impact. Even the interpolation method has difficulties with transaction data. Overall, one gets the impression 15 Vodafone open-close open-open Correlation Tick size # of Mid-quotes per day # of Transactions per day 0.00968 0.0159 0.861 0.25 333 3,018 BP Daily 0.00941 0.0140 0.851 0.25 1,434 2,995 AstraZeneca volatility 0.0143 0.0140 0.912 1.0 1,666 2,233 HSBC 0.00730 0.00720 0.731 0.5 598 2,264 Table 2: Top part of table: Average daily volatility. Open is the mid-price at 8.15am, close is the mid-price at 4.30pm. Open-open looks at daily returns. Reported are the sample standard deviations of the returns over 20 days and sample correlation between the open-close and openopen daily returns. Bottom part of table: descriptive statistics about the size of the dataset. from this study that basing the analysis on mid-quote data is sound for the LSE data 9 . A fundamental difficulty with equity data is that the equity markets are only open for a fraction of the whole day and so it is quite possible that a large degree of their variation is at times when there is little data. This is certainly true for the U.K. equity markets which are closed during a high percentage of the time when U.S. markets are open. Table 2 gives daily volatility for open to close and open to open returns, as well as the correlation between the two return measures. It shows the open to close measures account for a high degree of the volatility in the prices, with high correlations between the two returns. The weakest relationship is for the Vodafone series, with the strongest for AstraZeneca. Hansen and Lunde (2005c) have studied how one can use high-frequency information to estimate the QV throughout the day, taking into account closed periods. 2.10.2 Epps effects Market frictions affect the estimation of realised QVol, but if the asset is highly active, the tick size is small as a percentage of the price, δ is well above a minute and the mid-quote/interpolation method is used, then the effects are modest. The situation is much less rosy when we look at estimating quadratic covariations due to the so called Epps (1979) effect. This has been documented in very great detail by Sheppard (2005), who provides various theoretical explanations. We will come back to them in sections 3.8.3 and 5. For the moment it suffices to look at Figure 4 which shows the average daily realised correlation computed in January 2004 for the four stocks looked at above. Throughout prices are computed using mid-quotes and interpolation. The graph shows how this average varies with respect to δ. It trends downwards to zero as δ ↓ 0, with extremely low dependence measures for low values of δ. This is probably caused by the 9 A good alternative would be to carry out the entire analysis on either all the best bids or all the best asks. This approach is used by Hansen and Lunde (2005c) and Large (2005). 16 0.4 0.3 Vodafone−BP Vodafone−AstraZeneca Vodafone−HSBC BP−AstraZeneca BP−HSBC AstraZeneca−HSBC 0.2 0.1 Minutes 0.1 0.2 0.3 0.4 1 2 3 4 5 67 10 20 30 40 50 Figure 4: LSE data during January 2004. Realised correlation computed daily, averaged over the month. Realised quantities are computed using data at the frequency on the x-axis. fact that asset prices tend not to simultaneously move due to non-synchronous trading and the differential rate at which information of different types is absorbed into individual stock prices. Measurement error when Y ∈ BSM 3 3.1 Infeasible asymptotics Market frictions mean that it is not wise to use realised variation objects based on very small δ. This suggests refining our convergence in probability arguments to give a central limit theorem which may provide reasonable predictions about the behaviour of RV statistics for moderate values of δ, such as 5 or 10 minutes, where frictions are less likely to bite hard. Such CLTs will be the focus of attention in this section. At the end of the section, in addition, we will briefly discuss various alternative measures of variation, such as realised range, subsampling and kernel, which have recently been introduced to the literature. Finally we will also discuss how realised objects can contribute to the practical forecasting of volatility. We will derive the central limit theorem for [Y δ ]t which can then be discretised to produce the CLT for Vbi . Univariate results will be presented, since this has less notational clutter. The generalised results were developed in a series of papers by Jacod (1994), Jacod and Protter (1998), Barndorff-Nielsen and Shephard (2002) and Barndorff-Nielsen and Shephard (2004). Theorem 1 Suppose that Y ∈ BSM is one-dimensional and that (for all t < ∞) then as δ ↓ 0 so δ −1/2 ([Yδ ]t − [Y ]t ) → √ Z t 2 2 σ u dBu , Rt 0 a2u du < ∞, (16) 0 where B is a Brownian motion which is independent from Y and the convergence is in law stable 17 as a process. Proof. By Ito’s lemma for continuous semimartingales Y 2 = [Y ] + 2Y • Y, then Yjδ − Y(j−1)δ This implies that 2 = [Y ]δj − [Y ]δ(j−1) + 2 δ −1/2 ([Yδ ]t − [Y ]t ) = 2δ −1/2 = 2δ −1/2 Z δj δ(j−1) bt/δc Z δj X δ(j−1) j=1 Z (Yu − Y(j−1)δ )dYu . (Yu − Y(j−1)δ )dYu δbt/δc (Yu − Yδbu/δc )dYu . 0 Jacod and Protter (1998, Theorem 5.5) show that for Y satisfying the conditions in Theorem 1 then10 δ −1/2 Z 0 t 1 (Yu − Yδbs/δc )dYu → √ 2 Z 0 t σ 2u dBu , where B ⊥⊥ Y and the convergence is in law stable as a process. This implies δ −1/2 ([Yδ ] − [Y ]) → √ 2 σ2 • B . The most important point of this Theorem is that B ⊥⊥ Y . The appearance of the additional Brownian motion B is striking. This means that Theorem 1 implies, for a single t, Z t L −1/2 4 δ ([Yδ ]t − [Y ]t ) → M N 0, 2 σ u du , (17) 0 where M N denotes a mixed Gaussian distribution. This result implies in particular that, for i 6= j, δ −1/2 Vbi − Vi Vbj − Vj ! L → MN 0 0 R hi h(i−1) ,2 σ 4u du 0 R hj 0 4 h(j−1) σ u du !! , so Vbi − Vi are asymptotically uncorrelated, so long as Var Vbi − Vi < ∞, through time. 10 At an intuitive level, if we ignore the drift then Z δj Z 2 (Yu − Y(j−1)δ )dYu ' σδ(j−1) δ(j−1) δj (Wu − W(j−1)δ )dWu , δ(j−1) which is a martingale difference sequence in j with zero mean and conditional variance of 21 σ 2δ(j−1) . Applying a triangular martingale CLT one would expect this result, although formalising it requires a considerable number of additional steps. 18 Barndorff-Nielsen and Shephard (2002) showed that Theorem 1 can be used in practice as Rt [4] the integrated quarticity 0 σ 4u du can be consistently estimated using (1/3) {Y δ }t where bt/δc [4] {Yδ }t =δ −1 X j=1 Yjδ − Y(j−1)δ 4 . (18) In particular then δ −1/2 ([Yδ ]t − [Y ]t ) L r → N (0, 1). 2 [4] {Yδ }t 3 This is a nonparametric result as it does not require us to specify the form of a or σ. (19) The multivariate version of (16) has that as δ ↓ 0 so q q 1 XX δ −1/2 [Yδ ](kl) − [Y ](kl) → √ σ (kb) σ (cl) + σ (lb) σ (ck) • B(bc) , 2 b=1 c=1 k, l = 1, ..., q, (20) where B is a q × q matrix of independent Brownian motions, independent of Y and the convergence is in law stable as a process. In the mixed normal version of this result, the asymptotic covariance is a q × q × q × q array with elements Z t {Σ(kk0 )u Σ(ll0 )u + Σ(kl0 )u Σ(lk0 )u + Σ(kl)u Σ(k0 l0 )u }du 0 . (21) k,k 0 ,l,l0 =1,...,q Barndorff-Nielsen and Shephard (2004) showed how to use high frequency data to estimate this array of processes. We refer the reader to that paper, and also Mykland and Zhang (2005), for details. 3.2 Finite sample performance & the bootstrap Our analysis of [Yδ ]t − [Y ]t has been asymptotic as δ ↓ 0. Of course it is crucial to know if this analysis is informative for the kind of moderate values of δ we see in practice. A number of authors have studied the finite sample behaviour of the feasible limit theory given in (19) and a log-version, derived using the delta-rule δ −1/2 (log[Yδ ]t − log[Y ]t ) L s → N (0, 1). [4] {Y } δ t 2 3 ([Yδ ]t )2 (22) We refer readers to Barndorff-Nielsen and Shephard (2005a), Meddahi (2002), Goncalves and Meddahi (2004), and Nielsen and Frederiksen (2005). The overall conclusion is that (19) is quite poorly sized, but that (22) performs pretty well. The asymptotic theory is challenged in cases where there are components in volatility which are very quickly mean reverting. In the multivariate case, Barndorff-Nielsen and Shephard (2004) studied the finite sample behaviour of realised regression and correlation statistics. They suggest various transformations which improve the 19 finite sample behaviour of these statistics, including the use of the Fisher transformation for the realised correlation. Goncalves and Meddahi (2004) have studied how one might try to bootstrap the realised daily QV estimator. Their overall conclusions are that the usual Edgeworth expansions, which justify the order improvement associated with the bootstrap, are not reliable guides to the finite sample behaviour of the statistics. However, it is possible to design bootstraps which provide very significant improvements over the limiting theory in (19). This seems an interesting avenue to follow up, particularly in the multivariate case. 3.3 Irregularly spaced data Mykland and Zhang (2005) have recently generalised (16) to cover the case where prices are recorded at irregular time intervals. See also the related work of Barndorff-Nielsen and Shephard (2005c). Mykland and Zhang (2005) define a random sequence of times, independent of Y , 11 over the interval t ∈ [0, T ], Gn = {0 = t0 < t1 < ... < tn = T } , then continue to have δ = T /n, and define the estimated QV process tj ≤t [YGn ]t = X j=1 Ytj − Ytj−1 2 p → [Y ]t . They show that as n → ∞ so12 δ −1/2 ([YGn ]t − [Y ]t ) = 2δ −1/2 L → MN tj ≤t Z t X j tj−1 j=1 Z t 0, 2 0 where (Yu − Ytj−1 )dYu ∂HuG ∂u σ 4u du , tj ≤t HtG = lim H Gn , n→∞ t where HtGn =δ −1 X j=0 (tj − tj−1 )2 , and we have assumed that σ follows a diffusion and H G , which is a bit like a QV process but is scaled by δ −1 , is differentiable with respect to time. The H G function is non-decreasing and 11 It is tempting to think of the tj as the time of the j-th trade or quote. However, it is well know that the process generating the times of trades and price movements in tick time are not statistically independent (e.g. Engle and Russell (2005) and Rydberg and Shephard (2000)). This would seem to rule out the direct application of the methods we use here in tick time, suggesting care is needed in that case. 12 At an intuitive level, if we ignore the drift then Z tj Z tj 2 (Yu − Ytj−1 )dYu ' σtj−1 (Wu − Wtj−1 )dWu , tj−1 tj−1 which is a martingale difference sequence in j with zero mean and conditional variance of 21 σ2tj−1 (tj − tj−1 ). Although this suggests the stated result, formalising it requires a considerable number of additional steps. 20 runs quickly when the sampling is slower than normal. For regularly space data, t j = δj and so HtG = t, which reproduces (17). It is clear that tj ≤t [4] [YGn ]t =δ −1 X Ytj − Ytj−1 j=1 4 Z t p →3 0 ∂HuG ∂u σ 4u du, which implies the feasible distributional result in (19) and (22) also holds for irregularly spaced data, which was one of the results in Barndorff-Nielsen and Shephard (2005c). 3.4 Multiple grids Zhang (2004) extended the above analysis to the simultaneous use of multiple grids. In our exposition we will work with Gn (i) = 0 = ti0 < ti1 < ... < tin = T for i = 0, 1, 2, ..., I and δ i = 2 Ptij ≤t T /ni . Then define the i-th estimated QV process [Y Gn (i) ]t = j=1 Y ti − Y ti . Additionally j j−1 we need a new cross term for the covariation between the time scales. The appropriate term is ti,k j ≤t G(i)∪G(k) Ht G (i)∪Gn (k) = lim Ht n n→∞ G (i)∪Gn (k) where Ht n , X = (δ i δ k )−1/2 j=1 i,k ti,k j − tj−1 2 , where ti,k j comes from n o i,k i,k Gn (i) ∪ Gn (k) = 0 = ti,k < t < ... < t = T , 0 1 2n i, k = 0, 1, 2, ..., I. Clearly, for all i, −1/2 δi −1/2 so the scaled (by δ i [YGn (k) ]t is [YGn (i) ]t − [Y ]t = −1/2 and δ k 2 Z Z ∂Hu ∂u 0 ! G(i)∪G(k) t tij ≤t Z X j=1 tij tij−1 (Yu − Yti j−1 )dYu , respectively) asymptotic covariance matrix of [Y Gn (i) ]t and G(i) t −1/2 2δ i ∂Hu σ 4u du ∂u 0 σ 4u du ! Z • t 0 G(k) ∂Hu ∂u ! σ 4u du . Example 1 Let t0j = δ (j + ε), t1j = δ (j + η) where |ε − η| ∈ [0, 1] are temporal offsets, then G(0) Ht G(1) = Ht = t, G(0)∪G(1) Ht Thus δ −1/2 [YGn (0) ]t − [Y ]t [YGn (1) ]t − [Y ]t L = t (η − ε)2 + (1 − |η − ε|)2 . → MN 0, 2 1 • (η − ε)2 + (1 − |η − ε|)2 1 Z 0 t σ 4u du The correlation between the two measures is minimised at 1/2 by setting |η − ε| = 1/2. 21 Example 1 extends naturally to when t kj = δ j + k K+1 , k = 0, 1, 2, ..., K, which allows many equally spaced realised QV like estimators to be defined based on returns measured over δ periods. The scaled asymptotic covariance of [Y Gn (i) ]t and [YGn (k) ]t is 2 ( k−i K +1 2 ) Z t k−i 2 + 1 − σ 4u du. K + 1 0 If K = 1 or K = 2 then the correlation between the estimates is 1/2 and 5/9, respectively. As the sampling points become more dense the correlation quickly escalates which means that each new realised QV estimator brings out less and less additional information. 3.5 Subsampling The multiple grid allows us to create a pooled grid estimator of QV — which is a special case of subsampling a statistic based on a random field, see for example the review of Politis, Romano, and Wolf (1999, Ch. 5). A simple example of this is K 1 X [YGn+ (K) ]t = [YGn (i) ]t , K +1 (23) i=0 which was mentioned in this context by Müller (1993) and Zhou (1996, p. 48). Clearly p [YGn+ (K) ]t → [Y ]t as δ ↓ 0, while the properties of this estimator were first studied when Y ∈ BSM by Zhang, Mykland, and Aı̈t-Sahalia (2005). Zhang (2004) also studies the properties of unequally weighted pooled estimators, while additional insights are provided by Aı̈t-Sahalia, Mykland, and Zhang (2005b). Example 2 Let tkj = δ j + δ −1/2 L → MN k K+1 , k = 0, 1, 2, ..., K. Then, for fixed K as δ ↓ 0 so [YGn (0) ]t − [Y ]t [YGn (1) ]t − [Y ]t ( ! ) Z t K K X X k−i 2 2 k−i 2 0, + 1 − σ 4u du K +1 K + 1 (K + 1)2 i=0 k=0 0 This subsampler is based on a sample size K +1 times the usual one but returns are still recorded over intervals of length δ. When K = 1 then the constant in front of integrated quarticity is 1.5 while when K = 2 it drops to 1.4074. The next terms in the sequence are 1.3750, 1.3600, 1.3519 and 1.3469 while it asymptotes to 1.333, a result due to Zhang, Mykland, and Aı̈t-Sahalia (2005). Hence the gain from using the entire sample path of Y via multiple grids is modest and almost all the available gains occur by the time K reaches 2. However, we will see later that this subsampler has virtues when there are market frictions. 22 3.6 Serial covariances Suppose we define the notation Gδ (ε, η) = {δ(ε + η), δ(2ε + η), ...}, then the above theory implies that Rt δ −1/2 [YGn (2,0) ]t − 0 σ 2u du Rt δ −1/2 [YGn (2,−1) ]t − 0 σ 2u du Rt δ −1/2 [YGn (1,0) ]t − 0 σ 2u du Define the realised serial covariance as 0 4 2 2 Z t L σ 4u du . → M N 0 , 2 4 2 0 0 2 2 2 bt/δc γ bs (Yδ , Xδ ) = X j=1 Yδj − Yδ(j−1) Xδ(j−s) − Xδ(j−s−1) , s = 0, 1, 2, ..., S, and say γ b−s (Y, X) = γ bs (Y, X) while γ bs (Yδ ) = γ bs (Yδ , Yδ ). Derivatives on such objects have recently been studied by Carr and Lee (2003b). We have that 2b γ 1 (Yδ ) = [YGn (2,0) ]t + [YGn (2,−1) ]t − 2[YGn (1,0) ]t + op (δ 1/2 ). Note that γ b0 (Yδ ) = [YGn (1,0) ]t . Then, clearly δ −1/2 Rt γ b0 (Yδ ) − 0 σ 2u du γ b1 (Yδ ) .. . γ bS (Yδ ) L → M N 0 0 .. . 0 , 2 0 ··· 0 1 ··· .. .. . . . . . 0 0 ··· 0 0 .. . 1 Z t σ 4u du , 0 (24) see Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2). Consequently Rt 4 γ 0 (Yδ ) γ b1 (Yδ )/b σ u du L .. δ −1/2 → M N 0, I R 0 2 , . t 2 σ du γ b (Yδ )/b γ (Yδ ) 0 u S 0 which differs from the result of Bartlett (1946), inflating the usual standard errors as well as making inference multivariate mixed Gaussian. There is some shared characteristics with the familiar Eicker (1967) robust standard errors but the details are, of course, rather different. 3.7 Kernels Following Bartlett (1950) and Eicker (1967), long run estimates of variances are often computed using kernels. We will see this idea may be helpful when there are market frictions and so we take some time discussing this here. It was introduced in this context by Zhou (1996) and Hansen and Lunde (2006), while a thorough discussion was given by Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2). A kernel takes on the form of RVw (Y ) = w0 [Yδ ] + 2 q X i=1 23 wi γ bi (Yδ ), (25) where the weights wi are non-stochastic. It is clear from (24) that if the estimator is based on δ/K returns, so that it is compatible with (23), then ( δ K w02 +2 q X wi2 i=1 !)−1/2 RVw (Y δ ) − w0 K Z t 0 σ 2u du L → MN 0, 2 Z t 0 σ 4u du . (26) In order for this method to be consistent for integrated variance as q → ∞ we need that P w0 = 1 + o(1) and qi=1 wi2 /K = O(1) as a function of q. Example 3 The Bartlett kernel puts w 0 = 1 and wi = (q + 1 − i) / (q + 1). When q = 1 then w1 = 1/2 and the constant in front of integrated quarticity is 3, while when q = 2 then w 1 = 2/3, w2 = 1/3 and the constant becomes 4 + 2/9. For moderately large q this is well approximated by 4 (q + 1) /3. This means that we need q/K → 0 for this method to be consistent. This result appears in Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2). 3.8 3.8.1 Other measures Realised range Suppose Y = σW , a scaled Brownian motion, then r 2 2 E sup Ys = ϕ2 σ t, where ϕr = E sup |Ws | , 0≤s≤t 0≤s≤1 noting that ϕ2 = 4 log 2 and ϕ4 = 9ζ(3), where ζ is the Riemann function. This observation led Parkinson (1980) to provide a simple estimator of σ 2 based on the highs and lows of asset prices. See also the work of Rogers and Satchell (1991), Alizadeh, Brandt, and Diebold (2002), Ghysels, Santa-Clara, and Valkanov (2004) and Brandt and Diebold (2004). One reason for the interest in ranges is the belief that they are quite informative and somewhat robust to market frictions. The problem with this analysis is that it does not extend readily when Y ∈ BSM. In independent work, Christensen and Podolskij (2005) and Martens and van Dijk (2005) have studied the realised range process. Christensen and Podolskij (2005) define the process as bt/δc \Y \t = p− lim δ↓0 X sup j=1 s∈[(j−1)δ,jδ] (Ys − Y(j−1)δ )2 , (27) which is estimated by the obvious realised version, written \Y δ \t . Christensen and Podolskij Rt 2 (2005) have proved that if Y ∈ BSM, then ϕ −1 2 \Y \t = 0 σ u du. Christensen and Podolskij (2005) also shows that under rather weak conditions Z L ϕ4 − ϕ22 t 4 −1/2 −1 δ ϕ2 \Yδ \t − [Y ]t → M N 0, σ u du , ϕ22 0 24 where ϕ0 = ϕ4 − ϕ22 /ϕ22 ' 0.4. This shows that it is around five time as efficient as the usual realised QV estimator. Christensen and Podolskij (2005) suggest estimating integrated quarticity using bt/δc δ −1 ϕ−1 4 X sup (Ys − Y(j−1)δ )4 , j=1 s∈[(j−1)δ,jδ] which means this limit theorem is feasible. Martens and van Dijk (2005) have also studied the properties of \Yδ \t using simulation and empirical work. As far as we know no results are known about estimating [Y ] using ranges when there are jumps in Y , although it is relatively easy to see that a bipower type estimator could be defined using contiguous ranges which would robustly estimate [Y ct ]. 3.8.2 Discrete sine transformation Curci and Corsi (2003) have argued that before computing realised QV we should prefilter the data using a discrete sine transformation to the returns in order to reduce the impact of market frictions. This is efficient when the data X is a Gaussian random walk Y plus independent Gaussian noise ε model, where we think of the noise as market frictions. The Curci and Corsi (2003) method is equivalent to calculating the realised QV process on the smoother E (Y |X; θ), where θ are the estimated parameters indexing the Gaussian model. This type of approach was also advocated in Zhou (1996, p. 112). 3.8.3 Fourier approach Motivated by the problem of irregularly spaced data, where the spacing is independent of Y , Malliavin and Mancino (2002) showed that if Y ∈ BSM then J h i h i X 1 p YJl , YJk = π2 alj akj + blj bkj → Y l , Y k , J 2π 2π j=1 as J → ∞, where the Fourier coefficients of Y are Z Z 1 2π 1 2π l l l aj = cos(ju)dYu , bj = sin(ju)dYul . π 0 π 0 The Fourier coefficients are computed by, for example, integration by parts j Z 2π 1 l l l sin(ju)Yul du aj = Y − Y0 + π 2π π 0 1 n−1 X 1 l ' Y2π − Y0l + {cos (jti ) − cos (jti+1 )} Ytli , π π i=0 blj ' 1 π n−1 X i=0 {sin (jti ) − sin (jti+1 )} Ytli . 25 (28) This means that, in principle, one can use all the available data for all the series, even though prices for different assets appear at different points in time. Indeed each series has its Fourier coefficients computed separately, only performing the multivariate aspect of the analysis at step (28). A similar type of analysis could be based on wavelets, see Hog and Lunde (2003). The performance of this Fourier estimator of QV is discussed by, for example, Barucci and Reno (2002b), Barucci and Reno (2002a), Kanatani (2004b), Precup and Iori (2005), Nielsen and Frederiksen (2005) and Kanatani (2004a) who carry out some extensive simulation and empirical studies of the procedure. Reno (2003) has used a multivariate version of this method to study the Epps effects, while Mancino and Reno (2005) use it to look at dynamic principle components. Kanatani (2004a, p. 22) has shown that in the univariate case the finite J Fourier estimator can be written as a kernel estimator (25). For regularly spaced data he derived the weight function, noting that as J increases, so each of these weights declined and so for fixed δ so [YJ ]2π → [Yδ ]2π . An important missing component in this analysis is any CLT for this estimator. 3.8.4 Generalised bipower variation The realised bipower variation process suggests studying generic statistics of the form introduced by Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) and Barndorff-Nielsen, Graversen, Jacod, and Shephard (2005) bt/δc Yδ (g, h)t = δ X g δ −1/2 Yδ(j−1) − Yδ(j−2) h δ −1/2 Yδj − Yδ(j−1) , (29) j=1 where the multivariate Y ∈ BSM and g, h are conformable matrices with elements which are continuous with at most polynomial growth in their arguments. Both QV and multivariate BPV can be cast in this form by the appropriate choice of g, h. Some of the choices of g, h will deliver statistics which will be robust to jumps. Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) have shown that as δ ↓ 0 the probability limit of this process is always the generalised BPV process Z t ρσ u (g)ρσ u (h)du, 0 where the convergence is locally uniform, ρ σ (g) = Eg(X) and X ∼ N (0, σσ 0 ). They also provide a central limit theorem for the generalised power variation estimator. An example of the above framework which we have not covered yet is achieved by selecting r h(y) = y l for r > 0 and g(y) = 1, then (29) becomes δ 1−r/2 bntc r X l l Yδ(j−1) − Yδ(j−2) , j=1 26 (30) which is called the realised r-th order power variation. When r is an integer it has been studied from a probabilistic viewpoint by Jacod (1994) while Barndorff-Nielsen and Shephard (2003) look at the econometrics of the case where r > 0. The increments of these types of high frequency volatility measures have been informally used in the financial econometrics literature for some time when r = 1, but until recently without a strong understanding of their properties. Examples of their use include Schwert (1990), Andersen and Bollerslev (1998b) and Andersen and Bollerslev (1997), while they have also been abstractly discussed by Shiryaev (1999, pp. 349–350) and Maheswaran and Sims (1993). Following the work by Barndorff-Nielsen and Shephard (2003), Ghysels, Santa-Clara, and Valkanov (2004) and Forsberg and Ghysels (2004) have successfully used realised power variation as an input into volatility forecasting competitions. It is unclear how the greater flexibility over the choice of g, h will help econometricians in the future to learn about new features of volatility and jumps, perhaps robustly to market frictions. It would also be attractive if one could generalise (29) to allow g and h to be functions of the path of the prices, not just returns. 3.9 3.9.1 Non-parametric forecasting Background We saw in section 2.4 that if s is small then Cov (Yt+s − Yt |Ft ) ' E ([Y ]t+s − [Y ]t |Ft ) . This suggests: 1. estimating components of the increments of QV; 2. projecting these terms forward using a time series model. This separates out the task of historical measurement of past volatility (step 1) from the problem of forecasting (step 2). Suppose we wish to make a sequence of one-step or multi-step ahead predictions of V i = [Y ]hi − [Y ]h(i−1) using their proxies Vbi = [Yδ ]hi − [Yδ ]h(i−1) , raw returns yi = Yhi − Yh(i−1) (to try bi = {Yδ } − {Yδ } to deal with leverage effects) and components B hi h(i−1) , where i = 1, 2, ..., T . For simplicity of exposition we set h = 1. This setup exploits the high frequency information set, but is somewhat robust to the presence of complicated intraday effects. Clearly if Y ∈ BSM then the CLT for realised QV implies that as δ ↓ 0, so long as the moments exist, E (Vi |Fi−1 ) ' E Vbi |Fi−1 + o(δ 1/2 ). 27 It is compelling to choose to use the coarser information set, so bi−1 , B bi−2 , ..., B b1 , yi−1 , ..., y1 Cov Yi − Yi−1 |Vbi−1 , Vbi−2 , ..., Vb1 , B bi−1 , B bi−2 , ..., B b1 , yi−1 , ..., y1 ' E Vi |Vbi−1 , Vbi−2 , ..., Vb1 , B bi−1 , B bi−2 , ..., B b1 , yi−1 , ..., y1 . ' E Vbi |Vbi−1 , Vbi−2 , ..., Vb1 , B Forecasting can be carried out using structural or reduced form models. The simplest reduced form approach is to forecast Vbi using the past history Vbi−1 , Vbi−2 , ..., Vb1 , yi−1 , yi−2 , ..., y1 and bi−1 , B bi−2 , ..., B b1 based on standard forecasting methods such as autoregressions. The earliest B modelling of this type that we know of was carried out by Rosenberg (1972) who regressed Vbi on Vbi−1 to show, for the first time in the academic literature, that volatility was partially forecastable. This approach to forecasting is convenient but potentially inefficient for it fails to use all the available high frequency data. In particular, for example, if Y ∈ SV then accurately modelled high frequency data may allow us to accurately estimate the spot covariance Σ (i−1)h , which would be a more informative indicator than Vbi−1 . However, the results in Andersen, Bollerslev, and Meddahi (2004) are reassuring on that front. They indicate that if Y ∈ SV there is only a small loss in efficiency by forgoing Σ (i−1)h and using Vbi−1 instead. Further, Ghysels, SantaClara, and Valkanov (2004) and Forsberg and Ghysels (2004) have forcefully argued that by additionally conditioning on low power variation statistics (30) very significant forecast gains can be achieved. 3.9.2 Univariate illustration In this subsection we will briefly illustrate some of these suggestions in the univariate case. Much more sophisticated studies are given in, for example, Andersen, Bollerslev, Diebold, and Labys (2001), Andersen, Bollerslev, Diebold, and Ebens (2001), Andersen, Bollerslev, Diebold, and Labys (2003), Bollerslev, Kretschmer, Pigorsch, and Tauchen (2005) and Andersen, Bollerslev, and Meddahi (2004), who look at various functional forms, differing asset types and more involved dynamics. Ghysels, Santa-Clara, and Valkanov (2004) suggest an alternative method, using high frequency data but exploiting more sophisticated dynamics through so-called MIDAS regressions. Table 3 gives a simple example of this approach for 100 times the returns on the DM/Dollar bi . series. It shows the result of regressing Vbi on a constant, and simple lagged versions of Vbi and B We dropped a priori the use of yi as regressors for this exchange rate, where leverage effects are usually not thought to be important. The unusual spacing, using 1, 5, 20 and 40 lags, mimics the approach used by Corsi (2003) and Andersen, Bollerslev, and Diebold (2003). The results 28 Const 0.503 (0.010) 0.170 (0.016) 0.139 (0.017) 0.139 (0.017) Vbi−1 0.413 (0.018) -0.137 (0.059) Realised QV terms Vbi−5 Vbi−20 0.153 (0.018) -0.076 (0.059) 0.061 (0.018) -0.017 (0.058) Vbi−40 0.030 (0.017) 0.116 (0.058) bi−1 B 0.713 (0.075) 0.551 (0.023) Realised BPV terms bi−5 bi−20 bi−40 B B B 0.270 (0.074) 0.180 (0.023) 0.091 (0.074) 0.071 (0.022) -0.110 (0.073) 0.027 (0.021) Summary measures log L P ort49 -1751.42 4660 -1393.41 199 -1336.81 108 -1342.03 122 Table 3: Prediction for 100 times returns on the DM/Dollar series. Dynamic regression, predicting future daily RV Vbi using lagged values and lagged values of estimated realised BPV bi . Software used was PcGive. Subscripts denote the lag length in this table. Everyterms B thing is computed using 10 minute returns. Figures in brackets are asymptotic standard errors. Port 49 denotes the Box-Ljung portmantau statistic computed with 49 lags, while log-L denotes the Gaussian likelihood. are quite striking. None of the models have satisfactory Box-Ljung portmanteau tests (this can be fixed by including a moving average error term in the model), but the inclusion of lagged information is massively significant. The lagged realised volatilities seem to do a reasonable job at soaking up the dependence in the data, but the effect of bipower variation is more important. This is in line with the results in Andersen, Bollerslev, and Diebold (2003) who first noted this effect. See also the work of Forsberg and Ghysels (2004) on the effect of inclusion of other power variation statistics in forecasting. Table 4 shows some rather more sophisticated results. Here we model returns directly using a GARCH type model, but also include lagged explanatory variables in the conditional variance. This is in the spirit of the work of Engle and Gallo (2005). The results above the line show the homoskedastic fit and the improvement resulting from the standard GARCH(1,1) model. Below the line we include a variety of realised variables as explanatory variables; including longer lags of realised variables does not improve the fit. The best combination has a large coefficient on realised BPV and a negative coefficient on realised QV. This means when there is evidence for a jump then the impact of realised volatility is tempered, while when there is no sign of jump the realised variables are seen with full force. What is interesting from these results is that the realised effects are very much more important than the lagged daily returns. In effect the realised quantities have basically tested out the traditional GARCH model. Overall this tiny empirical study confirms the results in the literature about the predictability of realised volatility, but that it is quite easy to outperform a simple autoregressive model for RV. We can see how useful bipower variation is and that taken together the realised quantities do provide a coherent way of empirically forecasting future volatility. 29 Const 0.504 (0.021) 0.008 (0.003) 0.017 (0.009) 0.011 (0.008) 0.014 (0.009) 0.019 (0.010) Realised terms bi−1 Vbi−1 B -0.115 (0.039) 0.085 (0.042) -0.104 (0.074) 0.253 (0.076) 0.120 (0.058) 0.282 (0.116) Standard GARCH terms (Yi−1 − Yi−2 )2 hi−1 0.053 (0.010) 0.019 (0.019) 0.015 (0.017) 0.013 (0.019) 0.930 (0.013) 0.842 (0.052) 0.876 (0.049) 0.853 (0.055) 0.822 (0.062) log L -2636.59 -2552.10 -2533.89 -2537.49 -2535.10 -2534.89 Table 4: Prediction for 100 times returns Y i −Yi−1 on the DM/Dollar series. GARCH type model of the conditional variance hi of daily returns, using lagged squared returns (Y i−1 − Yi−2 )2 , rebi−1 and lagged conditional variance hi−1 . Throughout a Gausalised QV Vbi−1 , realised BPV B sian quasi-likelihood is used. Robust standard errors are reported. Carried out using PcGive. 3.9.3 Multivariate illustration TO BE ADDED 3.10 Parametric inference and forecasting Throughout we have emphasised the non-parametric nature of the analysis. This is helpful due to the strong and complicated diurnal patterns we see in volatility. These effects tend also to be unstable through time and so are difficult to model parametrically. A literature which mostly avoids this problem is that on estimating parametric SV models from low frequency data. Much of this is reviewed in Shephard (2005, Ch. 1). Examples include the use of Markov chain Monte Carlo methods (e.g. Kim, Shephard, and Chib (1998)) and efficient method of moments (e.g. Chernov, Gallant, Ghysels, and Tauchen (2003)). Both approaches are computationally intensive and intricate to code. Simpler method of moment procedures (e.g. Andersen and Sørensen (1996)) have the difficulty that they are sensitive to the choice of moments and can be rather inefficient. Recently various researchers have used the time series of realised daily QV to estimate parametric SV models. These models ignore the intraday effects and so are theoretically misspecified. Typically the researchers use various simple types of method of moments estimators, relying on the great increase in information available from realised statistics to overcome the inefficiency caused by the use of relatively crude statistical methods. The first papers to do this were Barndorff-Nielsen and Shephard (2002) and Bollerslev and Zhou (2002), who studied the first two dynamic moments of the time series Vb1 , Vb2 , ..., VbT implied by various common volatility models and used these to estimate the parameters embedded within the SV models. More so30 phisticated approaches have been developed by Corradi and Distaso (2004) and Phillips and Yu (2005). Barndorff-Nielsen and Shephard (2002) also studied the use of these second order properties of the realised quantities to estimate V 1 , V2 , ..., VT from the time series of Vb1 , Vb2 , ..., VbT using the Kalman filter. This exploited the asymptotic theory for the measurement error (17). See also the work of Meddahi (2002), Andersen, Bollerslev, and Meddahi (2004) and Andersen, Bollerslev, and Meddahi (2005). 3.11 Forecast evaluation One of the main early uses of realised volatility was to provide a instrument for measuring the success for various volatility forecasting methods. Andersen and Bollerslev (1998a) studied the correlation between Vi or Vbi and hi , the conditional variance from a GARCH model based on daily returns from time 1 up to time i − 1. They used these results to argue that GARCH models were more successful than had been previously understood in the empirical finance literature. Hansen and Lunde (2005b) study a similar type of problem, but look at a wider class of forecasting models and carry out formal testing of the superiority of one modelling approach over another. Hansen and Lunde (2005a) and Patton (2005) have focused on the delicate implications of the use of different loss functions to discriminate between competing forecasting models, where the object of the forecasting is Cov (Y i − Yi−1 |Fi−1 ). They use Vbi to proxy this unobserved covariance. See also the related work of Koopman, Jungbacker, and Hol (2005). 4 Jumps 4.1 Bipower variation In this short section we will review some material which non-parametrically identifies the contribution of jumps to the variation of asset prices. A focus will be on using this method for testing for jumps from discrete data. We will also discuss some work by Cecilia Mancini which provides an alternative to BPV for splitting up QV into its continuous and discontinuous components. Rt Recall µ−2 1 {Y }t = 0 Σu du when Y is a BSM plus jump process given in (7). The BPV process is consistently estimated by the p × p matrix realised BPV process {Y δ }, defined in (10). This means that we can, in theory, consistently estimate [Y ct ] and [Y d ] by µ−2 1 {Yδ } and [Yδ ] − µ−2 1 {Yδ }, respectively. One potential use of {Yδ } is to test for the hypothesis that a set of data is consistent with a null hypothesis of continuous sample paths. We can do this by asking if [Y δ ]t − µ−2 1 {Yδ }t is statistically significantly bigger than zero — an approach introduced by Barndorff-Nielsen and 31 Shephard (2006). This demands a distribution theory for realised BPV objects, calculated under the null that Y ∈ BSM with σ > 0. Building on the earlier CLT of Barndorff-Nielsen and Shephard (2006), Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) have established a CLT which covers this situation when Y ∈ BSM. We will only present the univariate result, which has that as δ ↓ 0 so δ −1/2 ({Yδ }t − {Y }t ) → µ21 Z p (2 + ϑ) t 0 σ 2u dBu , (31) where B ⊥⊥ Y , the convergence is in law stable as a process and ϑ = π 2 /4 + π − 5 ' 0.6090. This result, unlike Theorem 1, has some quite technical conditions associated with it in order to control the degree to which the volatility process can jump; however we will not discuss those issues here. Extending the result to cover the joint distribution of the estimators of the QV and the BPV processes, they showed that −2 Z t 0 (2 + ϑ) 2 µ1 {Yδ }t − µ−2 {Y }t L −1/2 4 1 → MN , σ u du , δ 0 2 2 [Yδ ]t − [Y ]t 0 a Hausman (1978) type result as the estimator of the QV process is, of course, fully asymptotically efficient when Y ∈ BSM. Consequently δ −1/2 [Yδ ]t − µ−2 L 1 {Yδ }t s Z → N (0, 1) , t ϑ 0 (32) σ 4u du which can be used as the basis of a test of the null of no jumps. 4.2 Multipower variation The “standard” estimator of integrated quarticity, given in (18), is not robust to jumps. One way of overcoming this problem is to use a multipower variation (MPV) measure — introduced by Barndorff-Nielsen and Shephard (2006). This is defined as ( I ) bt/δc X Y [r] Yδ(j−i) − Yδ(j−1−i) ri , {Y }t = p− lim δ (1−r+ /2) δ↓0 j=1 where ri > 0, r = (r1 , r2 , ..., rI )0 for all i and r+ = [1,1] case {Y }t = {Y }t . i=1 PI i=1 ri . The usual BPV process is the special If Y obeys (7) and ri < 2 then {Y [r] }t = I Y µ ri i=1 32 !Z t 0 σ ru+ du, This process is approximated by the estimated MPV process ( I ) bt/δc X Y ri [r] (1−r+ /2) Yδ(j−i) − Yδ(j−1−i) {Yδ }t = δ . j=1 i=1 In particular the scaled realised tri and quadpower variation, [1,1,1,1] µ−4 1 {Yδ }t respectively, both estimate Rt 0 [4/3,4/3,4/3] and µ−3 4/3 {Yδ }t , σ 4u du consistently in the presence of jumps. Hence either of these objects can be used to replace the integrated quarticity in (32), so producing a non-parametric test for the presence of jumps in the interval [0, t]. The test is conditionally consistent, meaning if there is a jump, it will detected and has asymptotically the correct size. Extensive small sample studies are reported in Huang and Tauchen (2005), who favour ratio versions of the statistic like δ −1/2 µ−2 1 {Yδ }t −1 [Yδ ]t v u u {Y }[1,1,1,1] tϑ δ t ({Yδ }t )2 L → N (0, 1) , which has pretty reasonable finite sample properties. They also show that this test tends to under reject the null of no jumps in the presence of some forms of market frictions. It is clearly possible to carry out jump testing on separate days or weeks. Such tests are asymptotically independent over these non-overlapping periods under the null hypothesis. To illustrate this methodology we will apply the jump test to the DM/Dollar rate, asking if the hypothesis of a continuous sample path is consistent with the data we have. Our focus will mostly be on Friday January 15th 1988, although we will also give results for neighbouring days to provide some context. In Figure 5 we plot 100 times the change during the week of the discretised Yδ , so a one unit uptick represents a 1% change, for a variety of values of n = 1/δ, bi /Vbi with their corresponding 99% critical values. as well as giving the ratio jump statistics B In Figure 5 there is a large uptick in the D-mark against the Dollar, with a movement of nearly two percent in a five minute period. This occurred on the Friday and was a response to the news of a large fall in the U.S. balance of payment deficit, which led to a large strengthening bi . Hence the of the Dollar. The data for January 15th had a large Vbi but a much smaller B statistics are attributing a large component of Vbi to the jump, with the adjusted ratio statistic being larger than the corresponding 99% critical value. When δ is large the statistic is on the borderline of being significant, while the situation becomes much clearer as δ becomes small. This illustration is typical of results presented in Barndorff-Nielsen and Shephard (2006) which showed that many of the large jumps in this exchange rate correspond to macroeconomic news 33 4 Change in Yδ during week using δ=20 minutes Ratio jump statistic and 99% critical values 1.00 3 2 0.75 1 4 (µ 1)−2B^i /V^i 99% critical values 0.50 0 Tues Wed Thurs Mon Change in Yδ during week using δ=5 minutes Mon Tues Wed Thurs Fri Ratio jump statistic and 99% critical values Fri 1.00 3 2 0.75 (µ 1)−2B^i /V^i 99% critical values 1 0.50 0 0.25 Mon Tues Wed Thurs Mon Fri Tues Wed Thurs Fri Figure 5: Left hand side: change in Y δ during a week, centred at 0 on Monday 11th January and running until Friday of that week. Drawn every 20 and 5 minutes. An up tick of 1 indicates bi /Vbi , which should strengthening of the Dollar by 1%. Right hand side shows an index plot of B be around 1 if there are no jumps. Test is one sided, with criticial values also drawn as a line. announcements. This is consistent with the recent economics literature documenting significant intraday announcement effects, e.g. Andersen, Bollerslev, Diebold, and Vega (2003). 4.3 Grids It is clear that the martingale based CLT for irregularly spaced data for the estimator of the QV process can be extended to cover the BPV case. We define tj ≤t {YGn }t = X Yt j=1 j−1 p − Ytj−2 Ytj − Ytj−1 → {Y }t . Using the same notation as before, we would expect the following result to hold, due to the fact that H G is assumed to be continuous, −2 Z t ∂HuG 0 (2 + ϑ) 2 µ1 {YGn }t − µ−2 L 1 {Y }t δ −1/2 → MN σ 4u du . 0 2 2 [Yδ ]t − [Y ]t ∂u 0 34 [1,1,1,1] The integrated moderated quarticity can be estimated using µ −4 1 {Yδ }t , or a grid version, which again implies that the usual feasible CLT continues to hold for irregularly spaced data. This is the expected result from the analysis of power variation provided by Barndorff-Nielsen and Shephard (2005c). Potentially there are modest efficiency gains to be had by computing the estimators of BPV on multiple grids and then averaging them. The extension along these lines is straightforward and will not be detailed here. 4.4 Infinite activity jumps The probability limit of realised BPV is robust to finite activity jumps. A natural question to ask is: (i) is the CLT also robust to jumps, (ii) is the probability limit also unaffected by infinite activity jumps, that is jump processes with an infinite number of jumps in any finite period of time. Both issues are studied by Barndorff-Nielsen, Shephard, and Winkel (2004) in the case where the jumps are of Lévy type, while Woerner (2004) looks at the probability limit for more general jump processes. Barndorff-Nielsen, Shephard, and Winkel (2004) find that the CLT for BPV is affected by finite activity jumps, but this is not true of tripower and high order measures of variation. The reason for the robustness of tripower results is quite technical and we will not discuss it here. However, it potentially means that inference under the assumption of jumps can be carried out using tripower variation, which seems an exciting possibility. Both Barndorff-Nielsen, Shephard, and Winkel (2004) and Woerner (2004) give results which prove that the probability limit of realised BPV is unaffected by some types of infinite activity jump processes. More work is needed on this topic to make these result definitive. It is somewhat related to the parametric study of Aı̈t-Sahalia (2004). He shows that maximum likelihood estimation can disentangle a homoskedastic diffusive component from a purely discontinuous infinite activity Lévy component of prices. Outside the likelihood framework, the paper also studies the optimal combinations of moment functions for the generalized method of moment estimation of homoskedastic jumpdiffusions. Further insights can be found by looking at likelihood inference for Lévy processes, which is studied by Aı̈t-Sahalia and Jacod (2005a) and Aı̈t-Sahalia and Jacod (2005b). 4.5 Testing the null of no continuous component In some stimulating recent papers, Carr, Geman, Madan, and Yor (2003) and Carr and Wu (2004), have argued that it is attractive to build SV models out of pure jump processes, with no Brownian aspect. This is somewhat related to the material we discuss in section 5.6. It is clearly important to be able to test this hypothesis, seeing if pure discreteness is consistent with 35 observed prices. Barndorff-Nielsen, Shephard, and Winkel (2004) showed that [2/3,2/3,2/3] δ −1/2 {Yδ }t − [Y ct ]t has a mixed Gaussian limit and is robust to jumps. But this result is only valid if σ > 0, which rules out its use for testing for pure discreteness. However, we can artificially add a scaled Brownian motion, U = σB, to the observed price process and then test if [2/3,2/3,2/3] δ −1/2 {Yδ + Uδ }t − σ2 t is statistically significantly greater than zero. In principle this would be a consistent non- parametric test of the maintained hypothesis of Peter Carr and his coauthors. 4.6 Alternative methods for identifying jumps Mancini (2001), Mancini (2004) and Mancini (2003) has developed robust estimators of Y ct in the presence of finite activity jumps. Her approach is to use truncation bt/δc X j=1 Yjδ − Y(j−1)δ 2 I(Yjδ − Y(j−1)δ < rδ ), (33) where I (.) is an indicator function. The crucial function r δ has to have the property that p δ log δ −1 rδ−1 ↓ 0. It is motivated by the modulus of continuity of Brownian motion paths that almost surely lim sup δ↓0 0≤s,t≤T |t−s|<δ |W − Wt | p s = 1. 2δ log δ −1 This is an elegant theory, which works when Y ∈ BSM. It is not prescriptive about the tuning function rδ , which is an advantage and a drawback. Given the threshold in (33) is universal, this method will throw out more returns as jumps during a high volatility period than during a low volatility period. Aı̈t-Sahalia and Jacod (2005b, Section 7 onwards) provides additional insights into these types of truncation estimators in the case where Y is scaled Brownian motion plus a homogeneous pure jump process. They develop a two-step procedure, which automatically selects the level of truncation. Their analysis is broader still, providing additional insights into a range of power variation type objects. 5 5.1 Mitigating market frictions Background The semimartingale model of the frictionless, arbitrage free market is a fiction. When we use high frequency data to perform inference on either transaction or quote data then various market 36 frictions can become important. O’Hara (1995), Engle (2000), Hasbrouck (2003) and Engle and Russell (2005) review the detailed modelling of these effects. Inevitably such modelling is quite complicated. With the exception of subsection 2.10, we have so far mostly ignored frictions by thinking of δ as being only moderately small. This is ad hoc and it is wise to try to more formally identify the impact of frictions. In this context the first econometric work was carried out by Fang(96) (1996) and Andersen, Bollerslev, Diebold, and Labys (2000) who used so-called signature plots to assess the degree of bias caused by frictions using a variety of values of δ. The signature plots we draw show the square root of the time series average of estimators of V i computed over many days, plotting this against δ. If the log-price process was a pure martingale then we would expect the plot to have roughly horizontal lines. Figure 6 shows the signature plot for the Vodafone series, discussed in section 2.10, calculated over 20 days in January 2004. Included in this plot is the standard deviation of daily returns Y i − Yi−1 , which is around 0.01. The signature plot for realised volatility for Vodafone reinforces the results from section 2.10. The daily RV is most reliable in the case of interpolated mid-quote data, where the bias is rather moderate even with δ well below 5 minutes 13 . Much larger biases appear when we base the statistics on transaction data, with raw transaction data being particularly vulnerable. The Figure also reports corresponding results for estimators we will discuss in this section: the kernel, two scale and alternation estimators. Potentially these statistics may be less influenced by frictions and have the potential to exhibit less bias. In order to deepen our understanding we will characterise different types of frictions: (i) liquidity effects, (ii) bid/ask bounce, (iii) discreteness, (iv) Epps effects. We will not discuss other important frictions such as market closures. We will review some of the literature which has tried to mitigate the effects of market frictions on the estimation of the QV of the efficient semimartingale price. This is a very active area of research and some of the answers are less clear cut than in previous sections as there is less agreement on the way to model the frictions. 5.2 Four statistical models of frictions The economics of market frictions involve a large number of traders competing against one another to maximise their expected utility. The econometrics literature on the effect of frictions on realised quantities almost entirely ignores the details of this economics and employs reduced form statistical models of the frictions. This is rather unsatisfactory but is typical of how 13 The reasonably flat signature plot for interpolated mid-quote data masks at least two offsetting effects: (i) an upward bias of RV for small δ that we will discuss in a moment, (ii) a downward bias caused by the interpolation which induces positive correlation amongst returns. In other datasets this offsetting may well not work. A more through empirical study is provided by Hansen and Lunde (2006). 37 0.04 0.03 Mid−quote returns: interpolated 0.04 RVol Kernel 2 scale Alternation Daily 0.03 0.02 0.02 0.01 0.01 0.04 0.03 0.1 0.2 1 2 34 Transaction returns: interpolated 10 20 0.04 RVol Kernel 2 scale Daily 0.03 0.02 0.02 0.01 0.01 0.1 0.2 1 2 34 10 20 Mid−quote returns: raw RVol Kernel 2 scale Alternation Daily 0.1 0.2 Transaction returns: raw 1 2 34 10 20 1 2 34 10 20 RVol Kernel 2 scale Daily 0.1 0.2 Figure 6: Signature plot for various estimators for the Vodafone stock during January 2004. Shows the square root of the long term average of estimators of V i based on data calculated over intervals of length δ recorded in minutes. Estimators are squared daily returns (which does not depend on δ), realised variance and the two scale, kernel and L (again does not depend upon δ) estimators. Note that the 2 scale estimator could only be computed for δ ≥ 0.1, the results for δ < 0.1 repeats the value recorded for δ = 0.1. code: lse RV.ox. econometricians deal with measurement error in other areas of economics, reflecting the fact that frictions are nuisances to us as our scientific focus is on estimating [Y ]. Suppose Y ∈ BSM is some latent efficient price obscured by frictions. Our desire is to estimate [Y ] from the distorted sample path X. Four basic statistical assumptions about the frictions seem worthy of study. Two use the notation U = X − Y , with ρ δ,j = Cor(Uδi , Uδ(i+j) ) assuming this correlation does not depend upon i. Assumption F1: stationary in observation time. For all δ, U δi has a zero mean, variance of κ 2 and is a covariance stationary process with Cor(U δi , Uδ(i+j) ) = ρδ,j = ρj . For simplicity of exposition we will also assume U is Gaussian. Here the autocovariance function does not depend upon δ or i 14 . This is not plausible 14 This assumptions has some empirical attractions if it is applied to ρi−j = Cor(Uti , Utj ), the autocorrelation 38 empirically, as we can see from Figure 3. In particular it is strong to assume that the temporal behaviour of Uδ , U2δ , U3δ ,... does not vary with δ. However, it is a simple structure where we can give a relatively complete analysis. The leading case of F1 is where Uδi ∼ N ID(0, κ 2 ) over i — the white noise assumption. We will write this as F1W. In this context it has been discussed by, for example, Zhou (1996), Bandi and Russell (2003), Hansen and Lunde (2006), Zhang, Mykland, and Aı̈t-Sahalia (2005) and Zhang (2004). Bandi and Russell (2003), Hansen and Lunde (2006) and Aı̈t-Sahalia, Mykland, and Zhang (2005b) have studied the more general case which allows autocorrelation in the errors. Note Gloter and Jacod (2001a) and Gloter and Jacod (2001b) also study what happens if Var(Uδi ) varies with δ. F1 implies, for example, that [U ]t = ∞ which is inconvenient mathematically. Further, it means that we can very precisely estimate Y δi by simply repeatedly recording Xδi (or versions of it very close in time to δi) and averaging it, for as δ ↓ 0 the dependence pattern in the data peters out increasingly quickly in calendar time. This is most easily seen in the F1W case, but it holds more generally. In this setup we would expect to eventually be able to estimate [Y ] given enough data. This is indeed the case. Assumption F2: diffusion and stationary in calendar time. U ∈ BSM, E(U t ) = 0. Throughout this discussion, for simplicity of exposition, we will assume U is a Gaussian process. F2 is very different from F1 for [U ]t < ∞ and Y cannot be perfectly recovered by local averaging of Z. Aı̈t-Sahalia, Mykland, and Zhang (2005a) have studied this case when Y is Brownian motion and U is a Gaussian OU process which solves √ dUt = −λUt dt + κ 2λdBt , (34) where B ⊥⊥ Y and is Brownian motion. Even if the full sample path of U from time 0 to time t is available it is not possible to consistently estimate λ as the log-likelihood for the path as a function of λ, given by the Radon-Nikodym derivative, is not degenerate. Of course, on the other hand, the value of κ 2 λ can be deduced from [U ]. Thus Var(U t ) cannot be estimated just from the sample path of U unless one a priori knows either κ or λ. This means it is impossible to consistently estimate [Y ] just from the sample path of X if one allows F2. Hansen and Lunde (2006) have studied the properties of realised QV under more general conditions, which are closer to the full spirit of F2. function in tick time. But we have already noted that this would require limit theorems for estimators of QV to hold for arbitrary stochastic tj . Existing results for irregularly spaced data, such as Mykland and Zhang (2005), are proved under the assumption that the tj are independent of Y . The independence assumption is well known not be innocuous. 39 Assumption F3: purely discontinuous prices. X is a pure point process and X − Y is strictly stationary in business or transaction time. Oomens (2004) studies the properties of realised QV and Large (2005) provides a new type of estimator of [Y ] based on X. Assumption F4: scrambled. (i) non-synchronously trading or quote updating, (ii) delays caused by reaction times, as information is absorbed into markets differentially quickly. F4(i) was the motivation for the work of Malliavin and Mancino (2002) discussed in Section 3.8.3. It also appears in the work of Hayashi and Yoshida (2005), which we will discuss at the end of this section. Typically F1-F2 are strengthened by Assumption IND. U ⊥⊥ Y . IND is a strong assumption, which is studied in some length by Hansen and Lunde (2006) who argue it is not empirically reasonable for a number of empirical examples. F1-F3 are all capable of impacting on univariate realised analysis, Assumption F4 can only possibly help in the multivariate case for it hardly impacts univariate quadratic variations over moderately long stretches of time such as a day. Assumptions F1-F2 are purely statistical abstractions for measurement error, somewhat convenient for carrying out calculations. They are attempts to deal with some of the effects of liquidity and bid/ask bounce. F2 has the problem that it implies X ∈ BSM and so needs additional assumptions in order to identify [Y ], such as Y ∈ Mloc . F3 is motivated by the discreteness seen in most asset markets — see for example Figure 3. F4 tries to capture a feature of Epps (1979) effects, which was discussed in section 2.10.2. 5.3 Properties of [Uδ ] in univariate case under F1W and F2 Trivially under F1 and F2 [Xδ ] = [Yδ ] + [Uδ ] + 2[Uδ , Yδ ]. It is useful to think of the conditional moments E U |Y ([Xδ ] − [Yδ ]) and VarU |Y ([Xδ ] − [Yδ ]), for they give an impression of the effect of frictions. Given space constraints we will specialise the results for F1 to F1W — there is little technical loss in doing this as the results under the more general conditions of F1 are very similar. Write n = bt/δc, then under IND we have that EU [Uδ ] = n2κ 2 , EU |Y [Uδ , Yδ ] = 0, CovU |Y ([Yδ ], [Uδ , Yδ ]) = 0, VarU |Y [Uδ , Yδ ] = 2κ 2 [Yδ ], CovU |Y ([Uδ ], [Uδ , Yδ ]) = 0, 40 [Yδ ] ⊥⊥ [Uδ ]. (35) To calculate Var[Uδ ], write ua,b = Ua − Ub and use the normality of U , Cov(u2a,b , u2c,d ) = 2Cov 2 (ua,b , uc,d ) . Taken together with the fact that [U δ ] = EU |Y ([Xδ ] − [Yδ ]) = 2nκ 2 Pn 2 i=1 uδi,δ(i−1) then under F1W Var[Uδ ] ' 12κ 4 n, so and VarU |Y ([Xδ ] − [Yδ ]) ' 12κ 4 n + 8κ 2 [Yδ ]. (36) This was reported by Fang(96) (1996), Bandi and Russell (2003), Hansen and Lunde (2006) and Zhang, Mykland, and Aı̈t-Sahalia (2005). Notice that as δ ↓ 0 both the bias and variance goes to infinity. This formalises the well known result that [X δ ] is an inaccurate estimator of [Y ] under F1 when δ is very small. The generalisation of these results to F1 appears also in the above papers and Aı̈t-Sahalia, Mykland, and Zhang (2005b) — the key result being that the orders in n do not change as we move to F1. For mid-quotes data κ 2 tends to be quite small and so the impact of terms which contain κ 4 tend to be tiny unless n is massive. For transaction data this is not typically the case for then we would expect κ 2 to be much bigger due to bid/ask bounce. Hansen and Lunde (2006) have studied the empirically important case of where [U δ , Yδ ] is negative, which can happen when the IND assumption is dropped. Under F2, if U is a Gaussian OU process (34) then E(U t ) = 0, Var(Ut ) = κ 2 . Clearly for small δ under IND EU |Y ([Xδ ] − [Yδ ]) = n2κ 2 (1 − exp(−λδ)) ' 2tλκ 2 = [U ]t , while from (17) the corresponding variance is approximately 2δ Rt 0 σ 2u + κ 2 2λ (37) 2 du. Notice that for large δ, [U ]t provides a poor approximation to the bias. As δ decreases, at first the bias increases but the variance falls, but eventually as δ gets very small the variance becomes small and the bias remains constant at [U ] t . Hence the results are materially different from the F1 case. The bias can be very large as λ is likely to be very large in practice, in order for the half life of U to be very brief. This type of result appears in the parametric case where Y is Brownian motion plus drift in Aı̈t-Sahalia, Mykland, and Zhang (2005b, Section 9.2). Related work includes Gloter and Jacod (2001a) and Gloter and Jacod (2001b). The bias (37) continues to hold for non-Gaussian OU processes of the type discussed by Barndorff-Nielsen and Shephard (2001). Under F1, Bandi and Russell (2003) and Hansen and Lunde (2006) have studied the problem of minimising the E ([Xδ ] − [Y ])2 by choosing δ. A key is the problem of estimating κ 2 , but 4 P under F1 this is consistently estimated by Xδi − Xδ(i−1) /2n. For thickly traded assets 41 they typically select δ to be between 1 and 15 minutes, but the results are sensitive to the choice of mid-quotes or transaction data and the use of interpolation or raw transforms. In recent work Bandi and Russell (2005) have extended their analysis to the multivariate case. See also the work of Martens (2003). 5.4 Subsampling 5.4.1 Raw subsampling Recalling the definition of (23), we have that when t kj = δ j + k K+1 K [XGn+ (K) ] = 1 X [XGn (i) ]t , K +1 then where [XGn (i) ] = [YGn (i) ] + [UGn (i) ] + 2[UGn (i) , YGn (i) ], i=0 and so we can use (35) to calculate EU |Y [XGn+ (K) ]. In particular under F1W EU |Y [XGn+ (K) ] = [YGn+ (K) ] + n2κ 2 , (38) so subsampling does not reduce the bias of the estimator. However, and crucially, under F1W we trivially have VarU |Y [UGn+ (K) ] = 1 Var[UGn (i) ], K +1 VarU |Y [UGn+ (K) , YGn+ (K) ] = 2κ 2 1 [Y + ]. K + 1 Gn (K) Which means that, for fixed δ as K → ∞ so p [XGn+ (K) ] → p lim [YGn+ (K) ] + n2κ 2 . K→∞ This is a marked improvement over the realised QV, for in that case both the mean and variance explode as δ ↓ 0. This important point was first made in the F1W context by Zhang, Mykland, and Aı̈t-Sahalia (2005). Aı̈t-Sahalia, Mykland, and Zhang (2005b) extend this argument to the case of dependence in observation time. Under F2, if U is a Gaussian OU process (34) then for small δ, as before, EU |Y [XGn+ (K) ] − [YGn+ (K) ] = n2κ 2 (1 − exp(−λδ)) ' [U ]t , while from (17) the corresponding variance is approximately (see Example 2), ( ) Z t K X K X k−i 2 2 2 k−i 2 σ 2u + κ 2 2λ du. + 1 − 2 K +1 K +1 (K + 1) 0 (39) i=0 k=0 Hence under F2 subsampling has broadly the same properties as the raw realised QV estimator studied in the previous subsection, but with a slightly smaller variance when K is large. 42 5.4.2 Bias correction Subsampling has the great virtue that under F1 or F2 its variance gets smaller as δ ↓ 0. This means that if we can bias correct these estimators then they will be consistent for [Y ]. Under F1W Zhang, Mykland, and Aı̈t-Sahalia (2005) argued one should estimate [Y ] by /Xδ,K / = [XGn+ (K) ] − 1 [X δ ], K + 1 K+1 which they termed a “two scaled estimator”, one based on returns computed over intervals of length δ, the other on intervals of length δ/(K + 1). Clearly under F1W 1 K +1 [Y δ ] + n2κ 2 − n2κ 2 K + 1 K+1 K +1 1 [Y δ ], ' [YGn+ (K) ] − K + 1 K+1 EU (/Xδ,K /) = [YGn+ (K) ] − with the second term becoming negligible as K increases. Hence the two scale estimator is consistent under F1W and IND. Zhang, Mykland, and Aı̈t-Sahalia (2005) calculate the asymptotic distribution of /Xδ,K / − [Y ] and show it is mixed Gaussian, but with a slow rate of convergence. Zhang (2004) provides a multiscale estimator which has a faster rate of convergence, but exploits similar ideas. It is clear this argument also holds under F1 with time dependent errors which are stationary in observation time — for in calendar time the dependence in the data weakens as δ ↓ 0. A clear analysis of that setup is provided by Aı̈t-Sahalia, Mykland, and Zhang (2005b). Unfortunately this estimator falls over when we move to the F2 assumption, concording with the above comments that this is a much more difficult problem. In particular for finite δ, K 1 EU (/Xδ,K /) = [YGn+ (K) ] − [Y δ ] K+1 K + 1 δ K +1 2 +n2κ (1 − exp(−λδ)) − 1 − exp −λ K +1 K +1 1 ' [YGn+ (K) ] − [Y δ ] + 2κ 2 tn. K + 1 K+1 Thus, in this case, the two scale estimator does not really help. The problem here is that the bias correction is of the wrong form. Figure 7, repeats Figure 6 but now plots the results for the two scale estimators for Vodafone on 2nd January 2004. Throughout the subsampling was based on a time series of 2 second returns and setting K = δ/0.02. It is important to understand that the choice of K is important and the empirical properties of the two scale estimator may change quite significantly with this tuning parameter15 . We have little experience of how to select K well. The results in the top 15 When δ is small we would like to use a lot of subsampling as the theory of Zhang, Mykland, and Aı̈t-Sahalia (2005) suggests, but we cannot take K to be very large as we run out of data. Our dataset is recorded only up to one second. At the moment we cannot see how to overcome this problem. More empirical studies into the performance of the two scale estimator are given in Hansen and Lunde (2006). 43 row, which are made on mid-quotes, show the 2 scale estimator does well for interpolated data and moderates the bias of the RV estimator for small δ. The 2 scale estimator is much more challenged (as all the estimators are!) by the transaction data, but it does have a positive impact in the case of the raw data. The signature plot for the two scale estimator for the Vodafone series is given in Figure 6 and produces qualitatively very similar results. 0.04 0.03 Mid−quote returns: interpolated 0.04 RVol Kernel 2 scale Alternation 0.03 0.02 0.02 0.01 0.01 0.04 0.03 0.1 0.2 1 2 34 Transaction returns: interpolated 10 20 0.04 RVol Kernel 2 scale Alternation RVol Kernel 2 scale Alternation 0.1 0.2 Transaction returns: raw 1 0.02 0.01 0.01 1 2 34 10 20 2 34 10 20 10 20 RVol Kernel 2 scale Alternation 0.03 0.02 0.1 0.2 Mid−quote returns: raw 0.1 0.2 1 2 34 Figure 7: LSE’s electronic order book on the 2nd working day in January 2004. Top 2 rows graphs: subsampled realised daily QVol computed using 0.1, 1, 5 and 20 minute midpoint returns. X-axis is in minutes. First row corresponds to mid-points, the second row corresponds to transactions. Bottom 2 rows of graphs: 2 scale estimator of the QVol computed using 0.1, 1, 5 and 20 minute transaction returns. 5.5 Kernel An alternative way of trying to remove frictions is by using kernels, discussed in this context by, for example, French, Schwert, and Stambaugh (1987) and Zhou (1996), and formalised by Hansen and Lunde (2006) and Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004). We will 44 study RVw (Xδ ) = w0 [Xδ ] + 2 q X i=1 wi γ bi (Xδ ) = RVw (Yδ ) + RVw (Uδ ) + RVw (Yδ , Uδ ) + RVw (Uδ , Yδ ). We already know from section 3.7 the properties of RV w (Yδ ) when δ is small, while under the IND assumption RVw (Yδ ) ⊥ RVw (Yδ , Uδ ), RVw (Uδ , Yδ ), RVw (Uδ ) ⊥ RVw (Yδ , Uδ ), RVw (Yδ ) ⊥⊥ RVw (Uδ ). the first two terms have zero mean and so, writing ρ ◦s = Cor(Ut , Ut−s ), EU (RVw (Xδ )) = w0 [Yδ ] + 2nκ 2 w0 (1 − ρ◦δ ) + q X i=1 wi 2ρ◦iδ − ρ◦(i−1)δ − ρ◦(i+1)δ Under F1W Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, [Uδ ] 2κ 2 n [Uδ ] 12 2 2b γ 1 (Uδ ) γ 1 (Uδ ) −2κ n 2b −16 2b 2b 4 4 (U ) (U ) γ 0 γ δ δ 2 2 EU = , CovU = nκ 0 .. .. .. . . . .. 2b γ q (Uδ ) 2b γ q (Uδ ) 0 . ! . Theorem 2) note that 28 −16 24 4 −16 24 .. .. .. . . . while EU |Y [Uδ , Yδ ] = EU |Y γ b1 (Uδ , Yδ ) = 0 and CovU |Y .. . , [Uδ , Yδ ] γ b1 (Uδ , Yδ ) γ b1 (Yδ , Uδ ) 2[Yδ ] − 2b γ 1 (Yδ ) γ 1 (Yδ ) − γ b2 (Yδ ) − [Yδ ] 2[Yδ ] − 2b γ 1 (Yδ ) = nκ 4 2b 2b γ 1 (Yδ ) − γ b2 (Yδ ) − [Yδ ] 2b γ 2 (Yδ ) − γ b1 (Yδ ) − γ b3 (Yδ ) 2[Yδ ] − 2b γ 1 (Yδ ) which implies EU {RVw (Xδ )} = w0 [Yδ ] + 2nκ 2 (w0 − w1 ) , which is unbiased iff w0 = w1 = 1. The simplest unbiased estimator is thus RV1,1 (Xδ ) = [Xδ ] + 2b γ 1 (Xδ ). (40) This was proposed by Zhou (1996) and studied in detail by Hansen and Lunde (2006). Table 5 summarises the properties of various estimators under F1 and IND using these results. As lags are added, the middle term in the variance considerably falls, while the first term increases. This means that when δ is small then RV 1,1,0.5 (Xδ ) could be more precise than [Xδ ] in the presence of market friction. Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 45 Estimator [Xδ ] RV1,1 (Xδ ) = [Xδ ] + 2b γ 1 (Xδ ) RV1,1,0.5 (Xδ ) = [Xδ ] + 2b γ 1 (Xδ ) + γ b2 (Xδ ) Bias 2nκ 2 0 0 Variance Rt 4 Rt 2δ 0 σ u du + 12nκ 4 + 8κ 2 0 σ 2u du Rt 4 R t 6δ 0 σ u du + 8nκ 4 + 8κ 2 0 σ 2u du R Rt 4 t 7δ 0 σ u du + 2nκ 4 + 4κ 2 0 σ 2u du Table 5: Bias and variance of various kernel estimators under the F1 and IND assumptions 2) show that by adding more lags and choosing the weights appropriately they can produce a consistent estimator, when small adjustments are made to deal with the intricate end effects we have ignored in this survey. An alternative would be to subsample (40), which will transform the unbiased statistic into a consistent one under the F1 assumption — although, as we have seen in the previous subsection, such an argument fails under F2. Under F2, the kernel approach will continue to produce an unbiased estimator of [Y ], but typically it will be inconsistent. Consider the simple kernel weights w i = (q + 1 − i) /q, where q is set to cover 5 minutes, whatever the value of δ. Then the bias is 0 under F1 and IND. When q = 1 then w 1 = 1, if q = 2 then w1 = 1, w2 = 1/2, while q = 3 implies w1 = 1, w2 = 2/3, w3 = 1/2. When q = 4, then w1 = 1, w2 = 3/4, w3 = 2/4, w4 = 1/4. Figure 7, repeats Figure 3 but now plots the results for the kernel estimator. These seem pretty strong results, with the estimator being reasonably in line with the alternation estimator we will discuss in the moment. The results are pretty stable with respect to δ, although the problem of dealing with frictions is clearly harder in the case where we use transaction as opposed to mid-quote data. These results are confirmed by looking at the signature plots in Figure 6, which shows the kernel still has bias in the transactions case, but the biases are rather modest. Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2) discuss more sophisticated choices of kernels. 5.6 F3 — pure point process This is attractive as there is a great deal of literature which moves econometricians towards modelling prices as point processes (e.g. Engle and Russell (1998), Engle (2000) and Bowsher (2003)). It appears in a number of guises. There is a literature on continuous sample path processes which are rounded to yield a point process. Work on this includes Gottlieb and Kalay (1985), Jacod (1996) and Delattre and Jacod (1997). Another strand has started out with a model for a point process for X, which has been introduced in this context by Oomens (2004) and Large (2005). Oomens (2004) has studied the sampling properties of realised QV estimators in the case where the data generating process is purely made up of jumps. Hasbrouck (1999) studies the problem where they have an underlying time series which is rounded so that the prices live on 46 discrete lattice points, whose width is a tick. His analysis is fully parametric. Closer to the content of this paper, Barndorff-Nielsen and Shephard (2005b) study the first two moments of realised QV when Yt = Zτ t ,where τ is the integral of a non-negative covariance stationary process, that is a process with non-decreasing sample paths, and Z is a Lévy process. Throughout they assume that Z ⊥⊥ τ . Such models have received some attention recently in mathematical finance due to the papers by, for example, Carr, Geman, Madan, and Yor (2003) and Carr and Wu (2004). Of course Lévy processes, outside the Brownian motion case, are pure jump processes and can be made to obey lattice structures if desired. ?) show that the realised QV estimator is an inconsistent estimator of τ , but are able to characterise the bias and variance and so the realised QV estimator can be used to provide inference on underlying parameters if the time-change model is parametric. A radical departure is provided by Large (2005) who introduced the alternation estimator. He looks at markets where prices move almost always by one tick. An example of this is Vodafone in Figure 3. He uses solely one side of the quotes, say the best bid, and assumes that the pure jump process Xt always jumps towards Yt when it moves. He then estimates the [Y ] t as b t = [X]t Nt − At , L At where Nt are the number of price movements in the single side of the market up to time t. We call this the alternation estimator. Here A t are the number of alternations or immediate price p bt → reversals. Under various assumptions he then shows that L [Y ]t as Nt → ∞ and establishes a CLT for the estimator. This approach is an elegant combination of a market microstructure model which is cointegrated with a BSM efficient price. For the Vodafone share price, we computed the alternation estimator separately on the bid and ask sides of the markets and averaged the values to compute the estimator of [Y ]. Figure 6 shows its value averaged over the 20 days in January 2004. It is in line with the estimated unconditional standard deviation of the open to close daily returns. Figure 7 shows the corresponding result for the 2nd of January, 2004. Again it suggests the alternation estimator produces plausible values in practice. Of course the Vodafone example is a nice case for the alternation estimator for it has a large tick size as a percentage of price and almost all of its moves are by one tick. It is potentially challenged by stocks with more frequently updated quotes. 47 5.7 F4 — scrambled multivariate process There is very little work on the effect of market frictions in the multivariate case. At a trivial level, all the univariate results apply to the multivariate case through the use of polarisation [Y l , Y m ]t = 1 l [Y + Y m ]t − [Y l − Y m ]t , 4 and so frictionally robust estimators can be applied to the two components. Although this approach has significant merits, and is used in the paper by Bandi and Russell (2005) to select δ for daily realised QV calculations, it misses out on adequately dealing with Epps type effects. In the multivariate case there are potentially two new problems: (i) non-synchronous trading or quote updating, (ii) delays caused by reaction times, as information is absorbed into markets differentially quickly. The first of these has been studied for quite a long time in empirical finance by, for example, Scholes and Williams (1977) and Lo and MacKinlay (1990). Martens (2003) provides a review of some of this work and more modern papers as well as making contributions of his own. If Y ∈ BSM and the times of observations, t j , are independent from Y then this is a precisely specified statistical problem and a number of solutions are available. We discussed the Fourier method in section 3.8.3, but there is also the work of Hayashi and Yoshida (2005) which characterises the bias caused by random sampling and suggests methods for trying to overcome it. Hayashi and Yoshida (2005, Definition 3.1) introduced the estimator o Y l, Y m n t = tX j ≤t i ≤t tX i=1 j=1 Ytli − Ytli−1 m Ytm − Y tj−1 I {(ti−1 , ti ) ∩ (tj−1 , tj ) 6= } . j (41) This multiplies returns together whenever time intervals of the returns have any component which are overlapping. This artificially includes terms with components which are approximately uncorrelated (inflating the variance of the estimator), but it does not exclude any terms and so does not miss any of the contributions to quadratic covariation. They show under various assumptions that as the times of observations become denser over the interval from time 0 to time t, this estimator converges to the desired quadratic covariation quantity. Table 6 illustrates the effect of using estimator (41), estimating the average realised correlation during January 2004 for the London stock exchange data discussed earlier. These results are comparable with Figure 4 and shows that it goes a modest way towards tackling this issue. This suggests that other types of frictions are additionally important. 48 Vodafone BP AstraZeneca HSBC Vodafone .0681 .0456 .0776 BP AstraZeneca HSBC .0430 .0602 .0495 - Table 6: The average of the daily Hayashi-Yoshida estimator of the correlation amoungst the returns in these asset prices. These statistics are based on mid-quotes. 6 Conclusions This paper has reviewed the literature on the measurement and forecasting of uncertainty through quadratic variation type objects. The econometrics of this has focused on realised objects, estimating QV and its components. Such an approach has been shown to provide a leap forward in our understanding of time varying volatility and jumps, which are crucial in asset allocation, derivative pricing and risk assessment. A drawback with these types of methods is the potential for market frictions to complicate the analysis. Recent research has been trying to address this issue and has introduced various innovative methods. There is still much work to be carried through in that area. 7 Software The calculations made for this paper were carried out using PcGive of Doornik and Hendry (2005) and software written by the authors using the Ox language of Doornik (2001). References Aı̈t-Sahalia, Y. (2004). Disentangling diffusion from jumps. Journal of Financial Economics 74, 487– 528. Aı̈t-Sahalia, Y. and J. Jacod (2005a). Fisher’s information for discretely sampled Lévy processes. Unpublished paper: Department of Economics, Princeton University. Aı̈t-Sahalia, Y. and J. Jacod (2005b). Volatility estimators for discretely sampled Lévy processes. Unpublished paper: Department of Economics, Princeton University. Aı̈t-Sahalia, Y., P. Mykland, and L. Zhang (2005a). How often to sample a continuous-time process in the presence of market microstructure noise. Review of Financial Studies 18, 351–416. Aı̈t-Sahalia, Y., P. Mykland, and L. Zhang (2005b). Ultra high frequency volatility estimation with dependent microstructure noise. Unpublished paper: Department of Economics, Princeton University. Alizadeh, S., M. Brandt, and F. Diebold (2002). Range-based estimation of stochastic volatility models. Journal of Finance 57, 1047–1091. Andersen, T. G. and T. Bollerslev (1997). Intraday periodicity and volatility persistence in financial markets. Journal of Empirical Finance 4, 115–158. Andersen, T. G. and T. Bollerslev (1998a). Answering the skeptics: yes, standard volatility models do provide accurate forecasts. International Economic Review 39, 885–905. Andersen, T. G. and T. Bollerslev (1998b). Deutsche mark-dollar volatility: intraday activity patterns, macroeconomic announcements, and longer run dependencies. Journal of Finance 53, 219–265. 49 Andersen, T. G., T. Bollerslev, and F. X. Diebold (2003). Some like it smooth, and some like it rough: untangling continuous and jump components in measuring, modeling and forecasting asset return volatility. Unpublished paper: Economics Dept, Duke University. Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens (2001). The distribution of realized stock return volatility. Journal of Financial Economics 61, 43–76. Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2000). Great realizations. Risk 13, 105–108. Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2001). The distribution of exchange rate volatility. Journal of the American Statistical Association 96, 42–55. Correction published in 2003, volume 98, page 501. Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2003). Modeling and forecasting realized volatility. Econometrica 71, 579–625. Andersen, T. G., T. Bollerslev, F. X. Diebold, and C. Vega (2003). Micro effects of macro announcements: Real-time price discovery in foreign exchange. American Economic Review 93, 38–62. Andersen, T. G., T. Bollerslev, and N. Meddahi (2004). Analytic evaluation of volatility forecasts. International Economic Review 45, 1079–1110. Andersen, T. G., T. Bollerslev, and N. Meddahi (2005). Correcting the errors: A note on volatility forecast evaluation based on high-frequency data and realized volatilities. Econometrica 73, 279– 296. Andersen, T. G. and B. Sørensen (1996). GMM estimation of a stochastic volatility model: a Monte Carlo study. Journal of Business and Economic Statistics 14, 328–352. Back, K. (1991). Asset pricing for general processes. Journal of Mathematical Economics 20, 371–395. Bandi, F. M. and J. R. Russell (2003). Microstructure noise, realized volatility, and optimal sampling. Unpublished paper, Graduate School of Business, University of Chicago. Bandi, F. M. and J. R. Russell (2005). Realized covariation, realized beta and microstructure noise. Unpublished paper, Graduate School of Business, University of Chicago. Barndorff-Nielsen, O. E., S. E. Graversen, J. Jacod, M. Podolskij, and N. Shephard (2005). A central limit theorem for realised power and bipower variations of continuous semimartingales. In Y. Kabanov and R. Lipster (Eds.), From Stochastic Analysis to Mathematical Finance, Festschrift for Albert Shiryaev. Springer. Forthcoming. Barndorff-Nielsen, O. E., S. E. Graversen, J. Jacod, and N. Shephard (2005). Limit theorems for realised bipower variation in econometrics. Unpublished paper: Nuffield College, Oxford. Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2004). Regular and modified kernel-based estimators of integrated variance: the case with independent noise. Unpublished paper: Nuffield College, Oxford. Barndorff-Nielsen, O. E. and N. Shephard (2001). Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in financial economics (with discussion). Journal of the Royal Statistical Society, Series B 63, 167–241. Barndorff-Nielsen, O. E. and N. Shephard (2002). Econometric analysis of realised volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B 64, 253–280. Barndorff-Nielsen, O. E. and N. Shephard (2003). Realised power variation and stochastic volatility. Bernoulli 9, 243–265. Correction published in pages 1109–1111. Barndorff-Nielsen, O. E. and N. Shephard (2004). Econometric analysis of realised covariation: high frequency covariance, regression and correlation in financial economics. Econometrica 72, 885–925. Barndorff-Nielsen, O. E. and N. Shephard (2005a). How accurate is the asymptotic approximation to the distribution of realised volatility? In D. W. K. Andrews and J. H. Stock (Eds.), Identification and Inference for Econometric Models. A Festschrift in Honour of T.J. Rothenberg, pp. 306–331. Cambridge: Cambridge University Press. Barndorff-Nielsen, O. E. and N. Shephard (2005b). Impact of jumps on returns and realised variances: econometric analysis of time-deformed Lévy processes. Journal of Econometrics. Forthcoming. 50 Barndorff-Nielsen, O. E. and N. Shephard (2005c). Power variation and time change. Theory of Probability and Its Applications. Forthcoming. Barndorff-Nielsen, O. E. and N. Shephard (2006). Econometrics of testing for jumps in financial economics using bipower variation. Journal of Financial Econometrics 4. Forthcoming. Barndorff-Nielsen, O. E., N. Shephard, and M. Winkel (2004). Limit theorems for multipower variation in the presence of jumps in financial econometrics. Unpublished paper: Nuffield College, Oxford. Bartlett, M. S. (1946). On the theoretical specification of sampling properties of autocorrelated time series. Journal of the Royal Statistical Society, Supplement 8, 27–41. Bartlett, M. S. (1950). Periodogram analysis and continuous spectra. Biometrika 37, 1–16. Barucci, E. and R. Reno (2002a). On measuring volatility and the GARCH forecasting performance. Journal of International Financial Markets, Institutions and Money 12, 182–200. Barucci, E. and R. Reno (2002b). On measuring volatility of diffusion processes with high frequency data. Economic Letters 74, 371–378. Bates, D. S. (2003). Empirical option pricing: a retrospective. Journal of Econometrics 116, 387–404. Billingsley, P. (1995). Probability and Measure (3 ed.). New York: Wiley. Bollerslev, T., M. Gibson, and H. Zhou (2005). Dynamic estimation of volatility risk premia and investor risk aversion from option-implied and realized volatilities. Unpublished paper, Economics Department, Duke University. Bollerslev, T., U. Kretschmer, C. Pigorsch, and G. Tauchen (2005). The dynamics of bipower variation, realized volatility and returns. Unpublished paper: Department of Economics, Duke University. Bollerslev, T. and H. Zhou (2002). Estimating stochastic volatility diffusion using conditional moments of integrated volatility. Journal of Econometrics 109, 33–65. Bowsher, C. G. (2003). Modelling security market events in continuous time: intensity based, multivariate point process models. Unpublished paper: Nuffield College, Oxford. Brandt, M. W. and F. X. Diebold (2004). A no-arbitrage approach to range-based estimation of return covariances and correlations. Journal of Business. Forthcoming. Branger, N. and C. Schlag (2005). An economic motivation for variance contracts. Unpublished paper: Faculty of Economics and Business Administration, Goethe University. Brockhaus, O. and D. Long (1999). Volatility swaps made simple. Risk 2, 92–95. Calvet, L. and A. Fisher (2002). Multifractality in asset returns: theory and evidence. Review of Economics and Statistics 84, 381–406. Carr, P., H. Geman, D. B. Madan, and M. Yor (2003). Stochastic volatility for Lévy processes. Mathematical Finance 13, 345–382. Carr, P., H. Geman, D. B. Madan, and M. Yor (2005). Pricing options on realized variance. Finance and Stochastics. Forthcoming. Carr, P. and R. Lee (2003a). Robust replication of volatility derivatives. Unpublished paper: Courant Institute, NYU. Carr, P. and R. Lee (2003b). Trading autocorrelation. Unpublished paper: Courant Institute, NYU. Carr, P. and K. Lewis (2004). Corridor variance swaps. Risk , 67–72. Carr, P. and D. B. Madan (1998). Towards a theory of volatility trading. In R. Jarrow (Ed.), Volatility, pp. 417–427. Risk Publications. Carr, P. and L. Wu (2004). Time-changed Lévy processes and option pricing. Journal of Financial Economics, 113–141. Chernov, M., A. R. Gallant, E. Ghysels, and G. Tauchen (2003). Alternative models of stock price dynamics. Journal of Econometrics 116, 225–257. Christensen, K. and M. Podolskij (2005). Asymptotic theory for range-based estimation of integrated volatility of a continuous semi-martingale. Unpublished paper: Aarhus School of Business. Comte, F. and E. Renault (1998). Long memory in continuous-time stochastic volatility models. Mathematical Finance 8, 291–323. Corradi, V. and W. Distaso (2004). Specification tests for daily integrated volatility, in the presence of possible jumps. Unpublished paper: Queen Mary College, London. 51 Corsi, F. (2003). A simple long memory model of realized volatility. Unpublished paper: University of Southern Switzerland. Curci, G. and F. Corsi (2003). A discrete sine transformation approach to realized volatility measurement. Unpublished paper. Dacorogna, M. M., R. Gencay, U. A. Müller, R. B. Olsen, and O. V. Pictet (2001). An Introduction to High-Frequency Finance. San Diego: Academic Press. Delattre, S. and J. Jacod (1997). A central limit theorem for normalized functions of the increments of a diffusion process in the presence of round off errors. Bernoulli 3, 1–28. Demeterfi, K., E. Derman, M. Kamal, and J. Zou (1999). A guide to volatility and variance swaps. Journal of Derivatives 6, 9–32. Doob, J. L. (1953). Stochastic Processes. New York: John Wiley and Sons. Doornik, J. A. (2001). Ox: Object Oriented Matrix Programming, 3.0. London: Timberlake Consultants Press. Doornik, J. A. and D. F. Hendry (2005). PC Give, Version 10.4. London: Timberlake Consultants Press. Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1, pp. 59–82. Berkeley: University of California. Engle, R. F. (2000). The econometrics of ultra-high frequency data. Econometrica 68, 1–22. Engle, R. F. and J. P. Gallo (2005). A multiple indicator model for volatility using intra daily data. Journal of Econometrics. Forthcoming. Engle, R. F. and J. R. Russell (1998). Forecasting transaction rates: the autoregressive conditional duration model. Econometrica 66, 1127–1162. Engle, R. F. and J. R. Russell (2005). Analysis of high frequency data. In Y. Ait-Sahalia and L. P. Hansen (Eds.), Handbook of Financial Econometrics. Amsterdam: North Holland. Forthcoming. Epps, T. W. (1979). Comovements in stock prices in the very short run. Journal of the American Statistical Association 74, 291–296. Fang(96) (1996). Volatility modeling and estimation of high-frequency data with Gaussian noise. Unpublished Ph.D. thesis, Sloan School of Management, MIT. Forsberg, L. and E. Ghysels (2004). Why do absolute returns predict volatility so well. Unpublished paper: Economics Department, UNC, Chapel Hill. Foster, D. P. and D. B. Nelson (1996). Continuous record asymptotics for rolling sample variance estimators. Econometrica 64, 139–174. French, K. R., G. W. Schwert, and R. F. Stambaugh (1987). Expected stock returns and volatility. Journal of Financial Economics 19, 3–29. Garcia, R., E. Ghysels, and E. Renault (2005). The econometrics of option pricing. In Y. Ait-Sahalia and L. P. Hansen (Eds.), Handbook of Financial Econometrics. North Holland. Forthcoming. Genon-Catalot, V., C. Larédo, and D. Picard (1992). Non-parametric estimation of the diffusion coefficient by wavelet methods. Scandinavian Journal of Statistics 19, 317–335. Ghysels, E., A. C. Harvey, and E. Renault (1996). Stochastic volatility. In C. R. Rao and G. S. Maddala (Eds.), Statistical Methods in Finance, pp. 119–191. Amsterdam: North-Holland. Ghysels, E., P. Santa-Clara, and R. Valkanov (2004). Predicting volatility: getting the most out of return data sampled at different frequencies. Journal of Econometrics. Forthcoming. Gloter, A. and J. Jacod (2001a). Diffusions with measurement errors. I — local asymptotic normality. ESAIM: Probability and Statistics 5, 225–242. Gloter, A. and J. Jacod (2001b). Diffusions with measurement errors. II — measurement errors. ESAIM: Probability and Statistics 5, 243–260. Goncalves, S. and N. Meddahi (2004). Bootstrapping realized volatility. Unpublished paper, CIRANO, Montreal. Gottlieb, G. and A. Kalay (1985). Implications of the discretness of observed stock prices. Journal of Finance 40, 135–153. 52 Hansen, P. R. and A. Lunde (2005a). Consistent ranking of volatility models. Journal of Econometrics. Forthcoming. Hansen, P. R. and A. Lunde (2005b). A forecast comparison of volatility models: Does anything beat a GARCH(1,1)? Journal of Applied Econometrics. Forthcoming. Hansen, P. R. and A. Lunde (2005c). A realized variance for the whole day based on intermittent high-frequency data. Hansen, P. R. and A. Lunde (2006). Realized variance and market microstructure noise (with discussion). Journal of Business and Economic Statistics 24. Forthcoming. Hasbrouck, J. (1999). The dynamics of discrete bid and ask quotes. Journal of Finance 54, 2109–2142. Hasbrouck, J. (2003). Empirical Market Microstructure: Economic and Statistical Perspectives on the Dynamics of Trade in Security Markets. Unpublished manuscript, Department of Finance, Stern Business School, NYU. Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46, 1251–1271. Hayashi, T. and N. Yoshida (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11, 359–379. Heston, S. L. (1993). A closed-form solution for options with stochastic volatility, with applications to bond and currency options. Review of Financial Studies 6, 327–343. Hog, E. and A. Lunde (2003). Wavelet estimation of integrated volatility. Unpublished paper: Aarhus School of Business. Howison, S. D., A. Rafailidis, and H. O. Rasmussen (2004). On the pricing and hedging of volatility derivatives. Applied Mathematical Finance 11, 317–346. Huang, X. and G. Tauchen (2005). The relative contribution of jumps to total price variation. Journal of Financial Econometrics 3. Forthcoming. Jacod, J. (1994). Limit of random measures associated with the increments of a Brownian semimartingale. Preprint number 120, Laboratoire de Probabilitiés, Université Pierre et Marie Curie, Paris. Jacod, J. (1996). La variation quadratique du Brownian en presence d’erreurs d’arrondi. Asterisque 236, 155–162. Jacod, J. and P. Protter (1998). Asymptotic error distributions for the Euler method for stochastic differential equations. Annals of Probability 26, 267–307. Jacod, J. and A. N. Shiryaev (2003). Limit Theorems for Stochastic Processes (2 ed.). Springer-Verlag: Berlin. Javaheri, A., P. Wilmott, and E. Haug (2002). GARCH and volatility swaps. Unpublished paper: available at www.wilmott.com. Kanatani, T. (2004a). High frequency data and realized volatility. Ph.D. thesis, Graduate School of Economics, Kyoto University. Kanatani, T. (2004b). Integrated volatility measuring from unevenly sampled observations. Economics Bulletin 3, 1–8. Karatzas, I. and S. E. Shreve (1991). Brownian Motion and Stochastic Calculus (2 ed.), Volume 113 of Graduate Texts in Mathematics. Berlin: Springer–Verlag. Karatzas, I. and S. E. Shreve (1998). Methods of Mathematical Finance. New York: Springer–Verlag. Kim, S., N. Shephard, and S. Chib (1998). Stochastic volatility: likelihood inference and comparison with ARCH models. Review of Economic Studies 65, 361–393. Koopman, S. J., B. Jungbacker, and E. Hol (2005). Forecasting daily variability of the S&P 100 stock index using historical, realised and implied volatility measurements. Journal of Empirical Finance 12, 445–475. Large, J. (2005). Estimating quadratic variation when quoted prices jump by a constant increment. Unpublished paper: Nuffield College, Oxford. Lévy, P. (1937). Théories de L’Addition Aléatories. Paris: Gauthier-Villars. Lo, A. and C. MacKinlay (1990). The econometric analysis of nonsynchronous trading. Journal of Econometrics 45, 181–211. 53 Maheswaran, S. and C. A. Sims (1993). Empirical implications of arbitrage-free asset markets. In P. C. B. Phillips (Ed.), Models, Methods and Applications of Econometrics, pp. 301–316. Basil Blackwell. Malliavin, P. and M. E. Mancino (2002). Fourier series method for measurement of multivariate volatilities. Finance and Stochastics 6, 49–61. Mancini, C. (2001). Does our favourite index jump or not. Dipartimento di Matematica per le Decisioni, Universita di Firenze. Mancini, C. (2003). Statistics of a Poisson-Gaussian process. Dipartimento di Matematica per le Decisioni, Universita di Firenze. Mancini, C. (2004). Estimation of the characteristics of jump of a general Poisson-diffusion process. Scandinavian Actuarial Journal 1, 42–52. Mancino, M. and R. Reno (2005). Dynamic principal component analysis of multivariate volatilites via fourier analysis. Applied Mathematical Finance. Forthcoming. Martens, M. (2003). Estimating unbiased and precise realized covariances. Unpublished paper. Martens, M. and D. van Dijk (2005). Measuring volatility with the realized range. Unpublished paper: Econometric Institute, Erasmus University, Rotterdam. Meddahi, N. (2002). A theoretical comparison between integrated and realized volatilities. Journal of Applied Econometrics 17, 479–508. Merton, R. C. (1980). On estimating the expected return on the market: An exploratory investigation. Journal of Financial Economics 8, 323–361. Müller, U. A. (1993). Statistics of variables observed over overlapping intervals. Unpublished paper, Olsen and Associates, Zurich. Munroe, M. E. (1953). Introduction to Measure and Integration. Cambridge, MA: Addison-Wesley Publishing Company, Inc. Mykland, P. and L. Zhang (2005). ANOVA for diffusions. Annals of Statistics 33. Forthcoming. Mykland, P. A. and L. Zhang (2002). Inference for volatility-type objects and implications for hedging. Unpublished paper: Department of Statistics, University of Chicago. Neuberger, A. (1990). Volatility trading. Unpublished paper: London Business School. Nielsen, M. O. and P. H. Frederiksen (2005). Finite sample accuracy of integrated volatility estimators. Unpublished paper, Department of Economics, Cornell University. Officer, R. R. (1973). The variability of the market factor of the New York stock exchange. Journal of Business 46, 434–453. O’Hara, M. (1995). Market Microstructure Theory. Oxford: Blackwell Publishers. Oomens, R. A. A. (2004). Properties of bias corrected realized variance in calender time and business time. Unpublished paper, Warwick Business School. Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return. Journal of Business 53, 61–66. Patton, A. (2005). Volatility forecast evaluation and comparison using imperfect volatility proxies. Unpublished paper: Department of Accounting and Finance, LSE. Phillips, P. C. B. and J. Yu (2005). A two-stage realized volatility approach to the estimation for diffusion processes from discrete observations. Unpublished paper: Cowles Foundation for Research in Economics, Yale University. Politis, D., J. P. Romano, and M. Wolf (1999). Subsampling. New York: Springer. Precup, O. V. and G. Iori (2005). Cross-correlation measures in high-frequency domain. Unpublished paper: Department of Economics, City University. Protter, P. (2004). Stochastic Integration and Differential Equations. New York: Springer-Verlag. Reno, R. (2003). A closer look at the epps effect. International Journal of Theoretical and Applied Finance 6, 87–102. Rogers, L. C. G. and S. E. Satchell (1991). Estimating variance from high, low, open and close prices. Annals of Applied Probability 1, 504–512. 54 Rosenberg, B. (1972). The behaviour of random variables with nonstationary variance and the distribution of security prices. Working paper 11, Graduate School of Business Administration, University of California, Berkeley. Reprinted in Shephard (2005). Rydberg, T. H. and N. Shephard (2000). A modelling framework for the prices and times of trades made on the NYSE. In W. J. Fitzgerald, R. L. Smith, A. T. Walden, and P. C. Young (Eds.), Nonlinear and Nonstationary Signal Processing, pp. 217–246. Cambridge: Isaac Newton Institute and Cambridge University Press. Scholes, M. and J. Williams (1977). Estimating betas from nonsynchronous trading. Journal of Financial Economics 5, 309–27. Schwert, G. W. (1989). Why does stock market volatility change over time? Journal of Finance 44, 1115–1153. Schwert, G. W. (1990). Indexes of U.S. stock prices from 1802 to 1987. Journal of Business 63, 399–426. Schwert, G. W. (1998). Stock market volatility: Ten years after the crash. Brookings-Wharton Papers on Financial Services 1, 65–114. Shephard, N. (2005). Stochastic Volatility: Selected Readings. Oxford: Oxford University Press. Sheppard, K. (2005). Measuring realized covariance. Unpublished paper: Department of Economics, University of Oxford. Shiryaev, A. N. (1999). Essentials of Stochastic Finance: Facts, Models and Theory. Singapore: World Scientific. Wasserfallen, W. and H. Zimmermann (1985). The behaviour of intraday exchange rates. Journal of Banking and Finance 9, 55–72. Wiener, N. (1924). The quadratic variation of a function and its Fourier coefficients. Journal of Mathematical Physics 3, 72–94. Woerner, J. (2004). Power and multipower variation: inference for high frequency data. Unpublished paper. Zhang, L. (2004). Efficient estimation of stochastic volatility using noisy observations: a multi-scale approach. Unpublished paper: Department of Statistics, Carnegie Mellon University. Zhang, L., P. Mykland, and Y. Aı̈t-Sahalia (2005). A tale of two time scales: determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association. Forthcoming. Zhou, B. (1996). High-frequency data and volatility in foreign-exchange rates. Journal of Business and Economic Statistics 14, 45–52. 55
© Copyright 2026 Paperzz