Dynamic Negative Binomial Dierence Model for High Frequency Returns ∗ PRELIMINARY AND INCOMPLETE István Barra (a,b) (a) and Siem Jan Koopman (a,c) VU University Amsterdam and Tinbergen Institute (b) Duisenberg School of Finance (c) CREATES, Aarhus University January 19, 2015 Abstract We introduce the dynamic ∆NB model for nancial tick by tick data. Our model explicitly takes into account the discreteness of the observed prices, fat tails of tick returns and intraday patterns of volatility. We propose a Markov chain Monte Carlo estimation method, which takes advantage of an auxiliary mixture representation of the ∆NB distribution. We illustrate our methodology using tick by tick data of several stocks from the NYSE in dierent periods. Using predictive likelihoods we nd evidence in favour of the dynamic ∆NB model. Keywords: high-frequency econometrics, Bayesian inference, Markov-Chain Monte Carlo, discrete distributions 1 Introduction Stock prices are not continuous variables, as they are multiple of the so called ticksize, which is the smallest possible price dierence. For example, on US exchanges the tick size is set to not smaller than 0.01$ for stock with a price greater than 1$ by the Security and Exchange Commission in Rule 612 of the Regulation National Market System. This has a serious impact on the distribution of the trade by trade log returns, resulting in multi modality and discontinuity in the distribution. Alzaid and Omair (2010) and Barndor-Nielsen et al. (2012) suggest to model tick returns (price dierences expressed in number of ticks) with integer valued distributions as the distribution of these tick returns is more tractable. In this paper we propose a model in which tick returns have ∆Negative Binomial (∆NB) distribution condition on a Gaussian latent state. Our model provides a exible framework to t Author information: István Barra, Email: [email protected]. Address: VU University Amsterdam, De Boelelaan 1105, 1081 HV Amsterdam, The Netherlands. István Barra thanks for the Dutch National Science Foundation (NWO) for nancial support. ∗ 1 the empirically observed fat tails of the tick returns and other stylized facts of trade by trade returns, hence it is an attractive alternative of previously suggested models in the literature (see Koopman et al. (2014)). Moreover our structural model allows us to decompose the log volatility into a periodic and a transient volatility component. We propose a Bayesian estimation procedure using Gibbs sampling. Our procedure is based on data augmentation and auxiliary mixtures and it extends the auxiliary mixture sampling proposed by Frühwirth-Schnatter and Wagner (2006) and Frühwirth-Schnatter et al. (2009). In the empirical part of our paper we illustrate our methodology on six stocks from the NYSE in a volatile week of October 2008 and a calmer week from April 2010. We compare the in-sample and out-of-sample t of the dynamic Skellam and dynamic ∆NB model using Bayesian information criteria and predictive likelihoods. We nd that the ∆NB model is favoured for stock with low relative tick size and in more volatile periods. Our paper is related to several strands of literature. Modelling discrete price changes with Skellam and ∆NB distributions was introduced by Alzaid and Omair (2010) and BarndorNielsen et al. (2012), while the dynamic Skellam model was introduced by Koopman et al. (2014). Our paper is related to stochastic volatility models see for example Chib et al. (2002) Kim et al. (1998) Omori et al. (2007) and recently Stroud and Johannes (2014). We extend the literature on trade by trade returns by explicitly account for prices discreteness and fat tails of the tick return distribution (see Engle (2000), Czado and Haug (2010), Dahlhaus and Neddermeyer (2014) and Rydberg and Shephard (2003)). The rest of the paper is organized as follows. I Section 2 we discuss the issue with tradeby-trade log return and we describe the Skellam and ∆NB distributions. Section 3 introduces the dynamic ∆NB distribution while Section 4 explains our Bayesian estimation procedure. In Section 5 we describe our dataset and cleaning procedure and Section 6 presents the empirical ndings. 2 Tick returns and integer valued distributions Stock prices can be quoted as a multiple of the tick size. As a consequence prices are dened on a discrete grid, where the grid points are a tick size distance away from each other. We can write the prices at time tj as p(tj ) = n(tj )g (1) where g is the tick size which can be the function of the price on some exchanges and n(tj ) is a natural number, denoting the location of the price on the grid. Modelling trade by trade returns can pose diculty as the eect of price discreteness on a few seconds frequency is pronounced compared to lower frequencies such as one hour or one day. As described in Münnix et al. (2010), the problem is that the return distribution is a mixture of return distributions ri , which correspond to x price changes ig ( ri = p(tj ) − p(tj−1 ) p(tj−1 ) ) p(tj ) − p(tj−1 ) = n(tj ) − n(tj−1 ) g = i(tj )g = ig 2 (2) where i(tj ) is an integer, which express the price change in terms of ticks. The ri distributions are on the intervals (for positive i) where ig ig , , max pi min pi p = p(tj−1 ) p(tj ) − p(tj−1 ) = n(tj ) − n(tj−1 ) g = i(tj )g = ig i (3) (4) These interval and the center of the intervals ci can be approximated by " # ig ig , , p̄ p ¯ and ig ci ≈ 2 1 1 − p p̄ ¯ (5) ! (6) as max pi ≈ p̄ and min pi ≈ p for i close to 0. ¯ First, note that the intervals corresponding to zero price change and one tick changes are always non-overlapping. Secondly, the center of the intervals are approximately equally spaced, however the intervals for higher absolute value changes are wider, which means that the intervals are getting more and more overlapping as |i| is increasing. Thirdly, the intervals are less overlapping when the price is lower, the volatility is higher or the tick size is bigger. Figure 1 shows the empirical trade by trade return distribution of several stocks from the New York Stock Exchange (NYSE). [ insert Figure 1 here ] Modelling this special feature of the trade by trade return distribution is dicult and often neglected in the literature (see e.g., Czado and Haug (2010) and Banulescu et al. (2013)). We use an alternative modelling framework. Following Alzaid and Omair (2010), Barndor-Nielsen et al. (2012), and Koopman et al. (2014) we can dene tick returns as r(tj ) = p(tj ) − p(tj−1 ) = n(tj ) − n(tj−1 ) = i(tj ) g (7) which is obviously an integer. The advantage of this approach is that we can directly model the price changes expressed in terms of ticks. Although the distribution of tick returns is integer valued, it is still easier to model this distribution than the specic features of the log returns. On issue with trick returns is that they are not properly scaled, hence we expect higher variance at higher prices, however on shorter time intervals this eect should be small, moreover we can count for this our model by introducing a time varying unconditional mean in the volatility equation. Figure 2 show the empirical distribution of ve stock from the NYSE along with tted Skellam densities. [ insert Figure 2 here ] 3 In order to model integer returns we need a discrete distribution dened on integers. One of the potential distributions is the Skellam distribution, which was suggested by Alzaid and Omair (2010). The Skellam distribution is dened as the dierence of two Poisson distributed random variables. If P + and P − are Poisson random variables with intensity λ+ and λ− , then R = P + − P −, R ∼ Skellam(λ+ , λ− ) (8) with probability mass function given by + − − + f (r; λ , λ ) = exp(−λ − λ ) λ+ λ− !r 2 √ I|r| (2 λ+ λ− ) (9) where Ir is the modied Bessel function of the rst kind. The Skellam distirbution has the following rst and second moments E(R) = λ+ − λ− (10) Var(R) = λ+ + λ− (11) An important special case is the zero mean Skellam distribution is when λ = λ+ = λ− . In this case E(R) = 0 (12) Var(R) = 2λ (13) [ insert Figure 3 here ] One issue with the Skellam distribution is that it has exponentially decaying tails and it implies approximately normally distributed tick returns as the tick size goes to zero. Keeping the variance of the tick returns xed at σ 2 p(tj ) − p(tj−1 ) n(tj ) n(tj−1 ) lim = lim g − ≈ g n∗ (tj ) − n∗ (tj−1 ) = N(0, σ 2 ), g→0 g→0 g g g where d n(tj )=n(tj−1 ) ∼ Poi σ2 2 ! d n∗ (tj )=n∗ (tj−1 ) ∼ N and σ2 σ2 , 2g 2g (14) ! , moreover we used the fact that a Poisson distribution with intensity λ can be approximated with a normal distribution with mean and variance equal to λ. The thin tailed trade by trade return assumption might be implausible as we know there is some evidence for jumps in stock prices. [ insert Figure 4 here ] The ∆ NB distribution is an alternative integer valued distribution which was proposed by Barndor-Nielsen et al. (2012). ∆ NB distribution is dened as the dierence of two negative binomial random variables N B + and N B − with number of failures λ+ and λ− and failure rates ν + and ν − R = N B+ − N B−. 4 (15) Then R is distributed as R ∼ ∆NB(λ+ , ν + , λ− , ν − ) (16) with probability mass function given by + − r ν̃ + ν ν̃ − ν λ̃+ f∆N B (r; λ+ , ν + , λ− , ν − ) = ν + − ν − − r ν̃ + ν̃ λ̃ ν + + r, ν − , r + 1; λ̃+ λ̃− if r ≥ 0 (ν − )r + − + − if r < 0 r! F ν , ν − r, −r + 1; λ̃ λ̃ (ν + )r r! F where ν̃ + = ν̃ − = λ̃+ = λ̃− = and F (α, β; γ; z) = ν+ λ+ + ν + ν− λ− + ν − λ+ + λ + ν+ λ− λ− + ν − (17) ∞ X (α)n (β)n z n n=0 (γ)n (18) n! is the hypergeometric function with (x)n denoting the Pochhammer symbol of falling factorial (x)n = x(x − 1)(x − 2) · · · (x − n + 1) = Γ(x + 1) . Γ(x − n + 1) (19) The ∆NB distribution has the following rst and second moments E(R) = λ+ − λ− Var(R) = λ+ λ+ 1+ + ν (20) ! λ− 1+ − ν + λ− ! (21) An important special case is the zero mean ∆ NB distribution is when λ = λ+ = λ− and ν = ν + = ν −. f (r; λ, ν) = ν λ+ν 2ν λ λ+ν |r| Γ(ν + |r|) F Γ(ν)Γ(|r| + 1) ν + |r|, ν, |r| + 1; λ λ+ν 2 ! (22) In this case E(R) = 0 λ Var(R) = 2λ 1 + ν (23) (24) We can think about the zero mean ∆N B(λ, ν) distribution as the realization of a compound 5 Poisson process R= N X (25) Mi , i=1 where N is a Poisson distribution with intensity z1 , z2 ∼ Ga(ν, ν) λ(z1 + z2 ), and Mi = 1, with (26) z1 z1 +z2 2 = z1z+z 2 P (Mi = 1) = −1, with P (Mi = −1) (27) This representation will be useful later on. 3 Dynamic ∆NB model In order to build a sensible model of trade by trade tick returns we have to account for several stylized facts. First of all, we model the tick returns with an integer valued distribution. We propose to use the ∆NB distribution to account for the potential fat tails of the distribution. In addition we use the zero inated version of the ∆NB distribution, because there a huge chunk of the trade by trade returns are zeros. The number of zero trade by trade returns are higher for more liquid stock as the available volumes on best bid and ask prices are higher in consequence the price impact of one trade is lower. Taking the above considerations into account, our observation density can be written as yt = r with (1 − γ)f∆NB (rt ; λt , ν) 0 with γ + (1 − γ)f∆NB (0; λt , ν) t , (28) where γ is the zero ination parameter, λt is the volatility parameter at time t and ν is the degree of freedom parameter of the ∆NB distribution, which determines the thickness of the tail of the distribution. Besides explicitly accounting for the discreteness of prices in our model, we also model the daily volatility pattern and volatility clustering. We introduce these features into our model by specifying the logarithm of the volatility as log λt = µθ + st + xt (29) where µθ is the unconditional mean of the log intensity, st is a spline which is standardized such that it has mean zero and xt is a zero mean AR(1) process. This specication implies a decomposition of the volatility into a deterministic daily pattern and a stochastic time varying component. The daily pattern of volatility usually associated to frequent trading during the beginning of the day and lower activity during lunch. The xt process captures changes in volatility due to new rm specic or market information experienced during the day. We model the daily patter with a periodic spline which has a daily periodicity. (See e.g., Bos (2008), Stroud and 6 Johannes (2014) andWeinberg et al. (2007)) The spline function is a continuous function built up from piecewise polynomials. Using the results of Poirier (1973) we can write a cubic spline st with K knots as a regression (30) s t = wt β where wt is a 1 × K vector and β is K × 1 vector. Details about the spline are in Appendix C. The latent state in our model is xt an AR(1) process, which accounts for the variation in volatility on top of the daily variation. Because of identications reasons we restrict the AR(1) process to have zero mean, which yields to the following transition density ηt ∼ N(0, ση2 ), xt = φxt−1 + ηt , (31) where φ is the persistence parameter and ση2 is the variance of the error term. The full model would be 4 r with (1 − γ)f∆NB (rt ; λt , ν) 0 with γ + (1 − γ)f∆NB (0; λt , ν) t Tick return : yt = Total log volatility : log λt = µθ + st + xt Daily volatility pattern : s t = wt β Transient volatility : xt = φxt−1 + ση ηt , ηt ∼ N(0, ση2 ) Estimation Our proposed estimation procedure relies on data augmentation and auxiliary mixture sampling of Frühwirth-Schnatter and Wagner (2006) and Frühwirth-Schnatter et al. (2009). First for each observation yt we introduce the variable Nt which is equal to the sum of N B + and N B − . Condition on the gamma mixing variables z1t and z2t and the intensity λt we can think about Nt as a realization of a Poisson process on [0, 1] with intensity (z1t + z2t )λt for every t = 1, . . . , T and we can introduce the latent arrival time of the Nt -th jump of the Poisson process τt2 and the arrival time between the Nt -th and Nt + 1-th jump of the process τt1 . Obviously the interarrival time τt1 has exponential distribution with intensity (z1t + z2t )λt while the Nt th arrival time has a Ga(Nt , (z1t + z2t )λt ) distribution, hence we can write τt1 = τt2 = ξt1 , (z1t + z2t )λt ξt2 , (z1t + z2t )λt ξt1 ∼ Exp(1) (32) ξt2 ∼ Ga(Nt , 1). (33) By taking the logarithm of the equations we can rewrite them as ∗ − log τt1 = log(z1t + z2t ) + log λt + ξt1 , ∗ ξt1 = − log ξt1 (34) ∗ − log τt2 = log(z1t + z2t ) + log λt + ξt2 , ∗ ξt2 = − log ξt2 . (35) These equations are linear in the state, which would facilitate the use of Kalman ltering, 7 ∗ and ξ ∗ are non normal. We can use the result of Frühwirth-Schnatter however the error terms ξt1 t2 and Wagner (2006) and Frühwirth-Schnatter et al. (2009) to come up with a normal mixture approximation of these distributions R(Nt ) fξ∗ (x; Nt ) ≈ X ωr (Nt )φ x, mr (Nt ), vr (Nt ) . (36) r=1 ∗ and ξ ∗ , allows us to build an ecient Gibbs Using the mixture of normal approximation of ξt1 t2 sampling procedure where we can sample the latent state paths in one block, eciently using Kalman ltering and smoothing techniques. This is crucial as in our application the number of observation is large and updating the state time period by time period would make our estimation slow and inecient. The MCMC algorithm 1. Initialize µθ , φ, ση2 , γ , ν , R , τ , N , z1 , z2 , s and x 2. Generate φ,ση2 , µθ , s and x from p(φ, ση2 , µθ , s, x|γ, ν, R, τ, N, z1 , z2 , s, y) (a) Draw φ, ση2 from p(φ, ση2 |γ, ν, R, τ, N, z1 , z2 , s, y) (b) Draw µθ , s and x from p(µθ , s, x|φ, ση2 , γ, ν, R, τ, N, z1 , z2 , s, y) 3. Generate γ from p(γ|ν, µθ , φ, ση2 , x, R, τ, N, z1 , z2 , s, y) 4. Generate R, τ, N, z1 , z2 , ν from p(R, τ, N, z1 , z2 , ν|γ, µθ , φ, ση2 , x, s, y) (a) Draw ν from p(ν|γ, µθ , φ, ση2 , x, s, y) (b) Draw z1 , z2 from p(z1 , z2 |ν, γ, µθ , φ, ση2 , x, s, y) (c) Draw N from p(N |z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y) (d) Draw τ from p(τ |N, z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y) (e) Draw R from p(R|τ, N, z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y) 5. Go to 2 The detailed MCMC steps can be found in Appendix D. 4.1 Simulation exercise To check our estimation procedure for the dynamic Skellam and ∆ NB models we simulate 20 000 observation and run 100 000 iterations of our MCMC procedure. We set µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001 and ν = 15 which sensible parameters based on the estimates on real data. Table 1 summarizes the results. We nd that the algorithm successfully estimate the parameters as the true parameters are in the HPD regions. Based on the simulations the AR(1) coecient and volatility in the state are the hardest parameters to estimate. [ insert Table 1 here ] 8 [ insert Figure 5 here ] [ insert Figure 6 here ] 5 Data We have quotes and trades data from the Thomson Reuters Sirca dataset. In this data set we have all the trades and quotes with millisecond time stamps for stocks from NYSE In our analysis we use Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM), Ford (F), Xerox (XRX), which dier in liquidity and their price magnitude. In the paper we concentrate on two months, namely October 2008 and April 2010. These months exhibit dierent market sentiments and volatility characteristics, as October 2008 is in the middle of the 2008 nancial crises with record high realized volatility and some of the markets experienced their worst weeks in October 2008 since 1929, while April 2010 is a calmer month with lower realized volatility. We carry out the following ltering steps to clean the data following a similar procedure to described in Boudt et al. (2012), Barndor-Nielsen et al. (2008) and Brownlees and Gallo (2006). First we lter the trades from the the trades and quotes data set by selecting rows where the 'Type' column equals 'Trade'. Large portion of the data is in fact consists of quotes, hence by excluding the quotes we loose around 70-90 % of the data set. In the next step we we delete the trades with missing or zero prices or volumes. Further more we restrict our analysis to the trading period. The fourth step is to aggregate the trades which have the same time stamps. We decided to use the trades with last sequence number when there are multiple trades at the same millisecond. This choice is motivated by the fact that, that is the price which we can observe with a millisecond ne resolution. Finally we lter the outliers using the the rule suggested by Barndor-Nielsen et al. (2008). We delete trades with price smaller then the bid minus the bid ask spread and higher than the ask plus the bid ask spread. Table 2 and Table 3 summarizes some descriptive statistics for the data from 3rd to 10th October 2008 and from 23rd to 30th April 2010 respectively. Detailed summary of the cleaning can be found in Table 6 and 7. [ insert Table 2 here ] [ insert Table 3 here ] 6 Empirical results We estimate the dynamic Skellam and ∆NB models for two dierent stocks Alcoa (AA), CocaCola (KO), International Business Machines (IBM), J.P. Morgan (JPM), Ford (F), Xerox (XRX), in the period of 3rd to 9th October 2008 and 23rd to 29th April 2010. The results on the data from 2008 are reported in Table 4 while Table 5 shows the parameter estimates on the data from April 2010. The unconditional mean volatility dier across stocks and time periods. The unconditional mean volatility is higher for stocks with higher price and it is higher in the more volatile period 9 in 2008 which is in line with our intuition. The AR(1) coecients are in the range of 0.88-0.99. This suggest persistence in the volatility even after accounting for the daily volatility pattern, however the transient volatility is less persistent in the more volatile crises period. The volatility parameter of the AR(1) process is higher during the 2008 nancial crisis period. We only used the zero ination parameter when some additional exibility was needed in the observation density. This was the case in for stocks with higher price and the more volatile periods. In case of the April 2010 period we used the zero ination only for IBM, while in the October 2008 period we included the zero ination for all stock expect for the two lowest price stock F and XRX. The tail parameter of the ∆NB distribution is higher during the calmer 2010 period which suggest that the distribution of the tick return is closer to a thin tailed distribution in that period. In addition the tail parameter is lower for stock with higher average price. [ insert Table 4 here ] [ insert Table 5 here ] Using the output of our estimation procedure we can decompose the logarithm of the volatility by E(log λt ) = E(µθ ) + E(st ) + E(xt ) (37) Figure 7 depicts the decomposition of the logarithm of the volatility from the Skellam model estimated on IBM tick returns from 23rd to 29th April 2010. [ insert Figure 7 here ] 6.1 In-sample comparison The computation of Bayes factors is infeasible in this setup as it requires sequential parameter estimation, which is computationally prohibitive with large time dimension of our model. Instead we follow Stroud and Johannes (2014) and we calculate Bayesian Information Criteria (BIC) BICT (M) = −2 T X log p(yt |θ̂, M) + di log T (38) t=1 where p(yt |θ, M) can be calculated with a particle lter and θ̂ is the posterior mean of the parameters. The BIC gives an asymptotic approximation to the Bayes factor by BICT (Mi ) − BICT (Mj ) ≈ −2 log BFi,j . This approximation can be used for sequential model comparison. Figure 8 and Figure 10 shows the in Bayes factors on the sample from 23rd to 29th October 2008 and 3rd to 9th April 2010. The pictures indicate that in 2008 there is evidence in favour of the ∆NB model in case of AA, F and XRX, while in 2010 on IBM favours the ∆NB distribution. This result is consistent with our prior conjecture that in the crisis period there are more jumps. Based on the sequential Bayes factors, the ∆NB model tends to be favoured in case of sudden big jumps in the data. Realization of returns from the tail suggest the need of ∆NB only in cases where the volatility high. This in line with the intuition that in models with time varying volatility identication of the tails comes from observation of extreme realizations coupled with low volatility. 10 6.2 Out-of-sample comparison In order to compare the dynamic Skellam and ∆NB models we can use predictive likelihoods. The one-step-ahead predictive likelihood for model M is dened as Z Z p(yt+1 |y1:t , M) = p(yt+1 |y1:t , xt+1 , θ, M)p(xt+1 , θ|y1:t , M)dxt+1 dθ Z Z = p(yt+1 |y1:t , xt+1 , θ, M)p(xt+1 |θ, y1:t , M)p(θ|y1:t , M)dxt+1 dθ. (39) Noticed that the h-step-ahead predictive likelihood can be decompose to the sum of one-stepahead predictive likelihoods p(yt+1:t+h |y1:t , M) = h Y p(yt+i |y1:t+i−1 , M) = i=1 h Z Z Y p(yt+i |y1:t+i−1 , xt+i , θ, M) i=1 × p(xt+i |θ, y1:t+i−1 , M)p(θ|y1:t+i−1 , M)dxt+i .dθ (40) The above formula suggests that we have to calculate p(θ|y1:t+i−1 , m), the posterior of the parameters using sequentially increasing data samples. This means that we have to run our MCMC procedure as many times as many out of sample observations we have. Unfortunately in our application this means several thousands of runs in case we would like to check the predictive likelihood on an out of sample day, which is computationally not practical or even infeasible. However we can use the vast amount of available data by using the following approximation p(yt+1:t+h |y1:t , M) ≈ h Z Z Y p(yt+i |y1:t+i−1 , xt+i , θ, M) i=1 × p(xt+i |θ, y1:t+i−1 , M)p(θ|y1:t , M)dxt+i dθ. (41) This approximation can be motivated by the fact that, after observing a considerable amount of data, which me means that t is suciently large, the posterior distribution of the static parameters should not change that much, hence p(θ|y1:t+i−1 , M) ≈ p(θ|y1:t , M). We carry out the following exercise. We thin our MCMC output to get a sample from the posterior distribution based on our in-sample observations. Then for each parameter draw we estimate the likelihood by running a particle lter through the out-of-sample period. Figure 9 and Figure 11 shows the out of sample sequential predictive Bayes factors for 10th October 2008 and 30th April 2010 respectively. Based on the results we can say that in the more volatile period the ∆NB model is doing better in term of Bayes factor. On 10th October AA, KO,XRX show evidence in favour the ∆NB model in the out-of-sample comparison, while on 30th October 2008 the dynamic Skellam model ts the data better out-of sample, except for IBM. [ insert Figure 8 here ] [ insert Figure 9 here ] 11 [ insert Figure 10 here ] [ insert Figure 11 here ] 7 Conclusion In this paper we introduced the dynamic ∆NB model for modelling trade by trade returns. We developed a Gibbs type MCMC procedure for the Bayesian estimation of the dynamic Skellam and ∆NB model. Moreover we showed some empirical evidence in favour of the ∆NB model using dierent stock and periods from the NYSE. References Alzaid, A. and M. A. Omair (2010). On the Poission dierence distribution inference and applications. Bulletin of the Malaysian Mathematical Science Society 33, 1745. Banulescu, D., G. Colletaz, C. Hurlin, and S. Tokpavi (2013). High-Frequency Risk Measures. Barndor-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2008). Realized Kernels in Practice: Trades and Quotes. Econometrics Journal 4, 132. Barndor-Nielsen, O. E., D. G. Pollard, and N. Shephard (2012). Integer-valued Levy procces and low latency nancial econometrics. Working Paper. Bos, C. (2008). Model-based estimation of high frequency jump diusions with microstructure noise and stochastic volatility. TI Discussion Paper. Boudt, K., J. Cornelissen, and S. Payseur (2012). Highfrequency: Toolkit for the analysis of Highfrequency nancial data in R. Brownlees, C. and G. Gallo (2006). Financial econometrics analysis at ultra-high frequency: Data handling concerns. Compuational Statistics and Data Analysis 51, 22322245. Chib, S., F. Nardari, and N. Shephard (2002). Markov Chain Monte Carlo for stochastic volatility models. Journal of Econometrics 108, 281316. Czado, C. and S. Haug (2010). An ACD-ECOGARCH(1,1) model. metrics 8, Journal of Financial Econo- 335344. Dahlhaus, R. and J. C. Neddermeyer (2014). Online Spot Volaitlity-Estimation and Decompostion with Nonlinear Market Microstructure Noise Models. rics 12, Journal of Financial Economet- 174212. Engle, R. F. (2000). The econometrics of ultra-high-frequency data. Econometria 68, 122. Frühwirth-Schnatter, S., R. Frühwirth, L. Held, and H. Rue (2009). Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data. 12 Statistics and Computing 19, 479492. Frühwirth-Schnatter, S. and H. Wagner (2006). Auxliliary mixture sampling for parameter-driven models of time series of small counts with applications to state space modeling. Biometrika 93, 827841. Kim, S., N. Shephard, and S. Chib (1998). Stochastic volatility: Likelihood inference and comparison with arch models. Review of Economic Studies 65, 361393. Koopman, S. J., A. Lucas, and R. Lit (2014). The Dynamic Skellam Model with Applications. TI Discussion Paper. Münnix, M. C., R. Schäfer, and T. Guhr (2010). Impact of the tick-size on nancial returns and correlations. Physica A: Statistical Mechanics and its Applications 389 (21), 48284843. Omori, Y., S. Chib, N. Shephard, and J. Nakajima (2007). Stochastic volatilty with leverage: fast likelihood inference. Journal of Econometrics 140, 425449. Poirier, D. J. (1973). Piecewise Regression Using Cubic Splines. tistical Association 68, Journal of the American Sta- 515524. Rydberg, T. H. and N. Shephard (2003). Dynamics of trade-by-trade price movements: Decomposition and models. Journal of Financial Econometrics 1, 225. Stroud, J. R. and M. S. Johannes (2014). Bayesian modeling and forecasting of 24-hour highfrequency volatility: A case study of the nancial crisis. Weinberg, J., L. D. Brown, and R. S. J (2007). Bayesian forecasting of an imhomogenous Poission process with application to call center data. Association 102, 11851199. 13 Journal of the American Statistical A Numerical issues with the Skellam distribution In general it is good to use the scaled version of the modied Bessel function of the rst kind exp(−z)In (z). (A1) exp(−2λ)I|k| (2λ) (A2) For the special case of the Skellam for small and large 2λ this still can be unstable, but we can use the following approximations for small 2λ 1 exp(−2λ)I|k| (2λ) ≈ exp(−2λ) Γ(|k| + 1) 2λ 2 |k| ≈1× λ|k| Γ(|k| + 1) (A3) While for large 2λ we can use exp(2λ) 1 exp(−2λ)I|k| (2λ) ≈ exp(−2λ) √ = √ 2π2λ 2 πλ B (A4) NB distribution Dierent parametrization of the NB distribution Γ(ν + k) pk (1 − p)ν Γ(ν)Γ(k + 1) f (k; ν, p) = Using λ=ν (A5) p λ ⇒p= 1−p λ+ν Γ(ν + k) f (k; λ, ν) = Γ(ν)Γ(k + 1) λ ν+λ k (A6) ν ν+λ ν (A7) Mean Variance µ=λ (A8) λ σ =λ 1+ ν (A9) 2 Dispersion index σ2 µ 1+ λ ν (A10) The NB distribution is over dispersed and which means that there are more intervals with low counts and more intervals with high counts, compared to a Poisson distribution. As we increase ν we get back to the Poission case. 14 The Poisson distribution can be obtained from the NB distribution as follows lim f (k; λ, ν) = ν→∞ 1 1 λk Γ(ν + k) λk (ν + k − 1) . . . ν ν lim = lim ν k k ν→∞ ν→∞ k! k! Γ(ν)(ν + λ) 1 + λ (ν + λ) 1+ λ ν = λk k! ·1· ν 1 = Poi(λ) eλ (A11) The NB distribution Y ∼ N B(λ, ν) can be written as a Poisson-Gamma mixture or Poisson distribution with Gamma heterogeneity where the Gamma heterogeneity has mean 1. Y ∼ Poi(λU ) where U ∼ Ga(ν, ν), (A12) where we use the Ga(α, β) is given by f (x; α, β) = β α xα−1 e−βx Γ(α) (A13) Z∞ fPoisson (k; λu)fGamma (u; ν, ν)du f (k; λ, ν) = 0 Z∞ = (λu)k e−λu ν ν uν e−νu du k! Γ(ν) 0 = λk ν ν k!Γ(ν) Z∞ e−(λ+ν)u uk+ν−1 du 0 Substituting (λ + ν)u = s we get = λk ν ν k!Γ(ν) Z∞ e−s 1 sk+ν−1 ds k+ν−1 (λ + ν) (λ + ν) 0 = λk ν ν 1 k!Γ(ν) (λ + ν)k+ν Z∞ e−s sk+ν−1 ds 0 = = λk ν ν Γ(k + ν) k!Γ(ν) (λ + ν)k+ν k ν Γ(ν + k) λ ν Γ(ν)Γ(k + 1) ν + λ ν+λ (A14) C Daily volatility patterns We want to approximate the function f : R → R with a continuous function which is built up from piecewise polynomials of degree at most three. Let the set ∆ = {k0 , . . . , kK } denote the set of of knots kj j = 0, . . . , K . ∆ is some times called a mesh on [k0 , kK ]. Let y = {y0 , . . . , yK } 15 where yj = f (xj ). We denote a cubic spline on ∆ interpolating to y as S∆ (x). S∆ (x) has to satisfy 1. S∆ (x) ∈ C 2 [k0 , kK ] 2. S∆ (x) coincides with a polynomial of degree at most three on the intervals kj−1 , kj for j = 0, . . . , K . 3. S∆ (x) = yj for j = 0, . . . , K . 00 Using the 2 we know that the S∆ (x) is a linear function on kj−1 , kj which means that we can 00 write S∆ (x) as " # " # kj − x x − kj−1 S∆ (x) = Mj−1 + Mj hj hj 00 00 for x ∈ kj−1 , kj (A15) 00 where Mj = S∆ (kj ) and hj = kj − kj−1 . Integrating S∆ (x) and solving the integrating for the two integrating constants (using S∆ (x) = yj ) Poirier (1973) shows that we get # " # (kj − x)2 (x − kj−1 )2 hj yj − yj−1 hj − Mj−1 + − Mj + S∆ (x) = 6 2hj 2hj 6 hj " 0 for x ∈ kj−1 , kj (A16) i i x − kj−1 h kj − x h (kj − x)2 − h2j Mj−1 + (x − kj−1 )2 − h2j Mj 6hj 6hj " # " # kj − x x − kj−1 + yj−1 + yj for x ∈ kj−1 , kj hj hj S∆ (x) = (A17) In the above expression only Mj for j = 0, . . . , K are unknown. We can use the continuity restrictions which enforce continuity at the knots by requiring that the derivatives are equal at the knots kj for j = 1, . . . , K − 1 0 S∆ (kj− ) = hj Mj−1 /6 + hj Mj /3 + (yj − yj−1 )/hj (A18) S∆ (kj+ ) = −hj+1 Mj /3 − hj+1 Mj+1 /6 + (yj+1 − yj )/hj+1 (A19) 0 which yields K − 1 conditions (1 − λj )Mj−1 + 2Mj + λj Mj+1 = 6yj 6yj+1 6yj−1 − + hj (hj + hj+1 ) hj hj+1 hj+1 (hj + hj+1 ) where λj = hj+1 hj + hj+1 (A20) (A21) Using two end conditions we have K + 1 unknowns and K + 1 equations and we can solve the linear equation system for Mj . Using the M0 = π0 M1 and MK = πK MK−1 end conditions we 16 can write 2 1-λ 1 0 .. Λ = . |{z} 0 (K+1)×(K+1) 0 0 Θ = |{z} (K+1)×(K+1) -2 π0 0 ... 0 0 0 2 λ1 ... 0 0 0 1-λ2 .. . 2 .. . ... 0 .. . 0 .. . 0 .. . 0 0 ... 2 λK−2 0 0 0 ... 1-λK−1 2 λK−1 0 0 ... 0 -2 πK 2 (A22) 0 0 0 ... 0 0 0 6 h1 (h1 +h2 ) - h16h2 ... 0 0 0 0 .. . 6 h2 (h2 +h3 ) 6 h2 (h1 +h2 ) - h26h3 ... 0 .. . 0 .. . 0 .. . 0 0 0 ... - hK−26hK−1 0 0 0 0 ... 6 hK−1 (hK−1 +hK ) 6 hK−1 (hK−2 +hK−1 ) 6 - hK−1 hK 6 hK (hK−1 +hK ) 0 0 0 ... 0 0 0 .. . .. . (A23) M0 M 1 .. m = . |{z} (K+1)×1 MK−1 MK y0 y 1 .. y = . |{z} yK−1 (K+1)×1 yK (A24) (A25) The linear equation system is given by Λm = θy (A26) m = Λ−1 Θy (A27) and the solution is 17 Using this result and equation (A17) we can calculate S∆ (ξ1 ) S (ξ ) ∆ 2 .. S∆ (ξ) = . | {z } S∆ (ξN −1 ) N ×1 S∆ (ξN ) (A28) Lets denote P the N × (K + 1) matrix where ith row i = 1, . . . , N 1 given that kj−1 ≤ ξ ≤ kj can be written as pi |{z} 1×(K+1) i ξ −k h i kj − ξi h i j−1 = 0, . . . , 0 , (kj − ξi )2 − h2j , (ξi − kj−1 )2 − h2j , | {z } 6hj 6hj rst j − 2 0, . . . , 0 | {z } last K + 1 − j (A29) Moreover denote Q the N ×(K + 1) matrix where ith row i = 1, . . . , N 1 given that kj−1 ≤ ξ ≤ kj can be written as qi |{z} 1×(K+1) kj − ξi ξi − kj−1 = 0, . . . , 0 , , , | {z } hj hj rst j − 2 (A30) 0, . . . , 0 | {z } last K + 1 − j Now using (A17) and (A27) we get S∆ (ξ) = P m + Qy = P Λ−1 Θy + Qy = (P Λ−1 Θ + Q)y = W |{z} y |{z} (A31) N ×(K+1) (K+1)×1 where W = P Λ−1 Θ + Q (A32) In practical situations we might only know the knots but we don't know we observe the spline values with error. In this case we have s = S∆ (ξ) + ε = W y + ε, where s1 s 2 .. s = . |{z} N ×1 sN −1 sN 18 (A33) (A34) and ε1 ε 2 . .. ε = |{z} N ×1 εN −1 εN (A35) with E(ε) = 0 and E(εε0 ) = σ 2 I (A36) Notice that after xing the knots we only have to estimate the value of the spline at he knots and this determines the whole shape of the spline. We cab do this by simple OLS ŷ = (W > W )−1 W > s (A37) For identication reasons we want X X S∆ (ξj ) = wj y = w ∗ y = 0 (A38) j:uniqueξj j:uniqueξj where wi is the ith row of W and w∗ |{z} X = (A39) wj j:uniqueξj 1×(K+1) The restriction can be enforced by one of the elements of y . This ensures that E(st ) = 0 so st and µθ can be identied. If we drop yK we can substitute yK = − K−1 X ∗ (wi∗ /wK )yi (A40) i=0 where wi∗ is the ith element of w∗ . Substituting this into X S∆ (ξj ) = j:uniqueξj X wj y = j:uniqueξj = X K X X wji yi = j:uniqueξj i=0 ∗ (wji − wjK wi∗ /wK )yi = K−1 X K−1 X i=0 i=0 ∗ ∗ ∗ (wi∗ − wK wi /wK )yi = K−1 X wji yi − wjK j:uniqueξj i=0 K−1 X j:uniqueξj i=0 = X K−1 X X i=0 j:uniqueξj (wi∗ − wi∗ )yi = 0 K−1 X ∗ (wi∗ /wK )yi i=0 ∗ (wji − wjK wi∗ /wK )yi (A41) Lets partition W in the following way W |{z} N ×(K+1) = [W−K : WK ] | {z } |{z} N ×K 19 N ×1 (A42) where W−K is equal to the rst K columns of W and WK is the K th column of W . Moreover w∗ |{z} 1×(K+1) We can dene ∗ ∗ : wK = [w−K ] | {z } |{z} (A43) 1×1 1×K f = W−K − 1 WK w∗ W ∗ |{z} −K |{z} | {z } wK | {z } (A44) f ỹ +ε. s = S∆ (ξ) + ε = |{z} W |{z} (A45) N ×K N ×1 1×K N ×K and we have N ×K K×1 D MCMC estimation of the dynamic ∆NB model D.1 Generating the parameters x, µθ , φ, ση2 (Step 2) Notice that conditional on R = rtj , t = 1, . . . , T, j = 1, . . . , min(Nt + 1, 2) , τ , N ,γ and s we have − log τt1 = log(z1t + z2t ) + µθ + st + xt + mrt1 (1) + εt1 , εt1 ∼ N(0, vr2t1 (1)) (A46) εt2 ∼ N(0, vr2t2 (Nt )) (A47) εt ∼ N(0, Ht ) (A48) and − log τt2 = log(z1t + z2t ) + µθ + st + xt + mrt2 (Nt ) + εt2 , which implies the following following state space form ỹt |{z} min(Nt +1,2)×1 µθ 1 + εt , = β |{z} wt 1 min(Nt +1,2)×1 xt {z } | {z } min(Nt +1,2)×(K+2) 1 1 | wt (K+2)×1 αt+1 = | µθ β xt+1 {z 1 0 = 0 0 } | IK (K+2)×1 0 µθ 0 0 β + 0 φ xt ηt+1 } | {z } | {z 0 {z (K+2)×(K+2) (K+2)×1 , ηt+1 ∼ N(0, ση2 )(A49) (K+2)×1 } (A50) where µ θ β x1 | {z ∼ N | } (K+2)×1 µ0 σµ2 β0 , 0 0 0 {z } | (K+2)×1 20 0 σβ2 IK 0 0 0 ση2 /(1 − φ2 ) {z } (K+2)×(K+2) (A51) Ht = diag(vr2t1 (1), vr2t,2 (Nt )) and (1) − log(z + z ) − log τ − m 1t 2t t1 rt1 = − log τt2 − mrt2 (Nt ) − log(z1t + z2t ) ỹt |{z} min(Nt +1,2)×1 D.2 (A52) Generating γ (Step 3) p(γ|ν, µθ , φ, ση2 , x, R, s, τ, N, z1 , z2 , y) = p(γ|ν, µθ , s, x, y) (A53) because given ν , λ and y , the variables R, τ, N, z1 , z2 are redundant. p(γ|ν, µθ , s, x, y) ∝ p(y|γ, ν, µθ , s, x)p(γ|ν, µθ , s, x) = p(y|γ, ν, µθ , s, x)p(γ) (A54) as γ is independent from ν and λt = exp(µθ + st + xt ). p(y|γ, ν, µθ , x)p(γ) = × ∝ × T Y " 2ν |yt | λt ν Γ(ν + |yt |) γ 1{yt =0} + (1 − γ) λt + ν λt + ν Γ(ν)Γ(|yt |) t=1 ! 2 a−1 (1 − γ)b−1 λt γ F ν + yt , ν, yt + 1; λt + ν B(a, b) " 2ν |yt | T Y ν λt Γ(ν + |yt |) a b−1 a−1 b γ (1 − γ) 1{yt =0} + γ (1 − γ) λt + ν λt + ν Γ(ν)Γ(|yt |) t=1 2 ! λt F ν + yt , ν, yt + 1; λt + ν We can carry out an independent MH step to sample from this density using a truncated normal or normal density with mean equal to the mode of this above distribution and variance equal to the Hessian at the mode. D.3 Generating the auxiliary variables R, τ, N, z1 , z2 , ν (Step 4) p(R, τ, N, z1 , z2 , ν|γ, µθ , φ, ση2 , s, x, y) = p(R|τ, N, z1 , z2 γ, p, µθ , φ, ση2 , s, x, y) × p(τ |N, z1 , z2 γ, ν, µθ , φ, ση2 , s, x, y) × p(N |z1 , z2 γ, ν, µθ , φ, ση2 , s, x, y) × p(z1 , z2 |γ, ν, µθ , φ, ση2 , s, x, y) × p(ν|γ, µθ , φ, ση2 , s, x, y) 21 (A55) Generating ν (Step 4a) Note that p(ν|γ, µθ , φ, ση2 , s, x, y) = p(ν|γ, λ, y) ∝ p(ν, γ, λ, y) = p(y|γ, λ, ν)p(λ|γ, ν)p(γ|ν)p(ν) = p(y|γ, λ, ν)p(λ)p(γ)p(ν) (A56) ∝ p(y|γ, λ, ν)p(ν) where p(y|γ, λ, ν) is a product of zero inated ∆NB probability mass functions. We can draw ν using a Laplace approximation or an adaptive random walk Metropolis-Hasting procedure. An alternative way of drawing ν is using a discrete uniform prior ν ∼ DU (2, 128) and a random walk proposal in the following fashion as suggested by Stroud and Johannes (2014) for degree of freedom parameter of a t density. We can write the posterior as a multinomial ∗ ) with probabilities distribution p(ν|µθ , x, z1 , z2 ) ∼ M (π2∗ , . . . , π128 πν∗ ∝ T h Y T i Y γI{yt =0} + (1 − γ)f∆NB (yt ; λt , ν) = gν (yt ) t=1 (A57) t=1 To avoid the computationally intense evaluation of these probabilities we can use a MetropolisHastings update. We can draw the proposal ν ∗ from the neighbourhood of the current value ν (i) using a discrete uniform distribution ν ∗ ∼ DU (ν (i) − δ, ν (i) + δ) and accept with probability ( min 1, QT gν ∗ (yt ) QTt=1 t=1 gν (i) (yt ) ) (A58) δ is chosen such that the acceptance rate is reasonable. Generating z1 , z2 (Step 4b) Notice that z1 , z2 are independent given γ, µθ , s, x, y . p(z1 , z2 |γ, ν, µθ , φ, ση2 , s, x, y) = T Y p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt ) (A59) t=1 p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt ) ∝ p(z1t , z2t , γ, ν, µθ , φ, ση2 , st , xt , yt ) = p(yt |z1t , z2t , γ, ν, µθ , φ, ση2 , st , xt ) × p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt ) p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt ) ∝ g(z1t , z2t ) 22 ν e−νz1t ν ν z ν e−νz2t ν ν z1t 2t Γ(ν) Γ(ν) (A60) (A61) where g(z1t , z2t ) = γ 1{yt =0} + (1 − γ) exp −λt (z1t + z2t ) z1t z2t yt √ 2 (A62) I|yt | (2λt z1t z2t ) ∗ , z ∗ from with λt = exp(µθ + st + xt ). We can carry out an independent MH step by sampling z1t 2t Ga(λt , ν) and accept it with probability min ∗ , z∗ ) g(z1t 2t ,1 g(z1t , z2t ) (A63) Generating N (Step 4c) The number of jumps are independent given γ, µθ , φ, ση2 , s, x, z1 , z2 , y which means p(N |γ, ν, µθ , φ, ση2 , s, x, z1 , z2 , y) = T Y (A64) p(Nt |γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , yt ). t=1 For a given t we can draw Nt from a discrete distribution with p(Nt = n|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , yt ) = p(Nt = n, yt = k|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t ) p(yt = k|γ, p, µθ , φ, ση2 , st , xt , z1t , z2t ) = p(yt = k|Nt = n, γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t ) p(Nt = n|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , ) p(yt = k|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t ) n X = γ 1{k=0} + (1 − γ)p Mi = k|Nt = n, , z1t , z2t × i=1 × p(Nt = n|γ, µθ , φ, ση2 , st , xt , z1t , z2t ) p(yt = k|γ, µθ , φ, ση2 , st , xt , z1t , z2t ) (A65) The denominator is easy to evaluate it is a Skellam distribution at k with intensity λt z1t andλt z2t . The probability n X p Mi = k|Nt = n, z1t , z2t (A66) i=1 is not standard. condition on z1 and z2 , yt = has a Skellam distribution, hence Mi = which implies we can represent 1, with P (Mi = 1) = −1, with P (Mi = −1) n P z1t z1t +z2t 2t = z1tz+z 2t Mi with a tree structure and the binomial distribution. i=1 (A67) n P Mi i=1 has a binomial distribution with n trails, (n + k)/2 successes and p = 0.5 success rate. Note that even k can only happen in even number of trails and odd k can only happen in odd number of 23 trails. 6 |n mod 2| 0, if k > n or |k mod 2| = n+k n−k p Mi = k|Nt = n, z1t , z2t = 2 2 z2t z1t n , otherwise i=1 n+k z1t + z2t z1t + z2t 2 (A68) n X The probability p(Nt = n|γ, µθ , φ, ση2 , st , xt , z1t , z2t ) is equal to p(Nt = n|µθ , st , xt ) and it is a Poission random variable with intensity equal to λt (z1t + z2t ). In general we have to following expression for p(N |γ, ν, µθ , φ, ση2 , s, x, z1 , z2 , y) when |yt | ≤ n " n γ λt (z1t + z2t ) exp −λt (z1t + z2t ) I{yt =0} Γ(n + 1) + I{|yt | mod 2=n mod 2} × " (1 − γ) λt (z1t + z2t ) t Γ( n+y 2 + n exp −λt (z1t + z2t ) t 1)Γ( n−y 2 + 1) z1t z1t + z2t n+k 2 z2t z1t + z2t n−k 2 1 z1t y2t √ γ 1{yt =0} + (1 − γ) exp −λt (z1t + z2t ) z2t I|yt | (2λt z1t z2t ) We can draw Nt parallel over t = 1, . . . , T by drawing a uniform random variable ut ∼ U [0, 1] n X Nt = min n : ut ≤ p(Nt = i|γ, µθ , φ, ση2 , st , xt , yt , z1t , z2t ) (A70) i=0 Generating τ (Step 4d) Notice that p(τ |N, z1 , z2 , γ, ν, µθ , φ, ση2 , x, y) = p(τ |N, µθ , z1 , z2 , s, x). Moreover p(τ |µθ , z1 , z2 , s, x) = T Y p(τ1t , τ2t |Nt , µθ , z1t , z2t , st , xt ) t=1 = T Y p(τ1t |τ2t , Nt , µθ , z1t , z2t , st , xt )p(τ2t |Nt , µθ , z1t , z2t , st , xt ) (A71) t=1 where we can sample from p(τ2t |Nt , µθ , z1t , z2t , st , xt ) using the fact that conditionally on Nt the arrival time τ2t of the Nt th jump is the maximum of Nt uniform random variables and it has a Beta(Nt , 1) distribution. The arrival time of the (Nt + 1)th jump after 1 is exponentially distributed with intensity λt (z1t + z2t ), hence τ1t = 1 + ξt − τ2t ξt ∼ Exp(λt (z1t + z2t )) 24 (A69) # otherwise it is zero. and (A72) Generating R (Step 4e) Notice that p(R|τ, N, z1 , z2 , γ, ν, µθ , φ, ση2 , s, x, y) = p(R|τ, N, z1 , z2 , ν, s, x) Moreover p(R|τ, N, z1 , z2 , ν, s, x) = t +1,2) T min(N Y Y t=1 p(rtj |τt , Nt , µθ , z1t , z2t , st , xt ) (A73) (A74) j=1 Sample rt1 from the following discrete distribution p(rt1 |τt , Nt , µθ , z1t , z2t , st , xt ) ∝ wk (1)φ(− log τ1t − log[λt (z1t + z2t )], mk (1), vk2 (1)) (A75) where k = 1, . . . , R(1) If Nt > 0 then draw rt2 from the discrete distribution p(rt2 |τt , Nt , µθ , z1t , z2t , st , xt ) ∝ wk (Nt )φ(− log τ1t − log[λt (z1t + z2t )], mk (Nt ), vk2 (Nt )) (A76) for k = 1, . . . , R(Nt ) Tables and Figures Table 1: Estimation results from a dynamic Skellam and ∆NB model based on 20000 observations and 100000 iterations from which 20000 used as a burn in sample.The 95 % HPD regions are in brackets The true parameters are µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001 and ν = 15 µ φ σ2 γ β1 β2 β3 β4 Skellam ∆NB -1.72 -1.726 [-1.797,-1.642] [-1.804,-1.651] 0.973 0.975 [0.965,0.979] [0.969,0.981] 0.018 0.015 [0.013,0.023] [0.011,0.02] 0.005 0.003 [0,0.017] [0,0.01] 1.139 1.128 [0.884,1.392] [0.875,1.38] -0.306 -0.297 [-0.453,-0.158] [-0.448,-0.151] -0.801 -0.793 [-0.943,-0.657] [-0.933,-0.65] 0.091 0.099 [-0.052,0.23] [-0.04,0.24] 12.191 ν [8,16.4] 25 Table 2: Descriptive statistics of the data from 3rd to 10th October 2008. AA Num. obs Avg. price Mean Std Min Max % Zeros F In Out In Out In Out 64 807 16.75 -0.007 1.63 -33 38 48.76 14 385 11.574 -0.004 2.126 -51 39 48.76 32 756 3.077 -0.007 0.745 -18 21 77.08 14 313 2.112 0 0.601 -10 9 77.08 68 002 96.796 -0.02 6.831 - 197 186 39.9 20 800 87.583 -0.004 7.09 - 105 140 39.9 JPM Num. obs Avg. price Mean Std Min Max % Zeros IBM KO XRX Out In Out In Out In 142 867 42.773 -0.009 2.368 -48 74 43.78 43 230 38.889 0.012 2.779 -40 55 43.78 70 356 49.203 -0.012 1.758 -33 30 34.39 25 036 41.875 0.005 2.734 -50 63 34.39 26 020 9.049 -0.006 0.816 -17 19 54.98 8 623 7.768 0.004 1.285 -17 12 54.98 Table 3: Descriptive statistics of the data from 23rd to 30th April 2010. AA Num. obs Avg. price Mean Std Min Max % Zeros F In Out In Out In Out 27 550 13.749 -0.001 0.468 -3 3 75.02 4 883 13.519 -0.006 0.502 -2 2 75.02 63 241 13.734 -0.001 0.448 -5 4 79.73 9 894 13.231 -0.006 0.454 -2 3 79.73 43 606 130.176 0.001 1.424 -22 24 51.93 8 587 129.575 -0.019 1.371 -15 9 51.93 JPM Num. obs Avg. price Mean Std Min Max % Zeros IBM KO XRX Out In Out In Out In 101 045 43.702 -0.001 0.615 -5 5 68.73 21 443 42.854 -0.007 0.638 -10 5 68.73 34 469 53.628 -0.003 0.647 -9 7 65.09 6 073 53.732 -0.006 0.696 -5 5 65.09 36 332 11.164 0 0.494 -9 7 79.29 4 326 11.025 -0.007 0.459 -2 3 79.29 26 Table 4: Estimation results from a dynamic Skellam and ∆ NB model during the period from 3rd to 9th October 2008. The posterior mean estimates are based on 100000 iterations from which 20000 used as a burn in sample.The 95 % HPD regions are in brackets AA µ φ σ2 γ β1 β2 F Skellam ∆NB Skellam ∆NB Skellam ∆NB -0.174 -0.262 -1.873 -1.861 1.935 1.246 [-0.236,-0.112] [-0.321,-0.204] [-1.942,-1.803] [-1.932,-1.791] [1.865,2.008] [1.198,1.294] 0.929 0.941 0.939 0.945 0.881 0.935 [0.921,0.936] [0.935,0.946] [0.931,0.95] [0.934,0.955] [0.873,0.888] [0.93,0.94] 0.207 0.126 0.112 0.093 0.86 0.124 [0.184,0.235] [0.107,0.141] [0.095,0.132] [0.077,0.114] [0.796,0.926] [0.112,0.133] 0.248 0.243 0.279 0.299 [0.24,0.258] [0.234,0.251] [0.274,0.285] [0.294,0.304] 0.436 0.374 0.297 0.288 0.282 0.206 [0.325,0.544] [0.272,0.477] [0.156,0.433] [0.149,0.429] [0.153,0.406] [0.118,0.293] -0.185 -0.151 -0.117 -0.114 0.076 0.023 [-0.274,-0.097] [-0.234,-0.068] [-0.224,-0.01] [-0.223,-0.007] [-0.034,0.186] [-0.053,0.098] ν 8.701 14.315 2 [6.6,11] [10.4,18.2] [2,2] JPM µ φ σ2 γ β1 β2 ν IBM KO XRX Skellam ∆NB Skellam ∆NB Skellam ∆NB 0.229 0.239 0.18 0.148 -1.418 -1.417 [0.188,0.272] [0.193,0.284] [0.138,0.222] [0.107,0.189] [-1.474,-1.363] [-1.473,-1.361] 0.897 0.905 0.937 0.943 0.928 0.941 [0.893,0.902] [0.901,0.911] [0.932,0.943] [0.937,0.948] [0.912,0.944] [0.928,0.954] 0.459 0.378 0.083 0.067 0.071 0.048 [0.444,0.476] [0.336,0.401] [0.076,0.091] [0.059,0.075] [0.053,0.089] [0.035,0.061] 0.197 0.205 0.103 0.103 [0.192,0.203] [0.199,0.213] [0.096,0.11] [0.096,0.109] 0.358 0.343 0.569 0.543 0.564 0.536 [0.285,0.431] [0.272,0.416] [0.502,0.64] [0.476,0.611] [0.448,0.677] [0.423,0.654] 0.011 0.015 -0.209 -0.196 -0.142 -0.132 [-0.056,0.077] [-0.05,0.081] [-0.277,-0.14] [-0.262,-0.129] [-0.23,-0.052] [-0.22,-0.042] 87.27 34.756 8.697 [75.2,98.8] [28.4,41.6] [5.8,11.6] 27 Table 5: Estimation results from a dynamic Skellam and ∆ NB model during the period from 23rd to 29th April 2010. The posterior mean estimates are based on 100000 iterations from which 20000 used as a burn in sample.The 95 % HPD regions are in brackets AA µ φ σ2 F IBM Skellam ∆NB Skellam ∆NB Skellam ∆NB -2.23 -2.227 -2.397 -2.393 -0.083 -0.224 [-2.29,-2.17] [-2.288,-2.167] [-2.442,-2.351] [-2.436,-2.348] [-0.154,-0.008] [-0.299,-0.146] 0.956 0.958 0.942 0.944 0.975 0.983 [0.944,0.968] [0.947,0.971] [0.933,0.951] [0.936,0.953] [0.968,0.981] [0.976,0.988] 0.029 0.027 0.061 0.057 0.025 0.011 [0.02,0.04] [0.018,0.039] [0.051,0.078] [0.046,0.068] [0.018,0.033] [0.007,0.017] 0.287 0.267 [0.278,0.297] [0.256,0.279] γ β1 β2 0.037 0.037 0.148 0.149 0.476 0.421 [-0.052,0.13] [-0.056,0.13] [0.089,0.207] [0.09,0.206] [0.359,0.6] [0.306,0.536] -0.041 -0.041 -0.188 -0.188 0.204 0.181 [-0.138,0.057] [-0.137,0.061] [-0.259,-0.115] [-0.26,-0.115] [0.082,0.329] [0.061,0.3] ν 20.367 27.436 6.101 [15,25.8] [21.4,33.8] [4.2,7.8] JPM µ φ σ2 KO XRX Skellam ∆NB Skellam ∆NB Skellam ∆NB -1.674 -1.673 -1.636 -1.637 -2.334 -2.328 [-1.716,-1.632] [-1.716,-1.631] [-1.693,-1.581] [-1.693,-1.581] [-2.393,-2.275] [-2.387,-2.271] 0.992 0.993 0.98 0.98 0.943 0.947 [0.99,0.994] [0.991,0.994] [0.973,0.987] [0.973,0.987] [0.929,0.959] [0.934,0.959] 0.002 0.002 0.007 0.007 0.059 0.052 [0.002,0.003] [0.002,0.003] [0.004,0.01] [0.004,0.01] [0.037,0.076] [0.038,0.068] 0.195 0.195 0.355 0.351 0.647 0.641 [0.124,0.266] [0.121,0.266] [0.268,0.443] [0.262,0.439] [0.553,0.739] [0.548,0.733] 0.029 0.029 0.067 0.069 -0.457 -0.455 [-0.039,0.1] [-0.043,0.098] [-0.032,0.164] [-0.031,0.166] [-0.545,-0.367] [-0.544,-0.368] γ β1 β2 ν 36.288 22.356 17.029 [29.6,43.8] [16.6,28] [12.4,22.4] 28 29 511 185 107 448 107 434 107 421 79 623 79 198 79 192 % AA 78.98 0.01 0.01 25.88 0.53 0.01 dropped 311 914 59 749 59 737 59 724 47 146 47 075 47 069 # % F 80.84 0.02 0.02 21.06 0.15 0.01 dropped 688 805 128 589 128 575 128 561 89 517 88 808 88 802 # % IBM 81.33 0.01 0.01 30.37 0.79 0.01 dropped 984 526 298 773 298 761 298 744 188 469 186 103 186 097 # % JPM 69.65 0 0.01 36.91 1.26 0 dropped 541 616 126 509 126 497 126 484 96 482 95 398 95 392 # % KO 76.64 0.01 0.01 23.72 1.12 0.01 dropped 371 065 40 846 40 834 40 820 34 722 34 649 34 643 # % XRX 88.99 0.03 0.03 14.94 0.21 0.02 dropped Raw quotes and trades Trades Non missing price and volume Trades between 9:30 and 16:00 Aggrageted trades Without outliers Without opening trades 1 487 382 33 684 33 675 33 666 32 446 32 439 32 433 # % AA 97.74 0.03 0.03 3.62 0.02 0.02 dropped 2 737 300 77 778 77 765 77 757 73 160 73 141 73 135 # F % 97.16 0.02 0.01 5.91 0.03 0.01 dropped 803 648 53 346 53 332 53 324 52 406 52 199 52 193 # % IBM 93.36 0.03 0.02 1.72 0.39 0.01 dropped 2 109 770 126 153 126 142 126 136 122 579 122 494 122 488 # % JPM 94.02 0.01 0 2.82 0.07 0 dropped 692 657 41 184 41 173 41 164 40 573 40 548 40 542 # % KO 94.05 0.03 0.02 1.44 0.06 0.01 dropped 1 038 502 43 170 43 155 43 149 40 673 40 664 40 658 # % XRX 95.84 0.03 0.01 5.74 0.02 0.01 dropped Table 7: Summary of the cleaning and aggregation procedure on the data from April 2010 for Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM), Ford (F), Xerox (XRX from the NYSE. Raw quotes and trades Trades Non missing price and volume Trades between 9:35 and 15:55 Aggrageted trades Without outliers Without opening trades # Table 6: Summary of the cleaning and aggregation procedure on the data from October 2008 for Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM), Ford (F), Xerox (XRX from the NYSE. Empirical distribution of the log returns in 10/2008 7000 6000 5000 4000 3000 2000 1000 0 5000 AA 2000 1500 1000 500 0 0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015 IBM 8000 7000 6000 5000 4000 3000 2000 1000 0 4000 3000 2000 1000 0 6000 0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015 KO 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 F 2500 0 0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015 0.004 0.002 0.000 0.002 0.004 JPM 0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015 XRX 0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015 Figure 1: Empirical distribution of the tick by tick log returns during October 2008 for Alcoa (AA), Ford (F), International Business Machines (IBM),JP Morgan (JPM), Coca-Cola (KO) and Xerox (X) 30 Empirical distribution of the tick returns in 10/2008 AA 10-1 empirical skellam 10-1 10-2 log density log density F empirical skellam 10-3 10-4 10-2 10-3 10-4 10-5 50 40 30 20 10 0 10 20 30 10-5 40 20 15 5 10 10-3 10-4 150 10-2 100 50 0 50 100 10-4 150 60 40 20 0 10 -3 20 40 60 80 XRX empirical skellam empirical skellam 10-1 log density log density 10 -2 25 10-3 KO 10 20 10-5 10-5 -1 15 empirical skellam 10-1 log density log density 10-2 10 JPM empirical skellam 10-1 5 0 IBM 10-4 10-2 10-3 10-4 10-5 60 40 20 0 20 40 60 80 20 15 10 5 0 5 10 15 20 Figure 2: Empirical distribution of the tick returns along with tted Skellam density during October 2008 for Alcoa (AA), Ford (F), International Business Machines (IBM),JP Morgan (JPM), Coca-Cola (KO) and Xerox (X) 31 0.35 Zero mean Skellam distribution Skellam γ=0 and λ=1 Skellam γ=0 and λ=2 Skellam γ=0.1 and λ=2 0.30 0.25 0.20 0.15 0.10 0.05 0.00 10 5 0 5 Figure 3: The picture shows the Skellam distribution with dierent parameters 32 10 0.35 Zero Mean ∆NB Distribution ∆NB γ=0, λ=1 and ν=1 ∆NB γ=0, λ=1 and ν=10 ∆NB γ=0, λ=5 and ν=1 ∆NB γ=0, λ=5 and ν=10 ∆NB γ=0.1, λ=5 and ν=10 0.30 0.25 0.20 0.15 0.10 0.05 0.00 10 5 0 5 Figure 4: The picture shows the ∆NB distribution with dierent parameters 33 10 Posterior densities of the parameters 12 10 8 6 4 2 0 1.90 1.85 160 140 120 100 80 60 40 20 0 0.010 1.80 1.75 0.015 µ 1.70 0.020 1.65 1.60 0.025 1.55 0.030 120 100 80 60 40 20 0 0.955 0.960 0.965 0.970 6 5 4 3 2 1 0 1.2 0.6 1.1 0.8 1.0 1.0 0.9 β1 0.8 β3 1.2 1.4 0.7 1.6 0.6 1.8 0.980 0.985 0.990 γ 6 5 4 3 2 1 0 6 5 4 3 2 1 0 0.5 0.975 200 150 100 50 0 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 σ2 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.4 φ 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.3 0.2 0.1 0.2 β2 β4 0.1 0.3 0.0 0.1 0.4 0.5 Figure 5: The posterior distribution of the parameters from a dynamic Skellam model based on 20000 observations and 100000 iterations from which 20000 used as a burn in sample. Each picture shows the histogram of the posterior draws the kernel density estimate of the posterior distribution, the HPD region and the posterior mean. The true parameters are µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001 34 Posterior densities of the parameters 1.50 140 120 100 80 60 40 20 0 0.960 0.965 0.970 200 150 100 50 0 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 400 350 300 250 200 150 100 500 0.00 0.01 0.02 12 10 8 6 4 2 0 1.95 1.90 1.85 1.80 1.75 µ 1.70 1.65 1.60 1.55 σ2 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.4 6 5 4 3 2 1 0 0.20 0.15 0.10 0.05 0.00 1.1 5 0.6 0.8 1.0 1.0 0.9 10 β1 0.8 β3 15 ν 1.2 1.4 0.7 1.6 0.6 20 1.8 0.5 6 5 4 3 2 1 0 6 5 4 3 2 1 0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 φ γ 0.975 0.980 0.985 0.03 0.04 0.05 0.3 0.2 0.1 0.2 β2 β4 0.1 0.3 0.0 0.1 0.4 0.5 25 Figure 6: The posterior distribution of the parameters from a dynamic ∆ NB model based on 20000 observations and 100000 iterations from which 20000 used as a burn in sample. Each picture shows the histogram of the posterior draws the kernel density estimate of the posterior distribution, the HPD region and the posterior mean. The true parameters are µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001 and ν = 15 35 Volatility decompostion of IBM tick returns from 23rd to 29th April 2010 30 20 10 0 10 20 30 4 3 2 1 0 1 2 3 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 4 3 2 1 0 1 2 3 tick retruns 26/04 27/04 28/04 29/04 xt 26/04 27/04 28/04 29/04 st 26/04 27/04 28/04 29/04 logλt 26/04 27/04 28/04 29/04 In-sample BIC comparison in October 2008 40 30 20 10 0 10 20 30 40 80 10 0 10 20 30 40 50 60 03/10 06/10 09/10 KO 07/10 08/10 09/10 2 log BF Tick returns 08/10 JPM 09/10 0 50 09/10 07/10 50 0 08/10 0 40 03/10 06/10 100 50 07/10 20 30 20 10 0 10 20 30 40 50 100 150 100 200 03/10 10 5 0 5 10 15 20 25 03/10 80 60 40 20 0 20 40 60 Tick returns 08/10 IBM 25 20 15 10 5 0 5 10 15 20 06/10 07/10 08/10 XRX 09/10 20 15 10 5 0 5 10 15 20 Tick returns 100 0 100 200 300 400 500 600 700 800 03/10 06/10 07/10 40 20 2 log BF 06/10 F 60 Tick returns AA Tick returns 2 log BF 20 10 0 10 20 30 40 50 03/10 Tick returns 2 log BF 2 log BF 2 log BF Figure 7: Decompostion of log volatility of IBM 06/10 07/10 08/10 09/10 Figure 8: Sequential Bayes factors approximation based on BIC on data from 3rd to 9th October 2008. 36 20 0 20 40 60 80 100 80 60 40 20 0 20 40 60 120 100 80 60 40 20 0 20 20 2 log BF Tick returns 0 40 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 IBM 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 KO 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10 5 0 5 10 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 JPM Tick returns 150 100 50 0 50 100 150 20 F 60 40 Tick returns 60 10 5 0 5 10 15 20 20 0 20 40 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 XRX 15 10 5 0 5 10 15 20 Tick returns 120 100 80 60 40 20 0 20 40 2 log BF 50 0 50 100 150 200 250 300 AA Tick returns 2 log BF 16 14 12 10 8 6 4 2 0 2 Tick returns 2 log BF 2 log BF 2 log BF Out of sample predicitve likelihood comparison in October 2008 10:00 11:00 12:00 13:00 14:00 15:00 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 1 2 26/04 27/04 28/04 IBM 3 29/04 30 10 2 log BF 0 20 26/04 26/04 27/04 27/04 KO 28/04 8 6 4 2 0 28/04 29/04 29/04 2 4 6 8 10 4 20 2 40 0 2 60 23/04 5 0 5 10 15 20 25 30 23/04 1 2 3 4 5 6 4 80 30 29/04 28/04 JPM 0 20 10 4 3 2 1 0 Tick returns 0 2 log BF Tick returns 1 F Tick returns 2 10 0 10 20 30 40 50 60 70 23/04 26/04 27/04 26/04 27/04 26/04 XRX 28/04 27/04 6 29/04 8 6 4 2 0 28/04 29/04 Tick returns 10 0 10 20 30 40 50 60 23/04 3 2 log BF 50 40 30 20 10 0 10 20 23/04 In-sample BIC comparison in April 2010 AA Tick returns 10 0 10 20 30 40 50 60 23/04 Tick returns 2 log BF 2 log BF 2 log BF Figure 9: Sequential predictive Bayes factors on 10th October 2008. 2 4 6 8 10 Figure 10: Sequential Bayes factors approximation based on BIC on data from 23rd to 29th April 2010. 37 10 5 5 KO 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 6 4 2 0 0 1 10 15 JPM 6 4 2 0 2 1 0 1 2 3 4 5 6 7 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 XRX 2 4 6 8 10 3 2 1 0 1 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 Figure 11: Sequential predictive Bayes factors on 30th April 2010. 38 2 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 5 15 Tick returns 1 0 10 2 4 6 3 2 5 0 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 F Tick returns 10 2 0 2 4 6 8 10 12 Tick returns IBM 2 log BF Tick returns 10:00 11:00 12:00 13:00 14:00 15:00 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 2 log BF 2 0 2 4 6 8 10 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2 log BF 30 25 20 15 10 5 0 5 AA Tick returns 2 0 2 4 6 8 10 Tick returns 2 log BF 2 log BF 2 log BF Out of sample predicitve likelihood comparison in April 2010 2
© Copyright 2026 Paperzz