Dynamic Negative Binomial Difference Model for High

Dynamic Negative Binomial Dierence Model
for High Frequency Returns ∗
PRELIMINARY AND INCOMPLETE
István Barra (a,b)
(a)
and
Siem Jan Koopman (a,c)
VU University Amsterdam and Tinbergen Institute
(b)
Duisenberg School of Finance
(c)
CREATES, Aarhus University
January 19, 2015
Abstract
We introduce the dynamic ∆NB model for nancial tick by tick data. Our model explicitly
takes into account the discreteness of the observed prices, fat tails of tick returns and intraday
patterns of volatility. We propose a Markov chain Monte Carlo estimation method, which
takes advantage of an auxiliary mixture representation of the ∆NB distribution. We illustrate
our methodology using tick by tick data of several stocks from the NYSE in dierent periods.
Using predictive likelihoods we nd evidence in favour of the dynamic ∆NB model.
Keywords: high-frequency econometrics, Bayesian inference, Markov-Chain Monte Carlo,
discrete distributions
1
Introduction
Stock prices are not continuous variables, as they are multiple of the so called ticksize, which
is the smallest possible price dierence. For example, on US exchanges the tick size is set to
not smaller than 0.01$ for stock with a price greater than 1$ by the Security and Exchange
Commission in Rule 612 of the Regulation National Market System. This has a serious impact
on the distribution of the trade by trade log returns, resulting in multi modality and discontinuity
in the distribution. Alzaid and Omair (2010) and Barndor-Nielsen et al. (2012) suggest to model
tick returns (price dierences expressed in number of ticks) with integer valued distributions as
the distribution of these tick returns is more tractable.
In this paper we propose a model in which tick returns have ∆Negative Binomial (∆NB)
distribution condition on a Gaussian latent state. Our model provides a exible framework to t
Author information: István Barra, Email: [email protected]. Address: VU University Amsterdam, De Boelelaan
1105, 1081 HV Amsterdam, The Netherlands. István Barra thanks for the Dutch National Science Foundation
(NWO) for nancial support.
∗
1
the empirically observed fat tails of the tick returns and other stylized facts of trade by trade
returns, hence it is an attractive alternative of previously suggested models in the literature
(see Koopman et al. (2014)). Moreover our structural model allows us to decompose the log
volatility into a periodic and a transient volatility component. We propose a Bayesian estimation
procedure using Gibbs sampling. Our procedure is based on data augmentation and auxiliary
mixtures and it extends the auxiliary mixture sampling proposed by Frühwirth-Schnatter and
Wagner (2006) and Frühwirth-Schnatter et al. (2009). In the empirical part of our paper we
illustrate our methodology on six stocks from the NYSE in a volatile week of October 2008 and
a calmer week from April 2010. We compare the in-sample and out-of-sample t of the dynamic
Skellam and dynamic ∆NB model using Bayesian information criteria and predictive likelihoods.
We nd that the ∆NB model is favoured for stock with low relative tick size and in more volatile
periods.
Our paper is related to several strands of literature. Modelling discrete price changes with
Skellam and ∆NB distributions was introduced by Alzaid and Omair (2010) and BarndorNielsen et al. (2012), while the dynamic Skellam model was introduced by Koopman et al. (2014).
Our paper is related to stochastic volatility models see for example Chib et al. (2002) Kim et al.
(1998) Omori et al. (2007) and recently Stroud and Johannes (2014). We extend the literature on
trade by trade returns by explicitly account for prices discreteness and fat tails of the tick return
distribution (see Engle (2000), Czado and Haug (2010), Dahlhaus and Neddermeyer (2014) and
Rydberg and Shephard (2003)).
The rest of the paper is organized as follows. I Section 2 we discuss the issue with tradeby-trade log return and we describe the Skellam and ∆NB distributions. Section 3 introduces
the dynamic ∆NB distribution while Section 4 explains our Bayesian estimation procedure. In
Section 5 we describe our dataset and cleaning procedure and Section 6 presents the empirical
ndings.
2
Tick returns and integer valued distributions
Stock prices can be quoted as a multiple of the tick size. As a consequence prices are dened on
a discrete grid, where the grid points are a tick size distance away from each other. We can write
the prices at time tj as
p(tj ) = n(tj )g
(1)
where g is the tick size which can be the function of the price on some exchanges and n(tj ) is a
natural number, denoting the location of the price on the grid. Modelling trade by trade returns
can pose diculty as the eect of price discreteness on a few seconds frequency is pronounced
compared to lower frequencies such as one hour or one day. As described in Münnix et al.
(2010), the problem is that the return distribution is a mixture of return distributions ri , which
correspond to x price changes ig
(
ri =
p(tj ) − p(tj−1 )
p(tj−1 )
)
p(tj ) − p(tj−1 ) = n(tj ) − n(tj−1 ) g = i(tj )g = ig
2
(2)
where i(tj ) is an integer, which express the price change in terms of ticks. The ri distributions
are on the intervals (for positive i)
where
ig
ig
,
,
max pi min pi
p = p(tj−1 ) p(tj ) − p(tj−1 ) = n(tj ) − n(tj−1 ) g = i(tj )g = ig
i
(3)
(4)
These interval and the center of the intervals ci can be approximated by
"
#
ig ig
,
,
p̄ p
¯
and
ig
ci ≈
2
1 1
−
p p̄
¯
(5)
!
(6)
as max pi ≈ p̄ and min pi ≈ p for i close to 0.
¯
First, note that the intervals corresponding to zero price change and one tick changes are
always non-overlapping. Secondly, the center of the intervals are approximately equally spaced,
however the intervals for higher absolute value changes are wider, which means that the intervals
are getting more and more overlapping as |i| is increasing. Thirdly, the intervals are less overlapping when the price is lower, the volatility is higher or the tick size is bigger. Figure 1 shows the
empirical trade by trade return distribution of several stocks from the New York Stock Exchange
(NYSE).
[ insert Figure 1 here ]
Modelling this special feature of the trade by trade return distribution is dicult and often
neglected in the literature (see e.g., Czado and Haug (2010) and Banulescu et al. (2013)). We
use an alternative modelling framework. Following Alzaid and Omair (2010), Barndor-Nielsen
et al. (2012), and Koopman et al. (2014) we can dene tick returns as
r(tj ) =
p(tj ) − p(tj−1 )
= n(tj ) − n(tj−1 ) = i(tj )
g
(7)
which is obviously an integer. The advantage of this approach is that we can directly model the
price changes expressed in terms of ticks. Although the distribution of tick returns is integer
valued, it is still easier to model this distribution than the specic features of the log returns.
On issue with trick returns is that they are not properly scaled, hence we expect higher variance
at higher prices, however on shorter time intervals this eect should be small, moreover we can
count for this our model by introducing a time varying unconditional mean in the volatility
equation. Figure 2 show the empirical distribution of ve stock from the NYSE along with tted
Skellam densities.
[ insert Figure 2 here ]
3
In order to model integer returns we need a discrete distribution dened on integers. One of
the potential distributions is the Skellam distribution, which was suggested by Alzaid and Omair
(2010). The Skellam distribution is dened as the dierence of two Poisson distributed random
variables. If P + and P − are Poisson random variables with intensity λ+ and λ− , then
R = P + − P −,
R ∼ Skellam(λ+ , λ− )
(8)
with probability mass function given by
+
−
−
+
f (r; λ , λ ) = exp(−λ − λ )
λ+
λ−
!r
2
√
I|r| (2 λ+ λ− )
(9)
where Ir is the modied Bessel function of the rst kind. The Skellam distirbution has the
following rst and second moments
E(R) = λ+ − λ−
(10)
Var(R) = λ+ + λ−
(11)
An important special case is the zero mean Skellam distribution is when λ = λ+ = λ− . In this
case
E(R) = 0
(12)
Var(R) = 2λ
(13)
[ insert Figure 3 here ]
One issue with the Skellam distribution is that it has exponentially decaying tails and it
implies approximately normally distributed tick returns as the tick size goes to zero. Keeping
the variance of the tick returns xed at σ 2
p(tj ) − p(tj−1 )
n(tj ) n(tj−1 )
lim
= lim g
−
≈ g n∗ (tj ) − n∗ (tj−1 ) = N(0, σ 2 ),
g→0
g→0
g
g
g
where
d
n(tj )=n(tj−1 ) ∼ Poi
σ2
2
!
d
n∗ (tj )=n∗ (tj−1 ) ∼ N
and
σ2 σ2
,
2g 2g
(14)
!
,
moreover we used the fact that a Poisson distribution with intensity λ can be approximated with
a normal distribution with mean and variance equal to λ. The thin tailed trade by trade return
assumption might be implausible as we know there is some evidence for jumps in stock prices.
[ insert Figure 4 here ]
The ∆ NB distribution is an alternative integer valued distribution which was proposed by
Barndor-Nielsen et al. (2012). ∆ NB distribution is dened as the dierence of two negative
binomial random variables N B + and N B − with number of failures λ+ and λ− and failure rates
ν + and ν −
R = N B+ − N B−.
4
(15)
Then R is distributed as
R ∼ ∆NB(λ+ , ν + , λ− , ν − )
(16)
with probability mass function given by

+
− r

 ν̃ + ν ν̃ − ν λ̃+
f∆N B (r; λ+ , ν + , λ− , ν − ) =
ν + − ν − − r

 ν̃ +
ν̃
λ̃
ν + + r, ν − , r + 1; λ̃+ λ̃−
if r ≥ 0
(ν − )r
+
−
+ −
if r < 0
r! F ν , ν − r, −r + 1; λ̃ λ̃
(ν + )r
r! F
where
ν̃ + =
ν̃ − =
λ̃+ =
λ̃− =
and
F (α, β; γ; z) =
ν+
λ+ + ν +
ν−
λ− + ν −
λ+
+
λ + ν+
λ−
λ− + ν −
(17)
∞
X
(α)n (β)n z n
n=0
(γ)n
(18)
n!
is the hypergeometric function with (x)n denoting the Pochhammer symbol of falling factorial
(x)n = x(x − 1)(x − 2) · · · (x − n + 1) =
Γ(x + 1)
.
Γ(x − n + 1)
(19)
The ∆NB distribution has the following rst and second moments
E(R) = λ+ − λ−
Var(R) = λ+
λ+
1+ +
ν
(20)
!
λ−
1+ −
ν
+ λ−
!
(21)
An important special case is the zero mean ∆ NB distribution is when λ = λ+ = λ− and
ν = ν + = ν −.
f (r; λ, ν) =
ν
λ+ν
2ν λ
λ+ν
|r|
Γ(ν + |r|)
F
Γ(ν)Γ(|r| + 1)
ν + |r|, ν, |r| + 1;
λ
λ+ν
2 !
(22)
In this case
E(R) = 0
λ
Var(R) = 2λ 1 +
ν
(23)
(24)
We can think about the zero mean ∆N B(λ, ν) distribution as the realization of a compound
5
Poisson process
R=
N
X
(25)
Mi ,
i=1
where N is a Poisson distribution with intensity
z1 , z2 ∼ Ga(ν, ν)
λ(z1 + z2 ),
and
Mi =


1,
with
(26)
z1
z1 +z2
2
= z1z+z
2
P (Mi = 1) =

−1, with P (Mi = −1)
(27)
This representation will be useful later on.
3
Dynamic ∆NB model
In order to build a sensible model of trade by trade tick returns we have to account for several
stylized facts. First of all, we model the tick returns with an integer valued distribution. We
propose to use the ∆NB distribution to account for the potential fat tails of the distribution. In
addition we use the zero inated version of the ∆NB distribution, because there a huge chunk
of the trade by trade returns are zeros. The number of zero trade by trade returns are higher for
more liquid stock as the available volumes on best bid and ask prices are higher in consequence the
price impact of one trade is lower. Taking the above considerations into account, our observation
density can be written as
yt =


r
with
(1 − γ)f∆NB (rt ; λt , ν)

0
with
γ + (1 − γ)f∆NB (0; λt , ν)
t
,
(28)
where γ is the zero ination parameter, λt is the volatility parameter at time t and ν is the
degree of freedom parameter of the ∆NB distribution, which determines the thickness of the tail
of the distribution.
Besides explicitly accounting for the discreteness of prices in our model, we also model the
daily volatility pattern and volatility clustering. We introduce these features into our model by
specifying the logarithm of the volatility as
log λt = µθ + st + xt
(29)
where µθ is the unconditional mean of the log intensity, st is a spline which is standardized
such that it has mean zero and xt is a zero mean AR(1) process. This specication implies a
decomposition of the volatility into a deterministic daily pattern and a stochastic time varying
component. The daily pattern of volatility usually associated to frequent trading during the
beginning of the day and lower activity during lunch. The xt process captures changes in volatility
due to new rm specic or market information experienced during the day. We model the daily
patter with a periodic spline which has a daily periodicity. (See e.g., Bos (2008), Stroud and
6
Johannes (2014) andWeinberg et al. (2007)) The spline function is a continuous function built
up from piecewise polynomials. Using the results of Poirier (1973) we can write a cubic spline st
with K knots as a regression
(30)
s t = wt β
where wt is a 1 × K vector and β is K × 1 vector. Details about the spline are in Appendix C.
The latent state in our model is xt an AR(1) process, which accounts for the variation in
volatility on top of the daily variation. Because of identications reasons we restrict the AR(1)
process to have zero mean, which yields to the following transition density
ηt ∼ N(0, ση2 ),
xt = φxt−1 + ηt ,
(31)
where φ is the persistence parameter and ση2 is the variance of the error term.
The full model would be
4


r
with
(1 − γ)f∆NB (rt ; λt , ν)

0
with
γ + (1 − γ)f∆NB (0; λt , ν)
t
Tick return
:
yt =
Total log volatility
:
log λt = µθ + st + xt
Daily volatility pattern
:
s t = wt β
Transient volatility
:
xt = φxt−1 + ση ηt ,
ηt ∼ N(0, ση2 )
Estimation
Our proposed estimation procedure relies on data augmentation and auxiliary mixture sampling
of Frühwirth-Schnatter and Wagner (2006) and Frühwirth-Schnatter et al. (2009). First for each
observation yt we introduce the variable Nt which is equal to the sum of N B + and N B − .
Condition on the gamma mixing variables z1t and z2t and the intensity λt we can think about
Nt as a realization of a Poisson process on [0, 1] with intensity (z1t + z2t )λt for every t = 1, . . . , T
and we can introduce the latent arrival time of the Nt -th jump of the Poisson process τt2 and the
arrival time between the Nt -th and Nt + 1-th jump of the process τt1 . Obviously the interarrival
time τt1 has exponential distribution with intensity (z1t + z2t )λt while the Nt th arrival time has
a Ga(Nt , (z1t + z2t )λt ) distribution, hence we can write
τt1 =
τt2 =
ξt1
,
(z1t + z2t )λt
ξt2
,
(z1t + z2t )λt
ξt1 ∼ Exp(1)
(32)
ξt2 ∼ Ga(Nt , 1).
(33)
By taking the logarithm of the equations we can rewrite them as
∗
− log τt1 = log(z1t + z2t ) + log λt + ξt1
,
∗
ξt1
= − log ξt1
(34)
∗
− log τt2 = log(z1t + z2t ) + log λt + ξt2
,
∗
ξt2
= − log ξt2 .
(35)
These equations are linear in the state, which would facilitate the use of Kalman ltering,
7
∗ and ξ ∗ are non normal. We can use the result of Frühwirth-Schnatter
however the error terms ξt1
t2
and Wagner (2006) and Frühwirth-Schnatter et al. (2009) to come up with a normal mixture
approximation of these distributions
R(Nt )
fξ∗ (x; Nt ) ≈
X
ωr (Nt )φ x, mr (Nt ), vr (Nt ) .
(36)
r=1
∗ and ξ ∗ , allows us to build an ecient Gibbs
Using the mixture of normal approximation of ξt1
t2
sampling procedure where we can sample the latent state paths in one block, eciently using
Kalman ltering and smoothing techniques. This is crucial as in our application the number of
observation is large and updating the state time period by time period would make our estimation
slow and inecient.
The MCMC algorithm
1. Initialize µθ , φ, ση2 , γ , ν , R , τ , N , z1 , z2 , s and x
2. Generate φ,ση2 , µθ , s and x from p(φ, ση2 , µθ , s, x|γ, ν, R, τ, N, z1 , z2 , s, y)
(a) Draw φ, ση2 from p(φ, ση2 |γ, ν, R, τ, N, z1 , z2 , s, y)
(b) Draw µθ , s and x from p(µθ , s, x|φ, ση2 , γ, ν, R, τ, N, z1 , z2 , s, y)
3. Generate γ from p(γ|ν, µθ , φ, ση2 , x, R, τ, N, z1 , z2 , s, y)
4. Generate R, τ, N, z1 , z2 , ν from p(R, τ, N, z1 , z2 , ν|γ, µθ , φ, ση2 , x, s, y)
(a) Draw ν from p(ν|γ, µθ , φ, ση2 , x, s, y)
(b) Draw z1 , z2 from p(z1 , z2 |ν, γ, µθ , φ, ση2 , x, s, y)
(c) Draw N from p(N |z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y)
(d) Draw τ from p(τ |N, z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y)
(e) Draw R from p(R|τ, N, z1 , z2 , ν, γ, µθ , φ, ση2 , x, s, y)
5. Go to 2
The detailed MCMC steps can be found in Appendix D.
4.1
Simulation exercise
To check our estimation procedure for the dynamic Skellam and ∆ NB models we simulate 20
000 observation and run 100 000 iterations of our MCMC procedure. We set µ = −1.7, φ = 0.97
, σ = 0.02, γ = 0.001 and ν = 15 which sensible parameters based on the estimates on real data.
Table 1 summarizes the results.
We nd that the algorithm successfully estimate the parameters as the true parameters are
in the HPD regions. Based on the simulations the AR(1) coecient and volatility in the state
are the hardest parameters to estimate.
[ insert Table 1 here ]
8
[ insert Figure 5 here ]
[ insert Figure 6 here ]
5
Data
We have quotes and trades data from the Thomson Reuters Sirca dataset. In this data set we have
all the trades and quotes with millisecond time stamps for stocks from NYSE In our analysis we
use Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM),
Ford (F), Xerox (XRX), which dier in liquidity and their price magnitude. In the paper we
concentrate on two months, namely October 2008 and April 2010. These months exhibit dierent
market sentiments and volatility characteristics, as October 2008 is in the middle of the 2008
nancial crises with record high realized volatility and some of the markets experienced their
worst weeks in October 2008 since 1929, while April 2010 is a calmer month with lower realized
volatility. We carry out the following ltering steps to clean the data following a similar procedure
to described in Boudt et al. (2012), Barndor-Nielsen et al. (2008) and Brownlees and Gallo
(2006).
First we lter the trades from the the trades and quotes data set by selecting rows where
the 'Type' column equals 'Trade'. Large portion of the data is in fact consists of quotes, hence
by excluding the quotes we loose around 70-90 % of the data set. In the next step we we delete
the trades with missing or zero prices or volumes. Further more we restrict our analysis to the
trading period. The fourth step is to aggregate the trades which have the same time stamps. We
decided to use the trades with last sequence number when there are multiple trades at the same
millisecond. This choice is motivated by the fact that, that is the price which we can observe
with a millisecond ne resolution. Finally we lter the outliers using the the rule suggested by
Barndor-Nielsen et al. (2008). We delete trades with price smaller then the bid minus the bid
ask spread and higher than the ask plus the bid ask spread. Table 2 and Table 3 summarizes
some descriptive statistics for the data from 3rd to 10th October 2008 and from 23rd to 30th
April 2010 respectively. Detailed summary of the cleaning can be found in Table 6 and 7.
[ insert Table 2 here ]
[ insert Table 3 here ]
6
Empirical results
We estimate the dynamic Skellam and ∆NB models for two dierent stocks Alcoa (AA), CocaCola (KO), International Business Machines (IBM), J.P. Morgan (JPM), Ford (F), Xerox (XRX),
in the period of 3rd to 9th October 2008 and 23rd to 29th April 2010. The results on the data
from 2008 are reported in Table 4 while Table 5 shows the parameter estimates on the data from
April 2010.
The unconditional mean volatility dier across stocks and time periods. The unconditional
mean volatility is higher for stocks with higher price and it is higher in the more volatile period
9
in 2008 which is in line with our intuition. The AR(1) coecients are in the range of 0.88-0.99.
This suggest persistence in the volatility even after accounting for the daily volatility pattern,
however the transient volatility is less persistent in the more volatile crises period. The volatility
parameter of the AR(1) process is higher during the 2008 nancial crisis period. We only used the
zero ination parameter when some additional exibility was needed in the observation density.
This was the case in for stocks with higher price and the more volatile periods. In case of the
April 2010 period we used the zero ination only for IBM, while in the October 2008 period we
included the zero ination for all stock expect for the two lowest price stock F and XRX. The
tail parameter of the ∆NB distribution is higher during the calmer 2010 period which suggest
that the distribution of the tick return is closer to a thin tailed distribution in that period. In
addition the tail parameter is lower for stock with higher average price.
[ insert Table 4 here ]
[ insert Table 5 here ]
Using the output of our estimation procedure we can decompose the logarithm of the volatility
by
E(log λt ) = E(µθ ) + E(st ) + E(xt )
(37)
Figure 7 depicts the decomposition of the logarithm of the volatility from the Skellam model
estimated on IBM tick returns from 23rd to 29th April 2010.
[ insert Figure 7 here ]
6.1
In-sample comparison
The computation of Bayes factors is infeasible in this setup as it requires sequential parameter
estimation, which is computationally prohibitive with large time dimension of our model. Instead
we follow Stroud and Johannes (2014) and we calculate Bayesian Information Criteria (BIC)
BICT (M) = −2
T
X
log p(yt |θ̂, M) + di log T
(38)
t=1
where p(yt |θ, M) can be calculated with a particle lter and θ̂ is the posterior mean of the
parameters. The BIC gives an asymptotic approximation to the Bayes factor by BICT (Mi ) −
BICT (Mj ) ≈ −2 log BFi,j . This approximation can be used for sequential model comparison.
Figure 8 and Figure 10 shows the in Bayes factors on the sample from 23rd to 29th October
2008 and 3rd to 9th April 2010. The pictures indicate that in 2008 there is evidence in favour of
the ∆NB model in case of AA, F and XRX, while in 2010 on IBM favours the ∆NB distribution.
This result is consistent with our prior conjecture that in the crisis period there are more jumps.
Based on the sequential Bayes factors, the ∆NB model tends to be favoured in case of sudden
big jumps in the data. Realization of returns from the tail suggest the need of ∆NB only in
cases where the volatility high. This in line with the intuition that in models with time varying
volatility identication of the tails comes from observation of extreme realizations coupled with
low volatility.
10
6.2
Out-of-sample comparison
In order to compare the dynamic Skellam and ∆NB models we can use predictive likelihoods.
The one-step-ahead predictive likelihood for model M is dened as
Z Z
p(yt+1 |y1:t , M) =
p(yt+1 |y1:t , xt+1 , θ, M)p(xt+1 , θ|y1:t , M)dxt+1 dθ
Z Z
=
p(yt+1 |y1:t , xt+1 , θ, M)p(xt+1 |θ, y1:t , M)p(θ|y1:t , M)dxt+1 dθ. (39)
Noticed that the h-step-ahead predictive likelihood can be decompose to the sum of one-stepahead predictive likelihoods
p(yt+1:t+h |y1:t , M) =
h
Y
p(yt+i |y1:t+i−1 , M) =
i=1
h Z Z
Y
p(yt+i |y1:t+i−1 , xt+i , θ, M)
i=1
× p(xt+i |θ, y1:t+i−1 , M)p(θ|y1:t+i−1 , M)dxt+i .dθ
(40)
The above formula suggests that we have to calculate p(θ|y1:t+i−1 , m), the posterior of the
parameters using sequentially increasing data samples. This means that we have to run our
MCMC procedure as many times as many out of sample observations we have. Unfortunately in
our application this means several thousands of runs in case we would like to check the predictive
likelihood on an out of sample day, which is computationally not practical or even infeasible.
However we can use the vast amount of available data by using the following approximation
p(yt+1:t+h |y1:t , M) ≈
h Z Z
Y
p(yt+i |y1:t+i−1 , xt+i , θ, M)
i=1
× p(xt+i |θ, y1:t+i−1 , M)p(θ|y1:t , M)dxt+i dθ.
(41)
This approximation can be motivated by the fact that, after observing a considerable amount
of data, which me means that t is suciently large, the posterior distribution of the static
parameters should not change that much, hence p(θ|y1:t+i−1 , M) ≈ p(θ|y1:t , M). We carry out
the following exercise. We thin our MCMC output to get a sample from the posterior distribution
based on our in-sample observations. Then for each parameter draw we estimate the likelihood
by running a particle lter through the out-of-sample period.
Figure 9 and Figure 11 shows the out of sample sequential predictive Bayes factors for 10th
October 2008 and 30th April 2010 respectively. Based on the results we can say that in the more
volatile period the ∆NB model is doing better in term of Bayes factor. On 10th October AA,
KO,XRX show evidence in favour the ∆NB model in the out-of-sample comparison, while on
30th October 2008 the dynamic Skellam model ts the data better out-of sample, except for
IBM.
[ insert Figure 8 here ]
[ insert Figure 9 here ]
11
[ insert Figure 10 here ]
[ insert Figure 11 here ]
7
Conclusion
In this paper we introduced the dynamic ∆NB model for modelling trade by trade returns. We
developed a Gibbs type MCMC procedure for the Bayesian estimation of the dynamic Skellam
and ∆NB model. Moreover we showed some empirical evidence in favour of the ∆NB model
using dierent stock and periods from the NYSE.
References
Alzaid, A. and M. A. Omair (2010). On the Poission dierence distribution inference and applications.
Bulletin of the Malaysian Mathematical Science Society 33,
1745.
Banulescu, D., G. Colletaz, C. Hurlin, and S. Tokpavi (2013). High-Frequency Risk Measures.
Barndor-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2008). Realized Kernels in
Practice: Trades and Quotes.
Econometrics Journal 4,
132.
Barndor-Nielsen, O. E., D. G. Pollard, and N. Shephard (2012). Integer-valued Levy procces
and low latency nancial econometrics. Working Paper.
Bos, C. (2008). Model-based estimation of high frequency jump diusions with microstructure
noise and stochastic volatility. TI Discussion Paper.
Boudt, K., J. Cornelissen, and S. Payseur (2012). Highfrequency: Toolkit for the analysis of
Highfrequency nancial data in R.
Brownlees, C. and G. Gallo (2006). Financial econometrics analysis at ultra-high frequency:
Data handling concerns.
Compuational Statistics and Data Analysis 51,
22322245.
Chib, S., F. Nardari, and N. Shephard (2002). Markov Chain Monte Carlo for stochastic volatility
models.
Journal of Econometrics 108,
281316.
Czado, C. and S. Haug (2010). An ACD-ECOGARCH(1,1) model.
metrics 8,
Journal of Financial Econo-
335344.
Dahlhaus, R. and J. C. Neddermeyer (2014). Online Spot Volaitlity-Estimation and Decompostion with Nonlinear Market Microstructure Noise Models.
rics 12,
Journal of Financial Economet-
174212.
Engle, R. F. (2000). The econometrics of ultra-high-frequency data.
Econometria 68,
122.
Frühwirth-Schnatter, S., R. Frühwirth, L. Held, and H. Rue (2009). Improved auxiliary mixture
sampling for hierarchical models of non-Gaussian data.
12
Statistics and Computing 19,
479492.
Frühwirth-Schnatter, S. and H. Wagner (2006). Auxliliary mixture sampling for parameter-driven
models of time series of small counts with applications to state space modeling.
Biometrika 93,
827841.
Kim, S., N. Shephard, and S. Chib (1998). Stochastic volatility: Likelihood inference and comparison with arch models.
Review of Economic Studies 65,
361393.
Koopman, S. J., A. Lucas, and R. Lit (2014). The Dynamic Skellam Model with Applications.
TI Discussion Paper.
Münnix, M. C., R. Schäfer, and T. Guhr (2010). Impact of the tick-size on nancial returns and
correlations.
Physica A: Statistical Mechanics and its Applications 389 (21),
48284843.
Omori, Y., S. Chib, N. Shephard, and J. Nakajima (2007). Stochastic volatilty with leverage:
fast likelihood inference.
Journal of Econometrics 140,
425449.
Poirier, D. J. (1973). Piecewise Regression Using Cubic Splines.
tistical Association 68,
Journal of the American Sta-
515524.
Rydberg, T. H. and N. Shephard (2003). Dynamics of trade-by-trade price movements: Decomposition and models.
Journal of Financial Econometrics 1,
225.
Stroud, J. R. and M. S. Johannes (2014). Bayesian modeling and forecasting of 24-hour highfrequency volatility: A case study of the nancial crisis.
Weinberg, J., L. D. Brown, and R. S. J (2007). Bayesian forecasting of an imhomogenous
Poission process with application to call center data.
Association 102,
11851199.
13
Journal of the American Statistical
A
Numerical issues with the Skellam distribution
In general it is good to use the scaled version of the modied Bessel function of the rst kind
exp(−z)In (z).
(A1)
exp(−2λ)I|k| (2λ)
(A2)
For the special case of the Skellam
for small and large 2λ this still can be unstable, but we can use the following approximations
for small 2λ
1
exp(−2λ)I|k| (2λ) ≈ exp(−2λ)
Γ(|k| + 1)
2λ
2
|k|
≈1×
λ|k|
Γ(|k| + 1)
(A3)
While for large 2λ we can use
exp(2λ)
1
exp(−2λ)I|k| (2λ) ≈ exp(−2λ) √
= √
2π2λ
2 πλ
B
(A4)
NB distribution
Dierent parametrization of the NB distribution
Γ(ν + k)
pk (1 − p)ν
Γ(ν)Γ(k + 1)
f (k; ν, p) =
Using
λ=ν
(A5)
p
λ
⇒p=
1−p
λ+ν
Γ(ν + k)
f (k; λ, ν) =
Γ(ν)Γ(k + 1)
λ
ν+λ
k (A6)
ν
ν+λ
ν
(A7)
Mean
Variance
µ=λ
(A8)
λ
σ =λ 1+
ν
(A9)
2
Dispersion index
σ2
µ
1+
λ
ν
(A10)
The NB distribution is over dispersed and which means that there are more intervals with low
counts and more intervals with high counts, compared to a Poisson distribution. As we increase
ν we get back to the Poission case.
14
The Poisson distribution can be obtained from the NB distribution as follows
lim f (k; λ, ν) =
ν→∞
1
1
λk
Γ(ν + k)
λk
(ν + k − 1) . . . ν
ν
lim
=
lim
ν
k
k
ν→∞
ν→∞
k!
k!
Γ(ν)(ν + λ) 1 + λ
(ν + λ)
1+ λ
ν
=
λk
k!
·1·
ν
1
= Poi(λ)
eλ
(A11)
The NB distribution Y ∼ N B(λ, ν) can be written as a Poisson-Gamma mixture or Poisson
distribution with Gamma heterogeneity where the Gamma heterogeneity has mean 1.
Y ∼ Poi(λU ) where U ∼ Ga(ν, ν),
(A12)
where we use the Ga(α, β) is given by
f (x; α, β) =
β α xα−1 e−βx
Γ(α)
(A13)
Z∞
fPoisson (k; λu)fGamma (u; ν, ν)du
f (k; λ, ν) =
0
Z∞
=
(λu)k e−λu ν ν uν e−νu
du
k!
Γ(ν)
0
=
λk ν ν
k!Γ(ν)
Z∞
e−(λ+ν)u uk+ν−1 du
0
Substituting (λ + ν)u = s we get
=
λk ν ν
k!Γ(ν)
Z∞
e−s
1
sk+ν−1
ds
k+ν−1
(λ + ν)
(λ + ν)
0
=
λk ν ν
1
k!Γ(ν) (λ + ν)k+ν
Z∞
e−s sk+ν−1 ds
0
=
=
λk ν ν Γ(k + ν)
k!Γ(ν) (λ + ν)k+ν
k ν
Γ(ν + k)
λ
ν
Γ(ν)Γ(k + 1) ν + λ
ν+λ
(A14)
C
Daily volatility patterns
We want to approximate the function f : R → R with a continuous function which is built up
from piecewise polynomials of degree at most three. Let the set ∆ = {k0 , . . . , kK } denote the
set of of knots kj j = 0, . . . , K . ∆ is some times called a mesh on [k0 , kK ]. Let y = {y0 , . . . , yK }
15
where yj = f (xj ). We denote a cubic spline on ∆ interpolating to y as S∆ (x). S∆ (x) has to
satisfy
1. S∆ (x) ∈ C 2 [k0 , kK ]
2. S∆ (x) coincides with a polynomial of degree at most three on the intervals kj−1 , kj for
j = 0, . . . , K .
3. S∆ (x) = yj for j = 0, . . . , K .
00
Using the 2 we know that the S∆ (x) is a linear function on kj−1 , kj which means that we can
00
write S∆ (x) as
"
#
"
#
kj − x
x − kj−1
S∆ (x) =
Mj−1 +
Mj
hj
hj
00
00
for x ∈ kj−1 , kj
(A15)
00
where Mj = S∆ (kj ) and hj = kj − kj−1 . Integrating S∆ (x) and solving the integrating for the
two integrating constants (using S∆ (x) = yj ) Poirier (1973) shows that we get
#
"
#
(kj − x)2
(x − kj−1 )2 hj
yj − yj−1
hj
−
Mj−1 +
−
Mj +
S∆ (x) =
6
2hj
2hj
6
hj
"
0
for x ∈ kj−1 , kj
(A16)
i
i
x − kj−1 h
kj − x h
(kj − x)2 − h2j Mj−1 +
(x − kj−1 )2 − h2j Mj
6hj
6hj
"
#
"
#
kj − x
x − kj−1
+
yj−1 +
yj for x ∈ kj−1 , kj
hj
hj
S∆ (x) =
(A17)
In the above expression only Mj for j = 0, . . . , K are unknown. We can use the continuity
restrictions which enforce continuity at the knots by requiring that the derivatives are equal at
the knots kj for j = 1, . . . , K − 1
0
S∆ (kj− ) = hj Mj−1 /6 + hj Mj /3 + (yj − yj−1 )/hj
(A18)
S∆ (kj+ ) = −hj+1 Mj /3 − hj+1 Mj+1 /6 + (yj+1 − yj )/hj+1
(A19)
0
which yields K − 1 conditions
(1 − λj )Mj−1 + 2Mj + λj Mj+1 =
6yj
6yj+1
6yj−1
−
+
hj (hj + hj+1 ) hj hj+1 hj+1 (hj + hj+1 )
where
λj =
hj+1
hj + hj+1
(A20)
(A21)
Using two end conditions we have K + 1 unknowns and K + 1 equations and we can solve the
linear equation system for Mj . Using the M0 = π0 M1 and MK = πK MK−1 end conditions we
16
can write

2

 1-λ
1


 0

 ..
Λ
=
 .
|{z}

 0
(K+1)×(K+1)


 0

0








Θ
=

|{z}


(K+1)×(K+1)




-2 π0
0
...
0
0
0

2
λ1
...
0
0
0
1-λ2
..
.
2
..
.
...
0
..
.
0
..
.
0
..
.
0
0
...
2
λK−2
0
0
0
...
1-λK−1
2
λK−1
0
0
...
0
-2 πK
2














(A22)
0
0
0
...
0
0
0

6
h1 (h1 +h2 )
- h16h2
...
0
0
0
0
..
.
6
h2 (h2 +h3 )
6
h2 (h1 +h2 )
- h26h3
...
0
..
.
0
..
.
0
..
.
0
0
0
...
- hK−26hK−1
0
0
0
0
...
6
hK−1 (hK−1 +hK )
6
hK−1 (hK−2 +hK−1 )
6
- hK−1
hK
6
hK (hK−1 +hK )
0
0
0
...
0
0
0














..
.
..
.
(A23)

M0

 M
1


..

m =
.
|{z}

(K+1)×1
 MK−1

MK

y0

 y
1


..

y
=
.
|{z}

 yK−1
(K+1)×1

yK










(A24)










(A25)
The linear equation system is given by
Λm = θy
(A26)
m = Λ−1 Θy
(A27)
and the solution is
17
Using this result and equation (A17) we can calculate

 S∆ (ξ1 )
 S (ξ )
∆ 2


..
S∆ (ξ) = 
.
| {z } 

 S∆ (ξN −1 )
N ×1

S∆ (ξN )










(A28)
Lets denote P the N × (K + 1) matrix where ith row i = 1, . . . , N 1 given that kj−1 ≤ ξ ≤ kj
can be written as

pi
|{z}
1×(K+1)

i ξ −k
h
i
kj − ξi h

i
j−1
=  0, . . . , 0 ,
(kj − ξi )2 − h2j ,
(ξi − kj−1 )2 − h2j ,
| {z } 6hj
6hj
rst j − 2

0, . . . , 0 
| {z }
last K + 1 − j
(A29)
Moreover denote Q the N ×(K + 1) matrix where ith row i = 1, . . . , N 1 given that kj−1 ≤ ξ ≤ kj
can be written as

qi
|{z}
1×(K+1)

kj − ξi ξi − kj−1

=  0, . . . , 0 ,
,
,
| {z }
hj
hj
rst j − 2
(A30)

0, . . . , 0 
| {z }
last K + 1 − j
Now using (A17) and (A27) we get
S∆ (ξ) = P m + Qy = P Λ−1 Θy + Qy = (P Λ−1 Θ + Q)y =
W
|{z}
y
|{z}
(A31)
N ×(K+1) (K+1)×1
where
W = P Λ−1 Θ + Q
(A32)
In practical situations we might only know the knots but we don't know we observe the spline
values with error. In this case we have
s = S∆ (ξ) + ε = W y + ε,
where

s1

 s
2


..

s =
.
|{z}

N ×1
 sN −1

sN
18
(A33)










(A34)
and

 ε1
 ε
2


.
..
ε =

|{z}

N ×1
 εN −1

εN










(A35)
with
E(ε) = 0 and E(εε0 ) = σ 2 I
(A36)
Notice that after xing the knots we only have to estimate the value of the spline at he knots
and this determines the whole shape of the spline. We cab do this by simple OLS
ŷ = (W > W )−1 W > s
(A37)
For identication reasons we want
X
X
S∆ (ξj ) =
wj y = w ∗ y = 0
(A38)
j:uniqueξj
j:uniqueξj
where wi is the ith row of W and
w∗
|{z}
X
=
(A39)
wj
j:uniqueξj
1×(K+1)
The restriction can be enforced by one of the elements of y . This ensures that E(st ) = 0 so st
and µθ can be identied. If we drop yK we can substitute
yK = −
K−1
X
∗
(wi∗ /wK
)yi
(A40)
i=0
where wi∗ is the ith element of w∗ . Substituting this into
X
S∆ (ξj ) =
j:uniqueξj
X
wj y =
j:uniqueξj
=
X
K
X
X
wji yi =
j:uniqueξj i=0
∗
(wji − wjK wi∗ /wK
)yi =
K−1
X
K−1
X
i=0
i=0
∗ ∗
∗
(wi∗ − wK
wi /wK
)yi =
K−1
X
wji yi − wjK
j:uniqueξj i=0
K−1
X
j:uniqueξj i=0
=
X
K−1
X
X
i=0 j:uniqueξj
(wi∗ − wi∗ )yi = 0
K−1
X
∗
(wi∗ /wK
)yi
i=0
∗
(wji − wjK wi∗ /wK
)yi
(A41)
Lets partition W in the following way
W
|{z}
N ×(K+1)
= [W−K : WK ]
| {z } |{z}
N ×K
19
N ×1
(A42)
where W−K is equal to the rst K columns of W and WK is the K th column of W . Moreover
w∗
|{z}
1×(K+1)
We can dene
∗
∗
: wK
= [w−K
]
| {z } |{z}
(A43)
1×1
1×K
f = W−K − 1 WK w∗
W
∗ |{z} −K
|{z}
| {z } wK
| {z }
(A44)
f ỹ +ε.
s = S∆ (ξ) + ε = |{z}
W
|{z}
(A45)
N ×K
N ×1 1×K
N ×K
and we have
N ×K K×1
D
MCMC estimation of the dynamic ∆NB model
D.1
Generating the parameters x, µθ , φ, ση2 (Step 2)
Notice that conditional on R = rtj , t = 1, . . . , T, j = 1, . . . , min(Nt + 1, 2) , τ , N ,γ and s we
have
− log τt1 = log(z1t + z2t ) + µθ + st + xt + mrt1 (1) + εt1 ,
εt1 ∼ N(0, vr2t1 (1))
(A46)
εt2 ∼ N(0, vr2t2 (Nt ))
(A47)
εt ∼ N(0, Ht )
(A48)
and
− log τt2 = log(z1t + z2t ) + µθ + st + xt + mrt2 (Nt ) + εt2 ,
which implies the following following state space form
ỹt
|{z}
min(Nt +1,2)×1


µθ

1  

+
εt
,
=
β


|{z}
wt 1
min(Nt +1,2)×1
xt
{z
}
|
{z
}
min(Nt +1,2)×(K+2)

 1
1
|

wt
(K+2)×1


αt+1 = 

|
µθ
β
xt+1
{z


1
0
 
= 0
 
0
} |
IK
(K+2)×1
0

µθ


0

 

 
0 
 β + 0
φ
xt
ηt+1
} | {z } | {z
0
{z
(K+2)×(K+2)
(K+2)×1


, ηt+1 ∼ N(0, ση2 )(A49)

(K+2)×1
}
(A50)
where


µ
 θ
 β

x1
| {z






 ∼ N 




|
}

(K+2)×1
µ0
 
σµ2
 

β0 
,  0
0
0
{z } |
(K+2)×1
20
0
σβ2 IK
0


0




0


ση2 /(1 − φ2 ) 
{z
}
(K+2)×(K+2)
(A51)
Ht = diag(vr2t1 (1), vr2t,2 (Nt )) and

(1)
−
log(z
+
z
)
−
log
τ
−
m
1t
2t
t1
rt1

=
− log τt2 − mrt2 (Nt ) − log(z1t + z2t )

ỹt
|{z}
min(Nt +1,2)×1
D.2
(A52)
Generating γ (Step 3)
p(γ|ν, µθ , φ, ση2 , x, R, s, τ, N, z1 , z2 , y) = p(γ|ν, µθ , s, x, y)
(A53)
because given ν , λ and y , the variables R, τ, N, z1 , z2 are redundant.
p(γ|ν, µθ , s, x, y) ∝ p(y|γ, ν, µθ , s, x)p(γ|ν, µθ , s, x) = p(y|γ, ν, µθ , s, x)p(γ)
(A54)
as γ is independent from ν and λt = exp(µθ + st + xt ).
p(y|γ, ν, µθ , x)p(γ) =
×
∝
×
T
Y
"
2ν |yt |
λt
ν
Γ(ν + |yt |)
γ 1{yt =0} + (1 − γ)
λt + ν
λt + ν
Γ(ν)Γ(|yt |)
t=1

!
2
a−1 (1 − γ)b−1
λt
γ
F ν + yt , ν, yt + 1;
λt + ν
B(a, b)
"
2ν |yt |
T
Y
ν
λt
Γ(ν + |yt |)
a
b−1
a−1
b
γ (1 − γ) 1{yt =0} + γ
(1 − γ)
λt + ν
λt + ν
Γ(ν)Γ(|yt |)
t=1

2 !
λt

F ν + yt , ν, yt + 1;
λt + ν
We can carry out an independent MH step to sample from this density using a truncated normal
or normal density with mean equal to the mode of this above distribution and variance equal to
the Hessian at the mode.
D.3
Generating the auxiliary variables R, τ, N, z1 , z2 , ν (Step 4)
p(R, τ, N, z1 , z2 , ν|γ, µθ , φ, ση2 , s, x, y) = p(R|τ, N, z1 , z2 γ, p, µθ , φ, ση2 , s, x, y)
× p(τ |N, z1 , z2 γ, ν, µθ , φ, ση2 , s, x, y)
× p(N |z1 , z2 γ, ν, µθ , φ, ση2 , s, x, y)
× p(z1 , z2 |γ, ν, µθ , φ, ση2 , s, x, y)
× p(ν|γ, µθ , φ, ση2 , s, x, y)
21
(A55)
Generating ν (Step 4a)
Note that
p(ν|γ, µθ , φ, ση2 , s, x, y) = p(ν|γ, λ, y)
∝ p(ν, γ, λ, y)
= p(y|γ, λ, ν)p(λ|γ, ν)p(γ|ν)p(ν)
= p(y|γ, λ, ν)p(λ)p(γ)p(ν)
(A56)
∝ p(y|γ, λ, ν)p(ν)
where p(y|γ, λ, ν) is a product of zero inated ∆NB probability mass functions. We can draw ν
using a Laplace approximation or an adaptive random walk Metropolis-Hasting procedure.
An alternative way of drawing ν is using a discrete uniform prior ν ∼ DU (2, 128) and a
random walk proposal in the following fashion as suggested by Stroud and Johannes (2014)
for degree of freedom parameter of a t density. We can write the posterior as a multinomial
∗ ) with probabilities
distribution p(ν|µθ , x, z1 , z2 ) ∼ M (π2∗ , . . . , π128
πν∗ ∝
T h
Y
T
i Y
γI{yt =0} + (1 − γ)f∆NB (yt ; λt , ν) =
gν (yt )
t=1
(A57)
t=1
To avoid the computationally intense evaluation of these probabilities we can use a MetropolisHastings update. We can draw the proposal ν ∗ from the neighbourhood of the current value ν (i)
using a discrete uniform distribution ν ∗ ∼ DU (ν (i) − δ, ν (i) + δ) and accept with probability
(
min 1,
QT
gν ∗ (yt )
QTt=1
t=1 gν (i) (yt )
)
(A58)
δ is chosen such that the acceptance rate is reasonable.
Generating z1 , z2 (Step 4b)
Notice that z1 , z2 are independent given γ, µθ , s, x, y .
p(z1 , z2 |γ, ν, µθ , φ, ση2 , s, x, y) =
T
Y
p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt )
(A59)
t=1
p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt ) ∝ p(z1t , z2t , γ, ν, µθ , φ, ση2 , st , xt , yt )
= p(yt |z1t , z2t , γ, ν, µθ , φ, ση2 , st , xt )
× p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt )
p(z1t , z2t |γ, ν, µθ , φ, ση2 , st , xt , yt ) ∝ g(z1t , z2t )
22
ν e−νz1t ν ν z ν e−νz2t
ν ν z1t
2t
Γ(ν)
Γ(ν)
(A60)
(A61)
where

g(z1t , z2t ) = γ 1{yt =0} + (1 − γ) exp −λt (z1t + z2t )
z1t
z2t
yt
√
2

(A62)
I|yt | (2λt z1t z2t )
∗ , z ∗ from
with λt = exp(µθ + st + xt ). We can carry out an independent MH step by sampling z1t
2t
Ga(λt , ν) and accept it with probability
min
∗ , z∗ )
g(z1t
2t
,1
g(z1t , z2t )
(A63)
Generating N (Step 4c)
The number of jumps are independent given γ, µθ , φ, ση2 , s, x, z1 , z2 , y which means
p(N |γ, ν, µθ , φ, ση2 , s, x, z1 , z2 , y)
=
T
Y
(A64)
p(Nt |γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , yt ).
t=1
For a given t we can draw Nt from a discrete distribution with
p(Nt = n|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , yt ) =
p(Nt = n, yt = k|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t )
p(yt = k|γ, p, µθ , φ, ση2 , st , xt , z1t , z2t )
= p(yt = k|Nt = n, γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t )
p(Nt = n|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t , )
p(yt = k|γ, ν, µθ , φ, ση2 , st , xt , z1t , z2t )



n
X


= γ 1{k=0} + (1 − γ)p 
Mi = k|Nt = n, , z1t , z2t 
×
i=1
×
p(Nt = n|γ, µθ , φ, ση2 , st , xt , z1t , z2t )
p(yt = k|γ, µθ , φ, ση2 , st , xt , z1t , z2t )
(A65)
The denominator is easy to evaluate it is a Skellam distribution at k with intensity λt z1t andλt z2t
. The probability


n
X
p
Mi = k|Nt = n, z1t , z2t 
(A66)
i=1
is not standard. condition on z1 and z2 , yt = has a Skellam distribution, hence
Mi =
which implies we can represent


1,
with P (Mi = 1) =

−1, with P (Mi = −1)
n
P
z1t
z1t +z2t
2t
= z1tz+z
2t
Mi with a tree structure and the binomial distribution.
i=1
(A67)
n
P
Mi
i=1
has a binomial distribution with n trails, (n + k)/2 successes and p = 0.5 success rate. Note that
even k can only happen in even number of trails and odd k can only happen in odd number of
23
trails.



6 |n mod 2|
0, if k > n or |k mod 2| =
n+k n−k


p
Mi = k|Nt = n, z1t , z2t =
2
2
z2t
z1t
n


, otherwise
i=1
 n+k
z1t + z2t
z1t + z2t
2
(A68)


n
X
The probability p(Nt = n|γ, µθ , φ, ση2 , st , xt , z1t , z2t ) is equal to p(Nt = n|µθ , st , xt ) and it is
a Poission random variable with intensity equal to λt (z1t + z2t ). In general we have to following
expression for p(N |γ, ν, µθ , φ, ση2 , s, x, z1 , z2 , y) when |yt | ≤ n
"
n
γ λt (z1t + z2t ) exp −λt (z1t + z2t )
I{yt =0}
Γ(n + 1)
+
I{|yt | mod 2=n mod 2}
×
"
(1 − γ) λt (z1t + z2t )
t
Γ( n+y
2
+
n
exp −λt (z1t + z2t )
t
1)Γ( n−y
2
+ 1)
z1t
z1t + z2t
n+k
2
z2t
z1t + z2t
n−k
2
1
z1t y2t
√
γ 1{yt =0} + (1 − γ) exp −λt (z1t + z2t ) z2t
I|yt | (2λt z1t z2t )
We can draw Nt parallel over t = 1, . . . , T by drawing a uniform random variable ut ∼ U [0, 1]


n


X
Nt = min n : ut ≤
p(Nt = i|γ, µθ , φ, ση2 , st , xt , yt , z1t , z2t )


(A70)
i=0
Generating τ (Step 4d)
Notice that p(τ |N, z1 , z2 , γ, ν, µθ , φ, ση2 , x, y) = p(τ |N, µθ , z1 , z2 , s, x). Moreover
p(τ |µθ , z1 , z2 , s, x) =
T
Y
p(τ1t , τ2t |Nt , µθ , z1t , z2t , st , xt )
t=1
=
T
Y
p(τ1t |τ2t , Nt , µθ , z1t , z2t , st , xt )p(τ2t |Nt , µθ , z1t , z2t , st , xt ) (A71)
t=1
where we can sample from p(τ2t |Nt , µθ , z1t , z2t , st , xt ) using the fact that conditionally on Nt
the arrival time τ2t of the Nt th jump is the maximum of Nt uniform random variables and it
has a Beta(Nt , 1) distribution. The arrival time of the (Nt + 1)th jump after 1 is exponentially
distributed with intensity λt (z1t + z2t ), hence
τ1t = 1 + ξt − τ2t
ξt ∼ Exp(λt (z1t + z2t ))
24


(A69)
#
otherwise it is zero.
and

(A72)
Generating R (Step 4e)
Notice that
p(R|τ, N, z1 , z2 , γ, ν, µθ , φ, ση2 , s, x, y) = p(R|τ, N, z1 , z2 , ν, s, x)
Moreover
p(R|τ, N, z1 , z2 , ν, s, x) =
t +1,2)
T min(N
Y
Y
t=1
p(rtj |τt , Nt , µθ , z1t , z2t , st , xt )
(A73)
(A74)
j=1
Sample rt1 from the following discrete distribution
p(rt1 |τt , Nt , µθ , z1t , z2t , st , xt ) ∝ wk (1)φ(− log τ1t − log[λt (z1t + z2t )], mk (1), vk2 (1))
(A75)
where k = 1, . . . , R(1) If Nt > 0 then draw rt2 from the discrete distribution
p(rt2 |τt , Nt , µθ , z1t , z2t , st , xt ) ∝ wk (Nt )φ(− log τ1t − log[λt (z1t + z2t )], mk (Nt ), vk2 (Nt ))
(A76)
for k = 1, . . . , R(Nt )
Tables and Figures
Table 1: Estimation results from a dynamic Skellam and ∆NB model based on 20000 observations and 100000 iterations
from which 20000 used as a burn in sample.The 95 % HPD regions are in brackets The true parameters are µ = −1.7,
φ = 0.97 , σ = 0.02, γ = 0.001 and ν = 15
µ
φ
σ2
γ
β1
β2
β3
β4
Skellam
∆NB
-1.72
-1.726
[-1.797,-1.642]
[-1.804,-1.651]
0.973
0.975
[0.965,0.979]
[0.969,0.981]
0.018
0.015
[0.013,0.023]
[0.011,0.02]
0.005
0.003
[0,0.017]
[0,0.01]
1.139
1.128
[0.884,1.392]
[0.875,1.38]
-0.306
-0.297
[-0.453,-0.158]
[-0.448,-0.151]
-0.801
-0.793
[-0.943,-0.657]
[-0.933,-0.65]
0.091
0.099
[-0.052,0.23]
[-0.04,0.24]
12.191
ν
[8,16.4]
25
Table 2: Descriptive statistics of the data from 3rd to 10th October 2008.
AA
Num. obs
Avg. price
Mean
Std
Min
Max
% Zeros
F
In
Out
In
Out
In
Out
64 807
16.75
-0.007
1.63
-33
38
48.76
14 385
11.574
-0.004
2.126
-51
39
48.76
32 756
3.077
-0.007
0.745
-18
21
77.08
14 313
2.112
0
0.601
-10
9
77.08
68 002
96.796
-0.02
6.831
- 197
186
39.9
20 800
87.583
-0.004
7.09
- 105
140
39.9
JPM
Num. obs
Avg. price
Mean
Std
Min
Max
% Zeros
IBM
KO
XRX
Out
In
Out
In
Out
In
142 867
42.773
-0.009
2.368
-48
74
43.78
43 230
38.889
0.012
2.779
-40
55
43.78
70 356
49.203
-0.012
1.758
-33
30
34.39
25 036
41.875
0.005
2.734
-50
63
34.39
26 020
9.049
-0.006
0.816
-17
19
54.98
8 623
7.768
0.004
1.285
-17
12
54.98
Table 3: Descriptive statistics of the data from 23rd to 30th April 2010.
AA
Num. obs
Avg. price
Mean
Std
Min
Max
% Zeros
F
In
Out
In
Out
In
Out
27 550
13.749
-0.001
0.468
-3
3
75.02
4 883
13.519
-0.006
0.502
-2
2
75.02
63 241
13.734
-0.001
0.448
-5
4
79.73
9 894
13.231
-0.006
0.454
-2
3
79.73
43 606
130.176
0.001
1.424
-22
24
51.93
8 587
129.575
-0.019
1.371
-15
9
51.93
JPM
Num. obs
Avg. price
Mean
Std
Min
Max
% Zeros
IBM
KO
XRX
Out
In
Out
In
Out
In
101 045
43.702
-0.001
0.615
-5
5
68.73
21 443
42.854
-0.007
0.638
-10
5
68.73
34 469
53.628
-0.003
0.647
-9
7
65.09
6 073
53.732
-0.006
0.696
-5
5
65.09
36 332
11.164
0
0.494
-9
7
79.29
4 326
11.025
-0.007
0.459
-2
3
79.29
26
Table 4: Estimation results from a dynamic Skellam and ∆ NB model during the period from 3rd to 9th October 2008.
The posterior mean estimates are based on 100000 iterations from which 20000 used as a burn in sample.The 95 % HPD
regions are in brackets
AA
µ
φ
σ2
γ
β1
β2
F
Skellam
∆NB
Skellam
∆NB
Skellam
∆NB
-0.174
-0.262
-1.873
-1.861
1.935
1.246
[-0.236,-0.112]
[-0.321,-0.204]
[-1.942,-1.803]
[-1.932,-1.791]
[1.865,2.008]
[1.198,1.294]
0.929
0.941
0.939
0.945
0.881
0.935
[0.921,0.936]
[0.935,0.946]
[0.931,0.95]
[0.934,0.955]
[0.873,0.888]
[0.93,0.94]
0.207
0.126
0.112
0.093
0.86
0.124
[0.184,0.235]
[0.107,0.141]
[0.095,0.132]
[0.077,0.114]
[0.796,0.926]
[0.112,0.133]
0.248
0.243
0.279
0.299
[0.24,0.258]
[0.234,0.251]
[0.274,0.285]
[0.294,0.304]
0.436
0.374
0.297
0.288
0.282
0.206
[0.325,0.544]
[0.272,0.477]
[0.156,0.433]
[0.149,0.429]
[0.153,0.406]
[0.118,0.293]
-0.185
-0.151
-0.117
-0.114
0.076
0.023
[-0.274,-0.097]
[-0.234,-0.068]
[-0.224,-0.01]
[-0.223,-0.007]
[-0.034,0.186]
[-0.053,0.098]
ν
8.701
14.315
2
[6.6,11]
[10.4,18.2]
[2,2]
JPM
µ
φ
σ2
γ
β1
β2
ν
IBM
KO
XRX
Skellam
∆NB
Skellam
∆NB
Skellam
∆NB
0.229
0.239
0.18
0.148
-1.418
-1.417
[0.188,0.272]
[0.193,0.284]
[0.138,0.222]
[0.107,0.189]
[-1.474,-1.363]
[-1.473,-1.361]
0.897
0.905
0.937
0.943
0.928
0.941
[0.893,0.902]
[0.901,0.911]
[0.932,0.943]
[0.937,0.948]
[0.912,0.944]
[0.928,0.954]
0.459
0.378
0.083
0.067
0.071
0.048
[0.444,0.476]
[0.336,0.401]
[0.076,0.091]
[0.059,0.075]
[0.053,0.089]
[0.035,0.061]
0.197
0.205
0.103
0.103
[0.192,0.203]
[0.199,0.213]
[0.096,0.11]
[0.096,0.109]
0.358
0.343
0.569
0.543
0.564
0.536
[0.285,0.431]
[0.272,0.416]
[0.502,0.64]
[0.476,0.611]
[0.448,0.677]
[0.423,0.654]
0.011
0.015
-0.209
-0.196
-0.142
-0.132
[-0.056,0.077]
[-0.05,0.081]
[-0.277,-0.14]
[-0.262,-0.129]
[-0.23,-0.052]
[-0.22,-0.042]
87.27
34.756
8.697
[75.2,98.8]
[28.4,41.6]
[5.8,11.6]
27
Table 5: Estimation results from a dynamic Skellam and ∆ NB model during the period from 23rd to 29th April 2010.
The posterior mean estimates are based on 100000 iterations from which 20000 used as a burn in sample.The 95 % HPD
regions are in brackets
AA
µ
φ
σ2
F
IBM
Skellam
∆NB
Skellam
∆NB
Skellam
∆NB
-2.23
-2.227
-2.397
-2.393
-0.083
-0.224
[-2.29,-2.17]
[-2.288,-2.167]
[-2.442,-2.351]
[-2.436,-2.348]
[-0.154,-0.008]
[-0.299,-0.146]
0.956
0.958
0.942
0.944
0.975
0.983
[0.944,0.968]
[0.947,0.971]
[0.933,0.951]
[0.936,0.953]
[0.968,0.981]
[0.976,0.988]
0.029
0.027
0.061
0.057
0.025
0.011
[0.02,0.04]
[0.018,0.039]
[0.051,0.078]
[0.046,0.068]
[0.018,0.033]
[0.007,0.017]
0.287
0.267
[0.278,0.297]
[0.256,0.279]
γ
β1
β2
0.037
0.037
0.148
0.149
0.476
0.421
[-0.052,0.13]
[-0.056,0.13]
[0.089,0.207]
[0.09,0.206]
[0.359,0.6]
[0.306,0.536]
-0.041
-0.041
-0.188
-0.188
0.204
0.181
[-0.138,0.057]
[-0.137,0.061]
[-0.259,-0.115]
[-0.26,-0.115]
[0.082,0.329]
[0.061,0.3]
ν
20.367
27.436
6.101
[15,25.8]
[21.4,33.8]
[4.2,7.8]
JPM
µ
φ
σ2
KO
XRX
Skellam
∆NB
Skellam
∆NB
Skellam
∆NB
-1.674
-1.673
-1.636
-1.637
-2.334
-2.328
[-1.716,-1.632]
[-1.716,-1.631]
[-1.693,-1.581]
[-1.693,-1.581]
[-2.393,-2.275]
[-2.387,-2.271]
0.992
0.993
0.98
0.98
0.943
0.947
[0.99,0.994]
[0.991,0.994]
[0.973,0.987]
[0.973,0.987]
[0.929,0.959]
[0.934,0.959]
0.002
0.002
0.007
0.007
0.059
0.052
[0.002,0.003]
[0.002,0.003]
[0.004,0.01]
[0.004,0.01]
[0.037,0.076]
[0.038,0.068]
0.195
0.195
0.355
0.351
0.647
0.641
[0.124,0.266]
[0.121,0.266]
[0.268,0.443]
[0.262,0.439]
[0.553,0.739]
[0.548,0.733]
0.029
0.029
0.067
0.069
-0.457
-0.455
[-0.039,0.1]
[-0.043,0.098]
[-0.032,0.164]
[-0.031,0.166]
[-0.545,-0.367]
[-0.544,-0.368]
γ
β1
β2
ν
36.288
22.356
17.029
[29.6,43.8]
[16.6,28]
[12.4,22.4]
28
29
511 185
107 448
107 434
107 421
79 623
79 198
79 192
%
AA
78.98
0.01
0.01
25.88
0.53
0.01
dropped
311 914
59 749
59 737
59 724
47 146
47 075
47 069
#
%
F
80.84
0.02
0.02
21.06
0.15
0.01
dropped
688 805
128 589
128 575
128 561
89 517
88 808
88 802
#
%
IBM
81.33
0.01
0.01
30.37
0.79
0.01
dropped
984 526
298 773
298 761
298 744
188 469
186 103
186 097
#
%
JPM
69.65
0
0.01
36.91
1.26
0
dropped
541 616
126 509
126 497
126 484
96 482
95 398
95 392
#
%
KO
76.64
0.01
0.01
23.72
1.12
0.01
dropped
371 065
40 846
40 834
40 820
34 722
34 649
34 643
#
%
XRX
88.99
0.03
0.03
14.94
0.21
0.02
dropped
Raw quotes and trades
Trades
Non missing price and volume
Trades between 9:30 and 16:00
Aggrageted trades
Without outliers
Without opening trades
1 487 382
33 684
33 675
33 666
32 446
32 439
32 433
#
%
AA
97.74
0.03
0.03
3.62
0.02
0.02
dropped
2 737 300
77 778
77 765
77 757
73 160
73 141
73 135
#
F
%
97.16
0.02
0.01
5.91
0.03
0.01
dropped
803 648
53 346
53 332
53 324
52 406
52 199
52 193
#
%
IBM
93.36
0.03
0.02
1.72
0.39
0.01
dropped
2 109 770
126 153
126 142
126 136
122 579
122 494
122 488
#
%
JPM
94.02
0.01
0
2.82
0.07
0
dropped
692 657
41 184
41 173
41 164
40 573
40 548
40 542
#
%
KO
94.05
0.03
0.02
1.44
0.06
0.01
dropped
1 038 502
43 170
43 155
43 149
40 673
40 664
40 658
#
%
XRX
95.84
0.03
0.01
5.74
0.02
0.01
dropped
Table 7: Summary of the cleaning and aggregation procedure on the data from April 2010 for Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM),
Ford (F), Xerox (XRX from the NYSE.
Raw quotes and trades
Trades
Non missing price and volume
Trades between 9:35 and 15:55
Aggrageted trades
Without outliers
Without opening trades
#
Table 6: Summary of the cleaning and aggregation procedure on the data from October 2008 for Alcoa (AA), Coca-Cola (KO) International Business Machines (IBM), J.P. Morgan (JPM),
Ford (F), Xerox (XRX from the NYSE.
Empirical distribution of the log returns in 10/2008
7000
6000
5000
4000
3000
2000
1000
0
5000
AA
2000
1500
1000
500
0
0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015
IBM
8000
7000
6000
5000
4000
3000
2000
1000
0
4000
3000
2000
1000
0
6000
0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015
KO
6000
5000
5000
4000
4000
3000
3000
2000
2000
1000
1000
0
F
2500
0
0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015
0.004
0.002
0.000
0.002
0.004
JPM
0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015
XRX
0.0015 0.0010 0.0005 0.0000 0.0005 0.0010 0.0015
Figure 1: Empirical distribution of the tick by tick log returns during October 2008 for Alcoa (AA), Ford (F), International
Business Machines (IBM),JP Morgan (JPM), Coca-Cola (KO) and Xerox (X)
30
Empirical distribution of the tick returns in 10/2008
AA
10-1
empirical
skellam
10-1
10-2
log density
log density
F
empirical
skellam
10-3
10-4
10-2
10-3
10-4
10-5
50
40
30
20
10
0
10
20
30
10-5
40
20
15
5
10
10-3
10-4
150
10-2
100
50
0
50
100
10-4
150
60
40
20
0
10
-3
20
40
60
80
XRX
empirical
skellam
empirical
skellam
10-1
log density
log density
10
-2
25
10-3
KO
10
20
10-5
10-5
-1
15
empirical
skellam
10-1
log density
log density
10-2
10
JPM
empirical
skellam
10-1
5
0
IBM
10-4
10-2
10-3
10-4
10-5
60
40
20
0
20
40
60
80
20
15
10
5
0
5
10
15
20
Figure 2: Empirical distribution of the tick returns along with tted Skellam density during October 2008 for Alcoa (AA),
Ford (F), International Business Machines (IBM),JP Morgan (JPM), Coca-Cola (KO) and Xerox (X)
31
0.35
Zero mean Skellam distribution
Skellam γ=0 and λ=1
Skellam γ=0 and λ=2
Skellam γ=0.1 and λ=2
0.30
0.25
0.20
0.15
0.10
0.05
0.00 10
5
0
5
Figure 3: The picture shows the Skellam distribution with dierent parameters
32
10
0.35
Zero Mean ∆NB Distribution
∆NB γ=0, λ=1 and ν=1
∆NB γ=0, λ=1 and ν=10
∆NB γ=0, λ=5 and ν=1
∆NB γ=0, λ=5 and ν=10
∆NB γ=0.1, λ=5 and ν=10
0.30
0.25
0.20
0.15
0.10
0.05
0.00 10
5
0
5
Figure 4: The picture shows the ∆NB distribution with dierent parameters
33
10
Posterior densities of the parameters
12
10
8
6
4
2
0
1.90
1.85
160
140
120
100
80
60
40
20
0
0.010
1.80
1.75
0.015
µ
1.70
0.020
1.65
1.60
0.025
1.55
0.030
120
100
80
60
40
20
0
0.955
0.960
0.965
0.970
6
5
4
3
2
1
0
1.2
0.6
1.1
0.8
1.0
1.0
0.9
β1
0.8
β3
1.2
1.4
0.7
1.6
0.6
1.8
0.980
0.985
0.990
γ
6
5
4
3
2
1
0
6
5
4
3
2
1
0
0.5
0.975
200
150
100
50
0
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045
σ2
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.4
φ
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.3
0.2
0.1
0.2
β2
β4
0.1
0.3
0.0
0.1
0.4
0.5
Figure 5: The posterior distribution of the parameters from a dynamic Skellam model based on 20000 observations and
100000 iterations from which 20000 used as a burn in sample. Each picture shows the histogram of the posterior draws
the kernel density estimate of the posterior distribution, the HPD region and the posterior mean. The true parameters are
µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001
34
Posterior densities of the parameters
1.50
140
120
100
80
60
40
20
0
0.960
0.965
0.970
200
150
100
50
0
0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026
400
350
300
250
200
150
100
500
0.00
0.01
0.02
12
10
8
6
4
2
0
1.95
1.90
1.85
1.80
1.75
µ
1.70
1.65
1.60
1.55
σ2
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.4
6
5
4
3
2
1
0
0.20
0.15
0.10
0.05
0.00
1.1
5
0.6
0.8
1.0
1.0
0.9
10
β1
0.8
β3
15
ν
1.2
1.4
0.7
1.6
0.6
20
1.8
0.5
6
5
4
3
2
1
0
6
5
4
3
2
1
0
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
φ
γ
0.975
0.980
0.985
0.03
0.04
0.05
0.3
0.2
0.1
0.2
β2
β4
0.1
0.3
0.0
0.1
0.4
0.5
25
Figure 6: The posterior distribution of the parameters from a dynamic ∆ NB model based on 20000 observations and
100000 iterations from which 20000 used as a burn in sample. Each picture shows the histogram of the posterior draws
the kernel density estimate of the posterior distribution, the HPD region and the posterior mean. The true parameters are
µ = −1.7, φ = 0.97 , σ = 0.02, γ = 0.001 and ν = 15
35
Volatility decompostion of IBM tick returns from 23rd to 29th April 2010
30
20
10
0
10
20
30
4
3
2
1
0
1
2
3
0.6
0.4
0.2
0.0
0.2
0.4
0.6
0.8
4
3
2
1
0
1
2
3
tick retruns
26/04
27/04
28/04
29/04
xt
26/04
27/04
28/04
29/04
st
26/04
27/04
28/04
29/04
logλt
26/04
27/04
28/04
29/04
In-sample BIC comparison in October 2008
40
30
20
10
0
10
20
30
40
80
10
0
10
20
30
40
50
60
03/10 06/10
09/10
KO
07/10
08/10
09/10
2 log BF
Tick returns
08/10
JPM
09/10
0
50
09/10
07/10
50
0
08/10
0
40
03/10 06/10
100
50
07/10
20
30
20
10
0
10
20
30
40
50
100
150
100
200
03/10
10
5
0
5
10
15
20
25
03/10
80
60
40
20
0
20
40
60
Tick returns
08/10
IBM
25
20
15
10
5
0
5
10
15
20
06/10
07/10
08/10
XRX
09/10
20
15
10
5
0
5
10
15
20
Tick returns
100
0
100
200
300
400
500
600
700
800
03/10 06/10
07/10
40
20
2 log BF
06/10
F
60
Tick returns
AA
Tick returns
2 log BF
20
10
0
10
20
30
40
50
03/10
Tick returns
2 log BF
2 log BF
2 log BF
Figure 7: Decompostion of log volatility of IBM
06/10
07/10
08/10
09/10
Figure 8: Sequential Bayes factors approximation based on BIC on data from 3rd to 9th October 2008.
36
20
0
20
40
60
80
100
80
60
40
20
0
20
40
60
120
100
80
60
40
20
0
20
20
2 log BF
Tick returns
0
40
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
IBM
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
KO
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
10
5
0
5
10
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
JPM
Tick returns
150
100
50
0
50
100
150
20
F
60
40
Tick returns
60
10
5
0
5
10
15
20
20
0
20
40
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
XRX
15
10
5
0
5
10
15
20
Tick returns
120
100
80
60
40
20
0
20
40
2 log BF
50
0
50
100
150
200
250
300
AA
Tick returns
2 log BF
16
14
12
10
8
6
4
2
0
2
Tick returns
2 log BF
2 log BF
2 log BF
Out of sample predicitve likelihood comparison in October 2008
10:00 11:00 12:00 13:00 14:00 15:00
10/10/08 10/10/08 10/10/08 10/10/08 10/10/08 10/10/08
1
2
26/04 27/04
28/04
IBM
3
29/04
30
10
2 log BF
0
20
26/04
26/04
27/04
27/04
KO
28/04
8
6
4
2
0
28/04
29/04
29/04
2
4
6
8
10
4
20
2
40
0
2
60
23/04
5
0
5
10
15
20
25
30
23/04
1
2
3
4
5
6
4
80
30
29/04
28/04
JPM
0
20
10
4
3
2
1
0
Tick returns
0
2 log BF
Tick returns
1
F
Tick returns
2
10
0
10
20
30
40
50
60
70
23/04 26/04 27/04
26/04
27/04
26/04
XRX
28/04
27/04
6
29/04
8
6
4
2
0
28/04
29/04
Tick returns
10
0
10
20
30
40
50
60
23/04
3
2 log BF
50
40
30
20
10
0
10
20
23/04
In-sample BIC comparison in April 2010
AA
Tick returns
10
0
10
20
30
40
50
60
23/04
Tick returns
2 log BF
2 log BF
2 log BF
Figure 9: Sequential predictive Bayes factors on 10th October 2008.
2
4
6
8
10
Figure 10: Sequential Bayes factors approximation based on BIC on data from 23rd to 29th April 2010.
37
10
5
5
KO
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
6
4
2
0
0
1
10
15
JPM
6
4
2
0
2
1
0
1
2
3
4
5
6
7
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
XRX
2
4
6
8
10
3
2
1
0
1
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
Figure 11: Sequential predictive Bayes factors on 30th April 2010.
38
2
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
5
15
Tick returns
1
0
10
2
4
6
3
2
5
0
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
F
Tick returns
10
2
0
2
4
6
8
10
12
Tick returns
IBM
2 log BF
Tick returns
10:00 11:00 12:00 13:00 14:00 15:00
30/04/10 30/04/10 30/04/10 30/04/10 30/04/10 30/04/10
2 log BF
2
0
2
4
6
8
10
2.0
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2 log BF
30
25
20
15
10
5
0
5
AA
Tick returns
2
0
2
4
6
8
10
Tick returns
2 log BF
2 log BF
2 log BF
Out of sample predicitve likelihood comparison in April 2010
2