Recovery of Hidden Information from Stock Price Data

JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
Recovery of Hidden Information from Stock
Price Data: A Semiparametric Approach
George Vachadze*
Abstract
This paper proposes a new methodology for measuring
announcement effect on stock returns. This methodology
requires no prior specification of the event day, event, and
estimation windows, and therefore is a generalization of the
traditional event study methodology. The dummy variable,
which indicates whether the event occurred or not, is treated as
missing. The unconditional probability of abnormal return is
estimated by the EM algorithm. The probability that
announcement is effective and the average announcement
effect are estimated by the Gibbs sampler. How the method
works is demonstrated on simulated data and IBM stock price
returns. (JEL C13, G14)
Introduction
Event studies have become a useful and popular statistical application in finance. The major
focus in event studies is to examine the reaction of investors to new information and how quickly
news is incorporated into prices. Accountants, lawyers, and insurers apply event studies to
estimate impact of different types of announcements on stock price performance and shareholder
value. Undoubtedly, accuracy of the predictions is what people of different professions rely on and
expect event studies to deliver.
Two main methodologies are widely applied in event studies. The first methodology uses the
concept of the estimation window (MacKinlay 1997). The estimation window contains data
unaffected by the announcement. Within this window, either the constant mean return model or
the market model is estimated in order to test whether the abnormal return is statistically different
from zero. According to the second methodology (Salinger 1989), the event window and the
dummy variables are used to measure the effect of announcement on the stock price. OLS
regression method is applied to estimate dummy variable coefficients. The established
methodologies have several disadvantages:
1 . Estimation and event windows, number of dummy variables, and the length of the
announcement effect are selected in an ad hoc and subjective manner.
*
George Vachadze, National Economic Research Associates, Inc., (NERA), 1166 Avenue of the Americas, New
York, NY 10036, [email protected]. The author would like to thank Louis Guth, Fred Dunbar, Paul Hinton,
Marcia Meyer, Eric Key, Algis Remeza, David Tabak for insightful comments and stimulating discussions, David
McKnight for help in GAUSS programming, and the anonymous referee and editor for interesting suggestions. I
acknowledge helpful comments from participants of NERA’s monthly security meeting.
244
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
2. Returns’ volatility is assumed to be constant and independent of the announcement effect.
3 . Stock market movement on the day of announcement is fully attributable to the
announcement if the null hypothesis of normal return is rejected. However, the observed
return is not necessarily an accurate measure of the announcement effect.
This paper resolves the above-described issues of the traditional event studies. In particular,
the presence of dummy variables is identified endogenously; the length of the announcement
effect is determined by data; volatility of the stock returns is not constant, but rather dependent on
whether the announcement occurred or not. The model in this paper is built on new concepts such
as the effectiveness of the announcement and the average announcement effect.
I start the paper by specifying the statistical model in the second. In the third section, I
implement the Expectation Maximization (EM) algorithm to obtain the maximum likelihood
estimates of the model’s parameters. In the fourth section, I apply the Gibbs sampler to estimate
the probability that announcement is effective and measure the average size of the announcement
effect. The fifth section contains an application of the new methodology and gives estimates of an
announcement effect for IBM stock. Finally, I conclude the paper and indicate directions for future
research.
Model Specification
Consider the following model for stock return, Rt :
Rt = m0 + s 0e t + J tx t ,
(1)
where m0 is the expected stock return under no announcement, s 0 is volatility of stock return, and
e t ~ N (0,1) is the noise term. J t is a random variable that takes values 0 and 1 depending on
whether announcement is effective. x t is a random variable that measures the announcement
effect. An announcement is called effective if J t = 1 . If the announcement is effective, then the
announcement effect is a normally distributed random variable with mean m and volatility s . If
the announcement is ineffective, J t = 0 , then the announcement effect is zero, x t = 0. Formally,
the announcement effect is defined as follows:
xt =
{~0 N (m,ifs )
if J t = 1
J t = 0.
(2)
Model (1) can be interpreted as a mixture of two regimes. When the announcement is
ineffective, J t = 0 , the stock returns are generated from normal distribution with parameters m0
and s 0 . On the other hand, when the announcement is effective, J t = 1 , the stock returns are
2
2
normally distributed with parameters m1 = m0 + m and s 1 = s 0 + s . If the announcement is
effective, J t = 1 , then the average return is equal to E (Rt | J t = 1) = m 0 + x t . If the announcement
is ineffective, J t = 0 , the average return is equal to E (Rt | J t = 0) = m 0 .
The abnormal return, AR t , is defined as follows:
ARt = E (Rt | J t = 1)- E (Rt | J t = 0) = m 0 + x t - m 0 = x t .
(3)
The abnormal return, as defined in (3), has to be distinguished from the average
announcement effect, E (J t x t | Rt , q ). Only if the announcement is effective, J t = 1 , the average
announcement effect is equal to the average abnormal return. According to the traditional
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
245
approach, Pr (J t = 1| R t ,q ) is equal to zero or one. By ignoring the fact that Pr (J t = 1| R t ,q )
belongs to the interval [0,1], the traditional model wrongly concludes that the average
announcement effect is equal to the abnormal return. In this work, Pr ( J t = 1 | Rt ,q ) Œ [ 0,1] is
estimated on a daily basis.
The parameters of model (1) are as follows:
q = {q 0 ,q1, p0 , p1} ,
(4)
where
q 0 = { m 0 ,s 0 } , q1 = { m1,s 1} , p0 = Pr(J t = 0) and p1 = Pr(J t = 1) .
Daily observations of stock returns can be used to make a statistical inference on the
probability that announcement is effective, Pr (J t = 1| R t ,q ) . Once Pr (J t = 1| R t ,q ) is determined,
one can measure the average announcement effect.
Maximum Likelihood Estimation
This section proposes a maximum likelihood estimation (MLE) methodology for estimating
the parameters of the event study model (1). The specific feature of this model is that
J = (J 1 , J 2 ,..., J n ) is unobservable. Therefore, one cannot maximize the complete data likelihood.
In order to obtain the ML estimates, I will implement the Expectation Maximization (EM)
algorithm. An extensive evaluation of the EM algorithm is given by Dempster at al. (1977), Green
(1984), Titterington at al. (1985), McLachlan and Basford (1988), and Lindsay (1995). The EM
algorithm provides a theoretical framework for iterative maximization of the observed likelihood
by maximizing the expected value of the full data likelihood.
The log-likelihood function of full data, X = (R , J ) , is given by:
n
log L(q | R, J ) = Â log f (R , J | q ) .
(5)
i =1
The observed data likelihood is defined as follows:
n
log L(q | R ) = Â log f (R | q ) .
(6)
i =1
The observed data likelihood is equal to:
log L (q | R) = Q(q ,q pr ) - I (q ,q pr ) ,
(7)
pr
where Q(q,q ) denotes the average full data likelihood
Q(q ,q pr ) =
 L (q | R, J ) f (J | R,q
pr
) = Eq pr (L (q | R, J )) ,
(8)
pr
(9)
J ŒJ
and I(q,q pr ) is the average likelihood of the missing data:
n
I (q ,q pr ) =
  log f (J | R,q ) f (J | R,q
) = Eq pr (log f (J | R,q )) ,
J Œ J i= 1
. is the expectation operator. Derivation of expressions (7) - (9) is shown in Appendix
where Eq ()
A.
The observed data likelihood given in (7) is maximized when Q(q ,q pr ) is maximized and
pr
I(q , q
pr
) is minimized. A nice feature of the EM algorithm is that maximization of Q(q , q
pr
)
246
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
implies minimization of I(q ,q pr ), and ultimately maximization of log L(q | R ) . Derivation of this
property is illustrated in Appendix A.
Four steps of the EM algorithm are as follows:
1. Select a starting value q pr , m = 0 to initialize the algorithm;
2. (E-step) Compute function Q(q,q pr );
3. (M-step) Set the posterior value of q as follows:
q ps = arg max Q(q ,q pr ) ;
(10)
q
4. Use q ps as q pr and go to step 2.
The second and the third steps are known as expectation and maximization steps, respectively.
The EM algorithm produces a sequence of likelihoods, which increases monotonically as the
number of iterations increases. Thus the likelihood must converge. The criterion for the
convergence is as follows:
log L(q ps | R) - log L(q pr | R) < e .
(11)
Next I proceed to implement the EM algorithm to obtain the ML parameter estimates for
pr
pr
pr
event study model (1). For initial values p k , m k , s k , k = 0,1 of the parameter q , E-step is as
follows:
Obtain function Q(q,q pr ):
1
Q(q ,q
pr
n
) = Â Â log f (Ri , J i = k | q ) f ( J i = k | Ri ,q pr )
(12)
k = 0 i= 1
1
=
n
  log( p
k
f (Ri , J i = k | q )) f ( J i = k | Ri ,q pr ).
k = 0 i= 1
M-step is to maximize function Q(q,q pr ) with respect to parameter q . Maximization yields:
pkps =
n
1
n
 f (J
i
= k | Ri ,q pr ) ;
i
= k | Ri ,q pr )
(13)
i= 1
n
 R f (J
i
m kps =
i= 1
n
;
 f (J
= k | R i ,q
i
pr
(14)
)
i= 1
n
 (R - m
i
s kps =
ps 2
k
) f ( J i = k | Ri ,q pr )
i= 1
;
n
 f (J
i
= k | R i ,q
pr
(15)
)
i= 1
for k = 0,1. Full derivation of expressions (13) - (15) is presented in Appendix A.
pr
pr
pr
Once updated parameter values for p k , m k , s k , k = 0,1 are obtained, the algorithm
proceeds to the next iteration. This concludes the description of the ML estimation. In the next
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
247
section, I turn to the estimation of the probability that the announcement is effective and measure
the average announcement effect on a daily basis.
Gibbs Sampling Algorithm
The Gibbs sampler is an iterative simulation scheme for generating random variables from a
marginal distribution without having to calculate its density. The Gibbs sampler was introduced in
the paper by Geman and Geman (1984). The basic idea behind the Gibbs sampler is to construct a
Markov chain with the equilibrium distribution f ( x ) by sampling from the conditional
distributions f ( x | y ) and f ( y | x ) . The Markov chain is the following sequence of random
variables:
Y0 , X 0 , Y1 , X 1 , Y2 , X 2 ,..., Yn , X n .
(16)
Once the initial value Y0 = y 0 is specified, the rest of (16) is obtained iteratively by alternating
values:
X i ~ f ( x | Yi = y i )
(17)
Yi +1 ~ f ( y | X i = x i ).
It can be shown that the distribution of X n converges to the true marginal distribution of X as
n Æ • . Thus for large n, the final observation in (16) is a sample point from the density f (x ) .
Convergence and uniqueness of the Gibbs iteration scheme is guaranteed by a fixed point
argument demonstrated by Gelfand and Smith (1990). General convergence conditions needed for
the Gibbs sampler can be found in Schervish and Carlin (1990), economic applications of the
Gibbs sampler can be found in Tanizaki (1996), and discussions about the rate of convergence can
be found in Roberts and Polson (1990).
The Gibbs sampler can be successfully implemented in the event study model defined in (1).
In particular, based on the knowledge of:
f (J t = 1 | Rt , x t , q ) , the conditional probability that announcement is effective given the
∑
observed return, announcement effect, and parameter values and
f (x t | Rt , J t = 1, q ) , the conditional probability density of the abnormal return given the
∑
observed return, the fact that announcement is effective, and the parameter values,
I implement the Gibbs sampler to obtain the probability that the announcement is effective,
f (J t = 1 | Rt ,q ) , and the probability distribution of the abnormal return, f (x t | Rt , q ) . Both
conditional densities are used to determine the average announcement effect, E (J t x t | Rt , q ) .
As shown in Appendix B, the conditional distribution that the announcement is effective on
day t is binomial with the following probability distribution:
f (J t = 1 | Rt ,x t ,q ) =
1
.
Ï x t ( 2 Rt - 2 m 0 - x t ) ¸
1- p
˝
1+
expÌÓ
˛
p
2s 02
(18)
The conditional probability distribution of the announcement effect is as follows:
2
2
2 2
Ï
Ê
ˆ
Ô ~ N Á (Rt - m 0 )s + ms 0 , s 0s ˜ if J t = 1
2
2
2
2
Ì
xt =
Ë
s 0 +s
s 0 +s ¯
ÔÓ
0
if J t = 0.
(19)
248
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
(19) is used to obtain the conditional probability that announcement is effective,
f (J t = 1 | Rt ,q ) ,
and to measure the average announcement effect, E (J t x t | Rt , q ) .
Empirical Results
In this section, I implement the event study methodology for IBM stock returns. I describe
ML estimation results, evaluate the precision of the EM algorithm on simulated data, and present
the Gibbs sampler estimates of the average announcement effect and probability that
announcement is effective.
IBM Stock Return Data
For empirical analysis, I use daily data on IBM stock price from January 5, 1998, through
February 2, 2001. Figure 1 displays the data during the sample period.
FIGURE 1. IBM STOCK PRICE SERIES
140
120
Price
100
80
60
40
20
0
01/01/98
07/20/98
02/05/99
08/24/99
03/11/00
09/27/00
Time
Figure 2 shows the percentage changes in IBM stock price series.
Table 1 displays ML parameter estimates of model (1) for IBM stock returns. The parameter
estimates are obtained from 2000 iterations of the EM algorithm. The convergence precision is
less than 10-8.
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
249
TABLE 1. ML ESTIMATES FOR IBM STOCK RETURNS
Estimate
Pr {J t = 1}
0.0863
m0
0.0008
m1
0.0062
s0
0.0208
s1
0.057
log ML
9391.21
FIGURE 2. IBM STOCK RETURN SERIES
% Change in Prices
0.20
0.10
0.00
-0.10
-0.20
1/5/98
7/5/98
1/5/99
7/5/99
1/5/00
7/5/00
1/5/01
Time
As shown in Table 1, the estimated probability that announcement is effective is equal to 8.63
percent; i.e., on average, 67 effective announcements are made during the three-year sample
period. On days when effective announcements are made, IBM stock returns are normally
distributed with mean 0.62 percent and volatility 6 percent. Otherwise, IBM stock returns are
normal with mean 0.08 percent and volatility 2 percent. The logarithm of the maximum likelihood
is equal to 9391.21.
To evaluate the precision of the EM algorithm estimates, I use parameter values from Table 1
to simulate 40 sample return series with the sample size being equal to 1,000. Figures 3-5 show
the confidence intervals for the parameter values and their estimated means.
FIGURE 3. DISTRIBUTION OF ESTIMATED PROBABILITY OF JUMP
True Parameter Value is 0.086324
Estimated Mean of the Estimated Parameter is 0.088
Estimated Std. Dev. of the Estimated Parameter is 0.023
20
16
PDF
12
8
4
0
0
0.04
Mean = 0.088
Estimated Probability of Jump
0.135
0.2
250
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
Figure 3 shows the sample distribution for the estimated probability that the announcement is
effective. A 95 percent confidence interval is 4.0 percent to 13.5 percent, while the true parameter
value is 8.6 percent. The simulated parameter’s mean is equal to 8.7 percent, which is very close
to the true mean of 8.6 percent.
Figure 4 displays the confidence interval for the stock return volatility when announcement is
ineffective. The confidence interval appears to be very tight, implying that the estimates of the
volatility are very precise. A 95 percent confidence interval is 1.9 percent to 2.2 percent, while the
true volatility is 2.07 percent. The simulated estimate of the volatility is 2.06 percent.
FIGURE 4. DISTRIBUTION FUNCTION OF ESTIMATED STANDARD DEVIATION
WHEN ANNOUNCEMENT IS INEFFECTIVE
600
True Parameter Value is 0.020791
Estimated Mean of the Estimated Parameter is 0.0206
Estimated Std. Dev. of the Estimated Parameter is 0.000747
PDF
450
300
150
0
0.017
0.01941
Mean=0.02
0.022
Values of Estimated Standard Deviation
FIGURE 5. DISTRIBUTION FUNCTION OF ESTIMATED STANDARD DEVIATION
WHEN ANNOUNCEMENT IS EFFECTIVE
80
True Parameter Value is 0.053743
Estimated Mean of the Estimated Parameter is 0.051677
Estimated Std. Dev. of the Estimated Parameter is 0.00516
PDF
60
40
20
0
0.02
0.041 Mean=0.
05167
0.062
Values of Estimated Standard Deviation
0.1
251
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
Figure 5 depicts the stock return volatility when announcement is effective. A 95 percent
confidence interval is 4.1 percent to 6.2 percent, while the true volatility is 5.37 percent. The
sample estimate of the volatility is 5.16 percent.
For parameter values in Table 1, the Gibbs sampling procedure is applied to obtain the
probability that announcement is effective on a daily basis. Figure 6 shows the estimated
probability.
FIGURE 6. ESTIMATED PROBABILITY OF ABNORMAL ANNOUNCEMENT
1.00
Probability
0.80
0.60
0.40
0.20
0.00
1/5/98
7/5/98
1/5/99
7/5/99
1/5/00
7/5/00
1/5/01
Time
Figure 7 shows the average announcement effect, E (J t x t | Rt , q ) , and the observed return data
for IBM stock.
FIGURE 7. ESTIMATED ANNOUNCEMENT EFFECT
0.20
0.15
0.10
0.05
0.00
-0.05
-0.10
-0.15
-0.20
01/01/98
07/20/98
02/05/99
08/24/99
03/11/00
09/27/00
Time
Mean of the announcement effect is equal to 0.54 percent, which indicates that the effective
announcements are positive, on average. Volatility of the announcement effect is equal to 5.37
percent. The probability that announcement is effective is 0.0863. The 95 percent confidence
252
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
interval for IBM stock return is [- 10%,10% ]. Therefore, the ratio of the stock return volatility
during effective announcements to the total volatility is as follows:
ps 2
= 36.58% ;
s + ps 2
2
0
i.e., 36.58 percent of IBM stock returns volatility is due to the effective announcements.
Conclusion and Directions for Future Research
A new methodology for event studies is proposed in this paper. The methodology is a
generalization of the traditional approach in that no prior assumption on the event day, event, and
estimation window is required. The dummy variable, which indicates when the event occurred, is
estimated endogenously. The ML estimates of the model parameters are obtained by applying the
EM algorithm. The Gibbs sampler technique is implemented to measure the average
announcement effect on stock return and estimate the probability that announcement is effective.
An application of the methodology is given for IBM stock return data.
The paper can be extended in two directions. First, the market return can be incorporated in
the stock return equation as an exogenous variable. Second, the Bayesian Information Criterion
can be successfully applied to optimally group-different announcements and identify how many
partitions of the data are needed. (See Dasgupta, A., and A. E. Raftery 1998 and Campbell, J. G.,
et al. 1997 for theory behind this application.)
References
Campbell, J. G., C. Fraley, F. Murtagh, and A. E. Raftey. 1997. “Linear Flaw Detection in Waven
Textiles Using Model Based Clustering.” Pattern Recognition Letters 18: 1539-1548.
Dasgupta, A., and A. E. Raftery. 1998. “Detecting Features in Spatial Point Process with Cluster
via Model Based Clustering.” Journal of American Statistical Association 93: 294-302.
Dempster, A. P., N. M. Laird, and D. B. Rubin. 1977. “Maximum Likelihood for Incomplete Data
via the EM Algorithm.” Journal of Royal Statistical Society B, 39: 1-38.
Gelfand, A. E., and A. M. Smith. 1990. “Sampling-Based Approaches to Calculating Marginal
Densities.” Journal of American Statistical Association 85: 398-409.
Geman, S., and D. Geman. 1984. “Stochastic Relaxation, Gibbs Distribution and the Bayesian
Restoration of Images.” IEEE Transactions on Pattern Analysis and Machine Intelligence 6:
721-741.
MacKinlay, A. C. 1997. “Event Studies in Economics and Finance.” Journal of Economic
Literature 35: 13-39.
McLachlan, G. J., and K. E. Basford. 1988. Mixed Models: Inference and Applications to
Clustering. New York: Marcel Dekker.
McLachlan, G. L., and Krishnan, T. 1997. The EM Algorithm and Extensions. New York: Wiley.
Roberts, G. O., and N. G. Polson. 1994. “A Note on the Geometric Convergence of the Gibbs
Sampler.” Journal of Royal Statistical Society B, 56: 377-384.
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
253
Salinger, M. 1992. “Standard Errors in Event Studies.” Journal of Financial and Quantitative
Analysis 27: 39-53.
Schervish, M. J., and B. P. Carlin. 1990. “On the Convergence of Successive Substitution
Sampling.” Journal of Computational and Graphical Statistics 1: 111-127.
Tanizaki, H. 1996. Nonlinear Filters. Heidelberg: Springer.
Appendix A: EM Algorithm
The log-likelihood function of full data, X = (R , J ) , is given by:
n
log L(q | R, J ) = Â log f (R , J | q ) .
(A.1)
i =1
The observed data likelihood is defined as follows:
n
log L(q | R ) = Â log f (R | q ) .
(A.2)
i =1
The relation between the full data likelihood and the observed data likelihood can be established
through the Bayes rule, according to which
f (R, J | q ) = f (J | R,q )f (R | q ) .
(A.3)
Substituting (A.3) into (A.1), one obtains the following relation between the full data likelihood
and the observed data likelihood:
n
Â
n
log f (R, J | q ) =
i= 1
 log( f (J | R,q ) f (R | q ))
(A.4)
i= 1
n
=
Â
n
log f (J | R,q ) +
i= 1
 log f (R | q ).
i= 1
Equation (A.4) can be rewritten as follows:
n
n
n
i =1
i =1
i =1
 log f (R | q ) =  log f (R, J | q )-  log f (J | R,q ) .
(A.5)
The first step in the EM algorithm is to integrate expression (A.5) with respect to the
conditional density f ( J | R,q pr ) for any prior parameter values, q pr , of the parameter q to
obtain:
log L(q , R)
=
 L (q | R, J ) f (J | R,q
J ŒJ
n
pr
) = Â Â log f (J | R,q ) f ( J | R,q pr ) ,
(A.6)
J Œ J i= 1
where J is a set of all n-dimensional vectors of 1’s and 0’s.
Let Q(q,q pr ) denote the average full data likelihood and I(q,q pr ) be the average likelihood of
the missing data, i.e.,
Q(q ,q pr ) =
 L (q | R, J ) f (J | R,q
pr
) = Eq pr (L (q | R, J )) ,
(A.7)
pr
(A.8)
J ŒJ
and
n
I (q ,q pr ) =
  log f (J | R,q ) f (J | R,q
J Œ J i= 1
) = Eq pr (log f (J | R,q )) ,
254
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
. is the expectation operator. From (A.6)-(A.8) it follows that the observed data
where Eq ()
likelihood is equal to:
pr
log L (q | R) = Q(q ,q pr ) - I (q ,q pr ) .
(A.9)
The incomplete data likelihood is monotonically increasing because the maximization of Q(q,q pr )
yields the minimization of I(q , q pr ). This can be shown by using the following inequality
log x £ x - 1 for x > 0 . In particular:
I (q ps ,q pr ) - I (q pr ,q pr ) = Eq pr ( log f ( J | R,q ps )) - Eq pr ( log f ( J | R,q pr ))
(A.10)
Ê
f ( J | R,q ps ) ˆ
= Eq pr ( log f ( J | R,q ps ) - log f ( J | R,q pr )) = Eq pr Á log
˜
Ë
f ( J | R,q pr ) ¯
Ê
f ( J | R,q ps ) ˆ
f ( J | R,q ps )
£ Eq pr Á log
- 1˜ =
f ( J | R,q pr ) - 1
pr
pr
Ë
¯
f ( J | R ,q )
f ( J | R ,q )
J ŒJ
Â
=
 f ( J | R ,q
ps
) -1 = 1-1 = 0.
J ŒJ
Since q
ps
pr
= arg max Q(q,q ) and I(q ps,q pr ) - I(q pr,q pr ) < 0, it follows that:
q
q ps = arg max(Q(q ,q pr ) - I (q ,q pr )) ,
(A.11)
q
or the log likelihood increases with each iteration of the EM algorithm. Since the sequence of the
observed likelihood function is monotonically increasing, it follows that the observed likelihood
converges. The advantage of the EM algorithm over the straightforward likelihood maximization
is that computation of the analytical solution of the maximization of (A.9) is possible. The rest of
the appendix gives a detailed derivation of E and M steps.
E-step. Function Q(q,q pr ) has the following form:
1
Q(q ,q pr ) =
n
ÂÂ
n
1
  log( f (R | J
log pk f ( J i = k | Ri ,q pr ) +
k - 0 i= 1
i
i
= k ,q )) f ( J i = k | Ri ,q pr ) . (A.12)
k - 0 i= 1
To maximize expression (A.12) one can maximize each part of the sum separately. To maximize
Q( q ,q pr) with respect p k one has to perform the constrained optimization, since p0 + p1 = 1 .
After introducing Lagrange multiplier, l , with the constraint p0 + p1 = 1 , the maximization takes
the following form:
1
max
p 0 , p1
n
  log p
k
f ( J i = k | Ri ,q pr ) ,
k = 0 i= 1
s. t . p0 + p1 = 1 .
The solution to problem (A.13) should satisfy the following system of equations:
(A.13)
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
ˆ
Ï ∂ Ê 1 n
log pk f ( J i = k | Ri ,q pr ) + l ( p0 + p1 - 1) ˜ = 0
Á
Ô
¯
Ô ∂p0 Ë k = 0 i=1
Ô
1 n
Ì ∂ Ê
ˆ
log pk f ( J i = k | Ri ,q pr ) + l ( p0 + p1 - 1) ˜ = 0
Ô
Á
¯
Ô ∂p1 Ë k = 0 i=1
Ô
Ó p0 + p1 = 1.
255
ÂÂ
ÂÂ
(A.14)
From (A.14) it follows:
Ï n 1
f ( J i = k | Ri ,q pr ) + l = 0
Ô
Ô i = 1 p0
Ô n
Ì
1
f ( J i = k | Ri ,q pr ) + l = 0
Ô
Ô i=1 p1
Ô
Ó p0 + p1 = 1.
Â
Â
(A.15)
Multiplying the first equation of (A.15) by p 0 and the second equation by p1 , one obtains:
Ï n
f ( J i = k | Ri ,q pr ) + lp0 = 0
Ô
Ô i= 1
Ô n
Ì
f ( J i = k | Ri ,q pr ) + lp1 = 0
Ô
Ô i= 1
Ô
Ó p0 + p1 = 1.
Â
Â
(A.16)
Summing the first and the second equations of (A.16), it follows:
n
 f (J
n
i
= k | Ri ,q pr ) + lp0 +
i= 1
 f (J
i
= k | Ri ,q pr ) + lp1 = 0.
(A.17)
i= 1
(A.17) implies that l = - N , resulting in the following solution:
Ï
1
Ô p0 =
n
Ô
Ì
Ô
1
Ô p1 =
n
Ó
n
 f (J
i
= 0 | Ri ,q pr )
i= 1
n
(A.18)
 f (J
i
= 1 | R i ,q
pr
).
i= 1
To optimize Q(q,q pr ) with respect q it is necessary to perform the following maximization:
1
max
m 0 , m1 ,s 0 ,s 1
n
  log f (R | J
i
i
= k ,q pr ) f ( J i = k | Ri ,q pr ).
k = 0 i= 1
Expressions for conditional densities are substituted in (A.19) to obtain the following:
(A.19)
256
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
1
n
  log f (R | J
i
= k ,q pr ) f ( J i = k | Ri ,q pr )
i
(A.20)
k = 0 i= 1
1
=
n
  ÊÁË- log
2p -
k = 0 i= 1
1
(R - m ) 2 ˆ
log s k2 - i 2k ˜ f ( J i = k | Ri ,q pr ).
¯
2
2s k
Taking derivatives of expression (A.20) with respect to parameters (m 0 , m1 , s 1 , s 2 ) and setting the
derivatives equal to zero one obtains the following:
n
Ï
(Ri - m 0 )
f ( J i = 0 | Ri ,q pr ) = 0
Ô
2
s
Ô
0
i= 1
Ô
n
R
(
Ô
i m1 )
f ( J i = 1 | Ri ,q pr ) = 0
2
ÔÔ
s
1
i= 1
Ì n
Ô Ê 1 (Ri - m 0 ) 2 ˆ
pr
+
˜ f ( J i = 0 | R i ,q ) = 0
Ô Á¯
s 03
Ô i= 1 Ë s 0
Ô n
2
Ô ÊÁ- 1 + (Ri - m1) ˆ˜ f ( J = 1 | R ,q pr ) = 0.
i
i
ÔÓ Ë s 1
¯
s 13
Â
Â
(A.21)
Â
Â
i= 1
System (A.21) leads to the following solution:
Ï
Ô
Ô
Ôm 0 =
Ô
Ô
Ô
Ì
Ô
Ô
Ô
Ô m1 =
Ô
ÔÓ
n
Â
n
 (R - m )
Ri f ( J i = 0 | Ri ,q pr )
and s 02 =
i= 1
n
Â
f ( J i = 0 | Ri ,q pr )
i
= 0 | Ri ,q pr )
i= 1
n
 f (J
f ( J i = 0 | Ri ,q pr )
i= 1
n
2
0
i
i= 1
 R f (J
i
i
= 1 | R i ,q
 (R - m )
)
and
i
= 1 | R i ,q
pr
s 12
=
f ( J i = 1 | R i ,q
i= 1
i= 1
pr
)
.
n
 f (J
)
2
1
i
i= 1
n
 f (J
(A.22)
n
pr
i
= 1 | R i ,q
pr
)
i= 1
Finally, equations (A.18) and (A.22) give parameter values, which maximize function Q( q ,q
Estimates of the new parameters can be expressed in terms of the old parameters as follows:
pkps =
1
n
pr
).
n
 f (J
i
= k | Ri ,q pr );
i
= k | Ri ,q pr )
(A.23)
i= 1
n
 R f (J
i
m kps =
i= 1
n
 f (J
i= 1
;
i
= k | R i ,q
pr
)
(A.24)
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
n
 (R - m
i
s kps =
257
ps 2
k
) f ( J i = k | Ri ,q pr )
i= 1
.
n
 f (J
i
= k | R i ,q
pr
(A.25)
)
i= 1
With newly derived parameters the EM algorithm proceeds to the next iteration.
Appendix B: Gibbs Sampler
In order to obtain f (J t = 1 | Rt ,q ) and E (J t x t | Rt , q ) applying the Gibbs sampler, the
following conditional distributions are used:
f (J t = 1 | Rt , x t , q ) and f (x t | Rt , J t = 1, q ) .
(B.1)
I obtain these distributions as follows.
f (J t = 1 | Rt ,x t ,q ) =
=
=
f (J t = 1,R t ,x t ,q )
f (R t ,x t ,q )
(B.2)
f (Rt | J t = 1,x t ,q ) f (J t = 1,x t ,q )
f (Rt | J t = 1,x t ,q ) f (J t = 1,x t ,q ) + f (Rt | J t = 0,x t ,q ) f (J t = 0,x t ,q )
f (Rt | J t = 1, x t ,q )f (J t = 1 | x t , q )
.
f (Rt | J t = 1, x t , q )f (J t = 1 | x t , q )+ f (Rt | J t = 0, x t , q )f (J t = 0 | x t , q )
Using the fact that f (J t = 1 | x t , q ) = f (J t = 1 | q ) = p and f (J t = 0 | x t , q ) = f (J t = 0 | q ) = 1 - p ,
expression (B.2) can be rewritten as follows:
f (Rt | J t = 1, x t ,q )p
.
f (Rt | J t = 1, x t , q )p + f (Rt | J t = 0, x t ,q )(1 - p )
(B.3)
From (1) it follows that for given values of x t ,q , and J t = 1 , stock returns are distributed
2
normally with the mean (m 0 + x t ) and variance s 0 . On the other hand, for given values of x t ,q ,
2
and J t = 0 , stock returns are distributed normally with the mean m 0 and variance s 0 . Thus
expression (B.3) can be rewritten as follows:
ÏÔ (R - m 0 - x t )2 Ô¸
exp Ì - t
˝p
2s 02
ÔÓ
Ô˛
2p s 0
.
2
ÏÔ (Rt - m 0 - x t ) Ô¸
ÏÔ (Rt - m 0 )2 ¸Ô
1
1
exp Ìexp Ì˝p +
˝(1 - p )
2s 02
2s 02 Ô˛
ÔÓ
Ô˛
ÔÓ
2p s 0
2p s 0
1
(B.4)
258
JOURNAL OF ECONOMICS AND FINANCE ∑ Volume 25 ∑ Number 3 ∑ Fall 2001
Expression (B.4) is simplified by dividing the numerator and the denominator by
2
1
ÔÏ (R - m 0 - x t ) ¸Ô
exp Ì - t
˝ p. From (B.4) it follows that:
2s 02
ÔÓ
Ô˛
2p s 0
f (J t = 1 | Rt ,x t ,q ) =
1
.
Ï x t (2Rt - 2m 0 - x t ) ¸
1- p
˝
1+
expÌÓ
˛
p
2s 02
(B.5)
In the rest of the appendix I derive the conditional distribution function of the announcement
effect for given values of x t ,q , and J t = 1 . From Bayes’ formula it follows that: