Advances in Return Predictability
Andrea Buraschi
Paul Whelan
Imperial College
Imperial College
Efficient Market Hypothesis
I Bachelier (1900): behaviour of prices is such that speculation should be a fair game;
stock prices follow a random walk.
I Samuelson (1965): In an informationally efficient market price changes must be
unforecastable if they fully incorporate expectations of all market participants.
I Fama (1970): A market in which prices always ‘fully reflect’ all available information
is called ’efficient’.
I Malkiel (1992): A capital market is said to be efficient if security prices would be
unaffected by revealing that information to all participants. Moreover, efficiency with
respect to an information set implies that it is impossible to make economic profits by
trading on the basis of that information set.
Assumptions
1. Market equilibrium can (somehow) be stated in terms of expected returns:
E [pj,t+1 |Ωt ] = (1 + E [rj,t+1 |Ωt ]) pj,t
2. Equilibrium expected returns are projected on the information set Ωt .
Market efficiency rules out the possibility of trading systems based solely on Ωt that have
expected returns in excess of equilibrium returns.
zj,t+1 = rj,t+1 − E [rj,t+1 |Ωt ]
E [mt+1 · zj,t+1 |Ωt ] = 0
I The above suggests three approaches to testing market efficiency:
1. Testing whether prices fully reflect all available information: Empirically
meaningless. No content.
2. By revealing information to market participants and measuring price response:
Empirically unfeasible.
3. By measuring profits generated by trading on information: Testable!!!
Empirical Strategy
The third idea has been used in two main ways:
1. Researchers have studied the profits generated by market professionals. If superior
profits are achieved (after adjusting for risk) then markets cannot be efficient.
I
I
Advantage: Concentrates on actual (real) trading by market participants.
Disadvantage: One cannot directly observe the information sets used by
managers.
2. One can ask whether hypothetical trading rules based on specified information sets
earn superior returns. This approach requires a clearly defined information set + a
model for risk.
Taxonomy of Information Sets
In order to implement trading based tests an information set must be
defined. The classic taxonomy of information sets is due to Roberts (1967):
1. Weak Form Efficiency: The information set includes only the history of
the prices or returns themselves
2. Semi-Strong Form Efficiency: The information set includes all
information known to all market participants (publicly available
information)
3. Strong Form Efficiency: The information set includes all information
known to any market participant (private information)
Abnormal Returns
I Abnormal returns are defined with respect to a model for risk: random walk, CAPM,
APT, GE, DSGE etc ...
I Abnormal returns are then defined as:
A
M
Rt+1
≡ Rt+1 − Et+1
[Rt+1 ]
I The null of market efficiency is then :
A
H0 : E [mt+1 · Rt+1
|Ωt ] = 0
I If abnormal returns are unforecastable then the hypothesis of market efficiency is not
rejected.
I The older EM literature typically specified constant normal returns, however, risk
premia over the business cycle vary in a predictable way; therefore, predictability can
be accommodated rationally within DSGE models.
Joint Hypothesis Problem
I Tests of market efficiency suffer from serious difficulties in terms of interpreting
results.
I The null of market efficiency contains an implicit joint hypothesis that:
I
I
Markets are efficient
The correct model for risk has been specified.
I The debate between rational expectations models Vs irrational behavioural models is
captured by the tension implicit in the joint hypothesis problem. Are markets
irrational, or is the test mis-specified?
I Grossman and Stiglitz (1980) show that abnormal returns exist if there are costs of
gathering and processing information. Efficiency tests should therefore really ask
whether deviations from efficiency can survive transaction costs.
Efficient Capital Markets
I Circa 1970 Fama concluded that weak and semi-strong tests of market efficiency held
unambiguously: expected excess returns roughly constant over time.
I In 1991 negative results for market efficiency, the CAPM, and return predictability.
I However, these results were not true rejections of EMH but consequences of the joint
hypothesis problem.
I Need to rethink models for risk / uncertainty: A new taxonomy was needed!
Modern Taxonomy of the Efficient Market Hypothesis
I
Tests for predictability: (Weak Form)
I
I
I
Event studies: (Semi-Strong Form)
I
I
I
Time-Series
Cross-Sectional
Investigate information based studies after release of public information
(see Malkiel’s definition) to test for abnormal returns.
Since time horizons are so short risk adjustment is unimportant so we
avoid the joint hypothesis problem.
Tests for private information / superior performance: (Strong Form)
I
I
Mutual fund / Hedge fund performance
Insider Trading
Do Stock Prices Move Too Much to be Justified by
Subsequent Changes in Dividends?, Shiller (1981)
Framework:
I Define the gross (real) return as:
Ri,t+1 =
Di,t+1 + Pi,t+1
Pi,t
I Taking expectations and rearranging
Pi,t =
Et [Di,t+1 ] + Et [Pi,t+1 ]
Et [Ri,t+1 ]
Prices vary because conditional expected dividends vary or conditional expected
returns vary.
I Imposing constant expectations over time, assuming no-bubbles, and iterating
forward:
Pi,t =
∞
X
E [Di,t+k ]
E [R]k
k=1
Do Stock Prices Move Too Much to be Justified by
Subsequent Changes in Dividends?, Shiller (1981)
I Assuming the result also holds for the aggregate stock market:
Pt =
∞
X
E [Dt+k ]
E [R]k
k=1
I The punchline: Shiller(1981) shows that the variance of the LHS is higher than the
plausible variance of the RHS. Stock market volatility is too high to accord with
rational expectations.
Shiller (1981): the math
I Prices and dividends appear to grow at a constant exponential rate λ. Detrend both
series by this rate:
λt dt = Dt ,
pt ≡ Pt /λt =
∞
X
var (dt ) < ∞
E [R]−k λk Et [dt+k ]
k=1
pt =
∞
X
E [R̄]−k Et [dt+k ]
k=1
I Idea : ex-post rational prices
pt? =
∞
X
E [R̄]−k dt+k
k=1
I Decompose dividends into expected and unexpected components:
dt+k = Et [dt+k ] + d̃t+k
I Then ex-post rational prices are related to actual prices via:
pt? = pt +
∞
X
k=1
E [R̄]−k d̃t+k
Shiller (1981): the math
I variance relation :
var [pt? ] ≥ var [pt ]
I Step 1: deflate series by CPI to get real prices and dividends.
I Step 2: Estimate λ:
log (Pt ) = a + bt + ηt ,
λ = eb
I Step 3: detrend Pt and Dt .
I Step 4: Taking unconditional expectations to estimate the discount rate:
E [pt ] =
1
E [dt ]
E [R̄] − 1
I Step 5: Construct ex-post rational prices using terminal condition
pt? =
∞
X
k=1
E [R̄]−k dt+k + E [R̄]−T pT?
Shiller (1981): the picture
Shiller (1981): the result
I var (pt ) ≥ var (pt? ) - ratio of 5 to 13.
I A single big picture delivers the punchline: a key moment condition is violated.
I LeRoy and Porter contemporaneously do the stats.
I Problems: de-trended dividends are non-stationary so var (pt ) does not even exist.
Fama and French (1988): Permanent and Temporary
Components of Stock Prices
I Early tests : daily or weekly stock returns exhibit strong statistical evidence of
non-zero autocorrelation.
I However, the implied predictability of daily or weekly returns is economically
insignificant.
I Two stories for long horizon predictability:
1. Models of irrational markets in which stock prices take long temporary
swings away from fundamental values.
2. Time-varying equilibrium expected returns generated by rational pricing in
an efficient market.
I Both stories imply observationally equivalent price behaviour.
I If expected returns are stationary and vary over time stock prices should exhibit
mean-version.
Intuition : holding expected future dividends constant, only way to give investors
higher expected future returns is for prices to contemporaneously fall.
Fama and French (1988): Permanent and Temporary
Components of Stock Prices
I Fama and French ran the following long-horizon forecasting regression:
Rt,t+T = b0,T + b1,T Rt−T ,t + εt,t+T
I Statistical Issues (explained later)t:
I
I
I
Finite sample bias in AR(1) regressions - coefficients negatively biased.
OLS standard errors are wrong because of overlapping observations.
None-the-less robust standard errors are computed and Monte Carlo
simulations are used to correct unbiased coefficents.
Fama and French (1988): results
I Bias adjusted slopes reach minimum for 2 - 5 years
I The U-shaped pattern is consistent with the hypothesis that stock prices contain
both random walk and slowly decaying stationary components.
I Also find that predictable variation is estimated as 40% for small firms for 3 − 5 year
returns while it drops to 25% for large firms.
Fama and French (1989): Business Conditions and
Expected Returns on Stocks and Bonds.
I
Fama and French do some data mining to forecast excess returns on
stocks and bonds at various horizons:
ExRet(t, t + T ) = α(T ) + β(T )X (t) + ε(t, t + T )
I
I
No strong theoretical motivation.
They identify forecasting variables that have been used extensively in
subsequent work:
1. dividend / price ratio: D/P.
2. default premium : Baa − Aaa corporate bond spread.
3. Slope of the term structure : y (n, t) − y (1, t).
Predictability
We look to predictability as a way to address questions of ‘market efficiency’:
I
Can we predict returns ahead of time?
I
This does not mean with certainty but is there a way to know that the
odds are in your favour some times and against you at other times?
We will run a simple regressions of the type:
ExRett,t+k = a(k) + b(k)xt + t,t+k
This regression attempts to answer the equivalent questions:
I
Are returns predictable?
I
Is there time-variation in the price of risk?
Why Excess Returns ?
α
β
R2
E(R)
σ(E(R))
0.09
(3.58)
0.05
(2.30)
0.00
(1.75)
0.06
(0.52)
0.06
(0.52)
0.91
(22.13)
0.00
9.10
1.17
0.00
4.99
1.14
0.83
4.12
3.10
Asset
Stock Return
Equity Premium
T Bill Return
Table: Regressions of returns on lagged returns. Sample Period: 1926 - 2010.
I b = 0.06: If returns go up 100% this year, you expect a rise of just 0.6% next year!
I Et (Rt,t+1 ) = a + bRt = 0.091
I The standard deviation of expected returns is σ[Et (Rt,t+1 )] = σ(bRt ) = 0.012.
Expected returns are almost constant through time!
dividend
price ratio
vwr-stock
T Bill
0.5
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 05
Figure: Annualised Return to VW - NYSE / AMEX / NASDAQ, and 90-day T Bill.
Sample Period: 1929 - 2010.
I Stock returns can be decomposed into a default free rate, which is predictable, and a
risk premium that should not be predictable according to the random walk
hypothesis: Rt = Rtf + Rte .
I T bills are predictable: Does this mean bond markets are inefficient?
I
I
I
I
I
b = 0.91 is huge together with an impressive t-stat and R 2 .
The mean return is 4.12% with the expected return varying by 3.1% ,
almost as much as its average level.
If you know stock returns will be high next year, you can borrow money
and invest.
But if T bill returns will be high, you have to borrow at the same high
rate, which is no good: All you can do is to save more and consume less!
The right way to do it is focus on excess returns. The return you achieve
in borrowing a dollar and investing: this takes no money from agents’
pocket, and thus reveals willingness to bear risk , separating out
intertemporal substitution.
A historical debate
The literature on predictability came originally from 2 strands:
1. The debate on permanent and transitory components in stock prices:
I
I
Looking at longer horizons evidence suggested that mean reversion
existed in stock prices which implied there was a transitory component in
prices and predictability. This was initial interpreted as evidence of
irrational pricing (Summers, 1986)
Fama & French, 1988, 1989, looked at new variables and in particular
suggested that stock prices are predictable using the dividend-to-price
ratio.
Price Dividend Ratio
price dividend ratio
100
80
60
40
20
0
30 35 40 45 50 55 60 65 70 75 80 85 90 95 00 05
Figure: Price Dividend Ratio. VW - NYSE / AMEX / NASDAQ. Sample Period: 1929 2010.
Economic Growth
0.5
0
−0.5
50
55
60
65
70
75
80
85
90
95
00
05
10
Figure: Real Annualised Dividend Growth. VW - NYSE / AMEX / NASDAQ. Sample
Period: 1929 - 2010.
Present Value Identity
Campbell & Shiller’s (1988) decomposition.
Starting from the definition of returns:
−1
−1
1 = Rt+1
Rt+1 = Rt+1
Pt+1 + Dt+1
Pt
Multiplying both sides by Pt Dt and rearranging:
„
«
Pt+1 Dt+1
Pt
−1
= Rt+1
1+
Dt
Dt+1
Dt
Log-linearizing:
pt − dt = −rt+1 + ∆dt+1 + log (1 + e pt+1 −dt+1 )
Taking a Taylor series expansion of the last term about a point P/D = expp−d :
pt − dt = −rt+1 + ∆dt+1 + k + ρ(pt+1 − dt+1 )
(1)
where
k = log (1 + P/D)ρ =
P/D
1 + P/D
Given the average dividend yield is about 4%, the average P/D is 25 and ρ is about 0.96.
Iterating 1 forward and taking conditional expectations we get:
#
"∞
X j−1
pt − dt = const. + Et
ρ (∆dt+j − rt+j )
j=1
Equation 2 is obtained by ruling out the explosive behaviour of stock prices where
limj→∞ ρj (pt+j − dt+j ) = 0. This is equivalent to ruling out bubbles.
(2)
Thus from 2, prices are high today relative to fundamentals if:
I Investors expected dividends to rise in the future.
I Investors expect returns to be low in the future (that is, discount rates).
I (Investors expect prices to rise forever, that is, a bubble).
Cementing what equation 2 means to us are 2 equivalent statements:
1. Price-dividend ratios can move if and only if there is news about current dividends,
future dividend growth or future returns.
2. If ∆dt and rt are totally unpredictable, i.e. if Et (∆dt+j ) and Et (rt+j ) are the same for
every time t, then pt − dt must be constant (which we know isn’t true!).
The Variance of Price / Dividend Ratios
If we forget the constant i.e. treat variables as deviations from the mean, then multiply
both sides of 2 by (pt − dt ) and take the unconditional mean:
"
#
∞
X
j−1
E [(pt − dt )(pt − dt )] = E (pt − dt )x
ρ (∆dt+j − rt+j )
j=1
Thus, the variance of price-dividend ratios must correspond to its covariance with future
dividend growth and returns:
"
#
"
#
∞
∞
X
X
j−1
j−1
var (pt − dt ) ≈ cov pt − dt ,
ρ ∆dt+j − cov pt − dt ,
ρ rt+j
(3)
j=1
j=1
Equation 3 directly tells us that p − d varies if and only if it either dividend growth is
predictable or that future returns are predictable!
Dividing equation 3 by var (pt − dt ) we get:
1≈
∞
X
(j)
ρj−1 bd −
j=1
∞
X
ρj−1 br(j)
(4)
j=1
where b (j) means the j-year ahead regression coefficient:
rt+j = ar(j) + br(j) (pt − dt ) + rt+j
(j)
(j)
∆dt+j = ad + bd (pt − dt ) + dt+j
I The volatility of price-dividend ratio thus corresponds to the ability to forecast returns
and/or dividend growth not just one year ahead but many years ahead as well.
I Equation 4 says if price-dividend ratios vary at all, the difference between the
dividend growth-forecasting regression coefficient and the return-forecasting
coefficient must be one.
I What does this actually mean?
The story for predictability in a one-period model:
I Therefore from equation 4:
1 ≈ bd − br
I Suppose br = 0, thus bd ≈ 1 and:
∆dt+1 = ad + 1.0x(pt − dt ) + dt+1
(5)
I In this case, if a decent job of forecasting was done, dt+1 must be unpredictable and
hence uncorrelated with anything at time t, including pt − dt . This is therefore a
regression forecast and states that if prices are moving on news of future dividend
growth - even news we cannot see - then prices should predict high dividend growth!
So what actually moves prices?
Figure: Cochrane, 2001 (Table 20.3)
I Although the evidence is statistically insignificant, it seems that a high price/dividend
ratio forecasts a decline in future dividends! This is the wrong direction!
I How has this happened?
I
I
It could be sampling error and the true parameter is zero.
Positive correlation between dividend news and expected return news:
p − d is pushed down by higher expected returns but at the same time it
is pushed up by positive dividend news.
I It seems that all variation in P/D ratios is due to the discount channel and none due
to the cashflow channel!
I High prices reflect low risk premia and lower expected excess returns.
I There are doubts about the statistical significance of return predictability
I But we must remember: Volatility in prices implies that prices must predict
something...
I Can we talk about irrational expectations of a price bubble to explain these
observations? Or is there another explanation?
Statistical issues with forecasting regressions
I ’Predictive regressions’, Stambaugh (1999): biases are incorrect standard errors in
return regressions using lagged endogenous variables.
I ’Dividend yields and expected stock returns: Alternative procedures for inference and
measurement’, Hodrick (1992): semi-solvable problems using overlapping
observations.
Stambaugh (1999)
Typical lagged explanatory variables for stock-return regressions are correlated with
contemporaneous stock returns:
I D/P negatively correlated with contemporaneous stock returns (price in the
denominator)
I cay negatively correlated (wealth in denominator)
I Slope of the term structure positively correlated with stock returns.
Positive contemporaneous correlation biases forecasting regressions.
OLS bias of an AR(1)
I
Model:
xt = α + ρxt−1 + εt
I
Estimate:
ρ̂ =
I
cov (xt , xt−1 )
var (xt−1 )
Bias
ρ̂ = ρ +
cov (xt , εt )
var (xt−1 )
I
Bias arises because in-sample εt negatively correlated with xt−1 .
I
High εt means a high xt and a high xt+1 , . . . xt+T etc, so in-sample
corresponds to a low x1 , . . . , xt−1 .
I
Standard errors are also wrong because the distribution of the estimate
is skewed.
Bias with lagged correlated and persistent explanatory
variables
I
Consider the forecasting regression:
Rt = α + βxt−1 + ut
I
Assumptions:
cov (εt , ut ) < 0,
0<ρ<1
I
Intuition same as OLS case: high ut means low εt , low xt+1 , . . . xt+T ,
and high in-sample x1 , . . . , xt−1 .
I
High xt−1 corresponds to high Rt .
Stambaugh (1999)
Hodrick (1992)
I
Many authors have used dividend yields and other variables to examine
the predictability of returns.
I
eg: Campbell (1991) and Cochrane (1992) attribute a large fraction of
the variance of the price-dividend ratio to variation in expected returns.
I
HUGE controversy in this literature is typified by the arguments of
Jegadeesh (1990) and Richardson and Stock (1989).
I
These authors argue that the case for predictability of stock returns is
weak once correct for small sample biases in test statistics.
I
Hodrick shows that long-horizon return regressions have same
implications (under the null) as non-overlapping short-horizon
regressions.
I
Furthermore, short-horizon regressions are better behaved so little
reason to use long-horizon specifications.
Hodrick (1992): Math
I Predict returns from t to t + k using period t observable variable:
rt,t+k = β0 + β1 (Dt /Pt ) + εt,t+k
I Statistical inference - stack β0 , β1 in a vector β:
b̂T − β → (0, Asy .var (b̂t ))
˜−1 ˆ
˜−1
1 ˆ
E (xt xt0 )
S E (xt xt0 )
T
I Period t residual of OLS moment vector is correlated with k − 1 previous lags and
k − 1 future lags.
var (b̂t ) =
I Spectral Density Matrix:
2
S =E4
k−1
X
3
05
(xt εt,t+k )(xt εt,t+k )
j=−(k−1)
Estimated covariance matrix
I Non-robust Hansen-Hodrick
I Robust Hansen-Hodrick
I Newey-West
Big problem with all : effectively few observations with which to estimate covariance.
I Suggestion: re-write numerator of regression coefficient:
cov (rt,t+k , Dt /Pt ) = cov (rt + . . . rt+k , Dt /Pt )
=
k−1
X
cov (rt+j , Dt /Pt )
j=0
=
k−1
X
cov (rt , Dt−j /Pt−j )
j=0
= cov (rt ,
k−1
X
j=0
Dt−j /Pt−j )
Hodrick (1992)
I
Recommended regression:
rt,t+1 = β0 + β1 (Dt /Pt + Dt−1 /Pt−1 + · · · + Dt−k+1 /Pt−k+1 ) + εt,t+1
I
Monte Carlo analysis suggests the asymptotic formulas for this
regression work reasonably well.
Do returns have a time-varying conditional expectation?
Still an open question:
I
Unstable results - everyone
I
Long horizon irrelevance - Hodrick (1992)
I
Statistical biases - Stambaugh (1999) , Valkanov (2003), Campbell and
Yogo 2006
I
Weak out-of-sample performance - Goyal and Welch (2003), Goyal and
Welch (2006), Campbell and Thompson (2007)
‘The Dog That Did Not Bark: A Defense of Return
Predictability’ , Cochrane (2007)
I This paper is about the statistics of return forecastability.
I Big contribution.
I Cochrane exploits economic significance (point estimates) to improve statistical
power (ability to reject a false null).
I Point estimates of return forecasts have large economic significance: The volatility of
expected returns is about 5%, almost as large as the 7.7% level of equity premium in
this sample!
I No evidence of dividend growth predictability AND estimates have the wrong sign.
I The statistical significance however is weak:
I
I
Dividend yield is very persistent: return regression inherits near unit root
properties
Return shocks are negatively correlation with dividend yield shocks:
estimates are upward biases] and the t-statistic is biased towards rejection.
I We therefore ask: is predictability dead?
I No...there are more powerful statistical tests if we exploit economic information.
I Remembering equation 3:
"
Var (pt − dt ) = Cov pt − dt ,
∞
X
j=1
#
ρ
j−1
"
∆dt+j − Cov pt − dt ,
∞
X
#
ρ
j−1
rt+j
j=1
I Given that the price-dividend ratio is volatile, if one argues that returns are
unpredictable, then he must accept that dividend growth is predictable.
I Hence, a null hypothesis in which returns are not forecastable must also specify that
dividend growth is predictable:
H0 : Excess Returns unpredictable which implies dividend growth is predictable
I Using first-order VAR representation of log returns, log dividend growth and log
dividend yield.
I Under the null : E [∆dt+1 ] = xt = φxt−1 + δtx and rt+1 = εrt+1 and specifying an
AR(1) process for dividend-yields we can derive the first order VAR:
rt+1 = ar + br (dt − pt ) + rt+1
(6)
∆dt+1 = ad + bd (dt − pt ) +
dt+1
(7)
dt+1 − pt+1 = adp + φ(dt − pt ) +
dp
t+1
(8)
I Firstly projecting on dt − pt , identity 1 implies that the regression coefficients follow
the identity:
br = 1 − ρφ + bd
(9)
I Identity 1 also links the error terms by:
rt+1 = dt+1 − ρdp
t+1
(10)
I Sensible economic priors: φ is non-explosive for φ < 1 ρ ∼ 1.04 (to rule out bubbles),
and dividend-yields do not have unit or larger roots for φ < 1 .
I A powerful test draws regions in {br , φ} space around the null {br = 0, ||φ|| < φ̄}.
I we have two variables to simulate: shocks to dividend growth εdt+1 and shocks to
dividend yields εdp
t+1
I The null hypothesis then takes the form:
2
dt+1 − pt+1
6
4
∆dt+1
rt+1
3
2
φ
3
2
εdp
t+1
3
7 6
7
7
6
d
5 = 4ρφ − 15 (dt − pt ) + 4 εt+1
5
0
εdt+1 − εdp
t+1
I The parameters are estimated above
I Draw the first observation from d0 − p0 ∼ N(0, σ 2 (εdp )/(1 − φ2 ))
I Then simulate forward by drawing εdt+1 and εdp
t+1 from random normals.
(11)
I Marginal distribution of return-forecast coefficient br gives weak evidence against the
unpredictable null : Monte Carlo produces coefficients larger than our sample
estimate 22% of the time.
I However, this null must assume dividend growth is forecastable.
I Almost all simulations give a large negative dividend growth forecast : a result of the
implied bd ’s from 1 and the economic restrictions on φ.
I Dividend growth forecasting coefficients larger than our in sample estimate are seen
only 1.77% of the time.
I Thus the strongest evidence against the null comes from the lack of dividend growth
predictability: ’the dog that did not bark’ !
I Again, from
"
Var (pt − dt ) = Cov pt − dt ,
∞
X
#
ρ
j−1
"
∆dt+j − Cov pt − dt ,
∞
X
j=1
#
ρ
j−1
j=1
we can derive another powerful test.
I dividing by Var (pt − dt )
1=β
pt − dt ,
∞
X
!
ρ
j−1
∆dt+j
−β
j=1
=
∞
X
ρj−1 φj−1 br −
j=1
∞
X
j=1
bd
br
−
=
1 − ρψ
1 − ρψ
= brL − bdL
pt − dt ,
∞
X
j=1
ρj−1 φj−1 bd
!
ρ
j−1
rt+j
rt+j
I
It turns out almost 100% of dividend yield volatility comes from return
forecasts.
What is the source of this power?
I Monte Carlo shows large br are not uncommon
I However since br and φ are negatively correlated high br typically occur with low φ.
I Look at what we need to generate large long horizon forecasts:
brL =
br
1 − ρψ
I The power of this test comes from the observed negative correlation between
r
dividend yields and realised returns : εdp
t+1 and εt+1 .
I In short, the economic source of this correlation is that shocks to dividend yields
(expected returns) and uncorrelated with shocks to ex-post dividend dividend growth
(or expected dividend growth).
I Return shocks and dividend yield shocks are thus strongly negatively correlated since
there is no offsetting dividend growth effect.
I You see this easily from:
Rt+1 =
(1 + Pt+1 /Dt+1 ) Dt+1
Pt /Dt
Dt
’Consumption, Aggregate Wealth, and Expected Stock
Returns’, Lettau and Ludvigson (2001)
The Budget Constraint I
I
Take the 2-period budget constraint
Wt+1 = (1 + Rt+1 )(Wt − Ct )
(12)
where Ct and Wt denote time t consumption and the (tradable) total
wealth portfolio (real assets plus human capital) respectively; (1 + Rt+1 )
is the gross return on the wealth portfolio between time t and t + 1.
Iterating the 2-period budget constraint forward and imposing the
transversality condition we obtain
∞
i
X
Y
W t = Ct +
Ct+i / (1 + Rt+j )
i=1
(13)
j=1
Observe that:
I
The expression relates current wealth and consumption to future returns
(it holds both ex-post and ex-ante)
I
It is ”model-free”
I
... But it is hard to estimate empirically: it is highly non linear and
involves an unobservable variable (Wt is a function of human capital)
Solution:
I
Use a proxy for human capital
I
Loglinearize
Human Capital Proxy
I
The wealth portfolio is the sum of asset holdings AT (observable) and
human capital Ht (unobservable)
Wt = At + Ht
(14)
I
How can we measure Ht ?
Ht is the stock of competencies, skills, and knowledge that enhance the
productivity of labor
⇒ You can think of wage Yt as the ”dividend” paid on your human
capital investment
Q
P
i
⇒ Ht = ∞
Y
/
(1
+
R
)
h,t+j
i=1 t+i
j=1
I
A log-linearization of the above expression yields
I
I
I
ht = k + yt + zt
(15)
where lower-case letters denote the logs of the corresponding upper-case
variables, k is a constant, and zt is a stationary RV
The Budget Constraint II
Take the 2-period budget constraint (12)
I
Loglinearize around the steady-state value of the consumption-wealth
ratio (assumed to exist!)
I
Solve the difference equation forward
I
Impose the transversality condition
I
Use the proxy for human capital introduced in the previous slide
I
Drop all constants
and you obtain...
ct − ωat − (1 − ω)yt =
"∞
#
X
i
Et
ρ (ωra,t+i + (1 − ω)rh,t+i − ∆ct+i ) + (1 − ω)zt
(16)
i=1
where, as usual, lower-case letters denote logs, ω equals the average share of
asset holdings in total wealth A/W , and ρ is the steady state ratio of new
investment to total wealth, (W − C )/W .
Observe :
I Equation (16) is model-free
I It is linear and relates current, observable variables to future consumption and total
wealth growth
I Given the stationarity of the RHS variables, it says that the LHS variables are
co-integrated
I The LHS (cayt hence) gives the deviation in the common trend of ct , at , and yt
I cayt is a good proxy for market expectations of future asset returns, future human
capital returns, consumption growth, or any linear combination of these
I Why does cayt have predictive power? Forward looking investors try to maintain a
flat consumption path inter-temporally and therefore smooth transitory movements in
asset wealth arising from time variation in expected returns .
Observe the tight link with Campbell and Shiller decomposition:
"∞
#
X i
dt − pt = Et
ρa (ra,t+i − ∆dt+i )
(17)
i=1
where ρa = P/(P + D).
I dt plays the same role as ct in (16): if dividends are high relative to prices, agents
expect high future returns, low dividend growth, or both
I Which linearisation should we use? What do we know better: the determinants of
consumption or those of dividends?
I If we allow for labor income or a saving technology, the tight link between the two is
broken
Cointegration Estimates and cay
ˆ t
LL01 estimate the co-integration parameters following Stock and Watson
(1993):
cn,t = 0.61 + 0.31 at + 0.59 yt
(7.96)
(11.70)
(23.92)
where values in parentheses are corrected t-statistics.
Quarterly Forecasting Regressions
I
Lettau and Ludvigson perform a series of out-of-sample tests and verify
that their findings hold out-of-sample (see section V of LL01 for details)
I
And find that the same forecasting relationships hold for longer
horizons: cay
ˆ t has forecasting power for future asset returns and not for
consumption/labor income growth
I
The predictive power of cay
ˆ t is hump-shaped and peaks around one year
Conclusion
I
Fluctuations in the log consumption to wealth ratio predict stock
market returns and excess returns over short and intermediate horizons
I
Why? If returns are expected to decline in the future,
consumption-smoothing investors will allow consumption to dip
temporarily from its long term relationship with both assets and labor
income (and vice-versa)
I
Important policy consequence: large swings in financial assets need not
be associated with large subsequent movements in consumption (e.g.
after a stock market boom, consumption might remain low relative to
wealth if agents factor in the expectation of lower return in the future)
© Copyright 2026 Paperzz