Recursive Stock Picking with Decision Trees - SFB 649

Recursive Stock Picking with Decision Trees
Anton Andriyashin
C.A.S.E. — Center for Applied Statistics
and Economics
Chair of Statistics
Humboldt-Universität zu Berlin
http://ise.wiwi.hu-berlin.de
Motivation
1-2
Economic Modeling vs Object Recognition
Stock picking frequently comes as the result of rigorous asset
pricing models and simulations of dynamic general equilibria.
• Economic agent behavior is to be limited by a certain
objective functional and relevant budget constraints
• Agents are supposed to behave rationally on the market
Object recognition restricts the class of equilibrium trajectories and
may not impose additional modeling assumptions.
• No explicit model is assumed
• Solution class is restrained (e.g. quadratic functions class)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Motivation
1-3
Basic Pricing Equation
u(·) – utility function, ct – consumption at time t, pt – stock price
at time t, dt+1 – dividends paid at time t, xt+1 – next period
payoff.
max u(ct ) + Et [βu(ct+1 )] s.t.
ξ
ct = et − pt ξ,
ct+1 = et+1 + xt+1 ξ,
xt+1 = pt+1 + dt+1
0
t+1 )
Solution: pt = Et [β uu(c
0 (c ) xt+1 ]
t
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Motivation
1-4
Bloomberg Report
The benchmark DAX Index slid 42.16, or 0.7 percent, to 6012.56
at 9:46 a.m. in Frankfurt. ”European stocks are set to take some
hits from the weakening dollar, prospects for interest rates and
oil prices near records”, said Christian Wrede. [...] Credit Suisse
said investors should cut their holdings in continental European
equities, citing the rising euro and higher interest rates. Oil
prices added 1.7 percent yesterday to $73.32 a barrel in New York
after climbing as high as $73.90. Futures slipped back to $72.76
today.
Factors: euro/dollar exchange rate, interest rates, and oil spot
and future prices.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Motivation
1-5
Relevant Studies in the Area
SmithBarney (1999): average return – 19.62% (annualized),
st. deviation – 11.96% and Sharpe ratio – 1.23
JPMorgan (2003): average return – 14.60% (annualized),
st. deviation – 9.50% and Sharpe ratio – 1.54
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-6
What is a Decision Tree?
X1 ≤ 0.5
1
0.9
GREEN
0.8
Blue
0.7
X2 ≤ 0.5
X2
0.6
X1 ≤ 0.75
0.5
PURPLE
0.4
Green
BLACK
0.3
BLUE
Black
0.2
0.1
0
X2 ≤ 0.25
YELLOW
0
0.1
0.2
0.3
0.4
0.5
X1
0.6
0.7
0.8
0.9
1
Yellow
Purple
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-7
Defining Stock Classes
Three stock classes are assumed at the moment, each class
corresponds to one of possible stock positions:
• short,
• neutral,
• long
Let Pt be the current stock price, Rt =
positive threshold.
Pt −Pt−1
Pt−1 ,
let R̄ be some
If Rt > R̄, the stock is assumed to be undervalued at t − 1
If Rt < −R̄, the stock is assumed to be overvalued at t − 1
If −R̄ ≤ Rt ≤ R̄, the stock is assumed to be fairly priced at t − 1
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-8
Setting Up a Classification Problem – I
If t is the current time period, then Rt+1 – the stock price
movement at t + 1 – is of one’s interest.
For fixed t:
Let X ∈ Rp be the factor space of explanatory variables
(fundamental, technical, macroeconomic indicators and stock
prices).
Let Y ∈ R1 be the vector of stock classes.
∀t : Yt ∈ {Short, Neutral, Long }.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-9
Setting Up a Classification Problem – II
From known market data one can obtain:
{X0 , X1 } → Y0
...
{Xt−2 , Xt−1 } → Yt−2
{Xt−1 , Xt } → Yt−1
Stock picking implies unknown next period price:
?
{Xt }→Yt
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-10
XETRA DAX Example: BMW
ΔDiv.Yield < 0.006
(28,32)
Div.Yield < 1.445
(9,23)
CPI < 103.350
(6,4)
Class BUY
(1,4)
Class SELL
(5,0)
Ext.Debt < 79177.000
(19,9)
ΔExt.Debt < 0.092
(3,19)
Class BUY
(0,16)
Class BUY
(0,3)
Class SELL
(3,3)
ΔNat.Reserves < -0.031
(19,6)
Class BUY
(0,2)
Ext.Debt < 103331.500
(19,4)
Div.Yield < 1.915
(17,1)
Class SELL
(16,0)
Class BUY
(2,3)
Class SELL
(1,1)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Problem Statement
2-11
How to Pick the Stock?
Market data from the past along with stock prices constitute the
learning sample. Individual stock price movements are predicted
separately using different trees.
Major steps:
• use the data from the past to build the decision tree,
• input the available data from today (Xt ) into the tree,
• get the prediction of the class of Rt+1 =
Pt+1 −Pt
Pt
The estimation is performed in the class of orthogonal step
functions
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-12
Parent and Child Nodes
Let tP be the parent node, tL and tR – children nodes.
Node tP
Node tL
Node tR
Let nP be the number of observations in tP , nL and nR – in tL and
tR respectively.
pL = 1 − pR =
p(j|t) =
nL
,
nP
nt (j)
nt
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-13
Impurity Function
Node homogeneity is measured by the impurity function i(t).
∆i(t) = i(tP ) − E {i(tC )}
where E (·) is the expected value operator:
E {i(tC )} = pL i(tL ) + pR i(tR ) and C = {L, R}.
s ∗ – optimal split: question variable and its value
s ∗ = argmin {pL i(tL ) + pR i(tR )}
s
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-14
Getting the Right Values for the Tree
Questions
To get s ∗ , i(t) must be explicitly defined.
Gini index:
J X
J
X
i(t) =
p(k|t)p(l|t)
k6=l l6=k
k=1 l=1
where k, l = 1, J are class indices.


J
J
J
 X

X
X
s ∗ = argmax −
p 2 (j|t) + pL
p 2 (j|tL ) + pR
p 2 (j|tR )


s
j=1
j=1
j=1
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-15
The Challenge of Optimum Tree Size
Maximum tree TMAX – there are no observations left to split.
• easy to build,
• perfect in-sample prediction rate,
• poor out-of-sample rate: overfitting
Too small and primitive trees lead to underfitting.
Therefore, some tradeoff is required.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-16
Change of Impurity Rule
Since the change of impurity value is maximized at every step,
select those node that lead only to significant fluctuations of i(t).
Enable split s, if for a given node t
∆i(s, t) ≥ β̄
However, the following challenges do arise:
• ∆i(s, t) is isually not a monotone function
• how should one define β̄?
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-17
V -fold Cross-Validation Measure of Tree
Quality – I
Define L\Lv , ∀v = 1, V as the training set and Lv as the test set
where L is the learning sample itself. For a given classification rule
dv derived from the learning set L\Lv , its quality can be
represented in the following way:
1
E 1 d (v ) =
Nv
X
I d (v ) (ln ) 6= jn
(ln ,jn )∈Lv
where ln is a test set observation with class jn , Nv – the number of
observations in Lv and E 1 d (v ) is the one-iteration internal
misclassification error estimate.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-18
V -fold Cross-Validation Measure of Tree
Quality – II
Note that none of the observations from Lv was engaged during
construction of decision rule d (v ) .
E CV (d) =
V
1 X 1 (v ) E d
V
v =1
where E CV (d) is the V -fold cross-validation measure of tree
quality.
This brute force approach is, however, rather slow.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-19
Internal Misclassification Error
Define internal misclassification error of an arbitrary observation at
node t as
e(t) = 1 − max p( j| t)
j
Let us define also
E (t) = e(t)p(t)
Then internal misclassification tree error is
X
E (T ) =
E (t)
t∈T̃
where T̃ is a set of terminal nodes.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-20
Cost-Complexity Function
Let T̃ be the number of terminal nodes – the measure of
complexity of tree T .
Cost-complexity function is defined as:
Eα (T ) = E (T ) + α T̃ where α ≥ 0 is the complexity parameter and α T̃ is the cost
component.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-21
Cost-Complexity Function Optimization
Although α can have infinite number of values, the number of
subtrees of TMAX resulting in minimization of Eα (T ) is finite.
Pruning of TMAX leads to the creation of subtree sequence
T1 , T2 , T3 , . . . with a decreasing number of terminal nodes.
It can be shown that ∀α ≥ 0 there exists an optimal tree T (α) in
the sense that
i
h
1. Eα {T (α)} = min Eα (T ) = min E (T ) + α T̃ T ≤TMAX
T ≤TMAX
2. If Eα (T ) = Eα (T (α)) then T (α) ≤ T .
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-22
Cross-Validation and the Optimum Tree
This way one can get a sequence of optimal subtrees
TMAX T1 T2 T3 . . . {t0 }.
The sequence {αk } is increasing. For k ≥ 1: αk ≤ α < αk+1 and
T (α) = T (αk ) = Tk and α1 = 0.
Applying the method of V -fold cross-validation to the sequence
TMAX T1 T2 T3 . . . {t0 }, the optimal tree is then
determined.
Performance gain: sequence of optimal subtrees vs all possible
subtrees.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
Introduction to Decision Trees
3-23
Cross-Validation Live Example: BMW
ΔDiv.Yield < 0.006
(28,32)
Class BUY
(9,23)
CPI < 103.350
(6,4)
Class BUY
(1,4)
Class SELL
(5,0)
Class SELL
(19,9)
ΔExt.Debt < 0.092
(3,19)
Class BUY
(0,16)
Class BUY
(0,3)
Class SELL
(3,3)
ΔNat.Reserves < -0.031
(19,6)
Class BUY
(0,2)
Ext.Debt < 103331.500
(19,4)
Div.Yield < 1.915
(17,1)
Class SELL
(16,0)
Class BUY
(2,3)
Class SELL
(1,1)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-24
Data
Two different setups – with and without Implied Volatility variable
– are considered.
• XETRA DAX companies;
• November 27, 2000 – October 20, 2003;
• normalized time scale: weekly observations;
• stock prices, technical and fundamental indicators from the
past
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-25
Available Variables
Indicator
Close Price
Momentum
Stochastic
Type
Technical
Technical
Frequency
1 day
1 day
1 day
Description
Closing price of a stock
Mt = Pt − Pt−T , T = 20
Pt −PL
PH −PL , PH = max(Pt ), PL = min(Pt )
MA
MA St. Error
MACD
Technical
Technical
Technical
1 day
1 day
1 day
ROC
TRIX
BV
CF
Dividends paid
EBITDA
Technical
Technical
Fundamental
Fundamental
Fundamental
Fundamental
1
1
1
1
1
1
EPS
Outst
Sales
ImplVola
Fundamental
Fundamental
Fundamental
Fundamental
1 month
3-6 months
1 month
1 day
i
MA(T ) = i=t−T
, T = 12
T
Standard deviation of MA
(1 − nn21 ){MA(n1 ) − MA(n2 − n1 )}
n1 = 12, n2 = 26
Pt
Pt−T , T = 10
Triple exponentially smoothed MA
Book Value
Cash Flow
Earnings Before Income Tax and
Depreciation
Earnings per Share
Number of outstanding stocks
Implied volatility
day
day
month
month
month
month
P
t
P
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-26
Major Steps
Model testing implies the following three stages:
• learning period (November 27, 2000 – June 24, 2001);
• calibration period – R̄ (June 24, 2001 – October 22, 2001);
• trading period (October 22, 2001 – October 20, 2003).
Transaction costs are accounted in the amount of 10 b.p. per
active transaction.
At the end of every period all active positions are closed.
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-27
Equally-Weighted Portfolio
Portfolio returns are calculated basing on individual stock returns.
If N is the number of stocks in the portfolio, then portfolio return
Π is defined as follows:
Π=
N
X
ωs R̂s
s=1
where ωs is the weight associated with s-th stock in the portfolio
and R̂s is the realized stock yield following the recommended
position.
Given that no obviously relevant weighting scheme is available in
this context, ∀s : ωs = N1 .
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-28
Wealth Curves – I: Vola Excluded
Active CART Portfolio (Vola Excluded) Performance Benchmarking
1.6
1.4
Cumulative Yield
1.2
1
0.8
0.6
0.4
0.2
28−Jan−2002
DAX30
LIBOR 90 days
DJJA
FTSE100
Active CART
08−Feb−2002
09−Dec−2002
Date
19−May−2003
20−Oct−2003
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-29
Wealth Curves – II: Vola Included
Active CART Portfolio (Vola Included) Performance Benchmarking
1.6
1.4
Cumulative Yield
1.2
1
0.8
0.6
0.4
DAX30
LIBOR 90 days
DJJA
FTSE100
Active CART
0.2
22−Oct−2001 06−May−2002 28−Oct−2002 29−Apr−2003 20−Oct−2003
Date
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-30
Weekly Returns – I: Vola Excluded
Active CART Strategy Weekly Yields: Vola Excluded
0.1
0.08
0.06
Weekly Yield
0.04
0.02
0
−0.02
−0.04
−0.06
−0.08
28−Jan−2002
08−Feb−2002
09−Dec−2002
Date
19−May−2003
20−Oct−2003
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-31
Weekly Returns – II: Vola Included
Active CART Strategy Weekly Yields: Vola Included
0.1
0.08
0.06
Weekly Yield
0.04
0.02
0
−0.02
−0.04
−0.06
−0.08
22−Oct−2001 06−May−2002 28−Oct−2002 29−Apr−2003 20−Oct−2003
Date
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-32
Sharpe Ratio Methodology
Annualized mean portfolio return and its standard deviation given
the time period of T weeks:
T
P
Π̄ann =
σ̄(Π)ann
Πi
t=1
T
52
v
u
T
u1 X
2 √
t
Πi − Π̄ann
=
52
T
t=1
Given the risk-free rate Rf , Sharpe ratio is then defined as follows:
SR =
Π̄ann − Rf
σ̄(Π)ann
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
XETRA DAX Backtesting
4-33
Simulation Results – III: Financial Indicators
Vola Excluded
Sharpe ratio
Mean relative weekly
Skewness
Kurtosis
Risk-free rate (Avg. LIBOR)
Value
0.88
19.99%
0.63
6.69
0.04
Vola Included
Sharpe ratio
Mean relative weekly
Skewness
Kurtosis
Risk-free rate (Avg. LIBOR)
Value
1.59
25.55%
1.00
5.68
0.04
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
BNS: Alternative Tree Pruning
5-34
Cross-Validation Live Example: BMW
ΔDiv.Yield < 0.006
(28,32)
Class BUY
(9,23)
CPI < 103.350
(6,4)
Class BUY
(1,4)
Class SELL
(5,0)
Class SELL
(19,9)
ΔExt.Debt < 0.092
(3,19)
Class BUY
(0,16)
Class BUY
(0,3)
Class SELL
(3,3)
ΔNat.Reserves < -0.031
(19,6)
Class BUY
(0,2)
Ext.Debt < 103331.500
(19,4)
Div.Yield < 1.915
(17,1)
Class SELL
(16,0)
Class BUY
(2,3)
Class SELL
(1,1)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
BNS: Alternative Tree Pruning
5-35
BNS Live Example: BMW
ΔDiv.Yield < 0.006
(28,32)
Div.Yield < 1.445
(9,23)
CPI < 103.350
(6,4)
Class BUY
(1,4)
Class SELL
(5,0)
Ext.Debt < 79177.000
(19,9)
ΔExt.Debt < 0.092
(3,19)
Class BUY
(0,16)
Class BUY
(0,3)
Class SELL
(3,3)
ΔNat.Reserves < -0.031
(19,6)
Class BUY
(0,2)
Ext.Debt < 103331.500
(19,4)
Div.Yield < 1.915
(17,1)
Class SELL
(16,0)
Class BUY
(2,3)
Class SELL
(1,1)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)
BNS: Alternative Tree Pruning
5-36
BNS – Best Node Strategy
Divides nodes into pure/impure i.e. reliable/unreliable groups.
• allows to prune not only the whole branch, but one leaf if
necessary;
• does not employ cross-validation;
• needs two extra parameters to be calibrated
p̄ – the minimum allowed probability of the dominating class in a
pure node (usually 75% or more)
n̄ – the minimum allowed quantity of observations in a pure node
(usually 10% or more of the learning sample size)
X1 < 0.028
(42,42)
Recursive Stock Picking with Decision Trees
CLASS -1
(13,35)
X2 < 0.057
(29,7)
CLASS -1
(16,6)
CLASS 1
(13,1)