Document

Project 4. Statistical Arbitrage
MS&E 444 Investment Practice
Spring 2010
Jeff Blokker [[email protected]]
Emile Chamoun[[email protected]]
Ibrahim Jreige[[email protected]]
Paris Georgoudis[[email protected]]
Sameh Galal[[email protected]]
MS&E 444 Kay Giesecke, April 7 2010
2
Factor Model
• Statistical Arbitrage
dSt
  dt  d  t
– A standard model for the dynamics of stock price is
St
– This model can be enhanced by expanding the noise term  t
p
dSt
  dt    j Ft ( j )  d  t
St
j 1
n
d (log St )   dt    j Ft ( j )  d  t
j 1
  dt  βFt  d t
– Where Ft ( j ) are risk factors associated with the market
– In discrete time
ri  log Pi  log Pi 1  t  βFi   i
– Assume that E (F)  0 , cov(F )  I , E ( )  0 , and that
F and  are independent.
MS&E 444 – Investment Practice
3
Covariance of Log Returns
– If we have n observations and p factors:
r1  1t  11 Ft (1)  12 Ft (2)  ...  1 p Ft ( p )  1
r2   2 t   21 Ft (1)   22 Ft (2)  ...   2 p Ft ( p )   2
rn   n t   n1 Ft (1)   n 2 Ft (2)  ...   np Ft ( p )   n
– Or in matrix form r  μt  βF  ε
– Using (r  μt )(r  μt )T  βF(βF)T  εβFT  βF T   T
cov(r)  E (r  μt )(r  μt )T
 βE (FFT )βT  E (εFT )β  βE (F T )  E ( T )
 ββT  Ψ
MS&E 444 – Investment Practice
4
Principal Component Analysis
• Principal Component Analysis
– Spectra decomposition of matrix
p
cov(r )  ββ  Ψ   i ei eTi  Ψ
T
i 1
where (i , ei ) are the Eigen value, Eigen vector pair
• Noise Reduction
– We can approximate the model with a limited set of m Eigen vectors or
Principal Components
m p
ri   dt    j Ft ( j )  dˆt
j 1
– Using the largest Eigen vectors will add the components that contribute
most to the variance in the data
MS&E 444 – Investment Practice
5
Stability of Principal Components
• Comparison of the Stability/Evolution of the PCA
– 30 day initial data sample– Moved forward one day at a time.
– 10 largest Eigen cectors compared to the first sample using dot product
cos  n  eT0 e n
• Two Subtle Problems
– 1. The Eigen vectors returned by PCA may be the inverse of the first set.
– 2. Since the Eigen vectors are given in descending order, a change in the
relative magnitude of any components may swap their position.
Therefore, comparisons must be made carefully.
• Results
– Eigen vectors are relatively stable over time.
– After 10 Eigen vectors they become more unstable.
MS&E 444 – Investment Practice
Stability of Principal Components
Distribution of Eigen Vector #1
Distribution of Eigen Vector #2
30
9
8
25
20
6
Number of Vectors
Number of Vectors
7
15
10
5
4
3
2
5
1
0
-0.5
0
0.5
1
1.5
2
2.5
Distribution
Eigen Vector #3
Cos(theta),ofMean=0.99382
3
0
-0.5
3.5
0
0.5
7
1
1.5
2
2.5
Cos(theta), Mean=0.9595
Distribution of Eigen Vector #4
3
3.5
6
6
5
Number of Vectors
5
Number of Vectors
6
4
3
2
4
3
2
1
1
0
-0.5
0
0.5
1
1.5
2
Cos(theta), Mean=0.91897
2.5
3
3.5
0
-0.5
0
0.5
1
1.5
2
Cos(theta), Mean=0.89915
2.5
3
3.5
MS&E 444 – Investment Practice
Stability of Principal Components
7
Distribution of Eigen Vector #5
Distribution of Eigen Vector #6
10
4
9
3.5
8
3
Number of Vectors
Number of Vectors
7
6
5
4
2.5
2
1.5
3
1
2
0.5
1
0
-0.5
0
0.5
1
1.5
2
Cos(theta), Mean=0.71333
2.5
3
0
-0.5
3.5
0
0.5
4
3.5
3.5
3
3
Number of Vectors
Number of Vectors
2.5
3
3.5
Distribution of Eigen Vector #8
Distribution of Eigen Vector #7
4
2.5
2
1.5
2.5
2
1.5
1
1
0.5
0.5
0
-0.5
1
1.5
2
Cos(theta), Mean=0.77994
0
0.5
1
1.5
2
Cos(theta), Mean=0.77283
2.5
3
3.5
0
-0.5
0
0.5
1
1.5
2
Cos(theta), Mean=0.58858
2.5
3
3.5
MS&E 444 – Investment Practice
8
Statistical Distance vs Time of Day
• Mahanalobis Distance
– The distance a data point is from the center of the distribution
DM (x)  (x   )T Σ 1 (x   )
• Procedure
–
–
–
–
The training set of 15 minute log return data was for 100 days.
The distance of the next 10 data points was calculated.
The training set was then shifted forward and the next 10 points measured.
The data was sorted by time of day to analyze the time of day that
generated the most outliers.
MS&E 444 – Investment Practice
Distance of new Test Data form the Training Data
T
1
Mahalanobis Distance DM (x)  (x   ) Σ ( x   )
4
9
x 10
Mahalanobis Distance of new Data Throughout the day
8
7
Magnatude of Distance
9
6
5
4
3
2
1
0
0
5
10
15
20
Number of 15 Minute Intervals in Day
25
30
Conclusion – We can separate the market into two distinct time
periods where the returns are generated by two different processes.
MS&E 444 – Investment Practice
10
Generation of Residuals
• Partial Least Squares
– If X is the data set and Y is the component desired to regress from the data
– then PCA analyzes E ( XT X)
T
– And PLS analyzes E ( X Y)
1.
2.
3.
PLS finds the matrix information associated with the first Eigen vector
Subtracts this information from the covariance matrix
Then finds the information for the second Eigen vector, etc.
• Procedure
– Test data : 100 day sample of 15 minute log returns on 500 stocks
– Predict the next 10 points of data using PLS with largest 9 Eigen vectors
– Test data moved forward
• Results
– Measure of fit
εT ε
R  1
(y   )T (y   )
2
MS&E 444 – Investment Practice
PLS First 45 Minutes of Market Removed
11
Out of Sample Residuals over time
0.015
0.01
Time
0.005
0
-0.005
-0.01
-0.015
0
1000
2000
3000
4000
5000
Residuals
6000
Out of Sample Distribution of Residuals
7000
8000
9000
Q-Q Plot Out of Sample Residuals
800
0.015
700
0.01
Quantiles of Input Sample
Number of Samples
600
500
400
300
0.005
0
-0.005
200
-0.01
100
0
-0.015
-0.01
-0.005
0
0.005
2
0.01
Deviation of Residuals =0.0016586 R =0.87011
0.015
-0.015
-4
-3
-2
-1
0
1
Standard Normal Quantiles
2
3
4
MS&E 444 – Investment Practice
PLS First 45 Minutes of the Market
12
Out of Sample Residuals over time
0.03
0.02
0.01
Time
0
-0.01
-0.02
-0.03
-0.04
0
200
400
600
Residuals
800
1000
Q-Q Plot Out of Sample Residuals
Out of Sample Distribution of Residuals
0.03
300
0.02
Quantiles of Input Sample
Number of Samples
250
200
150
100
50
0
-0.04
1200
0.01
0
-0.01
-0.02
-0.03
-0.03
-0.02
-0.01
0
0.01
Deviation of Residuals =0.004329 R2 =0.7535
0.02
0.03
-0.04
-4
-3
-2
-1
0
1
Standard Normal Quantiles
2
3
4
MS&E 444 – Investment Practice
13
Calibrating OU Process: Problem Setup
• Need to estimate κ, μ and σ in the OU-Process Equation:
dX t   (  X t )   * dWt
• The discrete form of the solution of the SDE can be written
as:
X t 1  a * X t  b  
where :
a  e 
b   (1  e  )
 
1  e  2
* N (0,1)
2
κ: coefficient of mean reversion
∆: discretization time step
μ: long term mean of the residuals
MS&E 444 – Investment Practice
14
Calibrating OU Process: OLS and MLE
•
Least Squares:
Basic idea: Fit parameters by minimizing sum of square of error terms.
•
Maximum Likelihood Estimation:
Basic idea: Find parameters by maximizing log-likelihood of the data.
MS&E 444 – Investment Practice
15
Main Issue
•
•
OLS and MLE tend to produce similar results.
However, MLE is known for overestimating the mean
reversion speed κ:
example: Johnson, Thomas. “Approximating Optimal
Trading Strategies Under Parameter Uncertainty: A
Monte Carlo Approach”. Kellog Business School. 2009.
•
•
•
•
Main idea: MLE typically overestimates the mean reversion speed and as
a result, underestimates the noise σ.
Paper compares filtering trading strategy to MLE.
Filtering outperforms MLE every time.
Reason: Boguslavsky, Boguslavskaya. “Arbitrage Under
Power”. February 2009.
•
MLE model suggests overly aggressive positions that can quickly lead
the trader to bankruptcy.
MS&E 444 – Investment Practice
16
Kalman Filtering
• Idea: mathematical method to use noisy measurements to
produced results that tend to be closer to the true value of
the variable of interest.
MS&E 444 – Investment Practice
17
Comparison of Estimation Methods
• Parameter estimation by Kalman Filtering Produces
produces more accurate estimates of the OU process
parameters than either MLE or OLS.
• Major disadvantage of EM Algorithm: Might take a long
time to converge, computationally intensive for large
window sizes.
• Solution: Use MLE/OLS to produce initial guesses then
use EM to refine estimation.
MS&E 444 – Investment Practice
18
Optimal Trading of the Residuals-1
• Implement the Boguslavsky/ Boguslavskyaya strategy
described in: “Optimal Arbitrage Trading” (2003).
• O-U process:
• Conditional Distribution:
• Utility Function
• Normalization Process : Let α be the control variable and
W the wealth at time t:
• Value Function:
MS&E 444 – Investment Practice
19
Optimal Trading of the Residuals-2
• Solve for optimal control parameter using HJB equation:
• Reduces to the PDE:
• Solution: Let τ be the time left for trading,
MS&E 444 – Investment Practice
20
Results on EvA residuals
• ∆ ~ 1 min, γ = -0.5, initial wealth = 100,000
Cumulative Wealth,
Peak ~ 4,300,000
End ~ 3,700,000
Optimal Trading Position
MS&E 444 – Investment Practice
21
Results on Our residuals using EvA’s dataXOM
• ∆ ~ 15 min, initialWealth = 100,000
Cumulative Wealth, γ = 0
Peak ~ 530,000
End ~ 490,000
Cumulative Wealth,γ = -0.5
Peak ~ 520,000
End ~ 450,000
MS&E 444 – Investment Practice
22
Incorporating TC-Separate Fund Allocation
•
•
•
•
•
•
•
•
•
All wealths curves will
lie between the red and
green curves.
Blue curve = no fixed cost
peak = 530,000, End = 490,000
Green curve
peak = 470,000, end = 420,000
Blue = no cost
Green = 10*fixed cost
Red = 1*fixed cost
MS&E 444 – Investment Practice
23
Trading Residuals in Practice
• Look at historical 15 minute data for ~500 stocks using a
100 days sliding window
• For every stock i at time t
– Generate partial least square representation using 10 components using the
remaining 499 stocks last 100 days return sliding window
– Generate a residual return by removing the PLS approximation from the
stock return
– Generate residue replicating portfolio weights
• Pi = [-β1 –β2 …. -βi-1 1 -βi+1 …. -βn]
MS&E 444 – Investment Practice
24
Available Data at Time t
•
•
•
•
•
Stock returns vector R(t)
Residuals returns Vector Rresidue(t)
Residuals means Vector μresidue(t)
Residuals standard deviations Vector σresidue(t)
Residuals replication matrix P(t)
– Pij(t) is the weight of the jth stock in the portfolio replicating ith residue
– If we have residuals positions vector V(t), the final investment portfolio
will be V(t)P(t)
MS&E 444 – Investment Practice
25
The Trading Strategy
• Evaluate the market every 15 minutes to look for strong
deviations of residuals from mean
– Enter positions that exceed a entering threshold
– Leave positions that cross the leaving threshold
– Allocate money in a certain defined percentage equally between all
opportunities invested in given a certain minimum cash position
percentage
• The dynamic rebalancing of portfolio is based on log
optimal portfolio growth strategy of volatility pumping
MS&E 444 – Investment Practice
26
The Secret Sauce: Trading Parameters
• 6 parameters
–
–
–
–
Long Enter threshold, Short Enter threshold
Long Exit Threshold, Short exit threshold
Minimum Cash percentage
Maximum single position percentage
• Trading algorithm is robust with trading parameters (at least as
far as I tested!)
• Divided data sets into a training period and used matlab
optimization toolbox to find parameters that maximizes sharpe
ratio and applied the resulting parameters into a testing period
• This strategy can be applied continuously to periodically
recalibrate the trading parameters
MS&E 444 – Investment Practice
Long Enter threshold=-2.698336, Long Exit threshold=-1.500553,
Short Enter Threshold=2.698336, Short Exit Threshold=1.500553,
Minimum Cash Position=84.418854%, Maximum Investment=4.071198%,
sharpe ratio=0.349356
27
3
training period
test period
Wealth Multiples
2.5
2
1.5
1
0.5
0
1000
2000
3000
4000
5000
6000
MS&E 444 – Investment Practice