An empirical study of stock portfolios based on diversification and innovative measures of risks. Master Thesis - Final Report Supervised by Pr. Dr. Sornette Chair of Entrepreneurial Risks at ETH Zurich Thibaut Simon 10 February 2010 Abstract When a measure of risks such as variance does not take into account the fact that distribution of returns are non-Gaussian and exhibit non-linear dependencies, it is ineffective to generate portfolios with standard optimization procedures. This study proposes to explore new measures of risks based on series and new concepts of level of risks. This empirical study back-tests these innovative ideas on the long run and evaluates them with random portfolios analyses. Persistent performance and risk management are achieved with the use of Maximum DrawDown of returns and levels of diversification. Contents 1 Introduction 3 2 Classical approach of Portfolio construction 2.1 Markowitz, Mean-Variance . . . . . . . . . . . . . . . . . . . . . 2.2 Different estimates of the Covariance matrix to improve MeanVariance approach . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Shrinkage . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Re-sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Reverse engineering . . . . . . . . . . . . . . . . . . . . . 2.3 CAPM and extensions . . . . . . . . . . . . . . . . . . . . . . . . 5 5 8 8 8 8 9 3 Genetic algorithm 12 3.1 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Focus on the generation of a new population . . . . . . . . . . . 14 3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Random Portfolios 18 4.1 First attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Second attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Innovative idea on risks 5.1 Measures of risks and dependencies 5.1.1 Coherent Measure of risk . 5.1.2 Dependencies . . . . . . . . 5.1.3 Some measure of risks . . . 5.2 Level of risks and diversification . 5.3 A winner mix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 23 23 24 24 25 28 6 Simulation 6.1 Methodology . . . . . . 6.1.1 Data . . . . . . . 6.1.2 Test . . . . . . . 6.1.3 Set of strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 30 30 31 32 . . . . . . . . . . . . . . . . 1 . . . . . . . . 6.2 Results . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Tables . . . . . . . . . . . . . . . . . . 6.2.2 Wealth evolution - first interpretation 6.2.3 Random Portfolios - Luck or skill? . . 6.2.4 Sharpe Ratio . . . . . . . . . . . . . . 6.2.5 Transaction costs overview . . . . . . 6.2.6 Impact of the diversification - concrete . . . . . . . . . . . . . . . . . . . . . . . . cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 35 37 43 53 59 63 7 What’s next? 66 7.1 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.2 To go further . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 8 Annexes 75 8.1 Maximum DrawDown - Matlab Code . . . . . . . . . . . . . . . . 75 8.2 Average Maximum DrawDown - Matlab Code . . . . . . . . . . . 76 2 Chapter 1 Introduction Asset allocation is today a topic of real importance. All investors want to invest in the winning combination of assets. This combination should give them the maximum level of return for the level of risk they are able to take. The problematic of asset allocation can be found across all industries and sectors of activity. For example, an oil company will have the choice of investing in different fields with different techniques for different products. How to choose the weight each project should have in my investment portfolio? Answering this question requires to take into account expected returns of projects as well as risks of failure, or delay that could penalize future returns. Actually, investing in several assets means trade-offs. However, making a decision is hard because investors have no idea about what will happen and what the consequences of their choice will be? Thus the idea of portfolio optimization is to develop a decision-tool to help investors to rationalize their choice. A lot of researches have been done in the area of asset allocation and portfolio optimization and construction. However, the topic is still widely open. Why is it still the case? Harry Markowitz created the modern portfolio theory based on historical distributions of returns considered to be Gaussian. As described in my thesis, this theory never performed well. However, the Markowitz’ idea of minimizing risks for a given level of return is still widely accepted as a good start for a new theory. In the 1950’s means of computation were limited but Markowitz developed a quadratic programming equation that could be easily solvable. Today new measures of risks and dependencies between assets can be used, since computation and optimization of complex problem is more convenient. These new measures allow to create new strategies that perform better than the traditional one. I use a genetic algorithm (see chapter 3) to solve optimization problems. The performance of these strategies is evaluated through random portfolios and Sharpe ratio analyses. This study covers stock portfolios taken in a universe composed of 28 stocks from the CAC40. The question answered in my thesis is: 3 How to achieve persistent return for different levels of risks? This report will help people to start from scratch in portfolio management and optimization to understand the problematic of portfolio construction and evaluation. Hence, this thesis will start by explaining the classical approach, its limits and extensions before describing the method and tools to construct and evaluate portfolios. From the development of random portfolios emerges the innovative idea of separating measures of risks and levels of risks. It is tested in the chapter presenting the simulation. My thesis ends with some insight and way to continue it. 4 Chapter 2 Classical approach of Portfolio construction Portfolio construction aim at producing portfolios minimizing risks for investors and maximizing their wealth. Portfolios are constructed from a basis of selected assets. Determining the universe of stocks that seems interesting to get in a portfolio is done by financial analyst. Hence a basic set of stocks is produced, the question is: how to weight each of these assets? That is what is about portfolio theory. Keep in mind, that a good theory, should produce well diversified portfolio, to decrease risks and volatility of the portfolio. However it should still be able to increase wealth of investors more than a random allocation could. This chapter is about the state of the art on the traditional approach of investment. 2.1 Markowitz, Mean-Variance Harry Markovitz is considered as the father of modern portfolio theory [16]. He developed during the 50’s a theory based on the assumption that the utility function of investors is quadratic. This theory tells that investors want to minimize their risks for a given level or return. In this quadratic framework the risk is modelled by the standard deviation of the portfolio. At this time this was a good model, because this problem has an analytical solution which allowed people to use it with the small computation means of this age. This approach never gave fully satisfaction, due to a lack of empirical evidence. In practice some limitations have been identified: returns don’t follow a normal distribution and samples are too noisy to provide good Covariance estimates. Different solutions emerged to improve covariance’s estimates. I present three of them in the following section. 5 Minimize wT Vw Subject to : wT µ = r0 ∀i, wi ≥ 0 n X wi = 1 i=1 Where: • w : portfolio weights • V : covariance matrix • r0 : the desired level of expected return of the portfolio • µ : vector of expected returns Figure 2.1: Formulation of Mean-Variance optimization Figure 2.2: Instability of the EFFICIENT FRONTIER moving at each periods. Efficient frontiers from 28 assets over 12 periods of 6 months. 6 Figure 2.3: DEVIATION of a portfolio from the EFFICIENT FRONTIER. Same frontiers as figure 2.2 with the evolution of a portfolio taken on the first efficient frontier. A portfolio done in the Mean-Variance frameworks become rapidly inefficient. 7 2.2 Different estimates of the Covariance matrix to improve Mean-Variance approach Covariance matrix can be estimate with the sample covariance matrix. However this simple estimator does not work much for portfolio construction, since samples are not sufficiently big and are to noisy to give a good estimate. Actually, sample covariance estimates are really sensitive to outliers. Nevertheless this estimator is unbiased. 2.2.1 Shrinkage The shrinkage method allows to provide a better estimation of the covariance matrix when the number of assets p is bigger than the size of the sample n. In the case p > n, the sample covariance matrix become singular. The shrinkage method is an alternative. It consists to evaluate through a cross validation, the shrinking parameter. The shrinking parameter is the weight determining the trade-off between the sample covariance estimator and the diagonal target matrix. This operation regularizes the covariance matrix. b= S k TF + (1 − k T )S • k : estimator of the optimal shrinkage constant • T : number of observations • F : single index covariance matrix • S : sample covariance matrix Figure 2.4: SHRINKAGE Method - a trade off between sample covariance and single index 2.2.2 Re-sampling The idea is to expand sample by creating samples from the original sample. This method is well known in statistics under the name of bootstrap method. It allows to reduce the variance uncertainty while increasing the bias on the measure. However this technique seems to give some results, if we believe M. Michaud who is the developer of this technique to portfolios construction. According to him, the re-sampling decrease the sensitivity of optimization to noise. 2.2.3 Reverse engineering This method consists in assuming that the market portfolio is on the efficient frontier. The aim is to correct the covariance matrix estimate with the help of 8 the market efficient portfolio. This idea is theoretically pleasant. However it has a lack of empirical use, since market portfolio is hard to estimate. Figure 2.5: REVERSE ENGINEERING - Measure of distance from the sample to the reverse engineering estimate. The factor alpha corresponds to the tradeoff between the bias and the variance of our estimation Figure 2.6: REVERSE ENGINEERING - Minimizing the distance of the reverse engineering estimate to the sample estimate under the condition that the market efficient portfolio belows to the efficient frontier 2.3 CAPM and extensions Capital Asset Pricing Model Sharpe, Lintner and Mossin developed the CAPM theory in the 60’s [18]. This model complements the Markovitz portfolio theory by adding the possibility to lend or borrow money at a risk free rate. Main assumptions are taken from the Markovitz portfolio theory: agents are risk averse and maximize their utility, stocks returns are normally distributed, perfect markets, informational efficiency, supply and demand equilibrium. It gives a clear relation between risk and return. The Capital Asset Pricing Model takes into account two risks: the unique risk which can be cancelled by diversification, and the market risk which corresponds to the risk of being on the market. The expected risk premium on stock is equal to r −rf where rf is the risk free rate of 9 return and r the expected return of the stock. Sharpe established the following relationship between risk and return: r − rf = β(rm − rf ) Beta represents the contribution of the market risk to the non-diversifiable stock risk. Beta is the relative covariance of stock to market returns. Beta measures the dependence between the stock and the market, weighted by the ratio of the stock volatility by the market volatility. βi = Cov(ri ,rm ) V ar(rm ) This model is linear, so easily understandable. However this theory is not fully useful since the market portfolio can only be approximate ( Roll critique [17]). These critics lead researchers to improve this model. 3-factors model based on size of firms and book-to-value Eugene Fama and Kenneth French extended the CAPM model by adding two new factors: firm size and the book-to-market equity [12]. With this improvement, the model becomes multi-linear. The cross-section of expected returns is better explained with this theory. However this model is built on empirical study and the mechanisms are not deeply understood. Our preliminary work on economic fundamentals suggests that highBE/ME1 firms tend to be persistently poor earners relative to lowBE/ME firms. Similarly, small firms have a long period of poor earnings during the 1980s not shared with big firms. The systematic patterns in fundamentals give us some hope that size and book-tomarket equity proxy for risk factors in returns, related to relative earning prospects, that are rationally priced in expected returns. [12] 4-factors model based on momentum strategies This model developed by John Cahart aims at taking into consideration the effect of time when pricing stocks. [8] Researchers detected a momentum effect from data. Momentum strategies are founded on empirical and behavioural finance. They benefit from an inefficiency of the market, with a lag generated by investors. Stocks with the highest return remain good choice for about 6 months, and the opposite with low returns stocks[13, 1]. The fact of adding a one year momentum in stocks returns improves the prediction power of the model. ”First, note the relatively high variance of the SMB2 , HML3 , and PR1YR4 zero-investment portfolios and their low correlations with each other and the market proxies. This suggests the 4-factor model 1 ME refers to size of the firm. It is equal to the market value. BE refers to the book value of the firm. BE/ME is called the book-to market equity. 2 factor of size 3 factor of book-to-market equity 4 factor referring to the one year momentum in stock returns. 10 can explain sizeable time-series variation.” ”I find that the 4-factor model substantially improves on the average pricing errors of the CAPM and the 3-factor model”[8] 2-factors model based on market concentration This model has been developed by Malevergne, Santa-Clara and Sornette in their papers Professor Zipf goes to Wall Street (2009)[15]. It is based on the Zipf distribution of firm size. The idea is that the market portfolio is concentrated into very few companies (about 20) due to the heavy-tailed shape of the Zipf distribution. This market concentration can be measured with the Herfindahl Index, that will be used widely in the following sections. This concentration leads to a new factor of systematic risk. The difference between the equally weighted and value-weighted market portfolios is used as a proxy of the Zipf factor, and allows to take this new risk into account in the pricing model. This idea of market concentration influenced my work. Application of these models These models are used to evaluate the performance of investment strategies. The idea is to compare the average returns of the portfolio against portfolios sharing the same factors. Then, it remains the question of the persistence of these results. Are managers lucky or truly skilled? [11]. This question is fundamental. The random portfolio theory of P. Burns, is preferred in the following for portfolios performance evaluation. 11 Chapter 3 Genetic algorithm Genetic algorithm allows to optimize all kinds of functions. Genetic algorithms are part of evolutionary algorithm [14, 9]. They directly come from the Darwin’s theory of evolution. This algorithm reproduces the selection process working in the nature. Generation after generation, the population adapts itself to the environment. In the case of a stable environment, GA produces an optimization solver which converges. The evolution of the population is done through small change coming from the mix of parents or through mutation. It is really convenient to use, once parameters that regularize performance of the optimization, are understood. Its name and the associated vocable come from genetics. The process of reproduction is actually a model of what is going on in our cells. The genotype is evolved across generations through selection, crossover and mutation processes, which converge to a final population answering the problem. I use this algorithm to do the final simulation of this study. 3.1 Composition The genetic algorithm is composed of 4 main steps : 1. initialisation of the population 2. evaluation of the population 3. selection 4. reproduction The steps 2,3 and 4 are in a loop which stops when a stopping criteria is reached. The reproduction step is composed of three possibilities: 1. elite count 2. crossover 3. mutation 12 Figure 3.1 represents the reproduction process. Below is published the description of the genetic algorithm used in my work. This algorithm exists already in the GA toolbox of Matlab. Here are the options and properties used to do optimization of portfolio. In order to obtain the attending result, it is good to understand, how each function works, and to have a global view of the process. Population Options: Population is composed of portfolios represented as vectors of weights. Population size specifies how many individuals there are in each generation. With a large population size, the genetic algorithm searches the solution space more thoroughly, thereby reducing the chance that the algorithm will return a local minimum that is not a global minimum. Here you can choose between 100 and 1000. Initial population specifies an initial population for the genetic algorithm. It is set as random with the constraint : ∀i, 0 ≤ wi . The other constraint of normality is taken into account in the evaluation function.It is not an optimization constraint, because the answer of the problem is a vector representing the proportion of the different assets. Hence the portfolio is normalized, only during the evaluation of the portfolio. At the end, the final vector of weights is normalized to further use. This astuteness avoids to use constraint options of the GA toolbox. It saves computation time. Evaluation function: The evaluation function quantifies the objective of the optimization. It allows to rank portfolios in function of the distance to the solution of the optimization problem. These ranks are used in the selection process. Selection function: It allows to select parents for reproduction. The Stochastic uniform selection function lays out a line in which each parent corresponds to a section of the line of length proportional to its scaled value. The algorithm moves along the line in steps of equal size. At each step, the algorithm allocates a parent from the section it lands on. The first step is a uniform random number less than the step size. Reproduction: Crossover fraction specifies the fraction of the next generation other than elite children, which are produced by crossover. Elite count specifies the number of individuals that are guaranteed to survive to the next generation. 13 Mutation uses the Gaussian mutation function, which adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector. Crossover uses the scattered crossover function, which creates a random binary vector and selects the genes where the vector is a 1 from the first parent, and the genes where the vector is a 0 from the second parent, and combines the genes to form the child. Stopping Criteria Options: The algorithm runs until the cumulative change in the fitness function value over Stall generations is less than or equal to Function Tolerance. Here the number of generation is limited to 200 and stall limit generation can be chosen between 10 and 50. 3.2 Focus on the generation of a new population The process of the generation of the new population is composed of three operations: direct transfer from the old to the new generation, mix of two parents and mutation of one guy. Figure 3.1 explains the process and gives the ventilation between the three operations. Parameters of the algorithm determine its performance feature. Large population (over 300) produces a result more pertinent in less generations, but increase the computation. Moreover a small crossover rate increases the number of mutation and the chance to find local minimum. However the convergence of the population will become slower and slower. The crossover decreases the variance of the population by making chromosome more and more identical(figure 3.2). The crossover operation look for global minimum. In opposition, the mutation operation, which add some random change in the population, tend to diversify the population. In the algorithm that I used in this study, the Gaussian mutation function adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector(figure 3.3). This operation helps in the convergence process to find the local minimum. 14 Figure 3.1: REPRODUCTION PROCESS using elite count, crossover and mutation. In this example the crossover rate is 0.8 and the elite count is 2 Figure 3.2: A view of the scattered CROSSOVER function process. A random vector of bit is generated of the length equal the number of assets in the portfolio. A zero means we select the weight of parent 1 and a one the weight of parent 2 15 Figure 3.3: Gaussian MUTATION function adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector. 16 3.3 Applications With a good parametrization this algorithm performs well in optimization of portfolios. First, this GA will be used to construct portfolios following a strategy. In this case, the population is a population of random portfolios. These random portfolios will be generation after generation evolved until they optimized the fitness function. The fitness function is the objective function describing the investment strategies. For instance, a strategy could be to reduce the variance of portfolios returns while keeping a diversification of 10 stocks. All the condition have to be integrate in the objective function (diversification,...). The second use of this algorithm, is the creation of set of random portfolios. Since the initial population and the search process are random, the GA allows to create such random portfolios meeting the needed constraints. These two applications are used for simulation in this study. 17 Chapter 4 Random Portfolios Introduction Random portfolios technique is about statistical simulation of portfolios. Random portfolios constitute a benchmark. This benchmark can be used to evaluate other portfolios. Papers [7, 6] explain how to use this technique for testing trading strategies, assessing the skill exhibited by funds, and implementing investment mandates. Random Portfolios are an application of Monte Carlo simulation. The result of a random portfolios simulation is the possibility to rank a portfolio against random ones. The main difficulty of this, is to compare what is comparable. In this sense, the way random portfolios are generated is of terrible importance. The first section of this chapter is a naive approach of random portfolios. I explain why it does not work. The second section presents a good procedure to obtain results from random portfolios. I choose to present this party in a narrative way to present the dynamic of my approach. Hypothesis Random portfolios are used as a statistical test. The hypothesis defining the p-value are: H0 : Strategy is not better than random portfolios H1 : Strategy is better than random portfolios Hence a p-value of ten percent means that the strategy performs better than 90 percents of random portfolios. 4.1 First attempt Use of the function random I generated random portfolios from the simplest method I knew : the function random. Each weight is assigned to a random double between 0 and 1. Then the portfolio vector is normalized to respond to the constraint: sum of weights equal to 1. I generated 1000 portfolios. Then I applied them to a period of 6 months of data returns. I obtain a Gaussian 18 distribution of returns of the 1000 portfolios. These results are represented figure 4.1. The test period is 6 months. The 6-month mean returns is about five percent and the standard deviation two percent. Figure 4.2 are plotted the returns of the thousand portfolios sorted from the smallest return to the highest. It allows to compare your strategy to random portfolios. Hence the p-value of your strategy is directly readable on this graphic, as soon as you know your return on the selected periods of data and universe. My first attempt to simulate random portfolios gave me surprising results. Every returns of the strategies that I tested, were above or below the simulated random portfolios. Hence their p-value were respectively 0 or 1. Figure 4.1: Histogram of the 1000 RANDOM PORTFOLIOS generated with the random and the normalize function. Returns are computed on a period of 6 months. The universe of study is composed on 28 stocks of the CAC40 over 6 months. What’s wrong? I firstly check my code, then recheck, then I changed several times my data without any success. After such a defeat, you need to stand back, in order to do the good analyze of the problem. When looking at the portfolio things become clear. They were too diversified in comparison of the strategy I was looking at. The average Herfindahl index was close to 1/27 and the variance was really low. My universe was constituted of 28 assets. Hence the conclusion was clear, my random portfolios were almost equally weighted portfolios. I needed to change the way I generate random portfolios, to add more diverse kind of portfolios. Idea Starting from the idea of comparing what is comparable, a simple idea is to compare portfolio of the same level of risks. As it is shown in the section about ”New Strategies”, I think that in a universe constituted of about the same kind of stocks (here 28 stocks from the CAC40), a good indicator of the level of risks is the level of diversification of the portfolio. The Herfindahl Index, which is a concentration index, is a good estimator of diversity. Hence the problem, 19 Figure 4.2: Graphics of RANDOM PORTFOLIOS sorted by increasing returns. The 1000 random portfolios were generated with the random and the normalize function. Returns are computed on a period of 6 months. The universe of study is composed on 28 stocks of the CAC40 over 6 months.. is to generate random portfolios of the same Herfindahl index as the studied portfolio. This could not be done using only the random function. 4.2 Second attempt Using the Genetic Algorithm The second attempt consisted in creating random portfolios with a constrained level of diversification. The Genetic Algorithm(GA) developed and parametrized to do portfolio optimization revealed itself a good way of creating random portfolios. The fitness function of the GA has to be equal to the constraint |Σwi2 − Hportf olio |. Then the GA select a random portfolio which satisfies to the constraint of the level of diversification. Hence the GA has to be ran thousand times to create the population of thousand random portfolios. It is a computationally time consuming to do, but it gives the expected result. 1 Result of the experience Figure 4.3 and 4.4 you can observe that the new repartition of random portfolios in term of returns is wider. The range of returns is now from - 20 percent et + 30 percent. Now the calculated p-value are between 0 and 1. The evaluation of strategies is possible. The p-value represent the rank in percentage of the strategy against the random portfolios. However one p-value for one period is not enough to decide whether the strategy is good or whether the result is due to luck. 1 A computationally cheaper way to generate random portfolios with a Herfindahl Constraint is to use hyper-spherical coordinate 20 Figure 4.3: Histogram of the 1000 RANDOM PORTFOLIOS generated with a Genetic Algorithm at a Herfindahl Index of 1/3. Same period and universe as figure 4.1. We can observe that the range of returns is much wider than this of figure 4.1 Figure 4.4: Graphics of RANDOM PORTFOLIOS sorted by increasing returns. The 1000 random portfolios were generated with a Genetic Algorithm at a Herfindahl Index of 1/3. Same period and universe as figure 4.2. This graphics can be used to read the p-value of your strategy on this particular period and universe if its herfindahl index is 1/3. 21 4.3 Conclusion Random Portfolios technique should be used carefully. Always remind this sentence of good sense: compare what is comparable. Figure 4.5 represent pvalues for different Herfindahl index value. Random portfolios are specific of a universe. Based on a same level of risks, the use of random portfolios and pvalue is a really powerful tool to assess the performance of a strategy. However this evaluation should be done over a significant amount of periods of time to ensure the validity of the result against luck. The use of statistic to create interval of confidence around the mean p-value observed should give insight on the fundamental question skilled or not? One idea to increase the validity of the p-value test would be to cut a big period on different scales. For example, cutting a period of 6 months in 60 periods of 3 months to compute more pvalues. However this cut could have an impact on the pertinence of result by uncorrelating the test and the strategy studied. Figure 4.5: RANDOM PORTFOLIOS return for different level of the Herfindahl Index. Each line corresponds to a different p-value. The universe of portfolios is composed of 28 stocks of the CAC40. Six months of daily returns are used to evaluate portfolios returns. Thousand random portfolios were generated with a Genetic Algorithm for the 28 levels of the Herfindahl index ( 1,1/2,1/3...1/28) 22 Chapter 5 Innovative idea on risks The creation of new strategies should be based on new measure of risks. The use of the variance or standard deviation as a measure of risks could not any more agreed as valid by researchers. The general idea of investment strategy should still minimize risks and maximize returns of the portfolio. Computation is not any more a problem with computer of today. The use of genetic algorithm allows strategies based on non-linear measures of risks. With the genetic algorithm, there is no need for a n-th improvement of the estimation of the covariance matrix, because optimization problems don’t need to be exposed under a quadratic form to be solve. My innovative idea is to separate the measure of risks to be minimized and the level of risks taken. The level of risks represents the profile of risk wanted by investors. The measure of risks will give information to select stocks in a safe way. 5.1 Measures of risks and dependencies 5.1.1 Coherent Measure of risk In the paper ”Coherent Measure of risk” [2], the authors develop a theory explaining what properties should have a good measure of risk in portfolio management. ”A risk measure satisfying the four axioms of translation invariance, subadditivity, positive homogeneity and monotonicity is called coherent.” [2] Translation invariance If a ∈ R and X ∈ L then ρ(a + X) ≤ ρ(X) − a The value a is just adding cash to your portfolio X, which acts like an insurance: the risk of X + a is less than the risk of Z, and the difference is exactly the added cash a. In particular, if a = ρ(Z) then ρ(Z + ρ(Z)) = 0. Sub-additivity If X1 , X2 ∈ L, then ρ(X1 + X2 ) ≤ ρ(X1 ) + ρ(X2 ) The risk of two portfolios together cannot get any worse than adding 23 the two risks separately: this is the diversification principle. Positive homogeneity If α ≥ 0 and X ∈ L then ρ(αX) = αρ(X) Loosely speaking, if you double your portfolio then you double your risk. Monotonicity If X1 , X2 ∈ L and X1 ≤ X2 , then ρ(X1 ) ≤ ρ(X2 ) That is, if portfolio X2 has better values than portfolio X1 under all scenarios then the risk of X2 should be bigger than the risk of X1 : more profit, more risk. 5.1.2 Dependencies Measures of risks are correlated to the universe of the study. Dependencies are included in the measure of risks. Properties of dependencies depend on the measure of risks used. For example, minimizing the variance of a portfolio is not the same as minimizing the weighted some of variances of single assets corresponding to the same portfolio, because variance is non-linear (actually, it is quadratic). Minimizing a measure of risks on a whole portfolio takes into consideration the dependencies between assets. This compensation ,between assets of a portfolio, has a positive impact on the property of the portfolio, if the measure of risks is coherent. Hence dependencies are taken into account. 5.1.3 Some measure of risks Below are listed some measure of risks and their main interests. These measures are used in the section Simulation, to create portfolio. They are all well known, except the Maximum DrawDown of returns for which I think to be the creator. ”If Maximum DrawDown of stock prices is the speed, Maximum Drawdown of returns is the acceleration.” Variance: That is the most commonly used measure of risk in portfolio optimization. It is convenient to calculate and represent. One problem of the variance is that extreme risks are not taken into account. Moment of even higher order (4,6,8): Moment of order 4, 6, 8 takes into account extreme risks. They indeed characterize fat-tailed distribution. Maximum DrawDown of stock price: The maximum loss from peak to valley. The Maximum DrawDown has some nice property. It is invariant by translation , it is homogeneous and convex. Thus the MDD is a coherent measure as the volatility. Average Maximum DrawDown of stock price: It takes the average from the Maximum DrawDown of a time-series of stock prices. See the code of the function in annexe Maximum DrawDown of returns: Like the Maximum DrawDown of stock prices, but based on the series of returns. In comparison with physics, 24 Figure 5.1: Illustration of two MAXIMUM DRAWDOWNS it represents the extreme acceleration of the system. As it will be shown in the part simulation, the MDD of returns performs well in portfolio optimization. Value at Risk: The VaR measures the loss of a portfolio for a given quantile. It is currently controversial since it does not take into account extreme risks. Moreover this measure is not coherent for a non-Gaussian distribution. Conditional Value at Risk: the second name of C-VaR is the expected shortfall. This measure is known as a coherent measure of risks. It benefits from the criticism of the VaR. It is more sensitive to the shape of the loss distribution. Extreme risks are taken into account, because it focuses on worst scenario quantiles. Semi-variance: This risk measure aims at measuring the variance for the negative returns. The logic is that variance of positive returns is an opportunity, but variance of negative returns represents a risk. Deviation to the median: This risk measure is a kind of standard deviation. It takes into account the asymmetry of the distribution of returns. 5.2 Level of risks and diversification Introduction Risks are often associated with diversification. Using coherent measure of risks implies that risks is diminishing with diversification (subadditivity property). A portfolio containing stocks of one company, will loose everything if the company die. However if the portfolio contains stocks of two companies in equal proportion, and one company dies, the portfolio will still have the value from the other company. 25 An easy example to understand the impact of diversification Given A and B two independent companies, P(A=1) the probability that the company A survives until next year, P(A=0) the probability that the company A dies before next year, the value of the company remains equal to 1 during the year in case of survival, the value of the company remains equal to 0 in case of death, Π1 the portfolio containing only stocks of A for a total value of 100 Π2 the portfolio containing stocks A and B in equal proportion for a total value of 100 P(A=1)=P(B=1)=0,99 P(A=0)=P(B=0)=0,01 The expected value of portfolios at the end of the year is the same in both case: E(Π1 ) = 100 × (1 × P (A = 1) + 0 × P (A = 0)) = 0, 99 E(Π2 ) = 50 × (1 × P (A = 1) + 0 × P (A = 0)) + 50 × (1 × P (B = 1) + 0 × P (B = 0)) = 0, 99 E(Π1 ) = E(Π2 ) However, the extreme risk that the portfolio is equal to 0 is: P (Π1 = 0) = P (A = 0) = 0.01 P (Π2 = 0) = P (A ∩ B = 0) = P (A = 0) × P (B = 0) = 0.0001 This example shows a reduction by hundred of the risk of loosing everything. Note that for non-independent companies, P (A∩B = 0) = P (A = 0)×P (B = 0) is not valuable any-more. However the result P (Π2 = 0) ≤ P (Π1 = 0) is still true. In conclusion, diversification gives benefit in lowering extreme risks. Herfindahl Index The Herfindahl index is a measure of diversification. It is used in economics, to study the concentration of a market. If it’s value is close to zero, it means that the market shares are distributed among a lot of companies. In opposition, if it’s value is close to one, the market share are concentrated in few companies.This indicator allows to select the number of companies wanted in your portfolio during the process of optimisation. For instance, if you want about P55 different assets under the assumption of equally weighted portfolio, H = i=0 wi2 = 15 . As it is shown in the figure 5.2, the risks is decreasing with the increase of the diversification of portfolios. In this example, the minimum return of simulated portfolios goes from -17 percent to 2 percent when the average number of stocks in portfolios goes from 28 to 1. The same is true of the maximum return of the portfolios. Over the period, it is divided by 6 from 30 percent to 5 percent. Small Herfindahl Index implies small range of variation and lower risks. Moreover the Herfindahl index is quite stable with the evolution of a portfolio. As you can see on figure 5.3, variations of the 26 Herfindahl index are not significant over a period of 122 days of evolution. It is true with other value of the Herfindahl index and other period of time. To be short, ”High degree of diversification implies small range of performance available. Low degree of diversification implies high range of performance available” Figure 5.2: ENVELOP of returns taken by 1000 random portfolios calculated for H0 = [1, 1/2 1/3, 1/4, 1/5, ..., 1/28] over a period of 121 days. H0 is the value of the Herfindahl Index the first day of the period. 27 Figure 5.3: VARIATION of the HERFINDAHL Index of a portfolio over a period of 122 days. H0 is equal to 0.1 The second graphic is a zoom of the first one. We can observe that the variation of the Herfindahl Index are really small. Var(H) about equal to 10−7 5.3 A winner mix Combining Diversification and Innovative risks measures The conclusion of the first two paragraphs is that there is various definitions of risks and that the level of risks could be thought in term of level of diversification. My idea, to create new strategy, is to minimize a measure of risks and to set the level of risks by choosing a value of diversification trough the use of the Herfindahl index. Since the distribution of returns is unstable over time, I think that measure of risks should not come from value related to the distribution. In opposition, the measure of risks should come from time series statistics, like the Maximum DrawDown. The mix of measure of risks and diversification should lead to make safer and more profitable portfolios. Active versus Passive Portfolio Management The process of asset management is complex and multiple. A good asset manager should take into consideration the allocation between class of assets. This mean the ventilation of the portfolio between cash, bonds, stocks and other. It should select the good assets in each class. This mean doing, for example, the right stocks or bonds picking. It should also apply a market-timing strategy. This mean adapting the allocation strategy to market cycle (bear and bull markets). A ideally good asset manager should be able to do these three tasks in order to benefit of ac28 tive or dynamic portfolio management. A lot of studies show that funds don’t beat the market [11]. Hence the question of active management versus passive management become pertinent. In the following, I choose to study a light active management: adjustment of the portfolio every 6 months. 29 Chapter 6 Simulation Aim of the simulation This simulation looks at the impact of new measure of risks in portfolio construction. The main question is: Is it possible to create new allocation strategies, for a given level of risks, that give persistent and consistent results? I developed for the purpose of this study, an optimization tool and an analysis tool to perform this simulation, using Matlab and a genetic algorithm. Moreover the analysis of performance will estimate the p-values from random portfolios analysis, to compare the different strategies. 6.1 6.1.1 Methodology Data Stocks returns The data used in this simulation are 28 stocks of the CAC 40 from the 17/01/2003 to the 03/08/2009. Data come from the website Yahoo Finance. Returns on stocks are calculated from daily adjusted-closing prices. The adjusted-closing price takes into account changes due to dividend and split of the stock price. The large number of stocks has been chosen to create realistic portfolios. You can observe figure 6.1, the evolution of the CAC 40 during the period used. It corresponds to the end of the 2001 crisis, followed by the subprime bubble, the crash of the stocks market in 2008, followed by the increase of 2009. Survival and Look-ahead Bias In empirical finance, results can be biased due to a survival effect coming from the selection of the data. The survival bias corresponds to the fact, that simulation is done from data of stocks which are still alive. Hence it forgot enterprise that died during the time set of the simulation. It has been shown that the survival bias could increase return of a strategy during a simulation [5, 10]. I choose only 28 assets from the CAC40 instead of 40. Is my study subject to survival bias? First, no enterprise of the CAC40 died during the time of the simulation. Enterprise of the CAC40 are 30 Figure 6.1: Evolution of the CAC40. generally big enough to survive. However some enterprises disappear due to mergers and acquisition (GDF-Suez), or appear due to new regulation for example Suez environment detached from Suez or GDF detached from EDF. It did not increase the survival bias since these enterprises are not under-performing or over-performing the market. Second I did not include every assets due to a lack of quality of certain data where the adjusted price was not taking into account split. In conclusion I think that my data are not impacted a lot by survival bias. Concerning the look-ahead bias, my procedure of test will use a period to create the portfolio, which will be test on the period after. Since I did not use future information in the past, my study should not be impacted by this bias (see figure 6.2). However it is necessary to be introduced to the look-ahead benchmark bias which appears in case of benchmark comparison. It comes from the fact that the constitution of the benchmark is permanently evolving and that the information of this constitution is not easily available and usable. [10] 6.1.2 Test Strategy The main idea of this simulation is to test the idea of level of risks based on the Herfindahl index for investment in stocks in a limited universe. The strategies of investment will take into account a constraint of diversification and a measure of risks that will be minimized. All these strategies will be compared to the equally weighted portfolio. This will provide a base of comparison. The strategies will be back tested with the data described below. The portfolios will be constructed from the last year of data and kept during the 6 following 31 Periods period period period period period period period period period period period period 1 2 3 4 5 6 7 8 9 10 11 12 Training data from to 17/01/2003 16/01/2004 04/07/2003 02/07/2004 19/12/2003 17/12/2004 04/06/2004 03/06/2005 19/11/2004 18/11/2005 06/05/2005 05/05/2005 21/10/2005 20/10/2006 07/04/2007 06/04/2007 22/09/2006 21/09/2007 09/03/2007 07/03/2008 24/08/2007 22/08/2008 11/02/2008 11/02/2009 Test from 19/01/2004 05/07/2004 20/12/2004 06/06/2005 21/11/2005 08/05/2006 23/10/2006 09/04/2007 24/09/2007 10/03/2008 25/08/2008 12/02/2009 data to 02/07/2004 17/12/2004 03/06/2005 18/11/2005 05/05/2006 20/10/2006 06/04/2007 21/09/2007 07/03/2008 22/08/2008 11/02/2009 03/08/2009 Figure 6.2: This table contains the different PERIODS OF DATA used during the simulation. Training data are the data used to create a portfolio. Test data are the data used to evaluate the portfolio. Training data contained 260 days and test data 120 days. months. Optimization The optimization will be realized with a tool developed in Matlab, using a genetic algorithm. The optimization tool cut the data into periods on which it performs in serial the optimization of all the strategies. The result of this optimisation is stored in one file to be analysed next. It contains the portfolio selected for the next 6-months period and data of this test period in order to enable the analysis later on. Evaluation of performance The analysis is done by an other tool developed in Matlab. It allows to look at each of the strategies for each sub-period and to compare them. It proceeds to the evaluation of transaction costs, of the wealth, of the evolution of the strategies over all the periods. Moreover random portfolios p-values and Sharpe ratios of strategies will be estimated from test data at each sub-period. 6.1.3 Set of strategies Here are the different sets of strategies analysed. The optimal portfolio for the period will be calculated by the optimization software on the precedent period. w represents the vector of portfolio’s weights of dimension n the number of assets. R represents the matrix of returns n × m with n the number of assets and m the number of trading days of the period. V represents the matrix of stocks’ price. 32 w∗ optimal portfolio such as F (R.w∗ ) + Hc(w∗ ) is minimum with F the risk measure and Hc the diversification constraint. No short selling is allowed: n X ∀i, 0 ≤ wi ≤ 1. The other constraint on weights is: wi = 1. i=1 Minimum Variance Strategy 1. min var(R.w) + |1 − h(w)| 2. min var(R.w) + | 21 − h(w)| 3. min var(R.w) + | 13 − h(w)| 4. min var(R.w) + | 14 − h(w)| 5. min var(R.w) + | 15 − h(w)| 6. min var(R.w) + | 16 − h(w)| 7. min var(R.w) + | 17 − h(w)| 8. min var(R.w) + | 18 − h(w)| 9. min var(R.w) + | 19 − h(w)| 1 10. min var(R.w) + | 10 − h(w)| 11. min var(R.w) 1 12. min | 28 − h(w)| Minimum Moment of order 4 Strategy 1. min µ4 (R.w) + |1 − h(w)| 2. min µ4 (R.w) + | 12 − h(w)| 3. min µ4 (R.w) + | 13 − h(w)| 4. min µ4 (R.w) + | 14 − h(w)| 5. min µ4 (R.w) + | 15 − h(w)| 6. min µ4 (R.w) + | 16 − h(w)| 7. min µ4 (R.w) + | 17 − h(w)| 8. min µ4 (R.w) + | 18 − h(w)| 9. min µ4 (R.w) + | 19 − h(w)| 1 10. min µ4 (R.w) + | 10 − h(w)| 11. min µ4 (R.w) 1 12. min | 28 − h(w)| Minimum Maximum DrawDown of returns Strategy 1. min M DD(R.w) + |1 − h(w)| 2. min M DD(R.w) + | 21 − h(w)| 3. min M DD(R.w) + | 31 − h(w)| 33 4. min M DD(R.w) + | 41 − h(w)| 5. min M DD(R.w) + | 51 − h(w)| 6. min M DD(R.w) + | 61 − h(w)| 7. min M DD(R.w) + | 71 − h(w)| 8. min M DD(R.w) + | 81 − h(w)| 9. min M DD(R.w) + | 91 − h(w)| 1 − h(w)| 10. min M DD(R.w) + | 10 11. min M DD(R.w) 1 12. min | 28 − h(w)| Minimum Maximum DrawDown of values Strategy 1. min M DD(V.w) + |1 − h(w)| 2. min M DD(V.w) + | 21 − h(w)| 3. min M DD(V.w) + | 13 − h(w)| 4. min M DD(V.w) + | 14 − h(w)| 5. min M DD(V.w) + | 15 − h(w)| 6. min M DD(V.w) + | 16 − h(w)| 7. min M DD(V.w) + | 17 − h(w)| 8. min M DD(V.w) + | 18 − h(w)| 9. min M DD(V.w) + | 19 − h(w)| 1 10. min M DD(V.w) + | 10 − h(w)| 11. min M DD(V.w) 1 − h(w)| 12. min | 28 Minimum Average Maximum DrawDown of values Strategy 1. min AverageM DD(V.w) + |1 − h(w)| 2. min AverageM DD(V.w) + | 12 − h(w)| 3. min AverageM DD(V.w) + | 13 − h(w)| 4. min AverageM DD(V.w) + | 14 − h(w)| 5. min AverageM DD(V.w) + | 15 − h(w)| 6. min AverageM DD(V.w) + | 16 − h(w)| 7. min AverageM DD(V.w) + | 17 − h(w)| 8. min AverageM DD(V.w) + | 18 − h(w)| 9. min AverageM DD(V.w) + | 19 − h(w)| 1 10. min AverageM DD(V.w) + | 10 − h(w)| 11. min AverageM DD(V.w) 1 12. min | 28 − h(w)| 34 6.2 Results 6.2.1 Tables These tables represent indicators resulting from the simulation. The data were subset in 12 test periods from the 19/01/2004 to the 03/08/2009. New portfolios are generated at each sub-period. Herfindahl Variance µ4 MDD(r) MDD(v) A-MDD(v) 1 180 221 231 139 95 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 175 123 141 205 115 140 150 141 174 122 251 122 123 128 98 127 87,8 146 133 127 101 125 150 134 109 126 114 133 130 118 141 124 144 128 127 107 89 139 122 119 105 85 137 120 108 free 133 131 129 134 116 1 28 112 112 112 112 112 Figure 6.3: END WEALTH of the different strategies at the date of 03/08/2009. Starting value of 100 the 19/01/2004. Herfindahl Variance µ4 MDD(r) MDD(v) A-MDD(v) 1 165 204 216 111 64 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 145 93 112 179 85 112 123 117 152 98 222 92 101 106 72 103 59 125 114 105 77 99 133 116 88 105 91 117 114 99 119 101 129 113 111 86 69 126 109 103 86 65 125 107 94 free 124 121 116 117 100 1 28 108 108 108 108 108 Figure 6.4: END WEALTH including TRANSACTION COSTS of 1 percent of the total value traded. Starting value of 100 the 19/01/2004. Herfindahl Variance µ4 MDD(r) MDD(v) A-MDD(v) 1 0,38 0,35 0,36 0,40 0,52 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 0,38 0,46 0,45 0,29 0,47 0,40 0,41 0,38 0,35 0,45 0,24 0,46 0,41 0,46 0,51 0,45 0,51 0,38 0,41 0,43 0,56 0,43 0,35 0,41 0,45 0,43 0,51 0,37 0,40 0,44 0,37 0,45 0,36 0,40 0,40 0,41 0,64 0,35 0,43 0,44 0,48 0,60 0,36 0,43 0,50 Figure 6.5: Average RANDOM PORTFOLIOS P-VALUES of the different strategies for the period from 19/01/2004 to 03/08/2009. 35 free 0,32 0,31 0,34 0,40 0,43 1 28 0,49 0,50 0,50 0,49 0,49 Herfindahl Variance µ4 MDD(r) MDD(v) A-MDD(v) 1 0,82 1,01 0,85 0,65 0,22 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 0,74 0,56 0,60 1,08 0,53 0,98 0,71 1,12 1,03 0,75 1,28 0,48 1,01 0,75 0,60 0,78 0,55 1,25 1,09 0,89 0,68 0,67 1,27 1,08 0,85 0,92 0,61 1,16 1,05 0,89 1,12 0,73 1,23 1,07 1,02 1,03 0,37 1,20 0,96 0,99 0,83 0,45 1,21 0,96 0,75 free 1,34 1,32 1,24 0,97 0,88 1 28 0,73 0,73 0,73 0,73 0,73 Figure 6.6: Average yearly SHARPE RATIO of the different strategies for the period from 19/01/2004 to 03/08/2009. Ranks Variance µ4 MDD(r) MDD(v) A-MDD(v) Wealth generated 2 4 1 3 5 Sharpe ratio 3 5 1 2 4 Random Portfolios 3 5 1 3 4 Overall Performance 3 5 1 3 4 Figure 6.7: Result summary. Numbers represent the rank of strategies against each other. 36 6.2.2 Wealth evolution - first interpretation Some stylized facts on result A first comparison of the wealth generated by each set of strategies during the back-testing period give an idea on performance. Figure 6.3 exhibits the value of portfolios at the end of the 12 periods of the simulation for every set of strategies. For each strategy new portfolios are generated at the beginning of each period based on the previous year of data. The starting value (19/01/2004) is 100. The end of the simulation is 03/08/2009. At this date, the equally weighted portfolios achieved a performance of 12.2 percent. The equally weighted portfolio strategy represents the performance of the market. It is used as a benchmark. Looking at the generated wealth, strategies based on Maximum DrawDown of returns and of values have results above the equally weighted portfolios for every level of diversification simulated. These two measures of risks seem to contain above the average information about the capacity to grow of portfolios. Their power of prediction leads to high returns. Moreover, figures 6.10 and 6.11 represent the evolution of their wealth. These graphics confirm the fact that portfolios based on Maximum DrawDown of returns and values perform at every level of diversification more than the benchmark on the long run. Set of strategies based on Variance and 4th Moment don’t give such strong robustness and persistence on results. They have high performance with portfolios poorly diversified (1-4 stocks) and low performance against the benchmark with diversified portfolios(5-10 stocks). Finally strategy based on Average Maximum DrawDown of values seem to provide no interests. Robustness to diversification A measure of risks, which could produce good performance whatever its level of diversification, provide a good base for an investment strategy. It proves its capacity to predict which stocks will produce high performance over the next period. The reliability of the measure of risks should be proved for different configuration of portfolios. Hence the robustness to diversification is a good indication of reliability. Figure 6.13 shows results of the table 6.3 under a Box plot form. Maximum DrawDown of returns box is smaller and above the others. It means that in average its performance is above the other strategy. Moreover its performance is robust to diversification since the size of the box is small. It means that strategies based on Maximum DrawDown on return provide, in the long run, high returns for every level of diversification. Strategy based on Variance provide more risky investment since the Box of results is wide. It means that this measure of risks is really sensible to diversification. It could have low or high unexpected results. Are these results the fruit of luck? This question needs a random portfolios analysis to be answered. Finally, strategies based on Average-Maximum-DrawDown on value produce robustness to diversification for bad returns. Its box is small, but located below the others with a small mean. Hence this measure should not be used (same conclusion for the 4th Moment measure of risks). Stock picking The aim of an investment strategy in stocks is to select stocks that produce high returns and have low volatility. Stock picking is based on 37 qualitative and quantitative analysis. The study of the different set of strategies shows for some measures of risks a non-robustness. However, 4th Moment and Variance strategies show extremely high return for portfolios composed of 1 to 4 assets. One explanation could be luck! Nevertheless, an other explanation could be that selecting 1 to 4 stocks over 28, require less information and competency than selecting 10 good stocks. This could explain the fact that Variance and 4th Moment measures of risks could lead to select some good stocks, but not be able to select one entire portfolio. Moreover, portfolios construct without constraint of diversification (Herfindahl level free) seems to perform better and lead to portfolio of lower volatility than portfolio of equivalent degree of diversification. The constraint impacts negatively the level of information contained by the measure of risks. Profile of risks In the universe of commercial finance, assets’ managers have to care about the profile of their investors, to offer them, product adapted to their risk acceptance. I think that based on the result of figures 6.3 and 6.13, 3 profiles emerged. These profile take their sense in a fund which would use a measure of risks robust to diversification in a small universe of large stocks like the CAC40. • The speculative profile composed of 1 or 2 stocks. • The risky profile composed from 3 to 5 stocks. • The secure profile composed by the optimisation without any constraint. 38 Figure 6.8: Wealth evolution of VARIANCE based strategies from 19/01/2004 to 03/08/2009. Figure 6.9: Wealth evolution of 4th MOMENT based strategies from 19/01/2004 to 03/08/2009. 39 Figure 6.10: Wealth evolution of MAXIMUM DRAWDOWN of RETURNS based strategies from 19/01/2004 to 03/08/2009. Figure 6.11: Wealth evolution of MAXIMUM DRAWDOWN of VALUES based strategies from 19/01/2004 to 03/08/2009. 40 Figure 6.12: Wealth evolution of Average MAXIMUM DRAWDOWN of VALUES based strategies from 19/01/2004 to 03/08/2009. 41 Figure 6.13: Comparison of the strategies in term of WEALTH at the end of the simulation from 19/01/2004 to 03/08/2009. All Herfindahl index mixed. The red horizontal line represent the median. The boundaries of the box represent the 25 and 75 quantile. The whiskers represent the minimum and maximum values. Maximum values corresponds generally to high Hefindahl index (look at the table 6.3 to see the corresponding values plotted here). This plot shows that strategies based on different measures of risks have different responses to diversification. 42 6.2.3 Random Portfolios - Luck or skill? Consistency of results Random portfolios allow to rank a strategy. In this study, the p-value is calculated for each sub-strategy and each sub-period. Results are plotted in graphics 6.14, 6.15, 6.16, 6.17, 6.18. It is interesting to observe how variable is the p-value for each strategy. A consistent strategy should have a high mean p-value and a small range of variation in order to show consistency. On the maps plotted below the 3D graphics, dark area represents low p-value. Hence a map with a high dark coverage along the 2 axes: strategy and period; means that the measure of risks is robust to diversification and gives persistent value added during time. 4th Moment This set of strategies does not show any robustness to diversification. The median p-value is really different for each sub-strategy(level of diversification). Box of the box-plot are big. Hence the p-value of each sub-strategy is very different for different periods. It means that this measure of risks don’t provide many information. The map seems completely random implying no consistency in results. Variances This set of strategies does not show any regularity of results. The map seems random. However p-value are in averaged smaller than 0.5 , meaning that they are performing better than the expected value of someone acting randomly(see table 6.5 for the average p-values over subperiods). Is it due to luck? Sub-strategy 4 and 11 are especially good in this simulation. MDD of returns This set of strategies shows a good robustness to diversification. Boxes of sub-strategies are quite similar. dark area on the map are predominant. P-values are almost the same for each sub-strategies. The map reveals a periodicity. For example, p-values are bad for all substrategies at the period 10 and 12. The mean and median p-values are low for most of the periods meaning that strategies based on Maximum DrawDown of returns provide a real plus to luck. This strategy seems consistent. MDD of values This set of strategies shows a good robustness to diversification. Like the Maximum DrawDown of returns, dark area is predominant with a periodicity of bad p-value. Median p-value is quite good (less than 0.5) for all the sub-strategies but less than those of Maximum DrawDown of returns. On the boxplot graphic, sub-strategy 2 (with 2 stocks) is particularly good. However the difference of p-value for different period implies a strong correlation of the performance to market condition. Average MDD of values This set of strategies shows a kind of robustness to diversification, however the bright trend is dominant. P-values are high meaning no skill at all. This measure of risks does not provide information of prediction. It is not effective at selecting interesting stocks. 43 Figure 6.14: p-values analysis for 4th MOMENT based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the pvalues of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). 44 Figure 6.15: p-values analysis for VARIANCE based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the pvalues of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). 45 Figure 6.16: p-values analysis for MAXIMUM DRAWDOWN of RETURNS based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents pvalues for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). 46 Figure 6.17: p-values analysis for MAXIMUM DRAWDOWN of VALUES based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents pvalues for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). 47 Figure 6.18: p-values analysis for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median pvalue over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 substrategies correspond to the 12 different levels of Herfindahl index). 48 Asymmetry of performance As observed in the previous paragraph, Maximum DrawDown of returns and values especially show a sensibility to the period in performance. Figures 6.19,6.21,6.20 represent the performance of substrategies against 10 percent best random portfolios (p-value of 0.1). The market period 11 decreases a lot and the market period 12 increases strongly. These graphics confirm the asymmetry of performance of strategy based on measure of risks, especially in the case of the Maximum DrawDown. This asymmetry comes from the ability of these strategies to select stocks that perform the most in bear market, but to select the ones that perform the less in extremely growing market. The leverage effect can be an explanation of this assymetry. This effect corresponds to a negative correlation between past returns and future volatility [4]. Moreover minimizing a measure of risks necessary means minimizing the extreme variation. Hence these measures failed to pick stocks that will grow strongly. However this effect is interesting to create defensive strategy for bear market. Remark on the equally weighted portfolio The equally weighted portfolio has a p-value of 0.5 since there is only one equally weighted portfolio by universe. It represents a good estimator of the 0.5 p-value random portfolios for every level of diversification. Hence the equally weighted portfolio is a good benchmark of performance, because it represents the expected return of a random strategy. 49 Figure 6.19: Return of strategies based on MAXIMUM DRAWDOWN of RETURNS, against random portfolios of p-value of 0.1 . In Period 11, the market decreases and in period 12, it increases 50 Figure 6.20: Return of strategies based on 4th MOMENT, against random portfolios of p-value of 0.1 In Period 11, the market decreases and in period 12, it increases 51 Figure 6.21: Return of strategies based on VARIANCE, against random portfolios of p-value of 0.1 In Period 11, the market decreases and in period 12, it increases 52 6.2.4 Sharpe Ratio The Sharpe Ratio is defined as the ratio of the mean excess-year return and the yearly standard deviation. Excess returns correspond to the series of portfolio returns minus the risk free rate. The risk free rate of return is estimated with the US-3-months-treasury bills. This ratio gives an information on how big is the performance in comparison with its variability. It is a classical indicator of performance computed in most studies. A ratio above one is considered as good. Analysis of the strategies The comparison of these results confirms the excellent performance of portfolios based on the Maximum DrawDown of returns. These portfolios get in average a Sharpe ratio above one when they contain more than two stocks (see table 6.6). Almost all levels of diversification give better result than the equi-weighted portfolio. Portfolios based on the 4th Moment give particularly bad Sharpe ratios, often below the equi-weighted portfolio. Surprisingly, portfolios based on Variance are not the best. This is a proof that the measure of the one-year variance is not a good predictor to select stocks that provide small variance. Indeed Variance portfolios generate more wealth than Maximum DrawDown of Values portfolios. When in the mean time, the average Sharpe ratio of Maximum DrawDown of Values portfolios is higher (see table 6.7). Graphics representing Sharpe ratio for sub-periods and sub-strategies show that its value is strongly dependent on the period, and moreover on the market performance of the period (see figures 6.22, 6.23, 6.24, 6.25,6.26). Relationship between Sharpe Ratio and Random Portfolios There is no relationship between these two indicators. However results from these two indicators go in the same direction. The assumptions under the Sharpe ratios are the effectiveness of variance as a reliable-in-sample measure of risks. Since there is no contradiction between results with random portfolios, it could be true. However Sharpe ratio is a absolute measure of performance strongly linked to the market performance. In opposition, the random portfolio p-value is a relative measure of performance, which is uncorrelated to the market, but depends on the universe of the study. Hence these indicators are complementary. 53 Figure 6.22: Yearly SHARPE ratio analysis for VARIANCE based strategies. Computed for 12 sub-strategies and 12 sub-periods. 54 Figure 6.23: Yearly SHARPE ratio analysis for µ4 based strategies. Computed for 12 sub-strategies and 12 sub-periods. 55 Figure 6.24: Yearly SHARPE ratio analysis for MAXIMUM DRAWDOWN of RETURNS based strategies. Computed for 12 sub-strategies and 12 subperiods. 56 Figure 6.25: Yearly SHARPE ratio analysis for MAXIMUM DRAWDOWN of VALUES based strategies. Computed for 12 sub-strategies and 12 sub-periods. 57 Figure 6.26: Yearly SHARPE ratio analysis for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies. Computed for 12 sub-strategies and 12 sub-periods. 58 6.2.5 Transaction costs overview Transaction costs are an important issue for investors. If a strategy is more expensive than it adds value, this strategy is useless. Here, transaction costs are evaluated in function of the total amount traded. At each end of sub-periods, portfolios are reconfigured, due to the investment strategy. These changes in value are summed up to give the total amount traded, see figures 6.27, 6.28, 6.29, 6.30, 6.31. Then transaction costs are a fraction of this amount. Note that I choose to exclude the creation cost of the first portfolio corresponding to the cost of the first period. The total amount traded increases, if a strategy performs well. Hence the cost of a good strategy is usually high. As shown in figures 6.27, 6.28, 6.29, this proposition is true with an exception: the 1 stock strategy. This exception comes from the fact that the strategy one, select the same stock during consecutive periods. In this case, transaction costs are equal to zero. The most diversified is a portfolio, the less change in portfolio weights there is, hence the less costly is the strategy. Observe in figures quoted above, that the equally weighted portfolio is the one, the most diversified and the cheapest. In order to conclude, the table 6.4 gives an idea of the impact of transaction costs in strategies. In this example, transaction costs are equal to one percent of the total amount traded. It is for sure, excessive in the case of hedge funds, but represents more or less the reality for individual investors. Comparing table 6.3 and 6.4, it is obvious, that the number of strategies underperforming the equally weighted portfolio, increases a lot if transaction costs are counted. For strategies based on Maximum DrawDown of returns, it goes from 0 to 1, for strategies based on Variance from 3 to 5 and for strategies based on µ4 from 3 to 8. It gives here a clear advantage for strategies based on Maximum DrawDown of returns. 59 Figure 6.27: Total transaction costs of the simulation for 4th MOMENT based strategies Figure 6.28: Total transaction costs of the simulation for VARIANCE based strategies 60 Figure 6.29: Total transaction costs of the simulation for MAXIMUM DRAWDOWN of RETURNS based strategies Figure 6.30: Total transaction costs of the simulation for MAXIMUM DRAWDOWN of VALUES based strategies 61 Figure 6.31: Total transaction costs of the simulation for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies 62 Figure 6.32: How many times a strategy was the best of a period in the simulation for VARIANCE based strategies 6.2.6 Impact of the diversification - concrete cases Diversification has a positive and a negative impact. The positive impact is the one that leads to lower risks by inhibition of unique risks in portfolio. The unique risk is a specific company risk. This company risk is diluted by other companies. Hence a diversified portfolio should only support market risks. The negative impact is the one that leads portfolio to loose opportunities of growth. Imagine the case where few stocks are performing highly above the market. For example, in the case of a universe containing 28 stocks and during a period where only 2 stocks have positive performance, if a 10-stocks portfolio is created, it will at least contain 8 bad stocks. Figures 6.32, 6.33, 6.34, 6.35, 6.36 represent the number of times, that a strategy earns the highest return on a period. Results clearly show that exceptional performance comes from poorly diversified portfolios. In conclusion, there is a trade-off to do between exceptional performance and low risks. 63 Figure 6.33: How many times a strategy was the best of a period in the simulation for 4th MOMENT based strategies Figure 6.34: How many times a strategy was the best of a period in the simulation for MAXIMUM DRAWDOWN of RETURNS based strategies 64 Figure 6.35: How many times a strategy was the best of a period in the simulation for MAXIMUM DRAWDOWN of VALUES based strategies Figure 6.36: How many times a strategy was the best of a period in the simulation for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies 65 Chapter 7 What’s next? 7.1 Recommendation A winning strategy From this study, I would recommend to use the one year Maximum DrawDown of returns as a measure of risks for portfolio optimization. This measure is coherent and provides good information to select stocks of your portfolios for the next 6 months. This recommendation is valuable for a small universe of investment (about 40 stocks) of big capitalization since the study was based on CAC. I think it could be extended to a similar universe such as DOW INDUSTRIALS, DAX, FTSE or SMI. The Maximum DrawDown of returns measures how brutal are changes of trend of stocks from peak to valley. This leads to select stocks the most stable. Moreover the minimization of this measure for a portfolio will lead to find anti-correlate stocks in period of tough market change. This ensure the robustness of these portfolio in crisis time. As seen before, the level of diversification of the portfolio drives the overall range of performance. Thus, the idea is to use diversification as a measure of the level of risks inherent in a strategy. Logically a portfolio with high diversification will offer less performance with a same level of skill than a poorly diversified portfolio. Hence the optimization process and the measure of risks used to construct portfolios should be robust to diversification. As observed, in the evaluation of performance of strategies with random portfolios, Maximum DrawDown of returns is robust to diversification since mean p-values are good for all the levels of diversification. From the result of the back-testing, from 17/01/2003 to 03/08/2009, I identify three profiles of investors: a speculative profile with 1-2 stocks, a risky profile with 3-5 stocks and a neutral profile with no constraint of diversification. The back-testing on this stressed period reveals a propensity of Maximum DrawDown of returns strategies to get high performance in bear market and average performance in bull market. Thus it ensures safety of the investment in time crisis, even for a speculative profile, and good performance in the long run. 66 A good back-testing A back-testing is relevant when applied to a period long enough to contain consecutive bull and bear markets. Indeed, it enables investors to stress test their investment strategies. Moreover, the evaluation of p-values, with random portfolios gives a really deep understanding of the performance of a strategy over time. It is important to understand the performance of a strategy in detail to see if it provides value-added. From a study of random portfolios and after many comparisons, I suggest that the equally weighted portfolio, which represents the expected value of investors acting randomly, is a reliable benchmark of performance. Actually, comparing the wealth produced by the equally weighted portfolio and a strategy is equivalent to compare the strategy with a 50 percent p-value random portfolio. 7.2 To go further Exploring all my ideas would have required more time. Some of them are listed below to go further in my study of investment strategy: Impact of time This study have been done using one year of training data and 6 months of holding in portfolio. Varying these durations may lead to other conclusions for the different indicators. Impact of the universe The universe of study was quite small. What if it integrates hundred of stocks of emerging markets? Maximum DrawDown in multi-asset optimization What will be the result of an optimisation of the Maximum DrawDowns of returns on a mixed universe of bonds and stocks. Applying re-sampling for strategies based on moment of orders 2 and 4 Re-sampling may lead to better results on this kind of measure based on returns. Combining market timing tool to asset allocation Some risk measures have better result in bear market, other in bull market. Predicting these market trends and adapting the optimized measure should produce exceptional performance. Using hyper-spherical constraints in portfolio optimization and random portfolios generation. It should significantly decrease the time of computation. Short selling short selling is not studied in this study, because I don’t believe that it is a position very useful for long run investment. However it could be interesting to observe its effects on measures such as Maximum DrawDown. 67 List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 4.1 Formulation of Mean-Variance optimization . . . . . . . . . . . . Instability of the EFFICIENT FRONTIER moving at each periods. Efficient frontiers from 28 assets over 12 periods of 6 months. DEVIATION of a portfolio from the EFFICIENT FRONTIER. Same frontiers as figure 2.2 with the evolution of a portfolio taken on the first efficient frontier. A portfolio done in the MeanVariance frameworks become rapidly inefficient. . . . . . . . . . . SHRINKAGE Method - a trade off between sample covariance and single index . . . . . . . . . . . . . . . . . . . . . . . . . . . REVERSE ENGINEERING - Measure of distance from the sample to the reverse engineering estimate. The factor alpha corresponds to the trade-off between the bias and the variance of our estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REVERSE ENGINEERING - Minimizing the distance of the reverse engineering estimate to the sample estimate under the condition that the market efficient portfolio belows to the efficient frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REPRODUCTION PROCESS using elite count, crossover and mutation. In this example the crossover rate is 0.8 and the elite count is 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A view of the scattered CROSSOVER function process. A random vector of bit is generated of the length equal the number of assets in the portfolio. A zero means we select the weight of parent 1 and a one the weight of parent 2 . . . . . . . . . . . . . Gaussian MUTATION function adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Histogram of the 1000 RANDOM PORTFOLIOS generated with the random and the normalize function. Returns are computed on a period of 6 months. The universe of study is composed on 28 stocks of the CAC40 over 6 months. . . . . . . . . . . . . . . . 68 6 6 7 8 9 9 15 15 16 19 4.2 4.3 4.4 4.5 5.1 5.2 5.3 6.1 6.2 6.3 6.4 6.5 6.6 6.7 Graphics of RANDOM PORTFOLIOS sorted by increasing returns. The 1000 random portfolios were generated with the random and the normalize function. Returns are computed on a period of 6 months. The universe of study is composed on 28 stocks of the CAC40 over 6 months.. . . . . . . . . . . . . . . . . Histogram of the 1000 RANDOM PORTFOLIOS generated with a Genetic Algorithm at a Herfindahl Index of 1/3. Same period and universe as figure 4.1. We can observe that the range of returns is much wider than this of figure 4.1 . . . . . . . . . . . . Graphics of RANDOM PORTFOLIOS sorted by increasing returns. The 1000 random portfolios were generated with a Genetic Algorithm at a Herfindahl Index of 1/3. Same period and universe as figure 4.2. This graphics can be used to read the pvalue of your strategy on this particular period and universe if its herfindahl index is 1/3. . . . . . . . . . . . . . . . . . . . . . . RANDOM PORTFOLIOS return for different level of the Herfindahl Index. Each line corresponds to a different p-value. The universe of portfolios is composed of 28 stocks of the CAC40. Six months of daily returns are used to evaluate portfolios returns. Thousand random portfolios were generated with a Genetic Algorithm for the 28 levels of the Herfindahl index ( 1,1/2,1/3...1/28) 20 21 21 22 Illustration of two MAXIMUM DRAWDOWNS . . . . . . . . . . 25 ENVELOP of returns taken by 1000 random portfolios calculated for H0 = [1, 1/2 1/3, 1/4, 1/5, ..., 1/28] over a period of 121 days. H0 is the value of the Herfindahl Index the first day of the period. 27 VARIATION of the HERFINDAHL Index of a portfolio over a period of 122 days. H0 is equal to 0.1 The second graphic is a zoom of the first one. We can observe that the variation of the Herfindahl Index are really small. Var(H) about equal to 10−7 . 28 Evolution of the CAC40. . . . . . . . . . . . . . . . . . . . . . . . This table contains the different PERIODS OF DATA used during the simulation. Training data are the data used to create a portfolio. Test data are the data used to evaluate the portfolio. Training data contained 260 days and test data 120 days. . . . . END WEALTH of the different strategies at the date of 03/08/2009. Starting value of 100 the 19/01/2004. . . . . . . . . . . . . . . . END WEALTH including TRANSACTION COSTS of 1 percent of the total value traded. Starting value of 100 the 19/01/2004. Average RANDOM PORTFOLIOS P-VALUES of the different strategies for the period from 19/01/2004 to 03/08/2009. . . . . Average yearly SHARPE RATIO of the different strategies for the period from 19/01/2004 to 03/08/2009. . . . . . . . . . . . . Result summary. Numbers represent the rank of strategies against each other. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 31 32 35 35 35 36 36 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 Wealth evolution of VARIANCE based strategies from 19/01/2004 to 03/08/2009. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wealth evolution of 4th MOMENT based strategies from 19/01/2004 to 03/08/2009. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wealth evolution of MAXIMUM DRAWDOWN of RETURNS based strategies from 19/01/2004 to 03/08/2009. . . . . . . . . . Wealth evolution of MAXIMUM DRAWDOWN of VALUES based strategies from 19/01/2004 to 03/08/2009. . . . . . . . . . . . . . Wealth evolution of Average MAXIMUM DRAWDOWN of VALUES based strategies from 19/01/2004 to 03/08/2009. . . . . . . Comparison of the strategies in term of WEALTH at the end of the simulation from 19/01/2004 to 03/08/2009. All Herfindahl index mixed. The red horizontal line represent the median. The boundaries of the box represent the 25 and 75 quantile. The whiskers represent the minimum and maximum values. Maximum values corresponds generally to high Hefindahl index (look at the table 6.3 to see the corresponding values plotted here). This plot shows that strategies based on different measures of risks have different responses to diversification. . . . . . . . . . . p-values analysis for 4th MOMENT based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). . . . . p-values analysis for VARIANCE based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). . . . . 70 39 39 40 40 41 42 44 45 6.16 p-values analysis for MAXIMUM DRAWDOWN of RETURNS based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). . . . . . . . . . . . . . . . . . . . . . 6.17 p-values analysis for MAXIMUM DRAWDOWN of VALUES based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum pvalues. The map represents p-values for a given sub-period and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). . . . . . . . . . . . . . . . . . . . . . 6.18 p-values analysis for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies. Results come from a random portfolios analysis with 1000 random portfolios of the same Herfindahl index as the analysed portfolio. Each box corresponds to the p-values of the 12 sub-periods for one level of Herfindahl Index. In the box, the red horizontal line represent the median p-value over the 12 periods. The boundaries of the box represent the 25 and 75 quantiles. The whiskers represent the minimum and maximum p-values. The map represents p-values for a given subperiod and sub-strategies(the 12 sub-strategies correspond to the 12 different levels of Herfindahl index). . . . . . . . . . . . . . . . 6.19 Return of strategies based on MAXIMUM DRAWDOWN of RETURNS, against random portfolios of p-value of 0.1 . In Period 11, the market decreases and in period 12, it increases . . . . . . 6.20 Return of strategies based on 4th MOMENT, against random portfolios of p-value of 0.1 In Period 11, the market decreases and in period 12, it increases . . . . . . . . . . . . . . . . . . . . . . . . 6.21 Return of strategies based on VARIANCE, against random portfolios of p-value of 0.1 In Period 11, the market decreases and in period 12, it increases . . . . . . . . . . . . . . . . . . . . . . . . 6.22 Yearly SHARPE ratio analysis for VARIANCE based strategies. Computed for 12 sub-strategies and 12 sub-periods. . . . . . . . . 6.23 Yearly SHARPE ratio analysis for µ4 based strategies. Computed for 12 sub-strategies and 12 sub-periods. . . . . . . . . . . . . . . 71 46 47 48 50 51 52 54 55 6.24 Yearly SHARPE ratio analysis for MAXIMUM DRAWDOWN of RETURNS based strategies. Computed for 12 sub-strategies and 12 sub-periods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.25 Yearly SHARPE ratio analysis for MAXIMUM DRAWDOWN of VALUES based strategies. Computed for 12 sub-strategies and 12 sub-periods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.26 Yearly SHARPE ratio analysis for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies. Computed for 12 substrategies and 12 sub-periods. . . . . . . . . . . . . . . . . . . . . 6.27 Total transaction costs of the simulation for 4th MOMENT based strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.28 Total transaction costs of the simulation for VARIANCE based strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.29 Total transaction costs of the simulation for MAXIMUM DRAWDOWN of RETURNS based strategies . . . . . . . . . . . . . . . 6.30 Total transaction costs of the simulation for MAXIMUM DRAWDOWN of VALUES based strategies . . . . . . . . . . . . . . . . 6.31 Total transaction costs of the simulation for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies . . . . . . . . 6.32 How many times a strategy was the best of a period in the simulation for VARIANCE based strategies . . . . . . . . . . . . . . 6.33 How many times a strategy was the best of a period in the simulation for 4th MOMENT based strategies . . . . . . . . . . . . . 6.34 How many times a strategy was the best of a period in the simulation for MAXIMUM DRAWDOWN of RETURNS based strategies 6.35 How many times a strategy was the best of a period in the simulation for MAXIMUM DRAWDOWN of VALUES based strategies 6.36 How many times a strategy was the best of a period in the simulation for AVERAGE MAXIMUM DRAWDOWN of VALUES based strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 56 57 58 60 60 61 61 62 63 64 64 65 65 Bibliography [1] Gilles Daniel, Didier Sornette and Peter Woehrmann, Look-Ahead Benchmark Bias in Portfolio Performance Evaluation, Journal of Portfolio Management 36 (1), 121-130 (Fall 2009). [2] M.B. Abbes, Y. Boujelbène, and A. Bouri. Les profits des stratégies momentum: sous et/ou surréaction ou phénomène rationnel? cas du marché français. [3] P. ARTZNER, F. DELBAEN, J.M. EBER, and D. HEATH. COHERENT MEASURES OF RISK. Mathematical Finance, 9(3):203–228, 1999. [4] G. Baquero, J. Horst, and M. Verbeek. Survival, look-ahead bias and the performance of hedge funds. J. Fin. Quant. Anal., 40:493–504, 2005. [5] J.P. Bouchaud and M. Potters. More stylized facts of financial markets: leverage effect and downside correlations. Physica A: Statistical Mechanics and its Applications, 299(1-2):60–70, 2001. [6] S.J. Brown, W.N. Goetzmann, and S.A. Ross. Survival. The Journal of Finance, 50(3):853–873, 1995. [7] P. Burns. Performance measurement via random portfolios. Newsletter. [8] P. Burns and B. Statistics. Random Portfolios for Evaluating Trading Strategies. Burns, 2006. [9] M.M. Carhart. On persistence in mutual fund performance. The Journal of Finance, 52(1):57–82, 1997. [10] SC Chiam, KC Tan, and A.A. Mamun. A memetic model of evolutionary PSO for computational finance applications. Expert Systems With Applications, 36(2P2):3695–3711, 2009. [11] E.F. Fama and K.R. French. Luck versus Skill in the Cross Section of Mutual Fund Alpha Estimates (http : //ssrn.com/abstract = 1356021). [12] E.F. Fama and K.R. French. Common risk factors in the returns on stocks and bonds* 1. Journal of financial economics, 33(1):3–56, 1993. 73 [13] M. Grinblatt and S. Titman. The persistence of mutual fund performance. The Journal of Finance, 47(5):1977–1984, 1992. [14] J. Liu, X. Jin, and K.C. Tsui. Autonomy oriented computing. Kluwer Academic. [15] Y. Malevergne, P. Santa-Clara, and D. Sornette. Professor Zipf goes to Wall Street (2009), NBER Working Paper No. 15295 (http : //ssrn.com/abstract = 1458280). [16] H. Markowitz. Portfolio selection. The journal of finance, 7(1):77–91, 1952. [17] R. Roll. A critique of the asset pricing theory’s tests Part I: On past and potential testability of the theory* 1. Journal of financial economics, 4(2):129–176, 1977. [18] W.F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3):425–442, 1964. 74 Chapter 8 Annexes 8.1 Maximum DrawDown - Matlab Code 75 8.2 Average Maximum DrawDown - Matlab Code 76 Master Thesis -Software Documentation Thibaut Simon 02/12/2009 Contents 1 Introduction 2 2 Optimization tool 2.1 How to use it? First use. . . . . . . . . . . . . . . . . . 2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Input Data . . . . . . . . . . . . . . . . . . . . 2.2.2 Output Data . . . . . . . . . . . . . . . . . . . 2.3 Options . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Genetic Algorithm Options . . . . . . . . . . . 2.3.2 Duration of training data and test data for the construction . . . . . . . . . . . . . . . . . . . . 2.3.3 Fitness function for the portfolio construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . portfolio . . . . . . . . . . . . 3 Analysis tool 3.1 User Interface . . . . . . . . . . 3.1.1 Aspect . . . . . . . . . . 3.1.2 Possibilities . . . . . . . 3.2 Data . . . . . . . . . . . . . . . 3.2.1 Input Data . . . . . . . 3.2.2 Output Data . . . . . . 3.3 What do you want to analyse? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 4 4 4 4 4 6 6 7 7 7 7 8 8 8 8 4 To go further 10 4.1 What could be done to improve these tools? . . . . . . . . . . . . 10 4.2 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.3 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3.1 How to change fitness functions? . . . . . . . . . . . . . . 11 4.3.2 How to add a functionality to the analysis or optimization tool? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5 Annexes 13 5.1 Optimization tool figures . . . . . . . . . . . . . . . . . . . . . . 14 5.2 Analysis tool figures . . . . . . . . . . . . . . . . . . . . . . . . . 21 1 Chapter 1 Introduction This work is part of my thesis on ”Portfolio construction : How to achieve better strategy through the use of Genetic Algorithm and non-linear objective function based on new measures of risks and dependencies?”. During my study of this subject I needed to solve optimization problem to construct portfolio and then analyse their performance. That’s why I did two tools, with user-friendly interfaces, to help me in my work. These tools were at first really basic without user-interface, but soon they became quite complex, due to the multiplication of functions. Hence I decided to create a user interface that allows other people than me, to use it easily in a future work. For this reason, I chose to do it in MATLAB. I will start with the optimization tool which makes the input data used by the analysis tool. Then, I will explain how to use the analysis tool to exploit the result of the optimization tool. Finally, I will tale you what could be done to enhance the tool and how to do it. 2 Chapter 2 Optimization tool The optimization tool allows to create portfolio from different objective functions. These objective functions correspond to allocation strategies. The optimization is done by a genetic algorithm coming from a library of Matlab. You can choose the precision of the genetic algorithm by setting up different options. You can as well choose the length of the data you use to create your portfolio and the duration you keep your portfolio before doing another one. The optimiser use a file containing as data: the date of trading, the adjusted price of the stocks at the closure (see the section data below). This tool have the advantage to create portfolios from 12 different strategies over the length of data you want. Then you will be able to analyse these portfolio with the analysis tool. The data I will use to present this tool are : 28 stocks coming from the french index CAC40 over about 1500 days from 2003. 2.1 How to use it? First use. You will need to: first, launch Matlab, second, set the current directory to the ”optimization directory”, then just write ”optimization” in the command window (see figure 5.1). You can observe the window represented in figure 5.2. If you want to create your first portfolio, just check the fitness you want to use and click on the button ”launch computation”. The program will run the optimization with the default value for the time-window and the genetic algorithm. You can follow the evolution of the process in the command window where you will see the period and the fitness currently being calculated. Result of these optimizations are saved into the Matlab file ”portfolios.mat”. This file is the file that will be required by the analysis tool. In order to understand how to choose the different parameters, I invite you to read the following parts. 3 2.2 Data The aim of this section is to give you the ability to put the data you want into the application. In order to be read by the program and produce the expected result, data should respect a format for the input data. The paragraph on the output data is done to understand what is stored in ”portfolios.mat” file. 2.2.1 Input Data The input data come originally from .cls file found in yahoo finance (see figure 5.5). This yahoo file contains a lot of field. For our purpose we need to keep only the adjusted close price and the date of each price (see figure 5.6). At the end all should be gathered in a unique file that will be import into Matlab and saved as ”data.mat”. The file ”data.mat” is a structure containing the numerical array ”data” and the text array ”textdata”. These two array are necessary for the good execution of the program. Figure 5.3 you can observe the numerical array ”data” and figure 5.4 the text array ”textdata”. In this file, data should be sorted from the most recent to oldest, like in the file provided by Yahoo finance. Each column corresponds to a stock. The first column correspond to the date of each price in text.The two last columns contains the reference index (here the CAC40) and the risk free rate (here the 13 weeks treasury bills). Look at the screen-shots below to get a better view of these file and to understand the procedure to follow in order to use your own data. 2.2.2 Output Data The tool optimization produce the file ”portfolios.mat” which is a Matlab data file containing the number of periods, the name of the fitness function, and the portfolios structure. The portfolios structure store the portfolio weight, the training data of the period, the test data to evaluate the portfolio over the next days. And other data useful to analyse the portfolio as the reference index data and risk free rate data over the period of test. You can see the details on the figure 5.11. This file can be read next by the analysis tool to plot the interesting figures and compare them. 2.3 2.3.1 Options Genetic Algorithm Options This program use the Genetic Algorithm of Matlab. In the interface you can change three options that allow you to make more precise or quicker the optimisation. Choices available here are the size of the population, the crossover rate and the stall limit generation. However, it is possible to change other options more fundamental by entering into the code of opt portfolio3.m and changing gaoptimset settings. Look in the help of Matlab for more details about the GA toolbox. Here is the list of options I used for the GA. 4 Population Options: Population type , the individuals in the population are represented with the data type double. Population size specifies how many individuals there are in each generation. With a large population size, the genetic algorithm searches the solution space more thoroughly, thereby reducing the chance that the algorithm will return a local minimum that is not a global minimum. Here you can choose between 100 and 1000 with the slider button. Initial population specifies an initial population for the genetic algorithm. It is set as random. Selection function: the Stochastic uniform selection function lays out a line in which each parent corresponds to a section of the line of length proportional to its scaled value. The algorithm moves along the line in steps of equal size. At each step, the algorithm allocates a parent from the section it lands on. The first step is a uniform random number less than the step size. Reproduction: Crossover fraction specifies the fraction of the next generation, other than elite children, that are produced by crossover. Elite count specifies the number of individuals that are guaranteed to survive to the next generation. Mutation Options: the Gaussian mutation function adds a random number taken from a Gaussian distribution with mean 0 to each entry of the parent vector. Crossover Options: The scattered crossover function creates a random binary vector and selects the genes where the vector is a 1 from the first parent, and the genes where the vector is a 0 from the second parent, and combines the genes to form the child. Stopping Criteria Options: The algorithm runs until the cumulative change in the fitness function value over Stall generations is less than or equal to Function Tolerance. Here the number of generation is limited to 200 and stall limit generation can be chosen between 10 and 50 with the slider button. 5 2.3.2 Duration of training data and test data for the portfolio construction In order to study the influence of time, it is interesting to variate the amount of data you use to create your portfolio and the time you keep it. This is possible to do with the two sliders button of the panel window parameters. If you increase the holding duration of your portfolio, you decrease the final number of periods, hence you decrease the number of portfolios to calculate. 2.3.3 Fitness function for the portfolio construction The fitness function can be choose by checking the name in the user interface. If you need to create or study new fitness function, you will have to enter a little bit in the code. The fitness function is the mathematical expression of your strategy of risks mitigation. In order to implement your own innovative strategy, you will need to create your own fitness function.The fitness function is then used by the genetic algorithm to evaluate each portfolio. Fitness functions are stored into the files fitness1.m, fitness2.m, ..., fitness12.m. Look at the last chapter for more details. 6 Chapter 3 Analysis tool The analysis tool has been developed to analyse portfolios coming from the tool optimization. It allow you to get a deep understanding of the performance of each portfolio against the others. As you will see in this section, this interface allow you to analyse the portfolios under different perspectives. 3.1 User Interface The analysis tool use a Matlab user interface that you can launch by entering analyse ui in the command window, once you have set the correct current directory. 3.1.1 Aspect Figure 5.12, you can observe a screen-shot of the application. On the left, you choose what you want to observe and on the right, graphics are plotted. 3.1.2 Possibilities Choices available for analysis are : • Specific Fitness function • Particular Period • Global evolution of portfolios • Which strategy for which period? • What about transaction cost? • Efficient frontier evolution • Synthesis 7 A click on the button of your choice will plot the corresponding graphics into the right panel. If you look on the bottom left panel,you will see a command panel where you can change parameters of the graphics and data plotted. I advise you to try the different possibilities which have all been selected for their pertinence in portfolio analysis. 3.2 3.2.1 Data Input Data The tool optimization produce the file ”portfolios.mat” which is a Matlab data file containing the number of periods, the name of the fitness function, and the portfolios structure. The portfolios structure store the portfolio weight, the training data of the period, the test data to evaluate the portfolio over the next days. And other data useful to analyse the portfolio as the reference index data and risk free rate data over the period of test. You can see the details on the figure 5.11. This file is the input data used by the analysis tool. Note that fitness index and periods index in the portfolios structure should go from one to n without discontinuities in order to works. 3.2.2 Output Data No output data for the moment. Export to excel and image of graphics could be produce in further development. 3.3 What do you want to analyse? Specific Fitness function Study a specific fitness function. A fitness function correspond to an investment strategy. Each strategy lead to different portfolio. You can choose witch function you want to analyse in the popup menu. Observe the evolution of the composition at each period. Look at the evolution in value of the strategy. You can also look at the return on each period to see when the portfolio perform the best.(see figure 5.12) Particular Period Study all the different investment strategy for a selected period. You can select the period you want to analyse in the pop-up menu. For each period you will be able to compare the composition of the portfolios, the evolution of the value during the period and the returns over the period.(see figure 5.13) Global evolution of portfolios Plot the evolution of portfolios along all the periods. Thus you can have a global view of the performance. The starting value of portfolios is 100.(see figure 5.14) 8 Which strategy for which period? You would like to know which strategy perform the best, or the most often. Trough the different plots, look at the distribution of best strategies, or the repartition along periods of the best strategy. Portfolios are ranked by rate of returns over the periods. Moreover you can check the composition of best portfolio to see if they are diversified or not.(see figure 5.15) What about transaction cost? It is interesting to have a look at the transaction costs. They are calculated in absolute value. They correspond to the change in value of each asset to get the new portfolio. Compare the cumulative costs of each strategy. Look which periods incur costs.(see figure 5.16) Efficient frontier evolution This section show the evolution of the efficient frontier over time. It is done to show how big are the change of the markets. The efficient frontier is calculated from Markovitz portfolio theory with the help of a quadratic solver. Often change are important and put some discredit on the pertinence of the Markovitz theory.(see figure 5.17) Synthesis Finally you need some figures to figure out how diversified is the portfolio with the Herfindahl index, what is the end value, the end value tacking into account the transaction costs or the overall returns. Transaction costs depend of who you are, hence you can choose the level of transaction costs you want to apply in the pop-up menu.(see figure 5.18) 9 Chapter 4 To go further This program has been developed in a limited amount of time. A lot can be done to simplify its use. I will propose some idea of what can be done to improve the utility and the easiness of use. This program has been done in the aim of constructing portfolios and analysing their performance and behaviour for my master thesis. However I decided as it was really powerful for me, to make it accessible and available for further use. It explains at some points, why it is not always as functional as it should be or as I would like it to be. 4.1 What could be done to improve these tools? Here are some idea to improve the functionality of this application. I will try to give some hints to develop these ideas. Compare strategy with different duration Plot the evolution of the market index Plot the Sharpe ratio Export graphics to excel Improve the process of data collection which is not automatic Add a system to modify the fitness function easily 4.2 Tips Tip 1: The user interface as been done using MATLAB GUIDE, which allows to create nice interface. To launch this tool, write ”guide” in the command window and load the optimization file ”optimization.fig” 10 Tip 2: The optimisation tool uses the Genetic Algorithm toolbox of Matlab. In order to get a wider view on the possibilities of GA in optimisation, you should go looking in the help of the GA toolbox. Tip 3: The analysis tool call 7 functions called analyse1.m, analyse2.m .... Each of these functions correspond to one choice of analysis user interface. If you want to modify one of these functions, you will see, that they are built in the same way. First, extract the interesting data from the structures portfolios. Second, make the calculation needed. Finally plot the interesting variables into the interface with the use of handles. 4.3 4.3.1 FAQ How to change fitness functions? The fitness function is the mathematical expression of your strategy of risks mitigation. In order to implement your own innovative strategy, you will need to create your own fitness function.The fitness function is then used by the genetic algorithm to evaluate each portfolio. Fitness functions are stored into the files fitness1.m, fitness2.m, ..., fitness12.m. In order to create your own fitness function, the easiest way, is to modify an existing one. Input variables of fitness functions are the portfolio weights w and the data of returns returnsdata. You can basically evaluate what you want without worrying about linearity and so one. Each fitness function start by the command normalize(w) which allows to keep the condition sum of weight equal to one. Then add z= .. what you want to minimize. Names of fitness functions are stored into the code file optimization.m in the variable fit name choice that you can change to put your own name. See figure 4.1 for an example of fitness function. 4.3.2 How to add a functionality to the analysis or optimization tool? These two applications can be modified. For example, it is possible to add a functionality to the Analysis tool. It will require to modify the interface file analysis-ui.fig and the m-file analysis-ui.m. In the interface file, you need for example to add a button for your functionality. This new button will have to be programmed in the m-file. A callback function is used to execute an action following a click on the new button. I advice you to create a new m-file analysis8.m and to make your function of analysis inside, on the model of the other analysis1.m ... files. See tip 3 for more information. 11 Figure 4.1: Fitness function code in Matlab - minimization of the single action maxdrawdown for a specific level of the Herfindahl index 12 Chapter 5 Annexes The figures referred before are grouped in this section. The optimization tool screen-shots represent the process of importing new data for the tool. Figures of the analysis section represent the different possibilities of analysis. 13 List of Figures 4.1 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 5.1 Fitness function code in Matlab - minimization of the single action maxdrawdown for a specific level of the Herfindahl index . . Command window . . . . . . . . . . . . . . . . . . . . . . . . . . Optimization interface . . . . . . . . . . . . . . . . . . . . . . . . Numerical array ”data” in Matlab . . . . . . . . . . . . . . . . . Text array ”textdata” in Matlab . . . . . . . . . . . . . . . . . . Data file downloaded in yahoo finance . . . . . . . . . . . . . . . Yahoo file treated by a VBA macro to keep only dates and prices The file which should be imported to Matlab . . . . . . . . . . . Import window 1 in Matlab . . . . . . . . . . . . . . . . . . . . . Import window 2 in Matlab . . . . . . . . . . . . . . . . . . . . . Save window in matlab to produce ”data.mat” which is needed by the program . . . . . . . . . . . . . . . . . . . . . . . . . . . . Details of data contained in the portfolios structure . . . . . . . . Study of a specific fitness function . . . . . . . . . . . . . . . . . Comparison of the different strategies for a specific period . . . . Overview of the evolution of the strategies . . . . . . . . . . . . . Which strategy perform the best? . . . . . . . . . . . . . . . . . . Study of the transaction cost . . . . . . . . . . . . . . . . . . . . Plot the evolution of efficient frontier . . . . . . . . . . . . . . . . Synthesis of the performance of the different strategies . . . . . . Optimization tool figures 14 12 15 15 16 16 17 17 18 18 19 19 20 21 21 22 22 23 23 24 Figure 5.1: Command window Figure 5.2: Optimization interface 15 Figure 5.3: Numerical array ”data” in Matlab Figure 5.4: Text array ”textdata” in Matlab 16 Figure 5.5: Data file downloaded in yahoo finance Figure 5.6: Yahoo file treated by a VBA macro to keep only dates and prices 17 Figure 5.7: The file which should be imported to Matlab Figure 5.8: Import window 1 in Matlab 18 Figure 5.9: Import window 2 in Matlab Figure 5.10: Save window in matlab to produce ”data.mat” which is needed by the program 19 Figure 5.11: Details of data contained in the portfolios structure 20 5.2 Analysis tool figures Figure 5.12: Study of a specific fitness function Figure 5.13: Comparison of the different strategies for a specific period 21 Figure 5.14: Overview of the evolution of the strategies Figure 5.15: Which strategy perform the best? 22 Figure 5.16: Study of the transaction cost Figure 5.17: Plot the evolution of efficient frontier 23 Figure 5.18: Synthesis of the performance of the different strategies 24
© Copyright 2026 Paperzz