The use of extreme value theory and time series analysis to estimate risk measures for extreme events Sofia Rydell Master’s Thesis in Engineering Physics, Department of Physics, Umeå University, 2013 Department of Physics Linnaeus väg 20 901 87 Umeå Sweden www.physics.umu.se Umeå University Department of Physics Master’s Thesis in Engineering Physics 30 hp 2013-01-27 The use of extreme value theory and time series analysis to estimate risk measures for extreme events. Sofia Rydell, [email protected] Examiner: Markus Ådahl, [email protected], Department of Mathematics and Mathematical Statistics, Umeå University. Supervisor: Magnus Lundin, [email protected], Svenska Handelsbanken AB Sammanfattning Syftet med detta arbeta är att använda extremvärdesteori och tidsserieanalys för att hitta modeller för att skatta de två riskmåtten value at risk och expected shortfall, vilka är mått på potentiella förluster. Fokus ligger på vilken tidshorisont som behövs för att erhålla prediktioner som överensstämmer med den faktiska avkastningen för en tillgång eller en portfölj av tillgångar. Metoderna baserade på extremvärdesteori som används är Hill estimatorn och den så kallade ”peak over threshold”-metoden. Hill estimatorn kombineras även med en tidsseriemodell. Modellen som då används är en . För extremvärdesteorimodellerna är valet av tröskel som skiljer de extrema observationerna i svansen från observationerna som tillhör den central delen av distributionen viktigt. I detta arbete används en tröskel som är 10% av den totala sampelstorleken, då detta är vanligt förekommande. Några ytterligare metoder för att välja tröskeln presenteras också i denna rapport. För varje modell används olika långa perioder av historiskt data då prediktioner av riskmåtten görs, för olika tillgångar. Det främsta resultatet av detta arbete är att den bästa modellen och lämpligaste tidshorisonten på det historiska datamaterialet för att skatta de två riskmåtten skiljer sig från data till data. Dock är metoderna som kombinerar extremvärdesteori och tidsseriemodeller mest flexibla av de undersökta och dessa är de som med störst sannolikhet fångar extrema händelser. Hillmetoden som inkluderar en tidsseriemodell och använder sig av kortare tidshorisonter är att föredra då riskmått för index ska skattas, medan Hill estimatorn i sig själv med tre eller fyra års tidshorisont är att föredra då riskmått för valutakurser skattas. I detta arbete har endast modeller för en enskild tillgång studerats, men modellerna kan lika enkelt användas på en portföljs tidsserie. En multivariat version av extremvärdesteorin existerar men dess komplexitet gör den ofördelaktig att implementera. Om de univariata extremvärdesmodellerna i sig själva anses otillräckliga när det kommer till att fånga alla förhållandena i en portfölj så kan modellerna användas som ett komplement till de enklare modeller som vanligen används och på så vis förbättra riskanalysen. I Abstract In this thesis the main purpose is to use extreme value theory and time series analysis to find models for estimating the two risk measures for potential losses, value at risk and expected shortfall. Focus is on the time horizon needed to obtain predictions that are consistent with the actual outcome of an asset or a portfolio of assets. The extreme value based methods used are the Hill estimator and the peak over threshold method. The Hill estimator is also combined with a time series model. The time series model used is an model. For extreme value theory based models the choice of threshold between the observations belonging to the tail and the observations belonging to the center of the distribution is crucial. In this study the threshold is set to be 10% of the sample size, by conventional choice. There are additional methods of choosing the threshold and some of them are presented in this paper. For each models different length of historical data is used when predictions of the risk measures are made for different assets. The main result is that the best model and appropriate time horizon of historical data to use for estimating value at risk and expected shortfall differs from dataset to dataset. However, the methods that combine extreme value theory and time series models are the most flexible ones and those are the ones most likely to capture extreme events. The conditional Hill methods with shorter time horizons seem preferable when estimating the risk measures for indices, while the Hill estimator with time horizon of three or four years is preferable for foreign exchange rates. In this study only models for single assets are evaluated, but the models could easily be implemented on a time series of a portfolio. A multivariate case of the extreme value theory exists but its complexity makes it disadvantageously to implement. So if for example the univariate extreme value models alone are considered inadequate to capture all the relations in a portfolio the models could be used as a complement to the commonly used model based solely on historical simulation and thereby improve the risk analysis. I Contents 1. Background and introduction ..........................................................................................................1 2. Theory ............................................................................................................................................3 2.1 Return series .............................................................................................................................3 2.2 Risk measures ...........................................................................................................................3 2.3 Likelihood estimates..................................................................................................................4 2.4 Time series models ....................................................................................................................5 2.4.1 Autoregressive model .........................................................................................................5 2.4.2 Generalized autoregressive conditional heteroskedasticity model ......................................5 2.5 Extreme Value Theory ...............................................................................................................6 2.5.1 Defining the tail ..................................................................................................................7 2.5.2 Peak over Threshold method ..............................................................................................9 2.5.3 Hill estimator .................................................................................................................... 10 2.5.4 Conditional Hill estimator ................................................................................................. 11 2.6 Historical simulation ................................................................................................................ 12 3. Data .............................................................................................................................................. 12 4. Analysis of the dataset .................................................................................................................. 13 4.1 Time series plot and descriptive statistics ................................................................................ 13 4.2 Autocorrelation and Ljung Box test.......................................................................................... 14 4.3 Histogram and QQ plots .......................................................................................................... 15 5. Method ......................................................................................................................................... 16 5.1 Overview threshold sensitivity................................................................................................. 17 5.2 Threshold choice ..................................................................................................................... 17 5.3 Peak over Threshold ................................................................................................................ 17 5.4 Hill estimator........................................................................................................................... 18 5.5 Conditional Hill estimator ........................................................................................................ 18 5.6 Backtesting.............................................................................................................................. 20 5.6.1 Backtesting VAR ............................................................................................................... 20 5.6.2 Backtesting ES .................................................................................................................. 20 5.7 Higher significance levels......................................................................................................... 21 6. Results .......................................................................................................................................... 21 6.1 Overview threshold sensitivity................................................................................................. 22 6.2 Prediction plots ....................................................................................................................... 23 6.2.1 Hill and GPD simulation .................................................................................................... 24 6.2.2 Conditional Hill simulations .............................................................................................. 25 6.2.3 Comparisons..................................................................................................................... 26 6.3 Backtesting results, 10% threshold .......................................................................................... 30 6.3.1 OMXS30 ........................................................................................................................... 33 6.3.2 FTSE ................................................................................................................................. 34 6.3.3 GSPC................................................................................................................................. 35 6.3.4 N225 ................................................................................................................................ 36 6.3.5 EUR .................................................................................................................................. 37 6.3.6 GPB .................................................................................................................................. 37 6.3.7 USD .................................................................................................................................. 38 6.4 Backtesting results, various thresholds .................................................................................... 39 6.5 Results higher significance levels ............................................................................................. 41 7. Discussion and conclusions ........................................................................................................... 43 8. References .................................................................................................................................... 46 Appendix A ....................................................................................................................................... 48 Appendix B ....................................................................................................................................... 57 Appendix C ....................................................................................................................................... 59 Appendix D ....................................................................................................................................... 64 i 1. Background and introduction Due to the turbulence on the financial markets during the last decades it has become more important for banks and financial institutes to try to foresee and monitor different risks. Above all it’s the incidents of large losses that wants to be prevented and avoided. For banks’ and other financial institutes’ reliable methods and credible measures for estimating risk is essential and are becoming more important as new regulations and new market conditions develops. Considering today’s market and with banks and other institutes as large as several of them are today the consequences of inaccurate evaluation and control of risks can be devastating. The crash of the banking system on Island and the downfall of Lehman Brothers 2008 and what followed are two examples. A risk measure for the potential loss of an asset or portfolio widely used is the so called value at risk. The pros for this risk measure are that it’s intuitive, easy to understand and there exists easily implemented methods for estimating it. On the other hand the cons lie in its inability to capture the most extreme losses and the fact that it’s not a coherent risk measure. Another risk measure that’s up and coming in the financial sector and is likely to be included in a future Basel framework is the so called expected shortfall. Like the value at risk it considers the potential loss and is fairly intuitive, but additionally expected shortfall is based on the entire set of the most extreme observations and it’s a coherent risk measure. There exist numerous methods for estimating value at risk. Many of those methods can be developed and used to estimate expected shortfall as well. Value at risk is included in this study since it’s universally accepted and used, and its pros in many ways outweighs its cons. Why expected shortfall is chosen is because its close relationship to value at risk and that therefore it is relatively easy to implement estimation methods, but mainly because The Basel Committee on Banking Supervision indicates that it’s going to be included in the forthcoming framework. A theory that considers these types of extreme events is the so called Extreme Value Theory. It aims to monitor and predict extreme events, like financial crisis, in an efficient way. Therefore an investigation of how methods based on the extreme value theory can be used to estimate risk measures is performed in this study. The extreme value theory methods that are included are the so called Hill estimator and the peak over threshold method. For both risk measures the two major questions investigated in this study are which method to use for estimations and what length of calibration data that is needed to obtain reliable estimates, with that method. The aim of this study is to use Extreme Value Theory, time series analysis and Monte Carlo simulations to obtain reliable models for predicting the two risk measures, value at risk and expected shortfall. A reliable model is considered to be a model that behaves similar to the statistically expected. Statistically expected means that if a model generates a VAR estimate for a significance level of 99% the statistically expected is that this estimate will be accurate in 99 cases out of 100. Hence, when evaluation of the model is performed the outcome should be roughly the same. Focus in this study is on investigating which time horizon of the historical data that is needed in the calibration to obtain stable and reliable models. The time horizon needed may differ between methods. Hence, one goal is to identify which of all models that is preferable, and give the most 1 accurate estimate of value at risk and expected shortfall, if possible. The other is to provide guidelines of what time horizon is needed for the different methods to obtain stable models. For meeting the objectives of this study MATLAB is used to build the models and a backtesting procedure is used to evaluate the models. In the backtesting procedure the models are used to predict value at risk and expected shortfall over a certain time for which historical data of the actual outcome is available. The predictions are compared with the actual outcome of the data and a backtesting statistic for the respective risk measures is obtained. The statistic for expected shortfall depends on the value at risk estimation and is an average of the difference between the predicted and actual outcome. Therefore, in this study a model is considered to be stable and reliable if the backtesting statistic for value at risk is within 10% from the statistically expected. To obtain a reliable model for estimating expected shortfall two conditions need to be fulfilled. First of all the above should hold for the value at risk estimate given by that model and second the backtesting statistic should not deviate more than a 1% from zero. For an ideal model backtesting statistic for expected shortfall would be zero. 2 2. Theory In this study the loss distribution for the returns will be considered. Hence, the losses are given by positive values. In this section the definition of the return series and the risk measures are first presented. After that a description of the likelihood method for estimating parameters is presented as well as the definition of some time series models that will be used in this study. Finally some basics of Extreme value theory are outlined as well as the methods that will be used in this study. 2.1 Return series For a given daily price process day to day , i.e. the common returns are given by the percentage change from (1) From that the loss returns is then given by: (2) 2.2 Risk measures Value at risk (VAR) and expected shortfall (ES) is two types of risk measures of the potential loss of a financial asset or a portfolio of assets. VAR is a measure that describes the minimal loss of a portfolio which can occur for a given probability, during a specific time. The mathematical definition is (3) where and belongs to the loss distribution (Rocco, 2011). This means that the is given by the smallest value such that the actual loss of a portfolio exceeds with the probability , at most. For the discrete case, let , where , be the ordered sample of , then the VAR for the probability is given by (4) In the case where isn’t an integer different approaches can be used. For instance, either or can be used for the value or an interpolation between and can be made to obtain a satisfying estimate. An example is a portfolio with a total value of 1 SEK, with a one-day . This means that from today to tomorrow the value of the portfolio will drop 60% with a probability of 5%, i.e. the loss will be 0.60 SEK. Let the historical data used to estimate the VAR be 200 days. Using equation (4), where , the one-day is given by the 190th observation in the ordered sample. 3 Figure 1. Shows the loss distribution and the VAR-value for the probability . Note that the loss can be greater, the say that of the worst outcomes, this is the least you lose. Therefore, the ES-measure is an interesting compliment. ES is the average loss of an asset or a portfolio of assets, for a certain probability, during a specific time, .A simple description is that ES is the average of the worst outcomes. The definition is (5) where is given by (3). In the discrete case, where is the ordered sample of , the ES becomes a sum, given by (6) Where n is the number of observation in the entire sample, and in the case the reasoning is the same as for given by (4). isn’t an integer 2.3 Likelihood estimates The maximum likelihood method for estimating parameters for a statistical model maximizes the likelihood function, or the log-likelihood function, for the data and the specified model. In this method the probability density function can be unknown but the joint density function for the data is assumed to come from a known family of distributions, for example the normal family. For an independent and identically distributed sample of size n the joint density function looks like (7) 4 where are the parameters of the model and is the observed variables. Thus the observed variables are known whereas the parameters given by are to be estimated. The likelihood function is then given by (8) And the often used log-likelihood function is given by (9) The estimated parameters are then given by the set equation (8) or (9). which maximizes the likelihood function, When working with real time data the underlying distribution can be quite complicated and is often unknown. Then the pseudo-maximum likelihood method is used. The different from the maximum likelihood is that the estimate of the parameters, , is obtained by maximizing a function that is related to the log-likelihood. For example, the maximum likelihood estimates parameters of a normal distribution even though the sample distribution has fatter tail than a normal. For financial time series, as the ones used in this study, the underlying distribution is commonly fat tailed but maximum likelihood can still be used since it provides consistent estimators. (McNeil and Frey (2000), Gouriéroux (1997)) 2.4 Time series models 2.4.1 Autoregressive model Given a strictly stationary time series , where and are measurable with respect to the information set up to time t, denoted , is a white noise process from an unknown distribution with mean zero and unit variance, an autoregressive model, , is given by (10) Where is a constant, are coefficients of the model and , hereafter denoted as innovations of the process. is a white noise process, 2.4.2 Generalized autoregressive conditional heteroskedasticity model The volatility process of a time series can be simulated via a generalized autoregressive conditional heteroskedasticity model, . The GARCH model takes into consideration both previous observations in the time series and previous volatilities for prediction the coming. A is defined as (11) Where coefficients of the model and is a constant, and are . To estimate the coefficients of the models above, equations (10)-(11) pseudo maximum likelihood is used, see section 2.3. , given by 5 2.5 Extreme Value Theory The extreme value theory (EVT) is based on the distribution of the extreme returns, the ones located in the tail of the distribution of a returns series. Hence this method can be used to estimate extreme events more accurate. In this study EVT is used to obtain estimates of VAR and ES. This section outlines some of the basics of EVT described in Alexander (2008), Coronado (2000), Kourouma et al. (2012), McNeil (1997), McNeil and Saladin (1997), McNeil and Frey (2000), Nyström and Skoglund (2002), Rocco (2011). According to the Fischer-Tippett Theorem a set of normalized maxima, if normalized appropriately, converges in distribution to a non-deganerated limiting distribution. This limiting distribution belongs to either Fréchet, Gumbel or Weibull family. This means that the tail, which consists of the maxima of a sample, for in principal every probability distribution converges to one of the following three families: Fréchet (12) Gumbel (13) Weibull (14) Where is the shape parameter and gives an indication of the fatness of the tail. Notice that these are the standardized distribution, for the non-standardized insert instead of in the formulas. These three distributions given by equation (12)-(14) are called extreme value distributions and can be combined to the general extreme value distribution (GEV), given by (15) where . For the standardized GEV, set and . There are various ways to make use of the EVT, there exists parametric and semi-parametric as well as non-parametric methods. The most frequently used parametric methods are the so called blockmaxima method and the peak over threshold method. For the non-parametric methods there are several estimators that can be applied, the Hill estimator is the one most commonly used. For all EVT methods it is the excesses over a certain threshold that are studied. The marginal distribution, , is used to denote the distribution of the observations above the threshold . It can be expressed by (16) 6 where , is the loss distribution and . According to the Gnedenko-Pickands-Balkema-deHaan Theorem, if is in the maximum domain of attraction, MDA, of the extreme value distribution with shape parameter , distribution of , the marginal converge to the generalized Pareto distribution (GPD), denoted . The convergence can be described by the following expression (17) for some positive measurable function , i.e. for a sufficiently high threshold, , the distribution of the observations above this threshold may be approximated with a GPD. The class of distributions which are in the MDA of the extreme value distributions is large and all commonly used distributions are included. The GPD is given by (18) where the scale parameter , when the shape parameter the shape parameter . Note that for a mean adjusted distribution . and when the GPD becomes The main assumption in EVT is that the extreme observations are independent and identically distributed. This is not always fulfilled when working with real data. A way to improve the EVT models is to combine them with a model of the volatility and an - or model of some sort to the returns and use EVT on the residuals obtained via the time series model. By combining EVT and time series models the assumption of i.i.d. observations is more likely to be fulfilled and hence more reliable estimates are obtained. For a time series of returns an model is a reasonable choice. A model for the volatility is often used to capture the heteroskedasticity. An model is used because the percentage changes generally fluctuate around zero. Other suitable models are or just a simple . The choice of model vary from article to article, the most common are , and . 2.5.1 Defining the tail When applying the EVT, the choice of threshold is a critical and not obvious choice. The threshold is the cut off between observations belonging to the center of the distribution and those belonging to the tail. If it’s set too low, observations belonging to the center of the distribution are classified as extreme events and if it’s set too high, extreme observations are excluded. There exist several methods for choosing the threshold and some are listed below. (Rocco (2011)) The conventional choice How the decision of choice of the threshold is made is not specified by all authors. A generally accepted rule of thumb is that the tail should consists of 5-10% of the entire sample and hence the threshold is set to be in that region. It should not be higher than 10-15%. 10% seems to be a frequently used limit (Rocco (2011), Nyström and Skoglund (2002), McNeil and Frey (2000)). 7 Graphical methods Even though graphical methods are somewhat arbitrary they can still be a helpful tool in the selection process and are quite easy implemented. One useful plot is the mean excess plot. Mean excess function is defined as (19) i.e. the mean of excesses over the threshold , . Plot the set that is, the actual observation, , and the sample mean excess for that one. If the tail distribution is a GPD the plot is approximately linear for the higher order statistics, since for a GPD with parameters and the mean excess function for some becomes (20) Where . The slope of the line indicates the sign of the shape parameter, . Hence, corresponds to the Fréchet family etc. see equations (12)-(14). (Embrechts, Klüppelberg and Mikosch (1997)). If the Hill estimator is the EVT method used Hill plots are another way to determine the threshold. The Hill plot is a plot of the estimated tail index, , as a function of the number of observations above the threshold, . For futher information about the tail index and how it is obtained, see section 2.5.3 below. Hence, for the Hill plots the set is plotted, where represent the number of observations in the tail and is the estimated tail index. For other non-parametric methods corresponding plots can be made. When interpreting the plot approximately horizontal lines indicates that for those values of the threshold, the tail index estimate is essentially stable with respect to the choice of threshold (Rocco (2011)). Monte Carlo simulation and minimizing mean squared error (MSE) The optimal threshold of this method is given by the one that minimizes the MSE for an estimator. Monte Carlo simulations are used to determine for which (the tail size), conditional on sample size and EVT-distribution, the MSE for the tail index is smallest. This method is useful since the tradeoff between bias and efficiency is optimized. (Halie and Pozo (2006)). Data driven algorithms There are several data driven algorithms that are design to give the optimal threshold for a given data. In an article by Lux (2000) an overview and comparison among five different methods are presented. Drees and Kaufmann (1998) present a method based on a stopping rule for the sequence 8 of Hill estimators. Hall (1990) develops and Danielsson and deVries (1997) lay out an improvement of a subsample bootstrap algorithm. Danielsson, De Haan, Peng, and De Vries (2001) present another subsample bootstrap algorithm based on subsamples of different magnitude. Beirlant, Vynckier and Teugels (1996a and 1996b) present two iterative regression approaches. 2.5.2 Peak over Threshold method One of the most frequently used parametric methods is the peak over threshold method, POT. As described in section 2.5 the GPD can be used as an approximation of the distribution of the tail. In POT the probability distribution of the GPD, given by (18), is used to derive expressions for different risk measures. For a sample and a choice of threshold, , a GPD is fitted to the exceedances above via maximum likelihood, see section 2.3. Hence estimate of the shape parameter and the scale parameter is obtained. By letting be the excesses over the threshold, define by (16). Use the fact that and an expression for the underlying loss distribution is (21) when , where is given by (18). and where can empirically be approximated with is the number of observations above the threshold and is the total sample size. Hence the underlying distribution can be estimated by (22) Since (22) is the cumulative distribution function for the loss distribution, given that an observation is above the threshold, and that is defined as in equation (3), for the observation equation (22) can be written as . (23) From this an estimate of is obtained by using the inverse of equation (23). Then the following expression for the estimates obtained , i.e. solve for in (24) where and are estimated GPD parameters, is the total number of observations in the sample and is the number of excesses over a given threshold (Alexander (2008), McNeil (1999), McNeil and Saladin (1997)). From the definition of given by (5) the estimate of can be derived (25) Where is given by (24) for (McNeil (1999)). and , and and are estimated GPD parameters. 9 2.5.3 Hill estimator Nonparametric methods for estimating the tail or the tail index makes no assumptions about the underlying distribution in the tail. The most common used is initiated by Hill (1975). He derived the estimator under the assumptions of i.i.d. observations via maximum likelihood. The only restriction for the Hill estimator is that it’s only valid for Fréchet type distributions, i.e. distributions with fat tails (Longin (2000)). To obtain the Hill estimate of the tail index, arrange the sample in descending order, . The tail index can then be estimated by the Hill estimator: (26) where is the number of exceedances over a given threshold, , and the threshold value is given by , i.e. . Depending on which tail you want to study, the indexing in the sum changes. The procedure presented above is the one used in this study. (Halie and Pozo (2006). As mentioned above one assumption when using the Hill estimator is that the data is heavy tailed, i.e. the distribution of the data corresponds to the Fréchet family. Therefore, as explained in section 2.5, the GPD, here denoted , can be used as an approximation of the true marginal distribution, . An approximation of the Pareto distribution is given by (27) for some constant approximation . So by using equation (21), but with this approximation of and the , the estimate of the underlying distribution can be derived. The constant is found from assuming that distribution is obtained: . Hence, the following expression of the estimated (28) Where and , is the sample size, is the number of observations above the threshold is the shape parameter estimated by equation (26). (Rocco (2011), Danielsson and de Vries (1997)). From equation (28) an expression for can be derived in the same way as in section 2.5.2, since it´s the -quantile of the tail distribution. Hence when using the Hill estimator and letting observation and the is estimated by: (29) (Rocco (2011), Christoffersen and Goncalves (2005)) The derivation of the estimate is done in a similar way as in section 2.5.2, starting from its definition given by (5). Thus when using the Hill estimator the ES is estimated by: (30) 10 Where is given by equation (29) for (Christoffersen and Goncalves (2005)). and , and is estimated by equation (26) 2.5.4 Conditional Hill estimator Since EVT is based on the assumption that the data is i.i.d. a filtering of the data can be necessary. A procedure commonly used is to first fit a time series model to the data and by that obtain the standardized residuals and thereafter in a second step apply EVT methods to the residuals (McNeil and Frey (2000), Diebold et al (1998)). The time series models that are considered in this study are presented under section 2.4. In this study both the normal and the student’s t distribution is used as the distribution for the innovations. In this study an model is used to model the time series. For that model the assumption of the dynamics of the loss-return series is that it’s a strictly stationary time series and with innovations . The innovations have the marginal distribution . is marginal distribution of . The predictive distribution for the returns over the next days is given by . For time series models a calibration period is needed to obtain an estimate of the parameters of the model. Let the calibration data be of length . The residuals are defined as (31) where is the observed return and is the predicted. From the fitted model the conditional mean, , and standard deviation , , are obtained from (32) (33) Where are the residuals, given by equation (31). And from the equations above the standardized residuals for the calibration period can be obtained: (34) Which by assumption is i.i.d. and not dependent of t (McNeil AJ and Frey R (2000)). In the second step an EVT method is applied to the standardized residuals to obtain the desired estimates. In this study the EVT method used is the Hill estimator, for details see section 2.5.3. The Hill estimator for the residuals is given by equation (26) where the ordered sample is given by the ordered residuals, for a sample size of . The unconditional VAR for the residuals is defined as equation (3) and obtained via equation (29) when using the Hill estimator. The conditional VAR is defined as (35) i.e. the quantile of the predictive distribution for the next days, given . Similarly, unconditional expected shortfall is defined as equation (5) and given by equation (30) for the Hill estimator. The conditional ES is defined 11 (36) i.e. the average of the exceedances over the VAR of the predictive distribution for the next given , the information set up to time . days, In this study one day VAR predictions are of interest, therefore the one step predictive distribution is used, , and the notation for the conditional predictions is simplified to and . After some rewriting so in the end the conditional VAR en ES can be given by (37) (38) Where and McNeil and Frey (2000)). are the unconditional VAR and ES for the residuals. (Rocco (2011), 2.6 Historical simulation One commonly used approach when estimating risk measures is what in this study is called historical simulation. It’s included in this study because it’s easy to implement and a widely used method and therefore works as a reference to compare the more advanced and complex EVT methods with. The principle is that historical data for a time series or a collection of several time series of different assets is used as the empirical distribution for the future returns. Value at risk and expected shortfall is estimated from that empirical distribution via equation (4) and (6). How the empirical distribution looks like depends on the length of the historical data used as well as how the returns have behaved under that period. If a period of low volatility is used the most extreme events are likely to consist of quite small changes in the returns, while the most extreme events most probably are quite large changes if a period containing high volatility is used. 3. Data In this study two different types of risk factors are included, foreign exchange rates and indices. The time series all have a length of 18 years, from 1994-01-03 to 2011-12-30. The main analysis and model evaluation is made with data from the OMX Stockholm 30 index (OMXS30) the index of the 30 most traded stocks, in volume, on the Stockholm stock exchange. Further evaluations are made to investigate if different types of risk factors or even just different risk factors give various results. Here three more indices and three foreign exchange rates are used. The indices are the FTSE 100 index (FTSE), the index of the 100 companies on the London stock exchange that have the highest capitalization and the Nikkei 225 average (N225), the index of the 225 Japanese companies with the highest rating in the first section of the Tokyo stock exchange. Also included is the Standard and Poor´s 500 (GSPC), an index of 500 stocks, representative for the major industries in USA, from either NASDAQ or New York stock exchange. The foreign exchange rates are all against the Swedish krona and the three included in this study are British pound (GBP), United States dollar (USD) and Euro (EUR). For all risk factors the time series are in percentage change for the loss returns, see section 2.1. 12 4. Analysis of the dataset The EVT is based on the assumption that the data used is independent and identically distributed, i.i.d. Before applying EVT an analysis of how well the data meet the criterion is essential, this to be able to draw meaningful conclusions and interpret the results obtained. Most financial time series come from a distribution with fatter tails than the normal distribution, and hence their tails have a Fréchet type of distribution. This assumption should be verified since the Hill estimator is conditioned on it. In this study autoregressive time series models will be used and for these the stationarity of the time series must be checked. The analysis is made on the entire length of each time series even though shorter time interval will be used for calibration data for example. 4.1 Time series plot and descriptive statistics To investigate the i.i.d. condition the time series is plotted to get a first rough indication. Heteroskedasticity violate the assumption of identically distributed observations and an upward or downward trend for the return series implies non-stationary time series. To see whether the data comes from a distribution with heavy tails and if it’s asymmetric, the kurtosis and the skewness measures are used. For the definition and equation used to estimate kurtosis and skewness see appendix A. Time series for loss returns of OMXS30 8,0% 3,0% -2,0% -12,0% 1994-05-20 1994-11-07 1995-04-26 1995-10-12 1996-04-03 1996-09-19 1997-03-14 1997-09-01 1998-02-24 1998-08-12 1999-02-03 1999-07-22 2000-01-11 2000-06-28 2000-12-14 2001-06-05 2001-11-21 2002-05-17 2002-11-06 2003-05-07 2003-10-28 2004-04-26 2004-10-15 2005-04-11 2005-09-30 2006-03-22 2006-09-15 2007-03-08 2007-09-03 2008-02-26 2008-08-20 2009-02-13 2009-08-10 2010-02-02 2010-07-28 2011-01-18 2011-07-13 2011-12-30 -7,0% Figure 2. The time series for the loss returns for the OMXS30 index, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date. The most extreme losses seem to be clustered, and it seems like the volatility also is clustered, which make the assumption of an i.i.d. time series doubtful. However, the time series seems to be stationary over all, but with a slightly shifted mean, see table 1 below. The time series plot of the other risk factor used is found in Appendix A, and all display similar tendencies. Since financial time series often looks like this an model for dealing with the shift in mean and a model for handling the clustering of the volatility is used in this study as a tool to satisfy the conditions of EVT. 13 Table 1. Statistics for the data used, all time series consists of approximately 18 years of daily observations. Statistics loss distribution Mean Standard deviation Kurtosis Skewness OMXS30 FTSE N225 GSPC* GBP USD EUR -3.85∙10-4 0.0155 – 1.68∙10-4 0.0113 4.48∙10-5 0.0151 – 2.95∙10-4 0.0119 2.09∙10-5 0.0060 2.20∙10-5 0.0072 – 3.01∙10-6 0.0043 6.67 – 0.205 9.29 -0.036 9.23 0.071 15.27 -0.213 6.26 0.075 5.95 0.126 6.15 -0.190 *The 123 most recent observations are excluded since they all are zero. The standard deviation listed in the table above should be taken for what it is, an approximation for the entire 18 years sample. As can be seen in figure 2, the standard deviation varies quite a lot throughout the entire period. The value of the kurtosis implies that every time series have fat tails. The skewness is close to zero for all assets, hence the distributions can be assumed to be evenly distributed around the mean and no tail is longer than the other. 4.2 Autocorrelation and Ljung Box test The autocorrelation plot is used to investigate the assumption of independence a bit more thoroughly. It is also used to investigate if heteroskedasticity occurs in the time series by testing the squared returns. Ljung-Box test is another tool to investigate the autocorrelation and hence the assumption of independence and the presence of heteroskedasticity in the time series. For definitions of autocorrelation and the Ljung Box Q-test see appendix A. In this study a significance level of 95% is set, by conventional choice. OMXS30 Autocorrelation Function OMXS30 Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 20 -0.2 0 5 10 Lag 15 20 Figure 3. Autocorrelation plot for the OMXS30 time series (left) and for the squared observations (right). The plot indicates autocorrelation since the autocorrelations for several lags exceeds the 95% confidence interval for no autocorrelation. If more than 5 % exceeds the confidence bound it implies the presence of autocorrelation or heteroskedasticity. Autocorrelation plots for the other time series can be found in Appendix A. In the table below a summary of the Ljung Box Q test is presented. A test outcome equal to one implies rejection of the null hypothesis of no autocorrelation. The significance level here is 95% as well. 14 Table 2. Summary of Ljung Box Q test. Test outcome P-value OMXS30 1 4.6672∙10-5 FTSE 1 9.7791∙10-12 N225 1 0 GSPC 1 0.0293 GBP 1 9.1342∙10-5 USD 1 1.2308∙10-5 EUR 1 0.0066 The Ljung Box Q-test for all the time series rejected the null hypothesis that no autocorrelation occurs in the time series. The result of the test on the squared residuals gives a rejection of the null for a p-value of zero for all the time series, i.e. heteroskedasticity occurs. As can be seen in table 2 there wouldn’t make a different if the significance level for the test was slightly higher or slightly lower, autocorrelation occurs. The result presented above motivates the choice of time series models in combination with EVT to improve the estimates. 4.3 Histogram and QQ plots To see whether the data comes from a distribution with heavy tails a histogram against the normal distribution is constructed as well as the QQ-plot and the mean excess plot. OMXS30 Histogram of data, red line normal dist. fit 400 QQ Plot of OMXS30 Data versus Standard Normal 0.1 350 0.05 Quantiles of Input Sample Number of observations 300 250 200 150 100 0 -0.05 -0.1 50 0 -0.15 -0.1 -0.05 0 Observation 0.05 0.1 -0.15 -4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 4 Figure 4. OMXS30 histogram and QQ plot against normal distribution. The histogram for OMXS30 in the figure above seems to be more peaked around the mean. Since several observations seem to be located far out in the tails it can be assumed that that the loss distribution for OMXS30 has heavier tails than a normal. The QQ plot implies leptokurtic behavior and hence that the data comes from a non-normal distribution. This appearance is noted for the other time series as well, more or less distinctly, see histogram and QQ plots in appendix A. The above two conclusions, non-normal and fat tails, suggest that Fréchet distribution can be a good approximation for the tail and it can be estimated by the Hill estimator. 15 5. Method In this section an overview of the method is presented. In the subsections 5.1 to 5.7 a more detailed description of the methods is found. Remember, as mentioned in the previous section, that the negative returns are used in this study. In the following sections “year” is often used as a unit to describe the data length used. One year is set to be 252 observations, since there are 252 banking days in one year on average. First a threshold sensitivity analysis is made using the OMXS30 data and the POT method to get an idea of how much the threshold choice affects the result. Some of the methods presented in section 2.5.1 are also investigated. After that a simulation study of different methods is performed. For different length of the calibration data and are estimated via historical simulation, simulation based on the Hill method, the conditional Hill method and the POT method. In this step the threshold is set to be 10% of the total sample size, see section 2.5.1 under the subtitle The conventional choice for more information. The data used in this step is not only OMXS30 but also data from the other assets presented under section 3. However, a more extensive study is made for OMXS30. Focus is to investigate which time horizon that gives stable models and which of the models that gives most reliable estimate. To investigate where the models don’t manage to predict the risk measures accurately and how fast they adjust to change in volatility the OMXS30 time series is used. In a first step the entire prediction periods for the different models are illustrated. After that, to see how the models behave for periods of high volatility, a period from 2000 to 2002 is used since this was a turbulent time in the financial world, for example it includes the IT crash, crisis in Asia and South America and Nine Eleven. The financial crisis of 2008 and some other shorter time periods is used to see how the models adjust to fast and large volatility increases. Predictions for VAR and ES are illustrated, how the curves behave as well as where VAR exceedances occur and how the difference between the predicted ES and the actual outcome looks like for these events. To further investigate the models and determine which of them that gives reliable estimate, and for which time horizons that occur, backtesting is used for all the time series included. To backtest the models predictions are made for all sample sizes of calibration data. Since the data size in this study has a range of 18 years the prediction period varies widely. When one year is used to calibrate the models predictions can be made over 17 years while when 17 years is used for calibration only one year can be predicted. In the last part of the study only the OMXS30 data is used. The threshold is set to 5% and 7 % of the total sample size. For the length of the calibration period of one to four years and are calculated. This is done in the same way as in the previous simulation study, via historical simulation, via simulation based on the Hill method, the POT method and the conditional Hill method. Since EVT should be efficient for high quantile a test is made where estimations of VAR and ES for significance level of and are generated and compared to a hypothetic data for a threshold of 10%. 16 5.1 Overview threshold sensitivity As mentioned in section 2.5.1 the choice of threshold is critical. Therefore an initial analysis of the sensitivity is made to get an idea of how influential the choice of threshold is on the results. As presented in the section 2.5 the generalized Pareto distribution (GPD) is a god approximation of the underlying distributions of the tail. To investigate the sensitivity four different thresholds are specified, one, five, ten and twenty percentage of the total sample size. For these four thresholds a GPD is fitted to the tail via pseudo-maximum likelihood. After that equation (24) and (25) is used to calculate VAR and ES for four different significance levels. As a reference historical VAR and ES are calculated via equations (4) and (6), where a MATLAB interpolation method is used if is not an integer. The above procedure is done for two different sample sizes to see if the smaller one still gives reliable estimates. The two sample sizes used are 18 years (full sample size) and 4 years, the data used is OMXS30. These two lengths of the historical data used to estimate the GPD parameters, the VAR and the ES, are arbitrarily chosen. The use of the full sample size generates the largest number of observations in the tails and could hence give an indication of the number of observation needed in the tail. A smaller sample size is also interesting to study to obtain a lower limit of the number of observations needed in the tail. By common sense a parameter fit to a very small tail sizes is not possible. Four years is chosen since it gives at least some observations in the tail when the tail size is one and five percentage of the total four year sample size. 5.2 Threshold choice An investigation of a few of the methods presented in section 2.5.1 is made to see if one or several could be useful. Conventional choice is a starting point, see section 2.5.1. Therefore an upper bound of 10% is set when the investigation of the convenient threshold choice based on the other methods is made. First the graphical method mentioned under 2.5.1 are used. Both Hill plots and mean excess plots are constructed. Then a Monte Carlo simulation and minimizing the mean squared error (MSE) is performed. The graphical methods depend on subjective interpretations and the Monte Carlo method has no connection to the real data. Therefore the conventional choice of 10% will be used overall in this study. Some other choices will be tested, based on the result from this investigation, to see if and how they affect the results. For algorithm and more detailed results see Appendix B. 5.3 Peak over Threshold After the threshold is chosen the POT method can be used to obtain VAR and ES estimates. At first the threshold is set to 10% of the total sample size. Hence, the assumption is made that the 10 % most extreme observations form the tail of the distribution and the rest of the observations belongs to the center of the distribution. For OMXS30 a sample of one, two, three, four, ten and 17 years is used as calibration data. For each calibration sample the following simulation procedure is used. 1. For the choice of tail size =10%, fit a GPD, given by (18), to the exceedances, y, above the threshold based on the observations in the tail, , via pseudo maximum likelihood. Hence, the estimates of the GPD parameters and is obtained. 2. Use the parameter estimated, the tail size and the threshold that follows from to estimate VAR and ES via equation (24) and (25) for a significance level of 99%. 17 This procedure is used for the OMXS30 data, for its respective calibration sample. 5.4 Hill estimator As in section 5.3 the threshold is initially set to 10% of the total sample size. Different sample sizes are used to derive the shape parameter via Hill simulation. For OMXS30 a sample of one, two, three, four, ten and 17 years is used as calibration data. For the other risk factors a calibration sample size of one, two, three and four years are used for calibration of the model. For each calibration sample the following simulation procedure is used. 1. For the choice of threshold at tail size =10%, derive the shape parameter via equation (26). 2. Using the shape parameter and the threshold the VAR and ES estimates can be obtain from equations (29) and (30) for a significance level of 99%. This procedure is used for all risk factors, for their respective calibration samples. 5.5 Conditional Hill estimator The condition of i.i.d. observations is violated when for example autocorrelation and heteroskedasticity occurs in returns series, see figure 2 in section 4.1. This is often the problem with financial time series and can be solved by fitting a GARCH model to the data from which the standardized residuals can be obtained. These residuals are in general i.i.d., so the conditions of EVT are fulfilled. With this tactic the dependency in the data are hopefully no longer a problem and hence more accurate estimates can be obtain. In this study an AR(1)-GARCH(1,1) is used since it’s one of the simpler time series models that provide a reasonable fit to the data used in this study. An AR(1)-GARCH(1,1) model is given by (39) (40) For more details on the time series model see section 2.5.4 Conditional Hill estimator and 2.4 Time series models. Hence, based on equations (32)-(33) the conditional mean and conditional standard deviation predictions for is given by (41) (42) where is the estimated coefficient of the model, is the sample observation at time , and is the sample standard deviation at time . As for the Hill estimator, see section 4.3, different sample sizes are used to calibrate the model. For OMXS30 a sample of one, two, three, four, ten and 17 years is studied. For the other risk factors a sample size of one, two, three and four years is considered for calibration of the model. For each calibration sample the following procedure is used. 18 1. Fit the AR(1)-GARCH(1,1) model as described by equations (39)-(40) via pseudo maximum likelihood with normal innovations to the sample size in question. 2. Use the estimated model to calculate and by equations (41)-(42) and compute standardized residuals for the calibration period via equation (34). The standardized residuals should be i.i.d. in order for step 3 to be applicable. 3. For the residuals, use the Hill estimator to obtain the shape parameter via equation (26) 4. Calculate VAR and ES for the residuals via equations (29) and (30), for a threshold of 10 % and a significance level of 99%. 5. Use the residual VAR and the residual ES and the calculated and to obtain the estimate of the conditional VAR and conditional ES for the return series, which are given by equation (37) and (38). Remember that these are the 1-day predictions. This procedure is used for all risk factors, for their respective calibration periods. Since the distributions of the data used are considered to be heavy tailed the procedure is repeated with the modification that in step 1 Student´s t innovations are used when fitting the model. The degrees of freedom used in the Student´s t are three. That is used because a Student´s t distribution with three degrees of freedom is a heavy tailed distribution and that is the type of distribution considered in this study. In step 2 the residuals are obtained and to fulfill the conditions of EVT they should be i.i.d. To test independence autocorrelation plots and Ljung Box Q test is made, as presented in section 4.2, stationarity of the models is verified by checking that . The analyzes are done on four years of historical data, even though shorter time horizons will be used, this is the maximum length used for most of the time series when estimating the risk measures. The time horizon is from 2008 to 2011. The autocorrelation plots can be found in appendix C. As can be seen in the autocorrelation plots in appendix C and in table 3 and 4 below the assumption of no autocorrelations between the residuals holds, both when normal innovations and t innovations are used in the models for all the time series. However heteroskedasticity occurs when t innovations are used. But since a GARCH model is used to monitor the volatility and it adjusts to the prevailing volatility when predicting the future, hopefully reasonable estimates for VAR and ES can be obtained. Notice that the results in the table are for the specified time period. For other four-year time periods, or shorter time periods, the non-heteroskedasticity assumption can be verified. Table 3. The table shows the Summary of Ljung Box Q test on the residuals from an AR-GARCH fit with normal innovations and with t innovations. Null hypothesis is that no autocorrelation occurs. Residuals OMXS30 FTSE N225 GSPC GBP USD EUR . Test outcome Normal innovations 0 0 0 0 0 0 0 P-value for test statistic 0.5481 0.8123 0.7373 0.7373 0.5504 0.6685 0.3958 Test outcome T innovations 0 0 0 0 0 0 0 P-value for test statistic 0.3162 0.7836 0.7732 0.2473 0.5956 0.6447 0.3064 19 Table 4. The shows the summary of Ljung Box Q test on the squared residuals from an AR-GARCH fit with normal innovations and with t innovations. Null hypothesis is that no heteroskedasticity occurs. Squared residuals Test outcome Normal innovations P-value for test statistic OMXS30 FTSE N225 GSPC GBP USD EUR 0 0 0 0 0 0 0 0.03249 0.4442 0.06823 0.1142 0.4617 0.1930 0.9632 Test outcome T innovations 1 1 1 1 0 1 0 P-value for test statistic 0.001341 0.01403 8.260e-6 7.671e-10 0.3429 0.007740 0.5420 5.6 Backtesting To verify the models a backtesting procedure is used. To be able to backtest a model both calibration- and prediction data is needed. In this study the full sample size is 18 years, so when 1 year is used to calibrate the model predictions can be made over 17 years, when 2 years is used for calibration, 16 can be used for predictions and so on. The predictions of the model are then used to evaluate the reliability of it. For the risk measures considered in this study the backtesting estimates are given in the sections below. 5.6.1 Backtesting VAR To backtest the VAR-estimating models the predicted loss is compared to the actual loss observed on next day. When the actual loss is greater than the estimated VAR an exceedance has occurred. For VAR a statistically expected number of exceedances over the VAR curve is expected. For the loss will not exceed that value 99 times out of 100. Hence one exceedance for every 100 observations is statistically expected. These statistically expected numbers of exceedances are used as the reference. A model is considered to be reliable if the number of exceedances when backtesting the model is within a 10% confidence interval from the statistically excepted. To obtain the backtesting statistic calculate the number of occasions where the actual loss return, , is greater than the predicted VAR: (43) Desirably, U should be very close to the statistically expected. Hence, in this study a model is considered to be reliable when the backtesting statistic is within the acceptable limits, which are the following: (44) Where is the statistically expected number of exceedances (McNeil (1999)). 5.6.2 Backtesting ES To obtain the backtest estimate calculate the average difference between actual loss return, forecasted ES conditional on having a loss return exceeding the corresponding VAR estimate: , and 20 (45) A positive value of the statistic indicates an underestimation of the risk loss and a negative value indicates an overestimation. For ES estimations the value of V should be as close to zero as possible, for the model to be as reliable and accurate as possible. (Kourouma et al (2011), Embrechts, Kaufmann and Patie (2005)). Since the actual loss and the estimated ES both are in percentage change, the backtesting results for ES will be presented in the unit of percentage. For example if the result will be presented as 1%, i.e. on average the model that estimate the ES measure underestimate the ES loss with 1%. Worth noticing is that the backtesting statistic depends on the VAR estimate. This implies that the accuracy of the VAR estimation must be taken into consideration when evaluating the ES backtesting statistic. 5.7 Higher significance levels To test how the EVT models performs for VAR and ES predictions of higher significance an AR(1)GARCH(1,1) model is used to generate a hypothetical distribution of the loss returns for the day the predictions are made for. For the AR-GARCH model description se section 5.5. The test is made on OMXS30 data. The length of historical data used to fit the AR-GARCH and obtain the parameters for the EVT models are the last one, two, three and four years in the sample, i.e. data from 2011, 20102011 etc. The threshold used is 10% of the sample size by conventional choice, see section 2.5.1 and 5.2. The prediction of the loss distribution for the next day, in this case 2012-01-01, is by equation (39)(40) with and comes from a student’s t distribution with three degrees of freedom. The for the residuals is obtained from the quantile of the distribution of . for the residuals is obtained via Monte Carlo simulations and the definition of ES, defined as in equation (5). To get sufficiently many observations in the tail for ES estimations 1 000 000 ’s are generated so a hypothetical distribution of 1 000 000 predictions are used to obtain the estimate. The different models for estimating the risk measures are presented in section 5.4 and 5.5. For each model one to four years of historical data are used respectively to obtain the estimate of VAR and ES for the day 2012-01-01 for a significance level of and . The results from the models are compared with the outcome from the hypothetical loss distribution, for which the “actual” VAR an ES are obtained as described above. 6. Results First the results from the sensitivity analysis of the threshold choice are presented. After that the illustrations of the models are presented. Predictions plots over the entire available time period as well as over shorter periods like 2008 are included. Then the results from the backtesting of the models for all the different time series included are outlined. In section 6.3 the backtesting results for a threshold of 10% is found while the results for when the threshold is varying is found in section 6.4. The last section contains the result for when VAR and ES are estimated for higher significance levels than 99%. 21 6.1 Overview threshold sensitivity In table 5 and 6 the result of the threshold sensitivity study is presented. The VAR and ES predictions are all for the same day, i.e. the predictions are for the first day after the last day in the calibration data. The total number of days in the sample is found in the top left cell, under parameters gives the number of observations in the tail and gives the threshold value. The GPD parameters are, as mentioned under 5.1, fitted via maximum likelihood. Notice that they don’t converge for all tail sizes. This means that the conditions for the parameters, given in section 2.5, equation 18, aren’t fulfilled for the estimated parameters or that the parameters didn’t converge during the maximum likelihood procedure. An example is that the parameters converge to some estimate that exceeds a predefined tolerance level and can therefore not be considered as reliable. The main reason for this is the low number of observations in the tail that is used in the maximum likelihood estimation. Notice that the number of observations that is equal to or smaller than VAR is very small for higher significance levels and hence the historical VAR and ES are just the one most extreme observation in the sample. Table 5. Entire sample used to estimate the VAR for 2012-01-01. The sample consists of OMXS30 data that starts 199401-03 and ends 2011-12-30. Sample Parameters size 4577 (18 years) Threshold 1%* 45.77 0.0416 5% 228.85 0.0247 10% 457.7 0.0172 20% 915.4 0.0104 Historical VAR GPD parameters -0.0806 -0.0538 -0.0423 0.0045 Threshold 1%* 5% 10% 20% Historical expected shortfall (% of total sample size) 0.0105 0.0110 0.0111 0.0102 VAR estimates for different significance levels 95% 99% 99,9% 99,97% -0.0414 0.0636 0.0737 0.0247 0.0417 0.0636 0.0740 0.0248 0.0416 0.0637 0.0745 0.0246 0.0412 0.0653 0.0779 0.0247 0.0416 0.0663 0.0735 ES estimates for different significance levels 95% 99% 99,9% 99,97% -0.0512 0.0717 0.0811 0.0351 0.0513 0.0721 0.0820 0.0351 0.0513 0.0725 0.0828 0.0349 0.0517 0.0758 0.0885 0.0352 0.0514 0.0730 0.08172 228.8 obs 45.77 obs 4.577 obs 1.373 obs (5%) (1%) (0.1%) (0.03%) *ML estimates did not converge 1 Calculated with “prctile” in MATLAB 2 The one most extreme observation 22 Table 6. A 4 year sample is used to estimate the VAR for 2012-01-01. The sample consists of OMXS30 data that starts 2007-12-28 and ends 2011-12-30. Sample Parameters size 1008 (4 years) Threshol d 1%* 10.08 0.0515 5%* 50.4 0.0302 10% 100.8 0.0214 20% 201.6 0.0131 Historical VAR GPD parameters VAR estimates for different significance levels 95% -0.6501 -0.2492 -0.1178 -0.0114 Threshold 1%* 5%* 10% 20% Historical expected shortfall (% of total sample size) 0.0147 0.0155 0.0140 0.0122 99% 99,9% 99,97% -0.0513 0.0690 0.0717 0.0300 0.0506 0.0687 0.0748 0.0306 0.0495 0.0710 0.0801 0.0299 0.0491 0.0759 0.0897 0.0302 0.0516 0.0703 0.072322 ES estimates for different significance levels 95% 99% 99,9% 99,97% -0.0603 0.0710 0.0726 0.0424 0.0589 0.0734 0.0783 0.0421 0.0590 0.0783 0.0864 0.0418 0.0607 0.0873 0.1009 2 0.0424 0.0598 0.07232 -50.4 obs 10.08 obs 1.008 obs 0.3024 obs (5%) (1%) (0.1%) (0.03%) * ML estimates did not converge 1 Calculated with prctile in MATLAB 2 The one most extreme observation In the tables above is can be seen that the different choices of threshold generate quite different result. Now, only one value for VAR and ES are estimated, to make a more thorough investigation several estimate for each confidence level can be made. The major problem seems to be the number of observations in the tail. For both sample sizes the GPD parameter estimates don’t converge for all choices of threshold. In Alexander (2008) it’s concluded that for a 10% tail a sample must have at least 2000 observations to get at good estimate of the GPD parameters, i.e. there must be at least 200 observations in the tail. However, as can be seen in table 5 and 6 this limit isn’t holding for the OMXS30 data. Thus, the threshold choice is neither obvious nor arbitrary. 6.2 Prediction plots In the figures below predictions of VAR and ES based on different length of calibration period are illustrated. For example, when one year of calibration data is used, the first prediction is of the first day in 1995, the second is for the second day in 1995 based on the previous one year calibration and so on. The calibrations length is fixed and the period moves as the predictions are made. This is done to get an idea of how the different methods behave, how they adjust to change in volatility and how they adjust over time. The threshold is set to 10%, only data and results for OMXS30 are used in plots, the other time series give very similar appearances. For the conditional Hill the models based in normal innovations are used in the figures since that model and the one based on t innovations have very similar appearances. 23 6.2.1 Hill and GPD simulation Calibrationperiod 1 year Calibrationperiod 4 year 0.15 0.1 0.1 0.05 0.05 0 Losses Losses 0.15 -0.05 -0.1 -0.1 -0.15 -0.15 -0.2 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 Calibrationperiod 10 year 0.1 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 PF VAR hist VAR hill Calibrationperiod 17 year ES hist 0.1 ES Hill 0.05 0.05 0 Losses Losses 0 -0.05 -0.05 0 -0.05 -0.1 -0.15 Oct03 Jul04 Apr05 Jan06 Oct06 Jul07 Apr08 Dec08 Sep09 Jun10 Mar11 Dec11 -0.1 Nov10 Dec10 Jan11 Mar11 Apr11 May11 Jun11 Aug11 Sep11 Oct11 Nov11 Dec11 Figure 5. The figure shows the Hill predictions of VAR and ES and the actual negative return series of OMXS30 for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. Calibrationperiod 4 year 0.1 0.05 0.05 0 0 Losses Losses Calibrationperiod 1 year 0.1 -0.05 -0.1 -0.05 -0.1 -0.15 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 Calibrationperiod 10 year 0.1 0.05 -0.15 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 PF VAR hist VAR GPD 0.08 ES hist ES GPD 0.06 Calibrationperiod 17 year -0.05 Losses Losses 0.04 0 0.02 0 -0.02 -0.1 -0.15 Oct03 Jul04 Apr05 Jan06 Oct06 Jul07 Apr08 Dec08 Sep09 Jun10 Mar11 Dec11 -0.04 -0.06 Nov10 Dec10 Jan11 Mar11 Apr11 May11 Jun11 Aug11 Sep11 Oct11 Nov11 Dec11 Figure 6.The figure shows the GPD predictions of VAR and ES and the actual negative return series of OMXS30 for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. In figure 5 the predictions based on historical simulation are included as a comparison. It can be seen that the longer calibration period you use, the more immovable the model get. The same can be seen in figure 6 where the GPD based predictions are illustrated. They follow the historical simulation more and more closely the longer the calibration period gets. A model with calibration period of ten years or more is stable, but it greatly overestimates the risk as long as nothing happens. However, when for example the volatility increases it cannot adapt and hence the actual return exceeds the predicted VAR and ES, see the plot in the bottom half of figure 6, around year 2008 (left) and august 24 to October 2011 (right). In the top right plot the models also seems a bit inflexible, except from the ES predictions of the Hill model. One year is the one that is most flexible. In Appendix D the graphs for 2 and 3 years of calibration data can be seen for both the Hill estimator and the GPD approach. The fact that flexibility decreases when the calibration period becomes longer is evident. However, the shorter calibration period the better is not entirely true, if it’s too short the risk is constantly underestimated, see section 6.3. Since the GPD predictions follow the predictions based on historical simulation so closely no further investigation is made on the GPD models. The GPD model is a parametric model which requires estimation of parameters and model assumptions which can be inaccurate. In the estimations based on one and two years of historical data the maximum likelihood estimated parameters of the GPD don’t converge, just as for some of the threshold choices described in section 6.1. Therefor the plot of predictions based on one year of historical data is not reliable and should be ignored. Based on the above mentioned shortcomings of the GPD approach the historical simulation is preferable of the two, since they in principle generate the same predictions, and historical simulations will be included in the further study while the GPD approach will be excluded. 6.2.2 Conditional Hill simulations Calibrationperiod 4 year 0.3 0.2 0.2 0.1 0.1 Losses Losses Calibrationperiod 1 year 0.3 0 -0.1 0 -0.1 -0.2 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 Calibrationperiod 10 year 0.2 0.15 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 PF VAR hist VAR cond Hill N Calibrationperiod 17 year 0.15 ES hist ES cond Hill N 0.1 0.05 0 Losses Losses 0.1 0.05 0 -0.05 -0.1 -0.15 Oct03 Jul04 Apr05 Jan06 Oct06 Jul07 Apr08 Dec08 Sep09 Jun10 Mar11 Dec11 -0.05 -0.1 Nov10 Dec10 Jan11 Mar11 Apr11 May11 Jun11 Aug11 Sep11 Oct11 Nov11 Dec11 Figure 7. The figure shows the predictions of VAR and ES by conditional Hill simulation and the actual negative return series, for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. As can be seen in figure 7 the conditional Hill predictions are much more flexible than those based on historical simulation, and the one based on just Hill. The length of the calibration period seems to have little influence on the predictions; all models adjust quickly to changes in volatility. To see how the different calibration periods perform in terms of over- or underestimation of risk, see section 6.3. Figures that shows predictions based on two and three years of calibration data can be found in appendix D, as well as one figure containing VAR predictions from both conditional Hill methods and another containing ES prediction. Those two are included to illustrate the similarities between them, and the fact that the difference is small. 25 6.2.3 Comparisons For all models the predictions of VAR and ES aren’t always as good as desired. In figure 8 the exceedances over the predicted VAR are illustrated. The predictions are based on calibration periods of one and four years. PF VAR hist (70) VAR hill (54) VAR cond Hill (46) 0.1 Marks VAR exceedances, calibrationperiod 1 year 0.05 0 -0.05 -0.1 -0.15 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 PF VAR hist (62) VAR hill (44) VAR cond Hill (32) 0.1 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 Oct06 Jan08 Apr09 Aug10 Nov11 Calibrationperiod 4 year 0.05 0 -0.05 -0.1 -0.15 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Figure 8. The figure shows where the negative return series exceeds the predicted VAR, for different methods. The symbols are the predicted VAR value at that time. The legend specifies what symbol belongs to which simulation method and the numbers inside the parenthesis are the total number of exceedances during the prediction period. Threshold is 10% of the sample. Marks VAR exceedances, calibrationperiod 1 year PF VAR hist VAR hill VAR cond Hill N Marks VAR exceedances, calibrationperiod 4 year 0.06 0.04 0.04 0.02 0.02 0 Losses Losses 0.06 0 -0.02 -0.02 -0.04 -0.04 -0.06 Dec05 Jan06 Jan06 Feb06 Mar06Mar06 Apr06 May06May06 Jun06 Jun06 Jul06 -0.06 Dec05 Jan06 Jan06 Feb06 Mar06Mar06 Apr06 May06May06 Jun06 Jun06 Jul06 Figure 9. The figure shows the marks where the exceedances of VAR for the different models as well as the actual negative return series, for a time period around May – June 2006. In the left figure a calibration period of one year is used, in the right figure four years. Threshold is 10% of the sample. As can be seen in both the lower and the upper plot of figure 8 the exceedances often occur at the same time. However the total number of exceedances differs and the main reason is the fact that the models react to volatility change more and less quickly. In figure 9 and figure 10 below that fact is clear. The conditional Hill models have exceedances more evenly distributed over the periods than the historical and Hill models. As can be seen in figure 9 all models fail to capture the great increase in volatility around may 2006. When a longer calibration data is used the exceedances are slightly fewer and not as large. However 26 neither are able to capture the first big jump and even the conditional Hill model aren’t able to adjust to the drastic change in volatility and several exceedances occurs close to another. Marks VAR exceedances, calibrationperiod 1 year 0.08 0.06 PF VAR hist VAR hill VAR cond Hill N 0.06 0.04 0.04 0.02 0.02 0 Losses 0 Losses Marks VAR exceedances, calibrationperiod 4 year 0.08 -0.02 -0.02 -0.04 -0.04 -0.06 -0.06 -0.08 -0.08 -0.1 -0.12 Jun08 Jul08 -0.1 Jul08 Aug08Aug08 Sep08 Oct08 Oct08Nov08 Nov08 Dec08 -0.12 Jun08 Jul08 Dec08 Jul08 Aug08Aug08 Sep08 Oct08 Oct08Nov08 Nov08 Dec08 Dec08 Figure 10. The figure shows the marks where the exceedances of VAR for the different models as well as the actual negative return series, for the time period around the crisis 2008. In the left figure a calibration period of one year is used, in the right figure four years. Threshold is 10% of the sample. In figure 10 the period around the IT crash and some other turbulence are illustrated. The conditional Hill model is able to adjust to the sudden volatility increase and hence avoid many exceedances. Both the historical and the Hill models are too rigid and hence several exceedances follow after one another when the models aren’t able to adapt to the change in volatility. In figure 11 and 12 below a section around the financial crisis of 2008 is shown, one year calibration data is used. It’s obvious that the conditional Hill estimator is the best when coming to adjust for emerging volatility increases. The other methods are, as mentioned earlier, more stiff and hence fail to capture fast and large changes. PF VAR hist VAR hill VAR cond Hill N ES hist ES hill ES cond Hill N Predictions, calibrationperiod 1 year 0.2 0.15 0.1 Losses 0.05 0 -0.05 -0.1 -0.15 Jun08 Jul08 Jul08 Aug08 Aug08 Sep08 Oct08 Oct08 Nov08 Nov08 Dec08 Dec08 Figure 11. The figure shows the predictions of VAR and ES as well as the actual negative return series, for the time period around the crisis 2008. Threshold is 10% of the sample. 27 PF ES hist ES hill ES cond Hill N VAR hist VAR hill VAR cond Hill N Marks VAR exceedances and shows ES predictations, calibrationperiod 1 year 0.2 0.15 0.1 Losses 0.05 0 -0.05 -0.1 -0.15 Jun08 Jul08 Aug08 Sep08 Oct08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Figure 12. The figure shows the predictions of ES as well as marks where the actual negative return series exceeds the predicted VAR. The period is from June 2008 to June 2009, this to show both where the exceedances occurs as well as the flexibility in the different models. As shown in figure 12 the number of VAR exceedances for the conditional Hill models is fewer than for the others. However, it can be seen that the conditional Hill estimator for the most part generate higher predictions of the ES risk during the period of high volatility. Since ES should be the average of the observations exceeding VAR the Hill estimator or even the historical simulation seems to be a more accurate models. Notice however that the conditional Hill predictions are more flexible and adjust to both increase and decrease in volatility rather quickly. The same plots are made for the turbulent time from the middle of 1999 to 2002, including the Asian crisis and the burst of the IT bubble among other events. In figure 13 below the flexibility of the conditional Hill model versus the stiffness of the others is visible. Calibrationperiod 1 year 0.15 PF VAR hist VAR hill VAR cond Hill N ES hist ES hill ES cond Hill N 0.1 Losses 0.05 0 -0.05 -0.1 Jun99 Sep99 Jan00 Apr00 Jul00 Oct00 Feb01 May01 Aug01 Nov01 Mar02 Jun02 Figure 13. The figure shows the predictions of VAR and ES as well as the actual negative return series, for the time period around the IT crisis and nine eleven. Threshold is 10% of the sample. 28 Marks VAR exceedances, calibrationperiod 1 year 0.1 PF VAR hist VAR hill VAR cond Hill 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 Jun99 Sep99 Jan00 Apr00 Jul00 Oct00 Feb01 May01 Aug01 Nov01 Mar02 Jun02 Figure 14. Shows the marks where the exceedances occurred for the different models. Threshold is 10% of the sample. PF ES hist ES hill ES cond Hill N VAR hist VAR hill VAR cond Hill T Marks VAR exceedances and shows ES predictations, calibrationperiod 1 year 0.15 0.1 Losses 0.05 0 -0.05 -0.1 Jun99 Sep99 Jan00 Apr00 Jul00 Oct00 Feb01 May01 Aug01 Nov01 Mar02 Jun02 Figure 15. The figure shows the predictions of ES as well as marks where the actual negative return series exceeds the predicted VAR. In figure 14 and 15 the Conditional Hill method gives the least exceedances over the estimated VAR and the ES predictions seem to be more accurate than shown in figure 12 above. However neither method can foresee a solo large price change. An example is between August and November 2001 in figure 13 and 14. Both the historical, and hence the GPD, and the Hill estimator seems to remember history for a longer time. When the volatility starts to increase the memory is still of a calm period 29 and therefor the models don’t react so quickly. On the other hand, when the volatility decreased after a more turbulent period, the memory is still of a turbulent method and that is reflected in the stiffness of the predictions, that don’t adjust to the now prevailing calm period. 6.3 Backtesting results, 10% threshold In this section backtesting result for each of the time series are presented. For the different simulation methods, Historical, Hill etc., different length of calibration period has been used. For each period a comparison between the statistically expected number of exceedances and the number of exceedances from the simulations is done. The simulation method that is closest to the statistically expected is considered as the best method for that length of calibration period. In each simulation the threshold for the EVT based methods is set to be 10% of the sample used. I.e. when one year (252 days) is used as calibration period the tail consists of 25.2 observations. For the conditional Hill approach one model is based on the assumption of normal innovations (Conditional Hill N in the table) and the other one on the assumption of student´s t innovations with three degrees of freedom (Conditional Hill T in the table). For the backtesting of VAR both the number of exceedances for the models are presented and in parentheses the difference from the statistically expected, in percentage, are presented. It needs to be 10% or less to be considered an acceptable model. A positive value indicates more exceedances than the expected and hence the model generates an underestimation of the VAR risk, if it’s a negative value the opposite applies. For the backtesting of ES the average difference are presented, in percentage. It needs to be within 1% from zero for the models to be acceptable, conditioned on the acceptance of the VAR estimate. A positive value of the statistic indicates an underestimation of the ES risk and a negative value indicates an overestimation. The outcomes that fall within the acceptable limits are highlighted via dark thick letters. For ES the outcome closes to zero for every calibration period is also market, even though the VAR estimate may be unacceptable. Summary In table 7 and 8 a short summary of the backtesting results is presented. For all the time series the historical simulation gave the ES statistic closes to zero, but it always indicated an underestimations of the risk (the statistic is positive) and the VAR estimate did never fall within the acceptable limits. Hence, one of the EVT methods is always preferable, but which one differs between the time series. 30 Table 7. A summary of the best VAR prediction model for the different time series. The difference from the statistically expected is given in percentage and a positive value indicates more exceedances than expected and a negative value indicates the opposite. Time series Best VAR model within limits for each method OMXS30 Hill Conditional Hill N Conditional Hill T 8% 6% 2% 2 1 1 FTSE Conditional Hill N Conditional Hill T Hill Hill Conditional Hill N Conditional Hill T -3% -2% 7% 1% -3% -5% 1 2 1 4 1 1 EUR GBP Hill Hill Conditional Hill N Conditional Hill T -2% 1% -5% -8% 4 4 1 3 and 4 USD Hill Conditional Hill N Conditional Hill T -1% 1% 1% 3 2 3 (+) and 4 (-) GSPC N225 Difference from statistically expected (%) Calibration period (in years) It’s clear that the preferable model differs among the time series but common for the models is that they are based on the extreme value theory. The historical simulation approach never even generates a model that falls within the limits of acceptance for a reliable model, i.e. the number of exceedances is never within 10% from the statistically expected. In the sections 6.3.1 to 6.3.7 below it can be seen that the models that fall within the limits of acceptance vary from time series to time series. The number of models that are considered reliable also differs largely. Notice that for some time series not all methods generate models that fall within the limits of acceptability, for more details see in the sections 6.3.1 to 6.3.7. However, the Hill and the conditional Hill with t innovations are the two most eminent. Worth mentioning is that the best models which are highlighted in the table above are closely followed by others, as can be seen in table 7 as well as in the following sections. The length of calibration data used to obtain the most accurate model vary but generally its longer periods for the Hill estimator and shorter for the conditional Hill. 31 Table 8. A summary of the best ES prediction model for the different time series. The average difference is the difference between the estimated ES and the actual outcome for those days where the actual VAR exceeded the estimated. A negative value indicates an overestimation of the risk, i.e. the estimated ES is larger than the actual outcome, on average. Time series Best ES model within Average difference (%) Calibration period limits for each method (in years) OMXS30 Conditional Hill N -0.53% 1 Conditional Hill T -0.84% 1 FTSE Conditional Hill N -0.71% 1 Conditional Hill T -0.86% 1 GSPC Hill -0.54% 1 N225 Conditional Hill N -0.70% 1 Conditional Hill T -0.82% 1 EUR Hill -0.26% 3 GBP Hill -0.50% 4 Conditional Hill N -0.33% 1 Conditional Hill T -0.37% 4 USD Hill -0.39% 3 Conditional Hill N -0.56% 4 Conditional Hill T -0.52% 2 The conditional Hill methods are on average superior. For the majority of the time series tested they generated reliable models. Of the two the one with normal innovations is preferable. However for both GSPC and EUR the Hill estimator is the only method that generates models who meet the requirements for reliable models. Averagely one year of calibration data seems to provide the most reliable models but no clear conclusion can be drawn from the summary table. For some time series longer time horizons seems desirable. In the following sections as well as in table 7 and 8 a general rule of thumb can be set to be that extreme value theory based models for indices need shorter calibration periods to generate reliable models for VAR and ES estimations while models for foreign exchange rates need longer periods. However, exceptions exist. 32 6.3.1 OMXS30 Table 9. Simulation results for OMXS30, shows how many times the actual percentage change were greater than the VAR prediction. Since is studied the expected number of exceedances are given by 1 % of the total prediction period. The ES statistic measures the average difference between the estimated and the observations exceeding VAR. OMX Calibration period Length prediction period (in days) Expected number of exceedances (in days) Exceedances over estimated VAR, : Historical Hill Conditional Hill N Conditional Hill T GPD estimate Statistic for ES, : Historical Hill Conditional Hill N Conditional Hill T GPD estimate 1 year 2 years 3 years 4 years 10 years 17 years 4324 4072 3820 3568 2056 292 43.24 40.72 38.20 35.68 20.56 2.92 70 (62%) 54 (25%) 46 (6%) 44 (2%) 76 (76%) 60 (47%) 44 (8%) 36 (-12%) 34 (-17%) 64 (57%) 64 (68%) 43 (13%) 35 (-8%) 35 (-8%) 63 (65%) 62 (74%) 44 (23%) 32 (-10%) 24 (-33%) 64 (79%) 23 (12%) 16 (-22%) 17 (-17%) 14 (-32%) 23 (12%) 5 (71%) 4 (37%) 3 (3%) 2 (-32%) 5 (71%) 0.11% -1.00% -0.53% -0.84% 1.30%* 0.29% -1.25% -0.81% -0.97% 0.23%* 0.08% -1.55% -0.73% -1.10% 0.14% 0.04% -2.00% -1.05% -0.96% 0.06% 0.10% -2.34% -0.87% -1.19% 0.06% 0.06% -2.70% -1.15% -0.95% 0.06% *ML estimates didn’t converge Remember, a positive value of the ES statistic, V, indicates an underestimation of the risk loss and a negative value indicates an overestimation. Notice that the GPD parameters don’t converge for all tail sizes. As mentioned in section 6.1, this means that the conditions for the parameters, given in section 2.5, equation 18, aren’t fulfilled for the estimated parameters or that the parameters didn’t converge during the maximum likelihood procedure. Because of that the GPD model that uses one and two years of historical data to estimate VAR and ES are unreliable and so is the results from those models. The calibration period of 17 years can only be backtested over one year and therefor the low number of statistically expected exceedances. Because of the low number the limits for a reliable model end up around 2.6 to 3.2 exceedances, hence unless the number of exceedances is three the model is rejected as unreliable. To properly investigate this length of calibration data a longer backtesting period is needed. It’s because of this the best model for the conditional Hill with normal innovations in table 9 is considered to be the one based on one year of calibration data. For ES the conditional Hill methods performs best for almost every length of calibration periods. Since the backtesting statistic for ES depends on the estimation of VAR only the conditional Hill methods fulfill the conditions for a reliable model. As can be seen in table 9 only one and three years of calibration data generates models that falls within the acceptable limits. Of those the calibration period of one year with the conditional Hill model with normal innovation gives the best predictions. If the restriction of an accurate estimate of VAR is needed to obtain a good ES model is ignored several more models are within the 1% limit. Historical simulation always gives an underestimation of the risk since the statistic is positive for all calibration periods, but it’s almost always the one closest to zero. The exception is the GPD method when 10 years of historical data is used to calibrate 33 the model. The Hill estimator never generate an acceptable model for OMXS30, for the conditional Hill methods the length of calibration data used affect the reliability, overall shorter time horizons seems preferable over longer. Considering VAR models the Conditional Hill with t innovations based on one year of calibrations data gives the number of exceedances closes to the statistically expected for the OMXS30 loss series. However, several other models fall within the limits for a reliable model. The conditional Hill methods are preferable for most calibration periods. The conditional Hill with normal innovations seems to be the most accurate for longer calibration periods. For a calibration period of two year the Hill estimate seems to be slightly better, but the difference is small. Even though the model that performs closest to the statistically expected is the conditional Hill model with t innovations, the one with normal generates models within the limits of acceptability for almost all lengths of calibration data. Therefore if one wants a model that is robust regardless of length of calibration period that one is preferable. For calibration periods of two years or longer the conditional Hill methods overestimate the risks while historical, Hill and GPD underestimate it. The conditional Hill with T innovations gives fewer and fewer exceedances over the estimated VAR than the one with normal innovations the longer the calibrations period used. To summarize the above, one of the conditional Hill methods should be used to obtain reasonable estimate of VAR and ES, preferable with one year of historical data as the calibration period. 6.3.2 FTSE Table 10. Simulation results and predictions based on the time series of FTSE. FTSE Calibration period: 1 year 2 years 3 years 4 years Length prediction period (in days) 4324 4072 3820 3568 Expected number of exceedances (in days) 43.24 40.72 38.20 35.68 Exceedances over estimated VAR, : Historical 71 (64%) 71 (74%) 71 (86%) 67 (88%) Hill 52 (20%) 60 (47%) 60 (57%) 56 (57%) Conditional Hill N 42 (-3%) 34 (-17%) 33 (-14%) 30 (-16%) Conditional Hill T 38 (-12%) 40 (-2%) 37 (-3%) 33 (-8%) Statistic for ES, : Historical 0.04% 0.21% 0.20% 0.27% Hill -1.25% -1.20% -1.47% -1.70% Conditional Hill N -0.71% -0.83% -0.90% -1.07% Conditional Hill T -0.82% -0.86% -0.96% -1.13% For VAR predictions, one of the conditional Hill methods is preferable, which one depends on the length of calibration period used. However the one based on t innovations are the only one that generated models within the limits for a reliable model when two years or more is used. When one year is used it gives 12% fewer exceedances than expected which is close to the limit. Hence, that method can be seen as the one preferable of the two is a choice has to be made. Noticeable is that both the conditional Hill methods are always overestimating the risk. As can be seen in table 10 the time horizon one should use doesn’t really matter, reliable models can be obtained for each length of calibration data tested. For two years or more the conditional Hill with 34 t innovation is a convenient choice, for one year the one based on normal innovations is slightly better. For ES the historical simulation gives an underestimation of the risk, however it’s the method with the statistic closest to zero as for the other time series. But since the conditions for an acceptable model includes a reasonable estimate of VAR only the conditional Hill methods for estimating ES should be considered. The length of calibration data can be between one and three years, depending on which distribution the innovations are assumed to come from, one year if its normal innovations, two or three if it’s t innovations. 6.3.3 GSPC Note that the GSPC time series had zeros on the latest 123 days, these observations are therefore excluded in the analysis. Table 11. Simulation results and predictions based on the time series of GSPC. GSPC Calibration period: 1 year 2 years 3 years 4 years Length prediction period (in days) 4201 3949 3697 3445 Expected number of exceedances (in days) 42.01 39.49 36.96 34.45 Exceedances over estimated VAR, : Historical 58 (38%) 63 (60%) 59 (60%) 58 (68%) Hill 45 (7%) 48 (22%) 46 (24%) 39 (13%) Conditional Hill N 36 (-14%) 31 (-21%) 24 (-35%) 24 (-30%) Conditional Hill T 31 (-26%) 27 (-32%) 23 (-38%) 17 (-51%) Statistic for ES, : Historical 0.14% 0.17% 0.14% 0.11% Hill -0.54% -0.57% -0.57% -0.53% Conditional Hill N -0.50% -0.68% -0.66% -0.88% Conditional Hill T -0.26% -0.52% -0.56% -0.60% The only VAR prediction model that falls within the limits of acceptability is obtained by the Hill estimator based on one-year calibration. All other models are pretty far off. The historical and Hill methods underestimate the risk quite substantially while the conditional Hill methods overestimate it. Since only one VAR estimating model is acceptable that is the only one that can be a candidate for ES estimations. As can be seen in table 11 the backtesting statistic is smaller than 1% and hence the Hill estimator model based on one year of calibration data is the model that gives trustworthy estimate of VAR and ES for time series that are similar to the one of GSPC. If the condition of an accurate VAR estimate is ignored several more models can be seen as appropriate. Hence, an incorrect estimation of VAR seems to generate an acceptable estimate of ES. If underestimation of ES risk is acceptable the historical simulation based on four years of calibration data is the best model, if not any of the EVT methods can be used. 35 6.3.4 N225 Table 12. Simulation results and predictions based on the time series of N225. N225 Calibration period: Length prediction period (in days) Expected number of exceedances (in days) 1 year 4324 43.24 2 years 4072 40.72 3 years 3820 38.20 4 years 3568 35.68 Exceedances over estimated VAR, : Historical 60 (39%) 56 (38%) 54 (41%) 44 (23%) Hill 48 (11%) 48 (18%) 45 (18%) 36 (1%) Conditional Hill N 42 (-3%) 29 (-29%) 28 (-27%) 23 (-36%) Conditional Hill T 41 (-5%) 32 (-21%) 29 (-24%) 23 (-36%) Statistic for ES, : Historical 0.24% 0.30% 0.12% 0.25% Hill -1.21% -1.24% -1.30% -1.50% Conditional Hill N -0.70% -0.69% -0.76% -0.69% Conditional Hill T -0.82% -0.96% -0.97% -0.93% The Hill model with four years of calibration is the preferable model for VAR estimations, followed closely by the conditional Hill model with normal innovations and one-year calibration period. These are the three models that falls within the 10% limit from the expected number of exceedances. If some other calibration periods used, historical and Hill method underestimate the risk, while the conditional Hill methods overestimate it quite a lot. For the time series of N225 one or four years of calibrations data has to be used for reliable VAR models the generate prediction that are trustworthy. For one year one of the conditional Hill models generate the most reliable estimates, for four years the simpler Hill method is preferable. For estimating ES the conditional Hill models based on one year of historical data are the only acceptable ones. As for the other datasets included in this study the historical simulation generates backtesting statistics closes to zero but underestimate the VAR risk quite substantially. So for a good ES estimating model the VAR estimate should be incorrect. But to fulfill the conditions for a reliable model for both VAR and ES estimations one of the conditional Hill methods based on one year of historical data should be used. 36 6.3.5 EUR Table 13. Simulation results and predictions based on the time series of EUR. EUR Calibration period: Length prediction period (in days) Expected number of exceedances (in days) 1 year 4324 43.24 2 years 4072 40.72 3 years 3820 38.20 4 years 3568 35.68 Exceedances over estimated VAR, : Historical 59 (36%) 50 (23%) 44 (15%) 42 (18%) Hill 50 (16%) 36 (-12%) 37 (-3%) 35 (-2%) Conditional Hill N 37 (-14%) 27 (-34%) 28 (-27%) 28 (-22%) Conditional Hill T 36 (-17%) 25 (-39%) 24 (-37%) 22 (-38%) Statistic for ES, : Historical 0.02% 0.03% 0.05% 0.05% Hill -0.23% -0.27% -0.26% -0.35% Conditional Hill N -0.19% -0.21% -0.26% -0.27% Conditional Hill T -0.23% -0.27% -0.28% -0.25% For EUR the Hill estimators performs best for VAR estimations, the ones with a calibration period of three and four years performs closes to the statistically expected and are the only two which falls within the limits for a reliable model. Both models overestimate the risk slightly. All other models are considered as unreliable since them either under- or overestimate the risk substantially. So for the time series of EUR the longer calibration periods are needed and only the Hill estimator generates reliable estimates. The same reasoning holds for the preferable model for estimating ES. The Hill estimator based on three or four years of historical data for calibration should be used. Different from the results for the indices in the above sections are that the four year period of historical data seems to be preferable to obtain good estimate of the risk measures when the time series for the EUR foreign exchange rate is used. This can be seen in the two coming sections as well, even though the models also perform well when shorter periods are used for those time series. 6.3.6 GPB Table 14. Simulation results and predictions based on the time series of GPB. GBP Calibration period: Length prediction period (in days) Expected number of exceedances (in days) 1 year 4324 43.24 2 years 4072 40.72 3 years 3820 38.20 4 years 3568 35.68 50 (16%) 37 (-14%) 41 (-5%) 36 (-17%) 51 (25%) 32 (-21%) 30 (-26%) 35 (-14%) 56 (47%) 34 (-11%) 33 (-14%) 35 (-8%) 47 (32%) 36 (1%) 33 (-8%) 33 (-8%) 0.11% -0.61% -0.33% -0.48% 0.07% -0.45% -0.42% -0.50% 0.01% -0.53% -0.39% -0.48% 0.11% -0.50% -0.44% -0.37% Exceedances over estimated VAR, : Historical Hill Conditional Hill N Conditional Hill T Statistic for ES, : Historical Hill Conditional Hill N Conditional Hill T 37 Every method apart from the historical overestimates the VAR throughout, except from the Hill estimator with four years of calibration data. That is the model which performs closes to the expected. However several conditional Hill models can also be considered as reliable. All reliable models give an overestimation of the ES, but within the acceptable limit. The choice of length on the calibration data affect the choice of model but four years of data seems to be the suitable choice since either of the models based on extreme value theory then can be used to obtain dependable results. The conditional Hill model with normal innovations and one year of calibration data is the model that seems to estimate ES most accurate. 6.3.7 USD Table 15. Simulation results and predictions based on the time series of USD. USD Calibration period: Length prediction period (in days) Expected number of exceedances (in days) 1 year 4324 43.24 2 years 4072 40.72 3 years 3820 38.20 4 years 3568 35.68 Exceedances over estimated VAR. : Historical 60 (39%) 56 (38%) 46 (20%) 52 (46%) Hill 55 (27%) 45 (11%) 38 (-1%) 40 (12%) Conditional Hill N 48 (11%) 41 (1%) 35 (-8%) 37 (4%) Conditional Hill T 46 (6%) 40 (-2%) 38 (-1%) 36 (1%) Statistic for ES, : Historical 0.08% 0.11% 0.14% 0.07% Hill -0.52% -0.42% -0.39% -0.33% Conditional Hill N -0.57% -0.57% -0.61% -0.56% Conditional Hill T -0.60% -0.52% -0.66% -0.54% As for all the other time series the historical simulation provides the backtest statistic closes to zero, but with a little underestimation of the risk. The statistic is by far the one closes to zero seen to all the models tested. However, as mentioned before, the VAR estimations are not even close to being reasonable and therefore the models cannot be seen as acceptable. For the VAR estimation all extreme value theory based models generate reliable estimates. The conditional Hill method with t innovations gives dependable models for all lengths of calibration data tested. The models with normal innovations need two year or more, while the Hill estimator only gives reliable estimates when three years of calibration data is used. The results from the backtesting of ES predictions all shows that the models that can are acceptable for VAR estimations all are within the limits of acceptability for ES estimations as well. The EVT models that give to many exceedances than desired are not that bad. As can be seen in table 15 the differences from the statistically expected are only 11-12%, except for the Hill estimator based on one year of calibration data. This indicates that all the EVT models are strong candidates when choosing a model for estimating risk measures. The time horizon used for de models are arbitrary. Some restrictions exist depending on which model to use and the outcome from this backtesting procedure is that the conditional Hill method with t innovations is preferable since the choice of length on the calibration data that is used then is arbitrary. 38 6.4 Backtesting results, various thresholds OMXS30 is used to investigate different threshold choices than 10%. Calibration periods of one to four years are used to estimate and . The threshold investigated is based on the result presented in section 5.2 Threshold choice. Table 16. Summary of Hill simulation for different thresholds, for the OMXS30 data. Hill estimator Calibration period 1 year 2 years 3 years 4 years Length prediction period 4324 4072 3820 3568 (in days) Expected number of 43.24 40.72 38.20 35.68 exceedances (in days) Exceedances over estimated VAR, : Hill T=0.1 54 (25%) 44 (8%) 43 (13%) 44 (23%) Hill T=0.07 70 (62%) 58 (42%) 57 (49%) 57 (60%) Hill T=0.05 76 (76%) 62 (52%) 62 (62%) 63 (77%) Hill T=0.03 81 (87%) 68 (67%) 71 (86%) 68 (91%) Statistic for ES, : Hill T=0.1 -1.00% -1.25% -1.55% -2.00% Hill T=0.07 -0.70% -0.81% -1.08% -1.48% Hill T=0.05 -0.48% -0.30% -0.56% -0.95% Hill T=0.03 -0.25% -0.17% -0.35% -0.44% VAR predictions based on Hill become less and less accurate when the tail size diminishes for all calibration periods from one up to four years. So if using the Hill estimator the threshold should not be less than 10% for data similar to the OMXS30 data and the length of the calibration period has to be two years to obtain a reliable model. This model slightly underestimates the risk based on VAR since it gives a few more exceedances than the statistically expected. For expected shortfall the backtesting statistic gets closer to zero the lower the threshold is set to be. However then the statistic is based on an inaccurate VAR estimate and is therefore somewhat misleading. If the ES models should be combined with a VAR model, to avoid several different simulations, a model based on two years of historical data and a threshold of 10% should be to used. This would lead to an overestimation of 1.25% of the ES risk, on average. I.e. if the estimated ES is 10% the actual loss is around 8.75%. 39 Table 17. Summary of conditional Hill simulation, with normal innovations, for different thresholds, for the OMXS30 data. Conditional Hill. normal innovations Calibration period 1 year Length prediction period 4324 (in days) Expected number of 43.24 exceedances (in days) Exceedances over estimated VAR, : CH N T=0.1 46 (6%) CH N T=0.07 60 (39%) CH N T=0.05 66 (53%) CH N T=0.03 73 (69%) Statistic for ES, : CH N T=0.1 -0.53% CH N T=0.07 -0.45% CH N T=0.05 -0.34% CH N T=0.03 -0.22% 2 years 3 years 4 years 4072 3820 3568 40.72 38.20 35.68 36 (-12%) 45 (11%) 54 (33%) 60 (47%) 35 (-8%) 46 (20%) 50 (31%) 54 (41%) 32 (-10%) 36 (1%) 41 (15%) 44 (23%) -0.81% -0.44% -0.36% -0.21% -0.73% -0.52% -0.35% -0.15% -1.05% -0.57% -0.35% -0.15% Table 18. Summary of conditional Hill simulation, with t innovations, for different thresholds, for the OMXS30 data. Conditional Hill. T innovations Calibration period 1 year 2 years 3 years 4 years Length prediction period 4324 4072 3820 3568 (in days) Expected number of 43.24 40.72 38.20 35.68 exceedances (in days) Exceedances over estimated VAR, : CH T T=0.1 44 (2%) 34 (-17%) 35 (-8%) 24 (-33%) CH T T=0.07 55 (27%) 46 (13%) 42 (10%) 36 (1%) CH T T=0.05 63 (46%) 53 (30%) 49 (28%) 43 (21%) Statistic for ES, : CH T T=0.1 -0.84% -0.97% -1.10% -0.96% CH T T=0.07 -0.28% -0.68% -0.60% -0.70% CH T T=0.05 -0.24% -0.45% -0.35% -0.47% For the VAR estimations the best models vary for different calibration periods, however it shouldn’t be lower than approximately 7%, see table 17 and 18. The choice of threshold affects the appropriate length of the calibration data. Overall two years should be avoided; either a shorter period with a 10% threshold or a longer period for a 7-10% threshold should be used. For both conditional Hill methods a threshold of 7% and a calibration data of four years seem to be the ones that perform closes to the statistically expected. As can be seen in table 17 and 18 the ES backtesting statistic seems to approach zero with smaller tail sizes. Then the threshold for the tail is set to be 5% of the sample the statistics closes to zero is obtained. Nevertheless, the same reasoning as for the Hill estimator holds for these methods. Since the ES backtesting statistic depends on the VAR estimate one has to consider that when interpreting 40 the results of the backtesting. If the models that are preferable for VAR estimation is used, the overestimation of the ES risk will be slightly larger than if a threshold of 5% is used for the models. 6.5 Results higher significance levels As described in section 5.7 an investigation of how the EVT models perform for higher significance levels is made. A hypothetical loss distribution consisting of 1 000 000 observations for the day 201201-01 is generated via a AR(1)-GARCH(1,1) model with t innovations to obtain the ES, VAR is obtained from properties of the distribution . The result is presented in table 19 and 20 below. Table 19. The table shows the prediction of and for 2012-01-01 from the different models. The outcome from the hypothetical data is also included as a reference, the closer to that outcome the better. 1 year 2 years 3 years 4 years Outcome from hypothetical data Historical Hill Conditional Hill N Conditional Hill T 0.1818 0.1666 0.1792 0.1909 0.0682* 0.1351 0.0964 0.1090 0.0682* 0.1227 0.0761 0.0866 0.0640 0.1136 0.0831 0.0833 0.0703 0.1410 0.0781 0.0900 Outcome from hypothetical data Historical Hill Conditional Hill N Conditional Hill T 0.2766 0.2538 0.2731 0.2906 -0.2280 0.1475 0.1771 -0.2132 0.1103 0.1337 -0.1862 0.1227 0.1242 0.0723* 0.2392 0.1106 0.1349 * The one most extreme observation in the historical data used. Table 20. The table shows the prediction of and for 2012-01-01 from the different models. The outcome from the hypothetical data is also included as a reference, the closer to that outcome the better. 1 year 2 years 3 years 4 years Outcome from hypothetical data Historical* Hill Conditional Hill N Conditional Hill T 0.2730 0.2504 0.2695 0.2868 0.0682 0.2207 0.1462 0.1731 0.0682 0.2045 0.1105 0.1323 0.0682 0.1817 0.1226 0.1239 0.0723 0.2311 0.1112 0.1343 Outcome from hypothetical data Historical Hill Conditional Hill N Conditional Hill T 0.4117 0.2780 0.4068 0.4327 -0.3724 0.2239 0.2814 -0.3554 0.1602 0.2041 -0.2978 0.1810 0.1848 -0.3921 0.1576 0.2015 * The estimated VAR is just the one most extreme observation in the historical data used, for all four time periods used. For the VAR predictions the Hill estimator seems to be the most accurate, even though it gives a lower VAR than the outcome from the hypothetical data and hence underestimates the risk. The 41 result is the same for the ES predictions, the Hill estimator provides the most accurate predictions, however it consistently underestimates the ES risk. Notice that the predictions in the tables above only are done for one day. To establish the accuracy of the models predictions must be made over a longer period of time and then used to analyze and backtest the separate models. 42 7. Discussion and conclusions In this study the time series used are just the returns for a few indices and foreign exchange rates over an 18 year period of time. The time series are just of the daily closing price for the assets. In reality, when a portfolio or an asset performs poorly, you make changes to avoid large losses by for example selling of some of the asset or changing the positions in the portfolio. If this is taken into consideration the appearance of the time series may be different. In the literature it is stated that EVT should be good for higher quantile estimations; it’s the main idea when using the theory. The largest part in this study is used to investigate models for estimating the risk measures for a significance level of 99%. A small analysis is made for higher levels of significance. The outcome from that is that for higher significance levels the Hill estimator seems to perform the best. However the estimations of both VAR and ES are pretty far from the outcome from the hypothetical data. One explanation to why the predictions from the models are so far off can be that when the hypothetical outcome is generated the residuals are assumed to come from a student’s t distribution with three degrees of freedom. In reality the actual residuals of the time series may correspond to another degree of freedom. So even if the AR-GARCH model is based on the time series and the assumption of the residuals seems reasonable, it may not be optimal. However, to properly investigate how the models performs for higher significance levels a more extensive study should be made, with substantial data material so proper backtesting procedures can be carried out. There seem to be no clear answer to which method is preferable over the others when estimating the two risk measures VAR and ES. The preferable model seems to vary not only between different types of assets but between assets of the same type as well. In this study indices and foreign exchange rates are considered, if other types of assets are studied a clearer picture of which model to prefer may be obtained. In addition, other aspects than just the models ability to satisfying predictions may affect the choice of model. The conditional Hill models demand both a fit to a time series model as well as an estimation of a shape parameter while the Hill estimator only demands an estimation of the shape parameter before predicting VAR and ES. The Hill estimator has the advantage that it doesn’t need to estimate any additional parameters and is therefore simpler to implement. Also, the computational burden to obtain the desired estimate is not as heavy as for the methods that combine EVT and time series analysis. The advantage of the conditional Hill methods is that they are more flexible than the others, both to volatility increase and decrease. When combining EVT and time series models the combinations are very many. In this study I chose the conditional Hill since error from model misspecification should be as small as possible. If the POT is used, a GPD needs to be fitted to the residuals as well, and hence the computational burden increases even more and the possibility of error increases. The one big drawback with the Hill estimator is that is only applies to Fréchet types of distributions and even though financial data most often can be assumed to come from that family, that’s not always the case. That is the main reason for considering the POT method. The advantages of the POT method is that is can handle all type of distributions, and the GPD can be fitted no matter if the data seems to come correspond to a Gumbel, Fréchet or a Weibull distribution. The fact that the GPD parameter has to be fitted to the data is one drawback of the method already mentioned. The main 43 problem, as shown in this study, is that time horizon needed for a stable model is not obvious. There need to be enough observations in the tail to obtain parameter estimate that converges. The number is conditioned on how good fit the GPD is to the data. If the data are a very good fit to the GPD you need fewer observations than if it’s a poor one. The threshold choice is one important part of the EVT. In this study some different method for finding the appropriate threshold are examined and a few choices of threshold are investigated. Since the choices based on graphical an minimizing MSE methods give worse models than when 10% is used those methods is obviously not optimal. If the conventional choice seems to diffuse the data driven algorithms presented in section 2.5.1 defining the tail, should be considered, in my opinion. However the conventional choice seems to be good enough as a first step and a threshold choice around 10% seems to be reasonable. As mentioned above, the preferable model depends on the data. For ES the historical simulation is the simplest alternative. It can be easily implemented where historical simulation of VAR already exists, it often gives a backtesting statistic very close to zero, and in comparison with the other methods tested in this study it always gives the smallest statistic (for a threshold of 10%). However, the statistic is obtained based on an incorrect estimation of VAR which make the results debatable. Since the ES backtesting statistic depends on the estimated VAR it seems reasonable that the preferable model should be the same for both. This fact can be seen in section 6.5 where the models are tested for higher significance levels, but it’s not always the case though as seen in section 6.3. The main reason for this is that the ES backtesting statistic then is based on an incorrect VAR estimate. In most cases this means that the actual number of exceedances for the models is substantially larger than the statistically expected, hence the ES backtesting statistic is based on several more observations than it actually would be if the VAR estimate was closer to the expected. This is one important result in this study; an incorrect estimation of VAR seems to generate an acceptable estimate of ES. But since VAR and ES are connected one of the EVT methods are preferable when an adequate model for estimating ES is desired. Which one, as well as the length of the calibration period, depends on the time series in question. As for ES, both the choice of model and length of calibration period seem to differ among the time series for VAR estimations. In general shorter time horizons seems to generate the most stable models, if one of the conditional Hill methods are applied while longer calibration periods should be used if one want to implement the Hill estimator. Logically a bit longer calibration period should be preferable so the observations in the tail that the ES estimate is based on aren’t so few, but as can be seen in section 6.3 that’s not always the case. Consequently, the time horizon needed is somewhat arbitrary and depends on both the chosen model but mainly on the data used. No matter which model one uses they all fail to capture large, quick changes in the volatility. If it goes from calm one day to very turbulent the next, no model can predict that. This is clear in the plots in section 6.2. When a change in the volatility develops under some time the conditional Hill models adjust fast, both to increases and decreases. These are the models that for the majority of the time series tested that generate models within the limits of acceptability. The models with normal innovations seem to generate the most accurate ES estimate, while the models with t innovations generate the most accurate VAR estimate. The Hill estimator is the only one that generates acceptable models for GSPC and EUR, and for several others gives VAR estimates that are the closest 44 to the expected. The foreign exchange rates seem to be more adaptable and all the EVT method generate reliable estimates of both VAR and ES, except for EUR. The indices on the other hand seem to be more sensitive to the choice of method. Either the Hill estimator is the only method that generates acceptable models or it is one of the conditional Hill. A drawback when using historical simulations is that you are restricted to the historical outcome of an asset. In a sense you assume that history will repeat itself. Since that the financial market is constantly changing EVT models or other types of semi parametric of parametric method can be useful in the field of risk analysis. The conditional Hill models in this study have performed quite well and with methods like that you consider what has happened recently, and adapt your model to that. If you are in a period of high volatility the model will take that into account, and if the volatility is low that will affect the model. Even though the first exceedance can’t be avoided the model quickly adapts the increase in volatility and hence avoids further exceedances. Even just the Hill estimator is more flexible and gives better backtesting results than historical simulations. The big problem with the EVT models presented in this paper is that they can be hard to implement on large portfolios and the computational burden for the estimations can be quite heavy. However, this for ES there is a way around this problem. Since ES is a coherent risk measure one of the properties that is satisfy is the sub-additivity. This means that the risk of a portfolio cannot be greater than the sum of the individual risks of the assets in the portfolio, this property is also known as the diversification. Hence, EVT can be applies to the individual assets and for the ES of a portfolio a upper limit is given by the sum of the ES for the individual assets. One way to make use of the advantages of the EVT models for VAR is either to use them on significantly smaller portfolios or just on the time series for the portfolio since the EVT methods in the univariate case are quite easily implemented and pretty fast to simulate, this can of course be done for ES as well. These estimations can then be used in combination with historical simulation which is easier to implement on large portfolios. So the conclusion of this study is that EVT method can be useful in the field of risk analysis and contribute to improved predictions. Which model to use depends on the data as well as the level of ambition. The models that combine EVT and time series analysis are harder to implement and the computational burden is higher. The Hill estimator is much simpler. The time horizon of the historical data used is shifting between models and between assets. To summarize, for indices a shorter horizon is preferable, one to two years, and one of the conditional Hill method should be used, GSPC is the exception. The Hill estimator with the longer horizons for up to four years should be used on time series that are similar to the foreign exchange rates included in this study. 45 8. References Books Alexander, C (2008): Market risk analysis IV, Value-at-risk models. John Wiley & sons Ltd., England Embrechts, P, Klüppelberg, C, Mikosch, T (1997): modeling extremal events for insurance and finance, Springer-Verlag, New York Gouriéroux, C (1997): ARCH-Models and Financial Applications, Springer Series in Statistics. Springer, New York. Articles Basel Committee on Banking Supervision (2012): Fundamental review of the trading book, Basel Committee on Banking Supervision Consultative document Christoffersen, P and Goncalves, S (2005): Estimation Risk in Financial Risk Management, Journal of Risk, 7, p 1-28. Danielsson, J and de Vries, C.G. (1997): Tail index and quantile estimation with very high frequency data. Journal of Empirical Finance 4: p241-257. Danielsson, J, De Haan, L, Peng, L, and De Vries, C.G. (2001): Using a Bootstrap Method to Choose the Sample Fraction in Tail Index Estimation. Journal of Multivariate Analysis 76 No. 2 p. 226–48. Drees, H and Kaufmann, E (1998): Selecting the optimal sample fraction in univariate extreme value estimation. Stochastic Processes and their Applications 75, p 149-172 Beirlant, J, Vynckier P. and Teugels J.L. (1996a): Tail index estimation, Pareto quantile plots and regression diagnostics. Journal of the American statistical association 91. P. 1659-1667. Beirlant, J, Vynckier P and Teugels J.L. (1996b): Excess function and estimation of the extreme value index. Bernoulli 2, p 293-318. Embrechts, P, Kaufmann, R, Patie, P (2005): Strategic long-term financial risks: single risk factors. Computational Optimization and Applications vol 32 issue(1-2),p 61-90 Halie, FD and Pozo S (2006): Exchange rate regimes and currency crises: an evaluation using extreme value theory. Review of International Economics, Volume 14, No. 4, p. 554-570. Hall, P (1990): using the bootstrap to estimate mean square error and select smoothing parameter in nonparametric problems. Journal of multivariate analysis vol 32 no 2. P. 177-203 Hill, M (1975): A simple general approach to inference about the tail of a distribution. The annals of statistics, Vol. 3, p. 1163-1174. Kourouma, L, Dupre, D, Sanfilippo, G and Taramasco, O (2011): Extreme Value at risk and Expected Shortfall during Financial Crisis, Working paper, HAL : halshs-00658495, version 1 46 Longin, FM (2000): From Value at Risk to Stress Testing: The Extreme Value Approach. Journal of Banking & Finance, Vol. 24, No. 7, p. 1097-1130. Lux, T (2000): On moment condition failure in German stock returns: An application of recent advances in extreme value statistics. Empirical Economics 25. P. 641-652 McNeil, AJ (1997): Estimating the tails of loss severity distributions using extreme value theory. ASTIN Bulletin,27: p. 117-137. McNeil, AJ (1999): Extreme value theory for risk managers . Internal Modelling and CAD II published by RISK Books , p. 93-113. McNeil, AJ and Frey, R (2000): Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach. Journal of Empirical Finance, 7: p. 271-300. McNeil, AJ and Saladin, T (1997): The peaks over thresholds method for estimating high quantiles of loss distributions. Proceedings of 28th International ASTIN Colloquium. Nyström, K and Skoglund, J (2002): Univariate extreme vale theory, GARCH and measure of risk. Swedbank, Group financial risk control. Rocco, M (2011): Extreme value theory for finance: a survey. Bank of Italy Occasional Paper No. 99 Coronado, M (2000): Extreme value theory (EVT) for risk managers: Pitfalls and opportunities in the use of EVT in measuring VaR. Proceedings of the VIII Foro de Finanzas Fisher, R. A. and Tippett, L.H.C. (1928): Limiting Forms of the Frequency Distribution of the Largest or Smallest Member of a Sample. Proceedings of the Cambridge Philosophical Society, Vol. 24, p. 180190. 47 Appendix A Kurtosis Kurtosis is defined as: In this study MATLAB is used for calculation of the kurtosis. MATLAB uses the following equation for the kurtosis which is corrected for bias: Where is the equation for calculating kurtosis not corrected for bias. The kurtosis gives an indication on how probable extreme events are for a distribution. The normal distribution has a kurtosis equal to 3 and a kurtosis lager than 3 indicate fat tails and slimmer, peaked center. These are called leptokurtic distributions and when the kurtosis is smaller than 3 the distribution is said to be platykurtic. Skewness The skewness is defined as In this study MATLAB is used for calculation of the skewness. MATLAB uses the following equation for the skewness which is corrected for bias: Where is the equation for calculating kurtosis not corrected for bias. Skewness is another measure that indicates the asymmetry of the probability distribution. A zero value implies that the data are evenly spread around the sample mean, as for the normal distribution 48 for example. A negative value indicates that the left tail is longer than the right while a positive value implies the opposite. If the probability distribution is tilted a time series models that take that into consideration can be useful to obtain dependable results. Autocorrelation The lag-h autocorrelation estimate is obtained by where is the observation at time t and is the estimated mean of the sample. Often a 95% confidence interval is included in the plot, if no crosses the bounds, the assumption of no autocorrelation hold, the data can be considered to be independent. Ljung Box Q test For the Ljung Box Q-test the following statistic is used Where T is the sample size, L is the number of lags where autocorrelation is tested and is the autocorrelation at lag k, for definition see under Autocorrelation above. Hence to obtain the statistic , the squared autocorrelation at each lag is weighted and then summarized. The weights are just the difference between the total sample size and the current lag. The hypotheses of the test are whether the statistic, , belongs to a distribution with degrees of freedom or not, with the significance level , and it’s formulated as: , no autocorrelation , autocorrelation occurs The test does not distinguish at which lags the autocorrelation occurs, it test the overall autocorrelation. 49 -15,0% -14,0% 1994-05-11 1994-10-24 1995-04-07 1995-09-20 1996-03-07 1996-08-20 1997-02-07 1997-07-23 1998-01-12 1998-06-25 1998-12-08 1999-05-27 1999-11-09 2000-04-25 2000-10-06 2001-03-23 2001-09-05 2002-02-25 2002-08-13 2003-02-03 2003-07-25 2004-01-15 2004-07-06 2004-12-17 2005-06-10 2005-11-24 2006-05-16 2006-11-01 2007-04-23 2007-10-10 2008-04-02 2008-09-18 2009-03-11 2009-08-31 2010-02-18 2010-08-10 2011-01-26 2011-07-18 2011-12-30 1994-05-11 1994-10-24 1995-04-07 1995-09-20 1996-03-07 1996-08-20 1997-02-07 1997-07-23 1998-01-12 1998-06-25 1998-12-08 1999-05-27 1999-11-09 2000-04-25 2000-10-06 2001-03-23 2001-09-05 2002-02-25 2002-08-13 2003-02-03 2003-07-25 2004-01-15 2004-07-06 2004-12-17 2005-06-10 2005-11-24 2006-05-16 2006-11-01 2007-04-23 2007-10-10 2008-04-02 2008-09-18 2009-03-11 2009-08-31 2010-02-18 2010-08-10 2011-01-26 2011-07-18 2011-12-30 -11,0% 1994-05-11 1994-10-24 1995-04-07 1995-09-20 1996-03-07 1996-08-20 1997-02-07 1997-07-23 1998-01-12 1998-06-25 1998-12-08 1999-05-27 1999-11-09 2000-04-25 2000-10-06 2001-03-23 2001-09-05 2002-02-25 2002-08-13 2003-02-03 2003-07-25 2004-01-15 2004-07-06 2004-12-17 2005-06-10 2005-11-24 2006-05-16 2006-11-01 2007-04-23 2007-10-10 2008-04-02 2008-09-18 2009-03-11 2009-08-31 2010-02-18 2010-08-10 2011-01-26 2011-07-18 2011-12-30 Time series for loss returns of FTSE 4,0% -1,0% -6,0% Figure 16. . The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date Time series for loss returns of N225 10,0% 5,0% 0,0% -5,0% -10,0% Figure 17. The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date Time series for loss returns of GSPC 11,0% 6,0% 1,0% -4,0% -9,0% Figure 18. The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date. 50 4,3% 3,3% 2,3% 1,3% 0,3% -0,7% -1,7% -2,7% -3,7% -2,6% 1994-05-10 1994-10-18 1995-03-29 1995-09-06 1996-02-19 1996-07-29 1997-01-13 1997-06-23 1997-12-01 1998-05-18 1998-10-26 1999-04-09 1999-09-17 2000-02-29 2000-08-08 2001-01-19 2001-06-28 2001-12-06 2002-05-27 2002-11-05 2003-04-25 2003-10-09 2004-03-26 2004-09-10 2005-02-23 2005-08-10 2006-01-20 2006-07-10 2006-12-18 2007-06-07 2007-11-16 2008-05-07 2008-10-17 2009-04-06 2009-09-21 2010-03-08 2010-08-23 2011-02-03 2011-07-21 2011-12-30 1994-05-10 1994-10-18 1995-03-29 1995-09-06 1996-02-19 1996-07-29 1997-01-13 1997-06-23 1997-12-01 1998-05-18 1998-10-26 1999-04-09 1999-09-17 2000-02-29 2000-08-08 2001-01-19 2001-06-28 2001-12-06 2002-05-27 2002-11-05 2003-04-25 2003-10-09 2004-03-26 2004-09-10 2005-02-23 2005-08-10 2006-01-20 2006-07-10 2006-12-18 2007-06-07 2007-11-16 2008-05-07 2008-10-17 2009-04-06 2009-09-21 2010-03-08 2010-08-23 2011-02-03 2011-07-21 2011-12-30 4,5% 3,5% 2,5% 1,5% 0,5% -0,5% -1,5% -2,5% -3,5% 1994-05-20 1994-11-07 1995-04-26 1995-10-12 1996-04-03 1996-09-19 1997-03-14 1997-09-01 1998-02-24 1998-08-12 1999-02-03 1999-07-22 2000-01-11 2000-06-28 2000-12-14 2001-06-05 2001-11-21 2002-05-17 2002-11-06 2003-05-07 2003-10-28 2004-04-26 2004-10-15 2005-04-11 2005-09-30 2006-03-22 2006-09-15 2007-03-08 2007-09-03 2008-02-26 2008-08-20 2009-02-13 2009-08-10 2010-02-02 2010-07-28 2011-01-18 2011-07-13 2011-12-30 Time series for loss returns of GBP Figure 19. The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date. Time series for loss returns of USD Figure 20. The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date. Time series for loss returns of EUR 2,4% 1,4% 0,4% -0,6% -1,6% Figure 21. The time series for the loss returns, note that the data starts at 1994-01-03 even though the first date on the axis label is a later date. 51 FTSE Autocorrelation Function FTSE Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 -0.2 20 0 5 10 Lag 15 20 Figure 22. . Autocorrelation plot for the residuals of the FTSE time series (left) and for the squared residuals (right). GSPC Autocorrelation Function GSPC Squared Autocorrelation Function 1 1 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 20 -0.2 0 5 10 Lag 15 20 Figure 23. . Autocorrelation plot for the residuals of the GSPC time series (left) and for the squared residuals (right). 52 N225 Autocorrelation Function N225 Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 -0.2 20 0 5 10 Lag 15 20 Figure 24. . Autocorrelation plot for the residuals of the N225 time series (left) and for the squared residuals (right). EUR Autocorrelation Function EUR Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 20 -0.2 0 5 10 Lag 15 20 Figure 25. . Autocorrelation plot for the residuals of the EUR time series (left) and for the squared residuals (right). 53 GBP Autocorrelation Function GBP Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 -0.2 20 0 5 10 Lag 15 20 Figure 26. . Autocorrelation plot for the residuals of the GBP time series (left) and for the squared residuals (right). USD Autocorrelation Function USD Squared Autocorrelation Function 0.8 Sample Autocorrelation Sample Autocorrelation 0.8 0.6 0.4 0.2 0 -0.2 0.6 0.4 0.2 0 0 5 10 Lag 15 20 -0.2 0 5 10 Lag 15 20 Figure 27. . Autocorrelation plot for the residuals of the USD time series (left) and for the squared residuals (right). 54 FTSE Histogram of data, red line normal dist. fit GSPC Histogram of data, red line normal dist. fit 450 700 400 600 350 500 Number of observations Number of observations 300 250 200 400 300 150 200 100 100 50 0 -0.1 -0.08 -0.06 -0.04 -0.02 0 Observation 0.02 0.04 0.06 0 -0.2 0.08 -0.15 -0.1 -0.05 0 Observation 0.05 0.1 0.15 Figure 28. Histogram of FTSE and GSPC with fitted normal distribution. N225 Histogram of data, red line normal dist. fit EUR Histogram of data, red line normal dist. fit 800 300 700 250 600 Number of observations Number of observations 200 500 400 300 150 100 200 50 100 0 -0.2 -0.15 -0.1 -0.05 0 Observation 0.05 0.1 0.15 0 -0.04 -0.03 -0.02 -0.01 0 0.01 Observation 0.02 0.03 0.04 0.05 Figure 29. Histogram of N225 and EUR with fitted normal distribution. USD Histogram of data, red line normal dist. fit 350 300 300 250 250 Number of observations Number of observations GBP Histogram of data, red line normal dist. fit 350 200 150 200 150 100 100 50 50 0 -0.04 -0.03 -0.02 -0.01 0 0.01 Observation 0.02 0.03 0.04 0.05 0 -0.03 -0.02 -0.01 0 Observation 0.01 0.02 0.03 Figure 30. Histogram of GBP and USD with fitted normal distribution. 55 QQ Plot of FTSE Data versus Standard Normal QQ Plot of GSPC Data versus Standard Normal 0.08 0.15 0.06 0.1 0.04 0.05 Quantiles of Input Sample Quantiles of Input Sample 0.02 0 -0.02 0 -0.05 -0.04 -0.1 -0.06 -0.15 -0.08 -0.1 -4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 -0.2 -4 4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 4 Figure 31. QQ plot against normal distribution for FTSE and GSPC. QQ Plot of N225 Data versus Standard Normal QQ Plot of EUR Data versus Standard Normal 0.15 0.05 0.04 0.1 0.03 0.05 Quantiles of Input Sample Quantiles of Input Sample 0.02 0 -0.05 0.01 0 -0.01 -0.1 -0.02 -0.15 -0.03 -0.2 -4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 -0.04 -4 4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 4 Figure 32. QQ plot against normal distribution for N225 and EUR. QQ Plot of GBP Data versus Standard Normal QQ Plot of USD Data versus Standard Normal 0.05 0.03 0.04 0.02 0.03 0.01 Quantiles of Input Sample Quantiles of Input Sample 0.02 0.01 0 -0.01 0 -0.01 -0.02 -0.02 -0.03 -0.04 -4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 4 -0.03 -4 -3 -2 -1 0 1 Standard Normal Quantiles 2 3 4 Figure 33. QQ plot against normal distribution for GBP and USD. 56 Appendix B Threshold choice To investigate different methods for setting the threshold the OMXS30 data is used. First a construction of Hill plots and mean excess plots are made, then a Monte Carlo simulation is implemented, for algorithm and results see below. In the Hill plots the estimated tail index is plotted as a function of the threshold, . The plot is made based on the entire data as well as just one year since one year will be used when the prediction models is implemented. As mentioned in section 2.5.1 Defining the tail approximately horizontal lines indicates that for those values of the threshold, the tail index estimate is essentially stable with respect to the choice of threshold. Tail index as a function of threshold, based on the average of 1 year data Tail index as a function of threshold, based on 18 year of data 9 6 8 5 Estimated tail index Estimated tail index 7 4 3 2 6 5 4 3 1 2 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Threshold of total sample size,ex 0.02=2% Figure 34. Hill plots of year data (right figure). 0.4 1 0.45 0 0.05 0.1 0.15 0.2 Threshold of total sample size,ex 0.02=2% 0.25 where m is the tail size based on 18 years of data (left figure) and based on the average of 1 Tail index as a function of threshold, based on 1 year of data Tail index as a function of threshold, based on 1 year of data 10 9 9 8 8 Estimated tail index Estimated tail index 7 7 6 5 4 3 2 5 4 3 2 1 0 6 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Threshold of total sample size,ex 0.02=2% 0.4 0.45 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Threshold of total sample size,ex 0.02=2% 0.4 0.45 Figure 35. Hill plots based on 1 year of data, year 1994 (left figure) and year 2011(right figure). In figure 34 the left plot is based on the average of all one year estimates. I.e. the tail index is estimated for all one year non overlapping interval, one interval is 94, the next is 95 and so on, and then the average estimate for each threshold is plotted. As can be seen in both figures 34 and 35 it’s not clear where the threshold is stable, and it can shift from sample to sample. A conclusion that can be made is that the threshold should be larger than 5%, which is in line with the conventional choice method presented in section 2.5.1 Defining the tail. 57 In the mean excess function plot the actual observation and the sample mean excess for that one is plotted. As mentioned in section 2.5.1 Defining the tail an approximately linear plot for the higher order statistics implies that the tail can be assumed to come from a GPD with shape parameter . The direction of the line indicates the sign of the shape parameter.The plot is made based on the entire data as well as just one year. -3 11 0.0108 10.5 0.0106 10 0.0104 9.5 Mean excess value Mean excess value Mean excess function for the upper tail for history of 18 years 0.011 0.0102 0.01 0.0098 9 8 0.0096 0.0094 7 6.5 0 0.02 0.04 0.06 0.08 0.1 0.12 Threshold of entire samplesize, ex. 0.02=2% 0.14 0.16 Mean excess function for the upper tail for history of 1 years 8.5 7.5 0.0092 x 10 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Threshold of entire samplesize, ex. 0.02=2% 0.4 0.45 Figure 36. Mean excess plots based on 18 years of data (left) and on the average of 1 year periods (right). How the data used affects the mean excess plots is clear in figure 36. But even when just one year is used to generate the plot there is no clear answer to where the suitable threshold should be set. The slope declines a little around 7-10%, but it’s a very vague motivation. The figures from both the Hill and mean excess methods show the difficulties and the drawbacks with the graphical methods. Even though they are easily generated a subjective assessment is needed, this is neither practical nor efficient, since the threshold choice based on these methods needs to be adjusted to the data at hand. A conventional choice of let say 10% is hence more consistent and time effective. To use a more robust method that don’t rely on human conclusions a Monte Caro simulation procedure was used. Hence the threshold is given for which the RMSE between the true and the estimated quantile is smallest. The procedure used in this study is a Monte Carlo simulation to find a suitable threshold by minimizing the mean squared error. It’s described in the following algorithm: Generate n=1000 samples from student´s t distribution (4 d.o.f.) and the true and hence the true quantile can be easily calc. Calculate and : For various values of m (threshold), restrict on is beyond the threshold. so that the target quantile 58 For the Hill estimator , calc the quantile and MSE and bias using MC based on 1000 independent samples Plot the MSE and BIAS against m for the 99th percentage. Choose the m which gives the smallest MSE and BIAS The procedure is repeated for 30 degrees of freedom as well, a distribution closer to the normal. The procedure generates the following plots: D.o.f. for the student´s t is 4 D.o.f. for the student´s t is 30 0.9 0.6 0.8 0.5 0.7 0.4 RMSE and BIAS RMSE and BIAS 0.6 0.5 0.4 0.3 0.3 0.2 0.2 0.1 0.1 RMSE BIAS 0 -0.1 0 0.02 0.04 0.06 0.08 0.1 Threshold in % 0.12 0.14 0 0.16 -0.1 RMSE BIAS 0 0.02 0.04 0.06 0.08 0.1 Threshold in % 0.12 0.14 0.16 Figure 37. Plot of MSE and BIAS as a function of the threshold, based on a t distribution with four degrees of freedom (left) and 30 degrees of freedom (right). For thresholds between one and 7-8% the RMSE seems to be stable, while the BIAS is increasing from a threshold of approximately 3%. The threshold that generated the minimal RMSE was 4.4 % when fours degrees of freedom were used and 5.1 % for 30 degrees of freedom. However if a short sample is used a threshold that is set too low there will be too few observations in the tail to obtain reliable estimates. Note that the above procedure is based on a defined distribution, not on the actual data which is a major drawback. The only possible connection is that if the OMXS30 data actually comes from a student´s t distribution with exactly four degrees of freedom. Another observation is that the plot looks quite the same for a t distribution with higher degrees of freedom. This method takes a bit more time than the graphical ones, gives a specific threshold, the one that corresponds to the minimal RMSE. However, it has no connection to a specific data, a return series for example, but is entirely based on a known probability distribution, a Student’s t in this case. Therefore, as mentioned in section 4.2, the conventional choice of 10% will be used overall in this study, see section 2.5.1 Defining the tail. Some other choices will be tested, based on the result in this section, to see if and how they affect the results. 59 Appendix C Autocorrelation plots OMXS30 Autocorrelation Function Normal innovations OMXS30 Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 OMXS30 Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 OMXS30 Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 4 6 8 10 Lag 12 14 16 Figure 38. Autocorrelation plot for the residuals from the AR-GARCH fit of the OMXS30 time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). FTSE Autocorrelation Function Normal innovations FTSE Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 FTSE Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 FTSE Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 20 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 Figure 39. Autocorrelation plot for the residuals from the AR-GARCH fit of the FTSE time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). 60 GSPC Autocorrelation Function Normal innovations GSPC Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 GSPC Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 GSPC Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 4 6 8 10 Lag 12 14 16 Figure 40. Autocorrelation plot for the residuals from the AR-GARCH fit of the GSPC time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). N225 Autocorrelation Function Normal innovations N225 Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 N225 Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 N225 Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 20 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 Figur 41. Autocorrelation plot for the residuals from the AR-GARCH fit of the N225 time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). 61 EUR Autocorrelation Function Normal innovations EUR Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 EUR Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 EUR Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 4 6 8 10 Lag 12 14 16 Figure 42. Autocorrelation plot for the residuals from the AR-GARCH fit of the EUR time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). GBP Autocorrelation Function Normal innovations GBP Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 GBP Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 GBP Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 20 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 Figure 43. Autocorrelation plot for the residuals from the AR-GARCH fit of the GBP time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). 62 USD Autocorrelation Function Normal innovations USD Squared Autocorrelation Function Normal innovations 1 Sample Autocorrelation Sample Autocorrelation 1 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 18 0.5 0 -0.5 20 0 2 USD Autocorrelation Function T innovations 8 10 Lag 12 14 16 18 20 18 20 1 Sample Autocorrelation Sample Autocorrelation 6 USD Squared Autocorrelation Function T innovations 1 0.5 0 -0.5 4 0 2 4 6 8 10 Lag 12 14 16 18 20 0.5 0 -0.5 0 2 4 6 8 10 Lag 12 14 16 Figure 44. Autocorrelation plot for the residuals from the AR-GARCH fit of the USD time series, with normal innovations (left upper), t innovations (left lower) and for the squared residuals, based on normal innovations (right upper), t innovations (right lower). 63 Appendix D Calibrationperiod 1 year Calibrationperiod 2 year 0.15 PF VAR hist VAR hill ES hist ES hill 0.1 0.05 0.15 0.1 0.05 0 0 -0.05 -0.05 -0.1 -0.1 -0.15 -0.15 -0.2 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 -0.2 Dec95 May97 Oct98 Mar00 Aug01 Feb03 Jul04 Jan06 Jun07 Dec08 May10 Nov11 Calibrationperiod 3 year Calibrationperiod 4 year 0.15 0.15 0.1 0.1 0.05 0.05 0 0 -0.05 -0.05 -0.1 -0.1 -0.15 -0.15 -0.2 Dec96 Apr98 Aug99 Dec00 Apr02 Aug03 Jan05 May06 Oct07 Feb09 Jul10 Nov11 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 Figure 45. The figure shows the Hill predictions of VAR and ES and the actual negative return series of OMXS30 for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. Calibrationperiod 3 year 0.1 PF VAR hist VAR GPD ES hist ES GPD 0.05 0 -0.05 -0.1 -0.15 Dec96 Apr98 Aug99 Dec00 Apr02 Aug03 Jan05 May06 Oct07 Feb09 Jul10 Nov11 Oct06 Jan08 Apr09 Aug10 Nov11 Calibrationperiod 4 year 0.1 0.05 0 -0.05 -0.1 -0.15 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Figure 46. The figure shows the GPD predictions of VAR and ES and the actual negative return series of OMXS30 for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. Predictions based on one and two years of historical data are excluded since the parameter estimates didn’t converge for those time horizons, as described in section 6.2.1. 64 Calibrationperiod 1 year PF 0.3 VAR hist VAR cond Hill 0.2 ES hist ES cond Hill 0.3 0.2 0.1 Calibrationperiod 2 year 0.1 0 0 -0.1 -0.1 -0.2 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 -0.2 Dec95 May97 Oct98 Mar00 Aug01 Feb03 Jul04 Jan06 Jun07 Dec08 May10 Nov11 Calibrationperiod 3 year Calibrationperiod 4 year 0.3 0.3 0.2 0.2 0.1 0.1 0 0 -0.1 -0.1 -0.2 Dec96 Apr98 Aug99 Dec00 Apr02 Aug03 Jan05 May06 Oct07 Feb09 Jul10 Nov11 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 Figure 47. The figure shows the predictions from conditional Hill with normal innovations of VAR and ES and the actual negative return series of OMXS30 for calibration periods of different length. The length of the calibration period for each window is found in the title of the window. The threshold for extreme observations is set to 10 % of the sample. VAR predictions, calibrationperiod 1 year 0.2 0.15 PF VAR hist VAR cond Hill T VAR cond Hill N 0.05 0 0 -0.05 -0.05 -0.1 -0.1 -0.15 -0.15 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 ES predictions, calibrationperiod 1 year 0.3 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 PF ES hist ES cond Hill T 0.3 ES cond Hill N 0.2 0.2 0.1 0.1 0 Losses Losses 0.1 0.05 Losses Losses 0.1 VAR predictions, calibrationperiod 4 year 0.15 ES predictions, calibrationperiod 4 year 0 -0.1 -0.1 -0.2 Dec94 Jun96 Jan98 Jul99 Jan01 Jul02 Feb04 Aug05 Mar07 Sep08 Apr10 Nov11 -0.2 Nov97 Mar99 May00 Aug01 Nov02 Mar04 Jun05 Oct06 Jan08 Apr09 Aug10 Nov11 Figure 48. In the two windows on the upper half VAR predictions for both t and normal innovations can be seen, in the two windows on the bottom half ES predictions are illustrated. The predictions from historical simulation are included to visualize the flexibility of both the conditional Hill models. 65
© Copyright 2026 Paperzz