Time Series introduction in R - Iñaki Puigdollers 10th of March 2016 TIME SERIES MODELLING AND FORECASTING Quick Overview • A time series is: o Sequence of observations o Generated from a stochastic process (random variables) o Ordered in time • Model a time series: find a mathematical expression for a time series • Time series forecasting: use a model to predict its future values • Examples of models: o Linear Regression o SARIMA TIME SERIES MODELLING AND FORECASTING Quick Overview • A time series is: o Sequence of observations o Generated from a stochastic process (random variables) o Ordered in time • Model a time series: find a mathematical expression for a time series • Time series forecasting: use a model to predict its future values • Examples of models: o Linear Regression o SARIMA TIME SERIES MODELLING AND FORECASTING Linear regression • Simple linear regression Dependent/Response Regression Independent variable coefficient or regressor variable Error term TIME SERIES MODELLING AND FORECASTING Linear regression • Multiple linear regression • Smaller n’s are preferable (parsimony principle, a.k.a. Occam’s razor) TIME SERIES MODELLING AND FORECASTING Linear regression • Assumptions: o Linear relation between Y and X (in case of multiple linear regression with each Xi) o In multiple linear regression, the regressors (Xi) are pairwise independent o The errors (𝜀t) are gaussian white noise: independent and normally distributed TIME SERIES MODELLING AND FORECASTING SARIMA (Seasonal Autoregressive Integrated moving Average) • SARIMA (p,d,q)x(P,D,Q)s: COMPONENTS: TERMS: Season Period (12 for a year, 7 for a week, …) Non-seasonal Autoregressive Error Moving average Seasonal Autoregressive Moving average TIME SERIES MODELLING AND FORECASTING SARIMA • Assumptions: o Weak (2nd order) stationarity*: the time series behaviour is not varying over time o Errors are gaussian white noise: independent and normally distributed • *Stationarity: o A time series is called stationary if all their moments are constant over time o A time series is called 2nd order stationary if the mean (1st order moment) and variance (2nd order moment) are constant over time TIME SERIES MODELLING AND FORECASTING Box & Jenkins Methodology • A methodology to model SARIMA time series • The methodology is composed by 3 different stages: o Model selection o Parameters estimation o Model Checking TIME SERIES MODELLING AND FORECASTING B&J: Model Selection • Detect stationarity o To detect mean-sationarity: use stationarity tests like Kiwatowski-Phillips-Schmidt-Shin (KPSS) or Augmented Dickey-Fuller (ADF) or Autocorrelation function (ACF) or by simply looking at the plot of the data Quite often you can make a time series stationary by applying a difference filter of order , i.e. to every point Xt in the time series apply the following transformation: Yt = Xt – Xt-1 o To detect variance stationarity: use Priestly-Suba-Rao test (PSR) or the ACF, or looking at the plot of the data • You can try to achieve stationarity in variance by applying the Box-Cox transformation to the series (for positive time series) or Yeo-Johnson transformation (for non-positive time series). Please check the appendix for the closed-for formulae. Remember: If you can’t make the time series stationary, it can’t be modelled as a SARIMA!!!! TIME SERIES MODELLING AND FORECASTING Stationarity? Stationary time series Non-Stationary time series TIME SERIES MODELLING AND FORECASTING B&J: Model Selection • Detect seasonality o Use ACF and partial autocorrelation function (PACF) to detect if there is any seasonal pattern If there is seasonality you have to seasonal-difference the time series before modelling it, i.e. difference the time series S times (where S is the order of the seasonality, e.g. if we have monthly data and the seasonality is quarterly you have to apply a order 4 difference filter. See appendix for a closed-for formula. TIME SERIES MODELLING AND FORECASTING Seasonality? Seasonal time series Non-seasonal time series TIME SERIES MODELLING AND FORECASTING B&J: Model Selection • Identify the parameters p & q (same for P & Q but at seasonal level) EXPONENTIALLY DECREASES TIME SERIES MODELLING AND FORECASTING B&J: Parameter Estimation • Most common practice is to use numerical methods to estimate the parameters by finding the most likelihood estimator. In order to asses which parameter is best fitting the most common practice is minimizing some information criteria, either: o Akaike information criterion (AIC) o Akaike information criterion corrected (AICc), this one tends to be preferred as it is not dependent on the size of the sample o Bayesian information criterion (BIC) TIME SERIES MODELLING AND FORECASTING B&J: Model Checking • Once the model is built we need to make sure that the errors of the model are gaussian distributed (i.e. they are white noise). If they are not, this means that the the errors may have a SARIMA structure itself, so the most common practice is to apply the same methodology (Box-Jenkins)again to the errors of the series • In order to assess the normality of the errors you can use among other Ljung-Box normality test (LB) or Durbin-Watson (DW), but you can also check it by plotting the ACF/PACF of the errors (if they are normal all the values of the ACF/PACF will be inside the confidence interval) TIME SERIES MODELLING AND FORECASTING DEMO TIME !!! TIME SERIES MODELLING AND FORECASTING Appendix • Box-Cox transformation (for trying to make non-variance-stationary POSITIVE time series into variance-stationary time series) TIME SERIES MODELLING AND FORECASTING Appendix • Yeo-Johnson transformation (for trying to make non-variance-stationary time series into variance-stationary time series) TIME SERIES MODELLING AND FORECASTING Appendix • The difference filter o The difference operator (or order 1 difference filter) is o In general an order r difference filter can be calculated recursively as a sequence of difference operators
© Copyright 2025 Paperzz