TIME SERIES MODELLING AND FORECASTING SARIMA

Time Series introduction in R - Iñaki Puigdollers
10th of March 2016
TIME SERIES MODELLING AND FORECASTING
Quick Overview
• A time series is:
o Sequence of observations
o Generated from a stochastic process (random variables)
o Ordered in time
• Model a time series: find a mathematical expression for a time series
• Time series forecasting: use a model to predict its future values
• Examples of models:
o Linear Regression
o SARIMA
TIME SERIES MODELLING AND FORECASTING
Quick Overview
• A time series is:
o Sequence of observations
o Generated from a stochastic process (random variables)
o Ordered in time
• Model a time series: find a mathematical expression for a time series
• Time series forecasting: use a model to predict its future values
• Examples of models:
o Linear Regression
o SARIMA
TIME SERIES MODELLING AND FORECASTING
Linear regression
• Simple linear regression
Dependent/Response Regression Independent variable
coefficient or regressor
variable
Error
term
TIME SERIES MODELLING AND FORECASTING
Linear regression
• Multiple linear regression
• Smaller n’s are preferable (parsimony principle, a.k.a. Occam’s razor)
TIME SERIES MODELLING AND FORECASTING
Linear regression
• Assumptions:
o Linear relation between Y and X (in case of multiple linear regression with each Xi)
o In multiple linear regression, the regressors (Xi) are pairwise independent
o The errors (𝜀t) are gaussian white noise: independent and normally distributed
TIME SERIES MODELLING AND FORECASTING
SARIMA (Seasonal Autoregressive Integrated moving Average)
• SARIMA (p,d,q)x(P,D,Q)s:
COMPONENTS:
TERMS:
Season Period (12 for a year, 7 for a week, …)
Non-seasonal
Autoregressive
Error
Moving average
Seasonal
Autoregressive
Moving average
TIME SERIES MODELLING AND FORECASTING
SARIMA
• Assumptions:
o Weak (2nd order) stationarity*: the time series behaviour is not varying over time
o Errors are gaussian white noise: independent and normally distributed
• *Stationarity:
o A time series is called stationary if all their moments are constant over time
o A time series is called 2nd order stationary if the mean (1st order moment) and variance (2nd
order moment) are constant over time
TIME SERIES MODELLING AND FORECASTING
Box & Jenkins Methodology
• A methodology to model SARIMA time series
• The methodology is composed by 3 different stages:
o Model selection
o Parameters estimation
o Model Checking
TIME SERIES MODELLING AND FORECASTING
B&J: Model Selection
• Detect stationarity
o To detect mean-sationarity: use stationarity tests like Kiwatowski-Phillips-Schmidt-Shin (KPSS)
or Augmented Dickey-Fuller (ADF) or Autocorrelation function (ACF) or by simply looking at
the plot of the data
 Quite often you can make a time series stationary by applying a difference filter of order ,
i.e. to every point Xt in the time series apply the following transformation: Yt = Xt – Xt-1
o To detect variance stationarity: use Priestly-Suba-Rao test (PSR) or the ACF, or looking at the
plot of the data
• You can try to achieve stationarity in variance by applying the Box-Cox transformation to
the series (for positive time series) or Yeo-Johnson transformation (for non-positive time
series). Please check the appendix for the closed-for formulae.
Remember: If you can’t make the time series stationary, it can’t be
modelled as a SARIMA!!!!
TIME SERIES MODELLING AND FORECASTING
Stationarity?
Stationary time series
Non-Stationary time series
TIME SERIES MODELLING AND FORECASTING
B&J: Model Selection
• Detect seasonality
o Use ACF and partial autocorrelation function (PACF) to detect if there is any seasonal
pattern
 If there is seasonality you have to seasonal-difference the time series before
modelling it, i.e. difference the time series S times (where S is the order of the
seasonality, e.g. if we have monthly data and the seasonality is quarterly you have to
apply a order 4 difference filter. See appendix for a closed-for formula.
TIME SERIES MODELLING AND FORECASTING
Seasonality?
Seasonal time series
Non-seasonal time series
TIME SERIES MODELLING AND FORECASTING
B&J: Model Selection
• Identify the parameters p & q (same for P & Q but at seasonal level)
EXPONENTIALLY DECREASES
TIME SERIES MODELLING AND FORECASTING
B&J: Parameter Estimation
• Most common practice is to use numerical methods to estimate the parameters by finding
the most likelihood estimator. In order to asses which parameter is best fitting the most
common practice is minimizing some information criteria, either:
o Akaike information criterion (AIC)
o Akaike information criterion corrected (AICc), this one tends to be preferred as it is
not dependent on the size of the sample
o Bayesian information criterion (BIC)
TIME SERIES MODELLING AND FORECASTING
B&J: Model Checking
• Once the model is built we need to make sure that the errors of the model are gaussian
distributed (i.e. they are white noise). If they are not, this means that the the errors may
have a SARIMA structure itself, so the most common practice is to apply the same
methodology (Box-Jenkins)again to the errors of the series
• In order to assess the normality of the errors you can use among other Ljung-Box
normality test (LB) or Durbin-Watson (DW), but you can also check it by plotting the
ACF/PACF of the errors (if they are normal all the values of the ACF/PACF will be inside the
confidence interval)
TIME SERIES MODELLING AND FORECASTING
DEMO TIME !!!
TIME SERIES MODELLING AND FORECASTING
Appendix
• Box-Cox transformation (for trying to make non-variance-stationary POSITIVE time series
into variance-stationary time series)
TIME SERIES MODELLING AND FORECASTING
Appendix
• Yeo-Johnson transformation (for trying to make non-variance-stationary time series into
variance-stationary time series)
TIME SERIES MODELLING AND FORECASTING
Appendix
• The difference filter
o The difference operator (or order 1 difference filter) is
o In general an order r difference filter can be calculated recursively as a sequence of
difference operators