ESSE 4020 / ESS 5020
Time Series and Spectral Analysis
Nov 28 2016
More remarks on models and, AR forecasting.
Good material from: Time-series forecasting / Chris Chatfield, 2000
http://evalenzu.mat.utfsm.cl/Docencia/2014/Primer%20semestre/Series
%20de%20Tiempo%20I/Apunte%202.pdf
3 Univariate Time-Series Modelling
3.1 ARIMA models and related topics
3.2 State space models
3.3 Growth curve models
3.4 Non-linear models
3.5 Time-series model building
4 Univariate Forecasting Methods
4.1 The prediction problem
4.2 Model-based forecasting
4.3 Ad hoc forecasting methods
4.4 Some interrelationships and combinations
3.1.3 ARMA and AR processes
A mixed autoregressive moving average model with p autoregressive terms and
q moving average terms is abbreviated ARMA(p, q) and may be written as
φ(B)Xt = θ(B)Zt
(3.1.11)
where φ(B), θ(B) are polynomials in B of finite order p, q, respectively and {Zt}
2
denotes a purely random process with zero mean and variance σ .
A time series {Xt} is said to be an autoregressive process of order p (AR(p)) if it
is a weighted linear sum of the past p values. so θ(B) = 1. BXt = Xt−1.
Note that
2
p
φ(B) = 1 − φ1B − φ2B ........ − φpB is a polynomial in B of order p.
Personally, and for the type of data that I look at, modelling and forecasting with
AR models seems the most natural approach. We will have sample values of X t
but not usually of Zt.
Power Demand data, from IESO
http://www.ieso.ca/Pages/Power-Data/default.aspx
2.1 Demand Forecasting System
Ontario Demand is the sum of coincident loads plus the losses on the IESO-controlled grid. Ontario
Demand is calculated by taking the sum of injections by registered generators, plus the imports into
Ontario, minus the exports from Ontario. Ontario Demand does not include loads that are supplied by
generation not participating in the market (embedded generation).
The IESO forecasting system uses multivariate econometric equations to estimate the relationships
between electricity demand and a number of drivers. These drivers include weather effects, economic
and demographic data, calendar variables, conservation and embedded generation. Using regression
techniques, the model estimates the relationship between these factors and energy and peak demand.
Calibration routines within the system ensure the integrity of the forecast with respect to energy and
peak demand, and zonal and system wide projections.
We produce a forecast of hourly demand by zone. From this forecast the following information is
available:
Hourly peak demand
Hourly minimum demand
Hourly coincident and non-coincident peak demand by zone
Energy demand by zone
These forecasts are generated based on a set of assumptions for the various model drivers. We use a
number of different weather scenarios to forecast demand. The appropriate weather scenarios are
determined by the purpose and underlying assumptions of the analysis. An explanation of the
weather scenarios follows in section 2.3.
2.2 Demand Forecast Drivers
Consumption of electricity is modelled using six sets of forecast drivers: calendar variables, weather
effects, economic and demographic conditions, load modifiers (time of use and critical peak pricing),
conservation impacts and embedded generation output. Each of these drivers plays a role in shaping the
results.
Calendar variables include the day of the week and holidays, both of which impact energy consumption.
Electricity consumption is higher during the week than on weekends and there is a pattern determinedby
the day of the week. Much like weekends, holidays have lower energy consumption as fewer
businesses and facilities are operating. Hours of daylight are instrumental in shaping the demand profile
through lighting load. This is particularly important in the winter when sunset coincides with increases in
load associated with cooking load and return to home activities. Hours of daylight are included with
calendar variables.
Weather effects include temperature, cloud cover, wind speed and dew point (humidity). Both energy
and peak demand are weather sensitive. The length and severity of a season’s weather contributes to the
level of energy consumed. Weather effects over a longer time frame tend to be offsetting resulting in a
muted impact. Acute weather conditions underpin peak demands.
For the Ontario Demand forecast, weather is not forecasted but weather scenarios based on historical data
are used in place of a weather forecast. Load Forecast Uncertainty (LFU) is used as a measure of the
variation in demand due to weather volatility. For resource adequacy assessments a Monthly Normal
weather forecast is used in conjunction with LFU to consider a full range of peak demands that can occur
under various weather conditions with a varying probability of occurrence. This is discussed further in
Section 2.3.
Economic and demographic conditions contribute to growth in both peak and energy demand. An
economic forecast is required to produce the demand forecast. We use a consensus of four major,
publicly available provincial forecasts to generate the economic drivers used in the model. Additionally,
we purchase forecast data from several service providers to enable further analysis and insight.
Population projections, labour market drivers and industrial indicators are utilized to generate the
forecast of demand.
Population projections are based on the Ministry of Finance’s Ontario Population Projections.
Conservation acts to reduce the need for electricity at the end-user. The IESO includes demand
reductions due to energy efficiency, fuel switching and conservation behaviour under the category of
conservation. Information on program’s targets and impacts, both past and future, are incorporated into
the demand forecast.
Embedded generation reduces the need for grid supplied electricity by generating electricity on the
distribution system. Since the majority of embedded generation is solar powered, embedded generation
is divided into two separate components – solar and non-solar. Non-solar embedded generation includes
generation fuelled by biogas and natural gas, hydroelectric power and wind. Contract information is
used to estimate both the historical and future output of embedded generation. This information is
incorporated into the demand model.
Load modifiers account for the impact of prices. The Industrial Conservation Initiative (ICI) and time of
use prices (TOU) put downward pressure on demand during peak demand periods. These impacts are
incorporated into the model.
Suppose we consider hourly demand November data, 2002-2015 (30*24*14 = 720*14)
2 .4
x 10
4
2 .2
2
1 .8
1 .6
1 .4
1 .2
1
0
100
200
300
400
500
600
700
800
S e rie s
22000.
20000.
18000.
16000.
14000.
12000.
0
200
400
600
2002 data plotted by ITSM
2
x 10
4
1 .9
1 .8
1 .7
1 .6
1 .5
1 .4
1 .3
1 .2
0
100
200
300
400
500
600
700
800
MATLAB plot of 14 year average
% PROGRAM NOVPWR, Peter Taylor
clc; X = csvread('E:\4020-2016\IESO\NovHourlyDemands_2002-2015.csv',2,2);
[m,n] = size(X); AVGYRS=sum(X')/n;
plot(X); figure; plot(AVGYRS,'r'); NHDAVGP=NHDAVG';
csvwrite('E:\4020-2016\IESO\NHDAVGT.csv',AVGYRSP)
S e rie s
20000.
19000.
18000.
17000.
16000.
15000.
14000.
13000.
0
200
400
600
ITSM2000 plot of 14 year average
0.
-1000.
-2000.
-3000.
-4000.
0
200
400
Nov 15 - Average (hourly values)
==========
ITSM::(INFO)
==========
# of Data Points =
720
Sample Mean = -.1561E+04: Sample Variance = .537490E+06
Std.Error(Sample Mean) = 96.885217
Polynomial Fit:
X(t) = .29050 * t - .16656E+04
Slow increase through the month 0.29 MW/hr (208 MW over month)
600
S e rie s
2000.
1500.
1000.
500.
0.
-500.
-1000.
-1500.
-2000.
-2500.
0
200
400
With mean and trend removed # of Data Points =
600
720
Sample Mean = .1029E-11; Sample Variance = .533844E+06 Std.Error(Sample Mean) = 96.534548
Maybe some weekly signal left since averages based on dates in month.
S a m p le A C F
1.00
S a m p le P A C F
1.00
.80
.80
.60
.60
.40
.40
.20
.20
.00
.00
-.20
-.20
-.40
-.40
-.60
-.60
-.80
-.80
-1.00
-1.00
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
Sample Partial Autocorrelations:
1.0000
.9559
-.5669
.0159
.0392
.0050
.1032
.0359
-.0399
.0746 ..........
Sample Autocorrelations: Sample Variance = .53384447E+06
1.0000
.5572
.9559
.4731
.8649
.4035
.7596
.3485
.6545
.3102 ...................
Some concern over multiple contributions and 24 hr peak in ACF plot but PACF is encouraging.
35
40
Where to from here? Could try to separate diurnal cycle and trend. Can look at
periodogram and also try to build a prediction model (to predict a few hours
ahead). Though IESO predict weeks ahead in other ways.
Periodogram:
P e rio d o g ra m /2 p i
5.00E +06
4.00E +06
3.00E +06
2.00E +06
1.00E +06
0.00E +00
.00
.10
.20
.30
.40
.50
===============
ITSM::(SPECTRUM)
===============
Number of frequencies in periodogram = 360
Fundamental Fourier frequency 2*pi/n = .0087266 (1 cycle per month)
Peak at 0.26 corresponds approximately to 2π/24 cycles per day since basic
time unit is hr. I cycle per week has frequency 2π/(24x7) = 0.0374, where there
is a peak, and a harmonic at twice that frequecy.
A Prediction model
Method: Maximum Likelihood
ARMA Model: AR(3)
X(t) = 1.595 X(t-1) - .7935 X(t-2) + .1398 X(t-3) + Z(t)
WN Variance = .292424E+05 SD = 171.0
Method: Maximum Likelihood
ARMA Model: AR(2)
X(t) = 1.513 X(t-1) - .5815 X(t-2) + Z(t)
WN Variance = .298242E+05
AR Coefficients
1.513121
-.581454
ARMA Model: AR(1)
X(t) = .9562 X(t-1) + Z(t)
WN Variance = .451282E+05
AR Coefficients
.956214
=================================
ITSM::(Preliminary estimates)
=================================
Method: Yule-Walker
ARMA Model: AR(5)
X(t) = 1.561 X(t-1) - .7529 X(t-2) + .1772 X(t-3) - .06474 X(t-4)
+ .01591 X(t-5) + Z(t)
WN Variance = .291841E+05
AR Coefficients 1.561037
-.752851
.177247
-.064739
.015906
Lets try the model to predict 1 hour ahead
X(t) = 1.595 X(t-1) - .7935 X(t-2) + .1398 X(t-3)
Hourly Electricity Demand (MW)
20000
18000
16000
14000
12000
10000
0
200
400
600
Time EST (Hours into November)
Black is actual , red is 1 hr prediction
Hourly Electricity Demand (MW)
18000
16000
14000
12000
10000
0
10
20
30
40
Time EST (Hours into November)
PnXn+h approach (see Notes 2)
Suppose we consider forecasting several hours (h) ahead, based on a small
number (n) of previous values, but have a long record from which we can
determine mean and autocovariances.
Is the time series stationary? Should we remove trend and seasonality? - then
add back in for forecast values? Basic data clearly not stationary.
S e rie s
20000.
19000.
18000.
17000.
16000.
15000.
14000.
13000.
12000.
11000.
0
200
400
600
Use "classical fit" with 168 hr (7 day) seasonal cycle and linear trend.
20000.
19000.
18000.
17000.
16000.
15000.
14000.
13000.
12000.
11000.
0
200
400
600
Polynomial Fit:
X(t) = 2.2694 * t + .14038E+05
Seasonal fit of period = 168
Seasonal components: 1) -.22297E+04, 2) -.27015E+04, 3) -.29574E+04 ...
HourlyElectricityDem
and(M
W
)
20000
18000
16000
14000
12000
10000
0
200
400
600
Time EST (Hours into November)
Graph check of data: Black - data, Red - Seasonal fit + trend.
S e rie s
1200.
1000.
800.
600.
400.
200.
0.
-200.
-400.
-600.
-800.
-1000.
-1200.
0
200
Remaining signal: # of Data Points =
400
600
720; Sample Mean = .6384E-11
Sample Variance = .141874E+06; Std.Error(Sample Mean) = 57.598617
P e rio d o g ra m /2 p i
2.00E +06
1.50E +06
1.00E +06
5.00E +05
0.00E +00
.0
.2
.4
.6
.8
1.0
Periodogram shows some low frequency components then small amplitudes.
S a m p le A C F
1.00
S a m p le P A C F
1.00
.80
.80
.60
.60
.40
.40
.20
.20
.00
.00
-.20
-.20
-.40
-.40
-.60
-.60
-.80
-.80
-1.00
-1.00
0
5
10
15
20
25
30
35
40
0
5
10
15
20
25
30
35
40
ACF and PACF are similar to those found earlier with different base pattern (15
year average.
===============
ITSM::(ACF/PACF)
===============
# of Lags =
40: Sample Autocorrelations:
Sample Variance = .14187437E+06 = 14187.4
1.0000
.9542
.8849
.8112
.7433
.6811
.6330
.5960
.5684
.5515
.5443
.5349
.5188
.5001
.4855
.4757
.4755
.4784
.4830
.4845
.4845
.4833
.4801
.4715
.4584
.4343
.4106
.3894
.3696
.3511
.3363
.3209
.3086
.3028
.2962
.2872
.2752
.2633
.2497
.2347
Sample Partial Autocorrelations:
1.0000
.9542
-.2851
-.0224
.0384
-.0076
.1068
.0296
.0404
.0824
.0653
-.0569
-.0396
.0263
.0747
.0526
.0977
-.0019
.0285
-.0117
.0086
.0284
.0171
-.0170
-.0086
-.1146
.0450
.0013
-.0312
.0120
.0093
-.0404
.0230
.0425
-.0692
.0036
-.0174
-.0048
-.0332
-.0271
The PACF- http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4463.htm
6.4.4.6.3. Partial Autocorrelation Plot
Purpose: Model
Partial autocorrelation plots (Box and Jenkins, pp. 64-65,
Identification for Box- 1970) are a commonly used tool for model identification in
Jenkins Models
Box-Jenkins models.
The partial autocorrelation at lag k is the autocorrelation
between Xt and Xt−k that is not accounted for by lags 1
through k−1.
There are algorithms, not discussed here, for computing the
partial autocorrelation based on the sample autocorrelations.
See (Box, Jenkins, and Reinsel 1970) or (Brockwell, 1991)
for the mathematical details.
Specifically, partial autocorrelations are useful in identifying
the order of an autoregressive model. The partial
autocorrelation of an AR(p) process is zero at lag p+1 and
greater. If the sample autocorrelation plot indicates that an
AR model may be appropriate, then the sample partial
autocorrelation plot is examined to help identify the order.
We look for the point on the plot where the partial
autocorrelations essentially become zero. Placing a 95 %
confidence interval for statistical significance is helpful for
this purpose.The approximate 95 % confidence interval for
the partial autocorrelations are at ±2/√N.
The partial autocorrelation plot can help provide answers to the following questions:
1.Is an AR model appropriate for the data?
2.If an AR model is appropriate, what order should we use?
Based on the Partial Autocorrelations we will take n =2.
The use P2 X2+h = a1X2 + a2X1, noting that the ai values will change with h.
The ai values are solutions of the matrix equation
and the γ values are autocovariances = autocorrelations * variance but the
variance will cancell on both sides of the equation so we can use
autocorrelations. Note γ(0) = 1.
The autocorrelations we are concerned with, if we want to go 4 hours into the
future are γ(0) to γ(5) with values
1.000
0.9542
0.8849
0.8112 0.7433
0.6811
The 2x2 Γ matrix has entries 1, 0.9542 while for h = 1, γ2(1) = (0.9542, 0.8849)
or we could write matlab code, but, by hand
Γ-1 =
11.173* |1 -0.9542|
|-0.9542 1 |
and a = (1.2271 -0.286)' for h = 1
ITSM2000 autofit gives a best fit model (Max Liklihood) as
X(t) = 1.225 X(t-1) - .2845 X(t-2) + Z(t) with WN Variance = .116642E+05
while Yule-Walker method gives
X(t) = 1.226 X(t-1) - .2851 X(t-2) + Z(t) and WN Variance = .116641E+05
with Least Squares.
Being a little more careful ....... MATLAB gives
GAMMA =
1.0000
0.9542
0.9542
1.0000
GAMMAI =
11.1729 -10.6612
-10.6612 11.1729
a(n) for h = 1,2,3,4
1.2271 -0.2860 - so same as computed by hand
1.2385 -0.3706
1.1390 -0.3435
1.0435 -0.3146
Seems odd pattern but .... Maybe should not have h > n?
2015 data
Seasonal+trend
1 hr forecast
2hr forecast
Hourly Electricity Demand (MW)
18000
16000
14000
12000
10000
0
20
40
60
80
Time EST (Hours into November)
100
2015 data
Seasonal+trend
1 hr forecast
2hr forecast
4 hr forecast
Hourly Electricity Demand (MW)
18000
16000
14000
12000
10000
0
20
40
60
80
Time EST (Hours into November)
100
© Copyright 2026 Paperzz