Download attachment

Proceedings of Applied International Business Conference 2008
SHORT-TERM FORECASTING MODELS OF BAGAN DATOH COCOA BEAN PRICE
Assis Kamu ψ, Amran Ahmed and Remali Yusoff
Universiti Malaysia Sabah, Malaysia
Abstract
The purpose of this paper is to indentify the best short-term forecasting model (12 months ahead) of the
Bagan Datoh cocoa bean price. The Monthly average data of Bagan Datoh cocoa bean prices graded
SMC 1B for the period of January 1992 - December 2006 was used. Four different types of univariate
time series models were utilized, namely the exponential smoothing, autoregressive integrated moving
average (ARIMA), generalized autoregressive conditional heteroskedasticity (GARCH) and the mixed
ARIMA/GARCH models. Root mean squared error (RMSE), mean absolute percentage error (MAPE),
mean absolute error (MAE) and Theil's inequality coefficient (U-STATISTICS) were used as selection
criteria to determine the best forecasting model. This study revealed that the time series data were
influenced by a positive linear trend factor while a regression test result showed the non-existence of
seasonal factors. Moreover, based on the Autocorrelation function (ACF) and the Augmented DickeyFuller (ADF) test, the time series data was not stationary but became stationary after the first order of
the differentiating process was carried out. Based on the results of the ex-post forecasting (starting
from January until December 2006), the GARCH(1,1) model was found as the best short-term
forecasting model for the Bagan Datoh cocoa bean price graded SMC 1B.
Keywords: ARIMA; Exponential Smoothing; GARCH; Univariate Time Series Model.
JEL classification codes: C22; C53.
1. Introduction
Cocoa, scientifically known as Theobroma cacao L. is the third-largest agricultural commodity in
Malaysia after oil palms and rubber. Malaysia now exports cocoa products to sixty-six countries
(Report on Malaysia Plantation Industries and Commodities 2001-2005, 2006). Domestic cocoa bean
prices are changing from time to time and very volatile (Mohammed Yusof & Mohamad Salleh, 1987
and Fatimah & Zainal Abidin, 1994). Instability of cocoa prices creates significant risks to producers,
suppliers, consumers, and other parties that are involved in the marketing and production of cocoa
beans, particularly in Malaysia. In risky conditions and amidst price instability, forecasting is very
important in helping to make decisions.
Fatimah and Roslan (1986) conducted the only research that was intended to help develop a short-term
forecasting model of Malaysian cocoa bean prices. The research used data developed from monthly
prices starting in July 1975 and continuing until December 1984. This research, however, focused only
on ARIMA models. There are many forecasting methods that can be used to forecast the short-term
cocoa bean price, one of the newly introduced models is the GARCH model. Aishah and Anton
Abdulbasah (2006) developed a time series model of Malaysian palm oil prices by using ARCH
models. Liew et al. (2000) have done research on black pepper prices in Sarawak. They used ARIMA
models to do price forecasting.
2. Objectives of Research
The objective of this research is to develop a short-term forecasting model (12 months ahead) of Bagan
Datoh cocoa bean price by utilizing four different types of univariate time series models, namely the
exponential smoothing, ARIMA, GARCH, and the mixed ARIMA/GARCH models.
ψ
Corresponding author. Assis Kamu, Universiti Malaysia Sabah, Locked Bag 2073, 88999 Kota
Kinabalu, Sabah, Malaysia. E-Mail: [email protected]
Proceedings of Applied International Business Conference 2008
3. Research Methodology
The monthly Bagan Datoh cocoa bean price graded SMC 1B was used for this study which was
collected
from
the
official
website
of
The
Malaysian
Cocoa
Board
(http://www.koko.gov.my/lkmbm/loader.cfm?page=statisticsFrm.cfm). The time series data is
measured in Ringgit Malaysia per tonne (RM/tonne). The time series data ranged from January 1992
until December 2006. The coefficient of variation ( V ) was used to measure the index of instability of
the time series data. The coefficient of variation ( V ) is defined as
V=
σ
, where σ is the standard
Y
n
deviation and
Y=
∑ Yt
t=1
is the mean of Bagan Datoh cocoa bean price changes. Completely stable data
n
has V = 1 , but unstable data are characterized by a V > 1 (Telesca et al., 2008).
In an effort to find whether trends and seasonal factors exist in the time series data, regression analysis
was used. In examining the existence of linear trend factors, regression as below was used
ε ~ WN(0,σ 2 )
where
Y = β0 + β1Trend + ε
with Y is the time series of the study, Trend is the linear trend factor, β0 & β1 are parameters and ε is
the error of the model with an assumption of white noise (WN). The hypothesis of the model is
H 0 : β1
= 0 (Linear trend factor not exists)
H 1 : β1
≠ 0 (Linear trend factor exists)
With January as a base month, existence of the seasonal factor was detected by using regression as
shown below
Y = β0 + β1Trend + β2 Feb + β3 Mar + β4 Apr + β5 May + β6 Jun + β7 Jul + β8 Aug + β9 Sep
+β10 Oct + β11 Nov + β12 Dec + ε
and hypothesis is defined as
H 0 : β2 = β3 = β4 = ...= β12 = 0
(Seasonal factor not exists)
H 1 : At least one of β2 , β3 ,..., β12 ≠ 0 (Seasonal factor exists)
Two stationary tests that are correlogram and Augmented Dickey-Fuller (ADF) test were chosen to test
the time series data. Four different types of forecasting methods were utilized in this study, that is
exponential smoothing, ARIMA, GARCH, and the mixed ARIMA/GARCH.
Exponential smoothing
The h-periods-ahead forecast is given by:
Yˆt+h = a + bh
With a and b are permanent components. Both of these parameters are counted by the following
equations
at = αYt +(1 - α)(at-1 + bt -1 )
bt = β(at - at -1 )+ (1 - β)bt -1
with
0 < α, β < 1
ARIMA
This study followed the Box-Jenkins methodology which involves four steps. These are identification,
estimation, model checking, and forecasting. ARMA(p,q) processes can be simply expressed as the
following two equations
(1)
Yt = xt γ + et
621
Proceedings of Applied International Business Conference 2008
p
q
i=1
j=1
(2)
et = ∑ φi µt-i + ∑ θ j εt - j + εt
where, xt : the explanatory variables, et : the disturbance term, ε t : the innovation in the disturbance, p:
the order of AR term, q: the order of MA term. In equation (2), the disturbance term ( µt ) again
consists of three parts. The first part is AR terms and the second part is MA terms. The last one is just a
white-noise innovation term. If we replace the data ( Y ) with the difference data ( ∆Yt = Yt - Yt-1 ), then
the ARMA models become ARIMA(p,d,q) models.
GARCH
The standard form of GARCH(p,q) models can be specified as following three equations:
Yt = xt γ + εt
(3)
2
(4)
εt = vt σ t
q
2
p
2
(5)
2
σ t = δ + ∑ αi εt-i + ∑ β j σ t- j
i=1
j=1
Where, p: the order of GARCH term, q: the order of ARCH term, and
σv = 1 .
Equation (3) and (5) are
2
respectively called mean equation and conditional variance equation. The mean equation is written as a
function of exogenous variables ( xt ) with an error term ( µt ). The variance equation is a function of
mean ( δ ), ARCH ( µt2-i ) and GARCH term ( σt-2 j ).
ARIMA/GARCH
Combination of ARIMA(p,d,q) and GARCH(p,q) are written as below:
p
d
q
d
(∆Yt ) = ∑ φi (∆Yt -i ) + ε t + ∑ θ j ε t - j
i=1
q
2
j=1
2
p
, εt ~ WN(0,σ t2 )
(6)
(7)
2
σ t = δ + ∑ β j ε t - j + ∑ α i σ t -i
j= 1
i= 1
Table 1: Model selection criteria (Ramanathan, 2002)
No.
1
Criteria
AIC
2
FPE
3
GCV
4
HQ
2f
 ESS 
n

 ( lnn )
 n 
5
RICE
6
SCHWARZ
 ESS    2 f 

 1−  
 n    n 
 ESS  f n

n
 n 
7
SGMASQ
 ESS    f 

 1−  
 n    n 
8
SHIBATA
 ESS  n + 2 f


 n  n
Note:
Formula
 ESS  2f n

e
 n 
 ESS  n + f


 n  n- f
 ESS    f 

 1- 
 n    n 
-2
−1
−1
n = Number of observation; f = Number of parameter; ESS = Error sum of square
622
Proceedings of Applied International Business Conference 2008
Table 2: Forecast accuracy criteria
No.
1
Criteria
RMSE
2
MAE
Formula
ESS
n
n
∑ Yt − Yˆt
t =1
n
3
MAPE
n
Yt - Yˆt
t=1
Yt
∑
n
4
U-statistic
× 100%
RMSE
n
ˆ2
n
2
∑ Y /n + ∑ Y /n
t=1
t=1
Note: Yt = The actual value at time t; Yˆt = The forecast value at time t; n = The number of observation;
error sum of square
ESS =
The
Eight model selection criteria as suggested by Ramanathan (2002) were used to chose the best
forecasting models among ARIMA and GARCH models (see Table 1). The best forecasting model was
chosen from four different types of forecasting methods selected in accordance with the value of four
criteria, namely RMSE, MAE, MAPE, and U-statistics (see Table 2). Finally, the selected model was
used to perform short-term forecasting for the next twelve months for Bagan Datoh cocoa bean price
starting from January 2007 until December 2007.
4. Results and discussions
The results showed that the coefficient of variation ( V ) of the time series data was 0.998 ( V < 1 ).
Because of the V value was closed to 1, so this study was concluded that the time series data as stable
(Telesca et al., 2008). The results of the regression analysis were showed that positive linear trend
factor exists in the time series data but seasonal factor was not. With referring to the correlogram and
Augmented Dickey-Fuller tests results, the time series data of the study was not stationary. But after
the first order of differencing was carried out, then the time series data became stationary (see Figure
1).
Figure 1: Time series data (After first order of differencing)
Exponential Smoothing
The double exponential smoothing method was used because the regression result’s has showed that
positive linear trend factor existed in the time series data. Double exponential smoothing models
consist with two parameters which symbolized as α for mean and β for trend. The model which has
the lowest MSE (Mean Square Error) value from combination of α and β with condition 0 < α, β < 1
623
Proceedings of Applied International Business Conference 2008
would chosen as the best forecasting model of double exponential smoothing. The result was showed
that combination α = 0.9 and β = 0.1 was chosen as the best forecasting model of double exponential
smoothing method (see Table 3). Refers to the Table 4, double exponential smoothing model written in
equation as
FT +h = a + bh = 4337.292 + (h)* (-49.11296)
Table 3: Error sum of square (ESS) accordance with α and β value
β
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
α
0.1
148574494
148089497
119506562
111432010
110413625
99118908
82089304
69046203
61520834
0.2
60780095
53009132
46290637
40011203
35644209
33322485
32620680
33216629
34765722
0.3
35967343
31511913
28206803
26500856
26183982
26752761
27706387
28645428
29338707
0.4
26067863
23584886
22285672
22044141
22388117
22906942
27706387
23736262
24034747
0.5
21171464
19811357
19352556
19491565
19867461
20285055
20691928
21099517
21526024
0.6
18416856
17698855
17631819
17928248
18359472
18833953
19333711
19860841
20417178
0.7
16768235
16458187
16634131
17065146
17604439
18201394
18846891
19543802
20299223
0.8
15782220
15770767
16145314
16727264
17418829
18191436
19044923
19989248
21038783
0.9
15241640
15480149
16047802
16807669
17693899
18692544
19810004
21059009
22452919
Table 4: EViews output of double exponential smoothing
Sample: 1992M01 2005M12
Included observations: 168
Method: Holt-Winters No Seasonal
Original Series: BAGANDATOH
Forecast Series: BAGANDATOHSM
Parameters:
Alpha
0.9000
Beta
0.1000
Sum of Squared Residuals
15241640
Root Mean Squared Error
301.2043
End of Period Levels:
Mean
4337.292
Trend
-49.11296
ARIMA
Chatfield (1996) have suggested that the number of parameters in a ARIMA(p,d,q) model should not
too many. Therefore, this study was fixed so all models which fulfill the criteria p + q ≤ 5 would be
considered. This study has found 20 ARIMA(p,d,q) models which fulfilled the criteria. Parameters of
the models were estimated with the least square method. Parameters which were not significant at 5%
confidence level were dropped from the model. Using the eight model selection criteria suggested by
Ramanathan (2002), the ARIMA(3,1,2) model was selected as the best model among the other ARIMA
models. However, AR(1) was found not significant and thus dropped from the model. Referring to
Table 5, the ARIMA(3,1,2) model is written in equation format as
zˆt = 16.53833 - 0.936073zt -2 + 0.188524zt -3 + 0.105330εt -1 + 0.972281εt-2
Table 5: Estimation of ARIMA(3,1,2)
Variables
Coefficient
Standard error
Z-statistic
p-value
Constant
16.53833
26.37156
0.627128
0.5315
AR(2)
-0.936073
0.022439
-41.71719
<0.0001*
AR(3)
0.188524
0.023478
8.029818
<0.0001*
MA(1)
MA(2)
Note: * p<0.05
0.105330
0.020494
5.139572
<0.0001*
0.972281
0.015673
62.03491
<0.0001*
624
Proceedings of Applied International Business Conference 2008
GARCH
Identification and estimation of GARCH(p,q) models in this study were done by followed four steps
that are ARCH effect checking, estimation, model checking, and forecasting. Four GARCH(p,q)
models were compared, that is GARCH(1,1), GARCH(1,2), GARCH(2,1), and GARCH(2,2). Using
the eight model selection criteria suggested by Ramanathan (2002), the GARCH(1,1) model was
selected as the best model among the other three GARCH models.
Table 6: Estimation of GARCH(1,1)
Mean equation
Coefficient
Standard error
Z-statistic
-0.4290
18.7540
-0.0229
Conditional variance equation
8467.87
4029.91
2.1013
0.2838
0.1000
2.8383
Variables
Constant
Constant
2
εt-1
2
0.6244
σ t-1
0.1195
5.2238
p-value
0.9818
0.0356*
0.0045*
<0.0001*
Note: * p<0.05
Refers to the Table 6, GARCH(1,1) model written in equation as
zˆt = -0.4290
2
σˆ t
=
2
8467.87 + 0.2838εt -1
(Mean equation)
2
+ 0.6244σ t -1
(Conditional variance equation)
ARIMA/GARCH
Firstly, this study checked whether the ARCH effect exists in the ARIMA(3,1,2) model by using a
regression test. The result showed that ARIMA(3,1,2) has an ARCH effect. Then, ARIMA(3,1,2) was
mixed with the best GARCH model, that is GARCH(1,1).
Variables
ARIMA(3,1,2)
Constant
AR(2)
AR(3)
MA(1)
MA(2)
GARCH(1,1)
C
2
εt-1
2
σ t-1
Table 7: Estimation of ARIMA(3,1,2)/GARCH(1,1)
Coefficient
Standard error
Z-statistic
p-value
16.53833
-0.936073
0.188524
0.105330
0.972281
26.37156
0.022439
0.023478
0.020494
0.015673
0.627128
-41.71719
8.029818
5.139572
62.03491
0.5315
<0.0001*
<0.0001*
<0.0001*
<0.0001*
3303.235
0.143294
1427.419
0.048339
2.31413
2.964356
0.0207*
0.003*
0.821894
0.05859
14.02778
<0.0001*
Note: * p<0.05
Refers to Table 7, ARIMA(3,1,2)/GARCH(1,1) model is written in equation form as
zˆt = 16.53833 - 0.936073zt-2 + 0.188524zt-3 + 0.105330εt-1 + 0.972281εt-2
2
2
2
σˆ t = 3303.235 + 0.143294εt-1 + 0.821894σ t-1
Model selection
Four model selection criteria were used to select the best forecasting model from four different types of
models, that is the exponential smoothing model, ARIMA(3,1,2), GARCH(1,1), and
ARIMA(3,1,2)/GARCH(1,1). Based on the results of the ex-post forecasting (starting from January
until December 2006), the GARCH(1,1) model was selected as the best short-term forecasting model
of Bagan Datoh cocoa bean price graded SMC 1B (see Table 8).
625
Proceedings of Applied International Business Conference 2008
Table 8: Four model selection criteria
Criteria
RMSE
MAE
MAPE
U-Statistics
Double
Exponential
Smoothing
515.1244909
433.8590833
9.556988755
0.060762006
ARIMA(3,1,2)
GARCH(1,1)
ARIMA(3,1,2)/
GARCH(1,1)
238.5100782
198.9448333
4.510056702
0.02643127
219.0348756
142.3456667
3.054184854
0.024875292
236.426303
195.6330833
4.437227514
0.026185129
Ex-ante forecasting
Based on ex-ante forecasting by using the GARCH(1,1) model, Figure 2 shows that the short-term
forecasting indicated an upward trend of Bagan Datoh cocoa bean price for the period January –
December 2007.
Figure 2: Short-term forecasting of Bagan Datoh cocoa bean prices
5. Conclusion
This study investigates four forecasting methods: exponential smoothing, ARIMA, GARCH, and the
mixed ARIMA/GARCH. It then applies them to forecast twelve months ahead the Bagan Datoh cocoa
bean price graded SMC 1B. It was found that the GARCH model outperforms exponential smoothing,
ARIMA, and the mixed ARIMA/GARCH. Based on ex-ante forecasting by using the GARCH model,
the result revealed that the short-term forecasting indicated an upward trend of the Bagan Datoh cocoa
bean price for the period January – December 2007. With an accurate estimate, the Malaysia
government as well as buyers (e.g. exporters and millers) and sellers (e.g. farmers and dealers) can
perform better strategic planning, especially in the domestic cocoa bean market. It would also help
them to maximize their revenue and minimize the cost of price.
References
Aishah, M. N. and Anton Abdulbasah, K. (2006) Time series modeling of Malaysian raw palm oil
price: Autoregressive conditional heteroskedasticity (ARCH) model approach. Discovering
Mathemathics, 28, 2, 19-32.
Fatimah, M. A. and Roslan, A. G. (1986) Univariate approach towards cocoa price forecasting.
Malaysian Agricultural Economics, 3, 1-11.
Fatimah, M. A. and Zainal Abidin, M. (1994) Price discovery through crude palm oil futures market:
an economic evaluation, dalam Erdener Kaynak dan Mohamed Sulaiman dalam Proceedings
on Third Annual Congress on Capitalising the Potentials of Globalisation - Strategies and
Dynamics of Business. Penang: International Management Development Association (IMDA)
dan Universiti Sains Malaysia, 73-92.
Liew, K. S., Shitan, M. and Huzaimi, H. (2000) Time series modelling and forecasting of Sarawak
black pepper price. Mathematics Department, Universiti Putra Malaysia, Serdang, Malaysia.
Mohammed Yusof and Mohamad Salleh. (1987) The elasticities of supply and demand for Malaysian
primary commodity exports. Malaysian Journal of Agricultural Economics, 4, 59-72.
Ramanathan, R. (2002) Introductory Econometric with Applications. 5th Edition. Fort Worth: Harcourt.
626
Proceedings of Applied International Business Conference 2008
Report on Malaysia Plantation Industries and Commodities 2001-2005. (2006) Putrajaya. Ministry of
Plantation Industries and Commodities.
Telesca, L., Bernardi, M. and Rovelli, C. (2008) Time-scaling analysis of lightning in Italy.
Communications in nonlinear science and numerical simulation. 13, 7, 1384-1396.
627