Adam Pilz Forecasting Grade: A- Copper Demand Low interest rates resulting from the 2001 recession fueled a boom in the demand for credit from industrial, retail, and institutional borrowers1. These customers used funds to expand their businesses, borrow from credit cards, purchase new mortgages, and purchase and securitize loans for resale, which allowed borrowing to continue. The expansion of business facilities and housing was very raw material intensive. Copper is used in nearly all facets of expansion in both industrial facilities as well as housing. Both require wiring, telecommunications equipment, electrical and electronic products, and plumbing. When you include power transmission and generation, which not only require large amounts of copper, but also represents the power to fuel the expansion, about three quarters of copper use is accounted for2. Before selling any wire or piping, copper must be removed from the ground and refined. Therefore if we wanted to “see” economic expansion before it happened in terms of corporate profits or GDP, we could simply look at the demand for copper and realize that there is a lag between the time copper is mined, turned into products, and sold. Since we are currently in another recession3, knowing when demand is starting up again is important as it will help determine both monetary and fiscal policy. Since copper seems to be a decent proxy for future economic growth, and the fate of the ailing housing market, knowing its demand should give us an idea of what is to come. 1 http://research.stlouisfed.org/publications/net/20000801/netpub.pdf http://www.copper.org/education/c-facts/c-electrical.html 3 http://www.nber.org/cycles/dec2008.html 2 Section II of this research contains a review of the literature. Section III contains a description of the data used. Section IV reviews the methodology used. Section V the results and Section VI conclusions of this research. II. Literature Review The amount of literature attempting to estimate copper demand is very limited. Some debate about what kind of model should be used in estimating copper demand is posed. Ordinary Least Squares(OLS) models as well as techniques to handle censored regressions (i.e. TOBIT models) are used and debated. Mackinnon and Oleweiler (1980) suggest that OLS models should not be used. This claim is based on the findings of Mc Nichol(1975) who speculates that there were five periods between 1947 and 1974 where the copper market was in disequilibrium: 1947-48, 1951-53, late 1954-1955, 1964-1970, 1973-1974. Disequilibrium, McNichols suggests, can be seen where the price differential between the US Producer Price, which is the price prevalent in the US, and the price on the London Metal Exchange, is large. He asserts that the large price differential persists because US producers are keeping the US price low, which they set, and ration supplies to consumers, which drives up demand on the London Metal exchange, causing the large differential. The problem with this, asserted by Mackinnon and Oleweiler(1980) is that during contiguous quarters of disequilibrium, OLS tends to have errors of the same sign and will be serially correlated. In their Tobit model, MacKinnon Olewiler (1980) find that the market for copper is not always in equilibrium. Fisher, Cootner, Baily (1972) estimate a world copper market model with very complex methods. They estimated supply and demand equations for The US, Europe, Japan, and the rest of the world separately and then used a closed model with “A net input equation.” They fit their model to data from 1948-1968 which is then used for forecasting and simulation experiments. They find that the short run forecasts are satisfactory and the long run forecasts are not satisfactory. III. Data Data for this research was obtained from the “United States Geological Survey(USGS) Historical Statistics for Mineral and Material Commodities” at the USGS web page. The USGS keeps historical records of copper consumption and unit pricing back to 1900. Reported copper consumption is recorded directly from industry sources and is measured in metric tons. The phrase “reported consumption” is used because there may be some discrepancy between what is reported and what the actual consumption is. Unless there is some incentive to lie, I would expect these discrepancies to be rather small. Reported consumption, hereafter referred to as “consumption,” is defined as “the quantity of refined copper used by the domestic industry (brass mills, wire-rod mills, foundries, etc.), as measured by direct survey of the copper consuming industries, in the production of semi fabricates, castings, chemicals, etc. in the United States.” Descriptive statistics are included below as well as a chart in Appendix A. Variable Mean SE Mean St Dev Min Q1 Median Consumption 1536329 107752 1140339 118000 765500 1360000 Q3 Maximum 2117500 8767403 Unit value is a measure of the price of a physical unit of consumption (in this case, a metric ton) in nominal dollars. Unit value, hereafter referred to as “price,” is estimated from the “Annual Average U.S. Producer Copper Price” as reported in the Metal Prices in the United States through 1998 (MP98) and the 2006 Minerals Yearbook(MY06). Descriptive statistics are included below as well as a chart in Appendix A. Variable Unit value ($/ Metric ton) N Mean SE Mean StDev Minimum Q1 Median Q3 Maximum 108 1069 112 1167 128 296 642 1587 7231 IV. Methodology As a precursor to the modeling methodology one important issue must be specified. The model I will use, no matter which one, assumes that consumption and demand are exactly equal, the copper market is always in equilibrium. In evaluating the different modeling techniques possible for the construction of a copper demand forecast, I have felt that an OLS model is more prone to error than other choices. This error could occur due to the findings of MacKinnon Olewiler (1980), or because of copper’s uses. It would require so many different variables to account for what is influencing demand in the United States, let alone what is happening abroad that also influences domestic consumption, that I don’t feel an accurate model would be produced. Also, in order to forecast using an OLS model, I would need forecasts for the exogenous variables which, again, are numerous. After trying various Naïve, moving average, and smoothing models, I tried an ARIMA model. I determined this to be the best model I could use for several reasons. In using the Box-Jenkins Methodology an ARIMA model does not assume any particular pattern in the data to be forecast as the other models do. Also, not only did the other models perform poorly, but the ARIMA model makes use of the information in the series itself to produce a forecast. Therefore, influence of exogenous variables such as trends occurring in terms of new uses for copper or changes in preferences, and prices of substitutes and complements will be taken into account by this model. I began using the consumption data as it is reported by the USGS. The consumption data clearly was trending upward. Time Series Plot of Consumption 3000000 Consumption 2500000 2000000 1500000 1000000 500000 0 1900 1918 1936 1954 Year 1972 1990 I checked an autocorrelation correlogram to confirm. Autocorrelation Function for Consumption (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 I used a first difference to remove the trend and continued. The autocorrelation and partial autocorrelation correlograms seemed to suggest the use of either a MA(1) model or an ARIMA(1,1,1) model. Both models provided horrendous results. The MS in both models were to the tune of 37,000,000,000 which may not have been a significant problem given the size of the numerical values of consumption. My real problem was that no matter what variation I tried, I could not get the LBQ statistics to be insignificant. Results, as well as additional commentary are provided in appendix B on these models. Then I decided to try a slight transformation of the data. I divided consumption by the price to see if the would stabilize the data. In essence, I would be forecasting the amount of copper consumed in a year adjusted for its nominal price, hereafter referred to as “adjusted consumption.” The interesting thing about this approach ended up being that once I received my forecasts, I could multiply the forecast by the expected future price to get the forecast tonnage consumed. An example: Consumption ÷Price = Adj. Cons. Stabilizes the data, the forecast is done, then, transform the forecasts back using: Exp. Price ∙ Adj. Cons. = Consumption A convenient future price for copper was supplied by the price of a copper futures contract. The only adjustment needed here was that the copper traded on the NYMEX is traded in ($ / pound) and all price figures used thus far were in ($ / metric ton). A simple conversion, hereafter referred to as “Conversion Factor,” to ($ / metric ton) was needed. An example: ($ ÷ Pound) ∙ (Pound ÷ Metric Ton) = ($ ÷ Metric Ton) (Conversion Factor) Time Series Plot of Adjusted Consumption 6000 Adjusted Consumption 5000 4000 3000 2000 1000 0 1900 1918 1936 1954 Year 1972 1990 There was clearly still some trending going on so I tried a first difference to see if the trend could be removed. Time Series Plot of Adjusted Consumption diff1 2000 Adjusted Consumption diff1 1500 1000 500 0 -500 -1000 -1500 1900 1918 1936 1954 Year 1972 1990 It appears as though the trend had been removed and the data fluctuate around a mean of zero. The auto and partial autocorrelation correlograms provided below seem to imply the use of either an MA(1), MA(2), or any variation of an ARIMA model. Autocorrelation Function for Adjusted Consumption diff1 (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 Partial Autocorrelation Function for Adjusted Consumption diff1 (with 5% significance limits for the partial autocorrelations) 1.0 Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 V. Empirical Results The results that were presented in the rough draft have been moved to Appendix C. After careful reexamination of the AIC and BIC table I had, I realized that I forgot to include one model before which ended up being the best in terms of those measures. Therefore the results presented below are different than before. Here is the amended AIC and BIC table: AIC BIC ARIMA(0,1,1) 11.9812 16.6881 ARIMA(0,1,2) 11.9951 16.7269 ARIMA(1,1,2) 11.9915 16.7048 The rough draft was written with the model in green and the results presented below are presented using the model highlighted in orange. They are very close in measure yet the ARIMA(0,1,1) measures better in each, is simpler, and provides a seemingly better forecast. Final Estimates of Parameters ARIMA(0,1,1) Type MA 1 Coef -0.3449 SE Coef 0.0912 T -3.78 P 0.000 MS 159749 From the chart we see that the moving average coefficient is significant above the 99% level. This tells us that the error of the past value is important in determining the next value. Even though the MS has been reduced we cannot compare that to the models in Appendix B since the transformation has been done. Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 12.6 11 0.319 24 24.3 23 0.386 36 26.5 35 0.847 48 36.1 47 0.875 The LBQ statistics do not raise any flags so we know the residual autocorrelations as a group are insignificant. Confirmation of this is also provided in the four in one plot and the residual autocorrelation correlogram below. Residual Plots for Adjusted Consumption Normal Probability Plot Versus Fits 99.9 1000 90 Residual Percent 99 50 10 1 0.1 0 -1000 -1000 0 Residual 1000 0 Histogram 1500 3000 4500 Fitted Value 6000 Versus Order 1000 Residual Frequency 30 20 10 0 0 -1000 -1000 -500 0 500 Residual 1000 1500 1 10 20 30 40 50 60 70 80 90 100 Observation Order The errors seem to be normally distributed in regards to the normal probability plot. The residual histogram appears to be normal. The “versus order” plot gives us a time series of the residuals and it has the appearance of white noise. The last check of residuals is provided below in the correlogram of the residuals. Autocorrelation Function for RESI of Adjusted Consumption (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 Therefore our model is: Ŷ= εt -.3449 ε (t-1) Time Series Plot for Adjusted Consumption with Forecasts (with forecasts and their 95% confidence limits) 7000 Adjusted Consumption 6000 5000 4000 3000 2000 1000 0 -1000 -2000 1 10 20 30 40 50 60 Time 70 80 90 100 110 Forecasts for Adjusted Consumption Year 2008 2009 2010 2011 2012 Lower -60.1399517 -89.7340626 -1360.25908 -662.826319 -925.039958 Forecast 323.4019481 323.4019481 323.4019481 323.4019481 323.4019481 Upper 1106.943848 1636.537959 2007.062976 2309.630215 2571.843854 Something to note is that the model is predicting that adjusted consumption will be exactly the same for the next 5 years. We must keep in mind that these forecasts are of adjusted consumption until we convert them back to original form. Therefore: Forecasts for Consumption Year 2008 2009 2010 2011 Average Price4 3.2 2.048278 2.027125 2.0155 Conversion Factor5 0.00045359 0.00045359 0.00045359 0.00045359 Price/ Metric Ton 7054.829251 4515.703609 4469.068983 4443.440111 Adjusted consumption 323.4019481 323.4019481 323.4019481 323.4019481 Forecast Tonnage 2281545.523 1460387.344 1445305.615 1437017.188 This same table for the lower and upper bounds is included in Appendix D. Below is the time series of copper consumption including both the upper and lower forecasts. Since I had to calculate the consumption numbers using the expected prices, Minitab would produce the typical forecast graph. Therefore I had to overlay three data sets which is the reason for the different colors. 4 Average Price- (2008) is from the USGS website. From (2009-2011) is the average most recent close of all the months available for the respective years derived from the NYMEX website. http://www.nymex.com/cop_fut_csf.aspx 5 Is the conversion factor necessary to convert $/ Pound to $/ Metric ton. http://www.metricconversions.org/weight/pounds-to-metric-tons.htm Time Series Plot of Consumption with Forecast, Lower, Upper Variable Forecast_1 Lower_2 Upper_2 10000000 Data 5000000 0 -5000000 -10000000 1900 1918 1936 1954 Year 1972 1990 2008 There is a slight discrepancy with the lower forecasts when we consider that they reflect negative consumption, which is not possible. Therefore I provide another chart. Here is basically the same graph just with a minimum value of zero. As you can see, the model predicts a fall in the demand for copper. Time Series Plot of Consumption with Forecast, Upper Variable Forecast_1 Lower_2 Upper_2 10000000 Data 8000000 6000000 4000000 2000000 0 1900 1918 1936 1954 Year 1972 1990 2008 Time Series Plot of Consumption with forecasts 3000000 Consumption 2500000 2000000 1500000 1000000 500000 0 1900 1918 1936 1954 Year 1972 1990 2008 In fact when we look at the time series plot of consumption by itself, the model predicts a steep fall in the demand for copper in the next few years. The predicted consumption by the USGS for 2008 was 2,000,000 metric tons and the model predicts 2,281,545 metric tons, then a sharp fall, which may be the case if this worldwide recession continues. Year 2008 2009 2010 2011 USGS Forecast 2000000 Forecast Tonnage 2281545.523 1460387.344 1445305.615 1437017.188 Now that we have a model that is satisfactory a check of an Ex Post as well as an Ex Post Historical forecast is in order. Ex Post: ARIMA(0,1,1) Final Estimates of Parameters Type MA Coef 1 SE -0.3432 Coef 0.1128 T -3.04 P 0.003 MS 216154 The moving average coefficient is significant at the 99% level and the MS appears to be close to the former model. Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF P-Value 12 11.400 11.000 0.409 24 24.40 23.00 0.38 36 29.70 35.00 0.72 48 34.000 47.000 0.922 The LBQ stats are insignificant. The four in one plot below does not raise any red flags, and neither does the correlogram of the residuals. Ex Post Residual Plots for Adjusted Consumption Normal Probability Plot Versus Fits 99.9 99 1000 Residual Percent 90 50 10 1 0 -1000 0.1 -1000 0 Residual 1000 2000 0 1500 Histogram 3000 4500 Fitted Value 6000 Versus Order 1000 15 Residual Frequency 20 10 0 5 0 -1000 -1200 -600 0 600 Residual 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 1200 Observation Order Autocorrelation Function for RESI1_Ex Post (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 Lag 12 14 16 18 Ex Post Time Series Plot for Adjusted Consumption (with forecasts and their 95% confidence limits) 10000 Adjusted Consumption 7500 5000 2500 0 -2500 -5000 1 10 20 30 40 50 60 Time 70 80 90 100 Time Series Plot of Ex Post Adjusted Consumption and Adjusted Consumption Variable Ex Post Adjusted C onsumption Actual Adjusted Consumption 6000 5000 Data 4000 3000 2000 1000 0 1900 1918 1936 1954 Year 1972 1990 We see that the Ex Post model does not predict the downturn that happens, when plotted against adjusted consumption, because it seems to predict the same value into the future. Ex Post Historical: ARIMA(0,1,1) Final Estimates of Parameters Type MA Coef 1 SE -0.3449 Coef 0.0912 T -3.78 P 0.000 MS 159749 The moving average coefficient is significant above the 99% level and the MS appears to be below the Ex Post model. Nothing appears wrong with the LBQ stats below, they are all insignificant. Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag Chi-Square DF 12 12.600 11.000 24 24.300 23.000 36 26.500 35.000 48 36.100 47.000 P-Value 0.319 0.386 0.847 0.875 Residual Plots for Ex Post Historical Adjusted Consumption Normal Probability Plot Versus Fits 99.9 1000 90 Residual Percent 99 50 10 1 0.1 0 -1000 -1000 0 Residual 1000 0 Histogram 1500 3000 4500 Fitted Value 6000 Versus Order 1000 Residual Frequency 30 20 10 0 0 -1000 -1000 -500 0 500 Residual 1000 1500 1 10 20 30 40 50 60 70 80 90 100 Observation Order The four in one plot appears to be ok as well as the correlogram of the residuals below. Autocorrelation Function for RESI2_Ex Post Hitsorical (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 Ex Post Historical Time Series Plot for Adjusted Consumption (with forecasts and their 95% confidence limits) Adjusted Consumption_1 7500 5000 2500 0 -2500 -5000 1 10 20 30 40 50 60 Time 70 80 90 100 Here is the Ex Post Historical forecast against adjusted consumption. It doesn’t seem to forecast the downturn that happens toward the end, although it seems reasonable prior to that. VI. Conclusion In the beginning of this paper, I asserted that copper seems to be a decent proxy for future economic growth, and the fate of the ailing housing market, and that knowing its demand should give us an idea of what is to come. I employed the use of an ARIMA(0,1,1) model to forecast the demand for copper in hopes of understanding when we can expect a turnaround in the economy. I am very pleased with this model and the forecasts generated by it. Considering that the data only go through 2007, when the housing market and the economy were in better shape, it provides an astonishingly low forecast for the demand for copper. This was undoubtedly helped by the adjustment for price as the model was allowed to take into account copper prices falling in half, which would not have happened had this adjustment not been done. If this model is correct, we should not expect a turnaround in the economy any time soon if it is agreed that copper consumption is a good proxy for future economic growth. The Ex Post and Ex Post Historical forecasts performed rather poorly, although not so much when we consider the length of time over which the model is asked to predict. Thirty six years into the future is a long time to assume that the fundamentals of the data set will be the same as the time before it. Some of the reasons why domestic consumption patterns would be vastly different after the early seventies is the enormous expansion into computer technology. Copper is used in many components that service this sector whose demand patterns were not reflected prior to the nineties. BIBLIOGRAPHY Copper Development Association Inc. (2009). COPPER.ORG. Retrieved 4/1, 2009, from http://www.copper.org/resources/market_data/homepage.html Fisher, F. M., Cootner, P. H., & Baily, M. N. (1972). “An econometric model of the world copper industry.” The Bell Journal of Economics and Management Science, 3(2), 568-609. Retrieved from http://www.jstor.org/stable/3003038 MacKinnon, J. G., & Olewiler, N. D. (1980). “Disequilibrium estimation of the demand for copper.” The Bell Journal of Economics, 11(1), 197-211. Retrieved from http://www.jstor.org/stable/3003408 McNichol, D.L.. (1980). “The Two price system in the Copper Industry.”Bell Journal of Economics, Vol. 6 No. 1. 50-73. New York Mercantile Exchange, I. (4/3/2009). Copper. Retrieved 4/4, 2009, from http://www.nymex.com/cop_opt_cso.aspx rcallaghan. (Oct. 1, 2008). Retrieved 4/1, 2009, from http://minerals.usgs.gov/ds/2005/140/#copper TheOptionsGuide.com. (2009). Copper futures explained. Retrieved 4/4, 2009, from http://www.theoptionsguide.com/copper-futures.aspx McNichol, D.L.. (1980). “The Two price system in the Copper Industry.”Bell Journal of Economics, Vol. 6 No. 1. 50-73. Appendix A. Summary for Consumption A nderson-Darling N ormality Test 600000 1200000 1800000 2400000 3000000 A -S quared P -V alue < 1.18 0.005 M ean S tDev V ariance S kew ness Kurtosis N 1385843 761342 5.79641E +11 0.14072 -1.02552 108 M inimum 1st Q uartile M edian 3rd Q uartile M aximum 118000 745000 1335000 2045000 3020000 95% C onfidence Interv al for M ean 1240613 1531072 95% C onfidence Interv al for M edian 1140000 1620000 95% C onfidence Interv al for S tD ev 9 5 % C onfidence Inter vals 671574 879030 Mean Median 1100000 1200000 1300000 1400000 1500000 1600000 Summary for Unit value ($/t) A nderson-Darling N ormality Test 0 1500 3000 4500 A -S quared P -V alue < 8.10 0.005 M ean S tDev V ariance S kew ness Kurtosis N 1068.9 1167.1 1362208.3 2.8882 11.7192 108 M inimum 1st Q uartile M edian 3rd Q uartile M aximum 6000 128.0 296.0 642.0 1587.3 7231.0 95% C onfidence Interv al for M ean 846.3 1291.5 95% C onfidence Interv al for M edian 433.9 9 5 % C onfidence Inter vals 1029.5 Mean Median 400 600 800 1000 789.0 95% C onfidence Interv al for S tD ev 1200 1400 1347.6 Summary for Adjusted Consumption A nderson-Darling N ormality Test 1000 2000 3000 4000 5000 6000 A -S quared P -V alue < 3.87 0.005 M ean S tDev V ariance S kew ness Kurtosis N 1918.2 1193.3 1423851.4 1.68183 3.49480 108 M inimum 1st Q uartile M edian 3rd Q uartile M aximum 295.9 1087.6 1626.5 2482.3 6113.2 95% C onfidence Interv al for M ean 1690.6 2145.8 95% C onfidence Interv al for M edian 1443.4 9 5 % C onfidence Inter vals 1052.6 Mean Median 1500 1600 1700 1800 1900 1786.6 95% C onfidence Interv al for S tD ev 2000 2100 1377.7 Appendix B. Final Estimates of Parameters Type MA 1 Coef -0.0206 SE Coef 0.0971 T -0.21 P 0.833 MS 37846422882 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag ChiSquare DF P-Value 12 24 36 48 25.1 11 0.009 32.5 23 0.09 37.4 35 0.361 52.8 47 0.259 The MA(1) coefficient contains no explanatory power in this model. The normality of the errors was unquestioned in regards to the normal probability plot and the residual frequency histogram presented by Minitab. Also, the LBQ statistics imply that the various lags 12, 24, 36, 48 are serially correlated, this model should be discarded. Final Estimates of Parameters Type AR MA Coef 0.7170 0.8393 1 1 SE Coef 0.2051 0.1561 T 3.50 5.38 P 0.001 0.000 MS 37092726953 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag ChiSquare DF P-Value 12.000 24.000 36.000 48.00 22.200 10.000 0.014 28.600 22.000 0.158 34.300 34.000 0.454 49.40 46.00 0.34 Both the autoregressive and the moving average coefficients are significant and shoulf be included. The MS is extremely large but seems to be the same as the model above. The normality of the errors was unquestioned in regards to the normal probability plot and the residual frequency histogram presented by Minitab. Also, the LBQ statistics imply that the various lags 12, 24, 36, 48 are serially correlated, this model should be discarded. Appendix C. Time Series Plot of Adjusted Consumption diff1 2000 Adjusted Consumption diff1 1500 1000 500 0 -500 -1000 -1500 1 11 22 33 44 55 Index 66 77 88 99 The chart resembles the white noise look that we are searching for. The data seem to fluctuate around a mean of zero. The autocorrelation and partial autocorrelation correlograms provided below seem to imply the use of either an MA(1), MA(2), or any variation of an ARIMA model. Autocorrelation Function for Adjusted Consumption diff1 (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 Partial Autocorrelation Function for Adjusted Consumption diff1 (with 5% significance limits for the partial autocorrelations) 1.0 Partial Autocorrelation 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 After receiving these results, I viewed the four in one plot to check for the normality of errors. Final Estimates of Parameters Type AR MA MA 1 1 2 Coef 0.8116 0.5083 0.3665 SE Coef 0.1612 0.1666 0.0919 T 5.03 3.05 3.99 P 0.000 0.003 0.000 MS 158531 The corrected model suggested the use of an ARIMA(1,1,1) model, and, after some variations were tried, an ARIMA(1,1,2) performed extremely well. The results: Final Estimates of Parameters Type AR MA MA 1 1 2 Coef 0.8116 0.5083 0.3665 SE Coef 0.1612 0.1666 0.0919 T 5.03 3.05 3.99 P 0.000 0.003 0.000 MS 158531 Model: Ŷ= .8116 Y(t-1) + εt -.5083 ε (t-1) - .3665 ε(t-2) From the chart we see that autoregressive coefficient and both the moving average coefficients are significant above the 99% level. This tells us that not only the past value is important for determining the next one, but that the errors of the past two values are important. Even though the MS has been reduced we cannot compare that to the models in Appendix B since the transformation has been done. Final Estimates of Parameters Lag ChiSquare DF P-Value 12.000 24.000 36.000 48.000 9.100 9.000 0.427 19.800 21.000 0.537 21.900 33.000 0.929 31.800 45.000 0.932 The LBQ statistics do not raise any flags so we know the residual autocorrelations as a group are insignificant. Confirmation of this is also provided in the four in one plot and the residual autocorrelation correlogram below. Residual Plots for Adjusted Consumption Normal Probability Plot Versus Fits 2000 99.9 90 Residual Percent 99 50 10 1 0.1 1000 0 -1000 -1000 0 Residual 1000 2000 0 2000 20 1000 10 0 3000 4500 Fitted Value 6000 Versus Order 30 Residual Frequency Histogram 1500 0 -1000 -800 -400 0 400 Residual 800 1200 1600 1 10 20 30 40 50 60 70 80 90 100 Observation Order The errors seem to be normally distributed in regards to the normal probability plot. The residual histogram appears to be normal when we consider the fact that the last bar all the way to the right in the histogram represents a frequency of one. That is, 1% of the 107 values taken in to consideration lie that far out. The “versus order” plot gives us a time series of the residuals and it has the appearance of white noise. The last check of residuals is provided below in the correlogram of the residuals. Autocorrelation Function for RES of Adjusted Consumption (with 5% significance limits for the autocorrelations) 1.0 0.8 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 Lag 16 18 20 22 24 26 Here are the forecasts from the ARIMA(1,1,2) model along with the copper prices in the respective time periods. I could not find any futures contracts farther into the future than 2011. Forecasts of Adj. Cons. from period 2007 Period 2008 2009 2010 2011 2012 Forecast 388.36 454.17 507.58 550.93 586.11 Lower -392.19 -828.08 -072.54 -241.94 -371.19 Upper 1168.91 1736.41 2087.69 2343.80 2543.41 Copper Pricing Year 2008 2009 2010 2011 * USGS **COMEX Price/pound 3.2* 2.01** 2.0247** 2.0285** The next table contains the calculated forecasts using the formula: (Price in Dollars ÷ Pound) ÷ (Pound ÷ Metric Ton) = Price in Dollars / Metric Ton =Exp. Price Exp. Price ×Adj. Cons.=Consumption Forecasts of Consumption in metric tons from period 2007 Period 2008 2009 2010 2011 Forecast Lower Upper 8767403 4045266 4587354 4997849 -799052 -944343 -209224 -405955 26388673 15466103 18867915 21262145 As we can see even though the model performed well in terms of significance, the forecasts seem to be extremely wrong. Here is the amended time series plot including forecasts of Adjusted Consumption: Time Series Plot of Adjusted Consumption 6000 forecast cons/dollar 5000 4000 3000 2000 1000 0 1 11 22 33 44 55 66 Index 77 88 99 110 It seems to be predicting that copper demand will pick up in the near future. But when we see how this translates into the amended times series of Consumption… Time Series Plot of Consumption 9000000 8000000 Consumption 7000000 6000000 5000000 4000000 3000000 2000000 1000000 0 1 11 22 33 44 55 66 Index 77 88 99 110 We see that there is obviously some flaw in the model. In fact the USGS estimates total 2008 US copper Consumption to be 2,000,000 metric tons whereas this model predicts 8,767,403 metric tons. Theory of model: Even though the model seems to predict copper demand at roughly 4 times the 2008 predicted by the USGS, there may be some merit to it. The data used in the model only go through 2007. This means that at the end of the data the US economy is in a steady climb upward, but most notably, the housing market is on fire. Also the models accounts for copper prices being around $4 per pound in 2007. This model could be seen as a prediction of what would’ve happened if the economy and especially the housing market would have continued upward and if copper prices fell by half, I used $2 copper to calculate the forecasted demand. Revisiting the plot above, it tells us that if the market kept up its acceleration and then all of a sudden copper prices fell dramatically, demand for copper would’ve skyrocketed, which is what we would expect. Appendix D. Year 2008 2009 2010 2011 Average Price 3.2 2.048278 2.027125 2.0155 Conversion Factor 0.00045359 0.00045359 0.00045359 0.00045359 Price/ Metric Ton 7054.829251 4515.703609 4469.068983 4443.440111 Adjusted consumption -460.1399517 -989.7340626 -1360.25908 -1662.826319 Lower Forecast Tonnage -3246208.791 -4469345.678 -6079091.662 -7388669.165 Year 2008 2009 2010 2011 Average Price 3.2 2.048278 2.027125 2.0155 Conversion Factor 0.00045359 0.00045359 0.00045359 0.00045359 Price/ Metric Ton 7054.829251 4515.703609 4469.068983 4443.440111 Adjusted consumption 1106.943848 1636.537959 2007.062976 2309.630215 Upper Forecast Tonnage 7809299.837 7390120.367 8969702.892 10262703.54
© Copyright 2026 Paperzz