Using Weighted Least Squares to Deflate Input-Output Tables§ Published in Economic Systems Research, September 2008 Giorgio Rampa Dipartimento “Casaregi” – Sezione Economica Via Balbi, 30 16126 Genova University of Genoa Italy [email protected] Abstract This article proposes a balancing procedure for the deflation of I-O tables from the viewpoint of users. This is a “Subjective” variant of Weighted Least Squares (WLS) method, already known in the literature. It is argued that it is more flexible than other methods, and it is shown that SWLS subsumes the first-order approximation of RAS as a special case. Flexibility is due to the facts that (a) users can attach differential “reliability” weights to first (unbalanced) estimates, depending on the confidence they have in the different parts of their pre-balancing work, (b) differently from RAS, one is not bound to take any row or column total as exogenously given, and (c) additional constraints can be added to it. The article describes also how SWLS was utilised to estimate a yearly (1959-2000) series of constant-price I-O tables for the Italian economy. KEYWORDS: I-O Tables, Deflation, Balancing, Weighted Least Squares, Data reliability § The author is indebted to Italo Lavanda for helpful observations. Many thanks are also due to the Editor of the Journal and to two anonymous referees for comments and suggestions that helped to hopefully clarify many previously obscure points. Remaining errors are mine. 1 1. I-O table deflation: squaring circles? Almost all analyses of structural change1 to be meaningful need constant price InputOutput tables2 pertaining to years not too near to each other in time. Therefore, constant-price tables spanning long periods of time should be made available to I-O users. I-O tables, however, are not regularly published in both current and constant prices, even if this is suggested by current National Accounting Systems (including the European System of Account, ESA). It follows that I-O users are often forced to compute themselves deflated tables. This article adopts the user’s viewpoint, assuming that the only data available for homemade deflation are those usually published, namely: a. valued added and output at current and constant prices (and thus price indexes of both): the classification of sectors is usually more aggregated than that appearing in I-O tables; b. household consumption at current and constant prices, classified however by consumption categories and not by producing sectors; c. highly aggregated figures of gross fixed capital formation, again at current and constant prices; d. external trade in current values and physical quantities, aggregated differently from I-O sectors, and excluding services; e. finally, and obviously, the current-price I-O tables to be deflated, referring possibly to only few years, and aggregated in such a way (usually 40 to 60 sectors) that heterogeneous products are grouped on single rows. 1 By this we mean, as usual, studying changes in input (both domestic and imported) and labour coefficients, changes in the composition of total output and final demand, and changes in all indicators derived from those coefficients/shares. 2 There are some exceptions to this. For instance, if one is prepared to assume product homogeneity along each row of the I-O tables, it is not difficult to show that vertically integrated labor requirements (i.e. labor contents in final demand items) do not depend on the price-incex vector, that is current-price and constant-price computations give the same results. 2 Given the long-period requirement, it might be necessary to derive type a-d data from different vintages of national accounts, often characterised by revisions of accounting figures and by changes in sector classification and aggregation. In addition, sector aggregation may differ between I-O tables and national accounts, and it is often the case that previously published tables are not revised according to subsequent national account revisions. All this means that even the availability of a coherent series of current-price tables might cause troubles. Supposing however that current-price tables are available for the period of interest, users wishing to deflate tables must use price and quantity indexes to start with, and such indexes must extracted from the different accounting vintages, whose problems were described above. Moreover users have very little or no information on the product mix of each cell of I-O tables and are bound to start with an homogeneity assumption, i.e. price uniqueness along any row. All this implies that the data readily available, or roughly estimated, for home-made deflation cannot be deemed to form an internally coherent set: in particular, price and quantity index might not agree with each other for certain periods and/or sectors, leading to unbalanced estimates. So users could choose two opposing routes: either they “believe firmly” in chained price indexes of resources (domestic and imported), as computed from available data, transferring residual unbalances on quantities; or they believe firmly in the chained time series of official constant-price value added, transferring residual unbalances on prices. Both routes leave the user with possibly important errors: in the first case the deflated value added diverges from real value added as recorded in published national accounts; in the second case computed output deflators diverge from those implicit in published data3. The meaning of such tension between price and “real value added” indexes will be made precise in section 2. It is as if users, wishing to comply with all published data, were vainly trying to “square a circle”. 3 It would be a different matter, of course, if a user had privileged and very reliable information on some of the items a-d above for the whole period and for all sectors to be treated: see e.g. Dietzenbacher-Hoen (1998). 3 This paper proposes then the use of an intermediate procedure for deflating I-O tables, one that takes into account both sides (price and quantity indexes computed from published data). The result of this procedure is certainly not a perfect square, but could be a satisfactory approximation to some fairly regular polygon. The procedure (a WLS method with subjective reliability weights, SWLS) is somehow known to I-O practitioners, and is reviewed in section 3. It is shown that the RAS method, up to first-order approximation, can be seen as a special case of SWLS, since the latter imposes weaker constraints than the former does (section 4). Section 5 describes the use of the proposed procedure to deflate a series of Italian I-O tables (1959-2000), and reports some summary statistics of the resulting ex-post errors. Section 6 concludes. Before continuing, let us comment briefly on the meaning of “real value added”. It might be argued, in fact, that the circle-squaring problem hides more than lacking information: the argument is not new, and various authors already raised some of the points that I am going to discuss, starting from Arrow (1974); see also Durand (2001) and Febrero−De Juan (2002) for surveys and critical discussions. The notion of “real value added” is indeed ambiguous, especially when one tries to interpret it as an “output” concept. Value added is in theory the balance of the production account: value of output minus value of intermediate inputs. The same meaning should be retained in the constant-price version, giving rise to double deflation: output and intermediate inputs are evaluated in the base-year prices, and valued added is their balance. It can happen that the movements of doubly deflated value added do not reflect the movements of constant-price output, especially in periods (e.g. the Seventies and Eighties, after the oil shocks) when relative prices undergo wide fluctuations and hence firms modify significantly their I-O coefficients: for a precise statement of this point, see Rampa (2002). This should come as no surprise: imagine what the revenues from production (roughly, wages plus profits) would have been for steel producers in the Sixties, had oil prices been those of the Eighties but they contin- 4 ued to use the same type and amount of intermediate energy products. If the current year is far (in time and in relative prices) from the base year, doubly deflated valued added could even be negative in some cases, simply because of the very notion of constant-price value added. Many applied economists tend however to interpret directly real value added as an output concept, and want to see its time-series behaving smoothly: they would not accept a doubly deflated value added that did not keep in line with other output concepts (gross output or final sales). Indeed, for many of them, and for almost all of them in past decades, constant-price value added was the unique information about sectors’ output, and there was (is?) a tendency to forget that the identity between valued added and (final) output holds true only for the aggregated economy. As it often happens in social affairs, expectations can affect reality: since economists were expecting “meaningful” real value added series, in some cases national accountants might have been tempted to comply with this expectation. In fact, one hardly finds official real value added series that are not very smooth even at a very disaggregated level, and even in periods of high relative-price fluctuations. This is a puzzle, and it can be suggested that applied economists should eventually refrain form interpreting constant-price valued added as an output concept. If the arguments put forward above contain a grain of truth, it follows that additional caution is needed when inserting published long-period real value added figures into deflated I-O tables. This brings us back to our more practical matters. 2. Two opposite, single-sided, ways of deflating I-O tables Consider a current-price I-O table of the “symmetric” form, referred to any of the years belonging to the time span the user is interested in, with n products and sectors and k different types of final demand. In matrix form, one can write: 5 X Y t v′ − − m′ − − t ′ − − where X is the n by n matrix of intermediate flows, Y is the n by k matrix of final uses, v′ is the n row-vector of value added (primes denote vector transposition), m′ is the n row-vector of imports, and t is the n-vector of total resources (that is, domestic output plus imports). One might want to distinguish explicitly flows according to their origin, domestic or imported (assuming equality of the number of products for both), so writing: Xd X m v′ m′ t′ Yd Ym − − − x m − − − (1) The indexes “d” and “m” refer to domestic and imported goods, respectively. One has X d + X m = X , Y d + Y m = Y , and x + m = t . These quantities fulfil the usual national ac- counting identities. If u r is the sum vector, composed of r ones, we write such identities as: a) u′n X d + u′n X m + v′ = x′ b) Xd u n + Y d uk = x c) Xm u n + Y d u k = m d) x+m =t If a user wishes to deflate this table, which is usually a highly aggregated version of the virtual “true” I-O table of the economy, lack of privileged information binds her/him to assume that each row of the table contains a single homogeneous product, or a fixed product mix: a single deflator is taken for each of the first 2n rows of matrix (1) above. Call δ and µ the n-vectors 6 of the deflators (i.e. reciprocal of price indexes) of domestic output and of imports, respectively. By assumption, the accounting constraints (a-d) must be fulfilled by deflated data as well. Two methods might be envisaged for deflating the current-price table (1): these two methods are somehow polar to each other. The first one, call it “Route I”, is double deflation4. Take the officially published deflators as given, and call δ and µ the output and import deflators, respectively. The doubly deflated constant-price value added, say v r , is obtained simply as v ′r = δ′ xˆ − δ′ X d − µ′ X m (2) where a hat over a vector stands for diagonalisation (notice that an under-bar denotes official items, while an over-bar denotes computed ones). If one deflates final demand as well, treating domestic and imported products separately with deflators δ and µ respectively and retaining row-homogeneity, one finds that the resulting deflated table fulfils constraints (a-d). However, sectoral real value added estimated under Route I is usually different, and possibly much so, from officially published constant-price value added. This is due mainly to the row−homogeneity assumption (see Appendix 1 for a simple 2x2 example). The second route for home-made deflation (“Route II”) goes along the other way around, being an “inverse” method with respect to the first one. The starting point is the “quantity” side, that is sectoral constant-price value added as officially published, which is now taken as given. One then “solves for” output price deflators, requiring that these are coherent with official real value added and with other published data, and that the resulting deflated table fulfils constraints (a-d). Route II would go as follows. Take the real value added vector, say v r , and the deflator vector of imports, µ , both derived from official sources and considered as given. Compute af- 4 Alternative routes to double deflation were proposed e.g. by David (1962), Sato (1976) and Durand (1994); the first one is more simple-minded, assuming equality of output and value added price indexes. From our current perspective, they all share with double deflation the feature that one starts with exogenously given price indexes, and “solves for” deflated value added 7 terwards the vector of implicit valued added deflators, say υ , such that υ′ vˆ = v r ′ (recall that v is the current-price value added, known to the user by assumption). The problem, now, is finding a vector of domestic output deflators, say δ , such that the following accounting equality, that is constraint (a) above, holds true: δ ′ X d + µ ′ X m + υ′ vˆ = δ ′ xˆ (3) Solving for δ is easily accomplished. Define in fact the current-price coefficient matrices (vectors) of domestic and imported inputs, and of value added, as A d = X d xˆ −1 , A m = X m xˆ −1 , and w ′ = v ′ xˆ −1 respectively. Post−multiply expression (3) by xˆ −1 . After a couple of passages, ˆ = vˆ xˆ −1 , one obtains and observing that w )( ( ˆ ⋅ I − Ad δ ′ = µ′ A m + υ′ w )−1 . (4) Notice, in passing, a property of result (4). The columns of inputs and value added coeffi- ( ) ˆ = u′ − u′ A d = u′ I − A d , whence one cients sum up to one: this can be written as u′ A m + u′ w ( )( ˆ I − Ad derives u′ A m + u′ w )−1 = u′ . Therefore expression (4) can be interpreted as saying that the estimated domestic output deflators, δ , are sorts of “weighted averages” of import and value added deflators. Since the weights change in time, being the current-price coefficients of the table to be deflated, these are Paasche−type deflators. The deflators obtained with (4) are used finally to compute the constant-price intermediate and final flows, under row-homogeneity, and the domestic outputs, while the imported ones are deflated by means of the official µ . These deflated data clearly fulfil constraints (a-d) 5. 5 All the methods quoted when introducing double deflation above, apart from David (1962), could in principle be “inverted” to obtain something similar to the second route we are discussing. Route II is more in the vein of, but still different from, a linear version of Sato (1976), fixed capital being not considered. See also Febrero-De Juan (2002) for critical comments on Sato (1976), which are however out of the scope of the present paper. 8 Now, using Route II to deflation one faces a problem that is symmetric to that faced when using Route I: sectoral output deflators estimated in this way are usually different, and possibly much so, from those implicit in published data, i.e. one has now δ ≠ δ , while one had v r ≠ v r under Route I. This depends again on having assumed, among other things, row−homogeneity (see again Appendix 1 for an example). Concluding: if the user firmly believes in official output deflators (respectively, real value added), then the resulting deflation procedure leaves her/him with important errors in sectoral real value added (respectively, output deflators). A way out from this difficulty lies in considering all −price and quantity− official indexes as subject to possible, but less pronounced, modification under deflation, instead of taking one of the two sides as exogenously given and transferring all the errors on the other side. In other terms, one looks for a method lying midway between Routes I and II described above: a method that, while preserving the accounting constraints, does not bind the user to deviate extremely from either official real value added or official price indexes. To this we turn now. 3. A Subjective Weighted Least Squares approach The procedure we are up to describe goes through two steps. In the first step the user, using available official sources of information, obtains a provisional estimate of the items to be inserted in the deflated table. For instance, intermediate flows are deflated provisionally assuming row-homogeneity, separately for domestic and imported products. Final demand deflation can exploit additional information on consumption, investment and export specific deflators, again separately for domestic and imported products. Total output and imports may either be deflated by means of chained price indexes, or computed via chained quantity indexes, or through some average of the two. Real value added can be computed using published chained constant-price 9 data (i.e. quantity indexes). The user may insert additional information, coming from various sources, to estimate single parts, or even single cells, of the constant-price table. Notice that also deflated row and column sums appear among the item provisionally deflated. Call “first estimate” the provisional result of the first step of the procedure. Using a tilde to denote first estimates (recall that we are speaking of deflated figures), define the matrix ~d X ~m X ~ Z = ~v′ ~ ′ x ~ m′ ~ Yd ~ Ym 0′k ′ 0 k 0′k (5) where 0k is the null k-vector, and all other symbols were defined above. Because of the reasons discussed in section 1, first estimates (5) generally do not obey the accounting constraints; such constraints, if fulfilled, would be written as: u′n X d + u′n X m + v ′ − x′ = 0′n d d X u n + Y u k − x = 0 n m m X u n + Y u k − m = 0 n (6) ~ The number of cells in Z is m = (2n + 3)(n + k ) , including the null vectors appearing in (5); the number of linear constraints (6) is instead 3n. We pass now to a more compact notation. Define the m-vector z = vec(Z ) , that is (z )′ = (z1 ) , (z 2 ) ,K , (z n + k ) , ′ ′ ′ z j being the j-th column of Z , j = 1, K n + k ; define similarly ~ ~z = vec Z . After this, we can write the constraints (6) compactly as () Gz=0 10 (7) where G is a 3n by m matrix whose elements are all 1, −1 and 0 in proper positions; the size of the null vector on the right-hand side of (7) is 3n. We do not write down the precise form of G, which is logically very simple but at the same time tedious and space consuming. The interested reader can contact the present author for the details, or see Rampa (2007b, Appendix 1). As we said, first estimates do not fulfil the constraints (7), i.e. one has G ~z ≠ 0 . This leads to the second step of the procedure. The balancing of first estimates in order to obtain final estimates satisfying G z = 0 can be implemented using a technique, a sort of Weighted Least Squares (WLS) method, developed initially by Stone et al. (1942); see also Stone (1961), Byron (1978), Van der Ploeg (1982) and Weale (1988). This technique has been applied mainly in order to estimate current-price I-O tables starting from recent national account data and from tables of previous years6. Here the use of WLS is proposed, as suggested also by Weale (1988), to balance the first estimates (6) of constant-price tables in order to fulfil constraints (7). Such technique is a constrained least square estimation, constraints coming from the accounting identities. The “least square” aspect enters the picture because one is looking for final estimates that are not “too far” from initial estimates in the Euclidean−norm sense: then the loss function to be minimized is the sum of the squared changes made in all cells with respect to first estimates in order to attain G z = 0 . The “weighted” side of the story, finally, comes from the fact that one wants to attach differential weights to the squares appearing in the loss function. Indeed, it is reasonable to think that the (squared) change prompted by the balancing procedure upon a first-estimated item should be “important” −i.e. weighted more heavily − in the loss function, if the user is highly confident in such first estimate; and vice-versa. In other terms, the weight to be attached to a squared change should be inversely related to the uncertainty characterising the corresponding 6 Applications exist also in the field of estimation of social accounting matrices (Robinson et al. 2001) and of multiregional I-O models (Canning-Wang 2005). 11 cell: this brings us to the variances of first estimates ~z , where ~z are seen as the result of an imprecise first estimation (measurement errors, hidden disaggregated flows, etc.). In principle, variances should be known to national accountants, since they repeatedly estimate all items of the accounts by means of questionnaires and privileged sample information. Hence, while publishing means only, they should have information on variances as well. Actually, variances are not known to users: more on this will be said below, when introducing the subjective WLS. This given, WLS can be described as follows. Let V be the m by m diagonal matrix7 containing the variances of the elements of ~z , and assume, for the moment being, that this matrix is known. The problem to be solved, i.e. finding final estimates that are “weighted-next” to first estimates and fulfil the accounting constraints, can be written as: min (z − ~z )′ V −1 (z − ~ z) z s.t. Gz = 0 (8) The solution to this problem, call it z*, is given by the Aitken estimator z* = ~z − VG ' (GVG ')−1 (G ~z ) . (9) We said “the” solution, since it is clearly unique given strict convexity of the loss function and linearity of the constraint; indeed, the solution is linear in first estimates. Heuristically, the final estimates z* are equal to the initial ones, minus a term which takes into account initial unbalances G ~z ≠ 0 and relative variances: in fact, the errors G ~z appearing in the last round brackets of (9) are subtracted from first estimates according to the share of their variances in the “total variance” of each constraint, expressed by the term VG ' (GVG ')−1 . From (9) one can easily check that indeed Gz* = 0 , and that multiplication of matrix V by any scalar leaves the result unaffected: only relative variances matter. 7 More generally, one could take also co-variances into account, so passing form a “weighted” to a “generalized” version of the method. We will however limit ourselves to the weighted case. 12 If the constraint of problem (8) were of the type Gz = h instead of Gz = 0 , where h is some fixed vector, the solution would be z* = ~z − VG ' (GVG ')−1 (G ~z − h ) . Notice the different role played by ~z and h in this last expression: the variables ~z are subject to the balancing procedure, while h is taken as exogenously given. This case can be subsumed under problem (8): in fact one can write the constraints Gz = h as G + z + = 0 , where z + = (z′, h′)′ and G + is formed from G adding columns composed of 0’s and −1’s in proper positions. The variance matrix V must be extended accordingly. As regards the loss function, one requires that the distance between the last part of vector z + and vector h not only is minimized by the solution, but is zero. This is accomplished by taking the variances of the part h of vector z + equal to zero8, coherently with the assumption that h is given. This point will be exploited in next section. The WLS procedure can be given an alternative interpretation, suggested also by Stone (1982)9. The V matrix is usually unknown to the user; however she/he may have subjective opinions about the reliabilities of first estimates, that is about the inverses of their variances: this is coherent with a Bayesian approach to the problem, where the term “reliability” would mean “precision”. The user maintains subjective opinions because she/he knows how first estimates were built, and knows the flaws of adapting certain official data to the I-O deflation objective, as discussed in section 1 above. The user might attach, e.g., lower reliabilities to items that she/he was bound to disaggregate somehow in order to pass from the national-accounts classification to the I-O one; a similar problem arises when different items are provisionally deflated with common price indexes. On the contrary, high reliabilities are in order when there is a one-to-one correspondence between national accounting and I-O items. As a further example, low reliabilities should be attached to sectoral inventory changes, that are not published in a disaggregate fashion 8 From the numerical point of view, in order to avoid terms going to infinity in the loss function, these variances should be set equal to some very small number ε. 9 See also Lahr (2001) and Dalgaard-Gysting (2004). 13 by national accounts10. And so on. Of course, if privileged information (coming e.g. from national accountants) is available about the reliability of some items, it should be exploited. Now, reliabilities pertain to the perceived “quality” of first-estimated items, and should be interpreted as scale-free indicators. In order to turn them into something similar to variances, one has to couple them with the levels of the items to be balanced: indeed, it is reasonable to assume that larger terms are allowed to change more heavily under balancing, for given reliability. A similar idea was suggested by Hendrickson-McNeil (1984) and Bartholdy (1987); see also Round (2003). Define, thus, r̂ as the diagonal matrix of inverse reliability indicators, that is, higher figures indicate lower reliabilities. The “subjective” variance matrix of the loss function can then be defined as V = rˆ~zˆ 11. After this, one proceeds as described by (8) and (9) above12. This subjective interpretation of WLS should be the preferred one from the point of view of users, given their lack of privileged and detailed information on objective variances. One can call “subjective WLS” (SWLS) the resulting procedure. Using SWLS enables the user to choose the degree of belief she/he maintains in each of the items ~z deflated provisionally, without any commitment to “single-sided” hypotheses: this intermediate route solves somehow the problems raised in the previous sections. Indeed, Routes I and II described in section 2 boil down to attaching zero variance, in problem (8), either to flows deflated with official price indexes, or to official real value added. SWLS, on the contrary, does not compel users to make such extreme hypotheses, unless she/he is endowed with very reliable information on one of the two sides. 10 As regards this point, if a “generalised” method were used considering co-variances as well, one might assume positive (resp. negative) co-variances between sectoral inventory changes and total output (resp. total demand). 11 V is diagonal, so we are using weighted, not generalized, LS. But neither r̂ nor V is a scalar multiple of the identity matrix, which would lead to OLS instead of WLS. Recall also that the solution (9) is homogeneous of degree zero in V: only relative reliabilities and variances matter. Finally, the variances of null elements are set equal to zero (see however footnote 8 above): this prevents the procedure from producing unwanted negative results. 12 Interpreting V as a Bayesian “prior” on variances, one can show that the corresponding “posterior”, i.e. the covariance matrix of balanced estimates, is the positive definite matrix V − VG ′(GVG ′) −1 GV , which is “lower” (in the positive-definite-matrix ordering) than V. ( 14 ) 4. SWLS compared with RAS The bi−proportional, or RAS, approach (see Stone, 1961 and Bacharach, 1970; for a recent review of bi−proportional methods, refer to De Mesnard, 2004) is well know and widely used (more than the WLS one) by I-O practitioners, including some of them engaged in table deflation, like Dietzenbacher−Hoen (1998): this last paper suggests that RAS can be used as a way to overcome some of the difficulties we raised in section 1, linked to row-homogeneity13. It is well known that, by means of iterative row and column multiplications, the RAS method purports to modify the first estimates of an I-O table14, in order to obtain balanced estimates that satisfy certain given column and row sums. Let us write the RAS constraints in ma~ trix notation. Call Z R the first estimate of the table to be RAS-balanced, with elements ~zijR , not ( ) ~ including row and column sums; form the vector ~z R = vec Z R ; define h′ = [ρ′, κ ′] , ρ and κ being the given row and column sums. Then, the constraints can be expressed as G R z R = h , where z R has the same dimension as ~z R , and G R is a properly defined matrix of zeros and ones in due positions. This given, as shown by Bacharach (1970) the RAS-balanced table solves the following minimum problem: zijR R zij ln R min ∑ ~ R z ij i , j zij s.t. G R z R = h (10) The loss function of (10), while criticized by some for lack of clarity (Lecomber 1975), can be interpreted as the “information gain” obtained passing from first estimates to RASbalanced ones (see Theil 1967, Bacharach 1970 and the survey in Robinson et al. 2001). Some 13 Dietzenbacher-Hoen (1999) prove that these difficulties are, as expected, grounded on aggregation problems. In the RAS tradition, the “first estimate” might be directly a table of an adjacent year, or one built using the coefficients thereof and the gross outputs of the target year. 14 15 call it “information discrimination”, after Kullback−Liebler (1951); referring to the latter paper, Robinson et al. (2001) speak of “cross entropy”. Its meaning is that one wants to minimise the “information gain”, i.e. to stay as near as possible to first estimates. Jackson-Murray (2004) describe alternative loss functions to be compared with RAS: their “Model 6” corresponds to WLS, whose loss function can be attributed to Frieldander (1961); their “Model 8” is RAS. Now, one shows easily (see e.g. Bacharach, 1970; also Hewings−Janson, 1980) that, up to first-order approximation, the loss function appearing in (10) is equivalent to that of (8). In fact, ( R zijR − ~zij R zij ≈ one can prove that ∑ zij ln ~ ~ zij ∑ zijR i , j i, j )2 + C , where C is constant and can thus be neglected in the minimum problem. Using a matrix notation, the function ∑ i, j (zijR − ~zij )2 (known ~z R ij ′ −1 R ~ R z R ~zˆ R z − z . Hence one sees a formal also as “Peirson’s χ2”) can be written z R − ~ ( )( ) ( ) equivalence between the loss functions of (8) and (10), defining V = ~zˆ R in the latter. As regards the equivalence of constraints, we saw in section 2 that there exists a formal identity between writing them in the form G R z R = h and in the form Gz = 0 , if one sets ′ z ′ = z R , h′ and defines properly the matrix G . The form G R z R = h appearing in problem ( ) (10) signals that vector h is taken as given. When passing to the alternative form Gz = 0 , and hence to problem (8), this fact is formalised by attaching a zero variance to the elements of h. To avoid terms going to infinity in the loss function of (8), one sets the variance of h equal to ε I , where the identity matrix has the same dimension as h and ε is a very low number. Collecting the above material, we see that up to first-order approximation the RAS problem (10) can be written as follows: 16 ~z )′ V R −1 (z − ~z ) min ( z − z Gz = 0 s.t. ′ z ′ = z R , h′ ~zˆ R 0 R V = ′ 0 ε I ( ) ( ) (11) where 0 is a rectangular null matrix of proper dimension. Comparing now (11) with (8), i.e. V R with V, and recalling that in the SWLS procedure we defined V = rˆ~zˆ , one recognises that RAS can be seen as a special case of SWLS in a twofold sense. First, suppose for a while that all SWLS reliability indexes are equal, that is matrix r̂ is a scalar multiple of the identity matrix. In this case the (first-order approximated) RAS method is a particular instance of SWLS, in that it assumes the same weights (variances) as SWLS for the z R part of vector z, but adds the further constraint that the variances of part h are equal to “zero”. Hence, having restricted the feasible set of the minimisation problem, RAS cannot do better that SWLS in terms of the value reached by the loss function to be minimised. Second, going back to the more general hypothesis of differentiated reliabilities (i.e. r̂ is a general diagonal matrix), SWLS appears to be more flexible than RAS. Indeed, one can attach different degrees of confidence to the different items to be balanced. We have, then, two implications. Implication I: differently from what happens with RAS, with SWLS the user is not bound to take any row and/or column sum as exogenously given, as if it were free of first-estimation errors; all first estimates, which the user in uncertain about, can be revised by the SWLS procedure, attaching of course possibly high reliabilities to some of them. This aspect is particularly interesting precisely when facing the problem of deflation of a long series of tables, where “margins” (i.e. column and row sums) are not known with certainty. 17 Implication II: greater flexibility derives from not assuming r̂ proportional to the identity matrix, so that there is room for differentiated judgements on the quality of first estimates. Indeed, it might well be the case that some high number appearing among first estimates is characterised by low uncertainty, thus deserving a high reliability: the (first-order approximation of) RAS, where all weights are proportional to first estimates, cannot manage this case15. In addition, as we shall see in the next section, further constraints can be added to the SWLS procedure if one wishes to incorporate other information bits in first estimates (this was suggested also by Robinson et al., 2001, with respect to RAS, or “cross entropy” minimization). SWLS, like RAS (as observed also by Dietzenbacher-Hoen, 1998), has the advantage that different flows on the same row of a table are deflated actually by means of different price indexes: in fact the balancing procedure (9) modifies first estimates on any single row in a non−scalar way. Cell-specific price indexes are justified not only by cell-specific product mixes due to aggregation16, but possibly also by different market characteristics of the buying sectors. Notice: we did not prove that SWLS achieves better results than other methods do in terms of nearness of balanced data to a known target17. In fact the SWLS loss function, like that of other balancing methods, considers distances from initial estimates. Owing to Implication II above, however, one expects good performances in all cases where large items are well firstestimated due to low uncertainty and this is accompanied by a high reliability: if the weights depended only on first estimates, balanced estimate could result too far from true values. 15 Of course, if the user has reason to believe that row and column sums are perfectly known, and if reliabilities are all equal among themselves, then SWLS collapses to the first-order approximation of RAS. 16 De Mesnard (2006), being interested in a different problem −namely, measuring structural change by means of biproportional, among which RAS, filters− proves that if no aggregation problem is present, then results are unaffected by using data in currency units instead of physical ones (or current-price data instead of constant-price ones). This is the “theoretical” side of the story: “…Nevertheless, a slight complication occurs because of aggregation” (ibid. p. 467). Since users live just in these slightly more complicated worlds, it follows that, as recommended by Dietzenbacher-Hoen (1998 and 1999), RAS it to be preferred to double deflation to compute constant-price tables. And, if the arguments put forward above are acceptable, SWLS should be a welcome route. 17 For instance, De Mesnard (2004), Jackson-Murray (2004), and Oosterhaven (2005) compare balanced tables with “true” ones, obtaining performance indicators of different balancing methods. This exercise was prevented in our case by unavailability of official constant-price Italian I-O tables (to be considered in next section). 18 5. A 1959-2000 series of constant-price Italian I-O tables We will now describe an application of SWLS to the deflation of a yearly series (19592000) of I-O tables for Italy. The construction of the current-price series is described in Lavanda et al. (1999) and Rampa (2001); the dataset is organized according to the ESA1979 standard. Since its first publication, the series was updated and extended to the year 2000: see Rampa (2007a). The interest in a long series stems from the desire to undertake a long-term analysis of structural change in the Italian economy requiring, as argued in section 1, constant-price tables. Various available data were used to compute price indexes (output, import, consumption, investment, export) and quantity indexes (output, valued added). The problem with these indexes was that they were extracted from different vintages of national accounting series. The older ones (1959-1970) came substantially from pre-ESA1979 accounting, even if the Italian Statistical Office managed to publish many 1960-1970 (not 1959) accounts revised according to ESA1979. This was not done for output in constant prices, nor hence for price indexes thereof. The construction of price indexes for years 1959-1970 is described in Bertoletti et. al (1987). The 1970-1995 price indexes came from ESA1979 national accounts, and the output and external trade indexes for years 1970-1995 were kindly supplied by the Department of National Accounting and Economic Analysis of ISTAT. Recent indexes (1996-2000) came from series compiled according to the new ESA1995 standard, and were computed making heroic assumptions on the correspondences between published ESA1979 and ESA1995 data. The different vintages of national accounts provided sector aggregations that differed among themselves and from the one of current-price I-O tables18. External trade in goods was classified in a still different manner; external trade in services came from very aggregated data 18 The aggregation adopted in the I-O dataset counts 42 sectors, i.e. the standard NACE-44 classification, with the three sectors of non-matket services summed together; from 1959 to 1964 the number of sectors is 38, some private service sectors being aggregated. 19 published by the Italian Central Bank. Therefore, the price and quantity indexes to be used in deflation cannot be considered as an internally coherent set of data. As discussed in the previous sections, a SWLS procedure might be useful in this case. Here we give a sketchy report of the steps followed in the deflation procedure. The deflation was implemented on tables in producers’ prices (“Depart Usine-Depart Douane”), allowing greater homogeneity along rows, which are net of trade an transport margins. Imputed Banking Services were allocated to sectors’ intermediate costs (see Rampa, 2007b). The base year was chosen to be 1978 because that year lies in between the two major oil shocks of 1974 and 1981, thus avoiding the summation of the price effects of both shocks in any of the two time directions. In addition the business cycle is neither too high nor too low in 1978. First estimates of the items appearing in expression (5) of section 3 were computed. In addition to the items appearing in (5), we inserted also an estimate of constant-price product “transfers” between sectors, as defined in ESA1979, which are due to similar products sold by different sectors. This is necessary if one wants to find agreement between row and column sums. To maintain a higher coherence with national accounts, we inserted also the items of the aggregate chained constant-price account of resources/uses equilibrium (GDP, imports of goods and services, household consumption, expenditure of non market institutions −General Government and non-profit ones−, investment, change in stocks, exports of goods and services), and incorporated the corresponding constraints to be fulfilled. First estimates were built as follows. Intermediate and final flows of domestic products were deflated by means of output price indexes; the only exception to this being exports and consumption, deflated by means of their own price indexes. Intermediate, final and total flows of imported products were deflated by means of import price indexes. Constant-price value added was computed by means of quantity indexes, multiplied by the value added of the current- 20 price table of the base year. Constant-price domestic output was computed as an average between two different estimates: deflation through output price indexes, and computation by means of quantity indexes. Finally, the aggregated account of resources/uses equilibrium was computed chaining quantity indexes. The deflation of the whole series utilised a sort of “quantity index chaining”. That is, starting from 1978 and moving in the two time directions the following steps were implemented: (a) the table of each year t was first of all deflated to the prices of year “ t ± 1 ” (read “year t−1” when moving forward to 2000, and “year t+1” when moving backward to 1959); (b) quantity indexes where computed dividing all cells of this “ t ± 1 ”-constant-price table by the corresponding cells of the current-price table of year “ t ± 1 ”; (c) these quantity indexes were multiplied by the cells of the 1978-constant-price table of year “ t ± 1 ”: this gives the first estimate of the 1978- constant-price table of year t; (d) the first estimate was finally balanced applying formula (9) of section 3. Notice that in step (c) the balanced 1978-constant-price table of year “ t ± 1 ” must be known, i.e. deflated and balanced: that is, the deflation procedure is performed sequentially and separately for each year. A more sophisticated route might be followed: each table could be balanced considering the unbalances of previous years, which amounts to considering autocorrelated errors. See e.g. Van Der Ploeg (1982) and Antonello (1990) for such an alternative. Regarding the variance matrix V of first estimates ~z , it was computed as described at the end of section 3, namely, V = rˆ~zˆ , r̂ being the diagonal matrix of inverse reliability indexes. Inverse reliability indexes were in turn defined as follows. A minimum positive value was chosen, meaning maximum reliability (recall that the result of the balancing procedure (9) is zero-degree homogeneous in V). This minimum value was attached to the items coming from the aggregate resources/uses equilibrium account, excluding change in stocks; 50% more than that minimum 21 was attached to sectoral outputs and imports; one and ½ times the minimum was attached to sectoral value added; twice the minimum the minimum was attached to the disaggregated items of final demand, excluding changes in stocks; three times the minimum to all intermediate flows; five times the minimum to sectoral changes in stocks. The variances of null first estimates (which are expected to remain null after balancing) were set equal to 10 −5 instead of zero: this is to avoid possible singularities in the matrix inversion appearing in (9). As it can be seen, while still having a simple structure, the reliability diagonal matrix r̂ is not equal to a scalar multiple of the identity matrix, reinforcing the “weighted” aspect of the present exercise. The computation of expression (9) was implemented by means of the MATLAB© package for Windows. Scripts of the routine are available on request. The computation, using sparse−matrix facilities, requires 0.11 seconds using a Pentium II−1GHz−64MbRAM PC. Compare this run-time with 78 to 122 minutes reported by Jackson-Murray (2004), table 5 on p. 145, for their “Model 6” applied to 23 (instead of 42) sectors and using a more advanced machine. The result of the exercise can be found at www.giuri.unige.it/iotables and with the present author. This website contains also national accounts in current and constant price, and employment, for the whole period 1959-2000. Recall that data are coherent with ESA1979, and with the 1997 revision on Italian national accounts, and are not comparable with more recent revisions. Table 2.1 of the Appendix 2 offers some summary indicators of how the balancing procedure affected ex-ante yearly indexes. Indicators are distinguished into those pertaining to quantity indexes of value added, and those relating to price indexes of output, imports and exports. It will be remembered that the main problem faced by users, when deflating I-O tables, comes from the compatibility of these items. It can be seen that on average (whole period 1959-2000) the balancing procedure affected less real valued added that it affected price indexes, and it affected less domestic output price indexes than external trade ones. This derives from the struc- 22 ture of reliability indexes described above. There is also a small peak in domestic indexes (valued added and prices) in 1964, due to the change in sector aggregation and the ensuing modification in the structure of some sectors oriented mainly to domestic markets in that period. More important “outliers” appear in external trade price index revisions: as regards exports, the 1959 and 1968 peaks must be related to breaks in the national accounting system (see the second paragraph in this section), which induced ill-computed ex-ante indexes. A similar argument holds for the 1959 peak appearing in the column of import price indexes. The outliers of the latter indexes observable in 1973, 1980, 1986 and 2000 are explained by oil (counter)shocks, and hence by the poor quality of the first estimate of price indexes of disaggregated energy products. 6. Conclusion In this paper a balancing procedure for the deflation of I-O tables was proposed from the viewpoint of I-O users. This Subjective Weighted Least Squares (SLWS) method has been shown to be more flexible than other methods, and subsumes first-order approximation of RAS as a special case. Flexibility is due to the facts that (a) users can attach differential reliability weights to their first estimates, depending on the confidence they have in the different parts of their work, and (b) one is not bound to take any row or column total as exogenously given. An application to a yearly series of Italian I-O tables (1959-2000) has been described. 23 Appendix 1 A 2x2 example of errors made under Routes I and II of section 3 Suppose that the user knows the following current-price 2x2 I-O table, where Di and Mi (i = 1, 2) are the flows of domestic and imported products, respectively: Table 1.1 D1 D2 M1 M2 Value added Production Sector 1 100 200 20 15 65 400 Sector 2 155 150 65 35 45 450 Final demand Total resources 145 400 100 450 10 95 20 70 The true deflated flows are reported in Table 1.2 whose highlighted items, however, are not published and are unknown to the user: Table 1.2 D1 D2 M1 M2 Value added Production Sector 1 Sector 2 85 180 15 10 135 130 45 32 50 340 38 380 Final demand 120 70 15 16 Total resources 340 380 75 58 Starting from publicly known information, the implicit deflators (δ) of domestic (d) and imported (m) products, and of value added (v), can be computed as follows by the user: δ1d = 0.850 ; δ 2d = 0.844 ; δ1m = 0.789 ; δ 2m = 0.829 ; δ1v = 0.769 ; δ 2v = 0.844 . Applying double deflation (Route I), the user would deflate the first four rows, and the last one, of the current-price table under the row-homogeneity assumption, using the implicit domestic-product and import deflators, and obtaining value added by difference. The deflated I-O table would be the following: Table 1.3 D1 D2 M1 M2 Value added Production Sector 1 85.0 168.9 15.8 12.4 57.9 340 Sector 2 131.8 126.7 51.3 29.0 41.3 380 Final demand Total resources 123.3 340 84.4 380 7.9 75 16.6 58 The deflated value added of the two sectors would diverge from the true ones by 15,8% and 8,6%; deflated total resources are of course identical to the true ones. Applying Route II, it turns out that, after proper computation of expression (4), the estimated domesticproduct deflators are 0.799 and 0.807, diverging from published ones by −6.0% and −4.4%. The deflated I-O table is the following, where deflated valued added is of course identical to the true one: Table 1.4 D1 D2 M1 M2 Value added Production Sector 1 79.9 161.4 15.8 12.4 50.0 319.5 Sector 2 123.8 121.1 51.3 29.0 38.0 363.2 24 Final demand Total resources 115.8 319.5 80.7 363.2 7.9 75.0 16.6 58.0 Appendix 2 Table 2.1. Mean absolute percentage differences between revised and ex-ante yearly indexes (Highlighted figures signal outliers) year value added Output prices import prices export prices 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 0,19% 0,17% 0,25% 0,17% 0,18% 0,68% 0,12% 0,11% 0,03% 0,08% 0,27% 0,00% 0,11% 0,21% 0,10% 0,02% 0,12% 0,02% 0,09% 0,11% 0,03% 0,02% 0,01% 0,13% 0,02% 0,01% 0,11% 0,14% 0,02% 0,01% 0,08% 0,12% 0,04% 0,30% 0,00% 0,01% 0,09% 0,09% 0,01% 0,16% 0,12% 0,83% 0,93% 0,24% 0,21% 0,16% 2,36% 0,11% 0,23% 0,22% 0,25% 1,07% 0,23% 0,48% 0,93% 0,17% 0,60% 0,56% 0,07% 0,07% 0,61% 0,06% 0,34% 0,02% 0,03% 0,04% 0,05% 0,22% 0,28% 0,29% 0,16% 0,43% 0,04% 0,09% 0,81% 0,41% 0,09% 0,14% 0,48% 1,10% 0,37% 0,39% 5,95% 0,13% 0,96% 0,07% 0,58% 1,06% 0,30% 0,98% 0,30% 0,94% 0,29% 1,08% 1,38% 3,77% 6,47% 2,06% 0,05% 0,59% 0,43% 1,97% 3,66% 3,41% 0,78% 1,12% 1,00% 0,64% 10,58% 0,42% 2,10% 1,31% 0,14% 1,44% 1,33% 1,65% 0,09% 0,85% 1,45% 0,64% 2,31% 0,72% 5,55% 11,94% 0,57% 0,22% 0,74% 0,53% 1,42% 0,75% 0,31% 0,63% 9,74% 0,43% 0,02% 1,53% 2,12% 2,26% 0,68% 2,24% 1,36% 0,69% 0,04% 1,86% 0,23% 0,44% 0,16% 0,16% 0,32% 0,39% 0,49% 0,44% 0,46% 0,64% 0,74% 0,12% 0,60% 1,53% 0,56% 0,71% 1,64% 0,29% 0,80% 0,85% average 1959-2000 0,11% 0,39% 1,63% 1,27% 25 References ANTONELLO, P. (1990) Simultaneous Balancing of Input-Output Tables at Current and Constant Prices with First-Order Vector Auto-correlated Errors, Economic Systems Research, 2, pp. 157-171 ARROW, K.J., (1974) The Measurement of Real Value Added, in P. David & D.M. Reder (eds) Nations and Households in Economic Growth (New York, Academic Press) BACHARACH, M.L. (1970) Bi-proportional Matrices and Input-Output Change (Cambridge UK, Cambridge University Press) BARTHOLDY, K. (1987) A New Method for Balancing the National Accounts, IMF Working Paper, 87/66, International Monetary Fund, European Department BERTOLETTI, P., RAMPA, G. & SILVA, V. (1987) Analisi strutturale e tavole interindustriali dell’economia italiana, 1959-1985, Economia e Politica Industriale, 54, pp. 65-122 BYRON, R.P. (1978) The Estimation of Large Social Account Matrices, Journal of The Royal Statistical Society A, 141, pp. 359-367 CANNING, P. & WANG, Z. (2005) A Flexible Mathematical Programming Model to Estimate Interregional Input–Output Accounts, Journal of Regional Science, 45, pp. 539-563 DALGAARD, E. & GYSTING, C. (2004) An Algorithm for Balancing Commodity-flow Systems, Economic Systems Research, 16, pp. 169-190 DAVID, P. (1962) The Deflation of Value Added, Review of Economics and Statistics, 44, pp. 148-155 DE MESNARD, L. (2004) Bi-proportional Methods of Structural Change Analysis: A Typological Survey, Economic Systems Research, 16, pp. 205-230 DE MESNARD, L. (2006) Measuring Structural Change in the I-O Production Function by Biproportional Methods: A Theorem of Price Invariance, Papers in Regional Science, 85, pp. 459-469 DIETZENBACHER, E. & HOEN, A.R. (1998) Deflation of Input-Output Tables from the User’s Point of View: A Heuristic Approach, Review of Income and Wealth, 44, pp. 111-122 DIETZENBACHER, E. & HOEN, A.R. (1999) Double Deflation and Aggregation, Environment and Planning A, 31, pp. 1695-1704 DURAND, R. (1994) An Alternative to Double Deflation for Measuring Real Industry Value Added, Review of Income and Wealth, 40, pp. 303-316 DURAND, R. (2001) On the Meaning of Real Value Added and Quantity Indices, Journal of Economic and Social Measurement, 27, pp. 155-165 FEBRERO, E. & DE JUAN, O. (2002) The Meaning of Real Value Added: A Critical Comment, paper presented at the XIV International Conference on Input-Output Techniques, Montreal 10-15 October 2002 FRIEDLANDER, D. (1961) A Technique for Estimating Contingency Tables, Given Marginal Totals and Some Supplementary Data, Journal of The Royal Statistical Society A, 124, pp. 412-420 JACKSON, R.W. & MURRAY, A.T. (2004) Alternative Input-Output Matrix Updating Formulations, Economic Systems Research, 16, pp. 135-148 HENDRICKSON, C. & McNEIL, S. (1984) Matrix entry estimation errors, in Volmuller J. & Hamerslag R. (eds.) Proceedings of the Ninth International Symposium on Transportation and Traffic Theory (Utrecht, VNU Science Press) HEWINGS, G.J.D. & JANSON, B.N. (1980), Exchanging Regional Input-Output Coefficients: A Reply and Further Comments, Environment and Planning A, 10, pp. 843-854 KULLBACK, S. & LIEBLER, R.A. (1951) On Information and Sufficiency, Annals of Mathematics and Statistics, 22, pp. 79-86 26 LAHR, M. L. (2001) A Strategy for Producing Hybrid Regional Input-Output Tables, in Lahr, M. L. & Dietzenbacher, E. (eds.), Input-output analysis: Frontiers and extensions, (New York, Palgrave) LAVANDA, I., RAMPA, G. & SORO, B. (1999) La revisione delle tavole intersettoriali 1970-90: metodo e procedure, Rivista di Statistica Ufficiale, 1, pp. 23-80 LECOMBER, J.R.C. (1975) A Critique of Methods of Adjusting Updating and Projecting Matrices, in Allen R.I.G. & Gossling W.F. (eds.), Estimating and Projecting Input-Output Coefficients (London, Input-Output Publishing Company) OOSTERHAVEN, J. (2005) GRAS versus Minimizing Absolute and Squared Differences: A Comment, Economic Systems Research, 17, pp. 327-331 RAMPA, G. (2001) Yearly Series of IO Tables for the Italian Economy, 1959-1997. Method and First Results, Rivista Internazionale di Scienze Sociali, 99, pp. 449-478 RAMPA, G. (2002) Verdoorn’s Law: Some Notes on Output Measurement and the Role of Demand, in J. McCombie, M. Pugno & B. Soro (eds), Productivity, Growth and Economic Performance: Essays on Verdoorn’s Law (London, Palgrave) RAMPA, G. (2007a) Yearly series of current-price I-O tables for the Italian economy: updating the 1993-1997 tables, and extension to year 2000, www.giuri.unige.it/iotables, Related Paper #3 RAMPA, G. (2007b) Deflation of the 1959-2000 I-O series. Some technical aspects, www.giuri.unige.it/iotables, Related Paper #2 ROBINSON, S., CATTANEO, A. & EL-SAID M. (2001) Updating and Estimating a Social Accounting Matrix Using Cross Entropy Methods, Economic Systems Research, 13, pp. 47-64 ROUND, J. (2003) Constructing SAMs for Development Policy Analysis: Lessons Learned and Challenges Ahead, Economic Systems Research, 15, pp. 161-183 SATO, K. (1976), The Meaning and Measurement of the Real Value Added Index, Review of Economics and Statistics, 58, pp. 434-442 STONE, R.A. (1961) Input-Output Accounts and National Accounts (Paris, OEEC) STONE, R.A. (1982), Balancing the National Accounts, in A. Ingham & A. Ulph (eds.), Demand, Equilibrium and Trade (London, MacMillan) STONE, R.A., MEADE, J.E. & CHAMPERNOWNE D.G. (1942) The Precision of National Income Estimates, Review of Economic Studies, 9, pp. 111-25 THEIL, H. (1967) Economics and Information Theory (Amsterdam, North Holland) VAN DER PLOEG, F. (1982) Reliability and the Adjustment of Sequences of Large Economic Accounting Matrices, Journal of The Royal Statistical Society A, 145, pp. 169-94 WEALE, M. (1988) The Reconciliation of Values, Volumes and Prices in National Accounts, Journal of The Royal Statistical Society A, 151, pp. 211-221 27
© Copyright 2026 Paperzz