Computational Statistics & Data Analysis 41 (2002) 211 – 229 www.elsevier.com/locate/csda Seemingly unrelated regression model with unequal size observations: computational aspects Paolo Foschi∗ , Erricos J. Kontoghiorghes Institut d’informatique, Universite de Neuchâtel, Emile-Argand 11, Case Postal 2, CH-2007 Neuchâtel, Switzerland Received 1 December 2001; received in revised form 1 March 2002 Abstract The computational solution of the seemingly unrelated regression model with unequal size observations is considered. Two algorithms to solve the model when treated as a generalized linear least-squares problem are proposed. The algorithms have as a basic tool the generalized QR decomposition (GQRD) and e5ciently exploit the block-sparse structure of the matrices. One of the algorithms reduces the computational burden of the estimation procedure by not computing explicitly the RQ factorization of the GQRD. The maximum likelihood estimation of the model when the covariance matrix is unknown is also considered. c 2002 Elsevier Science B.V. All rights reserved. Keywords: SUR model; Least squares; QR decomposition; Maximum likelihood 1. Seemingly unrelated regression with unequal size observations The seemingly unrelated regression (SUR) model is de=ned by the set of regressions y i = X i i + u i ; i = 1; : : : ; G; This work is in part supported by the Swiss National Foundation Grants 1214-056900.99/1 and 2000-061875.00/1. Part of the work of the second author was done while he was visiting INRIA-IRISA, Rennes, France under the support of the host institution and the Swiss National Foundation Grant 83R-065887. ∗ Corresponding author. Fax: 327182701. E-mail addresses: [email protected] (P. Foschi), [email protected] (E.J. Kontoghiorghes). c 2002 Elsevier Science B.V. All rights reserved. 0167-9473/02/$ - see front matter PII: S 0 1 6 7 - 9 4 7 3 ( 0 2 ) 0 0 1 4 6 - 9 212 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 where Xi ∈ Rt×ki , yi ∈ Rt and the disturbance vector ui ∈ Rt has zero mean and variance–covariance matrix i; i It . Furthermore, the disturbances are contemporaneously correlated across the equations, i.e. E(ui ujT ) = i; j It . In the compact form the SUR model can be written as y1 X1 1 u1 . . .. . = . + .. . . . . XG yG G uG or vec(Y ) = G ⊕ Xi vec({i }G ) + vec(U ); i=1 (1) where Y = (y1 · · · yG ); U = (u1 · · · uG ), the direct sum of matrices ⊕G i=1 Xi ≡ ⊕i Xi ≡ diag(X1 ; : : : ; XG ), {i }G —abbreviated to {i }—denotes the set of vectors 1 ; : : : ; G and vec(·) is the column stack operator with vec({i }) = (1T ; : : : ; GT )T . The disturbance term vec(U ) has zero mean and dispersion matrix ⊗ It , where, = [i; j ] ∈ RG×G is symmetric and positive semide=nite (Srivastava and Giles, 1987). Computationally e5cient methods for solving SUR models have been proposed (Foschi et al., forthcoming; Foschi and Kontoghiorghes, 2002; Kontoghiorghes, 2000a,b). These methods formulate the SUR model as a generalized linear least squares problem (GLLSP) and use the generalized QR decomposition (GQRD) to solve it (Kourouklis and Paige, 1981; Paige, 1978). Often it is assumed that each regression equation has the same number of observations, but this might not always be the case (Srivastava and Giles, 1987). The solution of SUR models with unequal size observations (abbreviated to SUR-USO) has been previously considered (Schmidt, 1977; Sharma, 1993; Srivastava and Zaatar, 1973). Emphasis was given in the statistical properties of the estimators. The SUR-USO model assumes that the observations for the ith (i ¿ 1) regression match in time with those for the (i − 1)th regression. Here, computational strategies for solving SUR-USO models are provided. Firstly, recent methods for solving SUR models are extended to the numerical solution of the SUR-USO model when this is considered as a GLLSP. A method based on the GQRD is proposed for solving the GLLSP by exploiting the block-sparse and recursive structures of the exogenous matrix and Cholesky factor, respectively. A recursive strategy to reduce the computational burden of this method is presented. Finally, maximum likelihood expressions that can be used in the iterative solution of the SUR-USO model are derived. 2. Numerical solution of the SUR-USO model In the SUR-USO model each regression has diNerent number of observations. That is, yi ; ui ∈ Rti , Xi ∈ Rti ×ki and the covariance matrices, for i ¡ j, are given by E(ui ujT ) = ij (Iti 0ti ×(tj −ti ) ); (2) P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 213 where it has been assumed that ti 6 ti+1 . The compact form of the SUR-USO model is given by vec({yi }) = (⊕i Xi ) vec({i }) + vec({ui }): (3) The dispersion of vec({ui }) has a block matrix structure, where the (i; j)th block is given by (2). Consider partitioning and reordering the observations of each regression by h1 h1 h1 y1; i X1; i u1; i y h X h u h 2; i 2 2; i 2 2; i 2 yi = and ui = (4) .. .. ; Xi = .. .. .. .. ; . . . . . . yi; i hi Xi; i hi ui; i hi where h1 =t1 and hi =ti −ti−1 for i=2; 3; : : : ; G. The SUR-USO model can be formulated as the set of regression equations yi; j = Xi; j j + ui; j for i; j = 1; 2; : : : ; G and i 6 j; (5) where ui; j has zero mean and dispersion matrix given by j; j Ihi . Furthermore, the cross equation covariances are given by i; j Ihk for l = k; T E(uk; i ul; j ) = 0hk ×hl for l = k; where k 6 i and l 6 j. Regressions (5) are also equivalent to the general line model (GLM) O X1 yO 1 uO 1 yO XO uO 2 2 2 = vec({i }) + (6) . . . . .. ; . . . yO G uO G XO G which, after appropriate substitutions, can be written as yO = XO + u; O where yO Ti = (yi;Ti yi;Ti+1 · · · yi;TG ) ∈ Ri ; uOTi = (ui;Ti ui;Ti+1 · · · ui;TG ) ∈ Ri ; = vec({i }), XO i = k1 0 ··· ki−1 ··· 0 ki + · · · + k G G ⊕ Xi; j i ; (7) j=1 and i = (G − i + 1)hi . The disturbance vector uO has zero mean and covariance matrix (i) ⊗Ihi , where (i) ≡ i:; i: denotes the (G −i +1)×(G −i +1) submatrix of starting at position (i; i) (Golub and Van Loan, 1996). Furthermore, the vectors uO i and uO j are O uncorrelated for i = j. Thus, the covariance matrix of uO is given by P=⊕i ((i) ⊗Ihi ) ∈ RT ×T , where T = i ti = i i . Without loss of generality it is assumed that (i) is non-singular and t1 ¿ ki for i = 1; : : : ; G. 214 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 Fig. 1. Examples of models (3), (6) and (8) for G = 3. As in the case of the SUR model, the best linear unbiased estimator (BLUE) of the SUR-USO model derives from the solution of the generalized linear least squares problem (GLLSP) argmin v O v; O subject to yO = XO + CO v; O (8) T where uO = CO v; O PO = CO CO and the upper triangular matrix CO has full rank. Thus, the random vector vO has zero mean and dispersion matrix given by IT . Note that the matrix CO is block diagonal with the ith (i = 1; : : : ; G) block given by CO i; i = Ci:; i: ⊗ Ihi , where = CC T and C is upper triangular. Fig. 1 shows the structure of the SUR-USO model (3), GLM (6) and GLLSP (8) for G = 3. For the solution of (8) consider the GQRD: T QO XO = K RO K 0 T −K (9a) P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 215 and K T O QO CO P= T −K W1; 1 W1; 2 W2; 2 0 K ; T −K (9b) where K = i ki ; RO and W22 are upper triangular, and Q; P ∈ RT ×T are orthogonal. Using (9) the GLLSP (8) can be written as arg min vOA 2 + vOB 2 vOA ;vOB ; yO A yO B where = T Q yO = yO A yO B RO 0 subject to + W1; 1 W1; 2 0 W2; 2 and T P vO = vOA vOB vOA vOB ; : −1 It follows that vOB = W22 yO B and vOA = 0. Thus, the solution of the SUR-USO model comes from solving the triangular system yO A RO W1; 2 = : (10) vOB yO B 0 W2; 2 The main operations for solving the SUR-USO is the computation of the GQRD (9) and in some extent the solution of the triangular system (10). Clearly, the computational burden of solving the SUR-USO will be reduced if the GQRD (9) is computed e5ciently. Furthermore, the e5cient computation of (9) will have a greater impact on the overall computational complexity if the iterative feasible estimator of the SUR-USO is required (Srivastava and Giles, 1987). In such a case, at each iteration an estimator in the place of the unknown is used. Thus, the QRD (9a) is computed once, while O (9b), and consequently (10), need to be solved at each iteration for diNerent C. 3. E#cient solution of the GLLSP For the e5cient solution of the GLLSP (8) using the GQRD (9) the block-sparse structure of the matrices needs to be exploited. Consider =rst the GQRD (0) RO = Q0T XO 1 (11a) 0 and Q0T CO 1; 1 P0 = K (0) CO 1; 1 0 1 − K K W1; 1 ; W̃ 1; 1 1 − K (11b) 216 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 (0) (0) where CO 1; 1 and W̃ 1; 1 are upper triangular and P0 is orthogonal. Furthermore, RO = and ⊕i R(0) i Q̃0; 1 Q̂0; 1 .. .. ; Q0 = (⊕i Q̂0; i ⊕i Q̃0; i ) ≡ . . Q̂0; G where X1; i = (Q̂0; i Q̃0; i ) R(0) i 0 Q̃0; G = Q̂0; i R(0) i is the QRD of X1; i for i = 1; 2; : : : ; G. Using (11), the GLLSP (8) can be equivalently written as arg min v̂1 2 + ṽ1 2 + ;vˆ1 ;v˜1 ; vO2 ;:::;vOG G vOj 2 (0) (0) CO 1; 1 RO O yO 2 X 2 0 .. . . = .. + ... yO G XO G 0 ỹ 1 0 0 ŷ 1 subject to j=2 T 0 CO 2; 2 ··· 0 Ŵ 1; 1 .. . ··· .. . 0 .. . 0 .. . 0 ··· CO G; G 0 ··· 0 v̂1 vO2 .. . ; 0 vOG ṽ1 W̃ 1; 1 (12) T where ŷ 1 = Q̂0 yO 1 ; ỹ 1 = Q̃0 yO 1 and P0T vO1 is conformably partitioned as P0T vO1 = (v̂T1 ṽT1 )T . −1 Here, ṽ1 can be computed by ṽ1 = W̃ 11 ỹ 1 and thus, (12) can be reduced to arg min v̂1 2 + ;vˆ1 ; vO2 ;:::;vOG yO (0) 1 yO 2 .. . yO G G vOj 2 subject to j=2 (0) (0) CO RO 1; 1 0 XO 2 = . + .. . . . O XG 0 0 ··· (0) CO 2; 2 .. . ··· .. . 0 ··· where −1 yO (0) 1 = ŷ 1 − Ŵ 1; 1 ṽ1 = ŷ 1 − Ŵ 1; 1 W̃ 1; 1 ỹ 1 (0) and CO i; i ≡ CO i; i for i = 2; : : : ; G. v̂1 0 vO2 .. .. ; . . (0) vOG CO G; G 0 (13) P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 217 The blocks XO i (i = 2; : : : ; G) can be annihilated by using a block generalization of a (0) Givens sequence. Starting from the bottom to the top RO is used as a pivot in order to annihilate XO G ; : : : ; XO 2 one at a time. That is, for i = G; G − 1; : : : ; 2, the QRD QiT RO = XO i QiT (G−i) yO (G−i) 1 yO i K (G−i+1) RO 0 = yO (G−i+1) 1 yO (G−i+1) i K ; i (14a) K i (14b) and QiT K i i+1 (G−i) CO 1; 1 0 (G−i) CO 1; i+1 ··· (G−i) CO 1; G 0 (0) CO i; i 0 ··· 0 K = ··· G ··· i G (G−i+1) CO 1; 1 (G−i+1) CO 1; i ··· (G−i+1) CO 1; G (G−i+1) CO i; 1 (G−i+1) CO i; i ··· (G−i+1) CO i; G K (14c) i (i) (i) are computed, where Qi ∈ R(K+i )×(K+i ) is orthogonal, RO = ⊕G and Rj(i) ∈ j=1 Rj (G−i+1) (G−i+1) (G−i+1) , CO i; 1 and CO i; i+1 ; : : : ; Rkj ×kj is upper triangular. Notice that at each step CO 1; i (G−i+1) are =lled-in. This results in =lling the block-superdiagonals and =rst blockCO i; G (0) column of ⊕i CO i; i . (0) Let W denote the modi=ed ⊕i CO i; i , that is, K 2 3 ··· G W1;(0)1 W1;(0)2 W1;(0)3 ··· W1;(0)G W2;(0)2 W2;(0)3 ··· W2;(0)G 0 W3;(0)3 ··· W3;(0)G .. . .. . .. .. . 0 0 ··· (0) W2; 1 (0) W W ≡ 3; 1 . .. (0) WG; 1 . (0) WG; G 218 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 K 2 3 ··· G (G−1) CO 1; 1 (G−1) CO 1; 2 (G−1) CO 1; 3 ··· (G−1) CO 1; G (G−1) CO 2; 2 (G−1) CO 2; 3 ··· (G−1) CO 2; G 0 (G−2) CO 3; 3 ··· .. . .. . .. 0 0 ··· (G−1) CO 2; 1 (G−2) = CO 3; 1 . . . (1) CO G; 1 (G−2) CO 3; G .. . (1) O C G; G . K 2 (15) 3 : .. . G The RQD of W can derive by a sequence of G − 1 orthogonal factorizations which (0) annihilate from the bottom to the top the submatrices W2;(0)1 ; : : : ; WG; 1 . The ith (i = 1; : : : ; G − 1) factorization computes W1;(i−1) 1 .. . (i−1) WG−i; 1 (i−1) WG−i+1; 1 W1;(0)G−i+1 .. . (0) WG−i; G−i+1 (0) WG−i+1; G−i+1 K Pi = . .. (i) WG−i; 1 0 G−i+1 W1;(i)1 W1;(i)G−i+1 .. . (i) WG−i; G−i+1 (i) WG−i+1; G−i+1 K .. . G−i ; G−i+1 (16) (i) where Pi ∈ R(K+G−i+1 )×(K+G−i+1 ) is orthogonal and WG−i+1; G−i+1 is upper triangular. Thus, the upper triangular factor in the RQD of W is given by K l∗ where l∗ = K W1;∗ 1 0 G ∗ W1; 2 = ∗ W2; 2 l∗ K 2 ··· G W1;(G−1) 1 W1;(G−1) 2 ··· W1;(1)G 0 W2;(G−1) 2 ··· W2;(1)G .. . .. . .. 0 0 ··· . .. . (1) WG; G K 2 .. . ; (17) G i=2 i . Fig. 2 shows at each step of the procedure the annihilation and (0) O =lled-in of X 2 ; : : : ; XO G and ⊕i CO i; i , respectively, and the retriangularization of W . From (14a), (14b) and (17), it follows that the solution of GLLSP (13) is given by the solution of the triangular system (G−1) yO 1(G−1) RO W1;∗ 2 = ; ∗ ∗ vO∗B yO 0 W2; 2 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 219 (0) Fig. 2. Annihilation of XO i , =lled-in of ⊕i CO i; i and retriangularization of W , where G = 5. where yO 2(G−1) .. : yO ∗ = . yO G(1) The matrices in (14a), (14c) and (16) have block-sparse structures which can facilitate the development of fast factorization algorithms. From (7) and the block-diagonal (G−i) it follows that the QRD (14a) can be derived by computing the structure of RO (G − i + 1) updating QRDs (UQRDs) Qi;Tj Rj(G−i) Xi; j = kj (G−i+1) Rj kj (j = 1; : : : ; G − i + 1): hi 0 (18) Thus, in (14a) Rs(G−i) ≡ Rs(G−i+1) for s = 1; : : : ; i − 1 and I"i 0 0 Qi = 0 (1; 1) ⊕G j=i Qi; j (1; 2) ; ⊕G j=i Qi; j 0 (2; 1) ⊕G j=i Qi; j (2; 2) ⊕G j=i Qi; j (19) 220 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 (0) Fig. 3. Annihilation of XO i and =ll-in of ⊕i CO i; i , where G = 5 and i = 1; : : : ; G. where "i = i−1 j=1 kj Qi; j = and hi kj Qi;(1;j 1) Qi;(1;j 2) Qi;(2;j 1) Qi;(2;j 2) kj : hi Note that when QiT in (19) is used to compute (14c), then in (15) "i K − "i (0) i ; Wi;(0) = 0 W̃ 1 i; 1 W1;(0)j = j 0 (0) W̃ 1; j "j K − "j (20a) (20b) ; and Wi;(0) j = j 0 (0) W̃ i; j (0) (j − i)hi (G − j + 1)hi (0) (0) if i ¡ j; (20c) where Wi;(0) i ; W̃ i; 1 ; W̃ 1; i and W̃ i; j are block upper triangular (i; j = 2; : : : ; G). Fig. 3 shows the structure of W after the UQRDs (18) have been computed, where G = 5. A numeral i in XO i; j and W denotes, respectively, the annihilated and =lled-in submatrices which resulted from the UQRDs at step i (i = 1; : : : ; G). P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 221 Fig. 4. Computing (16), where i = 3 and G = 5. The RQD of W using (16) needs to take into account the sparse structure of the submatrices. Sequential and parallel strategies for computing similar factorizations have been proposed (Kontoghiorghes, 1999; Kontoghiorghes, 2000a,b). The block-diagonals (i−1) of W̃ G−i+1; 1 in (16) can be annihilated one at a time with a series of factorizations (i−1) (i−1) which preserve the sparse and triangular structure of W̃ 1; 1 ; : : : ; W̃ G−i; 1 . The orthogonal (1) (i−j+1) matrix Pi is de=ned as Pi = P̃ i; 1 · · · P̃ i; i , where P̃ i; j = P̂ i; j · · · P̂ i; j (i−1) (s) and P̂ i; j annihilates the sth block of the (G−i+j)th block-diagonal of W̃ G−i+1; 1 (i=1; : : : ; G−1; j=1; : : : ; i and s = 1; : : : ; i − j + 1). Fig. 4 illustrates this strategy for computing (16), where i = 3 and G = 5. Arcs connecting the blocks and block-columns indicate those aNected by (s) the orthogonal factorization P̂ i; j . 4. A recursive strategy for solving the SUR-USO model In the GQRD (9) the computations of the QRD (9a) and RQD (9b) can be interO to leaved. The orthogonal matrix QiT in (14a) when applied from the left of (XO C) O O annihilate X i will =ll-in a block in the lower part of C. This =ll-in is eliminated by O That the application of an orthogonal transformation from the right of the modi=ed C. is, following (14a) and(14b), QiT i i+1 ··· G (G−i) CO 1; 1 K 0 (G−i) CO 1; i+1 ··· (G−i) CO 1; G 0 (0) CO i; i 0 ··· 0 222 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 (0) Fig. 5. Annihilation of XO 2 ; : : : ; XO G and retriangularization of ⊕i CO i; i , where G = 5. = K i i+1 ··· G (G−i) Ĉ 1; 1 (G−i) Ĉ 1; i (G−i+1) CO 1; i+1 ··· (G−i+1) CO 1; G (G−i) (G−i+1) CO i; i+1 ··· (G−i+1) CO i; G (G−i) Ĉ i; 1 Ĉ i; i K (21) i and the RQD K i K i (G−i) Ĉ 1; 1 (G−i) Ĉ 1; i (G−i) Ĉ i; 1 (G−i) Ĉ i; i Pi = K i (G−i+1) CO 1; 1 (G−i+1) CO 1; i 0 (G−i+1) CO i; i K (22) i (G−i+1) (G−i+1) are computed, where CO 1; 1 and CO i; i are upper triangular, Pi ∈ R(K+i )×(K+i ) (G−i) (G−i) have, respectively, the same structure as Wi;(0) 1 and T (0) W1; i in (20) and i = G; G − 1; : : : ; 2. The orthogonal matrices QO and PO in (9) are de=ned as the products of the left and right transformations, respectively. Furthermore, O instead of 2i blocks of its corresponding note that (22) involves only 4 blocks of C, (16). This results in an algorithm which has less computational complexity and lower memory usage. The annihilations and =ll-ins occurring at each step of this procedure are shown in Fig. 5, where G = 5. Note that after the (i + 1)th (i = G − 1; G − 2; : : : ; 1) step of the above strategy the GLLSP (13) can be written as i argmin vO1(G−i) 2 + vO∗(G−i) 2 + vOj 2 subject to is orthogonal, Ĉ i; 1 vO1(G−i) ;vO∗(G−i) ; vO2 ;:::;vOi ; and Ĉ 1; i j=2 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 223 yO (G−i) 1 yO 2 . .. yO i yO (G−i) ∗ RO + O X2 . = .. XO i 0 (G−i) where CO ∗ is ( (G−i) G j ) × ( G (G−i) CO 1; 1 argmin v̂1(G−i) 2 + ;vˆ1(G−i) ; vO2 ;:::;vOi ŷ(G−i) 1 yO 2 . .. yO i i ··· 0 0 .. . (0) CO 2; 2 .. . ··· .. . 0 .. . 0 0 ··· (0) CO i; i 0 0 ··· 0 ∗ (23) vO∗(G−i) = vec({vOi+1 ; : : : ; vOG }). Thus, (23) is equiv- vOj 2 subject to j=2 (G−i) (G−i) CO O R 1; 1 XO 2 0 = . + .. . . . XO i 0 (G−i) v̂1 0 vO2 .. .. ; . . vOi 0 (G−i) vO∗(G−i) CO (G−i) CO 1; i:G j ) upper triangular and non-singular, yO (G−i) = ∗ j=i+1 j=i+1 (G−i−1) (1) vec({yO (G−i) ; y O ; : : : ; y O }) and i+1 i+2 G alent to 0 0 ··· (0) CO 2; 2 .. . ··· .. . 0 ··· 0 v̂1(G−i) vO 2 .. . (0) vOi CO 0 .. . ; (24) i; i where (G−i) ŷ(G−i) = yO (G−i) − CO 1; i+1:G vO∗(G−i) 1 1 and (G−i) −1 (G−i) ) yO ∗ : vO∗(G−i) = (CO ∗ The latter suggest a recursive strategy which solves a sequence of smaller in size GLLSP and requires less computational eNort of computing the RQD. At the ith (i = G; G − 1; : : : ; 2) step the recursive algorithm solves the GLLSP (24) by computing the QRD in (14a), ŷ(G−i) ỹ(G−i) T 1 1 = ; (25) Qi yO i ŷ i (G−i) (G−i) (G−i) O C 1; 1 Ĉ 1; 1 0 Ĉ 1; i QiT (26) = (G−i) (0) (G−i) O 0 C i; i Ĉ i; 1 Ĉ i; i and the RQD in (22). As in the case of (13), the GLLSP (24) is reduced to argmin v̂1(G−i) 2 + ;vˆ1(G−i+1) ; vO2 ;:::;vOi−1 i−1 j=2 vOj 2 subject to 224 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 (0) Fig. 6. Annihilation of XO 2 ; : : : ; XO G and retriangularization of ⊕i CO i; i , where G = 5. ŷ(G−i+1) 1 yO 2 .. . yO i−1 (G−i+1) (G−i+1) CO RO 1; 1 XO 2 0 + = . .. .. . XO i−1 0 0 ··· 0 (0) CO 2; 2 .. . ··· .. . 0 .. . 0 ··· (0) CO i−1; i−1 (G−i+1) v̂1 vO2 .. . vOi−1 ; (27) where (G−i+1) (G−i+1) −1 ŷ(G−i+1) = ỹ(G−i) − CO 1; i (CO i; i ) ŷ i : 1 1 (28) The structure of (27) is the same as that of (24), but smaller in size. The GLLSP (13) is equivalent to (24) when i = G, and ŷ(G−i) ≡ yO (0) 1 1 . Thus, this process can be applied iteratively to solve (13) and derive the BLUE of the SUR-USO model. Algorithm 1 summarizes the steps of this recursive procedure and Fig. 6 illustrates the factorization steps for G = 5. Algorithm 1 Iterative estimation of the SUR-VAR model. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: T T Compute the GQRD (11), ŷ 1 = Q̂0 yO 1 and ỹ 1 = Q̃0 yO 1 . Solve the upper triangular system W̃ 11 ṽ1 = ỹ 1 . (0) Compute yO 1 = ŷ 1 − Ŵ 11 ṽ1 . for i = G; G − 1; : : : ; 2 do Compute the UQRD (14a). Compute (25) and (26). Compute the RQD (22). Compute (28). end for (G−1) (G−1) Solve the upper triangular system RO = ŷ1 . P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 225 5. Maximum likelihood estimation Under normality assumptions, the maximum likelihood (ML) estimators for i and derive from the solution of the non-linear equations @L =0 (29a) @ and @L = 0; (29b) @ where L is the log-likelihood function for the SUR-USO model (3). The non-linear equations (29) are solved by using the EM algorithm. An initial estimator for is choosen in order to obtain an estimator for i from (29a), which in term is used to provide a new estimator for . This process is repeated until convergence (Dempster et al., 1977). The solution of (29a) is equivalent to the GLS estimator (3) and can be computed using the previously derived methods. Thus, only the numerical solution of (29b) will be considered. Note that, when the disturbances are not normally distributed this approach can be considered as quasi-maximum likelihood estimation procedure. The SUR-USO model (3) is equivalent to the set of equations, Y1 = (X1; 1 1 X1; 2 2 · · · X1; G G ) + U1 ; Y2 = (X2; 2 2 · · · X2; G G ) + U2 ; .. . YG = XG; G G + U2 ; (30) hi ×(G−i+1) hi ×(G−i+1) where Yi = (yi; i : : : yi; G ) ∈ R and Ui = (ui; i : : : ui; G ) ∈ R has a multivariate distribution with zero mean and covariance matrix given by (i) . That is, vec(Ui ) has zero mean and covariance matrix given by (i) ⊗ IG−i+1 . Furthermore, the elements of Ui and Uj are uncorrelated for i = j. The log-likelihood function of the ith equation in (30) and of the whole set are given by 1 −1 )) Li = − (i + hi log(det((i) )) + tr(UiT Ui (i) 2 G T and C(i) = Ci:; i: , it follows and L = i=1 Li , respectively. Now, from (i) = C(i) C(i) that @Li −1 T T = hi C(i) − C(i) U i Ui : (31) −1 @C(i) −1 Furthermore, since C(i) is a submatrix of C −1 , the derivative for the log-likelihood function of SUR-USO model (30) with respect to C −1 is given by G 0(i−1)×(i−1) 0(i−1)×(G−i+1) @L : (32) = −1 @C −1 0(G−i+1)×(i−1) @Li =@C(i) i=1 226 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 Substituting (31) into (32) and considering only the non-zero elements of C −1 —the elements in its upper triangular—gives G 0 0 @L ; = vech DT DC − −T @ vech(C −T ) 0 UiT Ui C(i) i=1 = vech DT − G i=1 0 0 0 −T UiT Ui C(i) DC−1 DC ; (33) where DT = diag(t1 ; t2 ; : : : ; tG ), DC = diag(C1; 1 ; C2; 2 ; : : : ; CG; G ) and vech is the halfvectorization operator which stacks the columns of its matrix argument from the principal diagonal downwards (LUutkepohl, 1996). That is, if A = [ai; j ] ∈ Rn×n , then vech(A) = (aT1:n; 1 aT2:n; 2 · · · an; n )T . From i−1 G−i+1 it follows that 0 −1 DC 0 i−1 0 0 G−i+1 0 C −1 = IG−i+1 0 −1 T C(i) U i Ui = 0 0 i−1 0 0 0 C̃ IG−i+1 G−i+1 0 i−1 −1 C(i) G−i+1 −1 0 0 0 UiT Ui where C̃ = CDC . Thus, G 0 0 −T −1 vech DC = AO vech(C̃ ); −T T U C 0 U i i (i) i=1 where G 0 AO = LG 0 i=1 0 IG−i+1 ⊗ 0 0 0 UiT Ui ; (34) LTG (35) and the G(G + 1)=2 × G 2 elimination matrix LG is de=ned by vech(X ) = LG vec(X ), for any matrix X ∈ RG×G (LUutkepohl, 1996). Now, if UO i = (0hi ×(i−1) Ui ) ∈ Rhi ×G , then AO in (35) can be written as G 0 0 T ⊗ UO i UO i LTG LG AO = 0 I G−i+1 i=1 G G 0(G−j+1)×( j−1) T O O = ⊕ (0(G−j+1)×( j−1) I(G−j+1) )U i U i j=1 I(G−j+1) i=1 G = ⊕ AOj ; j=1 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 227 where AOj ∈ R(G−j+1)×(G−j+1) is given by G T 0(G−j+1)×( j−1) UO i UO i AOj = (0(G−j+1)×( j−1) I(G−j+1) ) I(G−j+1) i=j u1; i . = .. ui; i ··· ··· T u1; i u1; G .. .. . . ui; i ui; G ··· ··· u1G .. . : ui; G O is semide=nite. Note that if G ¿ t1 , then AO1 , and thus A, From (33) and (34) it follows that the solution of the non-linear equation in (29b) derives from the solution of the symmetric linear system AO vech(M ) = vech(DT ); or, equivalently, from solving the set of symmetric linear systems AOi Mi; i:G = ti e1 (i = i = 1; 2; : : : ; G); (36) −1 where M ≡ C̃ and e1 denotes the =rst column of the identity matrix. Once C̃ = M −1 is computed, from the de=nition of C̃ it follows that the elements of C are given by C̃ for i = j; i; i Ci; j = C̃ i; j =Ci; i for j = 1; 2; : : : ; i − 1; where it has been assumed that AO is positive de=nite and thus, C̃ i; i ¿ 0. Note that when O is positive semide=nite, which implies that (36) may not have t1 ¡ G; AO1 , and thus A, solutions. 6. Conclusions Computationally e5cient methods to solve the SUR model with unequal size of observations (SUR-USO) which is treated as a GLLSP have been proposed. The algorithms use the GQRD to solve the GLLSP by exploiting the block-sparse structure of the matrices. The =rst algorithm initially computes the QRD of the exogenous matrix by annihilating from bottom to the top blocks of observations which consist of a non-zero block-superdiagonal. The annihilation of the blocks is obtained by orthogonal transformations which do not create any =ll-in. These transformations are also applied from the left of the Cholesky factor and then, a sequence of orthogonal factorizations is applied to retriangularize it from the right. The second recursive algorithm interleaves the QRD and RQD of the exogenous and modi=ed Cholesky factors, respectively. This avoids the explicit computation of the RQD and thus, reduces the computational burden of the estimation procedure. 228 P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 The algorithms presented here assumed for simplicity that t1 ¿ ki ; (i = 1; : : : ; G). (G−i) in (14a) is upper triangular and not trapezoidal. Generally, this This implies that RO assumption should be relaxed and the algorithms modi=ed to deal with cases where the QRD (14a) yields a trapezoidal. This generalization will allow the investigation of alternative block-generalizations of Givens sequences to compute the QRD (9a) without (G−i) is triangular. imposing additional assumptions so that RO For the case of normally distributed disturbances the maximum likelihood estimation has been considered. A closed-form solution of the Cholesky factor of the covariance matrix has been derived by solving the =rst-order conditions (29). This resulted an iterative procedure to estimate the SUR-USO model when the variance–covariance matrix is unknown. Furthermore, this procedure never yields a non-de=nite estimator for . The extension of the proposed methods to solve SUR models with missing observations will be investigated (Hocking and Smith, 1968). Currently, the adaptation and (parallel) implementation of the recursive algorithm to solve the standard SUR model—with equal size observations—is considered. Acknowledgements The authors are grateful to Jesse Barlow, Jesse Barlow and the anonymous referees for their constructive comments and suggestions. References Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. Discussion. J. Roy. Statist. Soc. Ser. B 39, 1–38. Foschi, P., Kontoghiorghes, E.J., 2002. Estimation of VAR(p) models: computational aspects. Computat. Econom., to appear, 2003. Foschi, P., Belsley, D.A., Kontoghiorghes, E.J. A comparative study of algorithms for solving seemingly unrelated regressions models. Computat. Statist. Data Anal., to appear, 2003. Golub, G.H., Van Loan, C.F., 1996. Matrix Computations, 3rd Edition. Johns Hopkins University Press, Baltimore, MD. Hocking, R.R., Smith, W.B., 1968. Estimation of parameters in the multivariate normal distribution with missing observations. J. Amer. Statist. Assoc. 63, 159–173. Kontoghiorghes, E.J., 1999. Parallel strategies for computing the orthogonal factorizations used in the estimation of econometric models. Algorithmica 25, 58–74. Kontoghiorghes, E.J., 2000a. Parallel Algorithms for Linear Models: Numerical Methods and Estimation Problems, Advances in Computational Economics, Vol. 15. Kluwer Academic Publishers, Boston, MA. Kontoghiorghes, E.J., 2000b. Parallel strategies for solving SURE models with variance inequalities and positivity of correlations constraints. Computat. Econom. 15 (1–2), 89–106. Kourouklis, S., Paige, C.C., 1981. A constrained least squares approach to the general Gauss–Markov linear model. J. Amer. Statist. Assoc. 76 (375), 620–625. LUutkepohl, H., 1996. Handbook of Matrices. Wiley, New York. Paige, C.C., 1978. Numerically stable computations for general univariate linear models. Comm. Statist. Simulation Comput. 7 (5), 437–453. Schmidt, P., 1977. Estimation of seemingly unrelated regressions with unequal numbers of observations. J. Econometrics 5, 365–377. P. Foschi, E.J. Kontoghiorghes / Computational Statistics & Data Analysis 41 (2002) 211 – 229 229 Sharma, V.K., 1993. Estimation of seemingly unrelated regressions with unequal numbers of observation. Sankhya: Indian J. Statist. 55, 135–138. Srivastava, V.K., Giles, D.E.A., 1987. Seemingly Unrelated Regression Equations Models: Estimation and Inference (Statistics: Textbooks and Monographs), Vol. 80. Marcel Dekker, Inc., New York. Srivastava, J.N., Zaatar, M.K., 1973. A Monte carlo comparison of four estimators of the dispersion matrix of a bivariate normal population, using incomplete data. J. Amer. Statist. Assoc. 68 (341), 180–183.
© Copyright 2026 Paperzz