A comparative study of algorithms for solving seemingly unrelated

Computational Statistics & Data Analysis 44 (2003) 3 – 35
www.elsevier.com/locate/csda
A comparative study of algorithms for solving
seemingly unrelated regressions models
Paolo Foschia;∗ , David A. Belsleyb , Erricos J. Kontoghiorghesa
a Institut
d’informatique, Universite de Neuchâtel, Rue Emile Argand 11, Case Postale 2,
CH-2007 Neuchâtel, Switzerland
b Department of Economics, Boston College, Chestnut Hill, MA 02467, USA
Received 1 February 2002; received in revised form 9 January 2003; accepted 9 January 2003
Abstract
The computational e1ciency of various algorithms for solving seemingly unrelated regressions
(SUR) models is investigated. Some of the algorithms adapt known methods; others are new.
The 4rst transforms the SUR model to an ordinary linear model and uses the QR decomposition
to solve it. Three others employ the generalized QR decomposition to solve the SUR model
formulated as a generalized linear least-squares problem. Strategies to exploit the structure of
the matrices involved are developed. The algorithms are reconsidered for solving the SUR model
after it has been transformed to one of smaller dimensions.
c 2003 Elsevier B.V. All rights reserved.
Keywords: SUR models; Least-squares; QR decomposition
1. Introduction
The seemingly unrelated regressions (SUR) model is de4ned by the set of regressions
y i = X i i + u i ;
i = 1; : : : ; G;
where Xi ∈ RM ×ki has full column rank, yi ∈ RM , and the M -element disturbance vector
ui ∼ (0; i; i IM ) and is contemporaneously correlated across the equations so
This work is in part supported by the Swiss National Foundation Grants 1214-056900.99/1 and
2000-061875.00/1.
∗ Corresponding author. Department of Mathematics, University of Bologna, Via Sacchi, 3, 47023 Cesena,
Italy. Tel.: +39-0547-642-806; fax: +39-0547-610-054.
E-mail addresses: [email protected] (P. Foschi), [email protected] (D.A. Belsley),
[email protected] (E.J. Kontoghiorghes).
c 2003 Elsevier B.V. All rights reserved.
0167-9473/03/$ - see front matter doi:10.1016/S0167-9473(03)00028-8
4
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
E(ui ujT ) = i; j IM (Srivastava and Dwivedi, 1979; Srivastava and Giles, 1987; Telser,
1964; Zellner, 1962,1963; Zellner and Theil, 1962). Compactly, the SUR model is
written
 

  

1
X1
u1
y1
 

  

  2   u2 
 y2  
X2
 

  

=

  



 ..  
. + . 
..
  ..   .. 
 .  
.
 

  

XG
yG
G
uG
or
vec(Y ) = ⊕G
i=1 Xi vec({i }G ) + vec(U );
(1)
where Y = (y1 : : : yG ), U = (u1 : : : uG ), ⊕G
i=1 Xi ≡ ⊕i Xi ≡ diag(X1 ; : : : ; XG ), denotes
the direct sum of matrices, {i }G denotes the set of vectors 1 ; : : : ; G , and vec(·)
is the column stack operator. The disturbance term vec(U ) ∼ (0; ⊗ IM ), where
= [i; j ] ∈ RG×G is symmetric and non-negative de4nite and ⊗ denotes the Kronecker
product. In this treatment the following properties of the Kronecker product will be
used: (A⊗B)(C ⊗D)=AC ⊗BD, (A⊗B)−1 =A−1 ⊗B−1 and vec(ABC)=(C T ⊗A)vec(B).
For notational convenience, the subscript G in the set operator {·} is omitted, and
T
T
T
⊕G
i=1 is abbreviated as ⊕i . Also, vec({i }) is denoted simply , so ≡ (1 · · · G )
(Regalia and Mitra, 1989). The notation is consistent with that employed in Foschi and
Kontoghiorghes (2002). The standard colon notation will be used to denote subvectors
and submatrices (Golub and Van Loan).
When is non-singular, a best linear unbiased estimator (BLUE) of results from
solving the generalized (linear) least-squares (GLS) problem
argmin vec(Y ) − vec({Xi i })−1 ⊗IM ;
1 ;:::;G
which can be obtained from the normal equations
⊕i XiT (−1 ⊗ IM ) (⊕i Xi ) ˆ = ⊕i XiT vec(Y−1 ):
(2)
(3)
This solution, however, can be unstable when the matrices are ill-conditioned and
explicit matrix inversions are used (BjNorck, 1996; Lawson and Hanson, 1974). Alternatively, multiplying (1) by (C −1 ⊗ IM ) gives the ordinary linear model (OLM)
vec(YC −T ) = (C −1 ⊗ IM ) (⊕i Xi ) + vec(UC −T );
(4)
where C ∈ RG×G is a Cholesky decomposition of ≡ CC T and is upper triangular.
Computing the least-squares estimator of (4) derives the BLUE of the SUR model (1)
(Pollock, 1979).
The SUR model can also be formulated as a generalized linear least-squares problem
(GLLSP)
argminV 2F
subject to
vec(Y ) = (⊕i Xi ) + (C ⊗ IM )vec(V );
(5)
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
5
where VC T = U , vec(V ) ∼ (0; IGM ) and · F denotes the Frobenius norm
(Kontoghiorghes, 2000a; Kontoghiorghes and Clarke, 1995). This approach allows the
derivation of algorithms that are numerically more stable than those based on (4). Furthermore, the GLLSP allows solution of the BLUE of (1) even when C is not full
rank, that is, when is singular (Kourouklis and Paige, 1981; Paige, 1978, 1979b;
SNoderkvist, 1996).
Often, is unknown and an iterative procedure is used to obtain the feasible GLS
estimator (Telser, 1964). Given a consistent estimator of the solution of the model
(2) derives an estimator for . From the residual of the estimated coe1cients, another
estimator of is obtained. This procedure is repeated until convergence. Thus, the GLS
problem (2), or the corresponding GLLSP (5), is solved a number of times for diQerent
. Here, the computational cost of deriving the estimator during a single iteration is
considered. The particular properties of the SUR model that aQect the convergence of
the iterative estimation procedure are not investigated.
In this work, the computational e1ciencies of various methods for computing the
BLUE of the SUR model are considered. Some of the algorithms are well known while
others are new. All of the algorithms are based on an orthogonal factorization obtained
through the QR decomposition. In the next section the solutions of the SUR model
using the QR and generalized QR decompositions are considered. Recursive estimation
algorithms are presented in Section 3. Size reduction of large-scale SUR models is
shown in Section 4. The computational results are discussed in Section 5. Section 6
provides summary comments.
2. Numerical estimation of the SUR model
2.1. Estimating the OLM using the QR decomposition
The OLM (4) can be written as
yR = XR + ;R
(6)
where yR = vec(YC −T ), XR = (C −1 ⊗ IM )(⊕i Xi ), and R = vec(UC −T ). Let the QR decomposition (QRD) of XR be given by
yR A K
RR K
T R
T
R
R
and Q yR =
;
(7)
Q X=
yR B GM −K
0 GM −K
where RR is upper triangular, QR ∈ RGM ×GM is orthogonal, and K = i ki (BjNorck, 1996;
Golub and Van Loan, 1996). The least-squares estimator of is given by solving the
triangular system
R = y:
R
R
(8)
This straightforward solution of (4) is computationally ine1cient since it computes XR
explicitly and ignores its sparsity.
6
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
To solve (4) e1ciently, consider the QRD of Xi :
T
Q̃i
Ri
ki
T
T
; with Qi =
;
Qi Xi =
T
0
Q̂i
M −ki
(9)
where Qi ∈ RM ×M is orthogonal and Ri ∈ Rki ×ki is upper triangular. From (9) it follows
that the QRD of ⊕i Xi is given by
R
K
⊕
i
i
;
(10)
QT (⊕i Xi ) =
GM −K
0
where Q = (⊕i Q̃i ⊕i Q̂i ). Premultiplying (4) by QT gives
QT vec(YC −T ) = QT (C −1 ⊗ IM ) (⊕i Xi ) + QT vec(UC −T );
or
vec({yR̃ i })
vec({yR̂ i })

=
WR̃
WR̂


 + 
vec(VR̃ )
vec(VR̂ )

;
(11)
where



WR̃ = 





WR̂ = 


k1
···
WR̃ 1; 1
···
..
.
···
k1
···
WR̂ 1; 1
···
..
.
WR̃ G; G
kG
WR̂ 1; G
..
.
···

T

 i; j Q̃i Xj ;

WR̃ i; j = i; i Ri ;



0;
WR̂ i; j =
WR̃ 1; G
..
.
WR̃ G; 1
WR̂ G; 1
kG
WR̂ G; G

k1

 . ;
 .
 .

kG






M −k1
..
.
;
(12a)
M −kG
if i ¡ j;
if i = j;
(12b)
if i ¿ j;
i; j Q̂Ti Xj ;
if i ¡ j;
0;
if i ¿ j;
(12c)
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
7
and i; j is the (i; j)th element of C −1 . Notice that WR̃ and WR̂ are block upper-triangular
and strictly block upper-triangular matrices, respectively.
Now, compute a row-updating QRD (hereafter abbreviated to UQRD)
  WR̃
RR K
QR T   =
(13a)
0 GM −K
WR̂
and
QR T
vec({yR̃ i })
vec({yR̂ i })
=
yR
yR ∗
K
GM −K
:
(13b)
It follows that the least-squares solution of (11), and thus the BLUE of , is given
by (8). Algorithm 1 summarizes these steps for solving (4). Two block strategies, the
column- and diagonally-based methods, that can be used to compute (13)—step 8—are
described in Appendix A.
Algorithm 1. Ordinary least squares estimation of the OLM (4).
1:
2:
3:
4:
5:
6:
7:
8:
9:
Compute = CC T
Compute C −1 = [i; j ] and YR = (yR 1 · · · yR G ) = YC −T
for i = 1; : : : ; G do
Compute the QRD (9)
Compute yR̃ i = Q̃Ti yR i and yR̂ i = Q̂Ti yR i
end for
Compute WR̃ and WR̂ as in (12a)
Compute the UQRD (13a) and (13b)
Solve the triangular system (8) for 2.2. The GLLSP and generalized QRD
The solution of the GLLSP (5) can be obtained by computing the generalized QRD
(GQRD) of ⊕i Xi and (C ⊗ IM ) (BjNorck, 1996; Kontoghiorghes, 2000; Kontoghiorghes
and Dinenis, 1997; Kourouklis and Paige, 1981; Paige, 1990), that is, by computing
the QRD (10) and the RQ decomposition
QT (C ⊗ IM )P = W ≡
K
GM −K
WAA
WAB
0
WBB
K
;
(14)
GM −K
where K = i ki , P ∈ RGM ×GM is orthogonal, and WBB is upper triangular. Premultiplying the constraints in (5) by QT and using vec(V ) ≡ PP T vec(V ), the GLLSP can
8
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
be written as
argmin
G
;{v˜i };{vˆi } i=1
(ṽi 2 + v̂i 2 )
vec({ỹ i })
=
vec({ŷ i })
⊕i Ri
subject to
+
0
WAA
WAB
0
WBB
vec({ṽi })
vec({v̂i })
;
(15)
where Q̃Ti yi = ỹ i , Q̂Ti yi = ŷ i , P T vec(V ) = (vec({ṽi })T vec({v̂i })T )T , ỹ i ; ṽi ∈ Rki and
ŷ i ; v̂i ∈ RM −ki . The solution of (15) is given by vec({ṽi }) = 0 and
⊕i Ri
WAB
0
WBB
=
vec({v̂i })
vec({ỹ i })
vec({ŷ i })
:
(16)
The RQD (14) derives in two stages. The 4rst computes the permutation
T
Q (C ⊗ IT )# =
K
GM −K
W̃AA
W̃AB
W̃BA
W̃BB
K
;
(17)
GM −K
where # = (⊕i (Iki 0)T ⊕i (0 IM −ki )T ). This results in W̃AA , W̃AB , W̃BA and W̃BB being
block upper-triangular. The second stage computes the RQD
( W̃BA
W̃BB )P̃ = ( 0
WBB )
( W̃AA
W̃AB )P̃ = ( WAA
(18a)
and
WAB );
(18b)
where P̃ ∈ RGM ×GM is orthogonal. Notice that P in (14) is given by #P̃ and that (18)
does not compute the RQD of the whole matrix in (17). The leading submatrix WAA ,
which is not used in the solution of the GLLSP, is not triangularized. Furthermore, the
RQD (18a) is equivalent to the QL decomposition
P̃
T
T
W̃BA
T
W̃BB
=
0
T
WBB
:
This indicates that (18a) can be computed using adaptations of the diagonally-based
and column-based strategies (see Appendix A) that are used for the computation of
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
W = QT (C ⊗ IM )Π
Zero Block
9
T
W (0) = Q (C ⊗ IM )Q
Diagonal block
Non-zero block
Fig. 1. Structure of matrices W̃ and W̃ (0) , where G = 5.
Step 1
Step 2
A
Step 3
A
A
Step 4
A
A
A
A
A
A
A
Zero Block
Non-zero block
Filled-in block
A
Annihilated block
Fig. 2. Structure of matrices W̃ and W̃ (0) , where G = 5.
(13a) (Kontoghiorghes, 1999, 2000b). Furthermore, these strategies produce WAA in
(18b) that are block upper-triangular.
The 4rst step of the diagonally-based strategy annihilates the main block-diagonal of
W̃BA . However, the permutation in (17) along with this step is equivalent to applying
Q to the right of QT (C ⊗ IM ); that is
QT (C ⊗ IM )Q = W̃ (0) ≡
K
GM −K
(0)
W̃AA
(0)
W̃AB
(0)
W̃BA
(0)
W̃BB
K
;
(19)
GM −K
(0)
where W̃ (0)
AA and W̃ BB are block upper-triangular with the ith block of their main diagonals given by Ci; i Iki and Ci; i IM −ki , respectively. Furthermore the matrices W̃ (0)
AB and
(0)
W̃ BA are strictly block upper-triangular. Fig. 1 shows the structure of the matrices W̃
and W̃ (0) , where G = 5.
Thus the remaining steps of the diagonally-based method annihilate the strictly
(0)
block upper-triangular matrix W̃ (0)
BA by preserving the block-triangular structure of W̃ AA
and W̃ (0)
BB . This annihilation strategy is illustrated in Fig. 2, where an arc denotes an
10
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
updating RQD (URQD). Algorithm 2 summarizes the steps of this estimation
procedure.
Algorithm 2. Solution of the GLLSP (5) using the GQRD.
1:
2:
3:
4:
5:
6:
7:
8:
9:
Compute = CC T
for i = 1; : : : ; G do
Compute the QRD (9)
Compute ỹ i = Q̃Ti yi and ŷ i = Q̂Ti yi
end for
Compute QT (C ⊗ IT )Q as in (19)
(0)
(0)
Compute the URQD (W̃ (0)
= (0 WBB )
BA W̃ BB )P̃
(0)
(0)
(0)
Compute (W̃ AA W̃ AB )P̃ = (WAA WAB )
Solve the triangular system (16) for and vec({v̂i })
2.3. An interleaving approach to solving the GLLSP
The RQD (14) is the most expensive operation in computing the GQRD of ⊕i Xi
and C ⊗ IM (see Appendix B). An iterative procedure that does not compute (14)
can be employed (Paige, 1979b). At each iteration a smaller problem is solved. Let
U (0) =C ⊗IM , yU (0) =vec(Y ), and vU(0) =vec(V ). The sth (s=0; : : : ; G −1)
XU (0) =⊕G
i=1 Xi , W
iteration deals with the GLLSP
argminvU(s) vU(s) ;
subject to
yU (s) = XU (s) + WU (s) vU(s) ;
(20)
computing the factorizations
QU Ts XU (s)
=
K
XU (s+1)
0
(21a)
%s+1
M −kG−s
and
(QU Ts WU (s) QU s )PU s =
%s+1
M −kG−s
WU (s+1)
(s)
W̃AB
0
(s)
W̃BB
%s+1
M −kG−s
;
(21b)
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
where QU s ; PU s ∈ R%s ×%s are orthogonal, %s = (G − s)M + &G−s+1 , &i =








XU (s+1) = 






k1
···
X1
···
..
.
..
0
0
···
0
0
···
..
.
..
.
···
XG−s−1
0
···
0
···
0
RG−s
···
0
..
.
..
.
..
..
.
0
0
···
.
..
.
0
···
(G−s−1)M
&G−s
(s+1)
WU AA
(s+1)
WU AB
0
(s+1)
WU BB
WU (s+1) =
kG−s
kG−s−1
(G−s−1)M
G
j=i kj ,
kG

0














..
.
.
11
RG
M
..
.
;
M
(22)
kG−s
..
.
kG
;
(23)
&G−s
(s+1)
(s+1)
RG−s+i and WU BB
are upper-triangular, and WU AA
= C1:G−s−1; 1:G−s−1 ⊗ IM . Furthermore,


0
0
0
I(G−s−1)M


(24)
QU s = 
0
Q̃G−s
0
Q̂G−s 


0
0
I&G−s+1
0
and




T
(s)
QU s WU QU s = 



(G−s−1)M
kG−s
&G−s+1
M −kG−s
(s+1)
WU AA
(s)
Ŵ AB
(s)
Ŵ AC
(s)
Ŵ AD
0
CG−s; G−s IkG−s
(s)
Ŵ BC
0
0
0
(s)
Ŵ CC
0
0
0
(s)
Ŵ DC
CG−s; G−s IM −kG−s








(G−s−1)M
;
kG−s
&G−s+1
M −kG−s
(s)
(s)
(s)
(s)
U (0)
where Ŵ AC
= (I(G−s−1)M 0)WU AB
, Ŵ CC
= WU BB
and WU (0)
BB , and consequently W CC , has
zero dimension.
Note that (21a) computes the QRD of Xs , while the RQD (21b) is equivalent to the
URQD
(s)
(Ŵ DC
CG−s; G−s IM −kG−s )PR s = (0
(s)
W̃BB
)
(25a)
12
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
and

(s)
Ŵ AC
(s)
Ŵ AD
 (s)
 Ŵ
 BC


(s)
W̃ AC
(s)
W̃AD

(s) 
W̃BD
;
 (s)


R
0 
 P s =  W̃ BC
(s)
Ŵ CC
(s)
W̃ CC
0

(25b)
(s)
W̃CD
where
PU s =
I(%s+1 −&G−s+1 )
0
0
PR s
(s+1)
WU BB
=
(s+1)
(s)
WU AB
= ( Ŵ AB
;
CG−s; G−s IkG−s
(s)
W̃BC
0
(s)
W̃CC
(s)
);
W̃AC

(s)
W̃AD

 (s) 
(s)

W̃AB
=
 W̃BD  :
and
(s)
W̃CD
Let
QU Ts yU (s)
=
yU (s)
A
yU (s)
B
%s+1
and
M −kG−s
PU Ts QU Ts vU(s)
=
vU(s+1)
vUB(s)
%s+1
M −kG−s
:
Premultiplying the constraints in (20) by QU Ts and using (21), it follows that the GLLSP
is equivalent to
argmin vUA(s) 2 + vUB(s) 2
subject to
vUA(s) ;vUB(s) ;
yU (s)
A
yU (s)
B
=
XU (s+1)
0
+
WU (s+1)
(s)
W̃AB
0
(s)
W̃BB
vU(s+1)
vUB(s)
;
or, again, the smaller GLLSP
argmin vU(s+1) vU(s+1) ;
subject to
yU (s+1) = XU (s+1) + WU (s+1) vU(s+1) ;
(26)
(s) −1 (s)
(s) (s)
where vUB(s) = (W̃BB
) yU B and yU (s+1) = yU A(s) − W̃AB
vUB .
The solution to (26) can be obtained iteratively by employing the method used for
the GLLSP (20). At the end of iteration (G − 1) the GLLSP becomes
argmin vU(G) vU(G) ;
subject to
yU (G) = (⊕i Ri ) + WU (G) vU(G) ;
(G)
which has solution vU(G) = 0 and = (⊕i R−1
.
i )yU
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Step 1
A
Step 2
Step 3
A
Zero Block
Step 4
A
Non-zero block
13
A
Filled-in block
A Annihilated block
Fig. 3. Computation of (25) at step s = 4.
Fig. 4. Computation of (21b), where G = 5 and s = 2.
Now consider the computation of (25a). Let
kG−s+1
(s)
(Ŵ DC
CG−s; G−s IM −kG−s )= A1
···
···
kG
As
M −kG−s
A(0)
s+1
.
(27)
The submatrices A1 ; : : : ; As are annihilated one at a time by computing the
URQDs
(Ai
(i−1)
As+1
)Pi = (0
(i)
As+1
);
i = 1; : : : ; s;
(28)
(i)
(s)
(s)
where As+1
is upper triangular and Pi is orthogonal. Thus, in (25a) W̃BB
= As+1
. This
(s)
produces W̃CC with a block upper triangular structure. Fig. 3 shows the steps for annihi(s)
(s)
(s)
and the 4ll-ins induced in W̃CC
and W̃CD
in (25), where
lating Ŵ DC
s = 4. Algorithm 3 summarizes the steps of the interleaving procedure for solving
the GLLSP (5). Fig. 4 illustrates the computations of the second step (s = 2), where
G = 5.
14
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Algorithm 3. Solution of the GLLSP (5) using the interleaving approach.
1: Compute = CC T
U (0) = C ⊗ IM and yU (0) = vec(Y )
2: Let XU (0) = ⊕G
i=1 Xi , W
3: for s = 0; 1; : : : ; G − 1 do
4:
Compute the QRD (9), where i = G − s and let QU s be given by (24)
(s) yU A
5:
Compute
= QU Ts yU (s)
yU (s)
B
6:
Compute QU Ts WU (s) QU s
7:
Compute the URQD (25a) and (25b)
(s) (s)
(s)
8:
Solve the triangular system W̃BB
vUB = yU (s)
B for vUB
(s) (s)
9:
Compute yU (s+1) = yU (s) − W̃AB
vUB
10: end for
11: Solve the triangular system (⊕i Ri ) = yU (G) for 3. A recursive algorithm for the estimation of the SUR model
The BLUE of the SUR model can be computed recursively (Bolstad, 1987;
Kontoghiorghes, 2003). Consider the partitioning
 (1) 
 (1) 
 (1) 
Xi
Y
U
M1
M1
M1






 ..  .
 ..  .
 ..  .
Xi =  .  .. ; Y =  .  .. ; U =  .  ..






Xi(p)
and
Y (p)
Mp

(1)
Mp

U (p)
Mp
(29)
V
M1




V =  ...  ...


V (p) Mp
for i = 1; : : : ; G. The SUR model (1) and the GLLSP (5) can be respectively expressed
equivalently as



 

⊕i Xi(1)
vec(U (1) )
vec(Y (1) )



 

 vec(U (2) ) 
 vec(Y (2) )   ⊕i X (2) 
i



 

 + 

=







..
..
..






.
.
.



 

vec(Y (p) )
⊕i Xi(p)
vec(U (p) )
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
15
and
argmin

p
;V (1) ;:::;V (p) j=1
vec(Y (1) )


V (j) 2F
⊕i Xi(1)
subject to





 
 vec(Y (2) )   ⊕i X (2) 

i



 

 + 
=



 
..
.
..



 
.



 
vec(Y (p) )
C ⊗ IM1
0
0
⊕i Xi(p)
···
0
C ⊗ IM2 · · ·
0




  vec(V (2) ) 



:


..


.


..
.
..
.
..
..
.
0
0
· · · C ⊗ IM p
.
vec(V (1) )
vec(V (p) )
(30)
Assume that M1 ¿ max(k1 ; : : : ; kG ), and let the GQRD of
given by
T
Q(1)
⊕i Xi(1) =
K
⊕i R(1)
i
0
⊕i Xi(1)
and C ⊗ IM1 be
(31a)
K
GM1 −K
and
T
Q(1)
(C
⊗ IM1 )P(1) = W
(1)
≡
K
GM1 −K
(1)
WAA
(1)
WAB
0
(1)
WBB
K
;
(31b)
GM1 −K
where R(1)
and W(1) are upper-triangular. Furthermore, let
i
(1) (1) ỹ
ṽ
K
T
(1)
T
(1)
Q(1) vec(Y ) =
and P(1) vec(V ) =
(1)
ŷ
v̂(1)
GM1 −K
K
GM1 −K
: (32)
Using (31) and (32) it follows that the GLLSP (30) can be written as
p
argmin ṽ (1) 2 + v̂ (1) 2 +
V ( j) 2F subject to

;v˜ (1) ;vˆ (1) ;
V (2) ;:::;V (p)

j=2

 (1)
WAA
0





 vec(Y (2) )   ⊕i X (2) 
 0
C ⊗ I M2
i 

 


 



  . 
 .
..
..

 =  ..  +  ..
.
.

 



 


(p)
vec(Y ) ⊕ X (p) 
 0
0

  i i 

ỹ (1)
ŷ (1)
⊕i R(1)
i
0

0
0
···
0
···
0
..
..
.
.
· · · C ⊗ IM p
···
0
(1)
WAB

ṽ (1)



(2) 

0 
  vec(V ) 



..  
..


:
. 
.



vec(V (p) )
0 


(1)
WBB
v̂(1)
(33)
16
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
This is equivalent to
argmin ṽ(1) 2 +
;v˜ (1)
V (2) ;:::;V (p)

V (j) 2F
subject to
j=2

y(1)
p

⊕i R(1)
i


(1)
WAA



 
 vec(Y (2) )   ⊕i X (2) 

i 


 

 + 
=


  . 
..


  .. 
.



 
vec(Y (p) )
0
0
⊕i Xi(p)
···
0
C ⊗ IM2 · · ·
0



  vec(V (2) ) 



;


..


.


..
.
..
.
..
..
.
0
0
· · · C ⊗ IM p
.

ṽ (1)
vec(V (p) )
(34)
(1) −1 (1)
(WBB
) ŷ
(1) (1)
WAB
v̂ .
where v̂(1) =
and y(1) = ỹ (1) −
The solution to the GLLSP (34) can be obtained in (p − 1) iterations. The sth
(s = 2; : : : ; p) iteration solves the GLLSP
p
argmin ṽ(s−1) 2 +
V (j) 2F subject to

;v˜ (s−1) ;
V (s) ;:::;V (p)
j=s

y(s−1)

⊕i Ri(s−1)

 
 vec(Y (s) )   ⊕i X (s)
i

 

=

 
..
.
..

 
.

 
vec(Y (p) )


(s−1)
WAA
0
···
0
0
C ⊗ IMs
···
0
..
.
..
.
..
..
.
0
0
···






 + 






⊕i Xi(p)
.


ṽ (s−1)


  vec(V (s) ) 



;


..


.


C ⊗ IM p
vec(V (p) )
(35)
(s−1)
WAA
Ri(s−1)
where y(s−1) ; ṽ(s−1) ∈ RK , and
∈ RK×K and
∈ Rki ×ki are upper triangular.
For the solution of (35) consider the updating GQRD (UGQRD)
T
Q(s)
⊕i Ri(s−1)
⊕i Xi(s)
=
K
⊕i Ri(s)
0
(36a)
K
GMs
and
T
Q(s)
(s−1)
WAA
0
0
C ⊗ I Ms
P(s) = W (s) ≡
K
GMs
(s)
WAA
(s)
WAB
0
(s)
WBB
K
GMs
;
(36b)
where Ri(s) and W (s) are upper triangular and, Q(s) and P(s) are orthogonal. Let
(s−1) ỹ (s)
y
K
T
Q(s)
(37a)
=
(s)
(s)
ˆ
vec(Y )
vec(Y ) GMs
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
and
T
P(s)
ṽ (s−1)
vec(V
(s)
)
=
ṽ(s)
vec(V̂
(s)
)
K
GMs
17
:
(37b)
Strategies for computing the UGQRD (36) have been discussed in the context of
updating the SUR model (Kontoghiorghes, 2003).
Using (36) and (37), the GLLSP (35) becomes the smaller GLLSP
argmin ṽ(s) 2 +
;v˜ (s) ;
V (s+1) ;:::;V (p)

y(s)

p
V (j) 2F

⊕i Ri(s)


(s)
WAA



 
vec(Y (s+1) ) ⊕i X (s+1) 

i



 

 + 
=



 
..
.
..



 
.



 
vec(Y (p) )
subject to
j=s+1
⊕i Xi(p)
0
0
···
0
C ⊗ IMs+1 · · ·
0
..
.
..
.
..
0
0
· · · C ⊗ IM p
.
..
.


ṽ(s)


 vec(V (s+1) )



;


..


.


vec(V (p) )
(38)
where
(s)
WBB
vec(V̂ (s) ) = vec(Yˆ (s) )
(39)
(s)
y(s) = ỹ (s) − WAB
vec(V̂ (s) ):
(40)
and
At the last iteration, when s = p, the GLLSP reduces to
(p) (p)
argmin ṽ(p) 2 subject to y(p) = ⊕i R(p)
+ WAA
ṽ ;
i
;v˜(p)
(p)
which has solution ṽ(p) = 0 and ⊕i R(p)
. Algorithm 4 summarizes the steps of
i =y
this recursive estimation procedure for computing the BLUE of the SUR model. Note
that at the sth iteration, the matrix retriangularized in (36b) is of order (K + GMs ).
This results in less computational cost than does the RQD of the GM × GM matrix in
(14). Algorithm 4 also requires less memory to store the smaller dimensioned matrices
involved in the factorizations.
Algorithm 4. Solution of the GLLSP (5) using the recursive algorithm.
1:
2:
3:
4:
Compute = CC T
Compute the GQRD (3) and Y (1) from (23)
(1) (1)
Solve the triangular system WBB
v̂ = ŷ (1)
(1) (1)
(1)
(1)
Compute y = ỹ − WAB v̂
18
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
5: for s = 2; : : : ; p do
6:
Compute the UGQRD (36) and (37a)
7:
Solve the triangular system (39) for vec(V̂ (s) )
8:
Compute (40)
9: end for
10: Solve the triangular system (⊕i Ri ) = y(p) for 4. Size reduction of large scale SUR models
When M ¿ k, the SUR model can be transformed to one of smaller dimension
(Foschi et al., 2002; Kontoghiorghes, 2000a, b). Solving the transformed model
results
G
in a computationally e1cient algorithm. Let X ∗ = (X1 · · · XG ) ∈ RM ×K , K = i=1 ki ,
and M ¿ K. Consider the QRD
∗T ∗
QR
R
∗
X =
;
(41)
0
QN∗T
where R∗ = (R∗1 · · · R∗G ) ∈ RM ×M , R∗i ∈ RK×ki , and the matrix (QR∗ QN∗ ) ∈ RM ×M is orthogonal. Now, premultiplying the SUR model (1) by (IG ⊗ QR∗ IG ⊗ QN∗ )T results in
the transformed SUR (TSUR) model
vec(YR∗ )
vec(UR∗ )
⊕i R∗i
+
=
;
(42)
vec(YN∗ )
vec(UN∗ )
0
where YR∗ = QR∗T Y , YN∗ = QN∗T Y , UR∗ = QR∗T U , and UN∗ = QN∗T U . Furthermore,
vec(UR∗ )
0
⊗ IK
;
∼ 0;
vec(UN∗ )
0
⊗ IM −K
(43)
and thus the SUR model (1) is equivalent to the smaller TSUR model
vec(YR∗ ) = (⊕i R∗i ) + vec(UR∗ ):
Note that

R∗1; i

 ..
 .
R∗i = 
 ∗
R
 i; i
0

k1

.
 ..

;

 ki

&i+1
(i = 1; : : : ; G);
(44)
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
19
G
where &i = j=i kj . However, the direct implementation of Algorithms 1–4 to solve this
reduced model does not exploit the special structure of the (transformed) exogenous
matrices R∗1 ; : : : ; R∗G .
Let (41) be replaced with the QRD of X̃ = (XG · · · X1 ) and partition






k1
Ỹ 1
Ũ 1
R̃1; i
k1
k1












.
..

 Ỹ 2  k2
 Ũ 2  k2
 ..
.
∗
∗





QR∗T Xi ≡ R̃i = 
;
Y
=
and
U
=
R
R

 ..  .
 ..  . :

 R̃G−i+1; i  kG−i+1
 .  ..
 .  ..






&
0
G−i+2
kG
kG
Ỹ G
Ũ G
(45)
Also let Ṽ i C T = Ũ i , Ỹ i , Ṽ i , and C (i = 1; : : : ; G) be partitioned, respectively, as
G−i+1
Ỹ i = Ỹ iA
i−1
Ỹ iB ,
G−i+1
Ṽi = Ṽ iA
i−1
Ṽ iB ,
and C =
G−i+1
(i)
CAA
0
i−1
(i) CAB G−i+1 :
(i)
CBB
i−1
Then, the GLLSP formulation of the TSUR model (44) can be written as
argmin
;V˜ j
G
Ṽ j 2F
subject to
j=1
vec(Ỹ i ) = (⊕j R̃j; i ) + vec(Ṽ i C T );
i = 1; : : : ; G;
or, equivalently, as
argmin
G
;V˜ jA ;V˜ jB j=1
(Ṽ jA 2F + Ṽ jB 2F )
subject to
(46a)
(i) T
(i) T
) + Ṽ iB (CAB
) ;
Ỹ iA = (R̃i; 1 1 · · · R̃i; G−i+1 G−i+1 ) + Ṽ iA (CAA
i = 1; : : : ; G;
(46b)
(i) T
Ỹ iB = Ṽ iB (CBB
) ;
i = 1; : : : ; G:
(46c)
(i) −T
) , and thus, the GLLSP (46) can be written
From (46c), it follows that Ṽ iB = Ỹ iB (CBB
as
argmin
;V˜ jA
G
Ṽ jA 2F
subject to
j=1
YR iA = (R̃i; 1 1
···
R̃i; G−i+1 G−i+1 ) + Ṽ iA (CA(i) )T ;
i = 1; : : : ; G;
(47)
20
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
(i)
(R¯ T1 … R̄TG ) T
⊕i (CAA ⊕ Iki )
(i)
Fig. 5. The structure of the transformed exogenous matrix (RRT1 · · · RRTG )T and Cholesky factor ⊕i (CAA ⊗ Iki )
of the GLLSP (48), where G = 4.
(i) T
where YR iA = Ỹ iA − Ṽ iB (CAB
) . This is equivalent to
argmin
;V˜ iA

vec(YR 1A )

G
Ṽ iA 2F
i=1

RR 1

subject to

(1)
CAA
⊗ Ik 1


  
 vec(YR 2A )   RR 2 



  

 =   + 


  .. 
..


  . 
.


  
R
R
vec(Y GA )
RG
0
0
···
0
(2)
CAA
⊗ Ik 2 · · ·
0
..
.
..
.
..
0
0
(G)
· · · CAA
⊗ Ik G
.
..
.


vec(Ṽ 1A )


  vec(Ṽ 2A ) 



;


..


.


vec(Ṽ GA )
(48)
where
RR i =
and Ki =
i
KG−i+1
⊕G−i+1
R̃i; j
j=1
j=1 kj .
&G−i+2
0
(G−i+1)KG−i+1
(49)
Notice that the 4rst block of the constraints
(1)
vec(YR 1A ) = RR 1 + (CAA
⊗ Ik1 )vec(Ṽ 1A )
is analogous to the constraint of the GLLSP (5).
The GLLSP (48) corresponds to a GLLSP formulation of a SUR model with unequal
numbers of observations (Foschi and Kontoghiorghes, 2002). Fig. 5 shows the structure
(i)
of (RR T1 · · · RR TG )T and ⊕i (CAA
⊗Iki ), where G =4. The recursive algorithm in Foschi and
Kontoghiorghes (2002) that solves the unequal-numbers-of-observations problem is
similar to Algorithm 4 and can therefore be employed to compute the solution of
(48).
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
21
Table 1
Execution times of solving the SUR model, where k1 = · · · = kG = 5
M
G
OLM algorithms
GLLSP algorithms
LAPACK
Alg. 1
LAPACK
Alg. 2
Ratio
Alg. 3
Alg. 4
GLLSP/OLM
51
51
51
51
51
5
10
15
20
30
0.0007
0.0070
0.0389
0.1274
0.3471
0.0012
0.0070
0.0180
0.0382
0.1111
0.0968
0.4430
1.4195
3.4158
11.4602
0.0159
0.0994
0.2993
0.7061
2.4468
0.0116
0.0776
0.2412
0.6279
2.2541
0.0133
0.0547
0.1404
0.2801
0.7788
16.57
7.81
7.80
7.33
7.01
100
100
100
100
100
5
10
15
20
30
0.0014
0.0213
0.1135
0.3039
0.7402
0.0022
0.0119
0.0326
0.0687
0.2122
0.5553
2.8841
10.4153
25.5424
81.9765
0.0507
0.3215
1.0539
2.6647
9.4783
0.0409
0.2906
0.9871
2.4102
8.9455
0.0270
0.1091
0.2853
0.5697
1.5567
19.28
9.17
8.75
8.29
7.34
400
400
400
400
400
5
10
15
20
30
0.0162
0.1675
0.5750
1.3203
n/a
0.0093
0.0498
0.1598
0.4207
1.5331
25.6388
183.3260
586.3929
n/a
n/a
1.2633
8.2393
28.0438
n/a
n/a
0.7686
6.0076
22.2265
n/a
n/a
0.1092
0.4480
1.1694
2.4639
6.6609
11.74
9.00
7.32
5.86
4.34
5. Computational comparison
The algorithms are implemented in double precision on a PC with a single 1:7 GHz
Intel Pentium IV processor and 512 Mb of RAM. The matrix factorizations have been
computed using LAPACK subroutines (Anderson et al., 1992). The diagonally-based
method (see Appendix A) has been used to compute the factorizations (13), (18) and
(36b) (Foschi et al., 2002). Furthermore, in the case of the recursive Algorithm 4,
the block sizes used are M1 = max(k1 ; : : : ; kG ) and M2 = · · · = Mp = 10. This is found
experimentally to be the best choice for the speci4c architecture.
Tables 1 and 2 show the execution times (in seconds) of the algorithms. Three
classes of models (M = 51, M = 100 and M = 400) are reported, where each regression
equation is assumed to have the same number of variables and no common regressors.
The elements of the exogenous matrices, the response vectors, and the Cholesky factor
of the covariance matrix are generated randomly from a uniform distribution. Notice
that the computational complexity of the factorization procedures does not depend on
the speci4c values of the exogenous and covariance matrices. Thus, the performance
of the algorithms is the same for matrices that have been generated using diQerent
statistical assumptions. Table 1 shows the performance of the algorithms when the
number of equations changes, (G = 5; 10; 15; 20), while the number of variables in
each equation remains 4xed at 5. Table 2 shows the execution times when G = 10 is
constant and the number of variables in each regression is k (k = 5; 10; 15; 20; 40); that
is, k1 = · · · = kG = k.
22
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Table 2
Execution times of solving the SUR model, where G = 10 and ki = 5; 10; 15; 20; 30 for i = 1; : : : ; G
M
ki
OLM algorithms
GLLSP algorithms
LAPACK
Alg. 1
LAPACK
Alg. 2
Ratio
Alg. 3
Alg. 4
GLLSP/OLM
51
51
51
51
51
5
10
15
20
30
0.0060
0.0450
0.0853
0.0942
0.1450
0.0069
0.0187
0.0334
0.0507
0.0898
0.4446
0.5103
0.5746
0.5736
0.7835
0.0989
0.1383
0.1688
0.1887
0.1990
0.0756
0.1013
0.1239
0.1580
0.1803
0.0556
0.0868
0.1327
0.1773
0.2610
9.26
4.64
3.71
3.50
2.91
100
100
100
100
100
5
10
15
20
30
0.0227
0.1208
0.1951
0.1979
0.3181
0.0121
0.0327
0.0726
0.0989
0.2138
2.9766
3.1151
3.4684
3.6348
4.5518
0.3384
0.4663
0.6214
0.7296
1.0656
0.2963
0.4511
0.5865
0.7224
1.0375
0.1122
0.1897
0.3085
0.4416
0.8022
9.27
5.80
4.25
4.46
3.75
400
400
400
400
400
5
10
15
20
30
0.1812
0.6433
1.0295
1.1602
1.8623
0.0499
0.1731
0.3869
0.7879
1.8808
185.3506
191.6706
198.3687
204.5498
200.3455
8.3259
13.1758
18.3041
23.9272
34.8307
6.0854
9.2547
13.0657
17.4742
25.7010
0.4527
0.7791
1.2969
2.0352
3.8604
9.07
4.50
3.35
2.71
2.07
The execution times for solving the OLM (6) using the LAPACK routine DGELS
and Algorithm 1 are shown in columns 3 and 4, respectively. Columns 5 –8 give the
results for Algorithms 2– 4. Speci4cally, the 5th column refers to the LAPACK routine
DGGGLM, which solves the GLLSP (5) without exploiting the sparse structure of the
matrices. Columns 6 –8 show the execution times for Algorithms 2– 4, respectively.
The best times for solving the OLM (6) and the GLLSP (5) are underlined and are
used to calculate the performance ratio in the last column. Computational results for
the LAPACK routine DGGGLM and Algorithms 2–3 are not available (n/a) for the
largest problems because the algorithms run out of memory. Analogous results for the
estimation of the TSUR model (44) are given in Tables 3 and 4, where the execution
times include the initial step of transforming the SUR model (1) to the TSUR model
(44). The cost of this step is negligible compared to the overall execution time.
Table 5 shows the theoretical complexities in terms of Voating point operations
(Vops) of Algorithms 1– 4 and that of solving the TSUR model (44), where k = k1 =
· · · = kG . A detailed derivation of these complexities can be found in Appendix B.
The second column reports the approximate complexity of each algorithm for large
values of M , G, and k. The third column shows the same complexities for large scale
models, i.e., M k. Finally, the last column gives the number of Vops required by
each algorithm to solve the reduced-sized model (44). Computations that have small
marginal cost compared to the overall complexities have not been taken into account.
Furthermore, the transformation (41) that derives the TSUR model has complexity
2G 2 k 2 (M − Gk=3) and has not been included.
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
23
Table 3
Execution times of solving the TSUR model (44), where k1 = · · · = kG = 5
M
G
OLM algorithms
GLLSP algorithms
LAPACK
Alg. 1
LAPACK
Alg. 2
Alg. 3
Alg. 4
GLLSP/OLM
100
100
100
100
5
10
15
20
0.0007
0.0068
0.0665
0.2908
0.0012
0.0079
0.0277
0.0729
0.0055
0.3980
4.1706
25.1177
0.0049
0.0834
0.5426
2.3531
0.0034
0.0608
0.5045
2.1953
0.0050
0.0463
0.1887
0.5194
4.86
6.81
6.81
7.21
400
400
400
400
400
5
10
15
20
30
0.0015
0.0088
0.0746
0.3059
1.1623
0.0019
0.0110
0.0396
0.1010
0.4447
0.0059
0.3973
4.3138
26.2893
79.9213
0.0056
0.0866
0.5676
2.5442
23.6311
0.0040
0.0637
0.5066
2.3426
21.8987
0.0059
0.0496
0.1987
0.5566
2.3073
3.93
5.64
5.02
5.51
5.19
Table 4
Execution times of solving the TSUR model (44), where G = 10 and ki = 5; 10; 15; 20; 40 for i = 1; : : : ; G
M
ki
OLM algorithms
GLLSP algorithms
LAPACK
Alg. 1
LAPACK
Alg. 2
Ratio
Alg. 3
Alg. 4
GLLSP/OLM
100
100
5
10
0.0052
0.1088
0.0081
0.0366
0.4015
3.0925
0.0881
0.4421
0.0601
0.3805
0.0477
0.1672
9.17
4.57
400
400
400
400
400
5
10
15
20
30
0.0093
0.1236
0.3700
0.5482
1.3749
0.0111
0.0556
0.1443
0.3031
1.2734
0.4249
3.1201
10.9318
27.2676
87.4721
0.0873
0.4482
1.2860
3.5127
13.9188
0.0643
0.5142
1.2046
2.9855
11.3183
0.0491
0.1872
0.4675
0.9567
2.7661
5.28
3.37
3.24
3.16
2.17
From the theoretical and computational results a number of conclusions can be
drawn:
• Theoretical and experimental results con4rm that the OLM algorithms outperform
the GLLSP algorithms. In theory the ratio between Algorithm 1 and Algorithms
2–3 is linear with k=M , while that of Algorithms 1 and 4 is constant. In practice, this performance diQerence decreases as the number of regressors or equations
increases.
• The direct use of the standard LAPACK routine DGGGLM to solve the GLLSP is
not feasible for large-scale models.
• The discrepancies between the theoretical and experimental results are due to the
implementation overheads and memory usage.
• The complexities of the OLM algorithm and that of the recursive Algorithm 4 are a
linear function of the sample size. It follows that, in practice, the performances of
24
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Table 5
Complexity of Algorithms 1– 4, where k = k1 = · · · = kG
Algorithm
Complexity
Compl. Approx.
for M k
Compl. for solving
the TSUR model
OLM Algorithms
LAPACK
Alg. 1
2G 3 k 2 (M − k=3)
G 2 k 2 (M + 2G(M − k + 1)=3)
2G 3 k 2 M
2G 3 k 2 M=3
2G 4 k 3
2G 4 k 3 =3
4G 3 M 3 =3
2G 3 kM 2 =3
4G 6 k 3 =3
2G 5 k 3 =3
G 3 kM 2 =3
4G 3 k 2 M=3
G 5 k 3 =3
4G 4 k 3 =3
GLLSP Algorithms
LAPACK
G 3 (4M 3 =3 + 4M 2 k − 2k 3 =3)
Alg. 2
G 2 kM (M + 2G(M − k + 1)=3)
Alg. 3
G 2 kM (M + G(M − k + 2)=3)
+G 2 k 2 (M + G(M − k + 1)=3)
Alg. 4
G 2 k 3 + 4G 3 k 2 (M − k + 1=2)=3
these algorithms do not deteriorate when the number of observations increases and
thus they can solve large-scale problems.
• The algorithms for solving the TSUR model (44) outperform the corresponding
algorithms for solving the initial SUR model (1). For the largest problems, the
cost of transforming the SUR model to one of smaller dimensions and solving it is
negligible compared to the cost of solving the original one.
6. Summary
Algorithms for solving the seemingly unrelated regressions (SUR) model have been
considered. The algorithms use as a basic component the QR decomposition. Initially
the SUR model is transformed to an ordinary linear model (OLM). This transformation
results in a regressor matrix having a block triangular structure. The best linear unbiased
estimator (BLUE) of the SUR model results from the least-squares (LS) solution of the
OLM. A computationally e1cient strategy (Algorithm 1) produces the LS estimator by
exploiting the sparse structure of the matrices. This strategy outperforms the LAPACK
DGELS subroutine, which treats the matrices as full, when the problem is not very
small.
The remaining three algorithms compute the BLUE by formulating the SUR model as
a generalized linear least squares problem (GLLSP). The solution of the GLLSP is obtained using the generalized QR decomposition (GQRD). The 4rst method (Algorithm
2) computes the GQRD of the block diagonal and Kronecker structures of matrix of
the exogenous variables and the Cholesky factor of the dispersion matrix, respectively.
This method is computationally more e1cient than the corresponding LAPACK routine
(DGGGLM) that solves the general linear model. The second method (Algorithm 3)
solves the GLLSP iteratively. Each iteration solves a smaller sized GLLSP. The main
advantage of this method is that it avoids the formation of the computationally expensive RQ decomposition (14). This allows Algorithm 3 to outperform Algorithm 2.
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
25
Finally, a recursive estimation strategy (Algorithm 4) is proposed. This is found to be
the most e1cient when the model is not very small. Furthermore, this strategy requires
less memory and can thus solve larger problems. Algorithms 1, 3 and 4 are new designs, while Algorithm 2 has been originally proposed in Kontoghiorghes and Clarke
(1995).
The algorithms are reassessed after an initial orthogonal transformation is made to
reduce the SUR model to one of smaller size. The matrix of exogenous variables of
the transformed (TSUR) model (44) has dimensions GK × K compared with GM × K
of the original model (1). This transformation is signi4cant for large-scale models,
where the number of observations in each equation is much larger than the total number of regressors, i.e., M K. The solution for the SUR, and consequently TSUR,
model when the regressions have common exogenous factors is currently under ind
vestigation. In this case, Xi = XSi , where X ∈ RM ×K is the matrix of the K d distinct
d
regressors, and Si ∈ RK ×ki is the selection matrix comprised of the relevant columns
of the K d × K d identity matrix. The computation of the QRD of X̃ = X (SG · · · S1 )
produces R̃i (i = 1; : : : ; G) matrices in (45) that have a sparse structure able to be exploited by the various algorithms (Foschi and Kontoghiorghes, 2003b; Kontoghiorghes,
2000b; Kontoghiorghes and Dinenis, 1996).
Often SUR models exhibit special properties and characteristics (Foschi and Kontoghiorghes, 2003a, c; Kontoghiorghes, 2000b; Orbe et al., 2003). For the e1cient
solution of these models the proposed algorithms need to be modi4ed. The structures
of the matrices and their properties should be exploited. Iterative algorithms for computing the estimators of models with sparse exogenous matrices merits investigation.
Although Algorithm 1 is computationally most e1cient, it is numerically less stable
than Algorithms 2– 4 (SNoderkvist, 1996). The Algorithm 1 may provide a poor solution
when C is ill-conditioned and fails when is singular, i.e., when C is not of full-rank
(Kontoghiorghes, 2000a; Kourouklis and Paige, 1981; Paige, 1978, 1979a). In such
cases, the GLLSP approach should be used (Foschi et al., 2002; Kontoghiorghes, 2000b;
Kontoghiorghes and Clarke, 1995). The numerical stability of the algorithms needs to
be investigated.
The algorithms for solving the TSUR model (44) can be adapted to solve simultaneous equations models (SEMs) (Belsley, 1992; Chavas, 1982; Dhrymes, 1994;
Kontoghiorghes and Dinenis, 1997; Zellner and Theil, 1962). Similarly to the SUR
model (1), the SEM can be expressed as
vec(Y ) = (⊕i Wi )vec({i }) + vec(U );
(50)
where Wi =(Xi Yi ), i ∈ Rki +gi , and Yi ∈ RM ×gi consists of gi endogenous variables from
Y , excluding yi . The endogeneity in the SEM can be eliminated by a transformation
identical to that employed to derive the TSUR model (Kontoghiorghes and Dinenis,
1997). The transformed SEM can be written as
vec(YR∗ ) = (⊕i Wi∗ )vec({i }) + vec(UR∗ );
(51)
where Wi∗ = (Ri Yi∗ ), Yi∗ = QR∗T Yi , and YR∗ , R∗i and UR∗ are de4ned in (44). E1cient
algorithms for solving the SEM are currently under investigation.
26
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Acknowledgements
The authors are grateful to the four referees for their constructive comments and
suggestions.
Appendix A. The column- and diagonally-based methods
Consider the updating QR decomposition (UQRD)
A
R
T
=
;
Q
B
0
(G)
(G)
where A; R ∈ Rk ×k are upper-triangular, B ∈ Rq
Q is orthogonal of order k (G) + q(G) . Now, let




A ≡ A(0) = 




k1
k2
···
kG
A(0)
1; 1
A(0)
1; 2
···
A(0)
1; G
0
A(0)
2; 2
···
A(0)
2; G
..
.
..
.
..
..
.
0
0
···
A(0)
G; G
.

(G)
×k (G)
(A.1)
is block upper-triangular and
k1

 k
 2;

 .
 .
 .

kG
and




(0) 
B ≡ B =



k1
k2
···
kG
B1;(0)1
B1;(0)2
···
B1;(0)G
0
B2;(0)2
···
B2;(0)G
..
.
..
.
..
..
.
0
0
···
.
(0)
BG;
G

q1

 q2
 ;

 .
 .
 .

qG
G
G
where Ai; i is upper-triangular, k (G) = i=1 ki and q(G) = i=1 qi . Two block strategies can be used to compute the UQRD (A.1). The 4rst, a diagonally-based strategy,
annihilates the block-superdiagonals of B one at a time. The second, a column-based
strategy, annihilates the non-zero blocks of B column-by-column (Foschi et al., 2002;
Kontoghiorghes, 1999).
The diagonally-based strategy computes the UQRDs

 (i−1)
(i)
Ai+j;
A
ki+j
i+j
i+j;
i+j
=
Qi;Tj 
(A.2)
(i−1)
0
qj
Bj; i+j
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
27
and

Qi;Tj 
(i−1)
Ai+j;
i+j+1:G
Bj;(i−1)
i+j+1:G


=
(i)
Ai+j;
i+j+1:G
Bj;(i)i+j+1:G


ki+j
qj
;
(A.3)
where i = 1; : : : ; G − 1, j = 1; : : : ; G − i and


(i)
Ai+j;
i+j+1:G
Bj;(i)i+j+1:G


 =
ki+j+1
ki+j+2
(i)
Ai+j;
i+j+1
(i)
Ai+j;
i+j+2
Bj;(i)i+j+1
Bj;(i)i+j+2
Thus, R in (A.1) is given by
 (0)
A1; 1 A(0)
···
1; 2


A(1)
···
 0
2; 2

R=
..
..
 ..
.
 .
.

0
0
···
A(0)
1; G
···
kG
···
(i)
Ai+j;
G
···
Bj;(i)G


ki+j
:
qj



A(1)
2; G 

:
.. 

. 

A(G−1)
G; G
Notice that the UQRDs (A.2) and (A.3) can be computed simultaneously for j =
1; : : : ; G − i.
The column-based strategy computes the QRDs

 Ai;(i−2)
Ai;(i−1)
ki
i
i
T

Q̃i
=
(i−2)
q(i−1)
0
B1:i−1; i
and

Q̃Ti 

Ai;(i−2)
i+1:G
(i−2)
B1:i−1;
i+1:G

=
Ai;(i−1)
i+1:G
(i−1)
B1:i−1;
i+1:G


ki
q(i−1)
;
i−1
where i = 2; : : : ; G and q(i−1) = j=1 qj . Fig. 6 shows the annihilation patterns of the
two strategies in the case where G = 4.
Now, the computation of (18) and part of (36b) are equivalent to
Q
T
C
D
=
p(G)
C̃
D̃
k (G)
q(G)
;
(A.4)
28
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Fig. 6. Computation of the UQRD (13a) using the diagonally- and column-based strategies, G = 4.
where




(0)
C ≡ C =








D ≡ D(0) = 




p1
p2
···
pG
C1;(0)1
C1;(0)2
···
C1;(0)G
0
C2;(0)2
···
C2;(0)G
..
.
..
.
..
..
.
0
0
···
(0)
CG;
G
p1
p2
···
pG
D1;(0)1
D1;(0)2
···
D1;(0)G
0
D2;(0)2
···
D2;(0)G
..
.
..
.
..
..
.
0
0
···
.
.
(0)
DG;
G

k1

 k
 2;

 .
 .
 .

kG

q1


 q2


 ..
 .

qG
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
29
G
and p(G) = i=1 pi . If the diagonally-based strategy is used, then (A.4) is equivalent
to computing

Qi;Tj

(i−1)
Ci+j;
j
(i−1)
Ci+j;
j+1
···
(i−1)
Ci+j;
G
Dj;(i−1)
j
Dj;(i−1)
j+1
···
Dj;(i−1)
G


=
pj
pj+1
···
pG
(i)
Ci+j;
j
(i)
Ci+j;
j+1
···
(i)
Ci+j;
G
Dj;(i)j
Dj;(i)j+1
···
Dj;(i)G


ki+j
;
qj
(A.5)
where i = 1; : : : ; G − 1 and j = 1; : : : ; G − i. Notice that, the upper triangular structure
of D(0) is preserved throughout the computation.
Now, if a column-based algorithm is used, then (A.4) is computed as

Q̃Ti 
Ci;(i−2)
1
Ci;(i−2)
2
(i−2)
D1:i−1;
1
(i−2)
D1:i−1;
2
···
Ci;(i−2)
2
···
(i−2)
D1:i−1;
G


=
p1
p2
Ci;(i−1)
1
Ci;(i−1)
2
(i−1)
D1:i−1;
1
(i−1)
D1:i−1;
2
···
pG
···
Ci;(i−1)
2
···
(i−1)
D1:i−1;
G


ki
:
q(i−1)
Using this strategy, the block upper-triangular structure of D(0) is destroyed. This can
(i−2)
be avoided if at the ith step the blocks of B1:i−1;
j are annihilated one at a time and
from bottom to top.
Appendix B. Complexity analysis
The theoretical complexities of the algorithms in terms of number of Vops (Voating
point operations) are derived in line with (Golub and Van Loan, 1996). Initially the
computational costs of the main factorizations are calculated. These are then used
to determine the complexity of Algorithms 1– 4. For simplicity the complexities are
approximated for large values of G, M and k, where it is assumed that k =k1 =· · ·=kG .
B.1. Main factorizations
The number of Vops required to compute the Cholesky factorization of an n × n
symmetric and positive de4nite matrix is given by n3 =3 (Golub and Van Loan, 1996).
The complexities of computing the QRD of an m × n matrix using Householder transformations and that of applying the same orthogonal transformation to an m-element
1
2
vector are given by TQR
(m; n) = 2n2 (m − n=3) and TQR
(m; n) = 2n(2m − n + 1), respectively (Golub and Van Loan, 1996, pp. 224 –225). Analogously, the complexity of
computing the UQRD
A
R n
T
=
;
(B.1)
Q
B
0 m−n
30
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
1
is TUQR
(m; n) = 2n2 (m − n + 1), where Q ∈ Rm×m is orthogonal, R and A are upper
triangular of order n and B ∈ R(m−n)×n . The Vops required to apply QT to a vector are
2
(m; n)=4n(m−n+1). Notice that the RQD and URQD have the same complexities
TUQR
as those of the QRD and UQRD, respectively.
Now, consider the computation of the UQRD (A.1) and (A.4) using the diagonallybased strategy. To simplify the analysis, let assume that ki = k, pi = p and qi = q, for
i = 1; : : : ; G. Thus, the Vops required to compute the UQRD (A.2), (A.3) and (A.5)
are given, respectively, by
1
TUQR
(k; p) = 2k 2 (p + 1);
2
((G − i − j)k + 1)TUQR
(k; p) = 4(1 + (G − i − j)k)k(p + 1)
and
2
(G − i − j + 1)qTUQR
(k; p) = 4(G − i − j + 1)k(p + 1)q:
The complexities of the diagonally-based methods are given by
Tdiag (G; k; p; q) =
G−1
G−i
2
((G − i − j)k + 1 + (G − i − j + 1)q)TUQR
(k; p)
i=1 j=1
1
(k; p)
+ TUQR
= G(G + 1)k(p + 1)(2(k + q)(G + 1) − 3k + 6)=3
2G(G + 1)2 k(p + 1)(k + q)=3:
Now, if the 4rst block diagonal of B(0) is already zero, then the latter becomes
∗
Tdiag
(G; k; p; q) =
G−1
G−i
2
((G − i − j)k + 1 + (G − i − j + 1)q)TUQR
(k; p)
i=2 j=1
1
(k; p)
+ TUQR
= (G − 1)(G − 2)k(p + 1)(2G(k + q) − 3k + 6)=3
2G(G − 1)(G − 2)k(p + 1)(k + q)=3:
B.2. Algorithm 1
The complexity of Algorithm 1 is dominated by that of steps 7 and 8. Speci4cally,
the complexity of step 7 is given by
G−1
i=1
2
2(G − i)kTQR
(M; k) = G(G − 1)k 2 (2M − k + 1)
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
31
Table 6
Complexity of each step of Algorithm 1
Step
Complexity
1
2
4
5
7
8
9
G 3 =3
MG 2
2k 2 (M − k=3)
2k(2M − k + 1)
G(G − 1)k 2 (2M − k + 1)
2(G − 1)(G − 2)(G − 5=2)k 2 (M − k + 1)=3
G2 k 2
and that of step 8, i.e., computing the UQRD (13a) and (13b), by
Tdiag (G − 1; k; M − k; 0) = (G − 1)(G − 2)k(M − k + 1)((2G − 5)k + 6)=3
5
k 2 (M − k + 1)=3:
2(G − 1)(G − 2) G −
2
Thus, the number of Vops required by Algorithm 1 is approximately
T1 (G; M; k) (G − 1)k 2 (GM + 2(M − k + 1)(G 2 − 3G + 5)=3)
G 2 k 2 (M + 2G(M − k + 1)=3):
Table 6 reports the complexity of each step of Algorithm 1.
B.3. Algorithm 2
The complexity of Algorithm 2 is approximately that of steps 6 –8. The Vops required
by step 6 are
2
G(G − 1)MTQR
(M; k)=2 = G(G − 1)kM (2M − k + 1):
Notice that steps 7 and 8 can be computed using an adaptation of the diagonally-based
(0)
algorithm. Furthermore, since the diagonal blocks of W̃BA
are zero, the Vops of these
steps are
∗
Tdiag
(G; k; M − k; M − k) = G(G − 1)k(M − k + 1)(2GM − 3k + 6)=3
2G 2 (G − 1)k(M − k + 1)M=3:
Therefore, the complexity of Algorithm 1 is
T2 (G; M; k) G(G − 1)kM (M + 2(G + 3=2)(M − k + 1)=3)
G 2 kM (M + 2G(M − k + 1)=3):
32
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
B.4. Algorithm 3
The complexity of the interleaving Algorithm 3 is determined by that of steps 6, 7
and 9. Step 6 applies QiT from the left of an M × ks matrix and Qi from the right of
an M (G − s − 1) × M matrix. The complexity of this step is thus
2
(M; k)(ks + M (G − s − 1)) = 2k(2M − k + 1)(M (G − 1) − s(M − k)):
TQR
For all the iterations s = 0; : : : ; G − 1 the complexity is evaluated to
G−1
2k(2M − k + 1)(M (G − 1) − s(M − k)) = G(G − 1)k(M + k)(2M − k + 1)
s=0
G 2 k(M + k)(2M − k + 1):
Now, the complexity of step 7 is given by that of the URQDs (28) and of (25b), i.e.,
s
1
2
(TUQR
(M; k) + (k(s − i + 1) + M (G − s + 1))TUQR
(M; k))
i=1
= 2k(M − k + 1)(ks(s + 4) + 2Ms(G − s + 1)):
Therefore, for all the iterations s = 0; : : : ; G − 1 the complexity becomes
G−1
2k(M − k + 1)(ks(s + 4) + 2Ms(G − s + 1))
s=0
= G(G + 1)k(M − k + 1)(G(k + M ) + 13k + 4M )=3
G 3 k(M + k)(M − k + 1)=3:
Step 9 consists of multiplying an M (G − s − 1) × sk matrix with a vector using
2M (G − s − 1)sk Vops. For all the iterations s = 0; : : : ; G − 1 the number of Vops
required is
G−1
2Mks(G − s − 1) = G(G − 1)(G − 2)kM=3 G 3 kM=3:
s=0
Thus, the total complexity of steps 6, 7 and 9, and thus, of Algorithm 3, is given by
T3 (G; M; k) G 2 kM (M + G(M − k + 2)=3) + G 2 k 2 (M + G(M − k + 1)=3):
B.5. Algorithm 4
For the complexity analysis of Algorithm 4 it is assumed that M1 = k and that
Ms = 0 = (M − k)=(p − 1), s = 2; : : : ; p. Under these assumptions the complexity of step
2 is approximately T2 (G; k; k) = G 2 k 3 + 2=3G 3 k 2 . Now, the complexities of computing
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
33
the UQRD (36a), the transformation
(s−1)
0
WAA
T
Q(s)
0
C ⊗ I Ms
and the RQD (36b) are, respectively,
1
GTUQR
(k + 0; k) = 2Gk 2 (0 + 1);
2
G(G − 1)(k + 0)TUQR
(k + 0; k)=2 = 2G(G − 1)k(0 + 1)(k + 0)
2G 2 k(0 + 1)(k + 0)
and
Tdiag (G; k; 0; 0) = G(G + 1)k(0 + 1)(2(k + 0)(G + 1)=3 − k + 2)
2G 3 k(0 + 1)(k + 0)=3:
It follows that the complexity of step 6 is dominated by that of computing the RQD
(36b), i.e. Tdiag (G; k; 0; 0).
Finally the complexities of steps 7 and 8 are, respectively, 03 =3 and 2G 2 k0, which
are marginal respect to that of step 6. Thus, the complexity of Algorithm 4 is given
by
T4 (G; M; k; 0) G 2 k 3 + 2G 3 k 2 =3 + 2G 3 k(0 + 1)(k + 0)(M − k)=(30):
For 0 = 1, this reduces to
T4 (G; M; k; 1) G 2 k 3 + 2G 3 k 2 (2M − 2k + 1)=3:
Notice that, if M k, then T4 (G; M; k) = 4G 3 k 2 (k + 1)M=3, which is a linear function
of the sample size.
References
Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S.,
McKenney, A., Ostrouchov, S., Sorenson, D., 1992. LAPACK Users’ Guide. SIAM, Philadelphia.
Belsley, D.A., 1992. Paring 3SLS calculations down to manageable proportions. Computer Science in
Economics and Management 5, 157–169.
X 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia.
BjNorck, A.,
Bolstad, W.M., 1987. An estimation of seemingly unrelated regression-model with contemporaneous
covariances based on an e1cient recursive algorithm. Comm. Statist. Simulation Comput. 16 (3), 689–
698.
Chavas, J.-P., 1982. Recursive estimation of simultaneous equation models. J. Econometrics 18, 207–217.
Dhrymes, P.J., 1994. Topics in Advanced Econometrics. In: Linear and Nonlinear Simultaneous Equations,
Vol. 2. Springer, New York.
Foschi, P., Kontoghiorghes, E.J., 2002. Estimation of seemingly unrelated regression models with unequal
size of observations: computational aspects. Comput. Statist. Data Analysis 41, 211–229.
34
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
Foschi, P., Kontoghiorghes, E.J., 2003a. Estimating SUR models with orthogonal regressors: computational
aspects. Linear Algebra and Application, in press.
Foschi, P., Kontoghiorghes, E.J., 2003b. Estimation of VAR models: computational aspects. Comput.
Economics 21, 3–22.
Foschi, P., Kontoghiorghes, E.J., 2003c. Estimating seemingly unrelated regression models with vector
autoregressive disturbances. J. Economic Dynamics Control, in press.
Foschi, P., Garin, L., Kontoghiorghes, E.J., 2002. Numerical and computational methods for solving
SUR models. In: Kontoghiorghes, E.J., Rustem, B., Siokos, S. (Eds.), Computational Methods in
Decision-Making, Economics and Finance, Applied Optimization. Kluwer Academic Publishers, Dordrecht,
pp. 405 – 427.
Golub, G.H., Van Loan, C.F., 1996. Matrix Computations, 3rd Edition. Johns Hopkins University Press,
Baltimore, MD.
Kontoghiorghes, E.J., 1999. Parallel strategies for computing the orthogonal factorizations used in the
estimation of econometric models. Algorithmica 25, 58–74.
Kontoghiorghes, E.J., 2000. Inconsistencies and redundancies in SURE models: computational aspects.
Comput. Economics 16 (1+2), 63–70.
Kontoghiorghes, E.J., 2000a. Parallel Algorithms for Linear Models: Numerical Methods and Estimation
Problems. In: Advances in Computational Economics, Vol. 15. Kluwer Academic Publishers, Boston,
MA.
Kontoghiorghes, E.J., 2000b. Parallel strategies for solving SURE models with variance inequalities and
positivity of correlations constraints. Comput. Economics 15 (1+2), 89–106.
Kontoghiorghes, E.J., 2003. Computational methods for modifying seemingly unrelated regressions models.
J. Comput. Appl. Math., forthcoming.
Kontoghiorghes, E.J., Clarke, M.R.B., 1995. An alternative approach for the numerical solution of seemingly
unrelated regression equations models. Comput. Statist. Data Anal. 19 (4), 369–377.
Kontoghiorghes, E.J., Dinenis, E., 1996. Solving triangular seemingly unrelated regression equations models
on massively parallel systems. In: Gilli, M. (Ed.), Computational Economic Systems: Models, Methods &
Econometrics, Advances in Computational Economics, Vol. 5. Kluwer Academic Publishers, Dordrecht,
pp. 191–201.
Kontoghiorghes, E.J., Dinenis, E., 1997. Computing 3SLS solutions of simultaneous equation models with
a possible singular variance-covariance matrix. Comput. Economics 10, 231–250.
Kourouklis, S., Paige, C.C., 1981. A constrained least squares approach to the general Gauss–Markov linear
model. J. Amer. Statist. Assoc. 76 (375), 620–625.
Lawson, C.L., Hanson, R.J., 1974. Solving Least Squares Problems. Prentice-Hall, Englewood CliQs, NJ.
Orbe, S., Ferreira, E., Rodriguez-Poo, J., 2003. An algorithm to estimate time varying parameter sur models
under diQerent type of restrictions. Comput. Statist. Data Anal. 42, 363–383.
Paige, C.C., 1978. Numerically stable computations for general univariate linear models. Comm. Statist.
Simulation Comput. 7 (5), 437–453.
Paige, C.C., 1979a. Computer solution and perturbation analysis of generalized linear least squares problems.
Math. Comput. 33 (145), 171–183.
Paige, C.C., 1979b. Fast numerically stable computations for generalized linear least squares problems. SIAM
J. Numer. Anal. 16 (1), 165–171.
Paige, C.C., 1990. Some aspects of generalized QR factorizations. In: Cox, M.G., Hammarling, S.J. (Eds.),
Reliable Numerical Computation. Clarendon Press, Oxford, UK, pp. 71–91.
Pollock, D.S.G., 1979. The Algebra of Econometrics (Wiley Series in Probability and Mathematical
Statistics). Wiley, New York.
Regalia, P.A., Mitra, S.K., 1989. Kronecker products, unitary matrices and signal processing applications.
SIAM Rev. 31 (4), 586–613.
SNoderkvist, I., 1996. On algorithms for generalized least-squares problems with ill-conditioned covariance
matrices. Comput. Statist. 11 (3), 303–313.
Srivastava, V.K., Dwivedi, T.D., 1979. Estimation of seemingly unrelated regression equations models: a
brief survey. J. Econometrics 10, 15–32.
Srivastava, V.K., Giles, D.E.A., 1987. Seemingly Unrelated Regression Equations Models: Estimation and
Inference (Statistics: Textbooks and Monographs), Vol. 80. Marcel Dekker, New York.
P. Foschi et al. / Computational Statistics & Data Analysis 44 (2003) 3 – 35
35
Telser, L.G., 1964. Iterative estimation of a set of linear regression equations. J. Amer. Statist. Assoc. 59,
845–862.
Zellner, A., 1962. An e1cient method of estimating seemingly unrelated regression equations and tests for
aggregation bias. J. Amer. Statist. Assoc. 57, 348–368.
Zellner, A., 1963. Estimators for seemingly unrelated regression equations: some exact 4nite sample results.
J. Amer. Statist. Assoc. 58, 977–992.
Zellner, A., Theil, H., 1962. Three-stage least squares: simultaneous estimation of simultaneous equations.
Econometrica 30 (1), 54–78.