On identification of multi-factor models with correlated residuals

Biometrika (2004), 91, 1, pp. 141–151
© 2004 Biometrika Trust
Printed in Great Britain
On identification of multi-factor models with correlated
residuals
B MICHEL GRZEBYK
Department of Pollutants Metrology, INRS, Avenue de Bourgogne,
F-54501 Vandœuvre lès Nancy Cedex, France
[email protected]
PASCAL WILD  DOMINIQUE CHOUANIE RE
Department of Epidemiology, INRS, Avenue de Bourgogne,
F-54501 Vandœuvre lès Nancy Cedex, France
[email protected] [email protected]
S
We specify some conditions for the identification of a multi-factor model with correlated
residuals, uncorrelated factors and zero restrictions in the factor loadings. These conditions
are derived from the results of Stanghellini (1997) and Vicard (2000) which deal with
single-factor models with zero restrictions in the concentration matrix. Like these authors,
we make use of the complementary graph of residuals and the conditions build on the role
of odd cycles in this graph. However, in contrast to these authors, we consider the case
where the conditional dependencies of the residuals are expressed in terms of a covariance
matrix rather than its inverse, the concentration matrix. We first derive the corresponding
condition for identification of single-factor models with structural zeros in the covariance
matrix of the residuals. This is extended to the case where some factor loadings are
constrained to be zero. We use these conditions to obtain a sufficient and a necessary
condition for identification of multi-factor models.
Some key words: Complementary graph; Covariance graph; Odd cycle; Structural constraint.
1. I
Our work is motivated by neurotoxicology. Several test variables coming from different
neurobehavioural tests were referred to a limited number of unmeasured mental functions,
or factors, each influencing some but not all of these variables. However, other factors of
no scientific interest, such as the experimental conditions, may also influence some test
variables. This situation can be expressed as a multi-factor model with some factor loadings
constrained to zero, expressing the conditional independence between some test variables
and some mental functions represented as latent factors. These zero factor loadings are
termed structural zero factor loadings. For example, if variables exploring long-term memory
have been measured along with others, like the simple reaction time which explores nervous
motor function, the reaction time performance may be considered independent of long-term
memory, given the latent factor characterising nervous motor function. We suppose here
that, if all factors are included, the test variables are independent conditionally on the factors.
142
M. G, P. W  D. C 
However, we may want to marginalise over the factors with no scientific content, such as
the level of mental concentration of the experimental subject on the neurobehavioural tests.
This implies some marginal correlations among the test variables with a common factor
over which we marginalised, conditional on the remaining factors.
The problem we consider in this paper is the problem of identification. We do not consider
of the problem of the ‘existence of model’ in the words of Anderson & Rubin (1956) nor
the problem of estimation.
Anderson & Rubin (1956) gave some conditions for identification of multi-factor models
with uncorrelated residuals and Giudici & Stanghellini (2001) gave a sufficient condition
for identification of a subclass of such multi-factor models in which each observed variable
is a response to one factor only and the dependencies between the residuals are expressed
in terms of zero coefficients in the concentration matrix, i.e. the inverse of the covariance
matrix. In this paper we build mostly on the work by Vicard (2000), who gave a necessary
and sufficient condition for identification of such single-factor models. The main difference
is that the independence structure in the residuals we shall consider is expressed in terms
of zero restrictions on the correlation coefficients. We further restrict ourselves to the case
of uncorrelated factors.
After we introduce notation in § 2, § 3 gives a necessary and sufficient condition for
identification of single-factor models with correlated residuals when the dependencies between
the residuals are expressed in terms of zero covariance coefficients. Section 4 gives both
a necessary and a sufficient condition for the identification of the considered multi-factor
models with uncorrelated factors and correlated errors. These results are illustrated by a
series of examples showing when our results can or cannot be applied.
2. N  
2·1. T he problem
We consider the multi-factor model
X=lj+d,
(1)
where X is a vector of q observed variables X (u=1, . . . , q) with mean 0 and covariance
u
matrix S, j is a vector of p independent normally distributed factors j (i=1, . . . , p) each
i
with mean 0 and variance 1, l is a q×p matrix of factor loadings and d is a vector of q
normally distributed residuals with mean 0. The covariance matrix of d, denoted by H,
is the covariance matrix of X conditional on j. It is assumed that cov (j, j)=I and
p
cov (j, d)=0. Furthermore, the factor loadings l contain structural zeros l =0 for some
u,i
pairs (X , j ). These structural constraints are the expression of the substantive knowledge
u i
about the independence of some X ’s from some factors, conditional on the other factors. We
u
further assume that H comprises structural zeros, cov (X , X |j)=0, according to substantive
u v
knowledge. All the other parameters are assumed nonzero.
According to (1), the implied covariance model of the observed variables expressed as
a function of the parameters l and H is then
S(l, H)=llT+H.
(2)
The model is said to be identified if, for two sets of parameters (l, H) and (l∞, H∞) satisfying
the structural constraints, S(l, H)=S(l∞, H∞) implies that l=l∞ and H=H∞. A well-known
first necessary condition for identification of such a model is that the number of parameters
be less than or equal to the number of nonnull relationships in equation (2).
Multi-factor models
143
As the identification issues are based solely on the moments, normality assumptions
are not central to our considerations.
2·2. Graphical representation
Such factorial models can be considered as chain graphs, denoted by G=(K, E), organised
in two ‘boxes’. The set of vertices, K, consists of the factors j in the first box and the
observed variables X in the second box. The set of edges, E, consists of the subset of
the directed edges E
={(u, i) : l N0} from the box of factors to the box of variables
X÷j
ui
and the subset of the undirected edges E ={(u, v) : h N0 and uNv} within the box of
X.j
uv
variables.
The subgraph G =(X, E ) is the covariance graph (Cox & Wermuth, 1996, p. 30)
X.j
X.j
of the observed variables conditional on the factors, that is the covariance graph of the
errors d, which from now on we call the conditional covariance graph. We define its
complementary graph as G
9 =(X, E9 ) with E9 ={(u, v) : h =0}.
X.j
uv
X.j
X.j
Following the convention defined by Cox & Wermuth (1996), the covariance graph G
X.j
is visualised with undirected dashed lines within the box of the observed variables since
an absence of edge means a zero covariance between observed variables, conditional on
the factors j. Likewise, chain graphs are read from right to left; directed edges are visualised
as directed full arrows pointing from the factors on the right to the observed variables on
the left, indicating a dependency of a variable from a factor conditional on all other factors
but not on the other variables.
Examples of chain graphs and their graphical representation are given throughout the text.
3. S- 
3·1. Preamble
Vicard (2000) gave a necessary and sufficient condition for identification of single-factor
models with correlated errors. This condition is based on the complementary graph of the
conditional concentration graph. The result holds also when the dependencies are expressed
in terms of a covariance graph, and if structural zero constraints on the factor loadings
are allowed.
3·2. Structural zeros in the covariance matrix
In this subsection, we suppose that all the factor loadings are nonzero.
T 1. A necessary and suYcient condition for a single-factor model to be identified
is that the complementary graph of the conditional covariance graph G
satisfies the
9
X.j
following two conditions:
(i) each connectivity component contains at least one odd cycle,
(ii) the sign of one factor loading per connectivity component is given.
Proof. Let (L, H) be the set of parameters representing the factor loadings and the conditional covariance matrix. The parameters L are nonzero whereas some H are zero if
i
uv
and only if (u, v)µE9 . According to (2), to establish the identification of the model, we
X.j
have to check for the uniqueness of the solution to the system
L L =l l ((u, v)µE9 ),
u v
u v
X.j
H +L L =h +l l ((u, v)µE or u=v),
uv
u v
uv
u v
X.j
in which (L, H) are constants and (l, h) are the unknowns.
(3)
(4)
144
M. G, P. W  D. C 
According to the structure of the system, each solution for l leads to a unique solution
for h.
contains more than one connectivity component, the system (3) can be split into
If G
9
X.j
disjoint subsystems, each one corresponding to a single connectivity component. The
identification of l can be established independently in each connectivity component.
Subsequently, we suppose that G
9 is connected and the proof applies to each connectivity
X.j
component.
Let X and X be two distinct but arbitrary variables. Then there exists at least one
1
2
and (3) leads to
path connecting the corresponding nodes in G
9
X.j
q
L L /l , if the number of edges in the path is odd,
(5)
1 2 1
l =
2
l L /L , if the number of edges in the path is even.
(6)
1 2 1
and suppose that
We first prove sufficiency. Suppose there exists an odd cycle in G
9
X.j
X and X belong to this cycle. Then there exist two distinct paths from X to X through
1
2
1
2
this cycle. As the cycle is odd, one of the paths is odd and the other is even, so that both
(5) and (6) apply and give
l2 =L2 .
1
1
(7)
Thus l =aL with a=±1 for any nodes in the odd cycle.
1
1
It follows that, if X is in the odd cycle and X is not in the odd cycle, then both
1
2
equations (5) and (6) reduce to l =aL , whatever the length of the path.
2
2
If the sign of one factor loading is fixed a priori then a=1 and l =L for all vertices.
u
u
Substituting l by L in (4) gives h =H for all (u, v) in E or u=v. Hence the model
u
u
uv
uv
X.j
is identified.
We next prove the necessity. Suppose that there is no odd cycle in G
9 . Two cases are
X.j
possible. First, if there is no cycle, there exists only a single path between two vertices.
Therefore, for any choice of l , either (5) or (6) provides a solution for any l , which means
1
2
that the model is not identified. Secondly, suppose that G
contains even cycles and
9
X.j
assign an arbitrary value to l . Given any other distinct node X , all the paths connecting
1
2
X and X have the same parity because G
does not contain an odd cycle. Then all the
9
1
2
X.j
paths connecting X and X lead to the same equation which expresses l as a function
1
2
2
of l either through (5) or through (6). Thus the model is not identified.
%
1
Note that the necessary condition has been derived before by Marchetti & Stanghellini
(1996).
3·3. Structural zeros in the factor loadings of a single-factor model
We suppose that the first q* factor loadings are nonzero and that the last q−q* loadings
are zeros. This situation is obtained after marginalising over some factors of no substantive
interest. Furthermore, the results presented in this section will be used for the identification
of multi-factor models.
Let X(j)={X , . . . , X } be the subset of children of j; this is the subset of observed
1
q*
variables on which the latent factor loads. Let G*
denote the subgraph of G induced
X(j).j
X.j
by X(j) and let G
*
denote
its
complement.
9
X(j).j
The following corollary results from Theorem 1.
Multi-factor models
145
C 1. A necessary and suYcient condition for a single-factor model with zero
factor loadings to be identified is that the graph G
satisfies the following conditions:
9*
X(j).j
(i) each connectivity component contains at least one odd cycle,
(ii) the sign of one factor loading per connectivity component is given.
Proof. The vector of factor loadings l can be written lT=(l*T, 0, . . . , 0), where l* is
the q*-vector of nonzero factor loadings. Then llT is a block-diagonal matrix,
llT=
A
l*l*T 0
0
0
B
.
The expression of S(l, H) in (2) leads to the system of equations
G
h ,
for u>q* or v>q*,
(8)
uv
s = l l ,
for (u, v)µE9 * ,
(9)
uv
u v
X(j).j
h +l l , for (u, v)µE*
or u=v,
(10)
uv
u v
X(j).j
where s is the (u, v) entry in S(l, H).
uv
Equation (8) identifies h for u>q* or v>q*.
uv
Equations (9) and (10) correspond to the equations of identification of the single-factor
model without zero constraints on the factor loadings, induced by the children of j. The
complementary graph of the conditional covariance graph of this single-factor model is
G
%
9 * , to which Theorem 1 applies.
X(j).j
A similar result was proved by Anderson & Rubin (1956, Theorem 5·4) for uncorrelated
residuals.
4. M- 
4·1. Introduction
In this section, we give conditions for identification of multi-factor models using the
conditions for identification of single-factor models. Submodels with fewer factors can be
derived from a multi-factor model by marginalising over and conditioning on subsets of
factors. The associated graphs of these submodels are called summary graphs (Cox &
Wermuth, 1996, p. 196).
Let j , j and j be a partition of the set of factors j. We consider the submodel derived
s m
c
from (1) by conditioning on j and marginalising over j .
c
m
Thus, only the observed variables X and the subset of factors j remain. The edges of the
s
graph of this model can easily be deduced from those of the complete graph: the set of
is obtained by removing from E
the edges coming from a
the directed edges E
X÷j
X÷j
factor either over whichs we marginalise or on which we condition; its set of undirected
is obtained by adding to E the edges connecting the pairs of nodes with a
edges E
X.j
X.js
common factor in j .
m
In particular, we focus on submodels where j is a single factor in the sequence. As
s
indicated in § 3·3, the conditional covariance graphs induced by the children of the single
factor are of particular interest; they are termed induced conditional covariance graphs
of X(j ) conditional on j marginalised over j .
s
c
m
146
M. G, P. W  D. C 
4·2. A necessary condition for identification of a multi-factor model
T 2. A necessary condition for a multi-factor model with uncorrelated factors to be
identified is that all the single-factor models be identified which are obtained by conditioning
on all factors but one.
Equivalently, if at least one single-factor model, obtained by conditioning on all factors but
one, is not identified then the multi-factor model is not identified.
Proof. As these two conditions are equivalent, we only prove the second. Let G be a
multi-factor model with uncorrelated factors and parameters {L, H}. Suppose that the
single-factor model obtained by conditioning on all but factor j is not identified. Let L
i
i
be the vector of factor loadings of j and let Li be the q×( p−1) submatrix of the factor
i
loadings of the remaining factors. Then {L , H} is a set of parameters of a single-factor
i
model. As it is not identified, there exists another set of parameters {l , h} such that
i
L LT +H=l lT +h. Then LiLiT+L LT +H=LiLiT+l lT +h, which means that {l , Li, h}
i i
i i
i i
i i
i
is another solution for the set of parameters of the initial multi-factor model.
%
This condition is illustrated in the following example. Figure 1(a) represents the chain
graph of a multi-factor model with seven observed variables and two factors. The induced
conditional covariance graphs of the two single-factor models are represented in Figs 1(b), (c).
As the properties of the complementary graphs are the most important, we draw the edges
of the complementary graph with full lines together with the dashed lines of the covariance
graph. With this convention, we can easily check the existence of odd cycles in the full-lines
graph.
Fig. 1. A multi-factor model with 7 observed variables and 2 factors. The chain graph is shown in (a).
(b) and (c) show the induced conditional covariance graphs (dashed lines) and their complementary
graphs (full lines) of the two single-factor models obtained by conditioning on j and j respectively.
1
2
The single-factor model induced by conditioning on j is identified but the single-factor model induced
1
by conditioning on j is not identified. Thus this model is not identified according to Theorem 2.
2
Multi-factor models
147
For example, in Fig. 1(c), the complementary graph of the induced conditional covariance
graph of X(j ) does not contain an odd cycle, so that the multi-factor model is not identified.
1
This condition is not sufficient, as the following example shows. The chain graph of a
multi-factor model with six observed variables and two factors is presented in Fig. 2(a). Both
complementary graphs of the induced conditional covariance graphs of X(j ) conditional
1
on {j , j } and X(j ) conditional on {j , j } contain odd cycles, see Figs 2(b), (c), but
1 2
2
1 2
the multi-factor model is not identified. To show this, we compare the number of unknowns
and the number of equations. The number of nonzero covariances in S(l, H) is called the
apparent number of equations. This is not always the actual number of independent
equations as some equations may be redundant.
Fig. 2. A multi-factor model with 6 observed variables and 2 factors. The chain graph is shown in (a).
(b) and (c) show the induced conditional covariance graphs (dashed lines) and their complementary
graphs (full lines) of the two single-factor models obtained by conditioning on j and j respectively.
1
2
Both single-factor models obtained by conditioning on one factor are identified, but the multi-factor
model is not identified.
The model in Fig. 2 has 17 parameters and the apparent number of equations is 18.
However, two equations are redundant because the model imposes the two tetrad conditions,
cov (X , X ) cov (X , X )
cov (X , X ) cov (X , X )
3 6
4 5 =1.
1 3
2 4 =1,
cov (X , X ) cov (X , X )
cov (X , X ) cov (X , X )
1 4
2 3
3 5
4 6
Thus, the actual number of equations is 16, which is lower than the number of unknowns.
C 2. A necessary condition for identification of a multi-factor model with
uncorrelated factors is that each factor have at least three children.
Furthermore, if there exists at least a residual correlation within the set of the children of
any given factor, the number of children must be at least four.
148
M. G, P. W  D. C 
Proof. Suppose that a factor has fewer than three children. Then the single-factor model
obtained by conditioning on all but this factor is not identified since the complementary
graph of its conditional covariance graph has only one or two nodes and thus its associated
graph cannot contain any odd cycle. If there exists a correlation within the children of
a factor, the complementary graph of the induced conditional covariance graph of the
single-factor model obtained by conditioning on all but this factor cannot have any odd
cycle if the number of children is either 1, 2 or 3.
%
This result was proved by Anderson & Rubin (1956) in the case of uncorrelated errors.
4·3. A suYcient condition for identification of a multi-factor model
T 3. A suYcient condition for identification of a multi-factor model with uncorrelated
factors is that there exists at least one sequence of factors, j , such that each single-factor
p(i)
model obtained by conditioning on {j , j<i} and marginalising on {j , j>i} is identified.
p(j)
p(j)
Proof. We show that, if such a permutation exists, the identification is reached successively
for each vector of factor loadings l . In order to simplify notation, the factors are reordered
p(i)
such that p(i)=i. Let {L, H} be a set of parameters of a multi-factor model. Let {l, h}
be a second set of parameters of the same model. By (2) the equation of identification of
the model is
LLT+H=llT+h.
(11)
In the proof, we denote by L the ith column of the matrix L and make use of the
i
general property that LLT=W L LT .
i=1 i i
First we consider identification of l . Let H∞=W L LT +H and h∞=W l lT +h.
1
i=2 i i
i=2 i i
Equation (11) can be rewritten as
L LT +H∞=l lT +h∞.
(12)
1 1
1 1
This equation corresponds to the identification equation of the single-factor model obtained
by marginalising over all but factor j . As it is supposed to be identified, l =L .
1
1
1
Next we consider identification of l . We suppose that l =L for all i<j. Then (11) is
j
i
i
reduced to
∑ L LT +H= ∑ l lT +h.
(13)
i i
i i
i=j
i=j
This equation is the identification equation of the factor model with p−j+1 factors
obtained by conditioning on the j−1 first factors. As for the identification of l , we let
1
H∞=W
L LT +H and h∞=W
l lT +h. Then (13) can be rewritten as
i=j+1 i i
i=j+1 i i
L LT +H∞=l lT +h∞.
(14)
j j
j j
This equation corresponds to the identification equation of the single-factor model obtained
by conditioning on {j , j<i} and marginalising on {j , j>i}. Again, as this single-factor
p(j)
p(j)
model is supposed to be identififed, l =L .
j
j
The identification process continues until j=p−1. Then l =L for i=1, . . . , p−1
i
i
and (11) is reduced to
L LT +H=l lT +h.
(15)
p p
p p
This equation is the equation of identification of the single-factor model obtained by
conditioning on all but factor j , which is supposed to be identified. Thus L =l and
p
p
p
H=h.
%
Multi-factor models
149
This condition is illustrated by the following example. We consider a multi-factor model
with six observed variables and three factors whose chain graph is given in Fig. 3(a).
This multi-factor model is identified by Theorem 3; the sequence (j , j , j ) identifies the
2 1 3
parameters. The induced conditional covariance graphs of the successive single-factor
models resulting from the sequence are given in Fig. 3(b).
Fig. 3. A multi-factor model with 6 observed variables and 3 factors. The chain graph is shown in (a).
(b) shows the induced conditional covariance graphs (dashed lines) and their complementary graphs
(full lines) of the sequence of single-factor models illustrating the operations that prove the identification
of the model using the sequence (j , j , j ).
2 1 3
Fig. 4. A multi-factor model with 7 observed variables and 2 factors. The chain graph is shown in (a).
(b) shows the induced conditional covariance graph (dashed lines) and its complementary graph
(full lines) of the single-factor model obtained by marginalising over j for the tested sequence (j , j ).
1
2 1
(c) shows the corresponding graph of the single-factor model obtained by marginalising over j for the
2
tested sequence (j , j ). None of these single-factor models is identified. However this multi-factor
1 2
model is identified.
150
M. G, P. W  D. C 
The following example proves that this condition is not necessary. We consider a
multi-factor model with seven observed variables and two factors. The structural constraints are presented in the chain graph in Fig. 4(a). Neither sequence (j , j ) nor (j , j )
1 2
2 1
follows the condition of Theorem 3 since none of the single-factor models obtained by
marginalising over j or j is identified. However, the identification has been checked
1
2
algebraically. Interested readers can contact M. Grzebyk for the full proof.
5. D
An issue with these conditions is their computational complexity. We note first that Vicard’s
(2000) algorithm is linear in |E* |, where |E* | is the number of edges in G* . For the
X.j
X.j
X.j
necessary condition of identification of a multi-factor model, the algorithm builds the induced
conditional covariance graphs of all the single-factor submodels obtained by conditioning
on all but one factor and checks that they follow the conditions of identification of Theorem 1,
using the core algorithm. Thus the computational complexity of the algorithm is linear
in p×( |E |+K), where K is the complexity of the algorithm that derives the induced
X.j
conditional covariance graphs of a single-factor submodel obtained by conditioning on
all but one factor. In the case of the sufficient condition for identification of a multi-factor
model, the algorithm explores the set of permutations to look for a path among the latent
variables which identifies the model. Thus, if the sufficient condition is not fulfilled, the
parsing of all paths involves p! individual searches each of which consists of at most p
generations of induced conditional covariance graphs of a single-factor model and then
checks for odd cycles.
If a particular graph is to be checked for identifiability on the basis of the present results,
the necessary condition should be checked first as it is simpler and shorter. If all these
single-factor sub-models are identified and no permutation of the factors is found by which
the identification can be shown through Theorem 3, the identification status of the model
is still pending since it was shown that none of these conditions is both necessary and
sufficient. Note that, in the example presented in Fig. 2, the apparent number of equations
was sufficient for identification but the model was not identified through Theorem 2, which
at least in this case is a stronger criterion.
The second part of the condition in Theorem 1 indicates that fixing one sign constraint
for each factor is not sufficient for solving the invariance problem when there is more than
of a single-factor model. In this case, the
one connectivity component in the graph E9
X.j
sign of the factor loadings can be fixed independently in each connectivity component; these
alternative choices lead to different solutions for the covariance matrix H, more precisely
for the residual covariances between each pair of elements of X belonging to different connectivity components. Note that, in contrast to the factor loadings, these different solutions
for H are not equal up to a sign-change: they may correspond to truly different values.
Thus the sign constraints have to be chosen carefully. This phenomenon is crucial too in
the case of a multi-factor model. Suppose there exists a sequence such that Theorem 3
applies. If the complementary graph of the induced conditional covariance graph of
some single-factor model obtained by conditioning on {j , j<i} and marginalising over
p(j)
{j , j>i} contains more than one connectivity component, choosing different sign
p(j)
constraints leads to different solutions for the factor loadings of the subsequent factors as
well as different solutions for the residual covariances. Note that this problem does not
arise in uncorrelated factor models since then the complementary graph of the conditional
covariance graph is connected.
Multi-factor models
151
Beyond the diagnosis of the identification of a given model, we saw in Corollary 2
that a first necessary condition for identification of a latent variable in particular studies
is that each of these latent variables must be characterised by at least three observed
variables; that is, if one wants to explore a latent trait in a real-world study, for instance
in neurotoxicology, it must be explored through at least three observed variables. Furthermore, if these observed variables are still correlated, conditionally on the latent trait, more
than three observed variables are needed.
A
We acknowledge the help of the editor and of the two anonymous referees whose advice
improved our paper significantly.
R
A, T. W. & R, H. (1956). Statistical inference in factor analysis. In Proc. 3rd Berkeley Symp.
Math. Statist. Prob. 5, Ed. J. Neyman, pp. 111–50. Berkeley, CA: Univ. of California Press.
C, D. & W, N. (1996). Multivariate Dependencies. Models, Analysis and Interpretation. London:
Chapman and Hall.
G, P. & S, E. (2001). Bayesian inference for graphical factor analysis models. Psychometrika
66, 577–92.
M, G. M. & S, E. (1996). Alcune osservazioni sui modelli grafici in presenza di variabili
latenti. In Atti della XXXV III Riunione Scientifica 2, Ed. Società Italiana di Statistica, pp. 496–504. Rome,
Italy: Maggioli Editore.
S, E. (1997). Identification of a single-factor model using graphical Gaussian rules. Biometrika
84, 241–4.
V, P. (2000). On identification of a single-factor model with correlated residuals. Biometrika 87, 199–205.
[Received April 2001. Revised September 2003]