Theoretical Background

Issues of Existence and Uniqueness for I  J  2 Arrays
Heather M. Bush
PROPOSAL SUMMARY
In this dissertation the theory of Kronecker canonical forms for matrix pencils will be used to
study decompositions of the I  J  K array into a minimal sum of rank-one tensors. These
decompositions are of fundamental importance in the field of three-way analysis, often called
“trilinear” analysis, where a rich collection of theories and applications have developed over the
last 40 years. Three- and higher-way models for analyzing data emerged primarily from within
psychology (e.g. seminal papers by Carroll & Chang, 1970; and Harshman, 1970), and then later
in chemistry (e.g. seminal paper by Appellof & Davidson, 1981), although work by Tucker
(1966; 3MPCA) predates these manuscripts. Indeed, if one is willing to somewhat loosely define
what one means by a “multiway” model, they can be found in statistical work dating back at least
as far as Fisher and Mackenzie (1923).
Details concerning the I  J  K model and some of the associated decompositions are
discussed below.
For purposes of this summary, suffice it to say that the most popular
decompositions studied fall broadly into two categories – those that exhibit so-called “parallel”
factors and those that exhibit more general “mixed” factors. This broad distinction will be
important in understanding the direction of the research being proposed.
One of the primary advantages claimed by trilinear theory over more common bilinear methods
(e.g. principal components analysis) is the rotational uniqueness of the solutions.
This
uniqueness is attached to a particular way in which the decomposition problem is posed,
although this is not always the problem that is actually solved in practice. Regardless, a great
deal of importance is attached to these claims of uniqueness, claims which began with papers by
1
Harshman (1970) and Kruskal (1977) and have since led to a plethora of activity that will be
reviewed below. In brief, the uniqueness theory that has grown up around trilinear models is
primarily – not exclusively – for I  J  2 models of parallel factors. Some results are available
for IxJxK models of parallel factors and even for I  J  2 models of limited types of mixed
factors.
In some cases a solution is presumed to exist and, conditional on the truth of that
presumption, shown to be unique.
In spite of the fact that Kronecker canonical forms for matrix pencils deals precisely with these
kinds of decompositions, and predates formal trilinear theory by nearly a century there has been
little cross-fertilization of ideas. This dissertation will undertake to provide the first rigorous
look at trilinear and, perhaps, multilinear models from the perspective of matrix pencils. Both
the I  J  2 and the I  J  K models will be studied for parallel and mixed factor models.
The primary endpoint will always be the derivation of necessary and sufficient conditions for
unique solutions. It is anticipated that this work will result in:
1. The unification of many of the uniqueness results that are currently in the literature.
2. A tightening of conditions for uniqueness (e.g. some of the necessary and sufficient
conditions that are currently well-known in the literature are necessary and sufficient only
in the presence of certain other presumptions).
3. Discovery of general conditions for the specification of mixed factor models that admit
unique solutions.
4. A better understanding of the role of “existence” in the trilinear uniqueness literature. In
particular, it is anticipated that the presumption of existence, on which a subsequent proof
2
of uniqueness is based, will be shown to be a vacuous presumption in some important
cases.
5. A recalibration of the popular intuition about what one means by “uniqueness” in trilinear
theory.
3
MULTILINEAR MODELS
Theoretical Background
Extensions of and analogies to the well-known bilinear paradigms have created a substantial
literature on so-called multiway structure-seeking methods. The conceptual similarity to bilinear
methods is indicated in the schematic below, wherein a cube of data is decomposed into two
First
Component
task
task
time
task
time
+
space
=
space
space
time
task
Error
time
+
space
Data
Second
Component
components, each of which consists of three “profiles” (e.g. space, time, and task), which, taken
as triples, are assumed to characterize the cube. As was mentioned above, such techniques
originated in psychology, but some of the most important contributions to understanding and
application have come from within chemometrics, in particular from within florescence
spectroscopy (see e.g. Bro, 1997; Burdick, 1995; Leurgans & Ross, 1992; Mitchell and Burdick,
1994; Rayens & Mitchell, 1997; Sanchez & Kowalski, 1988, 1990; Wold, et. al., 1987; Bro,
1999; Bro & Heimdal, 1996). An extensive reference list is available courtesy of Professor
Rasmus Bro of the Chemometrics Group at the Royal Veterinary and Agricultural University in
Denmark on the web at http://www.optimax.dk/chemobro.html and from the Three-Mode
Company at http://www.fsw.leidenuniv.nl/~kroonenb.
Although the field is far from unified,
with several different multiway models and even different methods of implementing those
4
models, the literature is growing in statistical sophistication and the successes of many
applications are undeniable and intriguing.
Just as linear algebra is the mathematics that underlies bilinear methods, tensor algebra is needed
to describe the structure of trilinear and higher-way models. Users of trilinear models have been
slow to embrace this mathematical language that helps to provide a framework for interpreting
multilinear models. The approach adopted herein, however, consistent with Burdick (1995), is
that fully avoiding this language is a mistake. Hence, the technical overview of multiway
methods will be presented by discussing trilinear methods from this abstract perspective.
Extensions to higher-way arrays are straightforward but notationally cumbersome. Burdick’s
notation is used in the following.
First, a definition of the tensor product of two vectors and a tensor product of a vector and a
matrix. Arrangement of coordinates in the definition is somewhat arbitrary. However, the
following will suffice:
Definition 1

Let x be a vector in RI and y a vector in RJ. A tensor product of x and y is given by x  y =
xyt.

Let z = (zi1) be a vector in Rk and X be an I by J matrix. A tensor product of X and z is
given by X  z = z11X z12 X  z1k X IxJK

If, in fact, X=x  y, then X  z = z11xy t

z12 xy t
5
 z1k xy t

IxJK

 xi1 y j1 z1k
IxJK
Definition 2

Let U  RI and V  RJ. A tensor product of U and V, denoted by U  V, is the vector space
consisting of all linear combinations of x  y where x  U and y V.

This idea is easily extendible to more than two vector spaces.

It isn’t hard to check that if dim (U) = R and dim (V) = S, then dim (U  V) = RS.
Typically a bilinear errors-in-variables model employed to extract structure from A IxJ has the
form of A IxJ  S IxJ  N IxJ , interpreted as a signal matrix added to a noise, or error matrix. The
issue becomes one of how you decide to model the structure in the signal matrix. Cosmetically
different perspectives lead to identical bilinear models, but to very different multilinear models.
To see this, assume that the S can be written as the sum of R rank one matrices. That is, one
might assume that there exist vectors x r   R I and y r   R J such that:
R
S   xr  y r
(1)
r 1
If both x r   R I and y r   R J are linearly independent then S will have rank R.
Similarly, one might adopt the perspective that there exists subspaces U  R I and V  R J ,
with dim (U) = dim (V) = R, such that:
(2)
S U  V
Models (1) and (2) are equivalent, and each suffers equally from a well-known lack of
uniqueness. For instance in (1), there are uncountably infinitely many vectors xr and yr that can
describe S equally well from the point of view of fit. Hence, interpretations of the vectors xr and
6
yr become as much an act of faith as an analytical exercise. This problem is popularly referred to
as the “rotation problem”.
When (1) is extended to higher-way data structures, the so-called parallel factor model
(“PARAFAC”) emerges.
Definition 3
The PARAFAC model presumes there exist vectors x r   R I , y r   R J and z r   R K such
that:
R
(3)
S   xr  y r  z r
r 1
One can think of these R tensor algebra products as representing the relative influences of R
underlying latent characteristics, or factors, that define the array. For instance, if the array was
structured as space, time, and task, then the vectors x r and y r represent the relative influence of
factor r on the space and task modes, while the vector z r contains the weights of the r th factor
for each of the k time periods. From (3) it can easily be seen that each element of S can be
written as the sum of the relative influences of each of the factors on the i th space, the j th task,
R
and the k th time period, or s ijk   x ir y jr z kr . Notice that for each element, regardless of the
r 1
point in time, x ir y jr represents the contribution of the r th factor to the i th space and the j th task.
For the k th time period, this product is multiplied by z kr . Thus, the whole influence of a factor
on space and task is proportionally adjusted for the influence on time. In other words, the
influence of the factors is adjusted in parallel proportion by the elements of z r .
7
The extension of (2) defines the so-called Tucker3 model.
Definition 4
The Tucker3 model presumes that there exist subspaces U  R I , V  R J , and W  R K , with
dim (U) = RU, dim (V) = RV, and dim (W) = RW, such that:
(4)
S U  V  W
Notice, these two models are quite different. In particular, under rather general conditions, the
PARAFAC model lays claim to a useful form of uniqueness (Kruskal, 1989), while the Tucker3
model cannot (Comments by Burdick on paper by Leurgans & Ross, 1992).
This representation of S is useful in the more complex case when RU  M factors can be
extracted from the space mode, RV  P factors exist in the task mode, and RW  Q factors can
be found in the time mode. As in the case of parallel factors, the factors in each of the modes
contribute a relative influence on the elements. Unlike parallel factors, however, the existence of
factors within each of the modes will necessarily require that the factors be interrelated.
Continuing the representation as before, where x m U , y p  V , and z q  W . The vector x m
represents the relative influence of the m th space factor on the elements of the space mode,
y p corresponds to the relative influence of the p th task on the elements of the task mode, while
the vector z q contains the weights of the q th time factor for each of the k time periods. From (4)
8
S is any linear combination of vectors of the form x m  y p  z q , and an element from this array
can be written as sijk 
M
P Q
  xim y jp z qk g mpq ,
where the coefficient g mpq represents the
m1 p 1 q 1
relative weights of the relationships among the factors. In this form, it is obvious that the whole
influence of a particular factor will not merely change proportionally as time periods vary, but
will be dependent on the influences of factors from each of the other two modes. In general there
are many analogous extensions that lead to slightly different mixed factors models.
In conclusion to the theoretical background, some further definitions will help to more clearly
formulate the research problems that will be posed.
Definition 5
The rank of a three-way array A is defined to be the smallest value of R for which the Rcomponent PARAFAC model fits A exactly.
Definition 6
The relative influences of the r th factor form the vector x r . These weights can be referred to as
loadings as a way to describe the variations of relative influence from a point in space to the
next.
If the vectors of loadings for each factor are combined to form a matrix, X  x1  x r , then
X is called a loading matrix.
variations
of
relative
Likewise, the vectors of Y  y 1
influence
from
one
9
task
to
another,
 y r  describe the
and
the
vectors
of
Z  z 1  z r describe the variations of relative influence from one time period to the
next.
Definition 7
Let M be a matrix. The k-rank of M is the largest value of k such that every collection of k
columns in M is linearly independent.
Eigenstructure Representation
The PARAFAC multilinear model is usually implemented one of two ways: using eigenbased
methods (Sanchez & Kowalski, 1988, 1990; Leurgans & Ross, 1992) or by using an alternating
least squares routine, also called PARAFAC (Harshman & Lundy, 1984; Harshman, 1972;
Appellof & Davidson, 1981; Rayens & Mitchell, 1996). The PARAFAC routine exploits the
conditional linearity of the model bearing the same name. Two of the so-called factor matrices,
say X and Y, are fixed and linear regression is used to obtain the other factor matrix Z. Then Y
and Z are fixed and X is estimated, and similarly for Y. This continues iteratively until some
convergence criterion is met. Recent work by Bro and De Jong (1997) and by Bro and
Andersson (1998) and others, directed toward speeding up the convergence of PARAFAC, have
helped make this recursive solution the most popular.
The eigenbased methods allow for an exact solution, in a sense. That is, Sanchez and Kowalski
(1988) modified some ideas from Ho, et. al. (1978) and were able to solve the PARAFAC model
by solving a generalized eigenanalysis problem. This solution, exact for Z = 2 and under the
10
assumption of perfect signal, was later adapted to the case of Z > 2 and an approximate
eigensolution derived (Sanchez & Kowalski, 1990). The eigenbased methods are attractive
because no iterative scheme is apparent to the user. However, these methods have been found to
yield complex eigenstructures if there are significant deviations of the data from the underlying
trilinear assumptions. Most often, the eigensolutions are used as intelligent starting points for the
iterative PARAFAC.
It is important to emphasize that both the parallel factors model and the mixed factors models
can be formulated as generalized eigenanalysis problems. Practical issues of fitting aside, this
eigenanalysis perspective is the one that is necessarily adopted when uniqueness results are
discussed, and is the perspective that allows Kronecker theory to be applied. It will be helpful to
first express the PARAFAC model in terms of a matrix expression.
R
Recall the model S IxJxK   x r  y r  z r , where x r   R I , y r   R J and z r   R K . If one
r 1
thinks of S as composed of K IxJ matrix slabs, Mk, then it is easy to show that this representation
of
S
is
Dk  diag zk1
equivalent
to
zk 2  zkR 
the
presumption
that
M k  XDk Yt ,
for
k  1K ,
where
The Dk matrices are often called “core”
(see graphic below).
matrices. Similar representations are available for the Tucker model but are not going to be
discussed at this moment.
Mk
S=
I
K
J
11
A rank R solution amounts to the specification of XIxR, YJxR, and D1, D2, … DK. Intuitively,
using PCA-like language, one can think of the columns of X as the common “scores”, the
columns of Y as the common “directions”, and the columns of Z as the relative weights that
distinguish the slabs in the Z direction. It is important to note that the model presumes that such
matrices exist, that is, that the specified decomposition is possible. Typically, uniqueness results
in the literature have been derived in the presence of this presumption.
12
LITERATURE REVIEW
Uniqueness of the PARAFAC Decomposition
As mentioned above, one of the differentiating features of parallel factor decomposition is that it
provides a unique solution when certain conditions are met. As of yet, attempts at reducing these
conditions to a minimal, necessary and sufficient set have been unsuccessful. However, many
different results are available in the literature and these will be briefly reviewed in this section.
Mathematical insights into the uniqueness properties of parallel factor decomposition were first
found by Robert Jennrich and published in Harshman, 1970. In his proof, Jennrich showed that a
unique solution would exist if, for an array with R underlying factors, there were R space
measurements, R task measurements, and R time measurements. However, Harshman was able
to find empirical evidence to suggest that although these conditions were sufficient, they were
not minimal. Operating in the more general I  J  2 case, with the requirement that no two
columns of the loading matrix Z could be proportional, he was able to prove that the solution
was unique for any number of factors.
As evidenced by these first approaches, the task of developing conditions for uniqueness would
involve a discussion on the relationship of the rank of the loading matrices and the number of
factors. Kruskal (1977) developed the notion of k-rank (although the actual term was coined by
Harshman and Lundy), defined earlier. He found that the loading matrices (X, Y, and Z)
obtained from the parallel factor decomposition would be uniquely identified if
k X  kY  k Z  2( R  1) , where k X was the k-rank of a matrix X, and so on. To date, his work
13
has been the most extensive in the search for criteria; and his resulting condition, generally
accepted as the climax of uniqueness research. Consequently, other uniqueness results have
stemmed from his constraint on the k-ranks of the loading matrices in the hopes of developing a
set of necessary conditions. Leurgans, Ross, and Abel (1993) reduced Kruskal’s original
conditions to a set of requirements on the linear independence of the columns of the loading
matrices, which generated identifiable results. Even so, the conditions did not yield a reduction
adequate to produce necessity.
In the last few years, a resurgence of interest in uniqueness results and Kruskal’s condition has
produced even more conditions and further understanding. In 2000, Sidiropoulos and Bro
expanded Kruskal’s result by showing that it held for I  J  K arrays of complex numbers.
Additionally, they generalized the result to include multiway arrays. The development of
Kruskal’s condition was continued by ten Berge and Sidiropoulos in 2002, as they were able to
counter the notion that Kruskal’s condition was necessary and sufficient for arrays with more
than 1 factor. By producing alternative solutions when Kruskal’s condition was not met,
necessity was shown when the number of factors was two or three. In the case of four factors,
however, uniqueness was achieved even when Kruskal’s condition was not. Hence, Kruskal’s
condition could not be necessary. From their results, it was conjectured that the answer to
uniqueness might lie in the association of rank and k-rank.
The developments in the area of uniqueness have demonstrated that it is possible to uniquely
identify the loading matrices in parallel factor decomposition. However, the empirical and
14
mathematical evidence presented has only hinted at the requirements necessary for unique
solutions.
Matrix Pencil Theory
To date, the fortification of uniqueness results into a necessary and sufficient set has been
hindered by the algebraic subtleties found in the common proof technique. Kronecker canonical
forms of matrix pencils, however, may provide an untapped answer that avoids some of these
obstacles. Theoretically, using Kronecker canonical forms of matrix pencils will result in a
decomposition of the original triad into two parts, singular and regular. The regular part is of
particular interest since it is unique and may contain the important structure (factors) common to
both slabs.
In the case where K  2 , and the array consists of two matrix slabs, M1 and M 2 , recall the
PARAFAC decomposition can be expressed as:
(5)
R
R
r 1
r 1
M1   z1r (x r  y r ) and M 2   z 2r (x r  y r ) ,
where the tensors vary in parallel proportion depending on the value of k. As mentioned above,
the system of equations given in (5) are equivalent to the following matrix representation:
(6)
M 1  XD 1 Y t
M 2  XD 2 Y t
where
15
,
0 
 z11


D1  


 0

z
1R 

0 
 z 21


D2  


 0
z 2 R 

X  x1
(7)
Y  y 1
x 2  x R I R
y 2  y R  J R
 z11  z1R 

Z  

z

z
2 R  2 R
 21
.
Suppose M1 and M2 are defined as above, with the restriction that I=J and
det(M1  M 2 )  0,  .
Definition 8 (Gantmacher, 1959)
A pencil of matrices M1  M 2 is termed a regular pencil if
1) M1 and M 2 are square matrices of the same order; and
2) The det(M1  M 2 ) is not identically equal to zero.
For all other cases, the pencil of matrices is termed a singular pencil.
Consider the special case where M 2 is nonsingular. It is easily seen that, with this requirement,
M1  M 2 will be a regular pencil. That is, since M 2 is nonsingular, M1  M 2 can be
written as M 2 (M 2 1M 1  I ) and the determinant of this expression is given by:
det(M 2 (M 21M1  I))  det(M 2 ) det(M 21M1  I) .
Because
M2
is
nonsingular,
det(M 2 )  0 . Also, det(M 21M1  I) is the characteristic equation for M 21M1 , and unless
16
M1  0 , the equation will have at least one nonzero root. Hence, in the case of two square
matrices, where one of the matrices is nonsingular, the resulting pencil will be regular.
Therefore, if both M 2 and D 2 are nonsingular, the pencils M1  M 2 and D1  D 2 will be
regular. Simple substitution, using (6), will result in
(8)
M1  M 2  X(D1  D 2 )Y t .
Definition 9 (Gantmacher, 1959)
Two pencils of matrices M1  M 2 and D1  D 2 of the same dimensions connected by the
equation (8) in which X and Y are constant nonsingular matrices will be called strictly
equivalent.
Theorem (Gantmacher, 1959)
Two pencils of square matrices of the same order M1  M 2 and D1  D 2 for which M 2 and
D 2 are both nonsingular are strictly equivalent if and only if the pencils have the same
elementary divisors.
Elementary divisors, for a matrix pencil P can be found by reducing P to a “quasi-diagonal
matrix” that consists of polynomials, (   ) p (Gantmacher, 1959). Polynomials with power
greater than zero are the elementary divisors of the pencil. When a pencil is composed of
diagonal matrices, D1 and D 2 , the resulting elementary divisors are linear and are the entries of
D1  D 2 .
17
It follows that (6) can only be true when and only when M1  M 2 and D1  D 2 share the
same elementary divisors. To see this, suppose that the solution exists and is expressed in the
form of M1  XD1Y t , M 2  XD 2 Y t . The two equations can be connected by equation (8) by
simple substitution, or M1  M 2  X(D1  D 2 )Y t . From Definition 10, this implies that the
pencils, M1  M 2 and D1  D 2 , are strictly equivalent. Since M 2 and D 2 are nonsingular,
square matrices, the resulting pencils are regular, and using the theorem from Gantmacher
(1959), will have the same elementary divisors.
Now suppose that the two regular pencils M1  M 2 and D1  D 2 have the same elementary
divisors; then the pencils are strictly equivalent (Gantmacher, 1959). By definition, whenever
two
pencils
are
strictly
equivalent,
they
can
be
written
as
equation
(8),
M1  M 2  X(D1  D 2 )Y t ,    C . Let   0 , then the equation reduces to M1  XD1Y t .
If   1 , then M1  M 2  X(D1  D 2 )Y t , but M1  XD1Y t . Therefore, M 2  XD 2 Y t , and a
solution does exist.
It should be noted that one must be careful about how regular pencils are defined. Earlier in the
section a broader definition was used to define a regular pencil. In the case where M1 and M 2
are square and det(M1  M 2 )  0,  , Gantmacher’s theorem is not valid. Consider the
following example (Gantmacher, 1959):
18
For square matrices A, B, A  , and B  , where det( A  B)  0,  and det( A   B )  0,  ,
 2 1 3
1 1 2 
 2 1 1
1 1 1








let A   3 2 5 , B  1 1 2 , A    1 2 1, and B   1 1 1 . It can be shown
 3 2 6
1 1 3 
 1 1 1
1 1 1








that the elementary divisor for the pencils A  B and A  B is  + 1. This would imply that
the two pencils are strictly equivalent. However, the rank of B is 2 and the rank of B’ is 1. It
cannot possibly be true that the pencils are equivalent. Therefore, the theorem for the
equivalence of pencils does not hold for the broader definition of regular pencils. In order to
salvage the theorem, it is necessary to introduce infinite elementary divisors. Initially, however,
the focus of this paper will remain on the situation where M 2 and D 2 are nonsingular, and the
issue of infinite elementary divisors will be left to later.
19
REGULAR PENCILS AND ISSUES OF UNIQUENESS
Elementary divisors provide a platform for the discussion of the existence of solutions. Before
the question of whether or not a solution is unique can be answered, the question of when do
solutions, and therefore, alternative solutions exist must be addressed. Previously, it was noted
that shared elementary divisors is a necessary and sufficient condition for the existence of
solutions. However, elementary divisors offer a much richer bank of information in that they are
fundamental in determining the Jordan canonical form of a pencil. For a polynomial, (   ) p ,
the power, p, determines the size of each Jordan block, and the value of  corresponds to the
entry or eigenvalue. Therefore, claiming that two pencils share the same elementary divisors is
equivalent to stating that the two pencils share the same Jordan form.
In terms of uniqueness, then, the question becomes “under what conditions will another pencil
have the same Jordan form?” This surely includes permutation and scaling, but does it include
any other types of matrices? Thus, the elementary divisors enable the creation of an “if and only
if” condition for when alternative solutions exists, and provide conditions through the derivation
of the Jordan from for whether the solutions are essentially the same or different. Finally, the
theory behind the sharing of elementary divisors and Jordan forms does not require the
PARAFAC condition of the diagonalization of Z. Thus, Kronecker canonical forms for matrix
pencils suggests that existence and uniqueness results are available for pencils composed of more
complex D1 and D2 matrices.
20
Initial Result
The first result from this dissertation is a partial clarification of Harshman’s famous 1972
uniqueness theorem and is included as an example of how pencil theory is intended to be used to
clarify existing results and produce new ones.
Theorem (Harshman, 1972):
Suppose M1  XD1Y t , M 2  XD 2 Y t where X and Y are n  l matrices which are “nonhorizontal” ( n  l ) and “basic” (of rank l), and D1 and D 2 are nonsingular matrices such that
D1 D 1
2  D p where D p has distinct diagonal elements. Suppose also that there exists some
alternative representation of M1 and M 2 , such as M1  GC1H t , M 2  GC 2 H t .
Then
G  XΠ1 , H  YΠ 2 , and C  ZΠ 3 , where Π is a permutation matrix and  i is a diagonal
matrix and Ck is a diagonal matrix with the kth row of C on the diagonal.
Harshman’s theorem gives conditions for when an alternative solution will be a permutatedscaled version of the original. It is important to note that ten Berge and Sidiropoulos (2002)
argued that the conditions posed in Harshman’s theorem were equivalent to Kruskal’s k-rank
condition when R=2. In the theorem below, elementary divisors are used to show that
permutated-scaled versions of an existing solution constitute another solution.
21
Theorem 1: The two pencils D1  D 2 and C1  C 2 as defined above, have the same Jordan
form.
Proof:
It is necessary to show that the pencils, D1  D 2 and C1  C 2 , have the same elementary
divisors.
  31 z11

First, let C  Z 3 . This means that C    31 z 21
 

 31 z R1
 32 z12
 32 z 22

 32 z R 2
 3 R z1R 
  3 R z 2 R  . Let C represent

k

 

  3 R z RR 

the diagonal matrix that results when the kth row of Z is placed on the diagonal, or
0 
 31 z k1

 . Consider k = 2. Then, C  C   D   D . Because
Ck 

1
2
3 1
3 2


 0
 3 R z kR 
D 2 , and likewise C 2 , are diagonal and nonsingular, one can consider the elementary divisor
problem to consist of finding the elements C1C 21 . However, C1C 21   3 D1D 2131 . Therefore,
the elements on the diagonal of C1C 21 are simply the elements on the diagonal of D1D 21 .
Consequently, the elementary divisors must be the same.
Now consider the case when C  ZΠ 3 . In this case the columns are permuted. However, as
mentioned above, permuting the columns will have no effect on the elementary divisors.
Therefore, in this case, the elementary divisors are also equal. Hence, permutation and scaling
will have no effect on the elementary divisors and so the Jordan forms will be the same.
22
PROPOSED PLAN
Arising from the above discussion are several immediate problems that will form the first part of
this dissertation. These are briefly detailed below. Where the dissertation goes, if and when
these problems are successfully resolved, will have to be decided later. It is fair to point out that
all three of the problems mentioned would be significant contributions to the theory of I  J  2
trilinear models, paving the way for research innovations in the area of uniqueness and
I  J  K trilinear models (PARAFAC and non-parallel factor models) as well as general
multilinear models.
Harshman’s Theorem (1972)
This theorem was stated at the end of the previous section. The first goal of this dissertation is to
state and prove Harshman’s theorem from the perspective of elementary divisors.
The
expectation is that this new perspective will allow necessary and sufficient conditions to be
developed for uniqueness up to permutations and scale.
Kruskal’s Conditions
The conditions on k-rank are by now classic in the trilinear literature. As mentioned, these
conditions are not necessary and sufficient for an I  J  K array of general rank R, in spite of
some early conjectures to this effect in the literature. The second goal of this dissertation is to
understand k-rank for I  J  2 arrays from the perspective of matrix pencils and elementary
divisors. It should then be clear why the conditions are necessary for R=2 and 3, but not for R=4
23
and above. That is, it should be clear why ten Berge and Sidiropoulos (2002) were able to show
Kruskal’s conditions were not necessary for R=4. The hope is that elementary divisors and the
theory of equivalent matrix pencils will be just what is needed to tighten Kruskal’s conditions to
become necessary and sufficient.
Mixed Models
Recall that in so-called “mixed” trilinear models the influence of a particular factor will not
merely change proportionally as the variable denoting the third way varies, but will be dependent
on the influences of factors from each of the other two modes. Very little is known about the
uniqueness of structures that result from these models. This is unfortunate since these more
general models are in many ways much more useful – simply due to lack of constraint – than
PARAFAC models. In fact, mixed models are still the models of choice for psychometricians in
spite of the fact that the general belief is that there are no rational claims to uniqueness. There is
an excellent chance that matrix pencils will allow for the development of significant uniqueness
results in the mixed decompositions of the I  J  2 array. These results would be shown to
incorporate the very few, very specific uniqueness results that are available for mixed
decompositions, while at the same time produce necessary and sufficient conditions for when
mixed arrays admit unique solutions; which should, in turn, become a mechanism for generating
such models.
24
REFERENCES
Appellof, C.J., and Davidson, E.R. 1981. Strategies for analyzing data from video fluorometric
monitoring of liquid chromatographic effluents. Analytical Chemistry 53: 2053-2056.
Bro., R. 1997. PARAFAC: Tutorial & applications. Chemom. Intell. Lab. Syst. 38: 149-171.
Bro, R. 1999. Exploratory study of sugar production using fluorescence spectroscopy and multiway analysis. Chemometrics and Intelligent Laboratory Systems 46: 133-147.
Bro, R., and De Jong, S. 1997. A fast non-negativity-constrained least squares algorithm.
Journal of Chemometrics 11: 393-401.
Bro, R., and Andersson, C.A. 1998. Improving the speed of multiway algorithms. Part II:
Compression. Chemometrics and Intelligent Laboratory Systems 42: 105-113.
Bro, R., and Heimdal, H. 1996. Enzymatic browning of vegetables. Calibration and analysis of
variance by multiway methods. Chemometrics and Intelligent Laboratory Systems 34: 85-102
Burdick, D. 1995. An introduction to tensor products with applications to multiway data
analysis. Chemom. Intell. Lab. Syst. 28: 229-237.
Carroll, J. D., and Chang, J. J. 1970. Analysis of individual differences in multidimensional
scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 35: 283319.
Fisher, R. A. and Mackenzie, W. A. (1923) Studies in crop variation II. The manurial treatment
of different potato varieties. Journal of Agricultural Science 13, 311-320.
Gantmacher, F.R., The Theory of Matrices, Vols. I,II, Chelsea Publishing Company, New York,
1959.
Harshman, R.A. 1970. Foundations of the PARAFAC procedure: Models and conditions for an
“explanatory” multinomial factor analysis. UCLA Working Papers in Phonetics 16: 1-84.
Harshman, R.A. 1972. Determination and proof of minimum uniqueness conditions for
PARAFAC1. UCLA Working Papers in Phonetics 22: 111-117.
Harshman, R.A., and Lundy, M.E. 1984a. The PARAFAC model for three-way factor analysis
and multidimensional scaling. In Research Methods for Multimode Data Analysis (H.G. Law,
C.W. Snyder Jr., J.A. Hattie and R.P. Mcdonald, eds.) Praeger, New York.
Harshman, R.A., and Lundy, M.E. 1984b. Data preprocessing and the extended PARAFAC
model. In Research Methods for Multimode Data Analysis (H.G. Law, C.W. Snyder Jr., J.A.
Hattie and R.P. Mcdonald, eds.) Praeger, New York.
Ho, C.N., Christian, G.D., and Davidson, E.R. 1978. Application of the method of rank
annihilation to quantitative analyses of multicomponent fluorescence data from the video
fluorometer. Analytical Chemistry 50: 1108-1113.
Kiers, H.A.L. 2000. Towards a standardized notation and terminology in multiway analysis.
Journal of Chemnometrics 14:105-122.
Kroonenberg, P.M. 1983. Three-Mode Principal Component Analysis. DSWO Press, Leiden,
The Netherlands.
Kroonenberg, P.M. 1989. Singular value decompositions of interactions in three-way
contingency tables. In Multiway Data Analysis (R. Coppi and S. Bolasco, eds.), North-Holland.
Kruskal, J.B. 1977. Three-way arrays: Rank and uniqueness of trilinear decompositions with
application to arithmetic complexity and statistics. Linear Algebra Appl. 18: 95-138.
Kruskal, J.B. 1984. Multilinear methods. In Research Methods for Multimode Data Analysis.
(H.G. Law, C.W. Snyder, Jr., J.A. Hattie and R.P. McDonald, eds.), Praeger, New York.
Kruskal, J.B. 1989. Rank decomposition and uniqueness for 3-way and N-way arrays. In
Multiway Data Analysis (R. Coppi and S. Bolasco, eds.), North-Holland.
Leurgans, S. and Ross, R. 1992. Multilinear Models: Applications in spectroscopy. Statistical
Science 7(3): 289-319.
Leurgans, S., Ross R., Abel R. 1993 A decomposition for three-way arrays. SIAM Journal of
Matrix Analysis and Application 14:1064-1083.
Martin, A., Wiggs, C. L., Ungerleider, L. G., and Haxby, J. V. 1996. Neural correlates of
category-specific knowledge. Nature 379: 649-652.
Mitchell, B. and Burdick, D. 1994. Slowly converging PARAFAC sequences: swamps and two
factor degeneracies. Journal of Chemometrics 8: 155-168.
Rayens, W.S. and Mitchell, B. 1997. Two-factor degeneracies and a stabilization of PARAFAC.
Chemometrics and Intelligent Laboratory Systems 38: 173-181.
Sanchez, E., and Kowalski, B.R. 1988. Tensorial calibration: II. Second-order calibration.
Journal of Chemometrics 2: 265-280.
Sanchez, E., and Kowalski, B.R. 1990. Tensorial resolution: A direct trilinear decomposition.
Journal of Chemometrics 4: 29-45.
Sidiropoulos, N.D. and Bro, R. 2000. On the uniqueness of multilinear decomposition of N-way
arrays. Journal of Chemometrics 14: 229-239.
Smilde, A.K. and Doornbos, D.A. 1992. Simple validatory tools for judging the predictive
performance of PARAFAC and three-way PLS. Journal of Chemometrics 6: 11-28.
Smilde, A.K 1992. Three-way analysis. Problems and prospects. Chemom. Intell. Lab. Syst. 15:
143-157.
ten Berge, J.M. and Sidiropoulos, N.D. 2002. On uniqueness in CANDECOMP/PARAFAC.
Journal of Psychometrika 67: 399-409.
Tucker, L.R. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31:
279-311.
Wold, S., Geladi, P., Esbensen, K., and Ohman, J. 1987. Multi-way principal components- and
PLS-analysis. Journal of Chemometrics 1: 41-56.