Issues of Existence and Uniqueness for I J 2 Arrays Heather M. Bush PROPOSAL SUMMARY In this dissertation the theory of Kronecker canonical forms for matrix pencils will be used to study decompositions of the I J K array into a minimal sum of rank-one tensors. These decompositions are of fundamental importance in the field of three-way analysis, often called “trilinear” analysis, where a rich collection of theories and applications have developed over the last 40 years. Three- and higher-way models for analyzing data emerged primarily from within psychology (e.g. seminal papers by Carroll & Chang, 1970; and Harshman, 1970), and then later in chemistry (e.g. seminal paper by Appellof & Davidson, 1981), although work by Tucker (1966; 3MPCA) predates these manuscripts. Indeed, if one is willing to somewhat loosely define what one means by a “multiway” model, they can be found in statistical work dating back at least as far as Fisher and Mackenzie (1923). Details concerning the I J K model and some of the associated decompositions are discussed below. For purposes of this summary, suffice it to say that the most popular decompositions studied fall broadly into two categories – those that exhibit so-called “parallel” factors and those that exhibit more general “mixed” factors. This broad distinction will be important in understanding the direction of the research being proposed. One of the primary advantages claimed by trilinear theory over more common bilinear methods (e.g. principal components analysis) is the rotational uniqueness of the solutions. This uniqueness is attached to a particular way in which the decomposition problem is posed, although this is not always the problem that is actually solved in practice. Regardless, a great deal of importance is attached to these claims of uniqueness, claims which began with papers by 1 Harshman (1970) and Kruskal (1977) and have since led to a plethora of activity that will be reviewed below. In brief, the uniqueness theory that has grown up around trilinear models is primarily – not exclusively – for I J 2 models of parallel factors. Some results are available for IxJxK models of parallel factors and even for I J 2 models of limited types of mixed factors. In some cases a solution is presumed to exist and, conditional on the truth of that presumption, shown to be unique. In spite of the fact that Kronecker canonical forms for matrix pencils deals precisely with these kinds of decompositions, and predates formal trilinear theory by nearly a century there has been little cross-fertilization of ideas. This dissertation will undertake to provide the first rigorous look at trilinear and, perhaps, multilinear models from the perspective of matrix pencils. Both the I J 2 and the I J K models will be studied for parallel and mixed factor models. The primary endpoint will always be the derivation of necessary and sufficient conditions for unique solutions. It is anticipated that this work will result in: 1. The unification of many of the uniqueness results that are currently in the literature. 2. A tightening of conditions for uniqueness (e.g. some of the necessary and sufficient conditions that are currently well-known in the literature are necessary and sufficient only in the presence of certain other presumptions). 3. Discovery of general conditions for the specification of mixed factor models that admit unique solutions. 4. A better understanding of the role of “existence” in the trilinear uniqueness literature. In particular, it is anticipated that the presumption of existence, on which a subsequent proof 2 of uniqueness is based, will be shown to be a vacuous presumption in some important cases. 5. A recalibration of the popular intuition about what one means by “uniqueness” in trilinear theory. 3 MULTILINEAR MODELS Theoretical Background Extensions of and analogies to the well-known bilinear paradigms have created a substantial literature on so-called multiway structure-seeking methods. The conceptual similarity to bilinear methods is indicated in the schematic below, wherein a cube of data is decomposed into two First Component task task time task time + space = space space time task Error time + space Data Second Component components, each of which consists of three “profiles” (e.g. space, time, and task), which, taken as triples, are assumed to characterize the cube. As was mentioned above, such techniques originated in psychology, but some of the most important contributions to understanding and application have come from within chemometrics, in particular from within florescence spectroscopy (see e.g. Bro, 1997; Burdick, 1995; Leurgans & Ross, 1992; Mitchell and Burdick, 1994; Rayens & Mitchell, 1997; Sanchez & Kowalski, 1988, 1990; Wold, et. al., 1987; Bro, 1999; Bro & Heimdal, 1996). An extensive reference list is available courtesy of Professor Rasmus Bro of the Chemometrics Group at the Royal Veterinary and Agricultural University in Denmark on the web at http://www.optimax.dk/chemobro.html and from the Three-Mode Company at http://www.fsw.leidenuniv.nl/~kroonenb. Although the field is far from unified, with several different multiway models and even different methods of implementing those 4 models, the literature is growing in statistical sophistication and the successes of many applications are undeniable and intriguing. Just as linear algebra is the mathematics that underlies bilinear methods, tensor algebra is needed to describe the structure of trilinear and higher-way models. Users of trilinear models have been slow to embrace this mathematical language that helps to provide a framework for interpreting multilinear models. The approach adopted herein, however, consistent with Burdick (1995), is that fully avoiding this language is a mistake. Hence, the technical overview of multiway methods will be presented by discussing trilinear methods from this abstract perspective. Extensions to higher-way arrays are straightforward but notationally cumbersome. Burdick’s notation is used in the following. First, a definition of the tensor product of two vectors and a tensor product of a vector and a matrix. Arrangement of coordinates in the definition is somewhat arbitrary. However, the following will suffice: Definition 1 Let x be a vector in RI and y a vector in RJ. A tensor product of x and y is given by x y = xyt. Let z = (zi1) be a vector in Rk and X be an I by J matrix. A tensor product of X and z is given by X z = z11X z12 X z1k X IxJK If, in fact, X=x y, then X z = z11xy t z12 xy t 5 z1k xy t IxJK xi1 y j1 z1k IxJK Definition 2 Let U RI and V RJ. A tensor product of U and V, denoted by U V, is the vector space consisting of all linear combinations of x y where x U and y V. This idea is easily extendible to more than two vector spaces. It isn’t hard to check that if dim (U) = R and dim (V) = S, then dim (U V) = RS. Typically a bilinear errors-in-variables model employed to extract structure from A IxJ has the form of A IxJ S IxJ N IxJ , interpreted as a signal matrix added to a noise, or error matrix. The issue becomes one of how you decide to model the structure in the signal matrix. Cosmetically different perspectives lead to identical bilinear models, but to very different multilinear models. To see this, assume that the S can be written as the sum of R rank one matrices. That is, one might assume that there exist vectors x r R I and y r R J such that: R S xr y r (1) r 1 If both x r R I and y r R J are linearly independent then S will have rank R. Similarly, one might adopt the perspective that there exists subspaces U R I and V R J , with dim (U) = dim (V) = R, such that: (2) S U V Models (1) and (2) are equivalent, and each suffers equally from a well-known lack of uniqueness. For instance in (1), there are uncountably infinitely many vectors xr and yr that can describe S equally well from the point of view of fit. Hence, interpretations of the vectors xr and 6 yr become as much an act of faith as an analytical exercise. This problem is popularly referred to as the “rotation problem”. When (1) is extended to higher-way data structures, the so-called parallel factor model (“PARAFAC”) emerges. Definition 3 The PARAFAC model presumes there exist vectors x r R I , y r R J and z r R K such that: R (3) S xr y r z r r 1 One can think of these R tensor algebra products as representing the relative influences of R underlying latent characteristics, or factors, that define the array. For instance, if the array was structured as space, time, and task, then the vectors x r and y r represent the relative influence of factor r on the space and task modes, while the vector z r contains the weights of the r th factor for each of the k time periods. From (3) it can easily be seen that each element of S can be written as the sum of the relative influences of each of the factors on the i th space, the j th task, R and the k th time period, or s ijk x ir y jr z kr . Notice that for each element, regardless of the r 1 point in time, x ir y jr represents the contribution of the r th factor to the i th space and the j th task. For the k th time period, this product is multiplied by z kr . Thus, the whole influence of a factor on space and task is proportionally adjusted for the influence on time. In other words, the influence of the factors is adjusted in parallel proportion by the elements of z r . 7 The extension of (2) defines the so-called Tucker3 model. Definition 4 The Tucker3 model presumes that there exist subspaces U R I , V R J , and W R K , with dim (U) = RU, dim (V) = RV, and dim (W) = RW, such that: (4) S U V W Notice, these two models are quite different. In particular, under rather general conditions, the PARAFAC model lays claim to a useful form of uniqueness (Kruskal, 1989), while the Tucker3 model cannot (Comments by Burdick on paper by Leurgans & Ross, 1992). This representation of S is useful in the more complex case when RU M factors can be extracted from the space mode, RV P factors exist in the task mode, and RW Q factors can be found in the time mode. As in the case of parallel factors, the factors in each of the modes contribute a relative influence on the elements. Unlike parallel factors, however, the existence of factors within each of the modes will necessarily require that the factors be interrelated. Continuing the representation as before, where x m U , y p V , and z q W . The vector x m represents the relative influence of the m th space factor on the elements of the space mode, y p corresponds to the relative influence of the p th task on the elements of the task mode, while the vector z q contains the weights of the q th time factor for each of the k time periods. From (4) 8 S is any linear combination of vectors of the form x m y p z q , and an element from this array can be written as sijk M P Q xim y jp z qk g mpq , where the coefficient g mpq represents the m1 p 1 q 1 relative weights of the relationships among the factors. In this form, it is obvious that the whole influence of a particular factor will not merely change proportionally as time periods vary, but will be dependent on the influences of factors from each of the other two modes. In general there are many analogous extensions that lead to slightly different mixed factors models. In conclusion to the theoretical background, some further definitions will help to more clearly formulate the research problems that will be posed. Definition 5 The rank of a three-way array A is defined to be the smallest value of R for which the Rcomponent PARAFAC model fits A exactly. Definition 6 The relative influences of the r th factor form the vector x r . These weights can be referred to as loadings as a way to describe the variations of relative influence from a point in space to the next. If the vectors of loadings for each factor are combined to form a matrix, X x1 x r , then X is called a loading matrix. variations of relative Likewise, the vectors of Y y 1 influence from one 9 task to another, y r describe the and the vectors of Z z 1 z r describe the variations of relative influence from one time period to the next. Definition 7 Let M be a matrix. The k-rank of M is the largest value of k such that every collection of k columns in M is linearly independent. Eigenstructure Representation The PARAFAC multilinear model is usually implemented one of two ways: using eigenbased methods (Sanchez & Kowalski, 1988, 1990; Leurgans & Ross, 1992) or by using an alternating least squares routine, also called PARAFAC (Harshman & Lundy, 1984; Harshman, 1972; Appellof & Davidson, 1981; Rayens & Mitchell, 1996). The PARAFAC routine exploits the conditional linearity of the model bearing the same name. Two of the so-called factor matrices, say X and Y, are fixed and linear regression is used to obtain the other factor matrix Z. Then Y and Z are fixed and X is estimated, and similarly for Y. This continues iteratively until some convergence criterion is met. Recent work by Bro and De Jong (1997) and by Bro and Andersson (1998) and others, directed toward speeding up the convergence of PARAFAC, have helped make this recursive solution the most popular. The eigenbased methods allow for an exact solution, in a sense. That is, Sanchez and Kowalski (1988) modified some ideas from Ho, et. al. (1978) and were able to solve the PARAFAC model by solving a generalized eigenanalysis problem. This solution, exact for Z = 2 and under the 10 assumption of perfect signal, was later adapted to the case of Z > 2 and an approximate eigensolution derived (Sanchez & Kowalski, 1990). The eigenbased methods are attractive because no iterative scheme is apparent to the user. However, these methods have been found to yield complex eigenstructures if there are significant deviations of the data from the underlying trilinear assumptions. Most often, the eigensolutions are used as intelligent starting points for the iterative PARAFAC. It is important to emphasize that both the parallel factors model and the mixed factors models can be formulated as generalized eigenanalysis problems. Practical issues of fitting aside, this eigenanalysis perspective is the one that is necessarily adopted when uniqueness results are discussed, and is the perspective that allows Kronecker theory to be applied. It will be helpful to first express the PARAFAC model in terms of a matrix expression. R Recall the model S IxJxK x r y r z r , where x r R I , y r R J and z r R K . If one r 1 thinks of S as composed of K IxJ matrix slabs, Mk, then it is easy to show that this representation of S is Dk diag zk1 equivalent to zk 2 zkR the presumption that M k XDk Yt , for k 1K , where The Dk matrices are often called “core” (see graphic below). matrices. Similar representations are available for the Tucker model but are not going to be discussed at this moment. Mk S= I K J 11 A rank R solution amounts to the specification of XIxR, YJxR, and D1, D2, … DK. Intuitively, using PCA-like language, one can think of the columns of X as the common “scores”, the columns of Y as the common “directions”, and the columns of Z as the relative weights that distinguish the slabs in the Z direction. It is important to note that the model presumes that such matrices exist, that is, that the specified decomposition is possible. Typically, uniqueness results in the literature have been derived in the presence of this presumption. 12 LITERATURE REVIEW Uniqueness of the PARAFAC Decomposition As mentioned above, one of the differentiating features of parallel factor decomposition is that it provides a unique solution when certain conditions are met. As of yet, attempts at reducing these conditions to a minimal, necessary and sufficient set have been unsuccessful. However, many different results are available in the literature and these will be briefly reviewed in this section. Mathematical insights into the uniqueness properties of parallel factor decomposition were first found by Robert Jennrich and published in Harshman, 1970. In his proof, Jennrich showed that a unique solution would exist if, for an array with R underlying factors, there were R space measurements, R task measurements, and R time measurements. However, Harshman was able to find empirical evidence to suggest that although these conditions were sufficient, they were not minimal. Operating in the more general I J 2 case, with the requirement that no two columns of the loading matrix Z could be proportional, he was able to prove that the solution was unique for any number of factors. As evidenced by these first approaches, the task of developing conditions for uniqueness would involve a discussion on the relationship of the rank of the loading matrices and the number of factors. Kruskal (1977) developed the notion of k-rank (although the actual term was coined by Harshman and Lundy), defined earlier. He found that the loading matrices (X, Y, and Z) obtained from the parallel factor decomposition would be uniquely identified if k X kY k Z 2( R 1) , where k X was the k-rank of a matrix X, and so on. To date, his work 13 has been the most extensive in the search for criteria; and his resulting condition, generally accepted as the climax of uniqueness research. Consequently, other uniqueness results have stemmed from his constraint on the k-ranks of the loading matrices in the hopes of developing a set of necessary conditions. Leurgans, Ross, and Abel (1993) reduced Kruskal’s original conditions to a set of requirements on the linear independence of the columns of the loading matrices, which generated identifiable results. Even so, the conditions did not yield a reduction adequate to produce necessity. In the last few years, a resurgence of interest in uniqueness results and Kruskal’s condition has produced even more conditions and further understanding. In 2000, Sidiropoulos and Bro expanded Kruskal’s result by showing that it held for I J K arrays of complex numbers. Additionally, they generalized the result to include multiway arrays. The development of Kruskal’s condition was continued by ten Berge and Sidiropoulos in 2002, as they were able to counter the notion that Kruskal’s condition was necessary and sufficient for arrays with more than 1 factor. By producing alternative solutions when Kruskal’s condition was not met, necessity was shown when the number of factors was two or three. In the case of four factors, however, uniqueness was achieved even when Kruskal’s condition was not. Hence, Kruskal’s condition could not be necessary. From their results, it was conjectured that the answer to uniqueness might lie in the association of rank and k-rank. The developments in the area of uniqueness have demonstrated that it is possible to uniquely identify the loading matrices in parallel factor decomposition. However, the empirical and 14 mathematical evidence presented has only hinted at the requirements necessary for unique solutions. Matrix Pencil Theory To date, the fortification of uniqueness results into a necessary and sufficient set has been hindered by the algebraic subtleties found in the common proof technique. Kronecker canonical forms of matrix pencils, however, may provide an untapped answer that avoids some of these obstacles. Theoretically, using Kronecker canonical forms of matrix pencils will result in a decomposition of the original triad into two parts, singular and regular. The regular part is of particular interest since it is unique and may contain the important structure (factors) common to both slabs. In the case where K 2 , and the array consists of two matrix slabs, M1 and M 2 , recall the PARAFAC decomposition can be expressed as: (5) R R r 1 r 1 M1 z1r (x r y r ) and M 2 z 2r (x r y r ) , where the tensors vary in parallel proportion depending on the value of k. As mentioned above, the system of equations given in (5) are equivalent to the following matrix representation: (6) M 1 XD 1 Y t M 2 XD 2 Y t where 15 , 0 z11 D1 0 z 1R 0 z 21 D2 0 z 2 R X x1 (7) Y y 1 x 2 x R I R y 2 y R J R z11 z1R Z z z 2 R 2 R 21 . Suppose M1 and M2 are defined as above, with the restriction that I=J and det(M1 M 2 ) 0, . Definition 8 (Gantmacher, 1959) A pencil of matrices M1 M 2 is termed a regular pencil if 1) M1 and M 2 are square matrices of the same order; and 2) The det(M1 M 2 ) is not identically equal to zero. For all other cases, the pencil of matrices is termed a singular pencil. Consider the special case where M 2 is nonsingular. It is easily seen that, with this requirement, M1 M 2 will be a regular pencil. That is, since M 2 is nonsingular, M1 M 2 can be written as M 2 (M 2 1M 1 I ) and the determinant of this expression is given by: det(M 2 (M 21M1 I)) det(M 2 ) det(M 21M1 I) . Because M2 is nonsingular, det(M 2 ) 0 . Also, det(M 21M1 I) is the characteristic equation for M 21M1 , and unless 16 M1 0 , the equation will have at least one nonzero root. Hence, in the case of two square matrices, where one of the matrices is nonsingular, the resulting pencil will be regular. Therefore, if both M 2 and D 2 are nonsingular, the pencils M1 M 2 and D1 D 2 will be regular. Simple substitution, using (6), will result in (8) M1 M 2 X(D1 D 2 )Y t . Definition 9 (Gantmacher, 1959) Two pencils of matrices M1 M 2 and D1 D 2 of the same dimensions connected by the equation (8) in which X and Y are constant nonsingular matrices will be called strictly equivalent. Theorem (Gantmacher, 1959) Two pencils of square matrices of the same order M1 M 2 and D1 D 2 for which M 2 and D 2 are both nonsingular are strictly equivalent if and only if the pencils have the same elementary divisors. Elementary divisors, for a matrix pencil P can be found by reducing P to a “quasi-diagonal matrix” that consists of polynomials, ( ) p (Gantmacher, 1959). Polynomials with power greater than zero are the elementary divisors of the pencil. When a pencil is composed of diagonal matrices, D1 and D 2 , the resulting elementary divisors are linear and are the entries of D1 D 2 . 17 It follows that (6) can only be true when and only when M1 M 2 and D1 D 2 share the same elementary divisors. To see this, suppose that the solution exists and is expressed in the form of M1 XD1Y t , M 2 XD 2 Y t . The two equations can be connected by equation (8) by simple substitution, or M1 M 2 X(D1 D 2 )Y t . From Definition 10, this implies that the pencils, M1 M 2 and D1 D 2 , are strictly equivalent. Since M 2 and D 2 are nonsingular, square matrices, the resulting pencils are regular, and using the theorem from Gantmacher (1959), will have the same elementary divisors. Now suppose that the two regular pencils M1 M 2 and D1 D 2 have the same elementary divisors; then the pencils are strictly equivalent (Gantmacher, 1959). By definition, whenever two pencils are strictly equivalent, they can be written as equation (8), M1 M 2 X(D1 D 2 )Y t , C . Let 0 , then the equation reduces to M1 XD1Y t . If 1 , then M1 M 2 X(D1 D 2 )Y t , but M1 XD1Y t . Therefore, M 2 XD 2 Y t , and a solution does exist. It should be noted that one must be careful about how regular pencils are defined. Earlier in the section a broader definition was used to define a regular pencil. In the case where M1 and M 2 are square and det(M1 M 2 ) 0, , Gantmacher’s theorem is not valid. Consider the following example (Gantmacher, 1959): 18 For square matrices A, B, A , and B , where det( A B) 0, and det( A B ) 0, , 2 1 3 1 1 2 2 1 1 1 1 1 let A 3 2 5 , B 1 1 2 , A 1 2 1, and B 1 1 1 . It can be shown 3 2 6 1 1 3 1 1 1 1 1 1 that the elementary divisor for the pencils A B and A B is + 1. This would imply that the two pencils are strictly equivalent. However, the rank of B is 2 and the rank of B’ is 1. It cannot possibly be true that the pencils are equivalent. Therefore, the theorem for the equivalence of pencils does not hold for the broader definition of regular pencils. In order to salvage the theorem, it is necessary to introduce infinite elementary divisors. Initially, however, the focus of this paper will remain on the situation where M 2 and D 2 are nonsingular, and the issue of infinite elementary divisors will be left to later. 19 REGULAR PENCILS AND ISSUES OF UNIQUENESS Elementary divisors provide a platform for the discussion of the existence of solutions. Before the question of whether or not a solution is unique can be answered, the question of when do solutions, and therefore, alternative solutions exist must be addressed. Previously, it was noted that shared elementary divisors is a necessary and sufficient condition for the existence of solutions. However, elementary divisors offer a much richer bank of information in that they are fundamental in determining the Jordan canonical form of a pencil. For a polynomial, ( ) p , the power, p, determines the size of each Jordan block, and the value of corresponds to the entry or eigenvalue. Therefore, claiming that two pencils share the same elementary divisors is equivalent to stating that the two pencils share the same Jordan form. In terms of uniqueness, then, the question becomes “under what conditions will another pencil have the same Jordan form?” This surely includes permutation and scaling, but does it include any other types of matrices? Thus, the elementary divisors enable the creation of an “if and only if” condition for when alternative solutions exists, and provide conditions through the derivation of the Jordan from for whether the solutions are essentially the same or different. Finally, the theory behind the sharing of elementary divisors and Jordan forms does not require the PARAFAC condition of the diagonalization of Z. Thus, Kronecker canonical forms for matrix pencils suggests that existence and uniqueness results are available for pencils composed of more complex D1 and D2 matrices. 20 Initial Result The first result from this dissertation is a partial clarification of Harshman’s famous 1972 uniqueness theorem and is included as an example of how pencil theory is intended to be used to clarify existing results and produce new ones. Theorem (Harshman, 1972): Suppose M1 XD1Y t , M 2 XD 2 Y t where X and Y are n l matrices which are “nonhorizontal” ( n l ) and “basic” (of rank l), and D1 and D 2 are nonsingular matrices such that D1 D 1 2 D p where D p has distinct diagonal elements. Suppose also that there exists some alternative representation of M1 and M 2 , such as M1 GC1H t , M 2 GC 2 H t . Then G XΠ1 , H YΠ 2 , and C ZΠ 3 , where Π is a permutation matrix and i is a diagonal matrix and Ck is a diagonal matrix with the kth row of C on the diagonal. Harshman’s theorem gives conditions for when an alternative solution will be a permutatedscaled version of the original. It is important to note that ten Berge and Sidiropoulos (2002) argued that the conditions posed in Harshman’s theorem were equivalent to Kruskal’s k-rank condition when R=2. In the theorem below, elementary divisors are used to show that permutated-scaled versions of an existing solution constitute another solution. 21 Theorem 1: The two pencils D1 D 2 and C1 C 2 as defined above, have the same Jordan form. Proof: It is necessary to show that the pencils, D1 D 2 and C1 C 2 , have the same elementary divisors. 31 z11 First, let C Z 3 . This means that C 31 z 21 31 z R1 32 z12 32 z 22 32 z R 2 3 R z1R 3 R z 2 R . Let C represent k 3 R z RR the diagonal matrix that results when the kth row of Z is placed on the diagonal, or 0 31 z k1 . Consider k = 2. Then, C C D D . Because Ck 1 2 3 1 3 2 0 3 R z kR D 2 , and likewise C 2 , are diagonal and nonsingular, one can consider the elementary divisor problem to consist of finding the elements C1C 21 . However, C1C 21 3 D1D 2131 . Therefore, the elements on the diagonal of C1C 21 are simply the elements on the diagonal of D1D 21 . Consequently, the elementary divisors must be the same. Now consider the case when C ZΠ 3 . In this case the columns are permuted. However, as mentioned above, permuting the columns will have no effect on the elementary divisors. Therefore, in this case, the elementary divisors are also equal. Hence, permutation and scaling will have no effect on the elementary divisors and so the Jordan forms will be the same. 22 PROPOSED PLAN Arising from the above discussion are several immediate problems that will form the first part of this dissertation. These are briefly detailed below. Where the dissertation goes, if and when these problems are successfully resolved, will have to be decided later. It is fair to point out that all three of the problems mentioned would be significant contributions to the theory of I J 2 trilinear models, paving the way for research innovations in the area of uniqueness and I J K trilinear models (PARAFAC and non-parallel factor models) as well as general multilinear models. Harshman’s Theorem (1972) This theorem was stated at the end of the previous section. The first goal of this dissertation is to state and prove Harshman’s theorem from the perspective of elementary divisors. The expectation is that this new perspective will allow necessary and sufficient conditions to be developed for uniqueness up to permutations and scale. Kruskal’s Conditions The conditions on k-rank are by now classic in the trilinear literature. As mentioned, these conditions are not necessary and sufficient for an I J K array of general rank R, in spite of some early conjectures to this effect in the literature. The second goal of this dissertation is to understand k-rank for I J 2 arrays from the perspective of matrix pencils and elementary divisors. It should then be clear why the conditions are necessary for R=2 and 3, but not for R=4 23 and above. That is, it should be clear why ten Berge and Sidiropoulos (2002) were able to show Kruskal’s conditions were not necessary for R=4. The hope is that elementary divisors and the theory of equivalent matrix pencils will be just what is needed to tighten Kruskal’s conditions to become necessary and sufficient. Mixed Models Recall that in so-called “mixed” trilinear models the influence of a particular factor will not merely change proportionally as the variable denoting the third way varies, but will be dependent on the influences of factors from each of the other two modes. Very little is known about the uniqueness of structures that result from these models. This is unfortunate since these more general models are in many ways much more useful – simply due to lack of constraint – than PARAFAC models. In fact, mixed models are still the models of choice for psychometricians in spite of the fact that the general belief is that there are no rational claims to uniqueness. There is an excellent chance that matrix pencils will allow for the development of significant uniqueness results in the mixed decompositions of the I J 2 array. These results would be shown to incorporate the very few, very specific uniqueness results that are available for mixed decompositions, while at the same time produce necessary and sufficient conditions for when mixed arrays admit unique solutions; which should, in turn, become a mechanism for generating such models. 24 REFERENCES Appellof, C.J., and Davidson, E.R. 1981. Strategies for analyzing data from video fluorometric monitoring of liquid chromatographic effluents. Analytical Chemistry 53: 2053-2056. Bro., R. 1997. PARAFAC: Tutorial & applications. Chemom. Intell. Lab. Syst. 38: 149-171. Bro, R. 1999. Exploratory study of sugar production using fluorescence spectroscopy and multiway analysis. Chemometrics and Intelligent Laboratory Systems 46: 133-147. Bro, R., and De Jong, S. 1997. A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics 11: 393-401. Bro, R., and Andersson, C.A. 1998. Improving the speed of multiway algorithms. Part II: Compression. Chemometrics and Intelligent Laboratory Systems 42: 105-113. Bro, R., and Heimdal, H. 1996. Enzymatic browning of vegetables. Calibration and analysis of variance by multiway methods. Chemometrics and Intelligent Laboratory Systems 34: 85-102 Burdick, D. 1995. An introduction to tensor products with applications to multiway data analysis. Chemom. Intell. Lab. Syst. 28: 229-237. Carroll, J. D., and Chang, J. J. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 35: 283319. Fisher, R. A. and Mackenzie, W. A. (1923) Studies in crop variation II. The manurial treatment of different potato varieties. Journal of Agricultural Science 13, 311-320. Gantmacher, F.R., The Theory of Matrices, Vols. I,II, Chelsea Publishing Company, New York, 1959. Harshman, R.A. 1970. Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multinomial factor analysis. UCLA Working Papers in Phonetics 16: 1-84. Harshman, R.A. 1972. Determination and proof of minimum uniqueness conditions for PARAFAC1. UCLA Working Papers in Phonetics 22: 111-117. Harshman, R.A., and Lundy, M.E. 1984a. The PARAFAC model for three-way factor analysis and multidimensional scaling. In Research Methods for Multimode Data Analysis (H.G. Law, C.W. Snyder Jr., J.A. Hattie and R.P. Mcdonald, eds.) Praeger, New York. Harshman, R.A., and Lundy, M.E. 1984b. Data preprocessing and the extended PARAFAC model. In Research Methods for Multimode Data Analysis (H.G. Law, C.W. Snyder Jr., J.A. Hattie and R.P. Mcdonald, eds.) Praeger, New York. Ho, C.N., Christian, G.D., and Davidson, E.R. 1978. Application of the method of rank annihilation to quantitative analyses of multicomponent fluorescence data from the video fluorometer. Analytical Chemistry 50: 1108-1113. Kiers, H.A.L. 2000. Towards a standardized notation and terminology in multiway analysis. Journal of Chemnometrics 14:105-122. Kroonenberg, P.M. 1983. Three-Mode Principal Component Analysis. DSWO Press, Leiden, The Netherlands. Kroonenberg, P.M. 1989. Singular value decompositions of interactions in three-way contingency tables. In Multiway Data Analysis (R. Coppi and S. Bolasco, eds.), North-Holland. Kruskal, J.B. 1977. Three-way arrays: Rank and uniqueness of trilinear decompositions with application to arithmetic complexity and statistics. Linear Algebra Appl. 18: 95-138. Kruskal, J.B. 1984. Multilinear methods. In Research Methods for Multimode Data Analysis. (H.G. Law, C.W. Snyder, Jr., J.A. Hattie and R.P. McDonald, eds.), Praeger, New York. Kruskal, J.B. 1989. Rank decomposition and uniqueness for 3-way and N-way arrays. In Multiway Data Analysis (R. Coppi and S. Bolasco, eds.), North-Holland. Leurgans, S. and Ross, R. 1992. Multilinear Models: Applications in spectroscopy. Statistical Science 7(3): 289-319. Leurgans, S., Ross R., Abel R. 1993 A decomposition for three-way arrays. SIAM Journal of Matrix Analysis and Application 14:1064-1083. Martin, A., Wiggs, C. L., Ungerleider, L. G., and Haxby, J. V. 1996. Neural correlates of category-specific knowledge. Nature 379: 649-652. Mitchell, B. and Burdick, D. 1994. Slowly converging PARAFAC sequences: swamps and two factor degeneracies. Journal of Chemometrics 8: 155-168. Rayens, W.S. and Mitchell, B. 1997. Two-factor degeneracies and a stabilization of PARAFAC. Chemometrics and Intelligent Laboratory Systems 38: 173-181. Sanchez, E., and Kowalski, B.R. 1988. Tensorial calibration: II. Second-order calibration. Journal of Chemometrics 2: 265-280. Sanchez, E., and Kowalski, B.R. 1990. Tensorial resolution: A direct trilinear decomposition. Journal of Chemometrics 4: 29-45. Sidiropoulos, N.D. and Bro, R. 2000. On the uniqueness of multilinear decomposition of N-way arrays. Journal of Chemometrics 14: 229-239. Smilde, A.K. and Doornbos, D.A. 1992. Simple validatory tools for judging the predictive performance of PARAFAC and three-way PLS. Journal of Chemometrics 6: 11-28. Smilde, A.K 1992. Three-way analysis. Problems and prospects. Chemom. Intell. Lab. Syst. 15: 143-157. ten Berge, J.M. and Sidiropoulos, N.D. 2002. On uniqueness in CANDECOMP/PARAFAC. Journal of Psychometrika 67: 399-409. Tucker, L.R. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31: 279-311. Wold, S., Geladi, P., Esbensen, K., and Ohman, J. 1987. Multi-way principal components- and PLS-analysis. Journal of Chemometrics 1: 41-56.
© Copyright 2026 Paperzz