Chemometrics and Intelligent Laboratory Systems 58 Ž2001. 261–273 www.elsevier.comrlocaterchemometrics Multiple factor analysis combined with PLS path modelling. Application to the analysis of relationships between physicochemical variables, sensory profiles and hedonic judgements Jerome ´ ˆ Pages ` a, Michel Tenenhaus b,) a b ENSA–INSFA, Rennes, France HEC School of Management, Jouy-en-Josas France Abstract Multiple Factor Analysis ŽMFA. highlights the structures common to a set of J groups Žor blocks. of variables observed for the same individuals. PLS path modelling allows a search for latent variables, summarising as far as possible one-dimensional blocks of manifest variables while taking account of causal links between the blocks. These two methods can be combined: MFA, as an exploratory analysis, helps to define blocks, being both one-dimensional and as well-correlated as possible, on which PLS path modelling is performed. In this paper, we present MFA in detail and PLS path modelling more briefly. We also mention some links between MFA, PLS path modelling and PLS regression. A detailed presentation of a sensory analysis example will illustrate the proposed methodology. q 2001 Elsevier Science B.V. All rights reserved. Keywords: Generalised canonical analysis; Multiple factor analysis; Hierarchical factor analysis; PLS path modelling; PLS regression; Structural equation modelling 1. Introduction Multiple factor analysis ŽMFA. was proposed by Escofier and Pages ` w3–5x to highlight the structures common to a set of J groups Žor blocks. of variables observed for the same individuals. This method allows a graphical display of these common structures with respect to variables and individuals. When each data table represents a set of manifest Žor observable. variables relating to one Žunobserva) Corresponding author. Tel.: q33-1-3967-7249; fax: q33-13967-7109. E-mail address: [email protected] ŽM. Tenenhaus.. ble. latent variable and when there are explicit causal relationships between the latent variables, it is interesting to use the multi-block Partial Least Squares ŽPLS. path modelling approach proposed by Wold w21–23x. This approach has been adopted and extended by Lohmoller ¨ w9,10x at a theory and software level. In France, PLS path modelling was studied closely by Valette-Florence w17–19x and Tenenhaus w15,16x. PLS path modelling can still be adopted when there are no causal relationships between the blocks. Wold w22x proposed forming a supplementary block, by juxtaposing all the blocks, and connecting each initial block to this supplementary block. PLS path 0169-7439r01r$ - see front matter q 2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 9 - 7 4 3 9 Ž 0 1 . 0 0 1 6 5 - 4 262 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 modelling then makes it possible to recover various methods such as generalised canonical correlation analysis of Horst w8x, that of Carroll w1x, principal components analysis and multiple factor analysis. The links between PLS path modelling and other methods dealing with multiple tables are presented in Guinot et al. w7x. PLS regression w24x can be considered as a development of PLS path modelling specifically dedicated to relate a set of responses Y to a set of predictors X. The link between PLS regression and PLS path modelling is studied in Tenenhaus w16x. A detailed comparison between PLS path modelling and MFA can be found in Pages ` and Tenenhaus w13x. In the present paper, we describe the application of a methodology combining MFA and PLS path modelling to a significant problem for the food industry: predicting hedonic judgements on the basis of sensory and physicochemical characteristics of a set of products. 2. The orange juice example 2.1. Data Six pure orange juices ŽP1 to P6. were selected from the main brands on the French market. These juices were pasteurised in two ways: thus, three of them must be stored in refrigerated conditions while the others can be stored at room temperature. Here is the list of the six orange juices: Pampryl at room temperature ŽP1., Tropicana at room temperature ŽP2., refrigerated Fruivita ŽP3., Joker at room tem- perature ŽP4., refrigerated Tropicana ŽP5., refrigerated Pampryl ŽP6.. Ninety-six students of food science, both trained to evaluate foodstuffs and consumers of orange juice, described each of these six products on the basis of seven attributes: intensity and typical nature of its smell, intensity of the taste, pulp content, sweetness, sourness and bitterness. Moreover, they expressed an overall hedonic judgement. By using this particular panel, it was possible to collect descriptive and hedonic data from the same judges. This data characteristic has not been used in this case, since, in order to simulate the usual application context, where the sensory properties are assessed by trained panelists while the hedonic judgements are given by naive consumers Žin our data, the students play the two roles.. This confers a wider scope on the methodology presented. The serving order design was a juxtaposition of Latin squares balanced for carry-over effects w12x. In addition to the sensory investigation, the following chemical measurements pH, citric acid, overall acidity, saccharose, fructose, glucose, vitamin C and sweetening power Ždefined as saccharoseq 0.6 = glucoseq 1.4 = fructose. were carried out. The data are gathered in a table using the format shown in Table 1. 2.2. Problems The analysis of relationships between physicochemical variables, sensory profiles Žaccording to a trained panel. and hedonic judgements Žby con- Table 1 Source data For product i: x i k Žpanel average of the sensory attribute k ., yi l Žinstrumental measurement l ., z i j Žhedonic score of judge j .. J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 sumers. is a classic problem for the food industry. The objective is to predict hedonic judgements on the basis of the sensory and physicochemical characteristics of the products. This problem, usually named preference mapping has been much studied Žsee, for example the synthesis given by Greenhoff and MacFie w6x.. We show here, how the methodology combining MFA and PLS path modelling brings a new point of view to this classical problem. Hedonic judgements can be considered in various ways. When all the hedonic judgements correlate positively, we have a consensus and it is legitimate to summarise them by a compromise judgement, the average judgement being the simplest form. Otherwise, in the absence of a consensus, a natural approach is to divide the consumers into homogeneous clusters Žfrom the point of view of their judgements. and to relate each cluster, which can now legitimately be summarised by a compromise, to the sensory and physicochemical data. A priori, one might think of dividing consumers on the basis of hedonic judgements alone. But, in this case, nothing ensures that the clusters obtained will be predictable on the basis of variables characterising the products. Hence, the idea of building clusters of consumers, using the hedonic data of course but taking other data into account, so that from one cluster to another, the differences in liking correspond to physicochemical andror sensory differences. Poulsen et al. w14x and Courcoux w2x have already implemented a similar idea. To summarise, the aim of the analysis is to obtain clusters of consumers with similar patterns of hedonic assessments, but also, whose averaged judgements are well predicted by the sensory and physicochemical properties of the products. From the point of view of prediction, the difficulty arises in defining a summary for each cluster, this one-dimensional summary being reasonable due to the homogeneity of each cluster. In order to prepare this summary, the simplest approach is to give the same weight to each individual, in which case, the summary is the usual average. But given the wish for predictability, when constructing the summary, it is also possible to give preference to the most predictable individuals. This is what we are going to do in this paper. 263 Summary The initial assumption is that hedonic judgements depend on the physicochemical and sensory characteristics of the products. It would have been possible to use the sensory data alone. In fact, the presence of physicochemical data can be seen as contributing to the solidity of product characterisation: Ž1. the presence of physicochemical variables relating to the sensory variables reinforces the latter, and Ž2. the presence of physicochemical variables not dependent on the sensory variables protects against the possible shortcomings of the attributes list. The objective is to obtain clusters of consumers which are homogeneous in their hedonic judgements, these clusters being predictable on the basis of product characterisation. v v 2.3. Methodology 2.3.1. DescriptiÕe approach of the three data sets First, we apply MFA to Table 1 each of the three groups playing an active role. This analysis makes it possible to highlight the structures common to the three groups of variables. Taking into account the limited number of products, special attention must be paid to the percentages of variability expressed by the common structures as well as to their interpretability. In particular, relationships between physicochemical and sensory variables will be analysed. 2.3.2. Construction of consumer clusters MFA provides a graphical display of the consumers Žhere the ninety-six students., which shows the main variability of these consumers and also of physicochemical and sensory variables. This graphical display is thus suitable for building clusters of consumers suited to our problem. Several steps can be considered, which depend on: v v the number of clusters which is reasonable to manage, taking into account the number of consumers Žmuch smaller in sensory investigations than in public opinion polls., the structure of the data: to obtain clusters of hedonic judgements relating to variables of the other groups, we will use the common factors of MFA, which are related to the hedonic judgements. 264 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 Fig. 1. Definition of clusters of hedonic judgements based on two common factors of MFA ŽAAxis 1B and AAxis 2B .. Each judgement is a vector inside the correlation circle. The two bisecting lines Ždotted lines. define four zones. To ensure the predictability of the clusters, we propose the following process: v v each hedonic judgement is initially associated with the common factor with which it is the most correlated; each common factor defines two clusters of related hedonic judgements: those which are positively correlated with it, and the others. Fig. 1 illustrates this process in the case of two common factors, that is to say, when the two first common factors provided by MFA are closely related to each block of variables. 2.3.3. Use of PLS path modelling In PLS path modelling, it is advantageous that each group of variables is essentially one-dimen- sional, because each group should reflect one latent variable and these latent variables can be related by regression equations. To obtain this property we use the results of MFA. With each common factor of MFA, we associate four groups of variables which are more correlated to this factor than to the others: the most positively correlated hedonic judgements, the most negatively correlated hedonic judgements, the most correlated physicochemical variables, and the most correlated sensory variables. For each common factor, s, we prepare the arrow diagram shown in Fig. 2. It is possible to establish a model for each cluster of hedonic judgements, which would improve the quality of fit of each one. The choice of simultaneously explaining two clusters which have opposite judgements is consistent with the process adopted in building these groups and makes interpretation easier. This approach does not take into account non-linear relationships between hedonic data and sensory or physicochemical data. 2.3.4. Standardisation of the Õariables The physicochemical data, being expressed in different units, are systematically standardised. The descriptive sensory data are also standardised. We have also checked that each attribute presented a significant product effect in the analysis of variance performed on individual data. The hedonic judgements may or may not be centred or standardised. This depends on the meaning we give to the average and to the standard deviation of one consumer. The majority of users of sensory analysis consider that the hedonic judgements put forward by a consumer for a series of products are primarily relative; thus, they measure the relationship Fig. 2. Causal model for hedonic judgements. J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 between two consumers by the coefficient of correlation between their judgements. We adopt this point of view by standardising these variables. The clusters of consumers are thus made up on this basis. 3. Multiple factor analysis 3.1. Data and notations The present data set has a classic structure: several groups of variables are measured on the same set of individuals Žin the statistical sense.. These data can be presented in a single table Žcf. Fig. 3. using a sub-table structure. We denote by: X the complete data table I the set of individuals K the set of variables Žincluding all groups. or the set of indices for the variables J the set of sub-tables Žor groups of variables. K j the set of variables in group j or the set of indices for the variables in group j X j the sub-table associated with group j. Moreover, the symbols I, J, K and K j designate both the set and its cardinal. A variable of group, K j , is denoted by: Õ k Ž k g K j .. To simplify the presentation, without any loss of generality, the variables are assumed to be standardised and to have the same a priori weights. In the same way, the individuals are assumed to have the same weight 1rI. 265 3.2. Problems Many problems are associated with this kind of data. They include many aspects that all derive, more or less directly, from the following question: which structures are common to the various groups? For this, we summarise each group using linear combinations of its variables. These linear combinations have various names: canonical variables, components, latent variables, factors, dimensions or constructs. A common structure is highlighted by a J-uplet of canonical variables Žone per group. that are correlated. The highlighting of several common structures involves searching in each group j for a succession of S linear combinations of the variables of this group Fsj; s s 1, . . . , S4 so that, between the groups, the combinations having the same rank s Fsj; j s 1, . . . , J 4 are as closely correlated as possible. We define Fsj as a linear combination of the input variables Õ k of group K j by writing: Fsj s Ý a ks Õk . kgK j Various methods deal with these problems, each one according to a peculiar point of view: Carroll’s w1x generalised canonical correlation analysis, Horst’s w8x generalised canonical correlation analysis, hierarchical factor analysis w11x, PLS path modelling and finally MFA. MFA w3–5x is a method deriving from principal component analysis Žsearch for directions of maximum inertia. and from canonical correlation analysis Žsearch for common factors.. Here we use this second point of view to present MFA. Fig. 3. Data table. J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 266 3.3. A priori balancing of the input Õariable blocks In MFA, each input variable Õ k , k s 1, . . . K j , of the group K j has the weight m k s 1rl1j where l1j is the first eigenvalue of the separate principal component analysis of group K j . This is equivalent to replace each data table X j by the table: ž 1r(l / X . j 1 j 3.4. Measurement of the relationship between a Õariable z and a group K j In MFA, measurement of the relationship between any variable z and a group K j is defined in the following way when the variables Õ k are standardised: m k cor 2 Ž z ,Õ k . Ý kgK j 1 s Ý l1j kgK j cor 2 Ž z ,Õ k . . This measurement is related to the redundancy index w20x, which, with our notations, is written as: Rd Ž z, K j . s R d Ž z , K j . s 1 m cor Ž z ,Õ k . s 1 for all k g K j . This case, where all the variables of group K j correlate perfectly, does not correspond, of course, to any real situation. For L g , Lg Ž z , K j . Thus, if we perform a separate principal component analysis for group K j with these weights, m k , the first eigenvalue is 1. In other words, with this weighting scheme, for each group j, the maximum axial inertia of the cloud of individuals is 1. This weighting induces the balancing between the various blocks j, Ž j s 1, . . . , J .. This leads, for example, to a down weighting of variables from one block where 25 variables were used to express one real dimension, as compared to another block where only 10 variables were used for one dimension. Notations: M j is the diagonal matrix of the weights 1rl1j for the variables of group K j and M the diagonal matrix of the weights of all variables Žincluding all groups.. Lg Ž z , K j . s The value of 1 corresponds to a maximum for the two measurements, but this maximum does not have the same meaning for R d and L g . For R d , 1 Kj Ý cor 2 Ž z ,Õk . . kgK j These two measurements vary between 0 and 1. They are zero when the variable z does not correlate with any of the variables of group K j . s 1 m z is the first principal component of K j . We find here the basic idea of PCA according to which the first principal component is a synthetic variable related as far as possible to the initial variables. 3.5. Identifying general Õariables As Carroll’s generalised canonical correlation analysis, MFA provides a series of latent Ageneral variablesB z 1 , z 2 . . . related as far as possible to the various groups of input variables K j , j s 1, . . . , J. It is natural to measure the relationship between a variable z and the set of groups K j by: ž / Relationship z , D K j s Ý L g Ž z , K j . . j j In MFA, the general variable z s is thus defined by: Ý L g Ž z s , K j . maximum, j with the constraints: Varw z s x s 1 and corŽ z s , z t . s 0 for any t - s By expressing Lg Ž zs , K j . s 1 I2 zXs X j M j X jX z s the quantity to be maximised turns out to be: 1 Ý L g Ž z s , K j . s I 2 Ý zXs X j M j X jX z s j j 1 s I2 zXs XMX X z s . From this equation, we deduce that the variables z s are the standardised principal components of the J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 complete table X, the variables Õ k , k g K j , j s 1, . . . , J, being weighted according to the diagonal matrix M. Finally, MFA can be viewed both as a peculiar canonical correlation analysis and as a weighted PCA of the complete data matrix Žwith the weights 1rl1j .. The associated principal components Fs are obtained by multiplying the general variable z s by the square root of the eigenvalue l s : Fs s z s l s . This process presents two features in comparison with Carroll’s generalised canonical correlation analysis: principal component Fs and the data table X j . This convergence between the two approaches reinforces them. This process presents two features in comparison with Carroll’s method: v ( v v it leads to general variables related to the initial variables while ensuring a balance between the groups of variables due to the weights m k ; deriving from a weighted PCA, the general variables have all the properties of principal components. v here again, due to the properties of PLS regression, it leads to canonical variables that are more closely related to the initial variables; each Fsj provides a representation of the individuals, which can be superimposed on that of the PCA Ž Fs ., having the two properties described below. Property 1: up to coefficient 1rJ, the value Fs Ž i . of the principal component Fs for the individual i is the mean of the values Fsj Ž i . of the canonical variables Fsj for the individual i. In practice, the canonical variables Fsj are dilated according to the coefficient J, so that Fs Ž i . is an exact mean. That is to say, 3.6. Identifying canonical Õariables We associate with each general variable z s a canonical variable Fsj in each group j. In MFA, this variable is defined in the following way. We deduce from: zs s 1 1 ls I XMX X z s the equation: Fs s 1 1 ls I J 1 Ý j js1 l1 X j X jX Fs . Each canonical variable Fsj is then defined as the fragment of Fs corresponding to the group j: Fsj s 1 1 l s l1j I X j X jX Fs so that the following decomposition of Fs is obtained: 267 Fs Ž i . s 1 J Ý JFsj Ž i . . j In MFA terminology, we say that the global image Ži.e. from the point of view of the set of groups. of an individual is in the centre of gravity of its partial images Ži.e. from the point of view of the various groups.. Property 2: from the definition of the canonical variable we obtain, Fsj Ž i . s 1 (l 1 s l1j Ý cor Ž Õk , Fs . Õ k Ž i . . kgK j Thus, the partial individual A i j B Žindividual i described by the variables Õk of block j . is attracted by the variables Õ k , for which corŽ Õ k , Fs . Õ k Ž i . has a high positive value, and conversely, repelled by those for which it has a low negative value. 3.7. Link between MFA and the PLS path modelling J Fs s Ý Fsj . js1 Up to a multiplication coefficient, we find that the canonical variable Fsj is confounded with the first PLS component in the PLS regression between the The components of MFA can be obtained by performing a PLS path modelling according to the arrow diagram of Fig. 4. It can be shown that, by using the PLS path modelling with the options Mode A for the outer esti- 268 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 Table 3 Correlations between separate PCA factors Sensory attributes Hedonic judgements Fig. 4. MFA arrow diagram. mate of the latent Õariables and the path weighting scheme for the inner estimate of the latent Õariables, we obtain the general component z 1 and the standardised canonical components F1j Žw16x, section 4.6 and w7x.. The next component, z 2 , and the standardised canonical components F2j are obtained by replacing table XM 1r2 in Fig. 4 by the residuals of the regression of XM 1r2 on z 1. More generally, we obtain the component z s and the standardised canonical components Fsj by replacing table XM 1r2 in Fig. 4 by the residuals of the regression of XM 1r2 on the general components z 1 , . . . , z sy1. 4. Application to the orange juice example 4.1. Results from separate analyses Some results from separate analyses of the physicochemical measurements and sensory description groups are presented in Table 2. They highlight for F1 F2 F1 F2 Physico-chemistry Sensory attributes F1 F2 F1 F2 y0.78 0.08 0.74 y0.31 y0.25 y0.74 0.35 0.86 y0.94 0.09 y0.01 y0.92 each group a main direction of inertia, though these groups cannot be considered as one-dimensional. The hedonic judgements group is clearly multidimensional, with however, a first well individualised factor. The principal components Žof same rank. of the three separate PCA, are closely correlated ŽTable 3.. This correlation is remarkable in the case of hedonic and sensory data Žy0.94. and is not usual in practice, even when the number of products is small. The marked differences between the first eigenvalues of separate analyses Ž6.21, 4.74 and 34.03. means that it is essential to have weighting of the groups within MFA. 4.2. Results from MFA 4.2.1. Global indicators The correlations in Table 4 show that the first two factors of MFA correspond to structures common to the three groups Ž F1 and F2 are highly correlated with the corresponding canonical variables of each group F11, F12 , F13 and F21, F22 , F23 .. The analysis that follows will be based on them. Moreover, according to the L g measurements, the first axis corresponds to a direction of very significant inertia for each group. Table 2 Eigenvalues Žs inertia. associated with separate PCA Axes Physico-chemistry Sensory attributes Hedonic judgements Eigenvalue Ž%. Eigenvalue Ž%. Eigenvalue Ž%. 1 2 3 4 5 6.2135 1.4102 1.0457 0.3173 0.0133 69.04 15.67 11.62 03.53 00.15 4.7437 1.3333 0.8198 0.0840 0.0192 67.77 19.05 11.71 01.20 00.27 34.0281 19.3692 15.8922 13.8795 12.8311 35.45 20.18 16.55 14.46 13.37 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 Table 4 Correlations and L g measurements of relationship between the canonical variables of each group and the general variable of the same rank Correlation Physicochemistry Sensory attributes Hedonic judgements Lg F1 F2 F3 F1 F2 F3 0.93 0.82 0.62 0.84 0.24 0.22 0.97 0.89 0.41 0.93 0.23 0.07 0.99 0.99 0.96 0.94 0.58 0.49 4.2.2. Representation of indiÕiduals (s products) and Õariables This MFA builds a product space starting from factors common to the sensory, instrumental and hedonic data, in which the influences of these three groups of variables are a priori balanced. These MFA representations Žof products and variables. can be read like those of a PCA: the co-ordinates of a product are its values for the common factors; the coordinates of a variable are its correlations with these factors. From representations in Fig. 5, it follows that: v v v the products P1 and P4 have a high level of acidity Žand a low pH., a rather low sugar content with a high Žglucoseq fructose.rsaccharose ratio; they were perceived as sour, bitter, not very sweet, and as not containing very much pulp. the products P3, P5 and P6 have a low level of acidity Žand a high pH., a rather high sugar content with a low Žglucoseq fructose.rsaccharose ratio; they were perceived as being not very sour or bitter, sweet and having rich pulp content. the product P2 roughly shows the same characteristics as the three above, except for a small Fig. 5. First map from MFA of Table 1. ŽA. Representation of physicochemical and sensory variables. ŽB. Representation of the 96 hedonic judgements. ŽC. Representation of the products Žwhere refr. stands for refrigerated and r.t. for at room temperature.; P2, P3 and P5 come from Florida. 269 270 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 quantity of sugars and a quantity of pulp considered to be very limited. The individual hedonic judgements are widely scattered, showing a total absence of consensus about this point, and thus the need for a segmented approach to these judgements. In particular, there is no preference for refrigerated juices, which are appreciably more expensive Žtheir AsoftB pasteurisation is more difficult.. Let us also point out: the opposition between fructose and glucose on the one hand and saccharose and pH on the other hand, connected with the hydrolysis of saccharose, facilitated in an acid medium; the correlation between acidity and sourness; the absence of correlation between sweetening power and sweetness: a high level of sweetness is associated with a low level of acidity Žthis refers to the concept of gustatory balance.. Thus, the strong correlation between saccharose and sweetness is not due to the direct influence of saccharose but to a high pH. v v v 4.2.3. Preparation for PLS modelling Two factors being common, the hedonic judgements are now divided into four clusters according to the rule presented in the Methodology Žcf. Fig. 1.. The physicochemical and sensory variables are subdivided into two groups according to their correlations with the first two factors. 4.3. Results of PLS path modelling Using the exploratory results of MFA, we now want to relate the hedonic judgements to the physicochemical and sensory variables. 4.3.1. Causality models The first axis of MFA suggests a AcorrelationB between the physicochemical block Žacidity, pH before processing, pH after centrifugation, saccharose, citric acid., the sensory block Žtypical smell, sweetness, bitterness, sourness., the block of the 16 hedonic judgements positively correlated with F1 , and the block of the 44 hedonic judgements negatively correlated with F1 . The causality links between these blocks are described in Fig. 6. To estimate the coefficients of this model, we used w9x the program LVPLS1.8 proposed by Lohmoller ¨ with the following options: variables are standardised, Mode A for the outer estimates of the latent variables, factor-weighting scheme for the inner estimates of the latent variables. In this diagram, the numbers located on the arrows connecting the ob- Fig. 6. Model for the first two clusters of hedonic judgements. J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 271 Fig. 7. Model for the third and fourth clusters of hedonic judgements. servable variables to the latent variables are the correlations. The numbers appearing between the latent variables are the regression coefficients in the regressions relating one dependent variable to the independent variableŽs.. The numbers appearing under the endogenous latent variables are the R 2 of the regression, simple or multiple, as applicable. The second axis of the MFA suggests a relationship between the physicochemical block Žglucose, fructose, sweetening power Ždefined above.., the sensory block Žintensity of smell, intensity of taste, pulp content., the block of the 17 hedonic judgements correlated positively with F2, the block of the 19 hedonic judgements correlated negatively with F2. The causality links between these blocks are described in Fig. 7. The numbers which appear in this diagram have the same definition as in Fig. 6. 4.3.2. Interpretation of latent Õariables summarising the clusters of hedonic judgements Fig. 8 illustrates the latent variables summarising the four clusters. The characterisation of these clusters is quite clear, namely: cluster 2 Ž44 panel members. preferred the Florida orange juices ŽP2, P3, P5., which were not too sour, and were generally perceived to be sweet, not very sour or bitter; on the other hand, cluster 1 Ž16 panel members. rejected these Florida juices and preferred the products with a low pH and which were generally perceived to be more sour, more bitter and less sweet. cluster 3 Ž17 panel members. preferred the refrigerated juices, with more pulp, whatever the origin of these juices is; on the other hand, cluster 4 Ž19 panel members. preferred the orange juices at room temperature Žcontaining less pulp.. v v 5. Conclusion Fig. 8. Latent variables summarising the four clusters of hedonic judgements Žwhere refr. stands for refrigerated and r.t. for at room temperature.. Simultaneous analysis of several groups of variables defined for the same individuals should always be organised around the concept of structures common to the groups of variables. These structures are 272 J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 highlighted using linear combinations of variables of each group, called canonical variables or latent variables. One can look for these canonical variables using a criterion depending only on the correlations between them; this is the case for canonical correlation analysis. On the other hand, one can look for these variables by introducing in the criterion the amount of explained variance of each one within its own group: this is the case for MFA and PLS path modelling. The main difference between these two approaches lies in the existence of a causality model taken into account with PLS path modelling. This PLS model: v v looks for common structures, taking into account causality links between the blocks, whereas all the groups play the same role in MFA; leads to prediction equations, whereas MFA is a purely descriptive method. As a result of these common points and differences, the two methods are highly complementary in building both a descriptive and modelling approach to a problem. We have proposed an application which offers an original solution to the traditional problem of the connection between physicochemical variables, sensory attributes and hedonic judgements. The process consists of: v v using the common factors of MFA to subdivide the groups of variables into one-dimensional subgroups; connecting the new subgroups by a causal model. This leads to clusters of hedonic judgements which can be very simply explained by physicochemical andror sensory variables. Acknowledgements The authors are very grateful to the suggestions of the reviewers which have been most useful. They would also like to express their thanks to the students Cecile ´ Lavanant, of the University of Rennes 2, and Sebastien Le Dien, of the University of Paris ´ ŽISUP., who, as part of their training at ENSAR, undertook all data-processing operations used in this paper. References w1x J.D. Carroll, A generalization of canonical correlation analysis to three or more sets of variables, Proc. 76th Conv. Am. Psychol. Assoc. 1968, pp. 227–228. w2x P. Courcoux, Un modele ` en classes latentes en cartographie des preferences, 6e Journees Agro-industries et ´´ ´ Europeennes ´ Methodes Statistiques, 2000 Pau. ´ w3x B. Escofier, J. Pages, pour l’analyse de plusieurs ` Methode ´ groupes de variables, Application a` la caracterisation des vins ´ rouges du Val de Loire, Revue de Statistique Appliquee ´ XXXI Ž2. Ž1983. 43–59. w4x B. Escofier, J. Pages, ` Multiple factor analysis ŽAFMULT package., Computational Statistics and Data Analysis 18 Ž1994. 121–140. w5x B. Escofier, J. Pages, ` Analyses factorielles simples et multiples; objectifs, methodes et interpretation, 3rd edn., Dunod, ´ ´ Paris, 1998, 284 pp. w6x K. Greenhoff, H.J.H. MacFie, Preference mapping in practice, in: H.J.H. MacFie, D.M.H. Thompson ŽEds.., Measurement of Food Preferences, Blackie Academic & Professional, Glasgow, 1994, pp. 137–166. w7x C. Guinot, J. Latreille, M. Tenenhaus, PLS approach and multiple table analysis, Application to the study of cosmetic habits of women in Ile de France, Chemometrics and Intelligent Laboratory Systems Ž2001. this issue. w8x P. Horst, Relations among m sets of variables, Psychometrika 26 Ž1961. 129–149. w9x J.-B. Lohmoller, LVPLS Program Manual, Version 1.8, Zen¨ tralarchiv fur ¨ Empirische Sozialforschung, Koln. ¨ Ž1987.. w10x J.-B. Lohmoller, Latent Variables Path Modeling with Partial ¨ Least Squares, Physica-Verlag, Heidelberg, 1989. w11x R.P. McDonald, Factor Analysis and Related Methods, Lawrence Erlbaum Associates, Hillsdale, NJ, 1985. w12x H.J. MacFie, N. Bratchell, K. Greenhoff, L.V. Vallis, Designs to balance the effect of order of presentation and firstorder carry-over effects in hall tests, Journal of Sensory Studies 4 Ž1989. 129–148. w13x J. Pages, ` M. Tenenhaus, Analyse factorielle multiple et approche PLS, Revue de Statistique Appliquee ´ Ž2001. Žin press.. w14x C.S. Poulsen, P.M.B. Brockhoff, L. Erichsen, Heterogeneity in consumer preference data—a combined approach, Food Quality and Preference 8 Ž5r6. Ž1997. 409–417. w15x M. Tenenhaus, La regression PLS, Theorie et Pratique. Tech´ ´ nip, Paris, 1998. w16x M. Tenenhaus, L’Approche PLS, Revue de Statistique Appliquee ´ XLVII Ž2. Ž1999. 5–40. w17x P. Valette-Florence, Analyse structurelle comparative des composantes des systemes de valeurs selon Kahle et Rokeach, ` Recherche et Applications en Marketing III Ž1. Ž1988. 15–34. J. Pages, ` M. Tenenhausr Chemometrics and Intelligent Laboratory Systems 58 (2001) 261–273 w18x P. Valette-Florence, Specificite ´ ´ et apports des methodes ´ d’analyse multivariee Recherche ´ de la deuxieme ` generation, ´ ´ et Applications en Marketing III Ž4. Ž1988. 23–56. w19x P. Valette-Florence, Analyse structurelle et analyse typologique: illustration d’une demarche complementaire, ´ ´ Recherche et Applications en Marketing V Ž1. Ž1990. 73–91. w20x A.L. Van den Wollenberg, Redundancy analysis: an alternative for canonical correlation, Psychometrika 42 Ž1977. 207– 219. w21x H. Wold, Modeling in complex situations with soft information, Third World Congress of Econometric Society, August 21–26, Toronto, Canada, 1975. 273 w22x H. Wold, Soft modeling: the basic design and some extensions, in: K.G. Joreskog, H. Wold ŽEds.., System Under Indi¨ rect Observation, vol. 2, North-Holland, Amsterdam, 1982, pp. 1–54. w23x H. Wold, Partial least squares, in: S. Kotz, N.L. Johnson ŽEds.., Encyclopedia of Statistical Sciences, vol. 6, Wiley, New York, 1985, pp. 581–591. w24x S. Wold, H. Martens, H. Wold, The multivariate calibration problem in chemistry solved by the PLS method, in: A. Ruhe, ŽEds.., Proc. Conf. Matrix Pencils, March 1982, B. Kagstrøm ˚ Lecture Notes in Mathematics, Springer-Verlag, Heidelberg, 1983, pp. 286–293.
© Copyright 2024 Paperzz