Multiple Factor Analysis

Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Multiple Factor Analysis
Julie Josse
Applied Mathematics Department, Agrocampus Ouest
Stat 300
Stanford, July 2015
1 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Multiple Factor Analysis
1
Data - Issues
2
Common Structure
3
Groups Study
4
Partial Analyses
5
Example
2 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Multi-blocks data set
Groups of variables (MFA)
Groups of
variables are
quantitative and/
or qualitative
Objectives: - study the link between the sets of variables
• Sensory analysis:- balance
products
sensorial,
physico-chemical
the -influence
of each
group of variables
• Survey: individuals
- give
the classical graphsthemes
but also (students
specific graphs:
- questionnaires
health:
groups
of variables -conditions,
partial representation
addicted consumptions,
psychological
sleep, id)
• Economy: countries - economic indicators each year
Examples: - Genomic: DNA, protein
• Biology: samples - -Omics
(brain
tumors:
CGH, transcriptome;
Sensorydata
analysis:
sensorial,
physico-chemical
mouse: transcriptome,
hepatic
fatty
acid
measurements)
- Comparison of coding (quantitative / qualitative)
⇒ Generalized Canonical Correlation, Procrustes, Statis, etc.
⇒ MFA (Escofier & Pagès, 1998)
⇒ Continuous / categorical / contingency sets of variables
3
3 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
The data
Example: gliomas brain tumors
<Experiment>
Gliomas: Brain tumors, WHO classification
- astrocytoma (A)……….……… x5
- oligodendroglioma (O)……… x8
- oligo-astrocytoma (OA)……
x6
43 tumor samples
- glioblastoma (GBM)………… x24
(Bredel et al.,2005)
• Transcriptional
modification
(RNA), microarrays: 489 variables
- damage to DNA,
CGH arrays
transcriptional modification (RNA), Microarrays
• Damage -to
DNA (CGHThe
array):
variables
data,113
the expectations
<Merged data tables>
<Transcriptome>
1
j1
<Genome alteration>
J1
1
j2
J2
1
Tumors
Data
i
I
‘-omics’ data
4 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Objectives
• Study the similarities between individuals with respect to all
the variables
• Study the linear relationships between variables
⇒ taking into account the structure on the data (balance the
influence of each group)
• Find the common structure with respect to all the groups -
highlight the specificities of each group
• Compare the typologies obtained from each group of variables
(separate analyses)
5 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Principal component methods
The core of principal component methods is PCA on particular
matrices
"Doing a data analysis, in good mathematics, is simply searching
eigenvectors, all the science of it (the art) is just to find the right
matrix to diagonalize"
Benzécri
MFA is a particular weighted PCA!
6 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Balancing the groups of variables
MFA is a weighted PCA:
• compute the first eigenvalue λj1 of each group of variables
• perform a global PCA on the weighted data table:


X1
X2
XJ
q
; q ; ...; q 
λ11
λ21
λJ1
⇒ Same idea as in PCA when variables are standardized: variables
are weighted to compute distances between individuals i and i 0
i
8 variables
highly
correlated
2 var
i′
7 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Balancing the groups of variables
λ1
λ2
λ3
Transcriptome
162
35
21
Genome
12
10
5
This weighting allows that:
• Same weight for all the variables of one group: the structure of
the group is preserved
• For each group the variance of the main dimension of
variability (first eigenvalue) is equal to 1
• No group can generate by itself the first global dimension
• A multidimensional group will contribute to the construction
of more dimensions than a one-dimensional group
8 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Individuals and variables representations
Correlation circle
O4
●
●
GBM22
O1
●
1
1.0
2
Individual factor map
GBM30
O5
GBM28
GBM31
O3
GS2
JPA2
GBM21
GBM6 O2 O
GBM25
GBM23
GBM27
GBM15 sGBM3
GBM4GBM9
GBM11
AO1
GBM
AOA6 AO3
GBM5
LGG1
GBM26
GS1 AOA4
sGBM1
●
GBM29
AOA3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
AO2
GBM1
OA
●
●
GBM24
JPA3
●
A
●
GBM16
●
AOA7
AOA2
GBM3
●
●
AA3
AOA1
−2
●
−0.5
●
●
●
−1
0
1
MGC39606
CEAL1
IGSF3
GYPA
HNRNPG.T
LU
EGLN2
KIAA1543
ZNF226
EDG1
ZNF549 ACAD8
GPSM2
TLL2
MGC39581
KLRC3
MGP PPP1R14A
FBXO32 D21S2089E
AKR1C2
TAP1
SERPINA3
IFI30
NDUFB11
KCNK6
CACNB2
TMEM49
DSCR1L1
RPS3A
GPNMB
T51726
SOCS3
MGC42367
KIAA0934
CYR61 LUM
ITGA5
FAM84A
GADD45B
NPD014
DDEF1
BGN
RAB11FIP4
AKR1C1
T97457
HPRT1
LOC400451
CDKN1A
NNMT
SERPING1
TIMP1
YWHAH
RAB30
CA12
PRG1
AI822135
VEGF
SERPINI1
PSG1
AA598555
R52960
HMOX1
H20822
HLA.DRB1
BLVRB
COL1A2
HLA.G
UBE4B
LOC541471
ITGA3
L3MBTL4
USP6NL
MB
ADM
ANK3
SNRPN
CHI3L2
CAV1
PLAUR
H24428
HLA.F
OSR1
AI871056
AI262682
DPYD
YWHAG
ANXA1
IGFBP3
LYN
LGALS3
FN1
VSNL1
LY96
COL3A1
PRKCG
RGS16
DNASE1L1
AA281932
MICAL2
AA490257
MSTP9
CCL2
SP100
CDKN2D
NR4A1
SYT7
ABCC3
ABCA5
S100A11
H86813
X38595
H08563
GBP2
ZFHX1B
EPB49
HLA.A
PLA2G2A
L1CAM
PRKCB1
FAS
KLRC1
SPINT2
IQSEC1
CD47
FGF13
IL1RAP
HSPG2
CLIC1
COL4A2
PYGL
H41096
KCNMA1
BCL2A1
TNFRSF12A
SAA2
AA479357
PRKAR1B
GBP1
UGCG
COL1A1
ADFP
PRSS1
GOT1
C6orf12
HPCA
CD53
PDLIM7
TncRNA
R61377
CLTB
LTF
C8orf4
STC1
S100A1
NCF2
IL32
KLRC2
NPL
CYBA
PDE4DIP
MST150
AI335002
URP2
ASPA
KIAA1644
LOC283130
AA489629
MAL2
F13A1
RAB20
NSF
CHN1
LAPTM5
KCNJ1
COL6A2
FCGR2B
INPP5F
PDE2A
RBP4
LAMB1
CCR1
PRSS3
IGHG1
LSP1
CBLN2
AA873230
CD58
NDST1
AI357047
CD44
APCS
STXBP1
PLAU
ESM1
C1R
PRKCZ
R70506
SYNGR3
SNCG
HAMP
IMAGE.33267
NPC2
EFHB
GYPC
RFXAP
SERPINH1
H78560
MGC26694
CDC20B
SCN2B
CLSTN3
SLC16A3
DDN
PCP4
CKMT1A
RTN3
CAMKK2
C16orf5
TNFAIP3
KNS2
●
CALM1
PLP2
HBA1
CBX6
A2BP1
CTHRC1
AA669383
SLC15A2
SORT1
AA975768
RUNX1
CAMK2A
APOC2
MSN
S100A10
OSBPL1A
CA11.1
CLCA3
RAB27B
AIF1
LOC388610
VAMP2
EMP3
N98591
CAPG
WASF1
AA598631
AA401952
FBXL16
FAM46A
TSPAN4
KIAA0963
KLRC3.1
FBXO2
LOC613212
C1orf187
MYBPC2
COX6A2
W93688
PDPN
WDR7
T62491
AEBP1
PYHIN1
POSTN
NY.SAR.48
STK17A
ZNF217
TGFBI
TBC1D7
NPTX1
CAP2
PPP3CB
NCF1
DES TOMM40
NMNAT2
PLTP
FREQ
PDXP
FABP3
TSPYL1
ITPKA
AA424849
DDIT4
ARPP.19
SETD5
H10054
IGFBP5
RALYHCLS1
FLJ35740
FBL
AA181288
MED11
MOAP1
PNCK
PARP14
ADCY1
SCAMP5
STMN1
LMO7
NLK
PSD3
LOC57228
NEFH
MMP2
LRBA
UBA52
C9orf48
HBXIP
ADARB2
LHFPL2
DKFZp313A2432
FAM84B
MAPKBP1
FBXW7
ATP6V1C1
NUAK1
IGF2BP3
RND3
PRRX1
PPP3CA
DLL3
MYC
MDK
X37864
MTHFD2
STEAP3
AA702986
PHLDA1
TNCLAMA4
FCGRT
FLJ38984
C4A
DMN
EPHB1
MYCBP2
ITGA9
HEY1
PEG3
RPS19BP1
GAS1
PCDHGC3
COX7A1
R70684
LOC146795
HK2
EXT1AA398420
AP4B1
SLC35A2
FLJ38944
DEF6
MRC2
AI005038
NLGN1
PALLD
AA906888
ANXA5
CASP1
MST1
LOC389831
AA029415
MASP1
TIMP3
MVP
MGC18216
PYCR1
PARD3
SOX4
CCDC3
SPAG6
DYNLT1
CSPG2
LOC283070
KCNQ2
ID4 AA131320
CACNB3
FZD7
AI350724
EDG3
PLOD2
GABBR1
SUV39H1
ID3
BNIP2 PTPRZ1
WWTR1
TCF7L1
SPARC
EDNRB
CA11 PSG5 AI263051
NOTCH1
SRPX
COL9A2 H87106
TXNDC
IGF1R
DECR2
SVIL
MDFI
AI002301
USP3
TFDP2
FLJ12572
H91845
RBP7
VASP
FLJ40873
MGC4728
LOC56931
ZNF160
APOC1
LILRA1
ZNF233
SBP1
DCLRE1B
FLJ12586
−1.0
−3
GNN1
−2
0.0
0
●
●
−1
Dim 2 (13.51 %)
●
●●
●
Dim 2 (13.51 %)
●
●
●
●
0.5
●
●
2
3
Dim 1 (20.99 %)
−1.0
−0.5
0.0
0.5
1.0
Dim 1 (20.99 %)
⇒ What can be added to interpret?
9 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Individuals and variables representations
2
Individual factor map
A
GBM
O
OA
GBM30
O4
●
●
GBM22
O1
0
O5
GBM28
GBM31
O3
GS2
●
● ●
●
●
JPA2
GBM21
GBM6 O2 O
●
●
●
GBM25
●
GBM23
GBM27
●
GBM15
● sGBM3
GBM4
●
GBM11
AO1
GBM
GBM9
●
●
AO3
●
AOA6
●
GBM5
LGG1
●
●
GBM26
●
GS1
●
AOA4
sGBM1 ●●
●
●
GBM29
●
●
AOA3
●
●
●
●
AO2
GBM1
●
●
OA
GBM24
JPA3
−1
Dim 2 (13.51 %)
1
●
●
A
●
GBM16
●
AOA7
AOA2
GBM3
●
●
AA3
●
AOA1
−2
●
●
−3
GNN1
●
−2
−1
0
1
2
3
Dim 1 (20.99 %)
Figure 4: Multi-way glioma data set: Characteristics of oligodendrogliomas are linked to modifica
the genomic status of genes located on 1p and 19q positions.
⇒ Do the means of the tumors coordinates per stage on each
dimension significantly differ from each other?
10 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Groups study
⇒ Synthetic comparison of the groups
⇒ Are the relative positions of individuals globally similar from one
group to another? Are the partial clouds similar?
⇒ Do the groups bring the same information?
11 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Principal component in MFA
MFA = weighted PCA ⇒ first principal component of MFA
maximizes


J X
J
X
X
x
.k
cov 2  q , F1  =
Lg (F1 , Kj )
j=1 k∈Kj
j=1
λj1
Lg (F1 , Kj ) =<
WKj
, F1 F10 >= trace(WK0 j F1 F10 )
λ1
⇒ F1 the most related to the groups in the Lg sense
12 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the groups
1.0
Groupscoordinates
representation
Group j has the
(Lg (F1 , Kj ), Lg (F2 , Kj ))
CGH
0.6
close that they induce the
same structure
0.4
• The 1st dimension is
common to all the groups
0.2
Dim 2 (13.51 %)
0.8
• 2 groups are all the more
WHO
• 2nd dimension mainly due
expr
0.0
to CGH
0.0
0.2
0.4
0.6
0.8
1.0
Dim 1 (20.99 %)
0 ≤ Lg (F1 , Kj ) =
1 X
λj1 k∈Kj
|
cov 2 (x.k , F1 ) ≤ 1
{z
≤λj1
}
⇒ Could you predict the results of the PCA for each group?
13 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
The RV coefficient
Xj(I ×K ) and Xm(I ×Km ) not directly comparable
j
Wj(I ×I ) = Xj Xj0 and Wm(I ×I ) = Xm Xm0 can be compared
Inner product matrices = relative position of the individuals
Covariance between two groups:
X X
< Wj , Wm >=
cov 2 (x.k , x.l )
k∈Kj l∈Km
Correlation between two groups (Escoufier, 1973):
RV (Kj , Km ) =
< Wj , Wm >
kWj k kWm k
0 ≤ RV ≤ 1
RV = 0: variables of Kj are uncorrelated with variables of Km
RV = 1: the two clouds of points are homothetic
⇒ Extension of the notion of correlation matrix
14 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Similarity between two groups
Measure of similarity between groups Kj and Km :
X X
2 x.k x.l
,
Lg (Kj , Km ) =
cov
λk1 λl1
k∈K l∈K
j
m
Ramsay (1984): "Matrices may be similar or dissimilar in a many
ways"
Canonical correlation (Hotteling, 1936), Mantel (1967), Procrustes
(Gower, 1971), dCov (Szekely et al., 2007), kernel based HSIC
(Gretton et al., 2005), etc...
15 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Numeric indicators
> res.mfa$group$Lg
CGH expr WHO
CGH 2.51 0.60 0.46
expr 0.60 1.10 0.36
WHO 0.46 0.36 0.50
> res.mfa$group$RV
CGH expr WHO
CGH 1.00 0.36 0.41
expr 0.36 1.00 0.48
WHO 0.41 0.48 1.00
PKj
Lg (Kj , Kj ) =
j 2
k=1 (λk )
(λj1 )2
PKj
= 1+
j 2
k=2 (λk )
(λj1 )2
• CGH gives richer description (Lg greater)
• RV: a standardized Lg
• CGH and expr are not linked (RV=0.36)
Contribution of each group to each component of the MFA
• Similar contribution of the 2 groups to
> res.mfa$group$contrib
Dim.1 Dim.2 Dim.3
the first dimension
CGH
45.8 93.3 78.1
• Second dimension only due to CGH
expr 54.2
6.7 21.9
16 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Partial analyses
⇒ Comparison of the groups through the individuals
⇒ Comparison of the typologies provided by each group in a
common space
⇒ Are there individuals very particular with respect to one group?
⇒ Comparison of the separate PCA
17 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Projection of partial points
G1
xxxxxxxx
xxxxxxxx
xxxxxxxx
Data xxxxxxxx
i
G3
xxxxx
xxxxx
xxxxx
xxxxx
xxxxxxxx xxxxx xxxxx
xxxxxxxx xxxxx xxxxx
xxxxxxxx xxxxx xxxxx
xxxxxxxx
xxxxxxxx
xxxxxxxx
Projection of group 1 xxxxxxxx
i1
G2
xxxxx
xxxxx
xxxxx
xxxxx
00000
00000
00000
00000
RK = ⊕ RK j
i
00000
00000
00000
00000
xxxxxxxx 00000 00000
xxxxxxxx 00000 00000
xxxxxxxx 00000 00000
00000000
00000000
00000000
Projection of group 2 00000000
xxxxx
xxxxx
xxxxx
xxxxx
00000
00000
00000
00000
00000000 xxxxx 00000
2 00000000 xxxxx 00000
00000000 xxxxx 00000
i
00000000
00000000
00000000
Projection of group 3 00000000
i3
MFA individuals configuration
00000
00000
00000
00000
xxxxx
xxxxx
xxxxx
xxxxx
00000000 00000 xxxxx
00000000 00000 xxxxx
00000000 00000 xxxxx
Partial point 1
Partial point 2
Mean point
Partial point 3
18 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Partial points
opinion
attitude
individuals
F2
What you think
behavioral conflict
individual i
What you do
F1
19 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Tutorial participants
Partial points
What you expected
for the tutorial
What you have learned
during the tutorial
F2
What you expected
for the tutorial
What you have learned
during the tutorial
What you expected
for the tutorial
What you have learned
during the tutorial
F1
20 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Tutorial participants
Partial points
What you expected
for the tutorial
What you have learned
during the tutorial
F2
What you expected
for the tutorial
What you have learned
during the tutorial
Disappointed
learner
What you expected
for the tutorial
Happy learner
What you have learned
during the tutorial
F1
20 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the partial points
CGH
expr
Individual factor map
1.5
4
Individual factor map
●
CGH
expr
1.0
●
●
GBM30
O4
●
●
GBM22
O1
O5
GBM28
●
GBM31
O3 ●
GS2
JPA2GBM6
GBM21
●
O2O
●●
●
●GBM23
GBM25
GBM27
●
GBM15
●
●
sGBM3
●
GBM4
GBM11
AO1
GBM
GBM9
AO3
●●
AOA6
GBM5
LGG1
●
GBM26
●GS1
●sGBM1
AOA4
●
●
●
GBM29
●
●
●●
AOA3
●
● ● ●
●
●
●
AO2
●
GBM1
●
GBM24
JPA3 A ●OA
GBM16 ● ●
AOA7
AOA2
GBM3●
●
AA3 ● AOA1
●
●
● ●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●●
●●
●
●
●●●
●
●
●
●●
●
●
●● ●
●
●
●
●
●
−2
●
●●
●
●
●
GNN1
●
●
●
●
OA
A
−1.5
−4
●
●
−1.0
●
GBM
0.0
0
● ●●●
●
●
●
●
−2.0
●
−6
Dim 2 (13.51 %)
● ●
●
●
●
●
●
−0.5
●
●
●
●
0.5
●
●
●
●
●
O
●
●
●
Dim 2 (13.51 %)
2
●
●
●
−4
−2
0
2
4
6
Dim 1 (20.99 %)
−1
0
1
2
Dim 1 (20.99 %)
• an individual is at the barycentre of its partial points
• an individual is all the more "homogeneous" that its
superposed representations are close
21 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Identify particular individuals
2
Individual factor map
CGH
expr
GBM30
O4
●
●
GBM22
O1
1
●
O5
GBM28
GBM31
O3
GS2
JPA2
GBM21
GBM6 O2 O
GBM25
GBM23
GBM27
GBM15 sGBM3
GBM4GBM9
GBM11
AO1
GBM
AOA6 AO3
GBM5
LGG1
GBM26
GS1 AOA4
sGBM1
GBM29
AOA3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0
●
●
●
●
●
●
●
●
●
●
●
AO2
GBM1
OA
●
●
●
−1
GBM24
JPA3
●
A
●
GBM16
AOA7
AOA2
●
GBM3
●
●
AA3
●
AOA1
−2
●
●
GNN1
−3
Dim 2 (13.51 %)
Data
●
−2
−1
0
1
2
3
Dim 1 (20.99 %)
22 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Numeric indicators
J
I X
X
i=1 j=1
(Fi j q )2 =
I X
J
X
(Fiq )2 +
i=1 j=1
I X
J
X
(Fi j q − Fiq )2
i=1 j=1
Total inertia = Between indiv. inertia + Within indiv. inertia
> res.mfa$inertia.ratio
Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
0.84 0.56 0.44 0.59 0.43
• For the first dimension, the coordinates of each partial points
are close (0.84 close to 1)
• The within inertia can be decomposed by individuals
res.mfa$ind$within.inertia
23 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the partial components
Do the separate analyses give similar dimensions as MFA?
1 q Q
PCA
1 q Q
1
i
I
24 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the partial components
CGH
expr
WHO
Dim1.CGH
0.0
• The first dimension of
Dim1.WHO
Dim1.expr
●
Dim3.CGH
−0.5
Dim3.expr
Dim3.WHO
Dim2.expr
Dim2.WHO
Dim2.CGH
each group is well
projected
• CGH has same
dimensions as MFA
−1.0
Dim 2 (13.51 %)
0.5
1.0
Partial axes
−1.0
−0.5
0.0
0.5
1.0
Dim 1 (20.99 %)
25 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Use of biological knowledge
Genes can be grouped by gene ontology (GO) biological process
GO:0006928
cell motility
GO:0009966
regulation of signal
transduction
ANXA1
CALD1
EGFR
ENPP2
FN1
FPRL2
LSP1
MSN
PDPN
PLAUR
PRSS3
SAA2
SPINT2
TNFRSF12A
VEGF
WASF1
YARS
CASP1
EDG2
F2R
HCLS1
HMOX1
IGFBP3
IQSEC1
LYN
MALT1
TCF7L1
TNFAIP3
TRIO
VEGF
YWHAG
YWHAH
GO:0052276
chromosome
organisation and
biogenesis
CBX6
NUSAP1
PCOLN3
PTTG1
SUV39H1
TCF7L1
TSPYL1
26 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Use of biological knowledge
• Biological processes considered as supplementary groups of
Modules
variables
<MODULES of GENES>
Modular approach
1
j1
J1 1
j2
J2
Tumors
1
M1 M2
i
M3
…..
I
‘-omics’ data
Modules
=> Integration of the modules as groups of supplementary variables
27 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Use of biological knowledge
1.0
Groups representation
0.8
CGH
●
●
0.6
●
●
●
●
●
●
●
●
●
●
●
0.4
Many biological processes
induce the same structure
on the individuals than
MFA
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
0.0
●
●
●
●
●
●
●
●
0.2
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●●
● ●●
●
●
●
●
●● ●
● ●● ●● ●●●● ● ●●
● ● ●
●●
●
●
●
●
●● ●
●●
● ● ●● ●
● ●
●●
●●●
●
●●
●
●
●● ●
●
● ●
●●●
● ●●
●
●
●
● ● ●● ● ● ●
● ●
●● ●
●
●● ● ●
●● ●●●
●
●
●
●
●●
●● ●●
●
●● ● ● ●● ● ● ●●●
●●
●●
●
●
●● ●
●●
●
●
●●●●
● ●
●
●
●●
●
●
●
● ●
● ●
●
● ●●
●
●
●
●●● ●●●
● ● ● ●●●
●
●●
●● ●● ●
●
●●
●
● ● ● ●● ●
● ●
●●
● ●
● ●●●
●
●●
●
●●
●
● ● ●
● ●●
●
●●
●
●●
●
●● ●
● ●
●●
●
●
●
●●
●●
●● ●
●●
●
●
●
●
●
●●
● ● ●
●
●
●●
●
●
●
●
●
WHO
●
0.0
Dim 2 (13.51 %)
●
0.4
expr
●
●
0.6
0.8
1.0
Dim 1 (20.99 %)
28 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
To go further
• Mixed data: MFA with 1 group = 1 variable
continuous variables: PCA is recovered; categorical variables:
MCA is recovered
mixed: FAMD
• MFA used for methodological purposes:
• comparison of coding (continuous or categorical)
• comparison between preprocessing (standardized PCA and
unstandardized PCA)
• comparison of results from different analyses
• Hierarchical Multiple Factor Analysis:
Takes into account a hierarchy on the variables: variables are
grouped and subgrouped (like in questionnaires structured in
topics and subtopics)
29 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Clustering: MFA as a preprocessing
X1
X2
i
i’
MFA balances the influence of the groups when computing
distances between individuals
Kj
J
X
1 X
p
d (i, i ) =
(xik − xi 0 k )2
λ
j
j=1
k=1
2
0
AHC or k-means onto the first principal components (F.1 , ..., F.Q )
obtained from MFA allows to
• take into account the groups structure in the clustering
• make the clustering more robust by deleting the last dimensions
30 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Clustering
0.0
Cluster Dendrogram
inertia gain
0.0
0.2
0.4
0.6
0.8
1.0
Hierarchical clustering
0.4
0.8
AHC onto the first 5 principal components from MFA
O4
AO2
sGBM1
AOA6
AO1
AO3
GBM1
LGG1
AOA3
O3
O1
GBM6
O2
O5
AOA4
GBM25
GS1
GBM29
GBM31
GBM15
GBM26
JPA2
GBM9
GBM21
GBM22
GBM23
GBM27
GBM11
GBM4
GBM5
GBM30
GBM28
GS2
sGBM3
AOA1
GNN1
AOA2
AOA7
GBM24
JPA3
AA3
GBM3
GBM16
Data
Individuals are sorted according to their coordinate on F.1
31 / 58
0.0
0.2
0.4
0.6
0.8
1.0
Hierarchical clustering
Cluster Dendrogram
0.4
0.8
Groups Study
0.0
Common Structure
O4
AO2
sGBM1
AOA6
AO1
AO3
GBM1
LGG1
AOA3
O3
O1
GBM6
O2
O5
AOA4
GBM25
GS1
GBM29
GBM31
GBM15
GBM26
JPA2
GBM9
GBM21
GBM22
GBM23
GBM27
GBM11
GBM4
GBM5
GBM30
GBM28
GS2
sGBM3
AOA1
GNN1
AOA2
AOA7
GBM24
JPA3
AA3
GBM3
GBM16
Data
Partial Analyses
To go further
Example
Partition from the tree
An empirical number of clusters is suggested
inertia gain
32 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Partition on the principal component map
Factor map
2
cluster 1
GBM30
cluster 2
cluster 3
0
O1
O5
GBM31 GBM28
GS2
JPA2
GBM21
O3
GBM6
cluster 1
O2
GBM23
sGBM3
GBM25
AO1
GBM4
AO3
cluster 3
GBM15
GBM9
GBM26
GBM5
LGG1AOA6
GS1
sGBM1 AOA4
GBM29
AOA3
AO2
GBM27
Dim 2 (13.51%)
O4
GBM22
GBM11
GBM1
JPA3
GBM24
GBM16
AOA7
cluster 2
AOA2
−2
GBM3
AA3
AOA1
GNN1
−4
Data
−2
−1
0
1
2
3
Dim 1 (20.99%)
Continuous vision (principal component) and discontinuous
(clusters)
33 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Cluster description by variables
v.test = r
x̄` − x̄
H0 : random sampling of I` values from I
s2
I`
I −I`
I −1
with x̄` the mean of variable x in cluster `, x̄ (s) the mean
(standard deviation) of the variable x in the data set, I` the cardinal
of cluster `
$desc.var$quanti$‘1‘
v.test Mean in category Overall mean sd in category Overall sd p.value
TMEM49
4.488
-0.430
-1.424
0.722
1.277
0.000
TNFRSF12A
4.433
-0.794
-1.838
0.789
1.357
0.000
LGALS3
4.369
-0.222
-1.216
0.861
1.312
0.000
S100A11
4.300
-0.737
-1.500
0.525
1.024
0.000
BGN
4.273
2.105
1.106
0.697
1.348
0.000
IFI30
4.264
0.987
0.026
0.979
1.300
0.000
....
....
C9orf48
-4.411
-0.686
-0.037
0.540
0.848
0.000
PSD3
-4.594
-1.684
-1.024
0.419
0.829
0.000
AA398420
-4.635
0.324
1.134
0.635
1.007
0.000
34 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Cluster description by observations
• parangon: the closest observations to the centroid of the cluster
min d (xi. , C` ) with C` the centroid of cluster `
i∈`
• specific observations: the furthest observations to the centroids
of the other clusters (the observations sorted according to their
distance from the highest to the smallest to the closest centroid)
max min
d (xi. , C`0 )
0
i∈` ` 6=`
desc.ind$para
cluster: 1
GBM11
GBM28
GBM5
GBM25
GBM31
0.6649847 0.7001998 0.7973604 0.8869271 0.9674042
--------------------------------------------------------------desc.ind$dist
cluster: 1
GBM30
GS2
GBM21
GBM22
GBM27
3.227968 3.096048 3.031256 2.904327 2.778950
--------------------------------------------------------------35 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Cluster description
• by the principal components (observations coordinates): same
description than for continuous variables
$desc.axes$quanti$‘1‘
v.test Mean in category Overall mean sd in category Overall sd p.value
Dim.2 2.919
0.511
0
0.465
1.010
0.004
Dim.1 -4.458
-0.974
0
0.560
1.259
0.000
• by categorical variables: chi-square and hypergeometric test
$test.chi2
p.value df
type 8.433474e-06
6
⇒ Active and supplementary elements are used
⇒ Only significant results are presented
36 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Cluster description
$‘1‘
type=GBM
type=OA
type=O
Cla/Mod Mod/Cla
Global
p.value
v.test
75 94.73684 55.81395 3.300966e-06 4.651145
0 0.00000 13.95349 2.207775e-02 -2.289028
0 0.00000 18.60465 5.071916e-03 -2.802430
$‘2‘
type=A
Cla/Mod Mod/Cla
Global
p.value
v.test
60
37.5 11.62791 0.0398214 2.055597
$‘3‘
type=O
type=GBM
Cla/Mod Mod/Cla
Global
p.value
v.test
100.0
50.00 18.60465 8.875341e-05 3.919444
12.5
18.75 55.81395 2.319983e-04 -3.681354
37 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Complementarity between hierarchical clustering and
partitioning
• Partitioning after AHC: the k-means algorithm is initialized
from the centroids of the partition obtained from the tree
• consolidate the partition
• loss of the hierarchy
• AHC with many individuals: time-consuming
⇒ partitioning before AHC
• compute k-means with approximately 100 clusters
• AHC on the weighted centroids obtained from the k-means
⇒ top of the tree is approximately the same
38 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Other methods: ade4
Two-table analysis
Available methods
variables
variables
individuals
Data
Function name
between
within
discrimin
coinertia
cca
pcaiv
pcaivortho
procuste
niche
Stéphane Dray (Univ. Lyon 1)
Analysis name
Between-class analysis
Within-class analysis
Discriminant analysis
Coinertia analysis
Canonical correspondence analysis
PCA on Instrumental Variables
Orthogonal PCAIV
Procustes analysis
Niche (OMI) analysis
CARME 2011, Rennes
22 / 31
39 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Other methods: ade4
Other functions
K-table
variables
individuals
Data
Function name
sepan
pta
foucart
statis
mfa
mcoa
statico
Stéphane Dray (Univ. Lyon 1)
Analysis name
K-table separate analyses
Partial triadic analysis
Foucart analysis
STATIS analysis
Multiple factor analysis
Multiple coinertia analysis
2 K-table analysis
CARME 2011, Rennes
26 / 31
40 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Other methods
Predict one block with others:
• Multi-block PLS regression
• Multi-block PCA on instrumental variables..
41 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
RV Tests
Is there any (linear) relationship between the 2 sets? H0 : ρV = 0
Asymptotic tests: distributions normal, elliptical - rank P
(Robert et
nRV ∼
λi Zi2
⇒ sensitive to the departure from the distribution and to n
al, 1985, Cléroux, 1995, Cléroux & Ducharme, 1989)
42 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
RV Tests
Is there any (linear) relationship between the 2 sets? H0 : ρV = 0
Asymptotic tests: distributions normal, elliptical - rank P
(Robert et
nRV ∼
λi Zi2
⇒ sensitive to the departure from the distribution and to n
al, 1985, Cléroux, 1995, Cléroux & Ducharme, 1989)
Permutation tests:
permute one matrix’s rows - compute the RV for n! permutations
p-value: proportion of the values greater than the observed one
⇒ computationally costly (“old fashion" argument?)
42 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
RV Tests
Is there any (linear) relationship between the 2 sets? H0 : ρV = 0
Asymptotic tests: distributions normal, elliptical - rank P
(Robert et
nRV ∼
λi Zi2
⇒ sensitive to the departure from the distribution and to n
al, 1985, Cléroux, 1995, Cléroux & Ducharme, 1989)
Permutation tests:
permute one matrix’s rows - compute the RV for n! permutations
p-value: proportion of the values greater than the observed one
⇒ computationally costly (“old fashion" argument?)
Approximation of the permutation distribution
• sampling from the permutations - package ade4 (RV.rtest)
• moment matching: Pearson family, Edgeworth expansion
42 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Moments matching
The first three moments under H0 (Kazi-Aoual et al., 1995)
p
P
βx × βy
(tr (X 0 X ))2
( λi )2
EH0 (RV ) =
βx =
= P 2 .
n−1
tr ((X 0 X )2 )
λi
βx a measure of complexity 1 ≤ βx ≤ p
RV large: n small and many orthogonal variables per group
⇒ Normal approximation:
RVstd =
RV − EH0 (RV )
p
VH0 (RV )
43 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Moments matching
Problem: the exact distribution of the RVstd is often skewed
0.5
Histogram of the standardized RV
0.3
0.4
Normal
Gamma
Edgeworth
⇒ Pearson type III
0.2
Density
f(x) (skewness= γ):
2
2+γx
γ
(4−γ 2 )/γ 2
e −2(2+xγ)/γ
0.0
0.1
(2/γ)4/γ
Γ(4/γ 2 )
−1
0
1
2
3
4
⇒ package FactoMineR (coeffRV) (Josse et al., 2008)
44 / 58
2
Common Structure
Groups Study
Partial Analyses
To go further
Example
Back to the wine example!
• 10 white wines from Val de Loire (5 Vouvray - 5 Sauvignon)
7.1
7.2
6.1
4.9
6.1
5.9
6.3
6.7
7.0
7.3
6.7
6.6
6.1
5.1
5.1
5.6
6.7
6.3
6.1
6.6
5.0
3.4
3.0
4.1
3.6
4.0
6.0
6.4
7.4
6.3
Label
1.4
2.3
2.4
3.0
3.1
2.4
4.0
2.5
3.1
4.3
Odor.preferene
4.1
3.8
4.1
2.5
5.0
3.0
5.0
4.0
4.0
4.3
Overall.preference
5.9
6.8
6.1
5.6
6.6
4.4
6.4
5.7
5.4
5.1
Visual.intensity
3.5
3.3
3.0
3.9
3.4
7.9
3.5
3.0
3.9
3.8
Aroma.persistency
…
…
…
…
…
…
…
…
…
…
Astringency
5.7
5.3
5.3
3.6
3.5
3.3
1.0
2.5
3.8
2.7
Aroma.intensity
2.4
3.1
4.0
2.4
3.1
0.7
0.7
0.5
0.8
0.9
Acidity
4.3
4.4
5.1
4.3
5.6
3.9
2.1
5.1
5.1
4.1
Bitterness
…
Sweetness
Michaud
Renaudie
Trotignon
Buisse Domaine
Buisse Cristal
Aub Silex
Aub Marigny
Font Domaine
Font Brûlés
Font Coteaux
O.citrus
S
S
S
S
S
V
V
V
V
V
O.fruity
• 27 continuous variables: sensory descriptors
O.passion
Data
6.0
5.4
5.0
5.3
6.1
5.0
5.1
4.4
4.4
6.0
5.0
5.5
5.5
4.6
5.0
5.5
4.1
5.1
6.4
5.7
Sauvignon
Sauvignon
Sauvignon
Sauvignon
Sauvignon
Vouvray
Vouvray
Vouvray
Vouvray
Vouvray
45 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Back to the wine example!
• 3 panels (oenologists, naive consumers, our students!)
• 60 preference scores: taste evaluation 1 - 10
Continuous variables
Categorical
Expert
(27)
Consu
mer
(15)
Student
(15)
Preference
(60)
Label
(1)
wine 1
wine 2
…
wine 10
• How are the products described by the panels?
• Do the panels describe the products in a same way? Is there a
specific description done by one panel?
46 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Practice with R
1
Define groups of active and supplementary variables
2
Scale or not the variables
3
Perform MFA
4
Choose the number of dimensions to interpret
5
Simultaneously interpret the individuals and variables graphs
6
Study the groups of variables
7
Study the partial representations
8
Use indicators to enrich the interpretation
47 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Practice with R
library(FactoMineR)
Expert <- read.table("http://factominer.free.fr/docs/Expert_wine.csv",
header=TRUE, sep=";", row.names=1)
Consu <- read.table(".../Consumer_wine.csv",header=T,sep=";",row.names=1)
Stud <- read.table(".../Student_wine.csv",header=T,sep=";",row.names=1)
Pref <- read.table(".../Preference_wine.csv",header=T,sep=";",row.names=1)
palette(c("black","red","blue","orange","darkgreen","maroon","darkviolet"))
complet <- cbind.data.frame(Expert[,1:28],Consu[,2:16],Stud[,2:16],Pref)
res.mfa <- MFA(complet,group=c(1,27,15,15,60),type=c("n",rep("s",4)),
num.group.sup=c(1,5),graph=FALSE,
name.group=c("Label","Expert","Consumer","Student","Preference"))
plot(res.mfa,choix="group",palette=palette())
plot(res.mfa,choix="var",invisible="quanti.sup",hab="group",palette=palette())
plot(res.mfa,choix="ind",partial="all",habillage="group",palette=palette())
plot(res.mfa,choix="axes",habillage="group",palette=palette())
dimdesc(res.mfa)
write.infile(res.mfa,file="my_wine_results.csv") #to export a list
48 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the individuals
S Trotignon
S Renaudie
S Michaud
1
Sauvignon
Vouvray
Sauvignon
V Aub Marigny
0
V Font Coteaux
S Buisse Domaine
well separated
• Vouvray are
Vouvray
sensorially more
different
V Font Domaine
V Font Brûlés
-1
• The two labels are
-2
• Several groups of
V Aub Silex
wines, ...
-3
Dim 2 (24.42 %)
S Buisse Cristal
-2
-1
0
1
2
3
Dim 1 (42.52 %)
49 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
0.0
Expert
Consumer
Student
O.Intensity.before.shaking_S
Attack.intensity Expression
O.Intensity.before.shaking
Acidity
Freshness O.Intensity.after.shaking
O.passion
O.Intensity.before.shaking_C
O.Intensity.after.shaking_S
O.plante_C
O.Intensity.after.shaking_C
O.passion_S
Astringency_S
O.passion_CO.flower
A.persistency Bitterness
O.citrus
Astringency_CBitterness_C
Balance_S
A.intensity
Acidity_S
O.plante
Smoothness
Acidity_C A.intensity_C
O.mushroom_S Bitterness_S
A.alcohol_C
O.fruity
O.Typicity_S
A.intensity_S
O.vanilla
O.wooded
Typical_S Balance_C
O.mushroom_C A.alcohol_S
O.alcohol
Typical_C
Astringency
O.Typicity_C
Visual.intensity
Grade
Surface.feeling
O.candied.fruit
Sweetness_C
-0.5
Dim 2 (24.42 %)
0.5
1.0
Representation of the active variables
Sweetness_S
O.mushroom
Oxidation
Typicity O.alcohol_C
O.plante_S O.alcohol_S
Sweetness
-1.0
Data
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
Dim 1 (42.52 %)
50 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Expert
Consumer
Student
0.0
Acidity_S
Acidity_C
Sweetness_C
-0.5
Dim 2 (24.42 %)
Acidity
O.passion
O.passion_S
O.passion_C
0.5
1.0
Representation of the active variables
Sweetness_S
Sweetness
-1.0
Data
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
Dim 1 (42.52 %)
50 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the groups
more close that they
induce the same
structure
Expert
0.6
Preference
Student
Label
0.4
Consumer
• The 1st dimension is
common to all the
panels
0.2
• 2nd dimension mainly
due to the experts
0.0
Dim 2 (24.42 %)
0.8
1.0
• 2 groups are all the
0.0
0.2
0.4
0.6
Dim 1 (42.52 %)
0.8
1.0
• Preference linked to
sensory description
51 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the partial points
2
Expert
Consumer
Student
S Trotignon
S Renaudie
1
S Michaud
V Aub Marigny V Font Coteaux
0
S Buisse Cristal
S Buisse Domaine
Vouvray
-1
V Font Domaine
V Font Brûlés
-2
Dim 2 (24.42 %)
Sauvignon
V Aub Silex
-3
Data
-4
-2
0
2
4
Dim 1 (42.52 %)
52 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of the partial dimensions
Dim2.Student
Dim2.Consumer
Dim2.Expert
Dim1.Label
0.5
1.0
Expert
Consumer
Student
Preference
Label
0.0
-0.5
Dim1.Student
Dim1.Expert
Dim2.Preference
Dim1.Preference
• The two first
dimensions of each
group are well projected
• Consumer has same
dimensions as MFA
-1.0
Dim 2 (24.42 %)
Dim1.Consumer
-1.5
-1.0
-0.5
0.0
0.5
1.0
Dim 1 (42.52 %)
53 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
0.0
-0.5
Dim 2 (24.42 %)
0.5
1.0
Representation of supplementary continuous variables
-1.0
Data
-1.0
-0.5
0.0
0.5
1.0
Dim 1 (42.52 %)
⇒ Preferences do not participated to the construction of the
dimensions
⇒ Preferences are linked to sensory description
54 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of supplementary continuous variables
1.0
S Trotignon
S Renaudie
S Michaud
Expert
Consumer
Student
0.0
-0.5
Sweetness_C
Sweetness_S
0
-2
-1
V Font Domaine
V Font Brûlés
Acidity_S
Acidity_C
Sweetness
-1.0
V Aub Silex
-3
Dim 2 (24.42 %)
V Font Coteaux
Vouvray
Dim 2 (24.42 %)
V Aub Marigny
S Buisse Cristal
S Buisse Domaine
0.5
Sauvignon
Acidity
O.passion
O.passion_S
O.passion_C
1
Sauvignon
Vouvray
-2
-1
0
1
Dim 1 (42.52 %)
2
3
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
Dim 1 (42.52 %)
54 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Representation of supplementary continuous variables
⇒ Main information: the favourite is Vouvray Aubuisières Silex
54 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Helps to interpret
• Contribution of each group of variables to each component of
the MFA
> res.mfa$group$contrib
Dim.1 Dim.2 Dim.3
Expert
30.5 46.0 33.7
Consumer 33.2 23.1 31.2
Student
36.3 30.9 35.1
• Similar contribution of the 3 groups
to the first dimension
• Second dimension mainly due to the
expert
• Correlation between the global cloud and each partial cloud
> res.mfa$group$correlation
Dim.1 Dim.2 Dim.3
Expert
0.95 0.95 0.96
Consumer 0.95 0.83 0.87
Student
0.99 0.99 0.84
First components are highly linked to
the 3 groups: the 3 clouds of points
are nearly homothetic
55 / 58
Data
Common Structure
Groups Study
Partial Analyses
To go further
Example
Similarity measures between groups
> res.mfa$group$Lg
Expert Consumer Student Preference Label MFA
Expert
1.45
0.94
1.17
1.01 0.89 1.33
Consumer
0.94
1.25
1.04
1.11 0.28 1.21
Student
1.17
1.04
1.29
1.03 0.62 1.31
Preference
1.01
1.11
1.03
1.47 0.37 1.18
Label
0.89
0.28
0.62
0.37 1.00 0.67
MFA
1.33
1.21
1.31
1.18 0.67 1.44
> res.mfa$group$RV
Expert Consumer Student Preference Label MFA
Expert
1.00
0.70
0.85
0.69 0.74 0.92
Consumer
0.70
1.00
0.82
0.82 0.25 0.90
Student
0.85
0.82
1.00
0.75 0.55 0.96
Preference
0.69
0.82
0.75
1.00 0.31 0.81
Label
0.74
0.25
0.55
0.31 1.00 0.56
MFA
0.92
0.90
0.96
0.81 0.56 1.00
• Expert gives a richer description (Lg greater)
• Groups Student and Expert are linked (RV = 0.85)
• Group Student is the closest to the overall (RV = 0.96)
56 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
Partition from the tree
An empirical number of clusters is suggested (minq
Wq −Wq+1
Wq−1 −Wq )
0.0 0.5 1.0 1.5 2.0
Hierarchical Clustering
2.0
Hierarchical Classification
V Font Coteaux
V Aub Marigny
V Font Domaine
V Font Brûlés
S Buisse Cristal
S Buisse Domaine
S Michaud
S Renaudie
S Trotignon
0.0
0.5
1.0
1.5
inertia gain
V Aub Silex
Data
57 / 58
Common Structure
Groups Study
Partial Analyses
To go further
Example
1
2
Partition on the principal component map
cluster
cluster
cluster
cluster
cluster
1
2
3
4
5
S Trotignon
S Renaudie
S Michaud
S Buisse Cristal
V Aub Marigny
0
S Buisse Domaine
-1
V Font Domaine
V Font Brûlés
-2
Dim 2 (24.42%)
V Font Coteaux
V Aub Silex
-3
Data
-2
-1
0
1
Dim 1 (42.52%)
2
3
4
Continuous vision (principal component) and discontinuous
(clusters)
58 / 58