Multivariate_Cholesky_DeMoor_Boulder2012

Introduction to Multivariate
Genetic Analysis (2)
Marleen de Moor, Kees-Jan Kan & Nick Martin
March 7, 2012
M. de Moor, Twin Workshop Boulder
1
Outline
• 11.00-12.30
– Lecture Bivariate Cholesky Decomposition
– Practical Bivariate analysis of IQ and attention problems
• 12.30-13.30 LUNCH
• 13.30-15.00
– Lecture Multivariate Cholesky Decomposition
– PCA versus Cholesky
– Practical Tri- and Four-variate analysis of IQ, educational
attainment and attention problems
March 7, 2012
M. de Moor, Twin Workshop Boulder
2
Outline
• 11.00-12.30
– Lecture Bivariate Cholesky Decomposition
– Practical Bivariate analysis of IQ and attention problems
• 12.30-13.30 LUNCH
• 13.30-15.00
– Lecture Multivariate Cholesky Decomposition
– PCA versus Cholesky
– Practical Tri- and Four-variate analysis of IQ, educational
attainment and attention problems
March 7, 2012
M. de Moor, Twin Workshop Boulder
3
Bivariate Cholesky
a1
P1
1
1
A
1
A
C1
1
P2
C2
 a11
a
 21
a2
0 
a22 
2
c21
c22
a21
a11
c1
a22
c11
Twin 1
Phenotype 1
e11
P1  c11

P2 c21
Twin 1
Phenotype 2
c2
0
c22 
e21
e22
1
e1
1
E
E
1
2
March 7, 2012
P1  e11

P2 e21
M. de Moor, Twin Workshop Boulder
e2
0
e22 
4
Adding more phenotypes…
a1
1
1
A
A
C1
1
1
1
1
A
C2
a21
a11
e11
Twin 1
Phenotype 2
e21
e31
1
a11
0
0
P2
a21
a22
0
P3
a31
a32
a33
c1
c2
c3
c33
Twin 1
Phenotype 3
P1
c11
0
0
P2
c21
c22
0
P3
c31
c32
c33
e1
e2
e3
P1
e11
0
0
P2
e21
e22
0
P3
e31
e32
e33
e32
e33
e22
1
P1
c32
a22
c11
a3
c22
c31
a31
Twin 1
Phenotype 1
a33
a32
c21
C3
3
2
a2
1
E
E
E
1
2
3
Adding more phenotypes…
a1
1
1
A
A
C1
1
1
1
A
C2
a21
a11
Twin 1
Phenotype 1
e11
Twin 1
Phenotype 2
e21
e31
e22
1
1
a4
P1
a11
0
0
0
P2
a21
a22
0
0
P3
a31
a32
a33
0
a41
a42
a43
a44
c1
c2
c3
P4
c32
a22
c11
c44
a44
a3
c22
c31
a31
C4
4
c33
a33
a32
A
C3
3
2
c21
1
1
a2
Twin 1
Phenotype 3
Twin 1
Phenotype 4
e41
e32
e33 e43
e42
P1
c11
0
0
0
P2
c21
c22
0
0
P3
c31
c32
c33
0
P4
c41
c42
c43
c44
e44
e1
1
1
c4
E
E
E
E
1
2
3
4
e2
e3
e4
P1
e11
0
0
0
P2
e21
e22
0
0
P3
e31
e32
e33
0
P4
e41
e42
e43
e44
Trivariate Cholesky
1/0.5
1/0.5
1/0.5
1
1
1
1
A
A
C1
1
1
1
1
C2
a21
a11
Twin 1
Phenotype 1
e11
Twin 1
Phenotype 2
e21
e31
Twin 1
Phenotype 3
1
1
C2
a21
a11
c32
Twin 2
Phenotype 2
e21
e33
e31
Twin 2
Phenotype 3
e32
e33
e22
1
1
c33
c22
a22
c11
Twin 2
Phenotype 1
a33
c31
a31
C3
3
a32
c21
e11
1
1
A
2
e32
e22
1
1
A
C1
c33
c32
a22
c11
A
c22
c31
1
1
a33
a32
a31
C3
3
1
1
1
A
2
c21
1
1
E
E
E
E
E
E
1
2
3
1
2
3
What to change in OpenMx script?
OpenMx
Vars <- c(’varx', ’vary’, ‘varz’)
nv <- 3
# or, even more efficiently: nv <- length(Vars)
…
# Matrices a, c, and e to store a, c, and e path coefficients
mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.6, name="a" ),
mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.6, name="c" ),
mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.6, name="e" ),
Standardized solution – 3 pheno’s
1/0.5
1/0.5
1/0.5
1
1
1
A
C1
1
1
1
A
a11
1
C2
a22
c11
Twin 1
Phenotype 1
e11
1
1
a33
Twin 1
Phenotype 3
2
C1
c33
a11
A
c11
Twin 2
Phenotype 1
1
A
C2
a22
1
C3
3
c22
Twin 2
Phenotype 2
e11
1
E
M.Ede Moor, Twin Workshop Boulder
3
1
1
2
e33
1
E
A
1
e22
1
E 7, 2012
March
C3
1
1
1
3
c22
Twin 1
Phenotype 2
1
A
2
1
a33
Twin 2
Phenotype 3
e22
1
c33
e33
1
E
E
2
3
9
Genetic correlations
OpenMx
corA <- mxAlgebra(name ="rA", expression = solve(sqrt(I*A))%*%A%*%solve(sqrt(I*A)))
2x2
 1

rG ,12
 rG ,13
3x3
rg ,12 
1
rG , 23



1
a21a11
a * (a  a )
2
11
2
21
March 7, 2012
2
22
 1

2
 a11
 0


 0


 1


2
2
  a11
a11a 21
a11a31
  a11

 

2
2
0
a 21  a 22
a 21a31  a 22a32  *  0
 * a11a 21
  a11a31 a 21a31  a 22a32 a312  a32 2  a332  
1

 0
2

a31  a32 2  a332 
0
0
1
a 212  a 22 2
rg ,13 
0
a31a11
a * (a  a  a )
2
11
2
31
2
32
2
33
rg , 23 
M. de Moor, Twin Workshop Boulder
0
1
a 212  a 22 2
0




0


1

a312  a32 2  a332 
0
a21a31  a22 a32
2
2
2
2
2
(a31
 a32
 a33
) * (a21
 a22
)
10
The order of variables
• Order of variables does not matter for the
solution!
– Fit is identical, just different parameterization
– Standardized solutions are identical in terms of fit and
parameter estimates!
• But interpretation of A/C/E variance components
is different!
– Where A2 refers to those genetic factors that are not
shared with phenotype 1
• Sometimes there is natural ordering:
– Temporal ordering (IQ at 2 time points)
– Neuroticism and MDD symptoms
March 7, 2012
M. de Moor, Twin Workshop Boulder
11
Cholesky decomposition is not a
model…
•
•
•
•
No constraints on covariance matrices
Just reparameterization…
…But very useful to explore the data!
Observed statistics = Number of parameters
March 7, 2012
M. de Moor, Twin Workshop Boulder
12
Cholesky decomposition is not a
model…
• Bivariate constrained saturated model:
–
–
–
–
–
2
2
1
2
1
variances, 1 within-twin covariance MZ=DZ
within-trait cross-twin covariances MZ
cross-trait cross-twin covariance MZ
within-trait cross-twin covariances DZ
cross-trait cross-twin covariance DZ
9
observed
statistics
• Bivariate Cholesky decomposition
– a11, a21, a22
– c11, c21, c22
– e11, e21, e22
March 7, 2012
9
parameters
M. de Moor, Twin Workshop Boulder
13
Comparison with other models
Cholesky decomposition
models
Principal component
analysis
 Sanja, now
Genetic factor models
 Hermine, after coffee
break
Confirmatory factor
models
 Dorret, Sanja, Michel,
this morning
March 7, 2012
M. de Moor, Twin Workshop Boulder
14
Further reading
Three classic papers:
• Martin NG, Eaves LJ: The genetical analysis of
covariance structure. Heredity 38:79-95, 1977
• Carey, G. Inference About Genetic Correlations,
BG, 1988
• Loehlin, J. The Cholesky Approach: A Cautionary
Note, BG, 1996
• Carey, G. Cholesky Problems, BG, 2005
SEE ALSO:
http://genepi.qimr.edu.au/staff/classicpapers/
March 7, 2012
M. de Moor, Twin Workshop Boulder
15
Outline
• 11.00-12.30
– Lecture Bivariate Cholesky Decomposition
– Practical Bivariate analysis of IQ and attention problems
• 12.30-13.30 LUNCH
• 13.30-15.00
– Lecture Multivariate Cholesky Decomposition
– PCA versus Cholesky
– Practical Tri- and Four-variate analysis of IQ, educational
attainment and attention problems
March 7, 2012
M. de Moor, Twin Workshop Boulder
16
Practical
• Trivariate ACE Cholesky model
• 126 MZ and 126 DZ twin pairs from Netherlands
Twin Register
• Age 12
• Educational achievement (EA)
• FSIQ
• Attention Problems (AP) [mother-report]
March 7, 2012
M. de Moor, Twin Workshop Boulder
17
Practical
• Script CholeskyTrivariate.R
• Dataset Cholesky.dat
March 7, 2012
M. de Moor, Twin Workshop Boulder
18
Exercise
• Add Educational Achievement as the first of the 3
variables
• Run the saturated model, ACE model and AE
model
• Question: Can we drop C?
-2LL
ACE
model
df
chi2
∆df
P-value
-
-
-
AE model
March 7, 2012
M. de Moor, Twin Workshop Boulder
19
Exercise
• Run 4 submodels
–
–
–
–
Submodel
Submodel
Submodel
Submodel
1:
2:
3:
4:
drop
drop
drop
drop
rg
rg
re
re
between
between
between
between
EA and AP
FSIQ and AP
EA and AP
FSIQ and AP
• Compare fit of each submodel with full AE model
March 7, 2012
M. de Moor, Twin Workshop Boulder
20
Exercise
• Questions:
–
–
–
–
Can
Can
Can
Can
we
we
we
we
drop
drop
drop
drop
-2LL
AE model
rg
rg
re
re
between
between
between
between
df
EA and AP?
FSIQ and AP?
EA and AP?
FSIQ and AP?
chi2
∆df
P-value
-
-
-
No a31
No a32
No e31
No e32
March 7, 2012
M. de Moor, Twin Workshop Boulder
21
March 7, 2012
M. de Moor, Twin Workshop Boulder
22
Extra exercise
• Replace FSIQ by VIQ and PIQ, and run a
fourvariate Cholesky model.
• Questions:
– Is AP differentially related to VIQ and PIQ,
phenotypically and genotypically?
March 7, 2012
M. de Moor, Twin Workshop Boulder
23