lecture - Wellcome Trust Centre for Human Genetics

Current Research Areas in Complex Trait
Genetics
Richard Mott
December 2014
Overview
I
Heritability and Liability
I
Polygenic Scores
I
Estimating SNP effects jointly
I
Genomic Prediction
Genetic Relationship Matrices (Reminder)
I
At SNP q, genotype of individual i is
I
giq = 0,½,1, assumes effects are additive at q
I
Genetic correlation
across all loci
P
q (giq −ḡq )(gjq −ḡq )
P
kij = √P
2
2
I
I
q (giq −ḡq )
K =(kij ) =
Z0 Z
q (gjq −ḡq )
where
√P(giq −ḡq ) 2
q (giq −ḡq )
I
Ziq =
I
standardised genotype
Heritability and Liability
I
We know how to compute the heritability of a quantitative
trait as h2 =
σg2
2
σg +σe2
I
How do we compute the heritability of a dichotomous trait?
I
We could simply treat the disease status 0/1 as if it were a
quantitative trait, using the mixed model machinery to
estimate heritability. But this results in problems:
I
Scale. For quantitative traits the scale of measurement is the
same as the scale on which heritability is expressed. For
dichotomous traits, phenotypes are measured on the 0/1
scale, but heritability is most interpretable on liability scale.
I
Ascertainment. In case-control studies the proportion P of
cases is usually (much) larger than the prevalence K in the
population yet estimates of genetic variation are most
interpretable if they are not biased by this ascertainment.
Liability
covðy; lÞ ¼ Eðy,lÞ % EðyÞEðlÞ ¼ K1i þ ð1 % KÞ0i2 ¼ K
where z is the height of the standard normal probabil
function at the truncation threshold t. The above d
describe the relationship between the phenotypes o
scales, but what we are interested in is the relationshi
genetic values on those scales. Following Dempster and
we determine the genetic value on the observed 0–1
for an individual (u), defined in Equation 9, as
u ¼ c þ bg ¼ c þ zg;
(Eq
where c is a constant.
The linear regression coefficient that links the tw
derived from the regression of the phenotype on the
scale (y) on the additive genetic effect on the scale of l
and equals the covariance of y and g divided by the va
(Equation 3),
b ¼ covðy; gÞ=s2g ¼ ½Eðy,gÞ % EðyÞEðgÞ'=h2l ¼ Kih2l =h
(Eq
Figure 1. The Liability Threshold Model for a Disease Prevalence of K
An underlying continuous random variable determines disease
status. If liability exceeds the threshold t, then individuals are
affected.
populations and that statistical methods developed for quantitative traits can be applied to the trait liability.2,6 The model can
be written as
l ¼ m1N þ g þ e
(Equation 10)
where l is a vector of liability phenotypes that are distributed as
Finally, the heritability on the observed scale is the ge
ance on the observed scale, s2u ¼ varðzgÞ ¼ z2 s2g from Eq
as a proportion of the total variance of 0–1 observatio
is the Bernoulli distribution variance K(1 % K) an
written as
h
i2
h2o ¼ s2u =½Kð1 % KÞ' ¼ s2g covðy; gÞ=s2g =½Kð1 % K
¼ s2g b2 =½Kð1 % KÞ' ¼ h2l z2 =½Kð1 % KÞ':
This can be rearranged to transform the heritabili
observed scale to that on the liability scale as
h2l ¼ h2o Kð1 % KÞ=z2 :
(Eq
This linear transformation was derived by Alan Robert
tions for these observations are that either the effect sizes
at individual SNPs are so small that they do not reach
genome-wide significance in GWAS or that causal variants
are not in sufficient LD with SNPs on the commercial
arrays to be detected by association.7,12 For example, insufficient LD could arise if causal variants have lower minor
allele frequency (MAF) than genotyped SNPs. To test these
ARTICLE
artificial case-control differences could be partitioned as ‘‘heritability’’ in methods that utilize
genome-wide similarity within and differences
between cases and controls.
In the present study, we overcome all three problems
and by using theory, simulations, and analysis of real
Estimating Missing Heritability for Disease
from Genome-wide Association Studies
1
Queensland Institute of Medical Research, 300 Herston Rd, Herston, Queensland 4006, Australia; 2Biosciences Research Division, Department of Primary
Industries, Melbourne, Victoria 3086, Australia; 3Department of Agriculture and Food Systems, University of Melbourne, Melbourne, Victoria 3010, Australia
*Correspondence: [email protected]
1 Michael
2,3All
Sang
Hong Lee,1 Naomi
R.byWray,
E.Human
Goddard,
and
M. Visscher1,*
DOI
10.1016/j.ajhg.2011.02.002.
!2011
The American
Society of
Genetics.
rightsPeter
reserved.
Genome-wide association studies are designed to discover SNPs that are associated with a complex trait. Employing strict significance
thresholds
when testing
SNPs avoids
false positives
the2011
expense of increasing false negatives. Recently, we developed
294
The American
Journal individual
of Human Genetics
88, 294–305,
Marchat11,
a method for quantitative traits that estimates the variation accounted for when fitting all SNPs simultaneously. Here we develop
this method further for case-control studies. We use a linear mixed model for analysis of binary traits and transform the estimates to
a liability scale by adjusting both for scale and for ascertainment of the case samples. We show by theory and simulation that the method
is unbiased. We apply the method to data from the Wellcome Trust Case Control Consortium and show that a substantial proportion of
variation in liability for Crohn disease, bipolar disorder, and type I diabetes is tagged by common SNPs.
I
We distinguish between heritability on the
Introduction
I
hypotheses, we recently developed a method to estimate
of variance explained by all SNPs in
GWAS for a quantitative trait.7 We showed that a substan2
tial proportion of genetic variation for human height was
l common SNPs. For complex diseases it
associated with
would be very useful to apply the same estimation procedure to case-control GWAS data. However, there are three
issues that need to be overcome to be able to estimate
genetic variance for disease without bias and with computationally fast algorithms:
proportion
0 − 1 observed scale, ho2 andtheon
the
Heritability is a general and key population parameter that
can
I help understand the genetic architecture of complex
traits. It is usually defined as the proportion of total phenotypic variation that is due to additive genetic factors.1
I
Methods of obtaining unbiased estimates of heritability
from pedigree data are well established for continuous
2
2
I 2 for example
phenotypes,
(restricted) maximum likelihood
o
l
for linear mixed models (LMM).2–5 For binary traits, such
asIdisease, familial resemblance is usually parameterized
on an unobserved continuous liability scale so that the
heritability is independent of disease prevalence.6 With
I
genome-wide genotype data, we can derive estimates of
genetic variance tagged by the SNPs from samples of indiI who are unrelated in the conventional sense.7 Herividuals
tability estimated from pedigree data is not the same as the
proportion of phenotypic variation explained by all SNPs
because the former includes the contribution of all causal
Normally distributed liability scale h
We relate these two heritabilities by
h = h K (1 − K )/z
K is disease prevalence in the population
(1) Scale. For quantitative traits the scale of measureis the same as the scale on which heritability
Individual is a case if unobserved ment
score
t(K )
isliability
expressed. For
diseaseexceeds
traits, the phenotypes
(case-control status) are measured on the 0–1 scale,
z(K ) is height of N(0, 1) distribution
at threshold
t(Kon) a scale of
but heritability
is most interpretable
liability.
(2) Ascertainment. In case-control studies the proportion of cases is usually (much) larger than the prev-
Ascertainment Bias
!
# !
"%
"
! "
bcc ¼ cov ycc ; gcc varðgcc Þ ¼ E ycc ,gcc # E ycc
$%
#
$%
#
¼ h2l iP # h2l ilP s2gcc ¼ Ph2l ið1 # lÞ s2gcc ¼
s2
Pð1#PÞ g
quantifies the change of th
The term Kð1#KÞ
s2gcc
ficient due to ascertainment in a regression of p
observed risk scale onto genetic factors on the
In the absence of ascertainment (P ¼ K), this ter
According to Equation 15, the genetic value
scale (ucc) for an individual in a case-control stu
ucc ¼ c þ bcc gcc ¼ c þ z
Pð1 # PÞ s2g
Kð1 # KÞ s2gc
#2
&
Pð1
s2gcc ¼ z
Kð1
and
"
s2ucc ¼ b2cc s2gcc ¼ z
Figure 2. The Distribution of Liability When Cases Are Oversampled as in a Case-Control Study
I
! "
var ycc ¼ Pð1 # PÞ; which is the phenotypic variance on the
observed scale in the case-control sample; and
We note that ucc is a least-squares estimate of
on the observed scale. When residuals are norm
the least-square estimate is the same as the (res
likelihood estimate. However, normality of liab
a case-control study. The previous section describ
relationships between parameters on different s
ence of ascertainment. In practice, we do not ob
directly but estimate them. We now consider
between the parameters and their estimates whe
lihood is used to estimate the variance compon
The estimated genetic variance on the obs
REML analysis (Equation 9) is based on 0–1 obs
covariance structure among samples. Withou
the mean of estimated genetic values on the o
be derived from Equations 11, 12, 15, and 1
Let P ∼ 0.5 be the fraction of cases in the case-control sample
Eðl Þ ¼ Pi þ ð1 # PÞi ¼ il; where we define l ¼ ðP # KÞ=ð1 # KÞ:
Usually P > K Using
so Equations
the sample
is enriched for cases
1, 13, and 14 then gives,
cc
I
Pð1 # PÞ s2g
Kð1 # KÞ s2gcc
2
! "
Ascertainment Bias
I
2 be the heritability on the observed scale in a
Let hoc
case-control study, estimated in the usual way e.g. with GCTA
I
Adjustment for Ascertainment Bias is hl2 ∼
I
If P = K this reduces to previous relationship.
I
Care must be taken to clean the data of SNPs with different
missing rates between cases and controls, and which may
inflate the apparent heritability
2
P(1−P)
hoc
K (1−K ) K (1−K )
we conclude that there is no need to make the missing
threshold more stringent than 20. On the liability scale, the
heritability estimate (i.e., the variance in liability explained
by the SNPs) is 0.22 (SE ¼ 0.04), which is much higher than
that explained by genome-wide significant SNPs.27 Similar
results are obtained if the SNPs with MAF >0.05 are used
(Table 3). This indicates that common SNPs (MAF > 0.05)
are in substantial LD with casual variants for Crohn disease.
Example: Crohn’s Disease
Table 3.
for <4 missing genotypes (Table 5). For type I diabetes,
some SNPs on chromosome 6 had extremely significant
associations, for example, WTCCC13 reported a p value
of 5.47e-134 for rs9272346 in the region of the major
histocompatibility complex (MHC). We performed an
analysis without chromosome 6 or with chromosome 6
only when we used SNPs with an MAF > 0.01 (Table 6).
We observed that the estimates substantially decreased
Estimated Genetic Variance on the Observed and Liability Scale Explained by All SNPs for Crohn Disease in WTCCC Data
Thresholda
No. SNPb
Estimatec (SE)
LR
Adjustedd (SE)
Transformede (SE)
MAF > 0.01
200
322,142
0.56 (0.07)
63.16
0.64 (0.08)
0.24 (0.03)
20
294,850
0.53 (0.07)
57.48
0.61 (0.08)
0.22 (0.03)
7
248,791
0.52 (0.07)
57.30
0.61 (0.08)
0.22 (0.03)
4
195,977
0.50 (0.07)
54.94
0.60 (0.08)
0.22 (0.03)
200
293,269
0.56 (0.07)
69.00
0.63 (0.08)
0.23 (0.03)
20
266,843
0.53 (0.07)
63.27
0.60 (0.08)
0.22 (0.03)
7
225,043
0.52 (0.07)
63.94
0.60 (0.08)
0.22 (0.03)
4
177,615
0.50 (0.07)
62.14
0.60 (0.08)
0.22 (0.03)
MAF > 0.05
a
Excluding SNPs with more than the listed number of missing genotypes.
b
After filtering on the basis of SNP missing rate.
c
Estimate of genetic variance proportional to the total phenotypic variance on the observed scale.
d
Estimate adjusted for reduced number of SNPs.
e
Transformed genetic variance proportional to the total phenotypic variance on the liability scale under the assumption that the population prevalence is 0.1%,
the heritability on the liability scale explained by the SNPs.
300 The American Journal of Human Genetics 88, 294–305, March 11, 2011
Estimating Liability to improve GWAS power
I
I
See LEAP (Weissbrod et al
http://arxiv.org/pdf/1409.2448.pdf)
Basic idea - estimate the unobserved liability in each
individual and use this in a GWAS:
I
I
I
I
Estimate heritability on the Liability Scale, taking into account
ascertainment
Estimate the effect of each SNP.
Using a Probit model, a liability estimate is computed for every
individual.
SNPs are tested for association with estimated liability via a
standard LMM model.
Polygenic Scores
I
What can one do when a GWAS finds few or no genome-wide
significant associations?
I
There may still be some genuine signal among those SNPs
with p-values in the range (say) < 10−1 , but also many false
positives
I
One solution - compute the genome wide heritability
I
Polygenic Scores are another way forward
Polygenic Scores
I
Let X (t) be the set of SNPs with p-values < t, thinned to
remove duplicate SNPs in strong LD with each other
I
Let gix be the genotype dosage of SNP x in individual i
I
Let βx be the estimated coefficient of SNP x in the GWAS
P
Si (t) = x∈X (t) gix βx - the Polygenic Score
I
I
Expect Si (t) to be large when the individual i is affected, low
when i is not
I
Polygenic Score can predict the phenotype in individuals not
used to train it (genomic prediction)
I
Polygenic Score can be used to assess genetic correspondence
between different diseases measured in different cohorts.
Vol 460 | 6 August 2009 | doi:10.1038/nature08185
LETTERS
Common polygenic variation contributes to risk of
schizophrenia and bipolar disorder
The International Schizophrenia Consortium*
Schizophrenia is a severe mental disorder with a lifetime risk of
about 1%, characterized by hallucinations, delusions and cognitive
deficits,
I with heritability estimated at up to 80%1,2. We performed a
genome-wide association study of 3,322 European individuals with
schizophrenia and 3,587 controls. Here we show, using two analytic
approaches,
the extent to which common genetic variation underlies
I
the risk of schizophrenia. First, we implicate the major histocompatibility complex. Second, we provide molecular genetic evidence for a
I
substantial
polygenic component to the risk of schizophrenia involving thousands of common alleles of very small effect. We show that
this component also contributes to the risk of bipolar disorder, but
I to several non-psychiatric diseases.
not
We genotyped the International Schizophrenia Consortium (ISC)
case-control sample for up to ,1 million single nucleotide polymorphisms (SNPs), augmented by imputed common HapMap
SNPs. In the genome-wide association study (GWAS; genomic conI lGC 5 1.09; Supplementary Table 1 and Supplementary Figs
trol
1–3), the most associated genotyped SNP (P 5 3.4 3 1027) was
located in the first intron of myosin XVIIIB (MYO18B) on chromosome 22. The second strongest association comprised more than
450 SNPs on chromosome 6p spanning the major histocompatibility
complex (MHC; Fig. 1). There is some evidence for between-site
3322 cases, 3587 controls
Table 2, Supplementary Fig. 2 and section 5 and 6 in Supplementary Information).
The best imputed SNP, which reached genome-wide significance
(rs3130297, P 5 4.79 3 1028, T allele odds ratio 5 0.747, minor allele
frequency (MAF) 5 0.114, 32.3 megabases (Mb)), was also in the
MHC, 7−7
kilobases (kb) from NOTCH4, a gene with previously
reported associations with schizophrenia4. We imputed classical
human leukocyte antigen (HLA) −7
alleles; six were significant at
P , 1023, found on the ancestral European haplotype5 (Table 1, Supplementary Table 3 and section 3 in Supplementary Information).
However, it was not possible to ascribe the association to a specific
HLA allele, haplotype or region (Supplementary Table 3 and
Supplementary Fig. 4).
We exchanged GWAS summary results with the Molecular
Genetics of Schizophrenia (MGS) and SGENE consortia for genotyped SNPs with P , 1023. There were 8,008 cases and 19,077 controls
of European descent in the combined sample (see refs 6, 7 and section
7 in Supplementary Information). Our top genotyped MHC SNP
(rs3130375) had P 5 0.086 and P 5 0.14 in MGS and SGENE, respectively. Considering the combined results for genotyped and imputed
SNPs across the MHC region more broadly, rs13194053 had a
genome-wide significant combined P 5 9.5 3 1029 (ISC, MGS and
Most significant SNP P = 3.4x10
, only weakly associated
Cluster of associated SNPs in MHC P < 6x10
74,062 SNPs in linkage equilibrium used as basis from which
to select sets X (t) for polygenic score analysis.
Used males (2,176 cases, 1,642 controls) to train polygenic
score, females (1,146 cases, 1,945 controls) to test
24
22
24
was in linkage disequilibrium
. Across the region, 11 other
and 32.7 Mb (Supplementary
Variance explained (R2)
PT < 0.1
PT < 0.2
PT < 0.3
PT < 0.4
PT < 0.5
5 × 10–11
1×
0.02
10–12
7 × 10–9
0.01
0.008
0.71 0.05 0.30 0.65 0.23 0.06
Schizophrenia
D
C
C
-B
CD
HT
RA T1D
T2D
TC
CAD
W
ST
EP
EA
S-
SG
M
G
O
’D AA
on
ov
an
0
M
te whether common variants
ly testing the classic theory of
othesized to apply to schizosis did not identify a large
ere could still be potentially
ts that collectively account for
n risk. We summarized variinto quantitative scores, and
ependent samples10. Although
ple, genotypic relative risk
even nominally significant P
etected at increasingly liberal
e, PT , 0.1 or PT , 0.5. Using
f ‘score alleles’ in a discovery
s for individuals in independscore, instead of risk, as we
rue risk alleles from unasso-
P = 2 × 10–28
0.03
Bipolar disorder
Non-psychiatric (WTCCC)
Figure 2 | Replication of the ISC-derived polygenic component in
independent schizophrenia and bipolar disorder samples. Variance
explained in the target samples on the basis of scores derived in the entire
ISC for five significance thresholds (PT , 0.1, 0.2, 0.3, 0.4 and 0.5, plotted left
to right in each study). The y axis indicates Nagelkerke’s pseudo R2; the
number above each set of bars is the P value for the PT , 0.5 target sample
−19
analysis. CAD, coronary artery disease; CD, Crohn’s disease; HT,
hypertension; RA, rheumatoid arthritis; T1D, type I diabetes; T2D, type II
diabetes. Numbers for cases/controls: MGS-EA 2,687/2,656; MGS-AA 1,287/
973; O’Donovan 479/2,938; STEP-BD 955/1,498; WTCCC 1,829/2,935;
µ+γS CD 1,748/2,935;
µ+γS
CAD 1,926/2,935;
HT 1,952/2,935; RA 1,860/2,935; T1D
1,963/2,935; and T2D 1,924/2,935.
The score on the basis of all SNPs with male discovery
t < 0.5 (37, 655 SNPs) was highly correlated with
schizophrenia in target females P < 9x10
logistic regression on polygenic score
P(case) = e
/(1 + e
)
Explains 3% of the variance using Nagelkerke’s pseudo R 2
on a reduced set
I of SNPs to
After filtering on MAF, geno(independent of association
set of 74,062 autosomal SNPs
upplementary Tables 6 and 7).
ets of score alleles at different
I
individual in the target same alleles they possessed, each
e discovery sample. To assess
zophrenia risk,I
we tested for a
mpared to controls (sections
2
Nagelkerke’s R 2
I
Liability is one way to define heritability for a dichotomous
trait
I
How do we define heritability on the observed scale?
I
How do we define variance explained in a logistic model
I
Concept of variance explained does not exist in a logistic
model
I
But the Deviance (twice the log likelihood ratio) is a natural
generalisation
L(θ0 ) 2/n
ˆ
L(theta)
2
I ρmax = 1 − L(θ0 )2/n
I Nagelkerke’s R 2 = ρ2 /ρ2
max
I
ρ2 = 1 −
I
Same as Pearson squared correlation R 2 when data are
Normally distributed
Estimating SNP effects Jointly
I
I
Conventional GWAS estimate each SNP independently
There are some reasons for considering all SNPs jointly
I
I
I
Variability explained jointly is handled better
Genomic Prediction
Computational reasons
I
But... Need to impose shrinkage constraints on SNP estimates
I
We describe four methods: BLUP, ridge regression, LASSO,
Prior SNP Distributions
BLUP: Best Linear Unbiased Predictions
I
BLUPs are the predictions of the random (SNP) effects in a
mixed model
I
They are linear functions of the observations, are unbiased
and have minimum variance
I
All random effects are estimated jointly by BLUP
I
There are usually many more SNPs than individuals so BLUPs
are another way of getting SNP estimates
BLUP for random SNP effects
I
NOTE: we are using non-standard notation where random
SNP effects are β, fixed covariates are α (in most descriptions,
the fixed effects are β and the random effects are u).
I
Mixed Model y = Xα + Zβ + e
I
α are fixed effects (e.g. age, sex) with design matrix X
I
β are the random SNP effects with scaled genotype matrix Z
I
β ∼ MVN(0, G), e ∼ MVN(0, R), cov (u, e) = 0
I
β̂ = GZ0 V−1 (y − Xα̂) BLUP
I
V = ZGZ0 + R
I
ŷ = Xα̂ + Zβ̂ = Cy for a matrix C
BLUP
I
BLUP is good for genomic prediction, but...
I
BLUP SNP estimates tend to underestimate large effect SNPs
and to over-estimate small effects
I
BLUP SNP estimates resemble a sample from a Gaussian
distribution
I
But can be useful as a first approximation to identify
candidate SNPs e.g. Verbyla et al (2007) Theor Appl Genet
116:95111
I
Can make more complex versions of BLUP where different
genome regions are treated differently e.g. Speed and Balding
(2014) Genome Research 24:15501557
Ridge Regression and The Lasso
I
Least Squares regression: minimise Ω = (y − Xβ)0 (y − Xβ)
I
Ridge regression: minimise Ω subject to β 0 β < k
P
Lasso regression: minimise Ω subject to j |βj | < k
I
I
Both Ridge and Lasso are shrinkage methods but have
different properties - often many of the lasso estimates of βj
are exactly 0, while ridge estimates are non-zero but small.
I
Lasso is potentially useful for fitting all SNPs in a model
simultaneously, forcing most to be zero
I
Many extensions to Lasso for QTL mapping have been
published, e.g. Yi and Xu 2008, Genetics 179:1045-1055
The Lasso
Prior Distributions on SNP Effects
Verbyla et al. BMC Proceedings 2010, 4(Suppl 1):S5
http://www.biomedcentral.com/1753-6561/4/S1/S5
PROCEEDINGS
Open Access
Sensitivity of genomic selection to using different
prior distributions
Klara L Verbyla1,2,3,4*, Philip J Bowman2, Ben J Hayes2, Michael E Goddard2,3,4
From 13th European workshop on QTL mapping and marker assisted selection
Wageningen, The Netherlands. 20-21 April 2009
Abstract
Genomic selection describes a selection strategy based on genomic estimated breeding values (GEBV) predicted
from dense genetic markers such as single nucleotide polymorphism (SNP) data. Different Bayesian models have
been suggested to derive the prediction equation, with the main difference centred around the specification of
the prior distributions.
Methods: The simulated dataset of the 13th QTL-MAS workshop was analysed using four Bayesian approaches to
predict GEBV for animals without phenotypic information. Different prior distributions were assumed to assess their
affect on the accuracy of the predicted GEBV.
Conclusion: All methods produced GEBV that were highly correlated with the true breeding values. The models
appear relatively insensitive to the choice of prior distributions for QTL-MAS data set and this is consistent with
uniformity of performance of different methods found in real data.
Background
Genomic selection describes a technique for evaluating
The aim of this study was to assess the effect that different prior distributions and subsequently the models
Prior Distributions on SNP Effects
I
Mixed model: y = Xα + Zβ + g + e (NOTE different
notation)
I
var (g) = Kσg2 , var (e) = Iσe2
I
SNP effects β, fixed effects α, oligogenic effects g
Page 2 of 4
Prior Distributions on SNP Effects
marker
he QTL
of ranociated
distribuderived
residual
e ~ N(0,
r distrilygenic
priors of
nt were
assessed
Table 1 Prior Distribution Specifications
Method
Bayes BLUP
Prior Distribution
i
| 2
2
Bayes A
Bayes A/B (Hybrid)
i|
2
i
i
Bayes C
2
i
0
i | i,
2
i
r, s
N 0,
2
| i2
2
i
2
i
N 0, 2
2
2
i
r, s
N 0,
2
i
with probability 1- π
2
r , s with probability π
2
i
2
(1
i )N
0,
2
i
/ 100
2
i N(0, i )
r, s
gi ~bernoulli(π)
1 - p(gi = 0) = p(gi = 1) = π
bi is the effect for the ith SNP and gi is the indicator variable for the ith SNP.
A faster alternative to both the Bayes A/B hybrid and
ponents[7].
GEBV: Genomic Estimated Breeding Values
Table 2 Correlations Between Estimated GEBV for
unphenotyped animals at t=600
Bayes C
Bayes A/B
Bayes BLUP
Bayes A
0.999
0.991
0.860
Bayes C
1
0.993
0.863
1
0.893
Bayes A/B
Table 3 Comparison of True and Estimated GEBV
Method
Correlation
MSE
Rank
Regression
Bayes.BLUP
0.885
5.479
0.691
0.979
BayesA
0.857
7.092
0.696
1.162
BayesA/B
0.889
5.435
0.73
1.081
BayesC
0.861
6.561
0.71
1.024
Correlation coefficient between the true and predicted GEBV, Mean Square
Error (MSE), Rank (Accuracy of the predicting the best 100 animals) and the
Regression Coefficient of the true breeding value on the estimated GEBV.
the per
recomm
effect d
to deter
and sett
the like
possible
Acknowle
KV was fun
Training, as
Commissio
European C
may be m
This article
Supplemen
mapping a
The full co
http://www
Author de
1
Animal Br
8200 AB Le
Conclusions
I
Research in Complex Trait Analysis leads in several profitable
directions
I
I
I
I
Identification of individual SNP effects
Estimates of Heritability in the entire populace
Genomic Prediction
Progress in Human Complex Trait research is accelerated by
considering work in animals and plants