Necessary advances in exercise genomics and likely pitfalls

J Appl Physiol 110: 1150–1151, 2011;
doi:10.1152/japplphysiol.00172.2011.
Invited Editorial
Necessary advances in exercise genomics and likely pitfalls
Yannis Pitsiladis and Guan Wang
College of Medicine, Veterinary and Life Sciences, Institute of Cardiovascular and Medical Sciences, University of Glasgow,
Glasgow, United Kingdom
Address for reprint requests and other correspondence: Y. Pitsiladis, College
of Medicine, Veterinary and Life Sciences, Institute of Cardiovascular and
Medical Sciences, Univ. of Glasgow, Glasgow G12 8QQ, United Kingdom
(e-mail: [email protected]).
1150
studies [e.g., n ⬃ 30,000 with 47 identified loci (7)]. A large
sample size is indeed needed in the example of human height,
and the occurrence of rare variants that are not well captured by
GWAS may partly explain this limited success in determining
the genomics of adult height. The application of the genomewide linkage analysis approach has been successful in identifying disease genes related to monogenic disorders (4) but only
partially successful in detecting complex genetic traits related
to multiple genes. The lack of greater success is probably due
to the low heritability of the examined complex traits (common
variants with modest effect).
Despite numerous reports of genetic associations with
health-related fitness phenotypes, there has been limited progress in discovering and characterizing the genetic contribution
to these phenotypes due to few coordinated research efforts
involving major funding initiatives and the use primarily of the
candidate gene approach. It is timely that exercise genetic
research has moved into the genomics era with a paper by
Bouchard and colleagues (2) in this issue of the Journal of
Applied Physiology. In this first study to apply the GWAS
approach to exercise genetics, these authors report the results
of an investigation aimed at identifying the genetic variants
associated with gains in maximal oxygen uptake (V̇O2max)
using the resources of the subsample of whites of HERITAGE
(3). A total of 324,611 SNPs were genotyped and the most
significant SNPs tested for replication in the subsample of
blacks from HERITAGE, the women of DREW (9), and the
men and women of STRRIDE (5), who were all exposed to
different but standardized and supervised exercise training
programs. Based on single-SNP analysis, 39 SNPs were significantly associated with the gains in V̇O2max. Stepwise multiple regression analysis of the 39 SNPs identified a panel of 21
SNPs that accounted for 49% of the variance in V̇O2max
trainability. Intriguingly, subjects who carried 9 or fewer favorable alleles at these 21 SNPs improved their V̇O2max by 221
ml/min, while those who carried 19 or more of these alleles
gained on average 604 ml/min. Notably, the strongest association of the identified SNPs was with rs6552828 located in the
ACSL1 gene, accounting for ⬃6% of the training response of
Fig. 1. Popular gene discovery methods for polygenic traits.
8750-7587/11 Copyright © 2011 the American Physiological Society
http://www.jap.org
Downloaded from http://jap.physiology.org/ by 10.220.33.5 on June 16, 2017
of methodological approaches within the field of
genetic epidemiology have been utilized to unravel the genetic
basis of physical performance. Popular gene discovery methods for polygenic traits are summarized in Fig. 1. The basic
family/twin study approach has provided useful and reliable
genetic data. A recent twin study, which comprised 37,051
twin pairs from seven European countries, suggested additive
genetic variants contribute significantly to exercise participation among the twin pairs (10). No similar studies of such
magnitude have been conducted in the area of human performance. Because of the development of more advanced gene
discovery techniques, genetic studies are no longer restricted to
family/twin studies but expanded to include the assessment of
genetic variants within populations. Population-based studies
are extensively being used, particularly involving two groups:
cases and controls. Population case-control studies can be
further differentiated into hypothesis-free and the more commonly used hypothesis-driven approaches. The most extensively used candidate gene association study approach requires
a prior hypothesis that the genetic polymorphisms of interest
are causal variants or in strong linkage disequilibrium with a
causal variant. This population-based genetic approach aims to
define alleles or markers that segregate with a particular phenotype or disease at a significantly higher rate than predicted
by chance alone, by genotyping the variants in both affected
and unaffected individuals. This approach is effective in detecting genetic variants with small or modest influence on
common disease or complex traits. Functional single-nucleotide polymorphisms (SNPs) with tag SNPs that would cover
the entire candidate gene have been used in many candidate
gene association studies. Further advances in molecular technologies have enabled researchers to apply genomewide approaches to the field. The genomewide association study
(GWAS) is a hypothesis-free approach used to detect the
common variants underlying complex diseases and traits so as
to help predict the disease risk and develop targeted therapy.
GWAS has been successful in identifying novel genetic variants for Type 2 diabetes mellitus (1) and the interleukin 23
pathway in Crohn’s disease (8). The GWAS approach is not
without important limitations. For example, human height is a
highly heritable quantitative trait as well as stable and easy to
measure. In theory, the application of GWAS would be suitable
in finding height-related genes. However, despite significant
investment in large sample numbers, GWAS results have been
disappointing as only 10% of phenotypic variation in height
could be explained from the 180 associated loci to adult height
in the largest (n ⫽ 183,272) study published to date (6) and
typically 5% or less of the phenotypic variation in other smaller
A NUMBER
Invited Editorial
1151
J Appl Physiol • VOL
of time before the entire human genome can be sequenced for
$1,000. Recently, the newest Illumina sequencing machine
HiSeq 2000 costs less than $10,000 in a single run (2 human
genomes and 30⫻ coverage); this cost has dramatically
dropped from $60,000 in 2008. No matter the success or failure
of the GWAS approach, this approach is certainly providing
the insight into genetic architecture and the molecular basis
underlying human diseases and complex traits. In the near
future, large cohorts will be routinely studied by GWAS and
will provide good resources for all scientific fields including
exercise genomics. This development will require a move
away from the traditional method of researching in exercise
science/medicine (i.e., predominantly single laboratory studies) to large well-funded laboratory collaborations and therefore substantial statistical/technological power and know-how.
Only with such resources can the most strongly acting genes be
identified with confidence, gene ⫻ environment interactions be
studied accurately, and clinically meaningful gene ⫻ gene
interactions revealed.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
REFERENCES
1. Amos CI. Successful design and conduct of genome-wide association
studies. Hum Mol Genet 16: R220 –R225, 2007.
2. Bouchard C, Sarzynski MA, Rice TK, Kraus WE, Church TS, Sung
YJ, Rao DC, Rankinen T. Genomic predictors of maximal oxygen
uptake response to standardized exercise training programs. J Appl Physiol
(December 23, 2010). doi:10.1152/japplphysiol.00973.2010.
3. Bouchard C, Leon AS, Rao DC, Skinner JS, Wilmore JH, Gagnon J.
The HERITAGE family study. Med Sci Sports Exerc 27: 721–729, 1995.
4. Dean M. Approaches to identify genes for complex human diseases:
lessons from Mendelian disorders. Hum Mutat 22: 261–274, 2003.
5. Kraus WE, Torgan CE, Duscha BD, Norris J, Brown SA, Cobb FR,
Bales CW, Annex BH, Samsa GP, Houmard JA, Slentz CA. Studies
of a targeted risk reduction intervention through defined exercise
(STRRIDE). Med Sci Sports Exerc 33: 1774 –1784, 2001.
6. Lango Allen H, Estrada K, Lettre G, Berndt SI, Weedon MN, Rivadeneira F, et al. Hundreds of variants clustered in genomic loci and
biological pathways affect human height. Nature 467: 832–838, 2010.
7. Maher B. Personal genomes: the case of the missing heritability. Nature
456: 18 –21, 2008.
8. Massey DC, Parkes M. Genome-wide association scanning highlights
two autophagy genes, ATG16L1 and IRGM, as being significantly associated with Crohn’s disease. Autophagy 3: 649 –651, 2007.
9. Morss GM, Jordan AN, Skinner JS, Dunn AL, Church TS, Earnest
CP, Kampert JB, Jurca R, Blair SN. Dose response to exercise in
women aged 45–75 yr (DREW): design and rationale. Med Sci Sports
Exerc 36: 336 –344, 2004.
10. Stubbe JH, Boomsma DI, Vink JM, Cornes BK, Martin NG, Skytthe
A, Kyvik KO, Rose RJ, Kujala UM, Kaprio J, Harris JR, Pedersen
NL, Hunkin J, Spector TD, de Geus EJ. Genetic influences on exercise
participation in 37,051 twin pairs from seven countries. PLoS One 1: e22,
2006.
110 • MAY 2011 •
www.jap.org
Downloaded from http://jap.physiology.org/ by 10.220.33.5 on June 16, 2017
V̇O2max. Generally, this is a well-designed study in terms of
cohort selection, replication, use of predictor SNP score, and
reliable determination of V̇O2max. However, like all studies,
especially those pioneering in any field, this study is not
without an important limitation i.e., the small sample size (n ⫽
483). The three replication studies (HERITAGE blacks,
DREW, and STRRIDE) given their even smaller sample size
(n ⫽ 247, n ⫽ 112, and n ⫽ 183, respectively) are not ideal and
hence result in limited replication (i.e., 5 SNPs in total). Most
of the knowledge in exercise genetics has been generated
primarily using classical genetic methods such as SNPs
and applied to cohorts with small sample sizes. The data
generated therefore from most published studies in exercise
genetics need to be examined in light of the view held by some
“hard core” geneticists that a study of any complex phenotype
in humans is futile unless a cohort size of between 20,000 and
100,000 is used. Bouchard and colleagues (2) attempt to
overcome this major limitation of observational and crosssectional designs by conducting a carefully controlled intervention study that significantly reduces the number of confounders and therefore the sample size required to detect SNPs
with a significant effect size. With the present sample size
including replication, Bouchard and colleagues (2) appear to
detect SNPs with an effect size of ⱖ2% with some confidence
and ⱖ4% with greater confidence. Contrast this with the
perceived requirement for ⬃4,000 subjects to detect a SNP
with an effect size of ⬃1% in a classic GWAS with a continuous trait. As a result, the newly identified genomic predictors
of the response of V̇O2max to regular exercise provide new
targets for the study of the biology of aerobic fitness and
adaptation to regular exercise. More importantly, however, this
study demonstrates the unique capabilities of applying whole
genome technologies to the classic intervention study for gene
discovery, in particular, the ability of this approach to circumvent the need for very large (and typically impractical) cohort
sizes and the currently prohibitive cost to subject large cohorts
to whole genome analysis. This study has also reaffirmed the
importance of well-phenotyped cohorts in reducing the required sample size.
It is accepted that there will be many interacting genes
involved in exercise-related traits, including sporting performance, and hence it is timely that genetic research has moved
to the genomics era, i.e., the simultaneous testing of multiple
genes. The approaches and technologies used by Bouchard and
colleagues (2) will no doubt be increasingly applied to searching the whole human genome instead of studying single genes
or indeed SNPs as the cost of using whole genome methods
becomes more affordable. Particularly, the cost of large-scale
sequencing will become cheaper over the next years. Seemingly reputable claims have been made that it is only a matter