Genetic epidemiology – principles relevant to epigenetics

Genetic epidemiology – principles relevant to
epigenetics
Nic Timpson
Epigenetic Epidemiology 2012
Objectives:
To reconsider the measurement of complex biological systems
To consider epidemiology as a whole and the special nature of genetics within
epidemiological analyses
Cover the epidemiological properties of genetic variation
Consider the principles in genetic epidemiology relevant to epigenetics in 2012
Should should be able to:
Place the study of epigenetics in the context of epidemiology broadly
Explain how the nature of epidemiological measurement/information varies
Describe the properties of genetic variation in particular
Transfer lessons learnt in the study of genetics to that of epigenetic data where relevant
Epigenetic Epidemiology 2012
Measurements are proxies (most of the time…)
face, from the chin to the top of
the forehead and the lowest roots of the hair, is a tenth part of the whole
height; the open hand from the wrist to the tip of the middle finger is just the
same; the head from the chin to the crown is an eighth, and with the
For the human body is so designed by nature that the
neck and shoulder from the top of the breast to the lowest roots of the hair
is a sixth; from the middle of the breast to the summit of the crown is
a fourth. If we take the height of the face itself, the distance from the bottom of
the chin to the under side of the nostrils is one third of it; the
nose from the under side of the nostrils to a line between the eyebrows is the same;
from there to the lowest roots of the hair is also a third, comprising the forehead. The length
of the foot is one
sixth of the height of the body; of the forearm,
one fourth; and the breadth of the breast is also one fourth. The other
members, too, have their own symmetrical proportions…. Then again, in the human body
the central point is naturally the navel. For if a man be placed flat on his
back, with his hands and feet extended, and a pair of compasses centred at his
navel, the fingers and toes of his two hands and feet will touch the circumference of a circle
described therefrom…. the outstretched
arms, the breadth will be
found to be the same as the height….
Vitruvius De architectura c.15BC
“Vitruvian Man” Da Vinci c.1487
Epigenetic Epidemiology 2012
Epigenetics
“epi”, from greek: “above”.
Historically, the word “epigenetics” used to describe events that could
not be explained by genetic principles.
In to the context of the “gene number” arguments and including seemingly
“unrelated” processes, such as paramutation in maize, position effect
variegation in the fruit fly, genomic imprinting and Xinactivation/Lyonisation…
It is now a rapidly expanding field with the uncovering of common
molecular mechanisms.
Epigenetic Epidemiology 2012
Epidemiology – “epi” & “demos” – should consider in the general context of measurement…
DISTAL
PROXIMAL
NICHE CONSTRUCTION
ENVIRONMENT
“EPIGENETIC
EPIDEMIOLOGY”
GENOTYPE
“EPIDEMIOLOGY”
PROCESS
REGULATION
INTERMEDIATE
ENVIRONMENTAL
PROXY
REALISED PHENOTYPE
“GENETIC EPIDEMIOLOGY”
GENOTYPE BY HISTORY
NATURAL SELECTION
DIRECTLY
MEASURED
MEASURED BY
OBSERVATION/BEST
AVAILABLE TOOL
NOT EASILY
TRANSLATED
EASILY
TRANSLATED/CLINICA
L
Epigenetic Epidemiology 2012
Genetic Epidemiology
Why special?
Properties of the measurement of genotype
Properties of the information contained within genotypes (lead on to MR)
Requires some extra understanding re the mechanisms of variation and the
genomic landscape of traits…
Epigenetic Epidemiology 2012
Genetic Epidemiology
Why special?
Properties of the measurement of genotype
Properties of the information contained within genotypes (lead on to MR)
Requires some extra understanding re the mechanisms of variation and the
genomic landscape of traits…
Epigenetic Epidemiology 2012
DISTAL
PROXIMAL
Epigenetic Epidemiology 2012
Genetic Epidemiology
Why special?
Properties of the measurement of genotype
Properties of the information contained within genotypes (lead on to MR)
Requires some extra understanding re the mechanisms of variation and the
genomic landscape of traits…
Epigenetic Epidemiology 2012
Importance of the apparent
Independence of heritable units
within the human genome
“Genetics is indeed in a peculiarly favoured condition in that
Providence has shielded the geneticist from many of the
difficulties of a reliably controlled comparison. The different
genotypes possible from the same mating have been
beautifully randomized by the meiotic process…..Generally
speaking the geneticist, even if he foolishly wanted to,
could not introduce systematic errors into the comparison
of genotypes, because for most of the relevant time he has
not recognized them.”
Fisher RA. Statistical Methods in Genetics. Heredity (1952) 6, 1-12
Epigenetic Epidemiology 2012
Genetic Epidemiology
Why special?
Properties of the measurement of genotype
Properties of the information contained within genotypes (lead on to MR)
Requires some extra understanding re the mechanisms of variation and the
genomic landscape of traits…
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
Epigenetic Epidemiology 2012
?
Epigenetic Epidemiology 2012
Genetic Epidemiology
As yet unsolved…
Epigenetic Epidemiology 2012
Principles from genetic epidemiology relevant for epigenetics
(i)
Candidate research versus genomewide approaches
(ii) Array based methods and the importance of QC
(iii) Properties of measurements – interpretation and use
(iv) Sample sizes, power & replication
(v) Collaboration, sharing of data & repositories
(vi) Translation of effects & Integration of multiple data sources
Epigenetic Epidemiology 2012
(i) Candidate research versus genomewide approaches
Hypothesis driven approaches have/had been the main stay for both genetic epidemiology
and epigenetic epidemiology.
Based on the measurement of specific genotypes (or methylation profiles) under reasonable
hypotheses of gene effect.
Biological plausibility/Association strength/Dose response/Replication
(Tabor, Risch, Myers, NRG 2002)
Epigenetic Epidemiology 2012
Loci reproducibly associated with type 2 diabetes (Oct. 2011)
Risk of diabetes
(Odds ratio)
Biologic candidate approach
Positional cloning
Hypothesis-free approach
1.40
1.35
1.30
1.25
1.20
CDKN2A
FTO
HHEX
SLC30A8
CDKAL1
IGF2BP2
1.15
1.10
1.05
PPARG
KCNJ11
1.00
1997
1998-2005
WFS1
HNF1B
TCF7L2
2006
2007
2008
MTNR1B
KCNQ1
THADA
NOTCH2
CAMK1D
ADAM30
JAZF1
ADAMTS9
IRS1
TSPAN8
2009
Loci and effect sizes are from the DIAGRAM+ consortium
Epigenetic Epidemiology 2012
HMGA2
PPARG
ADCY5
TLE4
UBE2E2
ARAP1
ZBED3
GCK
PROX1
HNF1A
CDC123
BCL11A
GCKR
PRCT
C2CD4A
DGKB
SPRY2
RBMS1
TP3INP
ZFAND6
KLF14
HCCA2
1
SRR
PTPRD
2010
2011
2012??
Slides courtesy of Paul Franks
(ii) Array based methods and the importance of QC
Sample based QC (DNA quality):
Overt and cryptic relatedness
Ethnicity
Missingness
Heterogeneity
Variant based QC (e.g. SNPs)
Frequency
Basic expectations – e.g. Hardy Weinberg Equilibrium
Missingness (& possible biased patterns re. C/C status)
Data quality (as per plots!)
Sample based QC (DNA quality):
Bisulfite conversion
Batch effects
Missingness
Variant based QC (e.g. SNPs)
Noise/signal ratio
Probe concordance
(although metrics to compare against??)
Epigenetic Epidemiology 2012
(iii) Properties of measurements – interpretation and use
Genotypic data
Afforded the “luxury” of independent segregation and direct measurement
Past the natural properties of genotypic data, allows for relatively simple statistical
analysis and the possibility of information in inference (i.e. pathway dissection)
Some complications – Linkage disequilibrium, genomic landscape, structure…
Epigenetic Epidemiology 2012
Novel Pathways
Church et al. Nature Genetics 2010;42:1086–1092
Energy intake
Kcal/day/BW
Epigenetic Epidemiology 2012
Phenotypic heterogeneity
Epigenetic Epidemiology 2012
Sampling frame
What happens if we look at the association between PC1 and
our phenotype? Is this likely to be a problem (?)
Other types of possible confounding ??
Epigenetic Epidemiology 2012
(iii) Properties of measurements – interpretation and use
Genotypic data
Afforded the “luxury” of independent segregation and direct measurement
Past the natural properties of genotypic data, allows for relatively simple statistical
analysis and the possibility of information in inference (i.e. pathway dissection)
Some complications – Linkage disequilibrium, genomic landscape, structure…
Epigenetic data
Socio-economic status
Diet
???
Dietary folate
Smoking
Methylation
Off target phenotype
Target expression
Assuming data are clean and reliable…
Epigenetic Epidemiology 2012
Phenotype
Off target expression
(iv) Sample sizes, power & replication
Deducing “true numerical ratios” requires “the greatest possible
number of individual values; and the greater the number of these
the more effectively will mere chance be eliminated”.
Gregor Mendel 1865/6
20
18
16
14
12
%
10
8
6
4
2
0
1.05
1.15
1.25
1.35
1.45
1.55
Odds Ratio
Epigenetic Epidemiology 2012
1.65
1.75
1.85
1.95
June 7, 2007
Wan etNature,
al, HMG(2012)
(v) Collaboration, sharing of data & repositories
Epigenetic Epidemiology 2012
(vi) Translation of effects & Integration of multiple data sources
Epigenomics
Genomics
Metabolomics
Transcriptomics
Epigenetic Epidemiology 2012
Overall…
• Much to be learnt from genetic data, but there are clear
differences in the properties of genetic and epigenetic
data
• A favoured standpoint is that of a united epi.demi.ology
and recognition that these approaches are just different
points of measurement on the same overarching scheme
• Experiences from the handling of large scale genetic data
will be valuable for the arrival of array based epigenetic
data
• Will need a convergence of techniques from:
– Observational epidemiology
– Array based, high throughput, molecular analysis
Epigenetic Epidemiology 2012
Objectives:
To reconsider the measurement of complex biological systems
To consider epidemiology as a whole and the special nature of genetics within
epidemiological analyses
Cover the epidemiological properties of genetic variation
Consider the principles in genetic epidemiology relevant to epigenetics in 2012
Should should be able to:
Place the study of epigenetics in the context of epidemiology broadly
Explain how the nature of epidemiological measurement/information varies
Describe the properties of genetic variation in particular
Transfer lessons learnt in the study of genetics to that of epigenetic data where relevant
Epigenetic Epidemiology 2012
References
- Baldwin, B. The date, identity and career of Vitruvius. in Latomus, Vol. 49 425-434 (1990).
- Zuk, O., Hechter, E., Sunyaev, S.R. & Lander, E.S. The mystery of missing heritability: Genetic interactions create phantom heritability. Proceedings of the National
Academy of Sciences of the United States of America 109, 1193-8 (2012).
- Maher, B. Personal genomes: The case of the missing heritability. Nature 456, 18-21 (2008).
- Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proceedings of the National Academy of Sciences of the United States of America
90, 11995-9 (1993).
- McCarthy, M.I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9, 356-69 (2008).
- Fisher, R.A. Statistical methods in genetics. Heredity 6, 1-12 (1952).
- Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet
40, 638-645 (2008).
- Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336-41 (2007).
- WTCCC Consortium. Genome-wide association study of 14, 000 cases of seven common diseases and 3, 000 shared controls. Nature 447, 661-678 (2007).
- Frayling, T.M. et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889-94
(2007).
- Church, C. et al. A mouse model for the metabolic effects of the human fat mass and obesity associated FTO gene. PLoS Genetics 5, e1000599 (2009).
- Gerken, T. et al. The obesity-associated FTO gene encodes a 2-oxoglutarate-dependent nucleic acid demethylase. Science 318, 1469-1472 (2007).
- Timpson, N.J. et al. The FTO/obesity associated locus and dietary intake in children. American Journal of Clinical Nutrition 88, 971-978 (2008).
- Cecil, J.E., Tavendale, R., Watt, P., Hetherington, M.M. & Palmer, C.N.A. An obesity-associated FTO gene variant and increased energy intake in children. New
England Journal of Medicine 359, 2558-66 (2008).
- Stratigopoulos, G. et al. Regulation of Fto/Ftm gene expression in mice and humans. AJP - Regulatory, Integrative and Comparative Physiology 294, R1185-1196
(2008).
- Cauchi, S. et al. The genetic susceptibility to type 2 diabetes may be modulated by obesity status: implications for association studies. BMC Medical Genetics 9, 45
(2008).
- Timpson, N.J. et al. Adiposity-related heterogeneity in patterns of type 2 diabetes susceptibility observed in genome-wide association data. Diabetes 58, 505-10
(2009).
- Heath, S.C. et al. Investigation of the fine structure of European populations with applications to disease association studies. European Journal of Human
Genetics 16, 1413-29 (2008).
- Davey Smith, G. et al. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS
Medicine 4, e352 (2007).
- Wan, E.S. et al. Cigarette smoking behaviors and time since quitting are associated with differential DNA methylation across the human genome. Human
Molecular Genetics AOP(2012).
- A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861 (2007).
- Speliotes, E.K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42, 937-948 (2010).
- Dupuis, J. et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nature Genetics 42, 105-16 (2010).
- Tabor, H.K., Risch, N.J. & Myers, R.M. Candidate-gene approaches for studying complex genetic traits: practical considerations. Nature Reviews Genetics 3, 391-7
(2002).