Null Graph Procedure - UNC School of Medicine

A systematic assessment of the
population genetic evidence for
selection across twenty brain related
phenotypes
Lea K. Davis, PhD
Assistant Professor of Medicine
Vanderbilt Genetic Institute
Vanderbilt University Medical Center
1
Fantastic students and collaborators
Katya Khramtsova
Evan Beiter
Tony Capra
Barbara Stranger
Dan Stein
Corinne Simonti
Jim Knowles
Emile Chimusa Celia Van Der Merwe
2
Presentation Overview
2.
1.
Research questions
and study motivation
3.
Signatures of selection
4.
Integrating eQTLs to further
understand the biology driving
selection
Methods and Results
5.
Future directions
3
Research question and study
motivation
4
How do psychiatric traits with reduced fecundity persist
in the population and demonstrate such high
heritability?
Early age of onset
Reduced fecundity
Moderate to high prevalence
High heritability
5
An old question
• Several explanations have been offered
• Ancestral neutrality
• Perhaps reduced fecundity is a modern phenomenon?
• Khalifeh et al., 2015, Psych. Medicine
• Balancing selection
• Heterozygote advantage (i.e., sickle-cell anemia and malaria resistance)
• Pleiotropy
• Negative selection of a negatively correlated trait
• Positive selection of a positively correlated trait
• Stabilizing selection
• Polygenic mutation-selection balance
• Vg = Vm/s
6
Viewing the paradox through the lens of
genetic architecture
• Highly polygenic
• Consistent with liability threshold
model
• High level of genetic correlation
• Large effect variants very rare
Sullivan, Daly, O’Donnovan, 2012, Nat. Gen Rev
• Majority of SNP-based heritability
for OCD accounted for by SNPs
with high MAF
• Replicated this finding in a
subsequent OCD sample (under
review)
7
…meanwhile in evolutionary genomics labs
Efforts to refine detection of recent
(~25,000 years) positive selection
across the genome
Methods developed to test very recent
(~2,000 years) selection
Increased interest in detecting signals
of polygenic selection
Improved sequencing of Neanderthal
and Denisovan genomes
8
Hypothesis: Polygenic selection has
acted on neuropsychiatric traitassociated alleles through selection
acting on genetically correlated
phenotypes
9
Signatures of selection
10
Hard Sweeps and Soft Sweeps
Novembre and Han (2012)
1. Integrated Haplotype Score (iHS) utilizes the haplotype length as a signature of
positive selection (Voight et al., 2013)
2. Large negative values indicate unusually long haplotypes carrying the derived
allele; large positive values indicate long haplotypes carrying the ancestral allele
11
Fixation index – measuring population
differentiation
• F statistics describe the deviation in heterozygosity
• compared to expectation based on Hardy-Weinberg equilibrium
• F = 1- (observed number of heterozygotes/expected number of
heterozygotes)
• Fst compares rate of heterozygosity between two subpopulations (i.e., Ceu
and Asn)
John Novembre, and Eunjung Han Phil. Trans. R. Soc. B
2012;367:878-886
12
Polygenic Adaptation
Schienfeldt and Tishkov, 2013, Nat Rev Genet
13
Signatures of Polygenic Selection
• Coordinated shifts in frequency across many trait-associated variants
• Tests over-dispersion of risk variants compared to models of drift that account for
population structure
• Reduction in density of singleton events at trait-associated loci
14
Methods and Results
15
Summary of Analyses
1. Test for enrichment of ‘hard sweeps’ among trait-associated SNPs
compared to a null distribution of matched SNPs
1.
2.
SNPs with extreme haplotype score (iHS)
SNPs with extreme population differentiation (Fst)
2. Test for ‘signature of polygenic selection’ among trait-associated
SNPs compared to model of neutral genetic drift
3. Test for enrichment and direction of SDS
4. Test for enrichment of trait-associated SNPs in regions of the
genome depleted of Neanderthal alleles
5. In silico functional analyses to derive potential biological drivers of
selection
16
GWAS Summary Statistics
The complete set of GWAS summary statistics for twenty four phenotypes were obtained
from consortium websites (i.e., PGC, IAGP, ENIGMA, T2D, GIANT, IBD).
• Psychiatric Disorders (10): Attention deficit and hyperactivity disorder, anorexia
nervosa, autism spectrum disorders, bipolar disorder, major depression, schizophrenia,
anxiety disorder, Alzheimer’s disease, Tourette Syndrome, obsessive-compulsive
disorder
• Personality Traits (2): Extraversion, neuroticism
• Brain Structure Volumes (8): Putamen, nucleus accumbens, amygdala, caudate nucleus,
hippocampus, pallidum, thalamus, intracranial volume.
• Non-psychiatric complex traits (4): Type 2 diabetes, inflammatory bowel disease,
height, body mass index
We selected SNPs modestly associated with each trait at multiple nominal p-value
thresholds (p< 10-3 and p<10-4) for subsequent analysis.
17
iHS and Fst Enrichment Analysis Workflow
Matched by:
• minor allele
frequency (± 3%)
• gene density (±
50%)
• distance to
nearest gene (±
50%)
Empirical p-value
18
iHS and Fst Enrichment Analysis Results
Phenotypes
#Multiple
Population Differentiation (Fst)
Linkage Disequilibrium (iHS)
Neuropsychiatric Traits
#SNPs
Fst > 0.30
Fst > 0.56
#SNPs
|iHS| > 2.0 |iHS| > 2.5
ADHD
Alzheimer’s
Anorexia
Anxiety
Autism
Bipolar Disorder
Extraversion
MDD
Neuroticism
OCD
Schizophrenia
TS
Non-neuropsychiatric Traits#
1036
3863
4247
2504
3467
1847
3316
1162
3306
3271
8759
4246
0.048
0.482
0.132
0.066
0.208
<0.002*
0.186
0.146
0.030
0.490
0.378
0.122
0.376
0.362
0.168
0.332
0.390
0.036
0.028
0.368
0.162
0.362
0.140
0.250
514
1613
1512
1317
1383
924
1586
595
1566
1262
3845
1694
0.448
0.086
0.056
0.026
0.464
0.132
0.286
0.034
0.138
0.276
0.026
0.406
0.240
0.032
0.286
0.368
0.120
0.090
0.242
0.110
0.190
0.204
0.028
0.226
BMI
Height
Inflammatory Bowel Disease
179
838
1250
0.260
0.150
0.314
0.124
0.394
0.388
107
405
482
0.032
0.480
0.150
0.422
0.028
0.452
Type 2 Diabetes
2037
<0.002*
0.030
1040
0.474
0.332
p-value thresholds imposed to roughly equal the number of SNPs included in analysis of neuropsychiatric phenotypes and
determine how results change with number of SNPs
iHS and Fst Analysis Summary
• Expectations of non-neuropsych phenotypes:
• Amato et al., 2011 (very modest Fst differences in height-associated alleles)
• Lohmueller et al., 2006 (no differences in Fst in height-associated alleles)
• Polimanti et al., 2016 (functional networks instead of GWAS results)
• Consistent with expectations for polygenic phenotypes, no trait
exhibited significant enrichment of recent strong positive selection
(i.e., hard sweeps) as measured by the integrated haplotype score
• Significant evidence of population differentiation for bipolar disorder
and type 2 diabetes at Fst > 0.30, trend at Fst > 0.56
• Residual population stratification unlikely cause
• LD-score regression intercept low for both bipolar and T2D
• Type 2 Diabetes (1.0088)
• Bipolar disorder (1.027)
20
Important caveats for iHS and Fst
• Recently introgressed haplotypes (i.e., Neanderthal or Denisovan)
also introduce unusually large haplotypes
• SNPs with high iHS
• Can be mistaken for positive selection
• *Potential pitfall for enrichment analysis*
• Recommended to remove SNPs falling in known regions of introgression
• Made a big difference in our results!
21
Polygenic Analysis Pipeline
Null SNPs
matched on MAF
and B-value*
Empirical p-value
22
*McVicker G, Gordon D, Davis C, Green P (2009) Widespread Genomic Signatures of Natural Selection in Hominid Evolution. PLoS Genetics.
Comparison phenotypes for context
N = 180
N=4
N = 32
N = 65
N = 140
N = 135
23
Berg and Coop, 2014, Plos Gen.
Results of polygenic adaptation analysis on
neuropsychiatric phenotypes
Phenotypes
#SNPs
Neuropsychiatric Traits
Alzheimer’s
Anorexia
Anxiety
Autism
Bipolar Disorder
Extraversion
MDD
Neuroticism
OCD
Schizophrenia
Tourette Syndrome
Qx
P(Qx)
P < 5.0 x
#SNPs
10-3
P < 5.0 x
P(Qx)
10-4
1,777
1196
1420
1149
2634
1485
1806
83.80
68.96
66.41
58.02
57.26
88.04
63.31
0.014
0.112
0.141
0.334
0.310
0.001*
0.146
259
166
183
138
449
196
224
1617
1126
3307
1441
77.52
52.30
208.36
73.51
0.156
0.482
<0.001*
0.082
205
141
1,029
209
<0.001*
81
77.25
0.017
<0.001*
42
0.525
0.010
<0.001*
97
1,087
50.61
P < 5.0 x 10-8
82.61
209.27
P < 5.0 x 10-4
Non-neuropsychiatric
Traits
Inflammatory Bowel Disease
423
101.13
Type 2 Diabetes
478
117.77
P < 5.0 x 10-6
BMI
Height
Qx
246
2,002
78.20
303.29
Extraversion
46.81
51.41
59.21
42.69
54.30
49.71
70.11
75.69
54.69
101.30
42.79
P < 5.0 x 10-6
0.688
0.556
0.235
0.833
0.356
0.545
0.065
0.029
0.565
<0.001*
0.828
0.003*
<0.001*
Schizophrenia
25
Results of polygenic adaptation analysis on
brain structure volume phenotypes
Phenotypes
#SNPs
Brain Structure Volumes
Accumbens
Amygdala
Caudate Nucleus
Hippocampus
Intracranial Volume
Pallidum
Putamen
Thalamus
1,152
1,164
1,210
1,237
1,249
1,183
1,203
1,123
Qx
P(Qx)
#SNPs
P < 5.0 x 10-3
61.61
55.05
48.33
108.98
54.37
52.76
115.76
70.98
Hippocampus
Qx
P(Qx)
P < 5.0 x 10-4
0.203
0.378
0.626
<0.001*
0.386
0.559
<0.001*
0.088
134
137
153
177
175
176
155
153
63.86
53.34
56.18
79.65
50.79
48.78
72.69
57.51
0.162
0.470
0.348
0.010
0.526
0.625
0.029
0.288
Putamen
26
Increased stringency of clumping thresholds
Phenotypes
Neuropsychiatric Traits
#SNPs
Alzheimer’s
Anorexia
Anxiety
Autism
Bipolar Disorder
Extraversion
MDD
Neuroticism
OCD
Schizophrenia
1,471
971
1,184
980
2,235
1,261
1,629
1,382
985
2,336
Tourette Syndrome
1,215
Qx
P < 5.0 x
P(Qx)
#SNPs
10-3
P < 5.0 x
73.11
57.09
66.30
47.11
55.04
84.67
52.10
66.70
49.58
172.77
0.052
0.372
0.153
0.705
0.369
0.004*
0.466
0.442
0.572
<0.001*
223
149
171
126
395
184
209
186
133
747
76.75
0.042
192
P < 5.0 x 10-3
Brain Regions
Qx
P(Qx)
10-4
R2 < 0.1, 1000 Kb
51.29
47.70
64.52
47.38
47.10
54.63
62.10
72.90
47.36
80.06
0.518
0.649
0.105
0.727
0.662
0.367
0.163
0.051
0.654
0.012*
45.23
0.781
P < 5.0 x 10-4
Accumbens
Amygdala
Caudate Nucleus
Hippocampus
Intracranial Volume
Pallidum
Putamen
999
1004
1040
1043
1045
1004
1020
55.25
56.37
53.79
90.99
52.12
51.03
106.94
0.355
0.317
0.390
<0.001*
0.459
0.577
<0.001*
129
130
140
161
157
162
137
57.94
57.08
56.02
75.86
47.37
47.41
64.33
0.335
0.318
0.354
0.017
0.626
0.680
0.125
Thalamus
958
70.24
0.083
144
55.49q
0.348
P < 5.0 x
Controls
Inflammatory Bowel Disease
Type 2 Diabetes
310
433
10-4
74.38
110.42
P < 5.0 x
0.020
<0.001*
56
37
P < 5.0 x 10-6
BMI
Height
184
1,360
70.77
245.80
10-6
62.98
52.97
0.146
0.422
P < 5.0 x 10-8
0.048
<0.001*
72
738
82.53
159.36
0.007*
<0.001*
26
Singleton Density Score Analysis Results
• Compared mean SDS of trait associated alleles against distribution of
mean SDS of 500 sets of matched null SNPs
• Empirical p-value
• Height (p=0.008), schizophrenia (p=0.004), hippocampus (p=0.002)
• tSDS analysis demonstrates direction of effect
• Replicated height finding
Height
SCZ
Hippocampus
27
Summary of Polygenic Adaptation Results
• Over-dispersion of risk alleles
• Initial evidence of polygenic
adaptation in:
•
•
•
•
•
•
•
•
Schizophrenia
Extraversion
Hippocampus volume
Putamen volume
T2D
IBD
BMI
Height
• Single density score
• Secondary evidence of
polygenic adaptation in
•
•
•
•
Schizophrenia
Hippocampus
Height
BMI
• Direction of effect – SCZ protective
(hippocampus volume reducing)
alleles demonstrate evidence of very
recent selection
• Robust to clumping thresholds
28
Integrating eQTLs to further
understand the biology driving
selection
29
Data integration
Schienfeldt and Tishkov, 2013, Nat Rev Genet
30
Enrichment of eQTLs from brain and immune
tissues among trait-associated SNPs
(excluding HLA)
32
Conclusions – future testable hypotheses
1. We find no evidence of strong, sweeping selection driving allele
frequencies of neuropsychiatric-associated alleles
2. Convergent evidence of polygenic selection in schizophrenia,
extraversion, hippocampus, and putamen volume
3. Immune adaptation may be driving findings in brain structure
volumes
4. Brain-specific adaptation may be driving selection in schizophrenia
32
Study Limitations and Caveats
1.
2.
3.
4.
Residual population stratification
Differential power between traits makes comparisons difficult
Pleiotropy – both a feature and a bug
One piece of a complex story
33
Future directions
• Developing approaches to test predictions consistent with polygenic
stabilizing or directional selection in biobank data
Density
Number of
disease codes
PRS
34
Acknowledgements
• The individuals who have selflessly participated in genetic research
• The investigators and consortia who have graciously shared their data
• The students and collaborators who have worked with us!
• Dr. Jeremy Berg
35