Shared Molecular Pathways and Gene Networks for Cardiovascular

Original Article
Shared Molecular Pathways and Gene Networks for
Cardiovascular Disease and Type 2 Diabetes Mellitus in
Women Across Diverse Ethnicities
Kei Hang K. Chan, PhD; Yen-Tsung Huang, MD, ScD; Qingying Meng, PhD;
Chunyuan Wu, MS; Alexander Reiner, MD; Eric M. Sobel, PhD; Lesley Tinker, PhD;
Aldons J. Lusis, PhD; Xia Yang, PhD; Simin Liu, MD, ScD
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
Background—Although cardiovascular disease (CVD) and type 2 diabetes mellitus (T2D) share many common risk factors,
potential molecular mechanisms that may also be shared for these 2 disorders remain unknown.
Methods and Results—Using an integrative pathway and network analysis, we performed genome-wide association
studies in 8155 blacks, 3494 Hispanic American, and 3697 Caucasian American women who participated in the national
Women’s Health Initiative single-nucleotide polymorphism (SNP) Health Association Resource and the Genomics and
Randomized Trials Network. Eight top pathways and gene networks related to cardiomyopathy, calcium signaling, axon
guidance, cell adhesion, and extracellular matrix seemed to be commonly shared between CVD and T2D across all 3
ethnic groups. We also identified ethnicity-specific pathways, such as cell cycle (specific for Hispanic American and
Caucasian American) and tight junction (CVD and combined CVD and T2D in Hispanic American). In network analysis
of gene–gene or protein–protein interactions, we identified key drivers that included COL1A1, COL3A1, and ELN in
the shared pathways for both CVD and T2D. These key driver genes were cross-validated in multiple mouse models of
diabetes mellitus and atherosclerosis.
Conclusions—Our integrative analysis of American women of 3 ethnicities identified multiple shared biological pathways
and key regulatory genes for the development of CVD and T2D. These prospective findings also support the notion that
ethnicity-specific susceptibility genes and process are involved in the pathogenesis of CVD and T2D. (Circ Cardiovasc
Genet. 2014;7:911-919.)
Key Words: cardiovascular diseases ◼ diabetes mellitus ◼ ethnology ◼ genetics ◼ genome-wide association study
◼ women
I
t has long been known that cardiovascular disease (CVD)
and type 2 diabetes mellitus (T2D) share many common
risk factors and pathophysiological intermediaries including
obesity, dyslipidemia, insulin resistance, and proinflammatory
and prothrombotic states.1–3 However, the key molecular drivers underlying these highly correlated phenotypes as well as
the potential regulatory networks shared in the pathogenesis
of CVD and T2D remain poorly understood.4
risks, either alone or concentrated in specific regulatory networks that may explain the pathophysiological connections
between CVD and T2D? Second, are there any ethnicityspecific genetic mechanisms for these 2 common vascular
diseases, given that there are drastic ethnicity-specific differences in their risk and linkage disequilibrium patterns.6–8
Third, although common genetic loci have been detected
across different populations (eg, the Chr9p21 locus for CVD
and the TCF7L2 locus for T2D) supporting the presence of
common pathological paths across ethnicity, to what degree
molecular mechanisms are shared across ethnicities?
To answer these 3 questions, we performed 2 GWAS for
both CVD and T2D using an integrative pathway and network
analysis in the national Women’s Health Initiative SNP Health
Association Resource (WHI-SHARe) and the Genomics and
Clinical Perspective on p 919
Of the ≈60 genetic loci identified for CVD and T2D in
large-scale genome-wide association studies (GWAS),5 only 2
significant loci (TCF7L2 and VEGFA) are shared between the
2 diseases. There are ≥3 fundamentally important questions
that remain to be answered. First, are there additional genetic
Received March 18, 2014; accepted September 23, 2014.
From the Department of Epidemiology (K.H.K.C., Y.-T.H., S.L.) and Division of Endocrinology, Department of Medicine (S.L.), Warren Alpert Medical
School of Brown University, Providence, RI; Department of Integrative Biology and Physiology (K.H.K.C., Q.M., X.Y.), Department of Human Genetics
(E.M.S.), Department of Medicine/Division of Cardiology, David Geffen School of Medicine (A.J.L.), and Departments of Medicine and Obstetrics and
Gynecology, David Geffen School of Medicine (S.L.), University of California Los Angeles; Biostatistics Division (C.W.), Public Health Sciences Division
(L.T.), Fred Hutchinson Cancer Research Center, Seattle, WA; and Department of Epidemiology, University of Washington, Seattle (A.R.).
The Data Supplement is available at http://circgenetics.ahajournals.org/lookup/suppl/doi:10.1161/CIRCGENETICS.114.000676/-/DC1.
Correspondence to Xia Yang, PhD, Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, CA 90095.
E-mail [email protected] or Simin Liu, MD, ScD, Department of Epidemiology, Brown University, Providence, RI 02912. E-mail [email protected]
© 2014 American Heart Association, Inc.
Circ Cardiovasc Genet is available at http://circgenetics.ahajournals.org
911
DOI: 10.1161/CIRCGENETICS.114.000676
912 Circ Cardiovasc Genet December 2014
Randomized Trials Network (WHI-GARNET). These cohorts
provide unique opportunities to examine both CVD and T2D,
alone or in combination, across multiple ethnicities to allow
interdisease and interethnicity comparisons.
Methods
Study Participants
A detailed description of study participants of both WHI-SHARe and
WHI-GARNET is given in the Data Supplement and Table I in the
Data Supplement. In brief, the WHI-SHARe included 8155 blacks
and 3494 Hispanic American (HA) women. The WHI-GARNET involved 3697 Caucasian American (CA) women. The research protocol was approved by the institutional review board and that all human
participants gave written informed consent.
Definition of Clinical End Points
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
In WHI-SHARe, incident cases of CVD were classified based on any
event of myocardial infarction, stroke, deep vein thrombosis, and
pulmonary embolism during follow-up. Incident cases of T2D were
identified on the basis of those clinical cases that had no history of
T2D at baseline and diagnosed during the follow-up period. Those
women in the cohort who were free of T2D or CVD were used as
controls. In WHI-GARNET, CVD cases were identified during the
Hormone Therapy (HT) trial based on clinical diagnosis of acute
myocardial infarction that required overnight hospitalization, silent
myocardial infarction determined from serial electrocardiograms obtains every 3 years, or death because of coronary heart disease. Cases
of T2D were also identified during the HT trial. Controls were free of
coronary heart disease, stroke, venous thromboembolism, and T2D
by the end of the HT trial.
Genetic Data
Genome-wide genotyping of the WHI-SHARe participants was performed using the Affymetrix 6.0 array (Affymetrix, Inc, Santa Clara,
CA). Genotyping for WHI-GARNET participants was performed using the Illumina HumanOmni1-Quad SNP platform (Illumina, Inc,
San Diego, CA). Details about genotyping methods and quality control are given in the Data Supplement.
Standard SNP Analysis
We performed standard SNP association analysis for 3 end points,
that is, CVD, T2D, and combined CVD+T2D adjusting for principal components for global ancestry and matching factors (detailed
in the Data Supplement). Demographic and lifestyle factors do not
influence germline genetic variants and as such were not treated as
confounders and were not adjusted for in these models. Given that
WHI-SHARe included 8155 blacks and 3494 HA, statistical power
seems excellent (>80%) to detect an odds ratio of 1.25 for minor allele frequency >0.25 in blacks and an odds ratio of 1.5 for minor
allele frequency >0.13 in HA. Power estimate among 3697 WHIGARNET CA is almost identical to that in HA.
Pathway and Network-Based Integrative Analysis
Accumulating evidence supports that multiple genes involved in biological pathways or gene networks, rather than individual isolated
genes, coordinate together to contribute to disease risks.9–14 To uncover the hidden mechanisms that are not obvious from the individual
top GWAS hits alone, we augmented the standard GWAS analysis
with pathway and network approaches. Functionally related genes
involved in metabolic and signaling pathways were obtained from
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome.
We tested each pathway for enrichment of genetic signals for CVD
and T2D, alone or in combination, using 5 well-established methodologies (Meta-Analysis Gene-set Enrichment of VariaNT Associations
[MAGENTA],12 gene set analysis-SNP [GSA-SNP],11 Network
Interface Miner for Multigenic Interactions [NIMMI],9 Pathway and
Network-Oriented GWAS Analysis [PANOGA],10,15 and expression
SNP [eSNP]14; detailed in the Data Supplement and Table II in the
Data Supplement) that investigate whether functionally related genes
are enriched for both strong and subtle genetic risks (ie, not limited to
top, genome-wide significant loci) of diseases. We chose to use multiple methodologies to avoid potential bias from any particular method. We ran the 5 methodologies separately and a statistical cutoff of
false-discovery rate16 <5% (implemented in MAGENTA, GSA-SNP,
NIMMI, and eSNP approaches) or Bonferroni-corrected P<0.05 (implemented in PANOGA) as provided by each method was considered
significant. Pathways that showed significance in ≥2 methods were
chosen as the top pathways to report.
Identification of Key Regulatory Genes for
the Disease-Associated Pathways Using Gene
Regulatory Networks and Protein–Protein
Interaction Networks
As hundreds of genes are involved in the biological pathways, we seek
to identify important regulators of the top significant pathways as a
means to prioritize genes and uncover novel regulatory mechanisms.
We integrated the 169 shared genes involved in the top 8 pathways
with graphical networks (Bayesian networks and protein–protein interaction17; sources detailed in Table III in the Data Supplement) to
identify key regulators of the 169 shared genes using a key driver
(KD) analysis method.18,19 KD analysis takes a set of genes (G) and
a gene network N as input. For every node K in N, the subnetwork
(Nk of K) was determined by 3-edge expansion and then tested for
enrichment of genes in G using Fisher exact test. Nodes whose neighborhood subnetwork shows significant enrichment at Bonferronicorrected P<0.05 were termed KDs. As multiple KD lists were
generated using multiple networks, we ranked the KDs using a normalized rank score to summarize the consistency and strength, where
C KD
C
NRS =
× ∑i =KD
1 Rank KDi ; CKD is the count of network
N BN + PPI
models from which a KD was identified among all networks used including 8 BNs from the 8 tissue types (ie, adipose, liver, blood, heart,
brain, islet, kidney, and muscle) and protein–protein interaction; CKD
is then normalized by the total number of networks (NBN+PPI) to represent the consistency of a KD across all networks tested; the KD
strength is represented by summing the normalized statistical rank
in each network i(RankKDi) across all networks from which the KD is
identified; RankKDi was calculated by dividing the rank of a KD based
on the P values of the Fisher exact test in descending order divided by
the total number of KDs identified from a network i. KDs with high
normalized rank score were those with high network enrichment for
pathway genes and high consistency across networks tested.
To cross-validate the top KD genes from top disease pathways
identified, we searched for multiple mouse databases that include (1)
genes tested causal for CVD and T2D phenotypes,20 (2) the phenotypic changes in genetically modified mouse models with individual
genes perturbed,21 and (3) genes identified for CVD and T2D phenotypes in the hybrid mouse diversity panel (>100 strains of inbred or
recombinant inbred mouse).22
Results
The descriptive statistics on demographics and lifestyle factors of each study population are shown in Table 1. The blacks,
HA, and CA women in WHI-SHARe and WHI-GARNET differed significantly in age, body mass index, current smoking,
alcohol drinking, hormone usage, physical activity, and family
history of T2D (P<0.001).
Identification of Significant Genetic Loci Using
Standard GWAS Analysis
Four genomic loci reached genome-wide significance
(P<5e−8) in the standard GWAS analysis, including
Chan et al Pathways of CVD and T2D in Multiple Ethnicities 913
Table 1. Baseline Characteristics* of Participants in WHI-SHARe and WHI-GARNET Stratified by Ethnicity
Characteristic
Blacks in WHI-SHARe
n
HA in WHI-SHARe
CA in WHI-GARNET
8155
3494
3697
Age (mean±SD), y
61.6±7.03
60.3±6.69
65.7±6.90
BMI (mean±SD)
31.0±6.37
28.9±5.60
29.7±6.13
Current smoking, %
11.7
6.95
11.1
Current alcohol drinking, %
54.4
66.8
73.3
Current hormone user, %
25.5
35.4
7.87
9.67±12.7
10.8±13.8
10.2±12.7
T2D case, %
19.4
17.8
32.4
Family history of T2D, %
51.3
45.0
36.6
CVD case, %
7.59
4.33
20.4
Physical activity: total metabolic equivalents
(METS)/wk (mean±SD)
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
BMI indicates body mass index; CA, Caucasian American; CVD, cardiovascular disease; HA, Hispanic American; T2D, type 2 diabetes
mellitus; WHI-GARNET, Women’s Health Initiative-Genomics and Randomized Trials Network; and WHI-SHARe, Women’s Health
Initiative SNP Health Association Resource.
*Significant difference (P<0.001) seen between the 3 populations for every above variable. P value is calculated using ANOVA test
for continuous variables (ie, age, BMI, and physical activity) and χ2 test for categorical variables (ie, current smoking, current alcohol
drinking, current hormone user, family history of T2D, CVD case in WHI-SHARe or CVD case in WHI-GARNET, and T2D case).
rs11885576 (KLHL29 at 2p24) for CVD in CA, rs2805429
(RYR2 at 1q43) for T2D in blacks, and rs17591786
(FLJ45721 at 4p25) and rs7825609 (NAT2 at 8p22) for combined CVD+T2D in CA (summarized in Table 2). In addition, several previously established loci including Chr9p21,
TCF7L2, and CDKAL1 for CVD or T2D reached P<5e−3
in our study (summarized in Table 3; Table IV in the Data
Supplement).
Identification of Biological Pathways Using
Integrative Pathway and Network Analysis
Thirty six of the 1501 pathways from the KEGG and Reactome
databases were found to be associated with ≥1 of the 3 end
points (ie, CVD, T2D, and combined CVD+T2D) in ≥1 ethnic
group (Tables V and VI in the Data supplement). Of these 36
pathways identified, 8 including focal adhesion, hypertrophic
cardiomyopathy (HCM), extracellular matrix–receptor interaction signaling, dilated cardiomyopathy, arrhythmogenic
right ventricular cardiomyopathy (ARVC), calcium signaling,
axon guidance, and cell adhesion molecules were commonly
enriched for genetic signals for all 3 end points (CVD, T2D,
and combined CVD+T2D) across all 3 ethnic groups. These
pathways were also significantly enriched in the C4D and
CARDIOGRAM GWAS (Table VII in the Data Supplement).
There were 638 unique genes involved in these common pathways (Figure 1; genes listed in Table V in the Data
Supplement). Forty-five of these genes have been implicated
previously in CVD (24 genes)24–26 and T2D (21 genes).14,27,28
Five of the 8 commonly enriched pathways, namely, HCM,
dilated cardiomyopathy, ARVC, focal adhesion, and extracellular matrix–receptor interaction signaling, were also found to
be highly interconnected as demonstrated by a shared common set of 117 genes among them (Figure 1).
Besides the shared pathways across diseases and ethnicities,
we also identified disease- and ethnicity-specific pathways
(Table VIII in the Data Supplement). By disease, the apoptosis pathway was significantly associated with CVD+T2D, but
not T2D; acute amyloid leukemia was associated with T2D,
but not for CVD. By ethnicity, the cell cycle pathway was significant for all 3 diseases for HA and CA, but not blacks: the
WNT signaling pathway was significant for HA; pathways in
cancer was not significant for CA; and dorsoventral axis formation and prion diseases were only specific to blacks. Certain
pathways demonstrated both disease and ethnicity specificity.
For instance, the adipocytokine signaling pathway was only
significant for T2D in CA; the prostate cancer and melanogenesis pathways were only significant for T2D or CVD+T2D
in HA or blacks, but not for CVD in any of the 3 ethnicities.
Table 2. Genome-Wide Significant SNPs for CVD, T2D, and Combined CVD+T2D in Women’s Health Initiative Women
Population
Chromosomal
Region
Top SNP in
Region*
CVD
CA
2p24
T2D
Blacks
1q43
CVD+T2D
CA
CVD+T2D
CA
End Point
Position Hg18†
Candidate
Gene
Minor/Major
Allele‡
Minor Allele
Frequency
rs11885576
23526223
KLHL29
G/T
0.04
0.43 (0.32–0.58)
3.5e−8
rs2805429
235750901
RYR2
G/C
0.47
1.23 (1.15–1.33)
4.0e−8
4p15
rs17591786
26828037
FLJ45721
G/A
0.40
1.29 (1.18–1.41)
1.8e−8
8p22
rs7825609
18290501
NAT2
C/T
0.01
0.35 (0.25–0.49)
6.0e−10
Odds Ratio (95% CI)
P Value
CA indicates Caucasian American; CVD, cardiovascular disease; and T2D, type 2 diabetes mellitus.
*The top SNP with the smallest P values among the genotyped SNP for each locus. These are novel associations (SNPs not found in the Catalog of published genomewide association studies, however, the KLHL29 and NAT2 loci have been reported before).
†Positions of the SNPs were derived from dbSNP build 136.
‡The coded allele used to calculate the effect size was underlined.
914 Circ Cardiovasc Genet December 2014
Table 3. Replication of Previously Identified CVD and T2D Loci
Disease
Genes
Region
Reference
SNP (RS) ID
Observed P Value in WHI
Blacks
HA
Association With CVD+T2D in WHI
CA
Blacks
HA
CA
T2D
CDKAL1
6p22.3
rs10946398
0.1
0.0002*
0.09
0.34
0.0006*
0.13
T2D
CDKAL1
6p22.3
rs7754840
0.08
0.0002*
0.10
0.28
0.0006*
0.13
T2D
CDKAL1
6p22.3
rs7756992
0.59
0.0001*
0.04
0.75
0.0001*
0.05
T2D
TCF7L2
10q25.2
rs4506565
0.001*
0.63
0.04
0.08
0.94
0.08
T2D
TCF7L2
10q25.2
rs7901695
0.002*
0.47
0.02
0.14
0.82
0.06
CVD
Intergenic
9p21.3
rs1333042
0.64
0.46
0.001*
0.54
0.99
0.0003*
CVD
Intergenic
9p21.3
rs4977574
0.84
0.28
0.005*
0.01
0.74
0.002*
CA indicates Caucasian American; CVD, cardiovascular disease; HA, Hispanic American; T2D, type 2 diabetes mellitus; and WHI, Women’s Health Initiative.
*P values <5×10−3.
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
Identification and Validation of Putative Key
Regulatory Genes for the Shared Pathways Across
Diseases and Ethnicities
To identify potential KD genes among the significant pathways shared between diseases and ethnicities, we integrated
the pathway genes with 9 different regulatory or interaction
networks that capture gene–gene or protein–protein interactions. These KD genes represent central network genes which,
when perturbed, can potentially affect a large number of
genes involved in the CVD and T2D pathways and thus exert
stronger impact on diseases. The 10 top KD genes included
COL1A1, COL3A1, ELN, COL4A1, CD93, FN1, MMP2,
SPARC, COL2A1, and THBS2 in multiple networks (Figure 2;
Table IX in the Data Supplement). These KD genes were also
confirmed in multiple mouse data sets that documented their
modulating impact on risk of T2D and CVD (Table IX in the
Data Supplement). For example, the gene expression levels
or the SNPs regulating the expression levels of the COL4A1
gene were tested causal for 14 CVD and T2D traits in 4 different tissues in 7 mouse F2 cross data sets and a mouse data set
comprised >100 inbred or recombinant inbred strains.
Interestingly, the KDs themselves are not among the GWAS
hits from the current and previous GWAS for CVD and T2D,
although the genes within the pathways that these KDs seem
to regulate are enriched for disease risk SNPs. We speculate
that genetic polymorphisms that strongly perturb KDs may
impose evolutionary constraints, which may explain the lack
of strong GWAS hits in the KDs, whereas subtle genetic polymorphisms that affect KDs may still be enriched for disease
risks. To this end, we analyzed the risk enrichment for the top
10, 30, and 100 KDs, respectively. Our results indeed indicated that the top KDs, especially the top 30 and 100 KDs,
were significantly enriched for genetic risks of CVD, T2D,
and combined CVD+T2D (Table X in the Data Supplement).
Discussion
In this genome-wide assessment of 8155 blacks, 3494 HA, and
3697 CA women who participated in the WHI-SHARe and the
WHI-GARNET, we identified 4 independent genetic loci and
36 pathways to be significantly associated with CVD, T2D, and
CVD+T2D in one or more ethnicities. Among the significant
signals, the FLJ45721 and NAT2 loci were associated with the
combined CVD+T2D end point in CA and 8 pathways were
consistently associated with both types of vascular diseases
across all 3 ethnicities. These results suggest the presence of
core mechanisms underlying both CVD and T2D. Ethnicityand disease-specific pathways were also identified. We further
uncovered potential novel regulators of these shared pathways
supporting their pleiotropic and causal impact on CVD and T2D.
Our standard GWAS analysis of 3 ethnic populations identified several biologically plausible disease loci including 2
previously implicated loci (KLHL29 and NAT2) for cardiometabolic diseases and 2 novel loci (RYR2 and FLJ45721).
The KLHL29 locus was found to be associated with CVD
in CA in our study and it was also previously implicated in
CVD in blacks29 and obesity in HA30. These lines of evidence
support its importance for multiple cardiometabolic diseases.
The KLHL29 locus is highly complex and seems to encode
multiple proteins containing BTB and kelch motifs but with
poorly annotated functions. NAT2 was significant for the joint
CVD+T2D end point in our study and was previously detected
as a GWAS signal for several important cardiometabolic traits
including total cholesterol, triglyceride,31 and insulin sensitivity (GENESIS consortium, personal communication) that are
relevant for both CVD and T2D. NAT2 (N-acetyltransferase
2) is a well-known pharmacogenetic gene responsible for
O- and N-acetylation of arylamine and hydrazine drugs and
carcinogens but the mechanisms linking NAT2 to cardiometabolic traits are unknown. Along with NAT2, the FLJ45721
locus was a novel signal for the combined CVD+T2D end
point in our study. However, there is currently limited knowledge about this locus and the candidate genes in this region.
An additional novel locus, RYR2, was found to be associated
with T2D in blacks in our analysis. RYR2 encodes ryanodine
receptor 2, a calcium channel. As calcium is critical for insulin
secretion and sensitivity,32 RYR2 may contribute to T2D by
affecting insulin levels and activities. Indeed, a recent study
supported a role of RYR2 in islet β-cell function, insulin secretion, and glucose tolerance, all key processes in T2D.33
Our findings also add to the body of literatures linking
the involvement of multiple regulatory gene networks in the
pathogenesis of complex cardiometabolic diseases, although
individual genes may only exert subtle effects.14,24,34 Our integrative pathway-based analysis revealed 8 consistent pathways between CVD and T2D across the 3 ethnic groups. Four
pathways, calcium signaling, axon guidance, focal adhesion,
Chan et al Pathways of CVD and T2D in Multiple Ethnicities 915
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
Figure 1. Network of 8 top pathways enriched for cardiovascular disease (CVD), type 2 diabetes mellitus (T2D), and combined CVD+T2D
among blacks, Hispanic American, and Caucasian American Women’s Health Initiative women. The color codes are: salmon, hypertrophic cardiomyopathy (HCM); green, dilated cardiomyopathy (DC); yellow, arrhythmogenic right ventricular cardiomyopathy (ARVC); light
blue, calcium signaling pathway (Ca+); orange, axon guidance (axon); magenta, cell adhesion molecules (CAMs); brown, focal adhesion
(FA); and purple, extracellular matrix –receptor interaction (ECM). The diamond nodes represent pathway and the round nodes denote
gene, and the edge shows the interaction, that is, the association between a gene and a pathway. Genes involved in ≥2 pathways were
put with larger font, label and the nodes are in light green. The figure was created using cytoscape.23
and extracellular matrix–receptor interaction signaling, have
been implicated previously in CAD and T2D.14,24 Although
the axon guidance pathway is mainly involved in localization and neuronal extension during embryogenesis,35 genes
within the axon guidance pathway have been connected to
both CVD and T2D. For instance, a family of secreted proteins known as repulsive axon guidance cues (SLIT) and
roundabout axon guidance receptors in the pathway have been
found to reduce cytokine and thapsigargin-induced cell death
under hyperglycemic conditions. Particularly, SLIT also triggered a release of endoplasmic reticulum luminal Ca2+, which
suggested a molecular mechanism that defends β cells from
endoplasmic reticulum stress–induced apoptosis. Therefore,
local SLIT secretion may play a role in the survival and
function of pancreatic β cells. Because of the fact that T2D
results from a deficiency in functional β-cell mass, the axon
916 Circ Cardiovasc Genet December 2014
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
Figure 2. Network key drivers and gene subnetworks of the top 8 cardiovascular disease (CVD)/type 2 diabetes mellitus (T2D) pathways.
Top 10 ranked multitissue key drivers (bigger nodes in yellow) of the top 8 CVD/T2D pathways in the protein–protein interaction network
(edge color, green) and Bayesian network of adipose (orange), liver (yellow), blood (red), heart (brown), brain (blue), islet (pink), kidney
(purple), and muscle a (light blue) tissues. The genes within the top 8 CVD/T2D pathways were highlighted in pink.
guidance pathway especially SLIT may contribute to therapeutic approaches for improving β-cell survival and function.36 However, netrins, a family of proteins involved in axon
guidance during embryogenesis, were found to be involved
in angiogenesis and ischemia–reperfusion injury.37 Four additional pathways, including HCM, dilated cardiomyopathy,
ARVC, and cell adhesion molecules, were not reported previously. The HCM pathway involves genes that increase the calcium sensitivity of cardiac myofilaments leading to imbalance
in energy supply and demand in the heart under severe stress,
which may contribute to the development of CVD.38 Calcium
sensitivity is also important for T2D as discussed earlier. The
dilated cardiomyopathy pathway involves genes, when altered
that pose defect residing within the cytoskeleton or sarcomere,
within the mitochrondria that causes deficient energy generation, or in the calcium cycling resulting in inefficient force
activation and insulin secretion, which are processes related
to CVD and T2D.39 Genes that involve in the ARVC pathway includes RYR2, which is involved in calcium and insulin
activities as discussed above, processes important for both
cardiac function and insulin activities. A common mechanism
among HCM pathway, dilated cardiomyopathy pathway, and
ARVC pathway seem to be calcium homeostasis and sensitivity. The cell adhesion molecules are glycoproteins expressed
on the cell surface and play an important role in a wide range
of biological processes that includes homeostasis, immune
Chan et al Pathways of CVD and T2D in Multiple Ethnicities 917
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
response, and inflammation. Soluble intercellular adhesion
molecules and vascular cell adhesion molecules have been
associated with the development of coronary heart disease
in the Health Professional Follow-up Study.40 Higher levels
of intercellular adhesion molecule-1 were also consistently
associated with increased T2D risk in the WHI-Observational
Study.3 Therefore, these pathways appear to link to CVD and
T2D via diverse mechanisms. The fact that these pathways
were consistently identified across multiple ethnicities in our
study highlights their central role in the joint mechanisms
between CVD and T2D.
Importantly, the significant pathways were found to be
highly connected through a large number of shared genes
involved in extracellular matrix (collagens and laminins),
cytoskeleton (actins), cell adhesion (integrins), calcium channels, and adenylate cyclases. In our further investigation of
these genes using KD analysis network approach, the top 10
KDs were found to be expressed in almost all tissues or cell
types involved in CVD and T2D including islet, liver, adipose,
muscle, and kidney in our mouse tissue-specific data (details
shown in Table VI in the Data Supplement). Perturbations of
these genes and pathways that are critical for cell integrity and
cellular communications in multiple tissues will likely affect
vascular functions and subsequently CVD and T2D.
In addition to these shared mechanisms, we also identified
disease- and ethnicity-specific pathways. For instance, several
cancer-related pathways including melanoma, bladder cancer,
and pathways in cancer were found to be specific for HA.
However, the robustness of these signals awaits further validation in independent, ethnic-specific cohorts.
In the current study, only knowledge-driven biological
pathways from existing pathway databases were used for disease risk signal enrichment analysis and we did not include
data-driven networks from protein–protein interaction experiments or large-scale genomic data sets. Although data-driven
networks may represent a more unbiased source to uncover
previously unknown functional pathways, we focused on
knowledge-driven pathways for the following reasons. First,
these pathways represent a straightforward means to clearly
define functionally related gene sets. In contrast, implementing data-driven networks requires more sophisticated considerations such as how to handle species, tissue, and sex
specificity, how to clearly define gene sets of reasonable
size based on large networks, and how to deal with network
inconsistencies between data sets. Second, the results from
canonical pathways are easily interpretable as they are largely
derived from experimentally tested biochemical reactions,
signaling cascades, and functional categories. In contrast,
interpreting the results from the data-driven networks requires
extra steps of annotation. Some of the gene subnetworks cannot be easily annotated by known knowledge, further complicating the result interpretation. Third, limiting the analysis to
canonical pathways reduces the number of gene sets tested
and thus helps reduce statistical penalty from multiple testing. Nonetheless, we acknowledge the power of data-driven
approaches to detect novel insights, as demonstrated by our
recent comprehensive investigation of coronary artery disease
where a sophisticated analytic pipeline was used to include
both knowledge-driven and data-driven approaches.34 We will
further pursue additional novel biological insights in the WHI
cohorts using data-driven approaches in the future.
As hundreds of genes are likely involved in the core pathways identified, it is important to prioritize on KD genes,
which, when perturbed, should have major impact on the
pathways and hence the eventual vascular outcome of interest. Indeed, multiple KD genes, such as COL4A1, CD93,
MMP2, and SPARC, were found in our network analysis
reflecting gene–gene regulatory relations or protein–protein
interactions. Interestingly, the KD genes were not identified
in standard GWAS analysis of single SNPs, suggesting that
important regulatory genes may not harbor common susceptibility polymorphisms because of evolutionary constraints.41
Our further investigations of the top 10 KD genes yielded convincing evidence in support of the notion that perturbations of
these KD genes in multiple mouse studies affect both CVD
and T2D phenotypes. Their actions seem to be important in
multiple tissues and consistent across multiple mouse studies of diabetes mellitus and CVD. In addition, these genes or
their proteins have also been associated with obesity,30 diabetes mellitus,42 CAD,43 and CVD44 traits in the literature. Of
note, all of the top 10 KDs either encode extracellular matrix
proteins or are involved in cell–matrix interactions, which
places extracellular matrix at the central intersection of CVD
and T2D pathogenesis.
The observations from us and others that important regulators are rarely GWAS hits and that GWAS candidate genes are
mostly peripheral nodes in gene networks support that GWAS
SNPs may serve as subtle modifiers of disease predisposition
during a life period. Such subtle effects may help explain (1)
their low selection pressure and hence commonality in the
general population and (2) each GWAS locus only explains
a small fraction of genetic heritability of complex diseases.
These lines of available evidence suggest that the GWAS candidates may not serve as the best candidates for therapeutic
interventions, although we acknowledge their importance in
informing the biological pathways and processes involved
in complex diseases and the possibility that rare mutations
with strong effects in these genes may exist. However, KD or
regulatory genes, although lack genetic variations that can be
detected through GWAS, have hub properties in the networks
and may behave like master switches that exert strong effects
on disease networks and therefore may be better candidates
from a therapeutic perspective.
Our study has several unique features. First, the comparison across multiple ethnicities allowed detection of both
robust, shared mechanisms across populations, and potential
ethnicity-specific signals. Second, apart from studying CVD
and T2D separately, we also treated CVD and T2D together
as a combined end point to increase the sensitivity to capture
shared risk and pathology. Indeed, we identified genome-wide
significant loci as well as pathways to be significant only for
the combined end point. Finally, using a systems biology
framework that integrates GWAS, pathways, gene expression,
networks, and phenotypic information from both human and
mouse populations, we were able to derive novel mechanistic
insights and identify potential therapeutic targets.
Through a multiethnic GWAS augmented with comprehensive screening of ≈1500 curated biological pathways to capture
918 Circ Cardiovasc Genet December 2014
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
processes that are genetically related to CAD and T2D, we
identified many of the previously implicated biological pathways as well as novel genetic loci and pathways. We regard
the identification of many suspected signals as encouraging and
confirmatory. Importantly, our analyses imply causal involvement of the identified pathways or processes using genetics as
the anchor, which represents an important step forward. Such
causal inference is generally not possible with classic epidemiological studies of biomarkers, which may reflect consequences
of disease rather than causal mechanisms underlying diseases.
Furthermore, to our knowledge, this is the first time that these
processes are found to be genetically linked to both CVD and
T2D in multiple ethnicities, which points to shared mechanisms
of vascular diseases that can be targeted for future therapeutic
interventions. Moreover, we further explored the gene–gene
interactions as well as the potential regulatory mechanisms to
better understand the relationships between genes within and
between pathways. The network topology revealed and the
potential novel regulators identified provide deeper insights into
the close connections, coordinated actions, and regulation of the
pathways. The novel regulators identified may serve as more
effective drug targets because of their central role in the regulatory networks. We think that these progresses made through our
study are important for not only improving our understanding
of the causal disease mechanisms but also for future development of more effective therapies.
In conclusion, our integrative analysis of American women
of 3 ethnicities identified multiple shared biological pathways and key regulatory genes for the development of CVD
and T2D. These prospective findings also support the notion
that ethnicity-specific susceptibility genes and process are
involved in the complex pathogenesis of CVD and T2D.
Acknowledgments
We thank Yi-Hsiang Hsu, ScD, Yiqing Song, MD, ScD, and Andrea
Hevener, PhD, for reviewing the draft of the article; Ville-Petteri
Mäkinen, Dsc, for his help with the eSNP methodology, particularly
for generating the SNP set enrichment analysis analysis pipeline;
and Calvin Pan for his help with putting together the mouse in silico
validation data sets. We thank the Women’s Health Initiative (WHI)
investigators and staff for their dedication and the study participants
for making the program possible. A listing of WHI investigators can
be found at http://www.whiscience.org/publications/WHI_investigators_shortlist_2010-2015.pdf.
Sources of Funding
The Women’s Health Initiative (WHI) program is funded by the
National Heart, Lung, and Blood Institute; National Institutes of Health;
and the United States Department of Health and Human Services
through contracts HHSN268201100046C, HHSN268201100001C,
HHSN268201100002C, HHSN268201100003C, HHSN2 68201100004C,
and HHSN271201100004C. Dr Yang is funded by the American Heart
Association and the Leducq Foundation.
Disclosures
None.
References
1. Stern MP. Do non-insulin-dependent diabetes mellitus and cardiovascular disease share common antecedents? Ann Intern Med. 1996;124(1 Pt
2):110–116.
2. Liu S, Tinker L, Song Y, Rifai N, Bonds DE, Cook NR, et al. A prospective
study of inflammatory cytokines and diabetes mellitus in a multiethnic cohort of postmenopausal women. Arch Intern Med. 2007;167:1676–1685.
3. Song Y, Manson JE, Tinker L, Rifai N, Cook NR, Hu FB, et al. Circulating
levels of endothelial adhesion molecules and risk of diabetes in an ethnically diverse cohort of women. Diabetes. 2007;56:1898–1904.
4. Manolio TA. Cohort studies and the genetics of complex disease. Nat
Genet. 2009;41:5–6.
5. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins
FS, et al. Potential etiologic and functional implications of genome-wide
association loci for human diseases and traits. Proc Natl Acad Sci USA.
2009;106:9362–9367.
6. Mosca L, Barrett-Connor E, Wenger NK. Sex/gender differences in cardiovascular disease prevention: what a difference a decade makes. Circulation. 2011;124:2145–2154.
7. Ding EL, Song Y, Malik VS, Liu S. Sex differences of endogenous sex
hormones and risk of type 2 diabetes: a systematic review and meta-­
analysis. JAMA. 2006;295:1288–1299.
8. Rasmussen-Torvik LJ, Guo X, Bowden DW, Bertoni AG, Sale MM, Yao
J, et al. Fasting glucose GWAS candidate region analysis across ethnic
groups in the Multiethnic Study of Atherosclerosis (MESA). Genet Epidemiol. 2012;36:384–391.
9. Akula N, Baranova A, Seto D, Solka J, Nalls MA, Singleton A, et al; Bipolar Disorder Genome Study (BiGS) Consortium; Wellcome Trust CaseControl Consortium. A network-based approach to prioritize results from
genome-wide association studies. PLoS One. 2011;6:e24220.
10. Bakir-Gungor B, Sezerman OU. A new methodology to associate SNPs
with human diseases according to their pathway related context. PLoS
One. 2011;6:e26277.
11. Nam D, Kim J, Kim SY, Kim S. GSA-SNP: a general approach for gene
set analysis of polymorphisms. Nucleic Acids Res. 2010;38(Web Server
issue):W749–W754.
12. Segrè AV, Groop L, Mootha VK, Daly MJ, Altshuler D; DIAGRAM Consortium; MAGIC Investigators. Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related
glycemic traits. PLoS Genet. 2010;6:e1001058.
13. Wang K, Li M, Hakonarson H. Analysing biological pathways in genomewide association studies. Nat Rev Genet. 2010;11:843–854.
14. Zhong H, Yang X, Kaplan LM, Molony C, Schadt EE. Integrating pathway analysis and genetics of gene expression for genome-wide association studies. Am J Hum Genet. 2010;86:581–591.
15. Bakir-Gungor B, Egemen E and Sezerman OU. PANOGA: a web server
for identification of SNP-targeted pathways from genome-wide association study data. Bioinformatics. 2014;30:1287–9.
16. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical
and powerful approach to multiple testing. J R Stat Soc Series B Methodol.
1995;57:289–300.
17.Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK,
Surendranath V, et al. Development of human protein reference database
as an initial platform for approaching systems biology in humans. Genome
Res. 2003;13:2363–2371.
18. Wang IM, Zhang B, Yang X, Zhu J, Stepaniants S, Zhang C, et al. Systems
analysis of eleven rodent disease models reveals an inflammatome signature and key drivers. Mol Syst Biol. 2012;8:594.
19. Yang X, Zhang B, Molony C, Chudin E, Hao K, Zhu J, et al. Systematic
genetic and genomic analysis of cytochrome P450 enzyme activities in
human liver. Genome Res. 2010;20:1020–1036.
20. Yang X, Peterson L, Thieringer R, Deignan JL, Wang X, Zhu J, et al. Identification and validation of genes affecting aortic lesions in mice. J Clin
Invest. 2010;120:2414–2422.
21.Maddatu TP, Grubb SC, Bult CJ, Bogue MA. Mouse Phenome
Database (MPD). Nucleic Acids Res. 2012;40(Database issue):
D887–D894.
22. Parks BW, Nam E, Org E, Kostem E, Norheim F, Hui ST, et al. Genetic
control of obesity and gut microbiota composition in response to high-fat,
high-sucrose diet in mice. Cell Metab. 2013;17:141–152.
23. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al.
Cytoscape: a software environment for integrated models of biomolecular
interaction networks. Genome Res. 2003;13:2498–2504.
24. de las Fuentes L, Yang W, Dávila-Román VG, Gu C. Pathway-based genome-wide association analysis of coronary heart disease identifies biologically important gene sets. Eur J Hum Genet. 2012;20:1168–1173.
25. Yang X. Use of functional genomics to identify candidate genes underlying human genetic association studies of vascular diseases. Arterioscler
Thromb Vasc Biol. 2012;32:216–222.
Chan et al Pathways of CVD and T2D in Multiple Ethnicities 919
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
26. Gottesman O, Drill E, Lotay V, Bottinger E, Peter I. Can genetic pleiotropy replicate common clinical constellations of cardiovascular disease
and risk? PLoS One. 2012;7:e46419.
27. Lee I, Blom UM, Wang PI, Shim JE, Marcotte EM. Prioritizing candidate
disease genes by network-based boosting of genome-wide association
data. Genome Res. 2011;21:1109–1121.
28. Perry JR, McCarthy MI, Hattersley AT, Zeggini E, Weedon MN, Frayling TM; Wellcome Trust Case Control Consortium. Interrogating type 2
diabetes genome-wide association data using a biological pathway-based
approach. Diabetes. 2009;58:1463–1467.
29. Lettre G, Palmer CD, Young T, Ejebe KG, Allayee H, Benjamin EJ, et al.
Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project. PLoS Genet.
2011;7:e1001300.
30. Comuzzie AG, Cole SA, Laston SL, Voruganti VS, Haack K, Gibbs RA,
et al. Novel genetic loci identified for the pathophysiology of childhood
obesity in the Hispanic population. PLoS One. 2012;7:e51954.
31. Suhre K, Wallaschofski H, Raffler J, Friedrich N, Haring R, Michael K,
et al. A genome-wide association study of metabolic traits in human urine.
Nat Genet. 2011;43:565–569.
32.Ma B, Lawson AB, Liese AD, Bell RA, Mayer-Davis EJ. Dairy,
magnesium, and calcium intake in relation to insulin sensitivity: approaches to modeling a dose-dependent association. Am J Epidemiol.
2006;164:449–458.
33. Dixit SS, Wang T, Manzano EJ, Yoo S, Lee J, Chiang DY, et al. Effects
of CaMKII-mediated phosphorylation of ryanodine receptor type 2 on islet calcium handling, insulin secretion, and glucose tolerance. PLoS One.
2013;8:e58655.
34.Mäkinen VP, Civelek M, Meng Q, Zhang B, Zhu J, Levian C, et al;
Coronary ARtery DIsease Genome-Wide Replication And Meta-Analysis (CARDIoGRAM) Consortium. Integrative genomics reveals novel
molecular pathways and gene networks for coronary artery disease. PLoS
Genet. 2014;10:e1004502.
35. Tang H, Wei P, Duell EJ, Risch HA, Olson SH, Bueno-de-Mesquita HB,
et al. Axonal guidance signaling pathway interacting with smoking in
modifying the risk of pancreatic cancer: a gene- and pathway-based interaction analysis of GWAS data. Carcinogenesis. 2014;35:1039–1045.
36. Yang YH, Manning Fox JE, Zhang KL, MacDonald PE, Johnson JD. Intraislet SLIT-ROBO signaling is required for beta-cell survival and potentiates insulin secretion. Proc Natl Acad Sci USA. 2013;110:16480–16485.
37. Bongo JB, Peng DQ. The neuroimmune guidance cue netrin-1: a new
therapeutic target in cardiovascular disease. J Cardiol. 2014;63:95–98.
38. Sorajja P, Elliott PM, McKenna WJ. The molecular genetics of hypertrophic cardiomyopathy: prognostic implications. Europace. 2000;2:4–14.
39. Osterziel KJ, Perrot A. Dilated cardiomyopathy: more genes means more
phenotypes. Eur Heart J. 2005;26:751–754.
40. Shai I, Pischon T, Hu FB, Ascherio A, Rifai N, Rimm EB. Soluble intercellular adhesion molecules, soluble vascular cell adhesion molecules, and risk of coronary heart disease. Obesity (Silver Spring).
2006;14:2099–2106.
41. Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL. The human
disease network. Proc Natl Acad Sci USA. 2007;104:8685–8690.
42. Kos K, Wilding JP. SPARC: a key player in the pathologies associated
with obesity and diabetes. Nat Rev Endocrinol. 2010;6:225–235.
43. Schunkert H, König IR, Kathiresan S, Reilly MP, Assimes TL, Holm H,
et al; Cardiogenics; CARDIoGRAM Consortium. Large-scale association
analysis identifies 13 new susceptibility loci for coronary artery disease.
Nat Genet. 2011;43:333–338.
44.Yamada Y, Kato K, Oguri M, Fujimaki T, Yokoi K, Matsuo H,
et al. Genetic risk for myocardial infarction determined by polymorphisms of candidate genes in a Japanese population. J Med Genet.
2008;45:216–221.
CLINICAL PERSPECTIVE
Cardiovascular disease (CVD) and type 2 diabetes mellitus (T2D) are highly heritable, share many common risk factors,
and demonstrate ethnic-specific prevalence, yet a comprehensive molecular-level understanding of these observations is
currently lacking. In this study, we seek to explore 3 clinically relevant questions: (1) whether there are additional genetic
risks on top of the 60 identified genetic loci for CVD and T2D that may explain the pathophysiological link between CVD
and T2D; (2) whether there are any ethnicity-specific genetic mechanisms for the 2 diseases; and (3) to what extent molecular mechanisms are shared across ethnicities for CVD and T2D. Using integrative pathway and network approaches, we
conducted genome-wide association studies for both CVD and T2D in 3 ethnic populations, blacks, Caucasian Americans,
and Hispanic Americans, in the national Women’s Health Initiative. We identified 8 pathways and gene networks related
to cardiomyopathy, calcium signaling, axon guidance, cell adhesion, and extracellular matrix that seemed to be commonly
shared between CVD and T2D across all 3 ethnic groups. Potential key drivers of these shared pathways, such as COL1A1,
COL3A1, and ELN, were also unraveled and cross-validated. We also identified ethnicity-specific pathways such as cell cycle
(specific for Hispanic Americans and Caucasian Americans) and tight junction (specific for Hispanic Americans). These
findings not only suggest the existence of major mechanistic pathways and key regulatory genes underlying the development
of both CVD and T2D but also support the notion that ethnicity-specific mechanisms play a role in the complex pathogenesis
of CVD and T2D.
Shared Molecular Pathways and Gene Networks for Cardiovascular Disease and Type 2
Diabetes Mellitus in Women Across Diverse Ethnicities
Kei Hang K. Chan, Yen-Tsung Huang, Qingying Meng, Chunyuan Wu, Alexander Reiner, Eric
M. Sobel, Lesley Tinker, Aldons J. Lusis, Xia Yang and Simin Liu
Downloaded from http://circgenetics.ahajournals.org/ by guest on June 17, 2017
Circ Cardiovasc Genet. 2014;7:911-919; originally published online November 4, 2014;
doi: 10.1161/CIRCGENETICS.114.000676
Circulation: Cardiovascular Genetics is published by the American Heart Association, 7272 Greenville Avenue,
Dallas, TX 75231
Copyright © 2014 American Heart Association, Inc. All rights reserved.
Print ISSN: 1942-325X. Online ISSN: 1942-3268
The online version of this article, along with updated information and services, is located on the
World Wide Web at:
http://circgenetics.ahajournals.org/content/7/6/911
Data Supplement (unedited) at:
http://circgenetics.ahajournals.org/content/suppl/2014/11/04/CIRCGENETICS.114.000676.DC1
Permissions: Requests for permissions to reproduce figures, tables, or portions of articles originally published
in Circulation: Cardiovascular Genetics can be obtained via RightsLink, a service of the Copyright Clearance
Center, not the Editorial Office. Once the online version of the published article for which permission is being
requested is located, click Request Permissions in the middle column of the Web page under Services. Further
information about this process is available in the Permissions and Rights Question and Answer document.
Reprints: Information about reprints can be found online at:
http://www.lww.com/reprints
Subscriptions: Information about subscribing to Circulation: Cardiovascular Genetics is online at:
http://circgenetics.ahajournals.org//subscriptions/
SUPPLEMENTAL MATERIAL
Supplemental Methods
Study participants
All women enrolled in the Women’s Health Initiative (WHI) were from 40 clinical centers in 24
states and the District of Columbia and enrollment began in 1993 and ended in 19981. All participants
were between 50 and 79 years old, postmenopausal at the time of enrollment, and expected to remain in
the area for at least 3 years. Enrollment of ethnic or racial minority groups proportionate to the total
minority population of women between 50 and 79 years of age was a high priority of the WHI. At the end
of the recruitment period, 161, 808 women had joined the WHI, and about 17% represented ethnic or
racial minority groups. Clinical information was collected by self-report and physical examination.
121,151 self-identified AA and 5,469 self-identified HA WHI participants had consented to genetic
research and were eligible for WHI-SHARe. Due to budget constraints, a subsample of 12,157 of these
women (i.e. 8,515 AA and 3,642 HA women) were randomly selected for genetic study. DNA was
extracted by the Specimen Processing Laboratory at the Fred Hutchinson Cancer Research Center from
specimens that were collected at the time of enrollment.
The WHI-GARNET participants were women who enrolled in the WHI Hormone Therapy (HT)
trial, met eligibility requirements for this study and eligibility for submission to dbGaP, and provided
DNA samples. Of the approximately 27,000 women who participated in the HT trial, incident diabetes
cases, incident coronary heart disease cases and matched controls free of prevalent or incident diabetes
and/or coronary heart disease, stroke, and venous thrombosis were included. The matching criteria were
age (± 5 years), race/ethnicity, hysterectomy status, enrollment date (± 1.5 years), and length of follow-up
(±48 months). Controls were also prioritized on the basis of availability of plasma biomarker availability
(glucose, high density lipoprotein cholesterol, low density lipoprotein cholesterol, total cholesterol,
insulin, triglycerides, C-reactive protein, and fibrinogen). All incident cases of type 2 diabetes that had
been self-reported during follow-up in the WHI HT trial and were included in the August 14, 2009
1 database were selected as potential cases, using a cutoff date of July 7, 2002, for the estrogen plus
progestin trial and February 29, 2004, for the estrogen-alone trial due to possible more harm than benefit
for the HT trials.
All participants provided written informed consent as approved by local human subjects
committees. A total of 1765 African American, 689 Hispanic American, and 1525 Caucasian American
women had combined T2D + CVD.
Genotyping and Quality Control (QC)
Genome-wide genotyping of the WHI-SHARe participants was performed at Affymetrix 6.0
array (Affymetrix, Inc., Santa Clara, CA) with 2 µg of DNA at a concentration of 100 ng/µl. We excluded
samples on the basis of genotyping failure and quality control (n = 149), relatedness (n = 56), discordance
between self-identified race and genetic ancestry (n = 56), and missing phenotypic information (n = 40).
In case of related individuals, the relative with the highest call rate was retained, while other family
members were excluded. After applying all these exclusions, 8,155 AA and 3,494 HA women were
included in our analysis.
Genotyping for WHI-GARNET participants was performed at the Broad Institute (Cambridge,
MA) by using the Illumina HumanOmni1-Quad SNP platform (Illumina
Inc., San Diego, CA). Sample- and SNP-level processing quality control were performed using a
standardized protocol2 at the GARNET Data Coordinating Center at the University of Washington. After
applying exclusions on insufficient DNA volume (<2 µg), unsuccessfully genotyped, not having IRB
approval for data submission into dbGap, and sample identity issues, 3,697 CA women (1,022 T2D cases,
545 CVD cases, and 2,130 controls) were included in this study.
GWAS analysis
We first ran GWAS on CVD, T2D, and combined CVD and T2D using Cox model and adjusted
for age, region, and first four principal components (PCs) for global ancestry (based on relations with
2 disease outcomes) separately for WHI-SHARe AA and HA participants. For WHI-GARNET CA women,
GWAS was conducted on CVD and T2D using logistic regression and adjusted for first three PCs for
global ancestry and matching factors including age, baseline case status [CVD, stroke, and/or venous
thromboembolism], hormone therapy use [placebo vs. active], and hystretemony status). For the
combined phenotype CVD + T2D, we meta-analyzed the GWAS results within WHI-GARNET for CVD
and T2D using METAL3.
Pathway-based and Network-based analytical analysis
We tested whether any of the known biological pathways were enriched for genetic signals of
CVD, T2D, or combined CVD + T2D using five methodologies: 1) MAGENTA4, 2) GSA-SNP5, 3)
NIMMI6, 4) PANOGA7, 8, and 5) eSNP9, 10. We used two databases of biological pathways including
Encyclopedia of Genes and Genomes (KEGG) and Reactome. Briefly, using MAGENTA, we used the
most significant SNP of each gene to represent the gene association p-value. By allowing adjustment for
confounders on genes association scores, we used a non-parametric statistical test to assess whether the
best SNPs of all genes in a gene set are enriched for disease risk SNPs than would have been expected by
chance, comparing to randomly sampled gene sets of identical size from the genome4. Using GSA-SNP,
we adopted a P-value based gene set approach using kth best SNPs (k=2 in the current study)5. NIMMI
built biological networks weighted by connectivity that is estimated by a modified Google PageRank
algorithm by combining GWAS results with human protein-protein interaction (PPI) data. These weights
were then combined with genetic association p-values from GWAS using the Liptak-Stouffer method to
produce trait-prioritized sub-networks, which are in turn annotated with KEGG and Reactome pathways6.
PANOGA combined genetic association from GWAS, current knowledge of biochemical pathways, PPI
networks, and functional and genetic information of selected SNPs to integrate functional properties of
SNPs (such as protein coding, splicing and transcriptional regulation) identified from GWAS for
prioritizing trait-related pathways7. In the last approach, i.e. “eSNP”, pathway genes were assigned to
SNPs based on the expression quantitative trait loci (eQTLs) identified from multiple human tissues. We
3 then ran SNP set enrichment analysis (SSEA) 9, 10 using Kolmogorov-Smirnov test and fisher exact test to
calculate enrichment score of a gene set. The top 30 pathways ranked by the enrichment score were
selected and FDR was used to correct for multiple adjustment.
We then compared the top pathways obtained for each of the five methods and summarize the
overall top pathways for combined CVD + T2D, CVD, and T2D across the three ethnic groups in the
WHI cohort.
Investigation of the relationship between top enriched pathways
We used Cytoscape11 to plot the network formed by all the genes along each of the eight top
pathways enriched for CVD, T2D, and combined CVD + T2D among AA, HA, and CA WHI women.
The diamond nodes represent pathway and the round nodes denote gene, and the edge shows the
“interaction”, i.e. the association between a gene and a pathway. The sizes of the pathway (diamond)
nodes are inversely proportional to their corresponding p-value representing the enrichment with CVD,
T2D, and combined CVD + T2D among AA, HA, and CA WHI women.
In silico validation of the top key drivers
We selected the top ten key drivers with highest NRS and explored their connection with CVD
and T2D phenotypes by intersecting with multiple databases: 1) candidate causal genes identified through
the integration of DNA variation, gene transcription and phenotypic data in segregating mouse
populations by applying a likelihood-based causality model selection method12-14; 2) the public mouse
phenome database that curates genetically modified mouse models and the disease phenotypes affected in
each model15; and 3) the hybrid mouse diversity panel, which examined >100 inbred parental strains and
recombinant inbred strains to capture genetic loci and genes associated with complex cardiometabolic
traits in mice16. The complex traits studied in HMDP include over 70 clinical traits such as diet-induced
obesity, heart failure, atherosclerosis, lipoprotein metabolism, vascular injury, and diabetic complications.
4 Gene expression microarrays were used to quantify mRNA levels in liver, bone, adipose, brain, peritoneal
macrophages, aorta, and heart16.
5 Supplemental Tables
Supplemental Table 1a. Baseline characteristics of participants in WHI-SHARe and WHI-GARNET stratified by ethnicity and CVD
status.
CVD
AA in WHI-SHARe
HA in WHI-SHARe
CA in WHI-GARNET
P-value *
P-value *
(cases)
(controls)
Cases
Controls
Cases
Controls
Cases
Controls
N
483
5,880
131
2,893
545
2,130
Age (mean±SD)
64.2±7.26
60.8±6.80
62.2±7.19
59.8±6.55
67.1±6.81
65.6±6.87
<0.001
<0.001
BMI (mean±SD)
31.7±6.42
30.8±6.29
29.1±5.62
28.7±5.45
29.4±5.79
28.5±5.89
<0.001
<0.001
Current smoking (%)
13.9
11.4
9.16
6.67
17.4
10.4
0.04
<0.001
Current alcohol drinking
52.0
57.0
57.7
68.2
67.7
76.6
<0.001
<0.001
19.8
26.7
29.8
36.3
6.06
8.78
<0.001
<0.001
8.38±11.5
10.1±13.2
7.66±9.85
11.0±14.0
9.11±10.8
11.2±13.1
0.34
0.001
53.7
49.8
49.6
45.0
35.3
30.3
<0.001
<0.001
(%)
Current hormone user
(%)
Physical Activity: total
METS / week
(mean±SD)
Family history of T2D
(%)
6 * P-value is calculated using Analysis of Variance (ANOVA) test for continuous variables (i.e. age, BMI, physical activity) and chi-square test for
categorical variables (i.e. current smoking, current alcohol drinking, current hormone user, and family history of T2D).
7 Supplemental Table 1b. Baseline characteristics of participants in WHI-SHARe and WHI-GARNET stratified by ethnicity and T2D
status.
T2D
AA in WHI-SHARe
HA in WHI-SHARe
CA in WHI-GARNET
P-value*
P-value*
(cases)
(controls)
Cases
Controls
Cases
Controls
Cases
Controls
N
1,381
5,739
581
2,681
1,022
2,130
Age (mean±SD)
61.0±6.50
61.5±7.13
59.8±6.42
60.3±6.72
64.0±6.90
65.6±6.87
<0.001
<0.001
BMI (mean±SD)
32.8±6.29
30.2±6.18
31.1±5.51
28.1±5.33
32.5±6.23
28.5±5.89
<0.001
<0.001
Current smoking (%)
11.7
11.9
8.80
6.62
9.94
10.4
0.12
<0.001
Current alcohol
54.3
58.6
59.7
70.6
67.2
76.6
<0.001
<0.001
24.5
27.0
30.1
37.6
8.62
8.78
<0.001
<0.001
8.60±11.4
10.3±13.2
8.35±11.9
11.4±14.3
8.69±11.7
11.2±13.1
0.86
<0.001
61.0
44.9
55.1
40.2
51.9
30.3
<0.001
<0.001
drinking (%)
Current hormone user
(%)
Physical Activity: total
METS / week
(mean±SD)
Family history of T2D
(%)
8 * P-value is calculated using Analysis of Variance (ANOVA) test for continuous variables (i.e. age, BMI, physical activity) and chi-square test for
categorical variables (i.e. current smoking, current alcohol drinking, current hormone user, and family history of T2D).
9 Supplemental Table 2. Key differences of the five methodologies (MAGENTA, GSA-SNP, NIMMI, PANOGA, and eSNP).
Methodology
Main Feature
Advantage
Limitation
Account for confounders on the
association scores of genes, which
A pathway-based method that
Potential pathways or signals from
includes various gene properties
applies a modified GSEA
MAGENTA4
transcriptional regulatory elements that lie in
consisting of gene size, number of SNPs
approach to GWAS results and
distant region of a gene may have been missed
per kilobase (kb), number of
allows adjustment for confounders
because it only considers variants within a
recombination hotspots per kb, linkage
on genes association scores.
given distance around each gene.
disequilibrium units per kb, and genetic
distance.
A pathway analysis that adopts a
Provide a fast, secure, and easy-to-use
kth best P-value based analytical
computation by using a stand-alone
approach to summarize the
JAVA platform.
association of SNPs in a gene to
It chooses the kth best (k is determined
remove randomly associated
by user) p-value to combine the
signals.
information of each gene.
A network-based method
Map all the genes in a set of GWAS
Due to availability of biological database and
combining GWAS data with
results (without screening out SNPs
publication bias, less studied genes have less
In the process of combining randomized
scores of different gene sets to re-standardize
GSA-SNP5
the gene set analysis, the generation of
simulations of P-values may be timeconsuming.
NIMMI6
10 human protein-protein (PPI) data.
according to GWAS p-values) to human
connectivity while better studied genes show
interactome data using a modified
greater connectivity. PPI data is tissue
Google PageRank algorithm.
dependent and may be inconsistent.
The active network search algorithm by
A pathway and network based
jActive Modules may output extensive
Integrate functional information of
GWAS tool that unites evidence
overlapping genes for certain sub-networks.
SNPs additional to genetic association
PANOGA7, 8
from functional properties of
Current setting only outputs results from
information from GWAS, PPI
SNPs, genetic association of a
highest scoring sub-network.
networks, and current knowledge of
SNP with disease trait, and PPI
Due to current knowledge of gene and protein
biochemical pathways.
network.
databases, query results are biased towards
well-studied genes.
Map genes to corresponding functional
Data-driven functional eSNP dataset was used
SNPs; it uses eSNPs (SNPs associated
to map GWAS SNP to pathway genes. In the
with the expression levels of genes)
eSNP dataset, the correlations between gene
from disease-relevant tissues.
expression and polymorphism depend on the
Facilitate the discovery of functional
experimental design (the source of samples,
categories, biological pathways, or
tissue type, various methods for detecting
networks.
gene expression and genotypes). Therefore,
This method integrates all set of
eSNP9, 10
GWAS results with SNPs of
functional implications and group
genes by disease relevance.
11 gene mapping may become difficult if very
few or no association between SNP and gene
expression have been established (i.e. no
eSNPs were found).
12 Supplemental Table 3. Data sets used for Bayesian network analysis*.
Study
Tissue
Reference
C57BL/6J x A/J mouse cross
Adipose, brain, heart, kidney, liver, muscle
17
C57BL/6J x C3H ApoE -/- mouse cross
Adipose, brain, liver, muscle
18
C57BL/6J x C3H wildtype mouse cross
Adipose, liver, muscle
19
C57BL/6J x BTBR Lepob mouse cross
Adipose, brain, islet cells, liver, muscle
20
Bx129 cross
Adipose, liver, hypothalamus, muscle
21
BxA cross
Adipose, liver, hypothalamus, muscle
21
BxD cross
Adipose, liver, hypothalamus, muscle
21
* Protein-protein interaction (PPI) was obtained from the Human Protein Reference Database (HPRD)22.
13 Supplemental Table 4. Previously identified T2D and CVD loci in NHGRI GWAS Catalog.
Reporte
Disease
Genes
Region
RS ID
d
Ethnicity
P-value
Observed P-value in
WHI
Association with
CVD+T2D in WHI
AA
HA
CA
AA
HA
CA
NOTCH2T2D
1p12
rs10923931
4 x 10-8
Caucasian
0.03
0.62
0.35
0.04
0.73
0.92
ADAM30
CVD
PSRC1
1p13.3
rs599839
4 x 10-9
Caucasian
0.04
0.16
0.12
0.08
0.67
0.05
CVD
SORT1
1p13.3
rs599839
3 x 10-10
Caucasian
0.04
0.16
0.12
0.08
0.67
0.05
CVD
MIA3
1q41
rs17465637
1 x 10-8
Caucasian
0.57
0.65
0.45
0.59
0.07
0.52
T2D
ADAMTS9
3p14.1
rs4607103
1 x 10-8
Caucasian
0.61
0.75
0.47
0.53
0.69
0.62
CVD
MRAS
3q22.3
rs2306374
3 x 10-8
Caucasian
0.58
0.37
0.81
0.71
0.91
0.19
CVD
MRAS
3q22.3
rs9818870
7 x 10-13
Caucasian
0.42
0.51
0.85
0.67
0.96
0.23
T2D
IGF2BP2
3q27.2
rs1470579
2 x 10-9
Caucasian
0.13
0.07
0.78
0.10
0.02
0.84
0.13
0.07
0.78
0.10
0.02
0.84
0.13
0.07
0.78
0.10
0.02
0.84
2 x 10-19
AsianT2D
IGF2BP2
3q27.2
rs1470579
(South
Indian
Asian
2 x 10-13
T2D
IGF2BP2
3q27.2
(South
Asian-
rs1470579
Indian
14 Asians
4 x 10-9
AsianT2D
IGF2BP2
3q27.2
rs1470579
(South
0.13
0.07
0.78
0.10
0.02
0.84
Caucasian
0.46
0.16
0.78
0.10
0.12
0.82
Caucasian
0.46
0.16
0.78
0.10
0.12
0.82
Caucasian
0.46
0.16
0.78
0.10
0.12
0.82
0.31
0.90
0.11
0.67
0.14
0.21
0.40
0.33
0.98
0.25
0.62
0.09
0.34
0.0006
0.13
0.10
0.28
0.0006
0.13
Indian
Asians)
T2D
IGF2BP2
3q27.2
rs4402960
2 x 10-9
9 x 10-16
(DGI+F
T2D
IGF2BP2
3q27.2
rs4402960
USION+
WTCCC)
3 x 10-9
T2D
IG2BP2
3q27.2
rs4402960
(Obese)
AsianCVD
C6orf10
6p21.32
rs9268402
3 x 10-15
Chinese
HCG27
CVD
6p21.33
rs3869109
1 x 10-9
Caucasian
0.47
6p22.3
rs10946398
1 x 10-8
Caucasian
0.10
USP8P1
0.000
T2D
CDKAL1
2
T2D
CDKAL1
6p22.3
rs7754840
4 x 10-11
Caucasian
15 0.08
0.000
(DGI+F
(Finn)
2
USION+
WTCCC)
AsianT2D
CDKAL1
6p22.3
rs7754840
7 x 10-10
0.000
0.08
Chinese
0.10
0.28
0.0006
0.13
0.04
0.75
0.0001
0.05
2
0.000
T2D
CDKAL1
6p22.3
rs7756992
8 x 10-9
Caucasian
0.59
1
AsianT2D
C6orf57
6q13
3 x 10-8
Chinese,Mal
(Indian)
aysian,Asian
rs1048886
1.00
0.82
0.68
0.75
0.95
0.88
Indian
CVD
TCF21
6q23.2
rs12190287
1 x 10-12
Caucasian
0.27
0.38
0.57
0.11
0.86
0.93
CVD
MTHFD1L
6q25.1
rs6922269
3 x 10-8
Caucasian
0.69
0.09
0.62
0.67
0.47
0.23
0.37
0.44
0.61
0.61
0.96
0.35
2 x 10-10
FSCN3
T2D
Asian7q32.1
rs10229583
(East
PAX4
Chinese
Asian)
CVD
ZC3HC1
7q32.2
rs11556924
9 x 10-18
Caucasian
0.12
0.38
0.91
0.05
0.55
0.70
T2D
CDKN2B-
9p21.3
rs10811661
5 x 10-8
Caucasian
0.39
0.71
0.24
0.46
0.70
0.61
16 AS1
DMRTA1
8 x 10-15
CDKN2B(DGI+F
T2D
AS1
9p21.3
Caucasian
rs10811661
USION+
0.39
0.71
0.24
0.46
0.70
0.64
0.46
0.001
0.54
0.99
0.61
(Finn)
DMRTA1
WTCCC)
CDKN2BCVD
Asian9p21.3
rs1333042
1 x 10-9
AS1
0.000
Korean
3
CDKN2BAsianT2D
AS1
9p21.3
rs2383208
3 x 10-17
0.54
0.73
0.39
0.78
0.93
0.81
0.54
0.73
0.39
0.78
0.93
0.81
0.84
0.28
0.005
0.01
0.74
0.002
0.84
0.28
0.005
0.01
0.74
0.002
Chinese
DMRTA1
CDKN2BAsianT2D
AS1
9p21.3
rs2383208
2 x 10-29
Japanese
DMRTA1
European,So
CVD
Intergenic
9p21.3
rs4977574
2 x 10-25
uth Asian
CDKN2ACVD
9p21.3
rs4977574
1 x 10-22
Caucasian
CDKN2B
17 CDKN2BT2D
AS1
9p21.3
rs7018475
3 x 10-8
Caucasian
0.78
0.24
0.18
0.32
0.41
0.01
9p21.3
rs7865618
2 x 10-27
Caucasian
0.68
0.23
0.21
0.65
0.72
0.08
rs3739998
1 x 10-11
Caucasian
0.93
0.76
0.33
0.51
0.20
0.40
0.37
0.01
0.72
0.93
0.05
0.94
0.22
0.15
0.94
0.15
0.29
0.82
0.13
0.53
0.66
0.23
0.89
0.73
DMRTA1
CDKN2BCVD
AS1
10p11.2
CVD
KIAA1462
3
AsianCDC123
T2D
10p13
rs10906115
1 x 10-8
Chinese
MIR4480
(female)
T2D
VPS26A
CVD
LIPA
10q22.1
rs1802295
4 x 10-8
rs1412444
3 x 10-13
10q23.3
South Asian
European,So
1
uth Asian
10q23.3
CVD
LIPA
rs1412444
4 x 10-8
Caucasian
0.13
0.53
0.66
0.23
0.89
0.73
rs5015480
1 x 10-15
Caucasian
0.20
0.03
0.72
0.43
0.04
0.75
rs5015480
2 x 10-9
Caucasian
0.20
0.03
0.72
0.43
0.04
0.75
1
HHEX
10q23.3
IDE
3
HHEX
10q23.3
T2D
T2D
18 3
(Obese)
Asian-
HHEX
10q23.3
EXOC6
3
T2D
rs5015480
9 x 10-6
Chinese
0.20
0.03
0.72
0.43
0.04
0.75
Caucasian
0.0009
0.63
0.04
0.08
0.94
0.08
Caucasian
0.002
0.47
0.02
0.14
0.82
0.06
Caucasian
0.78
0.85
0.28
0.45
0.61
0.53
(female)
T2D
TCF7L2
10q25.2
rs4506565
5 x 10-12
1 x 10-48
(DGI+F
T2D
TCF7L2
10q25.2
rs7901695
USION+
WTCCC)
5 x 10-11
(DGI+F
T2D
KCNJ11
11p15.1
rs5215
USION+
WTCCC)
T2D
HMGA2
12q14.3
rs1531343
4 x 10-9
Caucasian
0.49
0.80
0.06
0.48
0.55
0.04
12q21.1
rs7961581
1 x 10-9
Caucasian
0.27
0.26
0.30
0.03
0.50
0.43
0.30
0.14
0.50
0.36
0.32
0.06
TSPAN8
T2D
LGR5
ATP2B1
12q21.3
MRPL2P1
3
CVD
Asianrs7136259
6 x 10-10
Chinese
19 12q24.1
CVD
MYL2
Asianrs3782889
4 x 10-14
1
0.94
0.05
0.57
0.59
0.49
0.90
0.11
0.95
0.33
0.19
0.90
0.07
0.05
0.93
0.78
0.26
0.63
0.94
Korean
RPL12P33
12q24.3
T2D
HNF1A
rs7305618
2 x 10-8
rs7403531
4 x 10-9
Hispanic
1
AS1
AsianT2D
RASGRP1
15q14
Chinese
15q22.3
CVD
SMAD3
rs17228212
2 x 10-7
Caucasian
0.52
1.00
0.01
0.89
0.28
0.00
3
T2D
ZFAND6
15q25.1
rs11634397
2 x 10-9
Caucasian
0.26
0.27
0.36
0.67
0.48
0.04
T2D
PRC1
15q26.1
rs8042680
2 x 10-10
Caucasian
0.92
0.18
0.17
0.48
0.11
0.05
0.24
0.68
0.15
0.02
0.95
0.06
Caucasian
0.24
0.68
0.15
0.02
0.95
0.06
Caucasian
0.24
0.68
0.15
0.02
0.95
0.06
1 x 10-12
T2D
FTO
16q12.2
(DGI+F
Caucasian
USION+
(Finn)
rs8050136
WTCCC)
T2D
FTO
16q12.2
rs8050136
7 x 10-14
2 x 10-17
T2D
FTO
16q12.2
rs8050136
(obese)
20 2 x 10-11
CVD
CVD
LCAT
ZFHX3
16q22.1
16q22.3
African
rs3729639
(HDL-C)
American
5 x 10-6
African
(LDL-C)
American
rs16971384
* P-values < 5x10-3 is in bold font.
21 0.22
0.15
0.80
0.81
0.69
0.57
0.87
0.63
0.16
0.71
0.34
0.22
Supplemental Table 5. Characteristic of top eight pathways* identified by pathway and network based analytical for CVD+T2D, CVD,
and T2D among African American(AA)(n=8,155), Hispanic American(HA)(n=3,494), and Caucasian American(CA)(n=3,697) in the WHISHARe and WHI-GARNET cohorts.
No. of
No. of
Pathway
Genes†
Min.
Min. P-
SNPs
Genes
FDR
value§
‡
q-value||
TGFB3, LAMA2, RYR2, MYL2 τ, ITGA11, CACNA2D3, MYL3, CACNA2D1, TTN,
CACNA1C, ITGA4, PRKAG3, ITGA8, CACNA1D, CACNG6, ITGA9, PRKAG2,
SLC8A1, SGCD, TNNC1, ATP2A2, MYH7, CACNA1S, ITGA1, CACNG7, ACTC1,
Hypertrophic
cardiomyopathy
(HCM)
MYH6, SGCG, CACNG5, CACNB2, ITGB5, ITGB8, TNNT2, IGF1, TNNI3,
MYBPC3, ITGB4, TPM1, CACNG3, ITGA3, CACNG1, ACE, ITGB6, ITGB7,
78
4,246
2.25e-7
<0.001
84
4,493
7.14e-9
<0.001
CACNG2, ITGA5, CACNA2D2, CACNB4, DAG1, ITGA6, CACNA2D4, ITGA10,
LMNA, TPM3, TNF, TGFB2, ITGAV, SGCB, ITGA7, IL6, ITGA2, ITGB3, PRKAB2,
TGFB1, PRKAG1, ACTB, TPM4, ITGB1, SGCA, ITGA2B, TPM2, DES, PRKAB1,
PRKAA2, ACTG1, CACNB3, CACNB1, PRKAA1
TGFB3, LAMA2, RYR2, MYL2 τ, ITGA11, CACNA2D3, MYL3, PRKACG,
Dilated
CACNA2D1, TTN, CACNA1C, ITGA4, ITGA8, CACNA1D, CACNG6, ITGA9,
cardiomyopathy
SLC8A1, ADCY5, SGCD, PRKACB, ADCY9, TNNC1, ATP2A2, MYH7,
22 CACNA1S, ITGA1, CACNG7, ACTC1, MYH6, SGCG, CACNG5, CACNB2,
ITGB5, ITGB8, ADCY4, TNNT2, IGF1, TNNI3, ADCY3, MYBPC3, ITGB4,
ADRB1τ, TPM1, ADCY8, CACNG3, ITGA3, CACNG1, ITGB6, ITGB7,
ADCY2, CACNG2, GNASτ, ITGA5, ADCY1, CACNA2D2, CACNB4, DAG1,
ITGA6, CACNA2D4, ITGA10, LMNA, TPM3, TNF, TGFB2, ITGAV, SGCB,
ITGA7, ITGA2, PLN, ITGB3, ADCY7, TGFB1, ACTB, TPM4, ITGB1, SGCA,
ITGA2B, TPM2, DES, PRKACAτ, ADCY6, ACTG1, CACNB3, CACNB1
LAMA2, RYR2, ITGA11, CACNA2D3, CTNNA3, CACNA2D1, CTNNA2, CACNA1C,
ITGA4, ITGA8, CACNA1D, TCF7L2φ, CACNG6, ITGA9, SLC8A1, SGCD, ATP2A2,
Arrhythmogenic
PKP2, CACNA1S, ITGA1, CACNG7, SGCG, CACNG5, CACNB2, ACTN1, ITGB5,
right ventricular
ITGB8, CTNNB1φ, ITGB4, LEF1, CACNG3, DSP, ITGA3, CACNG1, ITGB6,
cardiomyopathy
ITGB7, TCF7L1, CACNG2, ACTN2, ITGA5, CACNA2D2, CACNB4, DSC2, CDH2,
(ARVC)
69
5,313
2.25e-8
<0.001
165
8,040
1.08e-9
<0.001
DAG1, ITGA6, CACNA2D4, ITGA10, LMNA, CTNNA1, ACTN4, ITGAV, SGCB,
GJA1, ITGA7, ACTN3, ITGA2, ITGB3, DSG2, TCF7, ACTB, ITGB1, SGCA,
ITGA2B, JUP, DES, ACTG1, CACNB3, CACNB1
ITPR1φ, RYR2, ATP2B2, CALM2, PDE1Aφ, ATP2B4, PRKACG, CHRM2, MYLK,
Calcium signaling
PPP3R2, GNA14, CACNA1C, PLCG2, HRH1, CACNA1D, HTR6, PLCB4φ, DRD1,
pathway
PLCB1, SLC8A2, RYR1, ERBB4, GNAQ, SLC8A1, OXTR, HTR4, PDE1C, CALML5,
23 PRKACB, ADCY9, TNNC1, ATP2A2, PLCZ1, NOS2, AGTR1φ, VDAC3, CHRNA7,
CAMK4, TACR1, CACNA1S, RYR3, SLC25A31, CD38, PRKCB, PLCE1, PRKCG,
CHP2, TRPC1, GRIN2A, HTR7, ITPR2, PLCD3, ITPR3, EDNRB, PPP3R1, PTAFR,
PTK2B, CHRM1, CAMK2A, CHRM3, HRH2, HTR2A, ADCY4, EGFR, GRM5,
CACNA1G, CALM1, GRM1, PDGFRB, GNALτ, ADCY3, ADRA1A, PPP3CC,
ADRB1τ, PDGFRAφ, LHCGR, TNNC2, PTGER3, GRIN2Cφ, ADCY8, LTB4R2,
MYLK2, ITPKB, CAMK2Dφ, EDNRA, NTSR1, PLCD1, CACNA1B, BST1, VDAC1,
PPP3CA, F2R, PTGER1, NOS1, CACNA1E, PRKCA, CCKBR, GRIN2D, ADCY2,
BDKRB2, BDKRB1, P2RX1, SPHK2, GNASτ, ATP2B1, PPID, ADCY1, ADRA1D,
MYLK3, ADORA2B, CACNA1A, NOS3, HTR5A, CACNA1I, CAMK2B, PDE1B,
P2RX6, CCKARτ, CAMK2G, CALML3, P2RX5, SLC8A3, TRHR, ADRA1B, P2RX3,
P2RX4, P2RX7, ADRB2, CHRM5, ERBB2, PLCB2, CYSLTR2, VDAC2, TACR3,
ADORA2A, PLN, TACR2, PHKB, ADCY7, GRIN1, CACNA1H, ITPKA, PLCG1,
PPP3CB, PTGFR, CALM3, P2RX2, SLC25A4, PLCB3, DRD5, PLCD4, PRKACAτ,
PHKG2, ATP2A3, AVPR1A, ERBB3, ATP2A1, GNA15, CALML6, PHKG1,
TBXA2R, ADRB3, GNA11, SPHK1, HTR2B
RGS3, UNC5C, CXCL12, SEMA6A, NTN1, SRGAP1, EPHA7τ, EFNA5τ, SEMA5B,
Axon guidance
PLXNB1, SEMA5A, PPP3R2, ABLIM1, EPHA2, DCCτ, SEMA4F, NCK2, SLIT3,
24 121
6,303
4.91e-8
<0.001
ROCK1, SEMA6D, PAK7, ROBO1τ, SEMA3Aτ, RAC2, SRGAP3, ROBO2τ, KRAS,
SEMA4D, NGEF, GNAI1, SLIT2, GNAI3, LRRC4C, ABLIM3, EPHA6, CHP2,
NRP1τ, EPHB1, UNC5D, DPYSL2, SLIT1, PPP3R1, MET, EFNA2, NTNG1,
NFATC4, ABL1, EPHA4, EPHB2, PPP3CC, SEMA3E, NFATC2, EPHA8, DPYSL5,
ABLIM2, PLXNC1, RAC1, UNC5B, NTN4, PPP3CA, RHOA, GSK3B, PLXNA2,
UNC5A, PLXNA1, NFATC1, SEMA3D, SEMA4B, PTK2τ, MAPK1, SEMA7A,
CDK5, FYN, NFAT5φ, SRGAP2, SEMA3G, EPHA5, LIMK1, SEMA6C, SEMA4A,
ROBO3, CFL1, PAK1, SEMA6B, EPHA1, NRAS, NCK1, EPHB6, EPHB4, SEMA3C,
CDC42φ, ARHGEF12, PAK6, PAK2, EFNB2, ROCK2, CFL2, FES, RND1, EFNB3,
PAK4, EPHA3, SEMA3F, PPP3CB, LIMK2, ITGB1, CXCR4τ, RASA1, SEMA4G,
GNAI2, SEMA4C, HRAS, EFNA1, NFATC3, EFNA3, RAC3, EPHB3, RHOD,
PLXNB2, NTN3, EFNA4
SELP, ALCAM, NRCAM, CD8B, SIGLEC1, NEO1, CNTNAP2, JAM3, NCAM2,
VCAN, ITGA4, CADM3, PTPRM, ITGA8, GLG1, CLDN17, ITGA9, HLA-Bφ, ESAM,
Cell adhesion
molecules
CDH4τ, NRXN1, CD274, CD80, PDCD1LG2, NRXN3, OCLN, PVRL1, HLA-F,
HLA-A, ICAM2, NEGR1τ, CLDN10, CADM1, CLDN18, NFASC, HLA-DOB, HLA-
(CAMs)
DPB1, ITGB8, HLA-DMA, HLA-DQA2, NLGN1, MPZL1, SDC4, CD28, CNTN2,
HLA-DPA1, NCAM1τ, ITGAL, CLDN1, CDH1, SDC2, HLA-Cφ, ITGB7, HLA-DOA,
25 3.60e122
5,547
<0.001
10
HLA-Gφ, SDC3, PVRL2, VCAM1, CNTN1τ, HLA-DRB1φ, ICOS, CD4, ITGB2,
PECAM1, JAM2, HLA-DRA, HLA-Eφ, MAG, F11R, CDH15, PVR, CD2, CLDN5,
CD58, CD22, ICOSLG, NLGN2, CDH2, NRXN2, ITGA6, CDH5, CD34, SELE,
SELL, SELPLG, CLDN20, HLA-DRB3, CNTNAP1, HLA-DRB5, ITGAV, CLDN16,
PTPRF, HLA-DQB1, CTLA4, CLDN14, ICAM3, CD6, CD226, HLA-DQA1φ,
PTPRC, SPN, SDC1, CLDN22, CD40, ICAM1, ITGAM, ITGB1, CD276, CLDN7,
CLDN8, CDH3, CD86, CLDN23φ, PVRL3, MPZ, CLDN3, CLDN11, CLDN4,
CLDN19, MADCAM1, CLDN15, CLDN6
IGF1R, LAMA2, PDGFC, MYL2 τ, ITGA11, VAV2, THBS2φ, ARHGAP5, LAMB1,
MYLK, FLT1 τ, COL4A4, BCL2, MAP2K1, CCND2φ, GRB2τ, DIAPH1, ITGA4,
HGF, DOCK1, ITGA8, COL4A2, COL6A3, ITGA9, ROCK1, PARVB, CAV3,
RAP1A, PAK7, FLNB, RAC2, TNN, VCL, VAV3, VWF, SRC, PIK3CA, ITGA1,
PRKCB, COL3A1, PRKCG, LAMA3, TNR, MAPK8, COL4A1, IBSP, SHC4,
Focal adhesion
COL5A1, SHC3, BCAR1, VEGFAυ, ACTN1, PARVA, ITGB5, LAMA1, PTEN,
ITGB8, SOS1, MET, IGF1, EGFR, COL5A2, CTNNB1φ, VEGFB, TNXB, MYL10,
AKT2, PDGFRB, ITGB4, PIK3R5, THBS4, PDGFRAφ, CRK, COL11A1, PARVG,
PIK3R1τ, MYLK2, TLN2, RELN, COL11A2, ITGA3, FN1, LAMA4, TNC, RAC1,
ITGB6, RAPGEF1, ITGB7, RHOA, MAPK9, BRAF, RASGRF1φ, COL1A2, PRKCA,
26 189
6,750
1.60e-8
<0.001
PIP5K1C, GSK3B, COL2A1, COL5A3, PDGFD τ,υ, VAV1, MAPK10, PIK3R2,
LAMB3, RAF1, PIK3CD, MYL12A, AKT1, ACTN2, MYL9, ITGA5, CAPN2, LAMC1,
LAMB4, PTK2τ, MYLK3, ZYX, LAMA5, MAPK1, LAMC3, FLT4, CRKL, FYN,
AKT3, CAV1, COL6A1, ITGA6, ITGA10, PPP1CA, COL1A1, PAK1, CCND3,
MYLPF, ACTN4, VEGFC, ITGAV, EGF, PDPK1, ITGA7, COL6A6, ERBB2, BIRC2,
CDC42φ, PAK6, PAK2, LAMC2, ACTN3, PPP1CB, PIK3CG, KDR, ITGA2, ITGB3,
COL6A2, ROCK2, JUN, COMP, FLNC, THBS1, MYL5, PAK4, ILK, THBS3, ACTB,
PPP1CC, PIK3CB, RAP1B, SOS2, PDGFB, ITGB1, BIRC3, PPP1R12A, ITGA2B,
SHC2, BAD, MYL7, TLN1, CCND1, HRAS, VTN, PIK3R3, SPP1, RAC3, VASP,
LAMB2, ACTG1, SHC1, CAV2, CHAD, PXN, PGF
LAMA2, ITGA11, THBS2φ, LAMB1, COL4A4, ITGA4, ITGA8, COL4A2, COL6A3,
ITGA9, CD44, TNN, VWF, HSPG2, ITGA1, COL3A1, SV2B, LAMA3, TNR,
COL4A1, IBSP, SV2C, COL5A1, GP6, ITGB5, LAMA1, ITGB8, SDC4, AGRN,
ECM-receptor
COL5A2, TNXB, ITGB4, THBS4, COL11A1, CD47, RELN, COL11A2, ITGA3, FN1,
interaction
LAMA4, SDC2, TNC, ITGB6, ITGB7, SDC3, COL1A2, COL2A1, CD36τ, COL5A3,
LAMB3, ITGA5, LAMC1, LAMB4, LAMA5, LAMC3, GP5, HMMR, COL6A1, DAG1,
ITGA6, ITGA10, SV2A, COL1A1, ITGAV, ITGA7, COL6A6, LAMC2, ITGA2,
ITGB3, COL6A2, COMP, THBS1, SDC1, THBS3, ITGB1, GP1BA, ITGA2B,
27 83
3,404
2.74e-7
<0.001
GP1BB, GP9, VTN, SPP1, LAMB2, CHAD
* Pathways identified in ≥ 2 methods in all three populations(AA, HA, and CA). We used KEGG and/or Reactome pathway databases in our
pathway and network based analyses.
† Genes listed in the GSA-SNP output. Genes that were previously associated with one of the three endpoints were put in bolded font(τ: CHD; υ:
CVD; φ: T2D). Underlined are found to be significant using GSA-SNP, i.e. with nominal p-values ≤ 0.05.
‡ SNPs that are involved within the genomic region of the genes listed in the Genes column using the Affymetrix database.
§,|| P-values and false discovery rate(FDR) q-value computed by GSA-SNP. The minimum values across the three populations(AA, HA, and CA)
were presented.
28 Supplemental Table 6. Common top pathways* identified by pathway and network based analytical methodologies† for CVD+T2D, CVD,
and T2D among African American (AA)(n=8,155), Hispanic American (HA)(n=3,494), and Caucasian American (CA)(n=3,697) in the
WHI-SHARe and WHI-GARNET cohorts.
AA
CVD + T2D
HA
CA
AA
CVD
HA
Hypertrophic
cardiomyopathy
G,P
G,P
G,P,E
G,P
Dilated
cardiomyopathy
G,P
G,P
G,P,E
Arrhythmogenic
right ventricular
cardiomyopathy
G,P
G,P
Calcium
signaling
pathway
G,P
Axon guidance
Pathway
T2D
HA
CA
Evidence from
previous studies
CA
AA
G,P
M,G,P,E
M,G,P
G,P
G,P,E
G,P
G,P
G,P,E
G,P,E
G,P
G,P
G,P,E
G,P
G,P
G,P,E
G,P
G,P
G,P,E
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
CAD23, T2D10, 23
G,P
G,P
G,P
G,N,P
G,P
G,P
G,P
G,P
G,P
T2D24
Cell adhesion
molecules
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
Focal adhesion
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
CAD23, T2D23, 24
ECM-receptor
interaction
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
G,P
CAD, T2D23
* To avoid potential biases from individual methods and further reduce false discovery, the top significant pathways identified from each method
were then compared to yield the overall top pathways that were consistently identified in two or more of the five methods for each of the three
29 disease endpoints in each population. Pathways identified in ≥ 2 methods in at least one of the three populations (AA, HA, and CA). We used
KEGG and/or Reactome pathway databases in our pathway and network based analyses. Pathways are ranked by counts of enrichment in each of
the five methods.
† M denotes MAGENTA, G represents GSA-SNP, N denotes NIMMI, P represents PANOGA, and E denotes eSNP.
30 Supplemental Table 7. Assessment of top eight pathways in C4D and CARDIOGRAM GWAS.
Pathways
Enrichment P value *
C4D
CARDIOGRAM
Hypertrophic cardiomyopathy
5.63e-04
3.50e-04
Dilated cardiomyopathy
4.13e-04
2.51e-04
Arrhythmogenic right ventricular cardiomyopathy
4.80e-05
7.40e-04
Calcium signaling pathway
5.89e-04
1.87e-04
Axon guidance
4.96e-09
7.31e-04
Cell adhesion molecules
3.0e-03†
6.00e-05
Focal adhesion
2.15e-05
9.03e-06
ECM-receptor interaction
2.29e-06
3.7e-03
* GSA-SNP was used and FDR for all p values (but one noted in †) were <0.001.
† FDR was 0.05.
31 Supplemental Table 8. Pathways*,‡ specific for ethnicity and disease and identified by pathway and network based analyses for African
American (AA; n=8,155), Hispanic American (HA; n=3,494), for Caucasian American (CA; n=3,697) in the WHI-SHARe and WHIGARNET cohorts.
By Ethnicity
By Disease
Pathway
T2D +
AA
HA
CA
T2D
CVD†
CVD†
Cell Cycle
X
O
O
Apoptosis
X
Wnt signaling pathway
Melanoma
O
X
O
Pathways in cancer
X
Bladder cancer
O
Acute myeloid leukemia
O
Dorso-ventral axis formation
O
Prion diseases
O
* We used KEGG and/or Reactome pathway databases in our pathway and network based analyses.
†Among CA in WHI-GARNET, CHD was investigated instead of CVD.
O
32 X
‡ “O” represents “significantly enriched”(i.e. found to be significant [with nominal or adjusted p-values ≤ 0.05] by two or more out of the five
methods), while “X” represents not enriched.
33 Supplemental Table 9. Validation of the top key drivers via intersection with various mouse datasets.
Key
Official
Causality
Causality
Causality
Causality
Mouse
HMDP
HMDP
Literature
Drivers
Gene
No. of datasets
No. of
Traits
Trait Type
Phenome
Traits
Traits
support
(Locatio
Name
Tissue
(P-values
database –
correlated
correlated
<4.1e-6)*
related
with eSNP of
with
phenotypes
KD
expression of
(P-values <
(P-values
KD
0.001)
<4.1e-6)†
(P-values
(No. of tissues)
<4.1e-6)‡
n)
(No. of
tissues)
COL1A1
Collagen
5
4
Glucose,
Diabetes
CVD-
Body fat,
(17q21.3
, type 1,
(C57BL/6J x
(islet, liver,
MCPI, UC,
(including
atherosclerosis
esterified
compliance
3)
alpha 1
BTBR ob/ob
adipose,
leptin, insulin,
inflammatory
cholesterol
25
mouse cross20,
muscle)
weight,
cytokine/che
(2: adipose,
Bx129 cross21,
HOMA-IR,
mokine, and
heart)
BxA cross21,
resistin,
cytokine),
34 ---
Arterial
BxH wildtype
subcutaneous,
cross19, BxD
LDL
cross21)
cholesterol
CVD, obesity
COL3A1
Collagen
6
2
MCP1,
Diabetes
(2q31)
, type III,
(C57BL/6J x
(Liver,
weight, fat
(including
related26,
alpha 1
BTBR ob/ob
adipose)
mass, HOMA-
inflammatory
CAD27,
mouse cross20,
%B,
cytokine/che
cardiovascu
JAXLONG_200
subcutaneous,
mokine),
lar related28
821, Bx129
LDL
obesity, CVD
cross21, BxA
cholesterol,
cross21, BxD
DEXA fat
cross21,
%/tissue,
C57BL/6J x A/J
insulin, bw,
mouse cross17)
mesenteric
ELN
(7q11.23)
Elastin
Body weight
---
1
Glucose/insuli
Diabetes,
Lipids - LDL,
(C57BL/6J x
(adipose)
n, fat mass,
obesity
non-HDL, and
on29, blood
pressure30
weight, fat
total
mouse cross20,
mass, DEXA
cholesterol,
35 ---
Diabetes
4
BTBR ob/ob
---
---
Hypertensi
JAXLONG_200
fat tissue
phospholipid,
821, BxA cross21,
body
C57BL/6J x A/J
composition –
mouse cross17)
fat, body
weight,
cardiovascular
– ECG
parameters,
heart rate and
heart weight,
immune system
– cell count in
peripheral
blood
COL4A1
Collagen
7
4
HOMA-%B,
Diabetes,
Cardiovascular
(13q34)
, type IV,
(C57BL/6J x
(kidney,
weight,
CVD, obesity
alpha 1
BTBR ob/ob
liver,
triglyceride,
mouse cross20,
muscle,
insulin,
---
Glucose
Childhood
-ECG
(1 tissue:
obesity31,
parameters
liver)
CAD32,
arterial
36 glucose,
stiffness33,
cross19,
leptin, fat
MI34, blood
JAXLONG_200
mass,
pressure35,
821, Bx129
mesenteric,
cerevascula
cross21, BxA
subcutaneous,
r related36
cross21, BxD
gonadal fat,
cross21,
HDL
C57BL/6J x A/J
cholesterol,
mouse cross17)
total
BxH wildtype
adipose)
cholesterol,
LDL
cholesterol,
bw
CD93
CD93
6
5
Glucose,
Diabetes
Body fat pads,
Body fat
(20p11.2
molecule
(C57BL/6J x
(kidney,
CD40, insulin,
(including
body weight,
(1 tissue:
system37,
BTBR ob/ob
liver,
osteopontin,
inflammation
immune system
striatum)
CHD38,
mouse cross20,
adipose,
triglyceride,
-related),
– cell counts
JAXLONG_200
muscle,
num islets, C
CVD, obesity
1)
37 ---
Immune
CAD39
821, Bx129
islet)
peptide,
cross21, BxA
HOMA-%B,
cross21, BxD
HOMA-IR,
cross21,
mesenteric,
C57BL/6J x A/J
weight, fat
mouse cross17)
mass,
subcutaneous,
gonadal fat,
leptin, LDL
cholesterol,
bw
FN1
Fibronect
4
3
Insulin,
Diabetes,
Lipids – total
Fat mass
(2q34)
in 1
(C57BL/6J x
(kidney,
weight, leptin,
obesity
cholesterol,
(1 tissue:
ular
BTBR ob/ob
liver,
gonadal
body
macrophage)
related40,
mouse cross20,
adipose)
weight,
composition –
rheumatoid
Bx129 cross21,
DEXA fat
fat, body
arthritis41
BxA cross21,
%/tissue,
weight,
C57BL/6J x A/J
mesenteric
cardiovascular
38 ---
Cardiovasc
mouse cross17)
– ECG
parameters,
heart rate
Matrix
7
3
Haptoglobin,
Diabetes
(16q13-
metallop
(C57BL/6J x
(adipose,
insulin, fat
(including
retroperitoneal
cardiovascu
q21)
eptidase
BTBR ob/ob
liver,
mass, weight,
acute-phase),
fat pads
lar &
2
mouse cross20,
muscle)
HDL
obesity
(1 tissue:
diabetes43,
adipose)
glucose44,
Body weight
Fat mass,
---
MI42,
MMP2
C57BL/6J x
cholesterol,
C3H ApoE -/-
OGTT
adipogenesi
mouse cross18,
glucose,
s45
BxH wildtype
IPIST,
cross19,
glucose,
JAXLONG_200
HOMA-IR,
821, Bx129
mesenteric,
cross21, BxA
subcutaneous,
cross21,
leptin, gonadal
C57BL/6J x A/J
fat, bw,
mouse cross17)
DEXA fat
39 %/tissue
SPARC
Secreted
7
4
Triglyceride,
CVD,
HDL
Gonadal fat
Fat mass, fat
Obesity &
(5q31.3-
protein,
(C57BL/6J x
(Kidney,
insulin, leptin,
diabetes,
cholesterol,
pads, body fat,
pads
diabetes46,
q32)
acidic,
BTBR ob/ob
liver,
fat mass,
obesity
body weight
weight, fat
(1 tissue:
insulin
cysteine-
mouse cross20,
adipose,
weight,
mass, insulin
adipose)
secretion47,
rich
BxH wildtype
muscle)
subcutaneous,
(2 tissues:
collagen in
(osteonec
cross19,
gonadal fat,
adipose,
heart48,
tin)
JAXLONG_200
leptin, LDL
macrophage)
adipogeneis
821, Bx129
cholesterol,
& wnt
cross21, BxD
bw,
signaling
cross21,
mesenteric
pathway49
C57BL/6J x A/J
mouse cross17,
BxA cross21)
COL2A1
Collagen
4
1
Fat mass,
Obesity,
Glucose, body
(12q13.1
, type II,
(JAXLONG_20
(adipose)
HDL
CVD
weight, CVD-
1)
alpha 1
0821, Bx129
cholesterol,
ECG
cross21, BxA
weight,
parameters
40 ---
---
---
cross21,
subcutaneous
C57BL/6J x A/J
mouse cross17)
THBS2
Thrombo
2
3
VCAM1,
Diabetes
Lipids – LDL,
Weight, fat
HDL,
Childhood
(6q27)
spondin
(C57BL/6J x
(adipose,
insulin,
(including
non-HDL, and
mass
triglyceride,
obesity31,
2
BTBR ob/ob
kidney,
triglyceride,
inflammation
total
(1 tissue: aorta)
LDL
hypertensio
mouse cross20,
muscle)
APO A1,
-related,
cholesterol,
(2 tissues:
n50,
MCP1,
acute-phase,
cardiovascular
adipose, aorta)
diabetes51,
glucose,
and
– ECG
cystatin,
inflammatory
parameters and
weight
cytokine/che
heart rate
Bx129 cross21)
mokine),
CVD
(including
cystatin C),
obesity
41 CAD52
* Correlation p-value range (in –log10 scale): 5.40-300 (COL1A1), 5.57-300 (COL3A1), 5.43-300 (ELN), 5.38-300 (COL4A1), 5.38-300 (CD93),
5.60-14.7 (FN1), 5.42-300 (MMP2), 5.44-300 (SPARC), 6.01-300 (COL2A1), 5.77-300 (THBS2). P-value cutoff was determined by:
!.!"
!".!" !"#$∗!".!" !"#$!%
.
† Trait p-value range (in –log10 scale): 5.50-8.45 (COL1A1), 5.88 (CD93), 5.39-5.81 (FN1), 5.55-7.71 (MMP2), 5.44-6.30 (SPARC), 6.32-6.46
(THBS2). P-value cutoff was determined by:
!.!"
!".!" !"#$∗!".!" !"#$!%
.
‡ Trait p-value range (in –log10 scale): 5.80 (COL4A1), 7.12-11.1 (SPARC) in adipose tissue, 5.92-7.51 (THBS2).P-value cutoff was determined
by:
!.!"
!".!" !"#$∗!".!" !"#$ !
.
42 Supplemental Table 10. Assessment of top key drivers (KDs) for enrichment of genetic risk signals.
Enrichment P value
Gene Set
CVD+T2D
CVD
T2D
AA
HA
CA
AA
HA
CA
Top 10 KDs
1.0e-03
NS*
NS*
NS*
NS*
NS*
2.0e-03 1.0e-03
NS*
Top 30 KDs
2.0e-04
1.0e-03
4.0e-03 2.0e-02
NS
NS
1.0e-03 1.2e-04
2.3e-04
Top 100 KDs
2.7e-05
1.4e-05
3.1e-04 1.9e-05
4.4e-04
1.9e-04
4.0e-05 1.8e-07
7.6e-04
* NS – not significant at p<0.05 using GSA-SNP.
43 AA
HA
CA
Supplemental References:
1.
Hays J, Hunt JR, Hubbell FA, Anderson GL, Limacher M, Allen C, Rossouw JE. The women's
health initiative recruitment methods and results. Ann Epidemiol. 2003;13:S18-77
2.
Laurie CC, Doheny KF, Mirel DB, Pugh EW, Bierut LJ, Bhangale T, Boehm F, Caporaso NE,
Cornelis MC, Edenberg HJ, Gabriel SB, Harris EL, Hu FB, Jacobs KB, Kraft P, Landi MT,
Lumley T, Manolio TA, McHugh C, Painter I, Paschall J, Rice JP, Rice KM, Zheng X, Weir BS,
Investigators G. Quality control and quality assurance in genotypic data for genome-wide
association studies. Genet Epidemiol.34:591-602
3.
Willer CJ, Li Y, Abecasis GR. Metal: Fast and efficient meta-analysis of genomewide association
scans. Bioinformatics.26:2190-2191
4.
Segre AV, Groop L, Mootha VK, Daly MJ, Altshuler D. Common inherited variation in
mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic
traits. PLoS Genet. 2010;6:e1001058
5.
Nam D, Kim J, Kim SY, Kim S. Gsa-snp: A general approach for gene set analysis of
polymorphisms. Nucleic Acids Res.38:W749-754
6.
Akula N, Baranova A, Seto D, Solka J, Nalls MA, Singleton A, Ferrucci L, Tanaka T, Bandinelli
S, Cho YS, Kim YJ, Lee JY, Han BG, Bipolar Disorder Genome Study C, Wellcome Trust CaseControl C, McMahon FJ. A network-based approach to prioritize results from genome-wide
association studies. PLoS One.6:e24220
7.
Bakir-Gungor B, Sezerman OU. A new methodology to associate snps with human diseases
according to their pathway related context. PLoS One. 2011;6:e26277
8.
Bakir-Gungor B, Sezerman OU. Identification of snp targeted pathways from genome-wide
association study (gwas) data. Nature Protocol Exchange. DOI:10.1038/protex.2012.019. 2012
9.
Huan T, Zhang B, Wang Z, Joehanes R, Zhu J, Johnson AD, Ying S, Munson PJ, Raghavachari
N, Wang RL, P., Courchesne P, Hwang SJ, Assimes TL, McPherson R, Samani NJ, Schunkert H,
Consortium CAGwRaM-aC, (ICBP) ICfBPG, Meng Q, Suver C, O'Donnell CJ, Derry J, Yang X,
44 Levy D. A systems biology framework identifies molecular underpinnings of coronary heart
disease. Arterioscler Thromb Vasc Biol. 2013;33:1427-1434. doi:
1410.1161/ATVBAHA.1112.300112. Epub 302013 Mar 300128.
10.
Zhong H, Yang X, Kaplan LM, Molony C, Schadt EE. Integrating pathway analysis and genetics
of gene expression for genome-wide association studies. Am J Hum Genet. 2010;86:581-591
11.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B,
Ideker T. Cytoscape: A software environment for integrated models of biomolecular interaction
networks. Genome Res. 2003;13:2498-2504
12.
Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman
M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash
SF, Drake TA, Sachs A, Lusis AJ. An integrative genomics approach to infer causal associations
between gene expression and disease. Nat Genet. 2005;37:710-717
13.
Yang X, Deignan JL, Qi H, Zhu J, Qian S, Zhong J, Torosyan G, Majid S, Falkard B, Kleinhanz
RR, Karlsson J, Castellani LW, Mumick S, Wang K, Xie T, Coon M, Zhang C, Estrada-Smith D,
Farber CR, Wang SS, van Nas A, Ghazalpour A, Zhang B, Macneil DJ, Lamb JR, Dipple KM,
Reitman ML, Mehrabian M, Lum PY, Schadt EE, Lusis AJ, Drake TA. Validation of candidate
causal genes for obesity that affect shared metabolic pathways and networks. Nat Genet.
2009;41:415-423
14.
Yang X, Peterson L, Thieringer R, Deignan JL, Wang X, Zhu J, Wang S, Zhong H, Stepaniants S,
Beaulaurier J, Wang IM, Rosa R, Cumiskey AM, Luo JM, Luo Q, Shah K, Xiao J, Nickle D,
Plump A, Schadt EE, Lusis AJ, Lum PY. Identification and validation of genes affecting aortic
lesions in mice. J Clin Invest. 2010;120:2414-2422
15.
Maddatu TP, Grubb SC, Bult CJ, Bogue MA. Mouse phenome database (mpd). Nucleic Acids
Res. 2012;40:D887-894
16.
Ghazalpour A, Rau CD, Farber CR, Bennett BJ, Orozco LD, van Nas A, Pan C, Allayee H,
Beaven SW, Civelek M, Davis RC, Drake TA, Friedman RA, Furlotte N, Hui ST, Jentsch JD,
45 Kostem E, Kang HM, Kang EY, Joo JW, Korshunov VA, Laughlin RE, Martin LJ, Ohmen JD,
Parks BW, Pellegrini M, Reue K, Smith DJ, Tetradis S, Wang J, Wang Y, Weiss JN,
Kirchgessner T, Gargalovic PS, Eskin E, Lusis AJ, LeBoeuf RC. Hybrid mouse diversity panel:
A panel of inbred mouse strains suitable for analysis of complex genetic traits. Mamm Genome.
2012;23:680-692
17.
Derry JM, Zhong H, Molony C, MacNeil D, Guhathakurta D, Zhang B, Mudgett J, Small K, El
Fertak L, Guimond A, Selloum M, Zhao W, Champy MF, Monassier L, Vogt T, Cully D,
Kasarskis A, Schadt EE. Identification of genes and networks driving cardiovascular and
metabolic phenotypes in a mouse f2 intercross. PLoS One. 2010;5:e14319
18.
Yang X, Schadt EE, Wang S, Wang H, Arnold AP, Ingram-Drake L, Drake TA, Lusis AJ. Tissuespecific expression and regulation of sexually dimorphic genes in mice. Genome Res.
2006;16:995-1004
19.
Schadt EE, Molony C, Chudin E, Hao K, Yang X, Lum PY, Kasarskis A, Zhang B, Wang S,
Suver C, Zhu J, Millstein J, Sieberts S, Lamb J, GuhaThakurta D, Derry J, Storey JD, AvilaCampillo I, Kruger MJ, Johnson JM, Rohl CA, van Nas A, Mehrabian M, Drake TA, Lusis AJ,
Smith RC, Guengerich FP, Strom SC, Schuetz E, Rushmore TH, Ulrich R. Mapping the genetic
architecture of gene expression in human liver. PLoS Biol. 2008;6:e107
20.
Tu Z, Keller MP, Zhang C, Rabaglia ME, Greenawalt DM, Yang X, Wang IM, Dai H, Bruss MD,
Lum PY, Zhou YP, Kemp DM, Kendziorski C, Yandell BS, Attie AD, Schadt EE, Zhu J.
Integrative analysis of a cross-loci regulation network identifies app as a gene regulating insulin
secretion from pancreatic islets. PLoS Genet. 2012;8:e1003107
21.
Https://http://www.Synapse.Org/ - !Synapse:Syn47391.
22.
Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V,
Muthusamy B, Gandhi TK, Gronborg M, Ibarrola N, Deshpande N, Shanker K, Shivashankar
HN, Rashmi BP, Ramya MA, Zhao Z, Chandrika KN, Padma N, Harsha HC, Yatish AJ, Kavitha
MP, Menezes M, Choudhury DR, Suresh S, Ghosh N, Saravana R, Chandran S, Krishna S, Joy
46 M, Anand SK, Madavan V, Joseph A, Wong GW, Schiemann WP, Constantinescu SN, Huang L,
Khosravi-Far R, Steen H, Tewari M, Ghaffari S, Blobe GC, Dang CV, Garcia JG, Pevsner J,
Jensen ON, Roepstorff P, Deshpande KS, Chinnaiyan AM, Hamosh A, Chakravarti A, Pandey A.
Development of human protein reference database as an initial platform for approaching systems
biology in humans. Genome Res. 2003;13:2363-2371
23.
Torkamani A, Topol EJ, Schork NJ. Pathway analysis of seven common diseases assessed by
genome-wide association. Genomics. 2008;92:265-272
24.
Elbers CC, van Eijk KR, Franke L, Mulder F, van der Schouw YT, Wijmenga C, Onland-Moret
NC. Using genome-wide pathway analysis to unravel the etiology of complex diseases. Genet
Epidemiol. 2009;33:419-431
25.
Brull DJ, Murray LJ, Boreham CA, Ralston SH, Montgomery HE, Gallagher AM, McGuigan FE,
Davey Smith G, Savage M, Humphries SE, Young IS. Effect of a col1a1 sp1 binding site
polymorphism on arterial pulse wave velocity: An index of compliance. Hypertension.
2001;38:444-448
26.
Gaikwad AB, Gupta J, Tikoo K. Epigenetic changes and alteration of fbn1 and col3a1 gene
expression under hyperglycaemic and hyperinsulinaemic conditions. Biochem J. 2010;432:333341
27.
Muckian C, Fitzgerald A, O'Neill A, O'Byrne A, Fitzgerald DJ, Shields DC. Genetic variability in
the extracellular matrix as a determinant of cardiovascular risk: Association of type iii collagen
col3a1 polymorphisms with coronary artery disease. Blood. 2002;100:1220-1223
28.
Liu X, Wu H, Byrne M, Krane S, Jaenisch R. Type iii collagen is crucial for collagen i
fibrillogenesis and for normal cardiovascular development. Proc Natl Acad Sci U S A.
1997;94:1852-1856
29.
Goergen CJ, Li HH, Francke U, Taylor CA. Induced chromosome deletion in a williams-beuren
syndrome mouse model causes cardiovascular abnormalities. J Vasc Res. 2011;48:119-129
47 30.
Iwai N, Kajimoto K, Kokubo Y, Tomoike H. Extensive genetic analysis of 10 candidate genes for
hypertension in japanese. Hypertension. 2006;48:901-907
31.
Comuzzie AG, Cole SA, Laston SL, Voruganti VS, Haack K, Gibbs RA, Butte NF. Novel genetic
loci identified for the pathophysiology of childhood obesity in the hispanic population. PLoS
One. 2012;7:e51954
32.
Schunkert H, Konig IR, Kathiresan S, Reilly MP, Assimes TL, Holm H, Preuss M, Stewart AF,
Barbalic M, Gieger C, Absher D, Aherrahrou Z, Allayee H, Altshuler D, Anand SS, Andersen K,
Anderson JL, Ardissino D, Ball SG, Balmforth AJ, Barnes TA, Becker DM, Becker LC, Berger
K, Bis JC, Boekholdt SM, Boerwinkle E, Braund PS, Brown MJ, Burnett MS, Buysschaert I,
Carlquist JF, Chen L, Cichon S, Codd V, Davies RW, Dedoussis G, Dehghan A, Demissie S,
Devaney JM, Diemert P, Do R, Doering A, Eifert S, Mokhtari NE, Ellis SG, Elosua R, Engert JC,
Epstein SE, de Faire U, Fischer M, Folsom AR, Freyer J, Gigante B, Girelli D, Gretarsdottir S,
Gudnason V, Gulcher JR, Halperin E, Hammond N, Hazen SL, Hofman A, Horne BD, Illig T,
Iribarren C, Jones GT, Jukema JW, Kaiser MA, Kaplan LM, Kastelein JJ, Khaw KT, Knowles
JW, Kolovou G, Kong A, Laaksonen R, Lambrechts D, Leander K, Lettre G, Li M, Lieb W,
Loley C, Lotery AJ, Mannucci PM, Maouche S, Martinelli N, McKeown PP, Meisinger C,
Meitinger T, Melander O, Merlini PA, Mooser V, Morgan T, Muhleisen TW, Muhlestein JB,
Munzel T, Musunuru K, Nahrstaedt J, Nelson CP, Nothen MM, Olivieri O, Patel RS, Patterson
CC, Peters A, Peyvandi F, Qu L, Quyyumi AA, Rader DJ, Rallidis LS, Rice C, Rosendaal FR,
Rubin D, Salomaa V, Sampietro ML, Sandhu MS, Schadt E, Schafer A, Schillert A, Schreiber S,
Schrezenmeir J, Schwartz SM, Siscovick DS, Sivananthan M, Sivapalaratnam S, Smith A, Smith
TB, Snoep JD, Soranzo N, Spertus JA, Stark K, Stirrups K, Stoll M, Tang WH, Tennstedt S,
Thorgeirsson G, Thorleifsson G, Tomaszewski M, Uitterlinden AG, van Rij AM, Voight BF,
Wareham NJ, Wells GA, Wichmann HE, Wild PS, Willenborg C, Witteman JC, Wright BJ, Ye S,
Zeller T, Ziegler A, Cambien F, Goodall AH, Cupples LA, Quertermous T, Marz W,
Hengstenberg C, Blankenberg S, Ouwehand WH, Hall AS, Deloukas P, Thompson JR,
48 Stefansson K, Roberts R, Thorsteinsdottir U, O'Donnell CJ, McPherson R, Erdmann J, Samani
NJ. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery
disease. Nat Genet. 2011;43:333-338
33.
Tarasov KV, Sanna S, Scuteri A, Strait JB, Orru M, Parsa A, Lin PI, Maschio A, Lai S, Piras
MG, Masala M, Tanaka T, Post W, O'Connell JR, Schlessinger D, Cao A, Nagaraja R, Mitchell
BD, Abecasis GR, Shuldiner AR, Uda M, Lakatta EG, Najjar SS. Col4a1 is associated with
arterial stiffness by genome-wide association scan. Circ Cardiovasc Genet. 2009;2:151-158
34.
Yamada Y, Kato K, Oguri M, Fujimaki T, Yokoi K, Matsuo H, Watanabe S, Metoki N, Yoshida
H, Satoh K, Ichihara S, Aoyagi Y, Yasunaga A, Park H, Tanaka M, Nozawa Y. Genetic risk for
myocardial infarction determined by polymorphisms of candidate genes in a japanese population.
J Med Genet. 2008;45:216-221
35.
Van Agtmael T, Bailey MA, Schlotzer-Schrehardt U, Craigie E, Jackson IJ, Brownstein DG,
Megson IL, Mullins JJ. Col4a1 mutation in mice causes defects in vascular function and low
blood pressure associated with reduced red blood cell volume. Hum Mol Genet. 2010;19:11191128
36.
Gould DB, Phalan FC, van Mil SE, Sundberg JP, Vahedi K, Massin P, Bousser MG, Heutink P,
Miner JH, Tournier-Lasserve E, John SW. Role of col4a1 in small-vessel disease and
hemorrhagic stroke. N Engl J Med. 2006;354:1489-1496
37.
Zekavat G, Mozaffari R, Arias VJ, Rostami SY, Badkerhanian A, Tenner AJ, Nichols KE, Naji
A, Noorchashm H. A novel cd93 polymorphism in non-obese diabetic (nod) and nzb/w f1 mice is
linked to a cd4+ inkt cell deficient state. Immunogenetics. 2010;62:397-407
38.
van der Net JB, Oosterveer DM, Versmissen J, Defesche JC, Yazdanpanah M, Aouizerat BE,
Steyerberg EW, Malloy MJ, Pullinger CR, Kastelein JJ, Kane JP, Sijbrands EJ. Replication study
of 10 genetic polymorphisms associated with coronary heart disease in a specific high-risk
population with familial hypercholesterolemia. Eur Heart J. 2008;29:2195-2201
49 39.
Malarstig A, Silveira A, Wagsater D, Ohrvik J, Backlund A, Samnegard A, Khademi M,
Hellenius ML, Leander K, Olsson T, Uhlen M, de Faire U, Eriksson P, Hamsten A. Plasma cd93
concentration is a potential novel biomarker for coronary artery disease. J Intern Med.
2011;270:229-236
40.
Stoynev N, Dimova I, Rukova B, Hadjidekova S, Nikolova D, Toncheva D, Tankova T. Gene
expression in peripheral blood of patients with hypertension and patients with type 2 diabetes. J
Cardiovasc Med (Hagerstown). 2013
41.
Hua L, Zhou P, Liu H, Li L, Yang Z, Liu ZC. Mining susceptibility gene modules and disease
risk genes from snp data by combining network topological properties with support vector
regression. J Theor Biol. 2011;289:225-236
42.
Perez-Hernandez N, Vargas-Alarcon G, Martinez-Rodriguez N, Martinez-Rios MA, Pena-Duque
MA, Pena-Diaz Ade L, Valente-Acosta B, Posadas-Romero C, Medina A, Rodriguez-Perez JM.
The matrix metalloproteinase 2-1575 gene polymorphism is associated with the risk of
developing myocardial infarction in mexican patients. J Atheroscler Thromb. 2012;19:718-727
43.
Bhatt LK, Veeranjaneyulu A. A therapeutic approach to treat cardiovascular dysfunction of
diabetes. Exp Toxicol Pathol. 2012;64:847-853
44.
Wang P, Li HW, Wang YP, Chen H, Zhang P. Effects of recombinant human relaxin upon
proliferation of cardiac fibroblast and synthesis of collagen under high glucose condition. J
Endocrinol Invest. 2009;32:242-247
45.
Dubois SG, Tchoukalova YD, Heilbronn LK, Albu JB, Kelley DE, Smith SR, Fang X, Ravussin
E. Potential role of increased matrix metalloproteinase-2 (mmp2) transcription in impaired
adipogenesis in type 2 diabetes mellitus. Biochem Biophys Res Commun. 2008;367:725-728
46.
Kos K, Wilding JP. Sparc: A key player in the pathologies associated with obesity and diabetes.
Nat Rev Endocrinol. 2010;6:225-235
47.
Harries LW, McCulloch LJ, Holley JE, Rawling TJ, Welters HJ, Kos K. A role for sparc in the
moderation of human insulin secretion. PLoS One. 2013;8:e68253
50 48.
Harris BS, Zhang Y, Card L, Rivera LB, Brekken RA, Bradshaw AD. Sparc regulates collagen
interaction with cardiac fibroblast cell surfaces. Am J Physiol Heart Circ Physiol.
2011;301:H841-847
49.
Nie J, Sage EH. Sparc functions as an inhibitor of adipogenesis. J Cell Commun Signal.
2009;3:247-254
50.
Oguri M, Kato K, Yokoi K, Watanabe S, Metoki N, Yoshida H, Satoh K, Aoyagi Y, Nishigaki Y,
Nozawa Y, Yamada Y. Association of polymorphisms of thbs2 and hspa8 with hypertension in
japanese individuals with chronic kidney disease. Mol Med Rep. 2009;2:205-211
51.
Yamaguchi S, Yamada Y, Matsuo H, Segawa T, Watanabe S, Kato K, Yokoi K, Ichihara S,
Metoki N, Yoshida H, Satoh K, Nozawa Y. Gender differences in the association of gene
polymorphisms with type 2 diabetes mellitus. Int J Mol Med. 2007;19:631-637
52.
McCarthy JJ, Parker A, Salem R, Moliterno DJ, Wang Q, Plow EF, Rao S, Shen G, Rogers WJ,
Newby LK, Cannata R, Glatt K, Topol EJ. Large scale association analysis for identification of
genes underlying premature coronary heart disease: Cumulative perspective from analysis of 111
candidate genes. J Med Genet. 2004;41:334-341
51