DOI: 10.1161/CIRCULATIONAHA.112.000604 Genome- and Phenome-Wide Analysis of Cardiac Conduction Identifies Markers of Arrhythmia Risk Running title: Ritchie et al.; QRS GWAS and PheWAS in electronic records Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Marylyn D. Ritchie, PhD1*; Joshua C. Denny, MD, MS2,3*; Rebecca L. Zuvich, PhD4*; Dana C. Crawford, PhD4,5; Jonathan S. Schildcrout, PhD6; Lisa Bastarache, MS2; Andrea H. Ramirez, MD3; Jonathan D. Mosley, MD, PhD3; Jill M. Pulley, MBA7; Melissa A. Basford, MBA7; Yuki Bradford, MS5; Luke V. Rasmussen8; Jyotishman Pathak, PhD9; Christopher G. Chute, MD, DrPH9; Iftikhar J. Kullo, MD10; Catherine A. McCarty, PhD11; Rex L. Chisholm, PhD12; Abel N. Kho, MD, MS13; Christopher S. Carlson, PhD14; Eric B. Larson, MD, MPH15; Gail P. Jarvik, MD, PhD16,19; Nona Sotoodehnia, MD, MPH17,18 on behalf of the CHARGE QRS Group; Teri A. Manolio, MD, PhD21; Rongling Li, PhD21; Daniel R. Masys, MD20; Jonathan L. Haines, PhD4,5; Dan M. Roden, MD3,22 1 Dept of Biochemistry and Molecular Biology, The Pennsylvania State University, Uni University nive vers ve rsit ity it y Pa Park Park, rk,, PA rk PA;; 2 Dept of Biomedical Informatics, 3Dept of Medicine; 4Dept of Molecular Physiology and Biophysics; 5 Center for Human Genetics Research; 6Dept of Biostatistics, 7Office of Research; 22Dept of Pharmacology, Vanderbilt Medicine, Phar arma maco ma colo co l gy lo gy, Va and n erbilt University School of Me edi dici c ne, Nashville, TN; 8De ci Dept p of Preventive Medicine; 122 13 Dept D De eppt pt ooff Ce Cell ell and ndd M Molecular olecular Biology; Dept p of Me Medicine, edi d cine, Northweste Northwestern ern University Uniive versity y School of Medicine, 9 10 Chicago C Chic hiccag a o IL; Di Divs D ivs vs ooff Bi Biom Biomedical omed om edic ed i al IInformatics ic nfor nf orma or mati ma tics ti cs aand nd S Statistics; tattisstics; ta s;; Di Div v off C Cardiology, ardi ar d ol di olog ogy, M og Mayo ayo ay o Cl C Clinic, inic in ic, Ro ic Roch Rochester ches ch e ter es MN; MN; 11Esse Essentia ent ntiia IInstitute nsstiitu tutte te ooff Ru Rura Rural rall Hea H Health, ealth h, Du Duluth, ulu uth, MN M MN;; 1144Fred Fred dH Hutchinson utcchin ut chinson n Canc C Cancer anccer R Research esseaarc rch h Cent C Center, enteer,, Seattle, WA; S attle, W Se A 1155G A; Group rouup He Health ealth hR Research e eaarc es rchh Institute, Instiitu ute, S Seattle, eaatt ttle lee, WA WA; 16D Div iv of M Medical ed dic icaal G Genetics; enetticcs; 1177D Div iv of of 18 19 Cardiology; Card Ca rdio rd i lo io ogy gy;; Ca Cardiovascular Card r io rd ovaasc scul u ar H Health ealt ea lth lt h Rese R Research eseear arcch ch U Unit, niit, D Dept ep pt of M Medicine; edic ed iccin ne; De Dept p ooff Ge pt G Genome nom no me S Sciences; ciien encces; s;; 20 21 Dept De pt off Bi Biom Biomedical med ediccal aand nd Health Hea e lt lth h Informatics, Info In formatic ics, s, U University nive ni vers r ity y of W Washington, ashi as hing n to on, S Seattle, eattlee, WA ea WA;; 21 National Na ati tion onal al H Human uman um a Genome Research Health, Bethesda, Geno Ge noome R essea e rc rch h Institute, Inst In s it st itut utte, e National Nat atio io ona nall Institutes Inst In s it st itut uttes ooff He eal alth th h, Be eth thes esda es d , MD da *Contributed *Con *C ontr trib ibut uted ed eequally qual qu allly Address for Correspondence: Dan M. Roden, MD Professor of Medicine and Pharmacology Director, Oates Institute for Experimental Therapeutics Assistant Vice-Chancellor for Personalized Medicine Vanderbilt University School of Medicine 1285 Medical Research Building IV Nashville, TN 37232-0575 Tel: 615-322-0067 Fax: 615-343-4522 Email: [email protected] Journal Subject Codes: Basic science research:[132] Arrhythmias - basic studies, Diagnostic testing:[106] Electrophysiology, Etiology:[5] Arrhythmias, clinical electrophysiology, drugs 1 DOI: 10.1161/CIRCULATIONAHA.112.000604 Abstract: Background—Electrocardiographic QRS duration, a measure of cardiac intraventricular conduction, varies ~2-fold in individuals without cardiac disease. Slow conduction may promote reentrant arrhythmias. Methods and Results—We performed a genome-wide association study (GWAS) to identify genomic markers of QRS duration in 5,272 individuals without cardiac disease selected from electronic medical record (EMR) algorithms at five sites in the Electronic Medical Records and Genomics (eMERGE) network. The most significant loci were evaluated within the CHARGE consortium QRS GWAS meta-analysis. Twenty-three single nucleotide polymorphisms in 5 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 loci, previously described by CHARGE, were replicated in the eMERGE samples; 18 SNPs were g in the chromosome 3 SCN5A and SCN10A loci,, where the most significant SNPs were rs1805126 n SCN5A with p=1.2x10-8 (eMERGE) and p=2.5x10-20 (CHARGE) and rs6795970 rs679597 9770 in SCN10A SCN1 SC N10A N1 0A in with p=6x10-6 (eMERGE) and p=5x10-27 (CHARGE). The other loci were in NFIA, near CDKN CDKN1A, N1A, and near C6orf204. We then performed phenome-wide p enome-wide association studies (PheWAS) ph these European Americans search diagnoses on n vvariants ariantts in the aria hese se ffive iv ve lo loci c inn 13 113,859 ,8 859 9 Eur u opea e nA meriica cans n to se earch h ffor or dia agn gnos o es associated asssocciated w with ith h these markers. mar arke kerss. PheWAS ke PheW Ph eW WAS iidentified dennti de ntifie ifiedd atri atrial ial fibr fibrillation rilllatiion aand nd ccardiac arrdiac aarrhythmias rrhy rr hyth thm th miass aass th mias the most mos co common omm mon ass associated sssoc o ia iate tedd di diag diagnoses gnose sees wi w with th h SC SCN SCN10A N10A N10A A and and SC SCN5A SCN5 CN5 N5A A va variants. ariian nts. SCN SCN10A N10A va N10A variants ari rian ants ts we were ere al ere also lso so associated original asso oci ciat ated ed with witth subsequent subs bseq equent nt development dev evel elop o ment nt of of atrial atri at rial al fibrillation fibri rill llat atio ionn and an arrhythmia arrh ar rhyt y hm mia iin n th the or orig i in nal 55,272 ,2272 7 “heart-healthy” study population. “heart-healt thy hy”” st stud udyy po ud popu pu ula lattion n. Conclusions—We conclude that DNA biobanks coupled to EMRs provide a platform not only for GWAS but may also allow broad interrogation of the longitudinal incidence of disease associated with genetic variants. The PheWAS approach implicated sodium channel variants modulating QRS duration in subjects without cardiac disease as predictors of subsequent arrhythmias. Key words: cardiac conduction, QRS duration, atrial fibrillation, genome-wide association study, phenome-wide association study, electronic medical records 2 DOI: 10.1161/CIRCULATIONAHA.112.000604 Introduction Electrocardiographic (ECG) parameters of cardiac conduction and repolarization, PR, QRS, and QT intervals, are widely used in clinical medicine and display substantial variability when measured across large populations. QRS duration represents activation time in the cardiac ventricle, and prolongation of this interval – representing global or regional slow conduction – has been associated with adverse outcomes such as sudden cardiac death.1 This variability reflects modulators such as abnormal electrolytes, underlying heart disease, or concomitant drug therapy, as well as heritable components; published estimates suggest that up to 40% of Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 variability in the QRS interval is heritable.2–4 Using automated methods described further below, we have shown that 99% of QRS durations in over 30,000 normal subjects not receiving reece ceiv i in iv ingg confounding medications fall between 65 to 108 msec.5 DNA DN A repo repositories possito po sit ries linked to Electronic Me Medi Medical ical Record (EM (EMR) EM MR) ssystems ys ystems have been proposed prop posed as on one ne po potential oteentia iaal sour ssource our urce ce ooff su subj subjects bjeectss ffor or analyzing ana nalyzzin zing ng the the relationship rela re lati tion onsh s ip ip bbetween ettwe weeen en ggenetic en nettic ic –10 –10 variation vari va riat ri atio at ionn and io an nd a range rang ngee of of human huma hu mann traits. ma trai tr aits ai ts.66–10 The The advantages adv dvan nta tage gess of ge of this thi h s approach ap ppr p oach ch may may include inc nclu ludee rapid lu rap a id d generation ooff patient pati pa tien entt sets en s ts se t for for study stu udy (since (si s nc ncee electronic elec el ectr ec trron nic i ddata atta ar aree al alre read re ad dy in pplace), lac ace) e),, an andd th thee ability to already study large numbers of subjects accrued without bias with respect to factors such as disease or age. The National Human Genome Research Institute’s electronic MEdical Records and GEnomics (eMERGE) Network11 has, as one of its primary goals, the evaluation of the utility of EMR systems coupled to DNA repositories as a tool for genome science. Initial studies from eMERGE sites support the potential utility of EMR systems for discovery and validation of genotype-phenotype associations.12–16 We report here a GWAS of QRS duration in European-descent subjects whose first ECG in an EMR system was normal, and who at the time of the ECG lacked evidence of heart disease, 3 DOI: 10.1161/CIRCULATIONAHA.112.000604 potentially confounding medications, and electrolyte abnormalities. We then used a phenomewide association study (PheWAS)16,17 to demonstrate that polymorphisms associated with QRS variability in subjects without cardiac disease were also associated with subsequent diagnoses of cardiac arrhythmias. This coupling of GWAS and PheWAS further validates the relationship between abnormal conduction and arrhythmias, and points to the development of genome-based predictors of arrhythmia susceptibility. Methods Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Normal QRS algorithm We developed and deployed an algorithm to identify individuals with normal ECG CGss an CG nd wi with thou th o t ECGs and without any cardiac disease, abnormal electrolyte values, or QRS-active medications – across the five eM MER ERGE GE-1 GE -1 ssites ites ess iidentified d ntified 5,272 Caucasian pati de tien ti e ts (2,488 male en es an nd 2,78 22,784 ,7 4 females; Table 1). eMERGE-1 patients males and T hee algorithm a gorithm al m was wass developed d ve de velo lope lo pedd and pe and validated va idaatedd inn thee S vali ynnth ntheti heticc De D rivvati tive vee ((SD), SD D), a dde-identified e-id iden id en ntiifi fied ed d The Synthetic Derivative mag agee of tthe he V an nderb derbiiltt EM EMR R th that at ccurrently urrrent ur rent ntly ly co ont n ainns oover verr 12 ve 1200 m ill lllio ionn do ocu ume ment ntss on on aabout boout 2 image Vanderbilt contains million documents million patients. patien en nts ts.7 T The h S he SD D is rrefreshed e re ef resh s ed sh e rregularly egul eg ular ul arly ar ly tto o ad addd ne new w cl cclinical inic in icall iinformation ic nfor nf orma or mati ma tion on ffrom rom ro m th tthee EMR aass it is accrued. The study population consisted of subjects with a normal ECG without evidence of cardiac disease any time before or within one month following the ECG, concurrent use of medications that interfere with ventricular conduction, and who did not have abnormal potassium, calcium, or magnesium lab values at the time of the ECG. The algorithm has been described in detail previously.13 The algorithm used natural language processing (NLP)18,19 to analyze narrative text, billing code queries, and lab queries to exclude any subjects with evidence of arrhythmia, heart failure, cardiomyopathy, myocardial ischemia/infarct, or cardiac conduction 4 DOI: 10.1161/CIRCULATIONAHA.112.000604 defect. The algorithm considered all physician-generated clinical documentation, including clinical notes and cardiologist-generated ECG impressions. Patients with family histories of cardiac disease were allowed by the NLP algorithm. In addition, ECGs had normal Bazett’s corrected QT intervals (<450ms), heart rates (between 50-100 bpm), and QRS (60-120 ms). The algorithm was reviewed by two physicians not involved in algorithm development, and achieved a PPV of 97% to identify patients with normal ECGs who did not have known exclusions on a random selection of 100 subjects.13 Analysis of clinical covariates in ~30,000 records with algorithm-defined normal ECGs identified gender and ancestry as modulators of QRS duration.5 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Complete details of the algorithm are available from PheKB (http://phekb.org/). The final algorithm was applied in BioVU, the Vanderbilt DNA databank nkk that tha h t li link links nkss DN nk D DNA NA extracted from discarded blood samples to the SD.7 Patients at Vanderbilt were genotyped peccif ific ical ally lly ffor or tthe he ppurpose urpose of studying normal QRS QRS duration. Th he algo goori rith t m was then deployed specifically The algorithm across ac cro oss s the DNA NA A rrepositories epos osit i orie orie iess at the thee oth oother ther four fouur eMERGE-I eME ERGEE-II sites Esiitees (Marshfield (M Mar arsh shhfi f eldd Clinic, Clin Cl inic icc, Northwestern Norrthw No rthw hwes essteern University, Un niv iver ersi er sity ty y, Ma M Mayo yo C yo Clinic, linnic, li c, G Group ro oup H Health ealt ea lthh Re Rese Research sear arch ar c In ch Institute) ns itu nsti ute te)) to o iidentify deent ntif ifyy su if subj subjects bjec ecctss w with ithh ex it extant xtaant n eMERGE-based eMERGE-bas assed ggenotyping enot en ottyp y in ng da ddata ta ((based baseed oon ba n ot othe other h r ph he phen phenotypes; e ot en otyp ypes yp es;; as sho es shown hown ho wn iin n Table Tabl Ta blee 11)) who bl w o met wh algorithm-defined criteria for normal QRS. The eMERGE cohorts are described in more detail in McCarty et al. 201111 and at the Phenotype Knowledge Base (PheKB.org). Thus, all eMERGE individuals used in the analysis underwent the same algorithm to select those with normal ECGs and without prior heart disease, interfering medications, and abnormal electrolytes. To assess the performance of the algorithm when applied within external EMR systems, trained chart abstracters at Northwestern and Marshfield reviewed randomly-selected subsets of 100 subjects at Marshfield and 45 subjects at Northwestern to determine the algorithm’s accuracy at external sites. Northwestern’s evaluation also included an independent review by a board-certified 5 DOI: 10.1161/CIRCULATIONAHA.112.000604 internal medicine physician, with discrepancies resolved by consensus. This study included only subjects designated as “non-Hispanic white” European American in the EMR from each site. We have previously shown the EMR ancestry performs similar to self-report.20 This study was approved by each site’s Institutional Review Board. Because BioVU is de-identified and accrues individuals through left-over blood remaining after routine clinical testing, it operates as non-human subjects research according to the provisions of 45 CFR 46, as described previously.7 Individuals at other eMERGE sites were consented as part of each site’s DNA biobank.11 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Genotyping and data analysis Genotyping was performed at the Center for Genotyping and Analysis at the Bro Broad oad In Inst Institute sttittut utee an and the he Center for Inherited Disease Research (CIDR) at Johns Hopkins University. Samples of European Eu uro rope pean pe an ancestry anc ncestr trry or or unknown ancestry were analyzed ana naly na yzed using thee Illumina I lumi Il minna mi na Human660WQuadv1_A Q uaadv1_A ad genotyping gen enot otyypin ingg platform, plat pl atfo at forrm, fo rm, consisting cons co n istting off 561,490 ns 561,,4900 SNPs SNPs and and 95,876 95, 5 87 8 6 in intensity-only nte tens nsit ityy-oonly only pprobes. robe ro bes. be s Data were cleaned using (QC) pipeline Genomics Da ata t w eree cl er clea eane need us usi ing th ing the qu qquality ali lity ty ccontrol ontr ontr trol ol (QC QC)) pi QC ipeeli line ne ddeveloped eve velo lo ope pedd bby y tthe hee eeMERGE MERG ME RGE RG E Geno G eno omics mic Working Grou Group. ouup. p 21 T This hiis pr pprocess oces oc e s in es incl includes cllud u es eevaluation valu va luat lu a io at ionn of o sa samp sample mple mp le aand n m nd marker arke ar kerr ca ke ccall lll rrate, ate, at e, ggender e der en mismatch and anomalies, duplicate and HapMap concordance, batch effects, Hardy-Weinberg equilibrium (HWE), sample relatedness, and population stratification. After QC, 528,508 SNPs were used for analysis based on the following QC criteria: SNP call rate >99%, sample call rate >99%, minor allele frequency > 0.0001, unrelated samples only (removing all parent-offspring, full and half siblings), and individuals of European-descent only (based on STRUCTURE22 analysis of >90% probability of being in the CEU cluster). Each eMERGE site used the QC pipeline to clean their initial datasets prior to merging all the samples. QC procedures were then performed on the merged eMERGE dataset in which 6 DOI: 10.1161/CIRCULATIONAHA.112.000604 data from all five sites were combined, and no significant differences across sites or genotyping center were identified. As well, all sites had comparable QC results including similar SNP and sample call rates, HWE p-values overall, and minor allele frequencies. The detailed QC report on the merged dataset will be deposited in dbGaP along with the merged dataset. Single-locus tests of association were performed using linear regression assuming an additive genetic model for all 528,508 SNPs in a total of 5,272 individuals with a normal QRS duration. Our studies of ECG intervals in 32,949 normal individuals identified sex as a major modulator of normal QRS duration, with minor effects of age and ancestry.5 All analyses were Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 performed unadjusted and then adjusted for age, sex, BMI, and the first principal component from Eigenstrat23 to adjust for potential population stratification, without signifi significantly icant cant ntly ly y changing cha hang ngin ng ingg the he key results. Since only sex is significantly associated with QRS duration via the literature, we report epo port rt tthat hatt here. ha herre. Analyses he Anal An a yses were also performed aadjusting djusting for heigh height gh ht and and/or nd d/o /or BMI, but these did not no change ch hange an n the res results. sullts. Associations Asso As soci so ciat atiionns ns with witth the t e lowest th lo owe west st P values val alue ues (p<10 ue (p< (p <100-4 ) were were then the henn submitted suubm bmittted ted to t the the recent recen en nt QRS QR Cohorts for Heart Hear He a t and ar and Aging A in Ag ng Research Reese sear arrch in in Ge Geno nomi no m c Epidemiology mi E id Ep dem emio iolo io logy lo g ((CHARGE) gy CHAR CH ARGE AR G ) meta-analysis GE meta me ta-a ta -analysis Genomic group24 and a table of P-values from that analysis was generated. The CHARGE meta-analysis of QRS duration has been described in detail previously24; briefly, it involved 40,407 individuals selected from 15 sites restricted to those of European ancestry. Individuals with prior myocardial infarction, heart failure, arrhythmias, pacemakers, antiarrhythmic medication use, or whose QRS durations >120 ms were excluded. To calculate the variance explained by all SNPs in the dataset, we analyzed the data using GCTA.25 Only those SNPs with minor allele frequency > 0.01, genotyping efficiency > 99.9 and HWE > 0.001 were included in the analysis (n=505,502 SNPs). The genetic relationship matrix 7 DOI: 10.1161/CIRCULATIONAHA.112.000604 (GRM) was computed for all 5272 subjects and all SNPs using GCTA. In order to eliminate possible cryptic relationships, subjects with GRM>0.25 were pruned from the analysis, which removed 310 subjects. The proportion of variance explained by either all SNPs or all SNPs excluding the subset of 23 SNPs significant in the CHARGE GWAS was computed on the remaining subjects for QRS duration. We compared this to a linear regression analysis using the five SNPs in Table 2 to estimate the proportion of variance explained by these loci. All analyses were adjusted for age, gender and the first principal component (previously computed). Phenome-wide association study of QRS-associated SNPs Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 We selected the most significant SNP associations for analysis by PheWAS.16,17 For this analysis, we combined the entire eMERGE cohort of European American individ individuals id duaals ((n=13,859) n=13 n= 13,8 13 ,8559) identified dentified across the five eMERGE sites. These individuals represent a superset of the 5,272 individuals ndiivi vidu dual du alss with al w th wi h nnormal or ormal ECGs and without hear heart rt disease dissease usedd forr the di the GWAS. GWAS. To define diseases, Classification ddise i eas a es, we qqueried ueri ue r ed aall lll IInternational nter nt errna nati tion onaal Cla on C lassiificcation on off Disease Diseeasse (ICD), Dise (ICD), CD) 9thh eedition, diti di tion ti on, codes co ode d s from from m the the respective esp spec ecti ec t ve EMRs ti EMR MRss of of the the five fiv ive eMERGE eMER eM ERGE ER GE sites. sit ites es.. es The PheWAS PhheW eWAS AS software sof o tw war aree uses uses e occurrences occcur urre renc re nces nc e of es of ICD ICD codes code co dess to classify de cla lass la ssif ss iffy each each h person per erso sonn as having so one or more of 778 possible clinical phenotypes (typically diseases). For each disease, the PheWAS algorithm constructs a control population by selecting all patients that do not have the case disease or closely related diseases (e.g., a patient with a bundle branch block cannot serve as a control for complete heart block). The PheWAS methodology has previously been validated through rediscovery of known associations.16,17 Analysis of each phenotype then proceeds using a pairwise analysis of all case and control groups for each tested SNP (n=23). We have observed that positive predictive values increase when individual codes are present more than once in the EMR, and here we required each case to have at least four instances of the same ICD code in a 8 DOI: 10.1161/CIRCULATIONAHA.112.000604 PheWAS case group. In addition, we did not analyze phenotypes occurring in less than 50 patients (a prevalence of 0.36% in the dataset). Association analyses were performed with PLINK using logistic regression adjusted for age, gender, and the first three principal component analyses as calculated by Eigenstrat, since on this larger population, the third principal component was statistically significant.23 Analysis adjusted with and without principal components did not substantively change the results. After identification of PheWAS case and control groups using the PheWAS software, the association analyses were performed using PLINK.26 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Survival analysis of QRS population Following PheWAS analysis, we analyzed the original set of 5272 patients that m met ett ou our ur algorithm definition for normal cardiac conduction/normal heart for subsequent development of atrial at tri rial al ffibrillation ibri ib rill ri llat ll atio ti n an andd cardiac arrhythmias with the he SSCN5A CN5A rs1805126 rs18051 51 126 and and SCN10A rs6795970 SNPs. S SNP NPs. Phenotype Phenottyp ypee definitions defi defi fini niti tion ti onss were on weree drawn dra rawn wn from fro om thee P PheWAS heW eWA eW AS aanalysis naaly lysi siss uusing siing bi bill billing llin ingg code ccodes. ode d s.. Kaplan-Meier proportional models starting Ka apl plan an-M an -Mei eieer ei er aanalysis naly lysi ly siss and and Cox Cox pr prop opor orrtiion onaal hhazard a ardd mode az m ode dells ls were werre ccalculated, a cu al ulaate ted, d, uusing sing si ng tthe hee st tartiing tart i ng time ime as the initial in nit itia iaal normal norm no rm mal a ECG ECG with wit i h a time-to-event time ti me-t me -ttoo eveent n analysis. anaaly lysi sis. si s. C Cox ox pproportional ropo ro porrti po tion o al hhazard on azar az ardd models ar were adjusted for age, sex, principal components as calculated above, and QRS duration. Results Population identification We identified 5,272 Caucasian patients (2,488 males and 2,784 females; Table 1) across the five eMERGE-I sites. The positive predictive value (PPV) of the automated phenotype algorithm to find cases with normal ECGs and without exclusions at the development site, Vanderbilt, to identify study subjects was 97% (95% confidence interval [CI] 91-99%).13 The PPV at 9 DOI: 10.1161/CIRCULATIONAHA.112.000604 Northwestern University and Marshfield Clinic were 97% (95% CI 83%-100%) and 100% (95% CI 96%-100%), respectively. Combining all reviewed samples across the three sites, the PPV would be 98% (95% CI 96%-100%). The mean QRS duration was 87.9 msec (standard deviation 9.5 msec; median 88.0 msec; Figure 1A). GWAS results A total of 528,508 SNPs passed quality control of eMERGE-supported Illumina 660Quad genotyping data in these subjects. Figure 1B shows the genome-wide association analysis for QRS duration adjusted for sex; the findings were near-identical for the unadjusted analysis. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 There was a single association between QRS duration and a SNP (rs1805126) in SCN5A, msec encoding the cardiac sodium channel gene, that survived Bonferroni correction (beta=1.002 (beta beta t =1 =1.0 .0002 m sec per copy of the T allele, p=1.45 x 10-8). Th et ta aken ken forward to the CHARGE QR QRS S meta-analysi is co ons nsoortium or included 108 Thee se set taken meta-analysis consortium SNPs S SNP NPs with P-va P-values valu l es <10 lu <100-44. The The retrieved retr re t iev tr ieved P-values P-vval valuess for for this thiis set sett divided divvidded di ded into i to two in two distinct dissti t nct nct gr groups: roup oups:: 23 SNPs SN NPs with wit ithh P-values P-va Pvallues es in in the th he CHARGE CH HAR ARGE GE set sett from fro rom m 10 1 --88 ttoo 10-27, and an nd 85 8 with wit ithh P-values P-vaalu P-va ues > 0.003. 0.00033. These Th hese 23 associations association onns (Supplementary (Su Supp pple pp leme le meent ntar arry Ta ablle 1) 1) are are located l ca lo cate teed inn the the five fiv ivee loci lo oci with witth the t e lowest th lo owe west st P values Table reported by the CHARGE consortium: 18/23 are in the chromosome 3 locus that includes SCN5A and SCN10A, as well as other genes (e.g. EXOG and XYLB1). The other three loci are near SLC35F1 and C6orf204 (chromosome 6), near CDKN1A (chromosome 6) and in NFIA (chromosome 1). The most significant SNP for each locus is presented in Table 2. The locus zoom plot (Supplementary Figure 1) shows little linkage disequilibrium (LD) in the chromosome 3 region in HapMap Phase III (CEU), consistent with the suggestion that the SCN5A-10A finding may actually indicate multiple independent associations.24 Specifically, the most significant variants in SCN5A (rs1805126) and SCN10A (rs6795970) are not in LD 10 DOI: 10.1161/CIRCULATIONAHA.112.000604 (r2<0.20). Using the GTCA25 approach, we estimated heritability for QRS at 31.1% (standard error [SE] 6.9%, p=5.7 x 10-7) using all SNPs in the dataset. Conducting the analysis without the 23 SNPs significant in CHARGE decreased the estimated heritability to 30.3%, a decrease of 0.8%. This was somewhat conservative compared to a linear regression model, which estimated an adjusted r-square value of 1.6% for the five loci in Table 2. PheWAS analysis The PheWAS dataset consisted of 13,859 European-American subjects in the entire genotyped Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 eMERGE cohort. The analysis focused on the most significant SNPs in each of the five loci associated with QRS (Table 3). While no associations survived a strict Bonferro oni ni ccorrection orrre rect ctio ct ionn ffor io or Bonferroni significance ignificance (p=0.05/778/5=1.3x10-5), the most significant associations were particularly relevan relevant car ardi diac di ac disease dise isease see and and n demonstrated significantly significantlly different different patterns n off associations ns asssociations for the five as too cardiac QRS-associated Q QRS RS-associated ed d lo loci. oci. i T The he sstrongest tron tr onge on gest s as st asso associations socciattio ionns ffor or thee S SNPs NPs in both bot othh SCN5A SCN5 SC N5A A an andd SC SCN SCN10A N10A we N10A were r wi with ith tthe h ddiagnoses he iaagn gnos oses ess ooff ca cardiac ardi diacc arr arrhythmias rrhy rr hyth hy thm mias as ((p=7.21x10 p=7. p=7. 7 21 1x10 x10--44 ffor or SC SCN10A CN10A N10A aand nd p=1.1x10 p=1 =1.1 .1 1x1 x100-33 fo for or SCN5 SSCN5A) CN5 N5A A) SCN1 N110A 0A,, at atri r al a fib bri r ll llat attio ionn (p (p=8 =8.5 =8 .5 5x1 x100-44, Fi Figure Figu gure gu re 22). ). Ta Table Tabl ble 3 lists bl lis ists ts associations ass s occia iati tion ti onss for the on and, for SCN10A, atrial fibrillation (p=8.5x10 most significant SCN5A (rs1805126)and SCN10A (rs6795970) SNPs, as well as those at the other QRS-associated loci, chromosome 1 (rs2207790) and the two chromosome 6 loci (rs6906287 and rs1321313), also graphed in Supplementary Figures 2-4. The CDKN1 and C6orf204 loci were not associated with cardiac arrhythmias (p>0.3, with 80% power to detect and OR>1.12 at p=0.05), and NFIA was weakly associated with cardiac arrhythmias (OR=0.91, p=0.02). Supplementary Table 2 presents PheWAS association data for all 23 SNPs significant in CHARGE. While most SNPs in a given gene displayed similar patterns of PheWAS associations, rs11129801, the strongest SCN10A SNP in our adjusted analysis but a lesser 11 DOI: 10.1161/CIRCULATIONAHA.112.000604 association in CHARGE, had a very different PheWAS pattern, with the strongest associations being epilepsy, uterine cancer, and migraines; atrial fibrillation was not associated (OR=1.07, p=0.18). In agreement with these data, rs11129801 was only in weak linkage disequilibrium to the other SCN10A SNPs, such as rs6800541 (r2=0.20). Likewise, the EXOG locus had a very different PheWAS pattern of associations (prostatic hyperplasia, sexual and gender identity disorders, liver disease, kidney disease, cerebral degenerations, diarrhea) despite being nearby the SCN5A-10A region. Analysis of QRS population for arrhythmias Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 After PheWAS analysis, we analyzed the original set of 5,272 patients that met our algorithm definition for normal cardiac conduction/normal heart r for subsequent developme ent nt ooff at atri rial ri al development atrial fibrillation and cardiac arrhythmias with the SCN5A rs1805126 and SCN10A rs6795970. In this po opu pula lati la tion ti onn, 17 1173 3 (3 (3% %) developed atrial fibrillation on oorr atrial flutter aatt so ome point at least one month population, (3%) some ffollowing olllowing the normal norm no rm mall ECG, ECG CG,, and and 605 60 05 (11%) (1 11% %) were we coded coded d aass havi hhaving avi v ng aany ny aarrhythmia. rrrhy yth thm mia. Q mia. RS dduration urat ur a io at on QRS tseelf w as aassociated ssoc ss ocia iaateed w itth th ffuture uttur u e de dev velo velo opm pmen entt of en o aatrial trria i l fibrillation fiibr briilla il ati t on (p=0.015) (p=0 (p =0 0.0015 15)) by llogistic ogiisti og is ic itself was with development egression. Ass with witth the the eMERGE eM MER ERGE GE PheWAS, Phe h WA WAS, S, SCN10A SCN1 SC N10A N1 0A A rs6 s679 s6 7959 79 5970 59 70 was was a associated ass ssoc ocia oc i teed with ia with both regression. rs6795970 arrhythmias (hazard ratio [HR]=0.81 per copy of the A allele, p=0.002) and atrial fibrillation/flutter (HR=0.67 per copy of the A allele, p=0.001). Moreover, this association was essentially unchanged when QRS was also included in the model (HR=0.68), indicating that the association between rs6795970 and atrial fibrillation is independent of the association between QRS and rs6795970. Similarly, the association between rs6795970 and cardiac arrhythmias was independent of QRS (HR=0.80 without QRS). Our analysis did not demonstrate an association between SCN5A rs1805126 and either atrial fibrillation (HR=1.2, p=0.14) or cardiac arrhythmias (1.03, p=0.66) in the normal QRS population. Figure 3 presents a Kaplan-Meier plot for the 12 DOI: 10.1161/CIRCULATIONAHA.112.000604 relationship between rs6795970 and development of atrial fibrillation. Discussion The current study demonstrates common variants in the SCN5A-SCN10A locus are associated with QRS duration in subjects without clinical evidence of prior heart disease. These patients were derived from clinical practice settings, adding to the growing body of evidence of supporting the utility of EMR-based genomic analysis.12–16,27 The data replicate findings from a large meta-analysis24 drawn from multiple community populations, where information on Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 potential QRS modulators such as heart disease status and medications was not as precisely controlled or excluded from all included studies. The major new finding here is th that hatt usi using sing si ng tthe he PheWAS study paradigm, we were able to examine the longitudinal associations of these genomic geno ge nomi no micc variants mi vari va riaants tss on on disease in a hypothesis-freee manner. manner. This analysis ana naalysiis revealed re that SNPs in SSCN10A SCN CN10A (rs6795970) (rs67795 959770)) aand nd SCN5A SCN5 SC N A (rs1805126) N5 (rs1 (r s180 80 051226) 26) str strongly ronngly y aassociated ssoc ss occia iatted wi with ith Q QRS RS dduration, uraatio ur ion, n, aare re aalso re lsoo ls associated with SCN10A as sso soci ciat ci a ed w at ith subsequent ith subseq sub bseq equuent cardiac cardi diac ac aarrhythmias. rrhy rr hyth hy thm th mias as. SC as SCN CN10A N10A specifically speeci c fica call ca l y iss associated ll asssoc ocia iate teed with wiith h atrial atr triiall fibrillation. Th These hes e e as asso associations sooci c attio ions n be betw between twee tw e n rs rs67 rs6795970 67795 9 97 9700 an aand d at aatrial rial ri al ffibrillation ibrrilla ib lati la tiion aand nd ccardiac ardi ar diac di ac aarrhythmias rrhythmiass were also seen specifically in the original “heart healthy” study population, and were independent of the SNP’s association with QRS duration. The latter finding suggests that while variants at the SCN5A-SCN10A locus determine QRS and subsequent arrhythmia susceptibility, they may do so by divergent (pleiotropic) pathways, or that conduction slowing occurs not only in the ventricle but also in the atrium where it contributes to susceptibility to atrial fibrillation. Importantly, the selection logic for the case selection algorithm in our GWAS required the absence of cardiovascular disease at the time of the ECG. Therefore, these associations represent subsequent development of cardiac arrhythmias in subjects with these variants. This result 13 DOI: 10.1161/CIRCULATIONAHA.112.000604 highlights the potential of the EMR, with multiple diagnoses and longitudinal follow-up, to identify not only variants associated with disease susceptibility or trait variability, but also subsequent outcomes associated with these variants. Drugs that block SCN5A-encoded sodium channels slow ventricular conduction, prolong QRS duration,28 and increased mortality following myocardial infarction in the Cardiac Arrhythmia Suppression Trial (CAST).29,30 Available evidence supports the view that slow conduction, particularly in the setting of scarred or ischemic myocardium, promotes reentrant excitation that leads to fatal arrhythmias,31–33 and in CAST, longer QRS durations also predicted Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 increased mortality among patients treated with placebo.1 In addition, a genetic disease caused by SCN5A loss-of-function mutations (Brugada Syndrome) is characterized by slowed slo owe w d ventricular conduction and an increased risk for fatal arrhythmias.34 Interestingly, previous an nal alys ysees ys es ooff va variab ablle ab le PR duration have also identified identtif ifieed strong assoc cia i tion on ns with w th variants in wi analyses variable associations 13,35–37 35–37 37 7 SSCN10A, SCN CN10A,13,35– and and multiple mullti tipl plee mechanisms pl mech mech hanis an sms are arre currently currren rently y being bein ng examined exam ex min ined e to ed to explain explai exp plainn this this effect: eff ffec ff ecct: t expression ex xpr pres essi es s on ooff SC si SCN10A-encoded CN10A N10A A-enc n od oded ed cchannels hann ha nnnells in ca card cardiomyocytes rdio rd iomy myoccyt my ytes es aand/or nd d/o or ca cardiac ard dia iacc ne neur neurons, uron onss, on s, oorr 38–40 40 regulation egulation off SC SCN5 SCN5A-10A N5AN5 A 100A ex Aexpr expression. prres essi sion o .38 on Our PheWAS analysis suggests that SCN10A, SCN5A, and EXOG variants are associated with a cardiac arrhythmia billing codes (entered either by physicians or professional coders); this includes atrial fibrillation and flutter, supraventricular and ventricular tachycardia, cardiac arrest, and other unspecified arrhythmias. SCN10A rs6795970 was specifically associated with atrial fibrillation and flutter, which was noted by Pfeufer et al.36 but not Holm et al.35 Chambers et al.37 previously demonstrated associations between SCN10A rs6795970 and heart block and ventricular fibrillation; we also noted an association with first-degree atrioventricular block (p=0.009, Supplementary Table 2). Mouse studies have demonstrated expression of Nav1.8 in 14 DOI: 10.1161/CIRCULATIONAHA.112.000604 vagal and spinal afferents in gastrointestinal mucosa and myenteric plexes,41 and interestingly SCN10A was also associated with cholecystitis in the PheWAS. Variants in CDKN1A and C6orf204, however, were not associated with cardiac arrhythmias, although we cannot exclude that weak associations may exist. In contrast, the C6orf204 locus seems most associated with neoplastic disorders (colorectal cancer, prostatic hypertrophy, and melanoma); its strongest cardiovascular disease association was atherosclerosis (odds ratio 1.094, p=0.03). Thus, PheWAS suggests that although all these regions may be associated with QRS interval, only those SNPs in the SCN5A-10A region seem significantly associated with subsequent Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 development of arrhythmias. Recent growth in large GWAS meta-analysis has shown the power of large larrge numbers num mbe bers rs tto o find genomic influences of given traits and disease. In this study, the development of the ph hen enot otyype ot ype algorithm alggori al riith thm m at one site was followed byy its itss use at four other oth th her ssites ite tess with different EMR phenotype ysttem e s. Algo orith riith t m performance perf pe rffor orma mannce ma nce was was similar siimila laar to find finnd patients pattie ienntss meeting meeetin ingg in incl cluusio usio on an andd ex xcllusio usio ionn systems. Algorithm inclusion exclusion crrit iter eria er ia at at th thee th thr reee si sit tes w ho eevaluated valu va luat lu a ed at d tthe he aalgorithm, lg gor o itthm m, pr prov oviidiing ov n fur urth ur t er vvalidation th alid al idat id atio at ionn ooff tthe he criteria three sites who providing further 16,42 42 ransportabili lity ty ooff EM EMR R ph hen enot otyp ypee al yp aalgorithms. gori go riith thms ms..16 ms Estimates E Es tiima mate tees of her heritability erit er ittab abil ilit il ityy of Q it QRS RS uusing sing all transportability phenotype SNPs are consistent with other reports; the heritability associated with the very limited subset of 23 SNPs implicated here appears to explain a surprisingly large proportion of this heritability estimate. This analysis also indicates that, as with other traits, there is extensive “missing heritability” when QRS duration is analyzed as a function of common genomic variation. This report highlights both limitations and real and potential advantages of EMR-based genomic research. The present study involved analysis of subjects accrued in the initial stages of the eMERGE network, and as such a major limitation was the relatively small size of the study set. Large consortia such as CHARGE have used meta-analysis to aggregate individual datasets 15 DOI: 10.1161/CIRCULATIONAHA.112.000604 across many sites and have demonstrated the power of the large numbers to generate highly significant results by this approach. One of the key lessons in the eMERGE experience to date has been that algorithms to identify cases and controls for genomic or other study can be successfully deployed across multiple EMR systems.16,42,43 Thus, as the number of subjects with dense genomic information across multiple EMR systems grows, this and other EMR-based studies highlight the potential for accrual of increasingly large sample sets to identify genomic predictors of variability in phenotypes such as physiologic traits or disease susceptibility. Further, EMRs hold the promise, as suggested here, of examining longitudinal healthcare Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 outcomes, such as disease complications or response to drug therapies. Identifying cases for uch studies requires especially large resources, since subsets of subsets (e.g. dru ug rresponse ug espo es pons po nsee X in ns such drug Y are required. Current efforts demonstrate the feasibility of accrual r of DNA collections disease Y) co oup uple ledd to EMRs le EMR MRss large la u statisti tica ti callly valid analyses analys yses of rare ys rar a e variants and/or rare coupled enough to support statistically cllin nic ical events. s In I the he current cuurre urre rennt nt eMERGE eME ERG RGE E network, nettwork,, there there ree are are 9 sites sit ites es with wit ithh EMR-based EMREM R-ba Rbase seed biobanks biob bi obaank ob a nk s clinical hat iinclude nclu nc lude dee > 2500,0000 25 000 su ubj bjeccts t , an andd >5 50, 0,00 0000 ha 00 have vee bbeen eenn ge ee gen noty noty type peed on a ggenome-wide enom en omeom e wi ewidde pplatform. laatf tfor orrm. that >250,000 subjects, >50,000 genotyped resourrcees that th hat should sho houldd expand expa ex pand pa n the nd thee reach rea each ch of of EMR-based EMREM R ba base sedd genomic se geno ge n mi no micc research rese re sear se arch ar ch h include inc nclu lude lu d the Kaiser de Kaiseer Other resources Northern California biobank (>100,000 individuals),44 the Million Veteran’s Project (currently >100,000 individuals),45 and biobanks that will be coupled to national healthcare systems (e.g. the UK Biobank46). Studies in the EMR environment enable PheWAS analyses: a PheWAS experiment cannot be executed in the absence of diverse diagnoses across many diagnostic classes in study subjects. The PheWAS-based associations reported here did not achieve significance using a strict Bonferroni correction; however, SCN10A rs6795970 was strongly associated with both atrial fibrillation and cardiac arrhythmias in the normal QRS population as well. The PheWAS 16 DOI: 10.1161/CIRCULATIONAHA.112.000604 approach is still in development and it is likely that the strategy of using only standard diagnostic codes is a limitation. The refinement of disease classifications and abstraction methods will enable more granular phenotypic subsetting, particularly when applied to increasingly large datasets described above. A fundamental limitation of the PheWAS technique is that every diagnosis is not explicitly included or excluded in each record. Another potential limitation of EMR-based phenotyping is the accuracy of the information contained in the record. Extracting phenotypes from the EMR can result in errors if the data in the source EMR are incorrect (e.g., the EMR Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 specifies a disease the patient does not have). Gender and ancestry testing suggest these demographic features are rarely incorrect in an EMR.7,20 Similarly, the electronic icc phe phenotyping h no he noty typi ty ping pi n ng experience in eMERGE indicates that recurrent mentions of specific diagnoses or combinations off diverse div iver erse er see data dat ata types ty ypes pes (such as medications plus free free text plus diagnostic diagn gnosti gn tiic ccodes) odes) greatly improve ddiagnostic diag i gno n stic acc accuracy cur urac acy of ac o eelectronic lect le ctro ct ronnic ro nic ph phen phenotyping en nottypin in ng alg algorithms gorith gor hms ms w when hen as hen asse assessed esssed aass po posi positive siti tiive ppredictive redi dict di ctiv i e iv value va valu luee us lu uusing ingg ha in hhand and nd d ccuration uraatio ur on ass tthe he ggold oldd stan ol sstandard. tan anda dard da d.47 E EMRs MRs co MR con contain ntai ainn da ai data ata oon n subj ssubjects ubj bjec eccts eexposed xpossed xp d to to a healthcare system sys yste ys tem te m an andd as a such suc uchh EMR-based EMREM R baase Rsedd studies s ud st udie iees may m y not ma not be generalizable gen e er eral aliz al izab iz able ab le tto o a br broa oadd oa broad population. The conduct of EMR-based studies across a variety of geographical locations and practice settings (as in eMERGE) potentially mitigates this issue. In summary, a genome-wide association study conducted across multiple EMR systems replicated known associations for a readily available index of cardiac conduction, the QRS duration. The algorithm deployed allowed us to analyze subjects with normal electrocardiograms, and no evidence of heart disease or confounding drugs or electrolyte abnormalities, and the phenome-wide association established genomic variants predicting slower conduction in this population also associated with subsequent development of arrhythmias. 17 DOI: 10.1161/CIRCULATIONAHA.112.000604 Thus, the present findings are consonant with a view that individual susceptibility to serious arrhythmias is determined in part by genetically-determined variability in cardiac electrophysiologic behaviors. Furthermore, this study highlights the advantages of a genotyped EMR population to explore the subsequent emergence of clinically-important phenotypes not ascertained in the original study design. Funding Sources: This work was supported by the electronic MEdical Records and GEnomics (eMERGE) Network, initiated and funded by the National Human Genome Research Institute, Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 with additional funding from the National Institute of General Medical Sciences, through the following g ggrants: U01-HG-004610 ((Group p Health Cooperative); p ); U01-HG-004608 ((Marshfield Clinic); U01-HG-04599 (Mayo Clinic); U01-HG-004609 (Northwestern University); Univerrsiity y); U01-HGU01 01-H -HG -H G04603 (Vanderbilt University, also serving as the Administrative Coordinating Center). BioVU also receives support supp pport through the National Center for Research Resources UL1 RR024975, wh hic ichh is i nnow ow w at th thee Na Nati t on onal a C en nterr fo or A dvan nci cinng ng Trans nsla lational all S cien ence ces, 2 U L TR0 L1 R000 0445. which National Center for Advancing Translational Sciences, UL1 TR000445. D r. So S toodehni niaa w ass ssupported upp ppor pp orte or tedd byy R te 01-HL0 08845 56. The he ccontent ontent tent iss so sole lely le y tthe he rresponsibility espo es ponnsib po nsib ibillit ityy of the Dr. Sotoodehnia was R01-HL088456. solely au uth thoors and do does es nnot ot nnecessarily eccessaariily rrepresent epre ep rese sent nt tthe he oofficial he fffic icia iall vi view ew ws of tthe hee NIH H. authors views NIH. Conflict of Interest Inte In tere te rest re st Disclosures: Disscl clos osu os ures es:: No N ne. ne None. References: 1. Capone RJ, Pawitan Y, el-Sherif N, Geraci TS, Handshaw K, Morganroth J, Schlant RC, Waldo AL. Events in the cardiac arrhythmia suppression trial: baseline predictors of mortality in placebo-treated patients. J. Am. Coll. Cardiol. 1991;18:1434-1438. 2. Hanson B, Tuna N, Bouchard T, Heston L, Eckert E, Lykken D, Segal N, Rich S. Genetic factors in the electrocardiogram and heart rate of twins reared apart and together. Am J Cardiol. 1989;63:606-609. 3. Busjahn A, Knoblauch H, Faulhaber HD, Boeckel T, Rosenthal M, Uhlmann R, Hoehe M, Schuster H, Luft FC. QT interval is linked to 2 long-QT syndrome loci in normal subjects. Circulation. 1999;99:3161-3164. 4. Russell MW, Law I, Sholinsky P, Fabsitz RR. Heritability of ECG measurements in adult 18 DOI: 10.1161/CIRCULATIONAHA.112.000604 male twins. J Electrocardiol. 1998;30 Suppl:64-68. 5. Ramirez AH, Schildcrout JS, Blakemore DL, Masys DR, Pulley JM, Basford MA, Roden DM, Denny JC. Modulators of normal electrocardiographic intervals identified in a large electronic medical record. Heart Rhythm. 2011;8:271-277. 6. McCarty CA, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD. Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank. Personalized Medicine. 2005;2:49-79. 7. Roden DM, Pulley JM, Basford MA, Bernard GR, Clayton EW, Balser JR, Masys DR. Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin Pharmacol Ther. 2008;84:362-369. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 8. McCarty CA, Chapman-Stone D, Derfus T, Giampietro PF, Fost N. Community consultation and communication for a population-based DNA biobank: the Marshfield clinic personalized medicine research project. Am J Med Genet A. 2008;146A:3026-3033. 9. Ginsburg GS, Burke TW, Febbo P. Centralizedd biorepositories for genetic and ndd genomic gen e om omic ic research. esearch. JAMA. 2008;299:1359-1361. 10. Lemke AA, Wolf Wo WA, Hebert-Beirne J, Smith ME. Public and biobank participant attitudes toward Health Genomics. owa ward rd ggenetic enet en etiic rresearch et eseearch es ea participation and data sharing. sha hariing. Public Hea ha ea alth Ge Geno n mics. 2010;13:368377. 3777.. 11. 11. McCarty McCartyy CA, CA A, Chisholm Chissholm m RL, RL, Chute Chu huutee CG, CG, Kullo Kullo o IJ, IJ, Jarvik Jarv Ja rvik rv ik GP, GP,, Larson Larsoon EB, EB Li L R, Masys Masys Mas sys DR, DR R, Ritchie Network: consortium Riitc tchi hiee MD, hi MD D, Roden Rode Ro denn DM, de DM, Struewing S ru St uew win ingg JP, JP P, Wolf Wollf WA. Wo WA The Th he eMERGE eMER eM ER RGE EN etwo et work wo rk:: A co cons nsoorti ns tiium m ooff biorepositories electronic medical conducting studies. bior orep eposit itor oriess li llinked nkked to el elec ctr tron onic ic med dic ical al rrecords ecor ordds ddata ataa fo at forr co cond duc ucti ting ng ggenomic enom en omic ic stu udi diess. BM BMC C Med Genomi Genomics. 2011;4:13. ics cs.. 20 011 1 ;4 ;4:1 : 3.. 12. Kullo IJ, Ding K, Shameer K, McCarty CA, Jarvik GP, Denny JC, Ritchie MD, Ye Z, Crosslin DR, Chisholm RL, Manolio TA, Chute CG. Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. Am J Hum Genet. 2011;89:131-138. 13. Denny JC, Ritchie MD, Crawford DC, Schildcrout JS, Ramirez AH, Pulley JM, Basford MA, Masys DR, Haines JL, Roden DM. Identification of genomic predictors of atrioventricular conduction: using electronic medical records as a tool for genome science. Circulation. 2010;122:2016-2021. 14. Kullo IJ, Ding K, Jouni H, Smith CY, Chute CG. A genome-wide association study of red blood cell traits using the electronic medical record. PLoS ONE. 2010;5. 15. Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB, Pulley JM, Basford MA, Brown-Gentry K, Balser JR, Masys DR, Haines JL, Roden DM. Robust replication of genotypephenotype associations across multiple diseases in an electronic medical record. Am J Hum Genet. 2010;86:560-572. 19 DOI: 10.1161/CIRCULATIONAHA.112.000604 16. Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA, Bradford Y, Chai HS, Bastarache L, Zuvich R, Peissig P, Carrell D, Ramirez AH, Pathak J, Wilke RA, Rasmussen L, Wang X, Pacheco JA, Kho AN, Hayes MG, Weston N, Matsumoto M, Kopp PA, Newton KM, Jarvik GP, Li R, Manolio TA, Kullo IJ, Chute CG, Chisholm RL, Larson EB, McCarty CA, Masys DR, Roden DM, De Andrade M. Variants Near FOXE1 Are Associated with Hypothyroidism and Other Thyroid Conditions: Using Electronic Medical Records for Genomeand Phenome-wide Studies. Am J Hum Genet. 2011;89:529-542. 17. Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, Wang D, Masys DR, Roden DM, Crawford DC. PheWAS: demonstrating the feasibility of a phenomewide scan to discover gene-disease associations. Bioinformatics. 2010;26:1205-1210. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 18. Denny JC, Spickard A, Johnson KB, Peterson NB, Peterson JF, Miller RA. Evaluation of a method to identify and categorize section headers in clinical documents. J Am Med Inform Assoc. 2009;16:806-815. prolongation 19. Denny JC, Miller RA, Waitman LR, Arrieta MA, Peterson JF. Identifying QT T pr prol o on ol onga g ti ga tion on Inform. from ECG impressions using a general-purpose Natural Language Processor. Intt J Me Medd In Info forrm. fo rm 2009;78 Suppl 1:S34-42. 20. Dumitrescu L,, Ritchie MD, Brown-Gentry K, Pulley JM, Basford M,, Denny JC, Oksenberg JR, Roden Haines Assessing JR R, Ro Rode denn DM de DM, Ha Hain i es JL, Crawford DC. Assess ssiing the accuracyy of oobserver-reported ss bseerver-reported ancestry bs inn a biorepository Genet Med. 2010;12:648-650. bio i repo osi sito ory linked lin inke keed to electronic ele lect ctro ct roni ro nicc medical medi me dica call records. reccor re cords. Ge Gene n t Me ne Med d. 20 2010 10 0;1 ; 2: 2:64 64 488 65 650. 0. Turner S,, Armstrong LL, Bradford Y,, C Carlson AT, Dee An Andrade 221. 1. T urner S Arrmstroong LL L, B rad a fo ord dY arrlsonn CS, CS,, Crawford Crawf rawf wfordd DC, DC, Crenshaw Cren Cr ensh shaw aw A T, D Andr draade dr Doheny KF,, Ha Haines Hayes G,, Jarv G,, Ji Jiang Kullo Lii R, L Ling Manolio TA, M, D ohen oh enyy KF K ain inees JJL, L H L, ayes ay es G JJarvik arv rvik ik G Jian angg L, an L, K ullo ul o IIJ, J, L ingg H, M in anollio T an A, Matsumoto McCarty CA, McDavid AN, Mirel DB, Paschall EW, Rasmussen LV, Ma ats tsum u ot otoo M, M cCar cC arty C A, M cDavid A cD N M N, irel ir e D B P B, asch as chal a l JE JE, Pu Pugh hE W R W, asmu mussen en L V V, Zuvich RL, Ritchie MD. Quality Wilke RA, Zu uvi vich ch hR L R L, ittch chie i M D. Q u li ua lity ty y ccontrol o trool pr on pprocedures oced oc edur ed u es ffor ur orr ggenome-wide enom en omeom e wi ew de aassociation sssoc o iation studies. Protoc Genet. 2011;Chapter tud udie iess Cu Curr rr P roto ro tocc Hu Hum m Ge Gene nett 20 2011 11;C ;Cha hapt pter er 1:Unit1.19. 1:U :Uni nit1 t1 19 22. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945-959. 23. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904-909. 24. Sotoodehnia N, Isaacs A, De Bakker PIW, Dörr M, Newton-Cheh C, Nolte IM, Van der Harst P, Müller M, Eijgelsheim M, Alonso A, Hicks AA, Padmanabhan S, Hayward C, Smith AV, Polasek O, Giovannone S, Fu J, Magnani JW, Marciante KD, Pfeufer A, Gharib SA, Teumer A, Li M, Bis JC, Rivadeneira F, Aspelund T, Köttgen A, Johnson T, Rice K, Sie MPS, Wang YA, Klopp N, Fuchsberger C, Wild SH, Mateo Leach I, Estrada K, Völker U, Wright AF, Asselbergs FW, Qu J, Chakravarti A, Sinner MF, Kors JA, Petersmann A, Harris TB, Soliman EZ, Munroe PB, Psaty BM, Oostra BA, Cupples LA, Perz S, De Boer RA, Uitterlinden AG, Völzke H, Spector TD, Liu F-Y, Boerwinkle E, Dominiczak AF, Rotter JI, Van Herpen G, Levy 20 DOI: 10.1161/CIRCULATIONAHA.112.000604 D, Wichmann H-E, Van Gilst WH, Witteman JCM, Kroemer HK, Kao WHL, Heckbert SR, Meitinger T, Hofman A, Campbell H, Folsom AR, Van Veldhuisen DJ, Schwienbacher C, O’Donnell CJ, Volpato CB, Caulfield MJ, Connell JM, Launer L, Lu X, Franke L, Fehrmann RSN, Te Meerman G, Groen HJM, Weersma RK, Van den Berg LH, Wijmenga C, Ophoff RA, Navis G, Rudan I, Snieder H, Wilson JF, Pramstaller PP, Siscovick DS, Wang TJ, Gudnason V, Van Duijn CM, Felix SB, Fishman GI, et al. Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction. Nat Genet. 2010;42:1068-1076. 25. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: A Tool for Genome-wide Complex Trait Analysis. Am J Hum Genet. 2011;88:76-82. 26. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, De Bakker PIW, Daly MJ, Sham PC. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559-575. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 27. Kurreeman F, Liao K, Chibnik L, Hickey B, Stahl E, Gainer V, Li G, Bry L, Mahan S, Ardlie K, Thomson B, Szolovits P, Churchill S, Murphy SN, Cai T, Raychaudhuri S, Kohane I, Karlson E, Plenge RM. Genetic basis of autoantibody positive and negative rheumatoidd arthritis art rthr hrit hr i is risk it ris iskk in a 2011;88:57-69. multi-ethnic cohort derived from electronic health records. Am J Hum Genet. 201 0111;;88 88:5 :5 577 69 69. 28. Johnson EA, McKinnon MG. The differential effect of quinidine and pyrilamine on the myocardial action potential p tential at various rates of stimulation. J Pharmacol Exp Ther. po 1957;120:460-468. 19 957 57;1 ;120 ;1 20:4 20 :460 :4 60-4 60 468 68. effect mortality trial 229. 9. Preliminary Preliminarry rreport: epport portt: ef effe fect fe ct ooff encainide en ncain ca nidde aand nd flecainide fleecaainiidee oon n mor m ortaalit alityy in a rrandomized ando an dom mized miz zed tr ria iall of arrhythmia ar rrh hyt y hmia suppression sup u prressioon after afteer myocardial myo yocaard r ia iall infarction. inffar farctioon. The Th he Cardiac Card Ca diacc Arrhythmia Arrhyythhmia hmia Suppression Sup u preess ession Trial Trriaal (CAST) Engl Med. CAS AST) T) IInvestigators. nveesti nv estiggattors tors.. N En E gl J M ed.. 11989;321:406-412. ed 989; 98 9;32 9; 321: 32 1 406-4 1: 6-412. 412. 2 30. Epstein AE, AE, Hallstrom Hall Ha l sttro ll om AP, AP Rogers Roge Ro gerss WJ, ge WJ, Liebson Lieb e so sonn PR, PR R, Seals Seal Se a s AA, al A , Anderson AA Ande An ders de rson rs o JL, on JL, Cohen Coh o en JD, Capone RJ, Wyse DG. Mo Mortality following ventricular arrhythmia encainide, Capo Ca pone ne R J W ysee DG ys Mort rtal alit ityy fo foll llow owin ingg ve vent ntri ricu cula larr ar arrh rhyt ythm hmia ia ssuppression uppr up pres essi sion on bby y en enca cain inid idee flecainide, and moricizine after myocardial infarction. The original design concept of the Cardiac Arrhythmia Suppression Trial (CAST). JAMA. 1993;270:2451-2455. 31. Akiyama T, Pawitan Y, Greenberg H, Kuo CS, Reynolds-Haertle RA. Increased risk of death and cardiac arrest from encainide and flecainide in patients after non-Q-wave acute myocardial infarction in the Cardiac Arrhythmia Suppression Trial. CAST Investigators. Am J Cardiol. 1991;68:1551-1555. 32. Coromilas J, Saltman AE, Waldecker B, Dillon SM, Wit AL. Electrophysiological effects of flecainide on anisotropic conduction and reentry in infarcted canine hearts. Circulation. 1995;91:2245-2263. 33. Greenberg HM, Dwyer EM Jr, Hochman JS, Steinberg JS, Echt DS, Peters RW. Interaction of ischaemia and encainide/flecainide treatment: a proposed mechanism for the increased mortality in CAST I. Br Heart J. 1995;74:631-635. 21 DOI: 10.1161/CIRCULATIONAHA.112.000604 34. Antzelevitch C, Brugada P, Borggrefe M, Brugada J, Brugada R, Corrado D, Gussak I, LeMarec H, Nademanee K, Perez Riera AR, Shimizu W, Schulze-Bahr E, Tan H, Wilde A. Brugada syndrome: report of the second consensus conference: endorsed by the Heart Rhythm Society and the European Heart Rhythm Association. Circulation. 2005;111:659-670. 35. Holm H, Gudbjartsson DF, Arnar DO, Thorleifsson G, Thorgeirsson G, Stefansdottir H, Gudjonsson SA, Jonasdottir A, Mathiesen EB, Njølstad I, Nyrnes A, Wilsgaard T, Hald EM, Hveem K, Stoltenberg C, Løchen M-L, Kong A, Thorsteinsdottir U, Stefansson K. Several common variants modulate heart rate, PR interval and QRS duration. Nat Genet. 2010;42:117122. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 36. Pfeufer A, Van Noord C, Marciante KD, Arking DE, Larson MG, Smith AV, Tarasov KV, Müller M, Sotoodehnia N, Sinner MF, Verwoert GC, Li M, Kao WHL, Köttgen A, Coresh J, Bis JC, Psaty BM, Rice K, Rotter JI, Rivadeneira F, Hofman A, Kors JA, Stricker BHC, Uitterlinden AG, Van Duijn CM, Beckmann BM, Sauter W, Gieger C, Lubitz SA, Newton-Cheh C, Wang TJ, Magnani JW, Schnabel RB, Chung MK, Barnard J, Smith JD, Van Wagoner DR, Vasan RS, Aspelund T, Eiriksdottir G, Harris TB, Launer LJ, Najjar SS, Lakatta E, Schlessinger D, Uda M, Abecasis GR, Müller-Myhsok B, Ehret GB, Boerwinkle E, Chakravarti A, Soliman an n EZ, EZ,, Lunetta Lun unet etta KL, Perz S, Wichmann H-E, Meitinger T, Levy D, Gudnason V, Ellinor PT, Sanna San nna na S, S, Kääb Kääb S, S, Witteman JCM, Alonso A, Benjamin EJ, Heckbertt SR. Genome-wide association on study stu tudy dy of of PR interval. nterval. Nat Genet. 2010;42:153-159. 37. W,, Ka Kaba M, 37 7. Chambers Cham Ch ambe am berss JC, be C Zhao C, Zhao J, Terracciano CMN, Bezzina Beezz z ina CR, Zhangg W Kab ba R, Navaratnarajah M Lotlikar A,, Se Sehmi Kooner MK, Deng Siedlecka U,, Pa Parasramka S,, El El-Hamamsy Lo otllikar ik A Sehm hm mi JS JS,, Ko Koon oner on er M K, D engg G,, S en ieddleckaa U ie P raasr sram am mka S l-H Ham amam amsy syy II,, Wa Wass ss LRC, JSSG, Sternberg MJE, McKenna Severs NJ, Silva R,, Wi Wilde MN,, Dekker L MN R , De JJong RC ongg JS on JSSG SG G, St S ernbergg M JE E, Mc cKen Kenna W, W, S eveers NJ ev J, De De S illva v R Wild de AAM, A AAM AM, Anand nd P, Yacoub Yaco oubb M, M, Scott Scott J,, Elliott Scot Ell lliiott P, P, Wood Wo ood JN, JN, Kooner Kooonerr JS. JS. Genetic Gennet netic variation variaatio va on inn SCN10A cardiac Genet. 2010;42:149-152. SC CN1 N10A 0 iinfluences 0A nflu nf luen en ncees ca card rd dia iac cconduction. ond nduc ucti uc tioon. Na Natt Ge Gen net. 201 0 0; 01 0;42 42:1 :149 :1 499-1 152 5 . B.. Wh Whither Art Thou, SCN10A, What Artt Th Doing? 38. London B Whit ithe it herr Ar he rt Th T ou,, SC ou SCN N10 10A, A, aand n W nd h t Ar ha A Thou ou D o ng oi ng?? Ci Circ r Res. rc Res es.. 2012;111:268-270. 2012 20 12;1 ;111 11:2 :268 68-270 270 39. Yang T, Atack TC, Stroud DM, Zhang W, Hall L, Roden DM. Blocking Scn10a channels in heart reduces late sodium current and is antiarrhythmic. Circ Res. 2012;111:322-332. 40. Verkerk AO, Remme CA, Schumacher CA, Scicluna BP, Wolswinkel R, De Jonge B, Bezzina CR, Veldkamp MW. Functional Nav1.8 channels in intracardiac neurons: the link between SCN10A and cardiac electrophysiology. Circ Res. 2012;111:333-343. 41. Gautron L, Sakata I, Udit S, Zigman JM, Wood JN, Elmquist JK. Genetic tracing of Nav1.8expressing vagal afferents in the mouse. J Comp Neurol. 2011;519:3085-3101. 42. Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco JA, Boomershine CS, Lasko TA, Xu H, Karlson EW, Perez RG, Gainer VS, Murphy SN, Ruderman EM, Pope RM, Plenge RM, Kho AN, Liao KP, Denny JC. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012;19:e162-e169. 22 DOI: 10.1161/CIRCULATIONAHA.112.000604 43. Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK, Armstrong LL, Denny JC, Peissig PL, Miller AW, Wei W-Q, Bielinski SJ, Chute CG, Leibson CL, Jarvik GP, Crosslin DR, Carlson CS, Newton KM, Wolf WA, Chisholm RL, Lowe WL. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genomewide association study. J Am Med Inform Assoc. 2012;19:212-218. 44. Kaiser Permanente, UCSF Scientists Complete NIH-Funded Genomics Project Involving 100,000 People [Internet]. [cited 2011 Sep 13];Available from: http://www.dor.kaiser.org/external/news/press_releases/Kaiser_Permanente,_UCSF_Scientists_ Complete_NIH-Funded_Genomics_Project_Involving_100,000_People/. 45. Million Veteran Program (MVP) [Internet]. [cited 2012 Jun 20];Available from: http://www.research.va.gov/mvp/. 46. Collins R. What makes UK Biobank special? Lancet. 2012;379:1173–1174. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 47. Kho AN, Pacheco JA, Peissig PL, Rasmussen L, Newton KM, Weston N, Crane PK, Pathak Electronic J, Chute CG, Bielinski SJ, Kullo IJ, Li R, Manolio TA, Chisholm RL, Denny JC. E lect le ctro roni nicc Sci Transl Med. Medical Records for Genetic Research: Results of the eMERGE Consortium. Sc ci Tr ran nsl s M ed. ed 2011; 3:79re1. Table eMERGE All were European ancestry. T ab bl 1. eMER ble ERGE GE ssites ites it ess contributing con ontr triibut tr ibuttin ingg sam ssamples. amplles. A ll ppatients attie tients entss w eree of er of E uroopea opeaan an anc cest stry st ry y. Site Si te Group Health Marshfield Mayo Clinic Northwestern Vanderbilt Total Genotyped Primary Pri rima mary ma r site sitee Mean M an Age Me Age (SD) (SD SD)) G Geno eno noty type ty ped pe d samples saamp mplees meeting meet me etin et ingg normal in norm no rmal rm al QRS QRS RS algorithm definition p enot ph o yp ypee algo al g ri go rith t m de defi fini nition on phenotype Males Male Ma less le Fema Fe Females male ma lees Total Dementia 72.58 (7.23) 187 351 538 Cataract 56.73 (9.73) 69 149 218 Peripheral 58.81 (8.52) 1056 730 1786 artery disease Type 2 diabetes 52.49 (10.54) 99 118 217 Normal QRS 49.95 (14.6) 1077 1436 2513 2488 2784 5272 23 DOI: 10.1161/CIRCULATIONAHA.112.000604 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Table 2. Replicated SNPs in the QRS GWAS analysis. Reported betas and p-values for eMERGE analysis are adjusted for sex, and all betas for CHARGE and eMERGE analyses are for the coded allele. Each SNP represents the strongest association in the region, and was the target of a PheWAS (as shown in Table 3). CHR 3 3 6 6 1 eMERGE QRS Coded Coded Allele Allele Frequency rs1805126 38567410 T* 0.665 rs6795970 38741679 A 0.401 C rs6906287 119069433 0.454 G* 0.760 rs1321313 36726799 rs2207790 61670555 T* 0.482 SNP Location BETA (msec) -1.002 0.765 0.717 -0.793 -0.622 p-value 1.45E-08 6.60E 6.60E-06 06 2.26E-05 6.13E-05 2.07E-04 Coded Allele A A C C A CHARGE QRS Coded Allele BETA Frequency (msec) 0.655 -0.6568 0.396 0.74 0.7476 7 76 00.5383 .538 538 3 3 0.451 0.742 -0.8 -0 0.8 . 12 129 9 -0.8129 0.461 -0.5956 p-value 2.52E-20 55.08E-27 5. 08E 08 E 27 55.56E-16 5.5 .5 56E 6E-1 -16 -1 6 44.60E-25 4.6 .60E 60E 0E-2 -25 25 6.31E-18 Nearest Gene SCN5A SCN10A SC C6 C6orf204 CD CDKN1A N NFIA *This SNP was coded on the opposite strand from that in the CHARGE study. Therefore, while the allele is not identical, the direction of effect is the same 24 DOI: 10.1161/CIRCULATIONAHA.112.000604 Table 3. Most significant associations between SNPs predicting normal QRS duration and diagnostic codes. For each SNP displayed, all PheWAS associations p<0.01 are displayed. Associated Phenotype SCN5A (Chr 3, rs1805126) Cardiac arrhythmias Other diseases of upper respiratory tract Benign prostatic hypertrophy Chronic kidney disease Melanoma Dermatophytosis Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Case count Odds ratio P 3075 509 1615 1212 151 1229 0.877 0.8131 0.8527 0.8772 0.7033 0.8791 1.10E-03 3.83E-03 5.54E-03 6.02E-03 8.18E-03 8.77E-03 SCN10A (Chr 3, rs6795970) Cardiac arrhythmias Atrial fibrillation and flutter Arterial embolism and thrombosis Cholecystitis Convulsions Aphasia 3075 1758 150 121 311 69 0.8781 0.8519 0.6928 0.6627 1.249 0.5999 7.21E-04 8.45E-04 3.20E-03 3.44E-03 7.01E-03 7.01 0 E 03 0 7.25E-03 7.25 7. 25E25 E-03 E03 C6orf204 (Chr 6, rs6906287) Colorectal cancer Benign Beni nign gn prostatic pro rost stat st atic at i hypertrophy hyp ypeertrophy Penicillin Pe enici nici cill lliin aallergy lller ergy y Melanoma M ellano n ma Disorders D isoorders of pancreatic panccreatic secretion secrettio se ion n Nasal Nasa Na al polyps p ly po yps p Prostatitis Pros Pr osta tati ta titi ti tiss ti 316 1615 65 151 51 1 844 127 1227 150 150 0.7592 0.8576 0.857 57 76 1.68 1. .68 11.383 1. 383 38 3 0.6421 0.642 21 1.413 1.4113 0.7166 0 71 0. 7166 66 9.90E-04 4.19E-03 4.01E-03 4.01E-0 03 5.68E-03 5.68 5. 6 E68 E-03 03 6.22E-03 6.22 6. 22E22 E-03 03 3 7.15E-03 7..15 5EE-03 03 8.07E-03 8.0 8. 07E07EE-03 03 NFIA (Chr 1,, rs2207790) rs2 s220 2077 20 7790 77 90)) 90 Postinflammatory pulmonary fibrosis P i fl l fib i Multiple myeloma Neutropenia and leukopenia Facial nerve disorders Cardiomyopathy Pneumothorax Disorders of function of stomach Prostatitis Toxic diffuse goiter 173 54 238 67 420 80 355 150 76 0.7048 0 7048 0.5425 0.7545 1.677 0.8033 0.6391 1.23 0.7239 1.541 11.76E-03 76E 03 2.80E-03 2.81E-03 3.70E-03 4.40E-03 6.59E-03 7.73E-03 8.85E-03 9.47E-03 CDKN1A (Chr 6, rs1321313) Anorexia Bacteremia Anal fissure and fistula Abnormal loss of weight and underweight Dementias Overweight, obesity and other hyperalimentation Spondylosis and allied disorders 53 118 108 398 841 2501 1244 0.3649 0.5684 1.552 0.7642 0.833 1.105 1.156 1.57E-03 2.01E-03 3.04E-03 3.39E-03 5.69E-03 7.39E-03 9.39E-03 25 DOI: 10.1161/CIRCULATIONAHA.112.000604 Figure Legends: Figure 1. Panel A. Distribution of QRS durations in 5,272 normal ECGs. Panel B. Genomewide association analysis of QRS duration, using sex-adjusted linear regression. The red line indicates genome-wide significance (p=5x10-8). The points in green are those SNPs that were also identified at genome-wide significance in the CHARGE consortium QRS meta-analysis as described in the text. Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Figure 2. Phenome-wide association study plots of most significant SNPs associated with QRS blu ue li ine ness in ndi dica c te ca duration. Panel A. SCN5A (rs1805126). Panel B. SCN10A (rs6795970). The blue lines indicate p<0.05. Figure Figu gure u 3. Kapl Kaplan-Meier plan an n-M Mei e err eestimate stim st im mat atee of ddevelopment evelop eve pmentt A Atrial triaal Fi Fibr Fibrillation b il br illa lattioon on pper er SCN10A SCN1 SC N10A N1 0A rs rrs6795970 6795 67 9 97 95 9700 gennoty ge noty type pe.. Al pe Alll pa atieents entss w ere ju er judg dged dg ed ffree reee of ccardiac re arrdi d acc dise ddisease i easse (inc ((including inc nclu lu udi ding ng aarrhythmias) rrhhyt rr hythm hmia ias) s) aatt th he ti tim me ooff genotype. patients were judged the time he initial no orm mal a E CG.. CG the normal ECG. 26 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Figure 1 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Figure 2 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Figure 3 Downloaded from http://circ.ahajournals.org/ by guest on June 16, 2017 Genome- and Phenome-Wide Analysis of Cardiac Conduction Identifies Markers of Arrhythmia Risk Marylyn D. Ritchie, Joshua C. Denny, Rebecca L. Zuvich, Dana C. Crawford, Jonathan S. Schildcrout, Lisa Bastarache, Andrea H. Ramirez, Jonathan D. Mosely, Jill M. Pulley, Melissa A. Basford, Yuki Bradford, Luke V. Rasmussen, Jyotishman Pathak, Christopher G. Chute, Iftikhar J. Kullo, Catherine A. McCarty, Rex L. Chisholm, Abel N. Kho, Christopher S. Carlson, Eric B. Larson, Gail P. Jarvik, Nona Sotoodehnia, Teri A. Manolio, Rongling Li, Daniel R. Masys, Jonathan L. Haines and Dan M. Roden Circulation. published online March 5, 2013; Circulation is published by the American Heart Association, 7272 Greenville Avenue, Dallas, TX 75231 Copyright © 2013 American Heart Association, Inc. All rights reserved. Print ISSN: 0009-7322. Online ISSN: 1524-4539 The online version of this article, along with updated information and services, is located on the World Wide Web at: http://circ.ahajournals.org/content/early/2013/03/05/CIRCULATIONAHA.112.000604 Data Supplement (unedited) at: http://circ.ahajournals.org/content/suppl/2013/03/05/CIRCULATIONAHA.112.000604.DC1 Permissions: Requests for permissions to reproduce figures, tables, or portions of articles originally published in Circulation can be obtained via RightsLink, a service of the Copyright Clearance Center, not the Editorial Office. Once the online version of the published article for which permission is being requested is located, click Request Permissions in the middle column of the Web page under Services. Further information about this process is available in the Permissions and Rights Question and Answer document. Reprints: Information about reprints can be found online at: http://www.lww.com/reprints Subscriptions: Information about subscribing to Circulation is online at: http://circ.ahajournals.org//subscriptions/ Supplemental Material Supplemental Table 1: Replicated SNPs in the QRS GWAS analysis. Reported betas and pvalues for eMERGE analysis are adjusted for sex, and all betas for CHARGE and eMERGE analyses are for the coded allele. eMERGE QRS CHR SNP Location Coded Allele BETA (msec) p-value Coded Allele T** Coded Allele Frequency 0.665 3 rs1805126* 38567410 -1.002 1.45E-08 3 rs9832895 38636537 C 0.518 0.895 3 rs6771881 38553268 A 0.756 3 rs12053903 38568397 C 0.329 3 rs7374138 38580736 C 3 rs11129801 38725379 A 3 rs6800541 38749836 3 rs12636358 3 BETA (msec) CHARGE QRS p-value A Coded Allele Frequency 0.655 -0.6568 2.52E-20 SCN5A* 1.18E-07 C 0.497 0.6576 8.68E-18 SCN5A -1.003 2.23E-07 A 0.745 -0.7659 2.54E-22 EXOG 0.881 7.54E-07 C 0.330 0.646 1.87E-19 SCN5A 0.737 0.912 1.49E-06 C 0.718 0.5041 5.59E-10 SCN5A 0.294 -0.877 1.82E-06 A 0.301 -0.6296 2.60E-17 SCN10A C 0.409 0.803 1.97E-06 C 0.398 0.7394 1.88E-27 SCN10A 38430905 A 0.047 1.867 2.30E-06 A 0.070 1.185 2.46E-12 XYLBI rs7430439 38778643 A 0.597 -0.794 4.03E-06 A 0.588 -0.5516 8.32E-16 SCN10A 3 rs6798015 38773840 C 0.374 0.793 4.25E-06 C 0.370 0.7187 5.88E-24 SCN10A 3 rs4130467 38586708 G** 0.279 0.846 5.88E-06 C 0.289 0.5274 5.98E-12 SCN5A 3 rs6795970* 38741679 A 0.401 0.765 6.60E-06 A 0.396 0.7476 5.08E-27 SCN10A* 3 rs7374540 38609146 A 0.609 -0.758 1.04E-05 A 0.598 -0.5246 7.67E-14 SCN5A 3 rs6797133 38631037 A 0.394 -0.750 1.14E-05 A 0.395 -0.667 2.61E-21 SCN5A 3 rs9861242 38584338 A 0.250 0.848 1.20E-05 A 0.262 0.5273 7.43E-11 SCN5A 3 rs7373102 38655632 C 0.496 0.680 1.65E-05 C 0.494 0.4128 1.10E-08 SCN5A 6 rs6906287* 119069433 C 0.454 0.717 2.26E-05 C 0.451 0.5383 5.56E-16 C6orf204* 6 rs11970286 118787067 C 0.549 -0.715 2.33E-05 C 0.534 -0.5502 4.89E-16 SLC35F1 6 rs4307206 118920013 A 0.451 0.712 2.59E-05 A 0.450 0.5399 4.56E-16 C6orf204 6 rs1321313* 36726799 G** 0.760 -0.793 6.13E-05 C 0.742 -0.8129 4.60E-25 CDKN1A* 3 rs7430477 38740494 C 0.530 0.661 8.18E-05 C 0.524 0.6061 2.72E-19 SCN10A 3 rs6599257 38779592 C 0.313 0.705 9.58E-05 C 0.306 0.5508 2.23E-14 SCN10A 1 rs2207790* 61670555 T** 0.482 -0.622 2.07E-04 A 0.461 -0.5956 6.31E-18 NFIA* *The phenome-wide association analysis results for these variants (best or near-best association at each locus/gene in replication analysis) are shown in Table 3. **This SNP was coded on the opposite strand from that in the CHARGE study. Therefore, while the allele is not identical, the direction of effect is the same. Nearest Gene Supplemental Figure 1: LocusZoom plot of the region in chromosome 3 in which most of the replicated associations (18/23) were identified. All variants with p<10-4 in the eMERGE data set were replicated at p<5x10-8 in the CHARGE data set. Supplemental Table 2: PheWAS associations for replicated SNPs (N=23). Data for all associations with P<0.01 are shown. ICD9 codes for given associations can be downloaded from http://knowledgemap.mc.vanderbilt.edu/research/content/phewas. SNP rs11129801 rs11129801 rs11129801 rs6599257 rs6599257 rs6599257 rs6599257 rs6599257 rs6599257 rs6795970 rs6795970 rs6795970 rs6795970 rs6795970 rs6795970 rs6798015 rs6798015 rs6798015 rs6798015 rs6800541 rs6800541 rs6800541 rs6800541 rs6800541 rs6800541 rs7430439 rs7430439 rs7430439 rs7430439 rs7430439 rs7430477 rs7430477 rs7430477 rs7430477 rs7430477 rs7430477 rs7430477 Chr 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 rs11970286 rs11970286 rs11970286 rs11970286 6 6 6 6 Nearest gene SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A SCN10A Disease association Epilepsy and recurrent seizures Malignant neoplasm of uterus Migraine Atrial fibrillation and flutter Drug dependence Cardiac arrhythmias Lupus erythematosus Cholecystitis Migraine Cardiac arrhythmias Atrial fibrillation and flutter Arterial embolism and thrombosis Cholecystitis Convulsions Aphasia Cholecystitis Arterial embolism and thrombosis Atrial fibrillation and flutter Genital prolapse Cardiac arrhythmias Atrial fibrillation and flutter Cholecystitis Arterial embolism and thrombosis Aphasia Gingival and periodontal diseases Intestinal malabsorption Allergic rhinitis Disturbance of skin sensation First degree AV block Cholecystitis Cholecystitis Epilepsy and recurrent seizures Tinnitus Gingival and periodontal diseases Convulsions Migraine Inflammatory disease of cervix, vagina, and vulva SLC35F1 SLC35F1 SLC35F1 SLC35F1 Colorectal cancer Nasal polyps Malignant neoplasm of colon Personal history of malignant cases odds ratio 181 0.71 103 1.49 420 0.81 1758 0.84 71 1.68 3075 0.89 93 0.59 121 0.65 420 1.23 3075 0.88 1758 0.85 150 0.69 121 0.66 311 1.25 69 0.60 121 0.56 150 0.67 1758 0.86 511 0.82 3075 0.89 1758 0.86 121 0.66 150 0.69 69 0.61 229 0.77 156 1.42 865 1.16 664 0.85 181 1.34 121 0.70 121 1.71 181 0.72 159 1.39 229 1.30 311 0.80 420 0.83 266 0.79 316 127 226 130 0.77 1.48 0.75 0.69 P-value 6.23E-03 6.31E-03 9.81E-03 6.53E-04 2.88E-03 3.03E-03 4.31E-03 4.77E-03 5.78E-03 7.21E-04 8.45E-04 3.20E-03 3.44E-03 7.01E-03 7.25E-03 1.32E-04 1.73E-03 2.39E-03 8.87E-03 1.90E-03 2.15E-03 2.64E-03 3.05E-03 8.15E-03 9.66E-03 2.34E-03 4.83E-03 7.85E-03 9.09E-03 9.44E-03 5.85E-05 2.90E-03 4.46E-03 6.12E-03 7.36E-03 7.87E-03 8.21E-03 1.37E-03 2.32E-03 3.23E-03 3.80E-03 rs11970286 6 SLC35F1 rs11970286 6 SLC35F1 rs12053903 rs12053903 rs12053903 rs12053903 rs12053903 3 3 3 3 3 SCN5A SCN5A SCN5A SCN5A SCN5A rs12053903 3 SCN5A rs12053903 rs12053903 3 3 SCN5A SCN5A rs12053903 3 SCN5A rs1805126 rs1805126 rs1805126 rs1805126 rs1805126 3 3 3 3 3 SCN5A SCN5A SCN5A SCN5A SCN5A rs1805126 rs4130467 3 3 SCN5A SCN5A rs4130467 rs4130467 rs7373102 3 3 3 SCN5A SCN5A SCN5A rs7373102 rs7373102 rs7373102 rs7373102 rs7373102 rs7373102 3 3 3 3 3 3 SCN5A SCN5A SCN5A SCN5A SCN5A SCN5A rs7373102 3 SCN5A rs7374138 3 SCN5A rs7374138 3 SCN5A rs7374540 rs7374540 rs7374540 rs7374540 rs7374540 3 3 3 3 3 SCN5A SCN5A SCN5A SCN5A SCN5A neoplasm of colon Personal history of malignant melanoma of skin Hyperplasia of prostate Chronic kidney disease Atherosclerosis Cardiac arrhythmias Other peripheral vascular disease Disorders resulting from impaired renal function Osteomyelitis, periostitis, and other infections involving bone Other paralytic syndromes Thyrotoxicosis with or without goiter Fracture of other and unspecified parts of femur Cardiac arrhythmias Diseases of upper respiratory tract Hyperplasia of prostate Chronic kidney disease Personal history of malignant melanoma of skin Dermatophytosis Late effects of musculoskeletal and connective tissue injuries Hyperplasia of prostate Systemic lupus erythematosus Postinflammatory pulmonary fibrosis Irritable Bowel Syndrome Diseases of white blood cells Abnormal glucose Functional digestive disorders Elevated blood pressure Abnormal cardiovascular function test Anxiety, phobic and dissociative disorders Malignant neoplasm of female breast Chronic pharyngitis and nasopharyngitis Dermatophytosis Spondylosis and allied disorders Fracture of patella Deviated nasal septum Emphysema 151 1.37 7.37E-03 1615 0.87 9.66E-03 1212 1511 3075 1228 173 0.85 0.87 0.89 0.88 0.71 9.16E-04 1.60E-03 2.80E-03 5.84E-03 5.92E-03 192 0.73 7.56E-03 65 189 1.60 0.74 8.78E-03 9.64E-03 85 0.62 9.96E-03 3075 509 1615 1212 151 0.88 0.81 0.85 0.88 0.70 1.10E-03 3.83E-03 5.54E-03 6.02E-03 8.18E-03 1229 81 0.88 1.67 8.77E-03 1.55E-03 1615 81 173 0.85 1.56 1.50 6.97E-03 7.07E-03 1.31E-04 498 219 837 553 829 838 0.81 1.31 0.87 0.84 0.88 0.88 1.54E-03 3.98E-03 4.46E-03 4.95E-03 7.67E-03 7.87E-03 412 0.84 9.09E-03 645 1.22 2.10E-03 462 1.24 3.14E-03 1229 1244 65 275 138 0.87 0.86 0.56 0.77 0.69 2.05E-03 2.54E-03 3.58E-03 3.86E-03 5.01E-03 rs7374540 rs7374540 rs7374540 rs7374540 3 3 3 3 SCN5A SCN5A SCN5A SCN5A 0.89 0.75 0.91 1.54 5.77E-03 6.29E-03 6.34E-03 6.47E-03 0.89 0.75 1.56 1.20 1.25 1.69 8.31E-03 9.38E-03 2.14E-03 5.79E-03 6.76E-03 7.48E-03 1.20 1.66 9.29E-03 2.18E-03 1.27 9.71E-03 SCN5A SCN5A Osteopenia 2161 Candidiasis 201 Sprains and strains 2928 Late effects of musculoskeletal and 81 connective tissue injuries Urinary tract infection 2094 Polymyalgia Rheumatica 214 Acute reaction to stress 99 Mitral valve disorders 551 Debility 309 Disorders of tooth development and 55 eruption Peptic ulcers 427 Late effects of musculoskeletal and 81 connective tissue injuries Deficiency of B-complex 284 components Sleep disorders 52 Other diseases of appendix 59 rs7374540 rs7374540 rs9832895 rs9832895 rs9832895 rs9832895 3 3 3 3 3 3 SCN5A SCN5A SCN5A SCN5A SCN5A SCN5A rs9832895 rs9861242 3 3 SCN5A SCN5A rs9861242 3 SCN5A rs6797133 rs6797133 3 3 0.50 0.59 2.48E-03 9.88E-03 rs12636358 rs12636358 3 3 XYLB XYLB Sleep disorders Diseases of hard tissues of teeth 52 209 2.46 1.65 5.26E-03 9.73E-03 rs1321313 rs1321313 rs1321313 rs1321313 rs1321313 rs1321313 6 6 6 6 6 6 CDKN1A CDKN1A CDKN1A CDKN1A CDKN1A CDKN1A 53 118 108 398 841 2501 0.36 0.57 1.55 0.76 0.83 1.11 1.57E-03 2.01E-03 3.04E-03 3.39E-03 5.69E-03 7.39E-03 rs1321313 6 CDKN1A Anorexia Bacteremia Anal fissure and fistula Abnormal loss of weight Dementias Overweight, obesity and other hyperalimentation Spondylosis and allied disorders 1244 1.16 9.39E-03 rs2207790 1 NFIA 173 0.70 1.76E-03 rs2207790 1 NFIA 205 1.37 1.97E-03 rs2207790 1 NFIA 54 0.54 2.80E-03 rs2207790 rs2207790 rs2207790 rs2207790 rs2207790 rs2207790 rs2207790 1 1 1 1 1 1 1 NFIA NFIA NFIA NFIA NFIA NFIA NFIA Postinflammatory pulmonary fibrosis Abnormal serum enzyme levels (acid phosphatase, alkaline phosphatase, amylase, lipase) Multiple myeloma and immunoproliferative neoplasms Neutropenia and leukopenia Facial nerve disorders Cardiomyopathy Pneumothorax Disorders of function of stomach Prostatitis Toxic diffuse goiter 238 67 420 80 355 150 76 0.75 1.68 0.80 0.64 1.23 0.72 1.54 2.81E-03 3.70E-03 4.40E-03 6.59E-03 7.73E-03 8.85E-03 9.47E-03 rs4307206 rs4307206 6 6 C6orf204 C6orf204 Colorectal cancer Malignant neoplasm of colon 316 226 0.76 0.74 1.01E-03 2.76E-03 rs4307206 rs4307206 6 6 C6orf204 C6orf204 rs4307206 rs4307206 6 6 C6orf204 C6orf204 rs4307206 rs4307206 6 6 C6orf204 C6orf204 rs6906287 rs6906287 rs6906287 rs6906287 rs6906287 6 6 6 6 6 C6orf204 C6orf204 C6orf204 C6orf204 C6orf204 rs6906287 6 C6orf204 rs6906287 6 C6orf204 rs6906287 rs6906287 6 6 C6orf204 C6orf204 Hyperplasia of prostate Personal history of malignant neoplasm of colon Penicillin allergy Disorders of pancreatic internal secretion Nasal polyps Personal history of malignant melanoma of skin Colorectal cancer Malignant neoplasm of colon Penicillin allergy Hyperplasia of prostate Personal history of malignant neoplasm of colon Personal history of malignant melanoma of skin Disorders of pancreatic internal secretion Nasal polyps Prostatitis 1615 130 0.86 0.69 3.92E-03 4.14E-03 65 84 1.64 0.65 6.07E-03 7.40E-03 127 151 1.40 1.36 8.87E-03 9.30E-03 316 226 65 1615 130 0.76 0.75 1.68 0.86 0.70 9.90E-04 3.03E-03 4.01E-03 4.19E-03 5.04E-03 151 1.38 5.68E-03 84 0.64 6.22E-03 127 150 1.41 0.72 7.15E-03 8.07E-03 rs6771881 rs6771881 rs6771881 rs6771881 rs6771881 rs6771881 rs6771881 3 3 3 3 3 3 3 EXOG EXOG EXOG EXOG EXOG EXOG EXOG Hyperplasia of prostate 1615 Chronic liver disease 254 Cardiac arrhythmias 3075 Sexual and gender identity disorders 76 Chronic kidney disease 1212 Cerebral degenerations 137 Diarrhea 880 0.81 1.33 0.88 0.51 0.86 0.63 0.85 8.10E-04 3.55E-03 3.86E-03 4.20E-03 5.07E-03 5.29E-03 9.06E-03 Supplemental Figure 2: Phenome-wide association study plot for CDKN1A (rs1321313). Supplemental Figure 3: Phenome-wide association study plot for C6orf204 (rs6906287). Supplemental Figure 4: Phenome-wide association study plot for NFIA (rs2207790).
© Copyright 2026 Paperzz