Supplemental materials synaptic groups and SCZ Supplemental Materials for: Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia Esther S Lips1, L Niels Cornelisse1, Ruud F Toonen1, Josine L Min1, Christina M Hultman2,3, the International Schizophrenia Consortium#, Peter A. Holmans4, Michael C. O'Donovan4 Shaun M. Purcell5,6,7,8, August B Smit9, Matthijs Verhage1, Patrick F Sullivan10, Peter M Visscher11, Danielle Posthuma1,12 1 Department of Functional Genomics, Center for Neurogenomics and Cognitive Research, Neuroscience Campus Amsterdam VU University, The Netherlands 2 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 3 Department of Neuroscience, Psychiatry, Ulleråker, Uppsala University, Uppsala, Sweden, 4 School of Medicine, Department of Psychological Medicine, School of Medicine, Cardiff University, Cardiff, United Kingdom, 5 Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America, 6 Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America, 7 Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 8 Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America, 9 Molecular & Cellular Neurobiology, Center for Neurogenomics and Cognitive Research, Neuroscience Campus Amsterdam VU University, The Netherlands, 10 Department of Genetics, University of North Carolina, Chapel Hill, United States of America, 11 Queensland Statistical Genetics Laboratory, Queensland Institute of Medical Research, Brisbane, Australia, 12 Department of Medical Genomics, VU Medical Center, Neuroscience Campus Amsterdam, The Netherlands, # Please see Acknowledgements for consortium authorship Correspondence to: Danielle Posthuma, Center for Neurogenomics and Cognitive Research (CNCR), De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands Tel: +31 20 598 2823, Fax: +31 20 598 6926, E-mail: [email protected] 1 Supplemental materials synaptic groups and SCZ Supplemental Materials Contents 1. Genome-wide association analyses ................................................................................. 3 2. Testing synaptic genes versus groups of randomly drawn genes ........................................ 4 2.1 Competitive test matched for the number of genes – control method 1. ........................ 5 2.2 Competitive test matched for the effective number of SNPs: control method 2 - genic and non-genic SNPs. ........................................................................................................... 6 2.3 Competitive test matched for the effective number of SNPs: control method 3 - genic SNPs only. ................................................................................................................... 7 2.4 Competitive test matched for the effective number of SNPs: control method 4 - non-genic SNPs only. ................................................................................................................... 7 2.5 Competitive test matched for the effective number of SNPs: control method 5 - SNPs annotated to brain-expressed genes. .............................................................................. 7 2.6 Overlap across samples and across draws .................................................................. 8 3. Systematic differences between synaptic genes and other genes .......................................10 4. Functional gene group analysis for synaptic subgroups ....................................................10 5. Enrichment analysis on genes previously implicated genes in schizophrenia........................19 5.1 Signals from GWAS studies .....................................................................................19 5.2 Signals from large scaled CNV studies ......................................................................19 5.3 Enrichment analysis ...............................................................................................20 5.4 Gene-group analysis using previously implicated genes ..............................................22 6. Genetic heterogeneity and gene group robustness...........................................................24 7. Graphical representation of significant functional gene groups ..........................................24 8. Web resources ............................................................................................................28 9. References .................................................................................................................29 2 Supplemental materials synaptic groups and SCZ 1. Genome-wide association analyses We first conducted a single SNP analysis on the ISC_AFFY5, ISC_AFFY6 and GAIN_AFFY6 datasets, using all SNPs that passed QC. For the ISC samples we clustered the analysis for collection site to correct for possible confounders due to population stratification, following Purcell et al., (2009). Figure S1 shows the three Manhattan plots, while figure S2 shows the corresponding QQ plots. Red dots indicate SNPs with a P-value < 1x10-4. None of the SNPs reached the threshold of genome wide significance (< 1x10-8). Figure S1: Manhattan plots of all SNPs that passed QC for the ISC affy5, affy6 and GAIN samples. Figure S2: Quantile-quantile plots of all SNPs that passed QC for the ISC affy5, affy6 and GAIN samples. 3 Supplemental materials synaptic groups and SCZ 2. Testing synaptic genes versus groups of randomly drawn genes We randomly drew control groups of genes/SNPs that were not necessarily functionally related, to test whether the group of synaptic genes was significantly more related to the risk of schizophrenia than any other randomly drawn group of genes/SNPs. Randomly drawn groups were compiled following two strategies: matched for the number of genes and matched for the effective number of SNPs. For both strategies we conducted 100 random draws. When creating groups matched for the effective number of SNPs we first derived the effective number of SNPs in the synaptic gene group based on the empirical distribution of the Σ-log(P) under the null hypothesis of no association of the 10,000 permutations, following Purcell et al. (2009). Briefly, under the null hypothesis of no association, -2ln(P) is distributed as a 2 with 2 degrees of freedom, and hence –log10(P) is distributed as 1/(2ln(10)) = 0.217 times a 2 with 2 degrees of freedom. If all M SNPs are independent then –log(P) has a mean of (0.217)(2M) and a variance of (0.217)2(4M) = 0.189M. We define the effective number of SNPs (Meff) as 2 M obs[s exp, SNPsind ] å - log10 ( p) M eff = = 2 s emp, å - log10 ( p) M obs[(0.217) 2 (4 M obs)] 2 s emp, å - log10 ( p) = 2 0.189 M obs 2 s emp, å - log10 ( p) The expected mean and variance are calculated based on the number of SNPs that are summed to obtain the Σ-log(P), and larger variance of the observed distribution than expected indicates dependency (i.e. due to LD) between included SNPs. For each sample, we started off with a list of all SNPs that passed QC, we then retrieved a list of all independent SNPs (using Plink option --indep-pairwise, with a window size of 200 SNPs, a sliding window of 5 SNPs and and r2 of 0.25), deleted all SNPs from synaptic genes, and created four pools: one pool included all genic and non-genic SNPs (control method 2), one pool included only SNPs in genes (control method 3), one pool included only SNPs outside genes (control method 4), and one pool included SNPs in genes known to be expressed in brain (control method 5). A SNP was assigned to a gene when located between the transcription start site and transcription end site of the gene. Expression in brain was determined using the Unigene Homo sapiens repository (Build #221). For each draw we carried out an association analysis for all SNPs in the group, calculated the Σlog(P) and then carried out 10,000 permutations of the dataset to determine the empirical P- 4 Supplemental materials synaptic groups and SCZ value. We then calculated per dataset (i.e. ISC-AFFY5, ISC-AFFY6, GAIN-AFFY6) how often the empirical P-value from the random draw was lower than the empirical P-value of all synaptic genes and divided that by the total number of draws to obtain the ‘empirical P-value of the empirical P-value’. Since there were 100 draws, the lowest empirical P-value of the empirical Pvalue that we could obtain was <.01, when none of the random draws had a lower P-value than the P-value of the synaptic gene group. We then combined the empirical P-values across the three different samples and within each draw to obtain 100 combined empirical P-values. We finally determined how often the combined empirical P-value was lower than the combined empirical P-value obtained from the synaptic genes, to calculate the ‘empirical P-value of the combined P-value’. 2.1 Competitive test matched for the number of genes – control method 1. For this method we started off with randomly drawing 1026 genes (i.e. the total number of synaptic genes) from the total pool of genes covered on the AFFY 6.0 platform minus the synaptic genes (Ngenes in pool = 16351). This was done 100 times. The average number of genes across the different draws was less than 1026 due to genes not being covered (AFFY5) or genotyped in a sample (AFFY5 or AFFY6) (see Table S1). Table S1: Results from control method1 to test whether synaptic genes are more significantly related to the risk of schizophrenia than randomly drawn, matched groups of genes. ISC ISC AFFY5 AFFY6 GAIN AFFY6 Combined (N=3353) (N=3556) (N=2729) (N=9638) Synaptic gene group (N=1026 genes) nSNPs observed 15105 34860 35412 nGenes observed 795 906 908 Average number exons/gene 15.12 14.62 14.61 Σlog(p) 7102.19 16070.97 16348.01 10000 10000 10000 Number of permutations Average- Σlog (p)_perm 6564.67 15141.25 15380.92 Average var- Σlog (p)_perm 15380.92 59022.80 62465.73 nSNPs effective 2830 3883 3786 Empirical P-value (emp_p from synaptic genes) <1.00E-04 <1.00E-04 <1.00E-04 7.61E-11 Matched for N_Genes Number of independent draws 100 100 100 Average nSNPs observed 6703 15610 15899 Average nGenes observed 775 959 967 Average nSNPs effective 1493 2064 2076 Average number exons/gene 12.34 11.50 11.46 Average Σlog(p) 3118.94 7015.83 7334.84 Variance Σlog(p) 58759.19 289574.80 346796.49 Number of permutations per draw 10000 10000 10000 Average Σlog(p)_of perms across draws Average variance Σlog(p)_ of perms across draws Average emp_p across draws Variance of emp_p across draws Emp_p of emp_p from synaptic genes Average combined emp_p Emp_p of combined emp_p from synaptic genes 2912.46 6777.58 6905.25 - 48208.22 0.0239 0.0020 0.11 260652.96 0.1364 0.0319 0.02 265796.11 0.0286 0.0039 0.12 - - - - 2.00E-03 - - - <0.01 5 Supplemental materials synaptic groups and SCZ The average empirical P-value obtained from the 100 draws was 239 times as large as the empirical P-value obtained in the original analysis for all synaptic genes in the ISC_AFFY5 sample. For ISC_AFFY6 and GAIN this was 1364 times and 286 times respectively. In 11 out of 100 draws in the ISC_AFFY5 sample, the empirical P-value from a random draw was lower/more significant than the empirical P-value from the original analysis for all synaptic genes, for the ISC_AFFY6 and GAIN samples this was 2 times and 12 times respectively. The average combined empirical P-values from 100 draws across the three samples was .002, and in none of the 100 draws was the combined P-value smaller than the combined P-value from the original analysis for all synaptic genes (i.e. the empirical P-value of the combined P-value was < .01). (See Table S1). Although this method matches for the number of genes, it allows differences between the effective number of independent SNPs between the randomly drawn groups and the original synaptic gene group. For all draws, the number of genotyped SNPs was lower than in the original analysis on all synaptic genes. On average, the effective number of SNPs was ~50% lower in the draws than in the original analysis, which renders the draws matched for the number of genes sub-optimal. That is, given the findings previously reported by the ISC, which suggest that the predicted risk to schizophrenia increases with the effective number of SNPs in the model, a larger number of effective SNPs in the original analysis as compared to the control groups may lead to an inflation of type I errors. We therefore also conducted control methods 2-5, where we created randomly drawn control groups of SNPs matched for the effective number of SNPs in the original analysis. 2.2 Competitive test matched for the effective number of SNPs: control method 2 - genic and non-genic SNPs. Control method 2 is used to determine whether SNPs in synaptic genes are more significantly related to the risk to schizophrenia than any other SNP. The empirical P-value of the empirical P-value for each sample ranged between 0.01 and 0.10, while the empirical P-value of the combined empirical P-value was < .01, i.e. not once was the combined empirical P-value of the random draws lower than the combined empirical P-value from the original analysis with all synaptic genes (see Table S2). We should note however, that in contrast to control method 1, SNPs included in a draw may not completely overlap (in terms of both the actual SNP but also in terms of genomic location) across the three samples, which complicates interpretation of comparing P-values across samples. From Table S1 we also see that the total number of independent SNPs available in the pool for control method 2 varied between 77,702-109,551 for the three samples, while the effective number of SNPs varied between 2,830 to 3,883. Drawing 100 sets of 2,830-3,883 SNPs from pools of sizes 77,702-109,551 is not ideal, and results in some overlap in SNPs across different draws, rendering the 100 draws non-independent. In addition, this method may not serve as the 6 Supplemental materials synaptic groups and SCZ most informative control method as we now compare synaptic genes with both genic and nongenic SNPs, while it can be expected that genic SNPs have a larger contribution to the risk of schizophrenia than non-genic SNPs. We therefore also applied control methods 3 (genic SNPs) and 4 (non-genic SNPs). 2.3 Competitive test matched for the effective number of SNPs: control method 3 - genic SNPs only. This method was used to investigate whether SNPs in synaptic genes are more strongly related to the risk for schizophrenia than SNPs in other genes. The pool available for 100 draws was significantly smaller than the pool available for control method 2, as all non-genic SNPs were excluded. The number of SNPs in the pool ranged from 28,558 to 39,866, which is ~ 10 times the size of each draw, which - although it allows comparison with genic SNPs - is far from optimal and creates non-independent sets of drawn SNPs. The empirical P-value of the combined empirical P-value was <.01. 2.4 Competitive test matched for the effective number of SNPs: control method 4 - non-genic SNPs only. Control method 4 was used to test whether SNPs in synaptic genes are more strongly related to the risk of schizophrenia than non-genic SNPs. This method has slightly larger pools than method 3, and therefore suffers less from dependency between random draws. Results reported previously by the ISC (Purcell et al., 2009) showed that genic SNPs do better in predicting risk to schizophrenia as compared to non-genic SNPs. However, not once was the combined empirical P-value lower than the original analysis with all synaptic genes, indicating that the association with synaptic genes and schizophrenia is stronger than with other genes. The empirical P-value of the (combined) empirical P-value for this method was <.01. 2.5 Competitive test matched for the effective number of SNPs: control method 5 - SNPs annotated to brain-expressed genes. With this method we tested whether SNPs in synaptic genes are more strongly related to the risk of schizophrenia than SNPs in genes expressed in brain. This method has the smallest pools and the 100 independent draws will not be independent. Although the average combined empirical Pvalue of random draws composed of SNPs annotated to genes expressed in brain was the lowest of the control methods, again, the empirical P-value of the (combined) empirical P-value for this method was <.01. This suggests that SNPs in synaptic genes are more strongly to the risk to schizophrenia than any other set of SNPs in genes expressed in brain. 7 Supplemental materials synaptic groups and SCZ 2.6 Overlap across samples and across draws The overlap of SNPs included in the draws based on the ISC_AFFY5, ISC_AFFY6, and GAIN samples is not 100%. This is due to differences between the Affymetrix 5.0 and Affymetrix 6.0 platforms and due to the fact that LD structure between the samples may differ, which causes different SNPs to be selected as independent. For control methods 2-5 we also conducted analyses where we forced the SNPs included in the draws to be maximally overlapping, i.e. the draws for ISC_AFFY6 and GAIN included all SNPs included in ISC_AFFY5, ISC_AFFY6 and GAIN were completely overlapping and independent SNPs in ISC_AFFY5 were also qualified as independent SNPs in ISC_AFFY6 and GAIN. However, this resulted in very small pool sizes for ISC_AFFY5 (a reduction on average of 35% across methods 2-5), resulting in pool sizes for control methods 3 and 4 that were only 6 times as large as the total number of SNPs to be included in a draw. Since 100 draws need to be obtained, these pools are highly undersized. In summary, results from the applied control methods suggest that SNPs in synaptic genes are more strongly associated to the risk to schizophrenia than any other set of randomly drawn genes. Although all applied control methods have their own pros and cons all indicated a strong association of synaptic genes versus other genes or SNPs. Determining the correct control method is complicated: matching for genes (method 1) seems most optimal as it allows comparison across different platforms (i.e. considering the gene as the unit of association signal, as is also done in the original analysis), but there is a lot of fluctuation in the effective number of SNPs included in the draws for that method. This seems mainly due to an artifact of the AFFY 5.0 and AFFY 6.0 platforms probably in combination with brain genes (including synaptic genes) in general being larger than other genes; for the synaptic genes there are on average 28 SNPs and on average 12 SNPs for the non synaptic genes on the AFFY 5.0 platform, where there are on average 48 SNPs for synaptic genes and 20 SNPs for non synaptic genes on the AFFY 6.0 platform. As a final check we generated three draws in which we controlled for both the gene size and the genotyped number of SNPs by finding a matched-control gene for each synaptic gene. Matching was based on gene size (+/- 10%) and the number of genotyped SNPs (+/- 10%). In those few cases where a synaptic gene could not be matched with a control gene on these two matching criteria we randomly selected a gene matched for gene size only. Only three complete draws could be made on this basis. These three draws were analyzed following the same procedure as the synaptic gene group. None of the combined empirical P-values for these 3 draws were lower than or equal to the combined empirical P-value for the group of synaptic genes. 8 Supplemental materials synaptic groups and SCZ Table S2: Results from control methods 2-5 ISC AFFY5 2. Matched for n effective SNPS, genic and nongenic Number of independent SNPs in pool Number of independent draws Average nSNPs observed Average nGenes observed Average nSNPs effective Average Σ-log(p) Variance Σ-log(p) Number of permutations per draw Average Σ-log(p)_permxdraw Variance- Σ-log(p)_permxdraw Average emp_p across draws Variance of emp_p across draws Emp_p of emp_p from synaptic genes Average combined emp_p Emp_p of combined emp_p from synaptic genes 3. Matched for n effective SNPs, genic Number of independent SNPs in pool Number of independent draws Average nSNPs observed Average nGenes observed Average nSNPs effective Average Σ-log(p) Variance Σ-log(p) Number of permutations per draw Average Σ-log(p)_ of perms across draws Variance Σ-log(p)_ of perms across draws Average emp_p across draws Variance of emp_p across draws Emp_p of emp_p from synaptic genes Average combined emp_p Emp_p of combined emp_p from synaptic genes 4. Matched for n effective SNPs, nongenic Number of independent SNPs in pool Number of independent draws Average nSNPs observed Average nGenes observed Average nSNPs effective Average Σ-log(p) Variance Σ-log(p) Number of permutations per draw Average Σ-log(p)_ of perms across draws Variance Σ-log(p_ of perms across draws Average emp_p across draws Variance of emp_p across draws Emp_p of emp_p from synaptic genes Average combined emp_p Emp_p of combined emp_p from synaptic genes 5. Matched for n effective SNPs, brain genes Number of independent SNPs in pool Number of independent draws Average nSNPs observed Average nGenes observed Average nSNPs effective Average Σlog(p) Variance Σlog(p) Number of permutations per draw Average Σlog(p)_ of perms across draws Variance Σlog(p)_ of perms across draws Average emp_p across draws Variance of emp_p across draws Emp_p of emp_p from synaptic genes Average combined emp_p Emp_p of combined emp_p from synaptic genes 77702 100 2830 n.a 2789 1292.82 569.05 10000 1229.51 541.77 0.0269 0.0048 0.10 - ISC AFFY6 GAIN AFFY6 Combined 109551 100 3883 n.a 3740 1728.28 791.07 10000 1686.04 760.64 0.1411 0.0313 0.01 109431 100 3786 n.a 3711 1713.52 602.35 10000 1711.82 728.74 0.0288 0.0028 0.07 2.79E-03 <0.01 28558 100 2830 1930 2765 1306.12 471.51 10000 1229.52 546.44 0.0102 0.0011 0.29 - 39866 100 3883 2407 3702 1719.78 800.43 10000 1685.92 768.41 0.1992 0.0446 0.01 49221 100 2830 0 2781 1285.21 539.77 10000 1229.45 543.24 0.0436 0.0089 0.06 - 9 - 69784 100 3883 0 3742 1732.59 561.99 10000 1686.05 760.22 0.1032 0.0193 <0.01 25646 100 2830 1837 2767 1312.82 484.34 10000 1229.42 546.06 0.0041 0.0002 0.34 - 39853 100 3786 2358 3664 1711.82 740.22 10000 1644.29 737.90 0.0370 0.0049 0.11 1.89E-03 <0.01 69677 100 3786 0 3694 1715.67 839.19 10000 1644.23 732.04 0.0350 0.0049 0.12 - 35713 100 3883 2267 3681 1715.00 537.70 10000 1685.90 772.86 0.2104 0.0351 <0.01 - 1.85E-03 <0.01 35684 100 3786 2227 3658 1720.03 664.31 10000 1644.36 739.23 0.0212 0.0033 0.12 - 1.47E-03 <0.01 Supplemental materials synaptic groups and SCZ 3. Systematic differences between synaptic genes and other genes The average number of SNPs observed across all 100 draws from control method 1, was significantly lower than that in the original analysis of all synaptic genes: the mean number of SNPs per gene was 8.6 in the 100 random draws, while it was 19.0 in the original analysis of all synaptic genes in the ISC_AFFY5 sample, and similar discrepancies for the other two samples. This difference could not be ascribed to ‘unlucky’ draws, as new draws gave similar results. This discrepancy is most likely due to the larger gene-length known to be related to genes expressed in brain (e.g. Jia et al., 2010). To investigate whether other systematic differences between synaptic genes and other genes may explain the obtained results for synaptic genes, we looked at possible differences in minor allele frequency, gene size, and the number of exons in synaptic genes and non-synaptic genes (see Table S3). Table S3: Characteristics of synaptic genes and non-synaptic genes ISC AFFY5 ISC AFFY6 GAIN AFFY6 Number of genes on platform Mean MAF Mean nSNPs/gene** Mean gene size (bp) Mean number of exons 795* 0.239 19.0 151,821 15.12 906* 0.237 38.5 136,495 14.62 908* 0.236 39.0 136,180 14.61 Number of genes on platform Mean MAF Mean nSNPs/gene** Mean gene size (bp) Mean number of exons * max # genes genotyped on platform ** based on genotyped snps 12,355* 0.239 8.6 79,886 12.32 15,366* 0.234 16.2 68,300 11.46 15,430* 0.234 16.4 68,169 11.43 Synaptic genes (N=1026) Non-synaptic genes (N=22,655) The most notable difference between synaptic genes and other genes is the gene size (and related; the number of exons and the number of SNPs/gene), which is a known difference between genes expressed in brain and genes not expressed in brain. Although larger genes are more likely to contain SNPs that show a significant P-value (simply because more tests are conducted), we do not think this may have affected our results, as the permutation procedure we used corrects for any systematic effects of gene size or the number of SNPs, thus the combined empirical P-value is unlikely to be biased because of these systematic differences. 4. Functional gene group analysis for synaptic subgroups Seventeen functional synaptic subgroups were tested as well as one subgroup that contained synaptic genes for which the specific synaptic function is not yet known. The 17 groups are: Cell adhesion and transsynaptic signaling molecules; Cell metabolism (synaptic metabolic enzymes and their co-factors, excluding mitochondrial proteins); Endocytosis (proteins involved in endocyotosis); Excitability (voltage gated ion channels); Exocytosis (proteins involved in 10 Supplemental materials synaptic groups and SCZ regulated secretion); G-protein relay (G-protein subunits); GPCR signaling (G-protein coupled receptors); Intracellular signal transduction (enzymes downstream of G-protein/TK signaling); Intracellular trafficking (vesicle adaptors, sorting- and motor proteins); Ion balance/transport (ion-/solute-carriers and exchangers); Ligand gated ion channel signaling; Neurotransmitter metabolism (metabolizing enzymes); Peptide/neurotrophin signaling (neuropeptide, trophic factors, hormones); Protein clustering (scaffolding proteins); RNA and protein synthesis, folding and breakdown; Structural plasticity (cytoskeletal proteins and their regulators); and Tyrosine kinase (TK) signaling (tyrosine receptor kinases). Table S4 provides all assigned genes to each of the functional gene groups. Table S4: Genes assigned to tested synaptic functional gene groups Intracellular Signal Transduction ADCY1, ADCY2, ADCY3, ADCY4, ADCY5, ADCY8, ADCY9, ATM, BAIAP2, BASP1, BEGAIN, BRSK1, CALM1, CALM2, CALM3, CALML3, CALML4, CALR, CAMK1, CAMK1D, CAMK1G, CAMK2A, CAMK2B, CAMK2D, CAMK2G, CAMK2N1, CAMK4, CAP1, CAP2, CHP, CIT, CKB, CNP, CSNK2A1, CSNK2B, CTNNA1, CTNNA2, CTNNB1, CTNND1, CTNND2, DAPK1, DBC1, DCLK1, DIRAS2, DOCK4, ENSA, GDAP1, GRB2, GSK3B, GUCY1A2, GUCY1A3, GUCY1B3, HINT1, HOMER1, HOMER3, HPCA, HPCAL1, HPCAL4, HRAS, INPP4A, IQSEC1, IQSEC2, ITPKA, KIAA1688, KRAS, LANCL1, LANCL2, LINGO1, MAP2K1, MAPK1, MAPK3, MAPK8IP1, MAPK8IP2, MAPK8IP3, MARCKS, MARCKSL1, MINK1, MRAS, NCALD, NCKIPSD, NDRG2, NPTX1, NPTXR, NRGN, PAFAH1B1, PAFAH1B2, PCP4, PDE2A, PDXP, PEBP1, PGRMC1, PHB, PLCB1, PLCB3, PLCB4, PLCG1, PLD1, PPAP2B, PPP1CB, PPP1R9B, PPP2CB, PPP2R1A, PPP2R4, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PRKACA, PRKACB, PRKAR1A, PRKAR1B, PRKAR2A, PRKAR2B, PRKCA, PRKCB1, PRKCD, PRKCE, PRKCG, PRKCZ, PSD3, PTPN9, RAC1, RALA, RAP1B, RAP1GDS1, RAP2A, RAP2B, RAPGEF4, RASAL1, RHOA, RHOB, ROCK1, ROCK2, RYR1, RYR2, RYR3, SIRT2, SKP1, SMPD3, SNCB, SPG3A, THY1, VSNL1, WASF1, YWHAB, YWHAE, YWHAG, YWHAH, YWHAQ, YWHAZ Excitability CACNA1A, CACNA1B, CACNA1C, CACNA1D, CACNA1E, CACNA1F, CACNA2D1, CACNA2D2, CACNA2D3, CACNA2D4, CACNB1, CACNB3, CACNB4, CACNG2, CACNG3, CACNG4, CACNG5, CACNG8, KCNA1, KCNA2, KCNA4, KCNAB1, KCNAB2, KCNC1, KCNC2, KCNC4, KCND2, KCNE1, KCNE1L, KCNE2, KCNE3, KCNE4, KCNJ10, KCNJ12, KCNJ15, KCNJ3, KCNJ4, KCNJ5, KCNJ6, KCNJ8, KCNJ9, KCNMA1, KCNMB1, KCNMB2, KCNMB4, KCTD12, KCTD16, SCN1A, SCN1B, SCN2A, SCN2B, SCN3A, SCN3B, SCN4B, SCN5A, SCN8A, SCN9A, VDAC1, VDAC3 CAT signaling AGRN, ALCAM, BCAN, BSG, C1QBP, CADM1, CADM2, CADM4, CD200, CD47, CDH1, CDH10, CDH11, CDH12, CDH13, CDH15, CDH16, CDH18, CDH19, CDH2, CDH20, CDH22, CDH23, CDH26, CDH3, CDH4, CDH5, CDH6, CDH7, CDH8, CDH9, CNTN1, CNTN2, CNTN3, CNTN4, CNTN5, CNTN6, CNTNAP1, CNTNAP2, CRMP1, CSPG5, CTTN, DCHS1, DCHS2, ERBB2, ERBB2IP, GPC1, GPM6A, GPM6B, HAPLN1, HAPLN4, HNT, ICAM5, IGSF8, L1CAM, LSAMP, LY6H, MBP, MOG, NCAM1, NCAM2, NCAN, NEO1, NFASC, NLGN1, NLGN2, NLGN3, NLGN4X, NPTN, NRXN1, NRXN2, NRXN3, OMG, OPCML, PCDH1, PKP4, PLP1, PVRL1, ROBO1, ROBO2, TNR Endocytosis AAK1, AP1B1, AP1M1, AP1M2, AP2A1, AP2A2, AP2B1, AP2M1, AP2S1, AP3D1, AP3M2, AP3S2, CLTC, DNM1, DNM1L, DNM2, DNM3, GIT1, NECAP1, PICALM, SH3GL2, SH3GLB1, SH3GLB2, SNAP91, SYNJ1, SYNJ2 Structural plasticity ABI1, ABI2, ABLIM1, ACTB, ACTG1, ACTN1, ACTR2, ACTR3, ADD1, ADD2, ADD3, ANK2, ANK3, ANXA1, ANXA2, ARPC1A, ARPC3, ARPC4, ARPC5, ASTN1, CAPZA2, CAPZB, CDC42, CEND1, CFL1, CLASP2, CORO1A, CORO2B, COTL1, CPNE4, CPNE6, CRYM, CSRP1, DBN1, DBNL, DPYSL2, DPYSL3, DPYSL4, DPYSL5, DSTN, DYNC1I1, EPB41L1, EPB41L2, EPB41L3, EZR, FSCN1, GAS7, GPRIN1, INA, KIF2A, KIF5C, KPNB1, LASP1, LGI1, LPPR4, MACF1, MAP1A, MAP1B, MAP2, MAP6, MAPRE3, MAPT, MYH10, MYH9, MYL6, MYL9, NEBL, NEFH, NEFL, NEFM, NF1, NUMBL, PALM, PALM2, PFN1, PFN2, PLEC1, PRICKLE2, SPTAN1, SPTB, SPTBN1, SPTBN2, STMN1, TAGLN3, TCP1, TLN2, TMOD2, TMSB4X, TUBA1A, TUBA4B, TUBB, TUBB2A, TUBB2B, TUBB2C, TUBB3, TUBB4, WASL, WDR1 GPCR signaling ADORA1, ADORA2A, ADRA1A, ADRA2A, ADRB2, ADRB3, BAI3, BZRAP1, CHRM1, CHRM2, CNR1, CNR2, CNRIP1, CRHR1, CRHR2, DRD1, DRD2, GPR158, GRM1, GRM3, GRM4, GRM5, GRM7, GRM8, HTR1A, HTR1B, HTR1D, HTR2A, HTR2B, HTR7, LPHN1, OPRD1, OPRK1, OPRM1, P2RX1, P2RX3, P2RX6, P2RX7, P2RY1, P2RY2, RGS7 Protein cluster BSN, CASK, CASKIN1, CASKIN2, CNKSR2, CTBP1, CTBP2, DLG1, DLG2, DLG3, DLG4, DLGAP1, DLGAP2, DLGAP3, DLGAP4, ERC1, ERC2, KIAA1045, LIN7A, LIN7B, LRRC7, MPP2, MPP3, MPP4, MPP5, MPP6, 11 Supplemental materials synaptic groups and SCZ PACSIN1, PCLO, PICK1, PPFIA1, PPFIA2, PPFIA3, PPFIA4, PPFIBP1, PPFIBP2, SEPT11, SEPT2, SEPT3, SEPT5, SEPT6, SEPT7, SEPT8, SEPT9, SHANK2, SHANK3, SORBS1, WDR37 Tyrosine kinase signaling EPHA4, NTRK2, PTPRA, PTPRG, PTPRN2, PTPRS, SIRPA Cell metabolism ACSL1, ACSL3, AGK, APOE, CA2, DBI, ENO1, ENO2, FASN, GAPDH, GDA, GK, GLO1, GOT1, GPD1, GSTM3, GSTM5, GSTP1, HK1, HSD17B4, IDH1, LDHA, LRP1, MDH1, MGLL, NME1, NME2, PFKL, PFKM, PFKP, PGAM1, PGK1, PGM2, PHGDH, PI4KA, PIP4K2B, PIP4K2C, PIP5K1C, PITPNA, POR, PRDX1, PRDX2, PYGB, SCCPDH, SLC25A4, SLC25A5, SLC2A1, SLC2A3, SLC35d3, SLC38A3, SLC3A2, SLC6A6, SLC6A7, SLC6A8, SLC7A5, TKT, TMEM30A Neurotransmitter metabolism CHAT, DBH, DDAH1, GLS, MAOA, MAOB, MOXD1, SLC17A6, SLC17A7, SLC17A8, SLC18A1, SLC18A2, SLC18A3, SLC1A1, SLC1A2, SLC1A3, SLC1A4, SLC1A6, SLC1A7, SLC32A1, SLC6A1, SLC6A17, SLC6A2, SLC6A3, SLC6A4, SLC6A5, SLC6A9, TPH1, TPH2 Intracellular trafficking AGAP2, ARFIP2, ARHGAP1, ARL8B, DCTN1, DCTN2, DYNC1H1, DYNC2H1, DYNLL1, DYNLL2, DYNLT1, EHD3, GDI1, GDI2, GOLSYN, KIDINS220, KIF1A, KIF1B, KIF2B, KIF4A, KIF5B, KLC1, KLC2, KLC4, KTN1, MYO5A, RAB10, RAB11B, RAB12, RAB14, RAB15, RAB18, RAB1A, RAB1B, RAB21, RAB24, RAB25, RAB26, RAB2A, RAB30, RAB31, RAB33A, RAB35, RAB3A, RAB3B, RAB3C, RAB3D, RAB3GAP1, RAB4A, RAB4B, RAB5A, RAB5B, RAB5C, RAB6A, RAB6B, RAB6IP1, RAB7A, RAB8A, RAB8B, RAB9B, RABAC1, RABEP1, RABEP2, RABGEF1, RABIF, SEC22C, SNX5, STX12, STX16, STX5, STX6, STX7, STX8, VAMP7, VAPA, VPS33A, VPS33B, VPS45, VTI1A, ZFYVE20 LGIC signaling CHRNA4, CHRNA6, CHRNA7, CHRNB2, GABARAPL2, GABBR1, GABBR2, GABRA1, GABRA2, GABRA3, GABRA4, GABRA5, GABRA6, GABRB1, GABRB2, GABRB3, GABRE, GABRG1, GABRG2, GRIA1, GRIA2, GRIA3, GRIA4, GRIK1, GRIK2, GRIK3, GRIK4, GRIK5, GRIN1, GRIN2A, GRIN2B, GRIN2C, GRIN2D, HCN1, HTR3A, HTR3B Exocytosis AMPH, ARF4, ARFGAP1, ARHGDIA, C1orf142, CADPS, CPLX1, CPLX2, DNAJC5, DOC2A, DOC2B, EXOC1, EXOC2, EXOC3, EXOC4, EXOC5, EXOC6, EXOC7, EXOC8, MYRIP, NAPA, NAPG, NSF, NSFL1C, RAB27B, RAB3D, RIMS1, RIMS2, RIMS3, RPH3A, SCAMP1, SCAMP2, SCAMP3, SCAMP5, SCRN1, SNAP23, SNAP25, SNAP29, SNAPIN, SNIP, STX1A, STX1B, STX2, STX3, STX4, STXBP1, STXBP2, STXBP3, STXBP4, STXBP5, STXBP5L, STXBP6, SV2A, SV2B, SV2C, SVOP, SYN1, SYN2, SYN3, SYNGR1, SYNGR2, SYNGR3, SYNPR, SYP, SYPL1, SYT1, SYT12, SYT2, SYT3, SYT5, SYT6, SYT7, SYT9, SYTL1, SYTL2, SYTL3, SYTL4, SYTL5, UNC13A, UNC13B, UNC13C, UNC13D, VAMP1, VAMP2, VAMP3, VAMP4 RPSFB ADAM22, ADAM23, AHSA1, ALG2, BCAP31, CAND1, CANX, CAPN5, CCT2, CCT3, CCT4, CCT6A, CCT7, CCT8, CST3, CYFIP1, DNAJA1, DNAJC11, DNAJC6, DPP6, EEF1A1, EEF1A2, EEF2, EIF4B, ENDOD1, FBXO41, FKBP1A, GANAB, HSP90AA1, HSP90B1, HSPA12A, HSPA4, HSPA4L, HSPA5, HSPA8, HSPH1, MGST3, NEDD8, NPEPPS, NT5DC3, OTUB1, P4HB, PACS1, PDIA3, PDIA6, PPIA, PPIB, PURA, QDPR, RPL10, RPL13, RPL29, RPL4, RPL6, RPS16, RPS18, RPS27A, RPS3A, RPS9, RRBP1, SACS, SAMM50, STIP1, TMPRSS9, TXN, UBA1, UBE2N, UBE2V2, UCHL1, USP5, VCP Ion balance/transport AQP4, ATP1A1, ATP1A2, ATP1A3, ATP1B1, ATP1B2, ATP2A2, ATP2B1, ATP2B2, ATP2B3, ATP2B4, ATP6AP1, ATP6V0A1, ATP6V0C, ATP6V0D1, ATP6V1A, ATP6V1B1, ATP6V1B2, ATP6V1C1, ATP6V1D, ATP6V1E1, ATP6V1F, ATP6V1G2, ATP6V1H, ATP8A1, GJA1, SLC12A5, SLC30A1, SLC30A3, SLC30A4, SLC30A5, SLC30A6, SLC30A7, SLC30A9, SLC4A1, SLC4A10, SLC4A4, SLC8A1, SLC8A2, SLC8A3, SLC9A3R1, TTYH1, TTYH3 Peptide/Neurotrophin signals APH1A, APH1B, AVP, BDNF, CCK, CHGA, CRH, CRHBP, FGF1, MIF, NCSTN, NPY, NRG1, NRG2, NRG3, NRG4, NTS, OXT, PCSK1, PCSK2, PDYN, PENK, PSEN1, PSEN2, PSENEN, SCG2, SCG5, SST G-protein relay GNA11, GNA12, GNA13, GNA14, GNA15, GNAI1, GNAI2, GNAI3, GNAL, GNAO1, GNAQ, GNAS, GNAT1, GNAZ, GNB1, GNB2, GNB3, GNB4, GNB5, GNG10, GNG11, GNG12, GNG2, GNG3, GNG4, GNG5, GNG7 Unknown AADACL1, ABHD12, ANXA11, ANXA5, ANXA6, ANXA7, APBA1, APBA2, APBA3, APP, APPBP2, ARL6IP5, ATCAY, BCAS1, C10orf58, C2orf55, C6orf174, C9orf126, DMXL2, DVL1, FLJ45455, FLOT1, FLOT2, FRMPD4, GAP43, GBAS, GDAP1L1, HBA2, HBG1, HRB, KIAA0513, LAMP1, LAMP2, LOC389813, LRRTM1, MAL2, NBEA, NCDN, NCKAP1, NIPSNAP1, OLFM1, PHYHIP, PPP1R1B, PRNP, PRRT1, PRRT3, RIMBP2, RPH3AL, RTN1, RTN3, RTN4, SBF1, SNCA, TMED10, TMEM65, TRAPPC1, TRAPPC3, TRAPPC5, WDR7, WFS1, WNK1 Figures S3 a, b, c provide Q-Q plots for all eighteen groups of genes for the ISC_AFFY5, ISC_AFFY6 and GAIN samples. 12 Supplemental materials synaptic groups and SCZ Figure S3a. Q-Q plots of P-values per synaptic subgroup - ISC_AFFY5 13 Supplemental materials synaptic groups and SCZ Figure S3b. Q-Q plots of P-values per synaptic subgroup - ISC_AFFY6 14 Supplemental materials synaptic groups and SCZ Figure S3c. Q-Q plots of P-values per synaptic subgroup – GAIN 15 Supplemental materials synaptic groups and SCZ Figure S4,a,b,c provide the distribution of the Σ-log(P) of all permuted datasets with the red bar denoting the Σ-log(P) of the actual analysis for each functional gene group. Figure S4a. Distribution of Σ-log(P) of permuted and actual datasets – ISC_AFFY5 16 Supplemental materials synaptic groups and SCZ Figure S4b. Distribution of Σ-log(P) of permuted and actual datasets - ISC_AFFY6 17 Supplemental materials synaptic groups and SCZ Figure S4c. Distribution of Σ-log(P) of permuted and actual datasets – GAIN 18 Supplemental materials synaptic groups and SCZ 5. Enrichment analysis on genes previously implicated genes in schizophrenia We retrieved a list of all previously implicated genes and tested for enrichment of these in the synaptic groups. In addition and as a positive control we tested whether previously implicated genes as a gene-set were significantly associated to the risk for schizophrenia in the current data. The list was compiled based on whole genome studies and thus did not include genes implicated from previous candidate gene studies (except when these were replicated in the whole genome studies). We used the following criteria for compiling the list of previously implicated genes: 5.1 Signals from GWAS studies We used the online GWAS catalogue (Hindorff et al, 2009), accessed 14 th of February 2011, to obtain a list of published GWA studies for schizophrenia only (Lenzc et al, 2007; Walsh et al, 2008; Sullivan et al, 2008; Shifman et al, 2008; Kirov et al, 2008; Shi et al, 2009; Stefansson et al, 2009; Need et al, 2009; Purcell et al, 2009; Donovan et al, 2009;Athanasiu et al, 2010). Subsequently, we identified all SNPs with P ≤ 1.0-5 reported in these GWA studies and mapped these SNPs to protein coding genes (NCBI build v36.3). SNPs that were not directly located within TSS and TES of a gene, were mapped to genes that were located within a 20kb window, or excluded from the subsequent analyses if no genes were within this window. This procedure resulted in the mapping of 97 GWAS signals to 78 unique protein-coding genes. 5.2 Signals from large scaled CNV studies We used genome-wide CNV studies for schizophrenia in which CNV’s implicated a single gene. Most CNV studies report CNV’s that overlap or disrupt multiple genes, without providing clear evidence for the disease causing gene. Including these multiple genes would introduce a lot of error in our analysis. We thus selected only those CNV studies that implicated a single gene per CNV. This resulted in the selection of five genes from two recent genome-wide CNV studies (Levinson et al, 2011; Vacic et al, 2011). Combining the lists of implicated genes from SNP and CNV GWAS studies resulted in a total of 83 unique genes previously associated with schizophrenia. This list is given in Table S5. 19 Supplemental materials synaptic groups and SCZ Table S5 Complete list of 83 previously implicated genes for schizophrenia obtained from GWAS and CNV studies. Genes (N=78) implicated from GWAS studies for SCZ published until 14/02/2011 ACSM1; ADAMTSL3; ADIPOR2; AGBL1; AGTRAP; ANK3; ATXN1; C10orf59; C16orf5; C1orf187; C1orf51; C1orf54; C1orf91; C9orf82; CBX2; CCDC60; CDC42; CDH13; CENTD2; CENTG2; CSMD1; DCDC2B; DDX31; DOCK3; EIF3I; EML5; ERBB4; ESAM; FMO3; FXR1; GRID1; GTF3C4; HIST1H2AG; HIST1H2BJ; HIST1H2BK; HIST1H4I; HLA-DQA1; IFT74; ITGB1; LCK; LOC100128797; LOC100131289; LOC440302; LOC646993; MAD1L1; MAST4; MRPS21; MXRA5; MYO18B; NLGN4X; NOS1; NOTCH4; NRGN; NTRK3; OPCML; OR5M1; PCLO; PGBD1; PLAA; POM121L2; PROX2; PTGS2; PTPN21; RAD54L2; RBM15B; RELN; RPGRIP1L; RPP21; TCF4; TMEM16D; TRIM39; TRPA1; TXNRD2; VSIG2; WNT7A; YLPM1; ZNF452; ZNF804A. Genes (N=5) implicated from CNV studies published until April 2011 NRXN1, C16orf72, VIPR2, BCR, OTUD7A 5.3 Enrichment analysis From the 83 previously implicated genes, 72 (86.7%) were genes that are expressed in brain (Total N_brain expressed=18,029) and 8 (9.6%) belonged to the synaptic gene group. We used a Fisher exact test to test for enrichment. We note that there is some overlap between previously published GWAS studies and the data currently used (the current study includes data used in Purcell et al., 2009 and Shi et al., 2009), whicj introduces some bias to the enrichment analyses. There was a significant overrepresentation of previously implicated genes in brain expressed genes (P=.04) and synaptic genes (P=.03). Subsequently, we tested whether previously implicated genes were enriched in any of the synaptic subgroups. The Fisher exact test statistic was calculated using three different a priori expectations (see Table S6): 1) Enrichment of previously implicated genes in a subgroup, given 8 observed implicated genes in 1026 synaptic genes: Penr_SYN 2) Enrichment of previously implicated genes in a subgroup, given 72 observed implicated genes in 18029 brain expressed genes: Penr_BRAIN 3) Enrichment of previously implicated genes in a subgroup, given 83 observed implicated genes in 23681 total genes: Penr_ALL We do note that this enrichment analysis does not take into account the strength of the evidence (i.e. a gene is either associated or not, actual P-values are not taken into account), but merely provides an indication of whether the number of previously implicated genes is higher than expected. Results indicate that there is enrichment of previously implicated genes in one synaptic subgroup (CAT signaling), which was the third most significant gene-group in our original analyses. However, since larger genes have a higher chance of being significantly related to a trait in a GWAS, the selection of previously implicated genes may be biased towards larger genes. Previously implicated genes are indeed much larger than genes not implicated previously (Table S7; P= 1.958e-07, Wilcoxon rank sum test). Given that brain-expressed genes, including synaptic genes are generally larger than non-brain expressed genes, and that gene sizes of synaptic genes are not significantly different from gene-sizes of previously implicated genes (P=0.5661, Wilcoxon rank sum test), the enrichment we see may thus be biased by gene-size. 20 Supplemental materials synaptic groups and SCZ Three previously implicated and synaptic genes have large gene sizes. These three genes are all part of the CAT-signaling group and we thus note that the significant enrichment in that group needs to be interpreted with caution. Five previously implicated and synaptic genes have normal to small gene sizes (see Figure S5). These are unlikely to have been implicated due to gene size. Table S6 – Results of enrichment test per functional group. ‘Total N GENES’ denotes the number of genes within the functional group, N_IMP the number of associated genes that is observed. Functional groups indicated in bold are the functional groups that showed significant association to schizophrenia in the functional gene group analysis. Significant enrichment (after Bonferroni correction) is indicated with an asterisk. Total N GENES* N_ IMP Penr_SYN Penr_BRAIN Penr_ALL Intracellular signal transduction 150 1 0.7189 0.4527 Excitability 59 0 1 1 1 CAT signaling 81 4 0.0020* 0.0003* 0.0002* Endocytosis 26 0 1 1 1 Structural plasticity 98 2 0.1732 0.0585 0.0465 GPCR signaling 41 0 1 1 1 'Unknown' 61 0 1 1 1 Protein cluster 47 1 0.3137 0.1716 0.1523 Tyrosine kinase signaling 0.4104 7 0 1 1 1 Cell metabolism 57 0 1 1 1 Neurotransmitter metabolism 29 0 1 1 1 Intracellular trafficking 80 0 1 1 1 LGIC signaling 36 0 1 1 1 Exocytosis 86 0 1 1 1 RPSFB 71 0 1 1 1 Ion balance/transport 43 0 1 1 1 Peptide/Neurothropin signals 28 0 1 1 1 G-protein relay 27 0 1 1 *Please note that although the totals in the Table add up to 1027, the total number of synaptic genes is 1026 – because RAB3D is included in both the exocytosis and the intracellular trafficking group. 1 Table S7 Characteristics of previously implicated and non-implicated genes Synaptic genes Number of genes* 1,026 Mean gene size (bp) 121,553 Median gene (bp) 46,662 210,955 Mean number of exons 13.66 SD number of exons 12.05 Non-synaptic genes 22,655 47,535 16,346 102,451 8.82 9.54 Previously implicated genes 83 213,965 58,048 356,585 15.02 14.16 size SD gene size Non-previously implicated 23,598 50,192 17,127 108,227 9.01 9.69 genes *Although the total Ngenes is 23681, we here used 23463 genes as for 218 genes gene size or exon information was not complete. None of these encoded synaptic genes or were a previously implicated gene. 21 Supplemental materials synaptic groups and SCZ Figure S5: Gene sizes of synaptic genes and previously implicated synaptic genes. 5.4 Gene-group analysis using previously implicated genes As a positive control we conducted a gene-group analysis using the total set of 83 previously implicated genes as a group. As the current dataset overlaps with the GWA studies on which the list was based, we expected to find a significant association with this gene-group, which was confirmed with a combined P-value of 3.74E-5, for the self contained test and P-values all < .05 for the competitive tests. (Table S8). Figure S6 provides qq-plots for all SNPs in previously implicated genes across three samples. 22 Supplemental materials synaptic groups and SCZ Table S8: Results of testing for total set of previously implicated genes Set of previously implicated genes (N=83) ISC AFFY5 ISC AFFY6 GAIN AFFY6 ALL * limited at 0.0001 due to N genes 63 N SNPs 2833 Σ-log(P) 1340 PEMP 0.0344 Competitive Pemp Control Method1 0.12 Competitive Pemp Control Method2 0.26 Competitive Pemp Control Method3 0.26 Competitive Pemp Control Method4 0.13 Competitive Pemp Control Method5 0.37 N genes 71 N SNPs 6766 Σ-log(P) 3134 PEMP 0.0489 Competitive Pemp Control Method1 0.09 Competitive Pemp Control Method2 0.15 Competitive Pemp Control Method3 0.13 Competitive Pemp Control Method4 0.23 Competitive Pemp Control Method5 0.16 N genes 69 N SNPs 6838 Σ-log(P) 3716 PEMP <0.0001* Competitive Pemp Control Method1 <0.01** Competitive Pemp Control Method2 <0.01** Competitive Pemp Control Method3 <0.01** Competitive Pemp Control Method4 <0.01** Competitive Pemp Control Method5 0.01 PCOMB 3.74E-05 PCOMB Control method 1 0.0036 PCOMB Control method 2 0.0147 PCOMB Control method 3 0.0127 PCOMB Control method 4 0.0114 PCOMB Control method 5 0.0247 10000 permutations; ** limited at 0.01 due to 100 draws Figure S6: QQ plots for all SNPs in previously implicated genes across three samples. 23 Supplemental materials synaptic groups and SCZ 6. Genetic heterogeneity and gene group robustness Schematic view of functional gene group configurations in eight individuals, four unaffected and four affected. The nodes represent the gene products while the edges between the nodes indicate the physical interactions between the nodes. Unaffected individuals carry mutations (red nodes) in only a few genes. These do not lead to a notable difference in functioning of the group due to the robustness of the functional group. In the four affected individuals there are many genes with a mutation. In each individual this can be in a different set of genes. However, as the set of genes carrying a mutation is large, other genes can no longer act as a backup. As a consequence the robustness of the functional group is affected, and dysfunction of the gene group occurs. Figure S7 Heterogeneity at the gene level underlies robust functional gene groups: a model 7. Graphical representation of significant functional gene groups We used the STRING database to graphically represent the known physical and functional interactions between genes included in the subgroups ‘intracellular signal transduction’, ‘excitability’ and ‘cell adhesion and transsynaptic signaling molecules’. The lines in the graphs (Figures S8-S10) represent the confidence of the interactions between two connecting genes, where ticker lines represent a higher confidence. 24 Supplemental materials synaptic groups and SCZ Figure S8: Graphical view of known associations between the genes within the intracellular signal transduction group. 25 Supplemental materials synaptic groups and SCZ Figure S9: Graphical view of of known associations between the genes within the excitability group. 26 Supplemental materials synaptic groups and SCZ Figure S10: Graphical view of known associations between the genes within the CAT signalling group. 27 Supplemental materials synaptic groups and SCZ 8. Web resources The URLs for data presented herein are as follows: Genetics Cluster Computer: http://www.geneticcluster.org dbSNP: http://www.ncbi.nlm.nih.gov/SNP). R Software: http://www.r-project.org Plink software: http://pngu.mgh.harvard.edu/~purcell/plink/ String interaction database: http://string-db.org/ GAIN QA/QC software Package http://www.sph.umich.edu/csg/abecasis/GainQC/ 28 Supplemental materials synaptic groups and SCZ 9. References Athanasiu L, Mattingsdal M, Kahler AK, Brown A, Gustafsson O, Agartz I et al. Gene variants associated with schizophrenia in a Norwegian genome-wide study are replicated in a large European cohort. J Psychiatr Res 2010; 44: 748-753. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 2009; 106: 9362-9367. Jia P, Wang L, Meltzer HY, Zhao Z. Common variants conferring risk of schizophrenia: a pathway analysis of GWAS data. Schizophr Res 2010; 122: 38-42. Kirov G, Zaharieva I, Georgieva L, Moskvina V, Nikolov I, Cichon S et al. A genome-wide association study in 574 schizophrenia trios using DNA pooling. Mol Psychiatry 2009; 14: 796-803. Lencz T, Morgan TV, Athanasiou M, Dain B, Reed CR, Kane JM et al. Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia. Mol Psychiatry 2007; 12: 572-580. Levinson DF, Duan J, Oh S, Wang K, Sanders AR, Shi J et al. Copy number variants in schizophrenia: confirmation of five previous findings and new evidence for 3q29 microdeletions and VIPR2 duplications. Am J Psychiatry 2011; 168: 302-316. Need AC, Ge D, Weale ME, Maia J, Feng S, Heinzen EL et al. A genome-wide investigation of SNPs and CNVs in schizophrenia. PLoS Genet 2009 Feb; 5(2): e1000373. O'Donovan MC, Craddock N, Norton N, Williams H, Peirce T, Moskvina V et al. Identification of loci associated with schizophrenia by genome-wide association and follow-up. Nat Genet 2008; 40: 1053-1055. Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, Sullivan PF et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009; 460: 748-752. Raychaudhuri S, Korn JM, McCarroll SA, Altshuler D, Sklar P, Purcell S et al. Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function. PLoS Genet 2010; 6. Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature 2009; 460: 753-757. Shifman S, Johannesson M, Bronstein M, Chen SX, Collier DA, Craddock NJ et al. Genome-wide association identifies a common variant in the reelin gene that increases the risk of schizophrenia only in women. PLoS Genet 2008; 4: e28. Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D et al. Common variants conferring risk of schizophrenia. Nature 2009; 460: 744-747. 29 Supplemental materials synaptic groups and SCZ Sullivan PF, Lin D, Tzeng JY, van den Oord E, Perkins D, Stroup TS et al. Genomewide association for schizophrenia in the CATIE study: results of stage 1. Mol Psychiatry 2008; 13: 570-584. Vacic V, McCarthy S, Malhotra D, Murray F, Chou HH, Peoples A et al. Duplications of the neuropeptide receptor gene VIPR2 confer significant risk for schizophrenia. Nature 2011; 471: 499-503. Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB, Cooper GM et al. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 2008; 320: 539-543. 30
© Copyright 2026 Paperzz