Supplementary materials Title: EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations Guo-Bo Chen1, Sang Hong Lee1,2, Zhi-Xiang Zhu3, Beben Benyamin1, Matthew R. Robinson1 1Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia; of Environmental and Rural Science, The University of New England, Armidale, NSW 2351, Australia; 3SPLUS Game, Guangzhou, Guangdong 510665, China 2School Figure S1 The distribution of the entries in genetic relatedness matrix for HapMap3 and POPRES. Figure S2 Correlation for the SNP effects estimated using EigenGWAS and BLUP for POPRES samples. Figure S3 The correlation between EigenGWAS chi-square tests and πΉπ π‘ for HapMap3 samples. Figure S4 EigenGWAS for Simulated data. The simulated data has 2000 samples, and 500,000 biallelic markers. Figure S5 The correlation between chisquare test statistics for EigenGWAS SNP effects and their πΉπ π‘ . Figure S6 Distribution of p-values for EigenGWAS for Simulated data (Scheme I) Figure S7 Validation of the adjustment for the test statistic with the largest eigenvalue. Figure S8 The statistical power for EigenGWAS. Figure S9 Eigenvecotor1 against eigenvector 2 for 88 TSI samples and 112 CEU samples. Figure S10 QQ plot for EigenGWAS π 2 statistics against the π 2 observed from the null for CEU&TSI cohort. Figure S11 Eigenvector 1 against Eigenveoctor 2 for POPRES European samples. Figure S12 QQ plot for EigenGWAS π 2 statistics against the π 2 observed from the null for POPRES. Table S1 The correlation between πΉπ π‘ and chi-sq test for SNP effects on each EigenGWAS Table S2 Gene discovery using EigenGWAS for POPRES using 234,127 SNPs Table S3 EigenGWAS results for POPRES based on 234,127 SNPs 1 Figure S1 The distribution of the entries in genetic relatedness matrix for HapMap3 and POPRES. a) HapMap is a mix of various ethnicities, the many off-diagonal elements also showed very large number. b) For POPRES Europeans, which is a relative homogenous population, its off-diagonal elements were very close zero. 2 Figure S2 Correlation for the SNP effects estimated using EigenGWAS and BLUP for POPRES samples. The x-axis represents EigenGWAS estimation for SNP effects, and the y-axis represents BLUP estimation for SNP effects. The left panel illustrates from πΈ1 ~πΈ5 ; the right panel from πΈ6 ~πΈ10. As illustrated, the correlation is nearly 1. 3 Figure S3 The correlation between EigenGWAS chi-square tests and πΉπ π‘ for HapMap3 samples. The correlation between EigenGWAS chi-square tests and πΉπ π‘ from πΈ1 to πΈ10 (left panel, from top to bottom πΈ1 ~πΈ5 ; right panel, from top to bottom πΈ6 ~πΈ10 ) were 0.965, 0.819, 0.094, 0.146, 0.155, 0.186, 0.244, 0.159, 0.142, and 0.156, respectively. 4 Figure S4 EigenGWAS for Simulated data (Scheme I). The simulated data has 2000 samples, and 500,000 biallelic markers. Markers are in linkage equilibiurm, and the minor allele frequency ranges from 0.01 to 0.5 under a uniform distribution. Eigenvalues associated to the first ten eigenvectors are 1.12935, 1.12895, 1.12842, 1.12763, 1.12725, 1.12622, 1.12605, 1.12595, 1.12546, and 1.1253, respectively. 5 Figure S5 Distribution of p-values for EigenGWAS for Simulated data (Scheme I). The simulated data has 2000 samples, and 500,000 biallelic markers. Markers are in linkage equilibiurm, and the minor allele frequency ranges from 0.01 to 0.5 under a uniform distribution. Eigenvalues associated to the first ten eigenvectors are 1.12935, 1.12895, 1.12842, 1.12763, 1.12725, 1.12622, 1.12605, 1.12595, 1.12546, and 1.1253, respectively. The p-values have been corrected for ππΊπΆ . 6 Figure S6 The correlation between chisquare test statistics for EigenGWAS SNP effects and their πΉπ π‘ . Depending on πΈπ > 0 or πΈπ β€ 0, the samples were split into two groups, upon which πΉπ π‘ was calcuated. The average correlation between chi-square tests and πΉπ π‘ was around 0.66. 7 Figure S7 Validation of the adjustment for the test statistics. Two subdividions were generated, the average πΉπ π‘ between these two samples were 0.02 for 10,000 loci. For the top panel, the subdivision 1 and 2 had 1,000 and 200 individuals, respectively. πΜ1 = 14.34, and its expected value was 14.46. After the adjustment for π1 , it could be seemed that the distribution of the test statistic follows the null distribution very well. For the bottom panel, the subdivision 1 and 2 had 1,000 and 500 samples, respectively. πΜ1 = 26.96, and its expected value was 27.50. The black squares were test statistics after the adjustment of πΜ1 , whereas the red 2 π ones were test statistics calculated from theory πΈ(π1.π ) = 4ππ€(1 β π€)πΉπ π‘ , in which π is total sample for 1 π two subdivisions, π€ = 0.83 for the top panel and 0.67 for the bottom panel, and πΉπ π‘ is Neiβs measure for πΉπ π‘ . 8 The agreement between the red and black points validated our theory. For more details please refer to the main text. 9 1.0 0.4 0.0 0.2 Power 0.6 0.8 1.0 0.8 0.6 0.4 0.2 Power 0.0 Fst=0.01 Fst=0.02 Fst=0.05 Fst=0.1 0.0 0.2 0.4 0.6 0.8 1.0 0.0 Selection (Fst) 0.2 0.4 0.6 0.8 1.0 Selection (Fst) Figure S8 The statistical power for EigenGWAS. The x-axis represents the strength of selection, in terms of πΉπ π‘ , for a locus. πΉΜ π π‘ indicates population stratification. The y-axis represents statistical power evaluated πΉ fron π12 with non-centrality parameter πΉΜ π π‘ β 1. The p-value cutoff for 0.05 significant for the left and right π π‘ panel were 0.05/10000, and 0.05/500,000, respectively. 10 Figure S9 Eigenvecotor1 against eigenvector 2 for 88 TSI samples and 112 CEU samples. Eigenvetors were generated directly based on 919,313 SNPs over 88 TSI and 112 CEU samples from HapMap. 11 Figure S10 QQ plot for EigenGWAS ππ statistics against the ππ observed from the null for CEU&TSI cohort. The left one was the origninal EigenGWAS for CEU&TSI cohort, which had ππΊπΆ = 1.725, and the right one was the one corrected for ππΊπΆ . 12 Figure S11 Eigenvector 1 against Eigenveoctor 2 for POPRES European samples. 13 Figure S12 QQ plot for EigenGWAS ππ statistics against the ππ observed from the null for POPRES. The left one was the origninal EigenGWAS for POPRES cohort, which had ππΊπΆ = 5.00, and the right one was the one corrected for ππΊπΆ . 14 Table S1 The correlation between πππ and chi-sq test for SNP effects on each EigenGWAS Phenotype πΈ1 πΈ2 πΈ3 πΈ4 πΈ5 πΈ6 πΈ7 πΈ8 πΈ9 πΈ10 POPRES 0.887 0.933 0.904 0.987 0.765 0.994 0.977 0.963 0.917 0.927 p-value <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 HapMap 0.965 0.819 0.094 0.146 0.155 0.186 0.244 0.159 0.142 0.156 15 p-value <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 <1e-16 Table S2 Gene discovery using EigenGWAS for POPRES using 234,127 SNPs Gene LCT MCM6 HERC2 Phenotype πΈ1 πΈ1 πΈ1 Lead SNP rs2304371 rs309180 rs8039195 Alleles C/T G/A A/G p-value 4.19e-46 5.223e-93 8.902e-36 Allele frequency (samples) 0.817 (1373):0.676 (1062) 0.706 (1386):0.483 (1080) 0.694 (1385):0.765 (1080) 16 πΉπ π‘ 0.110 0.102 0.055 Position (CHR:BP) 2:135817629 2:135856685 15:28268218 Table S3 EigenGWAS results for POPRES based on 234,127 SNPs POPRES Eigen value #GWAS hits πΈπ ππΊπΆ 1 5.104 5.282 5373 2 2.207 2.118 544 3 2.157 1.703 713 4 2.077 1.467 481 5 1.971 1.726 447 6 1.871 1.303 464 7 1.843 1.469 513 8 1.818 1.660 295 9 1.807 1.637 309 10 1.798 1.584 240 17
© Copyright 2026 Paperzz