snpEff: Evaluation of Available Versions David Roazen Genome Sequencing and Analysis Medical and Population Genetics January 3, 2012 Missense/Silent Ratio in a 1000G Gencode-Annotated VCF, and with snpEff run on the Same Variants Missense Silent Missense/Silent 1000G Phase 1 SNP calls with Gencode 7 coding annotations1 299367 208171 1.44 snpEff 2.0.2 + GRCh37.63 341742 146079 2.34 snpEff 2.0.4 RC3 + GRCh37.64 297106 202174 1.47 snpEff 2.0.5 + GRCh37.64 297106 202174 1.47 snpEff 2.0.5 + GRCh37.65 341486 150938 2.26 1 ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20111220_coding_annotation_phase1/ Overall Concordance with the 1000G Gencode SNP Annotations Silent Missense Nonsense snpEff 2.0.2 + GRCh37.63 70.13% 97.21% 98.77% snpEff 2.0.4 RC3 + GRCh37.64 97.11% 99.0% 98.40% snpEff 2.0.5 + GRCh37.64 97.11% 99.0% 98.40% snpEff 2.0.5 + GRCh37.65 72.48% 98.05% 98.34% Summary: • Both snpEff 2.0.2 + GRCh37.63 and snpEff 2.0.5 + GRCh37.65 produce an abnormally high Missense:Silent ratio, with elevated levels of Missense mutations across the entire spectrum of allele counts. They also have a relatively low (~70%) level of concordance with the 1000G Gencode annotations when it comes to Silent mutations. • This suggests that these combinations of snpEff/database versions incorrectly annotate many Silent mutations as Missense. • snpEff 2.0.4 RC3 + GRCh37.64 and snpEff 2.0.5 + GRCh37.64 produce a Missense:Silent ratio in line with expectations, and have a very high (~97%-99%) level of concordance with the 1000G Gencode annotations across all categories.
© Copyright 2026 Paperzz