Final Results Genome Assembly Team Kelley Bullard, Henry Dewhurst, Kizee Etienne, Esha Jain, VivekSagar KR, Benjamin Metcalf, Raghav Sharma, Charles Wigington, Juliette Zerick Original Pipeline 454 raw reads Illumina DeNovo • Allpaths LG • SOAP DeNovo • Velvet • Taipan • SUTTA Illumina raw reads Statistical analysis Hybrid DeNovo • Ray • MIRA 454 DeNovo • Newbler • CABOG • SUTTA 454 reads Read stats V. vulnificus YJ016 V. vulnificus CMCP6 Published Genomes from public databases All possible combinations of the best 3 contigs * 3 Align illumina reads against 454 contigs Mac vector CLC wb Mimimus MAIA Scaffolds GRASS Built-in contigs MUMmer Reference evaluation AMOScmp REFERENCE BASED ASSEMBLY Chosen Ref. LEGEND Finished genome Nulceotide identity Gap filling GENOME FINISHING Illumina/(454?) reference based assembly Assemblers MUMmer PAGIT Mauve contigs Unmapped reads Reference genome Assemblers CONTIG MERGING Unmapped reads DENOVO ASSEMBLY Align Illumina against the reference REFERENCE SELECTION Info. GAGE Hawk-eye Unmapped reads bwa Compare mapping statistics Illumina 454 GAGE Evaluation Illumina reads V. vulnificus MO6-24/O hybrid Process Illumina/ 454/ Hybrid DeNovo assembly PRE-PROCESSING Illumina Parameter optimization Pre-processing Fastqc Prinseq NGS QC samstats 454 Draft/ Finished genome DNA Diff Reference evaluation Read Visualization – spot the differences Comparison of 454 Reads for 08-2462 (low coverage) and 2541-90 (improved coverage) Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Read Visualization - more is better! Nav 08-2462 454 reads compared to Nav 08-2462 Illumina reads. Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Read Visualization – cousins or siblings? Nav_2541-90 and Vul_06-2432 (454 and Illumina reads) coverage comparison. Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Data Quality Effect of pre-processing data (using prinseq) V. navarensis (454; non-preprocessed|pre-processed) Metric 2423-01 08-2462 2541-90 Per Base Seq. Quality Per Seq. Quality Sc Per Base Seq. Content Per Base GC Content Per Seq. GC Content Per Base N Content Seq. Length Dist. Seq. Dup. Levels Overreprese nted Seqs. Kmer Content Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results 2756-81 V. Vulnificus (454; non-preprocessed|preprocessed) Metric 2009 V_13 68 06-2432 08-2435 08-2439 Per Base Seq. Quality Per Seq. Quality Score Per Base Seq. Content Per Base GC Content Metric Per Seq. GC Content Per Base N Content Seq. Length Dist. Seq. Dup. Levels Overrepresente d Seqs. Kmer Content Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results 07-2444 V. navarensis (Illumina; non-preprocessed|preprocessed) Metric 2423-01 08-2462 2541-90 Per Base Seq. Quality Per Seq. Quality Score Per Base Seq. Content Per Base GC Content Per Seq. GC Content Per Base N Content Seq. Length Dist. Seq. Dup. Levels Overrepresented Seqs. Kmer Content Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results 2756-81 V. vulnificus (Illumina; non-preprocessed|preprocessed) Metric 2009V_1368 06-2432 08-2435 08-2439 Per Base Seq. Quality Per Seq. Quality Score Per Base Seq. Content Per Base GC Content Per Seq. GC Content Per Base N Content Seq. Length Dist. Seq. Dup. Levels Overrepresented Seqs. Kmer Content Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results 07-2444 Assembly Reference-guided and de-Novo Reference guided assembly Comparison of reference guided assembly vs de-novo assembly ARE – Assembly Score Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Reference-guided vs de-Novo assembly 90 80 70 ARE 60 50 40 30 20 10 454 (Vul_06-2432) 454 (Nav_2541-90) Illumina (Vul_06-2432) Illumina (Nav_2541-90) 0 Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Summary of Reference-guided assembly Using V. vulnificus (CMCP6) reference strain 84% coverage De-Novo assemblers overall provided higher assembly score than reference based assembly Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results De Novo Assembly ARE Newbler (denovo) 100 90 80 70 60 50 40 30 20 10 0 Nav_2541-90 Vul_06-2432 40 50 K-MER SIZE 100 Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results De Novo Assembly CABOG 50 ARE 40 30 Nav_2541-90 20 Vul_06-2432 10 0 15 22 25 K-MER Size Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results De Novo Assembly ARE SOAPdenovo 4 3.5 3 2.5 2 1.5 1 0.5 0 -0.5 Nav_2541-90 Vul_06-2432 20 30 40 50 K-MER Size 60 70 Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results De Novo Assembly Velvet 7 6 ARE 5 4 3 Nav_2541-90 2 Vul_06-2432 1 0 19 25 31 K-MER Size Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results De-Novo Assembler Comparison (Optimal Parameters) 100 90 ARE 80 70 454 (Vul_06-2432) Illumina (Vul_06-2432) 454 (Nav_2541-90) Illumina (Nav_2541-90) Hybrid (Vul_06-2432) Hybrid (Nav_2541-90) 60 50 40 30 20 10 0 CABOG Newbler (dn) Ray Ray SOAPdn Velvet (hybrid) Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Final Results – V. vulnificus Velvet 6 CABOG 5 Ray (hybrid) 4 SOAPdenovo Newbler (ref;Illumina) Ray (Illumina) 3 2 Newbler (ref;454) Assembly Score Ray (454) AMOScmp 1 0.828 0.6 0.837 0.8 0.846 1 1.2 0.855 1.4 0.864 1.6 Graph comparing assemblers on 3 criteria: Assembly Score, Span Ratio, 1/(Break Points). Higher score for all criteria are preferable. Newbler (dn) has been removed to show variance in other tools. Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Final Results – V. vulnificus Graph comparing assemblers on 3 criteria: Assembly Score, Span Ratio, 1/(Break Points). Higher score for all criteria are preferable. Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Summary of de-Novo results OLC assemblers showed considerable differences in ARE than de-Brujin based assemblers Cabog/Newbler vs Soap de-Novo/Velvet Hybrid assembler, Ray, did not perform as well in terms of assembly score Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Merging-Vul_06-2432 AMOScmp AMOScmp CABOG CABOG 164.00 164.00 Newbler (dn;454) Newbler (ref;454) Newbler ref ill Ray (454) Ray(Ill) Ray (hybrid) 6.35 4.69 63.51 55.13 64.51 44.38 67.22 225.12 101.30 62.66 73.23 93.88 98.11 75.98 113.08 5.48 ND 311.98 ND 419.76 104.46 127.01 1.44 67.72 64.99 72.79 35.07 72.34 35.28 ND ND ND ND 33.81 49.94 22.92 37.68 ND ND ND ND ND 234.69 221.89 Newbler (ref;454) 6.35 99.30 5.48 Newbler (ref;Illumina) 4.69 62.66 ND 1.44 63.50 72.56 311.99 67.72 35.28 55.13 93.88 ND 64.99 ND 33.81 Ray (hybrid) 64.51 97.17 419.76 72.79 ND 49.94 ND SOAPdn 44.38 75.98 104.46 35.07 ND 22.92 ND ND Velvet 67.22 113.08 127.01 72.34 ND 37.68 ND ND Ray (Illumina) Velvet 234.69 Newbler (dn;454) Ray (454) SOAPdn Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results ND ND Merging-Nav_2541-90 AMOScmp AMOScmp Cabog Newblerdn Newbler (ref;454) Cabog 133.95 133.95 Newblerdn Newbler Newbler (ref;454) (ref;Illumina ) Ray (454) Ray Ray (Illumina) (hybrid) SOAPdn Velvet ND 0.03 0.03 15.26 14.00 15.77 11.23 45.32 ND 107.60 114.60 82.62 92.44 92.53 80.73 123.02 ND ND 54.21 59.81 60.47 33.17 94.89 0.11 11.6 11.78 11.86 10.17 39.2 12.66 12.15 12.41 9.6 39.60 59.19 76.36 13.65 63.75 24.21 11.54 39.84 14.06 ND ND ND 0.03 107.60 59.94 0.03 114.60 ND 0.28 15.26 82.62 54.21 11.60 12.66 14.01 92.44 59.81 11.78 12.15 33.79 15.77 92.53 60.47 11.86 12.41 40.33 36.79 11.22 80.73 33.17 10.04 9.54 13.61 11.40 13.91 45.32 123.02 94.89 39.20 39.84 64.54 39.84 ND Newbler (ref;Illumina) Ray (454) Ray (Illumina) Ray (hybrid) SOAPdenov o Velvet Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results 8.47 8.31 Assembler Review Assembler Status Allpaths LG Paired-end only 454 Illumina Hybrid Algorithm DBG AMOScmp BB CABOG OLC MIRA ZEBRA Newbler OLC Ray DBG SOAPdenovo DBG SUTTA Unresolved errors Velvet BB DBG Mira worked as good as our merged contigs but it is impractical – 40hr run time BB = branch-and-bound; OLC = overlap consensus; DBG = de Bruijn Graph; ZEBRA Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Final Pipeline 454 raw reads Illumina DeNovo • Velvet Illumina raw reads 454 Hybrid DeNovo • Ray • Mira Illumina hybrid 454 DeNovo • Newbler • CABOG Process Illumina Statistical analysis 454 reads Info. Assemblers Illumina/ 454/ Hybrid DeNovo assembly Fastqc Prinseq Read stats 454 Pre-processing Assemblers Merge Ray –hyb/ Newbler Merge CABOG/Velvet MIRA-hyb Illumina reads contigs LEGEND Mimimus Draft genome PRE-PROCESSING Align illumina reads against 454 contigs CONTIG MERGING contigs DENOVO ASSEMBLY Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Splinter Pipeline 1 NUM Nav_2423 -01 Nav_082462 Nav_2541 -90 Nav_2756 -81 Vul_2009 v-1368 Vul_062432 Vul_082435 Vul_082439 Vul_072444 Pipeline 2 AVG Assembly Assembly Size Score N50 106 42657.2 156064 4.52 136.53 149 25736.8 51230 3.83 19.48 166 26172.5 130386 4.34 62.57 107 42939.4 131591 4.59 122.31 83 57787.2 401973 4.80 345.03 57 85122.7 322525 4.85 419.76 111 42872.9 230373 4.76 144.01 98 50885.7 250789 4.99 210.94 70 73255.1 492706 5.13 656.10 NUM Nav_2423 -01 Nav_082462 Nav_2541 -90 Nav_2756 -81 Vul_2009 v-1368 Vul_062432 Vul_082435 Vul_082439 Vul_072444 AVG Assembly Assembly Size Score N50 125 35357.0 164305 4.42 111.36 451 311.9 2253 0.14 0.09 106 40547.5 169781 4.30 123.02 111 41840.8 132119 4.64 124.55 97 49705.8 228408 4.82 170.81 167 28489.7 78353 4.76 32.53 193 24903.7 204178 4.85 75.19 114 44047.9 180889 5.02 134.64 143 35905.1 130942 5.13 85.93 Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Visualization Newbler Ray Hybrid Merged Pipeline / Read Processing / Assembler Results / Contig Merging / Assembler Review / Pipeline / Final Results Demo
© Copyright 2026 Paperzz