Standards-derived management of specimen-derived DNA sequences Satpal Bilkhu, Nazir El-Kayssi, James A. Macklin, Christopher T. Lewis Agriculture and Agri-food Canada, Ottawa, Canada Core Biological Resources Biodiversity Collections: Insects CNC) – 16 million Plants (DAO) – 1.5 million Fungi (DAOM) – 350,000 Nematodes – 40,000 Fungi – 18,000 Bacteria – 2,000 Viruses – 450 maintained alive Taxonomy library reference collection non-living specimens Multi-departmental “shared” priorities initiative • AAFC lead department • Bioinformatics experience crucial • CRTI projects helped to build up expertise and capacity on high risk organisms Environmental Monitoring • Metagenomics approach to Environmental Monitoring – Canadian samples (2007-11) – Foreign samples (2010-12) – Samples collected weekly • Develop baseline profiles of microbial biodiversity • Identify “Bioindicators” of climate change • Made possible by a nation wide Collection Network Sanger Sequence Management Mixed Specimen (e.g Tree Bark) Database Information •358,000 Sequences (mycology group, last 10 years) •Database Size = 1 GIG Specimen (e.g. Pure Culture) Sequence Reactions (Sanger) Sample (DNA Extraction) PCR Reactions Metagenome Sequence Management Mixed Specimen (e.g. Air/Rain Samples) Database Information Sample (DNA Extraction) Identification Pipeline PCR Reactions (Pooled) Sequence Reactions (454) •50 Million Sequences •Database Size = 100 GIG Integrate Specimen-based sequences into identification process Genome / Transcriptome Management Specimen (e.g. Pure Culture) Sample (e.g. DNA / RNA Extraction) Sequencing Library Database Information •??? Assembly / Annotation Pipelines Specimen as source material NGS Sequencing Infrastructure Specimen -> Sequences Downstream Management File / Metadata Management Network Attached Storage Sequence Analysis Workflow Design and Execution Cluster Computing Resources SeqDB Technologies SeqDB Components / Frameworks Legend External Commercial Client using Web Service Client using Browser (Programmatic Access) (Chrome / Firefox / Opera / IE) JSON Result HTML Page SeqDB Deprecated Scheduled / Manually Triggered Client using Command Line Minimal Dev Client using Barcode Printer Barcode Label Prototype Android Client (Missing) Client Barcode Scanner Barcode Scanner middleware reader printing webservices web Appfuse Struts 2 Spring 3 loader util processor BioJava dbi Hibernate Java RDBMS (MySQL) OS (Windows, Linux, OSX) Home Page Specimen Collections Mixed Specimen Collection Public Private Specimen Collection Query Builder • Search on any database field • Range Searches for Specimen Number • Partial Date Searches • Range Searches for Dates Search Filters Records Storage System Sequence Quality Colouring and Trimming Barcoding and Label Printing Extracted Fields for Label Customizable Label Output Many Export Formats Galaxy Integration AAFC-DINA Partnership • AAFC has a complex set of databases for resource management which we would like to reconcile and integrate using the DINA platform. • AAFC will contribute a configurable DNA module based on the SeqDB platform via web services API. • Flexibility and sustainability through community support!
© Copyright 2026 Paperzz