Applying MiSeq to Pathogen Tracing

Applying MiSeq to
Pathogen Tracing
Susan Knowles
Sr. Manager, Market Development
Illumina, Inc.
March 2014
© 2013 Illumina, Inc. All rights reserved.
Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium,
iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks
of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
Goals and Objectives
Introduction to Illumina
Next Generation Sequencing for Food Pathogens
Supporting the FDA Genome Trakr Network
Public Health 2.0
2
Introduction to
Illumina
© 2013 Illumina, Inc. All rights reserved.
Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium,
iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks
of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
Introduction to Illumina
Founded in 1998
Initial Public Offering on July 27, 2000
Headquarters in San Diego, CA
3,000 employees worldwide, San Diego HQ
>$1.4B annual sales 2013
90% of the worlds DNA sequencing
Instruments generate opproximately 1PB/week
of sequence data
IP portfolio of 135+ issued patents and 168
pending applications.
4
Global Organization
Expanded Manufacturing, R&D, Sales, Service & Support
Illumina China
(Beijing)
Illumina BV
(The Netherlands)
Illumina Hayward
(Hayward, CA)
Illumina
Cambridge
Russia
Greece
Illumina Global
Headquarters
(San Diego, CA)
Korea
Illumina KK (Tokyo)
Turkey
Jinan, China
Israel Chengdu, China
Middle East
India
Shanghai
Thailand
Illumina
Singapore
Taiwan
Vietnam
Malaysia
South Africa
Australia
Commercial
Mfg/R&D
Partners
5
New Zealand
Innovation: Making Sequencing Faster & Cheaper
$3,000
600
GAI
GAII
GAIIx
HiSeq
HiSeq 2500
1Tb X Ten
HiSeq 2000 v3HiSeq 2500
GAIIx 2x100GAIIx 50Gb GAIIx 95Gb HiSeq 2000
COST PER GIGABASE
500
$2,000
400
$1,500
300
$1,000
200
$500
100
$-
0
Dec-07
6
Dec-08
May-09
Sep-09
Dec-09
Jan-10
Jan-10
Jul-11
Oct-14
Jan-14
Mar-14
OUTPUT PER DAY (GIGABASES)
$2,500
The New Illumina Portfolio
Sequencing Power for Every Scale
Regulated Power
Focused Power
Flexible Power
Production Power
Population Power
MiSeqDx
MiSeq
NextSeq 500
HiSeq 2500
HiSeq X Ten
The world’s first
CE-IVD and FDA
cleared NGS
platform.
Speed and
simplicity for
targeted and
small-genome
sequencing.
Speed and
simplicity for
whole-genome,
exome, and
transcriptome
sequencing.
Power and
efficiency for largescale genomics.
$1,000 human genome
and extreme throughput
for population-scale
sequencing.
7
Throughput to Match Microbiology Applications
Shotgun metagenomics
Microbial diversity
High
Throughput
Gene content and discovery
rRNA Metagenomics
Relative abundance of microbial
diversity
– 16S for bacteria and archaea
– 18S for eukaryotes
Microbial genomics
Detection
Identification
Low
Throughput
Antibiotic sensitivity testing
Molecular epidemiology
8
Meet Miseq
Integrating three concepts
On-board clustering
Fast SBS
On-board analysis
9
MiSeq- A Closer Look
2 ft
10
MiSeq
Simple workflow
VERY SIMPLE USER INTERACTION
Preloaded single use reagent cartridge
contains cluster generation, SBS & PE reagents
RFID based reagents & flowcell tracking
Auto flow cell positioning
Walkaway automation
Load
Go
11
MiSeq Instrument
Options for output and read length
FLEXIBLE DATA OUTPUT
Multiple flow cell options
from 1M reads to 25M
from 300MB to 15GB
READ LENGTH
READ COUNT
50
45
40
35
30
25
20
15
10
5
0
1x50
2x75
2x150
2x250
2x300
12
Read 2
25
Read 1
15
25
1
1
Nano
4
4
Micro
15
v2
v3
Simplify Analysis MiSeq Reporter and BaseSpace
De Novo Enrichment Generate
FASTQ
Assembly
LibraryQC
PCR Metagenomics Small Resequencing TruSeq
Amplicon
Amplicon
RNA
MiSeq Reporter
(MSR)
Streamlined on-board analysis workflows
Illumina’s cloud computing environment.
No intervention from sample loading to
report
Most MSR workflows available on
BaseSpace
All workflows generate FASTQ files that
can be analyzed by most 3rd party apps
Free data storage
13
Data sharing with collaborators
Most Widely Used Desktop NGS System
Greater than 85% of desktop data generated on MiSeq
Analysis
14 of data submissions to the NCBI Sequence Read Archive (SRA); *As of January 02, 2014
Adopted by Worldwide Public Health Agencies
NGS Networks
FOOD-BORNE
PATHOGEN OUTBREAKS
PUBLIC HEALTH
GENOMIC
EPIDEMIOLOGY
GENOMIC
EPIDEMIOLOGY
15
NGS for Food Pathogens
PROOF OF PRINCIPLE STUDIES
PILOT PROGRAM
IMPACT
16
Food Safety Testing
Leveraging NGS data to revolutionize pathogen analysis
Outbreak Detection
– Cluster determination
– Is the strain known,
related to a known
strain or novel?
17
Outbreak
Management
– Epidemiology
– Traceback and source
attribution
– Recalls
In-depth Analysis
– Pathogenicity –
identify genes
associated with toxicity
and virulence
– Taxonomy
Proof of Principle Retrospective Study
NGS Analysis of Listeria Outbreak Samples*
Listeria outbreak in cantaloupes
July 2011, Listeria-contaminated
cantaloupe outbreak spread to 28
states, infected 146 people, killing
30.
The outbreak was tracked by
PulseNet US, national molecular
subtyping surveillance system of
foodborne pathogens.
Pulse field gel electrophoresis
(PFGE) used to subtype Listeria
isolates from human cases and
cantaloupe samples and track the
outbreak.
Is NGS a more effective way to way
of performing bacterial typing?
18
*Collaboration with US CDC PulseNet
Ease of Use: Bacterial Genome Sequencing from Isolate
Efficient workflow and quality data for resequencing
gDNA
19
Nextera XT
Library Prep
2.5 hours
Prepped Library
thru Sequencing
27 hours
(20 minutes
hands on)
Resequencing
Alignment and
Variant Calling
2 hours
(fully automated)
31:30
MiSeq Reporter Resequencing Workflow
SNPs accurately measure variation from the reference genome
Homozygous
SNPs
MiSeq Reporter - on-board analysis
workflows
A01
194
A02
131
Resequencing – reconstruction of a
genome sequencing from reads
mapped to a previously sequenced
reference genome.
A03
14,312
A04
55,857
A05
136
A06
57,526
A07
14,775
A08
193
Sample
Outbreak samples yielded > 92%
alignment with the reference genome.
A04 and A06 - highly divergent
from the reference.
A03 and A07 – Divergent (to a
lesser extent) from the reference.
20
Assess Concordance with PFGE Data
SNP-Based clustering illustrates genomic relatedness
Hierarchical clustering of samples
based on SNP calls.
Sample 1, 2,5 and 8 seem closely
related
Samples 4 and 6 and 3 and 7 seem
to form outlier groups, respectively.
The results matched what
PulseNet obtained viewing the
results generated by PFGE.
SNP-based analysis – differentiates
by as little as one SNP
21
Pilot Network: FDA CFSAN Selects MiSeq to Identify
Foodborne Pathogens
Genome Trakr Network1
A pilot network and coordinated effort
across state and federal labs.
Sequencing pathogens collected from
foodborne outbreaks, contaminated
food products and environmental
sources.
Illumina’s role
– MiSeq instruments
– Library prep and sequencing
reagents
– Installation and training
– Service and support
7 State health departments
and 10 FDA-ORA labs
22
Source:
http://www.fda.gov/Food/FoodScienceResearch/WholeGenome
SequencingProgramWGS/ucm363134.htm
FDA Protocol
Application: Bacterial WGS from culture
Sample Prep
• Grow
culture
23
• Lyse cells from
cultured
isolate
• Genomic DNA
extraction
Nextera
Library Prep
• Nextera XT
• ~12 samples
per run
• Sequencing kit:
500 cycle kit,
2 x 250 bp
paired-end
sequencing.
• 20x-30x
coverage
MiSeq &
Primary
Analysis
• MiSeq Reporter
workflow:
Generate FASTQ
• Data sent to FDA
or BaseSpace for
storage and
sharing with FDA
and upload to NCBI
SRA database and
analysis.
Impact: NGS Used to Assess Food Pathogen Outbreak in
Food and Clinical Samples
Compared with pulsed-field gel
electrophoresis (PFGE), WGS provides
clearer distinction between cases and
foods that are likely part of a given
outbreak and those that are not.
Whole-genome sequences of the Listeria strains
isolated from Roos Foods cheese products were
available after the recall and were found to be
highly related to sequences of the Listeria strains
isolated from the patients.
24
Supporting the GenomeTrakr Network
Illumina’s service and support infrastructure
Technical
Applications
Scientists
Project
Management
Field
Application
Scientists
“Customers don’t expect
you to be perfect. They
do expect you to fix
things when they go
wrong.”
Donald Porter
Territory
Account
Managers
25
Field Service
Engineers
Supporting the GenomeTrakr Network
Tech Support Group (TS)
Technical
Applications
Scientists
TAS – Technical Applications
Scientists
First line call
Project
Management
Territory
Account
Managers
26
Field
Application
Scientists
Field Service
Engineers
Email and phone support
All network accounts flagged
FDA CFSAN Network Illumina Support Team
Field Application Specialists (FAS)
Technical
Applications
Scientists
Project
Management
Deliver on-site trainings and
trouble-shooting
Field
Application
Scientists
Typically brought into the
picture by TS
Helps with chemistry and
software based troubleshooting
Escalate complex cases
Territory
Account
Managers
27
Field Service
Engineers
FDA CFSAN Network Illumina Support Team
Project Management
Technical
Applications
Scientists
Project
Management
Track outstanding support
issues via dashboard
Field
Application
Scientists
Meet regularly with FDA
management
Organize new network lab
trainings
Advance warning of significant
changes to software and
hardware
Territory
Account
Managers
28
Field Service
Engineers
Public Health 2.0
Changing the paradigm
Update
Detect
Methods
Simplify
Analyze/Report
Analysis
Share
Connectivity
Faster diagnosis and response times – detection, identification and containment
Improved accuracy and methods – high resolution, high throughput, automated
Cloud-based data storage/exchange and sharing
Functional and geographic connectivity for analysis and communications
29
Thank You
30