Application Brief: Long-read RNA sequencing

LONG-READ RNA SEQUENCING
BEST PRACTICES
PLANT AND ANIMAL
SCIENCES
HUMAN BIOMEDICAL
RESEARCH
With Single Molecule, Real-Time (SMRT®) Sequencing and the Sequel™ System, you can easily and affordably sequence
transcript isoforms of up to 10 kb in their entirety. The Iso-Seq™ method allows users to generate full-length cDNA
sequences - with no assembly required - in order to confidently characterize the full complement of transcript isoforms
within targeted genes, or across an entire transcriptome.
SAMPLE PREPARATION RECOMMENDATIONS
FROM RNA TO ACCURATE GENE MODELS
-- Prepare full-length transcripts using the Clontech®
Total RNA
---
Optional Poly-A Selection
--
poly A+ RNA
Reverse Transcription
--
Full Length
1st Strand cDNA
--
Large-scale Amplification
Amplified cDNA
>4 kb
Optional
Size Selection
Combined SMRTbell Library
SMARTer® PCR cDNA Synthesis Kit with as little as 1 ng
of poly A+ RNA or 2 ng of total RNA1
Sequel System loading protocols reduce need for size
selection for transcripts <4 kb2
Optional size-selection protocols to enrich for
transcripts >4 kb3
Compatible with standard target enrichment methods,
such as NimbleGen SeqCap EZ4 or IDT xGen Lockdown
Probes5
Multiplex transcripts or full transcriptomes with sample
barcoding6
Scalable throughput
-- Profile transcripts from multiplexed samples in a
single SMRT Cell
-- Survey transcriptomes in 1–2 SMRT Cells on the
Sequel System
-- Increase sequencing depth for more comprehensive
transcriptome characterization
MORE CONSISTENT LOADING ON SEQUEL SYSTEM
REDUCES NEED FOR SIZE SELECTION
SMRT Sequencing on Sequel System
Non-size Selected
Iso-Seq Libraries
Non-SizeSelected
SMRTbell
Library
PacBio
RS II
Sequel
System
Analyze with SMRT Analysis
Software Suite
Full-length Transcript Size
Depicted on the left is a histogram plot of number of full-length sequences
by transcript length for a Magbead-loaded, non-size selected Iso-Seq
library sequenced on both the PacBio RS II and the Sequel System. The
full-length cDNA sequences run on the Sequel System closely resemble
the size distribution of the input SMRTbell library (shown on the right).
w w w. p a c b . c o m / i s o s e q
tal pipeline
DATA ANALYSIS SOLUTIONS WITH PACBIO SMRT ANALYSIS
-- Use the Iso-Seq Algorithm in SMRT Analysis to output high quality, full-length transcript sequences, with no assembly
required, to characterize transcripts and splice variants and map transcripts back to a reference genome
-- Run Iso-Seq analysis in either de novo (no genome reference required) or reference-based mode
-- Install SMRT Analysis locally7 or access it via Amazon Cloud8
-- View tutorial9 for running the Iso-Seq Algorithm in SMRT Analysis via SMRT Link
INFORMATICS PIPELINE FOR ISO-SEQ ANALYSIS
1
Gene
PacBio raw
sequence reads
Remove adapters
Remove artifacts
2
b
partitioning &
amplification
Tbell ligation
equencing
Insufficient Connectivity
Splice Isoform Uncertainty
Informatics pipeline Reads clustering
3
Reads spanning
splice junctions
Iso-Seq
solution:
Full-length cDNA Sequence Reads
Splice Isoform Certainty – No Assembly Required
PacBio raw
sequence
Isoformreads
clusters
Remove adapters
Remove artifacts
Clean
mRNA
isoforms
Short-read
technologies:
Classify
sequence reads
A
synthesis
dapters
DETERMINATION OF TRANSCRIPT ISOFORMS
Consensus calling
The Iso-Seq method allows you to make evidence-based genome
annotations, discover novel genes and isoforms, identify promoters and
splice sites to understand gene regulation, improve accuracy of RNA-seq
quantification for gene expression studies, and distinguish important
stress response, developmental, or tissue-specific isoforms.
4sequence reads
SIGNIFICANTLY IMPROVE EXISTING GENOME
ANNOTATIONS
Nonredundant
Reads clustering
transcript
isoforms
Isoform clusters
Quality filtering
5
Consensus calling
Nonredundant
Final isoforms
transcript isoforms
Quality filtering
Final isoforms
Map to
reference genome
Evidenced-based
gene models
Map to reference genome
Evidence-based gene models
The Iso-Seq informatics pipeline, available in SMRT Analysis, generates
consensus sequences and determines those transcripts that are full-length
by detecting and identifying the 5’ primer, Poly-A sequence, and the 3’
primer of the reads.
Splice isoform analysis in the sorghum transcriptome using the Iso-Seq
method greatly improved genome annotation, with >11,000 novel splice
isoforms and >2,100 novel genes identified. In this example, a gene was
discovered to produce 13 novel alternatively spliced isoforms, where the
previous gene model contained only a single isoform10.
KEY
REFERENCES
Figure
1
1. Procedure & Checklist – Isoform Sequencing (Iso-Seq) using the Clontech® SMARTer® PCR cDNA Synthesis Kit and No Size Selection
2. Clark, T. et al. (2017) Full-Length cDNA Sequencing on the PacBio Sequel Platform. Poster presented at Plant and Animal Genome Conference. San
Diego, CA.
3. User Bulletin: Guidelines for Preparing cDNA Libraries for Isoform Sequencing (Iso-Seq)
4. Full-length cDNA Target Sequence Capture Using SeqCap® EZ Libraries
5. Full-length cDNA Target Sequence Capture Using IDT xGen® Lockdown® Probes
6. Barcoding Samples for Isoform Sequencing (Iso-Seq Analysis)
7. SMRT Analysis Software Installation (v2.3.0)
8. Running SMRT Analysis on Amazon
9. Tutorial: Iso-Seq Analysis Application
10.Abdel-Ghany, S.E. et al. (2016) A survey of the sorghum transcriptome using single-molecule long reads. Nature Communications. 7, e11706.
For Research Use Only. Not for use in diagnostic procedures. © Copyright 2017, Pacific Biosciences of California, Inc. All rights reserved. Information in
this document is subject to change without notice. Pacific Biosciences assumes no responsibility for any errors or omissions in this document. Certain
notices, terms, conditions and/or use restrictions may pertain to your use of Pacific Biosciences products and/or third party products. Please refer to the
applicable Pacific Biosciences Terms and Conditions of Sale and to the applicable license terms at http://www.pacb.com/legal-and-trademarks/
terms-and-conditions-of-sale/.
Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and
SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of
Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners.
PN: BP103-020917