Drive Discovery with the Most Complete View of Cancer Complexity

DISCOVER THE HIDDEN LANDSCAPE
OF CANCER VARIANTS
a
CANCER RESEARCH
To bring precision medicine to every patient, cancer researchers need a more comprehensive
view of all the somatic variants in genes, transcripts and whole genomes that drive cancer
biology. Single Molecule, Real-Time (SMRT®) Sequencing delivers the read lengths, uniform
coverage, and accuracy needed to access the complete size spectrum of driver mutations —
from rare single nucleotide variants to complex structural variants. Full-length transcript
sequencing brings clarity to tumor-specific isoform and splice variant expression, enabling the
discovery of novel biomarkers for early detection, tumor stratification, treatment response, and
drug resistance. With SMRT Sequencing, scientists gain new insight into the most pressing
questions in cancer research.
SEE THE TRUE EXTENT OF STRUCTURAL VARIANTS IN CANCER GENOMES
-- Move beyond CNVs to develop a complete picture of all types of structural variants
-- Precisely map driver mutations hidden in insertions, deletions, translocations and inversions, and characterize
their impact on oncogene expression and function
PacBio® assembly of HER2+ breast cancer cell line SK-BR-3 identified 9 gene fusions, 94 high-confidence interchromosomal translocations,
and ~1,200 intrachromosomal rearrangements1. Shown above is the chromosomal fusion linking CYTH1 and EIF3H, which is invisible to
short-read sequencing2.
Figure 1 | Ribbon shows all alignments from the pers
evidence for complex variants in the SK-BR-3 breast ca
captured within some of the long reads. (b) Long reads
homozygous deleted from chromosome 19 and homozy
REVEAL HIDDEN GENE FUSIONS WITH FULL-LENGTH ISOFORM SEQUENCING
-- Uncover or validate gene fusion candidates missed by Sanger or short-read sequencing
Nucleic
Research,
2015 9
-- Detect fusions involving unannotated exons or genes with high precision
andAcids
a low
false positive
rate
-- Reveal multi-translocation events in a single long read
By adding SMRT Sequencing to their study, scientists detected a fusion between AIB1 and an unannotated region in chromosome 1 in the
MCF-7 breast cancer cell line. Eight fusion isoforms with six fusion sites were identified and validated by PCR and Sanger sequencing. None
of the fusion sites are detectable with short-read sequencing, and two were previously unreported3.
Downlo
p a c b .co m /c a n ce r
THE LEADER IN LONG-READ SEQUENCING
filtered through 45 mm filters. Target cells (LNCaP) were infected
with the virus supernatant and 8 mg/ml protamine sulphate (SigmaeAldrich, Castle Hill, NSW, AU). Transfected cells were then
selected with 2 mg/ml Puromycin Dihydrochloride (Life Technologies, Mulgrave, VIC, Australia). We observed some cells expressing
red fluorescent protein (RFP) indicating inducible promoter
leakage; therefore we used an Astrios EQ cell sorter (Beckman
Coulter, Lane Cove, QLD, Australia) to remove cells with leaky
inducible promoter.
Results are representative of at least three independent experiments with triplicate samples generating similar findings. Differences between experimental groups were statistically evaluated by
multiple t-tests, followed by the HolmeSidak test for multiple
comparisons. p � 0.05 was considered statistically significant.
Statistical analysis was performed using Prism 6 (GraphPad Software Inc.).
TARGET STRUCTURAL VARIATION HOTSPOTS
FULLY RESOLVE ISOFORM DIVERISTY WITH THE
THAT DRIVE CANCER BIOLOGY
ISO-SEQ™ METHOD
3. Results
-- Determine penetrance and the exact breakpoints
-- Eliminate ambiguity around cancer-specific
3.1. Identification of novel RLN1-RLN2 fusions in PCa
2.5. Inhibition of androgen signalling
of de novo rearrangements arising in known
isoform variants by sequencing
full-length cDNA
To comprehensively
identify
RLN1 and RLN2 transcript variShort-term androgen
deprivation
assay. LNCaP cells‘no
were seeded
hotspots
and
producing
assembly
required’
whole
ants, we used long cDNA-Cap and SMRT sequencing in LNCaP
into T25 flasks, and incubated in RPMI1640 þ 5% FBS for 3 days. The
transcriptome
isoform
cells, and queried for circular consensus sequences (CCSs) that -- Cost effectively multiplex and sequence long,
medium was then changed
to RPMI 1640 supplemented
with 5% data
mapped to the RLN1-RLN2 locus. We found CCSs identical to the
charcoal stripped serum (CSS) and incubated for 2 days, then
- Use
existing
panels
and
to no CCS corresponded single molecules for high-throughput screening of
annotated
RLN1workflows
gene, but interestingly,
changed to RPMI-1640
supplemented
with 10 capture
nM DHT or 1 nM
to the annotated RLN2 gene (GENCODE Version 19). Instead,
R1881, and incubated for 48 h with DHT top-up at 24 h. The
heterogeneous cancer samples
target
isoforms
interest
the search retrieved sequences of two longer RLN2 transcript
reference group was kept
in CSS. Long-term
androgenof
deprivation
variants which were fused to the RLN1 gene, generating two
assay. LNCaP cells were seeded into T25 flasks and incubated in
-- forReveal
biology
novel and RLN1-RLN2-2 (Fig. 1,6
PRADHAN ET AL.
fusiondiscover
transcripts RLN1-RLN2-1
RPMI 1640 þ 5% FBS
3 days. The hidden
medium was then
changed to and
Supplementary
Fig. S1). The inherent error-proneness of the
RPMI 1640 supplemented
with 5% CSS, and incubated
for 10 days
biomarkers
for
early
detection,
cancer
SMRT sequencing technology hindered the determination of the
with medium changes every 3 days. Inducible AR knock-down.
stratification, and drug resistance
Figure 2. Both native and rearrangement configurations are
detected by LDI-PCR. The assayed region (Chr8-Chr12), its chromosomal location, and resulting PCR products are shown as a schematic
drawing and after agarose gel electrophoresis. Sanger sequencing
Fig. 1. Identification of the RLN1-RLN2 fusion in LNCaP cells using SMRT sequencing. The figure was extracted from the UCSC Genome Browser where the circular consensus
results obtained from gel-extracted PCR products around the break-
point junction are shown in red and blue typeface. Underlined bases
indicate microhomology at the breakpoint junction. In blue typeface,
the corresponding “native” sequence from the fusion partner chromosome is shown. M, marker (GeneRuler 1 kb DNALadder). Drawings
not to scale.
less challenging, it may be possible to apply commercial DNA extraction kits; however, integrity
of the extracted DNA should be verified by agarose gel electrophoresis before proceeding to subsequent steps of the assay.
After examination of the breakpoint region in
its “native” chromosomal location (the intact, nonrearranged configuration) in silico, PstI was chosen
for the detection of Chr8-Chr12 rearrangement.
PstI restriction fragments were self-ligated to generate circularized molecules that then provided
the template for LDI-PCR. Two distinct PCR
products were observed by agarose gel electrophoresis in the tumor sample (Fig. 2). The same products were detected consistently in more than three
independent experiments, indicating high reproducibility. Occasionally seen spurious bands of
various sizes were not reproducible from experiment to experiment and may represent sporadic
amplification of multi-molecule ligation products.
One of the distinctly visible PCR products corresponded in size to the native DNA configuration
and was present in both tumor and normal DNA.
The size of the other PCR product corresponded
to the predicted rearrangement (Mehine et al.,
2013) and was observed only in the tumor sample,
indicating tumor-specificity. Similarly, LDI-PCR
using SacI restriction enzyme, with inverse PCR
primers on the other fusion partner chromosome
(chromosome 12), generated robust tumor-specific
firmed to map to the position previously identified
by NGS (Mehine et al., 2013) (Fig. 2). The breakpoint junction contained microhomology at the
fusion junction (Fig. 2), consistent with earlier findings on translocation junctions (Lee et al., 2007;
Mehine et al., 2013). As a complementary approach
to Sanger sequencing for the analysis of LDI-PCR
products, we employed PacBio sequencing (Eid
et al., 2009), a method recently used by others for
the detection of recurrent somatic structural variations and germline rearrangements (Okoniewski
et al., 2013; Patel et al., 2014). PacBio sequencing
has the advantage over Sanger sequencing that it
provides information about individual DNA molecules in the pool of amplified DNA molecules,
rather than on the population average. The breakpoint position identified by PacBio sequencing
(Supporting Information Tables 5 and 6) was consistent with the result from Sanger sequencing
(Fig. 2). Using PacBio sequencing, we were further
able to estimate the percentage of self-circularized
molecules versus molecules with no or multiple
ligation sites (Supporting Information Fig. 5). The
majority of reads (69%) contained only one PstI
site, confirming that under the ligation conditions
used, self-ligation is favored, although some intermolecular multimeric ligation products were also
generated. Southern blotting and hybridization
revealed the presence of sub-visible PCR products,
many of higher mobility than the rearrangement, in
To resolve the native and rearrangement configurations in a
heterogeneous uterine cancer samples, scientists used LDI-PCR
and SMRT Sequencing to amplify and sequence the targeted
followed
by standard phenol-chloroform
amplification of thecan
rearranged
rearrangement
(Chr8-Chr12).extracThe rearrangement
be PCR product (Suption.
The former
method led
to shearing
of the
Information
Fig. 4).
detected
robustly
when
as little
as 2%porting
of the
genomic
DNA is of
obtained DNA, but using the latter protocol,
tumor origin. They applied this validated method to screen 33
high-molecular weight DNA was obtained (SupMapping the Breakpoint Locations
patient
samples
for
a recurrent
rearrangement upstream of the
porting
Information
Fig. 1).
If working
with other
5 Sanger sequencing, the breakpoint was conBy
HMGA2
athomogenization
the RAD51B
tumor
or tissuegene
types forand
which
is locus .
sequences (CCS) of RLN1 and RLN2 identified by SMRT sequencing were aligned to the RLN1/RLN2 genomic locus using BLAT tool. GENCODE Version 19 annotated RLN1 and RLN2
transcript variants are shown in black. Combined CCS sequences of RLN1 are shown in orange and RLN1-RLN2 fusion CCS transcripts in blue. The arrowed-line represents introns and
show directionality of the transcripts. The golden rectangles show an overlay between the annotated RLN1 and RLN2 and the transcripts identified by SMRT sequencing. (For
interpretation of the references to color in this figure caption, the reader is referred to the web version of this article.)
Combining target capture of RLN1 and RLN2 transcripts with PacBio
sequencing revealed a previously unknown RLN1-RLN2 fusion
isoform. A key insight was that previous studies on relaxin
expression had relied on qPCR primers that could not distinguish
among all the isoforms present in prostate cancer samples.
Retesting of tumor, normal, and several commonly used prostate cell
lines with redesigned primers revealed that only one commonly
used cell line recapitulates the relaxin isoform expression patterns
found in biopsied tissues4.
GET A MORE FAITHFUL VIEW OF THE IMMUNE
RESPONSE WITH FULL-LENGTH BCR SEQUENCING
-- Achieve uniform high accuracy across the fulllength of immune receptors amplicons with long,
single-molecule reads
-- Reduce multiplex primer failure and primer bias
by anchoring amplicons in the constant region,
away from highly polymorphic regions subject to
somatic hypermutation
-- Identify BCR isotype and subclass along with
antigen recognition domains
IDENTIFY THE DRIVERS OF DRUG RESISTANCE WITH
HIGH SENSITIVITY
-- Detect rare drug resistance mutations (1% abundance)
and isoforms
-- Chart the evolving complexity of a tumor in response
to treatment
-- Distinguish compound mutations from independent
alterations arising in different molecules
Genes, Chromosomes & Cancer DOI 10.1002/gcc
Multiplex PCR for BCR amplification is subject to frequent failure
due to mismatches between consensus primers and real-life
samples. In some cases, the multiplex reaction omits IGV alleles
present in the sample; in other cases, somatic hypermutation
creates mismatches that impede primer binding6.
BCR-ABL1 resistance mutations in patient samples, including rare
SNP and isoform mutants undetectable by routine methods
(marked with an asterisk)7.
KEY REFERENCES
1. McCombie W. R. (2015, February) PacBio long read sequencing and structural analysis of a breast cancer cell line, Presented at
Advances in Genome Biology & Technology Conference. Marco Island, FL.
2. Nattestad M. et al. (2016) Ribbon: Visualizing complex genome alignments and structural variation. bioRxiv https://doi.
org/10.1101/082123.
3. Weirather J. L. et al. (2015) Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid
sequencing. Nucleic Acids Research. 43(18), e116.
4. Tevz G. et al. (2016) Identification of a novel fusion transcript between human relaxin-1 (RLN1) and human relaxin-2 (RLN2) in prostate
cancer. Molecular and Cellular Endocrinology. 420, 159-168.
5. Pradhan B. et al. (2016) Detection and screening of chromosomal rearrangements in uterine leiomyomas by long-distance inverse PCR.
Genes Chromosomes Cancer. 55(3), 215-226.
6. Koning M. et al. (2016) ARTISAN PCR: rapid identification of full-length immunoglobulin genes without primer bias. British Journal of
Haematology. doi: 10.1111/bjh.14180.
7. Cavelier, L. et al. (2015) Clonal distribution of BCR-ABL1 mutations and splice isoforms by single-molecule long-read RNA sequencing.
BMC Cancer. 15, 45.
For Research Use Only. Not for use in diagnostic procedures. © Copyright 2017, Pacific Biosciences of California, Inc. All rights reserved. Information in this
document is subject to change without notice. Pacific Biosciences assumes no responsibility for any errors or omissions in this document. Certain notices, terms,
conditions and/or use restrictions may pertain to your use of Pacific Biosciences products and/or third party products. Please refer to the applicable Pacific
Biosciences Terms and Conditions of Sale and to the applicable license terms at http://www.pacb.com/legal-and-trademarks/terms-and-conditions-of-sale/.
Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are
trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical
Technologies. All other trademarks are the sole property of their respective owners.
PN: VM104-032217