Introduction to LOH and Allele Specific Copy Number User Forum

Introduction to LOH and Allele Specific
Copy Number User Forum
Jonathan Gerstenhaber
Copyright © 2009 Partek Incorporated.
All rights reserved.
Introduction to LOH and ASCN User Forum Contents
1. Loss of heterozygosity
• Analysis procedure
• Types of baselines
2. Merging LOH with copy number
• Why?
• Tips
3. Allele Specific Copy Number
• Paired vs Unpaired?
• Detecting Imbalance
• Visualization
4. Q & A
Copyright © 2009 Partek Incorporated.
All rights reserved.
Data Sets used
•20 paired tumor/normal samples
•Kindly provided by Ian Campbell, Peter MacCallum Cancer Centre
•Run on the Affymetrix Human SNP 6.0 Arrays
• 20 HapMap samples run on Illumina 1M
Copyright © 2009 Partek Incorporated.
All rights reserved.
Why study alleles?
• Interest in copy number is based on the idea that the deletion or
duplication of DNA can promote tumorigenesis
– Deletion of tumor suppressor genes
– Duplication of oncogenes
• Fundamentally deletion or duplication is not the important
issue, its loss or amplification of functionality or mutation.
– If everyone was homozygous everywhere, CGH would be all we need.
– In the heterozygous case if one allele is deleted, or amplified, it can alter
the expression of the functional or mutant phenotype.
Copyright © 2009 Partek Incorporated.
All rights reserved.
Focus on heterozygosity
• In areas of heterozygosity we are not interested in merely CGH
changes, instead we want to see how the alleles are changing
• There are two tools Partek lays at our disposal:
– LOH- Loss of Heterozygosity
– ASCN- Allele Specific Copy Number
• Fundamentally the difference is that LOH is a state. Either you are or you
aren’t. ASCN will be used to find magnitude and to help simplify images.
Copyright © 2009 Partek Incorporated.
All rights reserved.
Data types and import
• To analyze for LOH Partek will require genotype calls
– Partek does not have a genotyping algorithm
– Import from CHP files or create them from within Partek for Affymetrix
data
– Import from beadstudio project using Partek exporter for Illumina data
• To analyze for ASCN Partek will also require allele intensities
– Import CEL files into Partek for Affymetrix data
– Import from beadstudio project using Partek exporter for Illumina data
Copyright © 2009 Partek Incorporated.
All rights reserved.
Loss of Heterozygosity
• What is it?
– Looking across many continuous markers, find genomic regions that contain
stretches of SNP markers called heterozygous (AB) in the “normal” samples, but
called homozygous (AA or BB) in the “cancer” samples.
• How is it determined?
– Partek uses the HMM algorithm to find these regions.
Blood
Tumor
Normal Het
LOH
Homozygous:
Copyright © 2009 Partek Incorporated.
All rights reserved.
Heterozygous:
LOH:
Why/When is a baseline needed?
• If you do not have paired data, then
the baseline file is used to
determine the expected rate of
heterozygosity for each SNP
• This use case is not accurately
described as Loss of Heterozygosity,
rather it is the detection of runs of
homozygosity (ROH), specifically
unexpected ones
• But you have a baseline, right?
– There are baselines available
download drawn from the Hapmap2.
If you don’t have a random mix of
these populations, then the baseline
won’t give a good estimate of
expected frequency!
0% heterozygous
50% heterozygous
European Population
Japanese with cancer
ROH, but is it LOH?
Japanese Population
Common haplotype block
False positive
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH
• Paired is preferred when possible
• For baselines, use a normal population similar to your samples
– Better expected genotype frequencies
– Avoiding LOH due to common haplotype blocks within populations
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH Set Up Dialog - Paired
• HMM parameters are difficult to
tweak. Partek suggests taking the
default parameters.
• A quick primer
– Max probability: Chance that if a
probe is LOH, the next will be as well
– Genomic Decay: How quickly does
the effect of one SNP’s LOH on
neighboring SNPs status decay
– Genotype error: Chance that the
genotype call is made incorrectly
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH Set Up Dialog - Unpaired
Nearly the same except:
• Input baseline
• Default frequency
– If there is no baseline at all, or all the baseline samples were NCs this is used as
default het frequency for the SNPs
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH creates a segment table
One row per sample per LOH region .
Copyright © 2009 Partek Incorporated.
All rights reserved.
Paired LOH question
Q:
A:
I ran paired LOH, and yet, in some of my “LOH” regions, the
het rate is quite large, even when I set genotype error to 0!
When looking for paired LOH, we look only at the SNPs that
are heterozygous in the normal tissue. In long regions of
normal homozygosity, LOH may be detected using just the
few heterozygous SNPs. The many homozygous SNPs, due to
genotyping error, may become heterozygous in the tumor
sample; enough to noticeably effect the het rate in the
region.
• If genotype error is non zero, this actually happens more rarely.
Copyright © 2009 Partek Incorporated.
All rights reserved.
Common LOH – Sig-Regions
Number of samples and heterozygous rate is
a possible filter.
Automatically generated
with LOH, when box is
checked in setup.
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH and Copy Number Overlap
• Using the LOH workflow, we can find regions of LOH, common
LOH. Even places where LOH differs between groups with a
little elbow grease.
• Using the copy number workflow we can find regions of
amplification and deletion.
• Additionally we can overlap the two to find regions that
intersect
Copyright © 2009 Partek Incorporated.
All rights reserved.
Overlap
CN
LOH
LOH and Copy Number Overlap
• When overlapping you must have the same sample IDs for the LOH analysis
and the CN analysis
• Report is at the sample level, each region falls into one of 6 categories:
– Amplification, Deletion
– Amplification + LOH, Deletion + LOH
– Copy Neutral LOH
Why do I care about cnLOH? – Because allelic imbalance can occur
without alteration of chromosomal abundance. And a shift in the
allelic balance within a sample can lead to phenotypic alternations.
Copyright © 2009 Partek Incorporated.
All rights reserved.
Copy Neutral LOH is prevalent in Cancer
• 67% of LOH events in 26 pancreatic cancer cells lines occurs in
regions with either copy neutral or copy gain regions
– Calhoun et al. Can Res. 2006 66:7290
• 75% of LOH events in cervical cell lines did not show a copy
number change
– Kloth et al. BMC Genomics. 2007 8:53
• 56% of LOH events in glioblastomas were copy neutral
– Kuga et al. Neuro-oncology. 2008 10:995
Copyright © 2009 Partek Incorporated.
All rights reserved.
Why Allele Specific copy number (AsCN)?
• Cancer Studies w/ primary specimens
• Pure normal population required
• Does not require paired samples or large reference population
• Can be run paired (recommended)
Copyright © 2009 Partek Incorporated.
All rights reserved.
What is allele specific copy number?
• Like LOH, only pay attention to heterozygous SNPs
• In a heterozygous SNP, we expect balance between our two SNP
calls. In fact, in diploid organisms, each SNP becomes an
analogue of each allele!
• If we find that one of our alleles has become larger, the smaller,
or both then we have imbalance.
– LOH is a case of severe imbalance, but in mixed tissue perfect LOH is hard
to come by
A
B
Normal
Copyright © 2009 Partek Incorporated.
All rights reserved.
LOH
In between imbalance
Mixing Tumor & Normal Cell Lines on 50K SNP
Copyright © 2009 Partek Incorporated.
All rights reserved.
Allele Specific CN
• The workflow
Copyright © 2009 Partek Incorporated.
All rights reserved.
Allele Specific CN Setup
• No prebuilt references
• Requires “normals” to be
included in project
• Normal samples do not
have to come from the
same experiment, if
necessary, download some
normal samples off GEO
and merge them in
Copyright © 2009 Partek Incorporated.
All rights reserved.
How does ASCN use the baseline?
• ASCN will only make estimates in areas of heterozygosity
– In paired data, areas of heterozygosity is defined by normals
– In unpaired data, areas of heterozygosity are defined from genotypes
• Yes, unpaired ASCN will have no estimates in areas of LOH as defined by
the LOH workflow. They complement each other
– Not heterozygous areas are given “?”s
• ASCN must make an estimate of what intensity maps to 1 copy
of an allele
– In paired data, this is the intensity of the allele in the normal samples
– In unpaired data, this is the average intensity of the allele in
heterozygous samples
Copyright © 2009 Partek Incorporated.
All rights reserved.
Allele Specific Copy Number (AsCN)
Two rows per sample
Copyright © 2009 Partek Incorporated.
All rights reserved.
Detect Allelic Imbalances
Proportion = (Max-Min) / (Max + Min)
Copyright © 2009 Partek Incorporated.
All rights reserved.
Using Imbalance to Drive Copy # Discovery
•Sort Descending on Proportion column in imbalance spreadsheet
•Max = 1.05 and Min = 0.95, then Proportion = 0.1 / 2 = 0.05
•Max = 1.9 and Min = 0.1, then Proportion = 1.8 / 2 = 0.90
•Large Proportion Values represent Allelic Imbalance
Copyright © 2009 Partek Incorporated.
All rights reserved.
Copy Number vs Allele Specific Copy #
Copyright © 2009 Partek Incorporated.
All rights reserved.
Common questions when viewing ASCN
• Why is the image sparse?
– In paired data, homozygous SNPs in the normal are not used. Its possible
the region is dominantly homozygous
– In unpaired data, NC and homozygous SNPs in the sample are not tested.
In areas of great aberration or LOH fewer calls are made.
• Why doesn’t it always agree with the LOH?
– LOH uses the genotype calls. If there is a large dominance of one allele,
often the region will be called as homozygous
• Why doesn’t it always agree with copy number or allele ratio?
– These are all separate algorithms. While they are usually very close in
most ways when they don’t match up, there should still be a consistent
story
Copyright © 2009 Partek Incorporated.
All rights reserved.
Interplay between Copy Number Measures
Amplification on
p arm.
Four clusters in
allele ratio.
Separation in
AsCN
Copyright © 2009 Partek Incorporated.
All rights reserved.