Microsatellites

Microsatellites
What is microsatellite
• Simple Sequence Repeats (SSR)
• 1-6 bp long
Classification of Microsatellites
• Simple microsatelltes
• Composite microsatellites
(GT)n (AC)n (AG)n
Simple
microsatellites
contain
only
one
kind
of
repeat
sequences:
Composite
microsatellites
contain
more
than
one
type
repeats
Molecular Basis of Microsatellite
Polymorphism
Different by 3 repeats
• Slippage of DNA polymerase is believed to be the major cause of
microsatellite variation
• The mutation rate can be as high as 0.1 to 0.2% per generation
Abundant and Even Distribution
Abundant
•
Abundance varies with species, but all species
studied to date have miocrosatellites
• In well studied mammal species, one
microsatellite exist in every 30-40 kb DNA.
Even distribution
•
•
•
•
•
•
On all chromosomes
On all segments of chromosomes
With genes
Often in introns
In exons as well
Trinucleotide repeats and human diseases:
Huntington disease, fragile X, and other mental
retardation-related human diseases
Small Locus sizes adapt them for PCR
PCR
2
6 3
1
Microsatellites are co-dominant
markers
AB
BC
CD
BC
AD
BD
Allele A
Allele B
Allele C
Allele D
CC
CD
AC
AB
BD
AC
BD
AB
Mendelian Inheritance of Microsatellites
Microsatellites are inherited as codominant markers according
to Mendelian laws
Liu et al. 1999. Biochem. Biophys. Res Comm. 259: 190-194
Liu et al. 1999. J. Heredity 90: 307-311.
Advantages of Microsatellite Markers
Abundant
Evenly
distributed
Highly
polymorphic
Co-dominant
Small
loci
Development of
microsatellite markers
Need
•
•
SSR containing clones
Sequences of the flanking regions of SSR
Microsatellites-enriched
Small-insert DNA Libraries (I)
Genomic DNA
Digest with several 4-bp blunt enders
Gel fraction of 300-600 bp
Ligation to a phagemid vector
insert
insert
insert
insert
micro
Small insert
3.5 kb
insert
Small insert
3.4 kb
Small insert
Small insert
Small insert
Small insert
3.4 kb
3.4 kb
3.4 kb
insert
Small insert
3.4 kb
insert
Small insert
3.4 kb
insert
Small insert
3.4 kb
3.4 kb
insert
Small insert
3.4 kb
Microsatellites-enriched Libraries (II)
micro
insert
insert
Small insert plasmids
3.5 kb
Small insert plasmids
3.5 kb
insert
in sert
Small insert plasmids
3.5 kb
Small insert plasmids
3.5 kb
Using dut/ungCJ236 strain
u
u
u
Single-stranded phagemids
3.5 kb
Conversion into single-stranded
phagemids using helper phage
u
micro u
u
u
u
u
u
u
u
u
Single-stranded phagemids
Single-stranded phagemids
Single-stranded phagemids
3.5 kb
3.5 kb
Won’t be converted to ds
will be degraded in WT host
3.5 kb
Microsatellite-enriched Libraries (III)
micro
Convert into ds
micro
u micro
ds plasmids
using (CA)15 (e.g.)
Single-stranded phagemids
3.5 kb
u
Transform into
WT E. coli
u
3.5 kb
3.5 kb
micro
ds plasmids
3.5 kb
According to Ostrander et al., 1992: PNAS 89:3419
Microsatellites-enriched
Libraries
CA
GA
TA
CG
CT
GT
CAA
CAT
CAG
CAC
CGG
CGT
CGC
CGA
...
4 bp
5 bp
Characterization
of Microsatellites
• Isolate plasmid DNA;
• sequence clones;
• Identify clones with enough sequences
for primer design.
PCR Optimization and PIC Analysis
• PCR products best <200 bp
• PCR conditions: annealing temperature, Mg++, pH,
DMSO, etc.
• Polymorphism information content
• Polymorphism in reference families
Disadvantages of microsatellites
• Previous genetic information is needed
• Huge Upfront work required
• Problems associated with PCR of microsatellites
The concept of Polymorphic
information content
• Measures the usefulness of a marker
• Informativeness in specific families
Microsatellite Genotyping
1. AA x AA
Not polymorphic
2. AA x BB
No segregation
3. AØ x ØØ
Only 1 allele
segregating 1:1
4. AA x AB
B segregates 1:1,
A segregates with intensity 1:1
5. AA x BØ
A not segregate
B segregates 1:1
6. AØ x AB
A segregates 3:1,
B segregates 1:1
7. AB x AB
A segregates 3:1,
B segregates 3:1
Microsatellite Genotyping
8. AØ x BØ
A segregates 1:1,
B segregates 1:1
9. AB x ØØ
A segregates 1:1, B segregates
1:1, A & B alternating
10. AA x BC
2 of the 3 alleles
segregating 1:1
11. AØ x BC
All 3 alleles segregating 1:1,
2 types with only 1 allele
12. AB x AC
2 of 3 alleles segregating 1:1,
the other 3:1 with a single allele
existing for some individuals
13. AB x CD
All 4 alleles
segregating 1:1
Polymorphic Information Content PIC)
•
PIC refers to the value of a marker for detecting
polymorphism within a population
• PIC depends on the number of detectable alleles
and the distribution of their frequency.
• Bostein et al. (1980) Am. J. Hum Genet. 32:314331.
• Anderson et al. (1993). Genome 36: 181-186.
Polymorphic Information Content (PIC)
n
PICi = 1-∑ Pij2
j=1
Where PICi is the polymorphic information content
of a marker i; Pij is the frequency
of the jth pattern for marker i and the summation
extends over n patterns
Polymorphic Information Content PIC)
n
PICi = 1-∑ Pij2
j=1
Example: Marker A has two alleles, first allele has a
frequency of 30%, the second allele has a
frequency of 70%
PICa = 1- (0.32 + 0.72) = 1- (0.09 + 0.49) = 0.42
Polymorphic Information Content PIC)
n
PICi = 1-∑ Pij2
j=1
Example: Marker B has two alleles, first allele has a
frequency of 50%, the second allele has a
frequency of 50%
PICb = 1- (0.52 + 0.52) = 1- (0.25 + 0.25) = 0.5
Polymorphic Information Content PIC)
n
PICi = 1-∑ Pij2
j=1
Example: Marker C has two alleles, first allele has a
frequency of 90%, the second allele has a
frequency of 10%
PICc = 1- (0.92 + 0.12) = 1- (0.81 + 0.01) = 0.18
Polymorphic Information Content PIC)
n
PICi = 1-∑ Pij2
j=1
Example: Marker D has 10 alleles, each allele has a
frequency of 10%
PICd = 1- [10 x 0.12] = 1- 0.1 = 0.9
Allele frequency and Forensics
• Say, we have 10 marker loci
• We have done adequate population genetics to
know each one have a 10% distribution
• Test of each locus can define certain level of
confidence as to what the probability is to obtain
the results you are obtaining.
Allele frequency and Forensics
• Locus 1, positive
• You are included, but every one out of 10 people
has the chance to be positive
• locus 2, positive
• You are included, but every one out of 100
people has the chance to be positive at both
locus 1 and locus 2
• …
• Locus 10, also posive
• ...