Introduction to Pedigree Analysis II

Introduction to Pedigree
Analysis II
Biostatistics 666
Elements of Pedigree Likelihoods
n
Prior Probabilities
n
n
Segregation probabilities
n
n
For founder genotypes
For offspring genotypes
Penetrances
n
For individual phenotypes
Parameters in Pedigree Likelihood
n
Prior Probabilities
n
n
Segregation Probabilities
n
n
n
Allele Frequencies
Genetic Map and Recombination Fractions
Relationships Among Individuals
Penetrance
n
n
Model for Relating Genes and Disease
Error Model for Genotype Data
Prior Probabilities for Founders
n
n
P(Gfounder)
Assume hardy-weinberg equilibrium
n
n
Based on allele frequencies
May be multilocus frequencies
n
n
Assume linkage equilibrium
Frequencies at multiple loci are products of
frequencies at individual loci
Segregation Probabilities
n
n
P(Go | Gf , Gm)
Probability of offspring genotype
conditional on parental genotypes
n
n
Follows from Mendel’s laws
If two loci are considered, depends on
recombination fraction
Penetrances
n
n
P(Xi | Gi)
Probability of observed phenotype
conditional on genotype
n
n
If penetrances are 1 the genotypes are
known
Generally, assume that phenotypes are
independent within families
Overall Pedigree Likelihood
L = å ...å P( X 1 ,..., X n | G1 ,..., Gn )P(G1 ,..., Gn )
G1
Gn
= å ...åÕ P( X i | Gi )P(G1 ,..., Gn )
G1
Gn
i
= å ...åÕ P( X i | Gi )
G1
Gn
i
Õ
founder
P(G founder )
Õ P (G
{o , f ,m}
o
| G f , Gm )
Typical calculation
n
n
List all possible genotypes
Create reduced lists
n
n
n
Eliminate those where P(X|G) = 0
Eliminate those where P(Go|Gf,Gm) = 0
Iterate over all possibilities
Example Pedigree
?
O
?
A
A
A
AB
A
A
Condition on Phenotype
Person
Genotypes
#Genotypes
I-1
I-2
{AA, AO, BB, BO, AB, OO}
{OO}
6
1
II-1
II-2
II-3
II-4
{AA, AO, BB, BO, AB, OO}
{AA, AO}
{AA, AO}
{AA, AO}
6
2
2
2
III-1
III-2
III-3
{AA, AO}
{AB}
{AA, AO}
2
1
2
1152 possibilities to consider
Condition on Family Members
Person
Genotypes
#Genotypes
I-1
I-2
{AA, AO, AB}
{OO}
3
1
II-1
II-2
II-3
II-4
{BO, AB}
{AO}
{AO}
{AA, AO}
2
1
1
2
III-1
III-2
III-3
{AA, AO}
{AB}
{AA, AO}
2
1
2
48 possibilities
Segregation Probabilities
n
P(Go | Gf , Gm)
n
n
n
n
Go = (Ho1, Ho2)
Gf = (Hf1, Hf2)
Gf = (Hm1, Hm2)
P(Go | Gf , Gm) =
P(Ho1| Hf1, Hf2)P(Ho2| Hm1, Hm2) +
P(Ho2| Hf1, Hf2)P(Ho1| Hm1, Hm2)
Genetic Map
n
n
Set of n ordered loci
Set of n-1 distances between
consecutive loci
A
B
xAB
C
xBC
D
xCD
E
xDE
F
xEF
The Morgan
n
n
Distance along which one crossing over
is expected per generation
The basic unit of genetic distance
n
n
n
Usually, distances are reported in cM
1 cM = 0.01 M
Total human genetic map about 3.3 M
Recombination
Non-Recombinant
Gametes
1-q
Recombinant
Gametes
q
Recombination
Non-Recombinant
Gametes
Recombinant
Gametes
/
/
/
/
1-q
Even Number of Events
q
Odd Number of Events
Recombination and Map Distance
Observed Recombination
1.00
0.80
0.60
0.40
0.20
0.00
0.00
0.20
0.40
0.60
Distance
0.80
1.00
Haldane Map Function
n
Assume crossovers are independent
n
Follow a Poisson distribution in each
interval
ì- 1 2 ln(1 - 2q ) if 0 <q < 1 2
x=í
¥
otherwise
î
-2 x
1
q = 2 (1 - e )
n
Haldane (1919)
Overall Pedigree Likelihood
L = å ...åÕ P ( X i | Gi )Õ P (G f )
G1
n
n
Gn
i
f
Õ P (G
o
| G f , Gm )
{o , f ,m}
Computation rises exponentially with #people
Computation rises exponentially with #markers
n
n
G – genotypes, X – Phenotypes
Iterations are over everyone (i), founders (f), or
offspring, father, mother trios {o, f, m}
Simplification for Nuclear Families
L=
å P ( X | G ) P (G )
å P ( X | G ) P (G )
Õå P( X | G ) P(G | G , G
m
m
m
f
f
f
Gm
Gf
o
o
n
o
o
m
f
)
Go
Linear on number of offspring
n
n
G – genotypes, X – phenotypes
Indexes are m and f for mother and father and o
for iterating over offspring
Typical Interesting Pedigree
Elston and Stewart’s (1971) insight…
n
Special pedigree
n
Every person is either:
n
n
n
n
Related to someone in the previous generation
Marrying into the pedigree
No consaguineous marriages
Condition the probability of one parent
on their offspring