Maximum-likelihood estimation of
admixture proportions from
genetic data
Jinliang Wang
t1 = ξ/2n1
t2 = ξ/2n2
P0
n1
P1
T1 = ψ/2N1
Th = ψ/2Nh
n2
Ph
p1
N1
p2
Nh
ξ
P2
N2
T2 = ψ/2N2
P1
S1
Ph
Sh
ψ
P2
S2
Ω = {p1, t1,t2,T1,Th,T2}
t1 = ξ/2n1
t2 = ξ/2n2
Ω = {p1, t1,t2,T1,Th,T2}
T1 = ψ/2N1
Th = ψ/2Nh
T2 = ψ/2N2
P0
w
n1
P1
x1
Ph
p1
xh
N1
ξ
P2
x2
N2
Ph
y1
c1
p2
Nh
P1
S1
n2
P2
yh
Sh
ch
ψ
y2
S2
c2
C = (c1,c2,c3)
Likelihood function
Pr(C ) Pr(c1 , c2 , ch | y1 , y2 , yh )
Pr( y1 , y2 , yh | x1 , x2 , p1 , T1 , T2 , Th )
Pr( x1 , x2 | t1 , t 2 , w)
Pr( w) d
Likelihood function
Random sampling
Pr(C ) Pr(c1 , c2 , ch | y1 , y2 , yh )
Admixture and
genetic drift
Pr( y1 , y2 , yh | x1 , x2 , p1 , T1 , T2 , Th )
Genetic drift
Pr( x1 , x2 | t1 , t 2 , w)
Prior on w
Pr( w) d
Allele frequencies in P0
P0
w
Pr(w)
Genetic drift after population split
P0
n1
P1
w
n2
ξ
P2
x1
x2
Pr( x1 , x2 | t1 , t2 , w)
t1 = ξ/2n1
t2 = ξ/2n2
Genetic drift in independent populations
Genetic drift: the diffusion approximation
2
Pr( x1 , x2 | t1 , t 2 , w) Pr( xi | ti , w)
i 1
Pr( xi ti , w) w(1 w)a (a 1)( 2a 1) H (1 a, a 2,2, w)
a 1
a (a 1)
H (1 a, a 2,2, xi ) exp
4ni
ti = ξ/2ni
Crow and Kimura (1970) p. 382
The admixture event
P0
P1
x1
p1
xh p1 x1 p2 x2
Ph
xh
p2
Pr( y1 , y2 , yh | x1 , x2 , p1 , T1 , T2 , Th )
P2
x2
Genetic drift since admixture event
P0
P1
xh p1 x1 p2 x2
Ph
x1
N1
P2
xh
Nh
P1
N2
Ph
y1
x2
ψ
P2
yh
Pr( y1 , y2 , yh | x1 , x2 , p1 , T1 , T2 , Th )
y2
T1 = ψ/2N1
Th = ψ/2Nh
T2 = ψ/2N2
Random sampling
Pr(c1 , c2 , ch | y1 , y2 , yh )
Pr(c | y )
i
i
i 1.2.h
P1
Ph
y1
S1
c1
P2
yh
Sh
ch
y2
S2
c2
C = (c1,c2,c3)
Likelihood function
Random sampling
h
Pr(C ) Pr(c j | y j )
j 1
h
Admixture and
genetic drift
Pr( y j | x j , T j )
j 1
2
Genetic drift
Pr( xi | w, ti )
i 1
Prior on w
Pr( w) d
African-American Admixture Proportions
30
25
European ancestry
20
15
10
5
0
New
Orleans
New York
Pittsburg
Maywood nr
Chicago
Houston
Detroit
Baltimore
Philadelphia Philadelphia Charleston,
2
1
South
Carolina
Jamaica
Profile log-likelihoods for New York
Drift before admixture event
Proportion of European ancestry
Drift since admixture event
Application to canid populations:
Grey wolf and coyote in North America
70
60
Common
Ancestor
Wolverine ancestry
50
40
Grey Wolf
Coyote
30
20
10
Grey Wolf
0
CoyoteWolflike
Hybrid
Grey wolf-like hybrid
Coyote
Coyote-like hybrid
Discussion
Suitable data
Assumptions of the method given the
model
Comparing the model to other scenarios
Aspects of the data used for inference
Discussion
Suitable data
Human data
Genotypes of 10 nuclear loci. Chosen because
they are either African or European specific or
highly differentiated between the two.
Canid data
10 microsatellite loci. Neither species-specific
nor highly differentiated between wolves and
coyotes.
Discussion
Assumptions of method given the model
Alleles are inherited independently across
loci in the admixture event
Drift acts independently on alleles across
loci
Alleles in a sampled individual are
independent across loci
Discussion
Assumptions of method given the model
The prior distribution on w is flat, not Ushaped
Admixture occurs instantaneously
The effect of mutation on perturbing allele
frequency is negligible
Discussion
Comparing the model to other scenarios
Modern ‘pure’ populations need to be
sampled
Thus the ‘structure’ of the population is
assumed to be known
If we cannot sample modern ‘pure’
populations assumes we cannot make
inference on the admixture proportions
Discussion
Aspects of the data used for inference
Inference proceeds solely on the basis of allele
frequencies
Linkage disequilibrium is
Firstly, not used for inference
Secondly, assumed to be negligible
LD might be exploited
Enhance inference when modern ‘pure’ populations are
sampled
Relax the necessity to sample modern ‘pure’ populations
at all
© Copyright 2026 Paperzz