Kinship coefficient

4. Genetic Identity Coefficients
§4.1. Kinship and inbreeding coefficients
• Some definitions
Identity by state (ibs) : Two alleles are
ibs if they are functionally the same.
Identity by descent (ibd) : Two alleles
are ibd if one is a physical copy of the other,
or if they are both physical copies of the same
ancestral allele.
Kinship coefficient Φij : The kinship coefficient Φij between two individuals i and j
is the probability that an allele selected randomly from i and an allele selected randomly
from the same autosomal locus of j are ibd.
1
Inbreeding coefficient fi: The Inbreeding
coefficient fi of an individual i is the probability that his two alleles at any autosomal locus
are ibd. If fi > 0, i is said to be inbred.
Relation between kinship and inbreeding coefficients:
1
Φii = (1 + fi), fi = Φkl ,
2
where k and l are parents of i.
• Calculation of kinship coefficients
Simple examples
Parent-offspring: Φij = 1/4.
Full sibs: 3 and 4.
1 × 2: parents;
3, 4: childreen of 1 and 2.
1
1
1
Φ34 = Φ31 + Φ32 = .
2
2
4
2
Half sibs: 4 and 5.
1 × 2: husband and wife;
2 × 3: husband and wife;
4: child of 1 and 2;
5: child of 2 and 3.
1
1
Φ45 = Φ42 + Φ43
2
2
1
1 1
= × +0= .
2 4
8
First cousins: 7 and 8.
1 × 2: husband and wife;
3, 4: children of 1 and 2;
3 × 5: husband and wife;
4 × 6 husband and wife;
7: child of 3 and 5;
8: child of 4 and 6.
3
Φ78 =
=
=
=
=
1
1
Φ74 + Φ76
2
2
1
Φ74 + 0
2
1
1 1
( Φ34 + Φ54)
2 2
2
1 1
( Φ34 + 0)
2 2
1
1 1
× = .
4 4 16
General algorithm for calculating kinship coefficients between members of
a pedigree:
(i) Any person should have either both or neither of his or her parents in the pedigree.
(ii) Members in the pedigree are numbered in
such a way that every parent precedes his
or her children.
4
(iii) The kinship coefficients between any two
persons in the pedigree are computed in a
symmetric matrix from left top downwards
recursively.
(iv) The recursive calculation formulas are:
– For Φii: if i is a founder, Φii = 1/2, otherwise, Φii = 12 + 12 Φkl , where k and l are
parents of i.
– For Φij , (i > j): if i is a founder, Φij = 0,
otherwise, Φij = 12 Φjk + 12 Φjl , where k
and l are parents of i.
Basic Rule: Substitution of parental alleles
for the allele of the child.
5
A brother-sister mating example: 2
cI>
...
3
4
5
6
1
'2
0
0
!
1
'4
1
"4
1
"4
I
'4
I
'4
I
1
'4
1
1
4
1
2
'4
'4
'4
1
4 1..
i
'4
1
'4
I
"4
1
'4
1
'4
3
8"
3
i
6
1
3
3
"2
ii
3
i
8
3
i
8
3
g
I
'4
I
'4
3
i
3
li
3
i
5
i
A remark on the substitution rule
The substitution of parental alleles in the calculation of the kinship coefficient between two
persons should always operate on the higher
numbered person.
A counterexample:
1
1
Φ35 = Φ33 + Φ34,
2
2
1
1
Φ35 = Φ15 + Φ25.
2
2
Note: While the parental allele passed to 3 is
randomly chosen, once this choice is made, it
limits what can be passed to 5. Sampling from
5 depends on what have been sampled for 3.
7
§4.2. Identity states and identity coeffi­
cients
• Detailed identity states
i's genes
8l
•
j's genes
g~
•
J
.
8f
. gJ
~
V
~
~
Lj
S'"I
S*2
S*3
s·4
s·S
-"""--
.----
S*6
S*7
.
.
S~l
I
•
.
•
-
S''''8
II
S*9
I
Sjo
·
·
•
•
·
X
~
/.
•
Si2
Sj3
Sj4
Sis
8
I
• Condensed identity states
Sl = Si, S2 = S6' S3 = S2
US3
S4 = S7' S5 = S4 US;,
S6 = S8' S7 ~ S9'
USi2
USi1 USi3 USi4
S8 = Sio
89 == 8 9,
i's genes j's genes ~
s,
~
V.
Sz
S3
S4
II
I:
Sg
9
Ss
• Identity coefficients Δk
Definition:
Δk = P (Sk ).
If a person is not inbred, the two alleles of the
person cannot be IBD. Therefore
Δk = 0, for k = 1, 2, 3, 4, if i is not inbred;
Δk = 0, for k = 1, 2, 5, 6, if j is not inbred;
Δk = 0 except k = 7, 8, 9, if neither of i and
j is inbred.
• Relation between identity coefficients
and kinship coefficient
1
1
Φij = Δ1 + (Δ3 + Δ5 + Δ7) + Δ8.
2
4
The relation can be obtained by conditioning
on the identity states. For example, given either of S2, S4, S6 and S9, no alleles of i and
10
j can be IBD; given S7, the probablity of two
randomly chosen alleles of i and j is 1/2, etc..
• Calculation of identity coefficients for
simple pedigrees
By conditioning on the identity states of parental
relatives, the identity coefficients for simple
pedigrees can be easily calculated as given in
the following table.
Relationship
Δ7 Δ 8 Δ 9
Φ
Parent-Offspring
0
1
0
1/4
Full siblings
1/4 1/2 1/4 1/4
Half siblings
0 1/2 1/2 1/8
First cousins
0 1/4 3/4 1/16
Double first cousins 1/16 6/16 9/16 1/8
Second cousins
0 1/16 15/16 1/64
Uncle-nephew
0 1/2 1/2 1/8
11
§4.3. Generalized kinship coefficients
• Definition
Let n alleles G1, . . . , Gn be selected at random
from n persons (who are not necessarily all
different persons). Let the n alleles be partitioned into non-overlapping blocks whose constituent genes are ibd. A general kinship
coefficient is defined as the probability
that a particular partition occurs.
Example: Let Gi, Gj , Gk , Gl be 4 alleles selected at random from 4 persons. There are 15
different partitions of these 4 alleles which correspond to 15 detailed identity states: S1r∗, . . . ,
r∗. Thus there are 15 general kinship coeffiS15
cients: Φ(Skr∗), k = 1, . . . , 15.
12
• Random identity states and their probabilities ψk
If the 4 alleles are selected from two persons,
two from each, the 15 detailed random identity states can be collapsed into 9 condensed
random identity states (the order of the two
alleles from the same person becomes irrelevant).
The probability of the condensed random state
Skr is denoted by ψk , i.e.,ψk = P (Skr ).
• Condensed identity states
ψ1
ψ4
ψ3
ψ5
ψ7
ψ8
=
=
=
=
=
=
=
Φ(S1r∗ ),
ψ2 = Φ(S6r∗ ),
Φ(S7r∗ ),
ψ9 = Φ(S9r∗ ),
Φ(S2r∗ ) + Φ(S3r∗) = 2Φ(S2r∗ ),
Φ(S4r∗ ) + Φ(S5r∗) = 2Φ(S4r∗ ),
r∗
Φ(S9∗) + Φ(S12
) = 2Φ(S9∗ ),
r∗
r∗
r∗
r∗
Φ(S10
) + Φ(S11
) + Φ(S13
) + Φ(S14
)
r∗
4Φ(S10
).
13
• Relationship between ψk and Δk
1
1
1
1
ψ1 = Δ1 + Δ3 + Δ5 + Δ7 + Δ8 ,
4
4
8
16
1
1
1
1
ψ2 = Δ2 + Δ3 + Δ4 + Δ5 + Δ6
4
2
4
2
1
3
1
+ Δ7 + Δ8 + Δ9 ,
8
16
4
1
1
1
ψ3 = Δ3 + Δ7 + Δ8 ,
2
4
8
1
1
1
ψ4 = Δ4 + Δ8 + Δ9 ,
2
8
4
1
1
1
ψ5 = Δ5 + Δ7 + Δ8 ,
2
4
8
1
1
1
ψ6 = Δ6 + Δ8 + Δ9 ,
2
8
4
1
1
1
ψ7 = Δ7 , ψ8 = Δ8, ψ9 = Δ9.
4
4
4
1
1
1
1
Δ1 = ψ1 − ψ3 − ψ5 + ψ7 + ψ8 ,
2
2
2
4
1
1
Δ2 = ψ2 − ψ3 − ψ4 − ψ5 − ψ6
2
2
1
3
+ ψ7 + ψ8 + ψ9 ,
2
4
Δ3 = 2ψ3 − 2ψ7 − ψ8 ,
Δ4 = 2ψ4 − ψ8 − 2ψ9 ,
Δ5 = 2ψ5 − 2ψ7 − ψ8 ,
Δ6 = 2ψ6 − ψ8 − 2ψ9 ,
Δ7 = 4ψ7 , Δ8 = 4ψ8 , Δ9 = 4ψ9 .
14
The expression of ψk ’s in terms of Δk ’s can be
obtained by conditioning on the (non-random)
condensed identity states of the two person’s
alleles.
For example, given either of S2, S4, S6 and S9,
the randomly sampled four alleles cannot be
all IBD; given S1, the randomly sampled four
alleles, two form each person, are IBD with
probability 1; etc..
The expression of Δk ’s in terms of ψk ’s can be
easily solved by backwards substitution.
15
§4.4. Calculation of generalized kinship
coefficients
• An example:
General kinship coefficient
for first cousins.
7
1
4'¢8
= <1>( {Gi, G~}, {G~}, {Gg})
=
~ [<1>( {G~, G~}, {G~}, {Gg}) + <1>( {G~, G~}, {G~}, {G~})]
=
1~ [<1>( {G~, Gil, {G~}, {G~})
= 116 ~[<1>( {Gi, Gl}, {G~}, {G~})
<1>( {G~, G~}, {GA}, {Gl})]
<1>( {G~, Gi}, {G~}, {G~})]
=
1~ ~[<1>( {Gi, Gi}, {G~}, {G~}) + <1>( {G~, G~}, {G~}, {G~})]
-
---
1 11
1622'
16
• General rules
– Recurrence rules
Recurrence rule 1:
Assume only one allele Gi is sampled from i
whose parents are j and k. Then
Φ({Gi, . . . , }{} . . . {})
1
= [Φ({Gj , . . . , }{} . . . {}) + Φ({Gk , . . . , }{} . . . {})].
2
Recurrence rule 2:
Assume that the alleles G1i , . . . , Gs1 are sampled
sampled from i for s > 1. If these alleles
occur in one block, then
Φ({G1i , . . . , Gsi , . . . }{} . . . {})
1
= [1 − 2( )s]Φ({Gj , Gk , . . . }{} . . . {})
2
1
1
+( )sΦ({Gj , . . . , }{} . . . {}) + ( )sΦ({Gk , . . . , }{} . . . {})].
2
2
17
Recurrence rule 3:
s+t
Assume that the alleles G1i , . . . , Gs1, Gs+1
i , . . . , Gi
are sampled sampled from i. If the first s
alleles occur in one block and the remaining
t alleles occur in another block, then
s+t
Φ({G1i , . . . , Gsi , . . . }{Gs+1
i , . . . , Gi , . . . }{} . . . {})
1
= ( )s+t [Φ({Gj , . . . , }{Gk , . . . , }{} . . . {})
2
+Φ({Gk , . . . , }{Gj , . . . , }{} . . . {})].
18
– Boundary rules
Boundary rule 1:
If any person is involved in three or more
blocks then Φ = 0.
Boundary rule 2:
If two founders appear in the same block
then Φ = 0.
Boundary rule 3:
If only founders contribute sampled alleles
and neither condition in boundary rule 1
nor conditions in boundary rule 2 pertains,
then Φ = ( 12 )m1−m2, where m1 is the total
number of sampled founder alleles and m2
is the total number of founders sampled.
19
– The brother-sister mating example
=
=
=
=
=
1
ψ8
4
Φ({G15 , G16 }, {G25 }, {G26 })
1
[Φ({G13 , G16 }, {G14 }, {G26 }) + Φ({G14 , G16 }, {G13 }, {G26 })]
4
1
[Φ({G13 , G16 }, {G14 }, {G26 })
2
1
[Φ({G13 , G23 }, {G14 }, {G24 }) + Φ({G13 , G24 }, {G14 }, {G23 })]
8
1
[A + B].
8
1
Φ({G11 , G12 }, {G14 }, {G24 })
2
1
+ [Φ({G11 }, {G14 }, {G24 }) + Φ({G12 }, {G14 }, {G24 })]
4
1
= 0 + Φ({G11 }, {G14 }, {G24 })
2
1
= [Φ({G11 }, {G21 }, {G12 }) + Φ({G11 }, {G12 }, {G21 })]
8
1
1
= Φ({G11 }, {G21 }, {G12 }) = .
4
8
A =
1
[Φ({G11 , G24 }, {G14 }, {G12 }) + Φ({G12, G24 }, {G14 }, {G11 })]
4
1
= Φ({G11 , G24 }, {G14 }, {G12 })
2
1
= [Φ({G11 , G21 }, {G22 }, {G12 }) + Φ({G11, G22 }, {G21 }, {G12 })]
8
1 1
× .
=
8 4
B =
20