Journal of Heredity, 2015, 306–309 doi:10.1093/jhered/esv019 Brief Communication Advance Access publication April 17, 2015 Brief Communication Relationships Between Wright’s FST and FIS Statistics in a Context of Wahlund Effect Lev A. Zhivotovsky From the Institute of General Genetics, The Russian Academy of Sciences, Moscow 119991, Russia. Address correspondence to Lev A. Zhivotovsky, Institute of General Genetics, 3 Gubkin Street, Moscow 119991, Russia, or e-mail: [email protected]. Received 4 January 2015; First decision 6 February 2015; Accepted 20 March 2015. Corresponding Editor: Robin Waples Abstract Waples (2015) has suggested a formula for the Wahlund effect in a case of unequal contribution of samples from genetically different populations that relates Wright’s inbreeding coefficient, FIS, and normalized variance in allele frequencies between populations, FST. I generalize this relationship to a case of multiple alleles and multiple populations not assuming Hardy–Weinberg ratios prior to mixing. This can help to evaluate the impact of a Wahlund effect on heterozygote deficiency relative to other factors such as null alleles, nonrandom mating, or selection. It is suggested that Wahlund effect cannot be an important factor of deviations from Hardy–Weinberg proportions in natural populations in the majority of instances, but it can have a substantial contribution to heterozygote deficiency in a population that has low genetic diversity compared to that among immigrants or in mixed samples that contain comparable fractions of individuals from genetically different populations. Subject areas: Population structure and phylogeography, Bioinformatics and computational genetics Key words: fixation index, gene diversity, heterozygote deficiency, inbreeding coefficient, mixture rate, population differentiation In a recent article, Waples (2015) found a relationship between Wright’s inbreeding coefficient, FIS, caused by Wahlund effect (Wahlund 1928), that is, by mixing individuals from genetically different populations, and normalized variance in allele frequencies between populations, FST. Specifically, if a “recipient” population has a fraction m of individuals that have arrived from a “donor” population with different allele frequencies at a biallelic autosomal locus, then FIS = ( 4m(1 − m) / C ) FST , where C is a known function of allele frequencies and migration (mixture) rate m. Here, I generalize the Waples relationship to a case of multiple alleles and multiple populations that are not necessarily at Hardy–Weinberg equilibria, and consider how strongly Wahlund effects can influence the inbreeding coefficient compared to other factors; for example, null alleles or nonrandom mating. Results A General Formula Let us consider 2 infinite populations, recipient and donor, with allele frequencies pr1 , pr 2 ,... and pd 1 , pd 2 ,... ∑ j prj = 1, ∑ j pdj = 1 , ( ) respectively. Their gene diversities, Hr and Hd, are defined as the expected heterozygosities, H r = 1 − ∑ j prj2 , H d = 1 − ∑ j pdj2 (Nei 1977). Wright’s FIS (Wright 1951), or inbreeding coefficient, was introduced to measure excess of homozygotes, or heterozygote defiH exp − Hobs ciency. Based on Wright’s definition, FIS = , where Hexp H exp and Hobs are the expected and observed heterozygosities (Nei 1977). In general, the recipient and donor populations are not assumed to be at Hardy-Weinberg equilibrium: their inbreeding coefficients are denoted by Fr,IS and Fd,IS, respectively. Therefore, the observed heterozygosities in these populations are H r , obs = (1 − Fr , IS ) H r and H d , obs = (1 − Fd , IS ) H d . Let the recipient population receive a fraction m of immigrants from the donor population. What is the value of inbreeding coefficient in the recipient population after the event of immigration (mixing), Fr′, IS ? Hereafter, the prime stands for the values of population statistics in the mixture. H −H Define Wright’s FST-statistics as FST = T , where HT H= 1 2 ( Hr + Hd ) is the average within-population gene diversity and © The American Genetic Association 2015. All rights reserved. For permissions, please e-mail: [email protected] 306 Journal of Heredity, 2015, Vol. 106, No. 3 307 HT is the total gene diversity of an equal blend of the 2 populations (Nei 1977). As shown in the appendix, the inbreeding coefficient in the recipient population, Fr′, IS , after immigration of rate m, is as follows: H H H Fr′, IS = 4m (1 − m ) T FST + (1 − m) r Fr , IS + m d Fd , IS , (1) H r′ H r′ H r′ or, using an expansion of H r′ (the Appendix), Fr′, IS = 4m (1 − m ) FST + (1 − FST ) {(1 − m)(1 − ∆)Fr , IS + m(1 + ∆)Fd , IS } 4m(1 − m)FST + (1 − FST )(1 − ∆ + 2m∆) (1a) where H r′ is the gene diversity in the recipient population after immiH − Hr gration and ∆ = d is a relative difference in gene diversity Hd + Hr between the recipient and donor populations prior to mixing; the value of Δ lies between −1 and 1. Equation (1) can be expanded to a case of multiple donor populations (see the Appendix, equations A5–A6). A Linear Regression of F-Statistics Transforms If the expected and observed frequencies of heterozygotes equilibrate, that is Fr , IS = 0 and Fd , IS = 0, Equation (1) simplifies to Fr′, IS = 4m(1 − m) HT FST H r′ which generalizes Waples’ equation to a case of multiple alleles. Inverting and taking logarithms, a linear regression between 1 1 ln − 1 holds with a slope of 1: − 1 and ln F F ′ r , IS ST 1 1 1 − ∆(1 − 2m) − 1 + ln ln − 1 = ln .(2) FST 4m(1 − m) Fr′, IS Waples (2015) found a relationship between untransformed F values that is linear with a slope of 1 only if m = 0.5 or if populations fixed for alternative alleles. Wahlund Effect Versus Other Factors The level of heterozygote deficiency in a recipient population after mixing depends on both a Wahlund effect itself and heterozygote deficiencies in the recipient and donor populations prior to migration. The strength of Wahlund effect is determined by both FST and m. It follows from equation (1a) that the Wahlund effect dominates over other factors, that cause within-population positive FIS values, if 4m (1 − m ) FST > (1 − m)(1 − ∆)Fr , IS + m(1 + ∆)Fd , IS (3) FST is a linearized FST value. Failure of 1 − FST the inequality will mean that other factors of heterozygote deficiency dominate and obscure Wahlund effects. and vice versa; here FST = Discussion The Wahlund effect is frequently involved to explain heterozygote deficiencies in samples that presumably include a mixture of individuals from genetically different populations of the same or different species. In this context, Fr′, IS is an inbreeding coefficient in a mixture with fractions 1 − m and m from populations denoted in Equations (1)–(3) with indexes r and d, respectively. Alternatively, Fr′, IS can be interpreted as heterozygote deficiency in a recipient population due to immigration of (mixing with) individuals from a genetically different donor population. The latter context can be useful in studies on population genetics processes. The analytical expressions obtained in the current study can be used in 2 ways. 1. Equation (2) might serve as a test for Wahlund effects based on the regression across loci. However, sampling bias, sampling error, and other statistical properties of the regression parameters are not known; thus, such a test cannot be developed without careful statistical analyses. Nevertheless, this equation can be used as a transformation of Waples’ relationship between the corresponding F-statistics that maintains the linearity. 2. Inequality (3) might be useful to find the bounds within which heterozygote deficiency can be explained at least by Wahlund effects. Let us assume for simplicity that inbreeding coefficients in recipient and donor populations prior to migration are equal to each other, Fr,IS = Fd,IS; denote by FIS their common value. Then, Equation (3) simplifies to 4m (1 − m ) FST > {1 − ∆ (1 − 2m )} FIS (3a) Now, we should compute the value of alternative factors that presumably contribute in FIS. One of such factors is null alleles (Waples 2015). Null alleles are not distinguishable with a given method of genotyping. For example, polymerase chain reaction amplification of a DNA locus can fail due to mutation in the flanking regions for primer hybridization (Callen et al. 1993). Null alleles lead to the false discovery of homozygotes and cause heterozygote deficiency; null alleles can be distributed wide across populations and reach high frequencies (Zhivotovsky et al. 2015, and references therein). 2 pnull As follows from Chakraborty et al. (1992), FIS = , where 1 + pnull pnull is a population frequency of null alleles at a target locus. In a case of similar gene diversities (Δ = 0), the Wahlund effect dominates over the contribution of null alleles in heterozygote deficiency pnull if 2m (1 − m ) FST > . Therefore, if migration rates are not very 1 + pnull strong, FST values should be much greater than the frequency of null alleles to contribute significantly in heterozygote deficiency. For example, even if pnull is as low as 0.05 (which is almost impossible to be tested with small sample sizes), the inequality holds if FST exceeds 0.26 (FST > 0.2) if m = 0.1 or exceeds 0.50 (FST > 0.33) if m = 0.05. Even if the populations are evenly mixed (m = 0.5), FST values need to exceed 0.09. Analogous comparisons can be provided for other factors that decrease heterozygosity and obscure Wahlund effects—selfing, inbreeding as mating of relatives, assortative mating, and diversifying selection. For example, if there is partial selfing with rate s s at a equilibrium between outin both populations then FIS = 2−s crossing and selfing (Weir 1996, p. 263). Then, we can use the same arguments as above and conclude that even low rates of selfing may contribute in heterozygote deficiency at a greater extent than Wahlund effects. Journal of Heredity, 2015, Vol. 106, No. 3 308 The Wahlund effect, as a cause of heterozygote deficiency, is distinguishable if 4m (1 − m ) FST in Equation (3) is not small. In a mixture context, large values of 4m (1 − m ) FST can occur in mixtures with similar fractions of individuals from genetically different populations, for example, if the sample is collected from migration routes or feeding areas, where individuals from more than 1 population often mix but do not interbreed. In a migration context, however, 4m (1 − m ) FST does not seem to be great in natural populations as an immigration rate and genetic differentiation are usually inversely 1 at migration-drift equilibrium, where related. Indeed, FST = 1 + 4mN e Ne is an effective size of populations that exchange by migrants at 1− m rate m. Therefore, 4m(1 − m)FST equals , which is simply negliNe gible. A strong Wahlund effect would mostly occur with nonequilibrium situations where a large fraction of genetically divergent immigrants occurs. This cannot last for long in nature, or FST will quickly decline, but it could easily happen over the short term in human-altered landscapes. Another case when a Wahlund effect may have a substantial impact on heterozygote deficiency is low gene diversity in a recipient population relative to that in the donor population. As an extreme, let us assume that a recipient population is monomorphic, whereas the donor population is polymorphic; that is, Δ = 1. It follows from Equation (3) that the Wahlund effect contributes significantly if 2 (1 − m ) FST exceeds Fd , IS , the inbreeding coefficient in the donor population prior to migration. This might be used for conservation biology purposes, for example, when a population with low genetic variation is under risks of invasion from populations with higher genetic diversities. This brief note does not concern statistical aspects such as estimation procedures for Equation (2) or tests on inequality (3), although this is an important issue for practical applications that include estimates of the sampling biases and sampling errors for basic parameters of this study, FIS and FST , and their transforms. For example, the logarithm transforms of the inverse of F-statistics in Equation (2) seem to be greatly biased for small F values and small sample sizes. Also, testing Inequality (3) requires the knowledge of joint sampling errors of F values. Both analytical approaches and resampling procedures should be used to develop a statistical basis for estimating relationships between F-statistics. respectively, prior to migration. Their allele diversities, Hr and Hd, are defined as H r = 1 − ∑ j prj2 , H d = 1 − ∑ j pdj2 . In general, the recipient and donor populations are not assumed to be at HardyWeinberg equilibrium: their inbreeding coefficients prior to migration are denoted by Fr,IS and Fd,IS , respectively. Therefore, the observed heterozygosities in these populations prior to migration are H r , obs = (1 − Fr , IS ) H r and H d , obs = (1 − Fd , IS ) H d . The strength of differentiation between both populations prior H −H to migration, Wright’s FST value, is defined as FST = T , where HT H = 12 ( H r + H d ) is the average within-population allele diversity, and 2 prj + pdj HT = 1 − ∑ j is a total diversity of an equal blend of the 2 2 populations. Obviously, HT = 14 H r + 14 H d + 1 2 (1 − ∑ p p ).(A1) j rj dj Hr − H H −H and ∆ d = d be normalized deviations of the H H within-population allele diversities prior to migration from their H − Hr average value. Obviously, ∆ d = − ∆ r = ∆ , where ∆ = d ; the ∆s Hd + Hr are equal to 0 if H d = H r , Δ lies between −1 and 1. Further, Let ∆ r = H r = H(1 + ∆ r ), H d = H(1 + ∆ d ), H = (1 − FST )HT (A2) After immigration into the recipient population, at rate m, the allele frequencies and the observed heterozygosity in the recipient population change to prj′ = (1 − m) prj + mpdj , Hr′, obs = (1 − m) (1 − Fr , IS ) Hr + m (1 − Fd , IS ) Hd (A3) and the expected heterozygosity becomes ( ) H r′ = 1 − ∑ j ( prj′ ) = (1 − m ) H r + m2 H d +2m(1 − m) 1 − ∑ j prj pdj . 2 2 Using equations (A1–A3), obtain H r′ = (1 − m ) H r + m2 H d + 2m(1 − m) ( 2HT − H ) 2 Funding The Russian Foundation for Basic Research (grants 14-04-92005NNS and 15-04-02511) and RAS Program “Biodiversity in Life Systems” to L.A.Zh. and Acknowledgments I am grateful to Dr. Robin Waples and 2 anonymous reviewers for their valuable comments on the manuscript. = 4m(1 − m)FST HT + (1 − m)H r + mH d = 4m(1 − m)FST HT + (1 − FST )(1 − ∆ d + 2m∆ d )HT , H r′ − H r′, obs = 4m(1 − m)FST HT + (1 − m ) Fr , IS H r + mFd , IS H d = 4m(1 − m)FST HT + (1 − m ) (1 + ∆ r ) Fr , IS + m (1 + ∆ d ) Fd , IS (1 − FST ) HT . This implies Appendix. A Model of the Wahlund Effect One Donor Population Let us consider a recipient population that has received a fraction m of migrants from a genetically distinct donor population. Hereafter, the prime stands for the values of population statistics after mixing. ( ) Let pr1 , pr 2 ,... and pd 1 , pd 2 ,... ∑ j prj = 1, ∑ j pdj = 1 be allele frequencies at an autosomal locus in recipient and donor populations, Fr′, IS = 4m (1 − m ) HT H H FST + (1 − m) r Fr , IS + m d Fd , IS ,(A4) H r′ H r′ H r′ Multiple Donor Populations Let us denote by k the number of donor populations; mi is a fraction of migrants in the recipient population from donor population i (i = 1,2,…, k) and m is the total fraction of migrants (m = m1+m2+ Journal of Heredity, 2015, Vol. 106, No. 3 309 …+ mk); HT (r , i ) is a total diversity of an equal blend of the recipient and the ith donor population and HT (i , j ) is that for donor populations i and j; FST (r , i ) is an FST value between the recipient population and the ith donor population and FST (i , j ) is that between donor populations i and j; H r′ and Fr′, IS are the allele diversity and the inbreeding coefficient in the recipient population after migration from all donor populations. Then k = (1 − m ) (1 − Fr , IS ) H r + ∑mi (1 − Fi , IS ) Hi , k k i =1 i =1 ( H r′ = 1 − ∑ j ( prj′ ) = (1 − m ) H r + ∑ mi2 Hi + ∑ 2mi (1 − m) 1 − ∑ j prj pij k −1 +∑ ∑ 2m m (1 − ∑ k i s i =1 s = i +1 j ) pij psj ) As in Equation (A1) , 1 − ∑j pij psj = 2HT (is) − His = 2HT (is) − (1 − FST (is) ) HT (is) = (1 + FST (is) ) HT (is) , where His and HT (is) are the average of allele diversities and the total diversity in an equal blend of populations i and s, and FST (is) = HT (is) − His HT (is) is an FST-value between these populations; the same relationships hold between the recipient population and the ith population. Therefore, ( ) k k i =1 i =1 H r′ = (1 − m ) H r + ∑mi2 Hi + ∑2mi (1 − m) (1 + FST (ri ) ) HT (ri ) 2 k −1 +∑ k ∑ 2m m (1 + F ST (is ) s i i =1 s = i + 1 ) HT (is) , and H r′ − H r′, obs = k ∑ 2m (1 − m)(1 + F ST (ri ) i i =1 k −1 +∑ )H k ∑ 2m m (1 + F i ST (is) s i =1 s = i +1 T (ri ) )H T (is) k − m (1 − m ) H r − ∑ mi (1 − mi ) Hi i =1 k + (1 − m ) Fr , IS H r + ∑ mi Fi , IS Hi i =1 = k −1 k ∑ 2mi (1 − m)FST (ri)HT (ri) + ∑ i =1 k ∑ 2m m F i s ST (is ) HT (is) i =1 s = i +1 k + (1 − m ) Fr , IS H r + ∑ mi Fi , IS Hi i =1 k −1 k + ∑ 2mi (1 − m)H HT (ri ) + ∑ i =1 k ∑ 2m m H i s T (is ) i =1 s = i +1 k − m (1 − m ) H r − ∑ mi (1 − mi ) Hi . i =1 + … + (1 − m) k −1 i =1 2 i =1 k −1 k HT (ri ) H FST (ri ) + 4∑ ∑ mi mj T (is) FST (is) ′ i =1 s = i + 1 Hr H r′ k Hr H Fr , IS + ∑mi i Fi , IS , ′ i =1 Hr H r′ (A5) or, in a shorter form, i =1 2 k Fr′, IS = 4∑mi (1 − m) k prj′ = (1 − m ) prj + ∑mi pij , H r′, obs After simple algebra, the following relationship holds for Fr′, IS = ( H r′ − H r′, obs ) H r′: Fr′, IS = 4∑ k ∑ mm i = 0 s = i +1 i j k HT (is) H FST (is) + ∑ mi i Fi , IS (A6) H r′ H r′ i =0 where index “0” stands for the recipient population; that is, F0, IS = Fr , IS , HT (0 s) = HT (rs), FST (0 s) = FST (rs) , and m0 = 1 − m. References Callen DF, Thompson AD, Shen Y, Phillips HA, Richards RI, Mulley JC, Sutherland GR. 1993. Incidence and origin of “null” alleles in the (AC)n microsatellite markers. Am J Hum Genet. 52:922–927. Chakraborty R, De Andrade M, Daiger SP, Budowle B. 1992. Apparent heterozygote deficiencies observed in DNA typing data and their implications in forensic applications. Ann Hum Genet. 56:45–57. Nei M. 1977. F-statistics and analysis of gene diversity in subdivided populations. Ann Hum Genet. 41:225–233. Wahlund, S. 1928. Zusammensetzung von Populationen und Korrelationerscheinungen vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas. 11:65–106. Waples RS. 2015. Testing for Hardy-Weinberg proportions: have we lost the plot? J Hered. 106:1–19. Weir BS. 1996. Genetic data analysis II: Methods for discrete population genetic data. Sunderland (MA): Sinauer Associates. Wright S. 1951. The genetical structure of populations. Ann Eugen. 15:323– 354. Zhivotovsky LA, Kordicheva SY, Shaikhaev EG, Rubtsova GA, Afanasiev KI, Shitova MV, Fuller SA, Shaikhaev GO, Gharrett AJ. 2015. Efficiency of the inbreeding coefficient f and other estimators in detecting null alleles, as revealed by empirical data of locus oke3 across 65 populations of chum salmon Oncorhynchus keta. J Fish Biol. 86:402–408.
© Copyright 2026 Paperzz