Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Aug 28;10(8):e0135368.
doi: 10.1371/journal.pone.0135368. eCollection 2015.

An Improved F(st) Estimator

Affiliations

An Improved F(st) Estimator

Guanjie Chen et al. PLoS One. .

Abstract

The fixation index F(st) plays a central role in ecological and evolutionary genetic studies. The estimators of Wright ([Formula: see text]), Weir and Cockerham ([Formula: see text]), and Hudson et al. ([Formula: see text]) are widely used to measure genetic differences among different populations, but all have limitations. We propose a minimum variance estimator [Formula: see text] using [Formula: see text] and [Formula: see text]. We tested [Formula: see text] in simulations and applied it to 120 unrelated East African individuals from Ethiopia and 11 subpopulations in HapMap 3 with 464,642 SNPs. Our simulation study showed that [Formula: see text] has smaller bias than [Formula: see text] for small sample sizes and smaller bias than [Formula: see text] for large sample sizes. Also, [Formula: see text] has smaller variance than [Formula: see text] for small Fst values and smaller variance than [Formula: see text] for large F(st) values. We demonstrated that approximately 30 subpopulations and 30 individuals per subpopulation are required in order to accurately estimate F(st).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The relationship between F^st and ϑ^ for simulated data.
The x-axis shows the difference of allele frequencies between two subpopulations ϑ^ (left plots) and ϑ^2 (right plots); the y-axis shows F^st values for Wright’s (top row), Weir and Cockerham’s (second row), the modified (third row), and Hudson et al.’s estimators (bottom row), and the legend indicates the sample sizes n 1 (before hyphen) and n 2 (after hyphen).
Fig 2
Fig 2. Bias as a function of total sample size.
The x-axis shows the total sample size (n 1 + n 2). The y-axis shows FstF^st1 (red), FstF^st2 (blue), FstF^stm (green), and FstF^st3 (orange) for r = 2.
Fig 3
Fig 3. Effect of the number of subpopulations on bias.
The x-axis shows the number of subpopulations. The y-axis shows the mean (left) and variance (right) of FstF^st1 (red), FstF^st2 (blue), FstF^stm (green), and FstF^st3 (orange) values, given F st = 0.5 and average allele frequency p = 0.2. The top plot represents 5 individuals per subpopulation and the bottom plot represents 1000 individuals per subpopulation.
Fig 4
Fig 4. Effect of the number of individuals per subpopulation on bias.
The x-axis shows the number of individuals per subpopulation. The y-axis shows the mean (left) and variance (right) of FstF^st1 (red), FstF^st2 (blue), FstF^stm (green), and FstF^st3 (orange) values, given F st = 0.5 and an average allele frequency p = 0.2. From top to bottom, the plots represent the number of subpopulations r = 10, 20, and 40, respectively.

References

    1. Wright S. Genetical structure of populations. Nature 1950; 66(4215): 247–249. 10.1038/166247a0 - DOI - PubMed
    1. Wright S. The genetical structure of populations. Ann Eugen 1951; 15(4): 323–354. - PubMed
    1. Cockerham CC. Variance of Gene Frequencies. Evolution 1969; 23(1): 72–84. 10.2307/2406485 - DOI - PubMed
    1. Weir BS, Cockerham CC. Estimating F-Statistics for the Analysis of Population Structure. Evolution 1984; 38(6): 1358–1370. 10.2307/2408641 - DOI - PubMed
    1. Willing E-M, Dreyer C, van Oosterhout C. Estimates of Genetic Differentiation Measured by F st Do Not Necessarily Require Large Sample Sizes When Using Many SNP Markers. PLOS ONE 2012; 7(8): e42649 10.1371/journal.pone.0042649 - DOI - PMC - PubMed

Publication types

LinkOut - more resources