Estimating and interpreting FST: the impact of rare variants

Gaurav Bhatia¹, Nick Patterson, Sriram Sankararaman, Alkes L Price

Affiliations

PMID: 23861382
PMCID: PMC3759727
DOI: 10.1101/gr.154831.113

Estimating and interpreting FST: the impact of rare variants

Gaurav Bhatia et al. Genome Res. 2013 Sep.

. 2013 Sep;23(9):1514-21.

doi: 10.1101/gr.154831.113. Epub 2013 Jul 16.

Authors

Gaurav Bhatia¹, Nick Patterson, Sriram Sankararaman, Alkes L Price

Affiliation

¹ Harvard-Massachusetts Institute of Technology (MIT), Division of Health, Science, and Technology, Cambridge, Massachusetts 02139, USA. gbhatia@mit.edu

PMID: 23861382
PMCID: PMC3759727
DOI: 10.1101/gr.154831.113

Abstract

In a pair of seminal papers, Sewall Wright and Gustave Malécot introduced FST as a measure of structure in natural populations. In the decades that followed, a number of papers provided differing definitions, estimation methods, and interpretations beyond Wright's. While this diversity in methods has enabled many studies in genetics, it has also introduced confusion regarding how to estimate FST from available data. Considering this confusion, wide variation in published estimates of FST for pairs of HapMap populations is a cause for concern. These estimates changed-in some cases more than twofold-when comparing estimates from genotyping arrays to those from sequence data. Indeed, changes in FST from sequencing data might be expected due to population genetic factors affecting rare variants. While rare variants do influence the result, we show that this is largely through differences in estimation methods. Correcting for this yields estimates of FST that are much more concordant between sequence and genotype data. These differences relate to three specific issues: (1) estimating FST for a single SNP, (2) combining estimates of FST across multiple SNPs, and (3) selecting the set of SNPs used in the computation. Changes in each of these aspects of estimation may result in FST estimates that are highly divergent from one another. Here, we clarify these issues and propose solutions.

PubMed Disclaimer

Figures

**Figure 1.**
Allele frequency dependence of F_ST under different ascertainment schemes. This shows F_ST for CEU and CHB as a function of allele frequency when ascertaining in either CEU, CHB, or YRI. The increased F_ST for rare variants is consistent with bottlenecks being a stronger force on F_ST for CEU and CHB than recent expansion. In fact, this is consistent with a stronger bottleneck in the population history of CHB. We note that this frequency dependence disappears when ascertaining in YRI, suggesting that YRI is a reasonable outgroup for the comparison of CEU and CHB.

See this image and copyright information in PMC

References

1. The 1000 Genomes Project Consortium 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
1. Albrechtsen A, Nielsen FC, Nielsen R 2010. Ascertainment biases in SNP chips affect measures of population divergence. Mol Biol Evol 27: 2534–2547 - PMC - PubMed
1. Balding DJ 2003. Likelihood-based inference for genetic correlation coefficients. Theor Popul Biol 63: 221–230 - PubMed
1. Balding DJ, Nichols RA 1995. A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica 96: 3–12 - PubMed
1. Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L 2008. Natural selection has driven population differentiation in modern humans. Nat Genet 40: 340–345 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimating and interpreting FST: the impact of rare variants

Affiliation

Estimating and interpreting FST: the impact of rare variants

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous