Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 6;228(3):iyae144.
doi: 10.1093/genetics/iyae144.

An explanation for the sister repulsion phenomenon in Patterson's f-statistics

Affiliations

An explanation for the sister repulsion phenomenon in Patterson's f-statistics

Gözde Atağ et al. Genetics. .

Abstract

Patterson's f-statistics are among the most heavily utilized tools for analyzing genome-wide allele frequency data for demographic inference. Beyond studying admixture, f3- and f4-statistics are also used for clustering populations to identify groups with similar histories. However, previous studies have noted an unexpected behavior of f-statistics: multiple populations from a certain region systematically show higher genetic affinity to a more distant population than to their neighbors, a pattern that is mismatched with alternative measures of genetic similarity. We call this counter-intuitive pattern "sister repulsion". We first present a novel instance of sister repulsion, where genomes from Bronze Age East Anatolian sites show higher affinity toward Bronze Age Greece rather than each other. This is observed both using f3- and f4-statistics, contrasts with archaeological/historical expectation, and also contradicts genetic affinity patterns captured using principal components analysis or multidimensional scaling on genetic distances. We then propose a simple demographic model to explain this pattern, where sister populations receive gene flow from a genetically distant source. We calculate f3- and f4-statistics using simulated genetic data with varying population genetic parameters, confirming that low-level gene flow from an external source into populations from 1 region can create sister repulsion in f-statistics. Unidirectional gene flow between the studied regions (without an external source) can likewise create repulsion. Meanwhile, similar to our empirical observations, multidimensional scaling analyses of genetic distances still cluster sister populations together. Overall, our results highlight the impact of low-level admixture events when inferring demographic history using f-statistics.

Keywords: f-statistics; admixture; ancient DNA; deep ancestry; demographic inference.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest The author(s) declare no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
a) Geographical distribution of the studied Bronze Age genomes from the regions of modern-day Greece and East Anatolia. b) Genetic clustering of the populations in line with geography revealed by multidimensional scaling (MDS) analysis, using 1 − outgroup-f3 values as genetic distances. Supplementary Fig. 1 also presents a PCA, which also shows the absence of visible structure in these gene pools. c) Intra-regional diversity of the populations measured as pairwise 1 − outgroup-f3 values within regions. Each point represents comparisons of settlements from the same region. d) f3-Statistics of the form f3(Yoruba; East Anatolia, X) where X corresponds to populations from Greece and East Anatolia. The x-axis shows East Anatolian populations while the y-axis shows Greece and East Anatolia. e) f4-Statistics of the form f4(Yoruba, East Anatolia; East Anatolia, Greece), where positive values depict higher affinity of BA East Anatolian populations toward contemporary populations from Greece. f) Geographic vs genetic distances (1-f3) between and within regions. g) Distribution of f3 values between and within regions.
Fig. 2.
Fig. 2.
Demographic model used for the population genetic simulations. The ancestral populations are named as combinations of the descending population names. Population O is the outgroup used in the calculation of f-statistics. Dashed lines represent equal rates of continuous migration from Z to A and B per generation after the split between A and B and the present day. The vertical line on the right side of the AB branch corresponds to the parameter ΔTX-AB,A-B.
Fig. 3.
Fig. 3.
Expected F4 values under distant admixture. The 4 trees with arrows show 4 possible drift paths that can contribute to F4(O, A; B, X). Here, sister populations A or B have independently received gene flow (dashed arrow) from the distant branch Z with admixture proportions ɑA and ɑB, respectively, after their split from each other. The probability of each path and its value as a product of its probability and the drift magnitude (indicated by letters e, f, and g) are shown below each tree. The expected average F4 value (Patterson et al. 2012) is shown at the bottom, with the positive component in bold. The small arrows indicate the direction of the drift paths from O → A and B → X, which determine the sign of that F4 path.
Fig. 4.
Fig. 4.
Change in f4-statistics of the form f4(O, A; B, X) depending on the varying parameters in population genetic simulations, where negative values represent higher affinity between A and B, and positive values higher affinity between A and X. The panels for each parameter contain all values of the remaining studied parameters and are explained in Table 1. The multimodality arises because each distribution is collected from a range of simulations run under a range of parameters with diverse effects on the f4-statistics.

Similar articles

Cited by

References

    1. Altınışık NE, Kazancı DD, Aydoğan A, Gemici HC, Erdal ÖD, Sarıaltun S, Vural KB, Koptekin D, Gürün K, Sağlıcan E, et al. . 2022. A genomic snapshot of demographic and cultural dynamism in Upper Mesopotamia during the Neolithic Transition. Sci Adv. 8(44):eabo3609. doi:10.1126/sciadv.abo3609. - DOI - PMC - PubMed
    1. Antonio ML, Weiß CL, Gao Z, Sawyer S, Oberreiter V, Moots HM, Spence JP, Cheronet O, Zagorc B, Praxmarer E, et al. . 2024. Stable population structure in Europe since the Iron Age, despite high mobility. eLife. 13:e79714. doi:10.7554/elife.79714. - DOI - PMC - PubMed
    1. Baumdicker F, Bisschop G, Goldstein D, Gower G, Ragsdale AP, Tsambos G, Zhu S, Eldon B, Ellerman EC, Galloway JG, et al. . 2022. Efficient ancestry and mutation simulation with msprime 1.0. Genetics. 220(3):iyab229. doi:10.1093/genetics/iyab229. - DOI - PMC - PubMed
    1. Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J, et al. . 2020. Insights into human genetic variation and population history from 929 diverse genomes. Science. 367(6484):eaay5012. doi:10.1126/science.aay5012. - DOI - PMC - PubMed
    1. Clemente F, Unterländer M, Dolgova O, Amorim CEG, Coroado-Santos F, Neuenschwander S, Ganiatsou E, Cruz Dávalos DI, Anchieri L, Michaud F, et al. . 2021. The genomic history of the Aegean palatial civilizations. Cell. 184(10):2565–2586.e21. doi:10.1016/j.cell.2021.03.039. - DOI - PMC - PubMed

LinkOut - more resources