Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Dec 2:2024.12.02.626332.
doi: 10.1101/2024.12.02.626332.

Ancient genomics support deep divergence between Eastern and Western Mediterranean Indo-European languages

Fulya Eylem Yediay  1   2 Guus Kroonen  3   4 Serena Sabatini  2 Karin Margarita Frei  5 Anja B Frank  6   7 Thomaz Pinotti  1   8 Andrew Wigman  3 Rasmus Thorsø  3 Tharsika Vimala  1 Hugh McColl  1   2 Ioanna Moutafi  9   10 Isin Altinkaya  1 Abigail Ramsøe  1 Charleen Gaunitz  1 Gabriel Renaud  11 Alfredo Mederos Martin  12 Fabrice Demeter  1   13 Gabriele Scorrano  1   14 Alessandro Canci  15 Peter Fischer  2 Izzet Duyar  16 Claude Serhal  17 Alexander Varzari  18   19 Murat Türkteki  20 John O'Shea  21 Lorenz Rahmstorf  22 Gürcan Polat  23 Derya Atamtürk  24 Lasse Vinner  1 Sachihiro Omura  25 Kimiyoshi Matsumura  25 Jialu Cao  1 Frederik Valeur Seersholm  1 Jose Miguel Morillo Leon  26 Sofia Voutsaki  27 Raphaël Orgeolet  28   29 Brendan Burke  30 Nicholas P Herrmann  31 Giulia Recchia  32 Susi Corazza  15 Elisabetta Borgna  15 Mirella Cipolloni Sampò  33 Flavia Trucco  34 Ana Pajuelo Pando  35 Marie Louise Schjellerup Jørkov  36 Patrice Courtaud  37 Rebecca Peake  38   39 Juan Francisco Gibaja Bao  40 Györgyi Parditka  21 Jesper Stenderup  1 Karl-Göran Sjögren  2 Jacqueline Staring  1 Line Olsen  1 Igor V Deyneko  19 György Pálfi  41 Pedro Manuel López Aldana  35 Bryan Burns  42 László Paja  41 Christian Mühlenbock  43 Claudio Cavazzuti  44 Alberto Cazzella  32 Anna Lagia  45 Vassilis Lambrinoudakis  46 Lazaros Kolonas  47 Jörg Rambach  48   49 Eugen Sava  18 Sergey Agulnikov  50 Vicente Castañeda Fernández  51 Mia Broné  52 Victoria Peña Romo  53 Fernando Molina González  54 Juan Antonio Cámara Serrano  54 Sylvia Jiménez Brobeil  55 Trinidad Nájera Molino  54 María Oliva Rodríguez Ariza  56 Catalina Galán Saulnier  57 Armando González Martín  58 Nicolas Cauwe  59 Claude Mordant  39 Mafalda Roscio  60 Luc Staniaszek  38   39 Mary Anne Tafuri  32 Tayfun Yıldırım  61 Luciano Salzani  62 Thorfinn Sand Korneliussen  1 J Víctor Moreno-Mayar  1 Morten Erik Allentoft  1   63 Martin Sikora  1 Rasmus Nielsen  1   64 Kristian Kristiansen  1   2 Eske Willerslev  1   65
Affiliations

Ancient genomics support deep divergence between Eastern and Western Mediterranean Indo-European languages

Fulya Eylem Yediay et al. bioRxiv. .

Abstract

The Indo-European languages are among the most widely spoken in the world, yet their early diversification remains contentious1-5. It is widely accepted that the spread of this language family across Europe from the 5th millennium BP correlates with the expansion and diversification of steppe-related genetic ancestry from the onset of the Bronze Age6,7. However, multiple steppe-derived populations co-existed in Europe during this period, and it remains unclear how these populations diverged and which provided the demographic channels for the ancestral forms of the Italic, Celtic, Greek, and Armenian languages8,9. To investigate the ancestral histories of Indo-European-speaking groups in Southern Europe, we sequenced genomes from 314 ancient individuals from the Mediterranean and surrounding regions, spanning from 5,200 BP to 2,100 BP, and co-analysed these with published genome data. We additionally conducted strontium isotope analyses on 224 of these individuals. We find a deep east-west divide of steppe ancestry in Southern Europe during the Bronze Age. Specifically, we show that the arrival of steppe ancestry in Spain, France, and Italy was mediated by Bell Beaker (BB) populations of Western Europe, likely contributing to the emergence of the Italic and Celtic languages. In contrast, Armenian and Greek populations acquired steppe ancestry directly from Yamnaya groups of Eastern Europe. These results are consistent with the linguistic Italo-Celtic10,11 and Graeco-Armenian1,12,13 hypotheses accounting for the origins of most Mediterranean Indo-European languages of Classical Antiquity. Our findings thus align with specific linguistic divergence models for the Indo-European language family while contradicting others. This underlines the power of ancient DNA in uncovering prehistoric diversifications of human populations and language communities.

PubMed Disclaimer

Figures

Extended Data Fig.1.
Extended Data Fig.1.
Geographical distribution of the main IBD clusters, split into time ranges, pre 5,000 BP, 5,000–4,000 BP, and post 4,000 BP.
Extended Data Fig. 2.
Extended Data Fig. 2.
The PCA plot demonstrates the distribution of main IBD clusters on 2,228 ancient individuals.
Extended Data Fig. 3.
Extended Data Fig. 3.
The PCA plot demonstrates the distribution of subclusters of “Farmer-related (0_1)” and “Steppe-related (0_4)”. New genomes presented in this study were marked with black circle legends.
Extended Data Fig. 4.
Extended Data Fig. 4.
Ancestry bar plots generated for each individual using source population proportions of IBD admixture modelling sorted by time BP and divided into two time series, before and after 4,400 BP, illustrating a Southern and Central Eastern Europe split (Italy, France, Spain and Hungary vs Greece).
Extended Data Fig. 5.
Extended Data Fig. 5.
Pie charts generated by using the proportions of the applied IBD admixture model for each individual from Anatolia, Cyprus, Iran, Caucasus and Levant, divided into five time periods to avoid overlapping.
Extended Data Fig.6.
Extended Data Fig.6.
Bar plots generated using source proportions of the IBD admixture modelling that shows the similarity of steppe ancestry in Greece and Armenia Middle Late Bronze Age and Urartians modelled with Yamnaya and local populations.
Fig. 1.
Fig. 1.
Distribution of ancient individuals distributed by country (A), (N= the number of individuals in this study/total number of individuals in dataset), and locality shown on the map (B). We only demonstrate the individuals limited by time frame (6,000–1,000 cal BP) to avoid overlapping.
Fig. 2.
Fig. 2.
Geographical distribution of the IBD clusters of individuals in the 5th and 4th millennia BP.
Fig. 3.
Fig. 3.
Distribution of Bell Beaker-derived and Yamnaya-derived ancestry proportions obtained from the IBD admixture model. The proportion of each steppe source is standardized by the total steppe contributions, i.e. the sum of Corded Ware, Bell Beaker and Yamnaya_Samara contributions.
Fig. 4.
Fig. 4.
Ancestry bar plots generated for each individual using source population proportions of IBD admixture modelling sorted by time BP and divided into two time series, before and after 4,400 BP, illustrating a Southern Europe split (Italy, France, and Spain vs Greece).
Fig. 5.
Fig. 5.
Phylogeography of Y-chromosome haplogroup R-Z2103. Phylogram of haplogroup R-Z2103, with branch lengths proportional to SNP number, built from a dataset of unique variants in private datasets. The haplogroup resolves in a four-way polytomy. We plotted all occurrences older than 2,000 years ago in the ancient DNA record in Western Eurasia on an Albers Equal Area map, colour-coded by clades downstream of R-Z2103. Only haplogroup R-Z2106 extends beyond the Caucasus and Northern Iran, and we indicate phylogenetic position of individuals in the tree. Crosses mark R-Z2103 individuals with uncertain clade assignment due to low coverage, and squares indicate individuals from Greece, Moldova and Hungary generated in this study. Dotted lines denote split time estimates of key haplogroups, calculated using rho statistics.
Fig. 6.
Fig. 6.
Distribution of Iran Chalcolithic (Iran_C), Caucasus Chalcolithic (Caucasus_C) and Yamnaya-related ancestry proportions obtained from the IBD admixture model.

References

    1. Ringe D., Warnow T. & Taylor A. Indo-European and computational cladistics. Trans. Philol. Soc. 100, 59–129 (2002).
    1. Anthony D. W. The Horse, the Wheel, and Language. (Princeton University Press, 2010).
    1. Chang W., Hall D., Cathcart C. & Garrett A. ANCESTRY-CONSTRAINED PHYLOGENETIC ANALYSIS SUPPORTS THE INDO-EUROPEAN STEPPE HYPOTHESIS. Language 91, 194–244 (2015).
    1. Bouckaert R. et al. Mapping the origins and expansion of the Indo-European language family. Science 337, 957–960 (2012). - PMC - PubMed
    1. Heggarty P. et al. Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages. Science 381, eabg0818 (2023). - PubMed

Publication types