Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct;28(10):2905-20.
doi: 10.1093/molbev/msr126. Epub 2011 May 13.

Parallel evolution of genes and languages in the Caucasus region

Collaborators, Affiliations

Parallel evolution of genes and languages in the Caucasus region

Oleg Balanovsky et al. Mol Biol Evol. 2011 Oct.

Abstract

We analyzed 40 single nucleotide polymorphism and 19 short tandem repeat Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation, and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language coevolution occurred within geographically isolated populations, probably due to its mountainous terrain.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Geographic location, linguistic affiliation and genetic composition of the studied populations
Each population is designated by a pie chart representing frequencies of the major haplogroups in it. Areas of the linguistic groups of the Caucasus (except for Turkic groups) are shown by semi-transparent color zones. Black dotted lines indicate genetic boundaries identified in the barrier analysis (thick lines – most important boundaries A, B and C; thin lines – other boundaries D, E and F).
Figure 2
Figure 2
Frequency maps of major Caucasus, Near Eastern and East European haplogroups.
Figure 3
Figure 3. MDS plot depicting genetic relationships between Caucasus, Near Eastern and European populations
The plot is based on Nei’s pairwise genetic distances calculated from frequencies of thirteen Y chromosomal haplogroups (C, E, G, I, J1, J2, L, N1c, O, R1a1, R1b1, Q, other) in populations of North Caucasus (this study), Transcaucasus (Georgians, Battaglia et al., 2009), Near East (this study; Cinnioglu et al., 2004; Flores et al., 2005), and some other European, African and Asian populations (data from Y-base, compiled in our lab from published sources). Caucasus populations are shown by squares, Near Eastern populations by circles and European populations by diamonds.
Figure 4
Figure 4. Comparison of the genetic and linguistic trees of North Caucasus populations
The genetic tree was constructed from frequencies of 28 Y-chromosomal haplogroups in North Caucasus populations (data from Table 2). Populations speaking the same language (three Chechen populations and two Ossetian ones) were pooled to make the genetic dataset compatible with the linguistic classification. The weighted pair-group method was used as a clustering algorithm. The linguistic tree represents the classification of the North Caucasian languages from classical work (Ruhlen, 1987). Kubachi and Kaitak (languages of small populations) were not listed in Ruhlen’s classification, but most linguists agree that they are most related to the Dargin language.
Figure 5
Figure 5. Phylogenetic networks of the haplogroup G2a1a-P18 in the Caucasus and haplogroup C-M208 in Polynesia
A. Reduced median network of haplogroup G2a1a-P18 was constructed using all available (worldwide) STR haplotypes for this haplogroup, with a reduction threshold r=1.00 based on non-weighted data from 15 STRs (DYS19, DYS389I, DYS389b, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, GATA_H4). Black dotted lines designate clusters selected in our study for age estimations. B. Reduced median network of haplogroup C-M208 in Polynesia (modified from Zhivotovsky et al., 2004). Red dotted lines designate clusters selected by Zhivotovsky et al. (2004) for estimating the evolutionary effective mutation rate.
Figure 6
Figure 6. Model of the evolution of Caucasus populations combining genetic and linguistic evidence
The grey background outlines the linguistic tree, obtained by lexicostatistical method. Each colored line near the tips of the tree marks a haplotype cluster that is specific to a given population. If the cluster is shared between two populations, then both populations carry this color on their branches. Standard errors of a cluster’s age are shown by dotted colored lines. Each colored line near the root of the tree marks one of four major haplogroups. These lines stop 3,300 years BP. The root of the population tree indicates an initial migration from the Near East carrying four major haplogroups. This proto-population then separated into the West Caucasus, proto-Ossets, Nakh and Dagestan branches, differing by language and predominant haplogroup. The subsequent evolution (occurring independently in each of these four groups) consisted in the diversification of their languages and emergence of branch-specific or population-specific haplotype clusters.

References

    1. Abdushelishvili MG. Antropology of the ancient and contemporary population of Georgia. Metsniereba; Tbilisi: 1964.
    1. Abramova MP. The Central Caucasus in the Sarmatian epoch. In: Rybakov BA, editor. The steppes of the European part of the USSR in the Scythian-Sarmatian time. Nauka; Moscow: 1989. pp. 268–281. (Series Archaeology of the USSR).
    1. Ageeva RA. Which tribe we are? Ethnic groups of Russia: ethnonims and fortunes. Ethnolinguistic dictionary. Academia Press; Moscow: 2000.
    1. Alexeev VP. The origin of Caucasus peoples. Mysl; Moscow: 1974.
    1. Alexeev ME. Languages of the world: Caucasus languages. Academia; Moscow: 1999. Nakh-Dagestan languages; pp. 156–165.

Publication types