Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 16;10(1):9702.
doi: 10.1038/s41598-020-66232-1.

Genome Diversity and the Origin of the Arabian Horse

Affiliations

Genome Diversity and the Origin of the Arabian Horse

Elissa J Cosgrove et al. Sci Rep. .

Abstract

The Arabian horse, one of the world's oldest breeds of any domesticated animal, is characterized by natural beauty, graceful movement, athletic endurance, and, as a result of its development in the arid Middle East, the ability to thrive in a hot, dry environment. Here we studied 378 Arabian horses from 12 countries using equine single nucleotide polymorphism (SNP) arrays and whole-genome re-sequencing to examine hypotheses about genomic diversity, population structure, and the relationship of the Arabian to other horse breeds. We identified a high degree of genetic variation and complex ancestry in Arabian horses from the Middle East region. Also, contrary to popular belief, we could detect no significant genomic contribution of the Arabian breed to the Thoroughbred racehorse, including Y chromosome ancestry. However, we found strong evidence for recent interbreeding of Thoroughbreds with Arabians used for flat-racing competitions. Genetic signatures suggestive of selective sweeps across the Arabian breed contain candidate genes for combating oxidative damage during exercise, and within the "Straight Egyptian" subgroup, for facial morphology. Overall, our data support an origin of the Arabian horse in the Middle East, no evidence for reduced global genetic diversity across the breed, and unique genetic adaptations for both physiology and conformation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
The Arabian is a distinct breed with diverse lineages, having little apparent relationship to the Thoroughbred. Principal component analysis of 378 Arabian horses sampled in this study (A) among a reference set including samples from 18 additional global breeds from and, with symbol shape indicating data source and symbol color indicating breed (data set: 917 samples across 30,967 SNPs); and (B) with 71 Persian Arabian and 11 Turkemen samples from, and 17 Thoroughbred samples collected in this study, with symbol shape indicating breed, and symbol color indicating Arabian breed lineage, except for the Thoroughbred and Turkemen groups (data set: 477 samples across 56,239 SNPs). Percent variance explained: (A) PC1 = 5.6%, PC2 = 2.2%; (B) PC1 = 4.8%, PC2 = 2.5%. PC3 is plotted in Supplemental Figs. S2 and S3. See also Supplemental Fig. S4.
Figure 2
Figure 2
The Arabian breed shows genetic differentiation associated with performance use. Principal components in Fig. 1B were replotted to reflect known competition types for a subset of Arabian horses. Arabian samples were color coded by use (type of competition) in endurance competition, flat course racing or show. Racing horses (orange) plot in the PCA space between the Thoroughbred racehorse and other subgroups of Arabian horse.
Figure 3
Figure 3
Individual lineages of the Arabian breed display complex ancestry. STRUCTURE cluster assignments are plotted for number of clusters K = 2, 5, 8, and 11 (top to bottom panels). Each cluster in a given analysis (panel) is represented by a separate color. The plotted 312 samples represent Arabian breed subgroups as well as Turkemen, Icelandic and Thoroughbred breeds. Sample order is the same in each panel. 47,436 SNPs were included in the analysis. The purple bar and asterisk mark the cluster of multi-origin samples showing shared ancestry with Thoroughbred samples.
Figure 4
Figure 4
Admixture analysis indicates presence of Thoroughbred ancestry in Arabian horses used for flat course racing. RFMix was applied to the set of 34 “racing Arabian” samples. Segments assigned to one of four reference groups with posterior probability ≥ 0.99 are colored as indicated in the legend. No assignment was made if all posterior probabilities were <0.99 (gray segments). Each row is a haplotype, with individuals (pairs of haplotypes) separated by horizontal white lines. Samples were ordered based on the first principal component (PC1) of the PCA in Fig. 2. See also Supplemental Fig. S5.
Figure 5
Figure 5
Y chromosome haplotyping supports use of Thoroughbred sire lines in the Racing Arabian subgroup. (A) Reduced haplotype network map of the male-specific region of the Y-chromosome according to with SNPs genotyped in this study in red. Detected haplotypes are colored, and haplotypes in white circles were not detected in the 39 Arabian males genotyped. Haplotypes in green were previously detected in Arabian horses in other studies,. The ‘Whalebone’ haplotype, Tb-dW1, is colored in red. (B) Y haplotype frequencies in racing versus non-racing Arabian horses shown in absolute numbers.
Figure 6
Figure 6
Selection scan statistics highlight a candidate region on ECA 16 containing GPX1, a gene important for protection from exercise-induced oxidative damage across Arabian lineages. H, H12, and reflected Tajima’s D statistics were all scaled to 0 – 1. 99th percentile threshold values are plotted as dashed horizontal lines for each statistic. The selection scan peak with the highest count of overlapping peaks (“multi-strain” group, H statistic, overlapping with 11 other peaks) is shown as the vertical gray region. Colored asterisks indicate which test statistics had peaks exceeding the indicated threshold within the region of interest.
Figure 7
Figure 7
Selection scan statistics reveal a putative selective sweep unique to the Straight Egyptian subgroup on ECA 18. (A) H, H12, and reflected Tajima’s D statistics were all scaled to 0 – 1. 99th percentile threshold values are plotted as dashed horizontal lines for each statistic. The vertical gray region indicates the genomic region encompassing all 6 peaks observed for the Straight Egyptian group. Colored asterisks indicate which test statistics had peaks exceeding the indicated threshold within the region of interest. (B) Representative profile of a Straight Egyptian Arabian Horse and (C) of an Arabian horse used for Racing and with no Straight Egyptian ancestry.

References

    1. Olsen, S. & Culbertson, C. A gift from the desert: the art, history, and culture of the Arabian horse. (Kentucky Horse Park, 2010).
    1. Forbis, J. The classic Arabian horse. 1st edn, (Liveright, 1976).
    1. Olsen, S. L. & Bryant, R. T. Stories in the Rocks: Exploring Saudi Arabian Rock Art. (Carneigie Museum of Natural History, 2013).
    1. Fages A, et al. Tracking Five Millennia of Horse Management with Extensive Ancient Genome Time Series. Cell. 2019;177:1419–1435.e1431. doi: 10.1016/j.cell.2019.03.049. - DOI - PMC - PubMed
    1. Khadka, R. Global Horse Population with respect to Breeds and Risk Status Master’s Degree thesis, Swedish University of Agricultural Sciences (2010).

Publication types