Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 2;14(1):22930.
doi: 10.1038/s41598-024-73645-9.

Analyses of whole-genome sequences from 185 North American Thoroughbred horses, spanning 5 generations

Affiliations

Analyses of whole-genome sequences from 185 North American Thoroughbred horses, spanning 5 generations

Ernie Bailey et al. Sci Rep. .

Abstract

Whole genome sequences (WGS) of 185 North American Thoroughbred horses were compared to quantify the number and frequency of variants, diversity of mitotypes, and autosomal runs of homozygosity (ROH). Of the samples, 82 horses were born between 1965 and 1986 (Group 1); the remaining 103, selected to maximize pedigree diversity, were born between 2000 and 2020 (Group 2). Over 14.3 million autosomal variants were identified with 4.5-5.0 million found per horse. Mitochondrial sequences associated the North American Thoroughbreds with 9 of 17 clades previously identified among diverse breeds. Individual coefficients of inbreeding, estimated from ROH, averaged 0.266 (Group 1) and 0.283 (Group 2). When SNP arrays were simulated using subsets of WGS markers, the arrays over-estimated lengths of ROH. WGS-based estimates of inbreeding were highly correlated (r > 0.98) with SNP array-based estimates, but only moderately correlated (r = 0.40) with inbreeding based on 5-generation pedigrees. On average, Group 1 horses had more heterozygous variants (P < 0.001), more total variants (P < 0.001), and lower individual inbreeding (FROH; P < 0.001) than horses in Group 2. However, the distribution of numbers of variants, allele frequency, and extent of ROH overlapped among all horses such that it was not possible to identify the group of origin of any single horse using these measures. Consequently, the Thoroughbred population would be better monitored by investigating changes in specific variants, rather than relying on broad measures of diversity. The WGS for these 185 horses is publicly available for comparison to other populations and as a foundation for modeling changes in population structure, breeding practices, or the appearance of deleterious variants.

Keywords: Equus caballus; Genetic diversity; Inbreeding; Population genetics; Variant discovery; mtDNA.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Count of heterozygous (blue) or homozygous (orange) variants (SNPs and indels) found per horse ordered by year of birth. On average, the total number of variants and count of heterozygous variants was greater (P < 0.001) in Group 1 than Group 2. The count of homozygous variants, however, did not differ (P = 0.40) by Group.
Fig. 2
Fig. 2
The number of variants in each individual classified by the minor allele frequency of the locus in the total population. Individuals are ordered by year of birth. On average, Group 1 horses had a greater proportion of variants with maf < 0.01, 0.01 to 0.1 and 0.1 to 0.2 (P < 0.001) than horses in Group 2. Group 2 horses had a greater proportion of variants with maf 0.4 to 0.5 than horses in Group 1 (P = 0.001).
Fig. 3
Fig. 3
Neighbor joining network of the mtDNA sequences (D-loop excluded) of all 185 horses. Numbers indicate the contig to which they were assigned for this study (also see Table S1), and letter designations as identified in Achilli et al..
Fig. 4
Fig. 4
Individual inbreeding estimates (FROH) and the average by year of birth (red diamonds) for 185 Thoroughbreds. Mean FROH was greater (P < 0.001) in Group 1 (0.266) compared to Group 2 (0.283).
Fig. 5
Fig. 5
Inbreeding by length of ROH per horse ordered by year of birth for data based upon markers in the GGP (top) and HD (middle) arrays as well as considering the entire WGS (bottom).
Fig. 6
Fig. 6
Individual inbreeding (ROH) based upon the whole-genome sequence data (x-axis) compared to that estimated from markers represented on the GGP (Black) and HD (Red) arrays. Pearson’s correlations between the SNP and WGS estimates were 0.98 and 0.99 for the GGP and HD array, respectively.
Fig. 7
Fig. 7
Inbreeding estimates of each individual from their 5-generation pedigree (x-axis) vs that estimated from genome-wide ROH (y-axis).
Fig. 8
Fig. 8
ROH for Group 2 based upon loci in the GGP (top) and HD (middle) arrays, as well as that identified from the WGS (bottom). Loci in ROH that are shared by 75% or more of the horses in the sample are shown in red.

References

    1. Willett, P. The Classic Racehorse (University Press of Kentucky, 1982).
    1. Bower, M. A. et al. The genetic origin and history of speed in the Thoroughbred racehorse. Nat. Commun. 3, 643 (2012). - PubMed
    1. Binns, M. M. & Morris, T. Thoroughbred Breeding: Pedigree Theories and the Science of Genetics (J.A. Allen, 2011).
    1. Cunningham, E. P., Dooley, J. J., Splan, R. K. & Bradley, D. G. Microsatellite diversity, pedigree relatedness and the contributions of founder lineages to thoroughbred horses. Anim. Genet. 32, 360–364 (2001). - PubMed
    1. Petersen, J. L. et al. Genetic diversity in the modern horse illustrated from genome-wide SNP data. PLoS One. 8, e54997 (2013). - PMC - PubMed