Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 15;24(1):187.
doi: 10.1186/s13059-023-03023-7.

Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture

Affiliations

Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture

Jennifer R S Meadows et al. Genome Biol. .

Erratum in

  • Author Correction: Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture.
    Meadows JRS, Kidd JM, Wang GD, Parker HG, Schall PZ, Bianchi M, Christmas MJ, Bougiouri K, Buckley RM, Hitte C, Nguyen AK, Wang C, Jagannathan V, Niskanen JE, Frantz LAF, Arumilli M, Hundi S, Lindblad-Toh K, Ginja C, Agustina KK, André C, Boyko AR, Davis BW, Drögemüller M, Feng XY, Gkagkavouzis K, Iliopoulos G, Harris AC, Hytönen MK, Kalthof DC, Liu YH, Lymberakis P, Poulakakis N, Pires AE, Racimo F, Ramos-Almodovar F, Savolainen P, Venetsani S, Tammen I, Triantafyllidis A, vonHoldt B, Wayne RK, Larson G, Nicholas FW, Lohi H, Leeb T, Zhang YP, Ostrander EA. Meadows JRS, et al. Genome Biol. 2023 Nov 7;24(1):255. doi: 10.1186/s13059-023-03101-w. Genome Biol. 2023. PMID: 37936157 Free PMC article. No abstract available.

Abstract

Background: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function.

Results: We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection.

Conclusions: We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.

Keywords: Canine; Demographic history; Dog; Genetic diversity; Genomics; Mitochondrial DNA; Variation.

PubMed Disclaimer

Conflict of interest statement

BWD is Director of Veterinary Research at Volition Veterinary LLC; ARB is a Co-Founder and CSO of Embark Veterinary Inc. and serves on the Board of EMBARK and the Morris Animal Foundation. HL has consulted with Wisdom Panel Kinship, is an owner and chairman of the board of Petbiomics Ltd., and Petmeta Labs Ltd., and is an owner and advisor of DeepScan Dignostics Ltd. IT is the curator of Online Mendelian Inheritance which is supported by the University of Sydney.

Figures

Fig. 1
Fig. 1
Overview of the Dog10K collection. a Sample collection and sample filtering for (b) the varied demographic, genome function, and architecture examined in the program. QC, quality control. ROH, runs of homozygosity. OMIA, Online Mendelian Inheritance in Animals
Fig. 2
Fig. 2
Variant distribution across the genome. a SNV density in 50-kb windows, drawn from 1987 samples. Increased SNV density is observed at the X-PAR region and the ends of most autosomes. b Median copy number (CN) for 1824 dogs reveals a large, common duplication on chr9 relative to the reference genome. c Principal component (PC) analysis separates dog and wolf samples along the first axis while axis two separates dogs from Eastern and Western Eurasia. d SNV sharing between the three categories of samples
Fig. 3
Fig. 3
Genetic relationships between 1563 samples spanning 292 breeds. The plot is annotated for major and minor breed groups, breed purpose, and key morphological features. The fraction of haplotype sharing is indicated by the heatmap
Fig. 4
Fig. 4
Proportion of the genome covered by ROH (FROH). Mean and standard deviation are plotted for breed groups and are colored as per Fig. 3. Red dashed line shows mean FROH for breed dogs; gray dashed line shows mean FROH for wolves. Breeds containing individuals with the highest and lowest FROH are labeled
Fig. 5
Fig. 5
Genotype imputation accuracy of the Dog10K reference panel. a NRC rates of imputed genotypes across autosomes and the PAR segment of chromosome X. Variant sites are filtered according to GLIMPSE and IMPUTE5 imputation quality scores (INFO > 0.9). b NRC rates of imputed genotypes across the non-PAR segment of chromosome X. Variants are not filtered by imputation quality score, as imputation software does not provide scores for haploid genotypes. c NRC rates of imputed genotypes across autosomes and the PAR segment of chromosome X prior to filtering on imputation quality. d NRC rates and total number of imputed sites for each platform. Sites were filtered according to imputation quality score > 0.9 and reference MAF > 1%. e NRC rates for downsampled and full chromosome 38 reference panels for sites with reference MAF > 1%. Results show both quality and non-quality filtered sites. Data points show NRC rates for a single downsampled reference panel. Horizontal bars indicate mean NRC rates for each reference panel population size. f Number of imputed variants for downsampled and full chromosome 38 reference panels for sites with reference MAF > 1%. Results show both quality and non-quality filtered sites. Data points show the number of imputed variants for a single downsampled reference panel. Horizontal bars indicate the mean number of variants for each reference panel population size
Fig. 6
Fig. 6
Assignment and distribution of haplogroups in breed and village dogs. a Each breed and village dog was assigned to an existing mitochondrial haplotype. b Broad geographic categories based on sample origin
Fig. 7
Fig. 7
Structural variation detected across 1879 samples. a Boxplots of the number of deletion, duplication, insertion, and inversion variants are shown broken down by sample category. b Histograms of the size distribution of each class of detected structural variants are shown. An increase in variant count at the ~ 200 bp size bin is apparent for deletions and insertions. This corresponds to the size of SINEC elements
Fig. 8
Fig. 8
Signatures of selection inferred with ancestry components. a Population structure inferred from Ohana for the nine selected dog groups using K = 5. b Manhattan plots for four of the five ancestral components. Top to bottom: Spitz, Mastiffs, Scenthounds, Pointers, and Spaniels. The red dotted line represents the Bonferroni cutoff and genes either overlapping or within 100 kb from a significant site are indicated at each peak. The asterisk within the Manhattan plot for the Mastiff component contains 88 candidate genes listed in Table S7. c Population tree connecting the ancestral components with colors corresponding to the ancestries are shown in the admixture plot. Each ancestral component is labeled based on the dog group(s) for which it is maximized
Fig. 9
Fig. 9
Comparison among variant data sets. a Number of variable positions shared between major databases. b Fraction of genome spaces under constraint (5% FDR, phyloP > 2.56). c Enrichment of constrained bases in OMIA trait (blue), and disease (red) sets compared to the genome as a whole (gray). d Relationship between allele count (AC), allele frequency (AF), and phyloP score for the coding (CDS, red) and non-coding (non-CDS, green) bases in the whole genome (gray)
Fig. 10
Fig. 10
Structural and point variants impacting CYP1A2. a Distribution of copy number variation within all breed dogs, and the major clades. b Location of SNVs across CYP1A2 inclusive of filtering or impact status. c Mammalian phyloP scores (bounded by 10, − 10). d Illustration of the region from the reference genome perspective
Fig. 11
Fig. 11
SLC28A3 locus copy number expansion. a Location and span of the copy number element in polymorphic Grand Basset Griffon Vendéen (GBGV) and basset hound (BASS) individuals contrasted with copy number two samples from the German Shepherd Dog (GRSD) and BASS breeds. b Distribution of the median copy number count across the 1879 samples assayed

References

    1. Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, 3rd, Zody MC, et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005;438:803–819. - PubMed
    1. Hoeppner MP, Lundquist A, Pirun M, Meadows JR, Zamani N, Johnson J, Sundstrom G, Cook A, FitzGerald MG, Swofford R, et al. An improved canine genome and a comprehensive catalogue of coding genes and non-coding transcripts. PLoS ONE. 2014;9:e91172. - PMC - PubMed
    1. Megquier K, Genereux DP, Hekman J, Swofford R, Turner-Maier J, Johnson J, Alonso J, Li X, Morrill K, Anguish LJ, et al. BarkBase: epigenomic annotation of canine genomes. Genes (Basel) 2019;10(6):433. - PMC - PubMed
    1. Plassais J, Kim J, Davis BW, Karyadi DM, Hogan AN, Harris AC, Decker B, Parker HG, Ostrander EA. Whole genome sequencing of canids reveals genomic regions under selection and variants influencing morphology. Nat Commun. 2019;10:1489. - PMC - PubMed
    1. Serres-Armero A, Davis BW, Povolotskaya IS, Morcillo-Suarez C, Plassais J, Juan D, Ostrander EA, Marques-Bonet T. Copy number variation underlies complex phenotypes in domestic dog breeds and other canids. Genome Res. 2021;31:762–774. - PMC - PubMed

Publication types