Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;52(5):703-713.
doi: 10.1111/age.13117. Epub 2021 Jul 12.

Improving the resolution of canine genome-wide association studies using genotype imputation: A study of two breeds

Affiliations

Improving the resolution of canine genome-wide association studies using genotype imputation: A study of two breeds

Christopher A Jenkins et al. Anim Genet. 2021 Oct.

Abstract

Genotype imputation using a reference panel that combines high-density array data and publicly available whole genome sequence consortium variant data is potentially a cost-effective method to increase the density of extant lower-density array datasets. In this study, three datasets (two Border Collie; one Italian Spinone) generated using a legacy array (Illumina CanineHD, 173 662 SNPs) were utilised to assess the feasibility and accuracy of this approach and to gather additional evidence for the efficacy of canine genotype imputation. The cosmopolitan reference panels used to impute genotypes comprised dogs of 158 breeds, mixed breed dogs, wolves and Chinese indigenous dogs, as well as breed-specific individuals genotyped using the Axiom Canine HD array. The two Border Collie reference panels comprised 808 individuals including 79 Border Collies and 426 326 or 426 332 SNPs; and the Italian Spinone reference panel comprised 807 individuals including 38 Italian Spinoni and 476 313 SNPs. A high accuracy for imputation was observed, with the lowest accuracy observed for one of the Border Collie datasets (mean R2 = 0.94) and the highest for the Italian Spinone dataset (mean R2 = 0.97). This study's findings demonstrate that imputation of a legacy array study set using a reference panel comprising both breed-specific array data and multi-breed variant data derived from whole genomes is effective and accurate. The process of canine genotype imputation, using the valuable growing resource of publicly available canine genome variant datasets alongside breed-specific data, is described in detail to facilitate and encourage use of this technique in canine genetics.

Keywords: Border Collie; Italian Spinone; genome-wide association study; imputation accuracy; whole genome sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart to illustrate dataset processing for imputation study sets and reference panels. WGS, whole genome sequencing
Figure 2
Figure 2
Accuracy of imputation for each chromosome in Italian Spinone and Border Collie datasets. The graph shows the R2 of imputed calls and known genotypes. Boxes are 25th–75th percentiles, with lines for the median. Whiskers indicate upper and lower adjacent values; outliers are shown using dots. Truncated y-axis starts at 0.7.
Figure 3
Figure 3
Accuracy of imputation for each concordance-tested individual (n = 8 for each set) in Italian Spinone and Border Collie datasets. The graph shows the R2 of imputed calls and known genotypes. Boxes are 25th–75th percentiles, with lines for the median. Whiskers indicate upper and lower adjacent values; outliers are shown using dots. Truncated y-axis starts at 0.7.
Figure 4
Figure 4
Accuracy of imputation for each concordance tested dog from Border Collie Set 1 and each of three reference panels containing decreasing numbers of Border Collies. The graph shows the R2 of imputed calls and known genotypes. Boxes are 25th–75th percentiles, with lines for the median. Whiskers indicate upper and lower adjacent values; outliers are shown using dots. Lines show mean R2 for each reference panel. Truncated y-axis starts at 0.7.
Figure 5
Figure 5
A comparison of imputation accuracy and predicted certainty. Top: percent of concordant genotypes for SNPs with heterozygous or homozygous known genotypes grouped by impute2’s Info metric (imputation certainty). Bottom: percent of total imputed calls within each Info group.

References

    1. Arendt ML, Melin M, Tonomura N et al. (2015) Genome-wide association study of golden retrievers identifies germ-line risk factors predisposing to mast cell tumours. PLoS Genetics 11, e1005647. - PMC - PubMed
    1. Biasoli D, Compston-Garnett L, Ricketts SL et al. (2019) A synonymous germline variant in a gene encoding a cell adhesion molecule is associated with cutaneous mast cell tumour development in Labrador and Golden Retrievers. PLoS Genetics 15, e1007967. - PMC - PubMed
    1. Bolormaa S, Chamberlain AJ, Khansefid M et al. (2019) Accuracy of imputation to whole-genome sequence in sheep. Genetics Selection Evolution 51, 1. - PMC - PubMed
    1. Browning SR (2008) Missing data imputation and haplotype phase inference for genome-wide association studies. Human Genetics 124, 439–50. - PMC - PubMed
    1. Browning SR & Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81, 1084–97. - PMC - PubMed