Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 4;26(1):215.
doi: 10.1186/s12864-025-11250-4.

Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Affiliations

Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Huanhuan Zhao et al. BMC Genomics. .

Abstract

Background: Integrating germplasm populations genotyped by different genotyping platforms via genotype imputation is a way to utilize accumulated genetic resources. In this study, we used 278 canola samples genotyped via whole-genome sequencing (WGS) at 10× coverage to evaluate the imputation accuracy of three imputation approaches. The optimal imputation methods were used to impute and integrate two Canola genotype datasets: a diverse canola collection genotyped by genotyping-by-sequencing via transcriptome (GBS-t) and a double haploid (DH) line collection genotyped with low-coverage WGS (skim-WGS). The genomic predictive ability (GP) and detection power of marker‒trait association (GWAS) of the combined population for blackleg resistance were evaluated.

Results: The empirical imputation accuracy (r2) measured as the squared correlation between observed and imputed genotypes was moderate for Minimac3 when imputing from the GBS-t density to the WGS. The accuracy dramatically improved from 0.64 to 0.82 by removing SNPs with poor Minimac3-reported Rsq (Rsq < 0.2) quality statistics. The r2 for GLIMPSE was higher than that for Beagle when imputing from different low-coverage to full-coverage WGS. We imputed and integrated the diverse canola collection and the DH lines, and the combined population showed similar or slightly greater predictive ability (PA) for blackleg resistance traits than did each of the single populations with ~ 921 K SNPs. Higher marker-trait association (MTA) detection powers were indicated with the combined population; however, similar numbers of MTAs were discovered when each single population was combined in a meta-GWAS.

Conclusion: It is feasible to impute and integrate germplasms from different sequencing platforms for downstream analyses. However, genetic heterogeneity across populations could add complexity to the analysis. Increasing the sample size by combining datasets showed slightly greater predictive ability and greater detection power in GWASs in the present study.

Keywords: Blackleg resistance; Canola; GBS-t; GP; GWAS; Imputation; Whole-genome sequencing (WGS); skim-WGS.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Boxplot of the empirical imputation accuracy (represented by Concordance and r2) (a) and the mean r2 values across different Minimac Rsq quality statistic bins (b). Both panels are used to visualize the performance of imputing from GBS-t density to WGS using Minimac3. We further compared the r2 in Beagle and CLIMPSE by imputing from different skim coverages to 10× WGS (Fig. 2a). GLIMPSE outperformed Beagle in all low-coverage datasets, and the greatest differences were at 0.5× and 1.0×, where GLIMPSE (0.88 and 0.9) achieved 16% greater accuracy than Beagle (0.74 and 0.66). Additionally, GLIMPSE also had a slightly greater r2 for all MAF bins than did Beagle (Fig. 2b). The r2 was lower for SNPs with low MAFs (MAF < = 0.06) for both methods; however, the concordance did not change dramatically across different MAF bins in either Beagle or GLIMPSE
Fig. 2
Fig. 2
The average empirical imputation accuracy (r2) for imputing from different sequencing coverage to 10× WGS using Beagle and GLIMPSE (a) and the average r2 at different MAF bins with the full SNP dataset using Beagle and GLIMPSE (b)
Fig. 3
Fig. 3
Dot plots of the first two principal components (PC1, PC2) of the genomic relationship matrix (GRM) with all canola lines included in the study, with color-coded groups
Fig. 4
Fig. 4
Manhattan plots of the diverse, DH, and combined population and meta-GWAS for survival rate (SurvRt, left) and average internal infection (AveInf, right). The SNPs shown in red are all significant SNPs from all four GWAS

References

    1. Kim C, Guo H, Kong W, Chandnani R, Shuang L-S, Paterson AH. Application of genotyping by sequencing technology to a variety of crop breeding programs. Plant Sci. 2016;242:14–22. - PubMed
    1. Rasheed A, Hao Y, Xia X, Khan A, Xu Y, Varshney RK, He Z. Crop breeding chips and genotyping platforms: Progress, challenges, and perspectives. Mol Plant. 2017;10(8):1047–64. - PubMed
    1. Kumar P, Choudhary M, Jat BS, Kumar B, Singh V, Kumar V, Singla D, Rakshit S. Skim sequencing: an advanced NGS technology for crop improvement. J Genet 2021, 100. - PubMed
    1. Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387–406. - PMC - PubMed
    1. Schmidt M, Kollers S, Maasberg-Prelle A, Großer J, Schinkel B, Tomerius A, Graner A, Korzun V. Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection. Theor Appl Genet. 2016;129(2):203–13. - PubMed

LinkOut - more resources