Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Huanhuan Zhao¹, Iona M MacLeod^{2

3}, Gabriel Keeble-Gagnere², Denise M Barbulescu², Josquin F Tibbits², Sukhjiwan Kaur², Matthew Hayden^{4

5}

Affiliations

¹ Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia. huan.zhao@agriculture.vic.gov.au.
² Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.
³ School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.
⁴ Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia. matthew.hayden@agriculture.vic.gov.au.
⁵ School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. matthew.hayden@agriculture.vic.gov.au.

PMID: 40038585
PMCID: PMC11877698
DOI: 10.1186/s12864-025-11250-4

Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Huanhuan Zhao et al. BMC Genomics. 2025.

. 2025 Mar 4;26(1):215.

doi: 10.1186/s12864-025-11250-4.

Authors

Huanhuan Zhao¹, Iona M MacLeod^{2

3}, Gabriel Keeble-Gagnere², Denise M Barbulescu², Josquin F Tibbits², Sukhjiwan Kaur², Matthew Hayden^{4

5}

Affiliations

¹ Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia. huan.zhao@agriculture.vic.gov.au.
² Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.
³ School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.
⁴ Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia. matthew.hayden@agriculture.vic.gov.au.
⁵ School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. matthew.hayden@agriculture.vic.gov.au.

PMID: 40038585
PMCID: PMC11877698
DOI: 10.1186/s12864-025-11250-4

Abstract

Background: Integrating germplasm populations genotyped by different genotyping platforms via genotype imputation is a way to utilize accumulated genetic resources. In this study, we used 278 canola samples genotyped via whole-genome sequencing (WGS) at 10× coverage to evaluate the imputation accuracy of three imputation approaches. The optimal imputation methods were used to impute and integrate two Canola genotype datasets: a diverse canola collection genotyped by genotyping-by-sequencing via transcriptome (GBS-t) and a double haploid (DH) line collection genotyped with low-coverage WGS (skim-WGS). The genomic predictive ability (GP) and detection power of marker‒trait association (GWAS) of the combined population for blackleg resistance were evaluated.

Results: The empirical imputation accuracy (r²) measured as the squared correlation between observed and imputed genotypes was moderate for Minimac3 when imputing from the GBS-t density to the WGS. The accuracy dramatically improved from 0.64 to 0.82 by removing SNPs with poor Minimac3-reported Rsq (Rsq < 0.2) quality statistics. The r² for GLIMPSE was higher than that for Beagle when imputing from different low-coverage to full-coverage WGS. We imputed and integrated the diverse canola collection and the DH lines, and the combined population showed similar or slightly greater predictive ability (PA) for blackleg resistance traits than did each of the single populations with ~ 921 K SNPs. Higher marker-trait association (MTA) detection powers were indicated with the combined population; however, similar numbers of MTAs were discovered when each single population was combined in a meta-GWAS.

Conclusion: It is feasible to impute and integrate germplasms from different sequencing platforms for downstream analyses. However, genetic heterogeneity across populations could add complexity to the analysis. Increasing the sample size by combining datasets showed slightly greater predictive ability and greater detection power in GWASs in the present study.

Keywords: Blackleg resistance; Canola; GBS-t; GP; GWAS; Imputation; Whole-genome sequencing (WGS); skim-WGS.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
Boxplot of the empirical imputation accuracy (represented by Concordance and r²) (a) and the mean r² values across different Minimac Rsq quality statistic bins (b). Both panels are used to visualize the performance of imputing from GBS-t density to WGS using Minimac3. We further compared the r² in Beagle and CLIMPSE by imputing from different skim coverages to 10× WGS (Fig. 2a). GLIMPSE outperformed Beagle in all low-coverage datasets, and the greatest differences were at 0.5× and 1.0×, where GLIMPSE (0.88 and 0.9) achieved 16% greater accuracy than Beagle (0.74 and 0.66). Additionally, GLIMPSE also had a slightly greater r² for all MAF bins than did Beagle (Fig. 2b). The r² was lower for SNPs with low MAFs (MAF < = 0.06) for both methods; however, the concordance did not change dramatically across different MAF bins in either Beagle or GLIMPSE

**Fig. 2**
The average empirical imputation accuracy (r²) for imputing from different sequencing coverage to 10× WGS using Beagle and GLIMPSE (a) and the average r² at different MAF bins with the full SNP dataset using Beagle and GLIMPSE (b)

**Fig. 3**
Dot plots of the first two principal components (PC1, PC2) of the genomic relationship matrix (GRM) with all canola lines included in the study, with color-coded groups

**Fig. 4**
Manhattan plots of the diverse, DH, and combined population and meta-GWAS for survival rate (SurvRt, left) and average internal infection (AveInf, right). The SNPs shown in red are all significant SNPs from all four GWAS

See this image and copyright information in PMC

References

1. Kim C, Guo H, Kong W, Chandnani R, Shuang L-S, Paterson AH. Application of genotyping by sequencing technology to a variety of crop breeding programs. Plant Sci. 2016;242:14–22. - PubMed
1. Rasheed A, Hao Y, Xia X, Khan A, Xu Y, Varshney RK, He Z. Crop breeding chips and genotyping platforms: Progress, challenges, and perspectives. Mol Plant. 2017;10(8):1047–64. - PubMed
1. Kumar P, Choudhary M, Jat BS, Kumar B, Singh V, Kumar V, Singla D, Rakshit S. Skim sequencing: an advanced NGS technology for crop improvement. J Genet 2021, 100. - PubMed
1. Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387–406. - PMC - PubMed
1. Schmidt M, Kollers S, Maasberg-Prelle A, Großer J, Schinkel B, Tomerius A, Graner A, Korzun V. Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection. Theor Appl Genet. 2016;129(2):203–13. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- BioMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Affiliations

Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources