Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction
- PMID: 31379929
- PMCID: PMC6650575
- DOI: 10.3389/fgene.2019.00673
Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction
Abstract
Genomic prediction with imputed whole-genome sequencing (WGS) data is an attractive approach to improve predictive ability with low cost. However, high accuracy has not been realized using this method in livestock. In this study, we imputed 435 individuals from 600K single nucleotide polymorphism (SNP) chip data to WGS data using different reference panels. We also investigated the prediction accuracy of genomic best linear unbiased prediction (GBLUP) using imputed WGS data from different reference panels, linkage disequilibrium (LD)-based marker pruning, and pre-selected variants based on Genome-wide association society (GWAS) results. Results showed that the imputation accuracies from 600K to WGS data were 0.873 ± 0.038, 0.906 ± 0.036, and 0.979 ± 0.010 for the internal, external, and combined reference panels, respectively. In most traits of chickens, the prediction accuracy of imputed WGS data obtained from the internal reference panel was greater than or equal to that of the combined reference panel; the external reference panel had the lowest prediction accuracy. Compared with 600K chip data, GBLUP with imputed WGS data had only a small increase (1-3%) in prediction accuracy. Using only variants selected from imputed WGS data based on GWAS results resulted in almost no increase for most traits and even increased the bias of the regression coefficient. The impact of the degree of LD of selected and remaining variants on prediction accuracy was different. For average daily gain (ADG), residual feed intake (RFI), intestine length (IL), and body weight in 91 days (BW91), the accuracy of GBLUP increased as the degree of LD of selected variants decreased, but the opposite relationship occurred for the remaining variants. But for breast muscle weight (BMW) and average daily feed intake (ADFI), the accuracy of GBLUP increased as the degree of LD of selected variants increased, and the degree of LD of remaining variants had a small effect on prediction accuracy. Overall, the optimal imputation strategy to obtain WGS data for genomic prediction should consider the relationship between selected individuals and target population individuals to avoid heterogeneity of imputation. LD-based marker pruning can be used to improve the accuracy of genomic prediction using imputed WGS data.
Keywords: GWAS; LD-based marker pruning; chickens; genomic prediction; imputed WGS data.
Figures




Similar articles
-
Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs.Genet Sel Evol. 2019 Oct 21;51(1):58. doi: 10.1186/s12711-019-0500-8. Genet Sel Evol. 2019. PMID: 31638889 Free PMC article.
-
Genomic Prediction Based on SNP Functional Annotation Using Imputed Whole-Genome Sequence Data in Korean Hanwoo Cattle.Front Genet. 2021 Jan 21;11:603822. doi: 10.3389/fgene.2020.603822. eCollection 2020. Front Genet. 2021. PMID: 33552124 Free PMC article.
-
Pre-selecting markers based on fixation index scores improved the power of genomic evaluations in a combined Yorkshire pig population.Animal. 2020;14(8):1555-1564. doi: 10.1017/S1751731120000506. Epub 2020 Mar 25. Animal. 2020. PMID: 32209149
-
Evaluation of measures of correctness of genotype imputation in the context of genomic prediction: a review of livestock applications.Animal. 2014 Nov;8(11):1743-53. doi: 10.1017/S1751731114001803. Epub 2014 Jul 21. Animal. 2014. PMID: 25045914 Review.
-
Accurate Imputation of Untyped Variants from Deep Sequencing Data.Methods Mol Biol. 2021;2243:271-281. doi: 10.1007/978-1-0716-1103-6_13. Methods Mol Biol. 2021. PMID: 33606262 Review.
Cited by
-
Impact of linkage disequilibrium heterogeneity along the genome on genomic prediction and heritability estimation.Genet Sel Evol. 2022 Jun 27;54(1):47. doi: 10.1186/s12711-022-00737-3. Genet Sel Evol. 2022. PMID: 35761182 Free PMC article.
-
Impact of Marker Pruning Strategies Based on Different Measurements of Marker Distance on Genomic Prediction in Dairy Cattle.Animals (Basel). 2021 Jul 2;11(7):1992. doi: 10.3390/ani11071992. Animals (Basel). 2021. PMID: 34359120 Free PMC article.
-
High-Throughput Sequencing With the Preselection of Markers Is a Good Alternative to SNP Chips for Genomic Prediction in Broilers.Front Genet. 2020 Feb 27;11:108. doi: 10.3389/fgene.2020.00108. eCollection 2020. Front Genet. 2020. PMID: 32174971 Free PMC article.
-
Genome-wide association study and genomic prediction for intramuscular fat content in Suhuai pigs using imputed whole-genome sequencing data.Evol Appl. 2022 Oct 24;15(12):2054-2066. doi: 10.1111/eva.13496. eCollection 2022 Dec. Evol Appl. 2022. PMID: 36540634 Free PMC article.
-
Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel.Front Genet. 2022 Jul 12;13:872486. doi: 10.3389/fgene.2022.872486. eCollection 2022. Front Genet. 2022. PMID: 35903348 Free PMC article.
References
Associated data
LinkOut - more resources
Full Text Sources
Research Materials