Genomic prediction of pig growth traits based on machine learning
- PMID: 37872114
- DOI: 10.16288/j.yczz.23-120
Genomic prediction of pig growth traits based on machine learning
Abstract
This study aimed to assess and compare the performance of different machine learning models in predicting selected pig growth traits and genomic estimated breeding values (GEBV) using automated machine learning, with the goal of optimizing whole-genome evaluation methods in pig breeding. The research employed genomic information, pedigree matrices, fixed effects, and phenotype data from 9968 pigs across multiple companies to derive four optimal machine learning models: deep learning (DL), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB). Through 10-fold cross-validation, predictions were made for GEBV and phenotypes of pigs reaching weight milestones (100 kg and 115 kg) with adjustments for backfat and days to weight. The findings indicated that machine learning models exhibited higher accuracy in predicting GEBV compared to phenotypic traits. Notably, GBM demonstrated superior GEBV prediction accuracy, with values of 0.683, 0.710, 0.866, and 0.871 for B100, B115, D100, and D115, respectively, slightly outperforming other methods. In phenotype prediction, GBM emerged as the best-performing model for pigs with B100, B115, D100, and D115 traits, achieving prediction accuracies of 0.547, followed by DL at 0.547, and then XGB with accuracies of 0.672 and 0.670. In terms of model training time, RF required the most time, while GBM and DL fell in between, and XGB demonstrated the shortest training time. In summary, machine learning models obtained through automated techniques exhibited higher GEBV prediction accuracy compared to phenotypic traits. GBM emerged as the overall top performer in terms of prediction accuracy and training time efficiency, while XGB demonstrated the ability to train accurate prediction models within a short timeframe. RF, on the other hand, had longer training times and insufficient accuracy, rendering it unsuitable for predicting pig growth traits and GEBV.
为了比较自动机器学习下不同机器学习模型预测部分猪生长性状与全基因组估计育种值(genomic estimated breeding value,GEBV)的性能,并寻找适合的机器学习模型,以优化生猪育种的全基因组评估方法,本研究利用来自多个公司9968头猪的基因组信息、系谱矩阵、固定效应及表型信息通过自动机器学习方法获取深度学习(deep learning,DL)、随机森林(random forest,RF)、梯度提升机(gradient boosting machine,GBM)和极致梯度提升(extreme gradient boosting,XGB)4种机器学习最佳模型。采用10折交叉验证分别对猪达100 kg校正背膘(correcting backfat to 100 kg,B100)、达115 kg校正背膘(correcting backfat to 115 kg,B115)、达100 kg校正日龄(correcting days to 100 kg,D100)、达115 kg校正日龄(correcting days to 100 kg,D115)的GEBV及其表型进行预测,比较不同机器学习模型应用于猪基因组评估的性能。结果表明:机器学习模型对GEBV的估计准确性高于性状表型;在GEBV预测中,GBM在B100、B115、D100、D115的预测准确性分别为0.683、0.710、0.866、0.871,略高于其他方法;在表型预测中,对猪B100、B115、D100、D115预测性能最好的模型依次为GBM(0.547)、DL(0.547)、XGB(0.672、0.670);在模型训练所需时间上,RF远高于其他3种模型,GBM与DL居中,XGB所需时间最少。综上所述,通过自动机器学习获取的机器学习模型对GEBV预测的准确性高于表型;GBM模型总体上表现出最高的预测准确性与较短训练时间;XGB能够利用最短的时间训练准确性较高的预测模型;RF模型的训练时间远超其他3种模型,且准确性不足,不适用猪生长性状表型与GEBV预测。.
Keywords: automated machine learning; genomic estimated breeding values; growth traits; performance comparison.
Similar articles
-
Effects of number of training generations on genomic prediction for various traits in a layer chicken population.Genet Sel Evol. 2016 Mar 19;48:22. doi: 10.1186/s12711-016-0198-9. Genet Sel Evol. 2016. PMID: 26992471 Free PMC article.
-
Evaluating the performance of machine learning methods and variable selection methods for predicting difficult-to-measure traits in Holstein dairy cattle using milk infrared spectral data.J Dairy Sci. 2021 Jul;104(7):8107-8121. doi: 10.3168/jds.2020-19861. Epub 2021 Apr 15. J Dairy Sci. 2021. PMID: 33865589
-
Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs.FASEB J. 2023 Jun;37(6):e22961. doi: 10.1096/fj.202300245R. FASEB J. 2023. PMID: 37178007
-
A review of deep learning applications for genomic selection.BMC Genomics. 2021 Jan 6;22(1):19. doi: 10.1186/s12864-020-07319-x. BMC Genomics. 2021. PMID: 33407114 Free PMC article. Review.
-
Crop genomic selection with deep learning and environmental data: A survey.Front Artif Intell. 2023 Jan 10;5:1040295. doi: 10.3389/frai.2022.1040295. eCollection 2022. Front Artif Intell. 2023. PMID: 36703955 Free PMC article. Review.