Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec;16(4):e20390.
doi: 10.1002/tpg2.20390. Epub 2023 Sep 20.

Genomic prediction with machine learning in sugarcane, a complex highly polyploid clonally propagated crop with substantial non-additive variation for key traits

Affiliations
Free article

Genomic prediction with machine learning in sugarcane, a complex highly polyploid clonally propagated crop with substantial non-additive variation for key traits

Chensong Chen et al. Plant Genome. 2023 Dec.
Free article

Abstract

Sugarcane has a complex, highly polyploid genome with multi-species ancestry. Additive models for genomic prediction of clonal performance might not capture interactions between genes and alleles from different ploidies and ancestral species. As such, genomic prediction in sugarcane presents an interesting case for machine learning (ML) methods, which are purportedly able to deal with high levels of complexity in prediction. Here, we investigated deep learning (DL) neural networks, including multilayer networks (MLP) and convolution neural networks (CNN), and an ensemble machine learning approach, random forest (RF), for genomic prediction in sugarcane. The data set used was 2912 sugarcane clones, scored for 26,086 genome wide single nucleotide polymorphism markers, with final assessment trial data for total cane harvested (TCH), commercial cane sugar (CCS), and fiber content (Fiber). The clones in the latest trial (2017) were used as a validation set. We compared prediction accuracy of these methods to genomic best linear unbiased prediction (GBLUP) extended to include dominance and epistatic effects. The prediction accuracies from GBLUP models were up to 0.37 for TCH, 0.43 for CCS, and 0.48 for Fiber, while the optimized ML models had prediction accuracies of 0.35 for TCH, 0.38 for CCS, and 0.48 for Fiber. Both RF and DL neural network models have comparable predictive ability with the additive GBLUP model but are less accurate than the extended GBLUP model.

PubMed Disclaimer

Similar articles

Cited by

References

REFERENCES

    1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., & Devin, M. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
    1. Abdollahi-Arpanahi, R., Gianola, D., & Peñagaricano, F. (2020). Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes. Genetics Selection Evolution, 52(1), 12. https://doi.org/10.1186/s12711-020-00531-z
    1. Aitken, K., Farmer, A., Berkman, P., Muller, C., Wei, X., Demano, E., Jackson, P., Magwire, M., Dietrich, B., & Kota, R. (2016). Generation of a 345K sugarcane SNP chip. Proceedings of the Australian Society of Sugar Cane Technologists, 29, 1165-1172.
    1. Ali, M., Zhang, Y., Rasheed, A., Wang, J., & Zhang, L. (2020). Genomic prediction for grain yield and yield-related traits in Chinese winter wheat. International Journal of Molecular Sciences, 21(4), 1342. https://doi.org/10.3390/ijms21041342
    1. Azodi, C. B., Bolger, E., Mccarren, A., Roantree, M., De Los Campos, G., & Shiu, S.-H. (2019). Benchmarking parametric and machine learning models for genomic prediction of complex traits. G3 Genes|Genomes|Genetics, 9(11), 3691-3702. https://doi.org/10.1534/g3.119.400498

LinkOut - more resources