Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 May;106(9-10):3507-3530.
doi: 10.1007/s00253-022-11963-6. Epub 2022 May 16.

Machine learning: its challenges and opportunities in plant system biology

Affiliations
Review

Machine learning: its challenges and opportunities in plant system biology

Mohsen Hesami et al. Appl Microbiol Biotechnol. 2022 May.

Abstract

Sequencing technologies are evolving at a rapid pace, enabling the generation of massive amounts of data in multiple dimensions (e.g., genomics, epigenomics, transcriptomic, metabolomics, proteomics, and single-cell omics) in plants. To provide comprehensive insights into the complexity of plant biological systems, it is important to integrate different omics datasets. Although recent advances in computational analytical pipelines have enabled efficient and high-quality exploration and exploitation of single omics data, the integration of multidimensional, heterogenous, and large datasets (i.e., multi-omics) remains a challenge. In this regard, machine learning (ML) offers promising approaches to integrate large datasets and to recognize fine-grained patterns and relationships. Nevertheless, they require rigorous optimizations to process multi-omics-derived datasets. In this review, we discuss the main concepts of machine learning as well as the key challenges and solutions related to the big data derived from plant system biology. We also provide in-depth insight into the principles of data integration using ML, as well as challenges and opportunities in different contexts including multi-omics, single-cell omics, protein function, and protein-protein interaction. KEY POINTS: • The key challenges and solutions related to the big data derived from plant system biology have been highlighted. • Different methods of data integration have been discussed. • Challenges and opportunities of the application of machine learning in plant system biology have been highlighted and discussed.

Keywords: Big data; Data integration; Epigenomics; Multi-omics; Plant molecular biology; Prediction; Protein function; Transcription factor.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Acharjee A, Kloosterman B, Visser RGF, Maliepaard C (2016) Integration of multi-omics data for prediction of phenotypic traits using random forest. BMC Bioinform 17(5):180. https://doi.org/10.1186/s12859-016-1043-4 - DOI
    1. Aghbashlo M, Peng W, Tabatabaei M, Kalogirou SA, Soltanian S, Hosseinzadeh-Bandbafha H, Mahian O, Lam SS (2021) Machine learning technology in biodiesel research: a review. Prog Energy Combust Sci 85:100904. https://doi.org/10.1016/j.pecs.2021.100904 - DOI
    1. Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838. https://doi.org/10.1038/nbt.3300 - DOI - PubMed
    1. Alizadeh M, Hoy R, Lu B, Song L (2021) Team effort: Combinatorial control of seed maturation by transcription factors. Curr Opin Plant Biol 63:102091. https://doi.org/10.1016/j.pbi.2021.102091 - DOI - PubMed
    1. Amodio M, van Dijk D, Srinivasan K, Chen WS, Mohsen H, Moon KR, Campbell A, Zhao Y, Wang X, Venkataswamy M, Desai A, Ravi V, Kumar P, Montgomery R, Wolf G, Krishnaswamy S (2019) Exploring single-cell data with deep multitasking neural networks. Nat Methods 16(11):1139–1145. https://doi.org/10.1038/s41592-019-0576-7 - DOI - PubMed

LinkOut - more resources