Compositionality, sparsity, spurious heterogeneity, and other data-driven challenges for machine learning algorithms within plant microbiome studies
- PMID: 36538837
- PMCID: PMC9925409
- DOI: 10.1016/j.pbi.2022.102326
Compositionality, sparsity, spurious heterogeneity, and other data-driven challenges for machine learning algorithms within plant microbiome studies
Abstract
The plant-associated microbiome is a key component of plant systems, contributing to their health, growth, and productivity. The application of machine learning (ML) in this field promises to help untangle the relationships involved. However, measurements of microbial communities by high-throughput sequencing pose challenges for ML. Noise from low sample sizes, soil heterogeneity, and technical factors can impact the performance of ML. Additionally, the compositional and sparse nature of these datasets can impact the predictive accuracy of ML. We review recent literature from plant studies to illustrate that these properties often go unmentioned. We expand our analysis to other fields to quantify the degree to which mitigation approaches improve the performance of ML and describe the mathematical basis for this. With the advent of accessible analytical packages for microbiome data including learning models, researchers must be familiar with the nature of their datasets.
Keywords: Compositional data analysis; Deep learning; Machine learning; Plant-associated microbiome.
Copyright © 2022 Elsevier Ltd. All rights reserved.
Conflict of interest statement
Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests. Cranos Williams reports financial support was provided by Novo Nordisk Inc. Max Gordon reports financial support was provided by National Science Foundation. Sebastiano Busato reports financial support was provided by National Institute of Health. Stig Andersen reports financial support was provided by Novo Nordisk Inc. Meenal Chaudhari reports financial support was provided by Novo Nordisk Inc. Ib Jensen reports financial support was provided by Novo Nordisk Inc. Turgut Akyol reports financial support was provided by Novo Nordisk Inc.
Figures
References
-
- Whipps JM, Lewis K, Cooke R: Mycoparasitism and plant disease control. Fungi in biological control systems 1988,
-
- Vandenkoornhuyse P, Quaiser A, Duhamel M, Le Van A, Dufresne A: The importance of the microbiome of the plant holobiont. New Phytologist 2015, 206:1196–1206. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
