A Multi-Label Learning Framework for Predicting Chemical Classes and Biological Activities of Natural Products from Biosynthetic Gene Clusters
- PMID: 37779180
- DOI: 10.1007/s10886-023-01452-z
A Multi-Label Learning Framework for Predicting Chemical Classes and Biological Activities of Natural Products from Biosynthetic Gene Clusters
Abstract
Natural products (NP) or secondary metabolites, as a class of small chemical molecules that are naturally synthesized by chromosomally clustered biosynthesis genes (also called biosynthetic gene clusters, BGCs) encoded enzymes or enzyme complexes, mediates the bioecological interactions between host and microbiota and provides a natural reservoir for screening drug-like therapeutic pharmaceuticals. In this work, we propose a multi-label learning framework to functionally annotate natural products or secondary metabolites solely from their catalytical biosynthetic gene clusters without experimentally conducting NP structural resolutions. All chemical classes and bioactivities constitute the label space, and the sequence domains of biosynthetic gene clusters that catalyse the biosynthesis of natural products constitute the feature space. In this multi-label learning framework, a joint representation of features (BGCs domains) and labels (natural products annotations) is efficiently learnt in an integral and low-dimensional space to accurately define the inter-class boundaries and scale to the learning problem of many imbalanced labels. Computational results on experimental data show that the proposed framework achieves satisfactory multi-label learning performance, and the learnt patterns of BGCs domains are transferrable across bacteria, or even across kingdom, for instance, from bacteria to Arabidopsis thaliana. Lastly, take Arabidopsis thaliana and its rhizosphere microbiome for example, we propose a pipeline combining existing BGCs identification tools and this proposed framework to find and functionally annotate novel natural products for downstream bioecological studies in terms of plant-microbiota-soil interactions and plant environmental adaption.
Keywords: Biosynthetic gene clusters; Machine learning; Multi-label learning; Natural products; Plant and soil microbiome; Synthetic biology; Transfer learning.
© 2023. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Similar articles
-
Identification of the Bacterial Biosynthetic Gene Clusters of the Oral Microbiome Illuminates the Unexplored Social Language of Bacteria during Health and Disease.mBio. 2019 Apr 16;10(2):e00321-19. doi: 10.1128/mBio.00321-19. mBio. 2019. PMID: 30992349 Free PMC article.
-
Predicting fungal secondary metabolite activity from biosynthetic gene cluster data using machine learning.Microbiol Spectr. 2024 Feb 6;12(2):e0340023. doi: 10.1128/spectrum.03400-23. Epub 2024 Jan 9. Microbiol Spectr. 2024. PMID: 38193680 Free PMC article.
-
Comprehensive curation and analysis of fungal biosynthetic gene clusters of published natural products.Fungal Genet Biol. 2016 Apr;89:18-28. doi: 10.1016/j.fgb.2016.01.012. Epub 2016 Jan 22. Fungal Genet Biol. 2016. PMID: 26808821 Free PMC article.
-
Computational advances in biosynthetic gene cluster discovery and prediction.Biotechnol Adv. 2025 Mar-Apr;79:108532. doi: 10.1016/j.biotechadv.2025.108532. Epub 2025 Feb 7. Biotechnol Adv. 2025. PMID: 39924008 Review.
-
Linking Biosynthetic Gene Clusters to their Metabolites via Pathway- Targeted Molecular Networking.Curr Top Med Chem. 2016;16(15):1705-16. doi: 10.2174/1568026616666151012111046. Curr Top Med Chem. 2016. PMID: 26456470 Free PMC article. Review.
Cited by
-
Primed for Discovery.Biochemistry. 2024 Nov 5;63(21):2705-2713. doi: 10.1021/acs.biochem.4c00464. Epub 2024 Oct 15. Biochemistry. 2024. PMID: 39497571 Free PMC article. Review.
-
Federated Multi-Label Learning (FMLL): Innovative Method for Classification Tasks in Animal Science.Animals (Basel). 2024 Jul 9;14(14):2021. doi: 10.3390/ani14142021. Animals (Basel). 2024. PMID: 39061483 Free PMC article.
References
-
- Alam K, Hao J, Zhang Y, Li A (2021) Synthetic biology-inspired strategies and tools for engineering of microbial natural product biosynthetic pathways. Biotechnol Adv 49:07759 - DOI
-
- Atanasov AG, Zotchev SB, Dirsch VM (2021) Natural products in drug discovery: advances and opportunities. Nat Rev Drug Discov 28:1–17
-
- Begani J, Lakhani J, Harwani D (2018) Current strategies to induce secondary metabolites from microbial biosynthetic cryptic gene clusters. Annals Microbiol 68:419–432 - DOI
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous