Predicting novel substrates for enzymes with minimal experimental effort with active learning
- PMID: 29030274
- PMCID: PMC7055960
- DOI: 10.1016/j.ymben.2017.09.016
Predicting novel substrates for enzymes with minimal experimental effort with active learning
Abstract
Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes, developed support vector machine (SVM) models using these datasets, and selected additional compounds for experiments using an active learning approach. SVMs trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of ~80% using ~33% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of SVMs that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways.
Keywords: Active learning; Enzyme promiscuity; Machine learning.
Copyright © 2017 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Conflict of Interest Statement
The authors declare that they have no conflict of interest.
Figures





Similar articles
-
Towards creating an extended metabolic model (EMM) for E. coli using enzyme promiscuity prediction and metabolomics data.Microb Cell Fact. 2019 Jun 13;18(1):109. doi: 10.1186/s12934-019-1156-3. Microb Cell Fact. 2019. PMID: 31196115 Free PMC article.
-
Metabolic In Silico Network Expansions to Predict and Exploit Enzyme Promiscuity.Methods Mol Biol. 2019;1927:11-21. doi: 10.1007/978-1-4939-9142-6_2. Methods Mol Biol. 2019. PMID: 30788782
-
Enzyme Promiscuity Prediction Using Hierarchy-Informed Multi-Label Classification.Bioinformatics. 2021 Aug 4;37(14):2017–2024. doi: 10.1093/bioinformatics/btab054. Epub 2021 Jan 30. Bioinformatics. 2021. PMID: 33515234 Free PMC article.
-
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26. Artif Intell Med. 2019. PMID: 31383477 Review.
-
ENZPRED-enzymatic protein class predicting by machine learning.Curr Top Med Chem. 2013;13(14):1674-80. doi: 10.2174/15680266113139990118. Curr Top Med Chem. 2013. PMID: 23889047 Review.
Cited by
-
A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships.PLoS Comput Biol. 2024 May 20;20(5):e1012100. doi: 10.1371/journal.pcbi.1012100. eCollection 2024 May. PLoS Comput Biol. 2024. PMID: 38768223 Free PMC article.
-
Data-driven discovery of cardiolipin-selective small molecules by computational active learning.Chem Sci. 2022 Mar 2;13(16):4498-4511. doi: 10.1039/d2sc00116k. eCollection 2022 Apr 20. Chem Sci. 2022. PMID: 35656132 Free PMC article.
-
Protein structure-function continuum model: Emerging nexuses between specificity, evolution, and structure.Protein Sci. 2024 Apr;33(4):e4968. doi: 10.1002/pro.4968. Protein Sci. 2024. PMID: 38532700 Free PMC article. Review.
-
Specifics of Metabolite-Protein Interactions and Their Computational Analysis and Prediction.Methods Mol Biol. 2023;2554:179-197. doi: 10.1007/978-1-0716-2624-5_12. Methods Mol Biol. 2023. PMID: 36178627 Review.
-
A semi-supervised machine learning framework for microRNA classification.Hum Genomics. 2019 Oct 22;13(Suppl 1):43. doi: 10.1186/s40246-019-0221-7. Hum Genomics. 2019. PMID: 31639051 Free PMC article.
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources