Advances in machine learning for directed evolution
- PMID: 33647531
- DOI: 10.1016/j.sbi.2021.01.008
Advances in machine learning for directed evolution
Abstract
Machine learning (ML) can expedite directed evolution by allowing researchers to move expensive experimental screens in silico. Gathering sequence-function data for training ML models, however, can still be costly. In contrast, raw protein sequence data is widely available. Recent advances in ML approaches use protein sequences to augment limited sequence-function data for directed evolution. We highlight contributions in a growing effort to use sequences to reduce or eliminate the amount of sequence-function data needed for effective in silico screening. We also highlight approaches that use ML models trained on sequences to generate new functional sequence diversity, focusing on strategies that use these generative models to efficiently explore vast regions of protein space.
Copyright © 2021 Elsevier Ltd. All rights reserved.
Similar articles
-
Machine learning-assisted directed protein evolution with combinatorial libraries.Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12. Proc Natl Acad Sci U S A. 2019. PMID: 30979809 Free PMC article.
-
Machine learning to navigate fitness landscapes for protein engineering.Curr Opin Biotechnol. 2022 Jun;75:102713. doi: 10.1016/j.copbio.2022.102713. Epub 2022 Apr 9. Curr Opin Biotechnol. 2022. PMID: 35413604 Free PMC article. Review.
-
Combining Cloud-Based Free-Energy Calculations, Synthetically Aware Enumerations, and Goal-Directed Generative Machine Learning for Rapid Large-Scale Chemical Exploration and Optimization.J Chem Inf Model. 2020 Sep 28;60(9):4311-4325. doi: 10.1021/acs.jcim.0c00120. Epub 2020 Jun 19. J Chem Inf Model. 2020. PMID: 32484669
-
Generating functional protein variants with variational autoencoders.PLoS Comput Biol. 2021 Feb 26;17(2):e1008736. doi: 10.1371/journal.pcbi.1008736. eCollection 2021 Feb. PLoS Comput Biol. 2021. PMID: 33635868 Free PMC article.
-
Protein sequence design with deep generative models.Curr Opin Chem Biol. 2021 Dec;65:18-27. doi: 10.1016/j.cbpa.2021.04.004. Epub 2021 May 26. Curr Opin Chem Biol. 2021. PMID: 34051682 Review.
Cited by
-
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023. Front Plant Sci. 2023. PMID: 38034587 Free PMC article. Review.
-
Learning Strategies in Protein Directed Evolution.Methods Mol Biol. 2022;2461:225-275. doi: 10.1007/978-1-0716-2152-3_15. Methods Mol Biol. 2022. PMID: 35727454 Review.
-
Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models.ArXiv [Preprint]. 2023 Jul 27:arXiv:2307.14587v1. ArXiv. 2023. Update in: Brief Bioinform. 2023 Sep 20;24(5):bbad289. doi: 10.1093/bib/bbad289. PMID: 37547662 Free PMC article. Updated. Preprint.
-
Cross-protein transfer learning substantially improves disease variant prediction.Genome Biol. 2023 Aug 7;24(1):182. doi: 10.1186/s13059-023-03024-6. Genome Biol. 2023. PMID: 37550700 Free PMC article.
-
Peptide-based drug discovery through artificial intelligence: towards an autonomous design of therapeutic peptides.Brief Bioinform. 2024 May 23;25(4):bbae275. doi: 10.1093/bib/bbae275. Brief Bioinform. 2024. PMID: 38856172 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources