Enhancing the reverse transcriptase function in Taq polymerase via AI-driven multiparametric rational design
- PMID: 39720166
- PMCID: PMC11666352
- DOI: 10.3389/fbioe.2024.1495267
Enhancing the reverse transcriptase function in Taq polymerase via AI-driven multiparametric rational design
Abstract
Introduction: Modification of natural enzymes to introduce new properties and enhance existing ones is a central challenge in bioengineering. This study is focused on the development of Taq polymerase mutants that show enhanced reverse transcriptase (RTase) activity while retaining other desirable properties such as fidelity, 5'- 3' exonuclease activity, effective deoxyuracyl incorporation, and tolerance to locked nucleic acid (LNA)-containing substrates. Our objective was to use AI-driven rational design combined with multiparametric wet-lab analysis to identify and validate Taq polymerase mutants with an optimal combination of these properties.
Methods: The experimental procedure was conducted in several stages: 1) On the basis of a foundational paper, we selected 18 candidate mutations known to affect RTase activity across six sites. These candidates, along with the wild type, were assessed in the wet lab for multiple properties to establish an initial training dataset. 2) Using embeddings of Taq polymerase variants generated by a protein language model, we trained a Ridge regression model to predict multiple enzyme properties. This model guided the selection of 14 new candidates for experimental validation, expanding the dataset for further refinement. 3) To better manage risk by assessing confidence intervals on predictions, we transitioned to Gaussian process regression and trained this model on an expanded dataset comprising 33 data points. 4) With this enhanced model, we conducted an in silico screen of over 18 million potential mutations, narrowing the field to 16 top candidates for comprehensive wet-lab evaluation.
Results and discussion: This iterative, data-driven strategy ultimately led to the identification of 18 enzyme variants that exhibited markedly improved RTase activity while maintaining a favorable balance of other key properties. These enhancements were generally accompanied by lower Kd, moderately reduced fidelity, and greater tolerance to noncanonical substrates, thereby illustrating a strong interdependence among these traits. Several enzymes validated via this procedure were effective in single-enzyme real-time reverse-transcription PCR setups, implying their utility for the development of new tools for real-time reverse-transcription PCR technologies, such as pathogen RNA detection and gene expression analysis. This study illustrates how AI can be effectively integrated with experimental bioengineering to enhance enzyme functionality systematically. Our approach offers a robust framework for designing enzyme mutants tailored to specific biotechnological applications. The results of our biological activity predictions for mutated Taq polymerases can be accessed at https://huggingface.co/datasets/nerusskikh/taqpol_insilico_dms.
Keywords: Taq polymerase; bioengineering; function enhancement; machine learning; protein language model; rational design; reverse transcription.
Copyright © 2024 Tomilova, Russkikh, Yi, Shaburova, Tomilov, Pyrinova, Brezhneva, Tikhonyuk, Gololobova, Popichenko, Arkhipov, Bryzgalov, Brenner, Artyukh, Shtokalo, Antonets and Ivanov.
Conflict of interest statement
Authors YT, GP, SB, OT, NG, DP, MA, LB, EB, AA, MI were employed by AO Vector-Best. Authors NR, IY, and DS were employed by AcademGene LLC. Author VT was employed by SibEnzyme Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures






Similar articles
-
One-step RNA pathogen detection with reverse transcriptase activity of a mutated thermostable Thermus aquaticus DNA polymerase.Biotechnol J. 2010 Feb;5(2):224-31. doi: 10.1002/biot.200900200. Biotechnol J. 2010. PMID: 20108275
-
[Characterization of a Taq DNA polymerase fused with a DNA binding domain of Escherichia coli colicin].Sheng Wu Gong Cheng Xue Bao. 2024 Mar 25;40(3):812-820. doi: 10.13345/j.cjb.230540. Sheng Wu Gong Cheng Xue Bao. 2024. PMID: 38545979 Chinese.
-
Customized multiple sequence alignment as an effective strategy to improve performance of Taq DNA polymerase.Appl Microbiol Biotechnol. 2023 Nov;107(21):6507-6525. doi: 10.1007/s00253-023-12744-5. Epub 2023 Sep 1. Appl Microbiol Biotechnol. 2023. PMID: 37658164
-
DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models.Front Med (Lausanne). 2025 Apr 8;12:1503229. doi: 10.3389/fmed.2025.1503229. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40265190 Free PMC article. Review.
-
Building better polymerases: Engineering the replication of expanded genetic alphabets.J Biol Chem. 2020 Dec 11;295(50):17046-17059. doi: 10.1074/jbc.REV120.013745. Epub 2020 Oct 1. J Biol Chem. 2020. PMID: 33004440 Free PMC article. Review.
References
-
- Arezi B., McKinney N., Hansen C., Cayouette M., Fox J., Chen K., et al. (2014). Compartmentalized self-replication under fast PCR cycling conditions yields Taq DNA polymerase mutants with increased DNA-binding affinity and blood resistance. Front. Microbiol. 5, 408. 10.3389/fmicb.2014.00408 - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources
Miscellaneous