IECata: interpretable bilinear attention network and evidential deep learning improve the catalytic efficiency prediction of enzymes
- PMID: 40548541
- PMCID: PMC12205960
- DOI: 10.1093/bib/bbaf283
IECata: interpretable bilinear attention network and evidential deep learning improve the catalytic efficiency prediction of enzymes
Abstract
Enzyme catalytic efficiency (kcat/Km) is a key parameter for identifying high-activity enzymes. Recently, deep learning techniques have demonstrated the potential for fast and accurate kcat/Km prediction. However, three challenges remain: (i) the limited size of the available kcat/Km dataset hinders the development of deep learning models; (ii) the model predictions lack reliable confidence estimates; and (iii) models lack interpretable insights into enzyme-catalyzed reactions. To address these challenges, we proposed IECata, a kcat/Km prediction model that provides uncertainty estimation and interpretability. IECata collected a dataset of 11 815 kcat/Km entries from the BRENDA and SABIO-RK databases, along with an out-of-domain test dataset of 806 entries from the literature. By introducing evidential deep learning, IECata provides uncertainty estimates for kcat/Km predictions. Moreover, it uses a bilinear attention mechanism to focus on learning crucial local interactions to interpret the key residues and substrate atoms in enzyme-catalyzed reactions. Testing results indicate that the prediction performance of IECata exceeds that of state-of-the-art benchmark models. More importantly, it provides a reliable confidence assessment for these predictions. Case studies further highlight that the incorporation of uncertainty in screening for highly active enzymes can effectively increase the hit ratio, thereby improving the efficiency of experimental validation and accelerating directed enzyme evolution. To facilitate researchers' use of IECata, we have developed an online prediction platform: http://mathtc.nscc-tj.cn/cataai/.
Keywords: k cat/Km prediction; bilinear attention mechanism; evidential deep learning; interpretability; uncertainty.
© The Author(s) 2025. Published by Oxford University Press.
Figures









Similar articles
-
Finding the dark matter: Large language model-based enzyme kinetic data extractor and its validation.Protein Sci. 2025 Sep;34(9):e70251. doi: 10.1002/pro.70251. Protein Sci. 2025. PMID: 40815276 Free PMC article.
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
-
The clinical effectiveness and cost-effectiveness of enzyme replacement therapy for Gaucher's disease: a systematic review.Health Technol Assess. 2006 Jul;10(24):iii-iv, ix-136. doi: 10.3310/hta10240. Health Technol Assess. 2006. PMID: 16796930
-
PreTKcat: A pre-trained representation learning and machine learning framework for predicting enzyme turnover number.Comput Biol Chem. 2025 Apr;115:108327. doi: 10.1016/j.compbiolchem.2024.108327. Epub 2025 Jan 1. Comput Biol Chem. 2025. PMID: 39765190
-
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142. Br J Dermatol. 2024. PMID: 38581445
References
-
- Chronopoulou EG, Labrou NE. Site-saturation mutagenesis: a powerful tool for structure-based design of combinatorial mutation libraries. Curr Protoc Protein Sci 2011;26:26.6.1–10. - PubMed
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous