ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation
- PMID: 38287889
- DOI: 10.1021/acs.jcim.3c01456
ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation
Abstract
The incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation. When applied to c-Abl kinase, a protein with FDA-approved small-molecule inhibitors, the model learns to generate molecules similar to the inhibitors without prior knowledge of their existence and even reproduces two of them exactly. We also show that the methodology is effective for a protein without any commercially available small-molecule inhibitors, the HNH domain of the CRISPR-associated protein 9 (Cas9) enzyme. To facilitate implementation and reproducibility, we made all of our software available through the open-source ChemSpaceAL Python package.
Update of
-
ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation.ArXiv [Preprint]. 2023 Dec 4:arXiv:2309.05853v2. ArXiv. 2023. Update in: J Chem Inf Model. 2024 Feb 12;64(3):653-665. doi: 10.1021/acs.jcim.3c01456. PMID: 37744464 Free PMC article. Updated. Preprint.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous