This is a preprint.
Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction
- PMID: 39678335
- PMCID: PMC11643338
- DOI: 10.21203/rs.3.rs-2988283/v2
Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction
Abstract
Modeling molecular activity and quantitative structure-activity relationships of chemical compounds is critical in drug design. Graph neural networks, which utilize molecular structures as frames, have shown success in assessing the biological activity of chemical compounds, guiding the selection and optimization of candidates for further development. However, current models often overlook activity cliffs (ACs)-cases where structurally similar molecules exhibit different bioactivities-due to latent spaces primarily optimized for structural features. Here, we introduce AC-awareness (ACA), an inductive bias designed to enhance molecular representation learning for activity modeling. The ACA jointly optimizes metric learning in the latent space and task performance in the target space, making models more sensitive to ACs. We develop ACANet, an AC-informed contrastive learning approach that can be integrated with any graph neural network. Experiments on 39 benchmark datasets demonstrate that AC-informed representations of chemical compounds consistently outperform standard models in bioactivity prediction across both regression and classification tasks. AC-informed models show strong performance in predicting pharmacokinetic and safety-relevant molecular properties. ACA paves the way toward activity-informed molecular representations, providing a valuable tool for the early stages of lead compound identification, refinement, and virtual screening.
Keywords: Activity Cliff; Activity Cliff Awareness; Contrastive Learning; Graph neural networks.
Conflict of interest statement
Competing interests The authors declare no competing interests.
Figures







References
-
- Sadybekov A. V. & Katritch V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023). - PubMed
-
- Wang Y., Wang J., Cao Z. & Barati Farimani A. Molecular contrastive learning of representations via graph neural networks. Nature Machine Intelligence 4, 279–287 (2022).
-
- Li Y. et al. An adaptive graph learning method for automated molecular interactions and properties predictions. nature machine intelligence 4, 645–651 (2022).