Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 12;8(25):23148-23167.
doi: 10.1021/acsomega.3c00883. eCollection 2023 Jun 27.

Impact of Applicability Domains to Generative Artificial Intelligence

Affiliations

Impact of Applicability Domains to Generative Artificial Intelligence

Maxime Langevin et al. ACS Omega. .

Abstract

Molecular generative artificial intelligence is drawing significant attention in the drug design community, with several experimentally validated proof of concepts already published. Nevertheless, generative models are known for sometimes generating unrealistic, unstable, unsynthesizable, or uninteresting structures. This calls for methods to constrain those algorithms to generate structures in drug-like portions of the chemical space. While the concept of applicability domains for predictive models is well studied, its counterpart for generative models is not yet well-defined. In this work, we empirically examine various possibilities and propose applicability domains suited for generative models. Using both public and internal data sets, we use generative methods to generate novel structures that are predicted to be actives by a corresponding quantitative structure-activity relationships model while constraining the generative model to stay within a given applicability domain. Our work looks at several applicability domain definitions, combining various criteria, such as structural similarity to the training set, similarity of physicochemical properties, unwanted substructures, and quantitative estimate of drug-likeness. We assess the structures generated from both qualitative and quantitative points of view and find that the applicability domain definitions have a strong influence on the drug-likeness of generated molecules. An extensive analysis of our results allows us to identify applicability domain definitions that are best suited for generating drug-like molecules with generative models. We anticipate that this work will help foster the adoption of generative models in an industrial context.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing financial interest(s): All authors are or have been employed by Sanofi and may hold shares and/or stock options in the company.

Figures

Figure 1
Figure 1
Overview of the workflow used for evaluating an AD. During generation, the AD is taken into account in the reward function as a multiplicative term that yields 0 if the molecule generated is out of the AD and 1 otherwise. The colors used for the arrows are related to the different subsets of the data set that are used at different stages of the process: green for the activity model training set, blue for set used for AD definition and generative model pretraining, and orange for the test set.
Figure 2
Figure 2
From left to right: chemical structures of clopidogrel, medroxyprogesterone acetate (a steroid), and ivermectin (a drug derived from a natural product). The diversity in chemical features of these three drugs (e.g., presence of macrocycles, number of cycles, and number of chiral centers) shows how much drug-likeness is context-dependent.
Figure 3
Figure 3
Limitations of binary fingerprints to discriminate unusual chemical moieties. (Top) Binary. (Bottom) Count.
Figure 4
Figure 4
JAK2: Projection of molecules generated with the LSTM-HC model on the original data set using the first two dimensions of the PCA of their Morgan fingerprints. Blue dots represent generated molecules, green dots represent actives from the test set, and red dots represent inactives. The three AD metrics in the top row lead to more diverse molecules than the ones in the middle and bottom rows.
Figure 5
Figure 5
Entropy as a measure of the coverage of the training set diversity. The more homogeneous the repartition of the generated molecules between clusters is, the higher the entropy will be. Generated molecules too far from the training set are set apart. This example displays molecules from the JAK2 data set.
Figure 6
Figure 6
QED distribution of generated molecules for different applicability domain definitions on the JAK2 data set. The vertical green lines correspond to the minimum and maximum values for QED found in the training set, where higher is better.
Figure 7
Figure 7
Comparison of data set and generated molecules on the JAK2 test case (here with the maximum similarity on atom-pair descriptors) with the distributions for properties identified as important through qualitative analysis.
Figure 8
Figure 8
Comparison of scores reached by different algorithms when optimizing JAK2 predicted activity while staying in the “range physchem + range ECFP4 counts” AD. LSTM-HC reaches higher optimization scores and recovers slightly more active compounds than Graph GA and SMILES GA.
Figure 9
Figure 9
Comparison of scores reached by the LSTM-HC under the constraint of different AD definitions on the ChEMBL 11βHSD data set.
Figure 10
Figure 10
Results of the molecular Turing test for each of four different AD definitions (“range QED”, “maxsim ECFP4”, “range physchem + maxsim ECFP4”, and “range physchem + range ECFP4 counts”) and for the JAK2 training set. The black bar denotes the mean, and the box denotes an interval with 90% of the values. Results were obtained with 15 different participants.
Figure 11
Figure 11
Tree map plot of the Renin test set (in green), molecules generated with a good applicability domain (“range physchem + range ECFP4 counts”, in blue), and molecules generated by an applicability domain showing poor results (“maxsim ECFP4”, in purple). The molecules from the good AD and the test data set (blue and green, respectively) mainly fall on the same trees and share connections, suggesting that the two sets could be similar and the generated molecules relevant. In contrast, the molecules from the bad AD are all located on a separate tree, suggesting that they are dissimilar from the test set and probably irrelevant. The molecule generated with the bad AD that is closest to the test data set (purple arrow) is still dissimilar to the closest molecules from the test data set (green arrow) or from those generated with a good AD (blue arrow). The tree map is generated using the TMAP library with default settings.
Figure 12
Figure 12
Fraction of actives and inactives among the Renin training set molecules close to the generated molecules at different similarity thresholds. Results are shown for the “range physchem + range ECFP4 counts” and “maxsim ECFP4” applicability domains. They show that good applicability domains generate molecules closer to the actual training set, with a clear enrichment toward active molecules.
Figure 13
Figure 13
Renin data set: evolution of generated molecules’ scores throughout the optimization epochs for a good applicability domain (“range physchem + range ECFP4 counts”) and for an applicability domain showing poor results (“maxsim ECFP4”). The scores of the molecules generated with the good applicability domain are more spread out and lower than those generated with the other applicability domain (while most are still in the correct range, between the predicted active threshold and the maximum score among the test set molecules). This illustrates that poorly performing ADs leave room for reward hacking by the generator.
Figure 14
Figure 14
Average Tanimoto similarities (computed on ECFP4 fingerprints) for generated sets of molecules using a good applicability domain definition on the JAK2 data set. Different applicability domains can lead to the exploration of different portions of chemical space.

Similar articles

Cited by

References

    1. Schneider P.; et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discovery 2020, 19, 353–364. 10.1038/s41573-019-0050-3. - DOI - PubMed
    1. Olivecrona M.; Blaschke T.; Engkvist O.; Chen H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 2017, 9, 48.10.1186/s13321-017-0235-x. - DOI - PMC - PubMed
    1. Segler M. H. S.; Kogej T.; Tyrchan C.; Waller M. P. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks. ACS Central Science 2018, 4, 120–131. 10.1021/acscentsci.7b00512. - DOI - PMC - PubMed
    1. Brown N.; Fiscato M.; Segler M. H.; Vaucher A. C. GuacaMol: Benchmarking Models for de Novo Molecular Design. J. Chem. Inf. Model. 2019, 59, 1096–1108. 10.1021/acs.jcim.8b00839. - DOI - PubMed
    1. Wildman S. A.; Crippen G. M. Prediction of Physicochemical Parameters by Atomic Contributions. J. Chem. Inf. Comput. Sci. 1999, 39, 868–873. 10.1021/ci990307l. - DOI