. 2019 Aug 26;59(8):3370-3388.

doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

Analyzing Learned Molecular Representations for Property Prediction

Affiliations

¹ Computer Science and Artificial Intelligence Laboratory , MIT , Cambridge , Massachusetts 02139 , United States.
² Department of Chemical Engineering , MIT , Cambridge , Massachusetts 02139 , United States.
³ BASF SE , Ludwigshafen 67063 , Germany.
⁴ Amgen Inc. , Cambridge , Massachusetts 02141 , United States.
⁵ Novartis Institutes for BioMedical Research , Cambridge , Massachusetts 02139 , United States.

PMID: 31361484
PMCID: PMC6727618
DOI: 10.1021/acs.jcim.9b00237

Analyzing Learned Molecular Representations for Property Prediction

Kevin Yang et al. J Chem Inf Model. 2019.

. 2019 Aug 26;59(8):3370-3388.

doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

Authors

Affiliations

¹ Computer Science and Artificial Intelligence Laboratory , MIT , Cambridge , Massachusetts 02139 , United States.
² Department of Chemical Engineering , MIT , Cambridge , Massachusetts 02139 , United States.
³ BASF SE , Ludwigshafen 67063 , Germany.
⁴ Amgen Inc. , Cambridge , Massachusetts 02141 , United States.
⁵ Novartis Institutes for BioMedical Research , Cambridge , Massachusetts 02139 , United States.

PMID: 31361484
PMCID: PMC6727618
DOI: 10.1021/acs.jcim.9b00237

Erratum in

Correction to Analyzing Learned Molecular Representations for Property Prediction.
Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R. Yang K, et al. J Chem Inf Model. 2019 Dec 23;59(12):5304-5305. doi: 10.1021/acs.jcim.9b01076. Epub 2019 Dec 9. J Chem Inf Model. 2019. PMID: 31814400 Free PMC article. No abstract available.

Abstract

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial data sets spanning a wide variety of chemical end points. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary data sets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

**Figure 1**
Illustration of bond-level message passing in our proposed D-MPNN. (a) Messages from the orange directed bonds are used to inform the update to the hidden state of the red directed bond. By contrast, in a traditional MPNN, messages are passed from atoms to atoms (for example, atoms 1, 3, and 4 to atom 2) rather than from bonds to bonds. (b) Similarly, a message from the green bond informs the update to the hidden state of the purple directed bond. (c) Illustration of the update function to the hidden representation of the red directed bond from diagram (a).

**Figure 2**
Four example distributions fit to a random sample of 100,000 compounds used for biological screening in Novartis. Note that some distributions for discrete calculations, such as fr_pyridine, are not fit especially well. This is an active area for improvement.

**Figure 3**
Comparison of our D-MPNN with features to the best models from Wu et al.

**Figure 4**
Comparison of our best single model (i.e., optimized hyperparameters and RDKit features) to the model from Mayr et al.

**Figure 5**
Comparison of our unoptimized D-MPNN against several baseline models. We omitted the random forest baseline on PCBA, MUV, ToxCast, and ChEMBL due to large computational cost. Random forest is omitted on ClinTox due to numerical instability. The D-MPNN significantly outperforms each baseline on at least 8 data sets.

**Figure 6**
Comparison of our D-MPNN against baseline models on Amgen internal data sets on a chronological data split. D-MPNN outperforms all of the baselines. Note that the ensembles were ensembles of 3 models rather than 5 for the Amgen data sets only. Also note that RF on Morgan and Mayr et al. FFN were only run once on RLM.

**Figure 7**
Comparison of our D-MPNN against baseline models on BASF internal regression data sets on a scaffold data split (higher = better). Our D-MPNN outperforms all baselines.

**Figure 8**
Comparison of our D-MPNN against baseline models on the Novartis internal regression data set on a chronological data split (lower = better). Our D-MPNN outperforms all baseline models.

**Figure 9**
Comparison of Amgen’s internal model and our D-MPNN (evaluated using a single run on a chronological split) to experimental error (higher = better). Note that the experimental error is not evaluated on the exact same time split as the two models since it can only be measured on molecules which were tested more than once, but even so the difference in performance is striking.

**Figure 10**
Overlap of molecular scaffolds between the train and test sets for a random or chronological split of four Amgen regression data sets. Overlap is defined as the percent of molecules in the test set which share a scaffold with a molecule in the train set.

**Figure 11**
Performance of D-MPNN on four Amgen regression data sets according to three methods of splitting the data (lower = better). The chronological split is significantly harder than both random and scaffold on Sol and hPXR, while the scaffold split is significantly harder than the random split on Sol only.

**Figure 12**
Performance of D-MPNN on the Novartis regression data set according to three methods of splitting the data (lower = better). The chronological split is significantly harder than the random split while the scaffold split is not.

**Figure 13**
Performance of D-MPNN on the full (F), core (C), and refined (R) subsets of the PDBbind data set according to three methods of splitting the data (lower = better). The chronological and scaffold splits are significantly harder than the random split in all cases except for the PDBbind-C scaffold split.

**Figure 14**
Performance of D-MPNN on random and scaffold splits for several public data sets. Only the results on PDBbind-C, HIV, ClinTox, and ChEMBL are not statistically significant.

**Figure 15**
Comparison of performance of different message passing paradigms.

**Figure 16**
Effect of adding molecule-level features generated with RDKit to our model.

**Figure 17**
Effect of performing Bayesian hyperparameter optimization on the depth, hidden size, number of fully connected layers, and dropout of the D-MPNN.

**Figure 18**
An illustration of ensembling models. On the left is a single model, which takes input and makes a prediction. On the right is an ensemble of 3 models. Each model takes the same input and makes a prediction independently, and then the predictions are averaged to generate the ensemble’s prediction.

**Figure 19**
Effect of using an ensemble of five models instead of a single model.

**Figure 20**
Effect of data size on the performance of the model from Mayr et al. and of our D-MPNN model (higher = better). All comparisons besides the first are statistically significant.

See this image and copyright information in PMC

References

1. Duvenaud D. K.; Maclaurin D.; Iparraguirre J.; Bombarell R.; Hirzel T.; Aspuru-Guzik A.; Adams R. P. Convolutional Networks on Graphs for Learning Molecular Fingerprints. Advances in Neural Information Processing Systems 2015, 2224–2232.
1. Wu Z.; Ramsundar B.; Feinberg E.; Gomes J.; Geniesse C.; Pappu A. S.; Leswing K.; Pande V. MoleculeNet: A Benchmark for Molecular Machine Learning. Chem. Sci. 2018, 9, 513–530. 10.1039/C7SC02664A. - DOI - PMC - PubMed
1. Kearnes S.; McCloskey K.; Berndl M.; Pande V.; Riley P. Molecular Graph Convolutions: Moving Beyond Fingerprints. J. Comput.-Aided Mol. Des. 2016, 30, 595–608. 10.1007/s10822-016-9938-8. - DOI - PMC - PubMed
1. Gilmer J.; Schoenholz S. S.; Riley P. F.; Vinyals O.; Dahl G. E. Neural Message Passing for Quantum Chemistry. Proceedings of the 34th International Conference on Machine Learning 2017, 70, 1263–1272.
1. Li Y.; Tarlow D.; Brockschmidt M.; Zemel R.. Gated Graph Sequence Neural Networks. 2015, arXiv preprint arXiv:1511.05493. https://arxiv.org/abs/1511.05493 (accessed Aug 6, 2019).

Publication types

Actions
Actions

MeSH terms

Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Analyzing Learned Molecular Representations for Property Prediction

Affiliations

Analyzing Learned Molecular Representations for Property Prediction

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources