Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
- PMID: 32342001
- PMCID: PMC7181321
- DOI: 10.1021/acscentsci.0c00026
Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
Abstract
The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory (DFT) screening is intractable. Machine learning (e.g., artificial neural network, ANN, or Gaussian process, GP) models for this task are limited by training data availability and predictive uncertainty quantification (UQ). We overcome such limitations by using efficient global optimization (EGO) with the multidimensional expected improvement (EI) criterion. EGO balances exploitation of a trained model with acquisition of new DFT data at the Pareto front, the region of chemical space that contains the optimal trade-off between multiple design criteria. We demonstrate this approach for the simultaneous optimization of redox potential and solubility in candidate M(II)/M(III) redox couples for redox flow batteries from a space of 2.8 M transition metal complexes designed for stability in practical redox flow battery (RFB) applications. We show that a multitask ANN with latent-distance-based UQ surpasses the generalization performance of a GP in this space. With this approach, ANN prediction and EI scoring of the full space are achieved in minutes. Starting from ca. 100 representative points, EGO improves both properties by over 3 standard deviations in only five generations. Analysis of lookahead errors confirms rapid ANN model improvement during the EGO process, achieving suitable accuracy for predictive design in the space of transition metal complexes. The ANN-driven EI approach achieves at least 500-fold acceleration over random search, identifying a Pareto-optimal design in around 5 weeks instead of 50 years.
Copyright © 2020 American Chemical Society.
Conflict of interest statement
The authors declare no competing financial interest.
Figures
References
-
- Tabor D. P.; Roch L. M.; Saikin S. K.; Kreisbeck C.; Sheberla D.; Montoya J. H.; Dwaraknath S.; Aykol M.; Ortiz C.; Tribukait H.; Amador-Bedolla C.; Brabec C. J.; Maruyama B.; Persson K. A.; Aspuru-Guzik A. Accelerating the Discovery of Materials for Clean Energy in the Era of Smart Automation. Nat. Rev. Mater. 2018, 3, 5–20. 10.1038/s41578-018-0005-z. - DOI
-
- Andersson M. P.; Bligaard T.; Kustov A.; Larsen K. E.; Greeley J.; Johannessen T.; Christensen C. H.; Nørskov J. K. Toward Computational Screening in Heterogeneous Catalysis: Pareto-Optimal Methanation Catalysts. J. Catal. 2006, 239, 501–506. 10.1016/j.jcat.2006.02.016. - DOI
-
- Miranda-Galindo E. Y.; Segovia-Hernández J. G.; Hernández S.; Gutiérrez-Antonio C.; Briones-Ramírez A. Reactive Thermally Coupled Distillation Sequences: Pareto Front. Ind. Eng. Chem. Res. 2011, 50, 926–938. 10.1021/ie101290t. - DOI
-
- Schweidtmann A. M.; Clayton A. D.; Holmes N.; Bradford E.; Bourne R. A.; Lapkin A. A. Machine Learning Meets Continuous Flow Chemistry: Automated Optimization Towards the Pareto Front of Multiple Objectives. Chem. Eng. J. 2018, 352, 277–282. 10.1016/j.cej.2018.07.031. - DOI
-
- Bradford E.; Schweidtmann A. M.; Lapkin A. Efficient Multiobjective Optimization Employing Gaussian Processes, Spectral Sampling and a Genetic Algorithm. J. Global Optim. 2018, 71, 407–438. 10.1007/s10898-018-0609-2. - DOI
Grants and funding
LinkOut - more resources
Full Text Sources
