Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 24;7(11):1821-1830.
doi: 10.1021/acscentsci.1c00435. Epub 2021 Nov 11.

Discovering New Chemistry with an Autonomous Robotic Platform Driven by a Reactivity-Seeking Neural Network

Affiliations

Discovering New Chemistry with an Autonomous Robotic Platform Driven by a Reactivity-Seeking Neural Network

Dario Caramelli et al. ACS Cent Sci. .

Abstract

We present a robotic chemical discovery system capable of navigating a chemical space based on a learned general association between molecular structures and reactivity, while incorporating a neural network model that can process data from online analytics and assess reactivity without knowing the identity of the reagents. Working in conjunction with this learned knowledge, our robotic platform is able to autonomously explore a large number of potential reactions and assess the reactivity of mixtures, including unknown chemical spaces, regardless of the identity of the starting materials. Through the system, we identified a range of chemical reactions and products, some of which were well-known, some new but predictable from known pathways, and some unpredictable reactions that yielded new molecules. The validation of the system was done within a budget of 15 inputs combined in 1018 reactions, further analysis of which allowed us to discover not only a new photochemical reaction but also a new reactivity mode for a well-known reagent (p-toluenesulfonylmethyl isocyanide, TosMIC). This involved the reaction of 6 equiv of TosMIC in a "multistep, single-substrate" cascade reaction yielding a trimeric product in high yield (47% unoptimized) with the formation of five new C-C bonds involving sp-sp2 and sp-sp3 carbon centers. An analysis reveals that this transformation is intrinsically unpredictable, demonstrating the possibility of a reactivity-first robotic discovery of unknown reaction methodologies without requiring human input.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Closed-loop framework for chemical space exploration. A liquid handling robot performs an experiment and collects NMR and MS spectra. These data are processed to assess reactivity and create a model of the chemical space that is queried to formulate the next experiment to be performed.
Figure 2
Figure 2
Liquid handling platform. (a) Schematic of the platform. A series of reagents are added by dedicated pumps to a mixer flask. A pump expanded with two extra valves (obtained by removing the syringe from a normal pump) is used to transfer the reaction mixture in one of the six reactors (red); another pump with the same setup (blue) is used to connect the reactors with the benchtop NMR. Finally, a third expanded pump is used for an in-line dilution prior to injection in the MS (green). (b) Picture of the platform. The pumps are visible on the shelves, on two lines. At the bottom, there is the NMR instrument equipped with a flow probe. The MS is on the left and the reactors in the center, while the reagents, the solvent drum, and the waste container are on the left. (c) Six parallel reactions were started with a time-offset to allow the platform to continuously perform physical operations.
Figure 3
Figure 3
CNN assessment of reactivity in two different chemical spaces. (a) Structure of the neural network used to assign the reactivity to the NMR spectra. Data of the mixture and the sum of starting materials are used as input. The network is trained using 440 reactions from a chemical space (b) and tested on 1018 reactions performed from combinations of 15 different molecules (c). (d) The accuracy on the test set plotted as a confusion matrix shows that the network successfully learned to generalize the reactivity beyond reagents in the training set. (e) All data were manually classified into four classes.
Figure 4
Figure 4
Aspects of the autonomous chemical space exploration algorithm. (a) Structure of the neural network used for reactivity prediction. A junction-tree neural network encodes the molecular structure of each reactant into a 56 dimensional vector. (b) Scheme of the algorithm used to simulate the chemical space exploration. (c) Correlation of predicted versus observed reactivity (assigned via Reactify) for the test set. The correlation is demonstrated by fitting a linear regression model with the shaded area representing the 99% confidence interval obtained using bootstrap. Prediction uncertainties (calculated as standard deviations) are shown within error bars (see Section S3.1 for connection between uncertainty and error lower bound). (d) Results of the chemical space exploration simulation. After the initial selection of 100 random reactions (orange), the algorithm starts to create a model that correlates parameters to reactivity. By prioritizing combinations that are predicted to be reactive, the space is explored in a more efficient way. The error bars show standard deviation.
Figure 5
Figure 5
Five reactions showing high reactivity have been found and characterized. Reactions a and b have been previously reported in the literature with the exact same product. Reaction c is known in the literature but has never been used to make 27. Reactions d and e are unreported in the literature.
Figure 6
Figure 6
Reaction of diethyl bromomalonate and TosMIC discovered with the automated platform. (a) General scheme of the reaction. 6 equiv of isocyanide is consumed in the presence of an activator, water, and DMSO 29; on the right-hand side are the X-ray structure of 29 and its tube-shaped supramolecular structure. (b) Analogous products obtained with variations of the isocyanide. (c) The reaction has been carried out using variations of diethyl bromomalonate, yielding the same product. (d) By performing the reaction in the presence of an amine, we observed variations of the product, suggesting the mechanism reported in the next figure.
Figure 7
Figure 7
Reaction mechanism and cheminformatic analysis. (a) Scheme of the proposed mechanism. Two of the intermediates have been found by HPLC-MS analysis while the isocyanate group, CO2, DMS, and ammonium have been detected at IR and NMR, respectively. (b) The isotopic labeling of carbon atoms C1 and C2 of TosMIC and the DMSO oxygen helped determine the source of core atoms in the product molecule. (c) Comparison of the structural change between reagents and products among known reactions of TosMIC (red line indicates discovered reaction). (d) Estimation of reaction unpredictability by estimating the size of the relevant reaction network. A full simulation could only be carried out for the first eight steps of the simulation as the combinatorial explosion produces a vastly greater number of molecules than could be feasibly analyzed.

References

    1. Grzybowski B. A.; Bishop K. J. M.; Kowalczyk B.; Wilmer C. E. The ‘wired’ universe of organic chemistry. Nat. Chem. 2009, 1, 31–36. 10.1038/nchem.136. - DOI - PubMed
    1. Oeschger R.; et al. Diverse functionalization of strong alkyl C – H bonds by undirected borylation. Science 2020, 368, 736–741. 10.1126/science.aba6146. - DOI - PMC - PubMed
    1. Reymond J. L.; Ruddigkeit L.; Blum L.; van Deursen R. The enumeration of chemical space. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012, 2, 717–733. 10.1002/wcms.1104. - DOI
    1. Schwaller P.; Probst D.; Vaucher A. C.; et al. Mapping the space of chemical reactions using attention-based neural networks. Nat. Mach Intell 2021, 3, 144–152. 10.1038/s42256-020-00284-w. - DOI
    1. Herges R. Reaction planning: prediction of new organic reactions. J. Chem. Inf. Comput. Sci. 1990, 30, 377–383. 10.1021/ci00068a006. - DOI