Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 13;13(41):12087-12099.
doi: 10.1039/d2sc03016k. eCollection 2022 Oct 26.

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Affiliations

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Claudio Avila et al. Chem Sci. .

Abstract

For the discovery of new candidate molecules in the pharmaceutical industry, library synthesis is a critical step, in which library size, diversity, and time to synthesise are fundamental. In this work we propose stopped-flow synthesis as an intermediate alternative to traditional batch and flow chemistry approaches, suited for small molecule pharmaceutical discovery. This method exploits the advantages of both techniques enabling automated experimentation with access to high pressures and temperatures; flexibility of reaction times, with minimal use of reagents (μmol scale per reaction). In this study, we integrate a stopped-flow reactor into a high-throughput continuous platform designed for the synthesis of combinatory libraries with at-line reaction analysis. This approach allowed ∼900 reactions to be conducted in an accelerated timeframe (192 hours). The stopped flow approach used ∼10% of the reactants and solvents compared to a fully continuous approach. This methodology demonstrates a significantly improved synthesis success rate of smaller libraries by simplifying the implementation of cross-reaction optimisation strategies. The experimental datasets were used to train a feed-forward neural network (FFNN) model providing a framework to guide further experiments, which showed good model predictability and success when tested against an external set with fewer experiments. As a result, this work demonstrates that combining experimental automation with machine learning strategies can deliver optimised analyses and enhanced predictions, enabling more efficient drug discovery investigations across the design, make, test and analysis (DMTA) cycle.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. Stopped-flow reactor integrated into a continuous high-throughput platform: (a) stopped-flow system concept. The reactor is built by a small diameter coil, with the substrates injected simultaneously. The flow is stopped when material reaches the reaction point, and output for analysis/recovery when the desired ‘reaction time’ has elapsed; (b) high-throughput platform: (1) liquid handler; (2) multi-selection valve; (3) array of sampling loops, each connected to a (4) respective HPLC pump; (5) reactor temperature digitally controlled; (6) cooling jacket digitally controlled; (7) stopped-flow reactor coil; (8) back pressure regulator; (9) NIR flow cell; (10) 2 μL sampling loop connected to HPLC-MS. (c) Calibration and control of the stopped-flow reactor using NIR signal: (c1) 3D NIR signal of a single reaction; (c2) 2D NIR signal showing relative location of Rv (reaction volume) and Rp (reacting point); (c3) time position of the Rv along the system from injection (t1) to reaction (Rt) and sampling (t5).
Fig. 2
Fig. 2. Simplified exploration of the reaction conditions (a) using a fast design of experiments (DoE) pattern applied to two reactions (b and e). The fast DoE explores nine reaction conditions (c and f), indicating the minimisation function value (the ratio of the internal standard to the product peak areas, i.e. with increasing product area this will reduce, calculated from the HPLC DAD 254 nm signal) by a red circle. (d) and (g) show the strong temperature dependency for the optimum reaction conditions (when the function value is minimum), with a negligible influence of the reaction time.
Scheme 1
Scheme 1. Carboxylic acids (A), amines (B), and coupling agents (CA) used in this study.
Fig. 3
Fig. 3. Heatmap for the amide library synthesised. Five carboxylic acids were combined with five amines, using four coupling agents and tested under nine different reaction conditions (900 individual reactions). Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule signal relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.
Fig. 4
Fig. 4. Effectiveness of the coupling agent relative to each of the substrates used, considering successful reactions those in which the desired product produced a signal >10% of the total chromatogram area (measured at UV DAD 254 nm signal).
Fig. 5
Fig. 5. Performance measures for the model and random models. TP (true positive) and TN (true negative) correspond to successful and failed experiments that were correctly predicted by the model, respectively; oppositely, FP (false positive) and FN (false negative) correspond relate to wrong outcome predictions. The ‘precision’ is the ratio of the relevant instance (here, the true positive) among all the retrieved instances (true or false positive). The ‘recall’ is the fraction of relevant instances that were retrieved. The ‘accuracy’ can be seen as a measure of how often the model make a good prediction (whatever it is a successful or a failed experiment).
Fig. 6
Fig. 6. Experimental results obtained for a second amide library synthesised for external model cross-validation. Six carboxylic acids (Scheme 1, A1 to A6) were combined with five new amines (Scheme 1, B6 to B10), using the four coupling agents (Scheme 1, CA1 to CA4) and under two different reaction conditions totalling 240 individual reactions. Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.
Fig. 7
Fig. 7. Bar plot of the number of experiments and products with respect to the model score and the reaction outcome in the ‘temporal test set’. On the X-axis, ‘Highest Score’ was computed for each product and corresponded to the experiment which obtained the highest model score for a given product. ‘Score > 0.0’ means that no threshold has been applied on the model score (it considers all the experiments). ‘Score > 0.3’ to ‘Score > 0.9’, means that only the experiments for which the model reports a score equal or higher than the corresponding threshold value were considered. Red bars correspond to the number of experiments that failed, and blue bars the number of experiments that succeeded experimentally. Yellow bars correspond to the number of unique products that were targeted by experiments that subsequently failed, while green bars correspond to the number of unique products that succeeded reaction. Experimental results obtained for a second amide library synthesised for external model cross-validation. Six carboxylic acids (Scheme 1, A1 to A6) were combined with five new amines (Scheme 1, B6 to B10), using the four coupling agents (Scheme 1, CA1 to CA4) and under two different reaction conditions totalling 240 individual reactions. Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.
Fig. 8
Fig. 8. Flow chart of the best model selection strategy applied to the 12 different features set used in this study.

Similar articles

Cited by

References

    1. Hughes J. P. Rees S. Kalindjian S. B. Philpott K. L. Nat. Commun. 2011;162:1239–1249. - PMC - PubMed
    1. Galloway W. R. J. D. Isidro-Llobet A. Spring D. R. Nat. Commun. 2010;1:80. doi: 10.1038/ncomms1081. - DOI - PubMed
    1. Gerry C. J. Schreiber S. L. Nat. Rev. Drug Discov. 2018;17:333–352. doi: 10.1038/nrd.2018.53. - DOI - PMC - PubMed
    1. Sauer W. H. Schwarz M. K. J. Chem. Inf. Comput. Sci. 2003;43:987–1003. doi: 10.1021/ci025599w. - DOI - PubMed
    1. Bedard A. C. Adamo A. Aroh K. C. Russell M. G. Bedermann A. A. Torosian J. Yue B. Jensen K. F. Jamison T. F. Science. 2018;361:1220–1225. doi: 10.1126/science.aat0650. - DOI - PubMed