. 2022 Sep 13;13(41):12087-12099.

doi: 10.1039/d2sc03016k. eCollection 2022 Oct 26.

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Claudio Avila^{1

2}, Carlo Cassani³, Thierry Kogej⁴, Javier Mazuela⁴, Sunil Sarda¹, Adam D Clayton², Michael Kossenjans³, Clive P Green¹, Richard A Bourne²

Affiliations

¹ Sample Management, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca Cambridge CB4 0WG UK Claudio.avila@science.cl R.A.Bourne@leeds.ac.uk.
² Institute of Process Research and Development, School of Chemistry and School of Chemical and Process Engineering, University of Leeds Leeds LS2 9JT UK.
³ Research and Early Development, Medicinal Chemistry Respiratory & Immunology Göteborg Sweden.
⁴ MolecularAI, Discovery Sciences, R&D, AstraZeneca Göteborg Sweden.

PMID: 36349112
PMCID: PMC9601405
DOI: 10.1039/d2sc03016k

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Claudio Avila et al. Chem Sci. 2022.

. 2022 Sep 13;13(41):12087-12099.

doi: 10.1039/d2sc03016k. eCollection 2022 Oct 26.

Authors

Claudio Avila^{1

2}, Carlo Cassani³, Thierry Kogej⁴, Javier Mazuela⁴, Sunil Sarda¹, Adam D Clayton², Michael Kossenjans³, Clive P Green¹, Richard A Bourne²

Affiliations

¹ Sample Management, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca Cambridge CB4 0WG UK Claudio.avila@science.cl R.A.Bourne@leeds.ac.uk.
² Institute of Process Research and Development, School of Chemistry and School of Chemical and Process Engineering, University of Leeds Leeds LS2 9JT UK.
³ Research and Early Development, Medicinal Chemistry Respiratory & Immunology Göteborg Sweden.
⁴ MolecularAI, Discovery Sciences, R&D, AstraZeneca Göteborg Sweden.

PMID: 36349112
PMCID: PMC9601405
DOI: 10.1039/d2sc03016k

Abstract

For the discovery of new candidate molecules in the pharmaceutical industry, library synthesis is a critical step, in which library size, diversity, and time to synthesise are fundamental. In this work we propose stopped-flow synthesis as an intermediate alternative to traditional batch and flow chemistry approaches, suited for small molecule pharmaceutical discovery. This method exploits the advantages of both techniques enabling automated experimentation with access to high pressures and temperatures; flexibility of reaction times, with minimal use of reagents (μmol scale per reaction). In this study, we integrate a stopped-flow reactor into a high-throughput continuous platform designed for the synthesis of combinatory libraries with at-line reaction analysis. This approach allowed ∼900 reactions to be conducted in an accelerated timeframe (192 hours). The stopped flow approach used ∼10% of the reactants and solvents compared to a fully continuous approach. This methodology demonstrates a significantly improved synthesis success rate of smaller libraries by simplifying the implementation of cross-reaction optimisation strategies. The experimental datasets were used to train a feed-forward neural network (FFNN) model providing a framework to guide further experiments, which showed good model predictability and success when tested against an external set with fewer experiments. As a result, this work demonstrates that combining experimental automation with machine learning strategies can deliver optimised analyses and enhanced predictions, enabling more efficient drug discovery investigations across the design, make, test and analysis (DMTA) cycle.

This journal is © The Royal Society of Chemistry.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1. Stopped-flow reactor integrated into a continuous high-throughput platform: (a) stopped-flow system concept. The reactor is built by a small diameter coil, with the substrates injected simultaneously. The flow is stopped when material reaches the reaction point, and output for analysis/recovery when the desired ‘reaction time’ has elapsed; (b) high-throughput platform: (1) liquid handler; (2) multi-selection valve; (3) array of sampling loops, each connected to a (4) respective HPLC pump; (5) reactor temperature digitally controlled; (6) cooling jacket digitally controlled; (7) stopped-flow reactor coil; (8) back pressure regulator; (9) NIR flow cell; (10) 2 μL sampling loop connected to HPLC-MS. (c) Calibration and control of the stopped-flow reactor using NIR signal: (c1) 3D NIR signal of a single reaction; (c2) 2D NIR signal showing relative location of Rv (reaction volume) and Rp (reacting point); (c3) time position of the Rv along the system from injection (t₁) to reaction (Rt) and sampling (t₅).

Fig. 2. Simplified exploration of the reaction conditions (a) using a fast design of experiments (DoE) pattern applied to two reactions (b and e). The fast DoE explores nine reaction conditions (c and f), indicating the minimisation function value (the ratio of the internal standard to the product peak areas, *i.e.* with increasing product area this will reduce, calculated from the HPLC DAD 254 nm signal) by a red circle. (d) and (g) show the strong temperature dependency for the optimum reaction conditions (when the function value is minimum), with a negligible influence of the reaction time.

**Scheme 1. Carboxylic acids (A), amines (B), and coupling agents (CA) used in this study.**

Fig. 3. Heatmap for the amide library synthesised. Five carboxylic acids were combined with five amines, using four coupling agents and tested under nine different reaction conditions (900 individual reactions). Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule signal relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.

Fig. 4. Effectiveness of the coupling agent relative to each of the substrates used, considering successful reactions those in which the desired product produced a signal >10% of the total chromatogram area (measured at UV DAD 254 nm signal).

Fig. 5. Performance measures for the model and random models. TP (true positive) and TN (true negative) correspond to successful and failed experiments that were correctly predicted by the model, respectively; oppositely, FP (false positive) and FN (false negative) correspond relate to wrong outcome predictions. The ‘precision’ is the ratio of the relevant instance (here, the true positive) among all the retrieved instances (true or false positive). The ‘recall’ is the fraction of relevant instances that were retrieved. The ‘accuracy’ can be seen as a measure of how often the model make a good prediction (whatever it is a successful or a failed experiment).

Fig. 6. Experimental results obtained for a second amide library synthesised for external model cross-validation. Six carboxylic acids (Scheme 1, A1 to A6) were combined with five new amines (Scheme 1, B6 to B10), using the four coupling agents (Scheme 1, CA1 to CA4) and under two different reaction conditions totalling 240 individual reactions. Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.

Fig. 7. Bar plot of the number of experiments and products with respect to the model score and the reaction outcome in the ‘temporal test set’. On the X-axis, ‘Highest Score’ was computed for each product and corresponded to the experiment which obtained the highest model score for a given product. ‘Score > 0.0’ means that no threshold has been applied on the model score (it considers all the experiments). ‘Score > 0.3’ to ‘Score > 0.9’, means that only the experiments for which the model reports a score equal or higher than the corresponding threshold value were considered. Red bars correspond to the number of experiments that failed, and blue bars the number of experiments that succeeded experimentally. Yellow bars correspond to the number of unique products that were targeted by experiments that subsequently failed, while green bars correspond to the number of unique products that succeeded reaction. Experimental results obtained for a second amide library synthesised for external model cross-validation. Six carboxylic acids (Scheme 1, A1 to A6) were combined with five new amines (Scheme 1, B6 to B10), using the four coupling agents (Scheme 1, CA1 to CA4) and under two different reaction conditions totalling 240 individual reactions. Colours indicate the relative abundance of the target molecule (UV peak area of the target molecule relative to the total chromatogram area (%), measured at 254 nm). White and crossed boxes indicate absence of analytical data due to instrument failure.

**Fig. 8. Flow chart of the best model selection strategy applied to the 12 different features set used in this study.**

See this image and copyright information in PMC

Cited by

Continuous flow synthesis of pyridinium salts accelerated by multi-objective Bayesian optimization with active learning.
Dunlap JH, Ethier JG, Putnam-Neeb AA, Iyer S, Luo SL, Feng H, Garrido Torres JA, Doyle AG, Swager TM, Vaia RA, Mirau P, Crouse CA, Baldwin LA. Dunlap JH, et al. Chem Sci. 2023 Jul 12;14(30):8061-8069. doi: 10.1039/d3sc01303k. eCollection 2023 Aug 2. Chem Sci. 2023. PMID: 37538827 Free PMC article.
A Brief Introduction to Chemical Reaction Optimization.
Taylor CJ, Pomberger A, Felton KC, Grainger R, Barecka M, Chamberlain TW, Bourne RA, Johnson CN, Lapkin AA. Taylor CJ, et al. Chem Rev. 2023 Mar 22;123(6):3089-3126. doi: 10.1021/acs.chemrev.2c00798. Epub 2023 Feb 23. Chem Rev. 2023. PMID: 36820880 Free PMC article. Review.
Deep Reinforcement Learning-Based Self-Optimization of Flow Chemistry.
Yewale A, Yang Y, Nazemifard N, Papageorgiou CD, Rielly CD, Benyahia B. Yewale A, et al. ACS Eng Au. 2025 May 13;5(3):247-266. doi: 10.1021/acsengineeringau.5c00004. eCollection 2025 Jun 18. ACS Eng Au. 2025. PMID: 40556644 Free PMC article.
Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows.
Day EC, Chittari SS, Bogen MP, Knight AS. Day EC, et al. ACS Polym Au. 2023 Dec 5;3(6):406-427. doi: 10.1021/acspolymersau.3c00025. eCollection 2023 Dec 13. ACS Polym Au. 2023. PMID: 38107416 Free PMC article. Review.
Parallel multi-droplet platform for reaction kinetics and optimization.
Eyke NS, Schneider TN, Jin B, Hart T, Monfette S, Hawkins JM, Morse PD, Howard RM, Pfisterer DM, Nandiwale KY, Jensen KF. Eyke NS, et al. Chem Sci. 2023 Aug 4;14(33):8798-8809. doi: 10.1039/d3sc02082g. eCollection 2023 Aug 23. Chem Sci. 2023. PMID: 37621435 Free PMC article.

References

1. Hughes J. P. Rees S. Kalindjian S. B. Philpott K. L. Nat. Commun. 2011;162:1239–1249. - PMC - PubMed
1. Galloway W. R. J. D. Isidro-Llobet A. Spring D. R. Nat. Commun. 2010;1:80. doi: 10.1038/ncomms1081. - DOI - PubMed
1. Gerry C. J. Schreiber S. L. Nat. Rev. Drug Discov. 2018;17:333–352. doi: 10.1038/nrd.2018.53. - DOI - PMC - PubMed
1. Sauer W. H. Schwarz M. K. J. Chem. Inf. Comput. Sci. 2003;43:987–1003. doi: 10.1021/ci025599w. - DOI - PubMed
1. Bedard A. C. Adamo A. Aroh K. C. Russell M. G. Bedermann A. A. Torosian J. Yue B. Jensen K. F. Jamison T. F. Science. 2018;361:1220–1225. doi: 10.1126/science.aat0650. - DOI - PubMed

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Affiliations

Automated stopped-flow library synthesis for rapid optimisation and machine learning directed experimentation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Research Materials