Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 13;15(17):6331-6348.
doi: 10.1039/d3sc06133g. eCollection 2024 May 1.

Streamlining the automated discovery of porous organic cages

Affiliations

Streamlining the automated discovery of porous organic cages

Annabel R Basford et al. Chem Sci. .

Abstract

Self-assembly through dynamic covalent chemistry (DCC) can yield a range of multi-component organic assemblies. The reversibility and dynamic nature of DCC has made prediction of reaction outcome particularly difficult and thus slows the discovery rate of new organic materials. In addition, traditional experimental processes are time-consuming and often rely on serendipity. Here, we present a streamlined hybrid workflow that combines automated high-throughput experimentation, automated data analysis, and computational modelling, to accelerate the discovery process of one particular subclass of molecular organic materials, porous organic cages. We demonstrate how the design and implementation of this workflow aids in the identification of organic cages with desirable properties. The curation of a precursor library of 55 tri- and di-topic aldehyde and amine precursors enabled the experimental screening of 366 imine condensation reactions experimentally, and 1464 hypothetical organic cage outcomes to be computationally modelled. From the screen, 225 cages were identified experimentally using mass spectrometry, 54 of which were cleanly formed as a single topology as determined by both turbidity measurements and 1H NMR spectroscopy. Integration of these characterisation methods into a fully automated Python pipeline, named cagey, led to over a 350-fold decrease in the time required for data analysis. This work highlights the advantages of combining automated synthesis, characterisation, and analysis, for large-scale data curation towards an accessible data-driven materials discovery approach.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1. Range of possible outcomes that may occur from relatively simple building blocks, such as tri- and di-topic precursor combinations, in dynamic covalent reactions. When targeting an organic cage, a range of other supramolecular species may also be accessible. Even if an organic cage forms, further complexity arises in the range of potential topologies that may form, depicted by the notation TrinDim or [n + m], where n = number of tri-topic building blocks and m = number of di-topic building blocks in the cage structure. Additionally, conversion in dynamic systems can vary, with the question being will the building blocks assemble fully, converting to a single targeted organic cage, or will smaller partially assembled cage oligomers be formed. The final target cage also needs to have the desired properties, such as shape-persistency, which is key in porous organic cages.
Fig. 2
Fig. 2. Precursor library containing 55 molecules: tri-topic precursors of both amines and aldehydes (top) were screened against di-topic precursors of aldehydes and amines (bottom), respectively, leading to 366 imine condensation reactions. Precursors were synthesised (orange) or commercially available (blue). If a precursor has previously been reported in the literature as a building block for POC formation it is marked with an asterisk (*).
Fig. 3
Fig. 3. (Top) High-throughput experimental workflow, including the setup on the automated liquid-handling OT-2 platform. Rapid parallel solvent evaporation was achieved using an EquaVAP, and the sample preparation for characterisation was undertaken on the OT-2 prior to HT measurements being carried out. (Middle) High-throughput experimental data (turbidity, 1H NMR and high-resolution mass spectrometry) was automatically analysed to assign the reaction outcomes of the 366 imine condensations by species type, conversion, topological outcome, and in combination with the HT computational data (pore size analysis) led to the range of successfully formed organic cages being narrowed down further to identify shape-persistent POCs. (Bottom) High-throughput computational workflow for the optimisation and loading of each precursor building block, followed by the assembly into organic cages of the four most common topologies ([2 + 3] Tri2Di3, [4 + 6] Tri4Di6, [6 + 9] Tri6Di9, [8 + 12] Tri8Di12), and an optimisation process giving the lowest energy conformer.
Fig. 4
Fig. 4. Examples of DCC reaction outcomes from the high-throughput screen targeting organic cages. The three circles are for the ‘pass’ (blue) or ‘fail’ (yellow) of each experimental analysis (computer vision turbidity analysis, 1H NMR spectroscopy, and high-resolution mass spectrometry (HRMS), to indicate the species type, conversion, and cage topology, respectively). The splitting of the inner ring indicates the number of cage topologies identified by HRMS. The observed experimental data for each characterisation method is shown in the boxes coloured by their characterisation pass/fail check. Each reaction outcome was categorised as follows (identified by a coloured bar on right-hand side): single topology observed to form cleanly (dark red) or incompletely (orange); mixture of topologies observed to form cleanly (red) or incompletely (coral); and no topology observed (grey). The turbidity reference is shown as a red dashed line and the observed turbidity as a purple line, if the line is below the reference, the sample passes and is in solution, if it is above the line then it fails and there is presence of insoluble precipitate. The 1H NMR orange boxes over the spectra indicate the region between 9 and 11 ppm which is searched for residual aldehyde precursor and the purple box for the region searched for the presence of imine peaks. The HRMS spectra are shown to the right and the peak for an observed cage topology is indicated as a coloured dot and the spectra zoomed in accordingly to show the splitting – green for [2 + 3] Tri2Di3, orange for [4 + 6] Tri4Di6, blue for [6 + 9] Tri6Di9, and pink for [8 + 12] Tri8Di12 cage topologies. The predicted cage structures for the observed topologies are shown on the right with the carbons in the same colour – hydrogens are omitted for clarity.
Fig. 5
Fig. 5. Experimental results for all 366 precursor combinations from the automated analysis, based on either ‘pass’ (blue) or ‘fail’ (yellow) of the computer vision turbidity analysis, 1H NMR spectroscopy, and high-resolution mass spectrometry (HRMS), to indicate the species type, conversion, and cage topology, respectively. The splitting of the inner ring indicates the number of cage topologies identified by HRMS.
Fig. 6
Fig. 6. Sankey diagram illustrating the overall reaction outcomes of the high-throughput screen based on categorisation. Each reaction outcome was categorised as follows: single topology observed to form cleanly (dark red) or incompletely (orange); mixture of topologies observed to form cleanly (red) or incompletely (coral); and no topology observed (grey). The thickness of each connecting line is proportional to the number of precursor combinations leading to each outcome, and the numbers inside or next to each rectangle show the number of precursor combinations assigned to that outcome.
Fig. 7
Fig. 7. Distribution of cavity diameters for the computationally predicted shape-persistent cages, with a cavity size greater than 0.1 Å and the correct number of windows, for all 366 precursor combinations across four topologies (top left Tri2Di3, top right Tri4Di6, bottom left Tri6Di9, and bottom right Tri8Di12) and for precursor combinations where a specific topology or mixtures of topologies were experimentally observed by HRMS (hatched). The cages with the largest cavity sizes of each topology are shown both for the computationally predicted and experimentally realised structures, labelled according to the precursor combination. Carbon atoms in cages with the Tri2Di3 topology are shown in green, Tri4Di6 in orange, Tri6Di9 in blue, and Tri8Di12 in pink; in addition, nitrogen atoms shown in dark blue, oxygen in red. Hydrogens have been omitted for clarity.
Fig. 8
Fig. 8. Precursor trends for a subset of the reaction outcomes: (a) and (d) are cages that have formed either as a clean single topology or as a mixture of topologies, where the turbidity, 1H NMR and HRMS checks have been passed; (b) and (e) are cages that have formed as either a clean single topology or as a mixture of topologies which have passed the 1H NMR and HRMS checks but failed the turbidity checks, showing insoluble precipitate; (c) and (f) are where no cage topology was found, failing the HRMS check. Tri-topic and di-topic precursor combinations are categorised as either flexible–flexible (FF), flexible–rigid (FR), rigid–flexible (RF) or rigid–rigid (RR), and the total number of aromatic rings in the precursor combination is counted. (a) and (b) Both include overlaid percentage of the total number that were also computationally predicted to be shape-persistent (hatched pink and green).
Fig. 9
Fig. 9. Computationally modelled structures of the precursor combinations that passed computer vision turbidity, NMR and HRMS experimental checks, and therefore were assigned as resulting in the ‘clean formation of a single topology’. Cage cavity diameters are given under the precursor combination label. No cavity is reported if the cage does not have the correct number of windows and/or no cavity diameter could be calculated computationally from the predicted structure, indicating the absence of a shape-persistent internal cavity. Hits are shown in order of increasing cavity diameter if one could be calculated, and alphabetically prior. Carbon atoms in cages with the Tri2Di3 topology are shown in green, and Tri4Di6 in orange; in addition to nitrogen atoms shown in dark blue, oxygen in red. Hydrogens have been omitted for clarity.

References

    1. Stach E. DeCost B. Kusne A. G. Hattrick-Simpers J. Brown K. A. Reyes K. G. Schrier J. Billinge S. Buonassisi T. Foster I. Gomes C. P. Gregoire J. M. Mehta A. Montoya J. Olivetti E. Park C. Rotenberg E. Saikin S. K. Smullin S. Stanev V. Maruyama B. Matter. 2021;4:2702–2726. doi: 10.1016/j.matt.2021.06.036. - DOI
    1. Sholl D. S. Lively R. P. Nature. 2016;532:435–437. doi: 10.1038/532435a. - DOI - PubMed
    1. Farha O. K. Eryazici I. Jeong N. C. Hauser B. G. Wilmer C. E. Sarjeant A. A. Snurr R. Q. Nguyen S. T. Yazaydın A. Ö. Hupp J. T. J. Am. Chem. Soc. 2012;134:15016–15021. doi: 10.1021/ja3055639. - DOI - PubMed
    1. Little M. A. Cooper A. I. Adv. Funct. Mater. 2020;30:1909842. doi: 10.1002/adfm.201909842. - DOI
    1. Song Q. Jiang S. Hasell T. Liu M. Sun S. Cheetham A. K. Sivaniah E. Cooper A. I. Adv. Mater. 2016;28:2629–2637. doi: 10.1002/adma.201505688. - DOI - PubMed