Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Dec 5;3(6):406-427.
doi: 10.1021/acspolymersau.3c00025. eCollection 2023 Dec 13.

Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows

Affiliations
Review

Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows

Erin C Day et al. ACS Polym Au. .

Abstract

Synthetic polymers are highly customizable with tailored structures and functionality, yet this versatility generates challenges in the design of advanced materials due to the size and complexity of the design space. Thus, exploration and optimization of polymer properties using combinatorial libraries has become increasingly common, which requires careful selection of synthetic strategies, characterization techniques, and rapid processing workflows to obtain fundamental principles from these large data sets. Herein, we provide guidelines for strategic design of macromolecule libraries and workflows to efficiently navigate these high-dimensional design spaces. We describe synthetic methods for multiple library sizes and structures as well as characterization methods to rapidly generate data sets, including tools that can be adapted from biological workflows. We further highlight relevant insights from statistics and machine learning to aid in data featurization, representation, and analysis. This Perspective acts as a "user guide" for researchers interested in leveraging high-throughput screening toward the design of multifunctional polymers and predictive modeling of structure-property relationships in soft materials.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
High-throughput screening workflow design. First, a scientific objective must be established to optimize a material or explore a structure–property relationship. Then, variables or features of interest must be chosen and discretized appropriately to result in a design space that can be feasibly sampled. A library can then be generated, screened, and the resulting characterization can be used in designing a new library and further material discovery. The relevant sections of this perspective are highlighted below the flowchart.
Figure 2
Figure 2
Objective of a high-throughput screen over a complex design surface (gray) falls into either optimization (left) or exploration (right) categories, where the former involves identification of a high-performing “champion material” (blue star), and the latter involves mapping a structure–property relationship over the entirety of the surface (blue dots).
Figure 3
Figure 3
Featurization strategies and estimating library size. (a) Common variables for macromolecule libraries include composition, molecular weight, architecture, sequence patterning, and extrinsic factors. (b) Variables must be bounded (left) and discretized (right) based on physical limitations. (c) Estimation of the size of the design space for sequence-defined materials (left) or polydisperse materials (right).
Figure 4
Figure 4
Potential sampling challenges and harnessing data outputs. (a) Potential sampling challenges arise with insufficient library size (left), overrepresentation of certain data classes in imbalanced data sets (center), and practical constraints on samples such as insolubility in water (right). (b) Outputs of screening workflows include development of structure–property regression models (left), iterative library design using adaptive sampling (center), and database generation of screening outputs, feature information, and protocol metadata (right).
Figure 5
Figure 5
Methods for efficient library generation. (a) Library types organized by the size of space (x-axis) and amount of material (y-axis) that can be screened. (b) High-throughput synthesis methods categorized by how the libraries can be screened, how the material is produced, and what macromolecular properties are varied.
Figure 6
Figure 6
Methods for efficient library synthesis. Schematic of (a) split-and-pool synthesis of one-bead one-compound libraries, (b) large library synthesis with barcodes, (c) modification or separation of a single polymerization batch into a library, and (d) options for automated library generation.
Figure 7
Figure 7
Overview of molecular descriptors. (a) String representation for polymer materials can use existing SMILES-type notation (left). BigSMILES notation also supports architecture representation unique to polydisperse materials (right, adapted from ref (141). Copyright 2019 American Chemical Society). (b) Graph representation uses nodes and edges to represent atoms and bonds in a molecule (left). Hypergraph (e.g., PolyGrammar) and ECFP are graph-driven representation techniques that preserve information on the connectivity of atoms in a polymer (right). (c) Representation learning is a powerful tool that can take a diverse set of inputs, including MD simulation trajectory data, electronic structure calculations from DFT, spectroscopic inputs, and other types of molecular descriptors (left). These inputs can be converted to a feature vector using deep learning (autoencoder-decoder neural net), and the feature vectors can be used to fit a predictive model (right).
Figure 8
Figure 8
Characterization methods organized by data acquisition and processing times (x-axis) and material required (y-axis). Italicized are the properties that are evaluated by each technique.
Figure 9
Figure 9
Biological techniques adaptable to synthetic macromolecule characterization. (a) Overview of the structural and functional characterization techniques described herein. (b) Surface plasmon resonance (SPR): the target flows through a channel over a ligand immobilized on a metal sensor surface. Changes in the refractive index upon binding provide real-time information regarding kinetics and affinities between binding partners. (c) Affinity selection mass spectrometry (AS-MS): unbound target molecules or ions are isolated physically or through dialysis techniques from a macromolecular binding partner, then quantified to determine the bound fraction across a range of target concentrations. (d) Isothermal titration calorimetry (ITC): one binding partner, typically the target, is titrated gradually into a dilute solution of the other, while the resulting heat change is measured against a reference cell. Peak integration of each binding event and subsequent curve fitting yields ΔH, binding stoichiometry (N), and association constant (Ka). (e) Ion mobility spectrometry-mass spectrometry (IMS-MS): ions are separated based on size, charge state, and collisional cross section, resolving differences in polymer architecture or conformation. (f) Förster resonance energy transfer (FRET): the transfer of energy from excitation of a donor group to an acceptor moiety in close proximity enables the measurement of through-space interactions, and this can probe properties such as conformation or end-to-end distance (Adapted from ref (234). Copyright 2023 American Chemical Society).
Figure 10
Figure 10
Data visualization and regression workflow. (a) A high-dimensional data set presents challenges for both analysis and visualization. (b) Dimension reduction techniques such as principal component analysis (PCA) can represent the data on new reduced axes. (c) Low-dimensional data can then be clustered into different classes using strategies such as k-means clustering or hierarchical clustering. (d) Models can be fit and interpreted in a variety of ways, including regression analysis and calculating SHAP values, to determine the importance of various features.

References

    1. Namazi H. Polymers in Our Daily Life. Bioimpacts 2017, 7 (2), 73–74. 10.15171/bi.2017.09. - DOI - PMC - PubMed
    1. Schaefer S.; Pham T. T. P.; Brunke S.; Hube B.; Jung K.; Lenardon M. D.; Boyer C. Rational Design of an Antifungal Polyacrylamide Library with Reduced Host-Cell Toxicity. ACS Appl. Mater. Interfaces 2021, 13 (23), 27430–27444. 10.1021/acsami.1c05020. - DOI - PubMed
    1. Bannigan P.; Bao Z.; Hickman R. J.; Aldeghi M.; Häse F.; Aspuru-Guzik A.; Allen C. Machine Learning Models to Accelerate the Design of Polymeric Long-Acting Injectables. Nat. Commun. 2023, 14 (1), 35.10.1038/s41467-022-35343-w. - DOI - PMC - PubMed
    1. Pereira D. A.; Williams J. A. Origin and Evolution of High Throughput Screening: Origin and Circumscribed History of HTS. Br. J. Pharmacol. 2007, 152 (1), 53–61. 10.1038/sj.bjp.0707373. - DOI - PMC - PubMed
    1. Blay V.; Tolani B.; Ho S. P.; Arkin M. R. High-Throughput Screening: Today’s Biochemical and Cell-Based Approaches. Drug Discovery Today 2020, 25 (10), 1807–1821. 10.1016/j.drudis.2020.07.024. - DOI - PubMed