This is a preprint.
Expansion of DNA-Encoded Library Hits Using Generative Chemistry and Ultra-Large Compound Catalogs
- PMID: 41256572
- PMCID: PMC12621893
- DOI: 10.1101/2025.09.30.679600
Expansion of DNA-Encoded Library Hits Using Generative Chemistry and Ultra-Large Compound Catalogs
Abstract
DNA-encoded libraries (DELs) are powerful tools for initial hit identification, yet the combinatorial chemistries and building block choices used in their construction can restrict chemical space coverage and hit drug-likeness, limiting efficient hit expansion. Generative artificial intelligence (AI), by contrast, can in principle explore drug-like chemical space around any given compound, but it often struggles with the synthesizability of generated molecules and requires a set of validated hits to initiate exploration. Here, we present a synergistic methodology that overcomes these mutual limitations by leveraging experimentally validated DEL data to initialize and bias an AI-powered virtual screening pipeline, expanding initial DEL hits with both de novo and purchasable compounds from ultra-large chemical libraries. Using this approach, we identified novel, commercially available hits from the Enamine REAL Space for the chromatin reader protein 53BP1 and validated them in a time-resolved fluorescence resonance energy transfer (TR-FRET) displacement assay. Three compounds demonstrated TR-FRET IC50 values ≤50 μM, while 11 exhibited IC50 values ≤100 μM. Critically, the AI-nominated hits exhibited greater chemical diversity, improved drug-likeness, and were readily purchasable off-the-shelf compared to compounds from the initial DEL selection. This work demonstrates a streamlined platform in which empirical DEL data and generative chemistry models are combined to enable rapid hit expansion from initially screened libraries into diverse, commercially available chemical matter.
Conflict of interest statement
The authors declare no competing financial interest.
Figures
References
-
- Macarrón R. & Hertzberg R. P. Design and implementation of high throughput screening assays. Mol. Biotechnol. 47, 270–285 (2011). - PubMed
-
- DiMasi J. A., Grabowski H. G. & Hansen R. W. Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econ. 47, 20–33 (2016). - PubMed
-
- Yuen L. H. et al. A Focused DNA-Encoded Chemical Library for the Discovery of Inhibitors of NAD+-Dependent Enzymes. J. Am. Chem. Soc. 141, 5169–5181 (2019). - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources