Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 12;63(11):3423-3437.
doi: 10.1021/acs.jcim.3c00276. Epub 2023 May 25.

Fragment Merging Using a Graph Database Samples Different Catalogue Space than Similarity Search

Affiliations

Fragment Merging Using a Graph Database Samples Different Catalogue Space than Similarity Search

Stephanie Wills et al. J Chem Inf Model. .

Abstract

Fragment merging is a promising approach to progressing fragments directly to on-scale potency: each designed compound incorporates the structural motifs of overlapping fragments in a way that ensures compounds recapitulate multiple high-quality interactions. Searching commercial catalogues provides one useful way to quickly and cheaply identify such merges and circumvents the challenge of synthetic accessibility, provided they can be readily identified. Here, we demonstrate that the Fragment Network, a graph database that provides a novel way to explore the chemical space surrounding fragment hits, is well-suited to this challenge. We use an iteration of the database containing >120 million catalogue compounds to find fragment merges for four crystallographic screening campaigns and contrast the results with a traditional fingerprint-based similarity search. The two approaches identify complementary sets of merges that recapitulate the observed fragment-protein interactions but lie in different regions of chemical space. We further show our methodology is an effective route to achieving on-scale potency by retrospective analyses for two different targets; in analyses of public COVID Moonshot and Mycobacterium tuberculosis EthR inhibitors, potential inhibitors with micromolar IC50 values were identified. This work demonstrates the use of the Fragment Network to increase the yield of fragment merges beyond that of a classical catalogue search.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Pipeline for identifying fragment merges. Fragment hits from crystallographic fragment screens are used for finding fragment merges. All possible pairs of compounds are enumerated for merging (removing those with high similarity). Both the Fragment Network and similarity search are used to identify fragment merges. The Fragment Network enumerates all possible substructures of one of the fragments in the merge while the other fragment is regarded as the seed fragment. A series of optional hops are made away from the seed fragment (up to a maximum of two), after which an expansion is made by incorporating a substructure from the other fragment. The similarity search finds merges by calculating the Tversky (Tv) similarity against every compound in the database using the Morgan fingerprint (2048 bits and radius 2). The Tversky calculation uses α and β values of 0.7 and 0.3, respectively. All compounds with a mean similarity ≥0.4 are retained. The merges pass through a series of 2D and 3D filters, including pose generation with Fragmenstein, to result in scored poses.
Figure 2
Figure 2
The Fragment Network identifies pure merges. (a) A fragment-merging opportunity for the main protease (Mpro) data set. Interactions are predicted using the protein–ligand interaction profiler (PLIP). Hydrogen bonds and π-stacking interactions are shown by cyan and magenta dotted lines, respectively. The PanDDA density is provided for the purple fragment in the Supporting Information (owing to the unusual conformation). (b) The linker-like merge (pose generated using Fragmenstein) joins substructures from partially overlapping fragments by a “linker-like” region, maintaining the hydrogen bond with THR-45; a change in orientation of the thiazole ring (with respect to the thiophene ring in the fragment) enables an additional π-stacking interaction with HIS-41. (c) A fragment-linking opportunity for Mpro. (d) The proposed compound maintains a hydrogen bond with PHE-140 and makes an additional bond with SER-144. It is worth noting that the linker group proposed by the Fragment Network is present in thioacetazone, an oral antibiotic. (e) The fragments and merge in a and b in 2D. (f) The fragments and merge in c and d in 2D.
Figure 3
Figure 3
The Fragment Network and similarity searches identify filtered compounds for different fragment pairs. The numbers of filtered compounds for each fragment pair found using the Fragment Network (blue) or similarity search (orange) are shown across targets (a) dipeptidyl peptidase 11 (DPP11), (b) poly(ADP-ribose) polymerase 14, (PARP14), (c) nonstructural protein 13 (nsp13), and (d) main protease (Mpro). Only pairs that resulted in filtered compounds are shown. Pairs are ordered from right to left according to the number of Fragment Network compounds found. The data show that each search technique was able to identify filtered compounds for pairs where the other technique identified none.
Figure 4
Figure 4
Fragment Network and similarity search-derived compound sets populate different regions of chemical space. The chemical space occupied by the filtered compound sets is projected into two dimensions using the T-SNE algorithm across targets (a) dipeptidyl peptidase 11 (DPP11), (b) poly(ADP-ribose) polymerase 14 (PARP14), (c) nonstructural protein 13 (nsp13), and (d) main protease (Mpro). Fragment Network compounds are shown in blue, and similarity search compounds are shown in orange. The two compound sets are shown to occupy distinct areas of chemical space.
Figure 5
Figure 5
The Fragment Network identifies a known binder against Mpro. A Fragment Network search using (a) two fragment hits against the SARS-CoV-2 main protease (Mpro) identifies, (b) a known binder against Mpro (LON-WEI-b2874fec-25; RapidFire mass spectrometry (RF-MS) IC50 value of 59.6 μM) and similar compounds to known binders, (c) JAN-GHE-83b26c96–22 (fluorescence and RF-MS IC50 values of 96.9 μM and 24.5 μM), and (d) TRY-UNI-714a760b-18 (fluorescence and RF-MS IC50 values of 26.2 μM and 13.0 μM). Fragmenstein-predicted merge poses are shown in white, and crystal poses are in cyan. Interactions are predicted using the protein–ligand interaction profiler (PLIP), and key interaction residues are shown. Hydrogen bonds are shown in cyan, and π-stacking interactions are shown in magenta.
Figure 6
Figure 6
The Fragment Network identifies a known binder against EthR. A Fragment Network search using (a) two fragment hits against (which each bind in two different positions) Mycobacterium tuberculosis transcriptional repressor protein EthR identifies (b) a known binder (compound 4; IC50 value of >100 μM). (c) An alternative crystallographic arrangement of the equivalent fragments identifies (d) a similar compound to a known binder, compound 21 (IC50 value of 22 μM). Fragmenstein-predicted merge poses are shown in white, and crystal poses are in cyan. Hydrogen bonds are shown in cyan.

References

    1. Lamoree B.; Hubbard R. Current perspectives in fragment-based lead discovery (FBLD). Essays Biochem 2017, 61, 453–464. 10.1042/EBC20170028. - DOI - PMC - PubMed
    1. Davis B. J.; Roughley S. D. In Platform Technologies in Drug Discovery and Validation; Goodnow R. A., Ed.; Annual Reports in Medicinal Chemistry series; Academic Press, 2017; Vol. 50; pp 371–439.
    1. Hann M. M.; Leach A. R.; Harper G. Molecular complexity and its impact on the probability of finding leads for drug discovery. J. Chem. Inf. Model. 2001, 41, 856–864. 10.1021/ci000403i. - DOI - PubMed
    1. Leach A. R.; Hann M. M. Molecular complexity and fragment-based drug discovery: ten years on. Curr. Opin. Chem. Biol. 2011, 15, 489–496. 10.1016/j.cbpa.2011.05.008. - DOI - PubMed
    1. Keserű G. M.; Erlanson D. A.; Ferenczy G. G.; Hann M. M.; Murray C. W.; Pickett S. D. Design principles for fragment libraries: maximizing the value of learnings from pharma fragment-based drug discovery (FBDD) programs for use in academia. J. Med. Chem. 2016, 59, 8189–8206. 10.1021/acs.jmedchem.6b00197. - DOI - PubMed

Publication types