Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 28;14(1):87.
doi: 10.1186/s13321-022-00664-x.

Visualizing chemical space networks with RDKit and NetworkX

Affiliations

Visualizing chemical space networks with RDKit and NetworkX

Vincent F Scalfani et al. J Cheminform. .

Abstract

This article demonstrates how to create Chemical Space Networks (CSNs) using a Python RDKit and NetworkX workflow. CSNs are a type of network visualization that depict compounds as nodes connected by edges, defined as a pairwise relationship such as a 2D fingerprint similarity value. A step by step approach is presented for creating two different CSNs in this manuscript, one based on RDKit 2D fingerprint Tanimoto similarity values, and another based on maximum common substructure similarity values. Several different CSN visualization features are included in the tutorial including methods to represent nodes with color based on bioactivity attribute value, edges with different line styles based on similarity value, as well as replacing the circle nodes with 2D structure depictions. Finally, some common network property and analysis calculations are presented including the clustering coefficient, degree assortativity, and modularity. All code is provided in the form of Jupyter Notebooks and is available on GitHub with a permissive BSD-3 open-source license: https://github.com/vfscalfani/CSN_tutorial.

Keywords: CSN; Chemical similarity network; Chemical space network; Maximum common substructure; Molecular similarity; NetworkX; RDKit.

PubMed Disclaimer

Conflict of interest statement

The authors declare they have no competing interests.

Figures

Fig. 1
Fig. 1
A basic spring layout CSN (Tanimoto Similarity variant) with the glucocorticoid dataset compounds. The Tanimoto Similiarity threshold was set to >= 0.68
Fig. 2
Fig. 2
A spring layout CSN (Tanimoto Similarity variant) with the glucocorticoid dataset compounds. The Tanimoto Similiarity threshold was set to >= 0.68. Node color represents pKi value
Fig. 3
Fig. 3
A spring layout CSN component (MCS Similarity variant) with the glucocorticoid dataset compounds. The nodes are labeled with index values, node color represents pKi value, and line style is dependent on MCS-based similarity value
Fig. 4
Fig. 4
A spring layout CSN component (MCS Similarity variant) with the glucocorticoid dataset compounds. The nodes are plotted as 2D compound images and line style is dependent on MCS-based similarity value
Fig. 5
Fig. 5
A spring layout CSN component (MCS Similarity variant) with the glucocorticoid dataset compounds. The nodes are plotted as 2D compound images with color highlighting for pKi values and line style dependent on MCS-based similarity value
Fig. 6
Fig. 6
A spring layout CSN (MCS Similarity variant) with the glucocorticoid dataset compounds. The nodes are colored based on community cluster detected using the NetworkX greedy_modularity_communities function

References

    1. Maggiora GM, Bajorath J. Chemical space networks: a powerful new paradigm for the description of chemical space. J Comput Aided Mol Des. 2014;28:795–802. doi: 10.1007/s10822-014-9760-0. - DOI - PubMed
    1. Vogt M, Stumpfe D, Maggiora GM, Bajorath J. Lessons learned from the design of chemical space networks and opportunities for new applications. J Comput Aided Mol Des. 2016;30:191–208. doi: 10.1007/s10822-016-9906-3. - DOI - PubMed
    1. Recanatini M, Cabrelle C. Drug research meets network science: where are we? J Med Chem. 2020;63:8653–8666. doi: 10.1021/acs.jmedchem.9b01989. - DOI - PMC - PubMed
    1. Kunimoto R, Bajorath J. Combining similarity searching and network analysis for the identification of active compounds. ACS Omega. 2018;3:3768–3777. doi: 10.1021/acsomega.8b00344. - DOI - PMC - PubMed
    1. Zhang B, Vogt M, Maggiora GM, Bajorath J. Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity. J Comput Aided Mol Des. 2015;29:595–608. doi: 10.1007/s10822-015-9852-5. - DOI - PubMed

LinkOut - more resources