Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan 5;2(3):201-6.
doi: 10.1021/ml100240z. eCollection 2011 Mar 10.

Extracting SAR Information from a Large Collection of Anti-Malarial Screening Hits by NSG-SPT Analysis

Affiliations

Extracting SAR Information from a Large Collection of Anti-Malarial Screening Hits by NSG-SPT Analysis

Mathias Wawer et al. ACS Med Chem Lett. .

Abstract

We combine two graphical SAR analysis methods, Network-like Similarity Graphs (NSGs) and Similarity-Potency Trees (SPTs), to search for SAR information in a large and heterogeneous compound data set containing more than 13,000 antimalarial screening hits that was recently released by GlaxoSmithKline (GSK). The NSG-SPT approach first identifies subsets of compounds inducing local SAR discontinuity in data sets and then extracts available SAR information from these subsets in a graphically intuitive manner. Applying the NSG-SPT analysis scheme, we have identified in the GSK collection compound subsets of high local SAR information content including both known and previously unknown antimalarial chemotypes, which yielded interpretable SAR patterns. This information should be helpful to prioritize and select antimalarial candidate compounds for further chemical exploration. Furthermore, the NSG-SPT tools are publicly available, and our study also shows how to practically apply these SAR analysis methods to study large compound data sets.

Keywords: Anti-malaria screening hits; data mining; graphical SAR analysis; network-like similarity graphs; similarity-potency trees; structure−activity relationship (SAR) information.

PubMed Disclaimer

Figures

Figure 1
Figure 1
NSG-SPT analysis scheme. On the left, an exemplary NSG is shown calculated for a set of known thrombin inhibitors. This NSG consists of several components that display different local SARs. Nodes represent individual compounds that are connected by edges if they share 2D similarity above a predefined threshold. The color and size of a node reflects the potency and contribution to the local SAR discontinuity of the corresponding compound, respectively, as indicated below the graph. The highlighted region (compound subset) forms the most discontinuous local SAR and was subjected to SPT analysis. For this purpose, each compound is selected once as the root to build a set of overlapping trees. In each SPT, the remaining compounds are connected to the root on the basis of nearest neighbor similarity relationships. Two exemplary SPTs are shown on the right. These SPTs reveal horizontal and vertical SAR patterns that are highlighted. SPTs are ranked based on the occurrence of such patterns.
Figure 2
Figure 2
NSG-SPT analysis of the GSK data set. (a) NSG of the complete hit set. Two prominent regions of local SAR discontinuity are highlighted and shown in detail in part b. For these regions, corresponding highly ranked SPTs are provided in part c. Selected compounds are shown, and patterns that reflect significant SAR information are highlighted.
Figure 3
Figure 3
NSG-SPT analysis after removal of known antimalarial chemotypes. NSG of the GSK data set after removal of known anti-malarial chemotypes. The positions of the remaining nodes correspond to the layout in Figure 2a. To account for the removal of highly potent compounds, the potency-based coloring was adjusted to range from 1 μM (green) to 10 nM (red).

References

    1. Bajorath J.; Peltason L.; Wawer M.; Guha R.; Lajiness M. S.; Van Die J. H. Navigating Structure-Activity Landscapes. Drug Discovery Today 2009, 14, 698–705. - PubMed
    1. Wawer M.; Lounkine E.; Wassermann A. M.; Bajorath J. Data Structures and Computational Tools for the Extraction of SAR Information from Large Compound Sets. Drug Discovery Today 2010, 15, 630–639. - PubMed
    1. Wassermann A. M.; Wawer M.; Bajorath J.. Activity Landscape Representations for Structure-Activity Relationship Analysis. J. Med. Chem. 2010, 53, 8209−8223. - PubMed
    1. Malo N.; Hanley J. A.; Cerquozzi S.; Pelletier J.; Nadon R. Statistical Practice in High-Throughput Data Analysis. Nat. Biotechnol. 2006, 24, 167–175. - PubMed
    1. Ahlberg C. Visual Exploration of HTS Databases: Bridging the Gap between Chemistry and Biology. Drug Discovery Today 1999, 4, 270–485. - PubMed