Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May;27(5):455-68.
doi: 10.1007/s10822-013-9641-y. Epub 2013 Apr 13.

An informatic pipeline for managing high-throughput screening experiments and analyzing data from stereochemically diverse libraries

Affiliations

An informatic pipeline for managing high-throughput screening experiments and analyzing data from stereochemically diverse libraries

Carol A Mulrooney et al. J Comput Aided Mol Des. 2013 May.

Abstract

Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of "hits" in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The "hit-calling" tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The "cherry-picking" tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an "S/SAR viewer," has been designed specifically for the Broad Institute's diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
An overview of activities following HTS highlighting corresponding tools
Fig. 2
Fig. 2
Hit calling in the Spotfire workflow. The screen is divided into sections ad. a Drop-down menus permit the user to display various statistical measurements (e.g., mean or median activity) as solid lines in the scatter plot (b). Deviation thresholds selected by the user will be displayed as dotted lines in the scatter plot. Threshold testing allows for quick selection of wells meeting the input criterion to facilitate decision making on large subsets of data. b Data points representing individual wells are color-coded by well contents for easier analysis. The solid lines represent the testing threshold, statistical measurement, and associated deviations input by the user. The solid green line is the activity threshold. The remaining solid lines are the statistical median for the various data subsets: red: median test compound activity; orange: median neutral control compound activity; blue: median inhibitor control compound activity. The colored dotted lines show the median absolute deviation for the corresponding median values. c Individual wells or groups of wells can be masked, discarding these points from further analysis. The user can also override the experimental result and impose a decision (e.g., active, inactive, or inconclusive) on compounds that have a replicate in a masked well. d Tabulated summaries of well contents are listed for all unmasked wells and selected wells
Fig. 3
Fig. 3
Substructure filtering facilitates the removal of compounds bearing undesirable functional groups. The screen is divided into sections ae. a Master list of 52 functional group filters that can be individually applied by the user. b Display of currently selected filter(s). c Tabular summary of how many unique compounds contain the corresponding functional group. d Summary of the decisions made (pick/discard). e Structure viewer to display specific compounds associated with the selected functional group filter
Fig. 4
Fig. 4
Physical properties can be readily analyzed and correlated to bioactivity. The screen is divided into sections ae: a Convenient drop-down lists allow the user to change the horizontal and vertical axes to display bioactivity or any of 7 different physical properties. In this example, c-logP is plotted against exact mass. b Data points are displayed in multiple colors and unique shapes, quickly summarizing cherry-pick decisions made across multiple pages. Here, activity-based decisions (pick or force pick) are displayed. Green squares represent compounds with acceptable bioactivity and physical properties. Red crosses are compounds with acceptable bioactivity but that fail one or more physical property filters. Blue stars are compounds that were force-picked on the bioactivity page with acceptable physical properties. Green pluses are force-picked compounds that do not meet the c-logP criterion. c A summary of the decisions made (pick/discard) from this page is listed here. d Tool tips displaying bioactivity, hit-calling and cherry-pick decisions, selected physical properties and structure can be opened by placing the cursor over individual data points. e Detailed compound information is displayed in a table for highlighted data points
Fig. 5
Fig. 5
Hit-calling/cherry-picking workflow
Fig. 6
Fig. 6
a Structure with R-group and stereocenters labeled for a core (from the Head-to-Tail Library [6c] within the DOS collection). b ChemAxon extended SMILES encoding the structure, R-group, and stereocenter labels
Fig. 7
Fig. 7
An example heat map, axes labeled with R-group structures and stereocenter configuration
Fig. 8
Fig. 8
a An example view of hits with selective SAR and SSAR. b A view of selective SAR but non-selective SSAR
Fig. 9
Fig. 9
a View of data from the β-cell apoptosis HTS. b A magnified view of the hits including R-group structure. c Key to core structure and stereochemical assignments

References

    1. Mayr LM, Fuerst P. The future of high-throughput screening. J Biomol Screen. 2008;13:443–448. - PubMed
    1. Seiler KP, George GA, Happ MP, Bodycombe NE, Carrinski HA, Norton S, Brudz S, Sullivan JP, Muhlich J, Serrano M, Ferraiolo P, Tolliday NJ, Schreiber SL, Clemons PA. ChemBank: a small-molecule screening and cheminformatics resource database. Nucleic Acids Res. 2008;36:D351–D359. - PMC - PubMed
    1. Agrafiotis DK, Alex S, Dai H, Derkinderen A, Farnum M, Gates P, Izrailev S, Jaeger EP, Konstant P, Leung A, Lobanov VS, Marichal P, Martin D, Rassokhin DN, Shemanarev M, Skalkin A, Stong J, Tabruyn T, Vermeiren M, Wan J, Xu XY, Yao X. Advanced biological and chemical discovery (ABCD): centralizing discovery knowledge in an inherently decentralized world. J Chem Inf Model. 2007;47:1999–2014. - PubMed
    1. Sander T, Freyss J, von Korff M, Reich JR, Rufener C. OSIRIS, an entirely in-house developed drug discovery informatics system. J Chem Inf Model. 2009;49:232–246. - PubMed
    1. MLPCN website. http://mli.nih.gov/mli/mlpcn. Accessed 7 Jan 2013.

Publication types

LinkOut - more resources