Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan;36(Database issue):D351-9.
doi: 10.1093/nar/gkm843. Epub 2007 Oct 18.

ChemBank: a small-molecule screening and cheminformatics resource database

Affiliations

ChemBank: a small-molecule screening and cheminformatics resource database

Kathleen Petri Seiler et al. Nucleic Acids Res. 2008 Jan.

Abstract

ChemBank (http://chembank.broad.harvard.edu/) is a public, web-based informatics environment developed through a collaboration between the Chemical Biology Program and Platform at the Broad Institute of Harvard and MIT. This knowledge environment includes freely available data derived from small molecules and small-molecule screens and resources for studying these data. ChemBank is unique among small-molecule databases in its dedication to the storage of raw screening data, its rigorous definition of screening experiments in terms of statistical hypothesis testing, and its metadata-based organization of screening experiments into projects involving collections of related assays. ChemBank stores an increasingly varied set of measurements derived from cells and other biological assay systems treated with small molecules. Analysis tools are available and are continuously being developed that allow the relationships between small molecules, cell measurements, and cell states to be studied. Currently, ChemBank stores information on hundreds of thousands of small molecules and hundreds of biomedically relevant assays that have been performed at the Broad Institute by collaborators from the worldwide research community. The goal of ChemBank is to provide life scientists unfettered access to biomedically relevant data and tools heretofore available primarily in the private sector.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Conceptual summary of ChemBank schema. Logical illustration of ChemBank data model, in which 95 tables are organized into groups representing components of the chemical biology research enterprise. Each box represents several actual database tables, as indicated, and pseudocardinality relationships between boxes are meant to convey conceptual relationships, rather than the more complex cardinality relationships that relate the actual tables.
Figure 2.
Figure 2.
ChemBank offers multiple routes to find chemical information. Search tools allowing structure drawing (28) for substructure or similarity searches (a), selection of calculated molecular descriptors (b) and selection of term-based bioactivity annotations (c), each provide avenues to find individual molecules or sets of molecules in ChemBank. The ChemBank ‘Molecule Display’ webpage (background) provides detailed information about each molecule, including structure, names, molecular descriptors, biological annotations, sample information and screening instances.
Figure 3.
Figure 3.
Relationship of ChemBank ‘View ProjectandView Assaywebpages. Screenshots of representative screening project and assay (inset) webpages. Emphasis (red boxes, arrow) has been added to highlight key information, including project description and motivation (a), individual assays within project (b), detailed description (shared by both webpages) of assay protocol (c) and individual screening plates within assay (d).
Figure 4.
Figure 4.
ChemBank standard data-analysis model for high-throughput small-molecule screens. All raw small-molecule assay results in ChemBank are further processed by comparing each measurement with the collection of mock-treatment well measurements performed in the same screening experiment. Median values from mock-treatment wells on the same plate are used in an initial zero-centering step (a), after which the distribution of mock-treatment measurements for the entire experiment is trimmed to eliminate systematic artifacts (b). Trimmed mock-treatment measurements are used to normalize assay performance by first subtracting the mean of trimmed mock-treatment measurements on the same plate to give ‘background-subtracted values’ (c), then dividing by twice the standard deviation of trimmed mock-treatment measurements for the entire experiment to give ‘dimensionless Z-score values’ (d). Replicate handling is performed by cosine correlation of the replicate pair (for screens done in duplicate) of ‘dimensionless Z-score values’ for each compound with a simple prior model of ‘perfect reproducibility’, to yield a ‘Composite Z-score value’ (e) that represents the final primary screening result. The ChemBank web interface provides access to raw and processed data types appropriate for each of its visualization tools (f).
Figure 5.
Figure 5.
Illustration of ChemBank visualizations and linking activities with chemical information. Screening data, including raw measurements, in ChemBank are addressable by exact plate and well position in assay plates (a), and statistical data representing outcomes (b) can be reviewed at the level of raw or normalized data. A multi-assay analysis capability takes advantage of the standard analysis procedure (Figure 4) to display the performance of such similar compounds in multiple assays to which each has been exposed (c). Each of these capabilities can be combined with structure and annotation-based search capabilities to provide cheminformatic analysis of molecules scoring as ‘hits’ in biological assays (d).

Similar articles

Cited by

References

    1. Strausberg RL, Schreiber SL. From knowing to controlling: a path from genomics to drugs using small molecule probes. Science. 2003;300:294–295. - PubMed
    1. Tolliday N, Clemons PA, Ferraiolo P, Koehler AN, Lewis TA, Li X, Schreiber SL, Gerhard DS, Eliasof S. Small molecules, big players: the National Cancer Institute's Initiative for Chemical Genetics. Cancer Res. 2006;66:8935–8942. - PubMed
    1. Brooksbank C, Cameron G, Thornton J. The European Bioinformatics Institute's data resources: towards systems biology. Nucleic Acids Res. 2005;33:D46–D53. - PMC - PubMed
    1. Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34:D668–D672. - PMC - PubMed
    1. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2007;35:D5–D12. - PMC - PubMed

Publication types

Substances