Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 25:2014:bau012.
doi: 10.1093/database/bau012. Print 2014.

COMPARTMENTS: unification and visualization of protein subcellular localization evidence

Affiliations

COMPARTMENTS: unification and visualization of protein subcellular localization evidence

Janos X Binder et al. Database (Oxford). .

Abstract

Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high-throughput microscopy-based screens and predicted from primary sequence. To get a comprehensive view of the localization of a protein, it is thus necessary to consult multiple databases and prediction tools. To address this, we present the COMPARTMENTS resource, which integrates all sources listed above as well as the results of automatic text mining. The resource is automatically kept up to date with source databases, and all localization evidence is mapped onto common protein identifiers and Gene Ontology terms. We further assign confidence scores to the localization evidence to facilitate comparison of different types and sources of evidence. To further improve the comparability, we assign confidence scores based on the type and source of the localization evidence. Finally, we visualize the unified localization evidence for a protein on a schematic cell to provide a simple overview. Database URL: http://compartments.jensenlab.org.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Visualization of localization evidence. When querying the database for a protein, its localization is visualized on a schematic of a cell. When the user hovers the cursor over a compartment, we also graphically summarize the types of evidence supporting this localization. The confidence of the evidence is color coded, ranging from light green for low confidence to dark green for high confidence. White indicates an absence of localization evidence.
Figure 2.
Figure 2.
Overlap between the knowledge, experimental and text-mining evidence for human proteins. The Venn diagram shows the number of proteins with localization evidence from one or more of the three types of evidence. The two sequence-based prediction methods are not included as they are able to provide a prediction for any protein sequence.
Figure 3.
Figure 3.
Benchmark of text-mining results. The performance of the text-mining pipeline on human and yeast proteins is shown as receiver operating characteristics (ROC) curves for each of 11 compartments. The curves do not intercept sensitivity = 1.0 and FPR = 1.0 because many of the protein–compartment pairs in the benchmark set are never found mentioned together in Medline, for which reason they have no text-mining score.
Figure 4.
Figure 4.
Compartment relationships derived from shared proteins. Illustrating the usefulness of COMPARTMENTS for global analysis of protein localization, we studied relationships between compartments. Each node represents a single compartment, which is highlighted in green. The number of proteins in the compartment is shown in parenthesis. We show an edge between two compartments whenever they share more proteins than expected at random (false discovery rate <0.1%). The number of proteins co-localized to the two compartments is shown next to the edge.

References

    1. Magrane M, UniProt Consortium. UniProt Knowledgebase: a hub of integrated protein data. Database. 2011;2011:bar009. - PMC - PubMed
    1. Eppig JT, Blake JA, Bult CJ, et al. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res. 2011;40:D881–D886. - PMC - PubMed
    1. Cherry JM, Hong EL, Amundsen C, et al. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 2011;40:D700–D705. - PMC - PubMed
    1. McQuilton P, St. Pierre SE, Thurmond J, et al. FlyBase 101—the basics of navigating FlyBase. Nucleic Acids Res. 2011;40:D706–D714. - PMC - PubMed
    1. Harris TW, Antoshechkin I, Bieri T, et al. WormBase: a comprehensive resource for nematode research. Nucleic Acids Res. 2009;38:D463–D467. - PMC - PubMed

Publication types