Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb 1:2012:bar068.
doi: 10.1093/database/bar068. Print 2012.

Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

Affiliations

Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

Sarah Burge et al. Database (Oxford). .

Abstract

InterPro amalgamates predictive protein signatures from a number of well-known partner databases into a single resource. To aid with interpretation of results, InterPro entries are manually annotated with terms from the Gene Ontology (GO). The InterPro2GO mappings are comprised of the cross-references between these two resources and are the largest source of GO annotation predictions for proteins. Here, we describe the protocol by which InterPro curators integrate GO terms into the InterPro database. We discuss the unique challenges involved in integrating specific GO terms with entries that may describe a diverse set of proteins, and we illustrate, with examples, how InterPro hierarchies reflect GO terms of increasing specificity. We describe a revised protocol for GO mapping that enables us to assign GO terms to domains based on the function of the individual domain, rather than the function of the families in which the domain is found. We also discuss how taxonomic constraints are dealt with and those cases where we are unable to add any appropriate GO terms. Expert manual annotation of InterPro entries with GO terms enables users to infer function, process or subcellular information for uncharacterized sequences based on sequence matches to predictive models. Database URL: http://www.ebi.ac.uk/interpro. The complete InterPro2GO mappings are available at: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/external2go/interpro2go.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flowchart outlining the decision process taken by InterPro curators in order to assign GO terms.
Figure 2.
Figure 2.
Application of GO molecular function terms to IPR002201 and its child entries. IPR002201 is a more general entry, which encompasses the proteins matched by its three child entries, IPR011908, IPR011910 and IPR011916. The increased specificity of the child entry can be reflected in the GO annotation; IPR011908 has a more specific Molecular Function term than the parent entry IPR002201.
Figure 3.
Figure 3.
Complementary domain and family GO mapping for InterPro entries that match the human cellular tumour antigen p53. Domain GO annotation enables the function(s) of the family to be attributed to individual domains within the protein.

References

    1. Hunter S, Jones P, Mitchell A, et al. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 2011;40:D306–D312. - PMC - PubMed
    1. Quevillon E, Silventoinen V, Pillai S, et al. InterProScan: protein domains identifier. Nucleic Acids Res. 2005;33:W116–W120. - PMC - PubMed
    1. Jensen K, Ostergaard PR, Wilting R, Lassen SF. Identification and characterization of a bacterial glutamic peptidase. BMC Biochem. 2010;11:47. - PMC - PubMed
    1. Cantacessi C, Jex AR, Hall RS, et al. A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing. Nucleic Acids Res. 2010;38:e171. - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. - PMC - PubMed

Publication types