Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Feb 6;6(31):129-47.
doi: 10.1098/rsif.2008.0341.

Genome and proteome annotation: organization, interpretation and integration

Affiliations
Review

Genome and proteome annotation: organization, interpretation and integration

Gabrielle A Reeves et al. J R Soc Interface. .

Abstract

Recent years have seen a huge increase in the generation of genomic and proteomic data. This has been due to improvements in current biological methodologies, the development of new experimental techniques and the use of computers as support tools. All these raw data are useless if they cannot be properly analysed, annotated, stored and displayed. Consequently, a vast number of resources have been created to present the data to the wider community. Annotation tools and databases provide the means to disseminate these data and to comprehend their biological importance. This review examines the various aspects of annotation: type, methodology and availability. Moreover, it puts a special interest on novel annotation fields, such as that of phenotypes, and highlights the recent efforts focused on the integrating annotations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A selection of resources showing data for the human epidermal growth factor receptor (EGFR). (a) Ensembl (genomic information), (b) ASAP II (alternative splicing information), (c) Dasty2 protein DAS client (protein kinase domain), (d) InterProScan (functional annotation), (e) OMIM (disease information), (f) Reactome (EGFR signalling information): signalling by EGFR (Homo sapiens), (g) CATH domain database (protein kinase domain), (h) PDBsum (structural information) and (i) GO (MAP/ERK kinase kinase activity)

References

    1. Adams M.D., et al. Complementary DNA sequencing: expressed sequence tags and human genome project. Science (NY) 1991;252:1651–1656. doi: 10.1126/science.2047873. - DOI - PubMed
    1. Al-Shahrour F., et al. Babelomics: a systems biology perspective in the functional annotation of genome-scale experiments. Nucleic Acids Res. 2006;34:W472–W476. doi: 10.1093/nar/gkl172. - DOI - PMC - PubMed
    1. Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed
    1. Amaral P.P., Dinger M.E., Mercer T.R., Mattick J.S. The eukaryotic genome as an RNA machine. Science (NY) 2008;319:1787–1789. doi: 10.1126/science.1155472. - DOI - PubMed
    1. Andreeva A., Howorth D., Brenner S.E., Hubbard T.J.P., Chothia C., Murzin A.G. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 2004;32:D226–D229. doi: 10.1093/nar/gkh039. - DOI - PMC - PubMed

Publication types