Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2009 Jul;17(7):286-94.
doi: 10.1016/j.tim.2009.04.005. Epub 2009 Jul 2.

Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns

Affiliations
Review

Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns

Karen R Christie et al. Trends Microbiol. 2009 Jul.

Abstract

The quest to characterize each of the genes of the yeast Saccharomyces cerevisiae has propelled the development and application of novel high-throughput (HTP) experimental techniques. To handle the enormous amount of information generated by these techniques, new bioinformatics tools and resources are needed. Gene Ontology (GO) annotations curated by the Saccharomyces Genome Database (SGD) have facilitated the development of algorithms that analyze HTP data and help predict functions for poorly characterized genes in S. cerevisiae and other organisms. Here, we describe how published results are incorporated into GO annotations at SGD and why researchers can benefit from using these resources wisely to analyze their HTP data and predict gene functions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GO annotation types at SGD and sources of information. At SGD, GO annotations are made based on a wide range of published literature. Each GO annotation is further categorized with an annotation type: manually curated, high-throughput or computational [13]. (a) Manually curated GO annotations are made individually for each gene by curators reading the published literature describing experimental characterizations of that gene. We attempt to find experimental evidence whenever available. However, in our first pass through the genome to generate at least one annotation in each GO vocabulary for all genes, we sometimes made annotations from reviews using the TAS (traceable author statement) code (Box 2). We are working to replace these with annotations from the primary experimental papers with appropriate experimental evidence codes. (b) Sequence-based predictions can be classified as either manually curated or computational GO annotations. Sequence similarity comparisons from published papers are categorized as manually curated GO annotations because an expert in the field generated the comparison and a curator read the publication to determine the appropriate annotation. Predictions generated by the Gene Ontology Annotation group at the European Bioinformatics Institute are categorized as computational annotations because they are not reviewed by curators. (c) HTP annotations are made from published papers describing results of HTP experimental techniques. (d) Computational annotations are based on a variety of computational techniques, including sequence similarity and integrative analysis of experimental data. Computational methods that incorporate HTP data and sequence analysis should take caution to remove GO annotations derived from the source data to avoid including the information more than once.
Figure 2
Figure 2
Computational predictions for the uncharacterized protein-coding genes in S. cerevisiae. Out of 5796 protein coding genes, 1134 of them have no published information with regard to their molecular function (MF) or their biological process (BP). Predictions for the biological process can be made for only 416 (36.7%) of them, and predictions for molecular function can be made for even fewer, only 211 (18.6%). For the majority (654, or 57.7%), no prediction can be made for either molecular function or biological process to provide hypotheses for biologists to test experimentally.
Figure I
Figure I
Examples of S. cerevisiae GO annotations. Each row is an example of a GO annotation, which includes a protein or RNA gene product, a GO term, a reference and an evidence code (Box 2). The ribbon diagrams of URA3 [70] and URA6 [71] were contributed to PDB [72].

References

    1. Goffeau A, et al. Life with 6000 genes. Science. 1996;274:563–567. - PubMed
    1. Jones GM, et al. A systematic library for comprehensive overexpression screens in Saccharomyces cerevisiae. Nat Methods. 2008;5:239–241. - PubMed
    1. Huh WK, et al. Global analysis of protein localization in budding yeast. Nature. 2003;425:686–691. - PubMed
    1. Winzeler EA, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285:901–906. - PubMed
    1. DeRisi JL, et al. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997;278:680–686. - PubMed

Publication types

MeSH terms

Substances