Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jan;35(Database issue):D322-7.
doi: 10.1093/nar/gkl799. Epub 2006 Nov 10.

GO PaD: the Gene Ontology Partition Database

Affiliations

GO PaD: the Gene Ontology Partition Database

Gil Alterovitz et al. Nucleic Acids Res. 2007 Jan.

Abstract

Gene Ontology (GO) has been widely used to infer functional significance associated with sets of genes in order to automate discoveries within large-scale genetic studies. A level in GO's direct acyclic graph structure is often assumed to be indicative of its terms' specificities, although other work has suggested this assumption does not hold. Unfortunately, quantitative analysis of biological functions based on nodes at the same level (as is common in gene enrichment analysis tools) can lead to incorrect conclusions as well as missed discoveries due to inefficient use of available information. This paper addresses these using an informational theoretic approach encoded in the GO Partition Database that guarantees to maximize information for gene enrichment analysis. The GO Partition Database was designed to feature ontology partitions with GO terms of similar specificity. The GO partitions comprise varying numbers of nodes and present relevant information theoretic statistics, so researchers can choose to analyze datasets at arbitrary levels of specificity. The GO Partition Database, featuring GO partition sets for functional analysis of genes from human and 10 other commonly studied organisms with a total of 131,972 genes, is available on the internet at: bcl.med.harvard.edu/proj/gopart. The site also includes an online tutorial.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The specificity of GO terms can be captured in terms of bits of information. Heterocycle metabolism image courtesy of: Dr. Brent P. Krueger (14).
Figure 2
Figure 2
The GO Partition Database has an array of features from customized queries to several export options.
Figure 3
Figure 3
(a) GO term partition with six GO terms selected including: regulation of metabolism (19222), response to stimulus (50896), transcription (6350), transport (6810), biopolymer metabolism (43283) and organismal physiological process (50874). (b) Visual gene enrichment for transport is evident in these GenMAPP proteins involved in oxidative phosphorylation. Green circles represent proteins (displaying UniProtKB accessions) and rectangles contain the GO terms of the 6-node partition. An arrow going from a protein to a GO term indicates that the protein is annotated by that GO term. (c) Visual enrichment is shown based on GO graphical structure—leading to potentially misleading interpretations.
Figure 3
Figure 3
(a) GO term partition with six GO terms selected including: regulation of metabolism (19222), response to stimulus (50896), transcription (6350), transport (6810), biopolymer metabolism (43283) and organismal physiological process (50874). (b) Visual gene enrichment for transport is evident in these GenMAPP proteins involved in oxidative phosphorylation. Green circles represent proteins (displaying UniProtKB accessions) and rectangles contain the GO terms of the 6-node partition. An arrow going from a protein to a GO term indicates that the protein is annotated by that GO term. (c) Visual enrichment is shown based on GO graphical structure—leading to potentially misleading interpretations.
Figure 4
Figure 4
Histogram of GO level 3 versus GO partitions level 3 term information. This shows a tighter distribution for the GO partition-based information compared to that of graphical structure-derived GO level node information.

References

    1. The GO Consortium. The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006;34:D322–D326. - PMC - PubMed
    1. Dennis G., Jr, Sherman B.T., Hosack D.A., Yang J., Gao W., Lane H.C., Lempicki R.A. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003;4:P3. - PubMed
    1. Al-Shahrour F., Diaz-Uriarte R., Dopazo J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics. 2004;20:578–580. - PubMed
    1. Zhou M., Cui Y. GeneInfoViz: constructing and visualizing gene relation networks. In Silico Biol. 2004;4:323–333. - PubMed
    1. MacKay D.J.C. Information Theory, Inference, and Learning Algorithms. Cambridge, UK, New York: Cambridge University Press; 2003.

Publication types