Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 28:4:267.
doi: 10.1186/1756-0500-4-267.

GO Trimming: Systematically reducing redundancy in large Gene Ontology datasets

Affiliations

GO Trimming: Systematically reducing redundancy in large Gene Ontology datasets

Stuart G Jantzen et al. BMC Res Notes. .

Abstract

Background: The increased accessibility of gene expression tools has enabled a wide variety of experiments utilizing transcriptomic analyses. As these tools increase in prevalence, the need for improved standardization in processing and presentation of data increases, as does the need to guard against interpretation bias. Gene Ontology (GO) analysis is a powerful method of interpreting and summarizing biological functions. However, while there are many tools available to investigate GO enrichment, there remains a need for methods that directly remove redundant terms from enriched GO lists that often provide little, if any, additional information.

Findings: Here we present a simple yet novel method called GO Trimming that utilizes an algorithm designed to reduce redundancy in lists of enriched GO categories. Depending on the needs of the user, this method can be performed with variable stringency. In the example presented here, an initial list of 90 terms was reduced to 54, eliminating 36 largely redundant terms. We also compare this method to existing methods and find that GO Trimming, while simple, performs well to eliminate redundant terms in a large dataset throughout the depth of the GO hierarchy.

Conclusions: The GO Trimming method provides an alternative to other procedures, some of which involve removing large numbers of terms prior to enrichment analysis. This method should free up the researcher from analyzing overly large, redundant lists, and instead enable the concise presentation of manageable, informative GO lists. The implementation of this tool is freely available at: http://lucy.ceh.uvic.ca/go_trimming/cbr_go_trimming.py.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GO Trimming algorithm flowchart. A) Phase 1 of GO Trimming: identification of parent-child relationships in GO hierarchy. B) Phase 2 of GO Trimming: strict and soft trimming using 0% and 40% uniqueness thresholds. Green boxes represent start and endpoints for the algorithm. Blue parallelograms represent input and output steps. Red rectangles represent an action required by the user and yellow diamonds represent questions that determine the flow of the algorithm. Input for the algorithm is the query list (list of enriched GO terms) and the GO tree (hierarchy of all GO terms). Output is a list of GO terms with soft and strict trimmed terms removed.

References

    1. Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, Harris M, Hill D, Issel-Tarver L, Kasarskis A, Lewis S, Matese J, Richardson J, Ringwald M, Rubin G, Sherlock G, Consortium GO. Gene Ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28(1):27–30. doi: 10.1093/nar/28.1.27. - DOI - PMC - PubMed
    1. Ashburner M, Ball C, Blake J, Butler H, Cherry J, Corradi J, Dolinski K, Eppig J, Harris M, Hill D, Lewis S, Marshall B, Mungall C, Reiser L, Rhee S, Richardson J, Richter J, Ringwald M, Rubin G, Sherlock G, Yoon J, Consortium GO. Creating the gene ontology resource: Design and implementation. Genome Res. 2001;11(8):1425–1433. doi: 10.1101/gr.180801. - DOI - PMC - PubMed
    1. Grossmann S, Bauer S, Robinson PN, Vingron M. Improved detection of overrepresentation of Gene-Ontology annotations with parent-child analysis. Bioinformatics. 2007;23(22):3024–3031. doi: 10.1093/bioinformatics/btm440. - DOI - PubMed
    1. GO Slim. http://www.geneontology.org/GO.slims.shtml