Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 11;11(4):e11536.
doi: 10.1002/aps3.11536. eCollection 2023 Jul-Aug.

GOgetter: A pipeline for summarizing and visualizing GO slim annotations for plant genetic data

Affiliations

GOgetter: A pipeline for summarizing and visualizing GO slim annotations for plant genetic data

Emily B Sessa et al. Appl Plant Sci. .

Erratum in

Abstract

Premise: The functional annotation of genes is a crucial component of genomic analyses. A common way to summarize functional annotations is with hierarchical gene ontologies, such as the Gene Ontology (GO) Resource. GO includes information about the cellular location, molecular function(s), and products/processes that genes produce or are involved in. For a set of genes, summarizing GO annotations using pre-defined, higher-order terms (GO slims) is often desirable in order to characterize the overall function of the data set, and it is impractical to do this manually.

Methods and results: The GOgetter pipeline consists of bash and Python scripts. From an input FASTA file of nucleotide gene sequences, it outputs text and image files that list (1) the best hit for each input gene in a set of reference gene models, (2) all GO terms and annotations associated with those hits, and (3) a summary and visualization of GO slim categories for the data set. These output files can be queried further and analyzed statistically, depending on the downstream need(s).

Conclusions: GO annotations are a widely used "universal language" for describing gene functions and products. GOgetter is a fast and easy-to-implement pipeline for obtaining, summarizing, and visualizing GO slim categories associated with a set of genes.

Keywords: annotation; data mining; gene function; gene ontology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The GOgetter pipeline. Inputs, primary pipeline activity, and output file names (.tsv tables for raw and frequency count data) from GOgetter.sh are shown inside the dotted‐line box. The bottom of the figure shows the merged table and example figures produced from several frequency count tables using merge_and_viz.py, after multiple input files have been run through the pipeline.
Figure 2
Figure 2
Results of case studies using GOgetter: (A, B) Case Study 1, (C, D) Case Study 3. (A) Proportion of transcripts annotated to a GO slim term using GOgetter with BLAST (circle) and DIAMOND (triangle) for five transcriptomes representing angiosperms (red), gymnosperms (purple), ferns (green), lycophytes (orange), and bryophytes (blue). Vertical lines connecting points show the difference in the proportion of sequences annotated between BLAST and DIAMOND searches. (B) Bubble graph generated from GOgetter for the five transcriptomes analyzed, scaled to the proportion of total annotations that associate with that term. (C) The average proportion of sequences annotated to a GO slim term using GOgetter for full transcriptomes (green) and paralogs (purple) for five species of vittarioid ferns, using either Arabidopsis thaliana (left) or Ceratopteris richardii (right) as a reference. Error bars show ±1 standard error. (D) Heatmap of significantly over‐ (red) and under‐represented (blue) GO slim categories in paralog sequences relative to full transcriptomes. Non‐significant differences between the paralog set compared to the full transcriptome are depicted as gray.

Similar articles

Cited by

References

    1. Altschul, S. F. , Gish W., Miller W., Myers E. W., and Lipman D. J.. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. - PubMed
    1. Ashburner, M. , Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., et al. 2000. Gene Ontology: Tool for the unification of biology. Nature Genetics 25(1): 25–29. - PMC - PubMed
    1. Barker, M. S. , Kane N. C., Matvienko M., Kozik A., Michelmore R. W., Knapp S. J., and Rieseberg L. H.. 2008. Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years. Molecular Biology and Evolution 25: 2445–2455. - PMC - PubMed
    1. Binns, D. , Dimmer E., Huntley R., Barrell D., O'Donovan C., and Apweiler R.. 2009. QuickGO: A web‐based tool for Gene Ontology searching. Bioinformatics 25: 3045–3046. - PMC - PubMed
    1. Bryant, D. M. , Johnson K., DiTommaso T., Tickle T., Couger M. B., Payzin‐Dogru D., Lee T. J., et al. 2017. A tissue‐mapped axolotl de novo transcriptome enables identification of limb regeneration factors. Cell Reports 18: 762–776. - PMC - PubMed