Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 6:12:16.
doi: 10.1186/s13040-019-0204-1. eCollection 2019.

ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity

Affiliations

ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity

Aurélien Brionne et al. BioData Min. .

Abstract

The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility. ViSEAGO is publicly available on https://bioconductor.org/packages/ViSEAGO .

Keywords: Annotation; Cluster analysis; Enrichment test; Functional genomics; Gene ontology; Semantic similarity; Visualization.

PubMed Disclaimer

Conflict of interest statement

Competing interestsThe authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Database impacts on GO annotation. Bar plot of the number of GO annotations available for Molecular Function, Biological Process and Cellular Component category of protein-coding genes in two major databases (NCBI, Ensembl) on three golden standard models with Human, Mouse, Zebrafish and seven livestock animals with Chicken, Cow, Pig, Rabbit, Salmon, Sheep and Trout. Computational (blue) and Experimental (orange) evidence are represented.
Fig. 2
Fig. 2
Illustrated ViSEAGO package. A complete ViSEAGO analysis is presented from annotation of lists of features, enrichment tests to organization and viszualisation of GO terms thanks to semantic similarity. In italic, illustration of ViSEAGO features using case 1 study
Fig. 3
Fig. 3
Visualization of ViSEAGO’s functional analysis from mouse RNA-seq with three different transcriptomic datasets. Clustering heatmap plot that combines a dendrogram based on Wang’s semantic similarity distance and ward.D2 aggregation criterion, a heatmap of -log10(p-value) from functional enrichment tests and information content (IC). Focus is made on cholesterol biosynthetic process, a major pathway involved in the study
Fig. 4
Fig. 4
Visualization of ViSEAGO’s functional analysis from chicken RNA-seq with seven different transcriptomic datasets. a Upset plot representing overlaps between lists of enriched GO terms, b Clustering heatmap plot combining a dendrogram based on Wang’s semantic similarity distance and ward.D2 aggregation criterion, a heatmap of -log10(p-value) from functional enrichment test(s) of the seven lists of genes and information content (IC)
Fig. 5
Fig. 5
Visualization of ViSEAGO’s functional analysis from cattle with three MeDiP datasets. a Clustering heatmap plot combining a dendrogram based on Wang’s semantic similarity distance and ward.D2 aggregation criterion, a heatmap of -log10(p-value) from functional enrichment tests, and information content (IC). b MDS plot based on BMA distance representing the proximities of groups obtained by cutting dendrogram in (a). Dot size depends on the number of GO terms within each cluster. c Heatmap plot of functional sets of GO terms combining a description of the first common GO ancestor of each set of GO terms, a heatmap with the number of GO terms in each set, a dendrogram based on BMA semantic similarity distance and ward.D2 aggregation criterion

References

    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. The Gene Ontology Consortium Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res. 2017;45(D1):D331–D338. doi: 10.1093/nar/gkw1108. - DOI - PMC - PubMed
    1. Tomczak A, Mortensen JM, Winnenburg R, Liu C, Alessi DT, Swamy V, Vallania F, Lofgren S, Haynes W, Shah NH, et al. Interpretation of biological experiments changes with evolution of the gene ontology and its annotations. Sci Rep. 2018;8. - PMC - PubMed
    1. Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O'Donovan C. The GOA database: gene ontology annotation updates for 2015. Nucleic Acids Res. 2015;43(D1):D1057–D1063. doi: 10.1093/nar/gku1113. - DOI - PMC - PubMed
    1. Rivals I, Personnaz L, Taing L, Potier M-C. Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics. 2007;23(4):401–407. doi: 10.1093/bioinformatics/btl633. - DOI - PubMed

LinkOut - more resources