Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Apr 7:6:7.
doi: 10.1186/1751-0473-6-7.

WordCloud: a Cytoscape plugin to create a visual semantic summary of networks

Affiliations

WordCloud: a Cytoscape plugin to create a visual semantic summary of networks

Layla Oesper et al. Source Code Biol Med. .

Abstract

Background: When biological networks are studied, it is common to look for clusters, i.e. sets of nodes that are highly inter-connected. To understand the biological meaning of a cluster, the user usually has to sift through many textual annotations that are associated with biological entities.

Findings: The WordCloud Cytoscape plugin generates a visual summary of these annotations by displaying them as a tag cloud, where more frequent words are displayed using a larger font size. Word co-occurrence in a phrase can be visualized by arranging words in clusters or as a network.

Conclusions: WordCloud provides a concise visual summary of annotations which is helpful for network analysis and interpretation. WordCloud is freely available at http://baderlab.org/Software/WordCloudPlugin.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Tag cloud for a protein interaction cluster. The network consists of physical interactions between S. cerevisiae proteins involved in DNA replication (A). A group of highly inter-connected proteins was selected (blue circle) and their full names were mined using WordCloud. The results are shown for the three layouts: network (B), simple (C) and clustered (D). "Origin recognition complex component" and "Minichromosome maintenance complex component" are the dominating themes. The corresponding words are ranked on top in the simple cloud layout, but only the clustered and network layout reconstruct the correct connections between them, based on word co-occurrence patterns. Since clustering is non-overlapping, the words "complex" and "component" are forced to appear only in one cluster (with "minichrosome maintenance"), whereas the network layout displays association to "origin recognition" as well.
Figure 2
Figure 2
Application of WordCloud to gene-set enrichment analysis results. The transcriptional response of breast cancer cells to estrogen treatment was analyzed for gene-set enrichment, as described in [11]. Gene-sets were then arranged as a network using the Enrichment Map visualization technique [11]; edges represent gene-set overlap and clusters correspond to functional groups. A sub-network (A) was selected and analyzed using the WordCloud network layout (B). The most frequent words in gene-set names are "Mitotic Cell Cycle", "DNA Replication", "Ubiquitin Ligase Activity/Regulation", "Chromosome", "Microtubule"; this suggests that the sub-network consists of gene-sets involved in the control of cell proliferation. Specific parts of the sub-network (purple circles) relate to specific functional groups, as suggested by clustered word clouds (C,D).

References

    1. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. - DOI - PMC - PubMed
    1. Merico D, Gfeller D, Bader GD. How to visually interpret biological data using networks. Nat Biotechnol. 2009;27:921–924. doi: 10.1038/nbt.1567. - DOI - PMC - PubMed
    1. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrín-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MHY, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006;440:637–643. doi: 10.1038/nature04670. - DOI - PubMed
    1. Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Isserlin R, Merico D, Alikhani-Koupaei R, Gramolini A, Bader GD, Emili A. Pathway Analysis of Dilated Cardiomyopathy using Global Proteomic Profiling and Enrichment Maps. Proteomics. 2010;10:1316–1327. doi: 10.1002/pmic.200900412. - DOI - PMC - PubMed

LinkOut - more resources