. 2010 Nov 15;5(11):e13984.

doi: 10.1371/journal.pone.0013984.

Enrichment map: a network-based method for gene-set enrichment visualization and interpretation

Daniele Merico¹, Ruth Isserlin, Oliver Stueker, Andrew Emili, Gary D Bader

Affiliations

PMID: 21085593
PMCID: PMC2981572
DOI: 10.1371/journal.pone.0013984

Enrichment map: a network-based method for gene-set enrichment visualization and interpretation

Daniele Merico et al. PLoS One. 2010.

. 2010 Nov 15;5(11):e13984.

doi: 10.1371/journal.pone.0013984.

Authors

Daniele Merico¹, Ruth Isserlin, Oliver Stueker, Andrew Emili, Gary D Bader

Affiliation

¹ Department of Molecular Genetics, Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.

PMID: 21085593
PMCID: PMC2981572
DOI: 10.1371/journal.pone.0013984

Abstract

Background: Gene-set enrichment analysis is a useful technique to help functionally characterize large gene lists, such as the results of gene expression experiments. This technique finds functionally coherent gene-sets, such as pathways, that are statistically over-represented in a given gene list. Ideally, the number of resulting sets is smaller than the number of genes in the list, thus simplifying interpretation. However, the increasing number and redundancy of gene-sets used by many current enrichment analysis software works against this ideal.

Principal findings: To overcome gene-set redundancy and help in the interpretation of large gene lists, we developed "Enrichment Map", a network-based visualization method for gene-set enrichment results. Gene-sets are organized in a network, where each set is a node and edges represent gene overlap between sets. Automated network layout groups related gene-sets into network clusters, enabling the user to quickly identify the major enriched functional themes and more easily interpret the enrichment results.

Conclusions: Enrichment Map is a significant advance in the interpretation of enrichment analysis. Any research project that generates a list of genes can take advantage of this visualization framework. Enrichment Map is implemented as a freely available and user friendly plug-in for the Cytoscape network visualization software (http://baderlab.org/Software/EnrichmentMap/).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. From ranked gene lists to the enrichment map.**
High-throughput genomic experiments often output large gene lists, which are typically ranked according to a statistic measuring difference in one experimental condition versus another. Ranked lists are analyzed for enrichment in known sets of functionally related genes (e.g. pathways) from publicly accessible databases. An enrichment map is drawn, representing the enrichment results as a network of gene-sets (nodes) related by their similarity (edges), with enrichment significance encoded by the node color gradient, where color intensity represents significance and color hue (red / blue) represents the class (i.e. biological condition) of interest. Node size represents the gene-set size and edge thickness represents the degree of overlap between two gene-sets.

**Figure 2. Hierarchical visualization of enrichment results for estrogen treatment of breast cancer cells.**
Hierarchical organization of GO gene-set enrichment results for estrogen-treated compared to untreated breast cancer cells at 24 hours of culture. Nodes represent gene-sets and edges represent GO defined relations (*Is-a*, *Part-of*, *Regulates*). Gene-sets that did not pass the enrichment significance threshold are not shown. Nodes are colored according to enrichment results: red represents enrichment in estrogen-treated cells (i.e. up-regulation after estrogen treatment), whereas blue represents enrichment in untreated cells (i.e. down-regulation after estrogen treatment). Color intensity is proportional to enrichment significance. Since conservative thresholds were used to select gene-sets, most of the node colors are intense (corresponding to highly significant gene-sets). Subnetworks (i.e. clusters) are annotated according to the corresponding function. The acronym in brackets represents the specific GO ontology the gene-sets belong to: Molecular Function (MF), Cellular Component (CC), Biological Process (BP). Microtubule cytoskeleton (purple labels) and tRNA processing (green labels) were highlighted to show absence of connections between related gene-sets.

**Figure 3. Enrichment map for estrogen treatment of breast cancer cells at 24 hours of culture.**
The map displays the enriched gene-sets in estrogen-treated vs. untreated breast cancer cells at 24 hours of culture. As in Figure 2, red node color represents enrichment in estrogen-treated cells (i.e. up-regulation after estrogen treatment), whereas blue represents enrichment in untreated cells (i.e. down-regulation after estrogen treatment); color intensity is proportional to enrichment significance. Clusters of functionally related gene-sets were manually circled and assigned a label.

**Figure 4. Zoom in of the microtubule cytoskeleton cluster in the 24 hours estrogen treatment enrichment map.**
Cytoskeleton-related gene-sets from different GO partitions, such as *Spindle* (CC) and *Microtubule-based process* (BP) have been grouped together, unlike in the purely hierarchical visualization in Figure 2. As in the previous figures, red node color represents enrichment in estrogen-treated cells (i.e. up-regulation after estrogen treatment), whereas blue represents enrichment in untreated cells (i.e. down-regulation after estrogen treatment); color intensity is proportional to enrichment significance.

**Figure 5. Enrichment map for estrogen treatment of breast cancer cells at 12 and 24 hours of culture.**
The map displays the enriched gene-sets in estrogen-treated vs. untreated breast cancer cells at 12 and 24 hours of culture. Enrichments were mapped to the inner node area and to the node borders, respectively. As in the previous figures, red represents enrichment in estrogen-treated cells (i.e. up-regulation after estrogen treatment), whereas blue represents enrichment in untreated cells (i.e. down-regulation after estrogen treatment); color intensity is proportional to enrichment significance. Clusters of functionally related gene-sets were manually circled and assigned a label.

**Figure 6. Heat-maps displaying gene-set expression patterns in the estrogen treatment experiment.**
Two gene-sets displaying different enrichment patterns at 12 and 24 hours of the estrogen treatment experiment were selected from the enrichment map in Figure 5 and their expression patterns were explored using heat maps within the Enrichment Map software. For *APC-dependent protein degradation* (left), there is a marked increase of gene expression in estrogen treated cells at 24 hours compared to 12 hours, whereas the gene levels for untreated cells are substantially the same at the two time points; the pattern observed is consistent with the presence of significant enrichment only at 24 hours. On the other hand, for *Replication fork* (right), gene expression in estrogen treated cells at 12 and 24 hours is globally at similar levels, whereas there is an increase of gene levels in untreated cells. This suggests an explanation of why *Replication fork* is enriched only at 12 hours.

**Figure 7. Enrichment map for early-onset colon cancer and overlap with known disease genes.**
The map displays the enriched gene-sets in early onset colon cancer patients vs. normal controls. The yellow triangle represents the set of known colon cancer genes from the DiseaseHub database, which integrates disease gene lists from several genotype-phenotype association resources. Purple edges indicate overlap between the disease signature and enriched gene-sets; thickness represents significance. Only edges with a Fisher's Exact Test nominal p-value smaller than 10⁻⁴ were visualized.

See this image and copyright information in PMC

References

1. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nature reviews Genetics. 2006;7:55–65. - PubMed
1. Calarco JA, Saltzman AL, Ip JY, Blencowe BJ. Technologies for the global discovery and analysis of alternative splicing. Advances in experimental medicine and biology. 2007;623:64–84. - PubMed
1. Nesvizhskii AI, Vitek O, Aebersold R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nature methods. 2007;4:787–797. - PubMed
1. Quackenbush J. Computational analysis of microarray data. Nature reviews Genetics. 2001;2:418–427. - PubMed
1. Nam D, Kim S-Y. Gene-set approach for expression pattern analysis. Briefings in bioinformatics. 2008;9:189–197. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enrichment map: a network-based method for gene-set enrichment visualization and interpretation

Affiliation

Enrichment map: a network-based method for gene-set enrichment visualization and interpretation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources