Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec 19:1:e229.
doi: 10.7717/peerj.229.

Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

Affiliations

Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

Amir B K Foroushani et al. PeerJ. .

Abstract

Motivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from http://sigora.googlecode.com/svn/.

Keywords: Functional analysis; High-throughput data; Over-representation analysis; Pathway analysis; Shared components; Systems biology.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Not all genes have the same power to distinguish between different pathways.
In this example, all current KEGG annotations of seven selected genes are shown. Red: annotated in pathway; white: not annotated in this pathway.
Figure 2
Figure 2. SIGORA’s two phases.
In the off-line phase (left) a pathway repository is transformed to disjoint sets of weighted GPS. These precompiled signatures are used in the on-line phase (right) to evaluate a user specific input gene list.
Figure 3
Figure 3. Overview of the signature transformation.
(A) A schematic pathway repository as a bipartite graph (B); G1…G5: genes; P1…P3: pathways. (B) A pathway unique gene. (C) A Gene-Pair Signature (GPS): G3 and G4 co-occur only in P2. (D) Each GPS is associated with a single pathway and has a weight equal to the average inverse degree (in B) of its constituent genes.
Figure 4
Figure 4. Results of six different pathway analysis methods applied to a gene expression dataset measuring the host transcriptional response to M. tuberculosis infection of human macrophages.
The heatmap shows all pathways that were identified as statistically significant by at least one of the six different pathway analysis methods. The more red the color the higher the rank of that pathway for a particular method. The heatmap is sorted by the number of methods identifying a particular pathway as significant.
Figure 5
Figure 5. Number of differentially expressed genes that are shared between SIGORA’s pathways (vertical axis, ordered by rank) and additional pathways identified as significant by other methods on the TB dataset.
Figure 6
Figure 6. Comparison of results of six different methods on a mouse experimental cerebral malaria dataset.
The heatmap shows all pathways that were identified as statistically significant in at least one of five different pathway analysis methods. The more red the color the higher the rank of that pathway for a particular method. The heatmap is sorted by the number of methods identifying a particular pathway as significant.
Figure 7
Figure 7. Number of differentially expressed genes that are shared between SIGORA’s pathways (rows, ordered by rank) and additional pathways by other methods (columns) in the analysis of the Dengue dataset.

Similar articles

Cited by

References

    1. Alexa A, Rahnenführer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22(13):1600–1607. doi: 10.1093/bioinformatics/btl140. - DOI - PubMed
    1. Antonov AV, Schmidt T, Wang Y, Mewes HW. ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data. Nucleic Acids Research. 2008;36(Web Server issue):W347–W351. doi: 10.1093/nar/gkn239. - DOI - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics. 2000;25(1):25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Balachandar S, Katyal A. Peroxisome proliferator activating receptor (PPAR) in cerebral malaria (CM): a novel target for an additional therapy. European Journal of Clinical Microbiology & Infectious Diseases. 2011;30(4):483–498. doi: 10.1007/s10096-010-1122-9. - DOI - PubMed
    1. Bell MD, Taub DD, Perry VH. Overriding the brain’s intrinsic resistance to leukocyte recruitment with intraparenchymal injections of recombinant chemokines. Neuroscience. 1996;74(1):283–292. doi: 10.1016/0306-4522(96)00083-8. - DOI - PubMed

LinkOut - more resources