Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 12;25(1):23.
doi: 10.1186/s12859-024-05632-w.

Robustness evaluations of pathway activity inference methods on gene expression data

Affiliations

Robustness evaluations of pathway activity inference methods on gene expression data

Tay Xin Hui et al. BMC Bioinformatics. .

Abstract

Background: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches.

Results: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets.

Conclusion: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.

Keywords: Cancer classification; Literature validation; Pathway activity inference; Pathway analysis; PubMed text data mining; Reproducibility power; Robustness.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Comparison of mean reproducibility power for seven pathway activity inference methods
Fig. 2
Fig. 2
Coefficient variation of pathway activity inference methods
Fig. 3
Fig. 3
Comparison of number of identified informative pathways
Fig. 4
Fig. 4
Comparison of number of identified informative genes
Fig. 5
Fig. 5
PubMed text data mining automation based on pathways and genes [52]
Fig. 6
Fig. 6
Workflow of evaluating pathway activity inference methods based on the robustness of pathway activity
Fig. 7
Fig. 7
Workflow of evaluating pathway activity inference methods based on the robustness of predicted risk-active pathways and genes

Similar articles

Cited by

References

    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270(5235):467–470. doi: 10.1126/science.270.5235.467. - DOI - PubMed
    1. DeRisi JL, Iyer VR, Brown PO. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997;278(5338):680–686. doi: 10.1126/science.278.5338.680. - DOI - PubMed
    1. Mathur R, Rotroff D, Ma J, Shojaie A, Motsinger-Reif A. Gene set analysis methods: a systematic comparison. BioData mining. 2018;11(1):1–19. doi: 10.1186/s13040-018-0166-8. - DOI - PMC - PubMed
    1. Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ. Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci. 2005;102(38):13544–13549. doi: 10.1073/pnas.0506577102. - DOI - PMC - PubMed
    1. Kim SY, Volsky DJ. PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics. 2005;6(1):1–12. doi: 10.1186/1471-2105-6-144. - DOI - PMC - PubMed