Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(5):e37510.
doi: 10.1371/journal.pone.0037510. Epub 2012 May 18.

Assessment method for a power analysis to identify differentially expressed pathways

Affiliations

Assessment method for a power analysis to identify differentially expressed pathways

Shailesh Tripathi et al. PLoS One. 2012.

Abstract

Gene expression data can provide a very rich source of information for elucidating the biological function on the pathway level if the experimental design considers the needs of the statistical analysis methods. The purpose of this paper is to provide a comparative analysis of statistical methods for detecting the differentially expression of pathways (DEP). In contrast to many other studies conducted so far, we use three novel simulation types, producing a more realistic correlation structure than previous simulation methods. This includes also the generation of surrogate data from two large-scale microarray experiments from prostate cancer and ALL. As a result from our comprehensive analysis of 41,004 parameter configurations, we find that each method should only be applied if certain conditions of the data from a pathway are met. Further, we provide method-specific estimates for the optimal sample size for microarray experiments aiming to identify DEP in order to avoid an underpowered design. Our study highlights the sensitivity of the studied methods on the parameters of the system.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Simulation type I (A, B) and II (C): Power, FPR and number of significant pathways for GSEA (red), sum of t-square (blue) and Hotelling's (green).
DC = formula image (light color), formula image (medium color), formula image (dark color).
Figure 2
Figure 2. Simulation type III (A) and IV (B): Power, FPR and number of significant pathways for GSEA (red), sum of t-square (blue) and Hotelling's (green).
DC = formula image (light color), formula image (medium color), formula image (dark color). Simulated data are from the protein network of yeast .
Figure 3
Figure 3. Simulation type III (A) and IV (B) : Power, FPR and number of significant pathways for GSEA (red), sum of t-square (blue) and Hotelling's test (green).
DC = formula image (light color), formula image (medium color), formula image (dark color). Simulated data are from the transcriptional regulatory network of yeast .
Figure 4
Figure 4. Left column: prostate cancer. Right column: ALL. Power, false positive rate and number of significant pathways for GSEA (red), sum of t-square (blue) and Hotelling's (green).
Figure 5
Figure 5. Left: Hotelling's , Middle: sum of t-square, Right: GSEA.
The regression line is used to predict the optimal sample size (red cross) found from the intersection of the regression line with the horizontal dashed line corresponding to a ‘zero distance to convergence’.
Figure 6
Figure 6. Average correlations for individual pathways for ALL (blue) and prostate cancer (violet) are shown by horizontally dashed lines.
The two curves correspond to the rank ordered correlation values for ALL (blue) and prostate cancer (violet). For ST I (green - formula image), ST II (orange - formula image), ST III (purple, formula image) and ST IV (brown - formula image) the projections of the range of correlation values is shown on the right-hand side.
Figure 7
Figure 7. Distribution of the detection call (DC) values for gene expression data from prostate cancer (left) and ALL (right).

Similar articles

Cited by

References

    1. Alon U. An Introduction to Systems Biology: Design Principles of Biological Circuits. Boca Raton, FL: Chapman & Hall/CRC; 2006.
    1. Emmert-Streib F, Dehmer M, editors. Medical Biostatistics for Complex Diseases. Weinheim: Wiley-Blackwell; 2010.
    1. Kauffman S. Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology. 1969;22:437–467. - PubMed
    1. Niiranen S, Ribeiro A, editors. Information Processing and Biological Systems. Berlin: Springer; 2011.
    1. Callow M, Dudoit S, Gong E, Speed T, Rubin E. Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. Genome Res. 2000;10:2022–9. - PMC - PubMed

Publication types