Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets
- PMID: 19574285
- PMCID: PMC2735665
- DOI: 10.1093/bioinformatics/btp406
Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets
Abstract
Motivation: Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.
Results: In this article, we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two phenotypes. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T(2), N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to complementing null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.
Figures





Similar articles
-
Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond.Methods Mol Biol. 2017;1613:125-159. doi: 10.1007/978-1-4939-7027-8_7. Methods Mol Biol. 2017. PMID: 28849561 Free PMC article.
-
Gene set analysis for self-contained tests: complex null and specific alternative hypotheses.Bioinformatics. 2012 Dec 1;28(23):3073-80. doi: 10.1093/bioinformatics/bts579. Epub 2012 Oct 7. Bioinformatics. 2012. PMID: 23044539 Free PMC article.
-
Comparison of univariate and multivariate gene set analysis in acute lymphoblastic leukemia.Asian Pac J Cancer Prev. 2013;14(3):1629-33. doi: 10.7314/apjcp.2013.14.3.1629. Asian Pac J Cancer Prev. 2013. PMID: 23679247
-
Comparative study of gene set enrichment methods.BMC Bioinformatics. 2009 Sep 2;10:275. doi: 10.1186/1471-2105-10-275. BMC Bioinformatics. 2009. PMID: 19725948 Free PMC article.
-
Detecting Differentially Coexpressed Genes from Labeled Expression Data: A Brief Review.IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):154-67. doi: 10.1109/TCBB.2013.2297921. IEEE/ACM Trans Comput Biol Bioinform. 2014. PMID: 26355515 Review.
Cited by
-
Crosstalk analysis of dysregulated pathways in preeclampsia.Exp Ther Med. 2019 Mar;17(3):2298-2304. doi: 10.3892/etm.2019.7178. Epub 2019 Jan 16. Exp Ther Med. 2019. PMID: 30867714 Free PMC article.
-
Structural influence of gene networks on their inference: analysis of C3NET.Biol Direct. 2011 Jun 22;6:31. doi: 10.1186/1745-6150-6-31. Biol Direct. 2011. PMID: 21696592 Free PMC article.
-
Pathway Cross-Talk Analysis in Detecting Significant Pathways in Barrett's Esophagus Patients.Med Sci Monit. 2017 Mar 6;23:1165-1172. doi: 10.12659/msm.899623. Med Sci Monit. 2017. PMID: 28263955 Free PMC article.
-
Analytical strategies for studying stem cell metabolism.Front Biol (Beijing). 2015 Apr;10(2):141-153. doi: 10.1007/s11515-015-1357-z. Front Biol (Beijing). 2015. PMID: 26213533 Free PMC article.
-
Network and Pathway-Based Analyses of Genes Associated with Parkinson's Disease.Mol Neurobiol. 2017 Aug;54(6):4452-4465. doi: 10.1007/s12035-016-9998-8. Epub 2016 Jun 27. Mol Neurobiol. 2017. PMID: 27349437
References
-
- Baringhaus L, Franz C. On a new multivariate two-sample test. J. Multivariate Anal. 2004;88:190–206.
-
- Barry WT, et al. A statistical framework for testing functional categories in microarray data. Ann. Appl. Stat. 2008;2:286–315.
-
- Dempster AP. A high dimentional two sample significance test. Ann. Math. Statist. 1958;29:995–1010.
-
- Dudoit S, van der Laan MJ. Multiple Testing Procedures with Applications to Genomics. Berlin: Springer; 2008.