Unequal group variances in microarray data analyses
- PMID: 18344518
- DOI: 10.1093/bioinformatics/btn100
Unequal group variances in microarray data analyses
Abstract
Motivation: In searching for differentially expressed (DE) genes in microarray data, we often observe a fraction of the genes to have unequal variability between groups. This is not an issue in large samples, where a valid test exists that uses individual variances separately. The problem arises in the small-sample setting, where the approximately valid Welch test lacks sensitivity, while the more sensitive moderated t-test assumes equal variance.
Methods: We introduce a moderated Welch test (MWT) that allows unequal variance between groups. It is based on (i) weighting of pooled and unpooled standard errors and (ii) improved estimation of the gene-level variance that exploits the information from across the genes.
Results: When a non-trivial proportion of genes has unequal variability, false discovery rate (FDR) estimates based on the standard t and moderated t-tests are often too optimistic, while the standard Welch test has low sensitivity. The MWT is shown to (i) perform better than the standard t, the standard Welch and the moderated t-tests when the variances are unequal between groups and (ii) perform similarly to the moderated t, and better than the standard t and Welch tests when the group variances are equal. These results mean that MWT is more reliable than other existing tests over wider range of data conditions.
Availability: R package to perform MWT is available at http://www.meb.ki.se/~yudpaw
Similar articles
-
Multidimensional local false discovery rate for microarray studies.Bioinformatics. 2006 Mar 1;22(5):556-65. doi: 10.1093/bioinformatics/btk013. Epub 2005 Dec 20. Bioinformatics. 2006. PMID: 16368770
-
Estimation of false discovery proportion under general dependence.Bioinformatics. 2006 Dec 15;22(24):3025-31. doi: 10.1093/bioinformatics/btl527. Epub 2006 Oct 17. Bioinformatics. 2006. PMID: 17046978
-
Bias in the estimation of false discovery rate in microarray studies.Bioinformatics. 2005 Oct 15;21(20):3865-72. doi: 10.1093/bioinformatics/bti626. Epub 2005 Aug 16. Bioinformatics. 2005. PMID: 16105901
-
Classification based upon gene expression data: bias and precision of error rates.Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28. Bioinformatics. 2007. PMID: 17392326 Review.
-
Identification of differentially expressed genes and false discovery rate in microarray studies.Curr Opin Lipidol. 2007 Apr;18(2):187-93. doi: 10.1097/MOL.0b013e3280895d6f. Curr Opin Lipidol. 2007. PMID: 17353668 Review.
Cited by
-
Gene expression in teratogenic exposures: a new approach to understanding individual risk.Reprod Toxicol. 2014 Jun;45:94-104. doi: 10.1016/j.reprotox.2013.12.008. Epub 2014 Jan 31. Reprod Toxicol. 2014. PMID: 24491834 Free PMC article.
-
An oil containing EPA and DHA from transgenic Camelina sativa to replace marine fish oil in feeds for Atlantic salmon (Salmo salar L.): Effects on intestinal transcriptome, histology, tissue fatty acid profiles and plasma biochemistry.PLoS One. 2017 Apr 12;12(4):e0175415. doi: 10.1371/journal.pone.0175415. eCollection 2017. PLoS One. 2017. PMID: 28403232 Free PMC article.
-
Computational systems biology approaches for Parkinson's disease.Cell Tissue Res. 2018 Jul;373(1):91-109. doi: 10.1007/s00441-017-2734-5. Epub 2017 Nov 29. Cell Tissue Res. 2018. PMID: 29185073 Free PMC article. Review.
-
Modeling group heteroscedasticity in single-cell RNA-seq pseudo-bulk data.Genome Biol. 2023 May 5;24(1):107. doi: 10.1186/s13059-023-02949-2. Genome Biol. 2023. PMID: 37147723 Free PMC article.
-
Ranking metrics in gene set enrichment analysis: do they matter?BMC Bioinformatics. 2017 May 12;18(1):256. doi: 10.1186/s12859-017-1674-0. BMC Bioinformatics. 2017. PMID: 28499413 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases