A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data
- PMID: 16188930
- DOI: 10.1093/bioinformatics/bti685
A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data
Abstract
Motivation: False discovery rate (FDR) is defined as the expected percentage of false positives among all the claimed positives. In practice, with the true FDR unknown, an estimated FDR can serve as a criterion to evaluate the performance of various statistical methods under the condition that the estimated FDR approximates the true FDR well, or at least, it does not improperly favor or disfavor any particular method. Permutation methods have become popular to estimate FDR in genomic studies. The purpose of this paper is 2-fold. First, we investigate theoretically and empirically whether the standard permutation-based FDR estimator is biased, and if so, whether the bias inappropriately favors or disfavors any method. Second, we propose a simple modification of the standard permutation to yield a better FDR estimator, which can in turn serve as a more fair criterion to evaluate various statistical methods.
Results: Both simulated and real data examples are used for illustration and comparison. Three commonly used test statistics, the sample mean, SAM statistic and Student's t-statistic, are considered. The results show that the standard permutation method overestimates FDR. The overestimation is the most severe for the sample mean statistic while the least for the t-statistic with the SAM-statistic lying between the two extremes, suggesting that one has to be cautious when using the standard permutation-based FDR estimates to evaluate various statistical methods. In addition, our proposed FDR estimation method is simple and outperforms the standard method.
Comment in
-
Comments on 'On correcting the overestimation of the permutation-based false discovery rate estimator'.Bioinformatics. 2008 Oct 15;24(20):2420. doi: 10.1093/bioinformatics/btn456. Epub 2008 Sep 1. Bioinformatics. 2008. PMID: 18762484 No abstract available.
Similar articles
-
Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30. Bioinformatics. 2006. PMID: 16574697
-
Multidimensional local false discovery rate for microarray studies.Bioinformatics. 2006 Mar 1;22(5):556-65. doi: 10.1093/bioinformatics/btk013. Epub 2005 Dec 20. Bioinformatics. 2006. PMID: 16368770
-
Robust estimation of the false discovery rate.Bioinformatics. 2006 Aug 15;22(16):1979-87. doi: 10.1093/bioinformatics/btl328. Epub 2006 Jun 15. Bioinformatics. 2006. PMID: 16777905
-
Microarray data analysis: from disarray to consolidation and consensus.Nat Rev Genet. 2006 Jan;7(1):55-65. doi: 10.1038/nrg1749. Nat Rev Genet. 2006. PMID: 16369572 Review.
-
Classification based upon gene expression data: bias and precision of error rates.Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28. Bioinformatics. 2007. PMID: 17392326 Review.
Cited by
-
Improving power of genome-wide association studies with weighted false discovery rate control and prioritized subset analysis.PLoS One. 2012;7(4):e33716. doi: 10.1371/journal.pone.0033716. Epub 2012 Apr 9. PLoS One. 2012. PMID: 22496761 Free PMC article.
-
Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes.Bioinformatics. 2009 Jul 1;25(13):1662-8. doi: 10.1093/bioinformatics/btp295. Epub 2009 May 5. Bioinformatics. 2009. PMID: 19417058 Free PMC article.
-
Ventral tegmental transcriptome response to intermittent nicotine treatment and withdrawal in BALB/cJ, C57BL/6ByJ, and quasi-congenic RQI mice.Neurochem Res. 2007 Mar;32(3):457-80. doi: 10.1007/s11064-006-9250-4. Neurochem Res. 2007. PMID: 17268848
-
Inheritance patterns of transcript levels in F1 hybrid mice.Genetics. 2006 Oct;174(2):627-37. doi: 10.1534/genetics.106.060251. Epub 2006 Aug 3. Genetics. 2006. PMID: 16888332 Free PMC article.
-
Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis.Int J Mol Sci. 2020 Apr 20;21(8):2873. doi: 10.3390/ijms21082873. Int J Mol Sci. 2020. PMID: 32326049 Free PMC article. Review.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources