The false discovery rate: a key concept in large-scale genetic studies
- PMID: 20010520
- DOI: 10.1177/107327481001700108
The false discovery rate: a key concept in large-scale genetic studies
Abstract
Background: In experimental research, a statistical test is often used for making decisions on a null hypothesis such as that the means of gene expression in the normal and tumor groups are equal. Typically, a test statistic and its corresponding P value are calculated to measure the extent of the difference between the two groups. The null hypothesis is rejected and a discovery is declared when the P value is less than a prespecified significance level. When more than one test is conducted, use of a significance level intended for use by a single test typically leads to a large chance of false-positive findings.
Methods: This paper presents an overview of the multiple testing framework and describes the false discovery rate (FDR) approach to determining the significance cutoff when a large number of tests are conducted.
Results: The FDR is the expected proportion of the null hypotheses that are falsely rejected divided by the total number of rejections. An FDR-controlling procedure is described and illustrated with a numerical example.
Conclusions: In multiple testing, a classical "family-wise error rate" (FWE) approach is commonly used when the number of tests is small. When a study involves a large number of tests, the FDR error measure is a more useful approach to determining a significance cutoff, as the FWE approach is too stringent. The FDR approach allows more claims of significant differences to be made, provided the investigator is willing to accept a small fraction of false-positive findings.
Similar articles
-
Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157. BMC Bioinformatics. 2007. PMID: 17509157 Free PMC article.
-
Comparison of methods for estimating the number of true null hypotheses in multiplicity testing.J Biopharm Stat. 2003 Nov;13(4):675-89. doi: 10.1081/BIP-120024202. J Biopharm Stat. 2003. PMID: 14584715
-
Estimation of false discovery rate using sequential permutation p-values.Biometrics. 2013 Mar;69(1):1-7. doi: 10.1111/j.1541-0420.2012.01825.x. Epub 2013 Feb 4. Biometrics. 2013. PMID: 23379645
-
Analysis of multilocus models of association.Genet Epidemiol. 2003 Jul;25(1):36-47. doi: 10.1002/gepi.10237. Genet Epidemiol. 2003. PMID: 12813725 Review.
-
[Multiple comparison procedures: principles, limits. Applications to microarray phenotype-genotype analysis].Rev Epidemiol Sante Publique. 2004 Dec;52(6):523-37. doi: 10.1016/s0398-7620(04)99092-x. Rev Epidemiol Sante Publique. 2004. PMID: 15741915 Review. French.
Cited by
-
DAGM: A novel modelling framework to assess the risk of HER2-negative breast cancer based on germline rare coding mutations.EBioMedicine. 2021 Jul;69:103446. doi: 10.1016/j.ebiom.2021.103446. Epub 2021 Jun 19. EBioMedicine. 2021. PMID: 34157485 Free PMC article.
-
Transcriptome wide annotation of eukaryotic RNase III reactivity and degradation signals.PLoS Genet. 2015 Feb 13;11(2):e1005000. doi: 10.1371/journal.pgen.1005000. eCollection 2015 Feb. PLoS Genet. 2015. PMID: 25680180 Free PMC article.
-
Study designs and statistical analyses for biomarker research.Sensors (Basel). 2012;12(7):8966-86. doi: 10.3390/s120708966. Epub 2012 Jun 29. Sensors (Basel). 2012. PMID: 23012528 Free PMC article. Review.
-
Analysis of Thromboembolic and Thrombocytopenic Events After the AZD1222, BNT162b2, and MRNA-1273 COVID-19 Vaccines in 3 Nordic Countries.JAMA Netw Open. 2022 Jun 1;5(6):e2217375. doi: 10.1001/jamanetworkopen.2022.17375. JAMA Netw Open. 2022. PMID: 35699955 Free PMC article.
-
A practical guide to methods controlling false discoveries in computational biology.Genome Biol. 2019 Jun 4;20(1):118. doi: 10.1186/s13059-019-1716-1. Genome Biol. 2019. PMID: 31164141 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Medical