Spinning convincing stories for both true and false association signals
- PMID: 30657194
- PMCID: PMC6590226
- DOI: 10.1002/gepi.22189
Spinning convincing stories for both true and false association signals
Abstract
When interpreting genome-wide association peaks, it is common to annotate each peak by searching for genes with plausible relationships to the trait. However, "all that glitters is not gold"-one might interpret apparent patterns in the data as plausible even when the peak is a false positive. Accordingly, we sought to see how human annotators interpreted association results containing a mixture of peaks from both the original trait and a genetically uncorrelated "synthetic" trait. Two of us prepared a mix of original and synthetic peaks of three significance categories from five different scans along with relevant literature search results and then we all annotated these regions. Three annotators also scored the strength of evidence connecting each peak to the scanned trait and the likelihood of further studying that region. While annotators found original peaks to have stronger evidence (p Bonferroni = 0.017) and higher likelihood of further study ( p Bonferroni = 0.006) than synthetic peaks, annotators often made convincing connections between the synthetic peaks and the original trait, finding these connections 55% of the time. These results show that it is not difficult for annotators to make convincing connections between synthetic association signals and genes found in those regions.
Keywords: association peaks; false positives; genome-wide association studies; literature review.
© 2019 The Authors. Genetic Epidemiology Published by Wiley Periodicals, Inc.
Figures


Similar articles
-
True and false positive peaks in genomewide scans: The long and the short of it.Genet Epidemiol. 2001 May;20(4):409-14. doi: 10.1002/gepi.1010. Genet Epidemiol. 2001. PMID: 11319782
-
Estimation of a significance threshold for genome-wide association studies.BMC Genomics. 2019 Jul 29;20(1):618. doi: 10.1186/s12864-019-5992-7. BMC Genomics. 2019. PMID: 31357925 Free PMC article.
-
Residual linkage: why do linkage peaks not disappear after an association study?Hum Genet. 2007 Mar;121(1):77-82. doi: 10.1007/s00439-006-0278-y. Epub 2006 Oct 27. Hum Genet. 2007. PMID: 17072650
-
[Genome-wide association study on complex diseases: genetic statistical issues].Yi Chuan. 2008 May;30(5):543-9. doi: 10.3724/sp.j.1005.2008.00543. Yi Chuan. 2008. PMID: 18487142 Review. Chinese.
-
Current progress on statistical methods for mapping quantitative trait loci from inbred line crosses.J Biopharm Stat. 2010 Mar;20(2):454-81. doi: 10.1080/10543400903572845. J Biopharm Stat. 2010. PMID: 20309768 Review.
Cited by
-
Using the structure of genome data in the design of deep neural networks for predicting amyotrophic lateral sclerosis from genotype.Bioinformatics. 2019 Jul 15;35(14):i538-i547. doi: 10.1093/bioinformatics/btz369. Bioinformatics. 2019. PMID: 31510706 Free PMC article.
-
Influenza Vaccination Is Not Associated with Increased Number of Visits for Shoulder Pain.Clin Orthop Relat Res. 2020 Oct;478(10):2343-2348. doi: 10.1097/CORR.0000000000001215. Clin Orthop Relat Res. 2020. PMID: 32141910 Free PMC article.
-
A Genome-Wide Association Study of Anti-Müllerian Hormone (AMH) Levels in Samoan Women.Genes (Basel). 2025 Jun 30;16(7):793. doi: 10.3390/genes16070793. Genes (Basel). 2025. PMID: 40725450 Free PMC article.
-
Species-wide genomics of kākāpō provides tools to accelerate recovery.Nat Ecol Evol. 2023 Oct;7(10):1693-1705. doi: 10.1038/s41559-023-02165-y. Epub 2023 Aug 28. Nat Ecol Evol. 2023. PMID: 37640765
References
-
- Dowle, M. A. S. (2017). data.table: Extension of ‘data.frame’ (version 1.10.4). CRAN. https://CRAN.R-project.org/package=data.table
-
- Kovalchik, S. (2017). RISmed: Download content from NCBI databases (Version 2.1.7) [Package]. CRAN: https://CRAN.R-project.org/package=RISmed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources