Target SNP selection in complex disease association studies
- PMID: 15248903
- PMCID: PMC487897
- DOI: 10.1186/1471-2105-5-92
Target SNP selection in complex disease association studies
Abstract
Background: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available.
Results: We implemented a computational pipeline that retrieves the genomic sequence of target genes, collects information about sequence variation and selects functional motifs containing SNPs. Motifs being considered are gene promoter, exon-intron structure, AU-rich mRNA elements, transcription factor binding motifs, cryptic and enhancer splice sites together with expression in target tissue. As a case study, 396 genes on chromosome 6p21 in the extended HLA region were selected that contributed nearly 20,000 SNPs. By computer annotation ~2,500 SNPs in functional motifs could be identified. Most of these SNPs are disrupting transcription factor binding sites but only those introducing new sites had a significant depressing effect on SNP allele frequency. Other decision rules concern position within motifs, the validity of SNP database entries, the unique occurrence in the genome and conserved sequence context in other mammalian genomes.
Conclusion: Only 10% of all gene-based SNPs have sequence-predicted functional relevance making them a primary target for genotyping in association studies.
Figures


Similar articles
-
An efficient computational method for screening functional SNPs in plants.J Theor Biol. 2010 Jul 7;265(1):55-62. doi: 10.1016/j.jtbi.2010.04.017. Epub 2010 Apr 18. J Theor Biol. 2010. PMID: 20406646
-
Familial adenomatous polyposis: aberrant splicing due to missense or silent mutations in the APC gene.Hum Mutat. 2004 Nov;24(5):370-80. doi: 10.1002/humu.20087. Hum Mutat. 2004. PMID: 15459959
-
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.Bioinformatics. 2005 Jun 15;21(12):2814-20. doi: 10.1093/bioinformatics/bti442. Epub 2005 Apr 12. Bioinformatics. 2005. PMID: 15827081
-
Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes.Toxicol Appl Pharmacol. 2005 Sep 1;207(2 Suppl):84-90. doi: 10.1016/j.taap.2004.09.024. Toxicol Appl Pharmacol. 2005. PMID: 16002116 Review.
-
Advances in the Exon-Intron Database (EID).Brief Bioinform. 2006 Jun;7(2):178-85. doi: 10.1093/bib/bbl003. Epub 2006 Mar 9. Brief Bioinform. 2006. PMID: 16772261 Review.
Cited by
-
Association of a functional cytochrome P450 4F2 haplotype with urinary 20-HETE and hypertension.J Am Soc Nephrol. 2008 Apr;19(4):714-21. doi: 10.1681/ASN.2007060713. Epub 2008 Jan 30. J Am Soc Nephrol. 2008. PMID: 18235092 Free PMC article.
-
SNPs in Multi-species Conserved Sequences (MCS) as useful markers in association studies: a practical approach.BMC Genomics. 2007 Aug 6;8:266. doi: 10.1186/1471-2164-8-266. BMC Genomics. 2007. PMID: 17683615 Free PMC article.
-
Genetic basis of interindividual susceptibility to cancer cachexia: selection of potential candidate gene polymorphisms for association studies.J Genet. 2014 Dec;93(3):893-916. doi: 10.1007/s12041-014-0405-9. J Genet. 2014. PMID: 25572253 Review.
-
Identification of possible genetic polymorphisms involved in cancer cachexia: a systematic review.J Genet. 2011 Apr;90(1):165-77. doi: 10.1007/s12041-011-0027-4. J Genet. 2011. PMID: 21677406
-
Selection of SNP subsets for association studies in candidate genes: comparison of the power of different strategies to detect single disease susceptibility locus effects.BMC Genet. 2006 Apr 5;7:20. doi: 10.1186/1471-2156-7-20. BMC Genet. 2006. PMID: 16597333 Free PMC article.
References
-
- Antonarakis SE, Cooper DN. Mutations in Human Genetic Diseases. Nature Encyclopedia of the Human Genome. 2003. pp. 227–253.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials