Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Jul 12:5:92.
doi: 10.1186/1471-2105-5-92.

Target SNP selection in complex disease association studies

Affiliations
Comparative Study

Target SNP selection in complex disease association studies

Matthias Wjst. BMC Bioinformatics. .

Abstract

Background: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available.

Results: We implemented a computational pipeline that retrieves the genomic sequence of target genes, collects information about sequence variation and selects functional motifs containing SNPs. Motifs being considered are gene promoter, exon-intron structure, AU-rich mRNA elements, transcription factor binding motifs, cryptic and enhancer splice sites together with expression in target tissue. As a case study, 396 genes on chromosome 6p21 in the extended HLA region were selected that contributed nearly 20,000 SNPs. By computer annotation ~2,500 SNPs in functional motifs could be identified. Most of these SNPs are disrupting transcription factor binding sites but only those introducing new sites had a significant depressing effect on SNP allele frequency. Other decision rules concern position within motifs, the validity of SNP database entries, the unique occurrence in the genome and conserved sequence context in other mammalian genomes.

Conclusion: Only 10% of all gene-based SNPs have sequence-predicted functional relevance making them a primary target for genotyping in association studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The SNP context view consists of 7 panes with a flexible number of sequence lanes in pane "E". The sequence pane may be exploded by the interlinear display of splice variants or conserved sequences among species.
Figure 2
Figure 2
Allele frequencies of 1,633 SNPs in the Caucasian population (which is a random SNP subset of the 19,495 annotated SNPs on chromosome 6p21) by type of functional change. SNP allele frequencies were on average 4% lower in the two bottom tertiles (P = 0.004) of the frequency distribution in those SNPs that insert a new transcription factor binding site (red line) compared to SNPs that destroyed a binding site or were not found in any motif (black line).

Similar articles

Cited by

References

    1. Antonarakis SE, Cooper DN. Mutations in Human Genetic Diseases. Nature Encyclopedia of the Human Genome. 2003. pp. 227–253.
    1. Rica A, Kohanene S. SNPper: retrieval and analysis of human SNPs. Bioinformatics. 2002;18:1681–1685. doi: 10.1093/bioinformatics/18.12.1681. - DOI - PubMed
    1. Marsh S, Kwok P, McLeod HL. SNP databases and pharmacogenetics: great start, but a long way to go. Hum Mutat. 2002;20:174–179. doi: 10.1002/humu.10115. - DOI - PubMed
    1. Chang H, Fujita T. PicSNP: A browsable catalog of nonsynonymous single nucleotide polymorphisms in the human genome. Biochem Biophys Res Commun. 2002;287:288–291. doi: 10.1006/bbrc.2001.5576. - DOI - PubMed
    1. Taylor NE, Greene EA. ParseSNP: a tool for the analysis of nucleotide polymorphisms. Nucl Acid Res. 2003;31:3808–3811. doi: 10.1093/nar/gkg574. - DOI - PMC - PubMed

Publication types