Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 15;53(5):1794-1801.
doi: 10.1016/j.csda.2008.04.013.

A Fast Implementation of a Scan Statistic for Identifying Chromosomal Patterns of Genome Wide Association Studies

Affiliations

A Fast Implementation of a Scan Statistic for Identifying Chromosomal Patterns of Genome Wide Association Studies

Yan V Sun et al. Comput Stat Data Anal. .

Abstract

In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N=195) using the Affymetrix Human Mapping 100K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R(2)<0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 hours. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4,000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Region 12 on chromosome 3 associated with the drug response overlaps with SUCLG2 gene. The upper portion shows the arrangement of the genic regions within SUCLG2 and the locations of tagSNPs. The middle portion shows the single SNP p-values of the ten tagSNPs within the identified regions. Significant single SNP tests (chi-square pvalue < 0.05) are shaded in dark color. The lower portion shows pairwise linkage disequilibrium R2 (numbers in the cells) between the tagSNPs.
Figure 2
Figure 2
The scalability of the computation speed of ChromoScan-GWA regarding to the sample size and the number of SNPs.

References

    1. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. - PMC - PubMed
    1. Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, Erdos MR, Stringham HM, Chines PS, Jackson AU, Prokunina-Olsson L, Ding CJ, Swift AJ, Narisu N, Hu T, Pruim R, Xiao R, Li XY, Conneely KN, Riebow NL, Sprau AG, Tong M, White PP, Hetrick KN, Barnhart MW, Bark CW, Goldstein JL, Watkins L, Xiang F, Saramies J, Buchanan TA, Watanabe RM, Valle TT, Kinnunen L, Abecasis GR, Pugh EW, Doheny KF, Bergman RN, Tuomilehto J, Collins FS, Boehnke M. A genome-wide association study of type 2 diabetes in finns detects multiple susceptibility variants. Science. 2007;316:1341–1345. - PMC - PubMed
    1. Duerr RH, Taylor KD, Brant SR, Rioux JD, Silverberg MS, Daly MJ, Steinhart AH, Abraham C, Regueiro M, Griffiths A, Dassopoulos T, Bitton A, Yang H, Targan S, Datta LW, Kistner EO, Schumm LP, Lee AT, Gregersen PK, Barmada MM, Rotter JI, Nicolae DL, Cho JH. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science. 2006;314:1461–1463. - PMC - PubMed
    1. Karlin S, Macken C. Assessment of inhomogeneities in an E. coli physical map. Nucleic Acids Res. 1991;19:4241–4246. - PMC - PubMed
    1. Wagner A. A computational genomics approach to the identification of gene networks. Nucleic Acids Res. 1997;25:3594–3604. - PMC - PubMed

LinkOut - more resources