Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010;70(3):167-76.
doi: 10.1159/000308456. Epub 2010 Jul 30.

Fast and robust association tests for untyped SNPs in case-control studies

Affiliations
Comparative Study

Fast and robust association tests for untyped SNPs in case-control studies

Andrew S Allen et al. Hum Hered. 2010.

Abstract

Genome-wide association studies (GWASs) aim to genotype enough single nucleotide polymorphisms (SNPs) to effectively capture common genetic variants across the genome. Even though the number of SNPs genotyped in such studies can exceed a million, there is still interest in testing association with SNPs that were not genotyped in the study sample. Analyses of such untyped SNPs can assist in signal localization, permit cross-platform integration of samples from separate studies, and can improve power - especially for rarer SNPs. External information on a larger collection of SNPs from an appropriate reference panel, comprising both SNPs typed in the sample and the untyped SNPs we wish to test for association, is necessary for an untyped variant analysis to proceed. Linkage disequilibrium patterns observed in the reference panel are then used to infer the likely genotype at the untyped SNPs in the study sample. We propose here a novel statistical approach for testing untyped SNPs in case-control GWAS, based on an efficient score function derived from a prospective likelihood, that automatically accounts for the variability in the process of estimating the untyped variant. Computationally efficient methods of phasing can be used without affecting the validity of the test, and simple measures of haplotype sharing can be used to infer genotypes at the untyped SNPs, making our approach computationally much faster than existing approaches for untyped analysis. At the same time, we show, using simulated data, that our approach often has performance nearly equivalent to hidden Markov methods of untyped analysis. The software package 'untyped' is available to implement our approach.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Example showing analysis of a simulated dataset. Black dots represent efficient score-based testing (using sharing method to estimate mi) of untyped loci. Red dots represent tests of observed genotypes. Black and red solid horizontal lines represent significance thresholds used in simulation study for untyped- and typed-only analyses, respectively. Black and red dashed horizontal lines represent significance thresholds for untyped- and typed-only analyses, respectively, established by permutation applied to this dataset. Dashed vertical line denotes location of true disease locus.

Similar articles

Cited by

References

    1. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JPA, Hirschhorn J. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9:356–369. - PubMed
    1. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39:906–913. - PubMed
    1. Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR: Markov model for rapid haplotyping and genotype imputation in genome wide studies. Submitted.
    1. Nicolae DL. Quantifying the amount of missing information in genetic association studies. Genet Epidemiol. 2006;30:703–717. - PubMed
    1. Nicolae DL. Testing untyped alleles (TUNA) – applications to genome-wide association studies. Genet Epidemiol. 2006;30:718–727. - PubMed

Publication types