Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Dec 3:4:266.
doi: 10.3389/fgene.2013.00266.

Discovering epistasis in large scale genetic association studies by exploiting graphics cards

Affiliations
Review

Discovering epistasis in large scale genetic association studies by exploiting graphics cards

Gary K Chen et al. Front Genet. .

Abstract

Despite the enormous investments made in collecting DNA samples and generating germline variation data across thousands of individuals in modern genome-wide association studies (GWAS), progress has been frustratingly slow in explaining much of the heritability in common disease. Today's paradigm of testing independent hypotheses on each single nucleotide polymorphism (SNP) marker is unlikely to adequately reflect the complex biological processes in disease risk. Alternatively, modeling risk as an ensemble of SNPs that act in concert in a pathway, and/or interact non-additively on log risk for example, may be a more sensible way to approach gene mapping in modern studies. Implementing such analyzes genome-wide can quickly become intractable due to the fact that even modest size SNP panels on modern genotype arrays (500k markers) pose a combinatorial nightmare, require tens of billions of models to be tested for evidence of interaction. In this article, we provide an in-depth analysis of programs that have been developed to explicitly overcome these enormous computational barriers through the use of processors on graphics cards known as Graphics Processing Units (GPU). We include tutorials on GPU technology, which will convey why they are growing in appeal with today's numerical scientists. One obvious advantage is the impressive density of microprocessor cores that are available on only a single GPU. Whereas high end servers feature up to 24 Intel or AMD CPU cores, the latest GPU offerings from nVidia feature over 2600 cores. Each compute node may be outfitted with up to 4 GPU devices. Success on GPUs varies across problems. However, epistasis screens fare well due to the high degree of parallelism exposed in these problems. Papers that we review routinely report GPU speedups of over two orders of magnitude (>100x) over standard CPU implementations.

Keywords: CUDA tutorial; GPU programming; epistasis; gene–gene interactions; high performance computing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of CPU and GPU architecture.
Figure 2
Figure 2
Parallel execution work-items on a GPU.
Figure 3
Figure 3
Improvements from GPU optimizations (figure from Hemani et al., 2011).
Figure 4
Figure 4
Parallelization in GLIDE (figure from Kam-Thong et al., 2012).
Figure 5
Figure 5
Candidates from BOOST first stage screen (figure from Wan et al., 2010).
Figure 6
Figure 6
Runtime as a function of number of active threads across different optimization levels (figure from Yung et al., 2011).
Figure 7
Figure 7
Overview of MDR (figure from Oki and Motsinger-Reif, 2011).
Figure 8
Figure 8
ROC for methods to analyze a continuous trait.
Figure 9
Figure 9
ROC for methods to analyze a binary trait.
Figure A1
Figure A1
Parallel reduction for computing summary statistics.
Figure A2
Figure A2
Thrust code for computing means across different SNPs.
Figure A3
Figure A3
CUDA code for computing means across different SNPs.
Figure A4
Figure A4
OpenCL kernel for computing means across different SNPs.

References

    1. Blom J., Jakobi T., Doppmeier D., Jaenicke S., Kalinowski J., Stoye J., et al. (2011). Exact and complete short-read alignment to microbial genomes using graphics processing unit programming. Bioinformatics 27, 1351–1358 10.1093/bioinformatics/btr151 - DOI - PubMed
    1. Buckner J., Wilson J., Seligman M., Athey B., Watson S., Meng F. (2010). The gputools package enables gpu computing in R. Bioinformatics 26, 134–135 10.1093/bioinformatics/btp608 - DOI - PMC - PubMed
    1. Campagna D., Albiero A., Bilardi A., Caniato E., Forcato C., Manavski S., et al. (2009). Pass: a program to align short sequences. Bioinformatics 25, 967–968 10.1093/bioinformatics/btp087 - DOI - PubMed
    1. Chen G. K. (2012). A scalable and portable framework for massively parallel variable selection in genetic association studies. Bioinformatics 28, 719–720 10.1093/bioinformatics/bts015 - DOI - PMC - PubMed
    1. Chen G. K., Wang K., Stram A. H., Sobel E. M., Lange K. (2012). Mendel-GPU: haplotyping and genotype imputation on graphics processing units. Bioinformatics 28, 2979–2980 10.1093/bioinformatics/bts536 - DOI - PMC - PubMed

LinkOut - more resources