Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 5;110(1):23-29.
doi: 10.1016/j.ajhg.2022.11.010. Epub 2022 Dec 7.

LDAK-GBAT: Fast and powerful gene-based association testing using summary statistics

Affiliations

LDAK-GBAT: Fast and powerful gene-based association testing using summary statistics

Takiy-Eddine Berrandou et al. Am J Hum Genet. .

Abstract

We present LDAK-GBAT, a tool for gene-based association testing using summary statistics from genome-wide association studies that is computationally efficient, produces well-calibrated p values, and is significantly more powerful than existing tools. LDAK-GBAT takes approximately 30 min to analyze imputed data (2.9M common, genic SNPs), requiring less than 10 Gb memory. It shows good control of type 1 error given an appropriate reference panel. Across 109 phenotypes (82 from the UK Biobank, 18 from the Million Veteran Program, and nine from the Psychiatric Genetics Consortium), LDAK-GBAT finds on average 19% (SE: 1%) more significant genes than the existing tool sumFREGAT-ACAT, with even greater gains in comparison with MAGMA, GCTA-fastBAT, sumFREGAT-SKAT-O, and sumFREGAT-PCA.

Keywords: UK Biobank; complex traits; gene-based association testing; genome-wide association study; statistical genetics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1
Figure 1
Number of significant genes We analyze each of the 109 phenotypes by using LDAK-GBAT and five existing tools. Bars report the relative number of genome-wide significant genes (p ≤ 2.5 × 10−6) from each tool, with LDAK-GBAT as the reference. Segments mark 95% confidence intervals for the ratios. The total number of genome-wide significant genes from each tool is reported above each bar.
Figure 2
Figure 2
Clumping results from LDAK-GBAT Segments provide p values from LDAK-GBAT when analyzing the MVP phenotype type 2 diabetes; red segments indicate significant genes that are selected by clumping, gold segments indicate significant genes that are discarded by clumping, while gray segments indicate non-significant genes. Blue points report p values for genic SNPs from single-SNP analysis. The gold and blue horizontal lines mark p = 2.5 × 10−6 and p = 5 × 10−8, respectively (the significance thresholds used with LDAK-GBAT and single-SNP analyses).
Figure 3
Figure 3
Varying the heritability model We analyze each of the 109 phenotypes by using LDAK-GBAT with seven heritability models, defined by Εhj2pj1-pj1+α, where pj is the MAF of SNPj and α is −1.25, −1, −0.75, −0.5, −0.25, 0, or 0.25. Bars report the total number of genome-wide significant genes (p ≤ 2.5 × 10−6) for each heritability model. For comparison, the red horizontal line reports the total number of genome-wide significant genes from sumFREGAT-ACAT (the best existing tool).
Figure 4
Figure 4
Comparing LDAK-GBAT and single-SNP analysis for the first ten UK Biobank phenotypes The gold bar reports the total number of genome-wide significant genes (p ≤ 2.5 × 10−6) from LDAK-GBAT (using 50,000 individuals). The blue bars report how many of these genes contain a SNP with Bonferroni-corrected p ≤ 2.5 × 10−6 from single-SNP analysis as the sample size is increased from 50,000 to 200,000.

References

    1. Tam V., Patel N., Turcotte M., Bossé Y., Paré G., Meyre D. Benefits and limitations of genome-wide association studies. Nat. Rev. Genet. 2019;20:467–484. - PubMed
    1. Neale B.M., Sham P.C. The future of association studies: gene-based analysis and replication. Am. J. Hum. Genet. 2004;75:353–362. - PMC - PubMed
    1. Kang G., Jiang B., Cui Y. Gene-based genomewide association analysis: a comparison study. Curr. Genomics. 2013;14:250–255. - PMC - PubMed
    1. Wang K., Li M., Bucan M. Pathway-based approaches for analysis of genomewide association studies. Am. J. Hum. Genet. 2007;81:1278–1283. - PMC - PubMed
    1. Ballard D.H., Cho J., Zhao H. Comparisons of multi-marker association methods to detect association between a candidate region and disease. Genet. Epidemiol. 2010;34:201–212. - PMC - PubMed

Publication types

MeSH terms