Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 11;5(1):806.
doi: 10.1038/s42003-022-03738-6.

GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing

Affiliations

GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing

Ravi Mathur et al. Commun Biol. .

Abstract

Genome-wide association studies (GWAS) have made impactful discoveries for complex diseases, often by amassing very large sample sizes. Yet, GWAS of many diseases remain underpowered, especially for non-European ancestries. One cost-effective approach to increase sample size is to combine existing cohorts, which may have limited sample size or be case-only, with public controls, but this approach is limited by the need for a large overlap in variants across genotyping arrays and the scarcity of non-European controls. We developed and validated a protocol, Genotyping Array-WGS Merge (GAWMerge), for combining genotypes from arrays and whole-genome sequencing, ensuring complete variant overlap, and allowing for diverse samples like Trans-Omics for Precision Medicine to be used. Our protocol involves phasing, imputation, and filtering. We illustrated its ability to control technology driven artifacts and type-I error, as well as recover known disease-associated signals across technologies, independent datasets, and ancestries in smoking-related cohorts. GAWMerge enables genetic studies to leverage existing cohorts to validly increase sample size and enhance discovery for understudied traits and ancestries.

Trial registration: ClinicalTrials.gov NCT00292552.

PubMed Disclaimer

Conflict of interest statement

E.K.S. has received institutional grant support from GlaxoSmithKline and Bayer. M.H.C. has received grant support from GSK and Bayer, and consulting or speaking fees from Illumina, Genentech, and AstraZeneca. All other authors have no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the protocol to use whole-genome sequencing (WGS) data as public control in GWAS.
*The quality control (QC) of the case and public control data is conducted independently according to the steps outlined in the methods.
Fig. 2
Fig. 2. Evaluation design for GAWMerge.
Evaluation design for a technical comparison, b type-I error assessment, and c known GWAS hits. *The samples with European ancestry in COPDGene were evenly divided into two subsets of samples. EA1 includes all COPD cases and some COPD controls to match the COPD prevalence in ECLIPSE. EA2 has all the rest COPD free samples.
Fig. 3
Fig. 3. Meta-analysis results from evaluation for type-I error.
The Manhattan plot (a) shows the expected no signal, while the QQ-plot (b) shows no inflation.
Fig. 4
Fig. 4. Meta-analysis results for replication of GWAS hits for COPD.
The Manhattan plot (a) shows the replicated signals, while the QQ-plot (b) shows inflation due to the true signal.

References

    1. Luca D, et al. On the use of general control samples for genome-wide association studies: genetic matching highlights causal variants. Am. J. Hum. Genet. 2008;82:453–463. doi: 10.1016/j.ajhg.2007.11.003. - DOI - PMC - PubMed
    1. Cooper JD, et al. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat. Genet. 2008;40:1399–1401. doi: 10.1038/ng.249. - DOI - PMC - PubMed
    1. Rao DC. An overview of the genetic dissection of complex traits. Adv. Genet. 2008;60:3–34. doi: 10.1016/S0065-2660(07)00401-4. - DOI - PubMed
    1. Todd JA, et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat. Genet. 2007;39:857–864. doi: 10.1038/ng2068. - DOI - PMC - PubMed
    1. Johnson EO, et al. KAT2B polymorphism identified for drug abuse in African Americans with regulatory links to drug abuse pathways in human prefrontal cortex. Addict. Biol. 2016;21:1217–1232. doi: 10.1111/adb.12286. - DOI - PMC - PubMed

Publication types

Associated data