Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 1;105(2):373-383.
doi: 10.1016/j.ajhg.2019.07.001. Epub 2019 Jul 25.

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Affiliations

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Matthew Aguirre et al. Am J Hum Genet. .

Abstract

Copy-number variations (CNVs) represent a significant proportion of the genetic differences between individuals and many CNVs associate causally with syndromic disease and clinical outcomes. Here, we characterize the landscape of copy-number variation and their phenome-wide effects in a sample of 472,228 array-genotyped individuals from the UK Biobank. In addition to population-level selection effects against genic loci conferring high mortality, we describe genetic burden from potentially pathogenic and previously uncharacterized CNV loci across more than 3,000 quantitative and dichotomous traits, with separate analyses for common and rare classes of variation. Specifically, we highlight the effects of CNVs at two well-known syndromic loci 16p11.2 and 22q11.2, previously uncharacterized variation at 9p23, and several genic associations in the context of acute coronary artery disease and high body mass index. Our data constitute a deeply contextualized portrait of population-wide burden of copy-number variation, as well as a series of dosage-mediated genic associations across the medical phenome.

Keywords: UK Biobank; association testing; copy number variation; genetics; genomics; microdeletion/microduplication syndrome; population database; selection bias; structural variation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Burden and Distribution of Copy-Number Variation in UK Biobank (A) Log-scale histogram of CNV lengths. Mean length (dashed line) is 226.5 kb. (B) Cumulative density of CNV allele count (AC), displayed in log-log axes. Average AC is 5.5, but average frequency as experienced by the population (weighted by count, hence AC2) is ∼1.6%. (C and D) Histogram of CNV counts (C) and log-scale base-pairs affected by CNV per individual (D). Sample-level burden is heavy-tailed, with the average individual carrying 4.2 variants (dashed line), affecting mean ∼207.6 kb of genomic sequence. (E) Genome-wide density of CNV, defined as the number of unique CNVs overlapping 10 megabase (Mb) windows tiling each chromosome. Hotspots of structural variation are labeled by cytogenic band.
Figure 2
Figure 2
Genome-wide CNV Associations for Acute Coronary Artery Disease (CAD) (A and B) Manhattan plots for (A) genome-wide association of common copy-number variants and (B) genome-wide burden test of rare variants for genes with at least ten individuals observed with CNVs. (C) Locus inset of 9p23 CNV and summary statistics from GWAS of coronary artery disease using variants imputed on the same study population used in the CNV analysis. Variants are colored by marker LD with lead regional GWAS SNPs (rs145879274) from the analysis. This marker is highly stratified by continental ancestry and does not show significant correlation with any other variant in the region. (D) Quantile-quantile plots for genome-wide summary statistics from CNV associations.
Figure 3
Figure 3
Genome-wide CNV Associations for Body Mass Index (BMI) (A and B) Manhattan plots for (A) genome-wide association of common copy-number variants and (B) genome-wide burden test of rare variants for genes with at least five individuals observed with CNVs. (C) Locus inset of 16p11.2 CNVs and summary statistics from GWAS of BMI using variants imputed on the same study population used in the CNV analysis. Variants are colored by marker LD with lead regional GWAS SNPs overlapping each CNV (rs62037365 in SH2B1; rs12716975 in non-coding BOLA2). (D) Quantile-quantile plots for genome-wide summary statistics from CNV associations.
Figure 4
Figure 4
PheWAS of 16p11.2 CNVs Selected genome-wide significant (p < 9 × 10−6) associations for 220 kb (top) and 593 kb (bottom) 16p11.2 CNVs, with n > 500 binary cases or 15,000 quantitative values. Traits are grouped by type (binary/quantitative) then sorted by p value (left). Log-odds ratio and standardized betas (right) align with trait names on the y axis, with the horizontal dashed line separating positive and negative direction of association.

References

    1. Sudmant P.H., Mallick S., Nelson B.J., Hormozdiari F., Krumm N., Huddleston J., Coe B.P., Baker C., Nordenfelt S., Bamshad M. Global diversity, population stratification, and selection of human copy-number variation. Science. 2015;349:aab3761. - PMC - PubMed
    1. Mills R.E., Walter K., Stewart C., Handsaker R.E., Chen K., Alkan C., Abyzov A., Yoon S.C., Ye K., Cheetham R.K., 1000 Genomes Project Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470:59–65. - PMC - PubMed
    1. Mikhail F.M. Copy number variations and human genetic disease. Curr. Opin. Pediatr. 2014;26:646–652. - PubMed
    1. Carvill G.L., Mefford H.C. Microdeletion syndromes. Curr. Opin. Genet. Dev. 2013;23:232–239. - PubMed
    1. Bailey J.A., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. - PubMed

Publication types

MeSH terms

Supplementary concepts