. 2019 Aug 1;105(2):373-383.

doi: 10.1016/j.ajhg.2019.07.001. Epub 2019 Jul 25.

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Matthew Aguirre¹, Manuel A Rivas², James Priest³

Affiliations

¹ Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA 94305, USA.
² Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA 94305, USA.
³ Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA 94035, USA. Electronic address: jpriest@stanford.edu.

PMID: 31353025
PMCID: PMC6699064
DOI: 10.1016/j.ajhg.2019.07.001

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Matthew Aguirre et al. Am J Hum Genet. 2019.

. 2019 Aug 1;105(2):373-383.

doi: 10.1016/j.ajhg.2019.07.001. Epub 2019 Jul 25.

Authors

Matthew Aguirre¹, Manuel A Rivas², James Priest³

Affiliations

¹ Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA 94305, USA.
² Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA 94305, USA.
³ Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA 94035, USA. Electronic address: jpriest@stanford.edu.

PMID: 31353025
PMCID: PMC6699064
DOI: 10.1016/j.ajhg.2019.07.001

Abstract

Copy-number variations (CNVs) represent a significant proportion of the genetic differences between individuals and many CNVs associate causally with syndromic disease and clinical outcomes. Here, we characterize the landscape of copy-number variation and their phenome-wide effects in a sample of 472,228 array-genotyped individuals from the UK Biobank. In addition to population-level selection effects against genic loci conferring high mortality, we describe genetic burden from potentially pathogenic and previously uncharacterized CNV loci across more than 3,000 quantitative and dichotomous traits, with separate analyses for common and rare classes of variation. Specifically, we highlight the effects of CNVs at two well-known syndromic loci 16p11.2 and 22q11.2, previously uncharacterized variation at 9p23, and several genic associations in the context of acute coronary artery disease and high body mass index. Our data constitute a deeply contextualized portrait of population-wide burden of copy-number variation, as well as a series of dosage-mediated genic associations across the medical phenome.

Keywords: UK Biobank; association testing; copy number variation; genetics; genomics; microdeletion/microduplication syndrome; population database; selection bias; structural variation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Burden and Distribution of Copy-Number Variation in UK Biobank (A) Log-scale histogram of CNV lengths. Mean length (dashed line) is 226.5 kb. (B) Cumulative density of CNV allele count (AC), displayed in log-log axes. Average AC is 5.5, but average frequency as experienced by the population (weighted by count, hence AC²) is ∼1.6%. (C and D) Histogram of CNV counts (C) and log-scale base-pairs affected by CNV per individual (D). Sample-level burden is heavy-tailed, with the average individual carrying 4.2 variants (dashed line), affecting mean ∼207.6 kb of genomic sequence. (E) Genome-wide density of CNV, defined as the number of unique CNVs overlapping 10 megabase (Mb) windows tiling each chromosome. Hotspots of structural variation are labeled by cytogenic band.

**Figure 2**
Genome-wide CNV Associations for Acute Coronary Artery Disease (CAD) (A and B) Manhattan plots for (A) genome-wide association of common copy-number variants and (B) genome-wide burden test of rare variants for genes with at least ten individuals observed with CNVs. (C) Locus inset of 9p23 CNV and summary statistics from GWAS of coronary artery disease using variants imputed on the same study population used in the CNV analysis. Variants are colored by marker LD with lead regional GWAS SNPs (rs145879274) from the analysis. This marker is highly stratified by continental ancestry and does not show significant correlation with any other variant in the region. (D) Quantile-quantile plots for genome-wide summary statistics from CNV associations.

**Figure 3**
Genome-wide CNV Associations for Body Mass Index (BMI) (A and B) Manhattan plots for (A) genome-wide association of common copy-number variants and (B) genome-wide burden test of rare variants for genes with at least five individuals observed with CNVs. (C) Locus inset of 16p11.2 CNVs and summary statistics from GWAS of BMI using variants imputed on the same study population used in the CNV analysis. Variants are colored by marker LD with lead regional GWAS SNPs overlapping each CNV (rs62037365 in *SH2B1*; rs12716975 in non-coding *BOLA2*). (D) Quantile-quantile plots for genome-wide summary statistics from CNV associations.

**Figure 4**
PheWAS of 16p11.2 CNVs Selected genome-wide significant (p < 9 × 10⁻⁶) associations for 220 kb (top) and 593 kb (bottom) *16p11.2* CNVs, with n > 500 binary cases or 15,000 quantitative values. Traits are grouped by type (binary/quantitative) then sorted by p value (left). Log-odds ratio and standardized betas (right) align with trait names on the y axis, with the horizontal dashed line separating positive and negative direction of association.

See this image and copyright information in PMC

References

1. Sudmant P.H., Mallick S., Nelson B.J., Hormozdiari F., Krumm N., Huddleston J., Coe B.P., Baker C., Nordenfelt S., Bamshad M. Global diversity, population stratification, and selection of human copy-number variation. Science. 2015;349:aab3761. - PMC - PubMed
1. Mills R.E., Walter K., Stewart C., Handsaker R.E., Chen K., Alkan C., Abyzov A., Yoon S.C., Ye K., Cheetham R.K., 1000 Genomes Project Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470:59–65. - PMC - PubMed
1. Mikhail F.M. Copy number variations and human genetic disease. Curr. Opin. Pediatr. 2014;26:646–652. - PubMed
1. Carvill G.L., Mefford H.C. Microdeletion syndromes. Curr. Opin. Genet. Dev. 2013;23:232–239. - PubMed
1. Bailey J.A., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Myers E.W., Li P.W., Eichler E.E. Recent segmental duplications in the human genome. Science. 2002;297:1003–1007. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Affiliations

Phenome-wide Burden of Copy-Number Variation in the UK Biobank

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Supplementary concepts

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical