Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 10;4(7):100602.
doi: 10.1016/j.xgen.2024.100602. Epub 2024 Jun 28.

Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank

Affiliations

Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank

Frederik H Lassen et al. Cell Genom. .

Abstract

The phenotypic impact of compound heterozygous (CH) variation has not been investigated at the population scale. We phased rare variants (MAF ∼0.001%) in the UK Biobank (UKBB) exome-sequencing data to characterize recessive effects in 175,587 individuals across 311 common diseases. A total of 6.5% of individuals carry putatively damaging CH variants, 90% of which are only identifiable upon phasing rare variants (MAF < 0.38%). We identify six recessive gene-trait associations (p < 1.68 × 10-7) after accounting for relatedness, polygenicity, nearby common variants, and rare variant burden. Of these, just one is discovered when considering homozygosity alone. Using longitudinal health records, we additionally identify and replicate a novel association between bi-allelic variation in ATP2C2 and an earlier age at onset of chronic obstructive pulmonary disease (COPD) (p < 3.58 × 10-8). Genetic phase contributes to disease risk for gene-trait pairs: ATP2C2-COPD (p = 0.000238), FLG-asthma (p = 0.00205), and USH2A-visual impairment (p = 0.0084). We demonstrate the power of phasing large-scale genetic cohorts to discover phenome-wide consequences of compound heterozygosity.

Keywords: bi-allelic; compound heterozygosity; longtudinal; phasing; population genetics; recessive.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests B.M.N. is a member of the scientific advisory board at Deep Genomics and Neumora.

Figures

None
Graphical abstract
Figure 1
Figure 1
CH variants composed of at least 1 ultra-rare variant (MAC ≤ 10) can be robustly identified in large-scale biobanks (A) Trio SER depicted on the y axis as a function of MAC bin (x axis) for phased variants with MAF ≤ 5%, stratified by phasing confidence score PP ≥ 0.5 or PP ≥ 0.9. Error bars display 95% binomial confindence intervals. (B) Counts of samples harboring different classes of variation with at least 2 variants in UKBB. Each set of 3 bars depicts the number of individuals with at least 1 CH variant, homozygous variant, or multi-hit (cis) variant, respectively. Here, we define a CH pLoF + damaging missense variant as any combination of pLoF and/or damaging missense variation on opposite haplotypes. A qualifying carrier for each bar occurs according to the configuration displayed above the bars and is grouped by variant consequence according to the color legend. (C and D) Number of CH or homozygous carriers per gene. (E) One minus cumulative fraction (y axis) of homozygous (dashed line) and CH carriers as a function of lowest MAF (x axis) in bi-allelic variant pairs for which both variants phased at PP ≥ 0.9 (solid line), stratified by variant consequence according to the color key.
Figure 2
Figure 2
Conditional recessive and additive modeling of gene copy disruption in 311 phenotypes across 176,587 participants (A) Recessive Manhattan plot depicting log10-transformed gene-trait association p values against chromosomal location. Associations are colored red if they are Bonferroni (p < 1.68 × 10−7) significant. Transparent coloring represents the resulting p value when conditioning only on PRS, whereas solid coloring with black outline represents the p value derived after conditioning on off-chromosome PRS, nearby (500 kb) common variant association signal, and rare variants within the gene when applicable (STAR Methods). The Bonferroni significance threshold is also displayed as a red dashed line. A gene may appear multiple times if it is associated with >1 phenotype. A qualifying example of the recessive inheritance pattern is shown at the top right of the panel: disruption of both gene copies results in an effect on the phenotype. (B) Quantile-quantile (Q-Q) plot for genes with bi-allelic damaging variants after conditioning on off-chromosome PRS. The shaded area depicts the 95% CI under the null. Gene-trait associations passing Bonferroni significance are labeled accordingly. (C and D) Additive Manhattan plot and corresponding Q-Q plot for genes with mono- and bi-allelic damaging variants associated with at least 1 phenotype after conditioning on off-chromosome PRS when applicable (STAR Methods). No additional conditioning was performed in this analysis. Gene-trait associations are colored red if they are Bonferroni (p < 9.8 × 10−9) significant. The additive inheritance model is depicted at the top right of the panel; each affected haplotype results in a incremental effect on the phenotype.
Figure 3
Figure 3
In silico permutation of genetic phase provides evidence for CH-specific effects (A) Overview of the permutation pipeline. To be sufficiently powered to detect effects, we considered 5 significant (p < 0.01) gene-trait pairs from the genome-wide analysis that have at least 10 individuals harboring pLoF or damaging missense/protein-altering variants on the same haplotypes or CH carriers. Then, we shuffled CH trans and cis labels across samples and re-ran the association analysis, resulting in a null distribution of permuted score statistics corresponding to the association strength in the absence of phase information. We derive the 1-tailed empirical p value by comparing the observed score statistics with the empirical null distribution. (B) The resulting distributions of permuted (white and black boxplots) and observed score statistic (red dot) for each gene-trait and the resulting empirical p value. p values shown in bold indicate Bonferroni significance ((p < 0.05/06 = 0.0083). Box and whisker plots display the quartiles of the empirical null distribution.
Figure 4
Figure 4
Age at diagnosis modeling reveals novel recessive effects driven by damaging bi-allelic variants (A) Flow diagram of our approach. To investigate whether homozygous and/or CH effects are associated with a difference in lifetime risk of disease development, we performed Cox proportional hazards modeling for gene-trait combinations in which ≥5 samples are 2-hit carriers (CH or homozygotes) and ≥100 samples that are heterozygotes. Among Bonferroni (Bonf.) significant associations (p < 1.89 × 10−7), we filter to gene-trait pairs for which at least 5 samples carry multiple variants disrupting the same haplotype and test for an association between CH or homozygous carrier status and lifetime disease risk (corresponding to HRs ≥1). (B) HRs when comparing CH and homozygous status versus heterozygous carrier status. Throughout, we display hazard ratios and corresponding p values after taking the polygenic contribution into account by conditioning on off-chromosome PRSs for heritable traits that pass our quality control cutoffs. p values following inclusion of polygenic contribution to disease status are provided where PRSs are predictive. HRs for gene-traits with ≥2 individuals with multiple cis variants on the same haplotype are displayed in pink. Only associations that pass the stringent Bonferroni significance threshold (p < 1.89 × 10−7) cutoff are illustrated. (C) HRs when comparing wild-type, heterozygous, CH, and homozygous carrier status against individuals that harbor ≥2 putatively damaging variants on the same haplotype. 95% CIs are shown in the figure.
Figure 5
Figure 5
Trajectories of haplotype disruption in common diseases (A and B) Kaplan-Meier survival curves for CH (red), homozygous (orange), heterozygous carriers (blue), and single disruption of haplotypes (pink) due to pLoF or damaging missense/protein-altering mutations. Shaded regions indicate 95% confidence intervals for risk estimates. Wild types and bi-allelic variants (CH or homozygous) are shown with green and black lines, respectively. Both CH and homozygous MUTYH-variant carriers are at elevated lifetime risk of developing benign neoplasm of the colon compared to heterozygous carriers and wild types. (C and D) Kaplan-Meier survival curves for ATP2C2 mono- and bi-allelic variant carriers. Carriers of CH variants develop COPD earlier compared to heterozygotes carriers and wild types. Moreover, individuals who harbor a single putatively disrupted haplotype due to ≥2 damaging variants develop COPD at the same frequency as heterozygotes and wild types. (E) Gene plots for ATP2C2, displaying protein coding variants for samples that carry ≥2 pLoF or damaging missense/protein-altering variants stratified by exon or intron. CH variants, multiple variants in cis, and homozygous variants are highlighted by lines joining the positions of co-occurring variants in a sample. Lines are colored by the number of cases for the shown variant configurations, with gray lines indicating no observed samples are cases, orange lines indicating some some samples are cases, and red lines indicating that all observed samples are cases. Variants are labeled by position (GRCh38) and according to inferred consequence (missense, stop gain, splice acceptor/donor). Protein domains are highlighted accordingly.

Update of

Similar articles

Cited by

References

    1. Nelson M.R., Tipney H., Painter J.L., Shen J., Nicoletti P., Shen Y., Floratos A., Sham P.C., Li M.J., Wang J., et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 2015;47:856–860. - PubMed
    1. Plenge R.M., Scolnick E.M., Altshuler D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 2013;12:581–594. doi: 10.1038/nrd4051. - DOI - PubMed
    1. Whiffin N., Armean I.M., Kleinman A., Marshall J.L., Minikel E.V., Goodrich J.K., Quaife N.M., Cole J.B., Wang Q., Karczewski K.J., et al. The effect of LRRK2 loss-offunction variants in humans. Nat. Med. 2020;26:869–877. - PMC - PubMed
    1. Tobert J.A. Lovastatin and beyond: the history of the HMG-CoA reductase inhibitors. Nat. Rev. Drug Discov. 2003;2:517–526. doi: 10.1038/nrd1112. - DOI - PubMed
    1. Do R.Q., Vogel R.A., Schwartz G.G. PCSK9 Inhibitors: potential in cardiovascular therapeutics. Curr. Cardiol. Rep. 2013;15:345. doi: 10.1007/s11886-012-0345-z. - DOI - PubMed

Substances

LinkOut - more resources