Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 4;185(16):3041-3055.e25.
doi: 10.1016/j.cell.2022.06.036. Epub 2022 Aug 1.

A cross-disorder dosage sensitivity map of the human genome

Collaborators, Affiliations

A cross-disorder dosage sensitivity map of the human genome

Ryan L Collins et al. Cell. .

Abstract

Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.

Keywords: copy-number variation; developmental disorders; disease association; dosage sensitivity; genomics; haploinsufficiency; statistical genetics; structural variation; triplosensitivity.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests M.E.T. receives research funding and/or reagents from Levo Therapeutics, Microsoft Inc., and Illumina Inc. R.B., C. Lauricella, A.J., L.M., S.W., and J.M. are employees of GeneDx, Inc. S.A. is an employee of Invitae Corp.

Figures

Figure 1.
Figure 1.. The contribution of rCNVs to 54 disease phenotypes.
(A) Phenotype categorization for 950,278 samples using Human Phenotype Ontology. (B) ORs per phenotype from meta-analyses of rCNVs matching 95 GDs reported in the literature (top) and of rCNVs impacting PTV-constrained genes outside of known GDs (bottom). See also Tables S1-2.
Figure 2.
Figure 2.. Characteristics of disease-associated rCNV segments.
(A) rCNV association statistics for one example phenotype, neurodevelopmental abnormalities. (B) Relationship between effect size and strength of association for the 163 segments from our discovery meta-analyses as well as 42 GDs reported in the literature that did not reach FDR<1% in our discovery analysis. (C) Our consensus set of 178 disease-relevant rCNV segments overlapped 44% more genes than expected based on 100,000 random permutations. (D) rCNV segments overlapped genes under 34% greater constraint against PTVs than expected by permutations. (E) Segments in the top third of all effect sizes overlapped genes under stronger constraint than segments in the bottom third of effect sizes (two-tailed Wilcoxon tests). (F) The number of genes per segment was related to the number of phenotypes associated with each segment. Trend lines are outlier-robust linear fits with 95% confidence intervals. See also Figures S2-5 and Table S3.
Figure 3.
Figure 3.. Coding DNMs pinpoint dominant driver genes within rCNVs.
(A) We aggregated de novo mutations (DNMs) from two exome sequencing studies of neurodevelopmental disorders (NDDs) (Fu et al., 2021; Kaplanis et al., 2020). (B) NDD-associated deletion segments were enriched for PTV DNMs beyond expectations based on permutations adjusted for gene-specific mutation rates. This enrichment was ablated after excluding the 270 NDD-associated genes identified in the studies from (A). (C) Duplication segments associated with NDDs were enriched for missense DNMs, and this enrichment persisted after excluding the 270 known NDD-associated genes. (D-E) The distributions of (D) PTV DNMs per deletion segment and (e) missense DNMs per duplication segment were highly nonuniform. Genes in each segment have been ranked and colored according to their excess PTV DNMs; percentages indicate what fraction of total excess PTV DNMs is attributable to each gene rank across all segments. (F-G) The deletion (F) and duplication (G) segments with the greatest total excess of PTV or missense DNMs usually featured a single, prominent driver gene accounting for most of that segment’s mutational excess. See also Figures S4-5.
Figure 4.
Figure 4.. Fine-mapping prioritizes candidate genes in disease-associated rCNVs.
(A) Gene-based rCNV-disease association & fine-mapping workflow. (B) Fine-mapping reduced the number of candidate genes per locus by 48%. Trend lines are outlier-robust linear fits with 95% confidence intervals. (C) Distribution of credible set sizes after fine-mapping. (D) Distribution of posterior inclusion probabilities (PIPs) for all genes across all credible sets. (E) Comparison of per-gene probabilities across three gene sets of interest for each stage of our fine-mapping approach: naïve uniform prior, genetics-only posterior, and functionally-informed PIP (i.e., “Full Model”). (F) Summary of fine-mapping for all genes associated with one of 17 phenotypes with stronger rCNV effects (see Figure 1A) stratified by whether the gene had the highest PIP (i.e., “top gene”) among all genes in at least one credible set. See also Figure S2 and Tables S5-6.
Figure 5.
Figure 5.. Example triplosensitive disease genes nominated by fine-mapping.
(A) We identified a 95% credible set of five genes on chr20 where rare duplications were associated with nervous system abnormalities (OR=35.2; 95% CI=11.9–103.5). Fine-mapping prioritized GMEB2 as the top candidate for this association (PIP=0.40). (B) We identified an association (P=1.79×10−5; FDR Q=0.004) between rare duplications and personality disorders (OR=16.8; 95% CI=5.7–49.4) on chr6, which fine-mapping reduced to just one gene, KIF13A (PIP=0.98). (C) We identified a three-gene credible set on chr16 where rare duplications were associated with growth abnormalities (OR=25.0; 95% CI=8.5–73.7). Fine-mapping nominated the known haploinsufficient gene, ANKRD11 (PIP=0.62), as the most likely causal gene. (D) We identified a four-gene credible set on chr15 where rare duplications were associated with skeletal abnormalities (OR=21.3; 95% CI=6.1–75.2). Fine-mapping prioritized IGF1R (PIP=0.44), a biologically plausible candidate gene (Abuzzahab et al., 2003), as one of two candidates for this association. For all panels, meta-analysis P-values and ORs are provided for the more specific (smaller N) of the two phenotypes listed at the bottom of the panel. See also Figure S6.
Figure 6.
Figure 6.. Predicting dosage sensitivity at single-gene resolution.
(A) The probability of haploinsufficiency (pHaplo) and triplosensitivity (pTriplo) were moderately correlated per gene (Pearson R2=0.30; P<10−100). (B) We calibrated thresholds for pHaplo and pTriplo to define 2,987 haploinsufficient and 1,559 triplosensitive genes where the effect sizes of deletions or duplications were comparable to loss-of-function of PTV-constrained genes (Karczewski et al., 2020). (C) We observed clear shifts in the distributions of pHaplo and pTriplo across gene sets with prior biological evidence as being dosage sensitive or insensitive. Asterisks indicate gene sets considered when training our models and are not fully independent test sets. (D) pHaplo and pTriplo stratified risk for ASD conferred by de novo protein-truncating deletions and whole-gene copy-gain (CG) duplications outside of GDs in an independent dataset of 13,786 affected children and 5,098 unaffected siblings (Fu et al., 2021). Baseline indicates the overall OR for all de novo deletions or duplications. (E) pHaplo and pTriplo were inversely correlated with rates of protein-truncating deletions and CG duplications in the general population (Collins et al., 2020). (F) The top decile of genes when ranked by pHaplo and pTriplo were enriched for damaging DNMs (PTVs and missense) in 46,094 probands affected by NDDs (Fu et al., 2021; Kaplanis et al., 2020). See also Figure S6 and Table S7.
Figure 7.
Figure 7.. Insights into the biological basis of genic dosage sensitivity.
(A) We identified the features most correlated with bidirectionally dosage sensitive (DS) genes by comparing the minimum pHaplo & pTriplo per gene to 145 gene-level features. The 16 gene-level features with the largest absolute correlation coefficients are shown here. (B) Distributions of selected gene-level features for subsets of genes classified as haploinsufficient (HI), DS, triplosensitive (TS), and not dosage sensitive (NS). For clarity, all features have been transformed into Z-scores. (C) We also identified features predictive of genes uniquely HI or TS (but not both) using a Spearman correlation approach similar to (A). (D-E) See (B). See also Figure S7.

Comment in

  • The gene dose makes the disease.
    Smolen C, Girirajan S. Smolen C, et al. Cell. 2022 Aug 4;185(16):2850-2852. doi: 10.1016/j.cell.2022.07.005. Cell. 2022. PMID: 35931018 Free PMC article.
  • Mapping dosage.
    Clyde D. Clyde D. Nat Rev Genet. 2022 Oct;23(10):583. doi: 10.1038/s41576-022-00528-y. Nat Rev Genet. 2022. PMID: 36042286 No abstract available.

Similar articles

Cited by

References

    1. Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, et al. (2020). Mapping and characterization of structural variation in 17,795 human genomes. Nature 583, 83–89. - PMC - PubMed
    1. Abuzzahab MJ., Schneider A., Goddard A., Grigorescu F., Lautier C., Keller E., Kiess W., Klammt J., Kratzsch J., Osgood D., et al.. (2003). IGF-I receptor mutations resulting in intrauterine and postnatal growth retardation. The New England journal of medicine 349, 2211–2222. - PubMed
    1. Aguirre M, Rivas MA, and Priest J (2019). Phenome-wide Burden of Copy-Number Variation in the UK Biobank. American journal of human genetics 105, 373–383. - PMC - PubMed
    1. Albers CA, Paul DS, Schulze H, Freson K, Stephens JC, Smethurst PA, Jolley JD, Cvejic A, Kostadima M, Bertone P, et al. (2012). Compound inheritance of a low-frequency regulatory SNP and a rare null mutation in exon-junction complex subunit RBM8A causes TAR syndrome. Nat Genet 44, 435–439, s431–432. - PMC - PubMed
    1. Almarri MA, Bergström A, Prado-Martinez J, Yang F, Fu B, Dunham AS, Chen Y, Hurles ME, Tyler-Smith C, and Xue Y (2020). Population Structure, Stratification, and Introgression of Human Structural Variation. Cell 182, 189–199.e115. - PMC - PubMed

Publication types

LinkOut - more resources