Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Aug 28:2024.08.27.24312158.
doi: 10.1101/2024.08.27.24312158.

Genetic modifiers and ascertainment drive variable expressivity of complex disorders

Matthew Jensen  1   2 Corrine Smolen  1   2 Anastasia Tyryshkina  1 Lucilla Pizzo  1 Deepro Banerjee  1 Matthew Oetjens  3 Hermela Shimelis  3 Cora M Taylor  3 Vijay Kumar Pounraja  1   2 Hyebin Song  4 Laura Rohan  1 Emily Huber  1 Laila El Khattabi  5 Ingrid van de Laar  6 Rafik Tadros  6 Connie Bezzina  6 Marjon van Slegtenhorst  6 Janneke Kammeraad  6 Paolo Prontera  7 Jean-Hubert Caberg  8 Harry Fraser  9 Siddhartha Banka  9   10 Anke Van Dijck  11 Charles Schwartz  12 Els Voorhoeve  13 Patrick Callier  14 Anne-Laure Mosca-Boidron  14 Nathalie Marle  14 Mathilde Lefebvre  15 Kate Pope  16 Penny Snell  16 Amber Boys  16 Paul J Lockhart  16   17 Myla Ashfaq  18 Elizabeth McCready  19 Margaret Nowacyzk  19 Lucia Castiglia  20 Ornella Galesi  20 Emanuela Avola  20 Teresa Mattina  20 Marco Fichera  20   21 Maria Grazia Bruccheri  20 Giuseppa Maria Luana Mandarà  22 Francesca Mari  23 Flavia Privitera  23 Ilaria Longo  23 Aurora Curró  23 Alessandra Renieri  23 Boris Keren  24 Perrine Charles  24 Silvestre Cuinat  25 Mathilde Nizon  25 Olivier Pichon  25 Claire Bénéteau  25 Radka Stoeva  25 Dominique Martin-Coignard  26 Sophia Blesson  27 Cedric Le Caignec  28   29 Sandra Mercier  27 Marie Vincent  27 Christa Martin  3 Katrin Mannik  30   31 Alexandre Reymond  32 Laurence Faivre  14   15 Erik Sistermans  13 R Frank Kooy  11 David J Amor  13 Corrado Romano  20   21 Joris Andrieux  33 Santhosh Girirajan  1   2   34
Affiliations

Genetic modifiers and ascertainment drive variable expressivity of complex disorders

Matthew Jensen et al. medRxiv. .

Abstract

Variable expressivity of disease-associated variants implies a role for secondary variants that modify clinical features. We assessed the effects of modifier variants towards clinical outcomes of 2,252 individuals with primary variants. Among 132 families with the 16p12.1 deletion, distinct rare and common variant classes conferred risk for specific developmental features, including short tandem repeats for neurological defects and SNVs for microcephaly, while additional disease-associated variants conferred multiple genetic diagnoses. Within disease and population cohorts of 773 individuals with the 16p12.1 deletion, we found opposing effects of secondary variants towards clinical features across ascertainments. Additional analysis of 1,479 probands with other primary variants, such as 16p11.2 deletion and CHD8 variants, and 1,084 without primary variants, showed that phenotypic associations differed by primary variant context and were influenced by synergistic interactions between primary and secondary variants. Our study provides a paradigm to dissect the genomic architecture of complex disorders towards personalized treatment.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. Overview of variant-phenotype analyses in 2,252 individuals with primary pathogenic variants.
We assessed associations between variant classes and clinical phenotypes in six cohorts of individuals and families with primary variants. We directly recruited and assessed 132 families with the 16p12.1 deletion primarily ascertained for children with developmental delay (DD) (also including ten individuals from eight families from the Estonian Biobank not ascertained for DD). We further assessed 16p12.1 deletion carriers from cohorts with different ascertainments, including healthy volunteer-biased (UK Biobank), clinically-derived (MyCode), and single-disorder (SPARK, for autism) ascertainments. We finally assessed probands ascertained for autism with various primary pathogenic variants, including the 16p11.2 deletion or duplication (Simons Searchlight) and other large CNVs or rare SNVs in neurodevelopmental genes (SSC). We note that 100 probands in SSC have both pathogenic SNVs and CNVs and are included in both categories. Within and across these cohorts, we identified associations between up to 17 classes of rare and common variants (identified from WGS, WES, and microarrays) with phenotypic features from deep clinical datasets and electronic health records.
Figure 2.
Figure 2.. Variably expressive phenotypes of family members with the 16p12.1 deletion.
(A) Distribution of complexity scores for six phenotypic domains in probands (n=69–109), carrier siblings and cousins (n=28–35), and noncarrier siblings and cousins (n=13–20) in 16p12.1 deletion families (numbers vary due to clinical data availability). Complexity scores were determined by identifying the number of clinical features manifested within each phenotypic domain (see Methods). (B) Distribution of complexity scores for four phenotypic domains in carrier parents (orange, n=46–51, orange) and non-carrier parents (blue, n=58–61) of 16p12.1 deletion probands. (C) Distributions of quantitative phenotypes observed in 16p12.1 deletion probands. Top plots show the distribution of non-verbal IQ (HRS-MAT) and social responsiveness scores (SRS) in probands (green, n=10–27) compared to carrier (red, n=17–21) and non-carrier parents (blue, n=20–26). Middle plots compare the same scores in probands to the score for probands in the SSC cohort (SRS n=2,844, yellow; HRS-MAT mean derived from ) and probands with the 16p11.2 deletion or duplication from Simons Searchlight (n=139, purple). Bottom plots show the distribution of head circumference (n=64) and BMI z-scores (n=67) in deletion probands; red vertical lines represent the general population mean (i.e. z-score=0). P-values from Mann Whitney tests or one-sample t-tests. Individual scores for 16p12.1 deletion probands and parents are listed in Table S1A. (D) Distribution of the age of attainment for developmental milestones in probands (n=13–33), carrier siblings and cousins (n=16–18), and noncarrier siblings and cousins (n=11–15). One-tailed t-test, *p≤0.05, **Benjamini-Hochberg FDR≤0.05.
Figure 3.
Figure 3.. Secondary variants contribute to phenotypic variability within 16p12.1 deletion families.
(A) Cohen’s D effect sizes (top) for changes in secondary variant burden (i.e. rare variant burden or PRS) between probands and their carrier or noncarrier parents (n=49–54 pairs). *p≤0.05, paired one-tailed (rare variant classes) or two-tailed (PRS) t-test. Red indicates increased burden in probands relative to their parents. Boxplot (bottom) highlights increased burden of missense (LF) variants between probands and carrier parents. (B) Increased burden of rare variants corresponds with more severe clinical features across successive generations of 16p12.1 deletion carriers in a multi-generational family. (C) Distribution of genes by average connectivity (degree) within a brain-specific interaction network, binned into quartiles from 1000 simulations of randomly selected gene sets. Black lines represent the observed number of genes with secondary variants in 16p12.1 deletion probands in each degree quartile. Empirical p-values derived from simulation distributions. (D) Enrichment of genes with secondary SNVs in 16p12.1 deletion probands for genes preferentially expressed in neuronal classes (excitatory and inhibitory) and sub-classes (colored by main class) in the adult motor cortex. Fisher’s exact test, **Benjamini-Hochberg FDR≤0.05. Full results are listed in Table S2E. (E) Enrichment of genes with secondary variants in probands for six gene co-expression modules identified from WGCNA analysis of lymphoblastoid cell lines (LCL) from individuals with the 16p12.1 deletion. Fisher’s exact test, *p≤0.05, **Benjamini-Hochberg FDR≤0.05.
Figure 4.
Figure 4.. Secondary variant associations for phenotype domains of the 16p12.1 deletion.
(A) Forest plots show log-scaled odds ratios from logistic regression models for secondary variant burden in 16p12.1 deletion probands with higher complexity scores for five phenotypic domains, compared with probands with lower complexity scores (n=47–71). *p≤0.05. Model results for variants (LF) are shown in Fig. S3A. (B) Forest plots show β coefficients from linear regression models for secondary variant burden in genes under evolutionary constrain (LF genes) towards quantitative phenotypes in deletion probands (n=43–76). *p≤0.05. Model results for variants without LF filter are shown in Fig. S3A. (C) Gene Ontology (GO) biological process terms enriched among secondary variants in probands with each phenotypic domain. Circles represent individual GO terms, clustered based on semantic similarity into broad categories (green ovals, as defined in the “legend” plot). Size of each circle represents the number of genes in each term, such that broader terms are larger. Colored circles in each plot indicate significant enrichment of the GO terms for the given phenotype. (D) Changes in burden of secondary variants disrupting sets of genes involved with neurodevelopmental disease and related functions (see Methods) in probands with phenotypic domains (n=23–67) compared to probands without each domain (n=12–36). *p≤0.05, one-tailed t-test.
Figure 5.
Figure 5.. Effects of ascertainment on associations of 16p12.1 deletion.
(A) Prevalence of phenotypes among adults and children with 16p12.1 deletion from four ascertainments: DD cohort (adults n=38, children n=93–151), SPARK (n=51–56), UK Biobank (UKB; questionnaire n=50–53, ICD10 n=217), and MyCode (n=160). Fisher’s exact test, *p≤0.05. (B) Distribution of rare secondary SNVs in UKB individuals with 16p12.1 deletion (n=240, left) and age and sex-matched controls without large rare (>500kb) CNVs (n=2,640, right). P-value from two-tailed t-test. (C) Associations of secondary variant burden with select psychiatric phenotypes derived from clinical questionnaires in 16p12.1 deletion adults from DD cohort (n=24–31) and UKB (n=46–249). Two-tailed t-test, *p<0.05. (D) Associations of secondary variant burden with select clinical phenotypes derived from EHR data (ICD10 codes) in 16p12.1 deletion individuals from UKB (n=187–218) and MyCode (n=143–159). Two-tailed t-test, *p≤0.05. **Benjamini-Hochberg FDR≤0.05. (E) Associations of secondary variant burden and select developmental phenotypes in children with 16p12.1 deletion from the DD cohort (n=67–125) and SPARK (n=27–56). Two-tailed t-test, *p≤0.05. (F) Associations of secondary variant burden and developmental phenotypes from joint logistic models of 16p12.1 deletion children from the DD and SPARK cohorts (n=98–125). Joint models for non-LF are shown in Fig. S4J, and joint models for adults are shown in Fig. S4H–I. *p≤0.05.
Figure 6.
Figure 6.. Secondary variant associations in probands with primary variants.
(A) Heatmap shows β coefficients from select linear regression models for secondary variant burden (y-axis, third column) towards quantitative developmental phenotypes (y-axis, first column) in probands from SSC and Simons Searchlight cohorts with different classes of primary variants (x-axis) (n=21–660). *p≤0.05. (B) β coefficients from linear regression models examining interactions between primary variants (pie chart slices) and specific secondary variant classes (x-axis) towards quantitative phenotypes (y-axis) in SSC probands (n=1,597–2,591). Color of pie chart slices indicate interaction coefficients, and size of pie chart slices indicate p-value for strength of interaction coefficient. Red highlights indicate Benjamini-Hochberg FDR ≤0.05 (C-D) Gene Ontology (GO) biological process terms enriched among secondary variants observed in (C) probands with different classes of primary variants from the SSC cohort and (D) probands with 16p11.2 deletions and duplications from the Searchlight cohort. Circles represent individual GO terms, clustered based on semantic similarity into broad categories (green ovals, as defined in the two “legend” plots). Size of each circle represents the number of genes in each term, such that broader terms are larger. Colored circles in each plot indicate significant enrichment of the GO terms for the given primary variant.

References

    1. Claussnitzer M., Cho J.H., Collins R., Cox N.J., Dermitzakis E.T., Hurles M.E., Kathiresan S., Kenny E.E., Lindgren C.M., MacArthur D.G., et al. (2020). A brief history of human disease genetics. Nature 577, 179–189. 10.1038/s41586-019-1879-7. - DOI - PMC - PubMed
    1. Kingdom R., and Wright C.F. (2022). Incomplete Penetrance and Variable Expressivity: From Clinical Studies to Population Cohorts. Front Genet 13, 920390. 10.3389/fgene.2022.920390. - DOI - PMC - PubMed
    1. Girirajan S., and Eichler E.E. (2010). Phenotypic variability and genetic susceptibility to genomic disorders. Hum Mol Genet 19, R176–87. 10.1093/hmg/ddq366. - DOI - PMC - PubMed
    1. Posey J.E., Harel T., Liu P., Rosenfeld J.A., James R.A., Coban Akdemir Z.H., Walkiewicz M., Bi W., Xiao R., Ding Y., et al. (2017). Resolution of Disease Phenotypes Resulting from Multilocus Genomic Variation. N Engl J Med 376, 21–31. 10.1056/NEJMoa1516767. - DOI - PMC - PubMed
    1. Leitch C.C., Zaghloul N.A., Davis E.E., Stoetzel C., Diaz-Font A., Rix S., Alfadhel M., Lewis R.A., Eyaid W., Banin E., et al. (2008). Hypomorphic mutations in syndromic encephalocele genes are associated with Bardet-Biedl syndrome. Nat Genet 40, 443–448. 10.1038/ng.97. - DOI - PubMed

Publication types

LinkOut - more resources