Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug;632(8026):832-840.
doi: 10.1038/s41586-024-07773-7. Epub 2024 Jul 11.

De novo variants in the RNU4-2 snRNA cause a frequent neurodevelopmental syndrome

Yuyang Chen  1   2 Ruebena Dawes #  1   2 Hyung Chul Kim #  1   2 Alicia Ljungdahl #  3   4 Sarah L Stenton #  5   6 Susan Walker #  7 Jenny Lord  8 Gabrielle Lemire  5   6 Alexandra C Martin-Geary  1   2 Vijay S Ganesh  5   6   9 Jialan Ma  5 Jamie M Ellingford  7   10   11 Erwan Delage  12 Elston N D'Souza  1   2 Shan Dong  3   4 David R Adams  13 Kirsten Allan  14 Madhura Bakshi  15 Erin E Baldwin  16 Seth I Berger  17   18 Jonathan A Bernstein  19   20   21 Ishita Bhatnagar  22 Ed Blair  22 Natasha J Brown  14   23 Lindsay C Burrage  24 Kimberly Chapman  18 David J Coman  25   26   27 Alison G Compton  14   23   28 Chloe A Cunningham  14   23 Precilla D'Souza  13 Petr Danecek  12 Emmanuèle C Délot  17 Kerith-Rae Dias  29   30 Ellen R Elias  31   32 Frances Elmslie  33 Care-Anne Evans  29   34 Lisa Ewans  35   36   37 Kimberly Ezell  38 Jamie L Fraser  17   18 Lyndon Gallacher  14   23 Casie A Genetti  6   39 Anne Goriely  40   41 Christina L Grant  18 Tobias Haack  42   43 Jenny E Higgs  44 Anjali G Hinch  2 Matthew E Hurles  12 Alma Kuechler  45 Katherine L Lachlan  46   47 Seema R Lalani  24 François Lecoquierre  48 Elsa Leitão  45 Anna Le Fevre  14 Richard J Leventer  23   28   49 Jan E Liebelt  50   51 Sarah Lindsay  12 Paul J Lockhart  23   52 Alan S Ma  53   54 Ellen F Macnamara  13 Sahar Mansour  33 Taylor M Maurer  20   21   55 Hector R Mendez  20   21   56 Kay Metcalfe  57 Stephen B Montgomery  20   21   58 Mariya Moosajee  59   60   61 Marie-Cécile Nassogne  62   63 Serena Neumann  38 Michael O'Donoghue  64 Melanie O'Leary  5 Elizabeth E Palmer  35   36 Nikhil Pattani  33 John Phillips  38 Georgia Pitsava  65 Ryan Pysar  35   36   66 Heidi L Rehm  5   67 Chloe M Reuter  20   21   56 Nicole Revencu  68 Angelika Riess  42 Rocio Rius  23   69   70 Lance Rodan  6 Tony Roscioli  29   30   34 Jill A Rosenfeld  24 Rani Sachdev  35   36 Charles J Shaw-Smith  71 Cas Simons  69   70 Sanjay M Sisodiya  72   73 Penny Snell  52 Laura St Clair  53 Zornitza Stark  14   23 Helen S Stewart  22 Tiong Yang Tan  14   23 Natalie B Tan  14 Suzanna E L Temple  15   74 David R Thorburn  14   23   28 Cynthia J Tifft  13 Eloise Uebergang  28 Grace E VanNoy  5 Pradeep Vasudevan  75 Eric Vilain  76 David H Viskochil  16 Laura Wedd  69   70 Matthew T Wheeler  20   21   56 Susan M White  14   23 Monica Wojcik  6   39   77 Lynne A Wolfe  13 Zoe Wolfenson  13 Caroline F Wright  78 Changrui Xiao  79 David Zocche  80 John L Rubenstein  4 Eirene Markenscoff-Papadimitriou  81 Sebastian M Fica  82 Diana Baralle  83   84 Christel Depienne  45 Daniel G MacArthur  69   70 Joanna M M Howson  85 Stephan J Sanders  3   4 Anne O'Donnell-Luria  5   6   67 Nicola Whiffin  86   87   88
Affiliations

De novo variants in the RNU4-2 snRNA cause a frequent neurodevelopmental syndrome

Yuyang Chen et al. Nature. 2024 Aug.

Abstract

Around 60% of individuals with neurodevelopmental disorders (NDD) remain undiagnosed after comprehensive genetic testing, primarily of protein-coding genes1. Large genome-sequenced cohorts are improving our ability to discover new diagnoses in the non-coding genome. Here we identify the non-coding RNA RNU4-2 as a syndromic NDD gene. RNU4-2 encodes the U4 small nuclear RNA (snRNA), which is a critical component of the U4/U6.U5 tri-snRNP complex of the major spliceosome2. We identify an 18 base pair region of RNU4-2 mapping to two structural elements in the U4/U6 snRNA duplex (the T-loop and stem III) that is severely depleted of variation in the general population, but in which we identify heterozygous variants in 115 individuals with NDD. Most individuals (77.4%) have the same highly recurrent single base insertion (n.64_65insT). In 54 individuals in whom it could be determined, the de novo variants were all on the maternal allele. We demonstrate that RNU4-2 is highly expressed in the developing human brain, in contrast to RNU4-1 and other U4 homologues. Using RNA sequencing, we show how 5' splice-site use is systematically disrupted in individuals with RNU4-2 variants, consistent with the known role of this region during spliceosome activation. Finally, we estimate that variants in this 18 base pair region explain 0.4% of individuals with NDD. This work underscores the importance of non-coding genes in rare disorders and will provide a diagnosis to thousands of individuals with NDD worldwide.

PubMed Disclaimer

Conflict of interest statement

N.W. receives research funding from Novo Nordisk and has consulted for ArgoBio studio. S.J.S. receives research funding from BioMarin Pharmaceutical. A.O.’D.-L. is on the scientific advisory board for Congenica, was a paid consultant for Tome Biosciences, Ono Pharma USA Inc. and at present for Addition Therapeutics, and received reagents from PacBio to support rare disease research. H.L.R. has received support from Illumina and Microsoft to support rare disease gene discovery and diagnosis. M.W. has consulted for Illumina and Sanofi and received speaking honoraria from Illumina and GeneDx. S.B.M. is an advisor for BioMarin, Myome and Tenaya Therapeutics. S.M.S. has received honoraria for educational events or advisory boards from Angelini Pharma, Biocodex, Eisai, Zogenix/UCB and institutional contributions for advisory boards, educational events or consultancy work from Eisai, Jazz/GW Pharma, Stoke Therapeutics, Takeda, UCB and Zogenix. The Department of Molecular and Human Genetics at Baylor College of Medicine receives revenue from clinical genetic testing completed at Baylor Genetics Laboratories. J.M.M.H. is a full-time employee of Novo Nordisk and holds shares in Novo Nordisk A/S. D.G.M. is a paid consultant for GlaxoSmithKline, Insitro and Overtone Therapeutics and receives research support from Microsoft. All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. A highly structured 18 bp region of RNU4-2 that is critical for BRR2 helicase activity is enriched for variants in NDD and depleted in population cohorts.
a, Allele counts of de novo variants in 8,841 undiagnosed NDD probands in GEL (top; teal) and the UK Biobank cohort (bottom; grey) across RNU4-2. The 18 bp critical region, which is depleted of variants in the UK Biobank, is marked by a horizontal bar at the top of the plot. b, Allele counts of further variants identified in individuals with NDD in the critical 18 bp region. This includes 16 individuals with seven variants without sequencing data for both parents in GEL and variants identified in individuals from the following extra cohorts (Methods): NHS GMS (n = 19); MSSNG (n = 2); SSC (n = 1); GREGoR (n = 10) and UDN (n = 6); from personal communication or Matchmaker Exchange (n = 16). c, Schematic of U4 (teal) binding to U6 snRNA (grey). The 18 bp critical region is underlined. Nucleotides 142 to 145 of U4 (in blue) are not within the GENCODE transcript of RNU4-2 but are included in previous figures of the U4/U6 duplex in the literature on which this depiction is based and are present in the RNA-seq reads from human prefrontal cortex in BrainVar. d, The structure of U4 and U6 snRNAs resolved by cryogenic electron microscopy. U4 residues in the critical region are labelled with the reference nucleotide and numbered according to the position along the RNA (for example, U62 indicates a uracil residue in the reference sequence at position 62). Created using publicly accessible coordinates from the RCSB Protein Data Bank (PDB structure 6QW6). In both c and d, single base insertions identified in individuals with NDD are shown by black arrows and positions of SNVs by orange nucleotides.
Fig. 2
Fig. 2. Individuals with RNU4-2 variants have systematic changes in 5′ splice-site use.
a, Boxplots of the number of abnormal splicing events (detected by FRASER2, ref. ) at unannotated 5′ splice sites. The individuals with RNU4-2 variants (n = 5 individuals) have significantly more outlier events than both controls with non-NDD phenotypes (n = 378 individuals) and controls matched on genetic ancestry, sex and age at consent (n = 10 individuals, two per case; Wilcoxon P = 4.0 × 10−5 (W test statistic, 1,766) and P = 5.7 × 10−3 (W test statistic, 45.5), respectively). Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; maxima and minima represented as points. FDR, false discovery rate. b, The distribution of the number of abnormal splicing events at unannotated 5′ splice sites shared between two or more out of five randomly selected control individuals over 10,000 permutations (grey histogram). The number of shared events in individuals with RNU4-2 variants is indicated as a dotted teal vertical line (n = 11). c, DNA sequence motifs around 5′ splice sites with increased and decreased use in individuals with RNU4-2 variants. Each plot shows the proportion of sites with each base at each position. 5′ splice sites with increased use (top) have an increase in T at the +3 position (eight out of 19 versus zero out of 36; Fisher’s P = 6.2 × 10−5; OR = Inf; 95%CI: 5.92-Inf) and an increase in C at the +4 (four out of 19 versus zero out of 36; Fisher’s P = 0.011; OR = Inf; 95%CI: 1.37-Inf) and +5 (six out of 19 versus 1/36; Fisher’s P = 0.0051; OR = 15.3; 95%CI: 2.09-Inf)) positions compared to decreased 5′ splice sites (bottom). The consensus sequence at 5′ splice sites in matched annotation from NCBI and EMBL-EBI (MANE) transcripts is shown in Supplementary Fig. 4. d, The structure of the U6 snRNA paired with the 5′ splice site after 5′ splice-site transfer. The three bases of the U6 ACAGAGA that directly pair with the 5′ splice site are shown in pink. The paired positions of the 5′ splice site (5′SS) are shown in green (A + 3 and A + 4) and yellow (G + 5). Statistical tests in a and c are one-sided with unadjusted P values.
Fig. 3
Fig. 3. Clinical photographs showing facial features of affected individuals with variants in RNU4-2.
All individuals shown have the n.64_65insT variant, except for individual 44 in o (n.68_69insA), individual 45 in p (n.64_65insG) and individual 48 in q (n.76C>T). a, Individual 1 at 12 years old. b, Individual 4 at 9 years old. c, Individual 7 at 13 years old. d, Individual 15 at 8 years old. e, Individual 21 at 3.5 years old. f, Individual 22 at 8 years old. g, Individual 23 at 13 years old. h, Individual 28 at 5 years old (left) and 9 years old (right). i, Individual 32 at 3 years old (left) and 12 years old (right). j, Individual 36 at 11 months old (left) and 8 years old (right). k, Individual 37 at 22 months old (left) and 16 years old (right). l, Individual 38 at 2.5 years old (left) and 10 years old (right). m, Individual 39 at 2 years old (left) and 12 years old (right). n, Individual 43 at 8 years old (left) and 12 years old (right). o, Individual 44 at 6 years old (left) and 19 years old (right). p, Individual 45 at 6 years old (left and centre) and 27 years old (right). q, Individual 48 at 22 months old.
Fig. 4
Fig. 4. Many snRNA genes have regions that are depleted of variation in the population.
The proportion of observed SNVs in 490,640 genome-sequenced individuals in the UK Biobank, in sliding windows of 18 bp across each snRNA gene, normalized to the median value for each gene.
Fig. 5
Fig. 5. RNU4-2 is more highly expressed than RNU4-1 in the prefrontal cortex.
a, Levels of RNU4-1 (grey) and RNU4-2 (teal) expression in human dorsolateral prefrontal cortex at different developmental stages from BrainVar. Coloured lines correspond to the Loess smoothed average with the shaded regions representing 95% CIs. Developmental stages are labelled with periods (1 to 12), spanning from embryonic development to late adulthood, that were defined previously. b, ATAC-seq data from human prenatal prefrontal cortex shows substantially higher peaks of chromatin accessibility around RNU4-2 than RNU4-1. Data for both 18 and 19 gestational weeks (GW) are shown to demonstrate replication.
Extended Data Fig. 1
Extended Data Fig. 1. HPO terms for individuals in GEL.
(a) The proportion of individuals with human phenotype ontology (HPO) terms corresponding to phenotypes observed in ≥ 5 individuals with the n.64_65insT variant compared to all other individuals with NDD. Multiple HPO terms are significantly enriched in individuals with the n.64_65insT variant after Bonferroni adjustment (marked with a *) indicating that individuals with the n.64_65insT variant have more phenotypic similarity than the GEL NDD cohort as a whole. Multiple terms relating to global developmental delay, intellectual disability, hypotonia, seizure, microcephaly, autism, and short stature have been collapsed into single phenotypes. Of note, this figure relates only to HPO terms entered for each individual into GEL, which may be incomplete. Error bars indicate ±1 standard error. (b) Data plotted in panel (a) including statistics from two-sided Fisher’s exact tests. A P-value threshold of 2.94 × 10−3 was used to assess statistical significance (Bonferroni adjusted for 17 tests).
Extended Data Fig. 2
Extended Data Fig. 2. Depletion of variants in the population in RNU4-2 and RNU4-1.
(a; top) Distance to the median proportion of all possible SNVs that are observed in the UK Biobank in 18 bp sliding windows across the length of RNU4-2. A clear region of depletion compared to the rest of the gene is observed in the centre. (bottom) Log transformation of the mean Roulette mutability across the 3 possible SNVs within a site. (b) Total allele frequency at each site of RNU4-1 in undiagnosed NDD probands in GEL (teal) and the UK Biobank cohort (grey). In contrast to RNU4-2, variants in RNU4-1 have higher allele frequencies. A similar region of depletion is seen in the centre of RNU4-1 (quantified in Fig. 4), but this is not enriched for variants in GEL NDD or non-NDD individuals.
Extended Data Fig. 3
Extended Data Fig. 3. Sequencing coverage in exome sequencing data.
The number of sequencing reads covering the position of the n.64_65insT variant in 13,450 probands with exome sequencing in the DDD cohort. 3,408/13,450 probands (25.3%) have at least one read at the position.
Extended Data Fig. 4
Extended Data Fig. 4. Comparison of parental age.
Comparison of (a) paternal age for probands with fathers and (b) maternal age for probands with mothers recruited into GEL for individuals with variants in RNU4-2 (teal) and all other NDD probands (grey). Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. Individual data points, including outliers, are not shown due to Genomics England restrictions. NS: not significant. Paternal age: mean 33.1 vs 33.4 in individuals with RNU4-2 variants and other NDD probands, respectively (two-sided t-test P-value = 0.771; t = −0.29 (−2.41 - 1.80)). Maternal age: mean 30.2 vs 29.7 in individuals with RNU4-2 variants and other NDD probands, respectively (two-sided t-test P-value = 0.505; t = −0.67 (−1.07 - 2.15)).
Extended Data Fig. 5
Extended Data Fig. 5. Assessing variant density in the UK Biobank.
Median proportion of possible SNVs observed in UK Biobank per 18 bp window across 1,000 intergenic regions on chromosome 12 (grey) and RNU4-1, RNU4-2 (teal). A median of 76% of all possible SNVs in RNU4-2 are observed compared with 13% on average in the intergenic sequences of the same length (141 bp; P < 0.001, Monte Carlo Fisher-Pitman test).

Update of

  • De novo variants in the non-coding spliceosomal snRNA gene RNU4-2 are a frequent cause of syndromic neurodevelopmental disorders.
    Chen Y, Dawes R, Kim HC, Stenton SL, Walker S, Ljungdahl A, Lord J, Ganesh VS, Ma J, Martin-Geary AC, Lemire G, D'Souza EN, Dong S, Ellingford JM, Adams DR, Allan K, Bakshi M, Baldwin EE, Berger SI, Bernstein JA, Brown NJ, Burrage LC, Chapman K, Compton AG, Cunningham CA, D'Souza P, Délot EC, Dias KR, Elias ER, Evans CA, Ewans L, Ezell K, Fraser JL, Gallacher L, Genetti CA, Grant CL, Haack T, Kuechler A, Lalani SR, Leitão E, Fevre AL, Leventer RJ, Liebelt JE, Lockhart PJ, Ma AS, Macnamara EF, Maurer TM, Mendez HR, Montgomery SB, Nassogne MC, Neumann S, O'Leary M, Palmer EE, Phillips J, Pitsava G, Pysar R, Rehm HL, Reuter CM, Revencu N, Riess A, Rius R, Rodan L, Roscioli T, Rosenfeld JA, Sachdev R, Simons C, Sisodiya SM, Snell P, Clair L, Stark Z, Tan TY, Tan NB, Temple SE, Thorburn DR, Tifft CJ, Uebergang E, VanNoy GE, Vilain E, Viskochil DH, Wedd L, Wheeler MT, White SM, Wojcik M, Wolfe LA, Wolfenson Z, Xiao C, Zocche D, Rubenstein JL, Markenscoff-Papadimitriou E, Fica SM, Baralle D, Depienne C, MacArthur DG, Howson JM, Sanders SJ, O'Donnell-Luria A, Whiffin N. Chen Y, et al. medRxiv [Preprint]. 2024 Apr 9:2024.04.07.24305438. doi: 10.1101/2024.04.07.24305438. medRxiv. 2024. Update in: Nature. 2024 Aug;632(8026):832-840. doi: 10.1038/s41586-024-07773-7. PMID: 38645094 Free PMC article. Updated. Preprint.

References

    1. Wright, C. F. et al. Genomic diagnosis of rare pediatric disease in the United Kingdom and Ireland. N. Engl. J. Med.388, 1559–1571 (2023). 10.1056/NEJMoa2209046 - DOI - PMC - PubMed
    1. Nguyen, T. H. D. et al. The architecture of the spliceosomal U4/U6.U5 tri-snRNP. Nature523, 47–52 (2015). 10.1038/nature14548 - DOI - PMC - PubMed
    1. Ellingford, J. M. et al. Recommendations for clinical interpretation of variants found in non-coding regions of the genome. Genome Med.14, 73 (2022). 10.1186/s13073-022-01073-3 - DOI - PMC - PubMed
    1. 100,000 Genomes Project Pilot Investigators. 100,000 Genomes Pilot on rare-disease diagnosis in health care—preliminary report. N. Engl. J. Med.385, 1868–1880 (2021). 10.1056/NEJMoa2035790 - DOI - PMC - PubMed
    1. Aspden, J. L., Wallace, E. W. J. & Whiffin, N. Not all exons are protein coding: addressing a common misconception. Cell Genom.3, 100296 (2023). 10.1016/j.xgen.2023.100296 - DOI - PMC - PubMed