Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 26;14(1):79.
doi: 10.1186/s13073-022-01087-x.

A systematic analysis of splicing variants identifies new diagnoses in the 100,000 Genomes Project

Affiliations

A systematic analysis of splicing variants identifies new diagnoses in the 100,000 Genomes Project

Alexander J M Blakes et al. Genome Med. .

Abstract

Background: Genomic variants which disrupt splicing are a major cause of rare genetic diseases. However, variants which lie outside of the canonical splice sites are difficult to interpret clinically. Improving the clinical interpretation of non-canonical splicing variants offers a major opportunity to uplift diagnostic yields from whole genome sequencing data.

Methods: Here, we examine the landscape of splicing variants in whole-genome sequencing data from 38,688 individuals in the 100,000 Genomes Project and assess the contribution of non-canonical splicing variants to rare genetic diseases. We use a variant-level constraint metric (the mutability-adjusted proportion of singletons) to identify constrained functional variant classes near exon-intron junctions and at putative splicing branchpoints. To identify new diagnoses for individuals with unsolved rare diseases in the 100,000 Genomes Project, we identified individuals with de novo single-nucleotide variants near exon-intron boundaries and at putative splicing branchpoints in known disease genes. We identified candidate diagnostic variants through manual phenotype matching and confirmed new molecular diagnoses through clinical variant interpretation and functional RNA studies.

Results: We show that near-splice positions and splicing branchpoints are highly constrained by purifying selection and harbour potentially damaging non-coding variants which are amenable to systematic analysis in sequencing data. From 258 de novo splicing variants in known rare disease genes, we identify 35 new likely diagnoses in probands with an unsolved rare disease. To date, we have confirmed a new diagnosis for six individuals, including four in whom RNA studies were performed.

Conclusions: Overall, we demonstrate the clinical value of examining non-canonical splicing variants in individuals with unsolved rare diseases.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Conservation, predicted splice disruption, and constraint at near-splice and branchpoint positions across 207,548 CDS features in protein-coding genes. A Sequence logos and schematic indicating the position of conserved splicing motifs relative to exon/intron boundaries. Positional weight matrices were derived from the human reference sequence at our positions of interest (defined in the “Methods” section). B The mean phyloP 100-way scores at splicing positions. Error bars indicate 95% confidence intervals. C SpliceAI scores for all possible near splice SNVs. Scores represent the mean probability that any variant at this position disrupts splicing, as predicted by SpliceAI (see the “Methods” section). Error bars represent the 95% confidence interval. D Mutability-adjusted proportion of singletons (MAPS) for both coding and near-splice SNVs. Error bars indicate 95% confidence intervals. Positions with a significantly higher MAPS than synonymous variants are indicated with open circles (see the “Methods” section). For branchpoint positions, dark blue points represent all putative branchpoints, whereas light blue points represent the branchpoints with a LaBranchoR score > 0.85
Fig. 2
Fig. 2
Participant outcomes for rare disease probands with de novo splicing variants in known monoallelic loss-of-function rare disease genes. Each point represents a DNV in a rare disease proband. Points are coloured by the clinical outcome for that individual. Crosses indicate variants which were identified as likely new diagnoses in this study. Where a variant overlaps both a branchpoint and a splice acceptor position, only the splice acceptor annotation is given

References

    1. International Rare Diseases Research Consortium. Policies and guidelines. (2013). Available at: https://irdirc.org/about-us/policies-guidelines/. - PMC - PubMed
    1. Wright CF, FitzPatrick DR, Firth HV. Paediatric genomics: diagnosing rare disease in children. Nat Rev Genet. 2018;19:253–268. doi: 10.1038/nrg.2017.116. - DOI - PubMed
    1. Hyder, Z. et al. Evaluating the performance of a clinical genome sequencing program for diagnosis of rare genetic disease, seen through the lens of craniosynostosis. (2021). 10.1038/s41436-021-01297-5 - PMC - PubMed
    1. Sanders SJ, Schwartz GB, Farh KKH. Clinical impact of splicing in neurodevelopmental disorders. Genome Med. 2020;12:1–5. doi: 10.1186/s13073-020-00737-2. - DOI - PMC - PubMed
    1. Wai H, Douglas AGL, Baralle D. RNA splicing analysis in genomic medicine. Int J Biochem Cell Biol. 2019;108:61–71. doi: 10.1016/j.biocel.2018.12.009. - DOI - PubMed

Publication types