Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb;29(2):159-170.
doi: 10.1101/gr.238444.118. Epub 2018 Dec 26.

Pathogenicity and selective constraint on variation near splice sites

Affiliations

Pathogenicity and selective constraint on variation near splice sites

Jenny Lord et al. Genome Res. 2019 Feb.

Abstract

Mutations that perturb normal pre-mRNA splicing are significant contributors to human disease. We used exome sequencing data from 7833 probands with developmental disorders (DDs) and their unaffected parents, as well as more than 60,000 aggregated exomes from the Exome Aggregation Consortium, to investigate selection around the splice sites and quantify the contribution of splicing mutations to DDs. Patterns of purifying selection, a deficit of variants in highly constrained genes in healthy subjects, and excess de novo mutations in patients highlighted particular positions within and around the consensus splice site of greater functional relevance. By using mutational burden analyses in this large cohort of proband-parent trios, we could estimate in an unbiased manner the relative contributions of mutations at canonical dinucleotides (73%) and flanking noncanonical positions (27%), and calculate the positive predictive value of pathogenicity for different classes of mutations. We identified 18 patients with likely diagnostic de novo mutations in dominant DD-associated genes at noncanonical positions in splice sites. We estimate 35%-40% of pathogenic variants in noncanonical splice site positions are missing from public databases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Signals of purifying selection around splice sites. (A) Selective constraint across splicing region in 13,750 unaffected parents of DDD probands and more than 60,000 aggregated exomes from ExAC. Mutability-adjusted proportion of singletons (MAPS) with 95% confidence intervals (CIs) shown for Ensembl's Variant Effect Predictor (VEP) annotated exonic sites, extended splice acceptor and splice donor regions, the last base of the exon, split by reference nucleotide, and grouped sites in the polypyrimidine tract (PolyPy) region, split by changes from a pyrimidine to a purine (PyPu) versus all other changes. (B) Proportion of variants with 95% CI in 13,750 unaffected parents of DDD probands that fall within genes with high probability of loss-of-function intolerance (pLI > 0.9) across VEP annotated exonic sites, extended splice acceptor and splice donor regions, the last base of the exon, split by reference nucleotide, and grouped sites in the PolyPy region, split by changes from a pyrimidine to a purine (PyPu) versus all other changes. Lower panel shows splice acceptor and splice donor consensus sequences, based on our exons of interest.
Figure 2.
Figure 2.
De novo mutations (DNMs) around splice sites. Enrichment of DNMs across the splicing region in 7833 DDD probands. (A) Numbers of observed and expected DNMs across the splicing region in known dominant and recessive DD genes, as well as in non-DD–associated genes, with FDR-corrected Poisson P-values. Splice acceptor and splice donor consensus sequences are shown below, as in Figure 1. (B) Aggregation of observed and expected numbers of DNMs in the PolyPy region, with changes from a pyrimidine to a purine (PyPu) and all other changes shown separately for known dominant and recessive DD genes, as well as non-DD–associated genes. (C) Positive predictive values (PPVs) for DNMs in dominant DD-associated genes in positions across the splicing region, as well as VEP annotated stop gained and missense changes, calculated from observed and expected numbers of DNMs. (D) Enrichment (observed/expected) of DNMs by gene probability of pLI split into sextiles for donor+5, pyrimidine to purine PolyPy, and synonymous sites. pLI scores encompassed by each sextile: 1 = 5.36 × 10−91–0.000000605, 2 = 0.000000609–0.000558185, 3 = 0.000559475–0.027905143, 4 = 0.027908298–0.377456159, 5 = 0.377491926–0.919495985, 6 = 0.91955878–1.
Figure 3.
Figure 3.
Clinical classifications of noncanonical near-splice DNMs. Relationship between clinical classifications of 38 splice region DNMs in undiagnosed DDD probands and PPVs calculated using observed and expected numbers of DNMs in 7833 probands.
Figure 4.
Figure 4.
Selective constraint and pathogenicity scores. MAPS, with 95% CI, calculated for pathogenicity score brackets (least to most severe) in 13,750 unaffected parents from the DDD project, with Spearman's rank correlation coefficient.
Figure 5.
Figure 5.
Pathogenicity scores for observed near-splice site DNMs. Cumulative percentage of DNMs in known dominant DD genes with decreasing pathogenicity score bracket, shown with canonical splice site positions included (left) and excluded (right). (AUC*) area under curve.

References

    1. Ars E, Serra E, Garcia J, Kruyer H, Gaona A, Lazaro C, Estivill X. 2000. Mutations affecting mRNA splicing are the most common molecular defects in patients with neurofibromatosis type 1. Hum Mol Genet 9: 237–247. 10.1093/hmg/9.2.237 - DOI - PubMed
    1. Artimo P, Jonnalagedda M, Arnold K, Baratin D, Csardi G, de Castro E, Duvaud S, Flegel V, Fortier A, Gasteiger E, et al. 2012. ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res 40: W597–W603. 10.1093/nar/gks400 - DOI - PMC - PubMed
    1. Aten E, Sun Y, Almomani R, Santen GW, Messemaker T, Maas SM, Breuning MH, den Dunnen JT. 2013. Exome sequencing identifies a branch point variant in Aarskog–Scott syndrome. Hum Mutat 34: 430–434. 10.1002/humu.22252 - DOI - PubMed
    1. Badr E, ElHefnawi M, Heath LS. 2016. Computational identification of tissue-specific splicing regulatory elements in human genes from RNA-Seq data. PLoS One 11: e0166978 10.1371/journal.pone.0166978 - DOI - PMC - PubMed
    1. Baralle D, Buratti E. 2017. RNA splicing in human disease and in the clinic. Clin Sci (Lond) 131: 355–368. 10.1042/CS20160211 - DOI - PubMed

Publication types

Substances