Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 27;16(1):1068.
doi: 10.1038/s41467-024-55607-x.

Splicing accuracy varies across human introns, tissues, age and disease

Affiliations

Splicing accuracy varies across human introns, tissues, age and disease

S García-Ruiz et al. Nat Commun. .

Abstract

Alternative splicing impacts most multi-exonic human genes. Inaccuracies during this process may have an important role in ageing and disease. Here, we investigate splicing accuracy using RNA-sequencing data from >14k control samples and 40 human body sites, focusing on split reads partially mapping to known transcripts in annotation. We show that splicing inaccuracies occur at different rates across introns and tissues and are affected by the abundance of core components of the spliceosome assembly and its regulators. We find that age is positively correlated with a global decline in splicing fidelity, mostly affecting genes implicated in neurodegenerative diseases. We find support for the latter by observing a genome-wide increase in splicing inaccuracies in samples affected with Alzheimer's disease as compared to neurologically normal individuals. In this work, we provide an in-depth characterisation of splicing accuracy, with implications for our understanding of the role of inaccuracies in ageing and neurodegenerative disorders.

PubMed Disclaimer

Conflict of interest statement

Competing interests: S.G. is a current employee of Verge Genomics. All work performed for this publication was performed in his own time, and not as a part of his duties as an employee. R.H.R and D.Z are current employees of CoSyne Therapeutics. All work performed for this publication was performed in their own time, and not as a part of their duties as employees. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the analyses performed in this study.
a We studied splicing accuracy through three classes of split reads spanning exon-exon junctions: annotated, novel donor and novel acceptor split reads. The RNA-sequencing dataset used originated from the Genotype-Tissue Expression (GTEx) project v8. In all 40 GTEx tissues studied, junctions from the novel acceptor category exceeded the number of unique novel donor junctions. b Novel splice sites from the novel donor and novel acceptor categories present high sequence similarity to annotated splice sites. High sequence fidelity in the vicinity of exon-intron junctions is required to accomplish accurate splicing. c Novel junctions associated with protein-coding transcripts are predicted to be deleterious in 2/3 of cases. d Reduced expression levels of the RNA-binding proteins responsible for sequence recognition appear to change splice site selection, which reduces the overall accuracy of the splicing process. Age is positively correlated with increases in splicing inaccuracies across multiple human tissues. Splicing inaccuracies are significantly higher in autopsy-confirmed Alzheimer’s cases as compared to neurologically normal age-matched controls.
Fig. 2
Fig. 2. Splicing accuracy can be measured using short-read RNA-sequencing data.
a Re-classification rate of novel split reads in Ensembl v97 compared to Ensembl v105 per GTEx tissue. Bars in black represent the ratio of split reads classified as novel junctions using Ensembl v97 that entered annotation as annotated introns in Ensembl v105. Bars in light grey represent the ratio of novel junctions in Ensembl v97 that maintained the novel annotation category in Ensembl v105. b Percentage of unique novel donor and novel acceptor junctions detected across the samples of each GTEx tissue (Ensembl v105). The crossing lines link the percentage of unique novel donor and novel acceptor junctions found within the same tissue. c Percentage of cumulative number of read counts that the novel donor and novel acceptor categories presented across the samples of each GTEx tissue (Ensembl v105). The crossing lines link the percentage of novel donor and novel acceptor split read counts detected within the same tissue.
Fig. 3
Fig. 3. Splicing inaccuracies can be explained by high sequence similarity between novel splice sites and their annotated pairs.
a MaxEntScan (MES) Delta scores between the 5’ss of the annotated introns and the 5’ss of their novel donor pairs across all tissues. b MES Delta scores between the 3’ss of the annotated introns and the 3’ss of their novel acceptor pairs across all tissues. c, d Distances lying between the novel splice site of each novel junction and its annotated intron pair in (c) protein-coding transcripts and (d) non-coding transcripts in frontal cortex tissue. e Modulo3 of the distances between each novel junction and its linked annotated intron to a maximum distance of 100 bp within MANE transcripts across all body sites.
Fig. 4
Fig. 4. Splicing inaccuracies vary across introns and are impacted by local sequence properties.
a Mis-splicing Rates (MSRs) at the 5’ and 3’ss of the annotated introns (n = 251,042) from frontal cortex samples (n = 186). Bottom right: MSRs from inaccurately spliced introns across binned values. Bottom left: a zoomed-in view of the bottom right panel. b, c MSRs at the (b) 5’ and (c) 3’ss of the annotated introns from protein-coding (n = 55,358) and non-coding (n = 55,358) transcripts in samples from frontal cortex tissue. The black dashed vertical line separates the bars displayed under the two y-axes. Right y-scale: a zoomed-in view of the left y-axis. d Exponentiated beta coefficients from the count model of two zero-inflated poisson regression models (poisson family, log link function) to predict MSRs at the donor and acceptor splice sites, respectively, from the annotated introns (n = 224,189) in frontal cortex samples (n = 186). P-values from each ZIP model were corrected for multiple testing using the Benjamini-Hochberg method, resulting in q-values (error bars represent adjusted standard errors from each estimated coefficient; statistical tests were two-sided, with significance assessed at q < 0.05; n = 186 biologically independent replicates).
Fig. 5
Fig. 5. Splicing inaccuracies vary across tissues and this could be explained by variable RNA-binding protein expression.
a, b Distribution of beta coefficient variation across the zero-inflated poisson regression (ZIP) models built to predict mis-splicing rates (MSRs) at the (a) donor (5’ss) and (b) acceptor (3’ss) splice sites of the annotated introns across the samples of each GTEx tissue (n = 40). P-values from the ZIP models were corrected for multiple testing using the Benjamini-Hochberg method, resulting in q-values. Only beta coefficient values for significant q values were considered for display. All statistical tests were two-sided, with significance assessed at q < 0.05. Box plots indicate median (middle line), 25th, 75th percentile (box) and 5th and 95th percentile (whiskers) as well as outliers (single points) of the distribution of the exponentiated beta coefficient values obtained across the n = 40 ZIP models built per MSR measure (one ZIP model per tissue and MSR measure, n = 80 ZIP models built in total). c Probability of superior MSRs at the 5’ss and 3’ss of the annotated introns in samples with the shRNA knockdown of each RBP as compared to untreated samples. The top heatmap track contains the knockdown efficiency of the associated protein.
Fig. 6
Fig. 6. shRNA knockdown of RNA-binding proteins (RBPs) produces different patterns of Mis-splicing ratios (MSRs) across introns, predominantly affecting annotated introns with higher RBP binding densities.
a Distances in base pairs from each novel donor and acceptor junction to their annotated intron pairs in shRNA knockdown experiments of AQR and U2AF2, respectively, as compared to samples from untreated controls. b MaxEntScan (MES) Delta scores between the novel 3’ss of each novel acceptor junction and its annotated intron pair in shRNA knockdown experiments of AQR and U2AF2, respectively, as compared to untreated controls. Dashed vertical lines represent the median value of each distribution and p-values are produced from a one-sided Wilcoxon Rank-sum test for differences between the two density distributions. c Log2 fold change in the MSRs of unique annotated introns following RBP knockdown and subclassified on the basis of their RBP binding densities as derived from CLIP-seq data. The top heatmap track contains the knockdown efficiency of the associated protein.
Fig. 7
Fig. 7. Splicing inaccuracies increase with age and affect genes involved in neuronal function.
a Probability of superior mis-splicing rates (MSRs) at the 5’ss and 3’ss of the annotated introns in samples from individuals aged between 60-79 years-old as compared to 20-39 yrs. b Gene Ontology and KEGG enrichment analysis of the genes containing introns with increasing levels of MSR values with age (i.e. 20-39 yrs < 60-79 yrs) at their 5’ss and/or 3’ss in samples from brain tissues (one-sided over representation analysis test). P-values were corrected for multiple testing using the Benjamini-Hochberg method, resulting in q-values. c Cell-type specific expression of 111 splicing-regulator and spliceosomal RBPs (Van Nostrand et al. ) in cell types derived from multiple cortical regions of the human brain (Shen et al.). The dashed grey horizontal lines represent the minimum level of significance, with dots displayed above the line showing significant specific expression for a given cell type. P-values were corrected for multiple testing using the Benjamini-Hochberg method, resulting in q-values.
Fig. 8
Fig. 8. Splicing inaccuracies increase in samples affected with Alzheimer’s disease and affect genes involved in synaptic functions.
a Percentage of unique annotated, novel donor and novel acceptor splicing events across AD samples as compared to controls. b Percentage of cumulative number of annotated, novel donor and novel acceptor split read counts across AD samples as compared to controls. c Percentage of novel junctions that are located at each modulo3 value of the distance to their annotated pairs. d KEGG Enrichment analysis of the genes containing introns with higher frequencies of MSRs at any of their two splice sites (i.e. 5’ss and 3’ss) in AD samples as compared to control samples. e GO Enrichment analysis of the genes containing introns with higher frequencies of MSRs at any of their two splice sites in AD samples as compared to controls.

Update of

References

    1. Shi, Y. Mechanistic insights into precursor messenger RNA splicing by the spliceosome. Nat. Rev. Mol. Cell Biol.18, 655–670 (2017). - PubMed
    1. Morais, P., Adachi, H. & Yu, Y.-T. Spliceosomal snRNA Epitranscriptomics. Front. Genet.12, 652129 (2021). - PMC - PubMed
    1. Black, D. L. Finding splice sites within a wilderness of RNA. RNA1, 763–771 (1995). - PMC - PubMed
    1. Chow, L. T., Gelinas, R. E., Broker, T. R. & Roberts, R. J. An amazing sequence arrangement at the 5’ ends of adenovirus 2 messenger RNA. Cell12, 1–8 (1977). - PubMed
    1. Berget, S. M., Moore, C. & Sharp, P. A. Spliced segments at the 5’ terminus of adenovirus 2 late mRNA. Proc. Natl Acad. Sci. USA74, 3171–3175 (1977). - PMC - PubMed