Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 12;24(1):198.
doi: 10.1186/s12864-023-09247-y.

Splicing complexity as a pivotal feature of alternative exons in mammalian species

Affiliations

Splicing complexity as a pivotal feature of alternative exons in mammalian species

Feiyang Zhao et al. BMC Genomics. .

Abstract

Background: As a significant process of post-transcriptional gene expression regulation in eukaryotic cells, alternative splicing (AS) of exons greatly contributes to the complexity of the transcriptome and indirectly enriches the protein repertoires. A large number of studies have focused on the splicing inclusion of alternative exons and have revealed the roles of AS in organ development and maturation. Notably, AS takes place through a change in the relative abundance of the transcript isoforms produced by a single gene, meaning that exons can have complex splicing patterns. However, the commonly used percent spliced-in (Ψ) values only define the usage rate of exons, but lose information about the complexity of exons' linkage pattern. To date, the extent and functional consequence of splicing complexity of alternative exons in development and evolution is poorly understood.

Results: By comparing splicing complexity of exons in six tissues (brain, cerebellum, heart, liver, kidney, and testis) from six mammalian species (human, chimpanzee, gorilla, macaque, mouse, opossum) and an outgroup species (chicken), we revealed that exons with high splicing complexity are prevalent in mammals and are closely related to features of genes. Using traditional machine learning and deep learning methods, we found that the splicing complexity of exons can be moderately predicted with features derived from exons, among which length of flanking exons and splicing strength of downstream/upstream splice sites are top predictors. Comparative analysis among human, chimpanzee, gorilla, macaque, and mouse revealed that, alternative exons tend to evolve to an increased level of splicing complexity and higher tissue specificity in splicing complexity. During organ development, not only developmentally regulated exons, but also 10-15% of non-developmentally regulated exons show dynamic splicing complexity.

Conclusions: Our analysis revealed that splicing complexity is an important metric to characterize the splicing dynamics of alternative exons during the development and evolution of mammals.

Keywords: Alternative splicing; Development and evolution; Machine learning; Splicing complexity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Splicing complexity of alternative exons. A distribution of splicing entropy for all alternative CE events in protein-coding genes in brain. B splicing entropy for conserved CE events across seven species in brain. Red arrows indicate the position of two peaks. C frequencies of events with high splicing entropy (≥ 1.0) for each type of events in human. D density plot for Ψ values (x-axis) and splicing entropy (y-axis) in brain of human
Fig. 2
Fig. 2
Splicing entropy of events from different types of genes. The bean charts display differential splicing entropy of AS events in genes with different categories. A Housekeeping genes (n = 16,387) vs. Non-housekeeping genes (n = 35,205). B gene age, Young: human-specific genes (n = 315), Old: non-human-specific genes (n = 40,342). C expression level, Low: TPM < 50 (n = 39,838), High: TPM ≥ 50 (n = 11,754). D gene expression tissue specificity, Low tissue specificity: tau < 0.3 (event number: 24,229), High tissue specificity: tau ≥ 0.3 (n = 27,363). E Evolutionary rate, Low (slow-evolving): 0.000923 ≤ dN/dS ≤ 0.0993 (n = 45,272), High (fast-evolving): 0.0993 < dN/dS ≤ 29.3 (n = 45,256). F degree in protein–protein interaction (PPI) network, Low degree: 2 ≤ degree ≤ 1.34 × 103 (n = 25,442); High degree: 1.34 × 103 ≤ degree ≤ 1.38 × 104 (n = 25,350). All significances were evaluated with Wilcoxon rank-sum test. Dashed lines indicate the median of all two groups of CE events, solid vertical lines indicate median for each group of events
Fig. 3
Fig. 3
Prediction of splicing entropy with machine learning. A simplified diagram of deep learning model used to predict splicing entropy. For each event, 59 features were used as input and processed with two one-dimension convolutions, The subsequent Squeeze-and-Excitation Networks (SENet) was applied to process features. What follows is the recurrent layer which contains LSTM units that have end-to-end connection in both directions to capture dependencies between features. Recurrent outputs are the input of fully connected layer (FC) to predict the splicing entropy of events in test data. B comparison of the average performance of different methods with test data. PCC: Pearson product-moment correlation coefficient; SCC: Spearman’s rank correlation coefficient; R2: explained variation. C scatter plot shows the predictive power of xgboost and deep learning model respectively, the red line in each graph indicates the linear fit between the predicted and measured splicing entropies. D the rank of feature importance for the predictive splicing entropy (top 10) with xgboost model
Fig. 4
Fig. 4
Comparison of splicing complexity and Ψ values between paired tissues in human. A Sankey plots show conversion of belonged groups for event among tissues. Events were categorized depending on their splicing complexity and Ψ values in each tissue, and each row is for one tissue. K{n}_{m}, n is splicing complexity category and m is Ψ category. K1_Low: complexity = K1 and 0 < Ψ ≤ 0.2; K1_Middle: complexity = K1 and 0.2 < Ψ < 0.8; K1_High: complexity = K1 and 0.8 ≤ Ψ < 0.97; K2_Low: complexity = K2 and 0 < Ψ ≤ 0.2; K2_Middle: complexity = K2 and 0.2 < Ψ < 0.8; K2_High: complexity = K2 and 0.8 ≤ Ψ < 0.97; K3_Low: complexity = K3 and 0 < Ψ ≤ 0.2; K3_Middle: complexity = K3 and 0.2 < Ψ < 0.8; K3_High: complexity = K3 and 0.8 ≤ Ψ < 0.97; Others: not in the above categories (NA in one tissue, but not NA in the other). B bar plot showing the frequency of events from different groups in brain and cerebellum with transcript annotation from Ensembl database
Fig. 5
Fig. 5
Evolution of splicing complexity. A Spearman correlation between human and other species when comparing splicing entropy pairwise for orthologous events in each tissue. For each pair of species, correlation is calculated for splicing entropy of all orthologous exons in the seven species. B the numbers of conserved alternative exons in seven species with highest splicing entropy in each species for each tissue. C line chart showing splicing entropy of events with decrease/increase in splicing entropy during evolution in each tissue. D bar plot displaying the ratio of alternative exons with maximum splicing entropy ≥ 1.0 for each group. E bar plot displaying the ratio of alternative exons with maximum changes in splicing entropy ≥ 1.0 among tissues for each age group of events. VCA: vertebrate conserved alternative exons; MCA: mammalian conserved alternative exons
Fig. 6
Fig. 6
Splicing complexity changes during development. A hierarchical clustering for splicing complexity for events that have confident change (at least 20 supported reads) in splicing entropy larger than 0.5 and expression level of located gene larger than 10 TPM. Red arrow represents events that have stable and high splicing entropy during development. B 2D kernel density plots showing the largest change in splicing entropy (y-axis) and Ψ (x-axis). C the number of events that change in splicing entropy. Dev-events: events with largest splicing entropy ≥ 0.5 during development, non-Dev-events: events with largest splicing entropy < 0.5. D The relative ratio of events with splicing entropy changes larger than 0.5 among events that are not regulated in Ψ (max Δ Ψ < 0.1). E the distribution of splicing entropy changes for events in the humangain’ group

Similar articles

Cited by

References

    1. Kalsotra A, Cooper TA. Functional consequences of developmentally regulated alternative splicing. Nat Rev Genet. 2011;12:715–729. doi: 10.1038/nrg3052. - DOI - PMC - PubMed
    1. Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S, et al. A draft map of the human proteome. Nature. 2014;509:575–581. doi: 10.1038/nature13302. - DOI - PMC - PubMed
    1. Nilsen TW, Graveley BR. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463:457–463. doi: 10.1038/nature08909. - DOI - PMC - PubMed
    1. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–476. doi: 10.1038/nature07509. - DOI - PMC - PubMed
    1. Beqqali A, Bollen IA, Rasmussen TB, van den Hoogenhof MM, van Deutekom HW, Schafer S, Haas J, Meder B, Sorensen KE, van Oort RJ, et al. A mutation in the glutamate-rich region of RNA-binding motif protein 20 causes dilated cardiomyopathy through missplicing of titin and impaired Frank-Starling mechanism. Cardiovasc Res. 2016;112:452–463. doi: 10.1093/cvr/cvw192. - DOI - PubMed

Substances