Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep;36(9):1295-1312.
doi: 10.1111/jeb.14205. Epub 2023 Aug 11.

Purifying selection against spurious splicing signals contributes to the base composition evolution of the polypyrimidine tract

Affiliations

Purifying selection against spurious splicing signals contributes to the base composition evolution of the polypyrimidine tract

Burçin Yıldırım et al. J Evol Biol. 2023 Sep.

Abstract

Among eukaryotes, the major spliceosomal pathway is highly conserved. While long introns may contain additional regulatory sequences, the ones in short introns seem to be nearly exclusively related to splicing. Although these regulatory sequences involved in splicing are well-characterized, little is known about their evolution. At the 3' end of introns, the splice signal nearly universally contains the dimer AG, which consists of purines, and the polypyrimidine tract upstream of this 3' splice signal is characterized by over-representation of pyrimidines. If the over-representation of pyrimidines in the polypyrimidine tract is also due to avoidance of a premature splicing signal, we hypothesize that AG should be the most under-represented dimer. Through the use of DNA-strand asymmetry patterns, we confirm this prediction in fruit flies of the genus Drosophila and by comparing the asymmetry patterns to a presumably neutrally evolving region, we quantify the selection strength acting on each motif. Moreover, our inference and simulation method revealed that the best explanation for the base composition evolution of the polypyrimidine tract is the joint action of purifying selection against a spurious 3' splice signal and the selection for pyrimidines. Patterns of asymmetry in other eukaryotes indicate that avoidance of premature splicing similarly affects the nucleotide composition in their polypyrimidine tracts.

Keywords: Drosophila; intron evolution; polypyrimidine tract; selective constraint; short intron; splicing motifs.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

FIGURE 1
FIGURE 1
Representation of the pre‐mRNA splicing pathway (adapted from Green, 1986).
FIGURE 2
FIGURE 2
(a) Nucleotide composition around exon‐intron junctions. Positions depicted as 0 correspond to the junctions between intron and exons. (b) AG (purines, blue lines) and CT (pyrimidines, red lines) content per position. Only one candidate length class were chosen to visualize the patterns with increasing length (55, 70, 85 bp). Total numbers of sequences used for each length class are 1044, 474 and 117, respectively.
FIGURE 3
FIGURE 3
Representation of the regions analysed. The base numbering of each region is ascending as going from 5′ to 3′.
FIGURE 4
FIGURE 4
Tests of strand symmetric evolution in the 5LR. (a) Ordered p‐values from chi‐square tests, for the equality of forward and reverse complement trimers. (b) Confidence intervals for the ratios of forward and reverse complement trimers. Red dashed lines correspond to the tolerance range to assume equivalence.
FIGURE 5
FIGURE 5
Per position mononucleotide asymmetry scores in 5LR, 3PT and the 3′ junction. TA and CG asymmetries are shown in blue and red, respectively. Dashed horizontal lines corresponds to symmetry at 0. Vertical line on the far right plot corresponds to the intron‐exon boundary. The inset plots for 5LR and 3PT show the regressions between position and asymmetry scores.
FIGURE 6
FIGURE 6
(a) Asymmetry scores per trimer, per region. (b) Correlation between the observed asymmetry of trimers and that expected from the base composition for each region. Orange dots corresponds to 3PT, while green dots are 5LR.
FIGURE 7
FIGURE 7
Scaled selection coefficients of each trimer (γ(M)) in autosomal 3PT, calculated by Equation (4). Error bars represent 95% CIs from 1000 bootstraps of the datasets.
FIGURE 8
FIGURE 8
Inferred scaled selection coefficients γ of the monomers and dimer AG for each position in the autosomal 3PT under the best fit model HIII. Error bars represent 95% CIs from 1000 bootstraps of the datasets.
FIGURE 9
FIGURE 9
Deviations of the three hypotheses from the autosomal empirical joint frequency data of the four bases. Values calculated as χ 2 statistic and high to low deviation is represented with red to white colour gradient. Matrices for three positions are chosen to visualize the pattern along the 3PT.
FIGURE 10
FIGURE 10
Asymmetry scores of dimers (top row) and trimers (bottom row) from the pyrimidine enriched regions in the introns of eight eukaryotic species. Species are shown with different shapes and trimer motifs containing AG in it are depicted with red symbols, while non‐AG motifs are represented with blue.

Similar articles

Cited by

References

    1. Afreixo, V. , Bastos, C. A. , Garcia, S. P. , Rodrigues, J. M. , Pinho, A. J. , & Ferreira, P. J. (2013). The breakdown of the word symmetry in the human genome. Journal of Theoretical Biology, 335, 153–159. 10.1016/j.jtbi.2013.06.032 - DOI - PubMed
    1. Aroian, R. V. , Levy, A. D. , Koga, M. , Ohshima, Y. , Kramer, J. M. , & Sternberg, P. W. (1993). Splicing in Caenorhabditis elegans does not require an AG at the 3′ splice acceptor site. Molecular and Cellular Biology, 13, 626–637. 10.1128/mcb.13.1.626-637.1993 - DOI - PMC - PubMed
    1. Belshaw, R. , & Bensasson, D. (2006). The rise and falls of introns. Heredity, 96, 208–213. 10.1038/sj.hdy.6800791 - DOI - PubMed
    1. Berget, M. S. (1995). Exon recognition in vertebrate splicing. The Journal of Biological Chemistry, 270, 2411–2414. 10.1074/jbc.270.6.2411 - DOI - PubMed
    1. Bergman, J. , Betancourt, A. J. , & Vogl, C. (2017). Transcription‐associated compositional skews in Drosophila genes. Genome Biology and Evolution, 10, 269–275. 10.1093/gbe/evx200 - DOI - PMC - PubMed

Publication types