Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 18;9(2):458.
doi: 10.3390/cells9020458.

Animal, Fungi, and Plant Genome Sequences Harbor Different Non-Canonical Splice Sites

Affiliations

Animal, Fungi, and Plant Genome Sequences Harbor Different Non-Canonical Splice Sites

Katharina Frey et al. Cells. .

Abstract

Most protein-encoding genes in eukaryotes contain introns, which are interwoven with exons. Introns need to be removed from initial transcripts in order to generate the final messenger RNA (mRNA), which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides, which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5' end and AG at the 3' end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations, which have been known for years. Recently, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here, we expand systematic investigations of non-canonical splice site combinations in plants across eukaryotes by analyzing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences, such as an apparently increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts are a likely explanation for this observation, thus indicating annotation errors. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemoraaffinis and Oikopleuradioica. A variant in one U1 small nuclear RNA (snRNA) isoform might allow the recognition of GA as a 5' splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3' splice site compared to the 5' splice site across animals, fungi, and plants.

Keywords: RNA-Seq; gene structure; introns; mRNA processing; sequence conservation; splice site analysis pipeline; spliceosome; splicing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Frequencies of non-canonical splice site combinations in animals, fungi, and plants. The frequency of non-canonical splice site combinations across the 489 animal (red, a), 130 fungal (blue, b), and 121 plant (green, c) genome sequences is shown. Normalization of the absolute number of each splice site combination was performed per species based on the total number of annotated splice site combinations in representative transcripts. The frequency of the respective splice site combination of each species is shown on the left-hand side and the percentage of the respective splice site combination is shown on top of each box plot. The dashed line represents the mean frequency of the respective splice site combination over all investigated species. The box plots are ordered (from left to right) according to the mean frequency.
Figure 2
Figure 2
Flanking positions of GA-AG splice site combinations in Eurytemora affinis (a,b) and Oikopleura dioica (c,d). All splice site combinations (a,c) as well as all 5795 with RNA-Seq data supported splice site combinations (b,d) of these two species were investigated. Seven exonic and seven intronic positions are displayed at the 5′ and 3′ splice sites. Underlined bases represent the terminal dinucleotides of the intron, i.e., the 5′ and 3′ splice site.
Figure 3
Figure 3
CT-AC frequency exceeds AT-AC frequency in the annotation of fungal genome sequences. (a) Number of the minor non-canonical CT-AC splice site combination in comparison to the major non-canonical splice site combination AT-AC in each kingdom (Mann-Whitney U-Test; fungi: p ≈ 0.00035, animals: p ≈ 9.560 × 10−10, plants: p ≈ 5.464 × 10−24). The dashed line represents the mean frequency of the respective splice site combination over all investigated species. (b) Sequence logo for the splice site combination CT-AC in four selected fungal species (Alternaria alternata, Aspergillus brasiliensis, Fomitopsis pinicola and Zymoseptoria tritici). In total, 67 supported splice sites with this combination were used to generate the sequence logo.
Figure 4
Figure 4
Usage of non-canonical splice site combinations in plant species. (a) Comparison of the transcript abundance (FPKMs) of genes with non-canonical splice site combinations to genes with only canonical GT-AG splice site combinations. GC-AG and AT-AC containing genes display especially low proportions of genes with low FPKMs. (b) Comparison of the usage of 5′ and 3′ splice sites. On the x-axis, the difference between the 5′ splice site usage and the usage of the 3′ splice site is shown. A fast drop of values when going to the negative side of the x-axis indicates that the 3′ splice site is probably more flexible than the 5′ splice site.
Figure 5
Figure 5
Hypothetical binding of the U1 snRNA to the pre-mRNA. (a) Binding sequence of the canonical U1 snRNA to the canonical 5′ splice site GU (GT on DNA). (b) Hypothetical binding sequence of the non-canonical U1 snRNA (C > T) to the non-canonical 5′ splice site GA.

Similar articles

Cited by

References

    1. Moore M.J., Sharp P.A. Site-specific modification of pre-mRNA: The 2′-hydroxyl groups at the splice sites. Science. 1992;256:992–997. doi: 10.1126/science.1589782. - DOI - PubMed
    1. Barbosa-Morais N.L., Irimia M., Pan Q., Xiong H.Y., Gueroussov S., Lee L.J., Slobodeniuc V., Kutter C., Watt S., Çolak R., et al. The evolutionary landscape of alternative splicing in vertebrate species. Science. 2012;338:1587–1593. doi: 10.1126/science.1230612. - DOI - PubMed
    1. Ben-Dov C., Hartmann B., Lundgren J., Valcárcel J. Genome-wide analysis of alternative pre-mRNA splicing. J. Biol. Chem. 2008;283:1229–1233. doi: 10.1074/jbc.R700033200. - DOI - PubMed
    1. Matlin A.J., Clark F., Smith C.W. Understanding alternative splicing: Towards a cellular code. Nat. Rev. Mol. Cell Biol. 2005;6:386–398. doi: 10.1038/nrm1645. - DOI - PubMed
    1. Sibley C.R., Blazquez L., Ule J. Lessons from non-canonical splicing. Nat. Rev. Genet. 2016;17:407–421. doi: 10.1038/nrg.2016.46. - DOI - PMC - PubMed

Substances