Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov 16;4(11):e7853.
doi: 10.1371/journal.pone.0007853.

The peculiarities of large intron splicing in animals

Affiliations

The peculiarities of large intron splicing in animals

Samuel Shepard et al. PLoS One. .

Abstract

In mammals a considerable 92% of genes contain introns, with hundreds and hundreds of these introns reaching the incredible size of over 50,000 nucleotides. These "large introns" must be spliced out of the pre-mRNA in a timely fashion, which involves bringing together distant 5' and 3' acceptor and donor splice sites. In invertebrates, especially Drosophila, it has been shown that larger introns can be spliced efficiently through a process known as recursive splicing-a consecutive splicing from the 5'-end at a series of combined donor-acceptor splice sites called RP-sites. Using a computational analysis of the genomic sequences, we show that vertebrates lack the proper enrichment of RP-sites in their large introns, and, therefore, require some other method to aid splicing. We analyzed over 15,000 non-redundant, large introns from six mammals, 1,600 from chicken and zebrafish, and 560 non-redundant large introns from five invertebrates. Our bioinformatic investigation demonstrates that, unlike the studied invertebrates, the studied vertebrate genomes contain consistently abundant amounts of direct and complementary strand interspersed repetitive elements (mainly SINEs and LINEs) that may form stems with each other in large introns. This examination showed that predicted stems are indeed abundant and stable in the large introns of mammals. We hypothesize that such stems with long loops within large introns allow intron splice sites to find each other more quickly by folding the intronic RNA upon itself at smaller intervals and, thus, reducing the distance between donor and acceptor sites.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Ratcheting point consensus sequence (RP-site).
The RP-site consensus was obtained from our purged sample of 11,315 non-redundant human gene sequences (with <50% sequence identities between each other) from the human Exon-Intron Database, release 35p1. The top row contains the consensus sequence derived from the frequency information below. Each nucleotide in the consensus sequence is a column in the matrix whose rows show the frequencies found for each given nucleotide at that position. The first column gives the nucleotides corresponding to the frequency information.
Figure 2
Figure 2. Human and Drosophila large intron dot-plots.
(A) A dot-plot of human intron 21 from the CNTNAP2 gene versus its complementary sequence. (B) Dot-plot of the drosophila intron 1 from the luna gene versus its complementary sequence. Here the dot-plot window size is 19 and the mismatch limit is set to 0. Low complexity repeats were filtered out using RepeatMasker before performing the dot-plot. The diagonal lines on the graph represent base pairing between different sections of the large introns that we may interpret as potential stem structures. The dot-plot conveys all possible combinations of stems in the sequence.
Figure 3
Figure 3. Repetitive elements within species.
The percentage of repeats for the complete set of large introns for various species. The light gray bars are for the total percentage of repeats in large introns (percentage of nucleotides), the medium gray bars are only for the percentage of nucleotides made up by short interspersed element (SINE) repeats, while the dark gray bars are only for long interspersed element (LINE) repeats. *Note: Mosquito contains an ambiguous SINE element called “SINEX-1_AG”.
Figure 4
Figure 4. Beetle large intron dot-plot and secondary structure.
(A) Dot-plot of a beetle large second intron of the predicted gene XP_968205.1. The window size for the dot-plot was 19 and the mismatch limit was 0. (B) An example stem from the same intron, created using RNAcofold (it is not associated with any known repeat).

References

    1. Belshaw R, Bensasson D. The rise and falls of introns. Heredity. 2006;96(3):208–213. - PubMed
    1. Fedorov A, Merican AF, Gilbert W. Large-scale comparison of intron positions between plant, animal and fungal genes. Proc Natl Acad Sci USA. 2002;99:16128–16133. - PMC - PubMed
    1. Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol. 2003;13:1512–1517. - PubMed
    1. Roy SW, Gilbert W. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet. 2006;7:211–221. - PubMed
    1. De Souza SJ, Long M, Klein RJ, Roy S, Lin S, et al. Towards a resolution of the introns early/late debate. Only phase zero introns are correlated with the structure of ancient proteins. Proc Natl Acad Sci USA. 1998;95:5094–5099. - PMC - PubMed

Publication types