Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct;28(10):2949-59.
doi: 10.1093/molbev/msr127. Epub 2011 May 6.

The origins, evolution, and functional potential of alternative splicing in vertebrates

Affiliations

The origins, evolution, and functional potential of alternative splicing in vertebrates

Jonathan M Mudge et al. Mol Biol Evol. 2011 Oct.

Abstract

Alternative splicing (AS) has the potential to greatly expand the functional repertoire of mammalian transcriptomes. However, few variant transcripts have been characterized functionally, making it difficult to assess the contribution of AS to the generation of phenotypic complexity and to study the evolution of splicing patterns. We have compared the AS of 309 protein-coding genes in the human ENCODE pilot regions against their mouse orthologs in unprecedented detail, utilizing traditional transcriptomic and RNAseq data. The conservation status of every transcript has been investigated, and each functionally categorized as coding (separated into coding sequence [CDS] or nonsense-mediated decay [NMD] linked) or noncoding. In total, 36.7% of human and 19.3% of mouse coding transcripts are species specific, and we observe a 3.6 times excess of human NMD transcripts compared with mouse; in contrast to previous studies, the majority of species-specific AS is unlinked to transposable elements. We observe one conserved CDS variant and one conserved NMD variant per 2.3 and 11.4 genes, respectively. Subsequently, we identify and characterize equivalent AS patterns for 22.9% of these CDS or NMD-linked events in nonmammalian vertebrate genomes, and our data indicate that functional NMD-linked AS is more widespread and ancient than previously thought. Furthermore, although we observe an association between conserved AS and elevated sequence conservation, as previously reported, we emphasize that 30% of conserved AS exons display sequence conservation below the average score for constitutive exons. In conclusion, we demonstrate the value of detailed comparative annotation in generating a comprehensive set of AS transcripts, increasing our understanding of AS evolution in vertebrates. Our data supports a model whereby the acquisition of functional AS has occurred throughout vertebrate evolution and is considered alongside amino acid change as a key mechanism in gene evolution.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
Summary of interior and terminal modification alternative splicing biotypes. This figure summarizes the most common biotype categories in the data set (interior and terminal modifications); a complete set of 68 biotypes is represented in supplementary file 3 (Supplementary Material online). The total numbers highlighted in blue relate to the full 68 biotypes, whereas the total numbers of AS event within each biotype category are highlighted within the gray bars. Each entry represents the categorization of a particular AS event described within the context of the reference transcript structure and not the complete transcript itself; for this reason, the total counts differ slightly from those in table 1 (see supplementary file 1, Supplementary Material online, for further information). From left to right, “conserved Hs/Mm” lists events supported in human and mouse, “Hs-specific” and “Mm-specific” events that are species specific, “cross sp. Hs” and “cross sp. Mm” events that could be aligned cross-species, and “conserved non-mam.” events for which AS is also supported in nonmammals (this is thus a subset of “conserved Hs/Mm”; see supplementary file 2, Supplementary Material online). Although cassette events are pictorially represented by the insertion of an additional exon, these categories also include the “skipping” of exons from the reference transcript. Cassette counts do not include multiple exon events or mutually exclusive exon pairs. For splice site shifts, the counts for shifts in both the 5′ and 3′ directions have been combined. Furthermore, these counts do not distinguish between alternative STOPs and PTCs found within the AS region and those that appear downstream out of frame with the reference CDS. In all cases, these biotypes are distinguished in supplementary file 3 (Supplementary Material online).
F<sc>IG</sc>. 2.
FIG. 2.
The percent identity of human and mouse AS cassette exons plotted against exon size. Only AS single cassette events that are transcriptionally supported in both species are included. “% exonic similarity” represents the percentage similarity of each orthologous AS cassette pair at the base pair level. Cassettes linked to CDS are blue, whereas those linked to NMD are red. For comparison, the 1,118 constitutive internal exons for those genes displaying conserved AS are plotted in green. A cut-off of 300 bp has been used for exon size. The vertical dotted line marks the first quartile value of the median exon size, whereas the horizontal dotted line marks the upper standard deviation limit of the average exon percentage identity. AS cassettes linked to both CDS and NMD tend to be shorter and more similar at the sequence level than the control exons. However, this does not hold true for all AS cassettes; for example, although 14 have exon scores of 100%, 17 fall have scores falling below the average score for the control set exons.
F<sc>IG</sc>. 3.
FIG. 3.
The conservation of four alternative splicing events within RBM39. The RMB39 gene for RNA binding motif protein 39 contains six AS events conserved between human and mouse, four of which are highlighted here. The central panel shows the structure of these four splice variants in human (5′ and 3′ UTRs shown in red; CDS in green; NMD region in purple) and the Phastcons conservation plot. The peripheral panels contain genomic alignments taken initially from the conservation track resources at the UCSC genome browser (Siepel et al. 2005; Pollard et al. 2010). Splice acceptor sites of the form [N/GT] are represented by black triangles, splice donor sites of the form [NAG] by gray triangles, and termination codons by blue triangles. Two of the AS events are “poison” exons, introducing PTCs predicted to induce NMD. (a) Poison exon 1 is 98.6% identical at the base pair level between human and Xenopus and has transcriptional support in zebrafish. The UCSC alignment of the zebrafish splice donor site has been corrected based on manual analysis, whereas the PTC used in this genome is found in the subsequent exon. (b) In contrast, poison exon 2 cannot be aligned in any genomes beyond that of opossum. (c) Two AS acceptor sites are found for this exon in human and mouse; the first (i) is limited to mammalian genomes, whereas the second (ii) exists (often with a “wobble” on the first base pair) in all genomes back to Xenopus (where there is transcriptional support). (d) The alternative final exon uses a splice site found in mammals genomes only, although the STOP codon seen in mouse and other mammalian genomes is absent in apes; the human and mouse CDS are thus significantly different.

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Amit M, Sela N, Keren H, Melamed Z, Muler I, Shomron N, Izraeli S, Ast G. Biased exonization of transposed elements in duplicated genes: a lesson from the TIF-IA gene. BMC Mol Biol. 2007;8:109. - PMC - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2010;38:D46–D51. - PMC - PubMed
    1. Birney E, Stamatoyannopoulos JA, Dutta A, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. 420 co-authors. - PMC - PubMed
    1. Boguski MS, Lowe TM, Tolstoshev CM. dbEST–database for "expressed sequence tags. Nat Genet. 1993;4:332–333. - PubMed

Publication types