Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun;22(6):1184-95.
doi: 10.1101/gr.134106.111. Epub 2012 Mar 5.

Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis

Affiliations

Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis

Yamile Marquez et al. Genome Res. 2012 Jun.

Abstract

Alternative splicing (AS) is a key regulatory mechanism that contributes to transcriptome and proteome diversity. As very few genome-wide studies analyzing AS in plants are available, we have performed high-throughput sequencing of a normalized cDNA library which resulted in a high coverage transcriptome map of Arabidopsis. We detect ∼150,000 splice junctions derived mostly from typical plant introns, including an eightfold increase in the number of U12 introns (2069). Around 61% of multiexonic genes are alternatively spliced under normal growth conditions. Moreover, we provide experimental validation of 540 AS transcripts (from 256 genes coding for important regulatory factors) using high-resolution RT-PCR and Sanger sequencing. Intron retention (IR) is the most frequent AS event (∼40%), but many IRs have relatively low read coverage and are less well-represented in assembled transcripts. Additionally, ∼51% of Arabidopsis genes produce AS transcripts which do not involve IR. Therefore, the significance of IR in generating transcript diversity was generally overestimated in previous assessments. IR analysis allowed the identification of a large set of cryptic introns inside annotated coding exons. Importantly, a significant fraction of these cryptic introns are spliced out in frame, indicating a role in protein diversity. Furthermore, we show extensive AS coupled to nonsense-mediated decay in AFC2, encoding a highly conserved LAMMER kinase which phosphorylates splicing factors, thus establishing a complex loop in AS regulation. We provide the most comprehensive analysis of AS to date which will serve as a valuable resource for the plant community to study transcriptome complexity and gene regulation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Read alignment using Bowtie and TopHat. (A) Table showing statistics of the aligned reads to the Arabidopsis genome (TAIR9). (B) Aligned single reads to Arabidopsis chromosomes (left) and schematic representation of Arabidopsis chromosomes (right). (C) Log2 scale of median read density in windows of 1 kb by chromosome.
Figure 2.
Figure 2.
Features of splice junctions and predicted introns. (A) Distribution of splice junctions along gene features in protein-coding genes, according to TAIR9. (B) Distribution of intron sizes defined by splice junctions (SJ) predicted by TopHat (left circle). Splice junctions were classified as annotated if they were already in TAIR9 (middle circle); otherwise, they were classified as new (right circle). (C) Distribution of the AT content (%) in introns generated by annotated splice junctions and new splice junctions.
Figure 3.
Figure 3.
Classification of introns defined by predicted splice junctions. (A) Splice sites of predicted introns. (B) Classification of introns according to Sheth et al. 2006 (see Methods). (NC) Not classified.
Figure 4.
Figure 4.
Alternative splicing events and validation of assembled transcripts. (A) Top 10 most frequent types of AS in the predicted transcripts according to ASTALAVISTA. The first column illustrates the intron-exon structure of the AS event, followed by its description, the raw number of events found in the sample, and their frequency. (IR) Intron retention, (Alt 5′ss) alternative 5′ splice site, (Alt 3′ss) alternative 3′ splice site, (ES) exon skipping. (B) Venn diagram of the number of fragments obtained in the HR RT-PCR panel, putative assembled transcripts of RNA-seq, and TAIR9-annotated transcripts according to primer pairs of the HR RT-PCR panel (see text and Methods).
Figure 5.
Figure 5.
Features of retained introns. (A) Size distribution of different intron classes. [I(All)] All introns in our sample, (IC) the constitutive introns, (IC+IA−IR) and (IA−IR) the categories without retained introns either including or excluding constitutive introns, respectively, (IR+IA) retained introns that are also involved in other AS events, (IR) introns that are only retained. (B) GC content distribution of constitutive introns (IC) and retained introns (IR). (C) Splice site score distribution of constitutive introns (IC) and retained introns (IR). (D) Histogram of number of introns in each intron retention ratio (IRR) category. (E) Boxplots of splice site score distribution by IRR category. (F) Boxplots of GC content distribution by IRR category. (G) Boxplots of stop codons distribution by IRR category. For the splice site score distributions in C and E, the means of 5′ and 3′ splice sites scores (calculated according to Sheth et al. 2006) were used.
Figure 6.
Figure 6.
Alternative splicing in the AFC2 gene. (A) The black arrow represents the chromosome location of AFC2 and its orientation indicates the direction of transcription. (Diamonds) Two transcription start sites (TSS) reported by TAIR. (Black) TAIR models for the AFC2 gene; (gray) new putative assembled transcripts (PATs). The triangles located over the TAIR transcripts and PATs indicate the start codons, while the asterisks or double daggers represent the stop codons. For PAT1 and PAT2, two putative start codons (full and empty triangles) and two putative stop codons (asterisk and double dagger) are depicted. Protein isoforms predicted using the shown start and stop codons are illustrated at the end of each splice variant followed by their sizes in amino acids. (Black) Noncatalytic domain; (gray) kinase catalytic domain. (Star) Location of the LAMMER motif. (Arrows) The primer pair 226 designed for the HR RT-PCR panel. The amplified region is highlighted by a box (region A). The AS events in the region A are shown after each transcript, and the number in parentheses denotes the size of the amplified product in nucleotides. Abbreviations for AS events: (FS) fully spliced, (IRn) intron retention where “n” is the number of the intron, (ESn) exon skipping of exon number “n.” (B) Electropherograms of RT-PCR products for the primer pair indicated in A for wild type and upf mutants generated by GeneMapper. Numbers on the x-axis represent the size markers in bp; numbers on the y-axis represent relative fluorescence, reflecting transcript abundance. The AS event associated with each product is shown in the respective peak. (Ovals) The peaks which increase in abundance in the upf1-5 and upf3-1 mutants.

References

    1. Ali GS, Reddy AS 2008. Regulation of alternative splicing of pre-mRNAs by stresses. Curr Top Microbiol Immunol 326: 257–275 - PubMed
    1. Alioto TS 2007. U12DB: A database of orthologous U12-type spliceosomal introns. Nucleic Acids Res 35: D110–D115 - PMC - PubMed
    1. The Arabidopsis Genome Initiative 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815 - PubMed
    1. Barbazuk WB, Fu Y, McGinnis KM 2008. Genome-wide analyses of alternative splicing in plants: Opportunities and challenges. Genome Res 18: 1381–1392 - PubMed
    1. Berget SM, Moore C, Sharp PA 1977. Spliced segments at the 5′ terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci 74: 3171–3175 - PMC - PubMed

Publication types

Substances