Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 21;521(7552):376-9.
doi: 10.1038/nature14475. Epub 2015 May 13.

Genome-wide identification of zero nucleotide recursive splicing in Drosophila

Affiliations

Genome-wide identification of zero nucleotide recursive splicing in Drosophila

Michael O Duff et al. Nature. .

Abstract

Recursive splicing is a process in which large introns are removed in multiple steps by re-splicing at ratchet points--5' splice sites recreated after splicing. Recursive splicing was first identified in the Drosophila Ultrabithorax (Ubx) gene and only three additional Drosophila genes have since been experimentally shown to undergo recursive splicing. Here we identify 197 zero nucleotide exon ratchet points in 130 introns of 115 Drosophila genes from total RNA sequencing data generated from developmental time points, dissected tissues and cultured cells. The sequential nature of recursive splicing was confirmed by identification of lariat introns generated by splicing to and from the ratchet points. We also show that recursive splicing is a constitutive process, that depletion of U2AF inhibits recursive splicing, and that the sequence and function of ratchet points are evolutionarily conserved in Drosophila. Finally, we identify four recursively spliced human genes, one of which is also recursively spliced in Drosophila. Together, these results indicate that recursive splicing is commonly used in Drosophila, occurs in humans, and provides insight into the mechanisms by which some large introns are removed.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Extended Data Figure 1
Extended Data Figure 1. Two approaches for identifying recursive splice sites
a, Identification of recursive splice sites by parsing alignments. RNA-Seq reads were mapped to the genome using TopHat in a manner that allowed for novel splice junctions to be predicted. The alignments were then parsed for splice junction reads where the 5′ splice site mapped to an annotated 5′ splice site, but the 3′ splice site was unannotated. b, de novo identification of recursive splice sites. A database was generated in which each annotated 5′ splice site was spliced to all AG/GT sequences in an intron that did not correspond to an annotated 3′ splice site. All RNA-Seq reads were aligned to this database and the alignments parsed to find cases where reads mapped perfectly with at least 3 distinct offsets and at least an 8 nt overhang.
Extended Data Figure 2
Extended Data Figure 2. Characteristics of Drosophila ratchet points
a, Distribution of the number of ratchet points per recursive intron. b, Size distribution (log10(bp)) of all (red) and recursive (blue) introns. c, Size distribution (in kbp) of the individual intron segments removed by recursive splicing binned by the number of segments per intron.
Extended Data Figure 3
Extended Data Figure 3. RT-PCR validations of Drosophila recursive splicing events
RT-PCR validation of ratchet points (red dots) from the indicated genes using primers in the upstream constitutive exon and flanking the putative ratchet points. The RP primers are expected to yield RT-PCR products if the constitutive exon is spliced to the ratchet point. The URP primers, which are upstream of each ratchet point, serve as negative controls. The identity of all RT-PCR products were verified by Sanger sequencing. Though the URP control RT-PCR reactions yielded a product for hppy RP1 and pum RP2, we were not able to generate sequence from them and therefore consider them to be amplification artifacts.
Extended Data Figure 4
Extended Data Figure 4. Number of mapped reads per sample used for gene expression analysis
Extended Data Figure 5
Extended Data Figure 5. Chromatin marks associated with recursive splice sites
a, Examples of chromatin marks at the luna gene locus, which contains 5 recursive splice sites (red triangles) within a single long intron. b, Heatmaps show relative ChIP-seq enrichment for H3K4me3 (top, red), H3K79me2 (middle, green), and H3K36me3 (bottom, blue), within 2 kb of the indicated gene features from 171 genes containing at least one ratchet point. Heatmaps are centered around gene features, which include the transcription start site of the first exon (First exon, arrow), the 5′ ss of the exon upstream of the recursive splice site (Upstream exon, black rectangle), the ratchet point (red triangle), the 3′ ss of the exon downstream of the recursive splice site (downstream exon, black rectangle), and the poly(A) site of the last exon (last exon, red octogon); the average exon of each gene feature is drawn to scale. Genes are sorted from top to bottom by decreasing expression level. For genes containing more than one ratchet point, the first, upstream, downstream, and last exons are represented multiple times. c, Histogram illustrating the intron positions the ratchet points reside in based on RefSeq annotations.
Figure 1
Figure 1. Identification and validation of recursive splice sites in Drosophila
a, Schematic diagram of nascent pre-mRNA transcripts during co-transcriptional splicing and the corresponding read density that would be observed in total RNA-Seq data. Note the sawtooth pattern created by the 5′ to 3′ gradient of RNA-Seq read density from the exon to the downstream ratchet point and splice site. b, Example of total RNA-Seq data for the Ubx gene which is known to contain three recursive splice sites. Also shown are the splice junction reads supporting recursive splicing at each site. c, Example of five recursive splice sites identified in luna. Shown are the recursive junctions identified and the overall RNA-Seq read density from all samples (blue). d, RT-PCR validation of the luna ratchet points (red dots) using primers in the upstream constitutive exon and flanking the putative ratchet points (UP). The RP primers are expected to yield RT-PCR products if the constitutive exon is spliced to the ratchet point. The URP primers, which are upstream of each ratchet point, serve as negative controls.
Figure 2
Figure 2. Identification of recursive lariat introns in Drosophila
a, RNA-Seq reads (red and orange indicate the first and second half of an individual read) that traverse a 5′ splice site-branchpoint junction would align to the linear intron as out-of order split-reads. b, Example of recursive lariat introns in cpo. Shown are the recursive junctions identified (blue), the lariat junction reads (orange), and the overall RNA-Seq read density from all samples (red). A magnification of each branch point region is also shown along with the conservation among 16 insects. The positions of the branch points are indicated by the vertical arrows. c, Distribution of the distance of the recursive lariat intron branch points from the 3′ splice sites. d, Sequence logo of the recursive lariat intron branch point sequences.
Figure 3
Figure 3. Characteristics of Drosophila ratchet points
a, Sequence logos of 5′ splice sites, 3′ splice sites, ratchet points, and non-ratchet point AG/GT sequences located in the same introns as ratchet points (top to bottom). b, Sequence conservation of ratchet points. Average PhastCons scores of ratchet points (green) and non-ratchet points (blue). Solid line indicates the average PhastCons score, shaded regions indicate the 95% confidence interval. Normalized recursive junction (c) reads and percent non-recursive junctions (d) observed in untreated S2 cells and cells treated with the indicated dsRNAs.
Figure 4
Figure 4. Expression characteristics of recursively spliced Drosophila genes
a, Heatmap representation of Z-scores of mRNA expression levels of the recursively spliced genes among the samples examined. Male (M), female (F), imaginal discs (ID), ovaries (OV), accessory gland (A), testes (T), digestive tract (Dig), salivary gland (Saliv). b, Distribution of the Spearman correlations of mRNA expression levels and recursive indexes of each ratchet point for the cell line (red), developmental (green), and tissue (blue) samples. c, Example of the correlation of mRNA expression levels and recursive indexes for four ratchet points in Antp in the developmental (green), and tissue (blue) samples.

References

    1. Hatton AR, Subramaniam V, Lopez AJ. Generation of alternative Ultrabithorax isoforms and stepwise removal of a large intron by resplicing at exon-exon junctions. Mol Cell. 1998;2:787–796. - PubMed
    1. Burnette JM, Miyamoto-Sato E, Schaub MA, Conklin J, Lopez AJ. Subdivision of large introns in Drosophila by recursive splicing at nonexonic elements. Genetics. 2005;170:661–674. - PMC - PubMed
    1. Conklin JF, Goldman A, Lopez AJ. Stabilization and analysis of intron lariats in vivo. Methods. 2005;37:368–375. - PubMed
    1. Mackay TF, et al. The Drosophila melanogaster Genetic Reference Panel. Nature. 2012;482:173–178. - PMC - PubMed
    1. Graveley BR, et al. The developmental transcriptome of Drosophila melanogaster. Nature. 2011;471:473–479. - PMC - PubMed

Publication types

MeSH terms