Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 26;43(10):5052-64.
doi: 10.1093/nar/gkv333. Epub 2015 Apr 21.

Sequencing the cap-snatching repertoire of H1N1 influenza provides insight into the mechanism of viral transcription initiation

Affiliations

Sequencing the cap-snatching repertoire of H1N1 influenza provides insight into the mechanism of viral transcription initiation

David Koppstein et al. Nucleic Acids Res. .

Abstract

The influenza polymerase cleaves host RNAs ∼10-13 nucleotides downstream of their 5' ends and uses this capped fragment to prime viral mRNA synthesis. To better understand this process of cap snatching, we used high-throughput sequencing to determine the 5' ends of A/WSN/33 (H1N1) influenza mRNAs. The sequences provided clear evidence for nascent-chain realignment during transcription initiation and revealed a strong influence of the viral template on the frequency of realignment. After accounting for the extra nucleotides inserted through realignment, analysis of the capped fragments indicated that the different viral mRNAs were each prepended with a common set of sequences and that the polymerase often cleaved host RNAs after a purine and often primed transcription on a single base pair to either the terminal or penultimate residue of the viral template. We also developed a bioinformatic approach to identify the targeted host transcripts despite limited information content within snatched fragments and found that small nuclear RNAs and small nucleolar RNAs contributed the most abundant capped leaders. These results provide insight into the mechanism of viral transcription initiation and reveal the diversity of the cap-snatched repertoire, showing that noncoding transcripts as well as mRNAs are used to make influenza mRNAs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
High-throughput sequencing of the heterogeneous 5′ ends of influenza mRNAs. (A) Schematic of the sequencing method and subsequent mapping. (B) Diagram of the relevant regions of the influenza mRNA and vRNA template. The heterogeneous sequence is in blue, and the remainder of the mRNA is in black. An orange nucleotide denotes the SNP in the 3′ region of vRNA that differs between viral genes. (C) Length distributions of the heterogeneous sequences grouped by influenza mRNA. (D) Number of reads corresponding to the same heterogeneous sequences in two biological replicates for the NS1 gene at 4 h.p.i. rs, Spearman r coefficient; nx and ny, number of reads in each dataset; ux and uy, number of unique sequences in each dataset; umerged, number of unique sequences in the intersection of the compared datasets.
Figure 2.
Figure 2.
Nucleotide distributions at the 3′ ends of heterogeneous sequences and trimmed host leaders. (A) Nucleotide frequencies at the last eight positions of heterogeneous sequences before trimming residues attributed to prime-and-realign. At each position, enrichment was normalized to the overall nucleotide frequencies within 51-nt windows centered on all Gencode 17 TSSs. (B) Nucleotide frequencies after trimming residues attributed to prime-and-realign; otherwise, as in (A).
Figure 3.
Figure 3.
The potential contribution of prime-and-realign to the heterogeneous sequences. (A) A class of sequences most frequently prepended to NS1 mRNAs. These sequences begin with one of two related fragments (two shades of blue residues) both matching the 5′ end of U2. The U2-derived leaders are extended variable numbers of residues matching the vRNA (red residues), at frequencies indicated in the histograms at the left (blue bars indicating the fraction without any added nucleotides, red bars indicating the fraction of each species with added nucleotides). The sequences then continue with a contiguous match to the vRNA template (black residues, with the orange nucleotide indicating the SNP at position +2). (B) A class of sequences most frequently prepended to PB2 mRNAs; otherwise, as in (A). (C) Fraction of mRNAs with the indicated number of inserted nucleotides matching the vRNA. Frequencies from HA, NA, NP and NS1, which have a +2 U in the vRNA, are averaged, as are those from MP, PA, PB1 and PB2, which have a +2 C (red and blue, respectively); error bars, SD. (D) A prime-and-realign model that attributes the different numbers of inserted nucleotides to the influence of the +2 SNP in the vRNA.
Figure 4.
Figure 4.
Different influenza mRNAs are prepended with similar sets of sequences that include leaders from snRNAs and snoRNAs. (A) Length distributions of host leaders after trimming nucleotides attributed to prime-and-realign. (B) Number of reads corresponding to the same host leader sequences in two biological replicates of NS1, 4 h.p.i., after trimming nucleotides attributed to prime-and-realign. Shapes indicate values for leaders mapping to the annotated 5′ ends of snRNAs and snoRNAs, colored as in panel (D). Otherwise, as in Figure 1D. (C) Number of reads corresponding to the same host leader sequences from NS1 and PB2 mRNAs, 4 h.p.i., after trimming prime-and-realigned nucleotides. Otherwise, as in (B). (D) Abundant host leaders corresponding to the annotated 5′ ends of snRNAs and abundant snoRNAs. The last nucleotide of each abundant trimmed host leader highlighted in (B) and (C) is indicated (colored shape), as is the presumed cleavage site for each of these host leaders (black arrows showing unambiguous sites or two gray arrows showing alternative cleavage sites for the same host leader).
Figure 5.
Figure 5.
Inferred nucleotides near the cleavage sites of host transcripts. (A) Host transcript nucleotide composition immediately downstream of the trimmed host leaders. Analysis was limited to trimmed host leaders mapping precisely to Gencode 17 TSSs. The contribution of each downstream sequence was weighted in proportion to the rank of the corresponding host-leader abundance. Otherwise, as in Figure 2A. (B) Dinucleotide content at positions –1 and 0. Dinucleotide content statistics were collected from mapped host leaders from (A), using the same weighting as in (A). Enrichments were normalized to the dinucleotide composition of 51-nt windows centered on all Gencode 17 TSSs.
Figure 6.
Figure 6.
Summary model of influenza cap-snatching. The viral RNP binds to the CTD of Pol II (57) and PB2 captures a capped nascent transcript. Cleavage by the PA subunit occurs primarily after either a G (left pathway) or an A (right pathway), with possible preferences for flanking Cs (in brackets). Following cleavage, PB2 swivels to enable the 3′-terminus of the cleaved fragment to interact with the template in the PB1 subunit (62,63), allowing base pairing to the penultimate or last nucleotide of the vRNA template (left and right pathways, respectively), with potential pairing to both nucleotides if a purine (R in brackets) precedes a terminal G in the cleaved fragment (left pathway). Fragments ending in G can also pair to the terminal U of the template (not shown). One or more rounds of prime-and-realign can occur before processive transcription. Although fragments ending in U or C are sometimes used (not shown), they are not used as frequently as those ending in A or G, suggesting either that the inability of these fragments to pair with the relevant template residues might favour their dissociation prior to productive priming (not shown), or that the lack of G or A within the optimal window (∼10–13 nt from the 5′-terminus) might favor dissociation of the nascent transcript prior to cleavage (not shown). Dotted lines indicate base pairs; a gray dotted line indicates a potential base pair to the penultimate purine.

Similar articles

Cited by

References

    1. Schibler U., Perry R.P. The 5′-termini of heterogeneous nuclear RNA: a comparison among molecules of different sizes and ages. Nucleic Acids Res. 1977;4:4133–4150. - PMC - PubMed
    1. Hamm J., Mattaj I.W. Monomethylated cap structures facilitate RNA export from the nucleus. Cell. 1990;63:109–118. - PubMed
    1. Filipowicz W., Furuichi Y., Sierra J.M., Muthukrishnan S., Shatkin A.J., Ochoa S. A protein binding the methylated 5′-terminal sequence, m7GpppN, of eukaryotic messenger RNA. Proc. Natl. Acad. Sci. U.S.A. 1976;73:1559–1563. - PMC - PubMed
    1. Rasmussen E.B., Lis J.T. In vivo transcriptional pausing and cap formation on three Drosophila heat shock genes. Proc. Natl. Acad. Sci. U.S.A. 1993;90:7923–7927. - PMC - PubMed
    1. McCracken S., Fong N., Rosonina E., Yankulov K., Brothers G., Siderovski D., Hessel A., Foster S., Shuman S., Bentley D.L. 5′-Capping enzymes are targeted to pre-mRNA by binding to the phosphorylated carboxy-terminal domain of RNA polymerase II. Genes Dev. 1997;11:3306–3318. - PMC - PubMed

Publication types

MeSH terms

Associated data