Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 26;12(2):e1005473.
doi: 10.1371/journal.ppat.1005473. eCollection 2016 Feb.

High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling

Affiliations

High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling

Nerea Irigoyen et al. PLoS Pathog. .

Abstract

Members of the family Coronaviridae have the largest genomes of all RNA viruses, typically in the region of 30 kilobases. Several coronaviruses, such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and Middle East respiratory syndrome-related coronavirus (MERS-CoV), are of medical importance, with high mortality rates and, in the case of SARS-CoV, significant pandemic potential. Other coronaviruses, such as Porcine epidemic diarrhea virus and Avian coronavirus, are important livestock pathogens. Ribosome profiling is a technique which exploits the capacity of the translating ribosome to protect around 30 nucleotides of mRNA from ribonuclease digestion. Ribosome-protected mRNA fragments are purified, subjected to deep sequencing and mapped back to the transcriptome to give a global "snap-shot" of translation. Parallel RNA sequencing allows normalization by transcript abundance. Here we apply ribosome profiling to cells infected with Murine coronavirus, mouse hepatitis virus, strain A59 (MHV-A59), a model coronavirus in the same genus as SARS-CoV and MERS-CoV. The data obtained allowed us to study the kinetics of virus transcription and translation with exquisite precision. We studied the timecourse of positive and negative-sense genomic and subgenomic viral RNA production and the relative translation efficiencies of the different virus ORFs. Virus mRNAs were not found to be translated more efficiently than host mRNAs; rather, virus translation dominates host translation at later time points due to high levels of virus transcripts. Triplet phasing of the profiling data allowed precise determination of translated reading frames and revealed several translated short open reading frames upstream of, or embedded within, known virus protein-coding regions. Ribosome pause sites were identified in the virus replicase polyprotein pp1a ORF and investigated experimentally. Contrary to expectations, ribosomes were not found to pause at the ribosomal frameshift site. To our knowledge this is the first application of ribosome profiling to an RNA virus.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. MHV RNA synthesis and translation.
(A) Transcript map of the 31335-nt MHV-A59 genome. Polyproteins pp1a and pp1b are translated from the genomic RNA, with pp1b being expressed as a transframe fusion with pp1a (i.e. pp1ab) via −1 programmed ribosomal frameshifting (−1 PRF). The 3′ ORFs are expressed from a series of subgenomic RNAs produced during infection. Each subgenomic RNA contains a 5′ leader sequence that is identical to the 5′ leader of the genome, appended via polymerase “jumping” between body transcription regulatory sequences (TRSs) (green diamonds) and the leader TRS (orange diamond) during negative-strand synthesis. Due to mutations present in this laboratory-adapted strain, the hemagglutinin-esterase and ORF4 gene fragments (HE and 4; grey boxes) are not expected to be translated. (B) RiboSeq CHX (red) and RNASeq (green) densities at 5 h p.i. (repeat 1) in reads per million mapped reads (RPM). Read densities are plotted on a log(1+x) scale to cover the wide range in expression across the genome. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map (for RPFs) approximate P-site positions. Negative-sense reads are shown in dark blue below the horizontal axis. (C) The positive-sense RiboSeq/RNASeq ratio, after first applying a 15-nt running mean (RM) filter to each individual distribution.
Fig 2
Fig 2. RNA synthesis and translation in the 5′ UTR and 3′ ORF regions.
RiboSeq HAR (dark red), RiboSeq CHX (red) and RNASeq (green) densities at 5 h p.i. (repeat 1) in reads per million mapped reads (RPM), smoothed with a 15-nt running mean filter and plotted on a linear scale. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map (for RPFs) approximate P-site positions. Negative-sense reads are shown in dark blue below the horizontal axis.
Fig 3
Fig 3. Time course of MHV total RNA synthesis and translation.
(A) Time course of total virus RNA accumulation (left) and total virus translation (right). To normalize for differing library sizes, read counts are expressed relative to the total number of mapped virus RNA (positive and negative-sense) and mapped host mRNA reads for the library. Grey symbols with downward pointing arrows correspond to contaminated samples (see text) and represent upper bounds on the virus fraction. (B) Similar data represented on a linear scale; hatched bars—repeat 1, solid bars—repeat 2. (C) 17Cl-1 cells were infected with MHV-A59 (MOI 10) and harvested at 1, 2.5, 5 and 8 h p.i. Cell lysates were separated by 10% (for N and S westerns) or 17% (for nsp9 western) SDS-PAGE and immunoblotted using monoclonal anti-N, anti-S and anti-nsp9 sera. Molecular masses (in kDa) are indicated on the left. GAPDH was used as a loading control. All viral proteins were detected with a green fluorescent secondary antibody, and GAPDH with a red fluorescent secondary antibody.
Fig 4
Fig 4. Time course of RNA synthesis and translation for different MHV genes.
(A) Upper left: Time course of mean positive-sense raw RNASeq densities in each of six genome regions defined by the region between the TRS for a given mRNA and the next downstream TRS. Upper right: Estimated mean positive-sense RNASeq densities for each of mRNAs 1, 2, 3, 5, 6 and 7. Raw RNASeq densities represent the cumulative sum of densities for all mRNAs that cover a given genome region. Subtraction of the density for the immediately upstream inter-TRS region gives an estimate of the RNASeq density for a specific mRNA, herein referred to as the “decumulated” density. RNASeq densities for mRNA4 are omitted as it is not expressed at a sufficiently high level relative to mRNA3 to apply the “decumulation” procedure. Lower right: Estimated mean negative-sense RNASeq densities for each of the negative-sense subgenomic RNAs 2, 3, 5, 6, 7 and (anti)-gRNA. Lower left: Mean RiboSeq densities for each of ORFs 1a, 1b, 2, S, 5, E, M and N. The density for N includes any RPFs deriving from the overlapping I ORF. RiboSeq densities for the defective ORFs HE and 4 are omitted. Circles and solid lines correspond to repeat 2; crosses and dotted lines correspond to repeat 1. Due to low levels of reads and contamination (see text), values for 1 h p.i. and 2.5 h p.i. RiboSeq, and 1 h p.i. repeat 1 RNASeq should be considered as upper bounds and the 1 h p.i. repeat 1 RiboSeq values have been omitted. Densities are expressed in reads per kb per million mapped reads (RPKM). (B) Estimated translational efficiencies of different virus ORFs based on the quotient of the RiboSeq density for an ORF and the estimated positive-sense RNASeq density for the corresponding mRNA. Efficiencies are relative to mean host plus virus efficiencies and the calculation does not account for the presence of non-translated gRNA.
Fig 5
Fig 5. Comparison of estimators of relative mRNA abundance.
Relative abundances of the different mRNA species (positive-sense) at 5 h p.i. were estimated either from mean RNASeq density (decumulated as described in the caption to Fig 4) or from the abundance of leader/body “chimeric” RNASeq reads spanning the corresponding TRS junction site. RNASeq densities are expressed in reads per kb per million mapped reads (RPKM). Chimeric TRS read counts are expressed in reads per million mapped reads (RPM).
Fig 6
Fig 6. Comparison of host and virus translation efficiencies.
The translation efficiencies of virus mRNAs were calculated as described in the caption to Fig 4. Host mRNA translation efficiencies are based on the ratio (after normalization for library size) of all RiboSeq or RNASeq reads mapping to any annotated coding region of any splice form of a given gene (see Methods). Host data are shown only for genes with >100 mapped RNASeq coding-region reads (prior to normalization for library size). Horizontal dashed lines indicate the mean values for host cell genes.
Fig 7
Fig 7. MHV RNA synthesis and translation at an early time point.
(A) RiboSeq HAR (dark red), RiboSeq CHX (red) and RNASeq (green) density at 1 h p.i. in reads per million mapped reads (RPM), smoothed with a 15-nt running mean filter and plotted on a linear scale. To obtain sufficient reads at an early time point, cells were infected at an MOI of 200. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map (for RPFs) approximate P-site positions. (B) Comparison of the read length distributions of 5′ (ORF1a; orange) and 3′ (N ORF; red) virus reads with host mRNA reads (green) from the same samples. In the RNASeq graph (right), read length distributions are also shown for virus RNA from the 5 h p.i. and 8 h p.i. repeat 2 samples (grey).
Fig 8
Fig 8. Frameshifting efficiency.
(A) Schematic of the MHV frameshifting signal comprising a slippery heptanucleotide, U_UUA_AAC, and downstream pseudoknot stimulatory structure. (B) Frameshifting efficiencies estimated from the ratio of RiboSeq density in ORF1b to that in ORF1a (red). For comparison, the same calculation was done for RNASeq (green). ORF1a and ORF1b are both present only on the genomic RNA so the ratio of RNASeq densities in the two ORFs is expected to approximate unity. (C) Frameshifting efficiencies for MHV, IBV and HIV-1 frameshift cassettes determined using dual luciferase assays in 17 Cl-1 and BHK-21 cells. Cells were transfected with pDLuc-MHV, pDLuc-IBV or pDLuc-HXB2, and 24 h later, lysates were prepared and assayed for Renilla and firefly luciferase activity.
Fig 9
Fig 9. Ribosome pause sites in ORF1a.
(A) Histograms of log fold-change from the mean in ORF1a (5 h p.i., repeat 1) showing that RiboSeq densities are more variable than RNASeq densities. RNASeq and RiboSeq counts in ORF1a were first smoothed with a 3-nt running mean filter to average out the intra-codon variability (i.e. triplet periodicity) present in RiboSeq data. (B) Blue triangles indicate selected sites of RPF accumulation in ORF1a, indicative of ribosomal pausing (see text). Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site. RPF distributions were smoothed with a 15-nt running-mean filter (which, incidentally, reduces the peak height ~15-fold, cf. part C). (C) Enlarged view of the two pause sites without smoothing. The 3′ pause corresponds to reads with 5′ ends mapping to genomic coordinate 11366 while the positions of the 5′ ends of reads at the 5′ pause site differ by 5 nt between the two repeats (genomic coordinate 4704 and 4699, respectively). Reads whose 5′ ends map to the first, second or third positions of codons are indicated in purple, blue or orange, respectively.
Fig 10
Fig 10. Determination of MHV-nsp3 pausing site.
(A) Nsp3 is the largest protein encoded in the MHV replicase gene and contains two ubiquitin-related domains (green), a hypervariable acidic domain (red), two papain-like cysteine proteinase domains (PLP1 and PLP2; blue), a poly (ADP-ribose) binding activity (ADRP) domain (orange), the recently described “domain preceding Ubl2 and PLP2” (DPUP; fuchsia) [76], the nucleic-acid binding domain (NAB; salmon) the betacoronavirus marker (G2M; lavender), a transmembrane domain (TM; orchid) and domain Y (plum). The site of ribosomal pausing, DVKFVTNAC (P-site at pause underlined) is indicated. (B) Time course of translation of pcDNA.3 mRNA, containing sequence coding for nsp3* (first 1,125 residues excluding NAB, G2M, TM and Y domains) in RRL. Translation was allowed to proceed at 26°C in the presence of [35S]methionine for 3 min prior to addition of edeine to a final concentration of 5 μM. Samples were withdrawn at the indicated times after edeine addition, and translation products separated on a 10% SDS-PAGE gel and detected by autoradiography. MW indicates 14C-labelled molecular weight standards and H2O as a negative control. The predicted position of the pause product was determined from the “pause control” lane (see text). (C) Time course of translation of pPS0/nsp3-derived mRNA in RRL as above. pPS0 contains, under the control of the SP6 promoter, a copy of the influenza virus PB1 gene into which has been inserted cDNA encoding the nsp3 pause region (red) plus 30 upstream residues. (D) Ribosomal pausing assays of pPS0-nsp3 mutant mRNAs in RRL (20 min at 26°C). In each case, positively charged or aromatic amino acids were changed to alanine. In mutant 1, Lys-Phe at the pausing site was changed to Ala-Ala, and subsequent mutants were prepared sequentially from this clone, thus mutant 5 contains six substitutions (see text). (E) Ribosomal pausing assay of pPS0/nsp3 Mut3 mRNA in RRL as described above. In all panels, the pause product is indicated by a red asterisk.
Fig 11
Fig 11. RiboSeq and RNASeq densities in the leader region and 5′ end of the genomic and N mRNAs.
(A) A map of the 5′ end of the genomic RNA is shown at the top indicating a 1-codon non-AUG potential uORF (turquoise; sequence UUG UAG) in the leader, the leader TRS (orange), an 8-codon AUG-initiated uORF (lilac) present only in the genomic RNA, and the 5′ end of ORF1a (light blue). RiboSeq (CHX and HAR) and RNASeq (RNA) counts are shown for repeat 1 at 5 h p.i. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site (the +12 nt offset means that genome coordinates 1 to 12 register zero counts). Reads whose 5′ ends map to the first, second or third positions of codons relative to the reading frames of the two uORFs and ORF1a (which are all in phase) are indicated in purple, blue or orange, respectively. (B) Comparison of the read length distributions of 5′-end-of-leader RPFs (5′ end of RPF maps at or 5′ of genome coordinate 32) (orange), all virus RPFs (red), and host mRNA RPFs (green) from the same samples. (C) Mapping of reads specifically to mRNA7. After subtracting rRNA (see Methods) reads were mapped to mRNA7 instead of the MHV genome. The 5′ region of mRNA7 are shown.
Fig 12
Fig 12. Translation of the genomic RNA uORF.
(A) RiboSeq (CHX and HAR) and RNASeq (RNA) counts are shown for repeat 1 at 5 h p.i.; RiboSeq HAR counts are also shown for the high MOI infection. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site. Reads whose 5′ ends map to the first, second or third positions of codons relative to the reading frames of the uORFs and ORF1a (which are in phase) are indicated in purple, blue or orange, respectively. Note that the illustrated region does not extend to the genomic 5′ terminus. (B) Comparison of RiboSeq CHX densities summed over all host NCBI RefSeq mRNAs and summed over mRNAs whose annotated coding sequences begin with AUG-CCN (Met-Pro). Histograms show the positions of the 5′ ends of reads, e.g. RPFs of ribosomes paused during initiation (AUG in the P-site at position 0 to 2) have 5′ ends that map predominantly to −12 or −13.
Fig 13
Fig 13. Analysis of translation upstream of other annotated ORFs.
RiboSeq (CHX and HAR) and RNASeq (RNA) counts are shown for repeat 1 at 5 h p.i. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site. Reads whose 5′ ends map to the first, second or third positions of codons relative to the reading frame of the main annotated ORF (i.e. HE, 4b or 5, respectively) are indicated in purple, blue or orange, respectively. (A) 5′ of the HE ORF. A defective TRS for a very low abundance HE mRNA is annotated with an open green box. In MHV-A59, the HE ORF is disrupted with a premature termination codon (red diamond). Out-of-frame AUG codons that would inhibit ribosomal access via leaky scanning to the next HE-frame AUG codon downstream of the premature termination codon are indicated in green. (B) 5′ of ORF4. In MHV-A59, ORF4 is split by a frameshift mutation into ORF 4b (grey) and a very short ORF4a (pale yellow). An upstream AUU-initiated short ORF and a short out-of-frame AUG-initiated ORF are shown in orange. (C) 5′ of ORF5. A CUG codon in the same frame as the upstream ORF4, and a short out-of-frame AUG-initiated ORF are indicated.
Fig 14
Fig 14. Translation of the N and I proteins.
(A) RiboSeq (CHX and HAR) and RNASeq (RNA) counts are shown for repeat 1 at 5 h p.i. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site. Reads whose 5′ ends map to the first, second or third positions of codons relative to the reading frames of the N ORF are indicated in purple, blue or orange, respectively. The I ORF is in the +1 reading frame relative to the N ORF. (B) I is expressed in infected-cells. 17 Cl-1 cells were infected with MHV-A59 and harvested at 1, 2.5, 5 and 8 h p.i. Cell lysates were separated on a 12% SDS-PAGE gel and immunoblotted using monoclonal anti-N and polyclonal anti-I sera. Protein molecular weight markers (MW, kDa) are indicated on the left. N and I were detected with green and red fluorescent secondary antibodies, respectively. (C) Time course of translation of pcDNA.3 N-ORF-derived mRNA in RRL. Translation was at 26°C and samples were collected at the indicated times prior to separation on a 10% SDS-PAGE gel. Labelled polypeptides were detected by autoradiography. Products migrating at the expected sizes for N (50 kDa) and I (23 kDa) are indicated. (D) The pcDNA.3 N-ORF-derived mRNA was translated in RRL and immunoprecipitated with specific anti-N, anti-I or anti-S sera. In the H2O control, water replaces mRNA template. Immunoprecipitated products were separated on a 10% SDS-PAGE gel and detected as above. (E) Top: Phasing of RPFs (CHX, 5 h p.i.) mapping to the region of the N ORF that is overlapped by the I ORF and the region of the N ORF downstream of the I ORF. Bottom: Phasing as a function of position within the N ORF smoothed with a 55-codon running-mean filter. The bar indicates 55 codons length.

References

    1. Han HJ, Wen HL, Zhou CM, Chen FF, Luo LM, Liu JW, Yu XJ (2015) Bats as reservoirs of severe emerging infectious diseases. Virus Res 205: 1–6. 10.1016/j.virusres.2015.05.006 - DOI - PMC - PubMed
    1. Chan JF, Lau SK, To KK, Cheng VC, Woo PC, Yuen KY (2015) Middle East respiratory syndrome coronavirus: another zoonotic betacoronavirus causing SARS-like disease. Clin Microbiol Rev 28: 465–522. 10.1128/CMR.00102-14 - DOI - PMC - PubMed
    1. Corman VM, Baldwin HJ, Fumie Tateno A, Melim Zerbinati R, Annan A, Owusu M, Nkrumah EE, Maganga GD, Oppong S, Adu-Sarkodie Y, Vallo P, da Silva Filho LV, Leroy EM, Thiel V, van der Hoek L, Poon LL, Tschapka M, Drosten C, Drexler JF (2015) Evidence for an ancestral association of human coronavirus 229E with bats. J Virol 89: 11858–11870. 10.1128/JVI.01755-15 - DOI - PMC - PubMed
    1. Brierley I, Boursnell ME, Binns MM, Bilimoria B, Blok VC, Brown TD, Inglis SC (1987) An efficient ribosomal frame-shifting signal in the polymerase-encoding region of the coronavirus IBV. EMBO J 6: 3779–3785. - PMC - PubMed
    1. Bredenbeek PJ, Pachuk CJ, Noten AF, Charité J, Luytjes W, Weiss SR, Spaan WJ (1990) The primary structure and expression of the second open reading frame of the polymerase gene of the coronavirus MHV-A59; a highly conserved polymerase is expressed by an efficient ribosomal frameshifting mechanism. Nucleic Acids Res 18: 1825–32. - PMC - PubMed

Publication types

MeSH terms