Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 25;12(1):5120.
doi: 10.1038/s41467-021-25361-5.

A high-resolution temporal atlas of the SARS-CoV-2 translatome and transcriptome

Affiliations

A high-resolution temporal atlas of the SARS-CoV-2 translatome and transcriptome

Doyeon Kim et al. Nat Commun. .

Abstract

COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which infected >200 million people resulting in >4 million deaths. However, temporal landscape of the SARS-CoV-2 translatome and its impact on the human genome remain unexplored. Here, we report a high-resolution atlas of the translatome and transcriptome of SARS-CoV-2 for various time points after infecting human cells. Intriguingly, substantial amount of SARS-CoV-2 translation initiates at a novel translation initiation site (TIS) located in the leader sequence, termed TIS-L. Since TIS-L is included in all the genomic and subgenomic RNAs, the SARS-CoV-2 translatome may be regulated by a sophisticated interplay between TIS-L and downstream TISs. TIS-L functions as a strong translation enhancer for ORF S, and as translation suppressors for most of the other ORFs. Our global temporal atlas provides compelling insight into unique regulation of the SARS-CoV-2 translatome and helps comprehensively evaluate its impact on the human genome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Experimental design and generation of massive-scale datasets.
a Overall design of the study. A fine-scale temporal atlas of the SARS-CoV-2 translatome and transcriptome was constructed for early (0, 1, 2, and 4) and late (12, 16, 24, and 36) hours post infection (hpi) at multiplicity of infection (MOI) of 10 and at 48 hpi (MOI = 0.1) in Calu-3 and Caco-2 cell lines and at 24 hpi (MOI = 0.1) in Vero cell line. The temporal atlas of the human translatome and transcriptome in response to SARS-CoV-2 invasion were also constructed. Ribosome-protected mRNA fragment sequencing (RPF-seq), quantitative profiling of initiating ribosomes sequencing (QTI-seq), mRNA sequencing (mRNA-seq), and small RNA sequencing (sRNA-seq) were performed for each time point. For more details, see “Methods”. b High reproducibility between the replicates of sequenced data. Correlation coefficient (Spearman’s ρ) was calculated by comparing host gene expression levels between replicates. Both x- and y-axes represent log10(RPKM + 1). Representative examples from early and late hpi are displayed. For a full version, see Supplementary Fig. 1b, c. c The number of total mapped reads are shown in log10 scale (top) with the relative fraction of reads mapped to the human and SARS-CoV-2 genomes (bottom). The upward and downward directions of the y-axis indicate the fraction of the reads mapped to the SARS-CoV-2 and human genomes, respectively. d A growth dynamics curve of SARS-CoV-2 in Calu-3 and Vero cell lines after infection. The mean values ±95% confidence intervals are displayed (n = 3 biological independent experiments). e Distribution of mRNA-seq (left), RPF-seq (middle), and QTI-seq (right) reads with respect to the relative position near the start of ORF. For 4 and 36 hpi, the 13th nucleotide position (12-nt offset from the 5′ end) of the reads mapped to human (green) or SARS-CoV-2 (red) was counted for each sequencing data. The relative fraction to the amount of reads mapped to the entire ORF was calculated for each position, and the y-axis represents the average of the relative fractions for ORFs with >50 reads mapped. Open reading frames are depicted as three different colored bars with the darkest bars indicating in frame and the others out of frames.
Fig. 2
Fig. 2. Temporal landscape of the SARS-CoV-2 translatome and transcriptome.
a Coverage of mRNA-seq, RPF-seq, QTI-seq, and sRNA-seq reads across the SARS-CoV-2 genome in early phase (4 hours post infection (hpi), top) and late phase (36 hpi, bottom) after viral infection (multiplicity of infection = 10). The y-axis represents the number of reads per million mapped reads (RPM) on a log10 scale, and blue and orange bars indicate the number of reads mapped to positive and negative strands of the genome, respectively. b Expression level changes of SARS-CoV-2 ORFs over 0–36 hpi in the transcriptome (mRNA-seq, left) and translatome (RPF-seq and QTI-seq, middle) levels and translation efficiency (right). Expression levels of each ORF were measured as RPM for each time point (see “Methods”) and displayed as log10(RPM + 1). Translation efficiency was calculated by dividing the translation level by the mRNA level for each ORF. ORF 10* indicates the translation efficiency of ORF 10 as the RPF expression level divided by mRNA expression level of N sgRNA (see “Methods”). c Two examples of potential miRNA candidates detected on the SARS-CoV-2 genome. sRNA-seq reads mapped on the SARS-CoV-2 genome are displayed as blue bars and the corresponding H-scores, which summarize the folding degree of the RNA hairpin structure centered on the nucleotide position (see “Methods”), are depicted as a red line below. Predicted RNA secondary structures of the two miRNA candidates are also illustrated where predicted mature miRNAs are shown in blue and known determinants for miRNA processing are indicated. d Repression of human mRNAs targeted by SARS-CoV-2 miRNA candidates identified in c. Expression fold changes of each mRNA after viral infection were measured by mRNA-seq for Calu-3 cells at 36 hpi. Human mRNAs containing a single 7, 8mer target site of the identified candidates in their 3′UTRs were selected (see “Methods). Cumulative distribution of log2(mRNA fold change) of the target mRNAs was plotted (red) and compared with that of nontargets (“no site,” black) by two-sided Wilcoxon’s rank-sum test.
Fig. 3
Fig. 3. Extensive translation initiation by the translation initiation site located in the leader (TIS-L) for both gRNA and sgRNAs.
a Coverage of RPF-seq reads across the SARS-CoV-2 genome at 48 hours post infection (hpi) (multiplicity of infection = 0.1). Otherwise as in Fig. 2a. b Enrichment of RPF-seq reads at the TIS-L, which is a CUG codon located at 59th nt position in the leader sequence. The 13th position (12-nt offset from the 5′ end) of the reads, indicating the ribosome P-site position, was counted and calculated as the number of reads per million mapped reads (RPM). Open reading frames are depicted as three different colored bars with the dark blue bars indicating in frame with TIS-L and the others out of frames. c RPF-seq reads mapped to TIS-L categorized by whether their 3′ ends are mapped to the sgRNAs or the gRNA. For 48 hpi, a subset of RPF-seq reads around the TIS-L that were long enough to be uniquely mapped to the SARS-CoV-2 genome was collected (see “Methods”). The alignments of those uniquely mapped reads to the gRNA or sgRNAs are displayed and the corresponding read counts with the relative fraction of reads mapped to each ORF are shown in parentheses. d The relative fraction of TIS-L reads uniquely mapped to each ORF was compared between RPF-seq reads with high (5U, c) and low (0.008U, e) RNase I concentration (right). e An independent dataset of RPF-seq reads of TIS-L with reduced RNase I concentration (0.008U) at 48 hpi in Calu-3 cells, which consists of longer RPF-seq reads, were collected to obtain a larger number of reads uniquely mapped to TIS-L. Otherwise as in c. f For each ORF, level of translation initiation at TIS-L is compared with that at the annotated translation initiation site using RPF-seq dataset at 36 hpi. Otherwise as in d. g Using RPF-seq datasets, RPF expression level (left) and translation efficiency (right) of ORF S initiated from annotated ORF start codon (red dashed line) alone were measured and compared with those estimated by the number of RPF-seq reads from both annotated ORF start codon and TIS-L (see “Methods”) (red, solid line). Translation levels or translation efficiencies of other ORFs are depicted with gray lines. Otherwise as in Fig. 2b. h Enrichment of RPF-seq reads at the TIS-L for Calu-3 and Caco-2 cell lines at 48 hpi (left). The relative fractions of TIS-L reads mapped to each of gRNA and sgRNAs are compared between Calu-3 and Caco-2 (right). Otherwise as in b and d.
Fig. 4
Fig. 4. Translation initiation site located in the leader (TIS-L) functions as a global regulator of the SARS-CoV-2 translatome.
a Overview on the position of TIS-L in relation to ORF start codons of SARS-CoV-2 gRNA and sgRNAs. For each of gRNA and sgRNAs, positions of TIS-L (orange), transcription regulatory sequence (TRS, yellow), and start codon of ORFs (green) were displayed. The reading frame of each TIS-L-initiating hypothetical ORF was shown as a green (in frame) or red (out of frame) dashed line compared to the reading frame of the annotated ORF. RPF-seq reads mapped on TIS-L compared to those of the annotated ORFs 1a (b), S (c), 6 (d), and 7a (e), measured at 16, 24, and 36 hpi with their nucleotide sequence, annotation of each ORF, TRS, TIS-L, and a predicted ORF initiated from TIS-L shown below. For reading frames, the dark blue and the other blue bars indicate RPF-seq reads that are in frame and out of frame with the annotated ORF, respectively. The x-axis represents SARS-CoV-2 genomic position, and the y-axis represents log10(RPM + 1). The dashed lines indicate junction positions of each sgRNA. The ORFs that starts from CUG or TIS-L are depicted with its calculated protein sizes. Whether the CUG-derived hypothetical ORFs are in frame or out of frame with respect to the annotated downstream ORFs is also shown for each viral ORF. RPF-seq and QTI-seq reads mapped on the other ORFs are shown in Supplementary Figs. 7 and 8. f Experimental validation of TIS-L functions. A schematic diagram of RLuc reporters are displayed with the mutated sequences shown in red, designed to disrupt TIS-L-initiated uORF. Calu-3 transiently transfected with a RLuc reporter and a plasmid expressing FLuc mRNA were either infected or uninfected with SARS-CoV-2 and the relative RLuc activities were measured. RLuc and FLuc activities were normalized to RLuc and FLuc mRNAs, respectively (two-tailed, equal-sample variance Student’s t tests, *P < 0.05, **P < 0.01, ***P < 0.001). The mean values ± s.d. are displayed (n = 3 biologically independent experiments). P values are provided in Source Data.
Fig. 5
Fig. 5. Early and late responding human genes to SARS-CoV-2 infection.
a Differentially expressed genes (DEGs) identified from RPF-seq data are shown for each time point. From 0 to 36 hpi, RPF-seq levels of human genes were compared with those of uninfected condition. Log2(expression fold change) (x-axis) and statistical significance (y-axis; −log10-scale FDR-corrected q value) of each gene are displayed (see “Methods”). For a subset of DEGs with q < 0.01, highly upregulated and downregulated DEGs with |log2(expression fold change)| ≥ 2.0 are indicated in red and blue, respectively, and moderately upregulated and downregulated DEGs with |log2(expression fold change)| < 2.0 are indicated in pink and light blue, respectively. b Hierarchical clustering of the DEGs displayed in a. The genes identified as DEGs at least for one time point were clustered with respect to their temporal expression patterns (see “Methods”). Upregulation and downregulation in comparison to the uninfected condition is indicated in red and blue, respectively. c Temporal expression changes of the identified DEGs color-coded for the five clusters determined in b (left). Overall expression changes for each cluster were also depicted by mean log2(expression fold changes) (right). The light-colored shades for each line indicate the standard deviations, and the gray lines represent the expression changes of individual DEGs (left). For the DEGs included in each cluster determined in b, Gene Ontology (GO) enrichment analysis were performed and GO terms associated with early (d, e) and late (f, g) responding clusters in response to the viral infection are shown. For each cluster, top five GO terms chosen based on statistical significance and their temporal expression patterns are displayed (see “Methods”).
Fig. 6
Fig. 6. Associated functions and pathways of human genes responding to SARS-CoV-2 infection.
a Temporal expression changes of the human genes whose protein products were detected to interact with SARS-CoV-2 proteins (brown) with ACE2 and TMPRSS2 highlighted in blue. For mRNA-seq (left), RPF-seq (middle), and QTI-seq (right) data, log2(expression fold changes) for the genes at each time point were measured. Otherwise as in Fig. 5c. b Temporal expression changes of the host factors required for SARS-CoV-2 infection (brown), identified by CRISPR screening. Host factors whose targeting drugs were found in the Drug Gene Interaction database (DGIdb) are highlighted in magenta. The gray lines represent the expression changes of individual differentially expressed genes (DEGs) identified in Fig. 5. For mRNA-seq (left), RPF-seq (middle), and QTI-seq (right) data, log2(expression fold changes) for the genes at each time point were measured. c Association between the magnitude of differential expression and the impact on SARS-CoV-2 infectivity for the host factors investigated in b. The x-axis represents the maximum of the absolute values of RPF-seq log2(expression fold changes) across the time points. The y-axis represents the ranks of the CRISPR screening enrichment for the host factors. Spearman’s correlation ρ was measured between the two values, and the P value was obtained by using two-sided Student’s t test. Otherwise as in b. d Temporal expression changes of the type I and III interferon (IFN-α, β, ε, κ, λ, and ω) (green) and previously reported DEGs involved in type I interferon response (brown). Otherwise as in a. e Temporal expression changes of the previously reported DEGs involved in cytokine and chemokine activities (brown). Otherwise as in a. f Gene Ontology (GO) terms associated with DEGs identified from mRNA-seq (left) and RPF-seq (right) data are shown from 0 to 36 hpi. At each time point, statistical significance of the GO terms is visualized as a heat map, color-coded by −log10(FDR-corrected q values). g DAVID KEGG pathways associated with DEGs identified from mRNA-seq (left) and RPF-seq (right) data are shown from 0 to 36 hpi. Association significance of each KEGG pathway with mRNA-seq data (left) and RPF-seq data (right) is shown. Otherwise as in f. h Expression levels of human miRNAs upon SARS-CoV-2 infection at each time point with upregulated (left) and downregulated (right) miRNAs highlighted. Expression levels of each miRNA were measured as RPM for each time point (see “Methods”) and displayed as log10(RPM + 1). i Temporal expression changes of the genes reported to enhance (red) or suppress (blue) the translation initiation at non-AUG codons. Otherwise as in a. j Multiple sequence alignments for the 5′ leader region of betacoronaviruses. Two representative viruses from each lineage are displayed. See Supplementary Fig. 11i for multiple sequence alignments for a full list of betacoronaviruses.
Fig. 7
Fig. 7. A hypothetical therapeutic strategy for SARS-CoV-2 treatment by blocking TIS-L with antisense oligonucleotide (ASO).
By blocking the TIS-L with ASO, the viral infectivity of SARS-CoV-2 could be reduced by the disruption of the translation of the viral ORFs.

References

    1. Zhu N, et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020;382:727–733. doi: 10.1056/NEJMoa2001017. - DOI - PMC - PubMed
    1. Sola I, Almazan F, Zuniga S, Enjuanes L. Continuous and discontinuous RNA synthesis in coronaviruses. Annu Rev. Virol. 2015;2:265–288. doi: 10.1146/annurev-virology-100114-055218. - DOI - PMC - PubMed
    1. Lu J, et al. Genomic epidemiology of SARS-CoV-2 in Guangdong province, China. Cell. 2020;181:997–1003 e1009. doi: 10.1016/j.cell.2020.04.023. - DOI - PMC - PubMed
    1. Zhou P, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–273. doi: 10.1038/s41586-020-2012-7. - DOI - PMC - PubMed
    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed

Publication types