Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 24;12(1):3917.
doi: 10.1038/s41467-021-22785-x.

The architecture of the SARS-CoV-2 RNA genome inside virion

Affiliations

The architecture of the SARS-CoV-2 RNA genome inside virion

Changchang Cao et al. Nat Commun. .

Abstract

SARS-CoV-2 carries the largest single-stranded RNA genome and is the causal pathogen of the ongoing COVID-19 pandemic. How the SARS-CoV-2 RNA genome is folded in the virion remains unknown. To fill the knowledge gap and facilitate structure-based drug development, we develop a virion RNA in situ conformation sequencing technology, named vRIC-seq, for probing viral RNA genome structure unbiasedly. Using vRIC-seq data, we reconstruct the tertiary structure of the SARS-CoV-2 genome and reveal a surprisingly "unentangled globule" conformation. We uncover many long-range duplexes and higher-order junctions, both of which are under purifying selections and contribute to the sequential package of the SARS-CoV-2 genome. Unexpectedly, the D614G and the other two accompanying mutations may remodel duplexes into more stable forms. Lastly, the structure-guided design of potent small interfering RNAs can obliterate the SARS-CoV-2 in Vero cells. Overall, our work provides a framework for studying the genome structure, function, and dynamics of emerging deadly RNA viruses.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview and evaluation of vRIC-seq technology.
a Scheme of vRIC-seq technology. Concanavalin A (ConA) beads were used to capture the virion for diverse enzyme treatments in subsequent steps. b The proportions of chimeric reads mapped to SARS-CoV-2. c Scatter plots showing the correlation between two biological replicates for the number of chimeric reads (interaction strength). R, Pearson correlation coefficient. d Circos plot showing the distribution of chimeric reads along the SARS-CoV-2 genome. The inner red circle stands for the fractions of adenine or uracil within 100 nt windows, and the outer blue circle shows the coverage of chimeric reads. FSE frame-shifting element. e, f vRIC-seq confirmed known coronavirus RNA structures in the 5′ UTR (1–480 nt, e) and 3′ UTR (29,546–29,870 nt, f) of the SARS-CoV-2 RNA genome. Connection scores shown in different colors were used for assessing the base-pairing probability. The dashed lines illustrated the pseudoknot.
Fig. 2
Fig. 2. Global view of SARS-CoV-2 genome organization.
a Spanning distance of pairwise interacting RNAs. P1, P2, and P3 mark three peaks corresponding to chimeric interactions spanning 810, 1360, and 2090 nucleotides. b RNA interaction map of the SARS-CoV-2 genome. The black triangles represent RNA topological domains. The frame-shifting element (FSE), the transcription-regulatory sequence in the leader (TRS-L), and the body (TRS-B) are marked as black lines. c The global configuration of the SARS-CoV-2 RNA genome in virions, modeled by the miniMDS software. The 30 kb RNA genome of SARS-CoV-2 is presented as a rope, and each coding region and UTR are marked with different colors. The vRIC-seq detected RNA contact frequencies were used for the modeling. The solid red lines represent chimeric signals that support the local interactions, whereas the dashed red lines depict long-range interactions. The same picture rotated in 180 degrees is shown at the bottom.
Fig. 3
Fig. 3. The secondary structure of the SARS-CoV-2 genome.
The known structural elements in the 5′ UTR, the frame-shifting element (FSE), and the 3′ UTR are labeled or marked in blue. The pairwise interaction strength was quantified and shown in different colors. Black arrows highlight highly confident long-range duplexes measured by vRIC-seq signals. Green and cyan boxes mark the start and stop codons, respectively. Purple boxes and arrows outline the core sequence (CS) of each transcription-regulatory sequence (TRS).
Fig. 4
Fig. 4. Comparison of SARS-CoV-2’s structure in virions and host cells.
a Schematic diagram showing the life cycle of SARS-CoV-2. b Single-stranded nucleotides (n = 8269) identified by vRIC-seq in virions have higher SHAPE-MaP reactivities than base-paired nucleotides (n = 19055) in cells. P-value was determined by two-tailed, unpaired Student’s t-test. The center line of the box plot represents the median, the box borders represent the first (Q1) and third (Q3) quartiles, and the whiskers are the most extreme data points within 1.5× the interquartile range (from Q1 to Q3). c Venn diagram showing the overlap between duplexes revealed in virions and cells. d The percentage of in-virion interactions supported by the COMRADES chimeric reads is decreased along with the spanning distance. The control datasets are randomly selected pairwise loci (repeated 1000 times, n = 1000). Data are mean ± s.e.m., ***P < 0.001, one-sided permutation test. e The positive predictive value (PPV) of the in-virion duplexes compared to the in-cell predicted duplexes along the SARS-CoV-2 genome in sliding 1 kb windows. The dashed line indicates the average percentage. f In-virion (top) and in-cell (bottom) duplexes in the FSE surrounded region. Arc lines colored in gray indicate base pairs spanning more than 500 nt, arc lines in red indicate base pairs shared by duplexes in virions and cells, while arc lines in blue indicate base pairs specific in virions or cells. g Diagram the pseudoknot structure of FSE. The dashed lines represent base pairs in Stem 2. Different colors in bases stand for the strength of vRIC signals. The scale is shown at the right bottom. An alternative duplex can form at the gray shadowed regions. h vRIC-seq data preferably supports an elongated duplex in the FSE region. i In-virion (top) and in-cell (bottom) duplexes in 3′ UTR surrounded region. j Dual-Luciferase Reporters for functional characterization of the alternative duplex. Before the slippery site, a 33 nt sequence (orange) can form a duplex with the Stem 1 (green) of FSE-PK. In-frame is denoted as “0 frame”, while “−1 frame” stands for programmed −1 ribosomal frameshift (−1 PRF). The dashed orange lines stand for the deleted 33 nt sequence. Ctrl control, WT wild type, Del deletion, Rluc renilla luciferase, Fluc firefly luciferase. k The alternative duplex can stimulate −1 PRF activity. Data are mean ± s.e.m., P-value was determined by a two-tailed, unpaired Student’s t-test (n = 6 biologically independent experiments). Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Co-variation among duplexes in different SARS-CoV-2 strains.
a Profile of structural elements and variants in the SARS-CoV-2 genome. The black arc lines stand for base pairs, the blue line indicates the density of base-paired nucleotides, and the red arc lines denote co-variant base pairs among different SARS-CoV-2 strains. b, c Co-variant base pairs in the 5′ UTR (b) and 3′ UTR (c), respectively. Arc lines and nucleotides colored in red indicate co-variant base pairs. EPI_ISL_402125 is the reference SARS-CoV-2 sequence.
Fig. 6
Fig. 6. D614G and the accompanying mutations on structure remodeling.
a Line plot showing the point mutation resided regions tend to form local interactions. The C241U, C3037U, C14408U, and A23403G (D614G) mutants are marked as solid lines. b The A-to-G transition (D614G mutant) at 23,403 nt remodels two bulge structures into a single six-nucleotide bulge. Red stars mark the mutated nucleotides. ΔG, free energy. ce The D614G accompanying mutations have no influence on duplexes except for the C14408U transition (e).
Fig. 7
Fig. 7. Structure-guided design of potent siRNAs as a cleavage agent to restrict SARS-CoV-2 infection.
a Diagram of strategy against SARS-CoV-2 infection in Vero cells. b The SARS-CoV-2 copies in the supernatant were reduced to background level upon transfection with siRNAs targeting single-stranded regions (si-1 to si-6). Mock, uninfected cells; si-NC, non-targeting siRNA (control). c qPCR showing the abundance of viral RNA in infected Vero cells. Data in (b) and (c) are mean ± s.e.m.; n = 3 biological replicates, two-tailed, unpaired Student’s t-test. Source data are provided as a Source Data file. d Number of single-stranded bases within the siRNA target regions identified in cells and virions. The color intensity denotes the number of siRNAs. The six siRNAs targeting single-stranded regions and three siRNAs targeting duplex regions are labeled as s-1 to s-6 and d-1 to d-3, respectively. R, Pearson correlation coefficient. P-value was determined by two-sided correlation test.

Similar articles

Cited by

References

    1. Zhu, N. et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Eng. J. Med.382, 727–733 (2020). - PMC - PubMed
    1. Ren LL, et al. Identification of a novel coronavirus causing severe pneumonia in human: a descriptive study. Chin. Med. J. 2020;133:1015–1024. doi: 10.1097/CM9.0000000000000722. - DOI - PMC - PubMed
    1. Wu F, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed
    1. de Wit E, van Doremalen N, Falzarano D, Munster VJ. SARS and MERS: recent insights into emerging coronaviruses. Nat. Rev. Microbiol. 2016;14:523–534. doi: 10.1038/nrmicro.2016.81. - DOI - PMC - PubMed
    1. Gordon DE, et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020;583:459–468. doi: 10.1038/s41586-020-2286-9. - DOI - PMC - PubMed

Publication types