Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Aug;6(8):e04743.
doi: 10.1016/j.heliyon.2020.e04743. Epub 2020 Aug 17.

Molecular biology of coronaviruses: current knowledge

Affiliations
Review

Molecular biology of coronaviruses: current knowledge

I Made Artika et al. Heliyon. 2020 Aug.

Abstract

The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) late December 2019 in Wuhan, China, marked the third introduction of a highly pathogenic coronavirus into the human population in the twenty-first century. The constant spillover of coronaviruses from natural hosts to humans has been linked to human activities and other factors. The seriousness of this infection and the lack of effective, licensed countermeasures clearly underscore the need of more detailed and comprehensive understanding of coronavirus molecular biology. Coronaviruses are large, enveloped viruses with a positive sense single-stranded RNA genome. Currently, coronaviruses are recognized as one of the most rapidly evolving viruses due to their high genomic nucleotide substitution rates and recombination. At the molecular level, the coronaviruses employ complex strategies to successfully accomplish genome expression, virus particle assembly and virion progeny release. As the health threats from coronaviruses are constant and long-term, understanding the molecular biology of coronaviruses and controlling their spread has significant implications for global health and economic stability. This review is intended to provide an overview of our current basic knowledge of the molecular biology of coronaviruses, which is important as basic knowledge for the development of coronavirus countermeasures.

Keywords: Biochemistry; Cell biology; Coronaviruses; Covid-19; Genetics; MERS-CoV; Microbiology; Molecular biology; SARS-CoV; SARS-CoV-2; Virology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagram of the coronavirus virion. Together with membrane (M) and envelope (E) transmembrane proteins, the spike (S) glycoprotein projects from a host cell-derived lipid bilayer, giving the virion a distinctive appearance. The haemagglutinin esterase (HE) forms small spikes which appear under the tall S protein spikes. The positive-sense viral genomic RNA is associated with the nucleocapsid phosphoprotein (N) forming the ribonucleoprotein with a helical structure (Masters, 2006; de Wit et al., 2016).
Figure 2
Figure 2
Map and membrane topology model of coronavirus spike (S) protein. a). Map of coronavirus spike (S) protein. The S protein can be divided into two functionally distinct subunits: the S1 and S2 subunits. The S1 subunit consists of two major domains the N-terminal domain (S1-NTD) and the C-terminal domain (S1-CTD). The S1 subunit contains a receptor-binding domain (RBD). The RBD contain a receptor binding motif (RBM). The arrow-heads mark the site of cleavage for the S protein by cellular protease(s). The signal peptide (SP), N-terminal domain (NTD) and regions of RBD and RBM are shown in S1. The S2 subunit contains the heptad repeat regions (HR1 and HR2), fusion peptide (FP), transmembrane domain (TM) and intracellular tail (IC) are shown. b). Model for coronavirus spike (S) trimer and its membrane topology. The S protein is a transmembrane protein which assembles into a homotrimer. The S1 subunits constitute the bulb portion of the spike, in the virion exterior. The S2 subunits anchor the S proteins into the viral membrane. The S2 subunits contain segments which include the fusion peptides (FP), HR1, HR2 and the highly conserved transmembrane domains. The HR2 regions locate close to the C-terminal end of the S ectodomain in the virion exterior. The intracellular tails (ICs) and the C-terminal ends of the S proteins are located in the virion interior (Masters, 2006; Li, 2016).
Figure 3
Figure 3
The schematic domain and membrane topology of the coronavirus membrane (M) protein. a). The coronavirus M protein has three transmembrane (TM) domains flanked by the amino terminal domain and the carboxy-terminal domain. The carboxy-terminal endodomain contains a conserved domain (CD) following the third transmembrane (TM) domain. b). The transmembrane topology of the coronavirus M protein. The M protein spans the viral membrane three times. The three transmembrane (TM) domains are flanked by the amino-terminal glycosylated domain (in the virion exterior) and the carboxy-terminal endodomain (in the virion interior). The conserved domain (CD) in the long carboxy-terminal endodomain is indicated (Arndt et al., 2010; Perrier et al., 2019).
Figure 4
Figure 4
The schematic domain and membrane topology of coronavirus envelope (E) protein. a). The schematic domain of the coronavirus E protein. The protein has a hydrophobic domain predicted to span the viral membrane. The conserved cysteine and proline residues are indicated. b). Membrane topology of coronavirus E protein. The protein spans the viral membrane once with the N terminal end at the virion exterior and the C terminal end at the virion interior. The transmembrane domain is indicated by bar (Ruch and Machamer, 2012).
Figure 5
Figure 5
The schematic domain of coronavirus nucleocapsid (N) protein. The coronavirus N protein is a phosphoprotein of 422 amino acid residues (in SARS-CoV). The protein has three distinct and highly conserved domains, the N terminal domain (NTD), the linker region (LKR) and the C-terminal domain (CTD). The NTD is separated from the CTD by the LKR. All of the three domains have been shown to bind with viral RNA. The LKR contains a Ser/Arg-rich region (SR) which contains a number of putative phosphorylation sites. The nuclear localization signal (NLS) motifs are shown. The N-terminal arm (NA) and the C-terminal tail (CT) are shown (McBride et al., 2014; Chang et al., 2014).
Figure 6
Figure 6
The schematic diagram of structure of the human-infecting coronavirus genomes. Each bar represents the genomic organization of each coronavirus. The genomic regions or open-reading frames (ORFs) are compared. The structural proteins, including spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins, as well as non-structural proteins translated from ORF 1a and ORF 1b and accessory proteins are indicated. The tags indicate the name of the ORFs. 5′UTR = 5′ untranslated region, 3′UTR = 3′ untranslated region, An = poly(A) tail (Masters, 2006; Chen et al., 2020; Wang et al., 2020b).
Figure 7
Figure 7
The schematic diagram of coronavirus life cycle. The coronavirus infection is initiated by the binding of the virus particles to the cellular receptors leading to viral entry followed by the viral and host cellular membrane fusion. After the membrane fusion event, the viral RNA is uncoated in the host cells cytoplasm. The ORF1a and ORF1ab are translated to produce pp1a and pp1ab, which are subsequently processed by the proteases encoded by ORF1a to produce 16 non-structural proteins (nsps) which form the RNA replicase–transcriptase complex (RTC). This complex localizes to modified intracellular membranes which are derived from the rough endoplasmic reticulum (ER) in the perinuclear region, and it drives the generation of negative-sense RNAs ((–)RNAs) through both replication and transcription. During replication, the full-length (–)RNA copies of the genome are synthezied and used as templates for the production of full-length (+)RNA genomes. During transcription, a subset of 7–9 subgenomic RNAs, including those encoding all structural proteins, is produced through discontinuous transcription. In this process, subgenomic (–)RNAs are synthesized by combining varying lengths of the 3′end of the genome with the 5′ leader sequence necessary for translation. These subgenomic (–)RNAs are then transcribed into subgenomic (+)mRNAs. The subgenomic mRNAs are then translated. The generated structural proteins are assembled into the ribonucleocapsid and viral envelope at the ER–Golgi intermediate compartment (ERGIC), followed by release of the newly produced coronavirus particle from the infected cell (Masters, 2006; de Wit et al., 2016).
Figure 8
Figure 8
Schematic diagram of domain organization of the SARS-CoV-2 nsp12, a typical coronavirus nsp12. The C-terminal of coronavirus nsp12 contains an RNA dependent RNA polymerase (RdRp) domain. The polymerase domain consists of a fingers domain, a palm domain and a thumb domain. All viral polymerases possess seven conserved motif regions (A, B, C, D, E, F, G) involved in template and nucleotide binding and catalysis (Kirchdoerfer and Ward, 2019). The nsp12 of the SARS-CoV-2 has a nidovirus-unique N-terminal extension domain that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The polymerase domain and NiRAN domain are connected by an interface (Gao et al., 2020).

References

    1. Adedeji A.O., Lazarus H. Biochemical characterization of Middle East respiratory syndrome coronavirus helicase. mSphere. 2016;1:1–14. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5014916/ - PMC - PubMed
    1. Anand K., Ziebuhr J., Wadhwani P., Mesters J.R., Hilgenfeld R. Coronavirus main proteinase (3CLpro) structure: basis for design of anti-SARS drugs. Science. 2003;300:1763–1767. https://science.sciencemag.org/content/300/5626/1763 - PubMed
    1. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nat. Med. 2020 - PMC - PubMed
    1. Angelini A.M., Akhlaghpour M., Neuman B.W., Buchmeier M.J. Severe acute respiratory syndrome coronavirus nonstructural proteins 3, 4, and 6 induce double-membrane vesicles. mBio. 2013;4:1–10. https://mbio.asm.org/content/4/4/e00524-13 - PMC - PubMed
    1. Anindita P.D., Sasaki M., Setiyono A., Handharyani E., Orba Y., Kobayashi S., Rahmadani I., Taha S., Adiani S., Subangkit M., Nakamura I., Sawa H., Kimura T. Detection of coronavirus genomes in Moluccan naked-backed fruit bats in Indonesia. Arch. Virol. 2015;160:1113–1118. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7086880/ - PMC - PubMed