Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Mar;61(3):180-202.
doi: 10.1002/jobm.202000537. Epub 2021 Jan 18.

SARS-CoV-2, the pandemic coronavirus: Molecular and structural insights

Affiliations
Review

SARS-CoV-2, the pandemic coronavirus: Molecular and structural insights

Swapnil B Kadam et al. J Basic Microbiol. 2021 Mar.

Abstract

The outbreak of a novel coronavirus associated with acute respiratory disease, called COVID-19, marked the introduction of the third spillover of an animal coronavirus (CoV) to humans in the last two decades. The genome analysis with various bioinformatics tools revealed that the causative pathogen (SARS-CoV-2) belongs to the subgenus Sarbecovirus of the genus Betacoronavirus, with highly similar genome as bat coronavirus and receptor-binding domain (RBD) of spike glycoprotein as Malayan pangolin coronavirus. Based on its genetic proximity, SARS-CoV-2 is likely to have originated from bat-derived CoV and transmitted to humans via an unknown intermediate mammalian host, probably Malayan pangolin. Further, spike protein S1/S2 cleavage site of SARS-CoV-2 has acquired polybasic furin cleavage site which is absent in bat and pangolin suggesting natural selection either in an animal host before zoonotic transfer or in humans following zoonotic transfer. In the current review, we recapitulate a preliminary opinion about the disease, origin and life cycle of SARS-CoV-2, roles of virus proteins in pathogenesis, commonalities, and differences between different corona viruses. Moreover, the crystal structures of SARS-CoV-2 proteins with unique characteristics differentiating it from other CoVs are discussed. Our review also provides comprehensive information on the molecular aspects of SARS-CoV-2 including secondary structures in the genome and protein-protein interactions which can be useful to understand the aggressive spread of the SARS-CoV-2. The mutations and the haplotypes reported in the SARS-CoV-2 genome are summarized to understand the virus evolution.

Keywords: ACE2; SARS-CoV-2; angiotensin-converting enzyme 2; pandemic coronavirus; severe acute respiratory syndrome coronavirus 2.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflict of interests.

Figures

Figure 1
Figure 1
(a) Genome structure of SARS‐CoV‐2 and other coronaviruses. The genome of CoVs comprises of 5ʹ and 3ʹ untranslated region (UTR) and open reading frame (ORF) 1a/b (blue boxes). The structural genes present at 3ʹ terminus encodes for the structural proteins including spike (S; white boxes), envelope (E; yellow boxes), membrane (M; red boxes), and nucleocapsid (N; green boxes) which are common features to all CoVs. In addition, the accessory genes interspaced between the structural genes encodes for accessory proteins. The comparison of coding regions of SARS‐CoV‐2 with different CoVs showed a similar genome organization to SARS‐CoV, bat SL‐CoVZXC21, and pangolin CoV GX/P2V. There is no remarkable difference in the ORF1 of different CoVs but it encodes for NSPs of variable lengths and there is a distinction in the accessory genes. The red dotted line shows a notable variation between SARS‐CoV‐2 and SARS‐CoV. Red dotted line: notable variation between SARS‐CoV‐2 and SARS‐CoV. (b) SARS‐CoV‐2 spike (S) glycoprotein. The S1 region of spike protein contains a N‐terminal domain (NTD; red box) and a C‐domain or receptor‐binding domain (RBD; yellow box). The S2 subunit contains the fusion peptide (FP; gray box), heptad repeat 1 (HR1; white box), central helix (CH; green), connector domain (CD; blue box), heptad repeat 2 (HR2; brown box), and transmembrane domain (TM). Black arrows: cleavage sites at S1/S2 boundary (R685) and S2ʹ (R815)
Figure 2
Figure 2
SARS‐CoV‐2 nonstructural proteins (nps), replication cycle, and host pathogenesis. The multiple interactions or the complexes which acts as activators are essential for the virus replication. Numbers on polypeptide pp1a/b labels the nsps; white dots: cleavage made by nsp3; orange dots: nsp5 (Mpro/3CLpro) cleave sites. Brown arrow: nsp–nsp interactions. The portion of the figure with faint blue background specifies the role of nsps in host pathogenesis. Figure created with BioRender.com
Figure 3
Figure 3
The multiple sequence alignment of spike (S) glycoprotein along with S1/S2 and S2ʹ cleavage sites. The proprotein convertase (PPC) or furin motif RRAR with leading proline insertion is unique to SARS‐CoV‐2 (PRRA insertion highlighted in red box) although NL63 and MERS have proline without downstream additional basic residues. Such polybasic cleavage site is absent in other beta CoVs including bat, Chinese as well as Malayan pangolin, and even previous human SARS‐CoV. The S2ʹ cleavage site at R815 is conserved across all the sequences analyzed; however, SARS‐CoV‐2, bat, and pangolin has KPSKR and civet and hSARS‐CoV has KPTKR
Figure 4
Figure 4
Multiple sequence alignment of SARS‐CoV‐2 receptor‐binding domain (RBD) of spike glycoprotein (S). The contact amino acid residues of RBD that interacts with ACE2 receptor are marked with red boxes. All six amino acid residues exactly matches with Malayan pangolin CoV strains MP789 (NCBI acc no: MT084071) [62] and GD/P2S (GISAID acc no: EPI_ISL_410544) [63]; both the samples originated from the Guangdong Wildlife Rescue Center. These Malayan pangolins were rescued by the Anti‐Smuggling Customs Bureau in March 2019. This suggests that ancestral strain of SARS‐CoV‐2 might have infected Malayan pangolins
Figure 5
Figure 5
Phylogenetic relationship of various coronavirus spike (S) glycoproteins. The sequences downloaded from UniprotKB and GenBank website were clustered according to generas, namely, Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus. Multiple sequence alignment (MSA) was build using the MUSCLE tool of MEGAX software and phylogeny was inferred using maximum likelihood method with model of substitution: WAG + F + I+ G4 and 1000 bootstraps employing IQ‐Tree webserver (http://iqtree.cibiv.univie.ac.at/)
Figure 6
Figure 6
Replication cycle of SARS‐CoV‐2 and potential therapeutic target sites. The SARS‐CoV‐2 enters human body through nasal–oral route and in response to the virus, the body initiates innate response by producing interferons (IFNs); however, IFN activates expression of ACE2 protein which acts as receptor for virus attachment to host cell. Receptor‐binding domain (RBD) of S1 region of S protein interact with ACE2 which leads to proteolytic cleavage at the S1–S2 boundary and S2ʹ R815 site mediated by TMPRSS2 induces the viral and host cell plasma membrane fusion. The viral genomic single‐stranded RNA is translated by host machinery to produce viral polypeptide and these polypeptide undergo proteolytic cleavage by Mpro or 3CLpro synthesizing pp1a and pp1a/b. These polyproteins encode replication–transcription complex (RTC) which continuously replicates and produces a series of subgenomic messenger RNAs encoding the accessory and structural proteins. The viral genomic RNA and proteins are assembled to form the virus particles and buds in the ER and Golgi. Later, the virus containing vesicles fuse with plasma membrane of the host and release the viral particles out of the cell. The antiviral molecules with target sites are highlighted in red color

References

    1. Shereen MA, Khan S, Kazmi A, Bashir N, Siddique R. COVID‐19 infection: origin, transmission, and characteristics of human coronaviruses. J Adv Res. 2020;24:91‐8. - PMC - PubMed
    1. Cui J, Li F, Shi ZL. Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol. 2019;17:181‐92. - PMC - PubMed
    1. Corman VM, Lienau J, Witzenrath M. Coronaviruses as the cause of respiratory infections. Internist (Berl). 2019;60:1136‐45. - PMC - PubMed
    1. Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med. 2020;382:727‐33. - PMC - PubMed
    1. Gorbalenya AE, Baker SC, Baric RS, de Groot RJ, Gulyaeva AA, Haagmans BL, et al. The species and its viruses—a statement of the coronavirus study group. bioRxiv. 2020. 10.1101/2020.02.07.937862 - DOI

MeSH terms

Substances

LinkOut - more resources