Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Jun 22:15:11779322211025876.
doi: 10.1177/11779322211025876. eCollection 2021.

Structure and Function of Major SARS-CoV-2 and SARS-CoV Proteins

Affiliations
Review

Structure and Function of Major SARS-CoV-2 and SARS-CoV Proteins

Ritesh Gorkhali et al. Bioinform Biol Insights. .

Abstract

SARS-CoV-2 virus, the causative agent of COVID-19 pandemic, has a genomic organization consisting of 16 nonstructural proteins (nsps), 4 structural proteins, and 9 accessory proteins. Relative of SARS-CoV-2, SARS-CoV, has genomic organization, which is very similar. In this article, the function and structure of the proteins of SARS-CoV-2 and SARS-CoV are described in great detail. The nsps are expressed as a single or two polyproteins, which are then cleaved into individual proteins using two proteases of the virus, a chymotrypsin-like protease and a papain-like protease. The released proteins serve as centers of virus replication and transcription. Some of these nsps modulate the host's translation and immune systems, while others help the virus evade the host immune system. Some of the nsps help form replication-transcription complex at double-membrane vesicles. Others, including one RNA-dependent RNA polymerase and one exonuclease, help in the polymerization of newly synthesized RNA of the virus and help minimize the mutation rate by proofreading. After synthesis of the viral RNA, it gets capped. The capping consists of adding GMP and a methylation mark, called cap 0 and additionally adding a methyl group to the terminal ribose called cap1. Capping is accomplished with the help of a helicase, which also helps remove a phosphate, two methyltransferases, and a scaffolding factor. Among the structural proteins, S protein forms the receptor of the virus, which latches on the angiotensin-converting enzyme 2 receptor of the host and N protein binds and protects the genomic RNA of the virus. The accessory proteins found in these viruses are small proteins with immune modulatory roles. Besides functions of these proteins, solved X-ray and cryogenic electron microscopy structures related to the function of the proteins along with comparisons to other coronavirus homologs have been described in the article. Finally, the rate of mutation of SARS-CoV-2 residues of the proteome during the 2020 pandemic has been described. Some proteins are mutated more often than other proteins, but the significance of these mutation rates is not fully understood.

Keywords: SARS-CoV; SARS-CoV-2; function; proteins; structure.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicting interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
Genome organization of SARS-CoV and SARS-CoV-2. The upper panel shows the genomic organization of SARS-CoV. It contains two polyproteins pp1a and pp1b synthesized from ORF1a and 1b. These polypeptides undergo a series of proteolytic cleavages to form 16 nonstructural proteins. These 16 nonstructural proteins are encoded by the first two thirds of the genome (figure not drawn to scale). The second one third of the genome encodes for four structural proteins S, E, M, and N. Interspersed within these genes are genes for the expression accessory proteins. SARS-CoV-2, a more recently discovered virus, has genomic organization almost identical to SARS-CoV. Accessory protein 8 for SARS-CoV-2 is not divided into 8a and 8b as in SARS-CoV and two ORF for spike proteins are not present in this virus. In addition, SARS-CoV-2 protein contains accessory protein 10 potentially not present in SARS-CoV. ORF indicates Open Reading Frames.
Figure 2.
Figure 2.
Cap0 and Cap1 activity. Methylation is carried out at the 5ʹ end of newly synthesized RNA using nsp14 (cap0). Methylation is then again carried out at the 2ʹ oxygen of terminal ribose using nsp10 and 14.
Figure 3.
Figure 3.
Hypothetical SARS-CoV-2 entry and replication inside human host cell. (A) Human host cell with angiotensin-converting enzyme 2 (ACE 2) receptors, which attaches the virus and aids its entry into the cell. (B) Positive-sense single-stranded viral RNA within the cell. (C) Translation of the RNA into 1ab proteins (pp1a and pp1b). (D) Cleavage of pp1a and pp1b into 16 nonstructural proteins by virally encoded chymotrypsin-like protease and two papain-like proteases. (E) Induction of double-membraned vesicles (DMV) and localization of cleaved nsps with the help of nsp3, nsp4 and nsp6. (F) nsp8 acts as a primase for RNA replication. (G) nsp7, nsp8, nsp12, and nsp14 assist polymerase and exonuclease activities. (H) mRNA capping is assisted by nsp10, nsp13, nsp14 and nsp16. (I) Finally, replicated RNA and other translated viral proteins assembled into a new virus.
Figure 4.
Figure 4.
Crystal structure of SARS-CoV-2 nsp1 globular domain (Cartoon representation) comprising of residues 13 to 127 showing red helix, yellow sheet and green loop (PDB 7K3N). The structure of SARS-CoV-2 nsp1, like that of SARS-CoV, has six-stranded beta-barrel (yellow) and has additionally an alpha1 helix (red) and large number of flexible loops (green).
Figure 5.
Figure 5.
SARS-CoV-2 3CL protease (3CL pro) in complex with a novel inhibitor in cartoon representation showing red helix, yellow sheet, and green loop (PDB 2M2 N). 3CL pro has two chains (A and B) with three domains (I, II, III). A long loop connects domains II and III. The B barrels of each domain I and II are composed of six-stranded B-sheets (yellow) and domain III is composed of mainly alpha helices (red).
Figure 6.
Figure 6.
Crystal structure of SARS-CoV-2 nsp3 macrodomain (Cartoon representation) in complex with ADP ribose and showing five (A, B, C, D, E) different chains in different colors where ligands and water are shown in ball and stick representation (PDB 6YWL). SARS-CoV-2 encodes a large, multidomain nsp3 with an ADP ribose phosphate (ADRP) domain (also known as macrodomain), which is thought to interfere with the host immune response.
Figure 7.
Figure 7.
Crystal structure of RNA-dependent RNA polymerase of SARS-CoV-2 consisting of four chains represented by nsp12 (chain A), nsp8 (chain B, D), and nsp7 (chain C) in four different colors green, sky-blue, yellow, and purple, respectively (PDB 7W4Y).
Figure 8.
Figure 8.
RNA-dependent RNA polymerase activity and exonuclease proofreading activity. The figure shows, at a biochemical level, what reactions take place due to RNA-dependent RNA polymerase and 3ʹ to 5ʹ exonuclease.
Figure 9.
Figure 9.
Cartoon representation of SARS-CoV nsp10-nsp14 complex structure showing loops (green), β-sheets (yellow), and helices (red) (PDB 5C8U). The nsp14 ExoN domain is stabilized by nsp10. The ExoN domain features a core, twisted β-sheet consisting of five β-strands and the N7-MTase domain consists of five β-strands.
Figure 10.
Figure 10.
Cartoon representation of SARS-CoV-2 nsp13 structure showing loops (green), β-sheets (yellow), and helices (red) (PDB 6ZSL).
Figure 11.
Figure 11.
Cartoon representation of SARS-CoV-2 nsp10-nsp16 complex structure showing loops (green), β-sheets (yellow), and helices (red) (PDB 6W4 H). Nsp10’s positively charged and hydrophobic surface interacts with a hydrophobic pocket and a negatively charged nsp16 surface, which helps to stabilize the SAM binding site.
Figure 12.
Figure 12.
Cartoon representation of SARS-CoV-2 nsp15 structure showing loops (green), β-sheets (yellow), and helices (red)(PDB 6VWW). SARS-CoV-2 nsp15 generates dimers of trimmers, which finally assembles into a hexamer where each subunit of nsp15 contains 10 α-helices and 21 β-strands.
Figure 13.
Figure 13.
Cartoon representation of crystal structure of SARS-CoV-2 nsp9 dimer structure showing loops (green), β-sheets (yellow), and helices (red) (PDB 6WXD). The inter-subunit interactions to form a dimer are due to van der Waals interactions between the interfacing copies of α1 helix C-terminal as a result of self-association of GxxxG protein-protein binding motif.
Figure 14.
Figure 14.
Surface representation of a closed trimer of SARS-CoV-2 S protein (A) composed of three chains shown in purple, green, and cyan. A cartoon representation of the side view of SARS-CoV-2 S trimer, (B) cartoon representation of top view of the SARS-CoV-2 trimer (C) showing the (closed) hACE2-binding S-B domain (pdb 6VXX)
Figure 15.
Figure 15.
A cartoon representation of hACE2-binding S-B domain (A) of SARS-CoV-2 S protein. A cartoon representation of SARS-CoV-2 S-B domain (purple) bound with an hACE2 (red) (B). A SARS-CoV-2 S-B domain (C) showing receptor-binding motif (RBM) comprising amino acids 438 to 506 in red and the core in cyan (pdb 6M0J).
Figure 16.
Figure 16.
A cartoon representation showing a trimer of the S2 subunit of SARS-CoV-2 S protein (PDB 6LXT).
Figure 17.
Figure 17.
Structure of SARS-CoV-2 nucleocapsid protein: (A) Structure of SARS-CoV-2-N-NTD (Cartoon representation), monomers in one asymmetric unit. There are a total of four monomers, which are represented by different colors (PDB 6M3M). (B) structure of SARS-CoV-2-N-NTD (Cartoon representation) showing green loops, yellow β sheets, and red 310 helices (η) (PDB 6M3M). (C) Structure of SARS-CoV-2-N-NTD showing (loop)-(β-sheet core)-(loop) with β-hairpin sticking out from β2 and β5 regions. Here η1 represent the 310 helix.
Figure 18.
Figure 18.
Top and side view, repectively, of a SARS-CoV E protein pentameric ion channel (A and B) in ribbon diagram represntation. Ribbon diagram representation of a single monomer of E protein (C) that forms the ion channel pentamer (pdb 5X29).
Figure 19.
Figure 19.
Transmembrane domains predicted by TMHMM web server in which there are three domains connected by two linker peptides.
Figure 20.
Figure 20.
Structure of SARS-CoV-2 ORF3a protein (Cartoon representation). (A) Showing green chain A, blue chain B. (B) Showing green loops, yellow β sheets, and red alpha helices (PDB 6XDC). ORF indicates Open Reading Frames.
Figure 21.
Figure 21.
Structure of SARS-CoV-2 ORF7a protein (cartoon representation) showing green loops, yellow β sheets (PDB 7Ci3). ORF indicates Open Reading Frames.
Figure 22.
Figure 22.
Structure of SARS-CoV-2 ORF8 protein (cartoon representation) showing green loops, yellow β sheets (PDB 7JTL). ORF indicates Open Reading Frames.
Figure 23.
Figure 23.
Structure of SARS-CoV-2 ORF9b protein (Cartoon representation). (A) Showing green chain A, blue chain B. (B) Showing green loops, yellow β sheets, and red alpha helices (PDB 6Z4U). ORF indicates Open Reading Frames.

Similar articles

Cited by

References

    1. Wu F, Zhao S, Yu B, et al.. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579:265-269. doi:10.1038/s41586-020-2008-3. - PMC - PubMed
    1. Masters PS. The molecular biology of coronaviruses. Adv Virus Res. 2006;66:193-292. doi:10.1016/S0065-3527(06)66005-3. - PMC - PubMed
    1. Marra MA, Jones SJM, Astell CR, et al.. The genome sequence of the SARS-associated coronavirus. Science. 2003;300:1399-1404. doi:10.1126/science.1085953. - PubMed
    1. Rota PA, Oberste MS, Monroe SS, et al.. Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science. 2003;300:1394-1399. doi:10.1126/science.1085952. - PubMed
    1. Thiel V, Ivanov KA, Putics Á, et al.. Mechanisms and enzymes involved in SARS coronavirus genome expression. J Gen Virol. 2003;84:2305-2315. doi:10.1099/vir.0.19424-0. - PubMed