Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 22;9(3):e0109621.
doi: 10.1128/Spectrum.01096-21. Epub 2021 Nov 17.

SARS-CoV-2 Variants and Their Relevant Mutational Profiles: Update Summer 2021

Affiliations

SARS-CoV-2 Variants and Their Relevant Mutational Profiles: Update Summer 2021

Mohammad Alkhatib et al. Microbiol Spectr. .

Abstract

Since the beginning of the coronavirus disease 2019 (COVID-19) pandemic caused by it, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been undergoing a genetic diversification leading to the emergence of new variants. Nevertheless, a clear definition of the genetic signatures underlying the circulating variants is still missing. Here, we provide a comprehensive insight into mutational profiles characterizing each SARS-CoV-2 variant, focusing on spike mutations known to modulate viral infectivity and/or antigenicity. We focused on variants and on specific relevant mutations reported by GISAID, Nextstrain, Outbreak.info, Pango, and Stanford database websites that were associated with any clinical/diagnostic impact, according to published manuscripts. Furthermore, 1,223,338 full-length high-quality SARS-CoV-2 genome sequences were retrieved from GISAID and used to accurately define the specific mutational patterns in each variant. Finally, mutations were mapped on the three-dimensional structure of the SARS-CoV-2 spike protein to assess their localization in the different spike domains. Overall, this review sheds light and assists in defining the genetic signatures characterizing the currently circulating variants and their clinical relevance. IMPORTANCE Since the emergence of SARS-CoV-2, several recurrent mutations, particularly in the spike protein, arose during human-to-human transmission or spillover events between humans and animals, generating distinct worrisome variants of concern (VOCs) or of interest (VOIs), designated as such due to their clinical and diagnostic impacts. Characterizing these variants and their related mutations is important in tracking SAR-CoV-2 evolution and understanding the efficacy of vaccines and therapeutics based on monoclonal antibodies, convalescent-phase sera, and direct antivirals. Our study provides a comprehensive survey of the mutational profiles characterizing the important SARS-CoV-2 variants, focusing on spike mutations and highlighting other protein mutations.

Keywords: COVID-19; SARS-CoV-2; emerging variants; mutations; pandemic; variants.

PubMed Disclaimer

Conflict of interest statement

Robert Shafer has received grant funding from Janssen Pharmaceuticals, Vela Diagnostics, and Insilixa and honoraria from Gilead Sciences and GlaxoSmithKline (GSK).

Figures

FIG 1
FIG 1
Mutations underlying the currently circulating variants in the spike glycoprotein. Only mutated positions are reported. The different domains of the spike glycoproteins are depicted. The consensus sequence for each variant was defined as nonsynonymous substitutions or deletions that occurred in >75% of sequences within that lineage. Each mutation (such as E484K) is indicated by a first letter that is the symbol for the reference amino acid of NC_045512.2 (e.g., E), a number for the amino acid position in the wild-type protein (e.g., 484), and a second letter representing the amino acid actually found in the sequence analyzed (e.g., K). The nomenclatures of the VOCs and some of the VOIs were those reported by WHO and Pango, while the rest of the VOIs and other variants were reported by Pango. Mutations in black refer to the mutations reported by Nextstrain, Outbreak.info, Pango lineages, and Stanford database websites, while mutations in gray are those that we identified by analyzing entire high-quality viral genome sequences from GISAID (n = 1,223,383). a The mutations L452R, E484K, and S494L are rarely present in this variant, with rates of 0.05%, 0.3%, and 0.3%, respectively. In addition to the deletion at position 144, a deletion at position 145 is also observed, with a low prevalence of about 0.02%. b This VOC was previously characterized additionally by the presence of L18F, which currently is only in about 38% of sequences, and 2 sublineages have evolved recently (B.1.351.2 and B.1.351.3) that have L18F at prevalences of about 94% and 93%, respectively. c The mutation P681H is rarely present in this variant, with a prevalence of 1.3%. d The mutations V70F, A222V, W258L, and K417N are detected in this variant with prevalences of about 0.3%, 12.1%, 0.2%, and 0.3%, respectively. Recently, this variant has evolved into 3 sublineages (AY.1, AY.2, and AY.3) that have acquired some additional mutations, as follows: AY.1 (also called Delta plus) presents W258L and K417N and AY.2 presents A222V and K417N, while AY.3 does not present specific Spike mutations. e The mutations S13I and W152C are only present in the B.1.429 variant. f The mutations T19I, G142D, and H1101D are detected in this variant with prevalences of 54.5%, 71.3%, and 30.8%, respectively. g The mutation Q52R is detected in this variant with a prevalence of 71.6%. h The mutations L452R, S477N, and E484K cooccur rarely in this variant, while they have sole prevalences of about 25.7%, 15.1%, and 54.0%, respectively. Recently, the B.1.526 variant has evolved into 2 sublineages (B.1.526.1 and B.1.526.2) that appear to have several more unique mutations. B.1.526.1 presents the mutations D80G, Y144Δ, F157S, L452R, D614G, T859N, and D950H, while B.1.526.2 presents L5F, T95I, D253G, S477N, D614G, and Q957R. i A large deletion of 7 amino acids between residues 247 and 253 is detected in 63.6% of sequences of this variant. j The mutation F565L is detected in this variant with a prevalence of about 6.9%. k An insertion is present at 145/146N in all sequences. l The mutation G142D is detected in this variant with a prevalence of 43.8%. m The mutations A262S and P272L can be detected with prevalences of 7.5% and 6.1%, respectively. n The deletion at positions 69 and 70 is detected in about 71% of sequences. o A 3-amino-acid insertion at 214TDR is present at a prevalence of 71.3% in this variant. p The mutations S98F, G769V, and K854N are detected with prevalences of 2.9%, 32.5%, and 8.9%, respectively. q The mutations R102I, E484K, and P812S are detected in this variant with prevalences of 50.5%, 6.0%, and 5.3%, respectively. r The mutation Q677H is present with a prevalence of about 28.1%. s A large deletion of 9 amino acids at residues 136 to 144 and an insertion of 4 amino acids at 679GIAL are present in all sequences.
FIG 2
FIG 2
Three-dimensional representation of SARS-CoV-2 spike protein reporting residues characterizing the 4 variants of concern (VOCs). The protein is shown as a gray cartoon. The Alpha B.1.1.7, Beta B.1.351, Gamma P1, and Delta B.1.617.2 VOCs are represented as magenta, blue, cyan and forest-green spheres, respectively. The shared mutated residues present in all, 3, and 2 VOCs are reported as red, salmon, and chocolate spheres, respectively.
FIG 3
FIG 3
Mutations underlying the currently circulating variants in the SARS-CoV-2 proteins. Only mutated positions are reported. (A) The different structural and regulatory proteins are depicted. (B) The nonstructural proteins are depicted. The consensus sequence for each variant was defined as nonsynonymous substitutions or deletions that occurred in >75% of sequences within that lineage. Each mutation (such as P323L in the viral polymerase) is indicated by a first letter that is the symbol for the reference amino acid of NC_045512.2 (e.g., P), a number for the amino acid position in the wild-type protein (e.g., 323), and a second letter representing the amino acid actually found in the sequence analyzed (e.g., L). The nomenclatures of the VOCs and some of the VOIs were those reported by WHO and Pango, while the rest of the VOIs and other variants were reported by Pango. Mutations in black refer to the mutations reported by the Nextstrain, Outbreak.info, Pango lineages, and Stanford database websites, while mutations in gray are those that we identified by analyzing entire high-quality viral genome sequences from GISAID (n = 1,223,383). a A large deletion of 9 amino acids between residues 23 and 31 of ORF6 is detected in all sequences of this variant. b The mutations G238C in the nucleocapsid protein and G172C in the protein encoded by ORF3a are detected with prevalences of about 30.3% and 30.5%, respectively. c The mutations S2Y and R203K in the nucleocapsid protein are present with prevalences of about 12.1% and 13.4%, respectively. d The helicase mutation T588I is detected with a prevalence of about 21.9%. e The mutation E195D in NS6 is present with a prevalence of about 25.7%. f The mutation T51I in NS10 is present with a prevalence of about 56.1%. g The mutations L741F in PL-pro and T599I in Hel are present with prevalences of 6.2% and 2.1%, respectively.

Similar articles

Cited by

References

    1. Robson F, Khan KS, Le TK, Paris C, Demirbag S, Barfuss P, Rocchi P, Ng WL. 2020. Coronavirus RNA proofreading: molecular basis and therapeutic targeting. Mol Cell 79:710–727. doi:10.1016/j.molcel.2020.07.027. - DOI - PMC - PubMed
    1. Pachetti M, Marini B, Benedetti F, Giudici F, Mauro E, Storici P, Masciovecchio C, Angeletti S, Ciccozzi M, Gallo RC, Zella D, Ippodrino R. 2020. Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. J Transl Med 18:179. doi:10.1186/s12967-020-02344-6. - DOI - PMC - PubMed
    1. Shu Y, McCauley J. 2017. GISAID: global initiative on sharing all influenza data—from vision to reality. Euro Surveill 22:30494. doi:10.2807/1560-7917.ES.2017.22.13.30494. - DOI - PMC - PubMed
    1. van Dorp L, Acman M, Richard D, Shaw LP, Ford CE, Ormond L, Owen CJ, Pang J, Tan CCS, Boshier FAT, Ortiz AT, Balloux F. 2020. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infect Genet Evol 83:104351. doi:10.1016/j.meegid.2020.104351. - DOI - PMC - PubMed
    1. Wu A, Wang L, Zhou H-Y, Ji C-Y, Xia SZ, Cao Y, Meng J, Ding X, Gold S, Jiang T, Cheng G. 2021. One year of SARS-CoV-2 evolution. Cell Host Microbe 29:503–505. doi:10.1016/j.chom.2021.02.017. - DOI - PMC - PubMed

Publication types

Substances

Supplementary concepts