Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec;7(4):567-577.
doi: 10.1016/j.gendis.2020.05.006. Epub 2020 Jun 2.

Amino acid variation analysis of surface spike glycoprotein at 614 in SARS-CoV-2 strains

Affiliations

Amino acid variation analysis of surface spike glycoprotein at 614 in SARS-CoV-2 strains

Canhui Cao et al. Genes Dis. 2020 Dec.

Abstract

As severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continues to disperse globally with worrisome speed, identifying amino acid variations in the virus could help to understand the characteristics of it. Here, we studied 489 SARS-CoV-2 genomes obtained from 32 countries from the Nextstrain database and performed phylogenetic tree analysis by clade, country, and genotype of the surface spike glycoprotein (S protein) at site 614. We found that virus strains from mainland China were mostly distributed in Clade B and Clade undefined in the phylogenetic tree, with very few found in Clade A. In contrast, Clades A2 (one case) and A2a (112 cases) predominantly contained strains from European regions. Moreover, Clades A2 and A2a differed significantly from those of mainland China in age of infected population (P = 0.0071, mean age 40.24 to 46.66), although such differences did not exist between the US and mainland China. Further analysis demonstrated that the variation of the S protein at site 614 (QHD43416.1: p.614D>G) was a characteristic of stains in Clades A2 and A2a. Importantly, this variation was predicted to have neutral or benign effects on the function of the S protein. In addition, global quality estimates and 3D protein structures tended to be different between the two S proteins. In summary, we identified different genomic epidemiology among SARS-CoV-2 strains in different clades, especially in an amino acid variation of the S protein at 614, revealing potential viral genome divergence in SARS-CoV-2 strains.

Keywords: ACE2; COVID-19; Phylogenetic tree; SARS-CoV-2; Surface spike glycoprotein.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Different SARS-CoV-2 clades among countries in phylogenetic tree. (A) Phylogenetic tree of 489 SARS-CoV-2 genomes from Nextstrain, the cases were colored by countries. Branch labels were clades. (B) Clade Distribution of 489 SARS-CoV-2 genomes in world map from Nextstrain. Color by clades.
Figure 2
Figure 2
Amino acid variation of S protein at site 614 in Clade A2 and A2a SARS-Cov-2 strains. (A) Radial phylogenetic tree of 489 SARS-CoV-2 from Nextstrain, the cases were colored by the amino acid of S protein at site 614. Green: glutamic acid (D), yellow: glycine (G). Branch labels were clades. (B) Rectangular phylogenetic tree of 489 SARS-CoV-2 from Nextstrain, the cases were colored by the amino acid of S protein at site 614. Green: glutamic acid (D), yellow: glycine (G). Branch labels were the amino acid variation of SARS-Cov-2 proteins. (C) Diversity of S protein of SARS-Cov-2.
Figure 3
Figure 3
Variation in S protein at site 614 does not affect protein function. (A) Prediction results of the variation of S protein at 614. (B) Part multiple sequence alignment results, amino acids surrounding the variation position (614) were shown.
Figure 4
Figure 4
Protein modeling estimate of QHD43416.1 and QHD43416.1: p.614D > G. (A) Amino acids of QHD43416.1 and QHD43416.1: p.614D > G surrounding the site 614. (B) Protein modeling estimate results of QHD43416.1 and QHD43416.1: p.614D > G from SWISS-MODEL Server.
Figure 5
Figure 5
Three-dimensional (3D) protein structure models of QHD43416.1 and QHD43416.1: p.614D > G. (A) 3D protein structure of QHD43416.1 and QHD43416.1: p.614D > G performed by SWISS-MODEL Server. (B) The zooming-in region of site 614 of QHD43416.1 and QHD43416.1: p.614D > G performed by SWISS-MODEL Server.

References

    1. Lu R., Zhao X., Li J. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet. 2020;395(10224):565–574. - PMC - PubMed
    1. Wu Z., McGoogan J.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese center for disease Control and prevention. JAMA. 2020;323(13):1239–1242. doi: 10.1001/jama.2020.2648. - DOI - PubMed
    1. Del Rio C., Malani P.N. COVID-19-New insights on a rapidly changing epidemic. JAMA. 2020;323(14):1339–1340. doi: 10.1001/jama.2020.3072. - DOI - PubMed
    1. World Health Organization Coronavirus disease 2019 (COVID-19) situation report – 101. https://www.who.int/docs/default-source/coronaviruse/situation-reports/2... - PubMed
    1. Li W., Shi Z., Yu M. Bats are natural reservoirs of SARS-like coronaviruses. Science. 2005;310(5748):676–679. - PubMed

LinkOut - more resources