Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 26;10(2):91.
doi: 10.3390/biology10020091.

One Year of SARS-CoV-2: How Much Has the Virus Changed?

Affiliations

One Year of SARS-CoV-2: How Much Has the Virus Changed?

Santiago Vilar et al. Biology (Basel). .

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a worldwide crisis with profound effects on both public health and the economy. In order to combat the COVID-19 pandemic, research groups have shared viral genome sequence data through the Global Initiative on Sharing All Influenza Data (GISAID). Over the past year, ≈290,000 full SARS-CoV-2 proteome sequences have been deposited in the GISAID. Here, we used these sequences to assess the rate of nonsynonymous mutants over the entire viral proteome. Our analysis shows that SARS-CoV-2 proteins are mutating at substantially different rates, with most of the viral proteins exhibiting little mutational variability. As anticipated, our calculations capture previously reported mutations that arose in the first months of the pandemic, such as D614G (Spike), P323L (NSP12), and R203K/G204R (Nucleocapsid), but they also identify more recent mutations, such as A222V and L18F (Spike) and A220V (Nucleocapsid), among others. Our comprehensive temporal and geographical analyses show two distinct periods with different proteome mutation rates: December 2019 to July 2020 and August to December 2020. Notably, some mutation rates differ by geography, primarily during the latter half of 2020 in Europe. Furthermore, our structure-based molecular analysis provides an exhaustive assessment of SARS-CoV-2 mutation rates in the context of the current set of 3D structures available for SARS-CoV-2 proteins. This emerging sequence-to-structure insight is beginning to illuminate the site-specific mutational (in)tolerance of SARS-CoV-2 proteins as the virus continues to spread around the globe.

Keywords: 3D proteins; COVID-19; SARS-CoV-2; mutations; proteome; sequence.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Mutation rates in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) proteome. (A) Proteome-wide analysis of the observed mutation rate range for 27 SARS-CoV-2 proteins. Range for each protein is calculated as the difference between the highest residue MR and lowest MR. Red labels correspond to proteins with a range > 0.50. (B) Select examples of high-frequency mutating SARS-CoV-2 proteins (main mutation rates in residues D614 (S), A222 (S), L18 (S), P323 (NSP12), R203 (N), G204 (N), A220 (N), G50 (NS9c), and L67 (NS9c). Standard deviation for the mutation rates is plotted). A comprehensive analysis of the mutation rates for the rest of SARS-CoV-2 proteins is available in Figure S1 of the Supplementary Material. MRs were calculated taking into account November–December data.
Figure 2
Figure 2
Temporal emergence of SARS-CoV-2 mutations. (A) Running temporal average of SARS-CoV-2 proteome variation relative to December 2019. (B) Select temporal counts of SARS-CoV-2 variation rates for the high frequency mutating residues in the Spike, NSP12, Nucleocapsid, and NS9c proteins.
Figure 3
Figure 3
Temporal worldwide mutation rate (MR) analysis for the complete SARS-CoV-2 proteome (A) and the high-frequency mutating residues (B): D614 in the Spike (correlated data for P323 in the NSP12), A222 in the Spike (correlated data for A220 in the Nucleocapsid and for L67 in the NS9c), L18 in the Spike, and R203/G204 in the Nucleocapsid (correlated data for G50 in the NS9c). A minimum threshold of five sequences was considered in the world plots.
Figure 4
Figure 4
Three-dimensional (3D) protein structures colored by residue mutation rates: Spike, Nucleocapsid, RdRp (NSP12), Endoribonuclease (NSP15), NSP16-NSP10 heterodimer, Mpro (NSP5), NSP9, ADP ribose phosphatase (NSP3), and Papain-like protease (PLpro, NSP3). Proteins represented in white ribbons (MRs < 0.01) and color-coded residues (cyan: MRs = 0.01–0.025, green: MRs = 0.025–0.05, magenta: MRs = 0.05–0.10, red: MRs > 0.20. No residues with MR values between 0.10 and 0.20 were available in the shown crystallized structures).

References

    1. Doherty P.C. What have we learnt so far from COVID-19? Nat. Rev. Immunol. 2021:1–2. doi: 10.1038/s41577-021-00498-y. - DOI - PMC - PubMed
    1. Dow A.W., DiPiro J.T., Giddens J., Buckley P., Santen S.A. Emerging From the COVID-19 Crisis With a Stronger Health Care Workforce. Acad. Med. 2020;95:1823–1826. doi: 10.1097/ACM.0000000000003656. - DOI - PMC - PubMed
    1. OECD Policy Responses to Coronavirus (COVID-19) The Territorial Impact of COVID-19: Managing the Crisis Across Levels of Government. [(accessed on 15 December 2020)]; Available online: http://www.oecd.org/coronavirus/policy-responses/the-territorial-impact-....
    1. Morawska L., Cao J. Airborne transmission of SARS-CoV-2: The world should face the reality. Environ. Int. 2020;139:105730. doi: 10.1016/j.envint.2020.105730. - DOI - PMC - PubMed
    1. Race M., Ferraro A., Galdiero E., Guida M., Núñez-Delgado A., Pirozzi F., Siciliano A., Fabbricino M. Current emerging SARS-CoV-2 pandemic: Potential direct/indirect negative impacts of virus persistence and related therapeutic drugs on the aquatic compartments. Environ. Res. 2020;188:109808. doi: 10.1016/j.envres.2020.109808. - DOI - PMC - PubMed