Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug:92:104831.
doi: 10.1016/j.meegid.2021.104831. Epub 2021 Mar 31.

Molecular epidemiology analysis of early variants of SARS-CoV-2 reveals the potential impact of mutations P504L and Y541C (NSP13) in the clinical COVID-19 outcomes

Affiliations

Molecular epidemiology analysis of early variants of SARS-CoV-2 reveals the potential impact of mutations P504L and Y541C (NSP13) in the clinical COVID-19 outcomes

Canhui Cao et al. Infect Genet Evol. 2021 Aug.

Abstract

Since severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused global pandemic with alarming speed, comprehensively analyzing the mutation and evolution of early SARS-CoV-2 strains contributes to detect and prevent such virus. Here, we explored 1962 high-quality genomes of early SARS-CoV-2 strains obtained from 42 countries before April 2020. The changing trends of genetic variations in SARS-CoV-2 strains over time and country were subsequently identified. In addition, viral genotype mapping and phylogenetic analysis were performed to identify the variation features of SARS-CoV-2. Results showed that 57.89% of genetic variations involved in ORF1ab, most of which (68.85%) were nonsynonymous. Haplotype maps and phylogenetic tree analysis showed that amino acid variations in ORF1ab (p.5828P > L and p.5865Y > C, also NSP13: P504L and NSP13: Y541C) were the important characteristics of such clade. Furthermore, these variants showed more significant aggregation in the United States (P = 2.92E-66, 95%) than in Australia or Canada, especially in strains from Washington State (P = 1.56E-23, 77.65%). Further analysis demonstrated that the report date of the variants was associated with the date of increased infections and the date of recovery and fatality rate change in the United States. More importantly, the fatality rate in Washington State was higher (4.13%) and showed poorer outcomes (P = 4.12E-21 in fatality rate, P = 3.64E-29 in death and recovered cases) than found in other states containing a small proportion of strains with such variants. Using sequence alignment, we found that variations at the 504 and 541 sites had functional effects on NSP13. In this study, we comprehensively analyzed genetic variations in SARS-CoV-2, gaining insights into amino acid variations in ORF1ab and COVID-19 outcomes.

Keywords: Amino acid variations; COVID-19; Genetic variations; ORF1ab; SARS-CoV-2 strains.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Mutant landscape of genetic variations in SARS-CoV-2 strains. Genetic variation heatmap of SARS-CoV-2 strains over time (A) and country (B). Strains from different time points and different countries (partial) are indicated in the figure. Each vertical line shows mutation loci, yellow arrows indicate significant genetic variations. (C) Sankey diagram of coding regions in SARS-CoV-2 strains. Nonsynonymous and synonymous variations are shown on top, each gene region of SARS-CoV-2 is indicated on bottom.
Fig. 2
Fig. 2
Characteristics of genetic variations in SARS-CoV-2 strains. (A) Number of strains at each locus of the genetic variations. All gene regions of SARS-CoV-2 are complied in histogram, with top loci labeled. (B) Variation dynamic curve for occurrence of genetic variations in different countries. Significant loci, ORF1ab: 3037, S: 23403, ORF1ab: 14408, ORF1ab: 8782, ORF8: 28114, ORF1ab: 17858, ORF1ab: 17747, are displayed by detected time and country.
Fig. 3
Fig. 3
Variation characteristics of SARS-CoV-2 strains. (A) Viral genotyping maps for haplotypes by country; gray circle indicates haplotype H (140). Strains from different countries are indicated in the figure, purple circle indicates haplotype type containing ORF1ab variation in 17,747 and 17,858 loci. (B) Phylogenetic tree across countries based on genome variation of SARS-CoV-2 strains downloaded from the 2019nCoVR dataset with default settings. Strains from different countries are indicated in the figure, blue square indicates enriched clade. (C) Venn diagram of strains with haplotype H (140) and enriched clade. Number of merged strains = 140.
Fig. 4
Fig. 4
ORF1ab variations (NSP13: P504L and NSP13: Y541C) in SARS-CoV-2. (A) Phylogenetic tree of strains downloaded from Nextstrain. Strains are colored by amino acid variations in ORF1ab (p.5828P > L or p.5865Y > C) (yellow) or not. (B) Pie chart of variants from Iceland, Australia, and the United States. (C) Column chart of variants from the United States, Iceland, and Australia, with number of strains labeled. (D) Distribution of 271 variants in ORF1ab (p.5828P > L and p.5865Y > C). (E) Sankey diagram of strains from Washington, Utah, Minnesota, Wisconsin, and California states, with strain number of each state labeled. (F) Distribution of variants in ORF1ab (p.5828P > L and p.5865Y > C) or not in Washington State.
Fig. 5
Fig. 5
COVID-19 outcomes in states with different proportions of strains containing variation in NSP13: P504L and NSP13: Y541C. (A) Variation frequency curve of ORF1ab p.5828P > L (NSP13: P504L) and ORF1ab p.5865Y > C (NSP13: Y541C), black line chart indicates variation frequency, green histogram indicates SARS-CoV-2 strains. (B) Infection, recovery, and fatality rates in the United States, blue histogram indicates infection cases, green line chart indicates recovery rate, and gray line chart indicates fatality rate. (C) Infection, recovery, and fatality rates in China and other countries. (D) Infection and fatality rates in different states in the United States, blue histogram indicates infection cases, gray histogram indicates fatality rate. (E) Infection, fatality, and recovery rates in different states, blue histogram indicates infection cases, gray histogram indicates fatality rate, and green histogram indicates recovery rate. (F) Infection and fatality rates in Washington State counties.
Fig. 6
Fig. 6
Effects of NSP13: P504L and NSP13: Y541C on NSP13 function. (A) Sequence alignment of SARS-CoV-2 and other coronavirus, with amino acid sequences around ORF1ab variations aligned. (B) Results of protein secondary structure prediction in ORF1ab variations (QHD43415.1: p.5828P > L and p.5865Y > C) and ORF1ab (QHD43415.1), with α-helix, β-sheet, and β-turn structures of variant and QHD43415.1 displayed. Prediction results of ORF1ab variations (NSP13: P504L and NSP13: Y541C) based on PolyPhen-2 (C), and PROVEAN v1.1 (D).

References

    1. Adzhubei I.A., Schmidt S., Peshkin L., Ramensky V.E., Gerasimova A., Bork P., et al. A method and server for predicting damaging missense mutations. Nat. Methods. 2010;7(4):248–249. doi: 10.1038/nmeth0410-248. - DOI - PMC - PubMed
    1. Andersen K.G., Rambaut A., Lipkin W.I., Holmes E.C., Garry R.F. The proximal origin of SARS-CoV-2. Nat. Med. 2020;26(4):450–452. doi: 10.1038/s41591-020-0820-9. - DOI - PMC - PubMed
    1. Becerra-Flores M., Cardozo T. SARS-CoV-2 viral spike G614 mutation exhibits higher case fatality rate. Int. J. Clin. Pract. 2020;74(8) doi: 10.1111/ijcp.13525. - DOI - PMC - PubMed
    1. Bienert S., Waterhouse A., de Beer T.A., Tauriello G., Studer G., Bordoli L., et al. The SWISS-MODEL repository-new features and functionality. Nucleic Acids Res. 2017;45(D1):D313–D319. doi: 10.1093/nar/gkw1132. - DOI - PMC - PubMed
    1. Canhui C., Huang L., Liu K., Ma K., Tian Y., Qin Y., et al. Amino acid variation analysis of surface spike glycoprotein at 614 in SARS-CoV-2 strains. Gen. Dis. 2020;4(4):567–577. doi: 10.1016/j.gendis.2020.05.006. - DOI - PMC - PubMed

Publication types