Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 18;19(1):220.
doi: 10.1186/s12985-022-01951-7.

Mutations in SARS-CoV-2 structural proteins: a global analysis

Affiliations

Mutations in SARS-CoV-2 structural proteins: a global analysis

Mohammad Abavisani et al. Virol J. .

Abstract

Background: Emergence of new variants mainly variants of concerns (VOC) is caused by mutations in main structural proteins of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Therefore, we aimed to investigate the mutations among structural proteins of SARS-CoV-2 globally.

Methods: We analyzed samples of amino-acid sequences (AASs) for envelope (E), membrane (M), nucleocapsid (N), and spike (S) proteins from the declaration of the coronavirus 2019 (COVID-19) as pandemic to January 2022. The presence and location of mutations were then investigated by aligning the sequences to the reference sequence and categorizing them based on frequency and continent. Finally, the related human genes with the viral structural genes were discovered, and their interactions were reported.

Results: The results indicated that the most relative mutations among the E, M, N, and S AASs occurred in the regions of 7 to 14, 66 to 88, 164 to 205, and 508 to 635 AAs, respectively. The most frequent mutations in E, M, N, and S proteins were T9I, I82T, R203M/R203K, and D614G. D614G was the most frequent mutation in all six geographical areas. Following D614G, L18F, A222V, E484K, and N501Y, respectively, were ranked as the most frequent mutations in S protein globally. Besides, A-kinase Anchoring Protein 8 Like (AKAP8L) was shown as the linkage unit between M, E, and E cluster genes.

Conclusion: Screening the structural protein mutations can help scientists introduce better drug and vaccine development strategies.

Keywords: COVID-19; Evolutionary analysis; Genome-wide mutations; Mutations; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The genomic positions of four structural proteins in SARS-CoV-2. E protein includes 26,245 to 26,472, and subsequently, the positions of the M, N, and S proteins are 26,523 to 27,191, 28,274 to 29,533, and 21,563 to 25,384, respectively
Fig. 2
Fig. 2
Pie chart plot of the number of mutations among E and M proteins of SARS-CoV-2 up to January 2022. The incidence rate of one, two, three, four, and more mutations, in addition to the rate of lack of any mutation among these proteins, has been displayed in the clusters A and B
Fig. 3
Fig. 3
Pie chart plot of the number of mutations among N and S proteins of SARS-CoV-2 up to January 2022. The frequency of one, two, three, four, and more mutations, as well as the absence of any mutation, has been shown in the A and B clusters
Fig. 4
Fig. 4
The heat map of mutations among E and M proteins of SARS-CoV-2 as of January 2022. These indicate the rate of mutation per 100 amino acids. The highest frequency rate of mutations among the E and M AASs occurred in the regions of 7 to 14 AA, and 66 to 88 AA, respectively
Fig. 5
Fig. 5
The heat map of mutations among the N and S proteins of SARS-CoV-2 as of January 2022. The regions with the highest frequency of mutations among the N and S AASs were 164 to 205 AA and 508 to 635 AA, respectively
Fig. 6
Fig. 6
Top ten mutations among E and M with the highest frequency worldwide and geographic areas. The positions of altered and substituted AAs are shown differently based on the frequency percentage of the substituted AA. For better data representation, the data is represented by a logarithm based on 10. In total, T9I, P71L, V62F, L21F/L21V, V58F, L73F, S68F, S55F, V49L, and A41V were displayed as the ten mutations with the highest frequency rate of mutations for E AASs. I82T, D3G, A63T, Q19E, A2S, V70L, A81S, F28L, S197N, and T30I are the top ten frequent mutations among M AASs globally
Fig. 7
Fig. 7
Top ten mutations among N and S with the highest frequency worldwide and geographic areas. R203M/R203K, D377Y, D63G, G215C, G204R/G204P, D3L, S235F, Q9L, A220V, and P199L rank first to tenth in frequency for M AASs in the entire world, and the top ten frequent mutations for S AASs were also concluded to be D614G, L18F, A222V, E484K, N501Y, V1176F, T1027I, D138Y, P26S, and T20N globally
Fig. 8
Fig. 8
The locations of the top three frequent mutations occurred in E protein (A), M protein (B), N protein (C), and S protein (D). The mutations are among the top frequent mutations in the total
Fig. 9
Fig. 9
Timeline for report of mutations and evolutionary trends of the top ten high-rate mutations in E and M proteins based on the regions of North America, South America, Europe, Asia, Oceania, Africa, and globally
Fig. 10
Fig. 10
Timeline for reporting the top ten high-rate mutations in the N and S proteins of SARS-CoV-2 across a variety of continents, including North America, South America, Europe, Asia, Oceania, and Africa
Fig. 11
Fig. 11
The visualization of the PPI network with 57 nodes and 153 edges
Fig. 12
Fig. 12
Hub genes identification was concluded by node ranking analysis. As it is seen, the AKAP8L human gene is the linkage gene between the M protein cluster gene and the E and N genes of SARS-CoV-2. Moreover, ZDHHC5 and GOLGA7 showed high interactions with the S protein of SARS-CoV-2.

References

    1. Badua C, Baldo KAT, Medina PMB. Genomic and proteomic mutation landscapes of SARS-CoV-2. J Med Virol. 2021;93(3):1702–1721. doi: 10.1002/jmv.26548. - DOI - PMC - PubMed
    1. World Health Organization. https://www.who.int/news-room/detail/27-04-2020-who-timeline---covid-19, 2020.
    1. Korber B, et al. Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus. Cell. 2020;182(4):812–827. e19. doi: 10.1016/j.cell.2020.06.043. - DOI - PMC - PubMed
    1. Dolan ME, et al. Investigation of COVID-19 comorbidities reveals genes and pathways coincident with the SARS-CoV-2 viral disease. Sci Rep. 2020;10(1):20848. doi: 10.1038/s41598-020-77632-8. - DOI - PMC - PubMed
    1. Chan JF-W, et al. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg Microbes Infect. 2020;9(1):221–236. doi: 10.1080/22221751.2020.1719902. - DOI - PMC - PubMed

Substances