Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;1869(10):140693.
doi: 10.1016/j.bbapap.2021.140693. Epub 2021 Jul 5.

In silico analysis of the aggregation propensity of the SARS-CoV-2 proteome: Insight into possible cellular pathologies

Affiliations

In silico analysis of the aggregation propensity of the SARS-CoV-2 proteome: Insight into possible cellular pathologies

Manuel Flores-León et al. Biochim Biophys Acta Proteins Proteom. 2021 Oct.

Abstract

The SARS-CoV-2 virus causes the coronavirus disease 19 emerged in 2020. The pandemic triggered a turmoil in public health and is having a tremendous social and economic impact around the globe. Upon entry into host cells, the SARS-CoV-2 virus hijacks cellular machineries to produce and maintain its own proteins, spreading the infection. Although the disease is known for prominent respiratory symptoms, accumulating evidence is also demonstrating the involvement of the central nervous system, with possible mid- and long-term neurological consequences. In this study, we conducted a detailed bioinformatic analysis of the SARS-CoV-2 proteome aggregation propensity by using several complementary computational tools. Our study identified 10 aggregation prone proteins in the reference SARS-CoV-2 strain: the non-structural proteins Nsp4, Nsp6 and Nsp7 as well as ORF3a, ORF6, ORF7a, ORF7b, ORF10, CovE and CovM. By searching for the available mutants of each protein, we have found that most proteins are conserved, while ORF3a and ORF7b are variable and characterized by the occurrence of a large number of mutants with increased aggregation propensity. The geographical distribution of the mutants revealed interesting differences in the localization of aggregation-prone mutants of each protein. Aggregation-prone mutants of ORF7b were found in 7 European countries, whereas those of ORF3a in only 2. Aggregation-prone sequences of ORF7b, but not of ORF3a, were identified in Australia, India, Nepal, China, and Thailand. Our results are important for future analysis of a possible correlation between higher transmissibility and infection, as well as the presence of neurological symptoms with aggregation propensity of SARS-CoV-2 proteins.

Keywords: Bioinformatics; Protein aggregation; Proteostasis; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Figures

Fig. 1
Fig. 1
Workflow of the study. Aggregation prone proteins were predicted in the SARS-CoV-2 reference strain. Among all mutant sequences available in August 2020, mutations increasing the aggregation propensity were identified and mapped according to their geographical location.
Fig. 2
Fig. 2
Aggregation probability calculated using PASTA2.0. The graphs displays the probability of each amino acid to be part of a prone to aggregate region (black curve) and to be a part of disordered regions for: a) each SARS-CoV-2 aggregation-prone protein; b) positive and negative controls proteins used in the study.
Fig. 3
Fig. 3
CamSol based analyses of protein solubility. Each graph displays the sequence based solubility profile for each query protein. The part of the curve corresponding to the most soluble protein fragments are shown in blue, and the regions of low solubility are shown in red for: a) SARS-CoV-2 aggregation prone protein solubility graphs; and b) positive and negative control proteins used in the study. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4
Fig. 4
Amylogram based calculations of aggregation propensity. The complete set of the SARS-CoV-2 proteome and the positive and negative control proteins are displayed. The non-continuous line marks the threshold (>0.85) considered as “high propensity to aggregate”. The beta-amyloid peptide has the highest value, which relates to its propensity to aggregate.
Fig. 5
Fig. 5
Aggregation propensity calculated using TANGO. Graphs display the percentage of aggregation, signifying the probability of aggregation throughout the queried amino acid sequence for: a) SARS-CoV-2 proteins characterized by high aggregation propensity; b) positive and negative control proteins used in the study.
Fig. 6
Fig. 6
Hierarchical clustering heatmap of the sequence variants from different geographical locations based on aggregation propensity. The sum, mean and median value of the log2 of their normalized aggregation score was used to evaluate their aggregation propensity and to cluster them. Green labels an increase and red a decrease in aggregation propensity.
Fig. 7
Fig. 7
Geographical distribution of variants of (A) ORF7b and (B) ORF3a characterized by increased aggregation propensity. The location of the SARS-CoV-2 variants are shown as a percentage marked by different colors (key, right side) in each reported country.
Fig. 8
Fig. 8
ORF3a variant structural analysis. a) COBALT sequence alignment analysis of the mutant variants vs. wild type. The bright red boxes represent the mutation in each variant and the light red boxes represent the residue that was changed in another variant. Below the sequence, the important characteristics of the secondary structure are shown; the blue boxes are alpha-helixes, the green boxes represent the beta-strands and the yellow boxes the transmembrane regions. b) Secondary structure of the ORF3a protein wild type reported in the PDB (6XDC). c) Area under the curve analysis of the aggregation score of each amino acid from each ORF3a variant. Black arrows point towards the peaks that showed a higher increase in the aggregation score compared to the wild type. d) Aggrescan 3D 2.0 reconstruction of the variant structures. Yellow circles show where the mutated amino acid is located, black circles show the important changes in the aggregation prone regions. e) CoVex Protein-Protein interaction network analysis. Red circle is the viral protein, blue circles are the host proteins. The interaction prediction tool is based on the human cell line HEK-293 T.
Fig. 9
Fig. 9
Proteostasis imbalance due to SARS-CoV-2 aggregation-prone proteins. a) In our study, we identified 10 proteins that showed an increase in the predicted aggregation propensity: non-structural proteins Nsp4, Nsp6 and Nsp7, as well as ORF3a, ORF6, ORF7a, ORF7b, ORF10, CovE and CovM. b) One of the common consequences of protein aggregation is, for example, impairment of the protein degradation systems. This extra burden on the degradation system can promote additional protein accumulation, ultimately causing a proteostasis imbalance. This may result in the accumulation of viral proteins that can engage in promiscuous interactions with other proteins, and may increase the virulence of the virus.

References

    1. Gorbalenya A.E., Baker S.C., Baric R.S., de Groot R.J., Drosten C., Gulyaeva A.A., Haagmans B.L., Lauber C., Leontovich A.M., Neuman B.W., Penzar D., Perlman S., Poon L.L.M., Samborskiy D.V., Sidorov I.A., Sola I., Ziebuhr J. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat. Microbiol. 2020;5:536–544. doi: 10.1038/s41564-020-0695-z. - DOI - PMC - PubMed
    1. Chan J.F.W., Yuan S., Kok K.H., K.K.W. To, Chu H., Yang J., Xing F., Liu J., Yip C.C.Y., Poon R.W.S., Tsoi H.W., Lo S.K.F., Chan K.H., Poon V.K.M., Chan W.M., Ip J.D., Cai J.P., Cheng V.C.C., Chen H., Hui C.K.M., Yuen K.Y. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet. 2020;395:514–523. doi: 10.1016/S0140-6736(20)30154-9. - DOI - PMC - PubMed
    1. Machhi J., Herskovitz J., Senan A.M., Dutta D., Nath B., Oleynikov M.D., Blomberg W.R., Meigs D.D., Hasan M., Patel M., Kline P., Chang R.C.C., Chang L., Gendelman H.E., Kevadiya B.D. The natural history, pathobiology, and clinical manifestations of SARS-CoV-2 infections. J. NeuroImmune Pharmacol. 2020;15:359–386. doi: 10.1007/s11481-020-09944-5. - DOI - PMC - PubMed
    1. Liotta E.M., Batra A., Clark J.R., Shlobin N.A., Hoffman S.C., Orban Z.S., Koralnik I.J. Frequent neurologic manifestations and encephalopathy-associated morbidity in Covid-19 patients. Ann. Clin. Transl. Neurol. 2020;7:2221–2230. doi: 10.1002/acn3.51210. - DOI - PMC - PubMed
    1. Román G.C., Spencer P.S., Reis J., Buguet A., Faris M.E.A., Katrak S.M., Láinez M., Medina M.T., Meshram C., Mizusawa H., Öztürk S., Wasay M. The neurology of COVID-19 revisited: a proposal from the environmental neurology specialty Group of the World Federation of neurology to implement international neurological registries. J. Neurol. Sci. 2020;414:116884. doi: 10.1016/j.jns.2020.116884. - DOI - PMC - PubMed

Publication types

LinkOut - more resources