Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 4;23(1):25.
doi: 10.1186/s12863-022-01044-y.

Predicted coronavirus Nsp5 protease cleavage sites in the human proteome

Affiliations

Predicted coronavirus Nsp5 protease cleavage sites in the human proteome

Benjamin M Scott et al. BMC Genom Data. .

Abstract

Background: The coronavirus nonstructural protein 5 (Nsp5) is a cysteine protease required for processing the viral polyprotein and is therefore crucial for viral replication. Nsp5 from several coronaviruses have also been found to cleave host proteins, disrupting molecular pathways involved in innate immunity. Nsp5 from the recently emerged SARS-CoV-2 virus interacts with and can cleave human proteins, which may be relevant to the pathogenesis of COVID-19. Based on the continuing global pandemic, and emerging understanding of coronavirus Nsp5-human protein interactions, we set out to predict what human proteins are cleaved by the coronavirus Nsp5 protease using a bioinformatics approach.

Results: Using a previously developed neural network trained on coronavirus Nsp5 cleavage sites (NetCorona), we made predictions of Nsp5 cleavage sites in all human proteins. Structures of human proteins in the Protein Data Bank containing a predicted Nsp5 cleavage site were then examined, generating a list of 92 human proteins with a highly predicted and accessible cleavage site. Of those, 48 are expected to be found in the same cellular compartment as Nsp5. Analysis of this targeted list of proteins revealed molecular pathways susceptible to Nsp5 cleavage and therefore relevant to coronavirus infection, including pathways involved in mRNA processing, cytokine response, cytoskeleton organization, and apoptosis.

Conclusions: This study combines predictions of Nsp5 cleavage sites in human proteins with protein structure information and protein network analysis. We predicted cleavage sites in proteins recently shown to be cleaved in vitro by SARS-CoV-2 Nsp5, and we discuss how other potentially cleaved proteins may be relevant to coronavirus mediated immune dysregulation. The data presented here will assist in the design of more targeted experiments, to determine the role of coronavirus Nsp5 cleavage of host proteins, which is relevant to understanding the molecular pathology of coronavirus infection.

Keywords: 3CLpro; COVID-19; Coronavirus; Human proteins; Human proteome; Mpro; Nsp5; Protease; SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
a The SARS-CoV-2 polyproteins pp1a and pp1ab. pp1a contains Nsp1-Nsp11, pp1ab contains Nsp1-Nsp16 with Nsp11 skipped by a − 1 ribosomal frameshift. Nsp5 and its cleavage sites are indicated with red arrows. Nsp3 cleavage sites are indicated with grey arrows. b SARS-CoV-2 native Nsp5 cleavage motifs. NetCorona scores are indicated, and residues in white boxes differ from SARS-CoV. c SARS-CoV-2 pp1ab sequences scored with NetCorona. Scores and frequency were determined for all P5-P4’ motifs surrounding glutamine residues in 8017 patient-derived SARS-CoV-2 sequences. Known Nsp5 cleavage sites are indicated in green, while mutations at a Nsp5 cleavage site are indicated in blue. The Nsp5-Nsp6 cleavage site is indicated in red, and all other glutamine motifs are indicated in black
Fig. 2
Fig. 2
Overview of approach to predicting Nsp5 cleavage sites in human proteins. Three datasets of human protein sequences were analyzed by the NetCorona neural network. NetCorona assigned scores (0–1.0) to the 9 amino acid motif surrounding every glutamine residue in the datasets, where a score > 0.5 was inferred to be a possible cleavage site. PDB files associated with predicted cleaved proteins were analyzed using the Protein Structure and Interaction Analyzer (PSAIA) tool, which output the accessible surface area (ASA) of each predicted 9 amino acid cleavage motif. Proteins with highly predicted Nsp5 cleavage sites were then analyzed using STRING, which provided information on tissue expression, subcellular localization, and performed protein network analysis. Human proteins and molecular pathways of interest containing a predicted Nsp5 cleavage site were then flagged for potential physiological relevance
Fig. 3
Fig. 3
Structural analysis of predicted and known Nsp5 cleavage motifs. a NetCorona scores are shown for all P5-P4’ motifs surrounding glutamine residues in three datasets of human proteins, binned by score differences of 0.01. The distributions of scores were not statistically different from one another. b Despite a high NetCorona score in ACHE, the motif’s location in the core of the protein leads to a low Nsp5 access score. c TAB1 contains several motifs predicted to be cleaved, including at Q108 and Q132. The Nsp5 access score is slightly higher for the Q132 motif due to the greater accessible surface area (ASA). d DHX15 contains the motif with the highest Nsp5 access score observed in the human proteins studied, located on the C-terminus of the protein. e SARS-CoV-2 proteins Nsp15 and Nsp16 contain the native Nsp5 cleavage motif with the lowest Nsp5 access score calculated (487), which helped provide a cut-off to Nsp5 access scores in human proteins. f The Nsp5 access score of human protein motifs are indicated, binned by score differences of 50. 92 motifs in 92 unique human proteins have a Nsp5 access score > 500
Fig. 4
Fig. 4
Sum of the compartment score (a) or expression score (b) of all human proteins with a Nsp5 access score above 500 (92 proteins). Both the compartment and the expression score were obtained from STRING based on text-mining and database searches
Fig. 5
Fig. 5
Proteins with a Nsp5 access score over 500, that could be found in the same cellular compartment as Nsp5 (48 proteins), were plotted against their expression in the human body. For each protein, the mean expression by IHC is the mean across all tissues measured and reported in the HPA (Not detected = 0, Low = 1, Medium = 2, High = 3, Not measured = NA [which were ignored/removed])
Fig. 6
Fig. 6
Network of proteins with plausible Nsp5 colocalization a Nsp5 access score above 500. Node color represents the Nsp5 access score (light yellow = 500, dark red = 1005). Node size indicates the mean expression across all tissue. Edge linking two nodes notes a known interaction between these proteins. Grey squares are proteins added by STRING to add connectivity to the network, but do not have an access score above 500 and/or plausible colocalization with Nsp5. Circles highlighting pathways were based on STRING gene set enrichment analysis coupled with manual searches in databases (Uniprot, GeneCARD, PubMed)

References

    1. Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF. The proximal origin of SARS-CoV-2. Nat Med. 2020;26(4):450–452. - PMC - PubMed
    1. Zhou H, Chen X, Hu T, Li J, Song H, Liu Y, et al. A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the spike protein. Curr Biol. 2020;30(11):2196–203 e3. - PMC - PubMed
    1. Wilder-Smith A, Chiew CJ, Lee VJ. Can we contain the COVID-19 outbreak with the same measures as for SARS? Lancet Infect Dis. 2020;20(5):e102–e1e7. - PMC - PubMed
    1. Petersen E, Koopmans M, Go U, Hamer DH, Petrosillo N, Castelli F, et al. Comparing SARS-CoV-2 with SARS-CoV and influenza pandemics. Lancet Infect Dis. 2020;20(9):e238–ee44. - PMC - PubMed
    1. Yang X, Yu Y, Xu J, Shu H, Xia J, Liu H, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475–481. - PMC - PubMed

LinkOut - more resources