Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 22:15:11779322211011140.
doi: 10.1177/11779322211011140. eCollection 2021.

In Silico Characterization of a Hypothetical Protein from Shigella dysenteriae ATCC 12039 Reveals a Pathogenesis-Related Protein of the Type-VI Secretion System

Affiliations

In Silico Characterization of a Hypothetical Protein from Shigella dysenteriae ATCC 12039 Reveals a Pathogenesis-Related Protein of the Type-VI Secretion System

Md Fazley Rabbi et al. Bioinform Biol Insights. .

Abstract

Shigellosis caused by Shigella dysenteriae is a major public health concern worldwide, particularly in developing countries. The bacterial genome is known, but there are many hypothetical proteins whose functions are yet to be discovered. A hypothetical protein (accession no. WP_128879999.1, 161 residues) of S. dysenteriae ATCC 12039 strain was selected in this study for comprehensive structural and functional analysis. Subcellular localization and different physicochemical properties of this hypothetical protein were estimated indicating it as a stable, soluble, and extracellular protein. Functional annotation tools, such as NCBI-CD Search, Pfam, and InterProScan, predicted our target protein to be an amidase effector protein 4 (Tae4) of type-VI secretion system (T6SS). Multiple sequence alignment of the homologous sequences coincided with previous findings. Random coil was found to be predominant in secondary structure. Three-dimensional (3D) structure of the protein was obtained using homology modeling method by SWISS-MODEL server using a template protein (PDB ID: 4J30) of 80.12% sequence identity. The 3D structure became more stable after YASARA energy minimization and was validated by several quality assessment tools like PROCHECK, QMEAN, Verify3D, and ERRAT. Superimposition of the target with the template protein by UCSF Chimera generated RMSD value of 0.115 Å, suggesting a reliable 3D structure. The active site of the modeled structure was predicted and visualized by CASTp server and PyMOL. Interestingly, similar binding affinity and key interacting residues were found for the target protein and a Salmonella enterica Tae4 protein with the ligand L-Ala D-Glu-mDAP by molecular docking analysis. Protein-protein docking was also performed between the target protein and hemolysin coregulated protein 1 of T6SS. Finally, the protein was found to be a unique protein of S. dysenteriae nonhomologous to human by comparative genomics approach indicating a potential therapeutic target. Most pathogens harboring T6SS in their system pose a significant threat to the human health. Many T6SSs and their effectors are associated with interbacterial competition, pathogenesis, and virulency; however, relationships between these effectors and pathogenicity of S. dysenteriae are yet to be determined. The study findings provide a lucrative platform for future antibacterial treatment.

Keywords: Shigella dysenteriae; amidase effector protein 4; functional annotation; hcp1; homology modeling; hypothetical protein; in silico characterization; molecular docking; type-VI secretion system (T6SS).

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting Interests:The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
MSA among different amidase effector 4 (Tae4) proteins using ClustalOmega algorithm by Jalview software. (Top row—target protein, Rows 2 and 3—Shigella flexneri, Row 4—Shigella sonnei, Row 5—Enterobacteriaceae, Rows 6 and 7—Escherichia coli, and Row 8—Salmonella enterica). Marked white boxes indicate conserved catalytic cysteine (C) and histidine (H) residues typical of amidases. MSA indicate multiple sequence alignment.
Figure 2.
Figure 2.
Three-dimensional conformation of the catalytic triad of Salmonella enterica Tae4 protein (WP_129397493.1) (A) and the target hypothetical protein (B) generated by PyMOL software.
Figure 3.
Figure 3.
A phylogenetic tree showing evolutionary relationship of the target protein (yellow marked) with other Tae4 proteins. The tree was generated using neighbor joining method based on BLOSUM62 scoring matrix by Jalview software. The values indicate percentage mismatches between 2 nodes (branch length). The target protein along with 3 other Shigella Tae4 amidases (shaded) seems to share the most recent common ancestor with E. coli amidases (blue color) rather than Salmonella enterica amidases (black color).
Figure 4.
Figure 4.
Predicted secondary structure of the target protein using PSI-PRED server.
Figure 5.
Figure 5.
Predicted 3-dimensional structure of the target protein through SWISS-MODEL server after YASARA energy minimization (visualized by BIOVIA Discovery Studio Visualizer version 20.1.0.19295).
Figure 6.
Figure 6.
Quality assessment of the model: (A) Ramachandran plot of model structure validated by PROCHECK program, (B) graphical representation of QMEAN result of the model structure (indicates good agreement between the model structure and experimental structures of similar size).
Figure 7.
Figure 7.
Superimposition of the model (red color) and the template (cyan color) protein using UCSF Chimera software.
Figure 8.
Figure 8.
Z scores of the target (A) and template (B) protein using ProSA server. Both of the structure fell in the region typically found for experimentally determined (NMR and X-ray) native proteins of similar size.
Figure 9.
Figure 9.
Determination of active site using CASTp server and visualized (2 largest pockets) in PyMOL (left). Active amino acid residues are highlighted in the right figure.
Figure 10.
Figure 10.
L-Ala D-Glu-mDAP ligand (red stick) docked in the active site of proteins: (A) ligand-bound hypothetical protein (WP_128879999.1), (B) ligand-bound S. enterica Tae4 protein (WP_129397493.1) (analyzed by PyMOL), (C) key interacting residues of WP_129397493.1 with ligand, and (D) key interacting residues of hypothetical protein with ligand (analyzed by Discovery Studio Visualyzer).
Figure 11.
Figure 11.
Hcp1-Tae4 interaction analysis by PyMOL software resulted from ClusPro server: (A) interaction of Salmonella typhimurium Hcp1 (red) with Salmonella enterica Tae4 (teal), (B) interaction of Salmonella typhimurium Hcp1 (red) with the target hypothetical protein (teal). The interacting residues of Hcp1 and Tae4 are marked in black and blue color, respectively.

Similar articles

Cited by

References

    1. Choi HP, Juarez S, Ciordia S, et al.. Biochemical characterization of hypothetical proteins from helicobacter pylori. PLoS ONE. 2013;8:e66605. - PMC - PubMed
    1. Morozova O, Marra MA. Applications of next-generation sequencing technologies in functional genomics. Genomics. 2008;92:255-264. - PubMed
    1. Shahbaaz M, Bisetty K, Ahmad F, Hassan MI. Current advances in the identification and characterization of putative drug and vaccine targets in the bacterial genomes. Curr Top Med Chem. 2016;16:1040-1069. - PubMed
    1. Nimrod G, Schushan M, Steinberg DM, Ben-Tal N. Detection of functionally important regions in “hypothetical proteins” of known structure. Structure (London, England: 1993). 2008;16:1755-1763. - PubMed
    1. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. PNAS. 1999;96:4285-4288. - PMC - PubMed