Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 23:45:108629.
doi: 10.1016/j.dib.2022.108629. eCollection 2022 Dec.

The specific applications of the TSR-based method in identifying Zn2+ binding sites of proteases and ACE/ACE2

Affiliations

The specific applications of the TSR-based method in identifying Zn2+ binding sites of proteases and ACE/ACE2

Titli Sarkar et al. Data Brief. .

Abstract

We have developed an alignment-free TSR (Triangular Spatial Relationship)-based computational method for protein structural comparison and motif identification and discovery. To demonstrate the potential applications of the method, we have generated two datasets. One dataset contains five classes: Actin/Hsp70, serine protease (chymotrypsin/trypsin/elastase), ArsC/Prdx2, PKA/PKB/PKC, and AChE/BChE at the hierarchical level 1 and twelve groups at the level 2. The other dataset includes representative proteases and ACE/ACE2. The x,y, z coordinates of the structures were obtained from PDB. We calculated the keys (or features) that represent each structure using the TSR-based method. The dataset and data presented here include additional information that help the readers become aware of specific applications of the TSR-based method in protein clustering, identification and discovery of metal ion binding sites as well as to understand the effect of amino acid grouping on protein 3D structural relationships at both global and local levels.

Keywords: 3D structure; Alignment-free; Amino acid grouping; Metal ion binding site; Protein similarity; Structural motif; Structure comparison; TSR.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.

Figures

Fig 1
Fig. 1
The hierarchical organization of the structure dataset and Venn diagram analysis of the structures at the hierarchical level 1 for with and without amino acid grouping. a, The hierarchical organization of the structure dataset with five classes at the level 1 and twelve subclasses at the level 2 (leaf node level); b-c, The Venn diagram shows counts of the keys without (b) and with (c) amino acid grouping that are specific to each class of the structures at the level 1, and all possibly overlapped regions. The numbers of Common and the total keys and their ratios, for without (b) and with (c) amino acid grouping, are indicated.
Figure 2
Fig. 2
The sequence alignment of the representative proteins from chymotrypsin, elastase, trypsin, PKA, PKB, PKC, Actin, Hsp70, Arsc, PRDX2, acetylcholine esterase, and choline esterase.
Figure 3
Fig. 3
Comparison of the number of the Common distinct keys with the number of the total distinct keys of the structures at the hierarchical level 1. a, Without amino acid grouping; b, With amino acid grouping.
Figure 4
Fig. 4
Effect of amino acid grouping on common keys in each of five different protein families. a, The numbers of the common distinct keys without consideration of key frequency were calculated and are present for with and without amino acid grouping; b, The numbers of the common distinct keys with consideration of key frequency were calculated and are present for with and without amino acid grouping.
Figure 5
Fig. 5
Clustering and structure similarity of the dataset without amino acid grouping compared with amino acid grouping. a, Clustering comparison between without and with amino acid grouping; b, Comparison of similarity distributions between with and without amino acid grouping. The weighted averages are indicated.
Figure 6
Fig. 6
Effect of amino acid grouping on the specific keys of the structures organized in the hierarchical organization. a, At the level 1; b, At the level 2.
Figure 7
Fig. 7
A structure-based hierarchical organization of the dataset. Numbers of the specific keys for each (sub)class or type for with and without amino acid grouping are indicated.
Figure 8
Fig. 8
A small set of specific keys were identified and are present for serine proteases. a-b, Three specific keys for identified for without and with amino acid grouping respectively; c, Eight cysteine residues of a representative protein structure (PDB ID: 4H4F) were identified and are presented for the three keys in both with and without amino acid grouping. Four disulfide bonds were identified from these eight cysteine residues and are shown; d, The distances and cysteine positions of four disulfide bonds are shown; e, Eight triangles associated with the keys of 2129522 (Cys-Gly-Gly, 1 triangle), 2229137 (Cys-Cys-Cys, 4 triangles) and 2229142 (Cys-Cys-Cys, 3 triangles) are shown.
Figure 9
Fig. 9
A small set of specific keys were identified and are present for elastases. a-b, Two and one specific keys for identified for without and with amino acid grouping respectively; c, The amino acids associated with the keys of 2927359, 8692579 and 2570074 are shown in VDW (PDB ID: 1BRU); d, The triangles associated with the keys of 2927359 (Cys136-L160-C182 and Cys-L161-C182), 8692579 (Cys210-H210-R230) and 2570074 (Cys136-Y137-Cys201) (PDB ID: 1BRU). With and without amino acid grouping are indicated.
Figure 10
Fig. 10
The result from a dataset containing kinases, phosphatases, and isomerases shows a clustering improvement after applying the amino acid grouping. a, The clustering maps show the protein clusters before and after amino acid grouping. The dissimilarity values are indicated in the upper left corner of the clustering maps; b, Pairwise structure similarities with and without amino acid grouping were calculated and are shown. The means are labeled and the 25 and 75 percentiles are indicated; c, Percent increases in structure similarity were calculated and are present.
Figure 11
Fig. 11
The result from a dataset containing diverse receptors shows a clustering improvement after applying the amino acid grouping. a, The clustering maps show the protein clusters before and after amino acid grouping. The dissimilarity values are indicated in the upper left corner of the clustering maps; b, Pairwise structure similarities with and without amino acid grouping were calculated and are shown. The means are labeled and the 25 and 75 percentiles are indicated.
Figure 12
Fig. 12
The Zn2+ binding sites of proteases have their unique geometries. a, The sequence alignment study shows HExxH motif of ACE, ACE2 and thermolysin; b, MaxDist values of the triangles constituted from two His and one Glu or one Asp in the Zn2+ binding sites as well as those not in the binding sites were calculated and are shown; c, Theta values of the triangles constituted from two His and one Glu or one Asp in the Zn2+ binding sites as well as those not in the binding sites were calculated and are shown; b-c, The means are labeled and the 25 and 75 percentiles are indicated; d-e, Two representative Zn2+ binding sites: two His and one Glu (d) and two His and one Asp (e), are shown. The PDB IDs are indicated. The amino acids are labeled and Zn2+ ions are shown.
Figure 13
Fig. 13
The Zn2+ binding sites of ACE and ACE2 have their unique geometries. a-b, Two representative Zn2+ binding sites of ACE (a) and ACE2 (b) are shown. The PDB IDs are indicated. The amino acids are labeled. c-d, MaxDist and Theta values of the triangles constituted from two His and one Glu in the Zn2+ binding sites as well as those not in the binding sites were calculated and are shown. The means are labeled and the 25 and 75 percentiles are indicated; e, A representative structure shows interactions between spike of SARS-CoV-2 and human ACE2. The PDB ID is indicated and Zn2+ binding site is labeled.

Similar articles

Cited by

References

    1. Kondra S, Chen F, Chen Y, Chen Y, Collette CJ, Xu W. A study of a hierarchical structure of proteins and ligand binding sites of receptors using the triangular spatial relationship-based structure comparison method and development of a size-filtering feature designed for comparing different sizes of protein structures. Proteins. 2022;90(1):239–257. doi: 10.1002/prot.26215. - DOI - PubMed
    1. Kondra S, Sarkar T, Raghavan V, Xu W. Development of a TSR-based method for protein 3-D structural comparison with its applications to protein classification and motif discovery. Front. Chem. 2021;8(1261) - PMC - PubMed
    1. Sarkar T, Raghavan VV, Chen F, Riley A, Zhou S, Xu W. Exploring the effectiveness of the TSR-based protein 3-D structural comparison method for protein clustering, and structural motif identification and discovery of protein kinases, hydrolase, and SARS-CoV-2’s protein via the application of amino acid grouping. Comput. Biol. Chem. 2021 - PubMed
    1. Pace NJ, Weerapana E. Zinc-binding cysteines: diverse functions and structural motifs. Biomolecules. 2014;4(2):419–434. - PMC - PubMed
    1. Skalny AV, Rink L, Ajsuvakova OP, Aschner M, Gritsenko VA, Alekseenko SI, Svistunov AA, Petrakis D, Spandidos DA, Aaseth J, et al. Zinc and respiratory tract infections: Perspectives for COVID‑19 (Review) Int. J. Mol. Med. 2020;46(1):17–26. - PMC - PubMed