Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 22;4(1):vbae161.
doi: 10.1093/bioadv/vbae161. eCollection 2024.

ProCogGraph: a graph-based mapping of cognate ligand domain interactions

Affiliations

ProCogGraph: a graph-based mapping of cognate ligand domain interactions

Matthew Crown et al. Bioinform Adv. .

Abstract

Motivation: Mappings of domain-cognate ligand interactions can enhance our understanding of the core concepts of evolution and be used to aid docking and protein design. Since the last available cognate-ligand domain database was released, the PDB has grown significantly and new tools are available for measuring similarity and determining contacts.

Results: We present ProCogGraph, a graph database of cognate-ligand domain mappings in PDB structures. Building upon the work of the predecessor database, PROCOGNATE, we use data-driven approaches to develop thresholds and interaction modes. We explore new aspects of domain-cognate ligand interactions, including the chemical similarity of bound cognate ligands and how domain combinations influence cognate ligand binding. Finally, we use the graph to add specificity to partial EC IDs, showing that ProCogGraph can complete partial annotations systematically through assigned cognate ligands.

Availability and implementation: The ProCogGraph pipeline, database and flat files are available at https://github.com/bashton-lab/ProCogGraph and https://doi.org/10.5281/zenodo.13165851.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Procession of structures through the ProCogGraph pipeline. At each stage, some structures are lost due to either: no EC annotation, no bound entities or domains in structure, or structure contacts failing to meet criteria.
Figure 2.
Figure 2.
Cognate ligand cluster visualization using t-SNE. Each point represents a unique cognate ligand, which is coloured according to its assigned cluster using KMeans clustering. Cluster centroids are annotated with the cluster number. Not all clusters within the ordination have clearly similar structure or function—however it can be seen that cluster 4 corresponds to phosphorylated sugars and metabolites, cluster 5 to sugars, cluster 9 to amino acids/intermediates, cluster 11 to nucleotides/derivatives, cluster 12 to pyrroles and cluster 13 to Coenzyme A/conjugates. The remaining clusters are defined as the cognate substrate/product space.
Figure 3.
Figure 3.
Specialized cognate ligand interactions. t-SNE visualization of cognate ligands from specialized SF 2.40.110.10 (Butyryl-CoA Dehydrogenase, subunit A, domain 2), coloured, compared to all unique ProCogGraph cognate ligands (grey). Ligands are clustered based on their chemical similarity and coloured according to cluster number. Cluster 13 (orange) consists primarily of CoA/derivative ligands, while Cluster 11 (blue) contains FAD/NAD/derivative ligands with remaining cognate ligands in clusters 0 (green) and 6 (red). The high Tanimoto score (0.75 ± 0.30) indicates a high similarity among the ligands within each cluster.
Figure 4.
Figure 4.
Exclusive and partner interactions of domain 2,3-Dihydroxybiphenyl 1,2-Dioxygenase, domain 1 (CATH Code 3.10.180.10). Blue dashed lines indicate contacts identified by PDBe-arpeggio. (A) Exclusive interaction with catechol (stick representation with carbon atoms coloured yellow, cognate = 3-chlorocatechol, 0.89 PARITY score) within the beta-barrel of the domain (PDB 1KND). Zinc ion and tertiary-butyl alcohol ligands are coloured grey. (B) Partner interaction of two instances of the domain (PDB 1BH5), coming together to form the enzyme active site for PDB ligand s-hexylglutathione (stick representation with carbon atoms coloured yellow, cognate = (R)-S-Lactoylglutathione, 0.82 PARITY score). Zinc ion highlighted in grey as a space filling representation.
Figure 5.
Figure 5.
Cognate ligand mapping to determine exact EC IDs. (A) The graph schema for matching EC IDs with cognate ligands. For protein chains with partial ECs, nonminor domain interactions to bound entities are traced to cognate ligands with similarity above 0.4. An annotation is made if all cognate ligands share the same EC when multiple bound entities are present. (B) For PDB 2VNO, from partial EC 3.2.1, 54 cognate ligands are matched. The highest similarity ligand corresponds to EC 3.2.1.49.

References

    1. Andreeva A, Kulesha E, Gough J. et al. The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures. Nucleic Acids Res 2020;48:D376–82. - PMC - PubMed
    1. Bashton M, , Chothia C.. The generation of new protein functions by the combination of domains. Structure 2007;15:85–99. - PubMed
    1. Bashton M, Nobeli I, Thornton JM.. Cognate ligand domain mapping for enzymes. J Mol Biol 2006;364:836–52. - PubMed
    1. Bashton M, Nobeli I, Thornton JM.. PROCOGNATE: a cognate ligand domain mapping for enzymes. Nucleic Acids Res 2008;36:D618–22. - PMC - PubMed
    1. Blum M, Chang H-Y, Chuguransky S. et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res 2021;49:D344–54. - PMC - PubMed

LinkOut - more resources