Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 9;17(8):e1008844.
doi: 10.1371/journal.pcbi.1008844. eCollection 2021 Aug.

PPIDomainMiner: Inferring domain-domain interactions from multiple sources of protein-protein interactions

Affiliations

PPIDomainMiner: Inferring domain-domain interactions from multiple sources of protein-protein interactions

Seyed Ziaeddin Alborzi et al. PLoS Comput Biol. .

Abstract

Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing. We describe a new computational approach called "PPIDM" (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described "CODAC" (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as "Gold-Standard" a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided. Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist. Author Dave W Ritchie was unable to confirm their authorship contributions. On their behalf, the corresponding author has reported their contributions to the best of their knowledge.

Figures

Fig 1
Fig 1. Schematic illustration of edge inference by PPIDM in a tripartite graph setting G(X,Y,Z,E).
Z is here PPI, a set of PPIs, X and Y are DL and DR, two sets of Pfam domains. Each item in PPI is an ordered pair of proteins ppii = (Li, Ri) with Id(Li) ≤ Id(Ri). Domains in DL and DR are connected to their common neighbor item ppii in PPI through Li and Ri respectively. The (d1, d2) edge comes from the Gold-Standard dataset of DDIs. With PPIDM, new edges are inferred between domains of DL and domains of DR if their adjacency vectors in PPI are similar. Here, the (d3, d2) edge is inferred because d3 and d2 are found in ppi1 and ppi2, and (d3, d4) is inferred because d3 and d4 are found in ppi2 and ppi3. However, the score of (d3, d2) will be lower than the score of (d1, d2) because d3 has one neighbor that does not contain d2 (namely ppi3).
Fig 2
Fig 2. Venn diagram for overlapping DDIs between PPIDM (blue), DOMINE (green), and our Gold-Standard (KBDOCK ∩ 3did, yellow).
PPIDM and DOMINE share 8,433 (3,609 + 4,824) DDIs. The Gold-Standard has 6,989 and 4,934 DDIs in common with PPIDM and DOMINE, respectively, while the Gold-Standard, PPIDM, and DOMINE share together 4,824 interactions.
Fig 3
Fig 3. Coverage of 3didδ by inferred DDIs from PPIDM (blue; panel A) and DOMINE (green; panel B).
Fig 4
Fig 4. Coverage of PPI interactome derived from IMEx database (50,032 PPIs).
(A) Number of PPIs covered by at least one DDI; (B) Number of DDIs covering at least one PPI.
Fig 5
Fig 5. Coverage of human interactome derived from STRING (607, 088 PPIs).
(A) Number of PPIs covered by at least one DDI; (B) Number of DDIs covering at least one PPI.

References

    1. El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al.. The Pfam protein families database in 2019. Nucleic Acids Research. 2019;47(Database Issue):D427–D432. doi: 10.1093/nar/gky995 - DOI - PMC - PubMed
    1. Marchler-Bauer A, Yu B, Han L, He J, Lanczycki CJ, Lu S, et al.. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Research. 2017;45(Database Issue):D200–D203. doi: 10.1093/nar/gkw1129 - DOI - PMC - PubMed
    1. Stein A, Russell RB, Aloy P. 3did: interacting protein domains of known three-dimensional structure. Nucleic Acids Research. 2005;33(Database Issue):D413–D417. - PMC - PubMed
    1. Stein A, Céol A, Aloy P. 3did: identification and classification of domain-based interactions of known three-dimensional structure. Nucleic Acids Research. 2011;39(Database Issue):718–723. - PMC - PubMed
    1. Mosca R, Céol A, Stein A, Olivella R, Aloy P. 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Research. 2014;42(Database Issue):374–379. - PMC - PubMed

Publication types

MeSH terms