Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005;6(10):R89.
doi: 10.1186/gb-2005-6-10-r89. Epub 2005 Sep 19.

Inferring protein domain interactions from databases of interacting proteins

Affiliations

Inferring protein domain interactions from databases of interacting proteins

Robert Riley et al. Genome Biol. 2005.

Abstract

We describe domain pair exclusion analysis (DPEA), a method for inferring domain interactions from databases of interacting proteins. DPEA features a log odds score, Eij, reflecting confidence that domains i and j interact. We analyzed 177,233 potential domain interactions underlying 26,032 protein interactions. In total, 3,005 high-confidence domain interactions were inferred, and were evaluated using known domain interactions in the Protein Data Bank. DPEA may prove useful in guiding experiment-based discovery of previously unrecognized domain interactions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of DPEA method. (a) In this hypothetical protein interaction dataset, domains are represented as colored squares; proteins are represented as collections of one or more domains joined together; and protein interactions are shown as black double arrows. The protein interactions are known, the domain content of each protein is known, and domain interactions are unknown. Any pair of domains that co-occur in a pair of interacting proteins is considered a potentially interacting domain pair. (b) The frequency of proteins with domain i interacting with proteins with domain j, Sij is computed. (c) Using Sij as an initial guess, the propensity, θij, of each kind of potential domain interaction is estimated by EM. (d) The evidence, Eij, for each inferred domain interaction is then assessed by calculating the change in likelihood when a given type of domain interaction is excluded.
Figure 2
Figure 2
Enrichment of PDB complexes in highest-ranking domain pairs predicted to interact. Ratio of observed/expected PDB complexes in each sample of domain pairs is plotted against cumulative rank. For example, the top 100 domain pairs ranked by E have 71-fold more PDB complexes than would be expected in 100 randomly chosen potentially interacting domain pairs in DIP. Potentially interacting domain pairs were ranked by each of three measures: S, θ and E. (a) Ranking all domain pairs by their frequency of co-occurrence in interacting protein pairs, S, yielded no significant enrichment of PDB complexes at any rank cutoff. A significant enrichment of PDB complexes was seen when domain pairs were ranked by θ, and even more so ranked by E, as shown by the successive increase in observed/expected PDB complexes at each cumulative rank. The ratio using all three measures approaches 1.0 as the number of ranked complexes approaches total number of predictions in the dataset. Our results suggest that the E score output by DPEA performs better than S or θ at identifying physically interacting domain pairs. (b) Ranking interactions of modular domains by E reveals enrichment of PDB complexes. No enrichment is found when interactions are ranked by θ or S.
Figure 3
Figure 3
DPEA detects high-specificity domain interactions. (a) Interactions between domain families, such as the hypothetical red and blue domain families, whose members interact specifically are expected to have a low propensity, θ, because the number of interactions occurring between the domain families is a small fraction of the possible interactions (four out of 16 for two domain families of four members each). Conversely, domain interactions with a high θ will typically be between families whose members interact promiscuously. Because high-specificity domain interactions are of obvious interest to biologists, screening for domain interactions by their θ values fails to detect many important domain interactions. (b) Specific interactions of RING ubiquitin ligase domains [Pfam:PF00097, zf-C3HC4] with ubiquitin-conjugating enzymes [Pfam:PF00179, UQ_con] [32] in a fly protein network. The inferred domain interaction has a low θ (θ = 0.011, bottom 10%) and high E (E = 29, Table 1). This reflects the abundant evidence that the domains zf-C3HC4 and UQ_con interact, despite the low probability of interaction between any pair of these domains. (c) Specific interactions of Cyclin N-terminal domains [Pfam:PF00134, Cyclin_N] and protein kinase domains [Pfam:PF00069, Pkinase]. This interaction has a θ of 0.006, which is in the bottom 6% of θ for all domain pairs, suggesting the low propensity of interaction among members of these two domain families. However, the E score of 23 (the 13th highest score in the database) reveals the high degree of evidence for the Cyclin_N ↔ Pkinase interaction. These results show that DPEA identifies high-specificity domain interactions not detected by screening for the most probable domain interactions.
Figure 4
Figure 4
Inferred domain interactions of G-protein subunits. (a) Domain structures of interacting G-γ and G-β proteins in human, mouse and yeast. Protein names are in black to the left of each protein's domain structure schematic. Domains of proteins are colored boxes connected by a gray line. Pfam-A domain names and Pfam-B accession numbers are the same color as the domains they label. Domain structures are schematic and are not to scale. (b) Of the possible domain interactions, only that of G-gamma [Pfam:PF00631] and a Pfam-B domain [Pfam:PB002804] is inferred with high confidence (E = 12). (c) A published structure of complexed G-protein γ and β subunits [PDB:1GP2] [37] confirms our prediction that the G-gamma and PB002804 domains can interact.
Figure 5
Figure 5
Domain interactions of Ras family members with nuclear pore proteins. (a) Yeast and worm Ran signal-transducing proteins interact with proteins that have Ran-binding domains [Pfam:PF00638, Ran_BP1], often found as components of nuclear pore complexes. Domain structures of the relevant interacting proteins are shown. Domains of proteins are colored boxes connected by a gray line. Protein names are in black to the left of each protein's domain structure schematic. Pfam-A domain names and Pfam-B accession numbers are the same color as the domains they label. Domain structures are schematic and are not to scale. (b) We find evidence for the interaction of a Pfam-B domain [Pfam:PB001470] the Ran_BP1 domain (E = 3.7). (c) Structural evidence [PDB:1RRP] [43] confirms that the domains PB001470 and Ran_BP1 interact, consistent with our prediction.

Similar articles

Cited by

References

    1. Yu H, Greenbaum D, Xin Lu H, Zhu X, Gerstein M. Genomic analysis of essentiality within protein networks. Trends Genet. 2004;20:227–231. doi: 10.1016/j.tig.2004.04.008. - DOI - PubMed
    1. Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in the post-genomic era. Nature. 2000;405:823–826. doi: 10.1038/35015694. - DOI - PubMed
    1. Pawson T, Nash P. Assembly of cell regulatory systems through protein interaction domains. Science. 2003;300:445–452. doi: 10.1126/science.1083653. - DOI - PubMed
    1. McGough AM, Staiger CJ, Min JK, Simonetti KD. The gelsolin family of actin regulatory proteins: modular structures, versatile functions. FEBS Lett. 2003;552:75–81. doi: 10.1016/S0014-5793(03)00932-3. - DOI - PubMed
    1. Lim WA, Richards FM, Fox RO. Structural determinants of peptide-binding orientation and of sequence specificity in SH3 domains. Nature. 1994;372:375–379. doi: 10.1038/372375a0. - DOI - PubMed

Publication types

MeSH terms

Substances