Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May 7:14:154.
doi: 10.1186/1471-2105-14-154.

Reconstituting protein interaction networks using parameter-dependent domain-domain interactions

Affiliations

Reconstituting protein interaction networks using parameter-dependent domain-domain interactions

Vesna Memišević et al. BMC Bioinformatics. .

Abstract

Background: We can describe protein-protein interactions (PPIs) as sets of distinct domain-domain interactions (DDIs) that mediate the physical interactions between proteins. Experimental data confirm that DDIs are more consistent than their corresponding PPIs, lending support to the notion that analyses of DDIs may improve our understanding of PPIs and lead to further insights into cellular function, disease, and evolution. However, currently available experimental DDI data cover only a small fraction of all existing PPIs and, in the absence of structural data, determining which particular DDI mediates any given PPI is a challenge.

Results: We present two contributions to the field of domain interaction analysis. First, we introduce a novel computational strategy to merge domain annotation data from multiple databases. We show that when we merged yeast domain annotations from six annotation databases we increased the average number of domains per protein from 1.05 to 2.44, bringing it closer to the estimated average value of 3. Second, we introduce a novel computational method, parameter-dependent DDI selection (PADDS), which, given a set of PPIs, extracts a small set of domain pairs that can reconstruct the original set of protein interactions, while attempting to minimize false positives. Based on a set of PPIs from multiple organisms, our method extracted 27% more experimentally detected DDIs than existing computational approaches.

Conclusions: We have provided a method to merge domain annotation data from multiple sources, ensuring large and consistent domain annotation for any given organism. Moreover, we provided a method to extract a small set of DDIs from the underlying set of PPIs and we showed that, in contrast to existing approaches, our method was not biased towards DDIs with low or high occurrence counts. Finally, we used these two methods to highlight the influence of the underlying annotation density on the characteristics of extracted DDIs. Although increased annotations greatly expanded the possible DDIs, the lack of knowledge of the true biological false positive interactions still prevents an unambiguous assignment of domain interactions responsible for all protein network interactions.Executable files and examples are given at: http://www.bhsai.org/downloads/padds/

PubMed Disclaimer

Figures

Figure 1
Figure 1
Evaluation of different protein-domain annotation merging strategies. (A) Using the InterPro database, we obtained seven protein-domain annotations for yeast protein YNL271C from three databases: PFAM [32], Superfamily (SF) [33], and SMART [34,35]. PFAM domains: FH2, Drf_FH3, and two Drf_GBD domains; SF domains: Formin homology 2 domain (FH2 domain) and ARM repeat; and SMART domain: Formin Homology. (B) The naïve domain-merging strategy identified seven unique domains for YNL271C. (C) Sequence locations helped identify some of the identical domains (FH2, FH2 domain, and Formin Homology) but was not able to differentiate between different domains that share the same sequence position. (D) Taking into consideration both sequence location and domain names/labels, our merging strategy identified four unique domains: ARM repeat, Drf_FH3, Drf_GBD, and a domain consisting of FH2 domains (FH2, FH2 domain, and Formin Homology).
Figure 2
Figure 2
Enrichment of “known” (iPFAM) domain-domain interactions. Evaluation of the top-scoring domain-domain interactions (DDIs) extracted by the parameter-dependent DDI selection (PADDS) and the generalized parsimonious explanation (GPE). (A) The fraction of known DDIs in the iPFAM database [38] retrieved by PADDS as a function of α and the number of top-scoring DDIs. (B) Comparison of the percentage of retrieved iPFAM DDIs using PADDS and GPE as a function of top-ranked DDI sets (i.e., recall). (C) Comparison of the fraction of retrieved iPFAM DDIs using PADDS and GPE as a function of the iPFAM DDI set and top-ranked DDI sets (i.e., precision). For the GPE sets, we used the DDI rank information provided with the published data that includes their designated high-confidence (GPE-HC) and low-confidence (GPE-LC) sets [21]. We have also indicated the best results achievable with any α value, typically achieved for α = 0.1.
Figure 3
Figure 3
Overlap between extracted domain-domain interaction sets for different values of parameter α. The graph indicate fractional overlaps between sets of extracted domain-domain interactions (DDIs) for the six different domain annotation schemes defined in Table 2, for different sets of α values. As the underlying set of PPIs, we used a high-confidence yeast PPI data set created by the Interaction Detection Based On Shuffling (IDBOS) procedure at a 5% false discovery rate [8,41].
Figure 4
Figure 4
Protein-domain annotation merging procedure. An illustration of the computational procedure used to merge protein-domain annotation data from multiple databases for a single protein P (consisting of n amino acids) and domain annotation data from three databases: DB1, DB2, and DB3. INPUT: Protein sequences and protein-domain annotations from one or more databases. PROCESSING: The annotation data were merged in three consecutive steps. In Step I, tandem domains within each protein (and for each database) were merged and represented as a continuous domain with the same domain label as the tandem domains. In Step II, annotation data between all pairs of databases were merged. In Step III, all pairs from Step II were merged into a final annotation set. In this step, new domain labels were assigned to the sets of merged domains. OUTPUT: The output of the annotation merging procedure consists of 1) a set of new (merged) domain labels assigned to the protein, 2) a mapping between the new and original domain labels, and 3) a list of merging exceptions. Based on these lists, one may (re)define sets of labels that should be treated as equivalent or non-equivalent and iterate through the complete domain annotation merging procedure (ITERATION).
Figure 5
Figure 5
Example of domain-domain interaction extraction. I: Given a set of protein-protein interactions (PPIs) and a protein-domain annotation scheme, PADDS transformed all PPIs into the corresponding set of domain-domain interactions (DDIs) and calculated the benefit value Bij for all DDIs. II: The five steps involved in the DDI iterative evaluation procedure is illustrated using interactions between domains D1 and D3. III: After PADDS performed the DDI evaluation procedure for all other DDIs, the results were examined to select the final set of DDIs that can reconstitute the PPIs. P1, …, P7 denote proteins and D1, …, D8 denote domains. The benefit Bij and the reassessed benefit Bijr associated with the interaction between domains ij were calculated using Equations (3) and (4), respectively.

Similar articles

Cited by

References

    1. Hart GT, Ramani AK, Marcotte EM. How complete are current yeast and human protein-interaction networks? Genome Biol. 2006;7(11):120. doi: 10.1186/gb-2006-7-11-120. - DOI - PMC - PubMed
    1. Sambourg L, Thierry-Mieg N. New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size. BMC Bioinformatics. 2010;11:605. doi: 10.1186/1471-2105-11-605. - DOI - PMC - PubMed
    1. Stumpf MP, Thorne T, de Silva E, Stewart R, An HJ, Lappe M, Wiuf C. Estimating the size of the human interactome. Proc Natl Acad Sci USA. 2008;105(19):6959–6964. doi: 10.1073/pnas.0708078105. - DOI - PMC - PubMed
    1. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P. Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002;417(6887):399–403. - PubMed
    1. Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N. High-quality binary protein interaction map of the yeast interactome network. Science. 2008;322(5898):104–110. doi: 10.1126/science.1158684. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources