Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jun 27:8:223.
doi: 10.1186/1471-2105-8-223.

High-throughput identification of interacting protein-protein binding sites

Affiliations

High-throughput identification of interacting protein-protein binding sites

Jo-Lan Chung et al. BMC Bioinformatics. .

Abstract

Background: With the advent of increasing sequence and structural data, a number of methods have been proposed to locate putative protein binding sites from protein surfaces. Therefore, methods that are able to identify whether these binding sites interact are needed.

Results: We have developed a new method using a machine learning approach to detect if protein binding sites, once identified, interact with each other. The method exploits information relating to sequence and structural complementary across protein interfaces and has been tested on a non-redundant data set consisting of 584 homo-dimers and 198 hetero-dimers extracted from the PDB. Results indicate 87.4% of the interacting binding sites and 68.6% non-interacting binding sites were correctly identified. Furthermore, we built a pipeline that links this method to a modified version of our previously developed method that predicts the location of binding sites.

Conclusion: We have demonstrated that this high-throughput pipeline is capable of identifying binding sites for proteins, their interacting binding sites and, ultimately, their binding partners on a large scale.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The pipeline for the identification of protein binding sites and their binding partners. Predictor 1 can be any method that predicts the location of protein binding sites. Predictor 2 is the method presented in this study that identifies the interacting binding sites.
Figure 2
Figure 2
The amino acid contact preferences for (a) homo-dimer and (b) hetero-dimer interfaces. The amino acids are listed according to hydrophobicity [64]. The preferences were calculated with respect to the distribution of interface residues.
Figure 3
Figure 3
The residue contact preferences in terms of secondary structure properties for (a) homo-dimer and (b) hetero-dimer interfaces. H: alpha helix, S: beta strand, C: others, including coil regions. The preferences were calculated with respect to the distribution of interface residues.
Figure 4
Figure 4
The amino acid contact preferences in terms of the extent of water exposure for (a) homo-dimer and (b) hetero-dimer interfaces. PE: partially exposed (40% > ASA >= 15% of a residue's nominal maximum area), FE: fully exposed (ASA >= 40% of a residue's nominal maximum area). The preferences were calculated with respect to the distribution of interface residues.
Figure 5
Figure 5
Training Classes of the SVM predictor (a) Positive training class: residue pairs across two interacting binding sites with a distance < 5Å between any of their respective heavy atoms (b) Negative training class: any possible residue pairs between two non-interacting binding sites.
Figure 6
Figure 6
The prediction of contacting residue pairs. Proteins A and B are predicted to interact with each other if the percentage of the predicted contacting residue pairs reaches a certain threshold (see methods for details).
Figure 7
Figure 7
ROC curves for the prediction of interacting and non-interacting protein binding sites using different input features. (a) Predictions using only sequence profile/sequence information but different surface patch or sequence window sizes. Predictions using different combinations of sequence profile, secondary structure, and ASA when (b) surface patch size is 1 (c) surface patch size is 3. SEQ: sequence profile; SEQ-NE: sequence information only (without the evolutionary information provided by homologous sequences); SEC: secondary structure; ASA: accessible surface area.
Figure 8
Figure 8
Prediction accuracy for interacting and non-interacting protein binding sites. Two protein binding sites were predicted to interact with each other if the percentage of the predicted contacting residue pairs reached a certain threshold.
Figure 9
Figure 9
Comparison of the ROC curves for the prediction of interacting and non-interacting protein binding sites between homo-dimer and hetero-dimer interfaces using training data of different sizes.
Figure 10
Figure 10
The important contacting residues across the heavy chain and the light chain of the CD1d1 complex (PDB code: 1CD1) assigned by our predictor. The high scoring residues at the binding site (yellow) of the heavy chain (green) were colored orange and presented as spheres. The high scoring residues at the binding site (purple) of the light chain (light blue) were colored red and presented as spheres.
Figure 11
Figure 11
The important contacting residues across the PyrDB and the PyrK subunits of dihydroorotate dehydrogenase B (PDB code: 1EP1) assigned by our predictor. The high scoring residues at the binding site (yellow) of the PyrDB subunit (green) were colored orange and presented as spheres. The high scoring residues at the binding site (purple) of the PyrK subunit (light blue) were colored red and presented as spheres. Three cofactors, FMN, FAD and the [2Fe-2S] cluster were colored gray and presented as spheres.
Figure 12
Figure 12
The important contacting residues across protein kinase cdk2 and cyclin (PDB code: 1F5Q) assigned by our predictor. The high scoring residues at the binding site (yellow) of cdk2 (green) were colored orange and presented as spheres. The high scoring residues at the binding site (purple) of cyclin (light blue) were colored red and presented as spheres.
Figure 13
Figure 13
Comparison of the prediction performances based on known binding sites and putative binding sites using sequence profile with a surface patch size of 3 as input.
Figure 14
Figure 14
The important contacting residues across the N-terminal and the C-terminal segments of Rad50 abc-ATPase (PDB code: 1II8) assigned by our predictor based on (a) known binding sites and (b) putative binding sites. The high scoring residues at the binding site (yellow) of N-terminal segment (green) were colored orange. The high scoring residues at the binding site (purple) of C-terminal segment (light blue) were colored red.
Figure 15
Figure 15
The important contacting residues across gelsolin G4–G6 domains and actin (PDB code: 1H1V) assigned by our predictor based on (a) known binding sites and (b) putative binding sites. The high scoring residues at the binding site (yellow) of actin (green) were colored orange. The high scoring residues at the binding site (purple) of gelsolin G4–G6 domains (light blue) were colored red.

Similar articles

Cited by

References

    1. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P. Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002;417:399–403. doi: 10.1038/nature750. - DOI - PubMed
    1. Fields S, Song O. A novel genetic system to detect protein-protein interactions. Nature. 1989;340:245–246. doi: 10.1038/340245a0. - DOI - PubMed
    1. McCafferty J, Griffiths AD, Winter G, Chiswell DJ. Phage antibodies: filamentous phage displaying antibody variable domains. Nature. 1990;348:552–554. doi: 10.1038/348552a0. - DOI - PubMed
    1. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002;415:141–147. doi: 10.1038/415141a. - DOI - PubMed
    1. Valencia A, Pazos F. Computational methods for the prediction of protein interactions. Curr Opin Struct Biol. 2002;12:368–373. doi: 10.1016/S0959-440X(02)00333-0. - DOI - PubMed

Publication types

LinkOut - more resources