Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 May 12:9:234.
doi: 10.1186/1471-2105-9-234.

Comprehensive inventory of protein complexes in the Protein Data Bank from consistent classification of interfaces

Affiliations

Comprehensive inventory of protein complexes in the Protein Data Bank from consistent classification of interfaces

Andrew J Bordner et al. BMC Bioinformatics. .

Abstract

Background: Protein-protein interactions are ubiquitous and essential for all cellular processes. High-resolution X-ray crystallographic structures of protein complexes can reveal the details of their function and provide a basis for many computational and experimental approaches. Differentiation between biological and non-biological contacts and reconstruction of the intact complex is a challenging computational problem. A successful solution can provide additional insights into the fundamental principles of biological recognition and reduce errors in many algorithms and databases utilizing interaction information extracted from the Protein Data Bank (PDB).

Results: We have developed a method for identifying protein complexes in the PDB X-ray structures by a four step procedure: (1) comprehensively collecting all protein-protein interfaces; (2) clustering similar protein-protein interfaces together; (3) estimating the probability that each cluster is relevant based on a diverse set of properties; and (4) combining these scores for each PDB entry in order to predict the complex structure. The resulting clusters of biologically relevant interfaces provide a reliable catalog of evolutionary conserved protein-protein interactions. These interfaces, as well as the predicted protein complexes, are available from the Protein Interface Server (PInS) website (see Availability and requirements section).

Conclusion: Our method demonstrates an almost two-fold reduction of the annotation error rate as evaluated on a large benchmark set of complexes validated from the literature. We also estimate relative contributions of each interface property to the accurate discrimination of biologically relevant interfaces and discuss possible directions for further improving the prediction method.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Estimated probability density function for the Random Forest scores in each class.
Figure 2
Figure 2
ROC curve for the Random Forest prediction of benchmark set interfaces using 10-fold cross-validation. The high prediction accuracy is shown by an area under the curve as high as 0.945.
Figure 3
Figure 3
Correct predicted α2β2 structure for a bacterial nitrile hydratase from a PDB structure (entry 1UGQ) that is incorrectly annotated as a heterodimer.
Figure 4
Figure 4
Venn diagram of the number of common Pfam-A domain family contacts in the predicted biological complexes, in the 3DID database, and in the iPfam database (version 21.0).
Figure 5
Figure 5
Schematic graph representation of the predicted homohexamer complex for E. coli phosphopantetheine adenylyltransferase (PDB entry 1B6T). Nodes represent subunits, denoted by their chain name and symmetry transformation, and edges represent inter-subunit contacts in the complex.

Similar articles

Cited by

References

    1. Jones S, Thornton JM. Principles of protein-protein interactions. Proc Natl Acad Sci U S A. 1996;93:13–20. doi: 10.1073/pnas.93.1.13. - DOI - PMC - PubMed
    1. Sheinerman FB, Norel R, Honig B. Electrostatic aspects of protein-protein interactions. Curr Opin Struct Biol. 2000;10:153–159. doi: 10.1016/S0959-440X(00)00065-8. - DOI - PubMed
    1. Nooren IM, Thornton JM. Structural characterisation and functional significance of transient protein-protein interactions. J Mol Biol. 2003;325:991–1018. doi: 10.1016/S0022-2836(02)01281-0. - DOI - PubMed
    1. Bordner AJ, Abagyan R. Statistical analysis and prediction of protein-protein interfaces. Proteins. 2005;60:353–366. doi: 10.1002/prot.20433. - DOI - PubMed
    1. Bordner AJ, Gorin AA. Protein docking using surface matching and supervised machine learning. Proteins. 2007;68:488–502. doi: 10.1002/prot.21406. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources