Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Jan;17(1):117-31.
doi: 10.1093/bib/bbv027. Epub 2015 May 13.

Progress and challenges in predicting protein interfaces

Review

Progress and challenges in predicting protein interfaces

Reyhaneh Esmaielbeiki et al. Brief Bioinform. 2016 Jan.

Abstract

The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.

Keywords: antibody antigen interaction; protein interface prediction; protein–protein interaction.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Classification of existing protein interface prediction methods. In the leftmost column we present the input required by a method. In the middle column, a simplified pipeline for the protocol is presented. In the rightmost, prediction column, the resulting binding site is shown in red. Most methods output a ranked list of possible binding sites. Here for simplicity, we show a single result for each method. (A) Sequence-feature-based predictors: These methods receive a protein sequence. Sequential features of the input are compared with features thought to contribute to a residue being part of an interface, such as conservation scores and physico-chemical properties. (B) 3D mapping-based predictors: These methods receive a protein structure and its sequence as input. Evolutionary conservation is coupled with 3D surface and sequence information. Conserved residues can be grouped according to their surface proximity to form contiguous interface patches. (C) 3D-classifier-based predictors: The input for these methods is a protein structure and its sequence. Distinct sets of attributes (physico-chemical, evolution, 3D structural features, etc.) are used as an input to a learning method such as a SVM or Random Forest. (D) Template-based predictors: These methods receive a protein structure (and thus its sequence) as input. Complex templates are then identified, which can be homologues or structural neighbours (these are shown in white, whereas their binding partners are in green, cyan and yellow). Templates of the input protein are aligned to the query protein. The most commonly aligned contact sites are returned as a prediction. (E) Partner-specific interface predictors: These methods receive the structures/sequences of two proteins that are assumed to interact. The three groups of methods are shown for this category. Partner-specific descriptors can be calculated to predict interfaces. In some cases docking is used to sample possible orientations to identify a consensus binding site. Partner-specific descriptors and docking poses are used as input for parametric functions and classifiers to obtain the final result. In the co-evolution-based strategy, a MSA of interacting homologues is created and sites that appear to mutate in concert (co-evolve) are assumed to constitute the binding site. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.
Figure 2.
Figure 2.
Antibody structure and binding. The most common form of an antibody is the IgG (upper left). IgG is composed of two pairs of heavy and light chains. The tip of an antibody that carries the binding site (symmetrical in an IgG) is the variable region (upper right). The variable region harbours the six CDR loops, which form the majority of the antigen recognition site, the paratope (lower). The CDR regions are distinct between different antibodies whereas the rest of the antibody remains largely unchanged. The paratope recognizes a specific epitope, the corresponding binding site on the antigen (lower). A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.

Similar articles

Cited by

References

    1. Sudha G, Nussinov R, Srinivasan N. An overview of recent advances in structural bioinformatics of protein-protein interactions and a guide to their principles. Prog Biophys Mol Biol 2014;116:141–50. - PubMed
    1. Cazals F. Revisiting the Voronoi description of protein-protein interfaces: Algorithms. Pattern Recognit Bioinform 2010;6282:419–30. - PubMed
    1. Janin J, Henrick K, Moult J, et al. CAPRI: a critical assessment of predicted interactions. Proteins 2003;52:2–9. - PubMed
    1. Yan C, Dobbs D, Honavar V. A two-stage classifier for identification of protein-protein interface residues. Bioinformatics 2004;20:i371–8. - PubMed
    1. Ezkurdia I, Bartoli L, Fariselli P, et al. Progress and challenges in predicting protein-protein interaction sites. Brief Bioinform 2009;10:233–46. - PubMed

Publication types

MeSH terms