Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 18;11 Suppl 1(Suppl 1):S3.
doi: 10.1186/1471-2105-11-S1-S3.

Predicting the protein-protein interactions using primary structures with predicted protein surface

Affiliations

Predicting the protein-protein interactions using primary structures with predicted protein surface

Darby Tien-Hao Chang et al. BMC Bioinformatics. .

Abstract

Background: Many biological functions involve various protein-protein interactions (PPIs). Elucidating such interactions is crucial for understanding general principles of cellular systems. Previous studies have shown the potential of predicting PPIs based on only sequence information. Compared to approaches that require other auxiliary information, these sequence-based approaches can be applied to a broader range of applications.

Results: This study presents a novel sequence-based method based on the assumption that protein-protein interactions are more related to amino acids at the surface than those at the core. The present method considers surface information and maintains the advantage of relying on only sequence data by including an accessible surface area (ASA) predictor recently proposed by the authors. This study also reports the experiments conducted to evaluate a) the performance of PPI prediction achieved by including the predicted surface and b) the quality of the predicted surface in comparison with the surface obtained from structures. The experimental results show that surface information helps to predict interacting protein pairs. Furthermore, the prediction performance achieved by using the surface estimated with the ASA predictor is close to that using the surface obtained from protein structures.

Conclusion: This work presents a sequence-based method that takes into account surface information for predicting PPIs. The proposed procedure of surface identification improves the prediction performance with an F-measure of 5.1%. The extracted surfaces are also valuable in other biomedical applications that require similar information.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow of proposed method to predict interacting protein pairs. Given a pair of protein sequences, this method first encodes each of the two sequences as a vector. The encoding process comprises three steps; the two steps marked with an asterisk are the major contributions of this work. The two vectors are concatenated as the feature vector of the protein pair and submitted to the RVKDE for classifying whether the two proteins have interactions.
Figure 2
Figure 2
Example of the surface predicted by the present method. This example employs the two subunits of RNA ploymerase II (PDB ID: 2HZM), Med18 (chain B) and Med20 (chain A), to show the predicted surface relative to the interface residues. The protein chain in spacefill mode is the target subunit used in surface identification; the protein chain that is displayed in stick mode is treated as the interacting partner of the target subunit. The predicted surface that overlaps the interface residues is shown in yellow, and the non-overlapping region is shown in red. Med18 is the target subunit in (a), and Med20 is the target subunit in (b).
Figure 3
Figure 3
Example of encoding a residue in the PSSM-2SP form. This example encodes the fifth residue (i = 5) of a protein (PDB ID: 154L) with window size 11 (w = 11 and h = 5). A position is represented by a 23-dimensional vector (20 amino acid values, a terminal flag and two group values). The first row is a pseudo terminal residue where only the terminal flag is 1 and all 22 other values are zero. Finally, the i-th residue is encoded with its neighboring positions to form a 253-dimensional feature vector.
Figure 4
Figure 4
Identifying surface of protein sequence. Input: Each residue of the sequence is associated with a predicted RSA value. Step 1: Identify surface residues having RSA values ≥t. Step 2: Scan the sequence with a sliding window of size w, where each surface window must include at least o surface residues. Step 3: Predicted surface is union of all surface windows. t = 0.3, w = 9 and o = 3 in this example.
Figure 5
Figure 5
Encoding a protein sequence as a feature vector using conjoint triads. Step 1: Transform the amino acid sequence into the group sequence. Step 2: Scan the predicted surface along the group sequence, and count the triads in the occurrence vector O.

Similar articles

Cited by

References

    1. Ge H, Walhout AJM, Vidal M. Integrating 'omic' information: a bridge between genomics and systems biology. Trends Genet. 2003;19(10):551–560. doi: 10.1016/j.tig.2003.08.009. - DOI - PubMed
    1. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001;98(8):4569–4574. doi: 10.1073/pnas.061034498. - DOI - PMC - PubMed
    1. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K. et al.Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002;415(6868):180–183. doi: 10.1038/415180a. - DOI - PubMed
    1. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B. et al.Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006;440(7084):631–636. doi: 10.1038/nature04532. - DOI - PubMed
    1. Tong AHY, Drees B, Nardelli G, Bader GD, Brannetti B, Castagnoli L, Evangelista M, Ferracuti S, Nelson B, Paoluzi S. et al.A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science. 2002;295(5553):321–324. doi: 10.1126/science.1064987. - DOI - PubMed

Publication types

LinkOut - more resources