Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;15 Suppl 1(Suppl 1):S5.
doi: 10.1186/1471-2164-15-S1-S5. Epub 2014 Jan 24.

Cluster based prediction of PDZ-peptide interactions

Cluster based prediction of PDZ-peptide interactions

Kousik Kundu et al. BMC Genomics. 2014.

Abstract

Background: PDZ domains are one of the most promiscuous protein recognition modules that bind with short linear peptides and play an important role in cellular signaling. Recently, few high-throughput techniques (e.g. protein microarray screen, phage display) have been applied to determine in-vitro binding specificity of PDZ domains. Currently, many computational methods are available to predict PDZ-peptide interactions but they often provide domain specific models and/or have a limited domain coverage.

Results: Here, we composed the largest set of PDZ domains derived from human, mouse, fly and worm proteomes and defined binding models for PDZ domain families to improve the domain coverage and prediction specificity. For that purpose, we first identified a novel set of 138 PDZ families, comprising of 548 PDZ domains from aforementioned organisms, based on efficient clustering according to their sequence identity. For 43 PDZ families, covering 226 PDZ domains with available interaction data, we built specialized models using a support vector machine approach. The advantage of family-wise models is that they can also be used to determine the binding specificity of a newly characterized PDZ domain with sufficient sequence identity to the known families. Since most current experimental approaches provide only positive data, we have to cope with the class imbalance problem. Thus, to enrich the negative class, we introduced a powerful semi-supervised technique to generate high confidence non-interaction data. We report competitive predictive performance with respect to state-of-the-art approaches.

Conclusions: Our approach has several contributions. First, we show that domain coverage can be increased by applying accurate clustering technique. Second, we developed an approach based on a semi-supervised strategy to get high confidence negative data. Third, we allowed high order correlations between the amino acid positions in the binding peptides. Fourth, our method is general enough and will easily be applicable to other peptide recognition modules such as SH2 domains and finally, we performed a genome-wide prediction for 101 human and 102 mouse PDZ domains and uncovered novel interactions with biological relevance. We make all the predictive models and genome-wide predictions freely available to the scientific community.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Clustering of PDZ domains. Phylogenetic tree of all available PDZ domains from human, mouse, fly and worm. The MCL clustering output was mapped onto the phylogenetic tree. A total number of 138 PDZ families are presented by 138 colors. iTOL was used for the visualization [45].
Figure 2
Figure 2
PDZ-peptide complex structure. Representative PDZ-peptide complex structure (PDB-id: 4G69) for PDZ family 1. 2nd PDZ domain from human DLG1 binds with C-terminal peptide of human APC protein. Green lines indicate the binding pairs with distance less than 4.5 angstroms. UCSF Chimera was used for the visualization [46].
Figure 3
Figure 3
Performance. (A) The AUC-ROC and (B) the AUC-PR curve obtained by sequence-based feature encoding method.
Figure 4
Figure 4
Performance evaluation on an independent test set. Performance comparison of tree different tools. Red, green and blue bars indicate the predicted performances by our tool (SVM), DomPep and MDSM, respectively. The figure clearly shows that our tool (SVM) achieved better performance.

Similar articles

Cited by

References

    1. Pawson T, Nash P. Assembly of cell regulatory systems through protein interaction domains. Science. 2003;300(5618):445–52. doi: 10.1126/science.1083653. - DOI - PubMed
    1. Ponting CP. Evidence for PDZ domains in bacteria, yeast, and plants. Protein Sci. 1997;6(2):464–8. doi: 10.1002/pro.5560060225. - DOI - PMC - PubMed
    1. Nourry C, Grant SGN, Borg JP. PDZ domain proteins: plug and play! Sci STKE. 2003;2003(179):RE7. - PubMed
    1. Seet BT, Dikic I, Zhou MM, Pawson T. Reading protein modifications with interaction domains. Nat Rev Mol Cell Biol. 2006;7(7):473–83. doi: 10.1038/nrm1960. - DOI - PubMed
    1. Dev KK. Making protein interactions druggable: targeting PDZ domains. Nat Rev Drug Discov. 2004;3(12):1047–56. doi: 10.1038/nrd1578. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources