. 2010 May 20;6(5):e1000789.

doi: 10.1371/journal.pcbi.1000789.

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Amelie Stein¹, Patrick Aloy

Affiliations

PMID: 20502673
PMCID: PMC2873903
DOI: 10.1371/journal.pcbi.1000789

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Amelie Stein et al. PLoS Comput Biol. 2010.

. 2010 May 20;6(5):e1000789.

doi: 10.1371/journal.pcbi.1000789.

Authors

Amelie Stein¹, Patrick Aloy

Affiliation

¹ Institute for Research in Biomedicine, Joint IRB-BSC Program in Computational Biology, Barcelona, Spain.

PMID: 20502673
PMCID: PMC2873903
DOI: 10.1371/journal.pcbi.1000789

Abstract

Many biological responses to intra- and extracellular stimuli are regulated through complex networks of transient protein interactions where a globular domain in one protein recognizes a linear peptide from another, creating a relatively small contact interface. These peptide stretches are often found in unstructured regions of proteins, and contain a consensus motif complementary to the interaction surface displayed by their binding partners. While most current methods for the de novo discovery of such motifs exploit their tendency to occur in disordered regions, our work here focuses on another observation: upon binding to their partner domain, motifs adopt a well-defined structure. Indeed, through the analysis of all peptide-mediated interactions of known high-resolution three-dimensional (3D) structure, we found that the structure of the peptide may be as characteristic as the consensus motif, and help identify target peptides even though they do not match the established patterns. Our analyses of the structural features of known motifs reveal that they tend to have a particular stretched and elongated structure, unlike most other peptides of the same length. Accordingly, we have implemented a strategy based on a Support Vector Machine that uses this features, along with other structure-encoded information about binding interfaces, to search the set of protein interactions of known 3D structure and to identify unnoticed peptide-mediated interactions among them. We have also derived consensus patterns for these interactions, whenever enough information was available, and compared our results with established linear motif patterns and their binding domains. Finally, to cross-validate our identification strategy, we scanned interactome networks from four model organisms with our newly derived patterns to see if any of them occurred more often than expected. Indeed, we found significant over-representations for 64 domain-motif interactions, 46 of which had not been described before, involving over 6,000 interactions in total for which we could suggest the molecular details determining the binding.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. Linearity and elongation of linear motifs.**
(A) The Retinoblastoma-associated protein B domain (RB_B)-binding peptide shows the typical linear and elongated form found in 3D structures of many motifs (PDB ID 1gh6). The concepts of linearity (the maximum deviation of any in the motif from the line through the first and last ) and elongation (the distance between the first and last of a motif) are illustrated in this structure. (B) A slice of the data used for SVM training: linearity, elongation and secondary structure classification for 7-residue-peptides, with data from the SCOP background shown as dots and the data for known DMI shown as solid triangles, using one colour per DSSP classification. Panels (C) to (F) show the distribution of linearity∶elongation values for those secondary structure classifications for which we had known 7-residue-peptides (none, alpha-helix, bend, and turn). These data slices illustrate how known linear motifs fall into distinct regions of the parameter space.

formula image — **Figure 1. Linearity and elongation of linear motifs.**
(A) The Retinoblastoma-associated protein B domain (RB_B)-binding peptide shows the typical linear and elongated form found in 3D structures of many motifs (PDB ID 1gh6). The concepts of linearity (the maximum deviation of any in the motif from the line through the first and last ) and elongation (the distance between the first and last of a motif) are illustrated in this structure. (B) A slice of the data used for SVM training: linearity, elongation and secondary structure classification for 7-residue-peptides, with data from the SCOP background shown as dots and the data for known DMI shown as solid triangles, using one colour per DSSP classification. Panels (C) to (F) show the distribution of linearity∶elongation values for those secondary structure classifications for which we had known 7-residue-peptides (none, alpha-helix, bend, and turn). These data slices illustrate how known linear motifs fall into distinct regions of the parameter space.

**Figure 2. *Overview* of the generation and filtering of motif-like peptides.**
(Steps 1 and 2) We generated all possible peptides of 4–20 residues from regions of 3D structures that did not match Pfam domains. (3, 4) For peptides accepted by the SVM trained on linearity and elongation (cf. Figure 1) we computed whether there were sufficient contacts with domains in the same structure, which may be in the same or in another protein chain. (5) Peptides that are completely covered by other (longer) peptides are removed, so that the largest accepted peptide represents shorter candidates binding to the same region. (6) Peptides in intrachain interactions that are part of CATH domains are often artefacts of differences between structure- and sequence-based domain assignment and are therefore excluded. (7) Peptides in intrachain interactions that are sequentially directly next to the binding domain are often artefacts and thus removed, though in general peptides close to domains are allowed, as long as they have a sufficient sequential distance from their *binding* domain. (8) Exclude candidate DMI in which the interface is smaller than 150 Å, or in which the interface between domain and peptide is less than 50% of the total interface between the proteins.

**Figure 3. *Joining* of partially overlapping peptides for sequence-based clustering.**
(A) Partially overlapping peptides cannot be represented by either one, as both may contribute to an interface in ways not covered by the other. Yet to improve the quality of peptide alignments, and to ensure that motif matches in the overlapping regions (shown in gray) are only counted once for motif support, we need to create a construct that holds unique, non-overlapping regions of one or more peptides accepted by the SVM and having a sufficient interface with a domain. (B) Thus, for each continuous stretch of a protein that is covered by one or more peptides, we built a *peptide-containing region*. (C) These regions are then aligned to generate non-redundant sets of peptides binding to a given domain, and each motif match in a peptide-containing region only qualifies for motif support once. The 90% sequence clustering of the DMIs is computed from a combination of the sequence identities of peptide-containing regions and those of the binding domains.

**Figure 4. Topological clustering of peptide-mediated interactions.**
(A) Alignment of BRCA1 C Terminus (BRCT) sequences to the domain's HMM profile; interface residues are highlighted. The colour corresponds to the “rainbow” colouring scheme used for the domain visualisation in panel B. Lowercase letters refer to amino acids that do not match the domain's profile, - to positions in the profile that do not occur in the given sequence. (B) Clustering of the interaction topologies, based on shared interface residues. Domains with the same or highly similar topologies are grouped together. In the structural representation, all three BRCT domains have the same orientation. Note that the BRCT domain usually forms dimers that bind the peptide, using the interfaces from clusters (3,4,5,6) and (1,2,9), respectively (cf. Figure 5).

**Figure 5. *DMIs* significantly enriched in the interactomes.**
Significantly enriched motifs were found for 46 distinct domains (shown in gray; PCNA_N and PCNA_C are shown in the same structure). Binding peptides are given in a rainbow colour scheme, with the SVM-accepted part in sticks representation and the consensus motif in surface representation. In most cases, differences between the interaction types for a given domain are subtle, thus only one is shown in this representative figure. However, for domains that form repeats to bind peptides (Arm, BRCT (cf. Fig. 4 and main text), TPR_1, TRF, WD40), we have visualized all domains required to bind one peptide; these usually employ different interaction types. Blue domain names indicate those that were described in the ELM training dataset , violet names mark additions to ELM since 2007 , which were not in our training set, and green names indicate DMIs that are described on the Pawson lab web site but not in ELM.

See this image and copyright information in PMC

References

1. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006;440:631–636. - PubMed
1. Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, et al. A protein interaction map of Drosophila melanogaster. Science. 2003;302:1727–1736. - PubMed
1. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A. 2001;98:4569–4574. - PMC - PubMed
1. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005;437:1173–1178. - PubMed
1. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell. 2005;122:957–968. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Affiliation

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources