Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 20;6(5):e1000789.
doi: 10.1371/journal.pcbi.1000789.

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Affiliations

Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures

Amelie Stein et al. PLoS Comput Biol. .

Abstract

Many biological responses to intra- and extracellular stimuli are regulated through complex networks of transient protein interactions where a globular domain in one protein recognizes a linear peptide from another, creating a relatively small contact interface. These peptide stretches are often found in unstructured regions of proteins, and contain a consensus motif complementary to the interaction surface displayed by their binding partners. While most current methods for the de novo discovery of such motifs exploit their tendency to occur in disordered regions, our work here focuses on another observation: upon binding to their partner domain, motifs adopt a well-defined structure. Indeed, through the analysis of all peptide-mediated interactions of known high-resolution three-dimensional (3D) structure, we found that the structure of the peptide may be as characteristic as the consensus motif, and help identify target peptides even though they do not match the established patterns. Our analyses of the structural features of known motifs reveal that they tend to have a particular stretched and elongated structure, unlike most other peptides of the same length. Accordingly, we have implemented a strategy based on a Support Vector Machine that uses this features, along with other structure-encoded information about binding interfaces, to search the set of protein interactions of known 3D structure and to identify unnoticed peptide-mediated interactions among them. We have also derived consensus patterns for these interactions, whenever enough information was available, and compared our results with established linear motif patterns and their binding domains. Finally, to cross-validate our identification strategy, we scanned interactome networks from four model organisms with our newly derived patterns to see if any of them occurred more often than expected. Indeed, we found significant over-representations for 64 domain-motif interactions, 46 of which had not been described before, involving over 6,000 interactions in total for which we could suggest the molecular details determining the binding.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Linearity and elongation of linear motifs.
(A) The Retinoblastoma-associated protein B domain (RB_B)-binding peptide shows the typical linear and elongated form found in 3D structures of many motifs (PDB ID 1gh6). The concepts of linearity (the maximum deviation of any formula image in the motif from the line through the first and last formula image) and elongation (the distance between the first and last formula image of a motif) are illustrated in this structure. (B) A slice of the data used for SVM training: linearity, elongation and secondary structure classification for 7-residue-peptides, with data from the SCOP background shown as dots and the data for known DMI shown as solid triangles, using one colour per DSSP classification. Panels (C) to (F) show the distribution of linearity∶elongation values for those secondary structure classifications for which we had known 7-residue-peptides (none, alpha-helix, bend, and turn). These data slices illustrate how known linear motifs fall into distinct regions of the parameter space.
Figure 2
Figure 2. Overview of the generation and filtering of motif-like peptides.
(Steps 1 and 2) We generated all possible peptides of 4–20 residues from regions of 3D structures that did not match Pfam domains. (3, 4) For peptides accepted by the SVM trained on linearity and elongation (cf. Figure 1) we computed whether there were sufficient contacts with domains in the same structure, which may be in the same or in another protein chain. (5) Peptides that are completely covered by other (longer) peptides are removed, so that the largest accepted peptide represents shorter candidates binding to the same region. (6) Peptides in intrachain interactions that are part of CATH domains are often artefacts of differences between structure- and sequence-based domain assignment and are therefore excluded. (7) Peptides in intrachain interactions that are sequentially directly next to the binding domain are often artefacts and thus removed, though in general peptides close to domains are allowed, as long as they have a sufficient sequential distance from their binding domain. (8) Exclude candidate DMI in which the interface is smaller than 150 Å, or in which the interface between domain and peptide is less than 50% of the total interface between the proteins.
Figure 3
Figure 3. Joining of partially overlapping peptides for sequence-based clustering.
(A) Partially overlapping peptides cannot be represented by either one, as both may contribute to an interface in ways not covered by the other. Yet to improve the quality of peptide alignments, and to ensure that motif matches in the overlapping regions (shown in gray) are only counted once for motif support, we need to create a construct that holds unique, non-overlapping regions of one or more peptides accepted by the SVM and having a sufficient interface with a domain. (B) Thus, for each continuous stretch of a protein that is covered by one or more peptides, we built a peptide-containing region. (C) These regions are then aligned to generate non-redundant sets of peptides binding to a given domain, and each motif match in a peptide-containing region only qualifies for motif support once. The 90% sequence clustering of the DMIs is computed from a combination of the sequence identities of peptide-containing regions and those of the binding domains.
Figure 4
Figure 4. Topological clustering of peptide-mediated interactions.
(A) Alignment of BRCA1 C Terminus (BRCT) sequences to the domain's HMM profile; interface residues are highlighted. The colour corresponds to the “rainbow” colouring scheme used for the domain visualisation in panel B. Lowercase letters refer to amino acids that do not match the domain's profile, - to positions in the profile that do not occur in the given sequence. (B) Clustering of the interaction topologies, based on shared interface residues. Domains with the same or highly similar topologies are grouped together. In the structural representation, all three BRCT domains have the same orientation. Note that the BRCT domain usually forms dimers that bind the peptide, using the interfaces from clusters (3,4,5,6) and (1,2,9), respectively (cf. Figure 5).
Figure 5
Figure 5. DMIs significantly enriched in the interactomes.
Significantly enriched motifs were found for 46 distinct domains (shown in gray; PCNA_N and PCNA_C are shown in the same structure). Binding peptides are given in a rainbow colour scheme, with the SVM-accepted part in sticks representation and the consensus motif in surface representation. In most cases, differences between the interaction types for a given domain are subtle, thus only one is shown in this representative figure. However, for domains that form repeats to bind peptides (Arm, BRCT (cf. Fig. 4 and main text), TPR_1, TRF, WD40), we have visualized all domains required to bind one peptide; these usually employ different interaction types. Blue domain names indicate those that were described in the ELM training dataset , violet names mark additions to ELM since 2007 , which were not in our training set, and green names indicate DMIs that are described on the Pawson lab web site but not in ELM.

References

    1. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006;440:631–636. - PubMed
    1. Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, et al. A protein interaction map of Drosophila melanogaster. Science. 2003;302:1727–1736. - PubMed
    1. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A. 2001;98:4569–4574. - PMC - PubMed
    1. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005;437:1173–1178. - PubMed
    1. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell. 2005;122:957–968. - PubMed

Publication types