Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jun;36(10):3263-73.
doi: 10.1093/nar/gkn161. Epub 2008 Apr 19.

Prediction of phosphotyrosine signaling networks using a scoring matrix-assisted ligand identification approach

Affiliations

Prediction of phosphotyrosine signaling networks using a scoring matrix-assisted ligand identification approach

Lei Li et al. Nucleic Acids Res. 2008 Jun.

Abstract

Systematic identification of binding partners for modular domains such as Src homology 2 (SH2) is important for understanding the biological function of the corresponding SH2 proteins. We have developed a worldwide web-accessible computer program dubbed SMALI for scoring matrix-assisted ligand identification for SH2 domains and other signaling modules. The current version of SMALI harbors 76 unique scoring matrices for SH2 domains derived from screening oriented peptide array libraries. These scoring matrices are used to search a protein database for short peptides preferred by an SH2 domain. An experimentally determined cut-off value is used to normalize an SMALI score, therefore allowing for direct comparison in peptide-binding potential for different SH2 domains. SMALI employs distinct scoring matrices from Scansite, a popular motif-scanning program. Moreover, SMALI contains built-in filters for phosphoproteins, Gene Ontology (GO) correlation and colocalization of subject and query proteins. Compared to Scansite, SMALI exhibited improved accuracy in identifying binding peptides for SH2 domains. Applying SMALI to a group of SH2 domains identified hundreds of interactions that overlap significantly with known networks mediated by the corresponding SH2 proteins, suggesting SMALI is a useful tool for facile identification of signaling networks mediated by modular domains that recognize short linear peptide motifs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic representation of the SMALI program. (A) An OPAL-SH2 binding profile (shown here for the BRDG1 SH2 domain) was used to generate a position specific scoring matrix (PSSM) (B). (C) The PSSM was used to search a protein database for tyrosine-containing peptides that are preferred by a query SH2 domain. (D) Selected peptides are ranked according to their SMALI scores and put out either unfiltered or filtered through one or more filters as shown. (E) The output file size can be selected. A sample output file is shown (see text for detail).
Figure 2.
Figure 2.
Sample output of the domain-scan module in SMALI. (A) A query protein can be entered with an ID or by typing in the sequence in the space provided. Partial sequence is also acceptable. One or more SH2 domains in the pull-down menu may be selected for the prediction. (B) Tabulated results showing the query protein name, sequence, locations of Tyr residues and SH2 domains predicted to bind a particular Tyr site (assuming the site is phosphorylated). A relative SMALI score is given in parenthesis beside a selected SH2 domain. Only SH2 domains with a relative score of >1.0 are listed.
Figure 3.
Figure 3.
Validation of SMALI predicted interactions by peptide array and derivation of cut-off SMALI values. (A) Binding profile of the BRDG1 SH2 domain to an array of 1488 top-ranked phosphotyrosine-containing peptides selected by SMALI from the Swiss-Prot human protein database. (B) Binding of the GRB2 SH2 domain to 720 phosphopeptides taken from the Phosphosite database (15). The first 360 peptides (upper portion) was based on SMALI prediction, whereas the second half (lower portion) was randomly chosen from the database. Dark spots indicate positive binding. (C and D) Distribution of binding peptides over SMALI scores for the BRDG1 (C) and GRB2 SH2 (D) domains. The histograms show ‘hit rate’, defined as the percentage of binding peptides, at a given SMALI score range (in increments of 0.1 and 0.2, respectively for C and D). (E and F) An optimal SMALI cut-off value is arbitrarily defined as the SMALI score that produces the greatest F-measure. F-measure = 2 × precision × recall/(precision + recall), where precision = binding peptides correctly predicted/binding peptides predicted and recall = binding peptides correctly predicted/real binding peptides. For the BRDG1 SH2 domain, the SMALI score 1.4 produced the largest F-measure 0.84 (E). Coincidently, this SMALI value corresponds to a hit-rate of ∼50%. For the GRB2 SH2 domain, the cut-off SMALI score is 1.6. (F and G) Distribution of all Tyr-containing peptides (total 203 494) in Swiss-Prot human database according to SMALI scores calculated using PSSM for BRDG1 (G) or the GRB2 SH2 (H) domain. The SMALI cut-off of 1.4 for the BRDG1 SH2 domain corresponds to the top 3.5% scoring peptides located to the right of the cut-off value (G). For GRB2 SH2, the cut-off corresponds to the top 5.5% peptides ranked according to SMALI.
Figure 4.
Figure 4.
Validation of peptide ligands for the SH2 domains of CRK (A), NCK (B) and FGR (C), respectively as identified by SMALI (upper half of each peptide array) or Scansite (bottom half). For each SH2 domain, a total of 336 peptides were examined, of which the first 168 was identified as top binders by SMALI and the last 168 by the Scansite. The sequences of the peptides and their respective ranking orders on SMALI or Scansite are provided in Tables S3–S5. See also Table 2 for a summary of the result.

References

    1. Johnson SA, Hunter T. Kinomics: methods for deciphering the kinome. Nat. Methods. 2005;2:17–25. - PubMed
    1. Blume-Jensen P, Hunter T. Oncogenic kinase signalling. Nature. 2001;411:355–365. - PubMed
    1. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298:1912–1934. - PubMed
    1. Pawson T, Scott JD. Protein phosphorylation in signaling - 50 years and counting. Trends Biochem. Sci. 2005;30:286–290. - PubMed
    1. Pawson T. Specificity in signal transduction: from phosphotyrosine-SH2 domain interactions to complex cellular systems. Cell. 2004;116:191–203. - PubMed

Publication types