. 2010 Jun 22:6:385.

doi: 10.1038/msb.2010.41.

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Hyungwon Choi¹, Sinae Kim, Anne-Claude Gingras, Alexey I Nesvizhskii

Affiliations

PMID: 20571534
PMCID: PMC2913403
DOI: 10.1038/msb.2010.41

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Hyungwon Choi et al. Mol Syst Biol. 2010.

. 2010 Jun 22:6:385.

doi: 10.1038/msb.2010.41.

Authors

Hyungwon Choi¹, Sinae Kim, Anne-Claude Gingras, Alexey I Nesvizhskii

Affiliation

¹ Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA.

PMID: 20571534
PMCID: PMC2913403
DOI: 10.1038/msb.2010.41

Abstract

Affinity purification followed by mass spectrometry (AP-MS) has become a common approach for identifying protein-protein interactions (PPIs) and complexes. However, data analysis and visualization often rely on generic approaches that do not take advantage of the quantitative nature of AP-MS. We present a novel computational method, nested clustering, for biclustering of label-free quantitative AP-MS data. Our approach forms bait clusters based on the similarity of quantitative interaction profiles and identifies submatrices of prey proteins showing consistent quantitative association within bait clusters. In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys. The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods. We illustrate the performance of the algorithm using two published intermediate scale human PPI data sets, which are representative of the AP-MS data generated from mammalian cells. We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

**Figure 1**
Overview of the computational method. (A) Nested clustering algorithm. Baits are probabilistically assigned to bait clusters with associated mean and variance. The diameter of circles is proportional to the normalized spectral counts. In bait clusters, mean abundance of each prey is drawn as a square. Mixture modeling is used to group these elements into a small number of abundance levels, completing nested clustering of prey proteins. (B) Resulting biclusters from the algorithm in (A). Each bicluster corresponds to a submatrix consisting of a bait cluster and an associated nested prey cluster. (C) Example of *maximum a posteriori* estimation. Bait clustering is illustrated in a hypothetical data with two preys. Each dot is a single purification with a different bait. Four unique sets of clustering configurations were generated in 100 samples. The number N is the number of samples sharing the given bait clusters, and *maxP* is the maximum posterior probability under the fixed bait cluster configuration. The Model 2 is the most frequently sampled configuration *with the highest maximum posterior probability*, and Model 3 is the second best competing model with similarly high posterior probability. The other two configurations have low posterior probability and low frequency of sampling.

**Figure 2**
Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.

**Figure 3**
Application of nested clustering to PP2A data set. (A) Heatmap of the raw spectral data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of the PP2A system along with the STRIPAK and CCT complexes. Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.

See this image and copyright information in PMC

References

1. Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422: 198–207 - PubMed
1. Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2: 1152–1174
1. Braun P, Tasan M, Dreze M, Barrios-Rodiles M, Lemmens I, Yu H, Sahalie JM, Murray RR, Roncari L, de Smet AS, Venkatesan K, Rual JF, Vandenhaute J, Cusick ME, Pawson T, Hill DE, Tavernier J, Wrana JL, Roth FP, Vidal M (2009) An experimentally derived confidence score for binary protein-protein interactions. Nat Methods 6: 91–97 - PMC - PubMed
1. Breitkreutz A, Choi H, Sharom J, Boucher L, Neduva V, Larsen B, Lin Z-Y, Breitkreutz B-J, Stark C, Liu G, Ahn J, Dewar-Darch D, Reguly T, Tang X, Almeida R, Qin ZS, Pawson T, Gingras A-C, Nesvizhskii AI, Tyers M (2010) Global architecture of the yeast protein kinase and phosphatase interaction network. Science 328: 1043–1046 - PMC - PubMed
1. Chen GI, Gingras AC (2007) Affinity-purification mass spectrometry (AP-MS) of serine/threonine phosphatases. Methods 42: 298–305 - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Affiliation

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Miscellaneous