Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun 22:6:385.
doi: 10.1038/msb.2010.41.

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Affiliations

Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data

Hyungwon Choi et al. Mol Syst Biol. .

Abstract

Affinity purification followed by mass spectrometry (AP-MS) has become a common approach for identifying protein-protein interactions (PPIs) and complexes. However, data analysis and visualization often rely on generic approaches that do not take advantage of the quantitative nature of AP-MS. We present a novel computational method, nested clustering, for biclustering of label-free quantitative AP-MS data. Our approach forms bait clusters based on the similarity of quantitative interaction profiles and identifies submatrices of prey proteins showing consistent quantitative association within bait clusters. In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys. The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods. We illustrate the performance of the algorithm using two published intermediate scale human PPI data sets, which are representative of the AP-MS data generated from mammalian cells. We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Overview of the computational method. (A) Nested clustering algorithm. Baits are probabilistically assigned to bait clusters with associated mean and variance. The diameter of circles is proportional to the normalized spectral counts. In bait clusters, mean abundance of each prey is drawn as a square. Mixture modeling is used to group these elements into a small number of abundance levels, completing nested clustering of prey proteins. (B) Resulting biclusters from the algorithm in (A). Each bicluster corresponds to a submatrix consisting of a bait cluster and an associated nested prey cluster. (C) Example of maximum a posteriori estimation. Bait clustering is illustrated in a hypothetical data with two preys. Each dot is a single purification with a different bait. Four unique sets of clustering configurations were generated in 100 samples. The number N is the number of samples sharing the given bait clusters, and maxP is the maximum posterior probability under the fixed bait cluster configuration. The Model 2 is the most frequently sampled configuration with the highest maximum posterior probability, and Model 3 is the second best competing model with similarly high posterior probability. The other two configurations have low posterior probability and low frequency of sampling.
Figure 2
Figure 2
Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.
Figure 3
Figure 3
Application of nested clustering to PP2A data set. (A) Heatmap of the raw spectral data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of the PP2A system along with the STRIPAK and CCT complexes. Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.

References

    1. Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422: 198–207 - PubMed
    1. Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2: 1152–1174
    1. Braun P, Tasan M, Dreze M, Barrios-Rodiles M, Lemmens I, Yu H, Sahalie JM, Murray RR, Roncari L, de Smet AS, Venkatesan K, Rual JF, Vandenhaute J, Cusick ME, Pawson T, Hill DE, Tavernier J, Wrana JL, Roth FP, Vidal M (2009) An experimentally derived confidence score for binary protein-protein interactions. Nat Methods 6: 91–97 - PMC - PubMed
    1. Breitkreutz A, Choi H, Sharom J, Boucher L, Neduva V, Larsen B, Lin Z-Y, Breitkreutz B-J, Stark C, Liu G, Ahn J, Dewar-Darch D, Reguly T, Tang X, Almeida R, Qin ZS, Pawson T, Gingras A-C, Nesvizhskii AI, Tyers M (2010) Global architecture of the yeast protein kinase and phosphatase interaction network. Science 328: 1043–1046 - PMC - PubMed
    1. Chen GI, Gingras AC (2007) Affinity-purification mass spectrometry (AP-MS) of serine/threonine phosphatases. Methods 42: 298–305 - PubMed

Publication types

MeSH terms