Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;107(500):1372-1384.
doi: 10.1080/01621459.2012.706121.

Modeling Protein Expression and Protein Signaling Pathways

Affiliations

Modeling Protein Expression and Protein Signaling Pathways

Donatello Telesca et al. J Am Stat Assoc. 2011.

Abstract

High-throughput functional proteomic technologies provide a way to quantify the expression of proteins of interest. Statistical inference centers on identifying the activation state of proteins and their patterns of molecular interaction formalized as dependence structure. Inference on dependence structure is particularly important when proteins are selected because they are part of a common molecular pathway. In that case, inference on dependence structure reveals properties of the underlying pathway. We propose a probability model that represents molecular interactions at the level of hidden binary latent variables that can be interpreted as indicators for active versus inactive states of the proteins. The proposed approach exploits available expert knowledge about the target pathway to define an informative prior on the hidden conditional dependence structure. An important feature of this prior is that it provides an instrument to explicitly anchor the model space to a set of interactions of interest, favoring a local search approach to model determination. We apply our model to reverse-phase protein array data from a study on acute myeloid leukemia. Our inference identifies relevant subpathways in relation to the unfolding of the biological process under study.

Keywords: AML; Graphical models; Mixture models; POE; RJ-MCMC; RPPA.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) A typical reverse-phase protein array with 40 samples shown as the 40 batches on the slide. Each batch represents one individual sample with 16 spots, which are the results of duplicates of eight-step dilutions. (b) Normalized RPPA intensities for 51 proteins and 531 AML patients.
Figure 2
Figure 2
A protein interaction pathway produced by combining known protein–protein interactions from the literature. This network wiring diagram shows the connectivity of the receptor tyrosine kinase to the MAPK (mitogen-activated protein kinase), Akt, STAT, Bcl-2 (B-cell lymphoma 2), and p53 signaling proteins. (We are able to measure a large percentage of the molecules using the RPPA for AML patients. The relationships between proteins suggested in this diagram will be considered as prior information for the proposed probability model.) The online version of this figure is in color.
Figure 3
Figure 3
Simulated data. Panel (a) shows ROC curves for the correct classification of into activated versus nonactivated proteins. Panels (b) and (c) show ROC curves for reporting edges as present or not. In panel (b), we compare different models. In panel (c), we compare alternate priors under the proposed model (δ = 50 and 100, corresponding to ζ = 1 and 2, respectively). The online version of this figure is in color.
Figure 4
Figure 4
RPPA study: (Left panel) Centered protein abundance ygt versus POE scale intensities ( pgt=E(pgt+-pgt-)). (Right panel) Raw simple correlation estimates versus simple correlations in the POE scale.
Figure 5
Figure 5
Panel (a): Posterior expected interactions E(βu | Y) versus posterior edge inclusion probabilities p((i, j ) or (j, i) ∈ E | Y). Solid diamonds correspond to edges originally included in the prior pathway. Panel (b): Median model identified selecting edges with posterior inclusion probability greater than 0.5. Arrows define stimulatory relationships, whereas dotted arrowheads define inhibitory relationships. Edge thickness is proportional to the absolute size of the posterior expected interaction parameters.
Figure 6
Figure 6
Posterior mean degree and associated 95% CI by protein. Panels (a), (b), and (c) correspond to increasing sparsity penalties = 0.5, 1.0, and 2.0, respectively. For each protein, we report the posterior mean vertex degree evaluated in relation to three settings of the pathway deviation penalty parameter ν [ν = 0.5 (black), ν = 1.0 (blue), ν = 2.0 (green)]. The online version of this figure is in color.

References

    1. Airoldi EM, Markowetz F, Blei DM, Troyanskaya O. Statistical Discovery of Signaling Pathways From an Ensemble of Weakly Informative Data Sources. Poster presentation at the 2007 NIPS Workshop on Statistical Models of Networks; Whistler, BC, Canada. 2007.
    1. Andriew C, Roberts GO. The Pseudo-Marginal Approach for Efficient Markov Chain Monte Carlo Computation. The Annals of Statistics. 2009;37:697–725.
    1. Atay-Kays A, Massam H. The Marginal Likelihood for Decomposable and Non-Decomposable Graphical Gaussian Models. Biometrika. 2005;92:674–659.
    1. Barbieri MM, Berger JO. Optimal Predictive Model Selection. The Annals of Statistics. 2004;32(3):870–897.
    1. Besag J. Spatial Interaction and the Statistical Analysis of Lattice Systems,” (with discussion) Journal of Royal Statistical Society, Series B. 1974;36:192–236.

LinkOut - more resources