Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;8(3):1443-1468.
doi: 10.1214/14-AOAS722.

BAYESIAN SPARSE GRAPHICAL MODELS FOR CLASSIFICATION WITH APPLICATION TO PROTEIN EXPRESSION DATA

Affiliations

BAYESIAN SPARSE GRAPHICAL MODELS FOR CLASSIFICATION WITH APPLICATION TO PROTEIN EXPRESSION DATA

Veerabhadran Baladandayuthapani et al. Ann Appl Stat. 2014.

Abstract

Reverse-phase protein array (RPPA) analysis is a powerful, relatively new platform that allows for high-throughput, quantitative analysis of protein networks. One of the challenges that currently limit the potential of this technology is the lack of methods that allow for accurate data modeling and identification of related networks and samples. Such models may improve the accuracy of biological sample classification based on patterns of protein network activation and provide insight into the distinct biological relationships underlying different types of cancer. Motivated by RPPA data, we propose a Bayesian sparse graphical modeling approach that uses selection priors on the conditional relationships in the presence of class information. The novelty of our Bayesian model lies in the ability to draw information from the network data as well as from the associated categorical outcome in a unified hierarchical model for classification. In addition, our method allows for intuitive integration of a priori network information directly in the model and allows for posterior inference on the network topologies both within and between classes. Applying our methodology to an RPPA data set generated from panels of human breast cancer and ovarian cancer cell lines, we demonstrate that the model is able to distinguish the different cancer cell types more accurately than several existing models and to identify differential regulation of components of a critical signaling network (the PI3K-AKT pathway) between these two types of cancer. This approach represents a powerful new tool that can be used to improve our understanding of protein networks in cancer.

Keywords: Bayesian methods; graphical models; mixture models; protein signaling pathways.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
An example of a reverse-phase protein array (RPPA) slide. (A) Each slide is comprised of 4 rows (A–D) of 12 columns (1–12) grids of 11×11 spots. (B) Each grid has 22 individual samples and 11 controls. Each row of the grid consists of 2 individual samples (each with 5 serial 2-fold dilutions) and one control spot. Reproduced with permission from Tabchy et al. (2011).
Fig. 2
Fig. 2
Significant edges for the proteins in the PI3K-AKT kinase pathway for breast (a) and ovarian cancer cell lines (b) computed using a Bayesian FDR of 0.10. The red (green) lines between the proteins indicate a negative (positive) correlation between the proteins. The thickness of the edges corresponds to the strength of the associations, with stronger associations having greater thickness.
Fig. 3
Fig. 3
Conserved and differential networks for the proteins in the PI3K-AKT kinase pathway between breast and ovarian cancer cell lines computed using a Bayesian FDR set to 0.10. In the conserved network (top panel), the red (green) lines between the proteins indicate a negative (positive) correlation between the proteins. In the differential network (bottom row), the blue lines between the proteins indicate a relationship that was significant in the ovarian cancer cell lines but not in the breast cancer cell lines; the orange lines between the proteins indicate a relationship in the breast cancer cell lines but not in the ovarian cancer cell lines. The thickness of the edges corresponds to the strength of the associations, with stronger associations having greater thickness.
Fig. 4
Fig. 4
Conserved and differential networks for the proteins in the PI3K-AKT kinase pathway between ovarian cancer cell lines grown in three different tissue culture conditions: A, B and C (see main text) computed using a Bayesian FDR set to 0.10. In the conserved network [(a)–(c)], the red (green) lines between the proteins indicate a negative (positive) correlation between the proteins. In the differential network [(d)–(f)], the blue lines between the proteins indicate a relationship that was significant in the ovarian cancer cell lines but not in the breast cancer cell lines; the orange lines between the proteins indicate a relationship in the breast cancer cell lines but not in the ovarian cancer cell lines. The thickness of the edges corresponds to the strength of the associations, with stronger associations having greater thickness.

References

    1. Baladandayuthapani V, Talluri R, Ji Y, Coombes KR, Lu Y, Hennessy BT, Davies MA, Mallick BK. Supplement to “Bayesian sparse graphical models for classification with application to protein expression data”. 2014 doi: 10.1214/14-AOAS722SUPP. - DOI - PMC - PubMed
    1. Barnard J, McCulloch R, Meng X-L. Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Statist Sinica. 2000;10:1281–1311.
    1. Bast CR, Jr, Hennessy B, Mills GB. The biology of ovarian cancer: New opportunities for translation. Nat Rev Cancer. 2009;9:415–428. - PMC - PubMed
    1. Bickel PJ, Levina E. Regularized estimation of large covariance matrices. Ann Statist. 2008;36:199–227.
    1. Blower PE, Verducci JS, Lin S, Zhou J, Chung J-H, Dai Z, Liu C-G, Reinhold W, Lorenzi PL, Kaldjian EP, Croce CM, Weinstein JN, Sadee W. MicroRNA expression profiles for the NCI-60 cancer cell panel. Mol Cancer Ther. 2007;6:1483–1491. - PubMed

LinkOut - more resources