Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 1;29(13):i62-70.
doi: 10.1093/bioinformatics/btt229.

Learning subgroup-specific regulatory interactions and regulator independence with PARADIGM

Affiliations

Learning subgroup-specific regulatory interactions and regulator independence with PARADIGM

Andrew J Sedgewick et al. Bioinformatics. .

Abstract

High-dimensional '-omics' profiling provides a detailed molecular view of individual cancers; however, understanding the mechanisms by which tumors evade cellular defenses requires deep knowledge of the underlying cellular pathways within each cancer sample. We extended the PARADIGM algorithm (Vaske et al., 2010, Bioinformatics, 26, i237-i245), a pathway analysis method for combining multiple '-omics' data types, to learn the strength and direction of 9139 gene and protein interactions curated from the literature. Using genomic and mRNA expression data from 1936 samples in The Cancer Genome Atlas (TCGA) cohort, we learned interactions that provided support for and relative strength of 7138 (78%) of the curated links. Gene set enrichment found that genes involved in the strongest interactions were significantly enriched for transcriptional regulation, apoptosis, cell cycle regulation and response to tumor cells. Within the TCGA breast cancer cohort, we assessed different interaction strengths between breast cancer subtypes, and found interactions associated with the MYC pathway and the ER alpha network to be among the most differential between basal and luminal A subtypes. PARADIGM with the Naive Bayesian assumption produced gene activity predictions that, when clustered, found groups of patients with better separation in survival than both the original version of PARADIGM and a version without the assumption. We found that this Naive Bayes assumption was valid for the vast majority of co-regulators, indicating that most co-regulators act independently on their shared target.

Availability: http://paradigm.five3genomics.com.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Factor graph structures in PARADIGM. (A) Central dogma structure shared by all protein coding genes. (B) Alternative regulation models for the transcription, translation and activation nodes. In the Co-dependent Regulation Model, we learn a full conditional probability table of the child given the parents, while in the Independent Regulation Model, we learn conditional probabilities of individual links and use a Naive Bayes assumption to calculate the probability of the child node given the parents
Fig. 2.
Fig. 2.
(A) Principal component analysis of regulatory links in the TCGA cohort. Each point is the projection of the 9 WPMI scores for a link onto the top two principal components. The convex hulls show the membership of k-means clustering performed on the (unprojected) WPMI scores, and the cluster numbers are placed at the centroid of each cluster. (B) Cluster membership of significant links labeled as activation and inhibition in the pathway. (C) Heatmaps of the WPMI values of the centroids of the clusters show a range from strong inhibition (1) to strong activation (5)
Fig. 3.
Fig. 3.
(A) Cluster membership bar plots for WPMI values of significant links learned from the ovarian cohort using an informative prior. (B) Clustering membership when starting with a flat prior. Cluster centers range from strong activation (blue) to strong inhibition (red) as in Figure 2C
Fig. 4.
Fig. 4.
(A) Percentage of unique child nodes that fail the following tests at each EM step of a PARADIGM run learning a full conditional probability table: i. a test of the significance of conditional independence of any two parents given the child. ii. test i and at least one of the parents that fails is significantly linked to the child. iii. test i and the failing triplet is incoherent. iv. tests i, ii and iii. (B) Examples of coherent versus incoherent triplets. The arrows correspond to correlation with a pointed head for positive correlation (activation) and a flat head for negative correlation (inhibition). The interactions between parents are not found in the literature, so we use double sided arrows because we can not know the direction of that interaction
Fig. 5.
Fig. 5.
Kaplan–Meier survival curves of 416 patients in the TCGA ovarian cohort clustered by Integrated Pathway Activity using (A) the original PARADIGM implementation, (B) PARADIGM learning full conditional probability tables of regulatory nodes and (C) PARADIGM learning conditional probability of single links and using a naive Bayes assumption
Fig. 6.
Fig. 6.
Heatmap of the g-score ranks colored by link correlation, with red tending towards activating and blue tending towards inhibiting. For visualization purposes, interactions were filtered if they had a standard deviation <0.2 across all samples or did not have at least one tissue with a score of ≥0.7, resulting in 211 interactions out of the original 10 307
Fig. 7.
Fig. 7.
Boxplots of WPMI values across cancer types (A) WPMI values for links with PPARA:RXRA as a parent node. There is a stronger activation signal in GBM and KIRC. (B) WPMI values for links with TAp73a as a parent node, showing activation in OV

References

    1. Akavia UD, et al. An integrated approach to uncover drivers of cancer. Cell. 2010;143:1005–1017. - PMC - PubMed
    1. Bansal M, Califano A. Genome-wide dissection of posttranscriptional and posttranslational interactions. Methods Mol. Biol. 2012;786:131–149. - PubMed
    1. Cawthorn TR, et al. Proteomic analyses reveal high expression of decorin and endoplasmin (hsp90b1) are associated with breast cancer metastasis and decreased survival. PLoS One. 2012;7:e30992. - PMC - PubMed
    1. Cerami E, et al. Automated network analysis identifies core pathways in glioblastoma. PLoS One. 2010;5:e8918. - PMC - PubMed
    1. Chindelevitch L, et al. Causal reasoning on biological networks: interpreting transcriptional changes. Bioinformatics. 2012;28:1114–1121. - PubMed

Publication types

Substances