Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 12;2(6):100257.
doi: 10.1016/j.patter.2021.100257. eCollection 2021 Jun 11.

Causal interactions from proteomic profiles: Molecular data meet pathway knowledge

Affiliations

Causal interactions from proteomic profiles: Molecular data meet pathway knowledge

Özgün Babur et al. Patterns (N Y). .

Abstract

We present a computational method to infer causal mechanisms in cell biology by analyzing changes in high-throughput proteomic profiles on the background of prior knowledge captured in biochemical reaction knowledge bases. The method mimics a biologist's traditional approach of explaining changes in data using prior knowledge but does this at the scale of hundreds of thousands of reactions. This is a specific example of how to automate scientific reasoning processes and illustrates the power of mapping from experimental data to prior knowledge via logic programming. The identified mechanisms can explain how experimental and physiological perturbations, propagating in a network of reactions, affect cellular responses and their phenotypic consequences. Causal pathway analysis is a powerful and flexible discovery tool for a wide range of cellular profiling data types and biological questions. The automated causation inference tool, as well as the source code, are freely available at http://causalpath.org.

Keywords: cancer; causal pathway analysis; proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Overview of CausalPath pipeline over an example analysis One relationship CausalPath generated from the EGF stimulation study was GAB1 → MAPK3. Prior information for this relationship was curated into pathway databases, which we integrate into Pathway Commons as detailed mechanistic processes. We detect structural patterns in these processes that indicate that GAB1, when activated through phosphorylation, can, in turn, help in phosphorylation of MAPK3 (step 1). These phosphorylations were correlated in the proteomics dataset in the direction compatible with the prior, so CausalPath selects this relationship as a potential explanation (step 2). The final logical network is a subgraph of the EGF stimulation analysis results at 2 min time frame. For a more comprehensive description of graph notation, please see Figure 3C. We omit phosphorylation site locations while rendering the resulting network for complexity management; these can be inspected interactively within the CausalPath on causalpath.org. (This figure provides conceptual examples for steps 1 and 2 of CausalPath. ∗Step 1 recognizes a variety of pathway structures that can causally link an upstream protein activity to a downstream proteomic feature, which are detailed in Data S1. ∗∗Step 2 checks if the direction of the measured proteomic changes is compatible with the expectations set by the prior information using Equations 1 and 2 [see main text]. Step 2 is demonstrated in more detail in Figure S1.)
Figure 2
Figure 2
Validation of CausalPath relations on a cell-line ligand-stimulation and drug-inhibition RPPA dataset (A) An example subnetwork from CausalPath results to illustrate how the validation works. The first subnetwork is generated by comparing NRG1-stimulated BT20 cells with the unstimulated control cells. Since this network nominates activated MTOR and AKT1 as the cause of several downstream phosphorylations, we can test these hypotheses using MTOR and AKT inhibitors. The next two graphs show the same subnetwork after the inhibitors are applied (ligand+/inhibitor+ cells are compared with ligand+/inhibitor− cells). See Figure 3C for graph legend. (B) Cumulative validation results from all 32 cases, generated by readouts from 14 distinct antibodies. The x axis has mean changes in the antibody readouts normalized to their global standard deviation and expected direction. A positive value indicates the change is in the expected direction.
Figure 3
Figure 3
Results for CPTAC ovarian cancer (A) The largest connected component in the correlation-based causality network with phospho regulations. Note that the visual notation of this correlation-based result network is different from that of the comparison-based network in Figure 1, as we have no differential comparison but have pairwise correlations. For a compiled set of examples on how to read parts of a CausalPath result graph, please see Figure S4. (B) Immunoreactive subtype compared with all other samples, where we show RNA expression and DNA copy variation from corresponding TCGA datasets along with the CPTAC proteomic changes. (C) Key for the graph notation for causal explanations in all figures.
Figure 4
Figure 4
Results for CPTAC breast cancer (A) A subgraph of the correlation-based causal network with phospho regulations focused on the upstream regulators of proteins that are implicated in breast cancer as provided by the COSMIC Cancer Gene Census (CGS) database. There are 43 genes in CGS annotated with breast cancer, for 11 of which we identify phosphorylation regulators. (B) Subgraph of the correlation-based causal network with expression regulations where RNA-seq changes are explained by upstream proteomic changes, focused on the neighborhood of proteins with significant downstream. (C) Luminal A and luminal B subtypes are collectively compared with the basal-like subtype. Only the ESR1 downstream relations are shown.
Figure 5
Figure 5
Recurrent results for TCGA RPPA datasets Relations that are identified with correlation-based analysis in at least 15 cancer types are shown, where faintest color indicates 15 and boldest color indicates 30. Please note that the bold node borders are repurposed in the graph notation to display recurrence rate.

References

    1. Molinelli E.J., Korkut A., Wang W., Miller M.L., Gauthier N.P., Jing X., Kaushik P., He Q., Mills G., Solit D.B., Pratilas C.A. Perturbation biology: inferring signaling networks in cellular systems. PLoS Comput. Biol. 2013;9:e1003290. - PMC - PubMed
    1. Hill S.M., Heiser L.M., Cokelaer T., Unger M., Nesser N.K., Carlin D.E., Zhang Y., Sokolov A., Paull E.O., Wong C.K. Inferring causal molecular networks: empirical assessment through a community-based effort. Nat. Methods. 2016;13:310. - PMC - PubMed
    1. Triantafillou S., Lagani V., Heinze-Deml C., Schmidt A., Tegner J., Tsamardinos I. Predicting causal relationships from biological data: applying automated causal discovery on mass cytometry data of human immune cells. Sci. Rep. 2017;7:12724. - PMC - PubMed
    1. Korkut A., Wang W., Demir E., Aksoy B.A., Jing X., Molinelli E.J., Babur Ö., Bemis D.L., Onur Sumer S., Solit D.B. Perturbation biology nominates upstream–downstream drug combinations in raf inhibitor resistant melanoma cells. Elife. 2015;4:e04640. - PMC - PubMed
    1. Köksal A.S., Beck K., Cronin D.R., McKenna A., Camp N.D., Srivastava S., MacGilvray M.E., Bodík R., Wolf-Yadlin A., Fraenkel E. Synthesizing signaling pathways from temporal phosphoproteomic data. Cell Rep. 2018;24:3607–3618. - PMC - PubMed

LinkOut - more resources