Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 28;10(5):e0128411.
doi: 10.1371/journal.pone.0128411. eCollection 2015.

Network reconstruction based on proteomic data and prior knowledge of protein connectivity using graph theory

Affiliations

Network reconstruction based on proteomic data and prior knowledge of protein connectivity using graph theory

Vassilis Stavrakas et al. PLoS One. .

Abstract

Modeling of signal transduction pathways is instrumental for understanding cells' function. People have been tackling modeling of signaling pathways in order to accurately represent the signaling events inside cells' biochemical microenvironment in a way meaningful for scientists in a biological field. In this article, we propose a method to interrogate such pathways in order to produce cell-specific signaling models. We integrate available prior knowledge of protein connectivity, in a form of a Prior Knowledge Network (PKN) with phosphoproteomic data to construct predictive models of the protein connectivity of the interrogated cell type. Several computational methodologies focusing on pathways' logic modeling using optimization formulations or machine learning algorithms have been published on this front over the past few years. Here, we introduce a light and fast approach that uses a breadth-first traversal of the graph to identify the shortest pathways and score proteins in the PKN, fitting the dependencies extracted from the experimental design. The pathways are then combined through a heuristic formulation to produce a final topology handling inconsistencies between the PKN and the experimental scenarios. Our results show that the algorithm we developed is efficient and accurate for the construction of medium and large scale signaling networks. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGF/TNFA stimulation against made up experimental data. To avoid the possibility of erroneous predictions, we performed a cross-validation analysis. Finally, we validate that the introduced approach generates predictive topologies, comparable to the ILP formulation. Overall, an efficient approach based on graph theory is presented herein to interrogate protein-protein interaction networks and to provide meaningful biological insights.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. A simple example network used for illustration purposes—Workflow.
(a) The full network adopted from [36], after applying the Direct Paths step. These Direct Paths are depicted in blue edges, while in dashed we present edges and nodes not yet included in our solution. (b) The compressed model, as obtained after applying the Alternative Paths step and dealing with conflicts detected in the network. In this compressed version of the network we notice the appearance of the connection between TNFR and PI3K. The purpose of this new edge is not to link TNFA to P38 phosphorylation, but to satisfy the TNFA → GSK-3 dependency. The fact that TNFA links to P38 phosphorylation through this connection (i.e. TNFR → PI3K) is coincidental in this case and depends on the paths derived via the “Direct Paths” procedure (the blue edges have already been included in the final topology from the previous step). The algorithm, in order to satisfy the TNFA → P38 dependency chooses the shortest path TNFA → TNFR → TRAF2 → MAP3K7 → MKK4 → P38, including two conflicts (i.e. MAP3K7 and IKK nodes have been measured as inactive under TNFA stimulation). However, this error vanishes due to satisfaction of the two dependencies (TNFA → P38 and TNFA → NFKB). Consequently, the scoring method assesses this case as a draw case (2 Satisfied Dependencies – 2 Conflicts Detected = 0). In this work we suggest that the draw cases should be included in the compressed topology, as they add connectivity-topology information. The algorithmic steps and the experimental design is colour annotated. In blue we present the Direct Paths produced in the previous step, while in red we present the Alternative ones. The nodes in crimson contours represent the detected inconsistencies (conflicts) between network topology and experimental measurements. Finally, in dashed we present edges and components excluded from the final solution.
Fig 2
Fig 2. Medium scale network-Compressed model.
The model structure can be compressed substantially from 90 nodes and 139 edges to 41 nodes and 44 edges. The compressed model reflects the essential dependencies in the original network structure that can be addressed by the given set of measured nodes. Our solution resulted in a fitting error of 29, which has thus reduced much in comparison to 59 in original model. Several edges are absent due to conflict with the data. One example is the absence of RSK1 → RS6, in order to isolate the RS6 activity from the IL1B stimuli. In a similar manner, several edges are preserved as MEK1 → ERK1 and MEK1 → RSK1 to permit the activity of ERK1 and RSK1 under the TGFA treatment. Additionally, MAP3K → IKK enables the activation of NFKB signal under both IL6 and TGFA stimulation and the activation of IKBA measured node from the IL6 stimulus. In red color, we present the removed edges in the compressed model after a parameter change in our ranking method. This new model structure consists of 38 nodes and 41 edges. The new compressed model reflects essentially the experimental dependencies in the original network structure and provides a final fitting error of 19, much reduced in comparison to 59 in original model and 29 in the previous solution.

References

    1. Saez-Rodriguez J., Alexopoulos L.G., Stolovitzky G., (2011) Setting the Standards for Signal Transduction Research,Science Signaling,4:10. - PubMed
    1. Pandey A., Mann M., (2000) Proteomics to study genes and genomes,Nature,405(6788):837–846. 10.1038/35015709 - DOI - PubMed
    1. Downward J., (2001) The ins and outs of signalling,Nature, 411(6839):759–762. 10.1038/35081138 - DOI - PubMed
    1. Melas I.N., Mitsos A., Messinis E.D., Weiss T., Alexopoulos L.G., (2011) Combined logical and data-driven models for linking signalling pathways to cellular response,BMC Systems Biology,5:107 10.1186/1752-0509-5-107 - DOI - PMC - PubMed
    1. Mitsos A., Melas I.N., Morris M.K, Saez-Rodriguez J., Lauffenburger D.A., Alexopoulos L.G., (2012) Non Linear Programming (NLP) Formulation for Quantitative Modeling of Protein Signal Transduction Pathways,PLoS ONE,7(11):e50085 10.1371/journal.pone.0050085 - DOI - PMC - PubMed

Publication types

LinkOut - more resources