. 2016 Apr;13(4):310-8.

doi: 10.1038/nmeth.3773. Epub 2016 Feb 22.

Inferring causal molecular networks: empirical assessment through a community-based effort

Steven M Hill¹, Laura M Heiser^{2

3

4}, Thomas Cokelaer⁵, Michael Unger^{6

7}, Nicole K Nesser⁸, Daniel E Carlin⁹, Yang Zhang¹⁰, Artem Sokolov⁹, Evan O Paull⁹, Chris K Wong⁹, Kiley Graim⁹, Adrian Bivol⁹, Haizhou Wang¹⁰, Fan Zhu¹¹, Bahman Afsari¹², Ludmila V Danilova^{12

13}, Alexander V Favorov^{12

13

14}, Wai Shing Lee¹², Dane Taylor^{15

16}, Chenyue W Hu¹⁷, Byron L Long¹⁷, David P Noren¹⁷, Alexander J Bisberg¹⁷; HPN-DREAM Consortium; Gordon B Mills¹⁸, Joe W Gray^{2

3

4}, Michael Kellen¹⁹, Thea Norman¹⁹, Stephen Friend¹⁹, Amina A Qutub¹⁷, Elana J Fertig¹², Yuanfang Guan^{11

20

21}, Mingzhou Song¹⁰, Joshua M Stuart⁹, Paul T Spellman⁸, Heinz Koeppl^{6

7}, Gustavo Stolovitzky²², Julio Saez-Rodriguez^{5

23}, Sach Mukherjee^{1

24

25}

Collaborators, Affiliations

Collaborators

HPN-DREAM Consortium:
Bahman Afsari, Rami Al-Ouran, Bernat Anton, Tomasz Arodz, Omid Askari Sichani, Neda Bagheri, Noah Berlow, Alexander J Bisberg, Adrian Bivol, Anwesha Bohler, Jaume Bonet, Richard Bonneau, Gungor Budak, Razvan Bunescu, Mehmet Caglar, Binghuang Cai, Chunhui Cai, Daniel E Carlin, Azzurra Carlon, Lujia Chen, Mark F Ciaccio, Thomas Cokelaer, Gregory Cooper, Susan Coort, Chad J Creighton, Seyed-Mohammad-Hadi Daneshmand, Alberto de la Fuente, Barbara Di Camillo, Ludmila V Danilova, Joyeeta Dutta-Moscato, Kevin Emmett, Chris Evelo, Mohammad-Kasim H Fassia, Alexander V Favorov, Elana J Fertig, Justin D Finkle, Francesca Finotello, Stephen Friend, Xi Gao, Jean Gao, Javier Garcia-Garcia, Samik Ghosh, Alberto Giaretta, Kiley Graim, Joe W Gray, Ruth Großeholz, Yuanfang Guan, Justin Guinney, Christoph Hafemeister, Oliver Hahn, Saad Haider, Takeshi Hase, Laura M Heiser, Steven M Hill, Jay Hodgson, Bruce Hoff, Chih Hao Hsu, Chenyue W Hu, Ying Hu, Xun Huang, Mahdi Jalili, Xia Jiang, Tim Kacprowski, Lars Kaderali, Mingon Kang, Venkateshan Kannan, Michael Kellen, Kaito Kikuchi, Dong-Chul Kim, Hiroaki Kitano, Bettina Knapp, George Komatsoulis, Heinz Koeppl, Andreas Krämer, Miron Bartosz Kursa, Martina Kutmon, Wai Shing Lee, Yichao Li, Xiaoyu Liang, Zhaoqi Liu, Yu Liu, Byron L Long, Songjian Lu, Xinghua Lu, Marco Manfrini, Marta R A Matos, Daoud Meerzaman, Gordon B Mills, Wenwen Min, Sach Mukherjee, Christian Lorenz Müller, Richard E Neapolitan, Nicole K Nesser, David P Noren, Thea Norman, Baldo Oliva, Stephen Obol Opiyo, Ranadip Pal, Aljoscha Palinkas, Evan O Paull, Joan Planas-Iglesias, Daniel Poglayen, Amina A Qutub, Julio Saez-Rodriguez, Francesco Sambo, Tiziana Sanavia, Ali Sharifi-Zarchi, Janusz Slawek, Artem Sokolov, Mingzhou Song, Paul T Spellman, Adam Streck, Gustavo Stolovitzky, Sonja Strunz, Joshua M Stuart, Dane Taylor, Jesper Tegnér, Kirste Thobe, Gianna Maria Toffolo, Emanuele Trifoglio, Michael Unger, Qian Wan, Haizhou Wang, Lonnie Welch, Chris K Wong, Jia J Wu, Albert Y Xue, Ryota Yamanaka, Chunhua Yan, Sakellarios Zairis, Michael Zengerling, Hector Zenil, Shihua Zhang, Yang Zhang, Fan Zhu, Zhike Zi

Affiliations

¹ MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, UK.
² Department of Biomedical Engineering, Oregon Health and Science University, Portland, Oregon, USA.
³ Center for Spatial Systems Biomedicine, Oregon Health and Science University, Portland, Oregon, USA.
⁴ Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon, USA.
⁵ European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, UK.
⁶ Automatic Control Laboratory, ETH Zurich, Zurich, Switzerland.
⁷ Institute of Biochemistry, ETH Zurich, Zurich, Switzerland.
⁸ Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, Oregon, USA.
⁹ Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, USA.
¹⁰ Department of Computer Science, New Mexico State University, Las Cruces, New Mexico, USA.
¹¹ Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.
¹² Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, Maryland, USA.
¹³ Laboratory of Systems Biology and Computational Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
¹⁴ Laboratory of Bioinformatics, Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia.
¹⁵ Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, USA.
¹⁶ Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina, USA.
¹⁷ Department of Bioengineering, Rice University, Houston, Texas, USA.
¹⁸ Department of Systems Biology, MD Anderson Cancer Center, Houston, Texas, USA.
¹⁹ Sage Bionetworks, Seattle, Washington, USA.
²⁰ Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA.
²¹ Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA.
²² IBM Translational Systems Biology and Nanobiotechnology, Yorktown Heights, New York, USA.
²³ RWTH-Aachen University Hospital, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), Aachen, Germany.
²⁴ School of Clinical Medicine, University of Cambridge, Cambridge, UK.
²⁵ German Centre for Neurodegenerative Diseases (DZNE), Bonn, Germany.

PMID: 26901648
PMCID: PMC4854847
DOI: 10.1038/nmeth.3773

Inferring causal molecular networks: empirical assessment through a community-based effort

Steven M Hill et al. Nat Methods. 2016 Apr.

. 2016 Apr;13(4):310-8.

doi: 10.1038/nmeth.3773. Epub 2016 Feb 22.

Authors

Collaborators

HPN-DREAM Consortium:
Bahman Afsari, Rami Al-Ouran, Bernat Anton, Tomasz Arodz, Omid Askari Sichani, Neda Bagheri, Noah Berlow, Alexander J Bisberg, Adrian Bivol, Anwesha Bohler, Jaume Bonet, Richard Bonneau, Gungor Budak, Razvan Bunescu, Mehmet Caglar, Binghuang Cai, Chunhui Cai, Daniel E Carlin, Azzurra Carlon, Lujia Chen, Mark F Ciaccio, Thomas Cokelaer, Gregory Cooper, Susan Coort, Chad J Creighton, Seyed-Mohammad-Hadi Daneshmand, Alberto de la Fuente, Barbara Di Camillo, Ludmila V Danilova, Joyeeta Dutta-Moscato, Kevin Emmett, Chris Evelo, Mohammad-Kasim H Fassia, Alexander V Favorov, Elana J Fertig, Justin D Finkle, Francesca Finotello, Stephen Friend, Xi Gao, Jean Gao, Javier Garcia-Garcia, Samik Ghosh, Alberto Giaretta, Kiley Graim, Joe W Gray, Ruth Großeholz, Yuanfang Guan, Justin Guinney, Christoph Hafemeister, Oliver Hahn, Saad Haider, Takeshi Hase, Laura M Heiser, Steven M Hill, Jay Hodgson, Bruce Hoff, Chih Hao Hsu, Chenyue W Hu, Ying Hu, Xun Huang, Mahdi Jalili, Xia Jiang, Tim Kacprowski, Lars Kaderali, Mingon Kang, Venkateshan Kannan, Michael Kellen, Kaito Kikuchi, Dong-Chul Kim, Hiroaki Kitano, Bettina Knapp, George Komatsoulis, Heinz Koeppl, Andreas Krämer, Miron Bartosz Kursa, Martina Kutmon, Wai Shing Lee, Yichao Li, Xiaoyu Liang, Zhaoqi Liu, Yu Liu, Byron L Long, Songjian Lu, Xinghua Lu, Marco Manfrini, Marta R A Matos, Daoud Meerzaman, Gordon B Mills, Wenwen Min, Sach Mukherjee, Christian Lorenz Müller, Richard E Neapolitan, Nicole K Nesser, David P Noren, Thea Norman, Baldo Oliva, Stephen Obol Opiyo, Ranadip Pal, Aljoscha Palinkas, Evan O Paull, Joan Planas-Iglesias, Daniel Poglayen, Amina A Qutub, Julio Saez-Rodriguez, Francesco Sambo, Tiziana Sanavia, Ali Sharifi-Zarchi, Janusz Slawek, Artem Sokolov, Mingzhou Song, Paul T Spellman, Adam Streck, Gustavo Stolovitzky, Sonja Strunz, Joshua M Stuart, Dane Taylor, Jesper Tegnér, Kirste Thobe, Gianna Maria Toffolo, Emanuele Trifoglio, Michael Unger, Qian Wan, Haizhou Wang, Lonnie Welch, Chris K Wong, Jia J Wu, Albert Y Xue, Ryota Yamanaka, Chunhua Yan, Sakellarios Zairis, Michael Zengerling, Hector Zenil, Shihua Zhang, Yang Zhang, Fan Zhu, Zhike Zi

Affiliations

¹ MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, UK.
² Department of Biomedical Engineering, Oregon Health and Science University, Portland, Oregon, USA.
³ Center for Spatial Systems Biomedicine, Oregon Health and Science University, Portland, Oregon, USA.
⁴ Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon, USA.
⁵ European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, UK.
⁶ Automatic Control Laboratory, ETH Zurich, Zurich, Switzerland.
⁷ Institute of Biochemistry, ETH Zurich, Zurich, Switzerland.
⁸ Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, Oregon, USA.
⁹ Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, USA.
¹⁰ Department of Computer Science, New Mexico State University, Las Cruces, New Mexico, USA.
¹¹ Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.
¹² Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, Maryland, USA.
¹³ Laboratory of Systems Biology and Computational Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
¹⁴ Laboratory of Bioinformatics, Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia.
¹⁵ Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina, USA.
¹⁶ Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina, USA.
¹⁷ Department of Bioengineering, Rice University, Houston, Texas, USA.
¹⁸ Department of Systems Biology, MD Anderson Cancer Center, Houston, Texas, USA.
¹⁹ Sage Bionetworks, Seattle, Washington, USA.
²⁰ Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA.
²¹ Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA.
²² IBM Translational Systems Biology and Nanobiotechnology, Yorktown Heights, New York, USA.
²³ RWTH-Aachen University Hospital, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), Aachen, Germany.
²⁴ School of Clinical Medicine, University of Cambridge, Cambridge, UK.
²⁵ German Centre for Neurodegenerative Diseases (DZNE), Bonn, Germany.

PMID: 26901648
PMCID: PMC4854847
DOI: 10.1038/nmeth.3773

Abstract

It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

**Figure 1. Causal networks.**
(a) A directed edge denotes that inhibition of the parent node A can change the abundance of the child node B. (b) Causal edges, as used here, may represent direct effects or indirect effects that occur via unmeasured intermediate nodes. If node A causally influences node B via measured node C, the causal network should contain edges from A to C and from C to B, but not from A to B (top). However, if node C is not measured (and is not part of the network), the causal network should contain an edge from A to B (bottom). Note that in both cases inhibition of node A will lead to a change in node B. (c) Causal edges may depend on biological context; for example, a causal edge from A to B appears in context 1, but not in context 2 (lines in graphs are as defined in a). (d) Correlation and causation. Nodes A and B are correlated owing to regulation by the same node (C), but in this example no sequence of mechanistic events links A to B, and thus inhibition of A does not change the abundance of B (lines in bottom right graph are as defined in a). Therefore, despite the correlation, there is no causal edge from A to B.

**Figure 2. The HPN-DREAM network inference challenge: overview of experimental data tasks and causal assessment strategy.**
(a) Protein data were obtained from four cancer cell lines under eight stimuli (described in ref. 31). For each of the 32 resulting contexts, participants were provided with training data comprising time courses for ∼45 phosphoproteins under three different kinase inhibitors and a control (DMSO). For the sub-challenge 1 experimental data task (SC1A), participants were asked to infer causal signaling networks specific to each context. In SC2A, the aim was to predict context-specific phosphoprotein time courses. In both cases, submissions were assessed using held-out, context-specific test data that were obtained under an unseen intervention (inhibition of the kinase mTOR). Each sub-challenge also included a companion *in silico* data task (SC1B and SC2B, respectively; described in the text, Online Methods and Supplementary Fig. 1). Abund., abundance; TP, true positives; FP, false positives. (b) Networks submitted for SC1A were assessed causally in terms of agreement with the interventional test data. For each context, the set of nodes that changed under mTOR inhibition was identified (gold-standard causal descendants of mTOR; described in the text and Online Methods). In the example shown, node X is a descendant of mTOR, whereas node Y is not. (c) Predicted descendants of mTOR from submitted context-specific networks were compared with their experimentally determined gold-standard counterparts. This gave true and false positive counts and a (context-specific) AUROC. (d) In each context, teams were ranked by AUROC score, and mean rank across contexts gave the final rankings.

**Figure 3. Network inference sub-challenge (SC1) results.**
(a) AUROC scores in each of the 32 (cell line, stimulus) contexts for the 74 teams that submitted networks for the experimental data task. (b) Scores in experimental and *in silico* data tasks. Each square represents a team. Red borders around squares indicate that a different method was used in each task. Numbers adjacent to squares indicate ranks for the top ten teams under a combined score (three teams ranked third). (c,d) Results of crowdsourcing for the experimental data task. Aggregate networks were formed by combining, for each context, networks from top scoring (c) or randomly selected (d) teams (Online Methods). Dashed lines indicate aggregations of all submissions. Results in d are mean values over 100 iterations of random selection (error bars indicate ±s.d.). (e,f) Performance by method type for the experimental (e) and *in silico* (f) data tasks. The final rank is shown above each bar, and the gray lines indicate the mean performance of random predictions. ODE, ordinary differential equation.

**Figure 4. Role of pre-existing biological knowledge in the experimental data network inference task (SC1A).**
(a) Box plots showing mean AUROC scores for teams that either did or did not use a prior network. P value calculated via Wilcoxon rank-sum test (n = 18). (b) Performance of aggregate prior network when combined with networks inferred by PropheticGranger (top performer in SC1A when combined with a network prior) or FunChisq (top performer in SC1B). The blue line indicates aggregate prior combined with randomly generated networks (mean of 30 random networks; shading indicates ±s.d.). The dashed line shows the mean AUROC score achieved by the top-performing team in SC1A. Error bars denote ±s.e.m. (c) Performance of aggregate submission network and aggregate prior network in each context. Top, performance by context. Box plots over AUROC scores for the top 25 performers for each context, shown for comparison. Bottom, receiver operating characteristic curves for two contexts that showed performance differences between aggregate submission and prior. For all box plots, line within the box indicates the median, and the box edges denote the 25th and 75th percentiles. Whiskers extend to 1.5 times the interquartile range from the box hinge. Individual data points are also shown.

**Figure 5. Aggregate submission networks for the experimental data network inference task (SC1A).**
(a) The aggregate submission network for cell line MCF7 under HGF stimulation. Line thickness corresponds to edge weight (number of edges shown set to equal number of nodes). To determine which edges were present and not present in the aggregate prior network, we placed a threshold of 0.1 on edge weights. Green and blue nodes represent descendants of mTOR in the network shown (Fig. 2b,c and Supplementary Fig. 2). The network was generated using Cytoscape. (b) Principal component analysis applied to edge scores for the 32 context-specific aggregate submission networks (Online Methods).

See this image and copyright information in PMC

References

1. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D. How to infer gene networks from expression profiles. Mol. Syst. Biol. 2007;3:78. doi: 10.1038/msb4100120. - DOI - PMC - PubMed
1. Markowetz F, Spang R. Inferring cellular networks—a review. BMC Bioinformatics. 2007;8:S5. doi: 10.1186/1471-2105-8-S6-S5. - DOI - PMC - PubMed
1. Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R. Gene regulatory network inference: data integration in dynamic models—a review. Biosystems. 2009;96:86–103. doi: 10.1016/j.biosystems.2008.12.004. - DOI - PubMed
1. De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nat. Rev. Microbiol. 2010;8:717–729. doi: 10.1038/nrmicro2419. - DOI - PubMed
1. Marbach D, et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl. Acad. Sci. USA. 2010;107:6286–6291. doi: 10.1073/pnas.0913357107. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Inferring causal molecular networks: empirical assessment through a community-based effort

Collaborators

Affiliations

Inferring causal molecular networks: empirical assessment through a community-based effort

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources