Elucidating regulatory mechanisms downstream of a signaling pathway using informative experiments

Ewa Szczurek¹, Irit Gat-Viks, Jerzy Tiuryn, Martin Vingron

Affiliations

PMID: 19584836
PMCID: PMC2724975
DOI: 10.1038/msb.2009.45

Elucidating regulatory mechanisms downstream of a signaling pathway using informative experiments

Ewa Szczurek et al. Mol Syst Biol. 2009.

. 2009:5:287.

doi: 10.1038/msb.2009.45. Epub 2009 Jul 7.

Authors

Ewa Szczurek¹, Irit Gat-Viks, Jerzy Tiuryn, Martin Vingron

Affiliation

¹ Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany. szczurek@molgen.mpg.de

PMID: 19584836
PMCID: PMC2724975
DOI: 10.1038/msb.2009.45

Abstract

Signaling cascades are triggered by environmental stimulation and propagate the signal to regulate transcription. Systematic reconstruction of the underlying regulatory mechanisms requires pathway-targeted, informative experimental data. However, practical experimental design approaches are still in their infancy. Here, we propose a framework that iterates design of experiments and identification of regulatory relationships downstream of a given pathway. The experimental design component, called MEED, aims to minimize the amount of laboratory effort required in this process. To avoid ambiguity in the identification of regulatory relationships, the choice of experiments maximizes diversity between expression profiles of genes regulated through different mechanisms. The framework takes advantage of expert knowledge about the pathways under study, formalized in a predictive logical model. By considering model-predicted dependencies between experiments, MEED is able to suggest a whole set of experiments that can be carried out simultaneously. Our framework was applied to investigate interconnected signaling pathways in yeast. In comparison with other approaches, MEED suggested the most informative experiments for unambiguous identification of transcriptional regulation in this system.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

**Figure 1**
Comparative performance analysis on random models. The comparison is carried out on 1000 cyclic models generated by random reshuffling of the TNF canonical human signaling pathway. (A, B) x-axis: the number of highest priority experiments used from the compared experiment lists to distinguish between regulatory programs, y-axis: the *FUP* score averaged over the 1000 random models (only the results with average *FUP*<0.35 are reported). The lower the averaged cumulative *FUP*, the higher the performance of a given ED method. (A) Comparison with the INDEP method. Our MEED algorithm has significant advantage over independent experiment scoring. (B) Comparison with the network-based methods. The network-based methods choose the perturbed variables according to key features of the structure, whereas stimulations and perturbation states are chosen either at random (the random methods, R-prefixed, green shaded) or following our MEED algorithm (the hybrid methods, M-prefixed, blue shaded). (C) Box plots of the *FUP* scores (y-axis) for groups of 3, 9 and 15 highest priority experiments from the experiment lists proposed by all analyzed methods (x-axis). The results show that MEED consistently outperforms other methods on the tested random models. In general, the hybrid methods have a better performance than the random methods. This evident tendency implies that even allowing MEED to decide only on stimulations and perturbation states, regardless the way the perturbed variables were chosen, can still provide significant improvement.

**Figure 2**
Experiment list proposed using MEED for the yeast signaling model. The model is depicted on the left as a network with nodes (ovals) corresponding to environmental conditions (dark gray) and signaling components (light gray). Arrows represent regulatory influences. The list of the experiments designed using MEED is given in a table on the right, listing stimulation (control—YPD) and perturbation (green: knockout and red: overexpression).

**Figure 3**
Comparative performance on the yeast signaling model: *FUP* scores and ambiguity of expansion. MEED (plotted in magenta) is compared with INDEP (gray), network-based methods, as well as two extant ED approaches (Barrett and Palsson (2006)—orange; Ideker *et al* (2000)—red). As the two extant methods take as input results of expansion using the first four experiments proposed by MEED, their report starts from the fifth experiment. The method of Ideker *et al* (2000) reaches its stop criterion already after choosing three experiments (fifth to seventh experiment). x-axis in all plots (A–D): the number of highest priority experiments. For comparison with MEED, we present up to eleven experiments chosen by the other methods. (A, B) *FUP* scores. y-axis: the *FUP* score, measuring the ability of the experiments to distinguish between regulatory programs (only the results for *FUP*<0.35 are reported). With the lowest *FUP* for every number of highest priority experiments, MEED outperforms all alternative methods. The best performing of the network-based methods is M-TOPOL. (C) Regulatory modules. y-axis: the number of modules identified in expansion. The proportion of ambiguous modules is marked in gray. In comparison with the method of Barrett and Palsson (2006), more modules are obtained using the same number of highest priority experiments proposed by MEED (see Supplementary Figure S5 for similar analysis for M-TOPOL and Ideker *et al* (2000)). (D) Ambiguity of expansion. y-axis: ambiguity score (i.e., the average number of regulatory programs per gene; plotted in log scale). With lower ambiguity score for most numbers of highest priority experiments, MEED outperforms M-TOPOL and the method of Barrett and Palsson (2006) on the yeast model.

**Figure 4**
Expansion of the yeast signaling model using the experiments proposed by MEED. The yeast model is depicted in the center of the figure. The identified modules are presented, with additional dashed edges connecting the regulators in the pathway to their regulatory programs (nodes labeled with regulators and having a boundary color-coded according to their regulation function). The ambiguous modules, highlighted with dashed yellow squares, are presented as gray-filled nodes, labeled with their size and connected by edges to all their matching regulatory program nodes. The two ambiguous modules were subject to an additional MEED iteration, which succeeds to distinguish their regulatory programs using only two additional experiments. Matrices showing the expression measurements of target genes (rows) across the eleven experiments proposed using MEED (columns) are presented only for the modules that contain at least seven genes. The columns of the expression matrices are ordered from left to right according to the order proposed by MEED. For clarity, only subsets of the large Ste12 and Kss1/Fus3 matrices are shown. The predicted profiles appear as separate rows above the matrices. For most modules, the expression profiles agree well with the predicted profiles. Blue arrows exemplify experiments in which majority of the module genes disagree with the predicted profile.

**Figure 5**
Functional coherence of identified regulatory modules. Enrichment of the target genes from each of four large identified modules (rows) in various experiments (columns). Significant enrichment (Boferroni-corrected hypergeometric P-value; indicated by shades of red) represents distinct behavior of the genes in a module compared with the rest of the genome. Enrichment P-values in TF–DNA binding targets (Zeitlinger *et al*, 2003; Harbison *et al*, 2004; Pokholok *et al*, 2006) and gene ontology annotation (GO, Ashburner *et al*, 2000) are reported. The different data sets and experiments' environmental conditions are color-coded above and below the matrix, respectively. The profiles used for the enrichment tests were not part of our original dataset. RPBc, ribonucleoprotein complex; BG, biogenesis; BS, biosynthesis.

**Figure 6**
Illustrating expansion results with ambiguity networks. Ambiguity networks for regulatory modules obtained in expansion of the yeast model using the first five (A) and six (B) experiments on the list proposed by M-TOPOL (i.e., A and B differ only by one additional sixth experiment from the list). The ambiguity network provides a detailed insight into the ambiguous modules. Each white-filled node represents a regulatory program matching one of the identified modules. It is labeled with its regulator, and has a boundary color-coded according to its regulation function. Unambiguous modules are presented only by their unique matching regulatory program, without indicating their size. Ambiguous modules are presented as gray-filled nodes, labeled with their size and connected by edges to all their matching regulatory program nodes. Exemplary modules (highlighted with dashed squares) are shown together with their predicted profile (colored vector above the square). Dashed red: an ambiguous module controlled by seven regulatory programs containing a large set of genes in A is replaced in B by two smaller ambiguous modules controlled by four and three regulatory programs, respectively. The two modules differ in the gene response to the additional sixth experiment. Matrices showing expression profiles of the target genes (rows) across the experiments (columns) are plotted next to the modules. Dashed blue: A large ambiguous module whose genes did not respond in any of the first five experiments (the corresponding predicted profile is filled with black in A). Using the sixth experiment, the large module is replaced by two smaller ones in B. One module contains genes that were downregulated in the sixth experiment, whereas another contains genes that were upregulated (can be seen in green versus red entries in the predicted profiles of the modules). A large group of genes, whose expression has not changed in the sixth experiment, does not match any profile and therefore is not contained in any regulatory module.

See this image and copyright information in PMC

References

1. Akutsu T, Kuhara S, Maruyama O, Miyano S (1998) A system for identifying genetic networks from gene expression patterns produced by gene disruptions and overexpressions. Genome Inform Ser Workshop Genome Inform 9: 151–160 - PubMed
1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29 - PMC - PubMed
1. Barrett CL, Palsson BO (2006) Iterative reconstruction of transcriptional regulatory networks: an algorithmic approach. PLoS Comput Biol 2: e52. - PMC - PubMed
1. Bauer S, Grossmann S, Vingron M, Robinson PN (2008) Ontologizer 2.0—a multifunctional tool for GO term enrichment analysis and data exploration. Bioinformatics 24: 1650–1651 - PubMed
1. Bolouri H, Davidson EH (2002) Modeling transcriptional regulatory networks. Bioessays 24: 1118–1129 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- Saccharomyces Genome Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Elucidating regulatory mechanisms downstream of a signaling pathway using informative experiments

Affiliation

Elucidating regulatory mechanisms downstream of a signaling pathway using informative experiments

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Molecular Biology Databases