Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 6;12(12):e0170340.
doi: 10.1371/journal.pone.0170340. eCollection 2017.

Prophetic Granger Causality to infer gene regulatory networks

Affiliations

Prophetic Granger Causality to infer gene regulatory networks

Daniel E Carlin et al. PLoS One. .

Abstract

We introduce a novel method called Prophetic Granger Causality (PGC) for inferring gene regulatory networks (GRNs) from protein-level time series data. The method uses an L1-penalized regression adaptation of Granger Causality to model protein levels as a function of time, stimuli, and other perturbations. When combined with a data-independent network prior, the framework outperformed all other methods submitted to the HPN-DREAM 8 breast cancer network inference challenge. Our investigations reveal that PGC provides complementary information to other approaches, raising the performance of ensemble learners, while on its own achieves moderate performance. Thus, PGC serves as a valuable new tool in the bioinformatics toolkit for analyzing temporal datasets. We investigate the general and cell-specific interactions predicted by our method and find several novel interactions, demonstrating the utility of the approach in charting new tumor wiring.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Prophetic Granger Causality method.
(A) The method is given a set of probes (rows; y-axis) measuring the level of a particular phospho-protein state at particular time points (columns; x-axis). Each probe value at each time point) is considered in turn as a linear regression of all other feature times and probes. Depicted is probe A being considered at time t (green). The penalty parameter L1 is chosen such that autoregression contributions (red) are set to zero. Any remaining non-zero regression coefficients for other probes suggest causality; past or concurrent time point probes (blue) are considered causal of the target; future time point probes (yellow) are considered to be caused by the target. The different inhibitor conditions are treated as different examples in the regression task. This process was repeated for each time and probe, with each regression task contributing to the final connectivity matrix. (B) Overview of the overall PGC plus network prior approach for the HPN DREAM8 submission. Shown is a prediction for a single (cell line, ligand) pair task. (i.) 263 Pathway Commons pathways having at least two proteins in the DREAM dataset (colored shapes). (ii.) Heat diffusion kernel used to measure closeness between protein pairs in each pathway (see S1 File) were combined into a single weighted “network prior,” represented as an adjacency matrix. (iii.) The Prophetic Granger solution, obtained as shown in part A. (iv.) The final solution for the (cell line, ligand stimulus)-pair is produced by averaging the network prior with the absolute value of the Prophetic Granger solution.
Fig 2
Fig 2. Prophetic augmentations of Granger Causality and GENIE3 complement prior network knowledge.
Performance on the HPN DREAM8 1A sub-challenge after combining different methods with the network prior is shown. Performance of the prior alone is represented by the dotted line. Prophetic Granger Causality, PGC; ignorant of causal ordering, ICO; solutions averaged across all experiments, SA; only past and present time points used, OPP (since this regression framework does not use future points, it cannot be called prophetic); only present time points used, OP. GENIE3 OP is the originally published version of the algorithm; since there are not external time points used for this calculation, there is no equivalent Granger algorithm. GENIE3 error bars show one standard deviation of performance with 10 different random seeds.
Fig 3
Fig 3. Tests on the Yeung dataset reveal PGC adds orthogonal information to improve performance of ensembles.
Ensembles are constructed with methods added to the top performing method, GENIE3 (X-axis). Area under the Precision-Recall Curve (AUPRC) was used to measure performance (Y-axis). Only Prophetic GENIE3, Prophetic Granger Causality (PGC), and dynamic Bayes (EBDBnet) yielded additional performance improvement over the baseline GENIE3. The GENIE3, PGC and EBDBnet combination had the best performance.
Fig 4
Fig 4. Cell-type vs. Stimulus ligand influence on the inferred HPN consensus network reveals a preponderance of cell-type interactions.
Here we show the top 10 percent of interactions in the consensus network. ANOVA analysis on Granger coefficients was used to determine if interactions were cell-type dependent (red lines) or independent (grey) and if they were stimulus ligand-dependent (dotted) versus stimulus ligand-independent (solid). Line thickness reflects the inferred interaction strength. Cell-type-dependent interactions were much more common over stimulus ligand-dependent interactions suggesting that cellular context has an important influence on the underlying GRN. Proteins with more than one phosphorylation site are disambiguated with lower case letters following the protein name. Disambiguation of the identity of these probes appears in Supplemental S4 Table.
Fig 5
Fig 5. Evidence of mutational disruption network activity of MAPK8.
Interaction strengths involving JUN N-terminal Kinase (MAPK8) in the mutant UACC812 cell line are lower than in wild type cell lines. Interaction strengths were calculated as the normalized Granger coefficients derived in each cellular context. Each point is an interaction and points that appear above the line of equality (Y = X) indicate loss of function. Interaction strengths derived from all other interactions not involving MAPK8 are shown as the background (grey dots). Both the upstream and downstream interactions of MAPK8 (red) are significantly disrupted.

References

    1. Stuart JM, Segal E, Koller D, Kim SK. A gene-coexpression network for global discovery of conserved genetic modules. Science. 2003;302: 249–255. doi: 10.1126/science.1087447 - DOI - PubMed
    1. Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS. Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci U S A. 2000;97: 12182–12186. doi: 10.1073/pnas.220392197 - DOI - PMC - PubMed
    1. Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4: Article17. - PubMed
    1. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006;7 Suppl 1: S7. - PMC - PubMed
    1. Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007;5: e8 doi: 10.1371/journal.pbio.0050008 - DOI - PMC - PubMed