. 2020 Apr 15;36(8):2522-2529.

doi: 10.1093/bioinformatics/btz950.

LiPLike: towards gene regulatory network predictions of high certainty

Rasmus Magnusson¹, Mika Gustafsson¹

Affiliations

PMID: 31904818
PMCID: PMC7178405
DOI: 10.1093/bioinformatics/btz950

LiPLike: towards gene regulatory network predictions of high certainty

Rasmus Magnusson et al. Bioinformatics. 2020.

. 2020 Apr 15;36(8):2522-2529.

doi: 10.1093/bioinformatics/btz950.

Authors

Rasmus Magnusson¹, Mika Gustafsson¹

Affiliation

¹ Department of Physics, Chemistry and Biology, Linköping University, Linköping 581 83, Sweden.

PMID: 31904818
PMCID: PMC7178405
DOI: 10.1093/bioinformatics/btz950

Abstract

Motivation: High correlation in expression between regulatory elements is a persistent obstacle for the reverse-engineering of gene regulatory networks. If two potential regulators have matching expression patterns, it becomes challenging to differentiate between them, thus increasing the risk of false positive identifications.

Results: To allow for gene regulation predictions of high confidence, we propose a novel method, the Linear Profile Likelihood (LiPLike), that assumes a regression model and iteratively searches for interactions that cannot be replaced by a linear combination of other predictors. To compare the performance of LiPLike with other available inference methods, we benchmarked LiPLike using three independent datasets from the Dialogue on Reverse Engineering Assessment and Methods 5 (DREAM5) network inference challenge. We found that LiPLike could be used to stratify predictions of other inference tools, and when applied to the predictions of DREAM5 participants, we observed an average improvement in accuracy of >140% compared to individual methods. Furthermore, LiPLike was able to independently predict networks better than all DREAM5 participants when applied to biological data. When predicting the Escherichia coli network, LiPLike had an accuracy of 0.38 for the top-ranked 100 interactions, whereas the corresponding DREAM5 consensus model yielded an accuracy of 0.11.

Availability and implementation: We made LiPLike available to the community as a Python toolbox, available at https://gitlab.com/Gustafsson-lab/liplike. We believe that LiPLike will be used for high confidence predictions in studies where individual model interactions are of high importance, and to remove false positive predictions made by other state-of-the-art gene-gene regulation prediction tools.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Illustration of LiPLike rationale. (A) In a toy system of three gene regulators (X₁, and the correlated variables X₂ and X₃) regulating a target gene (Y), the optimal parameters of the corresponding linear model are easily identified using the method of least squares. Next, by imposing a constraint that the parameter value of X₁ should equal a value ζ and iteratively re-estimating the remaining two variables, the profile of the residual sum of squares as a function of ζ can be studied. In other words, as the parameter ζ is changed, so will the ability of the regulators to explain the data in Y. Of special interest is the point of ζ = 0, i.e. where the regulator is removed from the system. Furthermore, if X₁ is uniquely needed to model Y, the residual sum of squares as a function of ζ will increase rapidly, as seen in the top case, and there will be a large increase in the residual sum of squares between the best fit and the case where ζ = 0. In the bottom case, since X₂ and X₃ are correlated, there exists a linear combination of remaining explanatory variables that can adequately fit Y, and the residual sum of squares is less dependent on ζ. This is because when the parameter between X₂ and Y is changed, the variable X₃ is able to take the place of X₂. (B) Two examples of LiPLike applied to data. Three independent variables exist, whereof two (X₂ and X₃) have a high correlation between them. To explain dependent variable Y, either (X₁, X₂) or (X₁, X₃) are needed, and there is no way to infer whether X₂ or X₃ is the correct regulator. If X₂ or X₃ is left out from the set, LiPLike infers both remaining inputs to be important, as illustrated by the magnitudes of q shown to the right. When all three independent variables are included, LiPLike refrains from selecting variables that cannot be inferred uniquely. This is because there is no way to determine if X₂ or X₃ is the correct regulator

**Fig. 2.**
LiPLike properties and performance on *in silico* generated networks. The confidence of inferred edges is listed as q, calculated for two datasets for the same network. The networks differed in the signal-to-noise ratio. The magnitudes of q were found to be dependent on the noise level, with a factor 10e⁷ differing between the datasets, as seen on the x-axis on the histograms to the left. Moreover, the histograms both display a property empirically arising from LiPLike for networks with strong signals, i.e. a separation of confidence into two distinct groups, high confidence or none. To the right are the corresponding receiver operating characteristic curves, showing that LiPLike infers edges well for some values, to then have a near to random chance for identifying an edge. The networks were retrieved from https://bitbucket.org/sonnhammergrni/genespider

**Fig. 3.**
LiPLike performance on DREAM5 challenge data. (A) Accuracy of algorithms predicting edges of the *E.coli* network as a function of number of edges considered. The LiPLike performance is plotted in red, showing a higher accuracy than all DREAM5 participants. (B) The accuracies of LiPLike across top-ranked edges for all networks. (C) The accuracy of all methods, the crowd estimate and LiPLike for the top-ranked edges. LiPLike gave the highest accuracy of all methods in both biologically derived networks, and ranked 20th of 36 in the *in silico* network. (Color version of this figure is available at *Bioinformatics* online.)

**Fig. 4.**
Accuracy of edge predictions of the DREAM5 community prediction and LiPLike, split up between top edges that are exclusively found in the community, LiPLike, and in both. In all cases, the edges that are found in both predictions have a considerable increase in accuracy compared to the DREAM5 challenge community prediction. Moreover, in the case of the biological networks, *S.aureus* and *E.coli*, LiPLike performs better than the community in the non-overlapping predictions, indicating that LiPLike identifies edges that the community failed to include

**Fig. 5.**
Network properties. A) Cumulative distribution of the highest correlation with other regulators of putative interactions shown for LiPLike (red), and the crowd (grey) top-ranked interaction from respective DREAM5 networks. The regulators LiPLike identify tend to have on average fewer correlating regulators. For example, in the *E.coli* network, we observed a median Pearson correlation of $ρ = 0.57$ . For the corresponding community prediction, 85.3% of all inferred regulators have a higher correlation than 0.57 to another regulator. This higher correlation indicates that LiPLike to a lesser degree predicts edges where there are several potential regulators to choose from. (B) Distribution of inferred edges for each transcription factor for LiPLike (red) and the community (grey). While the putative outdegrees of transcription factors in the community estimate appear to follow power law (as indicated by the straight line in the log-scale), LiPLike appears to select edges with a broader distribution profile. (C) Accuracies for the inferred top regulators in the community prediction were found to be low. The top regulators in the LiPLike network had similar accuracies to the overall LiPLike accuracy. (Color version of this figure is available at *Bioinformatics* online.)

See this image and copyright information in PMC

Cited by

Hybrid modelling for stroke care: Review and suggestions of new approaches for risk assessment and simulation of scenarios.
Herrgårdh T, Madai VI, Kelleher JD, Magnusson R, Gustafsson M, Milani L, Gennemark P, Cedersund G. Herrgårdh T, et al. Neuroimage Clin. 2021;31:102694. doi: 10.1016/j.nicl.2021.102694. Epub 2021 May 7. Neuroimage Clin. 2021. PMID: 34000646 Free PMC article. Review.
Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inference.
Tjärnberg A, Beheler-Amass M, Jackson CA, Christiaen LA, Gresham D, Bonneau R. Tjärnberg A, et al. Genome Biol. 2024 Jan 18;25(1):24. doi: 10.1186/s13059-023-03134-1. Genome Biol. 2024. PMID: 38238840 Free PMC article.
Computational biology approaches for mapping transcriptional regulatory networks.
Saint-André V. Saint-André V. Comput Struct Biotechnol J. 2021 Aug 21;19:4884-4895. doi: 10.1016/j.csbj.2021.08.028. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34522292 Free PMC article. Review.
A comprehensive mechanistic model of adipocyte signaling with layers of confidence.
Lövfors W, Magnusson R, Jönsson C, Gustafsson M, Olofsson CS, Cedersund G, Nyman E. Lövfors W, et al. NPJ Syst Biol Appl. 2023 Jun 7;9(1):24. doi: 10.1038/s41540-023-00282-9. NPJ Syst Biol Appl. 2023. PMID: 37286693 Free PMC article.
Mini-review: Gene regulatory network benefits from three-dimensional chromatin conformation and structural biology.
Zhu X, Huang Q, Luo J, Kong D, Zhang Y. Zhu X, et al. Comput Struct Biotechnol J. 2023 Feb 16;21:1728-1737. doi: 10.1016/j.csbj.2023.02.028. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 36890880 Free PMC article. Review.

See all "Cited by" articles

References

1. Aghdam R. et al. (2015) CN: a consensus algorithm for inferring gene regulatory networks using the SORDER algorithm and conditional mutual information test. Mol. Biosyst., 11, 942–949. - PubMed
1. Alvarez M.J. et al. (2018) A precision oncology approach to the pharmacological targeting of mechanistic dependencies in neuroendocrine tumors. Nat. Genet., 50, 979–989. - PMC - PubMed
1. Arrieta-Ortiz M.L. et al. (2015) An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network. Mol. Syst. Biol., 11, 839. - PMC - PubMed
1. Barzel B., Barabási A.L. (2013) Network link prediction by global silencing of indirect correlations. Nat. Biotechnol., 31, 720–725. - PMC - PubMed
1. Bonneau R. et al. (2006) The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol., 7, R36. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

LiPLike: towards gene regulatory network predictions of high certainty

Affiliation

LiPLike: towards gene regulatory network predictions of high certainty

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Miscellaneous