Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 2;41(6):btaf318.
doi: 10.1093/bioinformatics/btaf318.

BiGSM: Bayesian inference of gene regulatory network via sparse modelling

Affiliations

BiGSM: Bayesian inference of gene regulatory network via sparse modelling

Hang Qin et al. Bioinformatics. .

Abstract

Motivation: Inference of gene regulatory network (GRN) is challenging due to the inherent sparsity of the GRN matrix and noisy expression data, often leading to a high possibility of false positive or negative predictions. To address this, it is essential to leverage the sparsity of the GRN matrix and develop a robust method capable of handling varying levels of noise in the data. Moreover, most existing GRN inference methods produce only fixed point estimates, which lack the flexibility and informativeness for comprehensive network analysis. In contrast, a Bayesian approach that yields closed-form posterior distributions allows probabilistic link selection, offering insights into the statistical confidence of each possible link. Consequently, it is important to engineer a Bayesian GRN inference method and rigorously execute a benchmark evaluation compared to state-of-the-art methods.

Results: We propose a method-Bayesian inference of GRN via Sparse Modelling (BiGSM). BiGSM effectively exploits the sparsity of the GRN matrix and infers the posterior distributions of GRN links from noisy expression data by using the maximum likelihood based learning. We thoroughly benchmarked BiGSM using biological and simulated datasets including GeneNetWeaver, GeneSPIDER, and GRNbenchmark. The benchmark test evaluates its accuracy and robustness across varying noise levels and data models. Using point-estimate based performance measures, BiGSM provides an overall best performance in comparison with several state-of-the-art methods including GENIE3, LASSO, LSCON, and Zscore. Additionally, BiGSM is the only method in the set of competing methods that provides posteriors for the GRN weights, helping to decipher confidence across predictions.

Availability and implementation: Code implemented via MATLAB and Python are available at Github: https://github.com/SachLab/BiGSM and archived at zenodo.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overall workflow of the BiGSM algorithm. From the model assumption, the likelihood is computed. The posterior distribution is derived from Bayes rule. Then, the log-likelihood as a function of prior parameters and noise variance is computed from the posterior. In the iterative learning process, the prior parameters and noise variance are first updated at each step, then the posterior is updated in iterations.
Figure 2.
Figure 2.
AUPR, AUROC, and Maximum F1 score of six inference methods on GeneSPDER data with SNR of 1 (A), 0.1 (B), 0.01 (C). The simulated data has one replicate and 50 genes for each GRN. The evaluation is without self-loops. Each box contains inference results over 20 GRNs.
Figure 3.
Figure 3.
AUROC of six inference methods on DREAM4 Insilico size 100 networks, knockdown data. Each group of bars shows the AUROC of six methods on each network.
Figure 4.
Figure 4.
Average ranks of six inference methods on GRNbenchmark. Each number represents the average rank of the method across five networks using the corresponding evaluation metric and under the specific noise level. BiGSM has the best overall performance in the 6 challenges, with a summed ranking of 10.
Figure 5.
Figure 5.
Estimated posterior distribution of a 3×3 GRN. The true GRN weights and the corresponding gene expression are generated by GeneSPIDER with SNR=0.1. The estimated posterior distributions are the probability density functions of predicted GRN weights. The inferred GRN are the mean values taken from posteriors.
Figure 6.
Figure 6.
AUPR, AUROC, and Maximum F1 score of six inference methods on a E. coli network using biological expression data.
Figure 7.
Figure 7.
Analysis of density of inferred full GRNs over six methods. A kernel density estimation (KDE) with Gaussian kernel is used to estimate the probability density function (PDF) of the true GRN and the inferred GRNs.
Figure 8.
Figure 8.
Average execution time of BiGSM, LSCON, and LASSO for varying number of genes (network size N) using standard tic-toc function.

Similar articles

References

    1. Aghdam R, Ganjali M, Zhang X et al. Cn: a consensus algorithm for inferring gene regulatory networks using the sorder algorithm and conditional mutual information test. Mol Biosyst 2015;11:942–9. - PubMed
    1. Ashworth A, Lord CJ, Reis-Filho JS. Genetic interactions in cancer progression and treatment. Cell 2011;145:30–8. - PubMed
    1. Ben Guebila M, Wang T, Lopes-Ramos CM et al. The network zoo: a multilingual package for the inference and analysis of gene regulatory networks. Genome Biol 2023;24:45. - PMC - PubMed
    1. Bishop CM, Nasrabadi NM. Pattern Recognition and Machine Learning, Vol. 4. New York: Springer, 2006.
    1. Boone C, Bussey H, Andrews BJ. Exploring genetic interactions and networks with yeast. Nat Rev Genet 2007;8:437–49. - PubMed