Inference of differentially expressed genes using generalized linear mixed models in a pairwise fashion
- PMID: 37033732
- PMCID: PMC10078460
- DOI: 10.7717/peerj.15145
Inference of differentially expressed genes using generalized linear mixed models in a pairwise fashion
Abstract
Background: Technological advances involving RNA-Seq and Bioinformatics allow quantifying the transcriptional levels of genes in cells, tissues, and cell lines, permitting the identification of Differentially Expressed Genes (DEGs). DESeq2 and edgeR are well-established computational tools used for this purpose and they are based upon generalized linear models (GLMs) that consider only fixed effects in modeling. However, the inclusion of random effects reduces the risk of missing potential DEGs that may be essential in the context of the biological phenomenon under investigation. The generalized linear mixed models (GLMM) can be used to include both effects.
Methods: We present DEGRE (Differentially Expressed Genes with Random Effects), a user-friendly tool capable of inferring DEGs where fixed and random effects on individuals are considered in the experimental design of RNA-Seq research. DEGRE preprocesses the raw matrices before fitting GLMMs on the genes and the derived regression coefficients are analyzed using the Wald statistical test. DEGRE offers the Benjamini-Hochberg or Bonferroni techniques for P-value adjustment.
Results: The datasets used for DEGRE assessment were simulated with known identification of DEGs. These have fixed effects, and the random effects were estimated and inserted to measure the impact of experimental designs with high biological variability. For DEGs' inference, preprocessing effectively prepares the data and retains overdispersed genes. The biological coefficient of variation is inferred from the counting matrices to assess variability before and after the preprocessing. The DEGRE is computationally validated through its performance by the simulation of counting matrices, which have biological variability related to fixed and random effects. DEGRE also provides improved assessment measures for detecting DEGs in cases with higher biological variability. We show that the preprocessing established here effectively removes technical variation from those matrices. This tool also detects new potential candidate DEGs in the transcriptome data of patients with bipolar disorder, presenting a promising tool to detect more relevant genes.
Conclusions: DEGRE provides data preprocessing and applies GLMMs for DEGs' inference. The preprocessing allows efficient remotion of genes that could impact the inference. Also, the computational and biological validation of DEGRE has shown to be promising in identifying possible DEGs in experiments derived from complex experimental designs. This tool may help handle random effects on individuals in the inference of DEGs and presents a potential for discovering new interesting DEGs for further biological investigation.
Keywords: DEGRE package; Differentially expressed genes; Gene dispersion; Generalized linear mixed model; Preprocessing; Random effects.
© 2023 Terra Machado et al.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures










Similar articles
-
BALLI: Bartlett-adjusted likelihood-based linear model approach for identifying differentially expressed genes with RNA-seq data.BMC Genomics. 2019 Jul 2;20(1):540. doi: 10.1186/s12864-019-5851-6. BMC Genomics. 2019. PMID: 31266443 Free PMC article.
-
bestDEG: a web-based application automatically combines various tools to precisely predict differentially expressed genes (DEGs) from RNA-Seq data.PeerJ. 2022 Nov 10;10:e14344. doi: 10.7717/peerj.14344. eCollection 2022. PeerJ. 2022. PMID: 36389403 Free PMC article.
-
Bioinformatics prediction and experimental verification of key biomarkers for diabetic kidney disease based on transcriptome sequencing in mice.PeerJ. 2022 Sep 20;10:e13932. doi: 10.7717/peerj.13932. eCollection 2022. PeerJ. 2022. PMID: 36157062 Free PMC article.
-
Robust identification of differentially expressed genes from RNA-seq data.Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20. Genomics. 2020. PMID: 31756426
-
A comparison of transcriptome analysis methods with reference genome.BMC Genomics. 2022 Mar 25;23(1):232. doi: 10.1186/s12864-022-08465-0. BMC Genomics. 2022. PMID: 35337265 Free PMC article.
Cited by
-
Transcriptomic signatures of prostate cancer progression: a comprehensive RNA-seq study.3 Biotech. 2025 May;15(5):135. doi: 10.1007/s13205-025-04297-3. Epub 2025 Apr 19. 3 Biotech. 2025. PMID: 40260408
-
Decoding glycomics with a suite of methods for differential expression analysis.Cell Rep Methods. 2023 Dec 18;3(12):100652. doi: 10.1016/j.crmeth.2023.100652. Epub 2023 Nov 21. Cell Rep Methods. 2023. PMID: 37992708 Free PMC article.
-
DeepCorr: a novel error correction method for 3GS long reads based on deep learning.PeerJ Comput Sci. 2024 Jul 26;10:e2160. doi: 10.7717/peerj-cs.2160. eCollection 2024. PeerJ Comput Sci. 2024. PMID: 39678285 Free PMC article.
-
Immune response stability to the SARS-CoV-2 mRNA vaccine booster is influenced by differential splicing of HLA genes.Sci Rep. 2024 Apr 18;14(1):8982. doi: 10.1038/s41598-024-59259-1. Sci Rep. 2024. PMID: 38637586 Free PMC article.
-
Integration of transcriptomics and metabolomics data revealed role of insulin resistant SNW1 gene in the pathophysiology of gestational diabetes.Sci Rep. 2025 Feb 4;15(1):4159. doi: 10.1038/s41598-025-88485-4. Sci Rep. 2025. PMID: 39905161 Free PMC article.
References
-
- Akbarian F, Tabatabaiefar MA, Shaygannejad V, Shahpouri MM, Badihian N, Sajjadi R, Dabiri A, Jalilian N, Noori-Daloii MR. Upregulation of MTOR, RPS6KB1, and EIF4EBP1 in the whole blood samples of Iranian patients with multiple sclerosis compared to healthy controls. Metabolic Brain Disease. 2020;35(8):1309–1316. doi: 10.1007/s11011-020-00590-7. - DOI - PubMed
-
- Beech RD, Lowthert L, Leffert JJ, Mason PN, Taylor MM, Umlauf S, Lin A, Lee JY, Maloney K, Muralidharan A, Lorberg B, Zhao H, Newton SS, Mane S, Epperson CN, Sinha R, Blumberg H, Bhagwagar Z. Increased peripheral blood expression of electron transport chain genes in bipolar depression. Bipolar Disorders. 2010;12(8):813–824. doi: 10.1111/j.1399-5618.2010.00882.x. - DOI - PMC - PubMed
-
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological) 1995;57(1):289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. - DOI
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources