pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods
- PMID: 38057718
- PMCID: PMC10701943
- DOI: 10.1186/s12859-023-05578-5
pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods
Abstract
Background: Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expression data.
Results: In this technical note, we present a new Python implementation of ComBat and ComBat-Seq. While the mathematical framework is strictly the same, we show here that our implementations: (i) have similar results in terms of batch effects correction; (ii) are as fast or faster than the original implementations in R and; (iii) offer new tools for the bioinformatics community to participate in its development. pyComBat is implemented in the Python language and is distributed under GPL-3.0 ( https://www.gnu.org/licenses/gpl-3.0.en.html ) license as a module of the inmoose package. Source code is available at https://github.com/epigenelabs/inmoose and Python package at https://pypi.org/project/inmoose .
Conclusions: We present a new Python implementation of state-of-the-art tools ComBat and ComBat-Seq for the correction of batch effects in microarray and RNA-Seq data. This new implementation, based on the same mathematical frameworks as ComBat and ComBat-Seq, offers similar power for batch effect correction, at reduced computational cost.
Keywords: Batch effects; Bayesian statistics; Open source; Transcriptomics.
© 2023. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
References
-
- Tai YC, Speed TP. A multivariate empirical Bayes statistic for replicated microarray time course data. Ann Stat. 2006;34(5):2387–2412. doi: 10.1214/009053606000000759. - DOI
MeSH terms
Grants and funding
- 190185351/European Union's Horizon 2020 research and innovation program
- 190185351/European Union's Horizon 2020 research and innovation program
- 190185351/European Union's Horizon 2020 research and innovation program
- 190185351/European Union's Horizon 2020 research and innovation program
- 190185351/European Union's Horizon 2020 research and innovation program
LinkOut - more resources
Full Text Sources
Research Materials
