The sva package for removing batch effects and other unwanted variation in high-throughput experiments
- PMID: 22257669
- PMCID: PMC3307112
- DOI: 10.1093/bioinformatics/bts034
The sva package for removing batch effects and other unwanted variation in high-throughput experiments
Abstract
Heterogeneity and latent variables are now widely recognized as major sources of bias and variability in high-throughput experiments. The most well-known source of latent variation in genomic experiments are batch effects-when samples are processed on different days, in different groups or by different people. However, there are also a large number of other variables that may have a major impact on high-throughput measurements. Here we describe the sva package for identifying, estimating and removing unwanted sources of variation in high-throughput experiments. The sva package supports surrogate variable estimation with the sva function, direct adjustment for known batch effects with the ComBat function and adjustment for batch and latent variables in prediction problems with the fsva function.
Similar articles
-
svaseq: removing batch effects and other unwanted noise from sequencing data.Nucleic Acids Res. 2014 Dec 1;42(21):e161. doi: 10.1093/nar/gku864. Epub 2014 Oct 7. Nucleic Acids Res. 2014. PMID: 25294822 Free PMC article.
-
Removing batch effects for prediction problems with frozen surrogate variable analysis.PeerJ. 2014 Sep 23;2:e561. doi: 10.7717/peerj.561. eCollection 2014. PeerJ. 2014. PMID: 25332844 Free PMC article.
-
Practical impacts of genomic data "cleaning" on biological discovery using surrogate variable analysis.BMC Bioinformatics. 2015 Nov 6;16:372. doi: 10.1186/s12859-015-0808-5. BMC Bioinformatics. 2015. PMID: 26545828 Free PMC article.
-
Blind estimation and correction of microarray batch effect.PLoS One. 2020 Apr 9;15(4):e0231446. doi: 10.1371/journal.pone.0231446. eCollection 2020. PLoS One. 2020. PMID: 32271844 Free PMC article.
-
Recent progress with next-generation biomarkers in muscle-invasive bladder cancer.Int J Urol. 2017 Jan;24(1):7-15. doi: 10.1111/iju.13193. Epub 2016 Sep 6. Int J Urol. 2017. PMID: 27597124 Review.
Cited by
-
Identification of potential biomarkers and drug of ischemic stroke in patients with COVID-19 through machine learning.Heliyon. 2024 Oct 11;10(20):e39039. doi: 10.1016/j.heliyon.2024.e39039. eCollection 2024 Oct 30. Heliyon. 2024. PMID: 39502238 Free PMC article.
-
An ancient, conserved gene regulatory network led to the rise of oral venom systems.Proc Natl Acad Sci U S A. 2021 Apr 6;118(14):e2021311118. doi: 10.1073/pnas.2021311118. Proc Natl Acad Sci U S A. 2021. PMID: 33782124 Free PMC article.
-
Ulcerative colitis immune cell landscapes and differentially expressed gene signatures determine novel regulators and predict clinical response to biologic therapy.Sci Rep. 2021 Apr 27;11(1):9010. doi: 10.1038/s41598-021-88489-w. Sci Rep. 2021. PMID: 33907256 Free PMC article.
-
Patient-Derived Xenografts and Organoids Recapitulate Castration-Resistant Prostate Cancer with Sustained Androgen Receptor Signaling.Cells. 2022 Nov 16;11(22):3632. doi: 10.3390/cells11223632. Cells. 2022. PMID: 36429059 Free PMC article.
-
Gene signature-based prediction of triple-negative breast cancer patient response to Neoadjuvant chemotherapy.Cancer Med. 2020 Sep;9(17):6281-6295. doi: 10.1002/cam4.3284. Epub 2020 Jul 21. Cancer Med. 2020. PMID: 32692484 Free PMC article.
References
-
- Brem R.B., et al. Genetic dissection of transcriptional regulation in budding yeast. Science. 2002;296:752–755. - PubMed
-
- Gibson G. The environmental contribution to gene expression profiles. Nat. Rev. Genet. 2008;9:575–581. - PubMed
-
- Johnson W., et al. Adjusting batch effects in microarray data using empirical bayes methods. Biostatistics. 2007;8:118–127. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources