BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data
- PMID: 21498400
- PMCID: PMC3102226
- DOI: 10.1093/bioinformatics/btr183
BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data
Abstract
Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples.
Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines.
Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm.
Figures






Similar articles
-
Genome-wide identification of significant aberrations in cancer genome.BMC Genomics. 2012 Jul 27;13:342. doi: 10.1186/1471-2164-13-342. BMC Genomics. 2012. PMID: 22839576 Free PMC article.
-
BACOM2.0 facilitates absolute normalization and quantification of somatic copy number alterations in heterogeneous tumor.Sci Rep. 2015 Sep 9;5:13955. doi: 10.1038/srep13955. Sci Rep. 2015. PMID: 26350498 Free PMC article.
-
AISAIC: a software suite for accurate identification of significant aberrations in cancers.Bioinformatics. 2014 Feb 1;30(3):431-3. doi: 10.1093/bioinformatics/btt693. Epub 2013 Nov 29. Bioinformatics. 2014. PMID: 24292941 Free PMC article.
-
Analysis of genomic abnormalities in tumors: a review of available methods for Illumina two-color SNP genotyping and evaluation of performance.Cancer Genet. 2013 Apr;206(4):103-15. doi: 10.1016/j.cancergen.2013.03.001. Epub 2013 Apr 25. Cancer Genet. 2013. PMID: 23623180 Review.
-
Detection of copy number alterations in acute myeloid leukemia and myelodysplastic syndromes.Expert Rev Mol Diagn. 2012 Apr;12(3):253-64. doi: 10.1586/erm.12.18. Expert Rev Mol Diagn. 2012. PMID: 22468816 Review.
Cited by
-
Genome-wide identification of significant aberrations in cancer genome.BMC Genomics. 2012 Jul 27;13:342. doi: 10.1186/1471-2164-13-342. BMC Genomics. 2012. PMID: 22839576 Free PMC article.
-
A mixture model for expression deconvolution from RNA-seq in heterogeneous tissues.BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S11. doi: 10.1186/1471-2105-14-S5-S11. Epub 2013 Apr 10. BMC Bioinformatics. 2013. PMID: 23735186 Free PMC article.
-
AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data.Bioinformatics. 2014 Apr 15;30(8):1056-1063. doi: 10.1093/bioinformatics/btt759. Epub 2014 Jan 2. Bioinformatics. 2014. PMID: 24389661 Free PMC article.
-
UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples.Bioinformatics. 2015 Jan 1;31(1):137-9. doi: 10.1093/bioinformatics/btu607. Epub 2014 Sep 10. Bioinformatics. 2015. PMID: 25212756 Free PMC article.
-
Comparative analysis of methods for identifying recurrent copy number alterations in cancer.PLoS One. 2012;7(12):e52516. doi: 10.1371/journal.pone.0052516. Epub 2012 Dec 20. PLoS One. 2012. PMID: 23285074 Free PMC article.
References
-
- Bengtsson H., et al. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics. 2008;24:759–767. - PubMed