Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun 1;27(11):1473-80.
doi: 10.1093/bioinformatics/btr183. Epub 2011 Apr 15.

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data

Affiliations

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data

Guoqiang Yu et al. Bioinformatics. .

Abstract

Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples.

Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines.

Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The flow chart of BACOM.
Fig. 2.
Fig. 2.
The DNA copy number profile and Bayesian analysis of the deletion segment of the simulation dataset 1 when α=0.7. (A) Copy number profile and correction results. (B) Bayesian analysis to determine the deletion type of the deletion segment.
Fig. 3.
Fig. 3.
The DNA copy number profile and Bayesian analysis of deletion segments of the simulation dataset 2 when α = 0.7. (A) Copy number profile and correction results. (B) Bayesian analysis to determine the deletion types of the three deletion segments.
Fig. 4.
Fig. 4.
The DNA copy number profile of Chromosome 10 in a prostate cancer sample assayed on Affymetrix SNP 500K platform.
Fig. 5.
Fig. 5.
The DNA copy number profile of Chromosome 17 in a prostate cancer sample assayed on Affymetrix Genome-Wide 6.0.
Fig. 6.
Fig. 6.
A comparison of the power to detect significant consensus events with and without correction of the normal tissue contamination, along (A) different false discovery rates (FDRs) and (B) different degrees of contamination.

Similar articles

Cited by

References

    1. Assie G., et al. SNP arrays in heterogeneous tissue: Highly accurate collection of both germline and somatic genetic information from unpaired single tumor samples. Am. J. Hum. Genet. 2008;82:903–915. - PMC - PubMed
    1. Bengtsson H., et al. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics. 2008;24:759–767. - PubMed
    1. Beroukhim R., et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl Acad. Sci. USA. 2007;104:20007–20012. - PMC - PubMed
    1. Clarke R., et al. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat. Rev. Cancer. 2008;8:37–49. - PMC - PubMed
    1. Goransson H., et al. Quantification of normal cell fraction and copy number neutral LOH in clinical lung cancer samples using SNP array data. PLoS One. 2009;4:e6057. - PMC - PubMed

Publication types