Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 15;32(20):3196-3198.
doi: 10.1093/bioinformatics/btw389. Epub 2016 Jun 26.

Conpair: concordance and contamination estimator for matched tumor-normal pairs

Affiliations

Conpair: concordance and contamination estimator for matched tumor-normal pairs

Ewa A Bergmann et al. Bioinformatics. .

Abstract

Motivation: Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and cross-individual contamination in whole-genome and whole-exome tumor-normal sequencing experiments.

Results: On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor-normal datasets from TCGA and showed that they strongly correlate with tumor-normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers.

Availability and implementation: The method is available at: https://github.com/nygenome/conpair CONTACT: egrabowska@gmail.com or mczody@nygenome.orgSupplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Relationship between tumor–normal discordance values (1 – concordance) and contamination levels detected by Conpair, ContEst and VerifyBamID in a set of TCGA glioblastoma WXS tumor samples. Data shows whole genome amplified samples (red) and exome capture (blue)

References

    1. Brennan C.W. et al. (2013) The somatic genomic landscape of glioblastoma. Cell, 155, 462–477. - PMC - PubMed
    1. Cibulskis K. et al. (2011) ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinf. Oxf. Engl., 27, 2601–2602. - PMC - PubMed
    1. Cibulskis K. et al. (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol., 31, 213–219. - PMC - PubMed
    1. Jun G. et al. (2012) Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet., 91, 839–848. - PMC - PubMed
    1. Magi A. et al. (2013) EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biol., 14, R120. - PMC - PubMed