Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul;15(7):531-534.
doi: 10.1038/s41592-018-0036-9. Epub 2018 Jun 25.

DeTiN: overcoming tumor-in-normal contamination

Affiliations

DeTiN: overcoming tumor-in-normal contamination

Amaro Taylor-Weiner et al. Nat Methods. 2018 Jul.

Abstract

Comparison of sequencing data from a tumor sample with data from a matched germline control is a key step for accurate detection of somatic mutations. Detection sensitivity for somatic variants is greatly reduced when the matched normal sample is contaminated with tumor cells. To overcome this limitation, we developed deTiN, a method that estimates the tumor-in-normal (TiN) contamination level and, in cases affected by contamination, improves sensitivity by reclassifying initially discarded variants as somatic.

PubMed Disclaimer

Conflict of interest statement

Competing Financial Interests Statement:

C.J.W. is a co-founder of Neon Therapeutics and a member of its scientific advisory board.

Figures

Figure 1
Figure 1. Results from in silico and in vitro validation of deTiN.
(a) TiN estimates at different in silico simulated TiN levels. (b) deTiN estimates at different in vitro mixed TiN levels. MAE = mean absolute error. (c, d) Sensitivity to detect mutations with deTiN (red) and without deTiN (blue) at (c) different in silico simulated TiN levels and (d) in vitro mixed TiN levels. (a, c) deTiN results from n=5 in silico independent simulation experiments. Dots represent weighted average and error bars represent standard errors. (b, d) Results from n=1 sequencing experiment. Error bars depict 95% confidence intervals on TiN estimates. (a, b) Dotted blue lines indicate y=x.
Figure 2
Figure 2. Application of deTiN to chronic lymphocytic leukemia (CLL) sequencing data.
(a) TiN estimates for CD19 selected (normal) blood compared with whole blood from minimal residual disease negative (MRD) patients. Box plot: median TiN value (red line), box represents Q1 and Q3 quartiles, whiskers represent the most extreme data points that are not outliers. Outliers are denoted with red crosses and represent data points out side the range [Q1 - 1.5 IQR, Q3 + 1.5 IQR] where IQR is the interquartile range. P value is calculated using two-tailed Mann–Whitney test (n=257 independent patient samples). (b) Mutation rate in samples pre- and post-application of deTiN stratified by normal sample type. Box plot and P value as in panel a. (c) Heat map and bar plot illustrating recovery of SSNVs in the CLL cohort. Samples are in columns, genes in rows. Blue boxes indicate variants detected prior to deTiN (“without deTiN”); red boxes indicate additional variants recovered by deTiN (“with deTiN”). (d) Stick plots showing mutation data in SF3B1 and TP53. Amino acid positions of recurrent COSMIC mutations are highlighted in teal. Blue circles indicate variants detected prior to deTiN; red circles indicate variants recovered by deTiN.
Figure 3
Figure 3. Application of deTiN to analysis of solid tumors with adjacent normal controls.
(a) Fraction of contaminated samples (pink; TiN≥0.02) when using different sources for normal tissue (tumor-adjacent normal tissue and peripheral blood) and, in cases with tumor-adjacent normal, stratified by tumor type. Asterisks represent non-TCGA cohorts. (b) Points show mean sensitivity for detecting mutations with deTiN (red) and without deTiN (blue). Means were derived from 256 of the 304 tumors that were matched with both a tumor-adjacent and a blood normal sample and had a sufficient number of somatic events to robustly estimate TiN (TiN = 0 [n=230]; TiN=0.01 [n=9]; TiN = 0.03 [n=9]; TiN=0.07 [n=4]; TiN=0.15 [n=1]; TiN=0.17 [n=1]; TiN=0.74 [n=1]; TiN=0.94 [n=1]). Error bars indicate standard error. (c) Histology images of selected adjacent tissue samples with evidence supporting TiN (n=1 patient sample for each image and plot). deTiN aSCNA data supporting TiN estimate is displayed for top two samples; points indicate allele-fraction of heterozygous germline SNPs, blue (tumor) and red (normal) points are used for TiN estimation, and grey points are not used by deTiN. The bottom plot displays deTiN somatic variant data supporting the TiN estimate for the bottom sample. Points indicate allele-fraction of variants in the tumor (x-axis) and normal (y-axis) samples; error bars indicate 95% beta confidence intervals. The green asterisk represents the KRAS G12V mutation, red points represent SSNVs recovered by deTiN, blue points are called before deTiN, and grey points are rejected by deTiN and MuTect as germline or artifact. Each plot displays data supporting TiN from a single tumor-normal pair corresponding to the image on the left (n = 1). (d) Illustration of three modes of contamination. Posterior distribution functions for TiN based on aSCNA data are shown clustered (red and orange) and unclustered for individual events (dashed grey). In the mixture scenario, TiN has two possible values: the lower represents events unique to the tumor cells (red) and the higher represents events shared between the tumor cells and the sibling precursor cells (orange).

References

References for main text

    1. Cibulskis K et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples 31, (2013). - PMC - PubMed
    1. Saunders CT et al. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, 1811–7 (2012). - PubMed
    1. Koboldt DC et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22, 568–76 (2012). - PMC - PubMed
    1. Stieglitz E et al. The genomic landscape of juvenile myelomonocytic leukemia. Nat. Genet 47, 1326–1333 (2015). - PMC - PubMed
    1. Wei L et al. Pitfalls of improperly procured adjacent non-neoplastic tissue for somatic mutation analysis using next-generation sequencing. BMC Med. Genomics 9, 64 (2016). - PMC - PubMed

Supplementary References:

    1. Carter SL et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol 30, 413–421 (2012). - PMC - PubMed
    1. Lek M et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). - PMC - PubMed
    1. Li H et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–9 (2009). - PMC - PubMed
    1. Cibulskis K et al. ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinformatics 27, 2601–2602 (2011). - PMC - PubMed
    1. Ramos AH et al. Oncotator Cancer Variant Annotation Tool 36, (2015). - PMC - PubMed

Publication types