Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 5;12 Suppl 9(Suppl 9):S21.
doi: 10.1186/1471-2105-12-S9-S21.

Consistency-based detection of potential tumor-specific deletions in matched normal/tumor genomes

Affiliations

Consistency-based detection of potential tumor-specific deletions in matched normal/tumor genomes

Roland Wittler et al. BMC Bioinformatics. .

Abstract

Background: Structural variations in human genomes, such as insertions, deletion, or rearrangements, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Most existing methods however are limited to the analysis of a single genome, and it is only recently that the comparison of closely related genomes has been considered. In particular, a few recent works considered the analysis of data sets obtained by sequencing both tumor and healthy tissues of the same cancer patient. In that context, the goal is to detect variations that are specific to exactly one of the genomes, for example to differentiate between patient-specific and tumor-specific variations. This is a difficult task, especially when facing the additional challenge of the possible contamination of healthy tissues by tumor cells and conversely.

Results: In the current work, we analyzed a data set of paired-end short-reads, obtained by sequencing tumor tissues and healthy tissues, both from the same cancer patient. Based on a combinatorial notion of conflict between deletions, we show that in the tumor data, more deletions are predicted than there could actually be in a diploid genome. In contrast, the predictions for the data from normal tissues are almost conflict-free. We designed and applied a method, specific to the analysis of such pooled and contaminated data sets, to detect potential tumor-specific deletions. Our method takes the deletion calls from both data sets and assigns reads from the mixed tumor/normal data to the normal one with the goal to minimize the number of reads that need to be discarded to obtain a set of conflict-free deletion clusters. We observed that, on the specific data set we analyze, only a very small fraction of the reads needs to be discarded to obtain a set of consistent deletions.

Conclusions: We present a framework based on a rigorous definition of consistency between deletions and the assumption that the tumor sample also contains normal cells. A combined analysis of both data sets based on this model allowed a consistent explanation of almost all data, providing a detailed picture of candidate patient- and tumor-specific deletions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Mappings. Illustration of (left) a mapping and (right) a valid cluster.
Figure 2
Figure 2
Overlapping clusters. We assume that each cluster describes exactly one deletion. Therefore, clusters overlapping as shown in this figure are not consistent. Only the surrounding shape of the mappings in the cluster, and the position and size of the deleted segments are indicated.
Figure 3
Figure 3
Minimal Conflicting Sets. Examples for minimal conflicting sets of two and three deletion clusters.
Figure 4
Figure 4
Results: deletion ranges. Deletion range for the 238 candidate tumor-specific deletions. Each point represents a deletion cluster: the x-coordinate is the minimum length of this deletion and the y-coordinate its maximum length. Lengths are expressed in number of nucleotides.
Figure 5
Figure 5
Results: Support. Support for the 238 candidate tumor-specific deletions. The support is defined by the number of mappings in the cluster defining each deletion.
Figure 6
Figure 6
Results: Comparison with GASV and BreakDancer. Deletion clusters inferred by Algorithm 1, GASV and BreakDancer on chromosome 2. Note that many deletions in close proximity may appear as a single dot, and the size of a dot is in general larger than the respective deletion. Illustrations for all chromosomes can be found in Additional file 2.
Figure 7
Figure 7
Results: refinement of deletion and breakpoint region sizes. (Left) Ratio between the initial deletion size and the deletion size after processing, for clusters indicating normal deletions. (Right) Ratio between the initial breakpoint region length and the breakpoint region length after processing, for clusters indicating normal deletions.
Figure 8
Figure 8
Results: Singletons. aSupported normal singletons. In this histogram, the number of normal deletions defined by exactly one mapping is shown w.r.t. the number of mappings in the tumor data set supporting the deletion. E.g., there were 50 singletons supported by eight mappings. The numbers are based on the tumor-subset only. Chromosome 18 is not included, because the computation did not finish within reasonable time limits.

Similar articles

Cited by

References

    1. Mardis E. Cancer genomics identifies determinants of tumor biology. Genome Biol. 2010;11(5):211. doi: 10.1186/gb-2010-11-5-211. - DOI - PMC - PubMed
    1. Robinson K. Application of second-generation sequencing to cancer genomics. Brief. Bioinform. 2010;11(5):524–534. doi: 10.1093/bib/bbq013. - DOI - PubMed
    1. Pleasance E, Cheetham R, Stephens P, McBride D, Humphray S, Greenman C, Varela I, Lin M, Ordóñez GR, GR B. et al.A comprehensive catalogue of somatic mutations from a human cancer genome. Nature. 2010;463(7278):191–196. doi: 10.1038/nature08658. - DOI - PMC - PubMed
    1. Dalgliesh G, Furge K, Greenman C, Chen L, Bignell G, Butler A, Davies H, Edkins S, Hardy C, Latimer C. et al.Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genes. Nature. 2010;463(7279):360–363. doi: 10.1038/nature08672. - DOI - PMC - PubMed
    1. Snyder M, Du J, Gerstein M. Personal genome sequencing: current approaches and challenges. Genes Dev. 2010;24:423–131. doi: 10.1101/gad.1864110. - DOI - PMC - PubMed

Publication types

LinkOut - more resources