Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 15;10(1):15.
doi: 10.1186/s12920-017-0255-4.

CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data

Affiliations

CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data

Zhenhua Yu et al. BMC Med Genomics. .

Abstract

Background: Copy number alterations (CNA) and loss of heterozygosity (LOH) represent a large proportion of genetic structural variations of cancer genomes. These aberrations are continuously accumulated during the procedure of clonal evolution and patterned by phylogenetic branching. This invariably results in the emergence of multiple cell populations with distinct complement of mutational landscapes in tumor sample. With the advent of next-generation sequencing technology, inference of subclonal populations has become one of the focused interests in cancer-associated studies, and is usually based on the assessment of combinations of somatic single-nucleotide variations (SNV), CNA and LOH. However, cancer samples often have several inherent issues, such as contamination of normal stroma, tumor aneuploidy and intra-tumor heterogeneity. Addressing these critical issues is imperative for accurate profiling of clonal architecture.

Methods: We present CLImAT-HET, a computational method designed for capturing clonal diversity in the CNA/LOH dimensions by taking into account the intra-tumor heterogeneity issue, in the case where a reference or matched normal sample is absent. The algorithm quantitatively represents the clonal identification problem using a factorial hidden Markov model, and takes an integrated analysis of read counts and allele frequency data. It is able to infer subclonal CNA and LOH events as well as the fraction of cells harboring each event.

Results: The results on simulated datasets indicate that CLImAT-HET has high power to identify CNA/LOH segments, it achieves an average accuracy of 0.87. It can also accurately infer proportion of each clonal population with an overall Pearson correlation coefficient of 0.99 and a mean absolute error of 0.02. CLImAT-HET shows significant advantages when compared with other existing methods. Application of CLImAT-HET to 5 primary triple negative breast cancer samples demonstrates its ability to capture clonal diversity in the CAN/LOH dimensions. It detects two clonal populations in one sample, and three clonal populations in one other sample.

Conclusions: CLImAT-HET, a novel algorithm is introduced to infer CNA/LOH segments from heterogeneous tumor samples. We demonstrate CLImAT-HET's ability to accurately recover clonal compositions using tumor WGS data without a match normal sample.

Keywords: Bayesian information criterion; Copy number alteration; Hidden Markov model; Intra-tumor heterogeneity; Loss of heterozygosity; Whole-genome sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Overview of the CLImAT-HET statistical framework. a CLImAT-HET analysis workflow. 1) Read counts and read depths of known SNP positions are extracted from whole-genome sequencing data of tumor sample; 2) read counts signals are preprocessed to correct GC-content and mappability bias, quantile normalization of read depths is performed to eliminate allelic bias; 3) read counts and read depth signals are jointly analyzed using an integrated hidden Markov model, and the model complexity is iteratively evaluated for different number of clonal clusters using Bayesian information criterion; 4) and finally the clonal/subclonal CNA and LOH segments as well as the cellularity of each clonal cluster are inferred. b Representation of the intra-tumor heterogeneity and CLImAT-HET solution. The observed copy number signals are generated from three types of cell populations: normal cells, tumor cells with an amplification, and tumor cells with the amplification event and an additional deletion event. CLImAT-HET infers the total and major copy number as well as the corresponding cellularity of each event. c The factorial hidden Markov model adopted in CLImAT-HET. The hidden Markov model has two underlying Markov chains with one chain depicting aberration events and another delineating corresponding clonal clusters
Fig. 2
Fig. 2
The cellularity estimation results of different methods on a simulated sample. Cellularity estimations of different methods on a simulated sample are compared with the ground truth. The predefined cellularity of all segments in simulation study are treated as underlying truth. CLImAT-HET correctly identify three clonal populations and infer their cellularity, meanwhile assign correct clonal cluster to 85% of all segments
Fig. 3
Fig. 3
Cellularity prediction results of CLImAT-HET, and tumor purity estimation results of different methods on simulated samples. Estimated cellularity are compared with the underlying truth cellularity for each simulated sample. Results of CLImAT-HET on samples containing one (a), two (b) and three (c) clonal populations are shown respectively. Inferred tumor purities of each method are also compared with the ground truth tumor purities (d)
Fig. 4
Fig. 4
The copy number inference results of different methods on a simulated sample. Major and total copy numbers inferred by CLImAT-HET, OncoSNP-SEQ and CLImAT are compared with the ground truth, respectively. The predefined major and total copy numbers of all segments in simulation study are treated as underlying truth. The results show that CLImAT-HET infers the correct copy numbers for 86% of all segments
Fig. 5
Fig. 5
The accuracy of inferred copy numbers of different methods on simulated samples. The abilities of CLImAT-HET, OncoSNP-SEQ and CLImAT in inferring tumor genotypes are assessed in simulated dataset. A segment is considered to be accurately identified only if both the total and major copy numbers of the segment are accurately identified. For each simulated sample, the total and major copy number profiles of all segments predefined in simulation experiment are used as the golden standard for evaluation. The performance of different methods on homogeneous samples (a), tumor samples with two clonal populations (b) and three clonal populations (c) is assessed respectively, the x-axis represents the sample Id as defined in Additional file 3
Fig. 6
Fig. 6
The subclonal prediction results of CLImAT-HET on sample SA223. CLImAT-HET infers sample SA223 as heterogeneous with two distinct clonal populations. The cellularity of one subclonal cluster is 0.66 and in good concordance with the tumor purities estimated by CLImAT, APOLLOH and ASCAT. In addition, CLImAT-HET identify another subclonal cluster with cellularity of 0.39
Fig. 7
Fig. 7
The log-likelihoods and BIC differences of sample SA223 and SA227 under different number of clonal populations. The log-likelihoods and BIC differences are measured under each iteration for sample SA223 (a) and SA227 (b). The iteration continues until the BIC difference is greater than zero. CLImAT-HET predicts the number of clonal populations as 2 and 3 for sample SA223 and SA227 respectively
Fig. 8
Fig. 8
The subclonal prediction results of CLImAT-HET on sample SA227. CLImAT-HET infers sample SA227 as heterogeneous with three distinct clonal populations. The cellularity of one subclonal cluster is 0.44 and in accordance with the tumor purities estimated by CLImAT and ASCAT. In addition, CLImAT-HET identify other two subclonal clusters with respective cellularity of 0.5 and 0.21

Similar articles

Cited by

References

    1. Nowell PC. The clonal evolution of tumor cell populations. Science. 1976;194(4260):23–8. doi: 10.1126/science.959840. - DOI - PubMed
    1. Greaves M, Maley CC. Clonal evolution in cancer. Nature. 2012;481(7381):306–13. doi: 10.1038/nature10762. - DOI - PMC - PubMed
    1. Yates LR, Campbell PJ. Evolution of the cancer genome. Nat Rev Genet. 2012;13(11):795–806. doi: 10.1038/nrg3317. - DOI - PMC - PubMed
    1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458(7239):719–24. doi: 10.1038/nature07943. - DOI - PMC - PubMed
    1. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–74. doi: 10.1016/j.cell.2011.02.013. - DOI - PubMed

Publication types