Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 13;16(1):35.
doi: 10.1186/s13059-015-0602-8.

PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors

PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors

Amit G Deshwar et al. Genome Biol. .

Abstract

Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods. PhyloWGS is free, open-source software, available at https://github.com/morrislab/phylowgs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The development of intratumor heterogeneity and subclonal reconstruction. Tumor composition over time (i), the resulting distribution of variant allele frequencies (VAFs) (ii), the result of successful inference of the VAF clusters (iii), and the desired output of subclonal inference (iiii). SSM, simple somatic mutation; VAF, variant allelic frequency.
Figure 2
Figure 2
Example of copy number variations affecting the distribution of variant allele frequencies.
Figure 3
Figure 3
Changes to VAF caused by CNVs with different phylogenetic relationships. CNV, copy number variation; SSM, simple somatic mutation; VAF, variant allelic frequency.
Figure 4
Figure 4
Example subclonal structure and inferred phylogenies using different methods. (A) Example of tumor subclonal structure. (B) Tumor phylogeny recovered by PhyloWGS. (C) Tumor phylogeny recovered by PhyloSub. (D) Subclonal structure implied by only CNVs. CNV, copy number variation; SSM, simple somatic mutation.
Figure 5
Figure 5
PhyloWGS run time. Relationship between the number of SSMs in the simulated dataset with five subpopulations and the run time on a log10 vs log10 plot. Run time was measured using a single core of an Intel i7-4770K with 2,500 MCMC iterations and 5,000 inner Metropolis–Hastings iterations. The run time can be greatly decreased by parallelizing the sampling or by taking less samples; however, the implications of these options have not been explored. SSM, simple somatic mutation.
Figure 6
Figure 6
Recovering the true number of clusters. Each panel shows the relationship between the number of SSMs per cluster, the read depth and the ability of PhyloWGS to recover the true number of populations for simulations with three, four, five or six populations. The error is calculated by subtracting the true number of subclonal lineages from the number found. SSM, simple somatic mutation.
Figure 7
Figure 7
Reconstruction accuracy. Each panel shows the relationship between the read depth and the accuracy of the resulting clustering, measured as the area under the precision–recall curve (AUPRC). Plots for three, four, five and six populations are shown with each line representing a different number of SSMs per cancerous population. SSM, simple somatic mutation.
Figure 8
Figure 8
Relationship between read depth and accuracy of the resulting clustering. These were measured as the area under the precision–recall curve for PhyloWGS and PyClone. Plots are shown for subclonal additions (left) and deletions (right). AUPR, area under the precision–recall curve.
Figure 9
Figure 9
True and inferred composition of TCGA benchmark samples. The figure shows the true (left), inferred by PhyloSub (center) and inferred by THetA (right) composition of three TCGA benchmark samples. Each bar represents a single sample.
Figure 10
Figure 10
Expert-generated and inferred phylogenies for patient CLL077 with chronic lymphocytic leukemia. Left: The expert-generated phylogeny based on targeted deep-sequencing data. Right: The phylogeny inferred by PhyloWGS on allele frequencies of the same SSMs found using WGS. The subclonal lineage population frequencies for the five samples and the SSM assignments of lineages are also shown in the figure. SSM, simple somatic mutation; WGS, whole-genome sequencing.
Figure 11
Figure 11
Subclonal reconstruction algorithms applied to breast tumor PD4120. Left: Area under the precision–recall curve (AUPRC) for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of normal copy number. Right: AUPRC for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of altered and normal copy number. CN, copy number; SSM, simple somatic mutation.

References

    1. Nowell PC. The clonal evolution of tumor cell populations. Science. 1976;194:23–8. doi: 10.1126/science.959840. - DOI - PubMed
    1. Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. New Engl J Med. 2012;366:883–92. doi: 10.1056/NEJMoa1113205. - DOI - PMC - PubMed
    1. Hughes AEO, Magrini V, Demeter R, Miller CA, Fulton R, Fulton LL, et al. Clonal architecture of secondary acute myeloid leukemia defined by single-cell sequencing. PLoS Genet. 2014;10:e1004462. doi: 10.1371/journal.pgen.1004462. - DOI - PMC - PubMed
    1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. doi: 10.1016/S0092-8674(00)81683-9. - DOI - PubMed
    1. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74. doi: 10.1016/j.cell.2011.02.013. - DOI - PubMed

Publication types