Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 5;11(1):731.
doi: 10.1038/s41467-020-14352-7.

Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

Collaborators, Affiliations

Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

Yulia Rubanova et al. Nat Commun. .

Erratum in

Abstract

The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour sample. In simulations, we find TrackSig has a 3-5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Signature activity trajectories for two samples.
Each plot is constructed from VAF data from a single tumour sample. Each line is an activity trajectory that depicts inferred activities for a single signature (y-axis) as a function of decreasing CCF (x-axis). The thin lines are trajectories from each of 30 bootstrap runs. The bold line depicts the mean activities across bootstraps. The vertical lines indicate time points in the original dataset, and are placed at the average CCF of their 100 associated mutations. Changes in activity trajectories are not necessarily aligned with vertical bars because mean CCFs of time points change across bootstraps. Frequency of changepoints between two vertical bars is indicated by shade, the darker shades indicate higher density of changepoints. Subclonal boundaries found by PCAWG consensus clustering are shown in red vertical lines. These boundaries are not used in trajectory calculation and are only shown for comparison. Histograms show the mutation counts per signature in fixed width intervals of CCF. a Breast cancer sample. In clonal signatures remains constant with dominating signature 3 (associated with BRCA1 mutations). In the subclone activity to signature 3 decreases and is replaced by SNVs associated with APOBEC/AID (signatures 2 and 13). b Chronic lymphocytic leukaemia sample. Signature 9 (somatic hypermutation) dominates during clonal expansion and drops from 55% activity to almost zero in the subclone. Signature 5 compensates for this change.
Fig. 2
Fig. 2. Results on non-parametric simulations.
a Median activity difference between the reconstructed trajectories and the ground truth. Lines correspond to the simulations with 0, 1, 2, or 3 changepoints. The median in computed across all signatures and time points in the sample. b Distribution of maximum activity change (MAC) discrepancies between between estimated activities and ground truth.
Fig. 3
Fig. 3. TrackSig and SciClone performance on clonal evolution simulations.
a Scatterplot of median activity errors (i.e. absolute activity difference) on all depth 30 simulations (see Supplementary Fig. 3 for depths 10 and 100). Mean activity error: TrackSig 3.5%, SciClone 6.2%. b Grouped barplot shows proportion of simulations where each method predicts the correct number of subclones for different simulation types as indicated on x-axis label. Different SciClone bars indicate different noise model selections. Results are for the simulations of average depth 30. Results for depths 10 and 100 are shown in Supplementary Fig. 2.
Fig. 4
Fig. 4. TrackSig reconstruction examples.
a Simulated data was generated with two clusters and clonal neutral mutations at read depth 100. TrackSig incorrectly places a changepoint before a cluster of neutral mutations from the clonal lineage near the VAF detection limit. However, because the signature activities match those in the clonal cluster, this error could be detected and corrected in post-processing. b Simulated data was generated with two clusters at read depth 10. TrackSig correctly identifies one changepoint. Although the simulation contains two clusters, there is only a single mode of CCF, thus making CCF-cluster-based detection of subclones impossible. However, the histogram on top shows that there are differences in mutation type distributions between the left and right tails, permitting TrackSig to correctly identify a changepoint. Both figures use an expanded x-axis that shows the whole spread of estimated CCF, this is indicated with a change in the x-label descriptor.
Fig. 5
Fig. 5. Maximum signature activity changes in PCAWG samples.
The red line shows the threshold of 5%, above which we consider changes to be significant. a Changes on random orders of mutations where we do not expect to see change in activities. b activity changes in TrackSig trajectories across all samples (on mutations sorted by CCF). Frequency axis shows the number of samples where we observe the certain activity change.
Fig. 6
Fig. 6. PCAWG signature changes by activity level.
a Mean signature activities ranked from the largest to the smallest within each sample in PCAWG data. Only the top five signatures with the highest activities in a sample are shown. b Maximum changes of signature activities for the corresponding signatures on plot (a). The changes below 5% are omitted.
Fig. 7
Fig. 7. Frequency of activity change by number of mutations.
Proportion of tumours that have a significant change greater than 5% activity depending on the number of time points in a sample. Each bar corresponds to the range of number of time points in a sample; each time point contains 100 mutations.

References

    1. Nik-Zainal S, et al. The life history of 21 breast cancers. Cell. 2012;149:994–1007. doi: 10.1016/j.cell.2012.04.023. - DOI - PMC - PubMed
    1. Alexandrov L, et al. Signatures of mutational processes in human cancer. Nature. 2013;500:415–421. doi: 10.1038/nature12477. - DOI - PMC - PubMed
    1. Hainaut P, Pfeifer GP. Patterns of p53 G–>T transversions in lung cancers reflect the primary mutagenic signature of DNA-damage by tobacco smoke. Carcinogenesis. 2001;22:367–374. doi: 10.1093/carcin/22.3.367. - DOI - PubMed
    1. Pfeifer GP, You Y-H, Besaratinia A. Mutations induced by ultraviolet light. Mutat. Res. 2005;571:19–31. doi: 10.1016/j.mrfmmm.2004.06.057. - DOI - PubMed
    1. Pfeifer GP, et al. Tobacco smoke carcinogens, DNA damage and p53 mutations in smoking-associated cancers. Oncogene. 2002;21:7435–7451. doi: 10.1038/sj.onc.1205803. - DOI - PubMed

Publication types

LinkOut - more resources