Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 29;41(4):btaf145.
doi: 10.1093/bioinformatics/btaf145.

Marker selection strategies for circulating tumor DNA guided by phylogenetic inference

Affiliations

Marker selection strategies for circulating tumor DNA guided by phylogenetic inference

Xuecong Fu et al. Bioinformatics. .

Abstract

Motivation: Blood-based profiling of tumor DNA ("liquid biopsy") offers great prospects for non-invasive early cancer diagnosis and clinical guidance, but requires further computational advances to become a robust quantitative assay of tumor clonal evolution. We propose new methods to better characterize tumor clonal dynamics from circulating tumor DNA (ctDNA), through application to two specific tasks: (i) applying longitudinal ctDNA data to refine phylogeny models of clonal evolution, and (ii) quantifying changes in clonal frequencies that may be indicative of treatment response or tumor progression. We pose these through a probabilistic framework for optimally identifying markers and using them to characterize clonal evolution.

Results: We first estimate a density over clonal tree models using bootstrap samples over pre-treatment tissue-based sequence data. We then refine these models over successive longitudinal samples. We use the resulting framework for modeling and refining tree densities to pose a set of optimization problems for selecting ctDNA markers to maximize measures of utility for reducing uncertainty in phylogeny models and quantifying clonal frequencies given the models. We tested our methods on synthetic data and showed them to be effective at refining tree densities and inferring clonal frequencies. Application to real tumor data further demonstrated the methods' effectiveness in refining a lineage model and assessing its clonal frequencies. The work shows the power of computational methods to improve marker selection, clonal lineage reconstruction, and clonal dynamics profiling for more precise and quantitative assays of somatic evolution and tumor progression.

Availability and implementation: https://github.com/CMUSchwartzLab/Mase-phi.git. (DOI: 10.5281/zenodo.14776163).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Our overall inference pipeline. (1) We assume we have first sequenced tissue and liquid biopsy sample(s), obtaining reference germline and ctDNA sequence. (2) We create bootstrapped samples over reads for each sequence set. (3) We infer a set of possible trees from the bootstrapped samples, establishing an estimated empirical tree density. (4) We then seek a set of optimal biomarkers of mutations to best reduce the tree uncertainty and (5) quantify these markers in ctDNA (e.g. by ddPCR). (6) We use the results of these assays to update the empirical tree densities. (7) We further seek a set of optimal biomarkers to track subclone frequencies efficiently and (8) assay these biomarkers. (9) Finally, we use the results of the assays to estimate clonal fractions at each sampled time point.
Figure 2.
Figure 2.
Results from 10 simulated clonal trees derived from 1 blood and 2 tissue samples, each with 7 clones, mutation rate 50, tissue mask proportion 0.5. (a) Updated weighted distance after tree adjustment versus original weighted distance between estimated and ground truth trees. CASet (Dinardo et al. 2020) is used as the distance metric. (b) The updated weighted distance after the tree adjustment versus the original weighted distance using DISC (Dinardo et al. 2020) as the metric. (c) Updated weights versus original weights for the best tree structure, with the lowest distance compared to the ground truth tree. We compared three marker selection strategies: optimizing for inferring clonal fractions (frac), inferring tree structures (struct), and random selection (random).
Figure 3.
Figure 3.
The trackable fractions of simulated tumor clonal population in noisy liquid biopsy samples. The lines show the change of trackable fractions, defined as the fraction of the total clonal frequency describing clones that can be quantified with the optional marker set, as the selected marker number increases. We compare marker selection optimized for the task of maximizing trackable fraction against random marker selection.
Figure 4.
Figure 4.
Results of applying our tree refinement and clonal tracking methods on TRACERx sample CRUK0044. (a) Changes in tree weights for each topology identified in bootstrap sampling, after adjusting the tree density using the selected markers at each time point. (b) The inferred most likely tree after all serial samples, corresponding to the blue line in (a). (c) Inferred clonal frequencies as of each longitudinal sample derived from the selected marker set, with lines representing the clones color-coded as in the tree at left.

Update of

Similar articles

References

    1. Abbosh C, Birkbak NJ, Wilson GA et al.; PEACE Consortium. Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature 2017;545:446–51. - PMC - PubMed
    1. Abbosh C, Frankell AM, Harrison T et al.; TRACERx Consortium. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature 2023;616:553–62. - PMC - PubMed
    1. Beerenwinkel N, Schwarz RF, Gerstung M et al. Cancer evolution: mathematical models and computational inference. Syst Biol 2015;64:e1–25. - PMC - PubMed
    1. Cescon DW, Bratman SV, Chan SM et al. Circulating tumor DNA and liquid biopsy in oncology. Nat Cancer 2020;1:276–90. - PubMed
    1. Connal S, Cameron JM, Sala A et al. Liquid biopsies: the future of cancer early detection. J Transl Med 2023;21:118. - PMC - PubMed