Reference-free inference of tumor phylogenies from single-cell sequencing data

Ayshwarya Subramanian, Russell Schwartz

PMID: 26576947
PMCID: PMC4652515
DOI: 10.1186/1471-2164-16-S11-S7

Reference-free inference of tumor phylogenies from single-cell sequencing data

Ayshwarya Subramanian et al. BMC Genomics. 2015.

. 2015;16 Suppl 11(Suppl 11):S7.

doi: 10.1186/1471-2164-16-S11-S7. Epub 2015 Nov 10.

Authors

Ayshwarya Subramanian, Russell Schwartz

PMID: 26576947
PMCID: PMC4652515
DOI: 10.1186/1471-2164-16-S11-S7

Erratum in

Erratum to: 'Reference-free inference of tumor phylogenies from single-cell sequencing data'.
Subramanian A, Schwartz R. Subramanian A, et al. BMC Genomics. 2016 May 10;17(1):348. doi: 10.1186/s12864-016-2609-2. BMC Genomics. 2016. PMID: 27164840 Free PMC article. No abstract available.

Abstract

Background: Effective management and treatment of cancer continues to be complicated by the rapid evolution and resulting heterogeneity of tumors. Phylogenetic study of cell populations in single tumors provides a way to delineate intra-tumoral heterogeneity and identify robust features of evolutionary processes. The introduction of single-cell sequencing has shown great promise for advancing single-tumor phylogenetics; however, the volume and high noise in these data present challenges for inference, especially with regard to chromosome abnormalities that typically dominate tumor evolution. Here, we investigate a strategy to use such data to track differences in tumor cell genomic content during progression.

Results: We propose a reference-free approach to mining single-cell genome sequence reads to allow predictive classification of tumors into heterogeneous cell types and reconstruct models of their evolution. The approach extracts k-mer counts from single-cell tumor genomic DNA sequences, and uses differences in normalized k-mer frequencies as a proxy for overall evolutionary distance between distinct cells. The approach computationally simplifies deriving phylogenetic markers, which normally relies on first aligning sequence reads to a reference genome and then processing the data to extract meaningful progression markers for constructing phylogenetic trees. The approach also provides a way to bypass some of the challenges that massive genome rearrangement typical of tumor genomes presents for reference-based methods. We illustrate the method on a publicly available breast tumor single-cell sequencing dataset.

Conclusions: We have demonstrated a computational approach for learning tumor progression from single cell sequencing data using k-mer counts. k-mer features classify tumor cells by stage of progression with high accuracy. Phylogenies built from these k-mer spectrum distance matrices yield splits that are statistically significant when tested for their ability to partition cells at different stages of cancer.

PubMed Disclaimer

Figures

**Figure 1**
**Workflow for inferring phylogenies from single-cell genome sequencing data based on the k-mer approach**. The major steps are k-mer counting, normalization, computation of distance matrices and phylogeny building.

**Figure 2**
**Histogram of k-mer relative abundances**. Both 20- and 25-mer relative abundance densities appear log-laplacian. These data included 20- and 25-mers found in all tumor cells. (a) Histogram of 20-mer relative abundances in log10 scale. (b) Histogram of 25-mer relative abundances in log10 scale.

**Figure 3**
**tSNE ordination of single tumor cells**. (a) Projection of tumor cells in the space of 20-mers present in all cells. The 20-mers separate the cells into 3 loose groups, the rightmost of which is dominated by primary cells and the others by metastatic cells. The cluster in the middle is equally represented by primary and metastatic cells from patient T16, suggesting a state of transition. The cluster on the upper right is mostly metastatic with some primary cells. This suggests two distinct stages of advancing tumor cells based on k-mer composition. (b) Projection of cells in the space of only those 20-mers that remain after the differential abundance test for k-mer selection. Cells group into 2 clusters, one of which is entirely composed of primary cells and the other a mix of primary and metastatic cells.

**Figure 4**
**20-mer bootstrap consensus neighbor-joining tree built from T16 primary (prefix P) and metastatic data (prefix M)**. Distinct groupings of cells are labeled as clusters.

**Figure 5**
**20-mer bootstrap consensus neighbor-joining tree built from T10 primary breast tumor cells (prefix C), T16 primary (prefix P) and metastatic data (prefix M)**. Distinct groupings of cells are labeled as clusters.

See this image and copyright information in PMC

References

1. Polyak K. Tumor heterogeneity confounds and illuminates: A case for darwinian tumor evolution. Nature Medicine. 2014;20(4):344–6. - PubMed
1. Burrell RA, McGranahan N, Bartek J, Swanton C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature. 2013;501(7467):338–345. - PubMed
1. Golub TR, Slonim DK, Tamayo P. et al.Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. - PubMed
1. Perou CM, Sorlie T, Eisen MB. et al.Molecular portraits of human breast tumors. Nature. 2000;406:747–752. - PubMed
1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458(7239):719–724. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reference-free inference of tumor phylogenies from single-cell sequencing data

Reference-free inference of tumor phylogenies from single-cell sequencing data

Authors

Erratum in

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous