Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 29;15(1):9343.
doi: 10.1038/s41467-024-53374-3.

Integrative spatial and genomic analysis of tumor heterogeneity with Tumoroscope

Affiliations

Integrative spatial and genomic analysis of tumor heterogeneity with Tumoroscope

Shadi Shafighi et al. Nat Commun. .

Erratum in

Abstract

Spatial and genomic heterogeneity of tumors are crucial factors influencing cancer progression, treatment, and survival. However, a technology for direct mapping the clones in the tumor tissue based on somatic point mutations is lacking. Here, we propose Tumoroscope, the first probabilistic model that accurately infers cancer clones and their localization in close to single-cell resolution by integrating pathological images, whole exome sequencing, and spatial transcriptomics data. In contrast to previous methods, Tumoroscope explicitly addresses the problem of deconvoluting the proportions of clones in spatial transcriptomics spots. Applied to a reference prostate cancer dataset and a newly generated breast cancer dataset, Tumoroscope reveals spatial patterns of clone colocalization and mutual exclusion in sub-areas of the tumor tissue. We further infer clone-specific gene expression levels and the most highly expressed genes for each clone. In summary, Tumoroscope enables an integrated study of the spatial, genomic, and phenotypic organization of tumors.

PubMed Disclaimer

Conflict of interest statement

Projects in Szczurek lab are co-funded by Merck Healthcare. C.E., K.T., and J.E.M. are scientific consultants for 10x Genomics Inc. Other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the Tumoroscope framework.
a–c Input data. d-f Data preprocessing. g Tumoroscope probabilistic model. h Regression model for inferring gene expression profiles of the clones. i Results of Tumoroscope. j Output of the regression model. Figure is created in BioRender.
Fig. 2
Fig. 2. Performance of Tumoroscope on simulated data featuring 5 clones and 30 mutations.
a–c Mean Average Error (MAE; y-axis) as a function of spot coverage (x-axis) in different simulation setups (colors) for Tumoroscope, for different noise levels in the cell count provided at input: no noise (a), small noise (b) and high noise (c). d-f The same as in a–c, but for Tumoroscope-fixed. g Pearson correlation (y-axis) between the average spot coverage and the average error in all the setups is negative for both model versions (x-axis), regardless of the noise in the number of cells provided as input (colors). h-l Comparison of the accuracy (y-axis) of the model between cardelino (gray) and two versions of the model given true and highly noisy values for the number of cells (colors), depending on the spot coverage (x-axis), in different simulation setups: basic (h), increased (i) and decreased (j) number of mutations, increased (k) and decreased (l) number of clones. In each panel, the lower and upper boundaries of the box represent the first (Q1) and third quartiles (Q3), with the median indicated by a line inside the box. The whiskers typically extend to the most extreme data points within 1.5 times the interquartile range (IQR) from the quartiles. Data points outside this range are considered outliers and are plotted individually by diamonds. The boxplots in panels (a-f) and (h-l) are based on 10 data points each, corresponding to 10 generated datasets for each setup. In panel g, each boxplot represents 20 data points, corresponding to the Pearson correlations calculated across the 10 datasets for the 20 different setups.
Fig. 3
Fig. 3. Spatial arrangement of cancer clones inferred for the breast cancer dataset.
a, b Evolutionary tree and genotypes of the inferred clones. Figure 3a is created in BioRender. Colors: major to total ratio, i.e., the fraction of the major copy number to the total copy number, with values that fall within the range of 0 to 1. c Distribution of the Pearson correlation (y-axis) of the clonal composition of the spots that are distant and adjacent, computed for 100 pairs of spots sampled at random 20 times each (x-axis). d Distribution of the agreement of the distant and adjacent spots in cardelino and Tumoroscope, computed for the same randomly sampled pairs as used in c. To compute the agreement, we use the single inferred clone by cardelino and the major inferred clone by Tumoroscope. In panel c and d, the lower and upper boundaries of the box represent the first (Q1) and third quartiles (Q3), with the median indicated by a line inside the box. The whiskers typically extend to the most extreme data points within 1.5 times the interquartile range (IQR) from the quartiles. Data points outside this range are considered outliers and are plotted individually by diamonds. e Pathologist’s annotation of the cancerous areas on the H&E images for sections SB1, SB2, and SB3. f For each section, two rows correspond to the two nearby samples and 7 columns correspond to the proportion of the spots assigned to each clone. g The clonal assignment of the spots inferred by cardelino for the same samples (see Supplementary Fig. 10 for expanded cardelino results). h Assignment of spots to copy number clones inferred by STARCH, with two clusters: gray corresponding to a normal clone, and dark blue corresponding to a single tumor clone.
Fig. 4
Fig. 4. Clonal evolution of breast cancer samples inferred by Canopy.
At each branch, known oncogenes, tumor suppressor or fusion genes with mutations that occurred along that branch are marked with black shapes. Colors: genes within 'Top 20 mutated genes in breast cancer' (green), 'Known hallmark of breast cancer' (yellow), and the 'Known breast cancer genes' (blue) categories. Purple framing: genes in the 'Known mutated genes in cancer' category. The branch lengths were adjusted for the visual presentation and are not inferred by the model. The percentage in the brackets for the 'Top 20 mutated genes in breast cancer' category is the mutation frequency (total mutated samples / total samples analysed, in percent) in the breast cancer samples in COSMIC. Figure is created in BioRender.
Fig. 5
Fig. 5. Results obtained for the prostate cancer dataset.
a-b Evolutionary tree and genotype of the clones. Figure 5a is created in BioRender. Colors: major to total ratio, i.e., the fraction of the major copy number to the total copy number, with values that fall within the range of 0 to 1. c Pathologist’s annotation of the cancerous areas on the H&E images for sections SP1, SP2, and SP3. d For each section (rows), 4 columns correspond to the proportion of the spots assigned to each clone. e The clonal assignment by cardelino (see Supplementary Fig. 14 for expanded cardelino results). f Assignment of spots to copy number clones as inferred by STARCH, with two clusters: gray corresponding to a normal clone (with no copy number changes), and dark green corresponding to a single tumor clone. g Distribution of the Pearson correlation of the clonal composition of the spots that are distant and adjacent, computed for 100 pairs of spots sampled at random 20 times each. In panel g and h, the lower and upper boundaries of the box represent the first (Q1) and third quartiles (Q3), with the median indicated by a line inside the box. The whiskers typically extend to the most extreme data points within 1.5 times the interquartile range (IQR) from the quartiles. Data points outside this range are considered outliers and are plotted individually by diamonds. h Distribution of the agreement (y-axis) of the distant and adjacent spots for cardelino and Tumoroscope, computed for the same randomly sampled pairs as used in g. For the computation of the agreement, we use the single inferred clone by cardelino and the major inferred clone by Tumoroscope.
Fig. 6
Fig. 6. Clonal evolution of prostate cancer samples inferred by Canopy.
At each branch, known oncogenes, tumor suppressor or fusion genes with non-synonymous mutations that occurred along that branch are marked with black shapes. Blue color marks the gene that belongs to the 'Known prostate cancer genes'. The branch lengths were adjusted for the visual presentation and are not inferred by the model. Figure is created in BioRender.
Fig. 7
Fig. 7. Genes are expressed differently in various cancer clones.
The expression of the 30 genes that were inferred by the regression model as the most active in at least one clone, clustered in rows and columns, for breast (a) and prostate cancer (b) tissues. * cancer gene found in all cancer tissues (not cancer type specific) according to the HPA database; + cancer gene with nTPM (normalized gene expression value) in the desired cancer type (either breast in a or prostate in b) at least four times higher than in other cancer tissues, according to; ++ cancer gene with nTPM in a group of cancer tissues including the desired cancer type, at least four times higher than in other cancer tissues, according to; - not detected in cancer tissues, or nTPM at least four times higher in another cancer tissue than the desired one, according to.

References

    1. Vasan, N., Baselga, J. & Hyman, D. M. A view on drug resistance in cancer. Nature575, 299–309 (2019). - PMC - PubMed
    1. Dagogo-Jack, I. & Shaw, A. T. Tumour heterogeneity and resistance to cancer therapies. Nat. Rev. Clin. Oncol.15, 81–94 (2018). - PubMed
    1. Sun, X.-x & Yu, Q. Intra-tumor heterogeneity of cancer cells and its implications for cancer treatment. Acta Pharmacol. Sin.36, 1219–1227 (2015). - PMC - PubMed
    1. McGranahan, N. & Swanton, C. Clonal heterogeneity and tumor evolution: past, present, and the future. Cell168, 613–628 (2017). - PubMed
    1. Yu, Z., Du, F. & Song, L. SCClone: Accurate clustering of tumor single-cell DNA sequencing data. Front Genet13, 823941 (2022). - PMC - PubMed

Publication types

LinkOut - more resources