Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 18;35(3):441-456.e8.
doi: 10.1016/j.ccell.2019.02.002.

Undifferentiated Sarcomas Develop through Distinct Evolutionary Pathways

Affiliations

Undifferentiated Sarcomas Develop through Distinct Evolutionary Pathways

Christopher D Steele et al. Cancer Cell. .

Abstract

Undifferentiated sarcomas (USARCs) of adults are diverse, rare, and aggressive soft tissue cancers. Recent sequencing efforts have confirmed that USARCs exhibit one of the highest burdens of structural aberrations across human cancer. Here, we sought to unravel the molecular basis of the structural complexity in USARCs by integrating DNA sequencing, ploidy analysis, gene expression, and methylation profiling. We identified whole genome duplication as a prevalent and pernicious force in USARC tumorigenesis. Using mathematical deconvolution strategies to unravel the complex copy-number profiles and mutational timing models we infer distinct evolutionary pathways of these rare cancers. In addition, 15% of tumors exhibited raised mutational burdens that correlated with gene expression signatures of immune infiltration, and good prognosis.

Keywords: cancer evolution; copy-number signatures; genomics; immuno-oncology; mutational signatures; sarcoma; tumor mutational burden.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1
Figure 1
Molecular Classification of USARCs (A) Alluvial diagram showing tumor diagnosis reclassification following expert pathological review. UPS, undifferentiated pleomorphic sarcoma; USARC, undifferentiated sarcoma; UCS, unclassified sarcoma; SCS, spindle cell sarcoma; S, spindle; P, pleomorphic; E, epithelioid; MPNST, malignant peripheral nerve sheath tumor; DDLPS, dedifferentiated liposarcoma; M-SFT, malignant solitary fibrous tumor; Ped. SCS, pediatric spindle cell sarcoma. Numbers indicate the number of samples for each subtype. (B) H&E staining of four representative USARC subtypes. Scale bars, 250 μm. (C) Mean methylation of probes categorized by genomic position (left) or position relative to CpG islands (right), in USARC samples (orange) and normal adjacent tissue (green); q < 0.05, ∗∗q < 0.01, ∗∗∗q < 0.001. Boxes show lower quartile, median and upper quartile; lines denote furthest point within 1.5× the interquartile range away from the box; points denote data further than 1.5× the interquartile range away from the box. (D) Principal-component analysis of tumor (orange) and normal (green) samples for both methylation array data (left) and RNA sequencing data (right) as well as shared hierarchical clustering of RNA and methylation data (center). (E) Scatterplot of rearrangement burden (x axis) against SNV/indel burden (y axis) of USARC samples from WGS. Samples were categorized into three groups: mutation high, rearrangement low (mutHi-rearrLo, purple), mutation low, rearrangement high (mutLo-rearrHi, red), and mutation low, rearrangement low (mutLo-rearrLo, blue). Decision boundary is shown as a dashed line. Filled circles are individual data points, ovals 50% probability intervals. (F) Number of samples that have ≥1 rearrangement in genomic windows of 1 Mb (top), number of samples that have chromothriptic regions overlapping genomic windows of 1 Mb (middle), and rearrangement partners of rearrangements within regions that are significantly enriched (bottom). Regions with significant enrichment (q < 0.2) are labeled. See also Figures S1–S3 and Table S1.
Figure 2
Figure 2
Integration of Driver Events in USARCs (A) Heatmap showing significant recurrently amplified or deleted regions (GISTIC q < 0.1). Known cancer driver genes in amplified regions are labeled in blue and those in deleted regions are labeled in red. Lengths of significant regions (Mb) are indicated above the heatmap, with –log2(q values) below. (B) SNV and indel mutational burden barplot (top) and copy-number alterations, SNVs, small indels, structural variants and promoter methylation alterations in known cancer genes (middle), with clinical and genetic covariates (bottom). Red text indicates driver genes identified by dNdSCV (q < 0.2). stab, genome stability; CC, cell cycle; tel, telomere maintenance; mTOR, mTOR signaling pathway; men, MENIN pathway; repair, DNA repair. Samples are ordered by sequencing platform, burden group, and mutational status. See also Table S2.
Figure 3
Figure 3
Implications of Increased Tumor Mutational Burden (A) Venn diagram of predicted pathogenic variants from mutHi samples identified from targeted sequencing (left) and identified from WGS in regions overlapping the design of the targeted baitset (right). (B) Variant allele frequency (VAF) of all variants (left) or variants only observed by targeted sequencing (right). (C) Overall survival of patients stratified by mutational burden and with a univariate Kaplan-Meier model. (D) Multivariate accelerated failure time model for progression-free survival with size of tumor, resection status, and burden group as covariates. (E) Gene set enrichment analysis for interferon gamma response (green) and antigen presentation (purple) pathways using both DNA methylation (top) and gene expression (bottom) data comparing the mutHi-rearrLo group against all others. See also Table S3.
Figure 4
Figure 4
Rearrangement Signatures (A) Rearrangement diversity and counts in the USARC cohort, classified by rearrangement size and rearrangement class. (B) Five rearrangement signatures identified by non-negative matrix factorization (NMF); USARC.RS1, clustered translocations (tloc, purple); USARC.RS2, small unclustered tandem duplications (TD, green), inversions (inv, red), and deletions (del, blue); USARC.RS3, large unclustered TDs, invs, and dels; USARC.RS4, large clustered TDs, invs, and dels; USARC.RS5, unclustered tlocs. x axis, strength of each rearrangement class in each signature. (C) Contribution of activities of each signature per sample (left) and cosine similarities between published breast cancer rearrangement signatures (BRCA.RS1-6) and USARC rearrangement signatures (right). See also Figure S4 and Table S4.
Figure 5
Figure 5
Copy-Number Signatures (A) Seven copy-number signatures identified using NMF; amp, amplified (CN ≥ 1, orange); dup, duplicated (3 ≤ CN ≤ 4, purple); neut, neutral (CN, 2, green); del, deletion (CN ≤ 1, blue); homdel, homozygous deletion (CN, 0, gray); het, heterozygous. x axis, strength of each copy-number class in each signature. (B) Activities of copy-number signatures (CNS) per sample, with associated proportion of the genome that shows LOH, and molecular classification groups. (C) Density plot of CNS1 activity stratified by whether the sample has one or fewer genome doubling events (WGD<2) or has two genome doubling events (WGD×2). Thickness of gray region indicates density. Small vertical lines, data points. Large vertical lines, median. (D) Density plot of CNS1 activity stratified by TP53 mutation status. (E) Scatterplot of CNS5 activity against number of chromothriptic chromosomes. Gray line indicates linear fit. (F) Scatterplot of CNS5 activity against USARC.RS1 activity. Gray line indicates linear fit. (G) Diversity estimates of CNS in TCGA sarcoma subtypes and our USARC cohort. (H) CNS5 activity stratified by tumor type in TCGA. Boxes show lower quartile, median, and upper quartile; lines denote furthest point within 1.5× the interquartile range away from the box. See also Figures S5 and S6 and Table S5.
Figure 6
Figure 6
LOH and Haploidization Are Frequent Events in USARCs (A) Histogram of DNA content, measured as integrated optical density (IOD) (x axis), for cell nuclei from PD26890. Proportion of genome LOH = 44%. 2c, median IOD of normal cell nuclei. (B) Histogram of DNA content for cell nuclei from PD26873. Proportion of genome LOH = 93%. (C) Proportion of samples within the USARC WGS cohort that are LOH (y axis) in sliding windows of the human genome of size 1 Mb each separated by 100 kb (x axis). Dashed line, boundary of regions with highly recurrent LOH (>0.8) or retention of heterozygosity (<0.2). Regions with retention of heterozygosity are highlighted with a horizontal black line. Regions with recurrent LOH are labeled with putative driver tumor suppressor genes in those regions.
Figure 7
Figure 7
Timing of Genome Duplication and Driver Mutations (A) The number of WGD determined by the mode of the major allele in a sample (mode 1, diploid, 0×WGD; mode 2, tetraploid, 1×WGD; mode >2, octoploid, 2×WGD) matches inference from the spread of the samples in the proportion of LOH versus ploidy space. (B) Time of WGD (circles/squares, mean timing per sample. Square indicates more than 10,000 SNVs and circle is less than 10,000 SNVs; vertical colored bars, 95% confidence intervals on the mean values) in years before diagnosis, split by predominant copy-number signature in USARC cohort whole genomes. (C) Time of WGD in the sarcoma cohort of TCGA, split by tumor type. (D) Relative timing of driver mutations (colored circles) and WGD events (empty/gray circles) using the mutations as a molecular clock in USARC cohort whole genomes. Vertical bars, 95% confidence intervals. Samples split by predominant copy-number signature. (E) Relative timing of driver mutations and WGD events in the sarcoma cohort of TCGA. Samples split by predominant copy-number signature, and subdivided by tumor type. Vertical bars, 95% confidence intervals. Boxes are delimited by first and third quartiles; the thick segment shows the median; and whiskers extend to the last data points within 1.5 of the box length away from the box. See also Figure S7 and Table S6.
Figure 8
Figure 8
Evolutionary Pathways in USARCs (A) Proportion of cells within a sample with no WGD (0×WGD), one WGD (1×WGD), or two WGDs (2×WGD) using cytometric ploidy analysis, for six samples estimated to be non-WGD through WGS. (B) Representative examples of ploidy results for CNS3 (diploid) samples. Ploidy displayed as integrated optical density (x axis) and nuclear perimeter (y axis) of each nucleus. (C) Proposed pathways of USARC tumorigenesis. Driver mutations (TP53 and RB1) are early events in USARCs. Haploidization pathway: extreme anaphase mis-segregation associated with near-genome-wide haploidy, which is rescued by WGD, leading to a CNS4 pattern. Genomic loss pathway: less extreme anaphase mis-segregation generates large areas of LOH. Three signatures (CNS3, CNS2, and CNS1) that are variations of this LOH pattern but differentiated from each other by subsequent WGD. Chromothripsis pathway: anaphase mis-segregation or anaphase lagging could also lead to chromosomal micronucleation. CNS5 is a signature of this process followed by WGD. Endoreduplication pathway: a tumor cell may undergo WGD with relatively few other copy-number alterations: CNS7.

References

    1. AACR Project GENIE Consortium AACR project GENIE: powering precision medicine through an international consortium. Cancer Discov. 2017;7:818–831. - PMC - PubMed
    1. Alexandrov L., Kim J., Haradhvala N.J., Huang M.N., Ng A.W.T., Boot A., Covington K.R., Gordenin D.A., Bergstrom E., Lopez-Bigas N. The repertoire of mutational signatures in human cancer. bioRxiv. 2018
    1. Aryee M.J., Jaffe A.E., Corrada-Bravo H., Ladd-Acosta C., Feinberg A.P., Hansen K.D., Irizarry R.A. Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–1369. - PMC - PubMed
    1. Van der Auwera G.A., Carneiro M.O., Hartl C., Poplin R., Del Angel G., Levy-Moonshine A., Jordan T., Shakir K., Roazen D., Thibault J. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics. 2013;43:11.10.1–33. - PMC - PubMed
    1. Behjati S., Tarpey P.S., Haase K., Ye H., Young M.D., Alexandrov L.B., Farndon S.J., Collord G., Wedge D.C., Martincorena I. Recurrent mutation of IGF signalling genes and distinct patterns of genomic rearrangement in osteosarcoma. Nat. Commun. 2017;8:15936. - PMC - PubMed

Publication types