Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov;575(7781):210-216.
doi: 10.1038/s41586-019-1689-y. Epub 2019 Oct 23.

Pan-cancer whole-genome analyses of metastatic solid tumours

Affiliations

Pan-cancer whole-genome analyses of metastatic solid tumours

Peter Priestley et al. Nature. 2019 Nov.

Abstract

Metastatic cancer is a major cause of death and is associated with poor treatment efficacy. A better understanding of the characteristics of late-stage cancer is required to help adapt personalized treatments, reduce overtreatment and improve outcomes. Here we describe the largest, to our knowledge, pan-cancer study of metastatic solid tumour genomes, including whole-genome sequencing data for 2,520 pairs of tumour and normal tissue, analysed at median depths of 106× and 38×, respectively, and surveying more than 70 million somatic variants. The characteristic mutations of metastatic lesions varied widely, with mutations that reflect those of the primary tumour types, and with high rates of whole-genome duplication events (56%). Individual metastatic lesions were relatively homogeneous, with the vast majority (96%) of driver mutations being clonal and up to 80% of tumour-suppressor genes being inactivated bi-allelically by different mutational mechanisms. Although metastatic tumour genomes showed similar mutational landscape and driver genes to primary tumours, we find characteristics that could contribute to responsiveness to therapy or resistance in individual patients. We implement an approach for the review of clinically relevant associations and their potential for actionability. For 62% of patients, we identify genetic variants that may be used to stratify patients towards therapies that either have been approved or are in clinical trials. This demonstrates the importance of comprehensive genomic tumour profiling for precision medicine in cancer.

PubMed Disclaimer

Conflict of interest statement

E.E.V. is a supervisory board member of the Hartwig Medical Foundation.

Figures

Fig. 1
Fig. 1. Mutational load of metastatic cancer.
a, Violin plot showing age distribution of each tumour type, with twenty-fifth, fiftieth and seventy-fifth percentiles marked. b, c, Cumulative distribution function plot (individual samples were ranked independently for each variant type) of mutational load for each tumour type for SNVs and MNVs (b) and indels and SVs (c). The median for each tumour type is indicated by a horizontal bar. Dotted lines indicate the mutational loads in primary cancers from the PCAWG cohort. Only tumour types with more than ten samples are shown (n = 2,350 independent patients), and are ranked from the lowest to the highest overall SNV mutation burden (TMB). CUP, cancer of unknown primary.
Fig. 2
Fig. 2. Copy number landscape of metastatic cancer.
a, Proportion of samples with amplification and deletion events by genomic position pan-cancer. The inner ring shows the percentage of tumours with homozygous deletion (orange), LOH and significant loss (copy number < 0.6× sample ploidy; dark blue) and near copy neutral LOH (light blue). Outer ring shows percentage of tumours with high level amplification (>3× sample ploidy; orange), moderate amplification (>2× sample ploidy; dark green) and low level amplification (>1.4× amplification; light green). The scale on both rings is 0–100% and inverted for the inner ring. The most frequently observed high-level gene amplifications (black text) and homozygous deletions (red text) are shown. b, Proportion of tumours with a WGD event (dark blue), grouped by tumour type. c, Sample ploidy distribution over the complete cohort for samples with and without WGD.
Fig. 3
Fig. 3. The most prevalent driver genes in metastatic cancer.
ac, The most prevalent somatically mutated oncogenes (a), TSGs (b) and germline predisposition variants (c). From left to right, the heat map shows the percentage of samples in each cancer type that are found to have each gene mutated; absolute bar chart shows the pan-cancer percentage of samples with the given gene mutated; relative bar chart shows the breakdown by type of alteration. For TSGs (b), the final bar chart shows the percentage of samples with a driver in which the gene is biallelically inactivated, and for germline predisposition variants (c), the final bar chart shows the percentage of samples with loss of wild type in the tumour.
Fig. 4
Fig. 4. Number of drivers and types of mutation per sample by tumour type.
a, Violin plot showing the distribution of the number of drivers per sample grouped by tumour type (number of patients per tumour type is provided). Black dots indicate the mean values for each tumour type. b, Relative bar chart showing the breakdown per cancer type of the type of alteration.
Fig. 5
Fig. 5. Clinical associations and actionability.
a, Percentage of samples in each cancer type with a putative candidate actionable mutation based on data in the CGI, CIViC and OncoKB databases. Level A represents presence of biomarkers with either an approved therapy or guidelines, and level B represents biomarkers with strong biological evidence or clinical trials that indicate that they are actionable. On-label indicates treatment registered by federal authorities for that tumour type, whereas off-label indicates a registration for other tumour types. b, Break down of the actionable variants by variant type.
Extended Data Fig. 1
Extended Data Fig. 1. Hartwig sample workflow, biopsy locations and sequence coverage.
a, Sample workflow from patient to high-quality WGS data. A total of 4,018 patients were enrolled in the study between April 2016 and April 2018. For 9% of patients, no blood and/or biopsy material was obtained, mostly because conditions of patients prohibited further study participation. Up to four fresh-frozen biopsies were obtained per patient, and were sequentially analysed to identify a biopsy with more than 30% tumour cellularity as determined by routine histology assessment. For 859 patients, no suitable biopsy was obtained, and 2,796 patients were further processed for WGS analysis. In total, 44 and 29 samples failed in either DNA isolation or library preparation and raw WGS data quality control tests, respectively. For a further 385 samples, the WGS data were of good quality, but the determination of tumour purity based on WGS data (PURity & PLoidy Estimator; PURPLE) was less than 20%, making reliable and comprehensive somatic variant calling impossible and were therefore excluded. Eventually, 2,338 pairs of tumour and normal tissue samples with high-quality WGS data were obtained, which were supplemented with 182 pairs from pre-April 2016, adding up to 2,520 pairs of tumour and normal samples that were included in this study. b, Breakdown of cohort by biopsy location. Tumour biopsies were taken from a broad range of locations. Primary tumour type is shown on the left, and the biopsy location on the right. c, Distribution of sample sequencing depth for tumour and blood reference samples (n = 2,520 independent samples for each category). The median for each is indicated by a horizontal bar.
Extended Data Fig. 2
Extended Data Fig. 2. Mutational context distribution per tumour type.
ae, Variant subtype, mutational context or signature per individual sample for each SNV (a), SNV by COSMIC signature (b), MNV (c), indel (d) or SV (e). Each column chart is ranked within tumour type by mutational load from low to high in that variant class. MNVs are classified by the dinucleotide substitution, with ‘NN’ referring to any dinucleotide combination. SVs are classified by type. DEL, deletion (with microhomology (MH), in repeats and other); DUP, tandem duplication; INV, inversion; TRL, translocation; INS, insertion. Highly characteristic known patterns can be discerned, for example the high rates of C>T SNVs, CC>TT MNVs and COSMIC S18 for skin tumours, and high rates of C>A SNVs and COSMIC S4 for lung tumours.
Extended Data Fig. 3
Extended Data Fig. 3. SNV mutational signatures.
a, Prevalence and median mutational load of fitted COSMIC SNV mutational signature per cancer type (the number of patients per category is provided). The observed distribution largely reflects the patterns observed from primary cancers. b, Box plots of relative residuals in fits per cancer type (sum of absolute difference between the fitted and actual divided by total mutational load). Boxes represent the twenty-fifth to seventy-fifth percentiles, and whiskers extend to the highest and lowest values within 1.5× the upper/lower quartile distance, with outliers shown as dots. c, Proportion of variants by 96 trinucleotide mutational context for two selected samples with high residuals and high mutational load. Top and bottom panels represent the highest outliers for breast (HMF002896) and oesophagus (HMF001562) cancers, respectively, from b. Both of these samples were previously treated with the experimental drug SYD985—a duocarmycin-based HER2-targeting antibody–drug conjugate.
Extended Data Fig. 4
Extended Data Fig. 4. Mutational load, genome-wide analyses and drivers.
a, Proportion of samples by cancer type classified as microsatellite instable (MSIseq score > 4). b, Proportion of samples with a high mutational burden (TMB > 10 SNVs per Mb). ce, Scatter plots of mutational load per sample for indels versus SNVs (c), indels versus SVs (d), and SVs versus SNVs (e). MSI (MSIseq score > 4) and high TMB (>10 SNVs per Mb) thresholds are indicated. fh, Mean mutational load versus driver rate for SNVs (f), indels (g) and SVs (h), grouped by cancer type. MSI samples were excluded.
Extended Data Fig. 5
Extended Data Fig. 5. Effect of sequencing depth on variant calling.
af, Comparison of variant calling of ten randomly selected samples at normal depth and 50% downsampled (approximately 50 times, similar to the mean coverage for the PCAWG project) for purity (a), SNV counts (b), SV counts (c), ploidy (d), MNV counts (e) and indel counts (f). Decreasing coverage results in an average decrease in sensitivity of 10% for SNVs, 2% for indels, 15% for MNVs and 19% for SVs.
Extended Data Fig. 6
Extended Data Fig. 6. Effect of bioinformatic analysis pipeline on variant calling.
ad, Comparison of observed mutational count per sample for SNVs (a), MNVs (b), indels (c) and SVs (d) on 24 patient samples analysed by the PCAWG and HMF pipelines. The PCAWG pipeline was found to have a 43% lower sensitivity for indels (which is based on a consensus calling), 18% lower for SVs (based on a different algorithm) and 6% lower for MNVs (only includes MNVs involving two nucleotides), with nearly the same sensitivity for SNVs. e, f, Cumulative distribution function plot for each tumour type (the number of independent patients per category is provided) of coverage and pipeline-adjusted mutational load for SNVs and MNVs (e) and indels and SVs (f). Mutational loads as shown in Fig. 1 were adjusted for the sensitivity effects caused by differences in sequencing depth coverage (Extended Data Fig. 4) and analysis pipeline differences (ad). After this correction, the TMB between primary and metastatic cohorts across all variant types are much more comparable (e, f), which indicates that technical differences do contribute to the reported mutational load differences between primary and metastatic tumours. Prostate cancer is the most notable exception, with approximately twice the TMB in all variant classes, although more subtle differences, potentially driven by biology, can also be observed for other tumour and mutation types. For cancer types that are comparable with the PCAWG cohort, the equivalent PCAWG numbers are shown by dotted lines. The median for each cohort is shown by a horizontal line.
Extended Data Fig. 7
Extended Data Fig. 7. Somatic Y chromosome loss and driver amplifications.
a, Proportion of male tumours with somatic loss of more than 50% of Y chromosome (dark blue) grouped by tumour type. b, Mean rate of amplification drivers per cancer type. c, Breakdown of the number of amplification drivers per gene by cancer type. d, Mean rate of drivers per variant type for samples with and without WGD.
Extended Data Fig. 8
Extended Data Fig. 8. Significantly mutated genes.
Tile chart showing genes found to be significantly mutated per cancer type (the number of independent patients per category is provided) and pan-cancer using dNdScv. Gene names marked in orange are also significant in a previous study, but not found in the COSMIC gene census or curated gene databases. Gene names marked in red are novel in this study. Significance (Poisson with Benjamini–Hochberg false discovery rate correction) is indicated by the intensity of shading.
Extended Data Fig. 9
Extended Data Fig. 9. Oncogenic hotspots.
Count of driver point mutations by variant type. Known pathogenic mutations curated from external databases are categorized as hotspot mutations. Mutations within five bases of a known pathogenic mutation are shown as near hotspot, and all other mutations are shown as non-hotspot.
Extended Data Fig. 10
Extended Data Fig. 10. Driver co-occurrence.
a, Mutated driver gene pairs that are significantly positively (right) or negatively (left) correlated in individual tumour types (number of independent samples per tumour type is indicated in Fig. 1) sorted by q value (Fisher exact test adjusted for false discovery rate). Pairs of genes on the same chromosome that are frequently co-amplified or co-deleted by chance are excluded from positively correlated results. The 20 significant findings include previously reported co-occurrence of mutated DAXMEN1 in pancreatic NET (q = 7 × 10−4), and CDH1SPOP in prostate tumours (q = 5 × 10−4), as well as negative associations of mutated genes within the same signal transduction pathway such as KRASBRAF (q = 4 × 10−4) and KRASNRAS (q = 0.008) in colorectal cancer, BRAFNRAS in skin cancer (q = 6 × 10−12), CDKN2ARB1 in lung cancer (q = 8 × 10−5) and APCCTNNB1 in colorectal cancer (q = 3 × 10−6). APC is also strongly negatively correlated with both BRAF (q = 9 × 10−5) and RNF43 (q = 4 × 10−6), which together are characteristic of the serrated molecular subtype of colorectal cancers. SMAD2SMAD3 are highly positively correlated in colorectal cancer (q = 0.02), which supports a previous report in a large cohort of colorectal cancers. In breast cancer, we found several novel relationships, including a positive relationship for GATA3VMP1(q = 6 × 10−5) and FOXA1PIK3CA (q = 3 × 10−3), and a negative relationship for ESR1TP53 (q = 9 × 10−4) and GATA3TP53 (q = 5 × 10−5).
Extended Data Fig. 11
Extended Data Fig. 11. Subclonality of somatic variants.
a, Violin plot showing the percentage of point mutations per tumour purity bucket (the number of independent samples per category is indicated) that are subclonal in each purity bucket per sample. Black dots indicate the mean for each bucket. b, Percentage of driver point mutations that are subclonal in each purity bucket. c, Approximate somatic ploidy detection cut-off of the HMF pipeline at median 106× depth coverage for each purity bucket and for sample ploidy 2 and 4. Subclonal variants with cellular fraction less than this cut-off are unlikely to be detected by our pipeline analyses.

Comment in

References

    1. The Cancer Genome Atlas Research Network et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013). - PMC - PubMed
    1. The International Cancer Genome Consortium. International network of cancer genome projects. Nature464, 993–998 (2010). - PMC - PubMed
    1. Gröbner, S. N. et al. The landscape of genomic alterations across childhood cancers. Nature555, 321–327 (2018). - PubMed
    1. Ma, X. et al. Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours. Nature555, 371–376 (2018). - PMC - PubMed
    1. Hyman, D. M., Taylor, B. S. & Baselga, J. Implementing genome-driven oncology. Cell168, 584–599 (2017). - PMC - PubMed

Publication types