Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 May;22(5):298-313.
doi: 10.1038/s41568-022-00446-5. Epub 2022 Mar 2.

Cancer proteogenomics: current impact and future prospects

Affiliations
Review

Cancer proteogenomics: current impact and future prospects

D R Mani et al. Nat Rev Cancer. 2022 May.

Abstract

Genomic analyses in cancer have been enormously impactful, leading to the identification of driver mutations and development of targeted therapies. But the functions of the vast majority of somatic mutations and copy number variants in tumours remain unknown, and the causes of resistance to targeted therapies and methods to overcome them are poorly defined. Recent improvements in mass spectrometry-based proteomics now enable direct examination of the consequences of genomic aberrations, providing deep and quantitative characterization of tumour tissues. Integration of proteins and their post-translational modifications with genomic, epigenomic and transcriptomic data constitutes the new field of proteogenomics, and is already leading to new biological and diagnostic knowledge with the potential to improve our understanding of malignant transformation and therapeutic outcomes. In this Review we describe recent developments in proteogenomics and key findings from the proteogenomic analysis of a wide range of cancers. Considerations relevant to the selection and use of samples for proteogenomics and the current technologies used to generate, analyse and integrate proteomic with genomic data are described. Applications of proteogenomics in translational studies and immuno-oncology are rapidly emerging, and the prospect for their full integration into therapeutic trials and clinical care seems bright.

PubMed Disclaimer

Figures

Fig. 1 ∣
Fig. 1 ∣. Typical proteogenomics data.
Proteomic and genomic data types typically available in a proteogenomic study. The right-hand column shows expected number of features that can be measured using current technology (copy number alteration (CNA) and DNA methylation starts with a larger feature set; counts shown are after mapping to gene space). Often, data from tumour tissue samples are compared with normal adjacent tissue (NAT) samples. Combinations of data types are used in various biologically focused analyses. Genomic data generated using technologies described in BOX 1. Proteomic data generated using liquid chromatography-based mass spectrometry (LC-MS) with electrospray ionization-. Post-translational modification (PTM) data generated using deep-scale LC-MS profiling of enriched samples to achieve higher sensitivity of detection (BOX 2). Deep-scale proteomic analysis of large numbers of tumour samples now typically employs isobaric chemical labelling reagents that enable up to 18-sample multiplexing (TMTpro),. Recently introduced microscaled methods allow DNA, RNA and proteomic analysis from single needle core biopsies. Proteome analysis can also be accomplished to near full depth in FFPE samples, using new sample preparation methods,, but limitations remain with respect to analysis of PTMs due to impact of embedding process itself. In addition, methods continue to be developed that provide greater depth, accuracy and repeatability of proteomic measurements across samples including FAIMS-, SPS-MS3 and real-time data analysis. Label-free data-independent acquisition (DIA) method shows great promise to reduce number of missing values in proteomic data across samples and data sets, and is beginning to be used in multi-omic and proteogenomic studies,. Samples must be analysed individually as DIA is currently not compatible with multiplexing strategies. miRNA, microRNA; RNA-seq, RNA sequencing; WGS, whole-genome sequencing; WXS, whole-exome sequencing.
Fig. 2 ∣
Fig. 2 ∣. Proteogenomic data analyses for biological insight via cloud computing.
Typical applications of multi-omic data, showing representative analyses along with expected input data. Methods listed are generally applicable to most cancer types and other proteogenomic efforts, and have been used in many of the reported studies. Cloud computing model entails global on-demand access to shared computing resources that can be rapidly provisioned for use and easily released after completion of computing activity (non-peer-reviewed report). As increasingly large data sets are generated and computationally intensive algorithms are used to analyse these data, cloud computing offers an attractive option for affordable high-performance computing for proteogenomics, with centralized data and co-located computing infrastructure and analysis tools. The National Institutes of Health (NIH) Data Commons initiative is a cloud-based initiative to disseminate harmonized genomic (Genomic Data Commons) and proteomic (Proteomic Data Commons) data, among others. The National Cancer Institute (NCI) Genomics Cloud Pilots have created user-friendly cloud-based shared computing environments. FireCloud (non-peer-reviewed preprint), which has now evolved to Terra, is one such computing platform where data, analysis methods and results are encapsulated in reproducible, shareable workspaces. Using the Terra infrastructure, platforms such as PANOPLY have been developed to automate the analysis of proteogenomic data to rapidly provide a comprehensive baseline analysis for proteogenomic studies, leading to many disease-specific hypotheses that can be explored further using additional computational and wet-laboratory experiments. CNA, copy number alteration; CT, cancer/testis; miRNA, microRNA; NAT, normal adjacent tissue; PTM, post-translational modification; SMG, significantly mutated gene.
Fig. 3 ∣
Fig. 3 ∣. LC-MS/MS workflow for proteomics in CPTAC studies.
Tumour tissue samples are lysed, proteins extracted and disulfide bonds reduced and alkylated prior to proteolytic digestion, typically with LysC followed by trypsin. Common reference channel created by mixing aliquots of samples. Peptides from each sample and reference are chemically labelled with isobaric tandem mass tag reagents (TMT) and then mixed together prior to further processing and analysis. Up to 18 distinct labels are now available, enabling up to 18 different samples to be labelled, then mixed and analysed by liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) as a single ‘plex’ (TMT,). Relative quantities in different samples become distinguishable after fragmentation in the mass spectrometer, which releases mass tags from labelled peptides generating low-mass reporter ions in MS/MS spectra. Quantification based on relative intensities or ratios of mass tags to one another or to a common reference in one of the channels. Peptide identification based on searching MS/MS spectra against a protein sequence database. To increase sensitivity to detect and quantify peptides from low-abundance proteins as well as post-translational modifications (PTMs), the pool of TMT-labelled peptides is fractionated off-line prior to analysis by LC-MS/MS. Fractionation also reduces ratio compression caused by co-isolation and co-fragmentation of isobaric-labelled target and non-target peptides-. Phosphopeptides enriched prior to analysis using immobilized metal affinity chromatography. Acetyl and ubiquityl peptides enriched using anti-peptide antibodies,. CPTAC, Clinical Proteomic Tumor Analysis Consortium.

References

References

    1. Slamon DJ et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N. Engl. J. Med 344, 783–792 (2001). - PubMed
    1. Druker BJ et al. Five-year follow-up of patients receiving imatinib for chronic myeloid leukemia. N. Engl. J. Med 355, 2408–2417 (2006). - PubMed
    1. Awad MM & Shaw AT ALK inhibitors in non-small cell lung cancer: crizotinib and beyond. Clin. Adv. Hematol. Oncol 12, 429–439 (2014). - PMC - PubMed
    1. Shaw AT et al. Resensitization to crizotinib by the lorlatinib ALK resistance mutation L1198F. N. Engl. J. Med 374, 54–61 (2016). - PMC - PubMed
    1. Robert C. et al. Improved overall survival in melanoma with combined dabrafenib and trametinib. N. Engl. J. Med 372, 30–39 (2015). - PubMed

Publication types

LinkOut - more resources