. 2023 Sep 19;4(9):101173.

doi: 10.1016/j.xcrm.2023.101173. Epub 2023 Aug 14.

Deep learning integrates histopathology and proteogenomics at a pan-cancer level

Joshua M Wang¹, Runyu Hong¹, Elizabeth G Demicco², Jimin Tan³, Rossana Lazcano⁴, Andre L Moreira⁵, Yize Li⁶, Anna Calinawan⁷, Narges Razavian⁸, Tobias Schraink³, Michael A Gillette⁹, Gilbert S Omenn¹⁰, Eunkyung An¹¹, Henry Rodriguez¹¹, Aristotelis Tsirigos¹², Kelly V Ruggles¹³, Li Ding¹⁴, Ana I Robles¹¹, D R Mani¹⁵, Karin D Rodland¹⁶, Alexander J Lazar¹⁷, Wenke Liu¹⁸, David Fenyö¹⁹; Clinical Proteomic Tumor Analysis Consortium

Collaborators, Affiliations

Collaborators

Clinical Proteomic Tumor Analysis Consortium:
François Aguet, Yo Akiyama, Shankara Anand, Meenakshi Anurag, Özgün Babur, Jasmin Bavarva, Chet Birger, Michael J Birrer, Lewis C Cantley, Song Cao, Steven A Carr, Michele Ceccarelli, Daniel W Chan, Arul M Chinnaiyan, Hanbyul Cho, Shrabanti Chowdhury, Marcin P Cieslik, Karl R Clauser, Antonio Colaprico, Daniel Cui Zhou, Felipe da Veiga Leprevost, Corbin Day, Saravana M Dhanasekaran, Marcin J Domagalski, Yongchao Dou, Brian J Druker, Nathan Edwards, Matthew J Ellis, Myvizhi Esai Selvan, Steven M Foltz, Alicia Francis, Yifat Geffen, Gad Getz, Tania J Gonzalez Robles, Sara J C Gosline, Zeynep H Gümüş, David I Heiman, Tara Hiltke, Galen Hostetter, Yingwei Hu, Chen Huang, Emily Huntsman, Antonio Iavarone, Eric J Jaehnig, Scott D Jewell, Jiayi Ji, Wen Jiang, Jared L Johnson, Lizabeth Katsnelson, Karen A Ketchum, Iga Kolodziejczak, Karsten Krug, Chandan Kumar-Sinha, Jonathan T Lei, Wen-Wei Liang, Yuxing Liao, Caleb M Lindgren, Tao Liu, Weiping Ma, Fernanda Martins Rodrigues, Wilson McKerrow, Mehdi Mesri, Alexey I Nesvizhskii, Chelsea J Newton, Robert Oldroyd, Amanda G Paulovich, Samuel H Payne, Francesca Petralia, Pietro Pugliese, Boris Reva, Dmitry Rykunov, Shankha Satpathy, Sara R Savage, Eric E Schadt, Michael Schnaubelt, Stephan Schürer, Zhiao Shi, Richard D Smith, Xiaoyu Song, Yizhe Song, Vasileios Stathias, Erik P Storrs, Nadezhda V Terekhanova, Ratna R Thangudu, Mathangi Thiagarajan, Nicole Tignor, Liang-Bo Wang, Pei Wang, Ying Wang, Bo Wen, Maciej Wiznerowicz, Yige Wu, Matthew A Wyczalkowski, Lijun Yao, Tomer M Yaron, Xinpei Yi, Bing Zhang, Hui Zhang, Qing Zhang, Xu Zhang, Zhen Zhang

Affiliations

¹ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA.
² Department of Pathology and Laboratory Medicine, Mount Sinai Hospital and Laboratory Medicine and Pathobiology, University of Toronto, Toronto M5G 1X5, ON, Canada.
³ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁴ Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
⁵ Department of Pathology, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁶ Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA.
⁷ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
⁸ Department of Population Health, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Radiology, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁹ The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Massachusetts General Hospital Division of Pulmonary and Critical Care Medicine, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02115, USA.
¹⁰ Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
¹¹ Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA.
¹² Department of Pathology, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
¹³ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
¹⁴ Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA.
¹⁵ The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
¹⁶ Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA; Department of Cell, Developmental, and Cancer Biology, Oregon Health & Science University, Portland, OR 97221, USA.
¹⁷ Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA. Electronic address: alazar@mdanderson.org.
¹⁸ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA. Electronic address: wenke.liu@nyulangone.org.
¹⁹ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA. Electronic address: david@fenyolab.org.

PMID: 37582371
PMCID: PMC10518635
DOI: 10.1016/j.xcrm.2023.101173

Deep learning integrates histopathology and proteogenomics at a pan-cancer level

Joshua M Wang et al. Cell Rep Med. 2023.

. 2023 Sep 19;4(9):101173.

doi: 10.1016/j.xcrm.2023.101173. Epub 2023 Aug 14.

Authors

Collaborators

Clinical Proteomic Tumor Analysis Consortium:
François Aguet, Yo Akiyama, Shankara Anand, Meenakshi Anurag, Özgün Babur, Jasmin Bavarva, Chet Birger, Michael J Birrer, Lewis C Cantley, Song Cao, Steven A Carr, Michele Ceccarelli, Daniel W Chan, Arul M Chinnaiyan, Hanbyul Cho, Shrabanti Chowdhury, Marcin P Cieslik, Karl R Clauser, Antonio Colaprico, Daniel Cui Zhou, Felipe da Veiga Leprevost, Corbin Day, Saravana M Dhanasekaran, Marcin J Domagalski, Yongchao Dou, Brian J Druker, Nathan Edwards, Matthew J Ellis, Myvizhi Esai Selvan, Steven M Foltz, Alicia Francis, Yifat Geffen, Gad Getz, Tania J Gonzalez Robles, Sara J C Gosline, Zeynep H Gümüş, David I Heiman, Tara Hiltke, Galen Hostetter, Yingwei Hu, Chen Huang, Emily Huntsman, Antonio Iavarone, Eric J Jaehnig, Scott D Jewell, Jiayi Ji, Wen Jiang, Jared L Johnson, Lizabeth Katsnelson, Karen A Ketchum, Iga Kolodziejczak, Karsten Krug, Chandan Kumar-Sinha, Jonathan T Lei, Wen-Wei Liang, Yuxing Liao, Caleb M Lindgren, Tao Liu, Weiping Ma, Fernanda Martins Rodrigues, Wilson McKerrow, Mehdi Mesri, Alexey I Nesvizhskii, Chelsea J Newton, Robert Oldroyd, Amanda G Paulovich, Samuel H Payne, Francesca Petralia, Pietro Pugliese, Boris Reva, Dmitry Rykunov, Shankha Satpathy, Sara R Savage, Eric E Schadt, Michael Schnaubelt, Stephan Schürer, Zhiao Shi, Richard D Smith, Xiaoyu Song, Yizhe Song, Vasileios Stathias, Erik P Storrs, Nadezhda V Terekhanova, Ratna R Thangudu, Mathangi Thiagarajan, Nicole Tignor, Liang-Bo Wang, Pei Wang, Ying Wang, Bo Wen, Maciej Wiznerowicz, Yige Wu, Matthew A Wyczalkowski, Lijun Yao, Tomer M Yaron, Xinpei Yi, Bing Zhang, Hui Zhang, Qing Zhang, Xu Zhang, Zhen Zhang

Affiliations

¹ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA.
² Department of Pathology and Laboratory Medicine, Mount Sinai Hospital and Laboratory Medicine and Pathobiology, University of Toronto, Toronto M5G 1X5, ON, Canada.
³ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁴ Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
⁵ Department of Pathology, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁶ Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA.
⁷ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
⁸ Department of Population Health, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Radiology, NYU Grossman School of Medicine, New York, NY 10016, USA.
⁹ The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Massachusetts General Hospital Division of Pulmonary and Critical Care Medicine, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02115, USA.
¹⁰ Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
¹¹ Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA.
¹² Department of Pathology, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
¹³ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA.
¹⁴ Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA.
¹⁵ The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
¹⁶ Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA; Department of Cell, Developmental, and Cancer Biology, Oregon Health & Science University, Portland, OR 97221, USA.
¹⁷ Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA. Electronic address: alazar@mdanderson.org.
¹⁸ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA. Electronic address: wenke.liu@nyulangone.org.
¹⁹ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA. Electronic address: david@fenyolab.org.

PMID: 37582371
PMCID: PMC10518635
DOI: 10.1016/j.xcrm.2023.101173

Abstract

We introduce a pioneering approach that integrates pathology imaging with transcriptomics and proteomics to identify predictive histology features associated with critical clinical outcomes in cancer. We utilize 2,755 H&E-stained histopathological slides from 657 patients across 6 cancer types from CPTAC. Our models effectively recapitulate distinctions readily made by human pathologists: tumor vs. normal (AUROC = 0.995) and tissue-of-origin (AUROC = 0.979). We further investigate predictive power on tasks not normally performed from H&E alone, including TP53 prediction and pathologic stage. Importantly, we describe predictive morphologies not previously utilized in a clinical setting. The incorporation of transcriptomics and proteomics identifies pathway-level signatures and cellular processes driving predictive histology features. Model generalizability and interpretability is confirmed using TCGA. We propose a classification system for these tasks, and suggest potential clinical applications for this integrated human and machine learning approach. A publicly available web-based platform implements these models.

Keywords: CPTAC; cancer imaging; cancer proteogenomics; computational pathology; molecular diagnostics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

**Figure 1**
Workflow, data split, and model performance (A) Overall workflow. Multi-resolution Panoptes models were trained on H&E slide images from six cancer types. Multi-CCA correlated proteomics, transcriptomics, and extracted imaging features from CNN models to reveal significant pathways and molecular signatures. (B) Per-slide level AUROCs of imaging-based prediction tasks with 95% confidence intervals.

**Figure 2**
Tissue-of-origin model performance and omics-integration (A) AUROC for each cancer type at per-slide level. (B) AUROC at per-tile level. (C) Features extracted from penultimate layer are separated with tSNE; each dot represents a tumor tile colored by tissue origin. (D) Feature extraction where each dot represents NAT tiles colored by tissue origin. (E) CCA canonical variate highlighting similarities between UCEC and LUAD samples. Line graphs represent standardized coefficients for subsets of imaging, gene, and proteome features. Each dot represents an image-proteogenomic paired sample. GO term enrichment assessed on subset of genes and proteome features with non-zero values in loading matrix. (F and G) Top and bottom images represent tiles with highest and lowest scores, respectively. Histopathology annotations reflect enriched GO terms.

**Figure 3**
Feature visualization and cross-testing of tumorigenesis models (A) Example UCEC slide with tumor tissue on left and normal tissue on right. (B) Prediction heatmap of example slide with hotter areas (red) highlighting tiles more likely to be tumor tissue. (C) CAM of example slide by tiles with hotter areas emphasizing the tumor tissue. (D–F) Feature extraction from tumorigenesis imaging model by tSNE; each dot represents a tile colored by prediction score, true label, and cancer type, respectively. (G) Example tiles of integrated saliency results highlighting accumulation of nuclei, with densest regions largely composed of stromal lymphoplasmacytic infiltrates. (H) Heatmap showing per-slide AUROCs of applying single cancer type trained models to the other cancer types.

**Figure 4**
Major canonical variates associated with tumorigenesis (A) Canonical variate with strongest correlation separating NAT/tumor samples across all six cancer types. (B) Tiles from highest-scoring regions show mitotic morphologies consistent with enriched transcriptomic and proteomic enrichment. (C) Tiles from lowest-scoring region. (D) Second canonical variate distinguishing NAT/tumor samples. (E and F) Tile scoring parallels enriched biological processes. Tile borders indicate scores; top-scoring regions (red) match tumorigenic areas with increased glycolytic activity, and bottom-scoring (blue) areas correspond with smooth muscle and blood vessel architectures.

**Figure 5**
Model performance and multi-omics assessment of grade and stage (A) Per-slide performance of models trained on tumor grade and disease stage. Numeric predictions represent expected value from softmax layer ( $\sum_{x = 0}^{4} p (x) (x)$ ) where x represents grade or stage outcome). AUROC for each outcome denoted. (B) CCA canonical variate uniquely observed in grade analysis. Tiles with highest projected values (shown by more intense red borders) reflect regions with disorganized tumor nests lacking lumen formation and glandular regions with loss of basal nuclear polarity. Paler tile borders reflect lower projected values.

**Figure 6**
Performance, visualization, and feature extraction of biomarkers (A) One-tail Wilcoxon tests on prediction scores between positively and negatively labeled samples at per-tile level with significance levels. (B) Extraction and visualization of features learned by pan-cancer *TP53* mutation model with tSNE. Reference plots of prediction scores and true labels on the right. (C) Canonical variate with strongest association between image and proteogenomic features. (D) Top tiles demonstrate highly cellular disordered regions correlating with *TP53* mutated samples. (E) Bottom tiles (wild-type) highlight organized and well-differentiated regions. (F) Canonical variate correlating increased IL-1 activity with *TP53* mutated samples. (G) Wild-type samples in canonical variate no. 3 highlight densely packed but relatively preserved tissue architectures. (H) Conversely, mutated samples reside in the bottom portion and show areas of increased immune infiltrate activity.

**Figure 7**
Panoptes Web (A) App workflow. (B) Boxplot assessment of probability scores and class outcomes, and individual tile probability visualization.

See this image and copyright information in PMC

References

1. Niazi M.K.K., Parwani A.V., Gurcan M.N. Digital pathology and artificial intelligence. Lancet Oncol. 2019;20:e253–e261. doi: 10.1016/S1470-2045(19)30154-8. - DOI - PMC - PubMed
1. Srinidhi C.L., Ciga O., Martel A.L. Deep neural network models for computational histopathology: A survey. Med. Image Anal. 2021;67 doi: 10.1016/j.media.2020.101813. - DOI - PMC - PubMed
1. Coudray N., Ocampo P.S., Sakellaropoulos T., Narula N., Snuderl M., Fenyö D., Moreira A.L., Razavian N., Tsirigos A. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 2018;24:1559–1567. doi: 10.1038/s41591-018-0177-5. - DOI - PMC - PubMed
1. Hong R., Liu W., Fenyö D. Predicting and Visualizing STK11 Mutation in Lung Adenocarcinoma Histopathology Slides Using Deep Learning. BioMedInformatics. 2021;2:101–105. doi: 10.3390/biomedinformatics2010006. - DOI
1. Hong R., Liu W., DeLair D., Razavian N., Fenyö D. Predicting endometrial cancer subtypes and molecular features from histopathology images using multi-resolution deep learning models. Cell Rep. Med. 2021;2 doi: 10.1016/j.xcrm.2021.100400. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep learning integrates histopathology and proteogenomics at a pan-cancer level

Collaborators

Affiliations

Deep learning integrates histopathology and proteogenomics at a pan-cancer level

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous