Deep learning features encode interpretable morphologies within histological images

Ali Foroughi Pour¹, Brian S White¹, Jonghanne Park¹, Todd B Sheridan^{1

2}, Jeffrey H Chuang^{3

4}

Affiliations

¹ The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
² Department of Pathology, Hartford hospital, 80 Seymour St, Hartford, CT, 06106, USA.
³ The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA. jeff.chuang@jax.org.
⁴ Department of Genetics and Genome Sciences, UCONN Health, Farmington, CT, 06032, USA. jeff.chuang@jax.org.

PMID: 35676395
PMCID: PMC9177767
DOI: 10.1038/s41598-022-13541-2

Deep learning features encode interpretable morphologies within histological images

Ali Foroughi Pour et al. Sci Rep. 2022.

. 2022 Jun 8;12(1):9428.

doi: 10.1038/s41598-022-13541-2.

Authors

Ali Foroughi Pour¹, Brian S White¹, Jonghanne Park¹, Todd B Sheridan^{1

2}, Jeffrey H Chuang^{3

4}

Affiliations

¹ The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
² Department of Pathology, Hartford hospital, 80 Seymour St, Hartford, CT, 06106, USA.
³ The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA. jeff.chuang@jax.org.
⁴ Department of Genetics and Genome Sciences, UCONN Health, Farmington, CT, 06032, USA. jeff.chuang@jax.org.

PMID: 35676395
PMCID: PMC9177767
DOI: 10.1038/s41598-022-13541-2

Abstract

Convolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture). While many studies have incorporated CNN features into predictive models, there has been little empirical study of their properties. We show such features can be construed as abstract morphological genes ("mones") with strong independent associations to biological phenotypes. Many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC = [Formula: see text] for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC = [Formula: see text]). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values. Our work also demonstrates mones can be interpreted without using a classifier as a proxy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
An overview of interpretation methods in deep learning. Blue arrows denote methods that require a trained classifier, and green arrows denote methods that do not require a trained classifier. (a) Several methods identify regions which drive the network’s prediction. These masks can be generated by the network, e.g. spatial self-attention, or as a post-process via visualization methods such as GradCam, or prediction heatmaps. Heatmaps of individual mones and mone-based classifiers can be used to detect predictive regions. (b) Channel attention and sparse models, including sparse mone-based classifiers, identify subsets of features that are predictive of class labels. Differential mone analysis identifies discriminative features without training of a classifier. (c) Methods in (a) and (b) are typically used to select example image regions that have high attention, affect predictions the most, or affect the value of a feature. Mone analysis can be used to (d) identify features that encode a given phenotype of interest and (e) identify the morphology a feature of interest encodes without training a classification model.

**Figure 2**
Individual mones and mone pairs encode and distinguish phenotypes. (a) Clustermap of BRCA slides using the top100 mones differentiating the slides. 100 mones are sufficient to separate frozen normal (green) from frozen tumor (orange) slides. (b) Venn diagram of statistically significant mones differentiating tumor from adjacent normal frozen slides, comparing different statistical tests. Venn diagrams were calculated for each cancer type, and the observed plot shows the average across all cancer types. On average the statistical tests agree on $75 %$ of mones differentiating between tumor and normal slides. (c) Probability density function of mone 983 among frozen tumor (orange) and adjacent normal (green) BRCA slides. (d) Log-normalized scatter plot of slide level mone 983 and cellpose estimates of cellularity across BRCA frozen slides. (e) Example tiles from slides with extreme mone 983 values (high and low). (f) Cluster map of the mone-mone correlation matrix of LUAD tumor slides, demonstrating that many mones pairs are highly correlated. (g) Mone-mone correlation matrix of LUSC slides, with mones ordered identically to the Fig. 1f cluster map.

**Figure 3**
The joint distribution of mones reliably separates tumor and normal slides and the underlying cancer. 2D t-SNE plots of the mone-based MLDA feature space distinguishing 38 classes (19 cancers, tumor/normal status) based on (a) cancer type and (b) tumor/normal status. (c) Normalized confusion matrix of the 38-class mone-based logistic regression classifier. The color depicts the ratio of slides with a given true class predicted as any of the possible classes. The large diagonal values suggest the classifier has high accuracy. (d) The cross-classification AUCs of mone-based logistic regression tumor/normal classifier trained on each cancer and applied to all cancers.

**Figure 4**
Mone-gene correlation analysis identifies highly correlated mone-gene clusters. Correlation matrix of (a) a cluster of highly correlated mones and collagen genes in OV , and a cluster of highly correlated mones and immune-related genes in (b) pan-GI cancers and (c) LUAD. See Supplementary Fig. 9 for adjusted p-values. Example tiles from slides with (d) high and low PC-1 in OV, (e) high and low mone 179 in COAD, and (f) high and low PC-1 in LUAD. Histopathology review identifies that mone-predicted (d) OV tiles with high PC-1 are rich in collagen, and (e) COAD tiles with high mone 179 and (f) luad tiles with high PC-1 have a strong lymphocyte presence. See Supplementary Figs. 10–13 for additional examples at both high and low mone values.

See this image and copyright information in PMC

References

1. Noorbakhsh J, Farahmand S, Namburi S, Caruana D, Rimm D, Soltanieh-ha M, Zarringhalam K, Chuang JH. Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images. Nat. Commun. 2020;11:1–14. doi: 10.1038/s41467-020-20030-5. - DOI - PMC - PubMed
1. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: Review, opportunities and challenges. Brief. Bioinform. 2018;19(6):1236–1246. doi: 10.1093/bib/bbx044. - DOI - PMC - PubMed
1. Wang S, Yang DM, Rong R, Zhan X, Fujimoto J, Liu H, Minna J, Wistuba II, Xie Y, Xiao G. Artificial intelligence in lung cancer pathology image analysis. Cancers. 2019;11(11):1673. doi: 10.3390/cancers11111673. - DOI - PMC - PubMed
1. Dodge, S. & Karam L. Understanding how image quality affects deep neural networks. In 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), 1–6. (IEEE, 2016).
1. Nair, T., Foroughi pour A. & Chuang, J. H. The effect of blurring on lung cancer subtype classification accuracy of convolutional neural networks. In IEEE Conference on Bioinformatics and Biomedicine, 2987–2989 (IEEE, 2020).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep learning features encode interpretable morphologies within histological images

Affiliations

Deep learning features encode interpretable morphologies within histological images

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources