Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Mar 21:2024.02.26.582106.
doi: 10.1101/2024.02.26.582106.

Self-Supervised Learning Reveals Clinically Relevant Histomorphological Patterns for Therapeutic Strategies in Colon Cancer

Affiliations

Self-Supervised Learning Reveals Clinically Relevant Histomorphological Patterns for Therapeutic Strategies in Colon Cancer

Bojing Liu et al. bioRxiv. .

Update in

Abstract

Self-supervised learning (SSL) automates the extraction and interpretation of histopathology features on unannotated hematoxylin-and-eosin-stained whole-slide images (WSIs). We trained an SSL Barlow Twins-encoder on 435 TCGA colon adenocarcinoma WSIs to extract features from small image patches. Leiden community detection then grouped tiles into histomorphological phenotype clusters (HPCs). HPC reproducibility and predictive ability for overall survival was confirmed in an independent clinical trial cohort (N=1213 WSIs). This unbiased atlas resulted in 47 HPCs displaying unique and sharing clinically significant histomorphological traits, highlighting tissue type, quantity, and architecture, especially in the context of tumor stroma. Through in-depth analysis of these HPCs, including immune landscape and gene set enrichment analysis, and association to clinical outcomes, we shed light on the factors influencing survival and responses to treatments like standard adjuvant chemotherapy and experimental therapies. Further exploration of HPCs may unveil new insights and aid decision-making and personalized treatments for colon cancer patients.

Keywords: Bevacizumab; Colon cancer; Histopathology; Overall Survival; Self-supervised learning; Tumor microenvironment.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests AT is a co-founder of Imagenomix. The remaining authors declare no competing interests.

Figures

Figure 1:
Figure 1:. Overview of the Model Architecture: Training Barlow Twins and deriving Histomorphological Phenotype Clusters.
(a) Training Barlow Twins with TCGA. WSIs from TCGA were processed to extract image tiles and normalize stain colors. The Barlow Twins network was employed to learn 128-dimensional z vectors from these image tiles. (b) Deriving HPCs. The tiles from TCGA were projected into z vector representations obtained from the trained Barlow Twins network. HPCs were defined by applying Leiden community detection to the nearest neighbor graph of z tile vector representations. Each WSI was represented by a compositional vector of the derived HPCs, indicating the percentage of each HPC with respect to the total tissue area. The Barlow Twins model and HPCs were then projected and integrated into the external AVANT trial. (c) Whole Slide Image Representation. The compositional HPC data represented the WSIs in the study. AVANT, Bevacizumab-Avastin® adjuVANT trial. HPC, histomorphological phenotype cluster. TCGA, The Cancer Genome Atlas. WSI, whole slide image.
Figure 2:
Figure 2:. Identification of HPCs in TCGA and subsequent classification into super-clusters
(a) UMAP showing 47 HPCs identified from the TCGA dataset, each scatter representing an image tile. (b) PAGA plot of HPCs. Each node represented an HPC with edges representing connections between HPCs based on their vector representation similarity. The pie chart of each node represented the tissue composition for each HPC.(c) Grouping of HPCs into super-clusters according to histopathology tissue similarities. Representative tiles for each HPC were labeled with ID and a brief description. HPC, histomorphological phenotype cluster. PAGA, partition-based abstraction graph. TCGA, The Cancer Genome Atlas. UMAP, uniform manifold approximation and projection plot
Figure 3:
Figure 3:. Verification of HPCs in the TCGA training set and the external AVANT trial
(a-i) Example tiles from TCGA (upper row) and external clinical AVANT trial (lower row) showcase the eight super-clusters with a zoomed-in representative tile. Notably, the muscle tissue super-cluster is further divided into longitudinal and axial subgroups (j,k) Stacked bar plots illustrate instances of misclassification for each HPC in TCGA training set and AVANT external test set. Green bars represent the percentage of correctly identified odd clusters, yellow bars indicate misclassifications within the tested HPC’s super-cluster, and orange bars show misclassifications outside the super-cluster. HPC, histomorphological phenotype cluster. TCGA, The Cancer Genome Atlas. AVANT, Bevacizumab-Avastin® adjuVANT trial.
Figure 4:
Figure 4:. HPC-based classifier was associated with OS in patients treated with standard-of-care and AVANT-experimental treatment.
(a) Ordinary Cox regression for OS, incorporating the HPC-based risk classifier, along with sex, age categories, tumor-stroma ratio, and AJCC TNM staging, was conducted within the external test set of the AVANT control group. The HPC model-based classifier stands as an independent prognostic factor (HR 2.50, 95% CI 1.18-5.31) for OS. (b) Ordinary Cox regression for OS, incorporating the HPC-based risk classifier, along with sex, age categories, tumor-stroma ratio, and AJCC TNM staging, was conducted within the AVANT experimental group. The HPC model-based classifier stands as an independent prognostic factor (HR 1.82, 95% CI 1.11-2.99) for OS. (c and d) The SHAP summary plots depict the relationship between the center-log-transformed compositional value of an HPC and its impact on death hazard prediction. The color bar indicates the relative compositional value of an HPC, with red indicating higher and blue indicating lower composition. Higher compositions of the top 10 HPCs were associated with worse OS, while higher compositions of the bottom 10 HPCs were linked to improved OS. AJCC TNM, American Joint Committee on Cancer tumor-no-demetastasis classification. AVANT, Bevacizumab-Avastin® adjuVANT trial. HPC, histomorphological phenotype cluster. OS, overall survival. SHAP, SHapley Additive exPlanations. TCGA, The Cancer Genome Atlas.
Figure 5:
Figure 5:. PAGA plots highlighted with important HPCs related to OS in the standard-of-care and experimental treated group
(a) Standard treated group: HPCs colored in the red are linked to worse survival and HPCs colored in blue are linked to better survival. (b) AVANT-experimental treated group: HPCs colored in the red are linked to worse survival and HPCs colored in blue are linked to better survival. AVANT, Bevacizumab-Avastin® adjuVANT trial. HPC, histomorphological phenotype cluster. PAGA, partition-based graph abstraction.
Figure 6:
Figure 6:. Survival-associated HPCs in relation to immune and genetic profile.
(a) Standard-of-care group: Spearman’s correlations between top 20 OS-related HPCs and immune landscape features. HPCs (columns of the matrix) were colored according to the beta-coefficients estimated from the optimized regularised Cox regression, with red indicating HPCs related to worse survival and green indicating HPCs related to better survival. The color bar at the upper left corner indicates the value of correlation coefficients with red denoting positive and blue denoting negative correlations. (b) AVANT-experimental treated group: Spearman’s correlations between top 20 OS-related HPCs and immune landscape features. (c) Standard-of-care group GSEA between the top OS-related HPCs and major cancer hallmark pathways. HPCs (columns of the matrix) were colored according to the beta-coefficients estimated from the optimized regularised Cox regression, with red indicating HPCs related to worse survival and green indicating HPCs related to better survival. The color bar at the upper left corner indicates the value of the correlation coefficients with red denoting enrichment and blue denoting underrepresentation in a gene pathway. (d) AVANT-experimental treated group GSEA for the top 20 OS-related HPCs. AVANT, Bevacizumab-Avastin® adjuVANT trial. GSEA, gene set enrichment analysis. HPC, histomorphological phenotype cluster. OS, overall survival.
Figure 7:
Figure 7:. Clinical application of AI-derived HPCs in prediction of patient outcomes.
The clinical algorithm consists of three key stages: data preparation, cancer patient characterization, and AI-supported multidisciplinary treatment meetings. Data preparation involves collecting histopathology WSIs, segmenting them into small image tiles. Patient characterization encompasses SSL model training, yielding HPCs via clustering. HPCs are easily interpretable by pathologists, linkable to omic data. Most importantly, HPCs are valuable for predicting diagnosis, patient outcomes, and treatment responses. In treatment-related outcomes, AI-predicted high/low risk groups aid multidisciplinary meetings, enabling personalized treatment plans by oncologists, pathologists, and physicians. AI, artificial intelligence. HPC, histomorphological phenotype cluster. SSL, self-supervised learning, WSI, whole slide image.

References

    1. Brierley James D, Gospodarowicz Mary K, and Wittekind Christian. TNM classification of malignant tumours. John Wiley & Sons, 2017.
    1. Argilés G., Tabernero J., Labianca R., Hochhauser D., Salazar R., Iveson T., Laurent-Puig P., Quirke P., Yoshino T., Taieb J., Martinelli E., and Arnold D.. Localised colon cancer: Esmo clinical practice guidelines for diagnosis, treatment and follow-up†. Annals of Oncology, 31(10):1291–1305, 2020. - PubMed
    1. Weiser Martin R. Ajcc 8th edition: colorectal cancer. Annals of surgical oncology, 25:1454–1455, 2018. - PubMed
    1. Cervantes A, Adam R, Roselló S, Arnold D, Normanno N, Taïeb J, Seligmann J, De Baere T, Osterlund P, Yoshino T, et al. Metastatic colorectal cancer: Esmo clinical practice guideline for diagnosis, treatment and follow-up. Annals of Oncology, 34(1):10–32, 2023. - PubMed
    1. Morgan Eileen, Arnold Melina, Gini A, Lorenzoni V, Cabasag CJ, Laversanne Mathieu, Vignat Jerome, Ferlay Jacques, Murphy Neil, and Bray Freddie. Global burden of colorectal cancer in 2020 and 2040: Incidence and mortality estimates from globocan. Gut, 72(2):338–344, 2023. - PubMed

Publication types