Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 10;19(9):e0309380.
doi: 10.1371/journal.pone.0309380. eCollection 2024.

Exploring the interplay between colorectal cancer subtypes genomic variants and cellular morphology: A deep-learning approach

Affiliations

Exploring the interplay between colorectal cancer subtypes genomic variants and cellular morphology: A deep-learning approach

Hadar Hezi et al. PLoS One. .

Abstract

Molecular subtypes of colorectal cancer (CRC) significantly influence treatment decisions. While convolutional neural networks (CNNs) have recently been introduced for automated CRC subtype identification using H&E stained histopathological images, the correlation between CRC subtype genomic variants and their corresponding cellular morphology expressed by their imaging phenotypes is yet to be fully explored. The goal of this study was to determine such correlations by incorporating genomic variants in CNN models for CRC subtype classification from H&E images. We utilized the publicly available TCGA-CRC-DX dataset, which comprises whole slide images from 360 CRC-diagnosed patients (260 for training and 100 for testing). This dataset also provides information on CRC subtype classifications and genomic variations. We trained CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology patterns. We assessed the interplay between CRC subtypes' genomic variations and cellular morphology patterns by evaluating the CRC subtype classification accuracy of the different models in a stratified 5-fold cross-validation experimental setup using the area under the ROC curve (AUROC) and average precision (AP) as the performance metrics. The CNN models that account for potential correlation between genomic variations within CRC subtypes and their cellular morphology pattern achieved superior accuracy compared to the baseline CNN classification model that does not account for genomic variations when using either single-nucleotide-polymorphism (SNP) molecular features (AUROC: 0.824±0.02 vs. 0.761±0.04, p<0.05, AP: 0.652±0.06 vs. 0.58±0.08) or CpG-Island methylation phenotype (CIMP) molecular features (AUROC: 0.834±0.01 vs. 0.787±0.03, p<0.05, AP: 0.687±0.02 vs. 0.64±0.05). Combining the CNN models account for variations in CIMP and SNP further improved classification accuracy (AUROC: 0.847±0.01 vs. 0.787±0.03, p = 0.01, AP: 0.68±0.02 vs. 0.64±0.05). The improved accuracy of CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology as expressed by H&E imaging phenotypes may elucidate the biological cues impacting cancer histopathological imaging phenotypes. Moreover, considering CRC subtypes genomic variations has the potential to improve the accuracy of deep-learning models in discerning cancer subtype from histopathological imaging data.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Summary of the TCGA COAD and READ datasets application: The total cohort encompasses n = 632 patients.
Some patients were excluded due to technical reasons, resulting with n = 430 patients. Out of this, Kather et al. [7] pre-processed and published data for n = 360 patients, segmenting them into a training and a testing set. The training set was balanced at the patch (p) level. For our research, we used stratified cross-validation folds at the patient level. The partitioning into these folds was informed by the novel sub-labels based on SNP rates and CIMP classifications.
Fig 2
Fig 2. Experimental flow for our exploration of the interplay between CRC subtypes genomic variants and cellular morphology.
The TCGA-CRC dataset, pre-processed by Kather et al. [7] (N = 360) is split into different sets for analysis. A baseline model is trained, and based on its results, a molecular feature analysis is performed. Based on the analysis we choose to define our data classes based on the ranges and categories of SNP, CIMP and CNV (the BP class definitions step). After the definition, we divide the classes into five stratified folds. Next, three models are trained: BP-CNNCIMP, BP-CNNSNP, and BP-CNNCNV to evaluate the interplay between genomic variations and cellular morphology. The BP-CNNCIMP, BP-CNNSNP, and BP-CNNCNV models classify the data based on CIMP, SNP, and CNV features, respectively. Based on their results, BP-CNNCIMP and BP-CNNSNP are further combined into BP-CNNCombined to incorporate the entire set of genomic variations identified as influencing cellular morphology.
Fig 3
Fig 3. Model architectures.
(a) Baseline Model Architecture: Patches are input into the Inception-Net [24] for feature extraction, with the last two layers acting as fully connected classifier layers. Outputs are propagated to a softmax layer for determining probabilities. N represents the number of patient patches, while Pi denotes the MSI probability for each patch. The MSI score for each patient, Pw, is the average of its corresponding MSI probabilities. (b) Biologically-Primed Model Architecture: Similar to the baseline model, the softmax layer outputs class probabilities at the patch level. However, the MSI probability here is calculated as the maximum value between MSI1 and MSI2 outputs. The calculation of Pw remains the same as in the baseline model.
Fig 4
Fig 4. Our BP-CNNCombined model.
Models A and B represent biologically-primed models informed by two distinct genomic variations. The network outputs from trained and fixed models A and B are concatenated, fed into a linear layer, and then propagated to a softmax layer to yield probabilities. ‘N’ represents the number of patches for each patient, and Pi indicates the corresponding MSI probabilities for these patches. The MSI score for each patient denoted as Pw, is derived from averaging its respective MSI probabilities.
Fig 5
Fig 5. Baseline model results for per-patient classification of the test set validated over 5-folds.
Average and 95% CI curves: (a) ROC curve, (b) PR curve.
Fig 6
Fig 6. The distribution of patient-level molecular features in the test set, categorized based on the patch-level classification by the baseline model.
The x-axis indicates the classification of patches, while the y-axis denotes the molecular level determined at the patient level. Here, MSI serves as the positive class and MSS as the negative class: (a) A boxplot illustrating SNP rates for each patch. The y-axis quantifies the cumulative count of SNPs throughout the DNA sample. (b) A bar plot depicting the methylation types for each patch. The y-axis showcases the distribution of various methylation types across classification categories. (c) A boxplot highlighting the CNV rates for patches, with the y-axis measuring the proportion of the DNA sample that manifests CNV.
Fig 7
Fig 7. Average and 95% CI ROC and PR curves for per-patient classification using: (a) the BP-CNNSNP model compared to its corresponding baseline model, (b) the BP-CNNCIMP model compared to its corresponding baseline model, and (c) the BP-CNNCNV model compared to its corresponding baseline model.
Fig 8
Fig 8. Box-plot visualization of (a) AUROC results, (b) AP results and (c) F1-scores for per-patient classification, comparing the biologically primed models with their corresponding baseline model on the test set over different training sessions.
It’s worth noting that due to the stratified k-fold approach used to partition the training data across sessions, the performance of the baseline model can vary between experiments.
Fig 9
Fig 9. Confusion matrices of the patient-level predictions for the different models.
Each matrix represents an average from the test set over various training sessions. The threshold for MSI prediction is determined by the best F1 score over the folds. (a) Baseline model corresponding to the BP-CNNSNP folds. (b) Baseline model corresponding to the to BP-CNNCIMP folds. (c) BP-CNNSNP model. (d) BP-CNNCIMP model.
Fig 10
Fig 10. Average and 95% CI ROC and PR curves for per-patient classification using the BP-CNNCombined model compared to the baseline model.
(a) ROC curve. (b) PR curve. (c), (d) and (e) are the 5-fold results comparison of the AUROC, AP, and F1 results respectively.
Fig 11
Fig 11. A histogram showcasing the MSI scores for patches from selected patients, misclassified by the baseline model but accurately classified by our proposed models.
The x-axis represents the patch MSI probabilities given by the CNN, while the y-axis denotes the count of patches, normalized to the total number of patches for each patient. The comparisons are between (a) the Baseline and BP-CNNSNP model, (b) the Baseline and BP-CNNCIMP model, and (c) the Baseline and BP-CNNCombined model.
Fig 12
Fig 12. Patches of patients that were miss-classified by our models.
Top row: patches of patients that were misclassified by the Baseline model and correctly classified by the BP-CNNCombined model. (a) TCGA-AA-3833, Baseline: MSS, BP-CNNCombined: MSI, reference: MSI (SNP<1200), (b) TCGA-AY-6197, Baseline: MSS, BP-CNNCombined: MSI, reference: MSI (CIMP-low), (c) TCGA-A6-2685, Baseline: MSI, BP-CNNCombined: MSS, reference: MSS, (d) TCGA-NH-A6GC, Baseline: MSI, BP-CNNCombined: MSS, reference: MSS. Bottom row: patches of patients that were misclassified by both the Baseline model and the BP-CNNCombined model. (e) TCGA-A6-2686, Baseline: MSS, BP-CNNCombined: MSS, reference: MSI, (f) TCGA-AG-A02N, Baseline: MSS, BP-CNNCombined: MSS, reference: MSI, (g) TCGA-AG-3881, Baseline: MSI, BP-CNNCombined: MSI, reference: MSS, (h) TCGA-DC-6682, Baseline: MSI, BP-CNNCombined: MSI, reference: MSS.

Similar articles

Cited by

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al.. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians. 2021;71(3):209–249. - PubMed
    1. Hu LF, Lan HR, Huang D, Li XM, Jin KT. Personalized immunotherapy in colorectal cancers: where do we stand? Frontiers in oncology. 2021;11:769305. doi: 10.3389/fonc.2021.769305 - DOI - PMC - PubMed
    1. Le DT, Durham JN, Smith KN, Wang H, Bartlett BR, Aulakh LK, et al.. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science. 2017;357(6349):409–413. doi: 10.1126/science.aan6733 - DOI - PMC - PubMed
    1. Baudrin LG, Deleuze JF, How-Kit A. Molecular and computational methods for the detection of microsatellite instability in cancer. Frontiers in oncology. 2018;8:621. doi: 10.3389/fonc.2018.00621 - DOI - PMC - PubMed
    1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.