Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 27;115(13):E2970-E2979.
doi: 10.1073/pnas.1717139115. Epub 2018 Mar 12.

Predicting cancer outcomes from histology and genomics using convolutional networks

Affiliations

Predicting cancer outcomes from histology and genomics using convolutional networks

Pooya Mobadersany et al. Proc Natl Acad Sci U S A. .

Abstract

Cancer histology reflects underlying molecular processes and disease progression and contains rich phenotypic information that is predictive of patient outcomes. In this study, we show a computational approach for learning patient outcomes from digital pathology images using deep learning to combine the power of adaptive machine learning algorithms with traditional survival models. We illustrate how these survival convolutional neural networks (SCNNs) can integrate information from both histology images and genomic biomarkers into a single unified framework to predict time-to-event outcomes and show prediction accuracy that surpasses the current clinical paradigm for predicting the overall survival of patients diagnosed with glioma. We use statistical sampling techniques to address challenges in learning survival from histology images, including tumor heterogeneity and the need for large training cohorts. We also provide insights into the prediction mechanisms of SCNNs, using heat map visualization to show that SCNNs recognize important structures, like microvascular proliferation, that are related to prognosis and that are used by pathologists in grading. These results highlight the emerging role of deep learning in precision medicine and suggest an expanding utility for computational analysis of histology in the future practice of pathology.

Keywords: artificial intelligence; cancer; deep learning; digital pathology; machine learning.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: L.A.D.C. leads a research project that is financially supported by Ventana Medical Systems, Inc. While this project is not directly related to the manuscript, it is in the general area of digital pathology.

Figures

Fig. 1.
Fig. 1.
The SCNN model. The SCNN combines deep learning CNNs with traditional survival models to learn survival-related patterns from histology images. (A) Large whole-slide images are generated by digitizing H&E-stained glass slides. (B) A web-based viewer is used to manually identify representative ROIs in the image. (C) HPFs are sampled from these regions and used to train a neural network to predict patient survival. The SCNN consists of (i) convolutional layers that learn visual patterns related to survival using convolution and pooling operations, (ii) fully connected layers that provide additional nonlinear transformations of extracted image features, and (iii) a Cox proportional hazards layer that models time-to-event data, like overall survival or time to progression. Predictions are compared with patient outcomes to adaptively train the network weights that interconnect the layers.
Fig. 2.
Fig. 2.
SCNN uses image sampling and filtering to improve the robustness of training and prediction. (A) During training, a single 256 × 256-pixel HPF is sampled from each region, producing multiple HPFs per patient. Each HPF is subjected to a series of random transformations and is then used as an independent sample to update the network weights. New HPFs are sampled at each training epoch (one training pass through all patients). (B) When predicting the outcome of a newly diagnosed patient, nine HPFs are sampled from each ROI, and a risk is predicted for each field. The median HPF risk is calculated in each region, these median risks are then sorted, and the second highest value is selected as the patient risk. This sampling and filtering framework was designed to deal with tissue heterogeneity by emulating manual histologic evaluation, where prognostication is typically based on the most malignant region observed within a heterogeneous sample. Predictions based on the highest risk and the second highest risk had equal performance on average in our experiments, but the maximum risk produced some outliers with poor prediction accuracy.
Fig. 3.
Fig. 3.
Prognostication criteria for diffuse gliomas. (A) Prognosis in the diffuse gliomas is determined by genomic classification and manual histologic grading. Diffuse gliomas are first classified into one of three molecular subtypes based on IDH1/IDH2 mutations and the codeletion of chromosomes 1p and 19q. Grade is then determined within each subtype using histologic characteristics. Subtypes with an astrocytic lineage are split by IDH mutation status, and the combination of 1p/19q codeletion and IDH mutation defines an oligodendroglioma. These lineages have histologic differences; however, histologic evaluation is not a reliable predictor of molecular subtype (37). Histologic criteria used for grading range from nuclear morphology to higher-level patterns, like necrosis or the presence of abnormal microvascular structures. (B) Comparison of the prognostic accuracy of SCNN models with that of baseline models based on molecular subtype or molecular subtype and histologic grade. Models were evaluated over 15 independent training/testing sets with randomized patient assignments and with/without training and testing sampling. (C) The risks predicted by the SCNN models correlate with both histologic grade and molecular subtype, decreasing with grade and generally trending with the clinical aggressiveness of genomic subtypes. (D) Kaplan–Meier plots comparing manual histologic grading and SCNN predictions. Risk categories (low, intermediate, high) were generated by thresholding SCNN risks. N/A, not applicable.
Fig. 4.
Fig. 4.
GSCNN models integrate genomic and imaging data for improved performance. (A) A hybrid architecture was developed to combine histology image and genomic data to make integrated predictions of patient survival. These models incorporate genomic variables as inputs to their fully connected layers. Here, we show the incorporation of genomic variables for gliomas; however, any number of genomic or proteomic measurements can be similarly used. (B) The GSCNN models significantly outperform SCNN models as well as the WHO paradigm based on genomic subtype and histologic grading.
Fig. 5.
Fig. 5.
Visualizing risk with whole-slide SCNN heat maps. We performed SCNN predictions exhaustively within whole-slide images to generate heat map overlays of the risks that SCNN associates with different histologic patterns. Red indicates relatively higher risk, and blue indicates lower risk (the scale for each slide is different). (Top) In TCGA-DB-5273, SCNN clearly and specifically predicts high risks for regions of early microvascular proliferation (region 1) and also, higher risks with increasing tumor infiltration and cell density (region 2 vs. 3). (Middle) In TCGA-S9-A7J0, SCNN can appropriately discriminate between normal cortex (region 1 in Bottom) and adjacent regions infiltrated by tumor (region 1 in Top). Highly cellular regions containing prominent microvascular structures (region 3) are again assigned higher risks than lower-density regions of tumor (region 2). Interestingly, low-density infiltrate in the cortex was associated with high risk (region 1 in Top). (Bottom) In TCGA-TM-A84G, SCNN assigns high risks to edematous regions (region 1 in Bottom) that are adjacent to tumor (region 1 in Top).

References

    1. Kong J, et al. Computer-assisted grading of neuroblastic differentiation. Arch Pathol Lab Med. 2008;132:903–904, author reply 904. - PubMed
    1. Niazi MKK, et al. Visually meaningful histopathological features for automatic grading of prostate cancer. IEEE J Biomed Health Inform. 2017;21:1027–1038. - PubMed
    1. Naik S, et al. Proceedings of the 2008 5th IEEE International Symposium on Biomedical Imaging: From Nano to Macro. IEEE; Piscataway, NJ: 2008. Automated gland and nuclei segmentation for grading of prostate and breast cancer histopathology; pp. 284–287.
    1. Ren J, et al. Computer aided analysis of prostate histopathology images Gleason grading especially for Gleason score 7. Conf Proc IEEE Eng Med Biol Soc. 2015;2015:3013–3016. - PMC - PubMed
    1. Kothari S, Phan JH, Young AN, Wang MD. Histological image classification using biologically interpretable shape-based features. BMC Med Imaging. 2013;13:9. - PMC - PubMed

Publication types

LinkOut - more resources