Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 15;35(14):i446-i454.
doi: 10.1093/bioinformatics/btz342.

Deep learning with multimodal representation for pancancer prognosis prediction

Affiliations

Deep learning with multimodal representation for pancancer prognosis prediction

Anika Cheerla et al. Bioinformatics. .

Abstract

Motivation: Estimating the future course of patients with cancer lesions is invaluable to physicians; however, current clinical methods fail to effectively use the vast amount of multimodal data that is available for cancer patients. To tackle this problem, we constructed a multimodal neural network-based model to predict the survival of patients for 20 different cancer types using clinical data, mRNA expression data, microRNA expression data and histopathology whole slide images (WSIs). We developed an unsupervised encoder to compress these four data modalities into a single feature vector for each patient, handling missing data through a resilient, multimodal dropout method. Encoding methods were tailored to each data type-using deep highway networks to extract features from clinical and genomic data, and convolutional neural networks to extract features from WSIs.

Results: We used pancancer data to train these feature encodings and predict single cancer and pancancer overall survival, achieving a C-index of 0.78 overall. This work shows that it is possible to build a pancancer model for prognosis that also predicts prognosis in single cancer sites. Furthermore, our model handles multiple data modalities, efficiently analyzes WSIs and represents patient multimodal data flexibly into an unsupervised, informative representation. We thus present a powerful automated tool to accurately determine prognosis, a key step towards personalized treatment for cancer patients.

Availability and implementation: https://github.com/gevaertlab/MultimodalPrognosis.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Kaplan–Meier survival curves for all cancer sites in TCGA demonstrating that overall survival is tissue specific. The first graph contains the 10 cancers with the highest mean overall survival, the second graph contains the 10 cancers with the lowest mean overall survival
Fig. 2.
Fig. 2.
Structure of the unsupervised model: the similarity loss can be visualized as projecting representations of different modalities in the same space. Each modality uses a different network architecture. For the clinical data, we use FC layers with sigmoid activations, for the genomic data we use deep highway networks (Srivastava et al., 2015) and for the WSI images, we use the SqueezeNet architecture (Iandola et al., 2016) (see main text for architecture details). These architectures generate feature vectors that are then aggregated into a single representation and used to predict overall survival
Fig. 3.
Fig. 3.
The SqueezeNet model architecture. The SqueezeNet architecture consists of a set of fire modules interspersed with maxpool layers. Each fire module consists of a squeeze layer (with 1 × 1 convolution filters) and expand layer (with a mix of 1 × 1 and 3 × 3 convolution filters). This fire module architecture helps to reduce the parameter space for faster training. We replaced the final softmax layer of the original SqueezeNet model with the 512-length feature encoding predictor
Fig. 4.
Fig. 4.
T-SNE-mapped representations of feature vectors T-SNE-mapped representations of feature vectors for 500 patients within the testing set. The 512-length feature vectors were compressed using PCA (50 features) and T-SNE into the 2D space. These representations manage to capture relationships between patients; e.g. patients with the same sex were generally clustered together (left image), and to a lesser extent, patients of the same race and same cancer type tended to be clustered as well (center and right), even when those clinical features were not provided to the model
Fig. 5.
Fig. 5.
Evaluation of multimodal dropout: learning rate in terms of C-index of the model on the validation dataset for predicting prognosis across 20 cancer sites combining multimodal data. The model converges after 40 epochs and shows that multimodal dropout improves the validation performance

References

    1. Alizadeh A.A. et al. (2015) Toward understanding and exploiting tumor heterogeneity. Nat. Med., 21,846–853. - PMC - PubMed
    1. Beck A.H. et al. (2011) Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med., 3, 108ra113. - PubMed
    1. Bejnordi B.E. et al. (2017) Deep learning-based assessment of tumor-associated stroma for diagnosing breast cancer in histopathology images In: IEEE 14th International Symposium on Biomedical Imaging 2017 (ISBI 2017), pp. 929–932. IEEE, Melbourne, Australia. - PMC - PubMed
    1. Calin G.A., Croce C.M. (2006) MicroRNA signatures in human cancers. Nat. Rev. Cancer, 6, 857. - PubMed
    1. Campbell J.D. et al. (2018) Genomic, pathway network, and immunologic features distinguishing squamous carcinomas. Cell Rep., 23, 194. - PMC - PubMed

Publication types