Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 22;3(1):lqab015.
doi: 10.1093/nargab/lqab015. eCollection 2021 Mar.

Two-stage Cox-nnet: biologically interpretable neural-network model for prognosis prediction and its application in liver cancer survival using histopathology and transcriptomic data

Affiliations

Two-stage Cox-nnet: biologically interpretable neural-network model for prognosis prediction and its application in liver cancer survival using histopathology and transcriptomic data

Zhucheng Zhan et al. NAR Genom Bioinform. .

Abstract

Pathological images are easily accessible data with the potential of prognostic biomarkers. Moreover, integration of heterogeneous data types from multi-modality, such as pathological image and gene expression data, is invaluable to help predicting cancer patient survival. However, the analytical challenges are significant. Here, we take the hepatocellular carcinoma (HCC) pathological image features extracted by CellProfiler, and apply them as the input for Cox-nnet, a neural network-based prognosis prediction model. We compare this model with the conventional Cox proportional hazards (Cox-PH) model, CoxBoost, Random Survival Forests and DeepSurv, using C-index and log-rank P-values. The results show that Cox-nnet is significantly more accurate than Cox-PH and Random Survival Forests models and comparable with CoxBoost and DeepSurv models, on pathological image features. Further, to integrate pathological image and gene expression data of the same patients, we innovatively construct a two-stage Cox-nnet model, and compare it with another complex neural-network model called PAGE-Net. The two-stage Cox-nnet complex model combining histopathology image and transcriptomic RNA-seq data achieves much better prognosis prediction, with a median C-index of 0.75 and log-rank P-value of 6e-7 in the testing datasets, compared to PAGE-Net (median C-index of 0.68 and log-rank P-value of 0.03). Imaging features present additional predictive information to gene expression features, as the combined model is more accurate than the model with gene expression alone (median C-index 0.70). Pathological image features are correlated with gene expression, as genes correlated to top imaging features present known associations with HCC patient survival and morphogenesis of liver tissue. This work proposes two-stage Cox-nnet, a new class of biologically relevant and interpretable models, to integrate multiple types of heterogenous data for survival prediction.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The architectures of Cox-nnet model: the sketch of Cox-nnet model for prognosis prediction, based on a single data type.
Figure 2.
Figure 2.
Comparison of prognosis prediction among different models using pathology imaging data. (A) C-index results on training and testing datasets. (BF) Kaplan–Meier survival curves on testing datasets using different methods. (B) Cox-nnet (C) CoxBoost (D) DeepSurv (E) Cox-PH (F) Random Survival Forests (RSF).
Figure 3.
Figure 3.
The architectures of two-stage Cox-nnet. The first stage builds individual Cox-nnet models for each data type. The second stage combines the hidden nodes from the first stage Cox-nnet models as the input, and builds a new Cox-nnet model.
Figure 4.
Figure 4.
Comparison of Cox-nnet prognosis prediction using pathology imaging, gene expression and the combination of the two data types. (A) C-index results on training and testing datasets. (BD) Kaplan–Meier survival curves on testing datasets. (B) Cox-nnet model on imaging data only. (C) Cox-nnet model on gene expression data only. (D) two-stage Cox-nnet model combining imaging and gene expression data.
Figure 5.
Figure 5.
Comparison of two-stage Cox-nnet and PAGE-Net, based on combined pathological images and gene expression. (A) C-index of the two methods on training (red) and testing (blue) datasets, on 20 repetitions. (BE) Kaplan–Meier survival curves resulting from the Cox-nnet (B and D) and PAGE-Net model (C and E) using training and testing datasets, respectively.
Figure 6.
Figure 6.
Relationship between top imaging and gene features. Rectangle nodes are image features and circle nodes are gene features. Node sizes are proportional to importance scores from Cox-nnet. Two gene nodes are connected only if their correlation is > 0.5; an image node and a gene node are connected only if their correlation is > 0.1 (P-value < 0.05). Green nodes represent features with positive coefficients (hazard ratio) in univariate Cox-PH regression, indicating worse prognosis. Blue nodes represent features with negative coefficients (hazard ratio) in univariate Cox-PH regression, indicating protection against bad prognosis.

References

    1. Ching T., Zhu X., Garmire L.X. Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput. Biol. 2018; 14:e1006076. - PMC - PubMed
    1. Ishwaran H., Lu M. Random survival forests. Wiley StatsRef: Statistics Reference Online. 2019; 1–13.
    1. De Bin R. Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost. Comput. Stat. 2016; 31:513–531.
    1. Katzman J.L., Shaham U., Cloninger A., Bates J., Jiang T., Kluger Y. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 2018; 18:24–35. - PMC - PubMed
    1. Hao J., Kosaraju S.C., Tsaku N.Z., Song D.H., Kang M. PAGE-Net: interpretable and integrative deep learning for survival analysis using histopathological images and genomic data. Pac. Symp. Biocomput. 2020; 25:355–366. - PubMed

LinkOut - more resources