Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 23:12:893424.
doi: 10.3389/fonc.2022.893424. eCollection 2022.

Deep Learning-Based Multi-Omics Integration Robustly Predicts Relapse in Prostate Cancer

Affiliations

Deep Learning-Based Multi-Omics Integration Robustly Predicts Relapse in Prostate Cancer

Ziwei Wei et al. Front Oncol. .

Abstract

Objective: Post-operative biochemical relapse (BCR) continues to occur in a significant percentage of patients with localized prostate cancer (PCa). Current stratification methods are not adequate to identify high-risk patients. The present study exploits the ability of deep learning (DL) algorithms using the H2O package to combine multi-omics data to resolve this problem.

Methods: Five-omics data from 417 PCa patients from The Cancer Genome Atlas (TCGA) were used to construct the DL-based, relapse-sensitive model. Among them, 265 (63.5%) individuals experienced BCR. Five additional independent validation sets were applied to assess its predictive robustness. Bioinformatics analyses of two relapse-associated subgroups were then performed for identification of differentially expressed genes (DEGs), enriched pathway analysis, copy number analysis and immune cell infiltration analysis.

Results: The DL-based model, with a significant difference (P = 6e-9) between two subgroups and good concordance index (C-index = 0.767), were proven to be robust by external validation. 1530 DEGs including 678 up- and 852 down-regulated genes were identified in the high-risk subgroup S2 compared with the low-risk subgroup S1. Enrichment analyses found five hallmark gene sets were up-regulated while 13 were down-regulated. Then, we found that DNA damage repair pathways were significantly enriched in the S2 subgroup. CNV analysis showed that 30.18% of genes were significantly up-regulated and gene amplification on chromosomes 7 and 8 was significantly elevated in the S2 subgroup. Moreover, enrichment analysis revealed that some DEGs and pathways were associated with immunity. Three tumor-infiltrating immune cell (TIIC) groups with a higher proportion in the S2 subgroup (p = 1e-05, p = 8.7e-06, p = 0.00014) and one TIIC group with a higher proportion in the S1 subgroup (P = 1.3e-06) were identified.

Conclusion: We developed a novel, robust classification for understanding PCa relapse. This study validated the effectiveness of deep learning technique in prognosis prediction, and the method may benefit patients and prevent relapse by improving early detection and advancing early intervention.

Keywords: H2O package; autoencoder; deep learning; multi-omics; prostate cancer; relapse prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Overall workflow. Firstly, mRNA, miRNA, DNA methylation, CNVs, and lncRNA deep features from the TCGA PCa cohort were stacked up as input features for the autoencoder, a deep-learning method. Then each of the new and transformed features in the bottleneck layer of the autoencoder were subjected to univariate Cox-PH models to select the features associated with relapse. Then K-mean clustering was applied to relapse-associated deep features and 10-fold cross-validation was applied to analyze the C-index for different clusters and relapse in 8 DL models. The best model (model_3) with better discriminative ability was finally selected using Kaplan-Meier plotter between two models with the highest C-index. Then the Lasso method was used to filter out the relapse-associated feature labels, according to model_3 subgroups, from the database of TCGA, including mRNA, miRNA, DNA methylation, CNVs and lncRNA. The Lasso model was constructed, and five external validation sets from GEO were used to evaluate its prediction ability. Last but not least, functional analysis was performed to understand the different characteristics between two relapse-associated subgroups.
Figure 2
Figure 2
Significant survival differences for model _3 and five external validation sets. Relapse-related deep-features of model_3 were used for subgrouping, and KM plot was used to show the difference in relapse levels between the two subgroups. The Lasso model constructed according to model_3 was validated in each of the five external validation sets. (A) KM plot of model_3 (log-rank P-value = 6e-09, the time of half relapse is about 3.5 years). (B) GSE70768 validation set (mRNA, Number of samples = 111, log-rank p-value = 4.46e-07). (C) GSE26367 validation set (miRNA, N = 149, log-rank P-value = 0.000319447). (D) GSE26126 validation set (DNA methylation, N = 85, log-rank P-value = 0.003265681). (E) GSE21035 validation set (CNVs, N = 198, log-rank P-value = 0), and (F) Re-annotated GSE70768 validation set (mRNA, N = 111, log-rank P-value = 0.017250485).
Figure 3
Figure 3
Differentially expressed genes (DEGs) in the two subgroups from the TCGA PCa samples. (A) Differential expression: S2 vs S1, S1: a low relapse-risk subgroup of PCa, S2: a high relapse-risk subgroup of PCa. (B) Volcano plot of DEGs.
Figure 4
Figure 4
GO and KEGG enrichment of upregulated and downregulated genes. (A) GO enrichment analysis of upregulated genes. (B) KEGG enrichment analysis of upregulated genes. (C) GO enrichment analysis of downregulated genes. (D) KEGG enrichment analysis of downregulated genes.
Figure 5
Figure 5
GSEA enrichment analysis in Hallmarks and KEGG (S2 vs S1). (A) The top five upregulated hallmarks. (B) The top five downregulated hallmarks. (C) The top five upregulated pathways in KEGG. (D) The top five downregulated pathways in KEGG.
Figure 6
Figure 6
GSVA enrichment analysis in Hallmarks. (A) Heatmap plot. (B) Bar Chart (-log(p) value of GSVA score were used, S2 vs S1).
Figure 7
Figure 7
CNVs difference analysis between S1 and S2. (A) CNVs difference analysis by Wilcoxon. (B) Hierarchical clustering. Red indicates amplification, whereas blue indicates deletion. (C–H) Chromosomal distribution of copy number by GISTIC. Red indicates amplification, whereas blue indicates deletion.
Figure 8
Figure 8
Functional analysis of CNV differential genes. (A) GO enrichment analysis of CNV differential genes. (B) Venn diagrams of CNV differential genes and expression differential genes. (C) GO enrichment analysis of upregulated CNV genes. (D) GO enrichment analysis of downregulated CNV genes.
Figure 9
Figure 9
Immuno-infiltration analysis between S1 and S2. (A) Heatmap plot of the 22 tumors infiltrating immune cells (TIICs) in two groups. (B–E) Relative proportion of the differential four types of TIICs between two subgroups, respectively.

Similar articles

Cited by

References

    1. Huang S, Chaudhary K, Garmire LX. More Is Better: Recent Progress in Multi-Omics Data Integration Methods. Front Genet (2017) 8:84. doi: 10.3389/fgene.2017.00084 - DOI - PMC - PubMed
    1. Galkin F, Mamoshina P, Kochetov K, Sidorenko D, Zhavoronkov A. Deepmage: A Methylation Aging Clock Developed With Deep Learning. Aging Dis (2021) 12(5):1252–62. doi: 10.14336/AD.2020.1202 - DOI - PMC - PubMed
    1. Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer. Clin Cancer Res (2018) 24(6):1248–59. doi: 10.1158/1078-0432.CCR-17-0853 - DOI - PMC - PubMed
    1. Zhao Z, Li Y, Wu Y, Chen R. Deep Learning-Based Model for Predicting Progression in Patients With Head and Neck Squamous Cell Carcinoma. Cancer biomark (2020) 27(1):19–28. doi: 10.3233/CBM-190380 - DOI - PubMed
    1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer Statistics, 2021. CA Cancer J Clin (2021) 71(1):7–33. doi: 10.3322/caac.21654 - DOI - PubMed

LinkOut - more resources