Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 3;10(1):18951.
doi: 10.1038/s41598-020-76025-1.

Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data

Affiliations

Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data

Bin Baek et al. Sci Rep. .

Abstract

Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival and recurrence, the accuracies of these approaches in the prediction of high-risk pancreatic adenocarcinoma (PAAD) still need to be improved. Accordingly, in this study, we proposed two biological features based on multi-omics datasets to predict survival and recurrence among patients with PAAD. First, the clonal expansion of cancer cells with somatic mutations was used to predict prognosis. Using whole-exome sequencing data from 134 patients with PAAD from The Cancer Genome Atlas (TCGA), we found five candidate genes that were mutated in the early stages of tumorigenesis with high cellular prevalence (CP). CDKN2A, TP53, TTN, KCNJ18, and KRAS had the highest CP values among the patients with PAAD, and survival and recurrence rates were significantly different between the patients harboring mutations in these candidate genes and those harboring mutations in other genes (p = 2.39E-03, p = 8.47E-04, respectively). Second, we generated an autoencoder to integrate the RNA sequencing, microRNA sequencing, and DNA methylation data from 134 patients with PAAD from TCGA. The autoencoder robustly reduced the dimensions of these multi-omics data, and the K-means clustering method was then used to cluster the patients into two subgroups. The subgroups of patients had significant differences in survival and recurrence (p = 1.41E-03, p = 4.43E-04, respectively). Finally, we developed a prediction model for prognosis using these two biological features and clinical data. When support vector machines, random forest, logistic regression, and L2 regularized logistic regression were used as prediction models, logistic regression analysis generally revealed the best performance for both disease-free survival (DFS) and overall survival (OS) (accuracy [ACC] = 0.762 and area under the curve [AUC] = 0.795 for DFS; ACC = 0.776 and AUC = 0.769 for OS). Thus, we could classify patients with a high probability of recurrence and at a high risk of poor outcomes. Our study provides insights into new personalized therapies on the basis of mutation status and multi-omics data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Workflow of approach. Graphical summary of the prediction of survival and recurrence in patients with pancreatic cancer. (a) Omics datasets used to construct the prediction models. (b) Data preprocessing and the process for obtaining features. (c) Final nine features (including seven clinical features). (d) Machine learning models used for the prediction.
Figure 2
Figure 2
Kaplan–Meier OS and DFS curves for the two groups of patients with PAAD. Kaplan–Meier survival curves for the CP value-based two groups of patients with PAAD. OS (a) and DFS (b). OS (c) and DFS (d) of two groups of patients who had mutations in frequently mutated genes and other patients.
Figure 3
Figure 3
Kaplan–Meier OS and DFS curves for the two groups of patients identified by K-means clustering. Kaplan–Meier survival curves for the two subgroups of patients showing OS and DFS. DFS for G1 and G2 (a), OS for G1 and G2 (b). Kaplan–Meier survival curves for the two subgroups analyzed by PCA showing OS and DFS. DFS for G1 and G2 (c), OS for G1 and G2 (d).
Figure 4
Figure 4
Predictive performance for DFS and OS using various features. The performance of machine learning models for predicting (a) OS and (b) DFS based on various features were measured. The y-axis represents the accuracy or AUC values. Clinical, nine clinical data; KG, known cancer driver genes (KRAS, CDKN2A, TP53, and SMAD4); HR, a high-risk group of patients harboring mutations in five genes with high CP values; Sub, a feature representing subgroups generated by integrating mRNA, miRNA, and DNA methylation subtypes; AUC, area under the curve values from fivefold cross-validation.

References

    1. Noone A, et al. Seer cancer statistics review, 1975–2015. Bethesda, MD: National Cancer Institute; 2018.
    1. Chikhladze S, et al. Adjuvant chemotherapy after surgery for pancreatic ductal adenocarcinoma: retrospective real-life data. World J. Surg. Oncol. 2019;17:185. doi: 10.1186/s12957-019-1732-3. - DOI - PMC - PubMed
    1. Oettle H, et al. Adjuvant chemotherapy with gemcitabine and long-term outcomes among patients with resected pancreatic cancer: the conko-001 randomized trial. JAMA. 2013;310:1473–1481. doi: 10.1001/jama.2013.279201. - DOI - PubMed
    1. Fischer R, et al. Early recurrence of pancreatic cancer after resection and during adjuvant chemotherapy. Saudi J. Gastroenterol. Off. J. Saudi Gastroenterol. Assoc. 2012;18:118. doi: 10.4103/1319-3767.93815. - DOI - PMC - PubMed
    1. Shibata K, et al. Factors predicting recurrence after resection of pancreatic ductal carcinoma. Pancreas. 2005;31:69–73. doi: 10.1097/01.mpa.0000166998.04266.88. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources