Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 19;31(10):1705-1719.
doi: 10.1093/hmg/ddab343.

Development and validation of prognostic and diagnostic model for pancreatic ductal adenocarcinoma based on scRNA-seq and bulk-seq datasets

Affiliations

Development and validation of prognostic and diagnostic model for pancreatic ductal adenocarcinoma based on scRNA-seq and bulk-seq datasets

Kai Chen et al. Hum Mol Genet. .

Abstract

The 5-year overall survival (OS) of pancreatic ductal adenocarcinoma (PDAC) is only 10%, partly owing to the lack of reliable diagnostic and prognostic biomarkers. The raw gene-cell matrix for single-cell RNA-seq (scRNA-seq) analysis was downloaded from the GSA database. We drew cell atlas for PDAC and normal pancreatic tissues. The inferCNV analysis was used to distinguish tumor cells from normal ductal cells. We identified differential expression genes (DEGs) by comparing tumor cells and normal ductal cells. The common DEGs were used to conduct prognostic and diagnostic model using univariate and multivariate Cox or logistic regression analysis. Four genes, MET, KLK10, PSMB9 and ITGB6, were utilized to create risk score formula to predict OS and to establish diagnostic model for PDAC. Finally, we drew an easy-to-use nomogram to predict 2-year and 3-year OSs. In conclusion, we developed and validated the prognostic and diagnostic model for PDAC based on scRNA-seq and bulk-seq datasets.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Graphical scheme describing the study design. We first delineated cell atlas of PDAC and normal pancreas using scRNA-seq datasets, then distinguished tumor cells from normal ductal cells by inferCNV analysis. The common DEGs of scRNA-seq and TCGA versus GTEx analyses were used to construct prognostic and diagnostic models. We also conducted intern and external validations for them.
Figure 2
Figure 2
scRNA-seq delineates cell atlas of pancreas. (A, B) The t-SNE plot showing the original cluster (A) and named cell subpopulations (B). (C) Violin plots showing the expression level of known cell-type-specific markers to demonstrate the identity of each cluster. (D) Bubble plot showing the Top5 marker genes across all clusters. Size of dots represents the proportion of cells expressing a particular marker, and intensity of color indicates the average expression level.
Figure 3
Figure 3
The different cellular constituents and cell-cycle status between PDAC and normal pancreatic specimens. (AC) Proportion of various cell subpopulations among normal pancreatic specimens (A), PDAC specimens (B) and PDAC versus normal pancreatic specimens (C). (DF) Proportion of G1/S/G2M phase among normal pancreatic specimens (D), PDAC specimens (E) and PDAC versus normal pancreatic specimens (F). (G, H) Subclustering of the ductal cell subpopulations for original clusters (G) and named ductal cell subpopulations (H). (I) Violin plots showing the expression level of selected ductal cell type markers among ductal cell subpopulations.
Figure 4
Figure 4
The CNV profile analysis distinguishes tumor cells. (A) Heatmap showing large-scale CNV profile of each ducal cell and reference cell subpopulation; the red and blue colors represent high and low CNV level, respectively. (B) Boxplot showing the CNV score of each subpopulation; white boxes represent reference cells. (C, D) Boxplot showing the E score and M score of each ductal subpopulation. (E) Violin plot showing the expression level of mesenchymal cell markers among ductal subpopulations.
Figure 5
Figure 5
Construction and validation of prognostic model in TCGA_PAAD dataset. (A) DEGs between tumor cell and normal ductal cell subpopulations are shown in Upsetplot. (B) The overlapping area showing the common DEGs of scRNA-seq and GEPIA2 analyses in Vennplot. (C, D) Variable selection using LASSO regression, the correlation between coefficients and the number of variable (C), and the first dashed line showing the cutoff value we selected, indicating minimal deviance (D). (EH) Construction of prognostic model in train set in TCGA_PAAD, KM curve showing different OSs between high and low-risk group (E), ROC curve was used to evaluate the accuracy of prognostic model for 1-/1.5-/2-year OS (F), risk score distribution of subjects in train set (G) and survival status scatter plot (H). (IL) Internal validation of prognostic model in validation set in TCGA_PAAD.
Figure 6
Figure 6
Construction of nomogram for predicting OS in PDAC. (AC) External validation of prognostic model in PACA_AU, GSE57495 and GSE71729. (D, E) Unicox and Multicox analyses were performed to find the risk factors of OS in PDAC; red boxes represent P < 0.05 in the forestplot. (F) The prognosis-nomogram was drawn to predict 2-year and 3-year OSs for PDAC. (G) Calibration curve showing the agreement between actual and nomogram-predicted OS; the gray diagonal line is reference line.
Figure 7
Figure 7
Construction of diagnostic model. (A) Boxplot showing the expression level of four prognosis-related genes among normal pancreas and PDAC in GSE62452. (B, C) Univariate and multivariate logistic regression analyses were used to select risk factors of the occurrence of PDAC; red boxes represent P < 0.05 in the forestplot. (D) The diagnosis-nomogram was drawn to predict the occurrence PDAC. (E) Calibration curve showing the agreement between actual and nomogram-predicted PDAC; the gray diagonal line is reference line.
Figure 8
Figure 8
Validation of diagnostic model using our data. (AD) RT-qPCR was performed to show the expression levels of MET, KLK10, PSMB9 and ITGB among pancreatic cell lines. (EH) The relative expression levels of MET, KLK10, PSMB9, and ITGB6 between tumor and tumor-adjacent tissues were shown. (I) Calibration curve showing the performance of diagnostic model in our dataset.

References

    1. Siegel, R.L., Miller, K.D. and Jemal, A. (2020) Cancer statistics, 2020. CA Cancer J. Clin., 70, 7–30. - PubMed
    1. Ryan, D.P., Hong, T.S. and Bardeesy, N. (2014) Pancreatic adenocarcinoma. N. Engl. J. Med., 371, 2140–2141. - PubMed
    1. Vincent, A., Herman, J., Schulick, R., Hruban, R.H. and Goggins, M. (2011) Pancreatic cancer. Lancet, 378, 607–620. - PMC - PubMed
    1. Garrido-Laguna, I. and Hidalgo, M. (2015) Pancreatic cancer: from state-of-the-art treatments to promising novel therapies. Nat. Rev. Clin. Oncol., 12, 319–334. - PubMed
    1. Yang, Y. (2020) Current status and future prospect of surgical treatment for pancreatic cancer. Hepatobiliary. Surg. Nutr., 9, 89–91. - PMC - PubMed

Publication types

MeSH terms

Substances