Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov;163(5):1435-1446.e3.
doi: 10.1053/j.gastro.2022.06.066. Epub 2022 Jul 1.

Radiomics-based Machine-learning Models Can Detect Pancreatic Cancer on Prediagnostic Computed Tomography Scans at a Substantial Lead Time Before Clinical Diagnosis

Affiliations

Radiomics-based Machine-learning Models Can Detect Pancreatic Cancer on Prediagnostic Computed Tomography Scans at a Substantial Lead Time Before Clinical Diagnosis

Sovanlal Mukherjee et al. Gastroenterology. 2022 Nov.

Abstract

Background & aims: Our purpose was to detect pancreatic ductal adenocarcinoma (PDAC) at the prediagnostic stage (3-36 months before clinical diagnosis) using radiomics-based machine-learning (ML) models, and to compare performance against radiologists in a case-control study.

Methods: Volumetric pancreas segmentation was performed on prediagnostic computed tomography scans (CTs) (median interval between CT and PDAC diagnosis: 398 days) of 155 patients and an age-matched cohort of 265 subjects with normal pancreas. A total of 88 first-order and gray-level radiomic features were extracted and 34 features were selected through the least absolute shrinkage and selection operator-based feature selection method. The dataset was randomly divided into training (292 CTs: 110 prediagnostic and 182 controls) and test subsets (128 CTs: 45 prediagnostic and 83 controls). Four ML classifiers, k-nearest neighbor (KNN), support vector machine (SVM), random forest (RM), and extreme gradient boosting (XGBoost), were evaluated. Specificity of model with highest accuracy was further validated on an independent internal dataset (n = 176) and the public National Institutes of Health dataset (n = 80). Two radiologists (R4 and R5) independently evaluated the pancreas on a 5-point diagnostic scale.

Results: Median (range) time between prediagnostic CTs of the test subset and PDAC diagnosis was 386 (97-1092) days. SVM had the highest sensitivity (mean; 95% confidence interval) (95.5; 85.5-100.0), specificity (90.3; 84.3-91.5), F1-score (89.5; 82.3-91.7), area under the curve (AUC) (0.98; 0.94-0.98), and accuracy (92.2%; 86.7-93.7) for classification of CTs into prediagnostic versus normal. All 3 other ML models, KNN, RF, and XGBoost, had comparable AUCs (0.95, 0.95, and 0.96, respectively). The high specificity of SVM was generalizable to both the independent internal (92.6%) and the National Institutes of Health dataset (96.2%). In contrast, interreader radiologist agreement was only fair (Cohen's kappa 0.3) and their mean AUC (0.66; 0.46-0.86) was lower than each of the 4 ML models (AUCs: 0.95-0.98) (P < .001). Radiologists also recorded false positive indirect findings of PDAC in control subjects (n = 83) (7% R4, 18% R5).

Conclusions: Radiomics-based ML models can detect PDAC from normal pancreas when it is beyond human interrogation capability at a substantial lead time before clinical diagnosis. Prospective validation and integration of such models with complementary fluid-based biomarkers has the potential for PDAC detection at a stage when surgical cure is a possibility.

Keywords: Artificial Intelligence; Biomarkers; Pancreas; Pancreatic Ductal Carcinoma; X-Ray Computed Tomography.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest

The authors disclose no conflicts.

Figures

Figure 1.
Figure 1.
Study design and structure of datasets.
Figure 2.
Figure 2.
Prediagnostic and diagnostic CT: 75-year-old man with PDAC. Pancreas was normal on the prediagnostic CT (A and B). Approximately 2.5 years later, the patient presented with PDAC in the body with peritoneal carcinomatosis (C). Color-coded radiomics texture map overlaid on the prediagnostic CT (D) shows the distribution of gray-level size zone matrix-small area emphasis (GLSZM-SAE) (measurement of fine gray-level texture) over a single slice of pancreas on the prediagnostic CT.
Figure 3.
Figure 3.
Receiver operating characteristics of the 4 ML models on the test subset. KNN, k-nearest neighbor; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
Figure 4.
Figure 4.
Receiver operating characteristics of the SVM model and the 2 radiologist readers on the test subset. R1: SVM model; R2, R3: two radiologist readers. FPF, false positive fraction or 1 – Specificity; TPF, true positive fraction or sensitivity.

Comment in

References

    1. Schwartz NRM, Matrisian LM, Shrader EE, et al. Potential cost-effectiveness of risk-based pancreatic cancer screening in patients with new-onset diabetes. J Natl Compr Canc Netw 2021;20:451–459. - PubMed
    1. Singh DP, Sheedy S, Goenka AH, et al. Computerized tomography scan in pre-diagnostic pancreatic ductal adenocarcinoma: stages of progression and potential benefits of early intervention: a retrospective study. Pancreatology 2020;20:1495–1501. - PubMed
    1. Vasen H, Ibrahim I, Ponce CG, et al. Benefit of surveillance for pancreatic cancer in high-risk individuals: outcome of long-term prospective follow-up studies from three european expert centers. J Clin Oncol 2016;34:2010–2019. - PubMed
    1. Yuan C, Babic A, Khalaf N, et al. Diabetes, weight change, and pancreatic cancer risk. JAMA Oncol 2020;6:e202948. - PMC - PubMed
    1. Hart PA, Chari ST. Is screening for pancreatic cancer in high-risk individuals one step closer or a fool’s errand? Clin Gastroenterol Hepatol 2019;17:36–38. - PMC - PubMed

Publication types

MeSH terms