Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Feb 24;22(2):e1004432.
doi: 10.1371/journal.pmed.1004432. eCollection 2025 Feb.

A systematic review of machine learning-based prognostic models for acute pancreatitis: Towards improving methods and reporting quality

Affiliations

A systematic review of machine learning-based prognostic models for acute pancreatitis: Towards improving methods and reporting quality

Brian Critelli et al. PLoS Med. .

Abstract

Background: An accurate prognostic tool is essential to aid clinical decision-making (e.g., patient triage) and to advance personalized medicine. However, such a prognostic tool is lacking for acute pancreatitis (AP). Increasingly machine learning (ML) techniques are being used to develop high-performing prognostic models in AP. However, methodologic and reporting quality has received little attention. High-quality reporting and study methodology are critical for model validity, reproducibility, and clinical implementation. In collaboration with content experts in ML methodology, we performed a systematic review critically appraising the quality of methodology and reporting of recently published ML AP prognostic models.

Methods/findings: Using a validated search strategy, we identified ML AP studies from the databases MEDLINE and EMBASE published between January 2021 and December 2023. We also searched pre-print servers medRxiv, bioRxiv, and arXiv for pre-prints registered between January 2021 and December 2023. Eligibility criteria included all retrospective or prospective studies that developed or validated new or existing ML models in patients with AP that predicted an outcome following an episode of AP. Meta-analysis was considered if there was homogeneity in the study design and in the type of outcome predicted. For risk of bias (ROB) assessment, we used the Prediction Model Risk of Bias Assessment Tool. Quality of reporting was assessed using the Transparent Reporting of a Multivariable Prediction Model of Individual Prognosis or Diagnosis-Artificial Intelligence (TRIPOD+AI) statement that defines standards for 27 items that should be reported in publications using ML prognostic models. The search strategy identified 6,480 publications of which 30 met the eligibility criteria. Studies originated from China (22), the United States (4), and other (4). All 30 studies developed a new ML model and none sought to validate an existing ML model, producing a total of 39 new ML models. AP severity (23/39) or mortality (6/39) were the most common outcomes predicted. The mean area under the curve for all models and endpoints was 0.91 (SD 0.08). The ROB was high for at least one domain in all 39 models, particularly for the analysis domain (37/39 models). Steps were not taken to minimize over-optimistic model performance in 27/39 models. Due to heterogeneity in the study design and in how the outcomes were defined and determined, meta-analysis was not performed. Studies reported on only 15/27 items from TRIPOD+AI standards, with only 7/30 justifying sample size and 13/30 assessing data quality. Other reporting deficiencies included omissions regarding human-AI interaction (28/30), handling low-quality or incomplete data in practice (27/30), sharing analytical codes (25/30), study protocols (25/30), and reporting source data (19/30).

Conclusions: There are significant deficiencies in the methodology and reporting of recently published ML based prognostic models in AP patients. These undermine the validity, reproducibility, and implementation of these prognostic models despite their promise of superior predictive accuracy.

Registration: Research Registry (reviewregistry1727).

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Summary of risk of bias in four domains assessed by PROBAST.
Fig 2
Fig 2. Heatmap depicting common areas of deficiencies in reporting standards as assessed by TRIPOD+AI.
* Publication has same first author and year as another paper listed; PMID of each *  in ascending order: Yang and colleagues (2022): 35430680, 35607360 [58,59]. Luo and colleagues (2023): 36653317, 36773821 [65,66]. Zhang and colleagues (2023): 36902504, 36964219, 37196588 [–71].

References

    1. Xiao AY, Tan ML, Wu LM, Asrani VM, Windsor JA, Yadav D, et al.. Global incidence and mortality of pancreatic diseases: a systematic review, meta-analysis, and meta-regression of population-based cohort studies. Lancet Gastroenterol Hepatol. 2016;1(1):45–55. Epub 20160628. doi: 10.1016/S2468-1253(16)30004-8 - DOI - PubMed
    1. Iannuzzi JP, King JA, Leong JH, Quan J, Windsor JW, Tanyingoh D, et al.. Global incidence of acute pancreatitis is increasing over time: a systematic review and meta-analysis. Gastroenterology. 2022;162(1):122–34. Epub 20210925. doi: 10.1053/j.gastro.2021.09.043 - DOI - PubMed
    1. Lee PJ, Papachristou GI. New insights into acute pancreatitis. Nat Rev Gastroenterol Hepatol. 2019;16(8):479–96. doi: 10.1038/s41575-019-0158-2 - DOI - PubMed
    1. Banks PA, Bollen TL, Dervenis C, Gooszen HG, Johnson CD, Sarr MG, et al.; Acute Pancreatitis Classification Working Group. Classification of acute pancreatitis—2012: revision of the Atlanta classification and definitions by international consensus. Gut. 2013;62(1):102–11. doi: 10.1136/gutjnl-2012-302779 - DOI - PubMed
    1. Dellinger EP, Forsmark CE, Layer P, Levy P, Maravi-Poma E, Petrov MS, et al.; Pancreatitis Across Nations Clinical Research and Education Alliance (PANCREA). Determinant-based classification of acute pancreatitis severity: an international multidisciplinary consultation. Ann Surg. 2012;256(6):875–80. doi: 10.1097/SLA.0b013e318256f778 - DOI - PubMed

Publication types