Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2025 Jan:99:103382.
doi: 10.1016/j.media.2024.103382. Epub 2024 Nov 8.

Large-scale multi-center CT and MRI segmentation of pancreas with deep learning

Affiliations
Multicenter Study

Large-scale multi-center CT and MRI segmentation of pancreas with deep learning

Zheyuan Zhang et al. Med Image Anal. 2025 Jan.

Abstract

Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective study, we collected a large dataset (767 scans from 499 participants) of T1-weighted (T1 W) and T2-weighted (T2 W) abdominal MRI series from five centers between March 2004 and November 2022. We also collected CT scans of 1,350 patients from publicly available sources for benchmarking purposes. We introduced a new pancreas segmentation method, called PanSegNet, combining the strengths of nnUNet and a Transformer network with a new linear attention module enabling volumetric computation. We tested PanSegNet's accuracy in cross-modality (a total of 2,117 scans) and cross-center settings with Dice and Hausdorff distance (HD95) evaluation metrics. We used Cohen's kappa statistics for intra and inter-rater agreement evaluation and paired t-tests for volume and Dice comparisons, respectively. For segmentation accuracy, we achieved Dice coefficients of 88.3% (±7.2%, at case level) with CT, 85.0% (±7.9%) with T1 W MRI, and 86.3% (±6.4%) with T2 W MRI. There was a high correlation for pancreas volume prediction with R2 of 0.91, 0.84, and 0.85 for CT, T1 W, and T2 W, respectively. We found moderate inter-observer (0.624 and 0.638 for T1 W and T2 W MRI, respectively) and high intra-observer agreement scores. All MRI data is made available at https://osf.io/kysnj/. Our source code is available at https://github.com/NUBagciLab/PaNSegNet.

Keywords: CT pancreas; Generalized segmentation; MRI pancreas; Pancreas segmentation; Transformer segmentation.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Ulas Bagci reports financial support was provided by National Institutes of Health. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1.
Fig. 1.
Flowchart showing the determination of the final study population. We select Center #1 and Center #2 data centers as in-distribution centers (internal validation) for five cross-fold training and Center #3, Center #4, and Center #5 as out-of-distribution centers (external validation). Center#1:, Center#2:, Center#3:, Center#4:, Center#5:
Fig. 2.
Fig. 2.
PanSegNet is based on a combination of nnUnet with linear self-attention mechanism. Linear self-attention is obtained by converting the self-attention mechanism with linearization operation, as described below. The architecture accepts volumetric input, therefore appreciating the full anatomy details compared to pseudo-3D approaches.
Fig. 3.
Fig. 3.
Comparison of traditional self-attention mechanism (left) v.s. linear self-attention mechanism (right). X is input, O is output. Red fonts show the specific changes we apply to self-attention to linearize.
Fig. 4.
Fig. 4.
Segmentation results for CT pancreas across multiple datasets (green indicates the predicted pancreas, and red indicates the annotations). While AbdominalCT-1K exhibits robust segmentation performance, marked by precise boundary delineation, a domain shift is observed when extending the model to the AMOS, WORD, and BTCV datasets, underscoring the significance of addressing domain shifts for clinical applications. For a fair comparison, we select the visualization samples near the median value according to the Dice coefficient distribution (note: Dice is calculated volumetrically).
Fig. 5.
Fig. 5.
MRI T1W pancreas segmentation visualization across various data centers. The segmentation delineations illustrate the model’s capability to delineate pancreas boundaries precisely, exemplified by the accurate results. We observe domain shifts in external validation from Centers #3, #4, and #5.
Fig. 6.
Fig. 6.
MRI T2W pancreas segmentation visualization across various data centers. The segmentation delineations illustrate the model’s capability to delineate pancreas boundaries precisely, exemplified by the accurate results. The Center #3 T2W segmentation also exhibits relatively high results, showcasing its segmentation potential. We observe domain shifts in external validation from Centers #3, #4, and #5.
Fig. 7.
Fig. 7.
Compelling correlation between the real volume and predicted volume for pancreas segmentation across three distinct modalities: CT, MRI T1, and MRI T2. Each subplot showcases a linear fitting line, corresponding to Pearson’s correlation R2 values of 0.91, 0.84, and 0.85 for CT, MRI T1W, and MRI T2W, respectively. These high R2 values elucidate the accuracy and effectiveness of our volume prediction model, reinforcing its potential utility in clinical applications.
Fig. 8.
Fig. 8.
The shifts in Dice coefficients observed across three modalities: CT, T1W, and T2W MRI scans, stemming from the influence of domain shifts. As we move from the source domain (dark blue) to other datasets (light blue), we observe variations in segmentation performance evidenced by the changing Dice coefficients.
Fig. 9.
Fig. 9.
Upper and lower bound information for dice vs volume error (left) and dice vs absolute volume error (right) are given, respectively. For example, for a 94% dice coefficient, 10% volume error is quite plausible while this is not the case for other organs in general, but pancreas.

Comment in

References

    1. Abdollahi A, Pradhan B, Alamri A, 2020. Vnet: An end-to-end fully convolutional neural network for road extraction from high-resolution remote sensing data. Ieee Access 8, 179424–179436.
    1. Antonelli M, Reinke A, Bakas S, Farahani K, Kopp-Schneider A, Landman BA, Litjens G, Menze B, Ronneberger O, Summers RM, et al. , 2022. The medical segmentation decathlon. Nature communications 13, 4128. - PMC - PubMed
    1. Bagci U, Chen X, Udupa JK, 2011. Hierarchical scale-based multiobject recognition of 3-d anatomical structures. IEEE Transactions on Medical Imaging 31, 777–789. - PubMed
    1. Busireddy KK, AlObaidy M, Ramalho M, Kalubowila J, Baodong L, Santagostino I, Semelka RC, 2014. Pancreatitis-imaging approach. World journal of gastrointestinal pathophysiology 5, 252. - PMC - PubMed
    1. Cai J, Lu L, Xie Y, Xing F, Yang L, 2017. Improving deep pancreas segmentation in ct and mri images via recurrent neural contextual learning and direct loss function. arXiv preprint arXiv:1707.04912.

Publication types