Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 15;12(5):530.
doi: 10.3390/bioengineering12050530.

Private Data Incrementalization: Data-Centric Model Development for Clinical Liver Segmentation

Affiliations

Private Data Incrementalization: Data-Centric Model Development for Clinical Liver Segmentation

Stephanie Batista et al. Bioengineering (Basel). .

Abstract

Machine Learning models, more specifically Artificial Neural Networks, are transforming medical imaging by enabling precise liver segmentation, a crucial task for diagnosing and treating liver diseases. However, these models often face challenges in adapting to diverse clinical data sources as differences in dataset volume, resolution, and origin impact generalization and performance. This study introduces a Private Data Incrementalization, a data-centric approach to enhance the adaptability of Artificial Neural Networks by progressively exposing them to varied clinical data. As the target of this study is not to propose a new image segmentation model, the existing medical imaging segmentation models-including U-Net, ResUNet++, Fully Convolutional Network, and a modified algorithm based on the Conditional Bernoulli Diffusion Model-are used. The study evaluates these four models using a curated private dataset of computed tomography scans from Coimbra University Hospital, supplemented by two public datasets, 3D-IRCADb01 and CHAOS. The Private Data Incrementalization method systematically increases the volume and diversity of training data, simulating real-world conditions where models must handle varied imaging contexts. Pre-processing and post-processing stages, incremental training, and performance evaluations reveal that structured exposure to diverse datasets improves segmentation performance, with ResUNet++ achieving the highest accuracy (0.9972) and Dice Similarity Coefficient (0.9449), and the best Average Symmetric Surface Distance (0.0053 mm), demonstrating the importance of dataset diversity and volume for segmentation models' robustness and generalization. Private Data Incrementalization thus offers a scalable strategy for building resilient segmentation models, ultimately benefiting clinical workflows, patient care, and healthcare resource management by addressing the variability inherent in clinical imaging data.

Keywords: BerDiff; FCN; ResUNet++; U-Net; artificial neural networks; automatic liver segmentation; computed tomography; private data incrementalization.

PubMed Disclaimer

Conflict of interest statement

Author Ricardo Filipe was employed by the company Altice Labs, S.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure A1
Figure A1
Performance of the U-Net experiments on unseen data: (a) original image; (b) ground truth; (c) segmentation outcome with E1; (d) segmentation outcome with E2; (e) segmentation outcome with E3; (f) segmentation outcome with E4; (g) segmentation outcome with E5.
Figure A2
Figure A2
Performance of the ResUNet++ experiments on unseen data: (a) original image; (b) ground truth; (c) segmentation outcome with E6; (d) segmentation outcome with E7; (e) segmentation outcome with E8; (f) segmentation outcome with E9; (g) segmentation outcome with E10.
Figure A3
Figure A3
Performance of the FCN experiments on unseen data: (a) original image; (b) ground truth; (c) segmentation outcome with E11; (d) segmentation outcome with E12; (e) segmentation outcome with E13; (f) segmentation outcome with E14; (g) segmentation outcome with E15.
Figure A4
Figure A4
Performance of the BerDiff experiments on unseen data: (a) original image; (b) ground truth; (c) segmentation outcome with E16; (d) segmentation outcome with E17; (e) segmentation outcome with E18; (f) segmentation outcome with E19; (g) segmentation outcome with E20.
Figure 1
Figure 1
Structure of the Private Data Incrementalization (PDI) process. Across the layers, the structure remains consistent, but the proportion of private data in the training set increases incrementally to reflect the progressive availability of clinical data in real-world settings.
Figure 2
Figure 2
Methodology of the proposed study: (a) clinical workflow; (b) AI workflow.
Figure 3
Figure 3
Example of a CT scan from the private dataset: (a) CT scan image/slice; (b) liver segmentation ground truth mask.
Figure 4
Figure 4
Datasets’ distribution based on number of scans.
Figure 5
Figure 5
Segmentation flowchart.
Figure 6
Figure 6
Pre-processing flowchart.
Figure 7
Figure 7
Post-processing flowchart.
Figure 8
Figure 8
Performance values for each model across all datasets on unseen data: loss, accuracy, precision, DSC, AOE, ASSD, and MSSD.
Figure 9
Figure 9
Comparison of the performance from each model for dataset 1 on unseen data: (a) original image; (b) ground truth; (c) U-Net outcome; (d) ResUNet++ outcome; (e) FCN outcome; (f) BerDiff outcome.
Figure 10
Figure 10
Comparison of the performance from each model for dataset 2 on unseen data: (a) original image; (b) ground truth; (c) U-Net outcome; (d) ResUNet++ outcome; (e) FCN outcome; (f) BerDiff outcome.
Figure 11
Figure 11
Comparison of the performance from each model for dataset 3 on unseen data: (a) original image; (b) ground truth; (c) U-Net outcome; (d) ResUNet++ outcome; (e) FCN outcome; (f) BerDiff outcome.
Figure 12
Figure 12
Comparison of the performance from each model for dataset 4 on unseen data: (a) original image; (b) ground truth; (c) U-Net outcome; (d) ResUNet++ outcome; (e) FCN outcome; (f) BerDiff outcome.
Figure 13
Figure 13
Comparison of the performance from each model for dataset 5 on unseen data: (a) original image; (b) ground truth; (c) U-Net outcome; (d) ResUNet++ outcome; (e) FCN outcome; (f) BerDiff outcome.

References

    1. Niño S.B., Bernardino J., Domingues I. Algorithms for Liver Segmentation in Computed Tomography Scans: A Historical Perspective. Sensors. 2024;24:1752. doi: 10.3390/s24061752. - DOI - PMC - PubMed
    1. Batista S., Domingues I. Adoption of Artificial Neural Network-based Methods for Medical Image Segmentation; Proceedings of the Conferência Portuguesa de Reconhecimento de Padrões (RecPad); Covilhã, Portugal. 25 October 2024.
    1. Litjens G., Kooi T., Bejnordi B.E., Setio A.A.A., Ciompi F., Ghafoorian M., van der Laak J.A., van Ginneken B., Sánchez C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017;42:60–88. doi: 10.1016/j.media.2017.07.005. - DOI - PubMed
    1. Isensee F., Jaeger P.F., Kohl S.A.A., Petersen J., Maier-Hein K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods. 2021;18:203–211. doi: 10.1038/s41592-020-01008-z. - DOI - PubMed
    1. Cheplygina V., de Bruijne M., Pluim J.P. Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 2019;54:280–296. doi: 10.1016/j.media.2019.03.009. - DOI - PubMed

LinkOut - more resources