. 2022 Feb 8;17(2):e0263656.

doi: 10.1371/journal.pone.0263656. eCollection 2022.

Methods for the frugal labeler: Multi-class semantic segmentation on heterogeneous labels

Mark Schutera¹, Luca Rettenberger¹, Christian Pylatiuk¹, Markus Reischl¹

Affiliations

PMID: 35134081
PMCID: PMC8824336
DOI: 10.1371/journal.pone.0263656

Methods for the frugal labeler: Multi-class semantic segmentation on heterogeneous labels

Mark Schutera et al. PLoS One. 2022.

. 2022 Feb 8;17(2):e0263656.

doi: 10.1371/journal.pone.0263656. eCollection 2022.

Authors

Mark Schutera¹, Luca Rettenberger¹, Christian Pylatiuk¹, Markus Reischl¹

Affiliation

¹ Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Baden-Württemberg, Germany.

PMID: 35134081
PMCID: PMC8824336
DOI: 10.1371/journal.pone.0263656

Abstract

Deep learning increasingly accelerates biomedical research, deploying neural networks for multiple tasks, such as image classification, object detection, and semantic segmentation. However, neural networks are commonly trained supervised on large-scale, labeled datasets. These prerequisites raise issues in biomedical image recognition, as datasets are generally small-scale, challenging to obtain, expensive to label, and frequently heterogeneously labeled. Furthermore, heterogeneous labels are a challenge for supervised methods. If not all classes are labeled for an individual sample, supervised deep learning approaches can only learn on a subset of the dataset with common labels for each individual sample; consequently, biomedical image recognition engineers need to be frugal concerning their label and ground truth requirements. This paper discusses the effects of frugal labeling and proposes to train neural networks for multi-class semantic segmentation on heterogeneously labeled data based on a novel objective function. The objective function combines a class asymmetric loss with the Dice loss. The approach is demonstrated for training on the sparse ground truth of a heterogeneous labeled dataset, training within a transfer learning setting, and the use-case of merging multiple heterogeneously labeled datasets. For this purpose, a biomedical small-scale, multi-class semantic segmentation dataset is utilized. The heartSeg dataset is based on the medaka fish's position as a cardiac model system. Automating image recognition and semantic segmentation enables high-throughput experiments and is essential for biomedical research. Our approach and analysis show competitive results in supervised training regimes and encourage frugal labeling within biomedical image recognition.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Visualization of the combined objective function.**
An exemplary sample x is fed into the network, which outputs a training prediction $\hat{y}$ . Each class is denoted with a unique color. The corresponding ground truth label $\tilde{y}$ is missing the class on the lower right. The dotted line marks where the label would be in the sample. The dashed box on the right denotes $L$ , with the output for both objective functions $L_{C A L}$ and $L_{D S C}$ . Prediction errors are marked as red areas. The number in the lower-left corners indicates the approximate output for the respective objective function. Since there is a prediction error in both given classes, the Dice loss will output a value $L > 0$ for them and 0 for the unlabeled class since it cannot make a statement without the ground truth mask. The class asymmetric loss will only output a value $L > 0$ if an unlabeled class’s predicted segmentation mask intersects with a given class’s ground truth mask. Since the network predicted a small portion of the unlabeled class as part of the class in the center-left, the output will be $L > 0$ for this class. For the other labeled class, no intersection occurred, and hence the output is 0. Since the unlabeled mask does not contain a ground truth, it is not evaluated by the class asymmetric loss, which will result in an output of 0.

**Fig 2. Labels of the extended heartSeg dataset.**
Displaying image masks for the three semantic classes overlaid on two image samples of the medaka hatchlings’ cardiac system as part of the extended heartSeg dataset. The orange label masks depict the ventricle V (based on the heartSeg dataset [3]). The blue and green label masks depict the respective, newly added semantic classes bulbus B and atrium A.

**Fig 3. Experiments.**
Visualization of the conducted experiments. Each row is an experiment, each column the applied statistic. The scatter points in each plot indicate the value for the respective method to measure performance, either the mean Performance Frugality Ratio *PFR* over the observed classes or the mean value (μ) over the respective statistic. The optimal performance is highlighted with a vertical dotted line. The plotted values are presented as a moving average with kernel size two. The left axis corresponds to the respective statistic, the right axis, if it is present, to the respective PFR. For details see S1 File.

**Fig 4. Predictions of different training schemes.**
For reasons of comparability, the predictions are shown for the same sample (see Fig 2). The first column presents the development of the baseline model’s prediction during the training, followed by the ablation experiment with 60% dropped labels, the transfer learning approach with 70% dropped labels, and the merge experiment with a ratio of 63% to 37%. The rows depict the predictions’ state after e training epochs (10, 20, 30, 40, 50). For detailed results of the experiments, see Section 3. The last column shows the validation loss development during training, presented as a moving average with kernel size three.

See this image and copyright information in PMC

Cited by

Automatization of CT Annotation: Combining AI Efficiency with Expert Precision.
Edelmers E, Kazoka D, Bolocko K, Sudars K, Pilmane M. Edelmers E, et al. Diagnostics (Basel). 2024 Jan 15;14(2):185. doi: 10.3390/diagnostics14020185. Diagnostics (Basel). 2024. PMID: 38248062 Free PMC article.

References

1. Scherr T, Bartschat A, Reischl M, Stegmaier J, Mikut R. Best practices in deep learning-based segmentation of microscopy images. In: Proceedings—28. Workshop Computational Intelligence. KIT Scientific Publishing; 2018. p. 175–196.
1. Rizwan I Haque I, Neubert J. Deep learning approaches to biomedical image segmentation. Informatics in Medicine Unlocked. 2020;18:100297. doi: 10.1016/j.imu.2020.100297 - DOI
1. Schutera M, Just S, Gierten J, Mikut R, Reischl M, Pylatiuk C. Machine learning methods for automated quantification of ventricular dimensions. Zebrafish. 2019;16(6):542–545. doi: 10.1089/zeb.2019.1754 - DOI - PubMed
1. Wang Y, Kanagaraj N, Pylatiuk C, Mikut R, Peravali R, Reischl M. High-throughput data acquisition platform for multi-larvae touch-response behavior screening of zebrafish. IEEE Robotics and Automation Letters (2021).
1. Milletari F, Navab N, Ahmadi SA. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 4. International Conference on 3D Vision. IEEE; 2016. p. 565–571.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Methods for the frugal labeler: Multi-class semantic segmentation on heterogeneous labels

Affiliation

Methods for the frugal labeler: Multi-class semantic segmentation on heterogeneous labels

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources