Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 13;10(4):e26153.
doi: 10.1016/j.heliyon.2024.e26153. eCollection 2024 Feb 29.

Attacking the out-of-domain problem of a parasite egg detection in-the-wild

Affiliations

Attacking the out-of-domain problem of a parasite egg detection in-the-wild

Nutsuda Penpong et al. Heliyon. .

Abstract

The out-of-domain (OO-Do) problem has hindered machine learning models especially when the models are deployed in the real world. The OO-Do problem occurs during machine learning testing phase when a learned machine learning model must predict on data belonging to a class that is different from that of the data used for training. We tackle the OO-Do problem in an object-detection task: a parasite-egg detection model used in real-world situations. First, we introduce the In-the-wild parasite-egg dataset to evaluate the OO-Do-aware model. The dataset contains 1,552 images, 1,049 parasite-egg, and 503 OO-Do images, uploaded through chatbot. It was constructed by conducting a chatbot test session with 222 medical technology students. Thereafter, we propose a data-driven framework to construct a parasite-egg recognition model for in-the-wild applications to address the OO-Do issue. In the framework, we use publicly available datasets to train the parasite-egg recognition models about in-domain and out-of-domain concepts. Finally, we compare the integration strategies for our proposed two-step parasite-egg detection approach on two test sets: standard and In-the-wild datasets. We also investigate different thresholding strategies for model robustness to OO-Do data. Experiments on two test datasets showed that concatenating an OO-Do-aware classification model after an object-detection model achieved outstanding performance in detecting parasite eggs. The framework gained 7.37% and 4.09% F1-score improvement from the baselines on Chula t e s t +Wild O O - D o dataset and the In-the-wild parasite-egg dataset, respectively.

Keywords: Chatbot; Computer vision in-the-wild; Data driven framework; Out-of-domain; Parasite egg detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Thanapong Intharah reports financial support was provided by The National Science, Research and Innovation Fund (NSRF), Thailand.

Figures

Figure 1
Figure 1
Differences between out-of-domain test data and out-of-distribution test data where the training data consists of four classes: English springer, car, parachute, and church. Out-of-distribution samples come from the same classes but they have different poses, textures, contexts, or whether from the training data. However, out-of-domain (OO-Do) samples come from completely different classes from that of the training data such as fish, cassette player, golf ball, and French horn.
Figure 2
Figure 2
Conceptual model of the relation between four datasets in the data-driven framework for training an object-detection model that is aware of the out-of-domain problem. We consider data in two dimensions. The data class is the category that the data belong to. The observation of the data is what describes the data. For computer vision tasks, observation of the data is the appearance. The hard-negative OO-Do dataset is a dataset that has similar observations to the data in the main task, but it does not have a common class with the main task. The main dataset is the dataset that only contains data from the task-focused classes. The in-the-wild dataset is the dataset used for testing, which contains data from the task-focused classes, with slightly different observations, and data from unseen classes, OO-Do samples. The OO-Do dataset is the dataset that contains a wilder set of unseen classes with completely different observations.
Figure 3
Figure 3
(a)Ascaris lumbricoides(b) Hookworms (c)Opisthorchis viverrini(d)Taenia spp. (e)Trichuris trichiura(f) Adult parasite (g) Artifact (h) Unclear image (i) Other parasite egg (j) Arbitrary.
Figure 4
Figure 4
Overview of testing process for the classification-first framework.
Figure 5
Figure 5
Overview of training and finding thresholds for classification-first framework.
Figure 6
Figure 6
Overview of testing process for the classification-later framework.
Figure 7
Figure 7
Overview of training and optimizing for thresholds for the classification-later framework.
Figure 8
Figure 8
Examples of misclassified and correctly classified classification cases. (a-b) are examples of misclassified cases. Echinostoma spp. (a above) and minute intestinal fluke (b above) in the “other parasite egg” class were incorrectly classified as Fasciolopsis buski (a) below and OV (b) below, respectively, which could lead to lower detection rates for OO-Do images in this class. (c) is an example of the classification-later framework correctly classifying OO-Do images. An image of cat eyes above (c) was detected as Taenia spp. below (c) by the object-detection model. However, this image was correctly rejected by classification with the SoftMax threshold.

References

    1. B. Zhao, S. Yu, W. Ma, M. Yu, S. Mei, A. Wang, J. He, A. Yuille, A. Kortylewski, Ood-cv: A benchmark for robustness to out-of-distribution shifts of individual nuisances in natural images, in: Proceedings of the European Conference on Computer Vision (ECCV).
    1. Olber B., Radlak K., Popowicz A., Szczepankiewicz M., Chachula K. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023. Detection of out-of-distribution samples using binary neuron activation patterns.
    1. Lai C.-H., Yu S.-S., Tseng H.-Y., Tsai M.-H. A protozoan parasite extraction scheme for digital microscopic images. Comput. Med. Imaging Graph. 2010;34(2):122–130. - PubMed
    1. Osaku D., Cuba C.F., Suzuki C.T., Gomes J.F., Falcão A.X. Automated diagnosis of intestinal parasites: a new hybrid approach and its benefits. Comput. Biol. Med. 2020;123 - PubMed
    1. Butploy N., Kanarkard W., Maleewong Intapan P., et al. Deep learning approach for ascaris lumbricoides parasite egg classification. J. Parasitol. Res. 2021 - PMC - PubMed

LinkOut - more resources