Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 31;12(8):1704.
doi: 10.3390/biomedicines12081704.

Deep Learning-Based Real-Time Organ Localization and Transit Time Estimation in Wireless Capsule Endoscopy

Affiliations

Deep Learning-Based Real-Time Organ Localization and Transit Time Estimation in Wireless Capsule Endoscopy

Seung-Joo Nam et al. Biomedicines. .

Abstract

Background: Wireless capsule endoscopy (WCE) has significantly advanced the diagnosis of gastrointestinal (GI) diseases by allowing for the non-invasive visualization of the entire small intestine. However, machine learning-based methods for organ classification in WCE often rely on color information, leading to decreased performance when obstacles such as food debris are present. This study proposes a novel model that integrates convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to analyze multiple frames and incorporate temporal information, ensuring that it performs well even when visual information is limited.

Methods: We collected data from 126 patients using PillCam™ SB3 (Medtronic, Minneapolis, MN, USA), which comprised 2,395,932 images. Our deep learning model was trained to identify organs (stomach, small intestine, and colon) using data from 44 training and 10 validation cases. We applied calibration using a Gaussian filter to enhance the accuracy of detecting organ boundaries. Additionally, we estimated the transit time of the capsule in the gastric and small intestine regions using a combination of a convolutional neural network (CNN) and a long short-term memory (LSTM) designed to be aware of the sequence information of continuous videos. Finally, we evaluated the model's performance using WCE videos from 72 patients.

Results: Our model demonstrated high performance in organ classification, achieving an accuracy, sensitivity, and specificity of over 95% for each organ (stomach, small intestine, and colon), with an overall accuracy and F1-score of 97.1%. The Matthews Correlation Coefficient (MCC) and Geometric Mean (G-mean) were used to evaluate the model's performance on imbalanced datasets, achieving MCC values of 0.93 for the stomach, 0.91 for the small intestine, and 0.94 for the colon, and G-mean values of 0.96 for the stomach, 0.95 for the small intestine, and 0.97 for the colon. Regarding the estimation of gastric and small intestine transit times, the mean time differences between the model predictions and ground truth were 4.3 ± 9.7 min for the stomach and 24.7 ± 33.8 min for the small intestine. Notably, the model's predictions for gastric transit times were within 15 min of the ground truth for 95.8% of the test dataset (69 out of 72 cases). The proposed model shows overall superior performance compared to a model using only CNN.

Conclusions: The combination of CNN and LSTM proves to be both accurate and clinically effective for organ classification and transit time estimation in WCE. Our model's ability to integrate temporal information allows it to maintain high performance even in challenging conditions where color information alone is insufficient. Including MCC and G-mean metrics further validates the robustness of our approach in handling imbalanced datasets. These findings suggest that the proposed method can significantly improve the diagnostic accuracy and efficiency of WCE, making it a valuable tool in clinical practice for diagnosing and managing GI diseases.

Keywords: deep learning; gastrointestinal transit; wireless capsule endoscopy.

PubMed Disclaimer

Conflict of interest statement

Gwiseong Moon, Jung-Hwan Park, Yoon Kim and Hyun-Soo Choi were employed by the company Ziovision Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure A1
Figure A1
Effects of the calibration. Three graphs show the prediction results of the original model (a), the model with the class probability correction by applying a Gaussian filter of size 64 (b), and the model with the class probability correction by applying a Gaussian filter of size 128 (c). The X-axis represents the frames of the video. The Y-axis represents the model’s prediction, which classifies the location into the stomach (0), small intestine (1), and colon (2).
Figure A2
Figure A2
Normal pylorus image from the duodenal side.
Figure A3
Figure A3
Poor bowel preparation of small intestine due to fecal material (A) or blood clot (B).
Figure 1
Figure 1
Technical flowchart that shows the overall system development process.
Figure 2
Figure 2
Image preprocessing using center crop.
Figure 3
Figure 3
Diagram of the proposed model, which combines a conventional neural network (CNN) and a long short-term memory (LSTM). The CNN is used for feature extraction and the LSTM is used to learn sequence awareness.
Figure 4
Figure 4
The effect of probability calibration. One case before the application of the Gaussian filter (a), and the same case after the application of the Gaussian filter (b). The X-axis represents the frames of the video. The Y-axis represents the model’s prediction, which classifies the location into the stomach (0), small intestine (1), and colon (2). The red circle denotes the effect of the Gaussian filter.
Figure 5
Figure 5
Example of anatomical landmark detection. Red circles indicate the transition points.
Figure 6
Figure 6
Comparison of the model’s prediction in terms of sequence awareness and probability calibration. The top two figures show organ prediction by the model without sequence awareness (a,b), and the bottom two figures are from the sequence-aware model (c,d). The two figures on the left represent the results before probability calibration (Gaussian filter) was applied (a,c), and the two pictures on the right represent the results after the application of the Gaussian filter (b,d). The X-axis represents the frames of the video. The Y-axis represents the model’s prediction, which classifies the location into the stomach (0), small intestine (1), and colon (2).
Figure 7
Figure 7
Time difference between the clinician’s decision and the model’s prediction. The red line represents the median value with 0.25 and 0.75 quartiles.
Figure 8
Figure 8
Time difference between the clinician’s decision and the model’s prediction across different pathologies of the small intestine. The red line represents the median value with 0.25 and 0.75 quartiles.

Similar articles

Cited by

References

    1. Le Berre C., Sandborn W.J., Aridhi S., Devignes M.D., Fournier L., Smail-Tabbone M., Danese S., Peyrin-Biroulet L. Application of Artificial Intelligence to Gastroenterology and Hepatology. Gastroenterology. 2020;158:76–94.e2. doi: 10.1053/j.gastro.2019.08.058. - DOI - PubMed
    1. Tokat M., van Tilburg L., Koch A.D., Spaander M.C.W. Artificial Intelligence in Upper Gastrointestinal Endoscopy. Dig. Dis. 2022;40:395–408. doi: 10.1159/000518232. - DOI - PubMed
    1. Kim S.H., Lim Y.J. Artificial Intelligence in Capsule Endoscopy: A Practical Guide to Its Past and Future Challenges. Diagnostics. 2021;11:1722. doi: 10.3390/diagnostics11091722. - DOI - PMC - PubMed
    1. Soffer S., Klang E., Shimon O., Nachmias N., Eliakim R., Ben-Horin S., Kopylov U., Barash Y. Deep learning for wireless capsule endoscopy: A systematic review and meta-analysis. Gastrointest. Endosc. 2020;92:831–839.e8. doi: 10.1016/j.gie.2020.04.039. - DOI - PubMed
    1. Ding Z., Shi H., Zhang H., Meng L., Fan M., Han C., Zhang K., Ming F., Xie X., Liu H., et al. Gastroenterologist-Level Identification of Small-Bowel Diseases and Normal Variants by Capsule Endoscopy Using a Deep-Learning Model. Gastroenterology. 2019;157:1044–1054.e5. doi: 10.1053/j.gastro.2019.06.025. - DOI - PubMed

LinkOut - more resources