Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct:178:105178.
doi: 10.1016/j.ijmedinf.2023.105178. Epub 2023 Aug 21.

MBT: Model-Based Transformer for retinal optical coherence tomography image and video multi-classification

Affiliations

MBT: Model-Based Transformer for retinal optical coherence tomography image and video multi-classification

Badr Ait Hammou et al. Int J Med Inform. 2023 Oct.

Abstract

Background and objective: The detection of retinal diseases using optical coherence tomography (OCT) images and videos is a concrete example of a data classification problem. In recent years, Transformer architectures have been successfully applied to solve a variety of real-world classification problems. Although they have shown impressive discriminative abilities compared to other state-of-the-art models, improving their performance is essential, especially in healthcare-related problems.

Methods: This paper presents an effective technique named model-based transformer (MBT). It is based on popular pre-trained transformer models, particularly, vision transformer, swin transformer for OCT image classification, and multiscale vision transformer for OCT video classification. The proposed approach is designed to represent OCT data by taking advantage of an approximate sparse representation technique. Then, it estimates the optimal features, and performs data classification.

Results: The experiments are carried out using three real-world retinal datasets. The experimental results on OCT image and OCT video datasets show that the proposed method outperforms existing state-of-the-art deep learning approaches in terms of classification accuracy, precision, recall, and f1-score, kappa, AUC-ROC, and AUC-PR. It can also boost the performance of existing transformer models, including Vision transformer and Swin transformer for OCT image classification, and Multiscale Vision Transformers for OCT video classification.

Conclusions: This work presents an approach for the automated detection of retinal diseases. Although deep neural networks have proven great potential in ophthalmology applications, our findings demonstrate for the first time a new way to identify retinal pathologies using OCT videos instead of images. Moreover, our proposal can help researchers enhance the discriminative capacity of a variety of powerful deep learning models presented in published papers. This can be valuable for future directions in medical research and clinical practice.

Keywords: Computer-aided diagnosis; Image classification; Multiscale vision transformer; Optical coherence tomography; Retinal disease classification; Swin Transformer; Video classification; Vision Transformer.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Similar articles

Cited by

LinkOut - more resources