Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 1;11(1):4250.
doi: 10.1038/s41598-021-83503-7.

Hierarchical deep learning models using transfer learning for disease detection and classification based on small number of medical images

Affiliations

Hierarchical deep learning models using transfer learning for disease detection and classification based on small number of medical images

Guangzhou An et al. Sci Rep. .

Abstract

Deep learning is being employed in disease detection and classification based on medical images for clinical decision making. It typically requires large amounts of labelled data; however, the sample size of such medical image datasets is generally small. This study proposes a novel training framework for building deep learning models of disease detection and classification with small datasets. Our approach is based on a hierarchical classification method where the healthy/disease information from the first model is effectively utilized to build subsequent models for classifying the disease into its sub-types via a transfer learning method. To improve accuracy, multiple input datasets were used, and a stacking ensembled method was employed for final classification. To demonstrate the method's performance, a labelled dataset extracted from volumetric ophthalmic optical coherence tomography data for 156 healthy and 798 glaucoma eyes was used, in which glaucoma eyes were further labelled into four sub-types. The average weighted accuracy and Cohen's kappa for three randomized test datasets were 0.839 and 0.809, respectively. Our approach outperformed the flat classification method by 9.7% using smaller training datasets. The results suggest that the framework can perform accurate classification with a small number of medical images.

PubMed Disclaimer

Conflict of interest statement

G.A. and M.A. are employees of Topcon Corporation. The other authors declare no competing interests.

Figures

Figure 1
Figure 1
Overview of our main framework for building deep learning models. The top figure illustrates doctors’ diagnostic process. The bottom figure illustrates the building of deep learning models based on the analyzed characteristics of the doctors’ diagnostic process using three techniques: hierarchical classification, hierarchy transfer learning, and a stacking ensemble method.
Figure 2
Figure 2
Proposed approaches for building deep learning models. Consider a classification problem (class number = n + 1) of normal and disease cases with n subtypes. (a) Proposed approach 1: flat classification is used to directly classify normal eyes and those with four subtypes of disease with transfer learning from an ImageNet-pretrained CNN model to create Model 1. (b) Proposed approach 2: hierarchical classification is used to create a low-level model (Model 2) for classifying normal versus disease cases and a high-level model (Model 3) for classifying subtypes of disease. Both models apply transfer learning from the ImageNet-pretrained CNN model. The confidence in a ‘normal’ result from Model 2 and the confidence in disease subtypes from Model 3 are concatenated to calculate the overall result from training Metamodel 1. (c) Proposed approach 3: hierarchical classification using hierarchy transfer learning between different-level models is used in the hierarchical classification model is. In contrast with proposed approach 2, the high-level model (Model 4) for classifying disease subtypes, transfer learning is from the low-level model (Model 2) instead of from the ImageNet-pretrained CNN model. The normal confidence from Model 2 and the disease subtype confidence from Model 4 are concatenated to train Metamodel 2 to calculate the overall result.
Figure 3
Figure 3
Classification performances of deep learning models built with the proposed approaches. The figure shows Cohen’s kappa with standard deviation error bars for the deep learning models for projection, en face, disc H B-scan, and disc V B-scan images and combination models using multiple-input CNN and stacking built with different proposed strategies.
Figure 4
Figure 4
Performance change with different training dataset sizes. (a) Cohen’s kappa with standard deviation error bars for deep learning models trained with different training methods based on single-input (projection) images. (b) Cohen’s kappa with standard deviation error bars for deep learning models trained with different training methods based on all input images. (c) Calculated classification performance reduction for models using different training datasets and other training methods with stacking, with respect to Cohen’s kappa for the CNN model built with the proposed method of HC & HTL with stacking.
Figure 5
Figure 5
ROC curves for models built using a small training dataset. ROC curves for the stacked hierarchical classification model using hierarchy transfer learning: (a) macro-average ROC curve of all five classes, (b) ROC curves for the classification of the largest data class (MY) versus others, and (c) ROC curves for the classification of the smallest data class (SS) versus others.

References

    1. Ker J, Wang L, Rao J, Lim T. Deep learning applications in medical image analysis. IEEE Access. 2018;6:9375–9389. doi: 10.1109/ACCESS.2017.2788044. - DOI
    1. Altaf, F., Islam, S., Akhtar, N. & Janjua, N. K. Going deep in medical image analysis: Concepts, methods, challenges and future directions. https://arxiv.org/abs/1902.05655 (2019).
    1. Gulshan V, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402–2410. doi: 10.1001/jama.2016.17216. - DOI - PubMed
    1. Rajpurkar, P., et al. CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. https://arxiv.org/abs/1711.05225 (2017).
    1. Esteva A, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. doi: 10.1038/nature21056. - DOI - PMC - PubMed

Publication types