Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov:150:106092.
doi: 10.1016/j.compbiomed.2022.106092. Epub 2022 Sep 28.

SVD-CLAHE boosting and balanced loss function for Covid-19 detection from an imbalanced Chest X-Ray dataset

Affiliations

SVD-CLAHE boosting and balanced loss function for Covid-19 detection from an imbalanced Chest X-Ray dataset

Santanu Roy et al. Comput Biol Med. 2022 Nov.

Abstract

Covid-19 disease has had a disastrous effect on the health of the global population, for the last two years. Automatic early detection of Covid-19 disease from Chest X-Ray (CXR) images is a very crucial step for human survival against Covid-19. In this paper, we propose a novel data-augmentation technique, called SVD-CLAHE Boosting and a novel loss function Balanced Weighted Categorical Cross Entropy (BWCCE), in order to detect Covid 19 disease efficiently from a highly class-imbalanced Chest X-Ray image dataset. Our proposed SVD-CLAHE Boosting method is comprised of both oversampling and under-sampling methods. First, a novel Singular Value Decomposition (SVD) based contrast enhancement and Contrast Limited Adaptive Histogram Equalization (CLAHE) methods are employed for oversampling the data in minor classes. Simultaneously, a Random Under Sampling (RUS) method is incorporated in major classes, so that the number of images per class will be more balanced. Thereafter, Balanced Weighted Categorical Cross Entropy (BWCCE) loss function is proposed in order to further reduce small class imbalance after SVD-CLAHE Boosting. Experimental results reveal that ResNet-50 model on the augmented dataset (by SVD-CLAHE Boosting), along with BWCCE loss function, achieved 95% F1 score, 94% accuracy, 95% recall, 96% precision and 96% AUC, which is far better than the results by other conventional Convolutional Neural Network (CNN) models like InceptionV3, DenseNet-121, Xception etc. as well as other existing models like Covid-Lite and Covid-Net. Hence, our proposed framework outperforms other existing methods for Covid-19 detection. Furthermore, the same experiment is conducted on VGG-19 model in order to check the validity of our proposed framework. Both ResNet-50 and VGG-19 model are pre-trained on the ImageNet dataset. We publicly shared our proposed augmented dataset on Kaggle website (https://www.kaggle.com/tr1gg3rtrash/balanced-augmented-covid-cxr-dataset), so that any research community can widely utilize this dataset. Our code is available on GitHub website online (https://github.com/MrinalTyagi/SVD-CLAHE-and-BWCCE).

Keywords: Categorical Cross Entropy (CCE); Chest X-Ray (CXR) images; Class imbalance problem; Contrast Limited Adaptive Histogram Equalization (CLAHE); Covid-19 detection; Data augmentation; Singular Value Decomposition (SVD).

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Block Diagram of entire proposed model (SVD-CLAHE Boosting + ResNet-50 + BWCCE).
Fig. 2
Fig. 2
Example of proposed Augmented Dataset (by SVD-CLAHE Boosting).
Fig. 3
Fig. 3
The entire scheme of proposed SVD-CLAHE Boosting.
Fig. 4
Fig. 4
Visualization of entire proposed methodologyFig. 4(a) presents the distribution of major and minor classes in the original dataset, based on a number of images. Figs. 4(b) to 4(d) present the changes in distribution after employing the proposed methodology. Fig. 4(e) indicates the cluster representation of major and minor classes for the original dataset. Fig. 4(f) to Fig. 4(h) indicate the changes in cluster representation in major and minor classes after employing RUS, SVD-CLAHE Over-sampling, and BWCCE loss function, respectively. All these diagrams of distribution and cluster representation are completely imaginary and have not been taken from any statistical plot of the dataset.
Fig. 5
Fig. 5
Comparisons of performances of several experiments on ResNet-50 Model (a) training accuracy, (b) training F1 score, (c) training loss, (d) validation accuracy, (e) validation F1 score, (f) validation loss. The experiments are employed are already labeled in the diagram, those are ResNet-50 on original dataset, ResNet-50 on augmented dataset (by SVD-CLAHE boosting) with equal no of images per class, ResNet-50 on augmented dataset (by proposed SVD-CLAHE boosting), ResNet-50+ SVD-CLAHE Boosting +WCCE, proposed method (ResNet-50+ SVD-CLAHE Boosting +BWCCE).
Fig. 6
Fig. 6
Comparisons of performances of several experiments on VGG-19 Model (a) training accuracy, (b) training F1 score, (c) training loss, (d) validation accuracy, (e) validation F1 score, (f) validation loss. The experiments are employed are already labeled in the diagram, those are VGG-19 on original dataset, VGG-19 on augmented dataset (by SVD-CLAHE boosting) with equal no of images per class, VGG-19 on augmented dataset (by proposed SVD-CLAHE boosting), VGG-19+ SVD-CLAHE Boosting +WCCE, proposed method (VGG-19+ SVD-CLAHE Boosting +BWCCE).
Fig. 7
Fig. 7
(a) No of Epochs of Convergence for ResNet-50 with different loss functions on proposed augmented dataset, (b) Average time taken per epochs in sec, for ResNet-50 with different loss functions on proposed augmented dataset.
Fig. 8
Fig. 8
Confusion matrix for different experiments on ResNet-50 model, (a) Confusion Matrix (CM1) for ResNet-50 on original dataset, (b) Confusion Matrix (CM2) for ResNet-50+SVD-CLAHE Boosting, (c) Confusion Matrix (CM3) for Proposed methodology (ResNet-50+SVD-CLAHE Boosting+ BWCCE).

Similar articles

Cited by

References

    1. of the International C.S.G., et al. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat. Microbiol. 2020;5(4):536. - PMC - PubMed
    1. Koh H.K., Geller A.C., VanderWeele T.J. Deaths from COVID-19. JAMA. 2021;325(2):133–134. - PubMed
    1. Shirani F., Shayganfar A., Hajiahmadi S. COVID-19 pneumonia: a pictorial review of CT findings and differential diagnosis. Egypt. J. Radiol. Nucl. Med. 2021;52(1):1–8.
    1. Peng X., Xu X., Li Y., Cheng L., Zhou X., Ren B. Transmission routes of 2019-nCoV and controls in dental practice. Int. J. Oral Sci. 2020;12(1):1–6. - PMC - PubMed
    1. Smyrlaki I., Ekman M., Lentini A., Rufino de Sousa N., Papanicolaou N., Vondracek M., Aarum J., Safari H., Muradrasoli S., Rothfuchs A.G., et al. Massive and rapid COVID-19 testing is feasible by extraction-free SARS-CoV-2 RT-PCR. Nature Commun. 2020;11(1):1–12. - PMC - PubMed