Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification

doi:10.1016/j.media.2021.102299

. 2022 Jan:75:102299.

doi: 10.1016/j.media.2021.102299. Epub 2021 Nov 4.

Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification

Sangjoon Park¹, Gwanghyun Kim¹, Yujin Oh¹, Joon Beom Seo², Sang Min Lee², Jin Hwan Kim³, Sungjun Moon⁴, Jae-Kwang Lim⁵, Jong Chul Ye⁶

Affiliations

¹ Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.
² Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea.
³ College of Medicine, Chungnam National Univerity, Daejeon, South Korea.
⁴ College of Medicine, Yeungnam University, Daegu, South Korea.
⁵ School of Medicine, Kyungpook National University, Daegu, South Korea.
⁶ Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea. Electronic address: jong.ye@kaist.ac.kr.

PMID: 34814058
PMCID: PMC8566090
DOI: 10.1016/j.media.2021.102299

Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification

Sangjoon Park et al. Med Image Anal. 2022 Jan.

. 2022 Jan:75:102299.

doi: 10.1016/j.media.2021.102299. Epub 2021 Nov 4.

Authors

Sangjoon Park¹, Gwanghyun Kim¹, Yujin Oh¹, Joon Beom Seo², Sang Min Lee², Jin Hwan Kim³, Sungjun Moon⁴, Jae-Kwang Lim⁵, Jong Chul Ye⁶

Affiliations

¹ Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.
² Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea.
³ College of Medicine, Chungnam National Univerity, Daejeon, South Korea.
⁴ College of Medicine, Yeungnam University, Daegu, South Korea.
⁵ School of Medicine, Kyungpook National University, Daegu, South Korea.
⁶ Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea. Electronic address: jong.ye@kaist.ac.kr.

PMID: 34814058
PMCID: PMC8566090
DOI: 10.1016/j.media.2021.102299

Abstract

Developing a robust algorithm to diagnose and quantify the severity of the novel coronavirus disease 2019 (COVID-19) using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with other findings are abundant. This situation is ideally suited for the Vision Transformer (ViT) architecture, where a lot of unlabeled data can be used through structural modeling by the self-attention mechanism. However, the use of existing ViT may not be optimal, as the feature embedding by direct patch flattening or ResNet backbone in the standard ViT is not intended for CXR. To address this problem, here we propose a novel Multi-task ViT that leverages low-level CXR feature corpus obtained from a backbone network that extracts common CXR findings. Specifically, the backbone network is first trained with large public datasets to detect common abnormal findings such as consolidation, opacity, edema, etc. Then, the embedded features from the backbone network are used as corpora for a versatile Transformer model for both the diagnosis and the severity quantification of COVID-19. We evaluate our model on various external test datasets from totally different institutions to evaluate the generalization capability. The experimental results confirm that our model can achieve state-of-the-art performance in both diagnosis and severity quantification tasks with outstanding generalization capability, which are sine qua non of widespread deployment.

Keywords: Chest X-ray; Coronavirus disease-19; Multi-task learning; Vision transformer.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1**
The analogy between the diagnosis by a clinical expert and by our method.

**Fig. 2**
Backbone network to extract low-level CXR feature corpus.

**Fig. 3**
Proposed multi-task Vision Transformer model for diagnosis and severity quantification of COVID-19 on CXR, which consists of (A) shared backbone and Transformer and (B) task-specific heads for each task.

**Fig. 4**
The procedure of severity prediction and labeling. (A) Map head and ROI max-pooling of the proposed framework. (B) Our severity annotation method for severity quantification on CXRs.

**Fig. 5**
Examples of visualization results for each disease class. (A) Bacterial infection, (B) tuberculosis, and (C) COVID-19 infection.

**Fig. 6**
Examples of severity quantification results of our models on the external dataset.

**Fig. 7**
Comparison of localization results in BIMCV dataset. Green: radiologist’s annotation. Yellow: model’s prediction after thresholding. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

**Fig. 8**
Simulation under the different prevalence of COVID-19.

**Fig. 9**
Examples of the failure cases of the proposed model for the classification task. (A) The model misclassified a case of tuberculosis as COVID-19, as the location and distribution of the consolidative lesions resemble those of COVID-19 (lower and peripheral distribution of patch consolidations). (B) The model failed to diagnose a faint COVID-19 lesion in the right lower lobe of the patients, possibly due to the fact that the COVID-19 lesion was concealed by the opacity of breast tissue. (C) The model failed to diagnose in a mild COVID-19 case, showing the confusion by the support device. (D) A severe COVID-19 case was confused as other infection, in which an opacity was located at an unusual location for COVID-19 involvement (right middle lobe), but the model retained proper attention to the abnormal lesions.

**Fig. 10**
Example of the failure case of the proposed model for severity quantification task. The model confused a faint opacity in the right middle lobe as COVID-19 involvement, yielding an overall score higher than the label. Nevertheless, its prediction came close to the label annotation.

**Fig. A1**
Detailed process of Probabilistic Class Activation Map (PCAM) pooling.

**Fig. B1**
Details of the model output and post-processing for severity array in the severity quantification task.

**Fig. E1**
Examples of success with our method when the previous classification models fail. (A) Ground truth is COVID-19, but the previous COVID-19 classification models failed to make correct diagnoses. On the contrary, our model makes a correct diagnosis of COVID-19. (B) Similarly, when the previous COVID-19 classification models make wrong diagnoses, our model is able to make a correct diagnosis of other infections.

**Fig. E2**
Examples of success with our method when the previous severity quantification models fail. (A) Annotated severity score is 2, but other models fail to make the correct prediction (CheXNet, Cohen, PXS scores are 4, 5, 3). On the contrary, our model predicts a correct severity score while providing a severity map with high agreement. (B) Also in the severe case with a score of 5, our model makes a correct prediction of severity while other models fail (CheXNet, Cohen, PXS scores are 6, 4, 2).

See this image and copyright information in PMC

Cited by

Deep Learning Network Selection and Optimized Information Fusion for Enhanced COVID-19 Detection: A Literature Review.
Caliman Sturdza OA, Filip F, Terteliu Baitan M, Dimian M. Caliman Sturdza OA, et al. Diagnostics (Basel). 2025 Jul 21;15(14):1830. doi: 10.3390/diagnostics15141830. Diagnostics (Basel). 2025. PMID: 40722579 Free PMC article. Review.
Automated Multi-View Multi-Modal Assessment of COVID-19 Patients Using Reciprocal Attention and Biomedical Transform.
Li Y, Zhao H, Gan T, Liu Y, Zou L, Xu T, Chen X, Fan C, Wu M. Li Y, et al. Front Public Health. 2022 May 25;10:886958. doi: 10.3389/fpubh.2022.886958. eCollection 2022. Front Public Health. 2022. PMID: 35692335 Free PMC article.
Segmentation and classification on chest radiography: a systematic survey.
Agrawal T, Choudhary P. Agrawal T, et al. Vis Comput. 2023;39(3):875-913. doi: 10.1007/s00371-021-02352-7. Epub 2022 Jan 8. Vis Comput. 2023. PMID: 35035008 Free PMC article.
Metaheuristic optimizers integrated with vision transformer model for severity detection and classification via multimodal COVID-19 images.
Padmavathi V, Ganesan K. Padmavathi V, et al. Sci Rep. 2025 Apr 22;15(1):13941. doi: 10.1038/s41598-025-98593-w. Sci Rep. 2025. PMID: 40263404 Free PMC article.
COVID-19 detection on chest X-ray images using Homomorphic Transformation and VGG inspired deep convolutional neural network.
Shibu George G, Raj Mishra P, Sinha P, Ranjan Prusty M. Shibu George G, et al. Biocybern Biomed Eng. 2023 Jan-Mar;43(1):1-16. doi: 10.1016/j.bbe.2022.11.003. Epub 2022 Nov 24. Biocybern Biomed Eng. 2023. PMID: 36447948 Free PMC article.

See all "Cited by" articles

References

1. Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W., Tao Q., Sun Z., Xia L. Correlation of chest CT and RT-PCR testing for coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32–E40. - PMC - PubMed
1. Apostolopoulos I.D., Mpesiana T.A. COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020;43(2):635–640. - PMC - PubMed
1. Bach S., Binder A., Montavon G., Klauschen F., Müller K.-R., Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE. 2015;10(7):e0130140. - PMC - PubMed
1. Bernheim A., Mei X., Huang M., Yang Y., Fayad Z.A., Zhang N., Diao K., Lin B., Zhu X., Li K., et al. Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection. Radiology. 2020:200463. - PMC - PubMed
1. Borghesi A., Maroldi R. COVID-19 outbreak in italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression. Radiol. Med. 2020;125(5):509–513. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

[1] Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W., Tao Q., Sun Z., Xia L. Correlation of chest CT and RT-PCR testing for coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32–E40. - PMC - PubMed

[2] Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W., Tao Q., Sun Z., Xia L. Correlation of chest CT and RT-PCR testing for coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32–E40. - PMC - PubMed

[3] Apostolopoulos I.D., Mpesiana T.A. COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020;43(2):635–640. - PMC - PubMed

[4] Apostolopoulos I.D., Mpesiana T.A. COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020;43(2):635–640. - PMC - PubMed

[5] Bach S., Binder A., Montavon G., Klauschen F., Müller K.-R., Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE. 2015;10(7):e0130140. - PMC - PubMed

[6] Bach S., Binder A., Montavon G., Klauschen F., Müller K.-R., Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE. 2015;10(7):e0130140. - PMC - PubMed

[7] Bernheim A., Mei X., Huang M., Yang Y., Fayad Z.A., Zhang N., Diao K., Lin B., Zhu X., Li K., et al. Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection. Radiology. 2020:200463. - PMC - PubMed

[8] Bernheim A., Mei X., Huang M., Yang Y., Fayad Z.A., Zhang N., Diao K., Lin B., Zhu X., Li K., et al. Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection. Radiology. 2020:200463. - PMC - PubMed

[9] Borghesi A., Maroldi R. COVID-19 outbreak in italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression. Radiol. Med. 2020;125(5):509–513. - PMC - PubMed

[10] Borghesi A., Maroldi R. COVID-19 outbreak in italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression. Radiol. Med. 2020;125(5):509–513. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification

Affiliations

Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical