Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb;36(1):164-177.
doi: 10.1007/s10278-022-00714-8. Epub 2022 Nov 2.

Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports

Affiliations

Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports

Pierre Chambon et al. J Digit Imaging. 2023 Feb.

Abstract

Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks.

Keywords: BERT; COVID-19; Classification; Natural language processing (NLP); Radiology; Transformer.

PubMed Disclaimer

Conflict of interest statement

Personal financial interests: Board of directors and shareholder, Bunkerhill Health; Option holder, whiterabbit.ai; Advisor and option holder, GalileoCDS; Advisor and option holder, Sirona Medical; Advisor and option holder, Adra; Advisor and option holder, Kheiron; Advisor, Sixth Street; Chair, SIIM Board of Directors; Member at Large, Board of Directors of the Pennsylvania Radiological Society; Member at Large, Board of Directors of the Philadelphia Roentgen Ray Society; Member at Large, Board of Directors of the Association of University Radiologists (term just ended in June); Honoraria, Sectra (webinars); Honoraria, British Journal of Radiology (section editor); Speaker honorarium, Icahn School of Medicine (conference speaker); Speaker honorarium, MGH (conference speaker). Recent grant and gift support paid to academic institutions involved: Carestream; Clairity; GE Healthcare; Google Cloud; IBM; IDEXX; Hospital Israelita Albert Einstein; Kheiron; Lambda; Lunit; Microsoft; Nightingale Open Science; Nines; Philips; Subtle Medical; VinBrain; Whiterabbit.ai; Lowenstein Foundation; Gordon and Betty Moore Foundation; Paustenbach Fund. Grant funding: NIH; Independence Blue Cross; RSNA.

Figures

Fig. 1
Fig. 1
Our classification task consists of consuming the text of a radiology report and generating one of three labels: COVID-19, uncertain COVID-19, and no COVID-19
Fig. 2
Fig. 2
Our fine-tuning dataset includes radiology reports from 6 sites within the same health academic system, Penn Medicine. The left graph shows the number of reports provided by each site, dominated by three sites. The graph on the right shows that the data imbalance remains stable across these three main sites. There is less balance in the three remaining sites
Fig. 3
Fig. 3
The Tree-structured Parzen Estimator builds empirical distributions on the hyperparameter space and suggests points that are highly likely under the distribution of good trials while being highly unlikely under the distribution of bad trials
Fig. 4
Fig. 4
Five hundred transformers using our pre-training on radiology reports and fine-tuned for the COVID-19 classification task. The yellow points, using our fine-tuning approach, perform better than the blue points in the vast majority of cases, using a standard fine-tuning procedure. This visualization was obtained using the Weights & Biases platform [39]
Fig. 5
Fig. 5
The red, yellow, and blue lines reflect the preponderance of COVID-19 as detected in radiology reports, all from the same health academic system, by our model. The green line represents the number of positive cases in the same county (data from the CDC COVID Tracker [40])
Fig. 6
Fig. 6
For each report and model output, integrated gradients underline in green the words that contributed positively to the decision of the model and in red the ones that contributed negatively

Similar articles

Cited by

References

    1. Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob. Llion Jones, Aidan N. Gomez: Lukasz Kaiser, and Illia Polosukhin. Attention is all you need; 2017.
    1. Devlin Jacob, Chang Ming-Wei, Lee Kenton. and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding; 2019.
    1. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. 10.18653/v1/2020.emnlp-demos.6. https://aclanthology.org/2020.emnlp-demos.6.
    1. Emily Alsentzer, John Murphy, William Boag, Wei-Hung Weng, Di Jindi, Tristan Naumann, and Matthew McDermott. Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pages 72–78, Minneapolis, Minnesota, USA, June 2019. Association for Computational Linguistics. 10.18653/v1/W19-1909. https://aclanthology.org/W19-1909.
    1. Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, Sep 2019. ISSN 1460-2059. 10.1093/bioinformatics/btz682. 10.1093/bioinformatics/btz682. - DOI - PMC - PubMed

Publication types