Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov:135:104220.
doi: 10.1016/j.jbi.2022.104220. Epub 2022 Oct 10.

RadioBERT: A deep learning-based system for medical report generation from chest X-ray images using contextual embeddings

Affiliations

RadioBERT: A deep learning-based system for medical report generation from chest X-ray images using contextual embeddings

Navdeep Kaur et al. J Biomed Inform. 2022 Nov.

Abstract

Background: Increasing number of chest X-ray (CXR) examinations in radiodiagnosis departments burdens radiologists' and makes the timely generation of accurate radiological reports highly challenging. An automatic radiological report generation (ARRG) system is envisaged to generate radiographic reports with minimal human intervention, ease radiologists' burden, and smoothen the clinical workflow. The success of an ARRG system depends on two critical factors: i) quality of the features extracted by the ARRG system from the CXR images, and ii) quality of the linguistic expression generated by the ARRG system describing the normalities and abnormalities as indicated by the extracted features. Most of the existing ARRG systems miserably fail due to the latter factor and do not generate clinically acceptable reports because they ignore the contextual importance of the medical terms.

Objectives: The advent of contextual word embeddings, like ELMo and BERT, has revolutionized several natural language processing (NLP) tasks. A contextual embedding represents a word based on its context. The main objective of this work is to develop an ARRG system that uses contextual word embeddings to generate clinically accurate reports from CXR images.

Methods: We present an end-to-end deep neural network that uses contextual word representations for generating clinically useful radiological reports from CXR images. The proposed network, termed as RadioBERT, uses DistilBERT for contextual word representation and leverages transfer learning. Additionally, due to the importance of abnormal observations over the normal ones, the network reorders the generated sentences by applying sentiment analysis to keep abnormal descriptions on the top of the generated report.

Results: The empirical study consisting of several experiments performed on the OpenI dataset indicates that CNN+Hierarchical LSTM with DistilBERT embedding improves the benchmark performance. We have been able to achieve the following performance scores: BLEU-1 = 0.772, BLEU-2 = 0.770, BLEU-3 = 0.768, BLEU-4 = 0.767, CIDEr = 0.5563, and ROUGE = 0.897.

Conclusion: The proposed method improves the state-of-the-art performance scores by a substantial margin. It is concluded that the use of word embeddings generated by DistilBERT enhances the performance of hierarchical LSTM for producing clinical reports by significant margin.

Keywords: BERT (Bidirectional Encoder Representations from Transformers); Medical report generation; Pretrained language model; Word embedding.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Publication types