Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 11;21(1):262.
doi: 10.1186/s12911-021-01623-6.

Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers

Affiliations

Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers

Yuta Nakamura et al. BMC Med Inform Decis Mak. .

Abstract

Background: It is essential for radiologists to communicate actionable findings to the referring clinicians reliably. Natural language processing (NLP) has been shown to help identify free-text radiology reports including actionable findings. However, the application of recent deep learning techniques to radiology reports, which can improve the detection performance, has not been thoroughly examined. Moreover, free-text that clinicians input in the ordering form (order information) has seldom been used to identify actionable reports. This study aims to evaluate the benefits of two new approaches: (1) bidirectional encoder representations from transformers (BERT), a recent deep learning architecture in NLP, and (2) using order information in addition to radiology reports.

Methods: We performed a binary classification to distinguish actionable reports (i.e., radiology reports tagged as actionable in actual radiological practice) from non-actionable ones (those without an actionable tag). 90,923 Japanese radiology reports in our hospital were used, of which 788 (0.87%) were actionable. We evaluated four methods, statistical machine learning with logistic regression (LR) and with gradient boosting decision tree (GBDT), and deep learning with a bidirectional long short-term memory (LSTM) model and a publicly available Japanese BERT model. Each method was used with two different inputs, radiology reports alone and pairs of order information and radiology reports. Thus, eight experiments were conducted to examine the performance.

Results: Without order information, BERT achieved the highest area under the precision-recall curve (AUPRC) of 0.5138, which showed a statistically significant improvement over LR, GBDT, and LSTM, and the highest area under the receiver operating characteristic curve (AUROC) of 0.9516. Simply coupling the order information with the radiology reports slightly increased the AUPRC of BERT but did not lead to a statistically significant improvement. This may be due to the complexity of clinical decisions made by radiologists.

Conclusions: BERT was assumed to be useful to detect actionable reports. More sophisticated methods are required to use order information effectively.

Keywords: Actionable finding; Bidirectional encoder representations from transformers (BERT); Deep learning; Natural language processing (NLP); Radiology reports.

PubMed Disclaimer

Conflict of interest statement

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Figures

Fig. 1
Fig. 1
Preprocessing for each radiology report. Note that subwords recognized by SentencePiece are not precisely substrings of words in the grammatical sense, because SentencePiece automatically constructs its vocabulary without a dictionary. For this reason, SentencePiece sometimes treats long phrases as one subword or conversely one character as one subword
Fig. 2
Fig. 2
Detection of actionable reports with BERT (a) without and (b) with order information. Each sequence was fed into the BERT classifier after adding special tokens (e.g., [CLS] and [SEP]) and padding with [PAD] tokens that are required by the standard specification of BERT. Generally, BERT models output a 512 × 768 hidden state matrix (indicated by **), part of which is a 768-dimensional feature vector for classification tasks (indicated by *). We used only the feature vector for classification tasks and discarded the rest of the hidden state matrix, as in the standard procedure
Fig. 3
Fig. 3
Detection of actionable reports with statistical machine learning (a) without and (b) with order information
Fig. 4
Fig. 4
Precision-recall curves for detection of actionable reports achieved by each method
Fig. 5
Fig. 5
ROC curves for detection of actionable reports achieved by each method, with optimal cut-off points shown as open circles
Fig. 6
Fig. 6
Top n-grams with positive and negative coefficients with the largest absolute values of the LR models (a) without and (b) with order information. Only the top 25 n-grams are shown when more than 25 n-grams had non-zero coefficients. N-grams in order information are marked with [Order]. The translation is not given for n-grams too short to make sense. Negation appears among n-grams with the smallest negative coefficient
Fig. 7
Fig. 7
Top 25 n-grams with the largest feature importance of GBDT (a) without and (b) with order information. N-grams in order information are marked with [Order]. The translation is not given for n-grams too short to make sense
Fig. 8
Fig. 8
Examples of LSTM and BERT predictions for two truly actionable reports with visualization of attention scores. (a) is an explicit actionable report detected without order information, and (b) is an implicit actionable report detected using order information. (a) Points out hydronephrosis due to ureteral calculus in the postoperative CT examination of rectal cancer, and (b) points out a lung nodule pointed out in the CT examination more than four years after the operation of esophageal carcinoma. “ < unk > ” stands for out-of-vocabulary subwords that were not recognized by the LSTM and BERT classifiers. Subwords with relatively high attention scores are colored red. For luminous visualization, Japanese periods are not colored
Fig. 9
Fig. 9
Three implicit actionable radiology reports pointing out incidental pneumothorax, all of which were successfully identified only by BERT. BERT attention scores are visualized in the same way as in Fig. 8. Although none of the three radiology reports emphasize urgency or explicitly recommend clinical actions, BERT has given high attention scores to the disease name “pneumothorax.”

References

    1. Larson PA, Berland LL, Griffith B, Kahn CE, Liebscher LA. Actionable findings and the role of IT support: report of the ACR Actionable Reporting Work Group. J Am Coll Radiol. 2014;11(6):552–558. doi: 10.1016/j.jacr.2013.12.016. - DOI - PubMed
    1. Sloan CE, Chadalavada SC, Cook TS, Langlotz CP, Schnall MD, Zafar HM. Assessment of follow-up completeness and notification preferences for imaging findings of possible cancer: what happens after radiologists submit their reports? Acad Radiol. 2014;21(12):1579–1586. doi: 10.1016/j.acra.2014.07.006. - DOI - PMC - PubMed
    1. Baccei SJ, DiRoberto C, Greene J, Rosen MP. Improving communication of actionable findings in radiology imaging studies and procedures using an EMR-independent system. J Med Syst. 2019;43(2):30. doi: 10.1007/s10916-018-1150-z. - DOI - PubMed
    1. Cook TS, Lalevic D, Sloan C, Chadalavada SC, Langlotz CP, Schnall MD, et al. Implementation of an automated radiology recommendation-tracking engine for abdominal imaging findings of possible cancer. J Am Coll Radiol. 2017;14(5):629–636. doi: 10.1016/j.jacr.2017.01.024. - DOI - PubMed
    1. Langlotz CP. Structured radiology reporting: are we there yet? Radiology. 2009;253(1):23–25. doi: 10.1148/radiol.2531091088. - DOI - PubMed

Publication types

LinkOut - more resources