. 2023 Jun;33(6):4228-4236.

doi: 10.1007/s00330-023-09526-y. Epub 2023 Mar 11.

Transformer-based structuring of free-text radiology report databases

S Nowak^#¹, D Biesner^#², Y C Layer³, M Theis³, H Schneider², W Block³, B Wulff², U I Attenberger^#³, R Sifa², A M Sprinkart^#³

Affiliations

¹ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany. sebastian.nowak@ukbonn.de.
² Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany.
³ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany.

^# Contributed equally.

PMID: 36905469
PMCID: PMC10181962
DOI: 10.1007/s00330-023-09526-y

Transformer-based structuring of free-text radiology report databases

S Nowak et al. Eur Radiol. 2023 Jun.

. 2023 Jun;33(6):4228-4236.

doi: 10.1007/s00330-023-09526-y. Epub 2023 Mar 11.

Authors

S Nowak^#¹, D Biesner^#², Y C Layer³, M Theis³, H Schneider², W Block³, B Wulff², U I Attenberger^#³, R Sifa², A M Sprinkart^#³

Affiliations

¹ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany. sebastian.nowak@ukbonn.de.
² Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Sankt Augustin, Germany.
³ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany.

^# Contributed equally.

PMID: 36905469
PMCID: PMC10181962
DOI: 10.1007/s00330-023-09526-y

Abstract

Objectives: To provide insights for on-site development of transformer-based structuring of free-text report databases by investigating different labeling and pre-training strategies.

Methods: A total of 93,368 German chest X-ray reports from 20,912 intensive care unit (ICU) patients were included. Two labeling strategies were investigated to tag six findings of the attending radiologist. First, a system based on human-defined rules was applied for annotation of all reports (termed "silver labels"). Second, 18,000 reports were manually annotated in 197 h (termed "gold labels") of which 10% were used for testing. An on-site pre-trained model (T_mlm) using masked-language modeling (MLM) was compared to a public, medically pre-trained model (T_med). Both models were fine-tuned on silver labels only, gold labels only, and first with silver and then gold labels (hybrid training) for text classification, using varying numbers (N: 500, 1000, 2000, 3500, 7000, 14,580) of gold labels. Macro-averaged F1-scores (MAF1) in percent were calculated with 95% confidence intervals (CI).

Results: T_mlm,gold (95.5 [94.5-96.3]) showed significantly higher MAF1 than T_med,silver (75.0 [73.4-76.5]) and T_mlm,silver (75.2 [73.6-76.7]), but not significantly higher MAF1 than T_med,gold (94.7 [93.6-95.6]), T_med,hybrid (94.9 [93.9-95.8]), and T_mlm,hybrid (95.2 [94.3-96.0]). When using 7000 or less gold-labeled reports, T_mlm,gold (N: 7000, 94.7 [93.5-95.7]) showed significantly higher MAF1 than T_med,gold (N: 7000, 91.5 [90.0-92.8]). With at least 2000 gold-labeled reports, utilizing silver labels did not lead to significant improvement of T_mlm,hybrid (N: 2000, 91.8 [90.4-93.2]) over T_mlm,gold (N: 2000, 91.4 [89.9-92.8]).

Conclusions: Custom pre-training of transformers and fine-tuning on manual annotations promises to be an efficient strategy to unlock report databases for data-driven medicine.

Key points: • On-site development of natural language processing methods that retrospectively unlock free-text databases of radiology clinics for data-driven medicine is of great interest. • For clinics seeking to develop methods on-site for retrospective structuring of a report database of a certain department, it remains unclear which of previously proposed strategies for labeling reports and pre-training models is the most appropriate in context of, e.g., available annotator time. • Using a custom pre-trained transformer model, along with a little annotation effort, promises to be an efficient way to retrospectively structure radiological databases, even if not millions of reports are available for pre-training.

Keywords: Deep learning; Intensive care units; Natural language processing; Radiology; Thorax.

PubMed Disclaimer

Conflict of interest statement

The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.

Figures

**Fig. 1**
Overview of the presented study. The dataset of the presented study includes a total of 93,368 free-text chest X-ray reports of intensive care unit patients. For a subset of the dataset, human annotations were generated for the occurrence of six findings within the reports to create gold-labeled training, validation, and test datasets. Furthermore, a rule-based system was applied for silver label generation. The use of an on-site pre-trained model using masked-language modeling (T_mlm) was compared to a public, medically pre-trained model (T_med) when adapting to silver labels only, gold labels only, and to first with silver and then gold labels (hybrid). To also give insights into which pre-training and labeling strategy is most appropriate in context of available human annotation time, the models were developed using varying numbers of gold-labeled reports

**Fig. 2**
Model performances for different numbers of gold-labeled reports. F1-scores in % (y-axis) are displayed for the rule-based (RB) system in black, as well as for T_med,gold (blue), T_mlm,gold (orange), TFIDF_gold (green), and T_mlm,hybrid (red) using various numbers of gold-labeled reports for training (x-axis)

See this image and copyright information in PMC

Comment in

Transformers, codes and labels: large language modelling for natural language processing in clinical radiology.
Remedios D, Remedios A. Remedios D, et al. Eur Radiol. 2023 Jun;33(6):4226-4227. doi: 10.1007/s00330-023-09566-4. Epub 2023 Apr 4. Eur Radiol. 2023. PMID: 37014411 No abstract available.

References

1. Nobel JM, Kok EM, Robben SG. Redefining the structure of structured reporting in radiology. Insights Imaging. 2020;11(1):1–5. doi: 10.1186/s13244-019-0831-6. - DOI - PMC - PubMed
1. European Society of Radiology (ESR) ESR paper on structured reporting in radiology. Insights Imaging. 2018;9:1–7. doi: 10.1007/s13244-017-0588-8. - DOI - PMC - PubMed
1. Irvin J, Rajpurkar P, Ko M, et al. Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence. 2019;33(1):590–597. doi: 10.1609/aaai.v33i01.3301590. - DOI
1. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In Advances in neural information processing systems 30
1. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

01|S18038B/Bundesministerium für Bildung und Forschung

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Transformer-based structuring of free-text radiology report databases

Affiliations

Transformer-based structuring of free-text radiology report databases

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources