Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 27:6:100027.
doi: 10.1016/j.redii.2023.100027. eCollection 2023 Jun.

BERT-based natural language processing analysis of French CT reports: Application to the measurement of the positivity rate for pulmonary embolism

Affiliations

BERT-based natural language processing analysis of French CT reports: Application to the measurement of the positivity rate for pulmonary embolism

Émilien Jupin-Delevaux et al. Res Diagn Interv Imaging. .

Abstract

Rationale and objectives: To develop a Natural Language Processing (NLP) method based on Bidirectional Encoder Representations from Transformers (BERT) adapted to French CT reports and to evaluate its performance to calculate the diagnostic yield of CT in patients with clinical suspicion of pulmonary embolism (PE).

Materials and methods: All the CT reports performed in our institution in 2019 (99,510 reports, training and validation dataset) and 2018 (94,559 reports, testing dataset) were included after anonymization. Two BERT-based NLP sentence classifiers were trained on 27.700, manually labeled, sentences from the training dataset. The first one aimed to classify the reports' sentences into three classes ("Non chest", "Healthy chest", and "Pathological chest" related sentences), the second one to classify the last class into eleven sub classes pathologies including "pulmonary embolism". F1-score was reported on the validation dataset. These NLP classifiers were then applied to requested CT reports for pulmonary embolism from the testing dataset. Sensitivity, specificity, and accuracy for detection of the presence of a pulmonary embolism were reported in comparison to human analysis of the reports.

Results: The F1-score for the 3-Classes and 11-SubClasses classifiers was 0.984 and 0.985, respectively. 4,042 examinations from the testing dataset were requested for pulmonary embolism of which 641 (15.8%) were positively evaluated by radiologists. The sensitivity, specificity, and accuracy of the NLP network for identifying pulmonary embolism in these reports were 98.2%, 99.3% and 99.1%, respectively.

Conclusion: BERT-based NLP sentences classifier enables the analysis of large databases of radiological reports to accurately determine the diagnostic yield of CT screening.

Keywords: CT; Natural language processing; Pulmonary embolism.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial or personal relationships that could be viewed as influencing the work reported in this paper.

Figures

Fig 1
Fig. 1
Sankey Diagram of the 27731 labels used to train and validate the classifiers.

References

    1. Kirsch J, Brown RKJ, Henry TS, et al. ACR Appropriateness Criteria® Acute Chest Pain—Suspected Pulmonary Embolism. J Am Coll Radiol. 2017;14:S2–S12. doi: 10.1016/j.jacr.2017.02.027. - DOI - PubMed
    1. Whitehead MT, Cardenas AM, Corey AS, et al. ACR appropriateness criteria® headache. J Am Coll Radiol. 2019;16:S364–S377. doi: 10.1016/j.jacr.2019.05.030. - DOI - PubMed
    1. European Society of Radiology (ESR) Methodology for ESR iGuide content. Insights Imaging. 2019;10:32. doi: 10.1186/s13244-019-0720-z. - DOI - PMC - PubMed
    1. Goldzweig CL, Orshansky G, Paige NM, et al. Electronic health record–based interventions for improving appropriate diagnostic imaging: a systematic review and meta-analysis. Ann Intern Med. 2015;162:557. doi: 10.7326/M14-2600. - DOI - PubMed
    1. Blackmore CC, Mecklenburg RS, Kaplan GS. Effectiveness of clinical decision support in controlling inappropriate imaging. J Am Coll Radiol. 2011;8:19–25. doi: 10.1016/j.jacr.2010.07.009. - DOI - PubMed

LinkOut - more resources