Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun:2022:288-296.
doi: 10.1109/ichi54592.2022.00050. Epub 2022 Sep 8.

Radiology Text Analysis System (RadText): Architecture and Evaluation

Affiliations

Radiology Text Analysis System (RadText): Architecture and Evaluation

Song Wang et al. Proc (IEEE Int Conf Healthc Inform). 2022 Jun.

Abstract

Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis. In this work, we present RadText, a high-performance open-source Python radiology text analysis system. RadText offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization, named entity recognition, parsing, and negation detection. Superior to existing widely used toolkits, RadText features a hybrid text processing schema, supports raw text processing and local processing, which enables higher accuracy, better usability and improved data privacy. RadText adopts BioC as the unified interface, and also standardizes the output into a structured representation that is compatible with Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which allows for a more systematic approach to observational research across multiple, disparate data sources. We evaluated RadText on the MIMIC-CXR dataset, with five new disease labels that we annotated for this work. RadText demonstrates highly accurate classification performances, with a 0.91 average precision, 0.94 average recall and 0.92 average F-1 score. We also annotated a test set for the five new disease labels to facilitate future research or applications. We have made our code, documentations, examples and the test set available at https://github.com/bionlplab/radtext.

Keywords: Natural Language Processing; Radiology; Text Analysis Systems.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of RadText’s NLP pipeline, main components and implementations.
Fig. 2.
Fig. 2.
The obtained dependency graph of “There is no pleural effusion or pneumothorax” using Stanza [28].

References

    1. Brady, “Error and discrepancy in radiology: inevitable or avoidable?” Insights into Imaging, vol. 8, 12 2016. - PMC - PubMed
    1. Savova G, Masanz J, Ogren P, Zheng J, Sohn S, Kipper-Schuler K, and Chute C, “Mayo clinical text analysis and knowledge extraction system (ctakes): Architecture, component evaluation and applications,” Journal of the American Medical Informatics Association : JAMIA, vol. 17, pp. 507–13, 09 2010. - PMC - PubMed
    1. Neumann M, King D, Beltagy I, and Ammar W, “ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing,” in Proceedings of the 18th BioNLP Workshop and Shared Task. Florence, Italy: Association for Computational Linguistics, Aug. 2019, pp. 319–327. [Online]. Available: https://www.aclweb.org/anthology/W19-5034
    1. Liu H, Bielinski SJ, Sohn S, Murphy S, Wagholikar KB, Jonnalagadda SR, Ravikumar K, Wu ST, Kullo IJ, and Chute CG, “An information extraction framework for cohort identification using electronic health records,” AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, vol. 2013, p. 149–153, 2013. [Online]. Available: https://europepmc.org/articles/PMC3845757 - PMC - PubMed
    1. Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, and Xu H, “CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines,” Journal of the American Medical Informatics Association, vol. 25, no. 3, pp. 331–336, 11 2017. [Online]. Available: 10.1093/jamia/ocx132 - DOI - PMC - PubMed

LinkOut - more resources