Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar;23(2):304-10.
doi: 10.1093/jamia/ocv080. Epub 2015 Jul 1.

Preparing a collection of radiology examinations for distribution and retrieval

Affiliations

Preparing a collection of radiology examinations for distribution and retrieval

Dina Demner-Fushman et al. J Am Med Inform Assoc. 2016 Mar.

Abstract

Objective: Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database.

Materials and methods: The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals' picture archiving systems. The images and reports were de-identified automatically and then the automatic de-identification was manually verified. The authors coded the key findings of the reports and empirically assessed the benefits of manual coding on retrieval.

Results: The automatic de-identification of the narrative was aggressive and achieved 100% precision at the cost of rendering a few findings uninterpretable. Automatic de-identification of images was not quite as perfect. Images for two of 3996 patients (0.05%) showed protected health information. Manual encoding of findings improved retrieval precision.

Conclusion: Stringent de-identification methods can remove all identifiers from text radiology reports. DICOM de-identification of images does not remove all identifying information and needs special attention to images scanned from film. Adding manual coding to the radiologist narrative reports significantly improved relevancy of the retrieved clinical documents. The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine (http://openi.nlm.nih.gov/).

Keywords: abstracting and indexing; biometric identification; information storage and retrieval; medical records; radiography.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
A sample radiology report with manual and MTI annotations. Terms removed by the automatic text scrubber are replaced with XXXX. “COPD” in the impression section is annotated with the MeSH term “Pulmonary Disease, Chronic Obstructive.” “Scarring” is translated to MeSH term “Cicatrix.”
Figure 2:
Figure 2:
Images with (A) hospital tag in the lower right-hand corner (the actual tag data has been obscured for privacy reasons) and (B) partially visible Medtronic device-specific radiopaque alphanumeric code and jaw outline and teeth.
Figure 3:
Figure 3:
Grid view of search results for pneumonia in the radiology collection indexed in Open-i.

References

    1. i2b2 NLP Research Datasets. https://www.i2b2.org/NLP/DataSets/Main.php. Accessed February 11, 2015.
    1. Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. JAMIA. 2013;20(5):806–813. - PMC - PubMed
    1. Cancer Imaging Archive. http://www.cancerimagingarchive.net/. Accessed February 11, 2015.
    1. Jaeger S, Karargyris A, Candemir S, et al. Automatic tuberculosis screening using chest radiographs. IEEE Trans Med Imaging. 2014;33(2):233–245. - PubMed
    1. Demner-Fushman D, Antani S, Simpson MS, Thoma GR. Design and development of a multimodal biomedical information retrieval system. JCSE. 2012;6(2):68–177.

Publication types