Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 16;8(1):183.
doi: 10.1038/s41597-021-00967-y.

A DICOM dataset for evaluation of medical image de-identification

Affiliations

A DICOM dataset for evaluation of medical image de-identification

Michael Rutherford et al. Sci Data. .

Abstract

We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM objects (a total of 1,693 CT, MRI, PET, and digital X-ray images) were selected from datasets published in the Cancer Imaging Archive (TCIA). Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM Attributes to mimic typical clinical imaging exams. The DICOM Standard and TCIA curation audit logs guided the insertion of synthetic PHI into standard and non-standard DICOM data elements. A TCIA curation team tested the utility of the evaluation dataset. With this publication, the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (the result of TCIA curation) are released on TCIA in advance of a competition, sponsored by the National Cancer Institute (NCI), for algorithmic de-identification of medical image datasets. The competition will use a much larger evaluation dataset constructed in the same manner. This paper describes the creation of the evaluation datasets and guidelines for their use.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Schematic description of the processing steps involved in the creation of the evaluation dataset and de-identified evaluation dataset.
Fig. 2
Fig. 2
Schematic description of the standard TCIA Curation Workflow based on the Posda tool suite.

References

    1. Clark K, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–1057. doi: 10.1007/s10278-013-9622-7. - DOI - PMC - PubMed
    1. Kushida CA, et al. Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies. Med Care. 2012;50:S82–101. doi: 10.1097/mlr.0b013e3182585355. - DOI - PMC - PubMed
    1. Chevrier R, Foufi V, Gaudet-Blavignac C, Robert A, Lovis C. Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review. J Med Internet Res. 2019;21:e13484. doi: 10.2196/13484. - DOI - PMC - PubMed
    1. Prior FW, et al. Facial recognition from volume-rendered magnetic resonance imaging data. IEEE T. Inf. Technol. B. 2008;13:5–9. doi: 10.1109/TITB.2008.2003335. - DOI - PubMed
    1. Schwarz CG, et al. Identification of anonymous MRI research participants with face-recognition software. N. Engl. J. Med. 2019;381:1684–1686. doi: 10.1056/NEJMc1908881. - DOI - PMC - PubMed

Publication types

LinkOut - more resources