Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Feb 6;8(1):11.
doi: 10.1186/s41747-023-00408-y.

Image annotation and curation in radiology: an overview for machine learning practitioners

Affiliations
Review

Image annotation and curation in radiology: an overview for machine learning practitioners

Fabio Galbusera et al. Eur Radiol Exp. .

Abstract

"Garbage in, garbage out" summarises well the importance of high-quality data in machine learning and artificial intelligence. All data used to train and validate models should indeed be consistent, standardised, traceable, correctly annotated, and de-identified, considering local regulations. This narrative review presents a summary of the techniques that are used to ensure that all these requirements are fulfilled, with special emphasis on radiological imaging and freely available software solutions that can be directly employed by the interested researcher. Topics discussed include key imaging concepts, such as image resolution and pixel depth; file formats for medical image data storage; free software solutions for medical image processing; anonymisation and pseudonymisation to protect patient privacy, including compliance with regulations such as the Regulation (EU) 2016/679 "General Data Protection Regulation" (GDPR) and the 1996 United States Act of Congress "Health Insurance Portability and Accountability Act" (HIPAA); methods to eliminate patient-identifying features within images, like facial structures; free and commercial tools for image annotation; and techniques for data harmonisation and normalisation.Relevance statement This review provides an overview of the methods and tools that can be used to ensure high-quality data for machine learning and artificial intelligence applications in radiology.Key points• High-quality datasets are essential for reliable artificial intelligence algorithms in medical imaging.• Software tools like ImageJ and 3D Slicer aid in processing medical images for AI research.• Anonymisation techniques protect patient privacy during dataset preparation.• Machine learning models can accelerate image annotation, enhancing efficiency and accuracy.• Data curation ensures dataset integrity, compliance, and quality for artificial intelligence development.

Keywords: Artificial intelligence; Data curation; Image processing (computer-assisted); Machine learning; Privacy.

PubMed Disclaimer

Conflict of interest statement

F. Galbusera is a member of the European Radiology Experimental Editorial Board. He has not taken part in the review or selection process of this article.

The other author declares no competing interests.

Figures

Fig. 1
Fig. 1
A screenshot of Fiji, a version of ImageJ packaged with several plugins
Fig. 2
Fig. 2
3D Slicer, a free software package for medical image processing and analysis
Fig. 3
Fig. 3
ITK-Snap, a tool for manual and semi-automatic image segmentation using active contours
Fig. 4
Fig. 4
Example of face removal from MRI scans of the head obtained with “Pydefacer”. The first row depicts the original images, while the second shows the same images after the removal of the face. Reprinted from [22] (no permission required)
Fig. 5
Fig. 5
Screenshot of VGG Image Annotator, a free online platform for image annotation
Fig. 6
Fig. 6
Custom image annotation software developed by the authors, aimed at localising landmarks (green circles) in 3D images of the spine. This Python application runs on a local computer and does not require sharing any information over the Internet. The user interface is designed to minimise any human interaction not directly aimed at localising the landmarks on the images, considerably speeding up the annotation workflow

Similar articles

Cited by

References

    1. Philbrick KA, Weston AD, Akkus Z, et al. RIL-Contour: a medical imaging dataset annotation tool for and with deep learning. J Digit Imaging. 2019;32:571–581. doi: 10.1007/s10278-019-00232-0. - DOI - PMC - PubMed
    1. Hao Z, Ge H, Wang L. Visual attention mechanism and support vector machine based automatic image annotation. PLoS One. 2018;13(11):e0206971. doi: 10.1371/journal.pone.0206971. - DOI - PMC - PubMed
    1. Channin DS, Mongkolwat P, Kleper V, Sepukar K, Rubin DL. The caBIG annotation and image markup project. J Digit Imaging. 2010;23:217–25. doi: 10.1007/s10278-009-9193-9. - DOI - PMC - PubMed
    1. Monteiro E, Costa C, Oliveira JL. A de-identification pipeline for ultrasound medical images in DICOM format. J Med Syst. 2017;41:89. doi: 10.1007/s10916-017-0736-1. - DOI - PubMed
    1. Rieke N, Hancox J, Li W, Milletarì F, et al. The future of digital health with federated learning. NPJ Digit Med. 2020;3:119. doi: 10.1038/s41746-020-00323-1. - DOI - PMC - PubMed

LinkOut - more resources