Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec;26(6):1045-57.
doi: 10.1007/s10278-013-9622-7.

The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository

Affiliations

The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository

Kenneth Clark et al. J Digit Imaging. 2013 Dec.

Abstract

The National Institutes of Health have placed significant emphasis on sharing of research data to support secondary research. Investigators have been encouraged to publish their clinical and imaging data as part of fulfilling their grant obligations. Realizing it was not sufficient to merely ask investigators to publish their collection of imaging and clinical data, the National Cancer Institute (NCI) created the open source National Biomedical Image Archive software package as a mechanism for centralized hosting of cancer related imaging. NCI has contracted with Washington University in Saint Louis to create The Cancer Imaging Archive (TCIA)-an open-source, open-access information resource to support research, development, and educational initiatives utilizing advanced medical imaging of cancer. In its first year of operation, TCIA accumulated 23 collections (3.3 million images). Operating and maintaining a high-availability image archive is a complex challenge involving varied archive-specific resources and driven by the needs of both image submitters and image consumers. Quality archives of any type (traditional library, PubMed, refereed journals) require management and customer service. This paper describes the management tasks and user support model for TCIA.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
TCIA operations overview. TCIA project managers negotiate new collection details with each image submitter, then supply submitter with a de-identification/re-identification script and Clinical Trial Processor (CTP) software that image submitter uses to transmit de-identified images to the TCIA intake server. The TCIA management team reviews images to make sure that counts match those of the submitter, no images have been quarantined, and re-identified IDs are as they should be. Management then directs quality control (QC) processing involving visual inspection and a thorough analysis of DICOM headers. Armed with QC results and a new CTP script to cleanse images of unwanted DICOM tag values, management moves the images to the TCIA public server from which properly authenticated image consumers may download images. A Support Center assists image consumers with gaining such access. A TCIA wiki hosts collection-specific details, a FAQ for typical TCIA questions and answers, and user guide for submitting images. TCIA managers use a private portion of the wiki to shepherd collection accrual and documentation thereof. Systems and network personnel facilitate a streamlined operation; software developers improve operations mechanisms and reporting
Fig. 2
Fig. 2
Image quality control process. Darkened boxes are step processes; light boxes specify roles required to complete processes. Images submitted to the TCIA intake server are reviewed by managers to make sure DICOM patient IDs have been properly assigned, no images have been quarantined, and image counts match submitter’s counts. Managers notify curators to begin visual inspection tasks and request TCIA DICOM experts to perform a TagSniffer analysis. A DICOM expert uses the analysis findings are used to create a CTP script that a manager will use to move the images to the TCIA public server. Curators perform another visual inspection of the public images and both curators and managers review a new TagSniffer report designed to expose any lingering PHI. Public-server images are then rendered “Visible”
Fig. 3
Fig. 3
Submission status at a glance. By way of example, several collections of varying priority and status are shown. Status “active” means images are being received, being curated, or expected shortly; “inactive” implies submitter has agreed to send images but is not quite ready to do so; “complete” means all images for the collection have arrived and are available on the public server. Priorities are assigned by TCIA management and are based on needs of identified research groups
Fig. 4
Fig. 4
TCIA home page (https://www.cancerimagingarchive.net)
Fig. 5
Fig. 5
TCIA public-server number of images (vertical bars; left vertical axis log-scale) and number of contributing sites (right vertical axis) by month
Fig. 6
Fig. 6
a Study percentages by modality. b Series percentages by modality. c Image percentages by modality
Fig. 7
Fig. 7
Image percentages by anatomy. “Other” includes head–neck, kidney, lung, and non-image objects
Fig. 8
Fig. 8
TCIA registered user accounts by month
Fig. 9
Fig. 9
TCIA image-series downloads by month
Fig. 10
Fig. 10
Request Tracker (RT) tickets by month

References

    1. Birney E, et al. Mining the draft human genome. Nature. 2001;409(6822):827–828. doi: 10.1038/35057004. - DOI - PMC - PubMed
    1. Benson DA, et al. GenBank. Nucleic Acids Res. 1997;25(1):1–6. doi: 10.1093/nar/25.1.1. - DOI - PMC - PubMed
    1. Howe D, et al. Big data: the future of biocuration. Nature. 2008;455(7209):47–50. doi: 10.1038/455047a. - DOI - PMC - PubMed
    1. Van Essen D, et al. The Human Connectome Project: a data acquisition perspective. Neuroimage. 2012;62:2222–2231. doi: 10.1016/j.neuroimage.2012.02.018. - DOI - PMC - PubMed
    1. Marcus D, et al. Informatics and data mining tools and strategies for the human connectome project. Front Neuroinform. 2011;5:4. doi: 10.3389/fninf.2011.00004. - DOI - PMC - PubMed

Publication types