Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May;295(2):328-338.
doi: 10.1148/radiol.2020191145. Epub 2020 Mar 10.

The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping

Alex Zwanenburg #  1 Martin Vallières #  1 Mahmoud A Abdalah  1 Hugo J W L Aerts  1 Vincent Andrearczyk  1 Aditya Apte  1 Saeed Ashrafinia  1 Spyridon Bakas  1 Roelof J Beukinga  1 Ronald Boellaard  1 Marta Bogowicz  1 Luca Boldrini  1 Irène Buvat  1 Gary J R Cook  1 Christos Davatzikos  1 Adrien Depeursinge  1 Marie-Charlotte Desseroit  1 Nicola Dinapoli  1 Cuong Viet Dinh  1 Sebastian Echegaray  1 Issam El Naqa  1 Andriy Y Fedorov  1 Roberto Gatta  1 Robert J Gillies  1 Vicky Goh  1 Michael Götz  1 Matthias Guckenberger  1 Sung Min Ha  1 Mathieu Hatt  1 Fabian Isensee  1 Philippe Lambin  1 Stefan Leger  1 Ralph T H Leijenaar  1 Jacopo Lenkowicz  1 Fiona Lippert  1 Are Losnegård  1 Klaus H Maier-Hein  1 Olivier Morin  1 Henning Müller  1 Sandy Napel  1 Christophe Nioche  1 Fanny Orlhac  1 Sarthak Pati  1 Elisabeth A G Pfaehler  1 Arman Rahmim  1 Arvind U K Rao  1 Jonas Scherer  1 Muhammad Musib Siddique  1 Nanna M Sijtsema  1 Jairo Socarras Fernandez  1 Emiliano Spezi  1 Roel J H M Steenbakkers  1 Stephanie Tanadini-Lang  1 Daniela Thorwarth  1 Esther G C Troost  1 Taman Upadhaya  1 Vincenzo Valentini  1 Lisanne V van Dijk  1 Joost van Griethuysen  1 Floris H P van Velden  1 Philip Whybra  1 Christian Richter  1 Steffen Löck  1
Affiliations

The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping

Alex Zwanenburg et al. Radiology. 2020 May.

Abstract

Background Radiomic features may quantify characteristics present in medical imaging. However, the lack of standardized definitions and validated reference values have hampered clinical use. Purpose To standardize a set of 174 radiomic features. Materials and Methods Radiomic features were assessed in three phases. In phase I, 487 features were derived from the basic set of 174 features. Twenty-five research teams with unique radiomics software implementations computed feature values directly from a digital phantom, without any additional image processing. In phase II, 15 teams computed values for 1347 derived features using a CT image of a patient with lung cancer and predefined image processing configurations. In both phases, consensus among the teams on the validity of tentative reference values was measured through the frequency of the modal value and classified as follows: less than three matches, weak; three to five matches, moderate; six to nine matches, strong; 10 or more matches, very strong. In the final phase (phase III), a public data set of multimodality images (CT, fluorine 18 fluorodeoxyglucose PET, and T1-weighted MRI) from 51 patients with soft-tissue sarcoma was used to prospectively assess reproducibility of standardized features. Results Consensus on reference values was initially weak for 232 of 302 features (76.8%) at phase I and 703 of 1075 features (65.4%) at phase II. At the final iteration, weak consensus remained for only two of 487 features (0.4%) at phase I and 19 of 1347 features (1.4%) at phase II. Strong or better consensus was achieved for 463 of 487 features (95.1%) at phase I and 1220 of 1347 features (90.6%) at phase II. Overall, 169 of 174 features were standardized in the first two phases. In the final validation phase (phase III), most of the 169 standardized features could be excellently reproduced (166 with CT; 164 with PET; and 164 with MRI). Conclusion A set of 169 radiomics features was standardized, which enabled verification and calibration of different radiomics software. © RSNA, 2020 Online supplemental material is available for this article. See also the editorial by Kuhl and Truhn in this issue.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1:
Figure 1:
Flowchart of study overview. The workflow in a typical radiomics analysis starts with acquisition and reconstruction of a medical image. Subsequently, the image is segmented to define regions of interest (ROI). Afterward, radiomics software is used to process the image and to compute features that characterize an ROI. We focused on standardizing the image processing and feature computation steps. Standardization was performed within two iterative phases. In phase I, we used a specially designed digital phantom to obtain reference values for radiomics features directly. In phase II, a publicly available CT image in a patient with lung cancer was used to obtain reference values for features under predefined configurations of a standardized general radiomics image processing scheme. Standardization of image processing and feature computation steps in radiomics software was prospectively validated during phase III by assessing reproducibility of standardized features in a publicly available multimodality patient cohort of 51 patients with soft-tissue sarcoma. 18F-FDG = fluorine 18 fluorodeoxyglucose, T1w = T1-weighted.
Figure 2:
Figure 2:
Flowchart of the general radiomics image processing scheme for computing radiomics features. Image processing starts with reconstructed images. These images are processed through several optional steps: data conversion (eg, conversion to standardized uptake values), image postacquisition processing (eg, image denoising), and image interpolation. Either the region of interest (ROI) is created automatically during the segmentation step, or an existing ROI is retrieved. The ROI is then interpolated as well, and intensity and morphologic masks are created as copies. The intensity mask may be resegmented according to intensity values to improve comparability of intensity ranges across a cohort. Radiomics features are then computed from the image masked by the ROI and its immediate neighborhood (local intensity features) or the ROI itself (all others). Image intensities are moreover discretized prior to computation of features from the intensity histogram (IH), intensity-volume histogram (IVH), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size-zone matrix (GLSZM), gray-level distance-zone matrix (GLDZM), neighborhood gray-tone difference matrix (NGTDM), and neighboring gray-level dependence matrix (NGLDM) families. All processing steps from image interpolation to the computation of radiomics features were evaluated in this study.
Figure 3:
Figure 3:
Bar graphs depict participation and radiomics feature coverage by research teams. A, Graph shows the number of research teams at each analysis time point during the two phases of the iterative standardization process. Teams computed features without prior image processing (phase I) and after image processing (phase II), with the aim of finding reference values for a feature. Consensus on the validity of reference values was assessed at each time point, the time between which was variable (arbitrary unit [arb. unit]). B, Graph shows the final coverage of radiomics features implemented by each team in phase I, as well as the team’s ability to reproduce the reference value of a feature. We were unable to obtain reliable reference values for five features (no ref. value). The teams are listed in Appendix E1 (online). BCOM = Institute of Research and Technology b<>com, Brest, CaPTk = Cancer Imaging Phenomics Toolkit, CERR = Computational Environment for Radiological Research, KCL = King’s College London, LUMC = Leiden University Medical Center, MAASTRO = Maastro, Maastricht, the Netherlands, MaCha = Marie-Charlotte Desseroit, MIRP = Medical Image Radiomics Processor, MITK = Medical Imaging Interaction Toolkit, QIFE = Quantitative Image Feature Engine, RaCaT = Radiomics Calculator, SERA = Standardized Environment for Radiomics Analysis, UCSF = University of California, San Francisco, UMCG = University Medical Center Groningen, USZ = University of Zurich.
Figure 4:
Figure 4:
Bar graphs depict iterative development of consensus on the validity of reference values for radiomics features. We tried to find reliable reference values for radiomics features in an iterative standardization process. In phase I, features were computed without prior image processing, whereas in phase II, features were assessed after image processing with five predefined configurations (configurations A–E; Appendix E1 [online]). The panels show, A, the overall development of consensus on the validity of (tentative) reference values in phases I and II and, B, the development of consensus in phase II, according to image processing configuration. Consensus on the validity of a reference value is based on the number of research teams that produce the same value for a feature (weak: ≤3; moderate: three to five; strong: six to nine; very strong: ≥10). We analyzed consensus at each of the analysis time points, the time between which was variable (arbitrary unit; arb. unit). New features were included at time points 5 and 22, causing an apparent decrease in consensus. For phase II, we first analyzed consensus at time point 10. Image processing configurations C and D were altered after time point 16. Configuration E was altered after revising the resegmentation processing step at time point 22. See Appendix E1 (online) for more information regarding the timeline.
Figure 5:
Figure 5:
Bar graph shows reproducibility of standardized radiomics features. We assessed reproducibility of 169 standardized features on a validation cohort of 51 patients with soft-tissue sarcoma using multimodality imaging (CT, fluorine 18 fluorodeoxyglucose PET, and T1-weighted MRI; shown as CT, PET and MRI) according to the feature values computed by research teams. We assigned each feature to a reproducibility category based on the lower boundary of the 95% confidence interval of the two-way random effects, single rater, absolute agreement intraclass correlation coefficient of the feature (poor: <0.50; moderate: 0.50–0.75; good: 0.75–0.90; excellent: ≥0.90). Five features could not be standardized in this study. Two features with unknown reproducibility were computed by fewer than two teams during validation.

Comment in

Similar articles

Cited by

References

    1. La Thangue NB, Kerr DJ. . Predictive biomarkers: A paradigm shift towards personalized cancer medicine . Nat Rev Clin Oncol 2011. ; 8 ( 10 ): 587 – 596 . - PubMed
    1. O’Connor JPB, Aboagye EO, Adams JE, et al. . Imaging biomarker roadmap for cancer studies . Nat Rev Clin Oncol 2017. ; 14 ( 3 ): 169 – 186 . - PMC - PubMed
    1. Lambin P, Leijenaar RTH, Deist TM, et al. . Radiomics: The bridge between medical imaging and personalized medicine . Nat Rev Clin Oncol 2017. ; 14 ( 12 ): 749 – 762 . - PubMed
    1. Morin O, Vallières M, Jochems A, et al. . A Deep Look into the Future of Quantitative Imaging in Oncology: A Statement of Working Principles and Proposal for Change . Int J Radiat Oncol Biol Phys 2018. ; 102 ( 4 ): 1074 – 1082 . - PubMed
    1. Gillies RJ, Kinahan PE, Hricak H. . Radiomics: Images Are More than Pictures, They Are Data . Radiology 2016. ; 278 ( 2 ): 563 – 577 . - PMC - PubMed

Publication types