Comparative Study

. 2017 Feb;44(2):479-496.

doi: 10.1002/mp.12041.

Multi-site quality and variability analysis of 3D FDG PET segmentations based on phantom and clinical image data

Reinhard R Beichel^{1

2}, Brian J Smith³, Christian Bauer¹, Ethan J Ulrich^{1

4}, Payam Ahmadvand⁵, Mikalai M Budzevich⁶, Robert J Gillies⁶, Dmitry Goldgof⁷, Milan Grkovski⁸, Ghassan Hamarneh⁵, Qiao Huang⁹, Paul E Kinahan¹⁰, Charles M Laymon^{11

12}, James M Mountz¹², John P Muzi¹⁰, Mark Muzi¹⁰, Sadek Nehmeh¹³, Matthew J Oborski¹¹, Yongqiang Tan⁹, Binsheng Zhao⁹, John J Sunderland¹⁴, John M Buatti¹⁵

Affiliations

¹ Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA, USA.
² Department of Internal Medicine, The University of Iowa, Iowa City, IA, USA.
³ Department of Biostatistics, The University of Iowa, Iowa City, IA, USA.
⁴ Department of Biomedical Engineering, The University of Iowa, Iowa City, IA, USA.
⁵ School of Computing Science, Simon Fraser University, Burnaby, Canada.
⁶ H Lee Moffitt Cancer Center, Tampa, FL, USA.
⁷ Department of Computer Science and Engineering, University of South Florida, Tampa, FL, USA.
⁸ Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁹ Department of Radiology, Columbia University Medical Center, New York, NY, USA.
¹⁰ Department of Radiology, University of Washington Medical Center, Seattle, WA, USA.
¹¹ Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA.
¹² Department of Radiology, University of Pittsburgh, Pittsburgh, PA, USA.
¹³ National Center for Cancer Care and Research, Doha, Qatar.
¹⁴ Department of Radiology, The University of Iowa, Iowa City, IA, USA.
¹⁵ Department of Radiation Oncology, The University of Iowa, Iowa City, IA, USA.

PMID: 28205306
PMCID: PMC5834232
DOI: 10.1002/mp.12041

Comparative Study

Multi-site quality and variability analysis of 3D FDG PET segmentations based on phantom and clinical image data

Reinhard R Beichel et al. Med Phys. 2017 Feb.

. 2017 Feb;44(2):479-496.

doi: 10.1002/mp.12041.

Authors

Affiliations

¹ Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA, USA.
² Department of Internal Medicine, The University of Iowa, Iowa City, IA, USA.
³ Department of Biostatistics, The University of Iowa, Iowa City, IA, USA.
⁴ Department of Biomedical Engineering, The University of Iowa, Iowa City, IA, USA.
⁵ School of Computing Science, Simon Fraser University, Burnaby, Canada.
⁶ H Lee Moffitt Cancer Center, Tampa, FL, USA.
⁷ Department of Computer Science and Engineering, University of South Florida, Tampa, FL, USA.
⁸ Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁹ Department of Radiology, Columbia University Medical Center, New York, NY, USA.
¹⁰ Department of Radiology, University of Washington Medical Center, Seattle, WA, USA.
¹¹ Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA.
¹² Department of Radiology, University of Pittsburgh, Pittsburgh, PA, USA.
¹³ National Center for Cancer Care and Research, Doha, Qatar.
¹⁴ Department of Radiology, The University of Iowa, Iowa City, IA, USA.
¹⁵ Department of Radiation Oncology, The University of Iowa, Iowa City, IA, USA.

PMID: 28205306
PMCID: PMC5834232
DOI: 10.1002/mp.12041

Abstract

Purpose: Radiomics utilizes a large number of image-derived features for quantifying tumor characteristics that can in turn be correlated with response and prognosis. Unfortunately, extraction and analysis of such image-based features is subject to measurement variability and bias. The challenge for radiomics is particularly acute in Positron Emission Tomography (PET) where limited resolution, a high noise component related to the limited stochastic nature of the raw data, and the wide variety of reconstruction options confound quantitative feature metrics. Extracted feature quality is also affected by tumor segmentation methods used to define regions over which to calculate features, making it challenging to produce consistent radiomics analysis results across multiple institutions that use different segmentation algorithms in their PET image analysis. Understanding each element contributing to these inconsistencies in quantitative image feature and metric generation is paramount for ultimate utilization of these methods in multi-institutional trials and clinical oncology decision making.

Methods: To assess segmentation quality and consistency at the multi-institutional level, we conducted a study of seven institutional members of the National Cancer Institute Quantitative Imaging Network. For the study, members were asked to segment a common set of phantom PET scans acquired over a range of imaging conditions as well as a second set of head and neck cancer (HNC) PET scans. Segmentations were generated at each institution using their preferred approach. In addition, participants were asked to repeat segmentations with a time interval between initial and repeat segmentation. This procedure resulted in overall 806 phantom insert and 641 lesion segmentations. Subsequently, the volume was computed from the segmentations and compared to the corresponding reference volume by means of statistical analysis.

Results: On the two test sets (phantom and HNC PET scans), the performance of the seven segmentation approaches was as follows. On the phantom test set, the mean relative volume errors ranged from 29.9 to 87.8% of the ground truth reference volumes, and the repeat difference for each institution ranged between -36.4 to 39.9%. On the HNC test set, the mean relative volume error ranged between -50.5 to 701.5%, and the repeat difference for each institution ranged between -37.7 to 31.5%. In addition, performance measures per phantom insert/lesion size categories are given in the paper. On phantom data, regression analysis resulted in coefficient of variation (CV) components of 42.5% for scanners, 26.8% for institutional approaches, 21.1% for repeated segmentations, 14.3% for relative contrasts, 5.3% for count statistics (acquisition times), and 0.0% for repeated scans. Analysis showed that the CV components for approaches and repeated segmentations were significantly larger on the HNC test set with increases by 112.7% and 102.4%, respectively.

Conclusion: Analysis results underline the importance of PET scanner reconstruction harmonization and imaging protocol standardization for quantification of lesion volumes. In addition, to enable a distributed multi-site analysis of FDG PET images, harmonization of analysis approaches and operator training in combination with highly automated segmentation methods seems to be advisable. Future work will focus on quantifying the impact of segmentation variation on radiomics system performance.

Keywords: FDG PET; head and neck cancer; multi-site performance analysis; phantom; radiomics; segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflict of interest to report.

Figures

**Figure 1**
Modified NEMA IEC Body Phantom with spherical and ellipsoid inserts and corresponding naming scheme. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 2**
Example of indicator images. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 3**
Distributions of measured volumes by reference volumes. The gray diamonds (phantom) and line (HNC) represent the reference volumes. Note that phantom and HNC plots have different scales.

**Figure 4**
Approach‐specific relative mean error as a function of reference volume. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 5**
Distributions of relative errors in volume measurements by approach.

**Figure 6**
Distributions of relative repeat errors in volume measurements by approach.

**Figure 7**
Examples of segmentation results for a primary cancer site, which is part of test set HNC. (a–g) Segmentations generated with approaches 1 to 7. (h) One example of the six manual reference segmentations. The corresponding indicator image is given in Fig. 2(a). For each segmentation approach, the relative volume error $V_{e}$ , Dice coefficient D, and mean unsigned distance error $d_{e}$ is provided. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 8**
Examples of segmentation results for a hot lymph node, which is part of test set HNC. (a–g) Segmentations generated with approaches 1 to 7. (h) One example of the six manual reference segmentations. The corresponding indicator image is given in Fig. 2(b). For each segmentation approach, the relative volume error $V_{e}$ , Dice coefficient D, and mean unsigned distance error $d_{e}$ is provided. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 9**
Relative mean errors from regression modeling of the phantom and HNC data.

**Figure 10**
Comparison of segmentation performance of methods 1 to 7 on (a) phantom and (b) HNC data. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 11**
Overview of overall segmentation performance, comparing Methods 1 to 7. [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 12**
Comparison of the absolute mean relative volume error of approaches 1 to 7 against the corresponding mean Dice coefficient (a) and mean unsigned distance error (b). [Color figure can be viewed at wileyonlinelibrary.com]

**Figure 13**
Boxplots of measured Dice coefficients (a) and unsigned distance errors (b) as well as corresponding repeat differences (c and d).

See this image and copyright information in PMC

Cited by

Application of Community Detection Algorithm to Investigate the Correlation between Imaging Biomarkers of Tumor Metabolism, Hypoxia, Cellularity, and Perfusion for Precision Radiotherapy in Head and Neck Squamous Cell Carcinomas.
Paudyal R, Grkovski M, Oh JH, Schöder H, Nunez DA, Hatzoglou V, Deasy JO, Humm JL, Lee NY, Shukla-Dave A. Paudyal R, et al. Cancers (Basel). 2021 Aug 3;13(15):3908. doi: 10.3390/cancers13153908. Cancers (Basel). 2021. PMID: 34359810 Free PMC article.
Clinical use of positron emission tomography for radiotherapy planning - Medical physics considerations.
Thorwarth D. Thorwarth D. Z Med Phys. 2023 Feb;33(1):13-21. doi: 10.1016/j.zemedi.2022.09.001. Epub 2022 Oct 20. Z Med Phys. 2023. PMID: 36272949 Free PMC article. Review.
Voxel size and gray level normalization of CT radiomic features in lung cancer.
Shafiq-Ul-Hassan M, Latifi K, Zhang G, Ullah G, Gillies R, Moros E. Shafiq-Ul-Hassan M, et al. Sci Rep. 2018 Jul 12;8(1):10545. doi: 10.1038/s41598-018-28895-9. Sci Rep. 2018. PMID: 30002441 Free PMC article.
Quantitative Imaging Informatics for Cancer Research.
Fedorov A, Beichel R, Kalpathy-Cramer J, Clunie D, Onken M, Riesmeier J, Herz C, Bauer C, Beers A, Fillion-Robin JC, Lasso A, Pinter C, Pieper S, Nolden M, Maier-Hein K, Herrmann MD, Saltz J, Prior F, Fennessy F, Buatti J, Kikinis R. Fedorov A, et al. JCO Clin Cancer Inform. 2020 May;4:444-453. doi: 10.1200/CCI.19.00165. JCO Clin Cancer Inform. 2020. PMID: 32392097 Free PMC article.
A Bayesian framework for performance assessment and comparison of imaging biomarker quantification methods.
Smith BJ, Beichel RR. Smith BJ, et al. Stat Methods Med Res. 2019 Apr;28(4):1003-1018. doi: 10.1177/0962280217741334. Epub 2017 Dec 22. Stat Methods Med Res. 2019. PMID: 29271301 Free PMC article.

See all "Cited by" articles

References

1. Cook GJR, Siddique M, Taylor BP, et al. Radiomics in PET: principles and applications. Clinical and Translational Imaging. 2014;2:269–276.
1. Kumar V, Gu Y, Basu S, et al. Radiomics: the process and the challenges. Magn Reson Imaging. 2012;30:1234–1248. - PMC - PubMed
1. Alluri KC, Tahari AK, Wahl RL, et al. Prognostic value of FDG PET metabolic tumor volume in human papillomavirus‐positive stage III and IV oropharyngeal squamous cell carcinoma. AJR Am J Roentgenol 2014;203:897–903. - PMC - PubMed
1. Sridhar P, Mercier G, Tan J, et al. FDG PET metabolic tumor volume segmentation and pathologic volume of primary human solid tumors. AJR Am J Roentgenol 2014;202(5):1114–1119. - PubMed
1. Dibble EH, Alvarez AC, Truong MT, et al. 18F‐FDG metabolic tumor volume and total glycolytic activity of oral cavity and oropharyngeal squamous cell cancer: adding value to clinical staging. J Nucl Med. 2012;53:709–715. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multi-site quality and variability analysis of 3D FDG PET segmentations based on phantom and clinical image data

Affiliations

Multi-site quality and variability analysis of 3D FDG PET segmentations based on phantom and clinical image data

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources