. 2024 Nov 29;16(23):4009.

doi: 10.3390/cancers16234009.

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Isabel C Dahm¹, Manuel Kolb², Sebastian Altmann³, Konstantin Nikolaou^{1

4}, Sergios Gatidis¹, Ahmed E Othman³, Alessa Hering^{5

6}, Jan H Moltz⁵, Felix Peisen¹

Affiliations

¹ Department of Diagnostic and Interventional Radiology, Eberhard Karls University, Tuebingen University Hospital, Hoppe-Seyler-Str. 3, 72076 Tuebingen, Germany.
² Department of Radiology, Te Whatu Ora Waikato, Hamilton 3240, New Zealand.
³ Institute of Neuroradiology, Johannes Gutenberg University Hospital Mainz, Langenbeckstr. 1, 55131 Mainz, Germany.
⁴ Image-Guided and Functionally Instructed Tumor Therapies (iFIT), The Cluster of Excellence (EXC 2180), 72076 Tuebingen, Germany.
⁵ Fraunhofer MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany.
⁶ Diagnostic Image Analysis Group, Radboudumc, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands.

PMID: 39682195
PMCID: PMC11640155
DOI: 10.3390/cancers16234009

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Isabel C Dahm et al. Cancers (Basel). 2024.

. 2024 Nov 29;16(23):4009.

doi: 10.3390/cancers16234009.

Authors

Isabel C Dahm¹, Manuel Kolb², Sebastian Altmann³, Konstantin Nikolaou^{1

4}, Sergios Gatidis¹, Ahmed E Othman³, Alessa Hering^{5

6}, Jan H Moltz⁵, Felix Peisen¹

Affiliations

¹ Department of Diagnostic and Interventional Radiology, Eberhard Karls University, Tuebingen University Hospital, Hoppe-Seyler-Str. 3, 72076 Tuebingen, Germany.
² Department of Radiology, Te Whatu Ora Waikato, Hamilton 3240, New Zealand.
³ Institute of Neuroradiology, Johannes Gutenberg University Hospital Mainz, Langenbeckstr. 1, 55131 Mainz, Germany.
⁴ Image-Guided and Functionally Instructed Tumor Therapies (iFIT), The Cluster of Excellence (EXC 2180), 72076 Tuebingen, Germany.
⁵ Fraunhofer MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany.
⁶ Diagnostic Image Analysis Group, Radboudumc, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands.

PMID: 39682195
PMCID: PMC11640155
DOI: 10.3390/cancers16234009

Abstract

Objectives: To evaluate the performance of a custom-made convolutional neural network (CNN) algorithm for fully automated lesion tracking and segmentation, as well as RECIST 1.1 evaluation, in longitudinal computed tomography (CT) studies compared to a manual Response Evaluation Criteria in Solid Tumors (RECIST 1.1) evaluation performed by three radiologists.

Methods: Baseline and follow-up CTs of patients with stage IV melanoma (n = 58) was investigated in a retrospective reading study. Three radiologists performed manual measurements of metastatic lesions. Fully automated segmentations were generated, and diameters and volumes were computed from the segmentation results, with subsequent RECIST 1.1 evaluation. We measured (1) the intra- and inter-reader variability in the manual diameter measurements, (2) the agreement between manual and automated diameter measurements, as well as the resulting RECIST 1.1 categories, and (3) the agreement between the RECIST 1.1 categories derived from automated diameter measurement compared to automated volume measurements.

Results: In total, 114 target lesions were measured at baseline and follow-up. The intraclass correlation coefficients (ICCs) for the intra- and inter-reader reliability of the diameter measurements were excellent, being >0.90 for all readers. There was moderate to almost perfect agreement when comparing the timepoint response category derived from the mean manual diameter measurements from all three readers with those derived from automated diameter measurements (Cohen's k 0.67-0.76). The agreement between the manual and automated volumetric timepoint responses was substantial (Fleiss' k 0.66-0.68) and that between the automated diameter and volume timepoint responses was substantial to almost perfect (Cohen's k 0.81).

Conclusions: The automated diameter measurement of preselected target lesions in follow-up CT is reliable and can potentially help to accelerate RECIST evaluation.

Keywords: RECIST 1.1; artificial intelligence and machine learning; melanoma.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

**Figure A1**
Tumor growth close to 20%, depending on measurement. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F) and reader 3 (G,H), with tumor growth close under 20% (automated diameter, reader 2 and 3) or over 20% (reader 1), and resulting differences for timepoint response.

**Figure A2**
Examples for intra- and inter-reader variability. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 session 1 (E,F), reader 2 session 2 (G,H), and reader 3 (I,J). Baseline measurements (A,C,E,G,I) are very close, with low inter- and intra-reader variability. However, follow-up measurements show high inter-reader variability ((B,D), vs. (H,J)) and intra-reader variability (F,H).

**Figure A3**
Incorrect automated segmentation. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F), and reader 3 (G,H). The algorithm incorrectly outlines the lesion in the follow-up CT (B) and includes a metastasis close by, with the resulting artificial growth of the tumor diameter. All three human readers have outlined the correct lesion (D,F,H).

**Figure A4**
Non-spherical shape of lesion, leading to differences between diameter- and volume-based timepoint response. Baseline and follow-up measurements for automated diameter (lesion 1: (A,B); lesion 2: (E,F)) and automated volume (lesion 1: (C,D); lesion 2: (G,H)). The timepoint response for the automated diameter indicates progressive disease for both lesions (lesion 1: +33.1%; lesion 2: +23.8%). The timepoint response for the automated volume indicates stable disease for both lesions (lesion 1: +61.4%; lesion 2: +23.4%).

**Figure 1**
Schema of the proposed pipeline for the AI-assisted segmentation of metastases in follow-up computed tomography (CT) scans. The AI-assisted segmentation pipeline includes four major components: (1) extraction of the region of interest (ROI) around the lesion in the baseline scan; (2) registration of the baseline to the follow-up scan; (3) propagation of the ROI to the follow-up scan to constrain the search region and inference of the trained U-Net to segment all the lesions in the defined region; (4) selection of the corresponding lesion from the output of the nnU-Net.

**Figure 2**
Excellent inter-reader agreement. Exemplary baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F), and reader 3 (G,H), illustrating the excellent inter-reader agreement.

See this image and copyright information in PMC

References

1. Eisenhauer E.A., Therasse P., Bogaerts J., Schwartz L.H., Sargent D., Ford R., Dancey J., Arbuck S., Gwyther S., Mooney M., et al. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1) Eur. J. Cancer. 2009;45:228–247. doi: 10.1016/j.ejca.2008.10.026. - DOI - PubMed
1. Therasse P., Arbuck S.G., Eisenhauer E.A., Wanders J., Kaplan R.S., Rubinstein L., Verweij J., Van Glabbeke M., van Oosterom A.T., Christian M.C., et al. New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J. Natl. Cancer Inst. 2000;92:205–216. doi: 10.1093/jnci/92.3.205. - DOI - PubMed
1. Bellomi M., De Piano F., Ancona E., Lodigiani A.F., Curigliano G., Raimondi S., Preda L. Evaluation of inter-observer variability according to RECIST 1.1 and its influence on response classification in CT measurement of liver metastases. Eur. J. Radiol. 2017;95:96–101. doi: 10.1016/j.ejrad.2017.08.001. - DOI - PubMed
1. Muenzel D., Engels H.P., Bruegel M., Kehl V., Rummeny E.J., Metz S. Intra- and inter-observer variability in measurement of target lesions: Implication on response evaluation according to RECIST 1.1. Radiol. Oncol. 2012;46:8–18. doi: 10.2478/v10019-012-0009-z. - DOI - PMC - PubMed
1. Marten K., Auer F., Schmidt S., Kohl G., Rummeny E.J., Engelke C. Inadequacy of manual measurements compared to automated CT volumetry in assessment of treatment response of pulmonary metastases using RECIST criteria. Eur. Radiol. 2006;16:781–790. doi: 10.1007/s00330-005-0036-x. - DOI - PubMed

Grants and funding

428216905/Deutsche Forschungsgemeinschaft

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Affiliations

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials