Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 29;16(23):4009.
doi: 10.3390/cancers16234009.

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Affiliations

Reliability of Automated RECIST 1.1 and Volumetric RECIST Target Lesion Response Evaluation in Follow-Up CT-A Multi-Center, Multi-Observer Reading Study

Isabel C Dahm et al. Cancers (Basel). .

Abstract

Objectives: To evaluate the performance of a custom-made convolutional neural network (CNN) algorithm for fully automated lesion tracking and segmentation, as well as RECIST 1.1 evaluation, in longitudinal computed tomography (CT) studies compared to a manual Response Evaluation Criteria in Solid Tumors (RECIST 1.1) evaluation performed by three radiologists.

Methods: Baseline and follow-up CTs of patients with stage IV melanoma (n = 58) was investigated in a retrospective reading study. Three radiologists performed manual measurements of metastatic lesions. Fully automated segmentations were generated, and diameters and volumes were computed from the segmentation results, with subsequent RECIST 1.1 evaluation. We measured (1) the intra- and inter-reader variability in the manual diameter measurements, (2) the agreement between manual and automated diameter measurements, as well as the resulting RECIST 1.1 categories, and (3) the agreement between the RECIST 1.1 categories derived from automated diameter measurement compared to automated volume measurements.

Results: In total, 114 target lesions were measured at baseline and follow-up. The intraclass correlation coefficients (ICCs) for the intra- and inter-reader reliability of the diameter measurements were excellent, being >0.90 for all readers. There was moderate to almost perfect agreement when comparing the timepoint response category derived from the mean manual diameter measurements from all three readers with those derived from automated diameter measurements (Cohen's k 0.67-0.76). The agreement between the manual and automated volumetric timepoint responses was substantial (Fleiss' k 0.66-0.68) and that between the automated diameter and volume timepoint responses was substantial to almost perfect (Cohen's k 0.81).

Conclusions: The automated diameter measurement of preselected target lesions in follow-up CT is reliable and can potentially help to accelerate RECIST evaluation.

Keywords: RECIST 1.1; artificial intelligence and machine learning; melanoma.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure A1
Figure A1
Tumor growth close to 20%, depending on measurement. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F) and reader 3 (G,H), with tumor growth close under 20% (automated diameter, reader 2 and 3) or over 20% (reader 1), and resulting differences for timepoint response.
Figure A2
Figure A2
Examples for intra- and inter-reader variability. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 session 1 (E,F), reader 2 session 2 (G,H), and reader 3 (I,J). Baseline measurements (A,C,E,G,I) are very close, with low inter- and intra-reader variability. However, follow-up measurements show high inter-reader variability ((B,D), vs. (H,J)) and intra-reader variability (F,H).
Figure A3
Figure A3
Incorrect automated segmentation. Baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F), and reader 3 (G,H). The algorithm incorrectly outlines the lesion in the follow-up CT (B) and includes a metastasis close by, with the resulting artificial growth of the tumor diameter. All three human readers have outlined the correct lesion (D,F,H).
Figure A4
Figure A4
Non-spherical shape of lesion, leading to differences between diameter- and volume-based timepoint response. Baseline and follow-up measurements for automated diameter (lesion 1: (A,B); lesion 2: (E,F)) and automated volume (lesion 1: (C,D); lesion 2: (G,H)). The timepoint response for the automated diameter indicates progressive disease for both lesions (lesion 1: +33.1%; lesion 2: +23.8%). The timepoint response for the automated volume indicates stable disease for both lesions (lesion 1: +61.4%; lesion 2: +23.4%).
Figure 1
Figure 1
Schema of the proposed pipeline for the AI-assisted segmentation of metastases in follow-up computed tomography (CT) scans. The AI-assisted segmentation pipeline includes four major components: (1) extraction of the region of interest (ROI) around the lesion in the baseline scan; (2) registration of the baseline to the follow-up scan; (3) propagation of the ROI to the follow-up scan to constrain the search region and inference of the trained U-Net to segment all the lesions in the defined region; (4) selection of the corresponding lesion from the output of the nnU-Net.
Figure 2
Figure 2
Excellent inter-reader agreement. Exemplary baseline and follow-up measurements for automated diameter (A,B), reader 1 (C,D), reader 2 (E,F), and reader 3 (G,H), illustrating the excellent inter-reader agreement.

References

    1. Eisenhauer E.A., Therasse P., Bogaerts J., Schwartz L.H., Sargent D., Ford R., Dancey J., Arbuck S., Gwyther S., Mooney M., et al. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1) Eur. J. Cancer. 2009;45:228–247. doi: 10.1016/j.ejca.2008.10.026. - DOI - PubMed
    1. Therasse P., Arbuck S.G., Eisenhauer E.A., Wanders J., Kaplan R.S., Rubinstein L., Verweij J., Van Glabbeke M., van Oosterom A.T., Christian M.C., et al. New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J. Natl. Cancer Inst. 2000;92:205–216. doi: 10.1093/jnci/92.3.205. - DOI - PubMed
    1. Bellomi M., De Piano F., Ancona E., Lodigiani A.F., Curigliano G., Raimondi S., Preda L. Evaluation of inter-observer variability according to RECIST 1.1 and its influence on response classification in CT measurement of liver metastases. Eur. J. Radiol. 2017;95:96–101. doi: 10.1016/j.ejrad.2017.08.001. - DOI - PubMed
    1. Muenzel D., Engels H.P., Bruegel M., Kehl V., Rummeny E.J., Metz S. Intra- and inter-observer variability in measurement of target lesions: Implication on response evaluation according to RECIST 1.1. Radiol. Oncol. 2012;46:8–18. doi: 10.2478/v10019-012-0009-z. - DOI - PMC - PubMed
    1. Marten K., Auer F., Schmidt S., Kohl G., Rummeny E.J., Engelke C. Inadequacy of manual measurements compared to automated CT volumetry in assessment of treatment response of pulmonary metastases using RECIST criteria. Eur. Radiol. 2006;16:781–790. doi: 10.1007/s00330-005-0036-x. - DOI - PubMed

LinkOut - more resources