Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 21;9(1):202.
doi: 10.1038/s41698-025-00978-7.

Prediction of long-term recurrence-free and overall survival in early-onset colorectal cancer: the ENCORE multi-centre study

Affiliations

Prediction of long-term recurrence-free and overall survival in early-onset colorectal cancer: the ENCORE multi-centre study

Alessandro Mannucci et al. NPJ Precis Oncol. .

Abstract

Survivors of early-onset colorectal cancer (EOCRC, i.e., diagnosed before age 50) are likely to experience recurrence after completing treatment. In this international, multi-centric, phase I-II-III EDRN biomarker study, we identified a panel of tumor-derived biomarkers of EOCRC recurrence. We then trained and independently validated a machine learning model (XGBoost) to predict 5-year recurrence-free and overall survival (RFS and OS) of patients with stage I-III EOCRC. Patients with "low-risk" EOCRC demonstrated statistically higher rates of 2-, 5-, and 10 year RFS in both the training cohort (51.0 vs. 92.4%; 34.4% vs. 92.4%; 25.8% vs. 92.4%, respectively; p < 0.0001) and the validation cohort (78.9% vs. 100.0%; 75.0% vs. 100.0%; 75.0% vs. 100.0%, respectively; p = 0.0019). We also report a significant reduction in both over-treatment and missed recurrences compared to current clinically available options. This tissue-based, machine learning-powered assay was prognostic of long-term RFS and OS outcomes after curative-intent treatment of EOCRC (ENCORE was first registered on ClinicalTrial.gov [ID: NCT06271980] on February 15th, 2024).

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

None
Fig. 1 Discovery and prioritization of the 10 best-performing miRNAs.
A The volcano plot displays the differential expression of microRNAs between cases that experienced recurrence and those that did not. MicroRNAs are color-coded based on their level of significance. B The ridgeline plot visualizes the distribution of expression levels for the 10 best-performing candidates. Each ridgeline represents the expression density of a specific microRNA across the patient samples. C The heatmap displays the expression levels of the top 10 prognostic microRNAs across the patient samples. Unsupervised clustering was applied to group patients based on their microRNA expression profiles. The annotation bar above the heatmap indicates the recurrence status (recurrent vs. non-recurrent) and other clinical characteristics of the patients. D This Forrest plot shows the hazard ratio of recurrence for each of the 10 best-performing microRNA candidates. CI Confidence intervals, FDR False-discovery rate, MSI Microsatellite instability.
None
Fig. 2 Architecture and performance of the ENCORE assay.
A Simplified decision tree of the ENCORE decision forest. This panel presents the ensemble of trees that constitute the ENCORE forest model. This simplified view illustrates the hierarchical decision-making process of the algorithm, showing how different microRNA expression levels (and potentially combinations thereof) lead to risk stratification. The nodes represent decision points based on microRNA levels, and the branches represent the possible outcomes, ultimately leading to a predicted risk score or risk group. B This beeswarm plot visualizes the SHAP values of each microRNA included in the ENCORE model. SHAP values quantify the contribution of each microRNA to the model’s prediction for individual patients. Points further from zero on the x-axis indicate a greater impact on the prediction (either increasing or decreasing the risk score). The color gradient of the points corresponds to the measured expression level of the corresponding microRNA, providing insight into how the expression level influences the prediction. C AUROC of ENCORE. D The raincloud plots with superimposed box and whisker plots provide a comprehensive visualization of the distribution of ENCORE-derived risk scores in two groups of EOCRC survivors: those who experienced recurrence and those who remained recurrence-free survivors. AUROC Area under the receiver-operating characteristic curve, CI Confidence intervals, ENCORE Early Onset Colorectal Cancer Recurrence, SHAP SHapley Additive exPlanations.
None
Fig. 3 Recurrence-free and overall survival based on ENCORE prediction.
Kaplan Meier recurrence-free (A) and overall (B) survival curves stratified by ENCORE status: low-risk (green) vs. high-risk (purple). The statistical significance of the difference between the two survival curves is assessed using a log-rank test. The number of patients at risk in each group at various time points is below the graph. CI Confidence intervals, ENCORE Early Onset Colorectal Cancer Recurrence, OS Overall survival, RFS Recurrence-free survival.
None
Fig. 4 Independent and external validation.
A, B Kaplan-Meier recurrence-free (A) and overall (B) survival curves in the independent cohort, stratified by ENCORE status: high-risk (red) vs. low-risk (blue). The statistical significance of the difference between the two survival curves is assessed using a log-rank test. The number of patients at risk in each group at various time points is below the graph. CI Confidence intervals, ENCORE Early Onset Colorectal Cancer Recurrence, OS Overall survival, RFS Recurrence-free survival.
None
Fig. 5 Decision curve analysis.
A Unstandardized net benefit of a surveillance strategy based on ENCORE vs. clinical characteristics. The y-axis represents the net benefit of each recurrence prediction model. The x-axis represents the probability threshold for considering a patient high-risk and intervening. The plot compares the net benefit of using a strategy based on the ENCORE assay against strategies based on clinical characteristics alone and the default strategies of considering all or no patients at risk of recurrence. The model with the highest net benefit across a clinically relevant range of probability thresholds has the greatest clinical utility. The net benefit is calculated by weighing the benefits of true positives against the harms of false positives. B Clinical Impact of a strategy based on ENCORE (circles represent the 25% high-risk threshold). The x-axis represents the probability threshold used to classify patients as high-risk. The curve shows the number of events (recurrences) captured by the strategy (true positives) and the number of patients unnecessarily classified as high-risk (false positives) at different risk thresholds. C Recurrence risks in survivors classified as “low-risk” compared to all EOCRC survivors. CI Confidence intervals, ENCORE Early Onset Colorectal Cancer Recurrence, EOCRC Early-onset colorectal cancer.

References

    1. Sinicrope, F. A. Increasing incidence of early-onset colorectal cancer. N. Engl. J. Med.386, 1547–1558 (2022). - PubMed
    1. Siegel, R. L., Giaquinto, A. N. & Jemal, A. Cancer statistics, 2024. CA Cancer J. Clin.74, 12–49 (2024). - PubMed
    1. Siegel, R. L., Wagle, N. S., Cercek, A., Smith, R. A. & Jemal, A. Colorectal cancer statistics, 2023. CA Cancer J. Clin.73, 233–254 (2023). - PubMed
    1. Eng, C. et al. A comprehensive framework for early-onset colorectal cancer research. Lancet Oncol.23, e116–e128 (2022). - PubMed
    1. Patel, S. G., Karlitz, J. J., Yen, T., Lieu, C. H. & Boland, C. R. The rising tide of early-onset colorectal cancer: a comprehensive review of epidemiology, clinical features, biology, risk factors, prevention, and early detection. Lancet Gastroenterol. Hepatol.7, 262–274 (2022). - PubMed

Associated data

LinkOut - more resources