Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct:8:e2400277.
doi: 10.1200/PO.24.00277. Epub 2024 Oct 11.

5-Hydroxymethylated Biomarkers in Cell-Free DNA Predict Occult Colorectal Cancer up to 36 Months Before Diagnosis in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial

Affiliations

5-Hydroxymethylated Biomarkers in Cell-Free DNA Predict Occult Colorectal Cancer up to 36 Months Before Diagnosis in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial

Diana C West-Szymanski et al. JCO Precis Oncol. 2024 Oct.

Abstract

Purpose: Using the prostate, lung, colorectal, and ovarian (PLCO) Cancer Screening Trial samples, we identified cell-free DNA (cfDNA) candidate biomarkers bearing the epigenetic mark 5-hydroxymethylcytosine (5hmC) that detected occult colorectal cancer (CRC) up to 36 months before clinical diagnosis.

Materials and methods: We performed the 5hmC-seal assay and sequencing on ≤8 ng cfDNA extracted from PLCO study participant plasma samples, including n = 201 cases (diagnosed with CRC within 36 months of blood collection) and n = 401 controls (no cancer diagnosis on follow-up). We conducted association studies and machine learning modeling to analyze the genome-wide 5hmC profiles within training and validation groups that were randomly selected at a 2:1 ratio.

Results: We successfully obtained 5hmC profiles from these decades-old samples. A weighted Cox model of 32 5hmC-modified gene bodies showed a predictive detection value for CRC as early as 36 months before overt tumor diagnosis (training set AUC, 77.1% [95% CI, 72.2 to 81.9] and validation set AUC, 72.8% [95% CI, 65.8 to 79.7]). Notably, the 5hmC-based predictive model showed comparable performance regardless of sex and race/ethnicity, and significantly outperformed risk factors such as age and obesity (assessed as BMI). Finally, when splitting cases at median weighted prediction scores, Kaplan-Meier analyses showed significant risk stratification for CRC occurrence in both the training set (hazard ratio, [HR], 3.3 [95% CI, 2.6 to 5.8]) and validation set (HR, 3.1 [95% CI, 1.8 to 5.8]).

Conclusion: Candidate 5hmC biomarkers and a scoring algorithm have the potential to predict CRC occurrence despite the absence of clinical symptoms and effective predictors. Developing a minimally invasive clinical assay that detects 5hmC-modified biomarkers holds promise for improving early CRC detection and ultimately patient outcomes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Study design and workflow.
For the current project, we obtained 602 cfDNA samples from the PLCO repository, including pre-diagnostic CRC cases and age-, sex-, self-identified race/ethnicity (population)-matched controls. Genome-wide 5hmC profiles were obtained using the 5hmC-Seal technique and the next-generation sequencing (NGS), followed by association analysis and statistical modeling. In addition, QC samples were used to test between- and within-batch variability, and replicate CRC and CTRL samples were used for assay robustness. CRC: colorectal cancer pre-diagnostic (Dx) cases; CTRL: non-cancer controls; cfDNA: cell-free DNA; 5hmC: 5-hydroxymethylcytosine; PLCO: The Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial.
Figure 2.
Figure 2.. Identification of 5hmC signatures associated with pre-diagnostic CRC cases.
The Multivariable Cox proportional hazards model is performed on each 5hmC marker gene (i.e., gene body) to identify differential signatures associated with pre-diagnostic CRC cases. A. The volcano plot shows hazard ratios (HR) in the X-axis and p-values in the Y-axis in the training samples. Those gene bodies with the same direction of association in the validation samples are highlighted in red. B. The KEGG pathways enriched in the 5hmC signatures associated with pre-diagnostic CRC cases. KEGG: Kyoto Encyclopedia of Genes and Genomes. C. Most significant cancer-related IPA Canonical Pathways ranked by -log(p-value) and color-coded based on pathway activation z-score (orange is positive/activated, blue is negative/inhibited, and white is zero/neutral value).
Figure 3.
Figure 3.. Performance of a 5hmC-based model for predicting risk for CRC in pre-diagnostic samples.
The elastic net regularization, a machine learning approach was used to perform feature selection under the Cox proportional hazard model. A. A predictive 32-gene model for CRC occurrence in pre-diagnostic samples comprised of a panel of 17 genes. Shown is a forest plot for the model components and individual hazard ratios (HR). B. Shown is the performance of the 5hmC-based predictive model for CRC occurrence in pre-diagnostic samples in the training (T) and validation (V) samples. AUC: area under the curve; CI: confidence interval. C-H show the performance of the 5hmC-based predictive model for CRC occurrence in combined samples by various risk factors, including: C. race/ethnicity (population); D. CRC stage; E. age; F. sex; G. obesity as defined by BMI; and H. smoking history. EE European American, AA, African American; BMI: body mass index.
Figure 4.
Figure 4.. comparison of predictive values between wp-scores and risk factors for CRC.
Predictive performance is shown for the 5hmC-based wp-scores and various risk factors for CRC, by different time periods between blood collection and CRC diagnosis: ≤ 6 months, 6–12 months, 12–24 months, and 24–36 months. For each time interval before diagnosis, 95% CI are also shown for AUC. A. wp-scores; B. age; C. obesity; and D. smoking history. AUC: area under the curve; CI: confidence interval.
Figure 5.
Figure 5.. Risk stratification with the 5hmC-based wp-scores for CRC occurrence in pre-diagnostic samples.
The wp-scores computed based on the final predictive model for CRC occurrence are used to assign individuals samples into a high-risk group (i.e., those with wp-scores higher than the median) and a low-risk group (i.e., those with wp-scores lower than the median). Kaplan-Meier (KM) plots show the differential rate of CRC occurrence over time (months) according to the 5hmC-based risk in A. training samples, and B. validation samples. Forrest plot showing the hazard ratios (HR) for wp-score and other risk factors in C. training samples; D. validation samples. E. Performance of full model integrating wp-score and other risk factors for predicting CRC occurrence in pre-diagnostic samples in the training (T) and validation (V) sets. F. Nomogram for risk evaluation. Relative contributions of the 5hmC-based wp-scores and traditional risk factors age, sex, obesity, smoking history.

References

    1. Bray F, Ferlay J, Soerjomataram I, et al. : Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68:394–424, 2018 - PubMed
    1. Siegel RL, Wagle NS, Cercek A, et al. : Colorectal cancer statistics, 2023. CA Cancer J Clin 73:233–254, 2023 - PubMed
    1. Zheng S, Schrijvers JJA, Greuter MJW, et al. : Effectiveness of Colorectal Cancer (CRC) Screening on All-Cause and CRC-Specific Mortality Reduction: A Systematic Review and Meta-Analysis. Cancers (Basel) 15, 2023. - PMC - PubMed
    1. Navarro M, Nicolas A, Ferrandez A, et al. : Colorectal cancer population screening programs worldwide in 2016: An update. World J Gastroenterol 23:3632–3642, 2017 - PMC - PubMed
    1. U. S. Preventive Services Task Force, Davidson KW, Barry MJ, et al. : Screening for Colorectal Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 325:1965–1977, 2021 - PubMed

MeSH terms