Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 22:11:631056.
doi: 10.3389/fonc.2021.631056. eCollection 2021.

Development of a Gene-Based Prediction Model for Recurrence of Colorectal Cancer Using an Ensemble Learning Algorithm

Affiliations

Development of a Gene-Based Prediction Model for Recurrence of Colorectal Cancer Using an Ensemble Learning Algorithm

Han-Ching Chan et al. Front Oncol. .

Abstract

It is difficult to determine which patients with stage I and II colorectal cancer are at high risk of recurrence, qualifying them to undergo adjuvant chemotherapy. In this study, we aimed to determine a gene signature using gene expression data that could successfully identify high risk of recurrence among stage I and II colorectal cancer patients. First, a synthetic minority oversampling technique was used to address the problem of imbalanced data due to rare recurrence events. We then applied a sequential workflow of three methods (significance analysis of microarrays, logistic regression, and recursive feature elimination) to identify genes differentially expressed between patients with and without recurrence. To stabilize the prediction algorithm, we repeated the above processes on 10 subsets by bagging the training data set and then used support vector machine methods to construct the prediction models. The final predictions were determined by majority voting. The 10 models, using 51 differentially expressed genes, successfully predicted a high risk of recurrence within 3 years in the training data set, with a sensitivity of 91.18%. For the validation data sets, the sensitivity of the prediction with samples from two other countries was 80.00% and 91.67%. These prediction models can potentially function as a tool to decide if adjuvant chemotherapy should be administered after surgery for patients with stage I and II colorectal cancer.

Keywords: colorectal cancer; ensemble; gene expression; machine learning; prognostic signature.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Flowchart for data analysis.
Figure 2
Figure 2
Survival analysis using the training data set (France) and validation data sets (USA & Australia). (A) Kaplan-Meier plot for France (n=196) data set. (B) Kaplan-Meier plot for USA (n=55) data set. (C) Kaplan-Meier plot for Australia (n=103) data set. (D) Forest plot of the hazard ratio and 95% confidence intervals in both the training data set and validation data sets. The prediction of high or low risk groups was dependent on majority voting. The P-values correspond to the two-sided log-rank test determining the difference between two curves.
Figure 3
Figure 3
Network analysis using the Ingenuity® Pathway Analysis (IPA®) software program. The red colored the genes which are in the list of our differentially expressed genes, and white colored the putative genes based on IPA database.

References

    1. Araghi M, Soerjomataram I, Jenkins M, Brierley J, Morris E, Bray F, et al. . Global trends in colorectal cancer mortality: projections to the year 2035. Int J Cancer (2019) 144(12):2992–3000. 10.1002/ijc.32055 - DOI - PubMed
    1. National Health Promotion Administration Ministry of Health and Welfare . Taiwan Cancer Registry Annual Report of 2016. Available at: https://www.hpa.gov.tw/Pages/List.aspx?nodeid=269 (Accessed March 3, 2018).
    1. Edge SB, Compton CC. The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM. Ann Surg Oncol (2010) 17(6):1471–4. 10.1245/s10434-010-0985-4 - DOI - PubMed
    1. Gray R, Barnwell J, McConkey C, Hills RK, Williams NS, Kerr DJ. Adjuvant chemotherapy versus observation in patients with colorectal cancer: a randomised study. Lancet (2007) 370(9604):2020–9. 10.1016/S0140-6736(07)61866-2 - DOI - PubMed
    1. Schippinger W, Samonigg H, Schaberl-Moser R, Greil R, Thödtmann R, Tschmelitsch J, et al. . A prospective randomised phase III trial of adjuvant chemotherapy with 5-fluorouracil and leucovorin in patients with stage II colon cancer. Br J Cancer (2007) 97:1021. 10.1038/sj.bjc.6604011 - DOI - PMC - PubMed

LinkOut - more resources