A Classification Approach for Cancer Survivors from Those Cancer-Free, Based on Health Behaviors: Analysis of the Lifelines Cohort
- PMID: 34066093
- PMCID: PMC8151639
- DOI: 10.3390/cancers13102335
A Classification Approach for Cancer Survivors from Those Cancer-Free, Based on Health Behaviors: Analysis of the Lifelines Cohort
Abstract
Health behaviors affect health status in cancer survivors. We hypothesized that nonlinear algorithms would identify distinct key health behaviors compared to a linear algorithm and better classify cancer survivors. We aimed to use three nonlinear algorithms to identify such key health behaviors and compare their performances with that of a logistic regression for distinguishing cancer survivors from those without cancer in a population-based cohort study. We used six health behaviors and three socioeconomic factors for analysis. Participants from the Lifelines population-based cohort were binary classified into a cancer-survivors group and a cancer-free group using either nonlinear algorithms or logistic regression, and their performances were compared by the area under the curve (AUC). In addition, we performed case-control analyses (matched by age, sex, and education level) to evaluate classification performance only by health behaviors. Data were collected for 107,624 cancer free participants and 2760 cancer survivors. Using all variables resulted an AUC of 0.75 ± 0.01, using only six health behaviors, the logistic regression and nonlinear algorithms differentiated cancer survivors from cancer-free participants with AUCs of 0.62 ± 0.01 and 0.60 ± 0.01, respectively. The main distinctive classifier was age. Though not relevant to classification, the main distinctive health behaviors were body mass index and alcohol consumption. In the case-control analyses, algorithms produced AUCs of 0.52 ± 0.01. No key health behaviors were identified by linear and nonlinear algorithms to differentiate cancer survivors from cancer-free participants in this population-based cohort.
Keywords: cancer survivors; classification; health behaviors; lifestyle; machine learning; medical informatics.
Conflict of interest statement
The authors declare no conflict of interest.
Figures


References
-
- Allemani C., Matsuda T., Di Carlo V., Harewood R., Matz M., Nikšić M., Bonaventure A., Valkov M., Johnson C.J., Estève J., et al. Articles Global surveillance of trends in cancer survival 2000–14 ( CONCORD-3 ): Analysis of individual records for 37,513,025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries. Lancet. 2018;14:1023–1075. doi: 10.1016/S0140-6736(17)33326-3. - DOI - PMC - PubMed
-
- Di Salvo F., Van Eycken E., Mayer-da-Silva A., Pannozzo F., Smailyte G., Mazzei A., Usala M., Aareleid T., Lambe M., Zvolský M., et al. Age and case mix-standardised survival for all cancer patients in Europe 1999–2007: Results of EUROCARE-5, a population-based study. Eur. J. Cancer. 2015;51:2120–2129. doi: 10.1016/j.ejca.2015.07.025. - DOI - PubMed
-
- World Health Organisation Latest global cancer data: Cancer burden rises to 18.1 million new cases and 9.6 million cancer deaths in 2018. [(accessed on 22 March 2021)];Int. Agency Res. Cancer. 2018 Available online: https://www.who.int/cancer/PRGlobocanFinal.pdf.
LinkOut - more resources
Full Text Sources