Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Mar 17;4(3):e004007.
doi: 10.1136/bmjopen-2013-004007.

Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry

Affiliations

Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry

Sunil Gupta et al. BMJ Open. .

Abstract

Objectives: Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes.

Setting: A regional cancer centre in Australia.

Participants: Disease-specific data from a purpose-built cancer registry (Evaluation of Cancer Outcomes (ECO)) from 869 patients were used to predict survival at 6, 12 and 24 months. The model was validated with data from a further 94 patients, and results compared to the assessment of five specialist oncologists. Machine-learning prediction using ECO data was compared with that using EAR and a model combining ECO and EAR data.

Primary and secondary outcome measures: Survival prediction accuracy in terms of the area under the receiver operating characteristic curve (AUC).

Results: The ECO model yielded AUCs of 0.87 (95% CI 0.848 to 0.890) at 6 months, 0.796 (95% CI 0.774 to 0.823) at 12 months and 0.764 (95% CI 0.737 to 0.789) at 24 months. Each was slightly better than the performance of the clinician panel. The model performed consistently across a range of cancers, including rare cancers. Combining ECO and EAR data yielded better prediction than the ECO-based model (AUCs ranging from 0.757 to 0.997 for 6 months, AUCs from 0.689 to 0.988 for 12 months and AUCs from 0.713 to 0.973 for 24 months). The best prediction was for genitourinary, head and neck, lung, skin, and upper gastrointestinal tumours.

Conclusions: Machine learning applied to information from a disease-specific (cancer) database and the EAR can be used to predict clinical outcomes. Importantly, the approach described made use of digital data that is already routinely collected but underexploited by clinical health systems.

Keywords: Cancer; Electronic Medical Record; Machine Learning; Prediction; Survival.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Zhao X, Rodland EA, Sorlie T, et al. Combining gene signatures improves prediction of breast cancer survival. PLoS ONE 2011;6:e17845. - PMC - PubMed
    1. Chang CM, Su YC, Lai NS, et al. The combined effect of individual and neighborhood socioeconomic status on cancer survival rates. PLoS ONE 2012;7:e44325. - PMC - PubMed
    1. Li C, Zhang S, Zhang H, et al. Using the k-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer. Comput Math Methods Med 2012;2012:876545. - PMC - PubMed
    1. Huang ML, Hung YH, Lee WM, et al. Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference system classification techniques in breast cancer dataset classification diagnosis. J Med Syst 2012;36:407–14 - PubMed
    1. Appari A, Eric Johnson M, Anthony DL. Meaningful use of electronic health record systems and process quality of care: evidence from a panel data analysis of U.S. acute-care hospitals. Health Serv Res 2013;48:354–75 - PMC - PubMed

Publication types

LinkOut - more resources