Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug:68:25-38.

Classifying Lung Cancer Severity with Ensemble Machine Learning in Health Care Claims Data

Affiliations

Classifying Lung Cancer Severity with Ensemble Machine Learning in Health Care Claims Data

Savannah L Bergquist et al. Proc Mach Learn Res. 2017 Aug.

Abstract

Research in oncology quality of care and health outcomes has been limited by the difficulty of identifying cancer stage in health care claims data. Using linked cancer registry and Medicare claims data, we develop a tool for classifying lung cancer patients receiving chemotherapy into early vs. late stage cancer by (i) deploying ensemble machine learning for prediction, (ii) establishing a set of classification rules for the predicted probabilities, and (iii) considering an augmented set of administrative claims data. We find our ensemble machine learning algorithm with a classification rule defined by the median substantially outperforms an existing clinical decision tree for this problem, yielding full sample performance of 93% sensitivity, 92% specificity, and 93% accuracy. This work has the potential for broad applicability as provider organizations, payers, and policy makers seek to measure quality and outcomes of cancer care and improve on risk adjustment methods.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Flowchart for Lung Cancer Severity Classification Tool
Figure 2:
Figure 2:
Stage by Predicted Probability
Figure 3:
Figure 3:
Cross-Validated AUC Plots by Variable Sets C(·)

References

    1. Brooks GA, Landrum MB, and Keating NL. Inferring cancer stage from administrative data, March 2017. Report submitted to the Centers for Medicare and Medicaid Innovation.
    1. Chawla N, Yabroff KR, Mariotto A, McNeel TS, Schrag D, and Warren JL. Limited validity of diagnosis codes in Medicare claims for identifying cancer metastases and inferring stage. Ann Epidemiol, 24(9):666–672, 2014. - PMC - PubMed
    1. Cooper GS, Yuan Z, Stange KC, Amini SB, Dennis LK, and Rimm AA. The utility of Medicare claims data for measuring cancer stage. Med Care, 37(7):706–711, 1999. - PubMed
    1. Hassett MJ, Ritzwoller DP, Taback N, Carroll N, Cronin AM, Ting GV, Schrag D, Warren JL, Hornbrook MC, and Weeks JC. Validating billing/encounter codes as indicators of lung, colorectal, breast, and prostate cancer recurrence using 2 large contemporary cohorts. Med Care, 52(10):65–73, 2014. - PMC - PubMed
    1. Howlader N, Noone AM, Krapcho M, Miller D, Bishop K, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, and Cronin KA (eds.). SEER cancer statistics review 1975–2013. Report, National Cancer Institute, 2016.

LinkOut - more resources