A method to predict breast cancer stage using Medicare claims
- PMID: 20145734
- PMCID: PMC2818641
- DOI: 10.1186/1742-5573-7-1
A method to predict breast cancer stage using Medicare claims
Abstract
Background: In epidemiologic studies, cancer stage is an important predictor of outcomes. However, cancer stage is typically unavailable in medical insurance claims datasets, thus limiting the usefulness of such data for epidemiologic studies. Therefore, we sought to develop an algorithm to predict cancer stage based on covariates available from claims-based data.
Methods: We identified a cohort of 77,306 women age >/= 66 years with stage I-IV breast cancer, using the Surveillence Epidemiology and End Results (SEER)-Medicare database. We formulated an algorithm to predict cancer stage using covariates (demographic, tumor, and treatment characteristics) obtained from claims. Logistic regression models derived prediction equations in a training set, and equations' test characteristics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value [NPV]) were calculated in a validation set.
Results: Of the entire sample of women diagnosed with invasive breast cancer, 51% had stage I; 26% stage II; 11% stage III; and 4% stage IV disease. The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%. Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.
Conclusions: A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease. These prediction equations may be applied in future studies of breast cancer patients, substantially improving the utility of claims-based studies in this group. This method may similarly be employed to develop algorithms permitting claims-based epidemiologic studies of patients with other cancers.
Figures



Similar articles
-
An algorithm for the use of Medicare claims data to identify women with incident breast cancer.Health Serv Res. 2004 Dec;39(6 Pt 1):1733-49. doi: 10.1111/j.1475-6773.2004.00315.x. Health Serv Res. 2004. PMID: 15533184 Free PMC article.
-
Evaluation of three algorithms to identify incident breast cancer in Medicare claims data.Health Serv Res. 2007 Oct;42(5):2056-69. doi: 10.1111/j.1475-6773.2007.00705.x. Health Serv Res. 2007. PMID: 17850533 Free PMC article.
-
Development and validation of coding algorithms to identify patients with incident lung cancer in United States healthcare claims data.Pharmacoepidemiol Drug Saf. 2020 Nov;29(11):1465-1479. doi: 10.1002/pds.5137. Epub 2020 Oct 4. Pharmacoepidemiol Drug Saf. 2020. PMID: 33012044
-
Methods for systematic reviews of administrative database studies capturing health outcomes of interest.Vaccine. 2013 Dec 30;31 Suppl 10:K2-6. doi: 10.1016/j.vaccine.2013.06.048. Vaccine. 2013. PMID: 24331070 Review.
-
Diagnostic Algorithms for Cardiovascular Death in Administrative Claims Databases: A Systematic Review.Drug Saf. 2019 Apr;42(4):515-527. doi: 10.1007/s40264-018-0754-z. Drug Saf. 2019. PMID: 30471046
Cited by
-
Algorithm to Identify Incident Epithelial Ovarian Cancer Cases Using Claims Data.JCO Clin Cancer Inform. 2022 Mar;6:e2100187. doi: 10.1200/CCI.21.00187. JCO Clin Cancer Inform. 2022. PMID: 35297648 Free PMC article.
-
Validation of Claims Algorithms for Progression to Metastatic Cancer in Patients with Breast, Non-small Cell Lung, and Colorectal Cancer.Front Oncol. 2016 Feb 1;6:18. doi: 10.3389/fonc.2016.00018. eCollection 2016. Front Oncol. 2016. PMID: 26870695 Free PMC article.
-
Development of a Machine Learning Model to Identify Colorectal Cancer Stage in Medicare Claims.JCO Clin Cancer Inform. 2023 May;7:e2300003. doi: 10.1200/CCI.23.00003. JCO Clin Cancer Inform. 2023. PMID: 37257142 Free PMC article.
-
Risk of mortality with concomitant use of tamoxifen and selective serotonin reuptake inhibitors: multi-database cohort study.BMJ. 2016 Sep 30;354:i5014. doi: 10.1136/bmj.i5014. BMJ. 2016. PMID: 27694571 Free PMC article.
-
Health Disparities among Patients with Cancer Who Received Molecular Testing for Biomarker-Directed Therapy.Cancer Res Commun. 2024 Oct 1;4(10):2598-2609. doi: 10.1158/2767-9764.CRC-24-0321. Cancer Res Commun. 2024. PMID: 39172022 Free PMC article.
References
-
- Kiyota Y, Schneeweiss S, Glynn RJ, Cannuscio CC, Avorn J, Solomon DH. Accuracy of Medicare claims-based diagnosis of acute myocardial infarction: estimating positive predictive value on the basis of review of hospital records. Am Heart J. 2004;148:99–104. doi: 10.1016/j.ahj.2004.02.013. - DOI - PubMed
-
- Leibson CL, Needleman J, Buerhaus P, Heit JA, Melton LJ, Naessens JM, Bailey KR, Petterson TM, Ransom JE, Harris MR. Identifying in-hospital venous thromboembolism (VTE): a comparison of claims-based approaches with the Rochester Epidemiology Project VTE cohort. Med Care. 2008;46:127–132. doi: 10.1097/MLR.0b013e3181589b92. - DOI - PubMed
LinkOut - more resources
Full Text Sources