Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun 20;104(12):931-40.
doi: 10.1093/jnci/djs233. Epub 2012 Apr 30.

Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer

Affiliations

Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer

Jessica Chubak et al. J Natl Cancer Inst. .

Abstract

Background: Studies of breast cancer outcomes rely on the identification of second breast cancer events (recurrences and second breast primary tumors). Cancer registries often do not capture recurrences, and chart abstraction can be infeasible or expensive. An alternative is using administrative health-care data to identify second breast cancer events; however, these algorithms must be validated against a gold standard.

Methods: We developed algorithms using data from 3152 women in an integrated health-care system who were diagnosed with stage I or II breast cancer in 1993-2006. Medical record review served as the gold standard for second breast cancer events. Administrative data used in algorithm development included procedures, diagnoses, prescription fills, and cancer registry records. We randomly divided the cohort into training and testing samples and used a classification and regression tree analysis to build algorithms for classifying women as having or not having a second breast cancer event. We created several algorithms for researchers to use based on the relative importance of sensitivity, specificity, and positive predictive value (PPV) in future studies.

Results: The algorithm with high specificity and PPV had 89% sensitivity (95% confidence interval [CI] = 84% to 92%), 99% specificity (95% CI = 98% to 99%), and 90% PPV (95% CI = 86% to 94%); the high-sensitivity algorithm had 96% sensitivity (95% CI = 93% to 98%), 95% specificity (95% CI = 94% to 96%), and 74% PPV (95% CI = 68% to 78%).

Conclusions: Algorithms based on administrative data can identify second breast cancer events with high sensitivity, specificity, and PPV. The algorithms presented here promote efficient outcomes research, allowing researchers to prioritize sensitivity, specificity, or PPV in identifying second breast cancer events.

PubMed Disclaimer

Figures

Figure 1
Figure 1
High sensitivity algorithm for a second breast cancer event. “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.
Figure 2
Figure 2
High specificity and high positive predictive value algorithm for a second breast cancer event. “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.
Figure 3
Figure 3
Extremely high sensitivity algorithm for a second breast cancer event. “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.
Figure 4
Figure 4
High sensitivity algorithm for a second breast cancer event (No Surveillance, Epidemiology, and End Results variables). “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.
Figure 5
Figure 5
High specificity and high positive predictive value algorithm for a second breast cancer event (No Surveillance, Epidemiology, and End Results variables). “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.
Figure 6
Figure 6
Extremely high sensitivity algorithm for a second breast cancer event (No Surveillance, Epidemiology, and End Results variables). “Yes” indicates that the criterion was met. “No” indicates that the criterion was not met.

References

    1. National Cancer Institute. Overview of the SEER Program. 2011. http://seer.cancer.gov/about/overview.html. Accessed April 4, 2012.
    1. Chen SL, Martinez SR. The survival impact of the choice of surgical procedure after ipsilateral breast cancer recurrence. Am J Surg. . 2008;196(4):495–499. - PubMed
    1. Hershman D, Neugut AI, Jacobson JS, et al. Acute myeloid leukemia or myelodysplastic syndrome following use of granulocyte colony-stimulating factors during breast cancer adjuvant chemotherapy. J Natl Cancer Inst. . 2007;99(3):196–205. - PubMed
    1. Thompson D, Taylor DC, Montoya EL, Winer EP, Jones SE, Weinstein MC. Cost-effectiveness of switching to exemestane after 2 to 3 years of therapy with tamoxifen in postmenopausal women with early-stage breast cancer. Value Health. . 2007;10(5):367–376. - PubMed
    1. Stokes ME, Thompson D, Montoya EL, Weinstein MC, Winer EP, Earle CC. Ten-year survival and cost following breast cancer recurrence: estimates from SEER-Medicare data. Value Health. . 2008;11(2):213–220. - PubMed

Publication types