Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 11;7(1):e011580.
doi: 10.1136/bmjopen-2016-011580.

Predicting patient 'cost blooms' in Denmark: a longitudinal population-based study

Affiliations

Predicting patient 'cost blooms' in Denmark: a longitudinal population-based study

Suzanne Tamang et al. BMJ Open. .

Abstract

Objectives: To compare the ability of standard versus enhanced models to predict future high-cost patients, especially those who move from a lower to the upper decile of per capita healthcare expenditures within 1 year-that is, 'cost bloomers'.

Design: We developed alternative models to predict being in the upper decile of healthcare expenditures in year 2 of a sample, based on data from year 1. Our 6 alternative models ranged from a standard cost-prediction model with 4 variables (ie, traditional model features), to our largest enhanced model with 1053 non-traditional model features. To quantify any increases in predictive power that enhanced models achieved over standard tools, we compared the prospective predictive performance of each model.

Participants and setting: We used the population of Western Denmark between 2004 and 2011 (2 146 801 individuals) to predict future high-cost patients and characterise high-cost patient subgroups. Using the most recent 2-year period (2010-2011) for model evaluation, our whole-population model used a cohort of 1 557 950 individuals with a full year of active residency in year 1 (2010). Our cost-bloom model excluded the 155 795 individuals who were already high cost at the population level in year 1, resulting in 1 402 155 individuals for prediction of cost bloomers in year 2 (2011).

Primary outcome measures: Using unseen data from a future year, we evaluated each model's prospective predictive performance by calculating the ratio of predicted high-cost patient expenditures to the actual high-cost patient expenditures in Year 2-that is, cost capture.

Results: Our best enhanced model achieved a 21% and 30% improvement in cost capture over a standard diagnosis-based model for predicting population-level high-cost patients and cost bloomers, respectively.

Conclusions: In combination with modern statistical learning methods for analysing large data sets, models enhanced with a large and diverse set of features led to better performance-especially for predicting future cost bloomers.

Keywords: high-cost patients; predictive analytics.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Overview of our model development and evaluation framework. Three independent panel data sets were used for training (model fitting), tuning and testing steps. To evaluate alternative models, we calculated the ratio of predicted high-cost patient expenditures to actual high-cost patient expenditures in year 2.
Figure 2
Figure 2
High-cost persistence in Western Denmark (N=2 146 801). Among the 314 989 individuals with any high-cost years, the bars show the per cent of high-cost patients by total high-cost years; colour saturation increases proportionally to the longest duration of consecutive high-cost years for each individual from 2004 to 2011.
Figure 3
Figure 3
Proportion of chronic condition indicators among persistent high-cost patients (N=49 855) and cost bloomers (N=105 904). Bars show the per cent of patients with each indicator in the prior year, 2010; colour identifies the high-cost group.
Figure 4
Figure 4
Age distribution of 2011 high-cost patients by high-cost status (N=155 756). Lines show the per cent of patients by age; colour distinguishes persistent high-cost or cost-bloom status. Persistent high-cost patients and cost bloomers had mean and median interquartile age ranges of 30 and 34, respectively.
Figure 5
Figure 5
Performance of alternative cost-bloom prediction models by cost capture and relative improvement over the baseline. Bars show cost capture for each model; lines show the per cent increases in predictive power. More details on each model are provided in table 2.

References

    1. National Institute for Health Care Management. The concentration of health care spending. NIHCM Data Brief: NIHCM Foundation, 2012.
    1. Joynt KE, Gawande AA, Orav EJ et al. . Contribution of preventable acute care spending to total spending for high-cost Medicare patients. JAMA 2013;309:2572–8. 10.1001/jama.2013.7103 - DOI - PubMed
    1. Cohen S, Uberoi N. Differentials in the concentration in the level of health expenditures across population subgroups in the U.S., 2010. Statistical Brief . Agency for Healthcare Research and Quality, 2013. - PubMed
    1. Cohen SB, Yu W. The concentration and persistence in the level of health expenditures over time: estimates for the U.S. population, 2008–2009. Agency for Healthcare Research and Quality, 2012.
    1. Bates DW, Saria S, Ohno-Machado L et al. . Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood) 2014;33:1123–31. 10.1377/hlthaff.2014.0041 - DOI - PubMed

Publication types

LinkOut - more resources