Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 20;17(Suppl 1):131.
doi: 10.1186/s12938-018-0568-3.

Machine learning approaches for predicting high cost high need patient expenditures in health care

Affiliations

Machine learning approaches for predicting high cost high need patient expenditures in health care

Chengliang Yang et al. Biomed Eng Online. .

Abstract

Background: This paper studies the temporal consistency of health care expenditures in a large state Medicaid program. Predictive machine learning models were used to forecast the expenditures, especially for the high-cost, high-need (HCHN) patients.

Results: We systematically tests temporal correlation of patient-level health care expenditures in both the short and long terms. The results suggest that medical expenditures are significantly correlated over multiple periods. Our work demonstrates a prevalent and strong temporal correlation and shows promise for predicting future health care expenditures using machine learning. Temporal correlation is stronger in HCHN patients and their expenditures can be better predicted. Including more past periods is beneficial for better predictive performance.

Conclusions: This study shows that there is significant temporal correlation in health care expenditures. Machine learning models can help to accurately forecast the expenditures. These results could advance the field toward precise preventive care to lower overall health care costs and deliver care more efficiently.

Keywords: High need patients; High-cost; Machine learning; Predictive modeling.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Schematic diagram of the deployed RNN model. The whole process consists of several steps. Step 1: Input variables are embedded; Step 2: A RNN with single gated recurrent unit (GRU) layer is used to generate attention from the sequential embeddings; Step 3: Attentions and embeddings are summed to make the context vector. The context vector is later transformed to output
Fig. 2
Fig. 2
Scatter plot of expenditure percentiles between two consecutive time periods for different period lengths. The upper right corner is denser, implying the HCHN patients are more temporally consistent. From left to right, x axis are the last month, last 3 months, last 6 months, and entire 12 months of 2012 respectively. The y axis are the first month, first 3 months, first 6 months, and entire 12 months of 2013 respectively
Fig. 3
Fig. 3
Scatter plot of expenditure percentiles of the top 10% population between two consecutive time periods for different period lengths. The majority of HCHN patients stay above 80% for the next period. From left to right, the x axis are the last month, last 3 months, last 6 months, and entire 12 months of 2012 respectively. The y axis are the first month, first 3 months, first 6 months, and entire 12 months of 2013 respectively
Fig. 4
Fig. 4
Scatter plot of expenditure percentiles of the diabetes cohort between two consecutive time periods for different period lengths. The conclusions are similar to the entire adult population’s. The HCHN patients in the upper right corner are consistent. The low-cost population in the lower left also shows consistency. From left to right, the x axis are the last month, last 3 months, last 6 months, and entire 12 months of 2012 respectively. The y axis are the first month, first 3 months, first 6 months, and entire 12 months of 2013 respectively
Fig. 5
Fig. 5
Scatter plot of expenditure percentiles of the top 10% population in the diabetes cohort between two consecutive time periods for different period lengths. HCHN diabetes patients are likely to stay in the top 20%. From left to right, the x axis are the last month, last 3 months, last 6 months, and entire 12 months of 2012 respectively. The y axis are the first month, first 3 months, first 6 months, and entire 12 months of 2013 respectively
Fig. 6
Fig. 6
Comparison of different period lengths. Generally performance improves when the time period becomes longer, which is consistent with higher correlation in the same trend. However, GBM and LASSO seem to find their best R-squared when period length = 6 months in predicting pctlPMPM and logPMPM
Fig. 7
Fig. 7
All four models improved after adding demographics, diagnoses, medical procedures and medications as input variables, suggesting that though prior expenditures already provide a good approximation for future spending, additional information is useful in predictive modeling
Fig. 8
Fig. 8
Performance changes after adding more prior periods. Most measures substantially improved after adding the first three periods. The gain for adding a fourth period to LR, LASSO and GBM is minimal. RNN benefits most, indicating its stronger ability to model temporal relations
Fig. 9
Fig. 9
Contributions derived from a prediction by LASSO. The radius of the circle corresponds to the standard deviation of the contribution
Fig. 10
Fig. 10
Contributions derived from the same prediction by GBM. We observe a larger variation in contributions. But the variation in predicted value is similar
Fig. 11
Fig. 11
Contributions derived from a prediction by RNN. When comparing LASSO, GBM and RNN, LASSO not only gives stable predicted value, but also generates stable contributions. GBM has consistent predicted values, but is less stable in contributions. RNN is unstable in both, possibly due to its non-convex optimization procedure

References

    1. Centers for Medicare & Medicaid Services, et al. National health expenditures 2014 highlights. 2014.
    1. Stanton MW, Rutherford M. The high concentration of us health care expenditures. Rockville: Agency for Healthcare Research and Quality; 2006.
    1. Berk ML, Monheit AC. The concentration of health expenditures: an update. Health Affairs. 1992;11(4):145–149. doi: 10.1377/hlthaff.11.4.145. - DOI - PubMed
    1. Blumenthal D, Chernof B, Fulmer T, Lumpkin J, Selberg J. Caring for high-need, high-cost patients—an urgent priority. N Engl J Med. 2016;375(10):909–911. doi: 10.1056/NEJMp1608511. - DOI - PubMed
    1. Gerdtham U-G, Jönsson B. International comparisons of health expenditure: theory, data and econometric analysis. Handbook of health economics. 2000;1:11–53. doi: 10.1016/S1574-0064(00)80160-2. - DOI