Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 6:2019:620-629.
eCollection 2019.

Deep Learning on Electronic Health Records to Improve Disease Coding Accuracy

Affiliations

Deep Learning on Electronic Health Records to Improve Disease Coding Accuracy

Sina Rashidian et al. AMIA Jt Summits Transl Sci Proc. .

Abstract

Characterization of a patient's clinical phenotype is central to biomedical informatics. ICD codes, assigned to inpatient encounters by coders, is important for population health and cohort discovery when clinical information is limited. While ICD codes are assigned to patients by professionals trained and certified in coding there is substantial variability in coding. We present a methodology that uses deep learning methods to model coder decision making and that predicts ICD codes. Our approach predicts codes based on demographics, lab results, and medications, as well as codes from previous encounters. We are able to predict existing codes with high accuracy for all three of the test cases we investigated: diabetes, acute renal failure, and chronic kidney disease. We employed a panel of clinicians, in a blinded manner, to assess ground truth and compared the predictions of coders, model and clinicians. When disparities between the model prediction and coder assigned codes were reviewed, our model outperformed coder assigned ICD codes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) F1 comparison of different learning methods across three different diseases. Deep Learning (DL), Logistic Regression (LR) and Random Forest (RF) are shown. (B) Performance of deep learning prediction in other data sets, for facilities with identifier 143, 67, or the multi-10 dataset consisting of 10 different facilities.
Figure 2.
Figure 2.
Using expert review as ground truth, data shows the success of model predictions in cases where they disagree with existing codes. Model accuracy is shown as median (dot) and the 95% confidence interval (bar) using a beta-binomial model with Jeffreys prior. CKD - Chronic Kidney Disease, ARF - Acute Renal Failure.

References

    1. O'malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring diagnoses: ICD code accuracy. Health services research. 2005 Oct;40(5p2):1620–39. - PMC - PubMed
    1. American Health Information Management Association [Internet] 2018 [cited 2018 Aug]. Available at: http://www.ahima.org/
    1. Gologorsky Y, Knightly JJ, Lu Y, Chi JH, Groff MW. Improving discharge data fidelity for use in large administrative databases. Neurosurgical focus. 2014 Jun;36(6):E2. - PubMed
    1. Henry J, Pylypchuk Y, Searcy T, Patel V. Adoption of electronic health record systems among US non-federal acute care hospitals: 2008–2015. ONC Data Brief. 2016 May;35:1–9.
    1. Post AR, Kurc T, Cholleti S, Gao J, Lin X, Bornstein W, Cantrell D, Levine D, Hohmann S, Saltz JH. The Analytic Information Warehouse (AIW): A platform for analytics using electronic health record data. Journal of biomedical informatics. 2013 Jun 1;46(3):410–24. - PMC - PubMed