Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2025 Aug 29:13:e67837.
doi: 10.2196/67837.

Natural Language Processing and ICD-10 Coding for Detecting Bleeding Events in Discharge Summaries: Comparative Cross-Sectional Study

Affiliations
Comparative Study

Natural Language Processing and ICD-10 Coding for Detecting Bleeding Events in Discharge Summaries: Comparative Cross-Sectional Study

Frederic Gaspar et al. JMIR Med Inform. .

Abstract

Background: Bleeding adverse drug events (ADEs), particularly among older inpatients receiving antithrombotic therapy, represent a major safety concern in hospitals. These events are often underdetected by conventional rule-based systems relying on structured electronic medical record data, such as the ICD-10 (International Statistical Classification of Diseases and Related Health Problems 10th Revision) codes, which lack the granularity to capture nuanced clinical narratives.

Objective: This study aimed to develop and evaluate a natural language processing (NLP) model to detect and categorize bleeding ADEs in discharge summaries of older adults. Specifically, the model was designed to distinguish between "clinically significant bleeding," "severe bleeding," "history of bleeding," and "no bleeding," and was compared with a rule-based algorithm using ICD-10 codes.

Methods: Clinicians manually annotated 400 discharge summaries, comprising 65,706 sentences, into four categories: "no bleeding," "clinically significant bleeding," "severe bleeding," and "history of bleeding." The dataset was divided into a training set (70%, 47,100 sentences) and a test set (30%, 18,606 sentences). Two detection approaches were developed and evaluated: (1) an NLP model using binary logistic regression and support vector machine classifiers, and (2) a traditional rule-based algorithm relying exclusively on predefined ICD-10 codes. To address class imbalance, with most sentences categorized as irrelevant ("no bleeding"), a class-weighting strategy was applied in the NLP model. Model performance was assessed using accuracy, precision, recall, F1-score, and receiver operating characteristic (ROC) curve analyses, with manual annotations as the gold standard.

Results: The NLP model significantly outperformed the rule-based approach across all evaluation metrics. At the document level, the NLP model achieved macro-average scores of 0.81 for accuracy and 0.80 for F1-score. Precision was particularly high for detecting severe (0.92) and clinically significant bleeding events (0.87), demonstrating strong classification capability despite class imbalance. ROC analyses confirmed the model's robust diagnostic performance, yielding an area under the curve (AUC) of 0.91 when distinguishing irrelevant sentences from potential bleeding events, 0.88 for identifying historical mentions of bleeding, and notably, 0.94 for differentiating clinically significant from severe bleeding. In contrast, the rule-based ICD-10 model demonstrated high precision (0.94) for clinically significant bleeding but poor recall (0.03) for severe bleeding events, reflecting frequent missed detections. This limitation arose due to its reliance on commonly used ICD-10 codes (eg, gastrointestinal hemorrhage) and inadequate capture of rare severe bleeding conditions such as shock due to hemorrhage.

Conclusions: This study highlights the considerable advantage of NLP over traditional ICD-10-based methods for detecting bleeding ADEs within electronic medical records. The NLP model effectively captured nuanced clinical narratives, including severity, negations, and historical bleeding events, demonstrating substantial promise for improving patient safety surveillance and clinical decision-making. Future research should extend validation across multiple institutions, diversify annotated datasets, and further refine temporal reasoning capabilities within NLP algorithms.

Keywords: AI; ML; NLP; adverse drug events; artificial intelligence; bleeding; cross-sectional; decision support systems; decision-making; deep learning; discharge summaries; elderly; electronic medical records; healthcare; hemorrhage; international classification of diseases; logistic regression; machine learning; medical records; natural language processing; older adults.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1.
Figure 1.. Overview of the methodological framework used in this study. ADE: adverse drug event; AUC: area under the curve; BERT: Bidirectional Encoder Representations from Transformers; HDBSCAN: hierarchical density-based spatial clustering of applications with noise; ICD-10-GM: German Modification of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision; ISTH: International Society on Thrombosis and Haemostasis; NLP: natural language processing; UMAP: uniform manifold approximation and projection.
Figure 2.
Figure 2.. Sentence-level classification pipeline for detecting bleeding events. In this pipeline: BOW-LOG: logistic regression classification using a bag-of-words (BOW) model to determine whether the sentence contains bleeding-relevant information; BOW-SVM: support vector machine (SVM) classification using a BOW model to assess whether there is evidence of bleeding in the sentence; BOW-LOG: a second logistic regression classification using a BOW model to evaluate the severity of the bleeding event (ie, whether it is severe or clinically significant).
Figure 3.
Figure 3.. Distribution of sentence lengths in training and test sets.
Figure 4.
Figure 4.. Receiver operating characteristic (ROC) curves showing the diagnostic performance of the multistage classification model at various thresholds, illustrating the trade-off between the true positive rate (sensitivity) and the false positive rate (1 - specificity).
Figure 5.
Figure 5.. Comparative analysis of confusion matrices for bleeding detection using natural language processing (NLP) and ICD (International Classification of Diseases) coding methods. This figure contrasts the performance of the NLP model and the ICD coding approach, displaying the counts of true positives, true negatives, false positives, and false negatives, thus providing a comparative evaluation of both methods. ICD-10: International Statistical Classification of Diseases and Related Health Problems, 10th Revision.

Similar articles

References

    1. Cook DJ, Griffith LE, Walter SD, et al. The attributable mortality and length of intensive care unit stay of clinically important gastrointestinal bleeding in critically ill patients. Crit Care. 2001 Dec;5(6):368–375. doi: 10.1186/cc1071. doi. Medline. - DOI - PMC - PubMed
    1. Krähenbühl-Melcher A, Schlienger R, Lampert M, Haschke M, Drewe J, Krähenbühl S. Drug-related problems in hospitals: a review of the recent literature. Drug Saf. 2007;30(5):379–407. doi: 10.2165/00002018-200730050-00003. doi. Medline. - DOI - PubMed
    1. Berger JS, Bhatt DL, Steg PG, et al. Bleeding, mortality, and antiplatelet therapy: results from the Clopidogrel for High Atherothrombotic Risk and Ischemic Stabilization, Management, and Avoidance (CHARISMA) trial. Am Heart J. 2011 Jul;162(1):98–105. doi: 10.1016/j.ahj.2011.04.015. doi. Medline. - DOI - PubMed
    1. Classen DC, Pestotnik SL, Evans RS, Burke JP. Computerized surveillance of adverse drug events in hospital patients. JAMA. 1991 Nov 27;266(20):2847–2851. Medline. - PubMed
    1. Kanagaratnam L, Abou Taam M, Heng M, De Boissieu P, Roux MP, Trenque T. Serious adverse drug reaction and their preventability in the elderly over 65 years. Therapie. 2015;70(5):477–484. doi: 10.2515/therapie/2015029. doi. Medline. - DOI - PubMed

Publication types

LinkOut - more resources