Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning
- PMID: 36331289
- PMCID: PMC9896464
- DOI: 10.1093/aje/kwac182
Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning
Abstract
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.
Keywords: anaphylaxis; electronic health records; health outcome identification; machine learning, supervised; postmarketing product surveillance; predictive modeling.
© The Author(s) 2022. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.
Figures



Similar articles
-
The use of natural language processing to identify vaccine-related anaphylaxis at five health care systems in the Vaccine Safety Datalink.Pharmacoepidemiol Drug Saf. 2020 Feb;29(2):182-188. doi: 10.1002/pds.4919. Epub 2019 Dec 3. Pharmacoepidemiol Drug Saf. 2020. PMID: 31797475 Free PMC article.
-
Identification and Validation of Anaphylaxis Using Electronic Health Data in a Population-based Setting.Epidemiology. 2021 May 1;32(3):439-443. doi: 10.1097/EDE.0000000000001330. Epidemiology. 2021. PMID: 33591057
-
Identification of Patients With Congestive Heart Failure From the Electronic Health Records of Two Hospitals: Retrospective Study.JMIR Med Inform. 2025 Apr 10;13:e64113. doi: 10.2196/64113. JMIR Med Inform. 2025. PMID: 40208662 Free PMC article.
-
Validation of Prediction Models for Critical Care Outcomes Using Natural Language Processing of Electronic Health Record Data.JAMA Netw Open. 2018 Dec 7;1(8):e185097. doi: 10.1001/jamanetworkopen.2018.5097. JAMA Netw Open. 2018. PMID: 30646310 Free PMC article.
-
Machine learning natural language processing for identifying venous thromboembolism: systematic review and meta-analysis.Blood Adv. 2024 Jun 25;8(12):2991-3000. doi: 10.1182/bloodadvances.2023012200. Blood Adv. 2024. PMID: 38522096 Free PMC article.
Cited by
-
Scalable incident detection via natural language processing and probabilistic language models.Sci Rep. 2024 Oct 8;14(1):23429. doi: 10.1038/s41598-024-72756-7. Sci Rep. 2024. PMID: 39379449 Free PMC article.
-
Artificial Intelligence: Exploring the Future of Innovation in Allergy Immunology.Curr Allergy Asthma Rep. 2023 Jun;23(6):351-362. doi: 10.1007/s11882-023-01084-z. Epub 2023 May 9. Curr Allergy Asthma Rep. 2023. PMID: 37160554 Free PMC article. Review.
-
A future of data-rich pharmacoepidemiology studies: transitioning to large-scale linked electronic health record + claims data.Am J Epidemiol. 2025 Feb 5;194(2):315-321. doi: 10.1093/aje/kwae226. Am J Epidemiol. 2025. PMID: 39013780 Free PMC article.
-
An application of the Causal Roadmap in two safety monitoring case studies: Causal inference and outcome prediction using electronic health record data.J Clin Transl Sci. 2023 Sep 21;7(1):e208. doi: 10.1017/cts.2023.632. eCollection 2023. J Clin Transl Sci. 2023. PMID: 37900347 Free PMC article.
-
Protocol for Designing a Model to Predict the Likelihood of Psychosis From Electronic Health Records Using Natural Language Processing and Machine Learning.Perm J. 2024 Sep 16;28(3):23-36. doi: 10.7812/TPP/23.139. Epub 2024 Sep 2. Perm J. 2024. PMID: 39219312 Free PMC article.
References
-
- Yu JE, Lin RY. The epidemiology of anaphylaxis. Clin Rev Allergy Immunol. 2018;54(3):366–374. - PubMed
-
- Lieberman P, Camargo CA Jr, Bohlke K, et al. . Epidemiology of anaphylaxis: findings of the American College of Allergy, Asthma and Immunology Epidemiology of Anaphylaxis Working Group. Ann Allergy Asthma Immunol. 2006;97(5):596–602. - PubMed
-
- Rudders SA, Arias SA, Camargo CA Jr. Trends in hospitalizations for food-induced anaphylaxis in US children, 2000–2009. J Allergy Clin Immunol. 2014;134(4):960–2 e3. - PubMed
-
- Shrestha P, Dhital R, Poudel D, et al. . Trends in hospitalizations related to anaphylaxis, angioedema, and urticaria in the United States. Ann Allergy Asthma Immunol. 2019;122(4):401–406.e2. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous