Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models
- PMID: 38036723
- PMCID: PMC10689487
- DOI: 10.1038/s41746-023-00957-x
Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models
Abstract
Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific instructions. Here we report the performance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records (n = 271,081). The language model achieves strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allows the development of interpretable, complex phenotypes and subtypes. The Flan-T5 model achieves high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperforms a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this language modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.
© 2023. The Author(s).
Conflict of interest statement
K.J.G. has served as a consultant to Illumina Inc., Aetion, Roche, and BillionToOne outside the scope of the submitted work. D.W.B. reports grants and personal fees from EarlySense, personal fees from CDI Negev, equity from Valera Health, equity from CLEW, equity from MDClone, personal fees and equity from AESOP Technology, personal fees and equity from FeelBetter, and grants from IBM Watson Health, outside the submitted work. V.P.K. reports consulting fees from Avania CRO unrelated to the current work.
Figures




Update of
-
Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models.medRxiv [Preprint]. 2023 Jun 1:2023.05.31.23290753. doi: 10.1101/2023.05.31.23290753. medRxiv. 2023. Update in: NPJ Digit Med. 2023 Nov 30;6(1):212. doi: 10.1038/s41746-023-00957-x. PMID: 37398230 Free PMC article. Updated. Preprint.
Similar articles
-
Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models.medRxiv [Preprint]. 2023 Jun 1:2023.05.31.23290753. doi: 10.1101/2023.05.31.23290753. medRxiv. 2023. Update in: NPJ Digit Med. 2023 Nov 30;6(1):212. doi: 10.1038/s41746-023-00957-x. PMID: 37398230 Free PMC article. Updated. Preprint.
-
Ensembles of natural language processing systems for portable phenotyping solutions.J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23. J Biomed Inform. 2019. PMID: 31655273 Free PMC article.
-
Large language models to identify social determinants of health in electronic health records.NPJ Digit Med. 2024 Jan 11;7(1):6. doi: 10.1038/s41746-023-00970-0. NPJ Digit Med. 2024. PMID: 38200151 Free PMC article.
-
Active management of the third stage of labour: prevention and treatment of postpartum hemorrhage.J Obstet Gynaecol Can. 2009 Oct;31(10):980-993. doi: 10.1016/S1701-2163(16)34329-8. J Obstet Gynaecol Can. 2009. PMID: 19941729 Review.
-
Primary and secondary postpartum haemorrhage: a review for a rationale endovascular approach.CVIR Endovasc. 2024 Feb 13;7(1):17. doi: 10.1186/s42155-024-00429-7. CVIR Endovasc. 2024. PMID: 38349501 Free PMC article. Review.
Cited by
-
Toward real-world deployment of machine learning for health care: External validation, continual monitoring, and randomized clinical trials.Health Care Sci. 2024 Oct 14;3(5):360-364. doi: 10.1002/hcs2.114. eCollection 2024 Oct. Health Care Sci. 2024. PMID: 39479276 Free PMC article.
-
Use of a Large Language Model to Assess Clinical Acuity of Adults in the Emergency Department.JAMA Netw Open. 2024 May 1;7(5):e248895. doi: 10.1001/jamanetworkopen.2024.8895. JAMA Netw Open. 2024. PMID: 38713466 Free PMC article.
-
Opportunities of AI-powered applications in anesthesiology to enhance patient safety.Int Anesthesiol Clin. 2024 Apr 1;62(2):26-33. doi: 10.1097/AIA.0000000000000437. Epub 2024 Feb 13. Int Anesthesiol Clin. 2024. PMID: 38348838 Free PMC article. No abstract available.
-
Synthetic data distillation enables the extraction of clinical information at scale.NPJ Digit Med. 2025 May 10;8(1):267. doi: 10.1038/s41746-025-01681-4. NPJ Digit Med. 2025. PMID: 40348936 Free PMC article.
-
Decoding substance use disorder severity from clinical notes using a large language model.Npj Ment Health Res. 2025 Feb 7;4(1):5. doi: 10.1038/s44184-024-00114-6. Npj Ment Health Res. 2025. PMID: 39915681 Free PMC article.
References
Grants and funding
LinkOut - more resources
Full Text Sources