Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients
- PMID: 37080559
- PMCID: PMC10354762
- DOI: 10.1093/jamia/ocad054
Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients
Abstract
Objective: We applied natural language processing and inference methods to extract social determinants of health (SDoH) information from clinical notes of patients with chronic low back pain (cLBP) to enhance future analyses of the associations between SDoH disparities and cLBP outcomes.
Materials and methods: Clinical notes for patients with cLBP were annotated for 7 SDoH domains, as well as depression, anxiety, and pain scores, resulting in 626 notes with at least one annotated entity for 364 patients. We used a 2-tier taxonomy with these 10 first-level classes (domains) and 52 second-level classes. We developed and validated named entity recognition (NER) systems based on both rule-based and machine learning approaches and validated an entailment model.
Results: Annotators achieved a high interrater agreement (Cohen's kappa of 95.3% at document level). A rule-based system (cTAKES), RoBERTa NER, and a hybrid model (combining rules and logistic regression) achieved performance of F1 = 47.1%, 84.4%, and 80.3%, respectively, for first-level classes.
Discussion: While the hybrid model had a lower F1 performance, it matched or outperformed RoBERTa NER model in terms of recall and had lower computational requirements. Applying an untuned RoBERTa entailment model, we detected many challenging wordings missed by NER systems. Still, the entailment model may be sensitive to hypothesis wording.
Conclusion: This study developed a corpus of annotated clinical notes covering a broad spectrum of SDoH classes. This corpus provides a basis for training machine learning models and serves as a benchmark for predictive models for NER for SDoH and knowledge extraction from clinical texts.
Keywords: depression; lower back pain; machine learning; natural language inference; natural language processing; social determinants of health.
Published by Oxford University Press on behalf of the American Medical Informatics Association 2023.
Conflict of interest statement
DSL is a shareholder of Crosscope Inc and SynthezAI Corp and is currently employed by Johnson & Johnson. BL is supported by Innovate for Health Data Science Fellowship from Johnson & Johnson. PLA received funding from REAC RAP UCSF through UCSF. EDM received support from Hellman Fellows Fund Payment, and REAC RAP UCSF through UCSF. SP received support from Back Pain Consortium (BACPAC) grant through UCSF.
Figures




Similar articles
-
Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046. J Am Med Inform Assoc. 2023. PMID: 37002953 Free PMC article.
-
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7. J Biomed Inform. 2022. PMID: 35007754
-
A marker-based neural network system for extracting social determinants of health.J Am Med Inform Assoc. 2023 Jul 19;30(8):1398-1407. doi: 10.1093/jamia/ocad041. J Am Med Inform Assoc. 2023. PMID: 37011635 Free PMC article.
-
Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.Health Serv Res. 2023 Dec;58(6):1292-1302. doi: 10.1111/1475-6773.14210. Epub 2023 Aug 3. Health Serv Res. 2023. PMID: 37534741 Free PMC article.
-
Extracting social determinants of health from electronic health records using natural language processing: a systematic review.J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170. J Am Med Inform Assoc. 2021. PMID: 34613399 Free PMC article.
Cited by
-
Social determinants of health extraction from clinical notes across institutions using large language models.NPJ Digit Med. 2025 May 17;8(1):287. doi: 10.1038/s41746-025-01645-8. NPJ Digit Med. 2025. PMID: 40379919 Free PMC article.
-
Clinical Significance of Marital Status and Changes in Status Extracted from Unstructured Clinical Notes Using Ensembles of Off-the-Shelf Extraction Models.Res Sq [Preprint]. 2025 May 5:rs.3.rs-6578415. doi: 10.21203/rs.3.rs-6578415/v1. Res Sq. 2025. PMID: 40386391 Free PMC article. Preprint.
-
Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046. J Am Med Inform Assoc. 2023. PMID: 37002953 Free PMC article.
-
Evaluating associations between social risks and health care utilization in patients with chronic low back pain.Pain Rep. 2024 Oct 8;9(6):e1191. doi: 10.1097/PR9.0000000000001191. eCollection 2024 Dec. Pain Rep. 2024. PMID: 39391767 Free PMC article.
-
Topic modeling on clinical social work notes for exploring social determinants of health factors.JAMIA Open. 2024 Jan 14;7(1):ooad112. doi: 10.1093/jamiaopen/ooad112. eCollection 2024 Apr. JAMIA Open. 2024. PMID: 38223407 Free PMC article.
References
-
- Hatef E, Predmore Z, Lasser EC, et al.Integrating social and behavioral determinants of health into patient care and population health at Veterans Health Administration: a conceptual framework and an assessment of available individual and population level data sources and evidence-based measurements. AIMS Public Health 2019; 6: 209–24. - PMC - PubMed
-
- Anderson KO, Green CR, Payne R.. Racial and ethnic disparities in pain: causes and consequences of unequal care. J Pain 2009; 10: 1187–204. - PubMed
-
- James SL, Abate D, Abate KH, et al.Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 2018; 392: 1789–858. - PMC - PubMed
-
- Dutmer AL, Schiphorst Preuper HR, Soer R, et al.Personal and societal impact of low back pain: the Groningen Spine cohort. Spine (Phila Pa 1976) 2019; 44 (24): E1443–51. - PubMed