Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 19;30(8):1438-1447.
doi: 10.1093/jamia/ocad054.

Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients

Affiliations

Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients

Dmytro S Lituiev et al. J Am Med Inform Assoc. .

Abstract

Objective: We applied natural language processing and inference methods to extract social determinants of health (SDoH) information from clinical notes of patients with chronic low back pain (cLBP) to enhance future analyses of the associations between SDoH disparities and cLBP outcomes.

Materials and methods: Clinical notes for patients with cLBP were annotated for 7 SDoH domains, as well as depression, anxiety, and pain scores, resulting in 626 notes with at least one annotated entity for 364 patients. We used a 2-tier taxonomy with these 10 first-level classes (domains) and 52 second-level classes. We developed and validated named entity recognition (NER) systems based on both rule-based and machine learning approaches and validated an entailment model.

Results: Annotators achieved a high interrater agreement (Cohen's kappa of 95.3% at document level). A rule-based system (cTAKES), RoBERTa NER, and a hybrid model (combining rules and logistic regression) achieved performance of F1 = 47.1%, 84.4%, and 80.3%, respectively, for first-level classes.

Discussion: While the hybrid model had a lower F1 performance, it matched or outperformed RoBERTa NER model in terms of recall and had lower computational requirements. Applying an untuned RoBERTa entailment model, we detected many challenging wordings missed by NER systems. Still, the entailment model may be sensitive to hypothesis wording.

Conclusion: This study developed a corpus of annotated clinical notes covering a broad spectrum of SDoH classes. This corpus provides a basis for training machine learning models and serves as a benchmark for predictive models for NER for SDoH and knowledge extraction from clinical texts.

Keywords: depression; lower back pain; machine learning; natural language inference; natural language processing; social determinants of health.

PubMed Disclaimer

Conflict of interest statement

DSL is a shareholder of Crosscope Inc and SynthezAI Corp and is currently employed by Johnson & Johnson. BL is supported by Innovate for Health Data Science Fellowship from Johnson & Johnson. PLA received funding from REAC RAP UCSF through UCSF. EDM received support from Hellman Fellows Fund Payment, and REAC RAP UCSF through UCSF. SP received support from Back Pain Consortium (BACPAC) grant through UCSF.

Figures

Figure 1.
Figure 1.
Study design. (A) Workflow of the study. (B). Annotation ontology. Clinical notes were annotated such that text relevant to the 7 studied social risk factors (solid border) or 3 clinical factors (dashed border) were marked. Two levels of labels were used, such that the second level was a subcategory of the first. Level 2 labels for each Level 1 annotation are shown in descending order of frequency. Level 2 annotations that comprised <1% of the group’s annotations are not shown. Text that can be classified to the first level but not the second due to ambiguity or low frequency is designated as “NA”. Examples of selected text are shown within the hypothetical clinical note.
Figure 2.
Figure 2.
Exploratory data analysis. (A) Histogram of number of entities in different note types. (B) Number of entities per note type and first-level annotated domain. The pictorial legend contains the total number of notes and annotations per note type.
Figure 3.
Figure 3.
Comparison of model performance. (A) Comparison of F1 performance in 4 best performing models per model class. Second-level metrics are aggregated using weighted average over first-level domains. (B) Comparison of F1, precision, and recall in all studied models. Metrics are aggregated using weighted average.
Figure 4.
Figure 4.
Examples of predictions from 4 best models per model class. Left: NER models. Right: RoBERTA entailment model. Probabilities of 3 possible relations are shown as shaded horizontal bars and numerically together with a final relation prediction.

Similar articles

Cited by

References

    1. Hatef E, Predmore Z, Lasser EC, et al.Integrating social and behavioral determinants of health into patient care and population health at Veterans Health Administration: a conceptual framework and an assessment of available individual and population level data sources and evidence-based measurements. AIMS Public Health 2019; 6: 209–24. - PMC - PubMed
    1. Anderson KO, Green CR, Payne R.. Racial and ethnic disparities in pain: causes and consequences of unequal care. J Pain 2009; 10: 1187–204. - PubMed
    1. James SL, Abate D, Abate KH, et al.Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 2018; 392: 1789–858. - PMC - PubMed
    1. U.S. Burden of Disease Collaborators; Mokdad AH, Ballestros K, Echko M, et al.The State of US Health, 1990–2016: burden of diseases, injuries, and risk factors among US states. JAMA 2018; 319: 1444–72. - PMC - PubMed
    1. Dutmer AL, Schiphorst Preuper HR, Soer R, et al.Personal and societal impact of low back pain: the Groningen Spine cohort. Spine (Phila Pa 1976) 2019; 44 (24): E1443–51. - PubMed

Publication types