Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 May;34(4):503-12.
doi: 10.1177/0272989X13514777. Epub 2013 Nov 27.

Automatically annotating topics in transcripts of patient-provider interactions via machine learning

Affiliations

Automatically annotating topics in transcripts of patient-provider interactions via machine learning

Byron C Wallace et al. Med Decis Making. 2014 May.

Abstract

Background: Annotated patient-provider encounters can provide important insights into clinical communication, ultimately suggesting how it might be improved to effect better health outcomes. But annotating outpatient transcripts with Roter or General Medical Interaction Analysis System (GMIAS) codes is expensive, limiting the scope of such analyses. We propose automatically annotating transcripts of patient-provider interactions with topic codes via machine learning.

Methods: We use a conditional random field (CRF) to model utterance topic probabilities. The model accounts for the sequential structure of conversations and the words comprising utterances. We assess predictive performance via 10-fold cross-validation over GMIAS-annotated transcripts of 360 outpatient visits (>230,000 utterances). We then use automated in place of manual annotations to reproduce an analysis of 116 additional visits from a randomized trial that used GMIAS to assess the efficacy of an intervention aimed at improving communication around antiretroviral (ARV) adherence.

Results: With respect to 6 topic codes, the CRF achieved a mean pairwise kappa compared with human annotators of 0.49 (range: 0.47-0.53) and a mean overall accuracy of 0.64 (range: 0.62-0.66). With respect to the RCT reanalysis, results using automated annotations agreed with those obtained using manual ones. According to the manual annotations, the median number of ARV-related utterances without and with the intervention was 49.5 versus 76, respectively (paired sign test P = 0.07). When automated annotations were used, the respective numbers were 39 versus 55 (P = 0.04). While moderately accurate, the predicted annotations are far from perfect. Conversational topics are intermediate outcomes, and their utility is still being researched.

Conclusions: This foray into automated topic inference suggests that machine learning methods can classify utterances comprising patient-provider interactions into clinically relevant topics with reasonable accuracy.

Keywords: CRF; communication; informatics; machine learning; natural language processing; patient-provider interaction; speech acts.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Average relative topic frequencies over deciles in patient-provider conversations. The solid line is the empirical (true) proportion; the dotted line corresponds to average model predictions (over test sets).
Appendix Figure 1
Appendix Figure 1
Graphical model representation of the conditional random field (CRF).
Appendix Figure 2
Appendix Figure 2
GMIAS topic codes (original schema).

Comment in

References

    1. Ong LML, De Haes JCJM, Hoos AM, Lammes FB. Doctor-patient communication: a review of the literature. Social science & medicine. 1995;40(7):903–918. - PubMed
    1. Little P, Everitt H, Williamson I, et al. Observational study of effect of patient centredness and positive approach on outcomes of general practice consultations. Bmj. 2001;323(7318):908–911. - PMC - PubMed
    1. Epstein R, Street RL. Patient-centered communication in cancer care: promoting healing and reducing suffering. National Cancer Institute, US Department of Health and Human Services, National Institutes of Health. 2007
    1. Kaplan SH, Greenfield S, Ware JE., Jr Assessing the effects of physician-patient interactions on the outcomes of chronic disease. Medical care. 1989:110–127. - PubMed
    1. Oates J, Weston WW, Jordan J. The impact of patient-centered care on outcomes. Family practice. 2000;49:796–804. - PubMed

Publication types

MeSH terms

Substances