Automated evaluation of psychotherapy skills using speech and language technologies

Nikolaos Flemotomos¹, Victor R Martinez², Zhuohao Chen³, Karan Singla², Victor Ardulov², Raghuveer Peri³, Derek D Caperton⁴, James Gibson⁵, Michael J Tanana⁶, Panayiotis Georgiou³, Jake Van Epps⁷, Sarah P Lord⁸, Tad Hirsch⁹, Zac E Imel⁴, David C Atkins⁸, Shrikanth Narayanan^{3

2

5}

Affiliations

¹ Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California, USA. flemotom@usc.edu.
² Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
³ Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California, USA.
⁴ Department of Educational Psychology, University of Utah, Salt Lake City, Utah, USA.
⁵ Behavioral Signal Technologies Inc., Los Angeles, CA, USA.
⁶ College of Social Work, University of Utah, Salt Lake City, Utah, USA.
⁷ University Counseling Center, University of Utah, Salt Lake City, Utah, USA.
⁸ Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, Washington, USA.
⁹ Department of Art + Design, Northeastern University, Boston, Massachusetts, USA.

PMID: 34346043
PMCID: PMC8810915
DOI: 10.3758/s13428-021-01623-4

Automated evaluation of psychotherapy skills using speech and language technologies

Nikolaos Flemotomos et al. Behav Res Methods. 2022 Apr.

. 2022 Apr;54(2):690-711.

doi: 10.3758/s13428-021-01623-4. Epub 2021 Aug 3.

Authors

Affiliations

¹ Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California, USA. flemotom@usc.edu.
² Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
³ Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California, USA.
⁴ Department of Educational Psychology, University of Utah, Salt Lake City, Utah, USA.
⁵ Behavioral Signal Technologies Inc., Los Angeles, CA, USA.
⁶ College of Social Work, University of Utah, Salt Lake City, Utah, USA.
⁷ University Counseling Center, University of Utah, Salt Lake City, Utah, USA.
⁸ Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, Washington, USA.
⁹ Department of Art + Design, Northeastern University, Boston, Massachusetts, USA.

PMID: 34346043
PMCID: PMC8810915
DOI: 10.3758/s13428-021-01623-4

Abstract

With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services. Traditionally, quality assessment is addressed by human raters who evaluate recorded sessions along specific dimensions, often codified through constructs relevant to the approach and domain. This is, however, a cost-prohibitive and time-consuming method that leads to poor feasibility and limited use in real-world settings. To facilitate this process, we have developed an automated competency rating tool able to process the raw recorded audio of a session, analyzing who spoke when, what they said, and how the health professional used language to provide therapy. Focusing on a use case of a specific type of psychotherapy called "motivational interviewing", our system gives comprehensive feedback to the therapist, including information about the dynamics of the session (e.g., therapist's vs. client's talking time), low-level psychological language descriptors (e.g., type of questions asked), as well as other high-level behavioral constructs (e.g., the extent to which the therapist understands the clients' perspective). We describe our platform and its performance using a dataset of more than 5000 recordings drawn from its deployment in a real-world clinical setting used to assist training of new therapists. Widespread use of automated psychotherapy rating tools may augment experts' capabilities by providing an avenue for more effective training and skill improvement, eventually leading to more positive clinical outcomes.

Keywords: MISC; Machine learning; Motivational interviewing; Psychotherapy; Quality assessment; Speech processing.

PubMed Disclaimer

Figures

**Figure 1.**
(a) Overview of the system used to assess the quality of a psychotherapy session and provide feedback to the therapist. Once the audio is recorded, it is automatically transcribed to find who spoke when and what they said. If the transcription meets certain quality criteria, this textual information is used to predict utterance-level and session-level behavior codes which are summarized into an interactive feedback report. Otherwise, an error message is displayed to the user. (b) Rich transcription module. The dyadic interaction is transcribed through a pipeline that extracts the linguistic information encoded in the speech signal and assigns each speaker turn to either the therapist or the client.

**Figure 2.**
Count of each target MISC label per session (Table 5) when coded by humans (reference) and when processed by the pipeline. All the sessions in the two test sets of the University Counseling Center (UCC) dataset (UCC_test₁ and UCC_test₂) are shown and the correlation values are calculated based on all of them. The sessions flagged as problematic by the quality safeguards are denoted by square markers. RE is a composite label containing both simple and complex reflections (RES and REC).

**Figure 3.**
Frequency of the utterance-level MISC codes (Table 5) for all the University Counseling Center (UCC) recordings processed and for the subset included in the UCC test sets. Only the sessions successfully processed (that met our quality criteria) are taken into consideration here. The total number of therapist-assigned utterances is about 1.2M for all the sessions (4,269 sessions) and 28K for only the sessions included in the UCC test sets (UCC_test₁ and UCC_test₂; 96 sessions).

**Figure 4.**
Distribution of the session-level MISC codes (Table 1) for all the University Counseling Center (UCC) recordings processed and for the subset included in the UCC test sets. Only the sessions successfully processed (that met our quality criteria) are taken into consideration here.

See this image and copyright information in PMC

References

1. Anguera X, Bozonnet S, Evans N, Fredouille C, Friedland G, & Vinyals O (2012). Speaker diarization: A review of recent research. IEEE Transactions on Audio, Speech, and Language Processing, 20 (2), 356–370.
1. Anguera X, Wooters C, & Hernando J (2007). Acoustic beamforming for speaker diarization of meetings. IEEE Transactions on Audio, Speech, and Language Processing, 15 (7), 2011–2022.
1. Baer JS, Wells EA, Rosengren DB, Hartzler B, Beadnell B, & Dunn C (2009). Agency context and tailored training in technology transfer: A pilot evaluation of motivational interviewing training for community counselors. Journal of substance abuse treatment, 37 (2), 191–202. - PMC - PubMed
1. Bakeman R, & Quera V (2012). Behavioral observation. In Cooper H, Camic PM, Long DL, Panter AT, Rindskopf D, & Sher KJ (Eds.), Apa handbook of research methods in psychology, vol. 1. foundations, planning, measures, and psychometrics (pp. 207–225). Washington, DC: American Psychological Association. doi: 10.1037/13619-013 - DOI
1. Barahona LMR, Tseng B-H, Dai Y, Mansfield C, Ramadan O, Ultes S, … Gasic M (2018). Deep learning for language understanding of mental health concepts derived from cognitive behavioural therapy. In Proc. international workshop on health text mining and information analysis (pp. 44–54).

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 AA018673/AA/NIAAA NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated evaluation of psychotherapy skills using speech and language technologies

Affiliations

Automated evaluation of psychotherapy skills using speech and language technologies

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources