Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 20;24(3):548.
doi: 10.4081/ripppo.2021.548.

Monitoring the effects of therapeutic interventions in depression through self-assessments

Affiliations

Monitoring the effects of therapeutic interventions in depression through self-assessments

Ines Moragrega et al. Res Psychother. .

Abstract

The treatment of major psychiatric disorders is an arduous and thorny path for the patients concerned, characterized by polypharmacy, massive adverse side effects, modest prospects of success, and constantly declining response rates. The more important is the early detection of psychiatric disorders prior to the development of clinically relevant symptoms, so that people can benefit from early interventions. A well-proven approach to monitoring mental health relies on voice analysis. This method has been successfully used with psychiatric patients to 'objectively' document the progress of improvement or the onset of relapse. The studies with psychiatric patients over 2-4 weeks demonstrated that daily voice assessments have a notable therapeutic effect in themselves. Therefore, daily voice assessments appear to be a lowthreshold form of therapeutic means that may be realized through self-assessments. To evaluate performance and reliability of this approach, we have carried out a longitudinal study on 82 university students in 3 different countries with daily assessments over 2 weeks. The sample included 41 males (mean age 24.2±3.83 years) and 41 females (mean age 21.6±2.05 years). Unlike other research in the field, this study was not concerned with the classification of individuals in terms of diagnostic categories. The focus lay on the monitoring aspect and the extent to which the effects of therapeutic interventions or of behavioural changes are visible in the results of self-assessment voice analyses. The test persons showed an over-proportionally good adherence to the daily voice analysis scheme. The accumulated data were of generally high quality: sufficiently high signal levels, a very limited number of movement artifacts, and little to no interfering background noise. The method was sufficiently sensitive to detect: i) habituation effects when test persons became used to the daily procedure; and ii) short-term fluctuations that exceeded prespecified thresholds and reached significance. Results are directly interpretable and provide information about what is going well, what is going less well, and where there is a need for action. The proposed self-assessment approach was found to be well-suited to serve as a health-monitoring tool for subjects with an elevated vulnerability to psychiatric disorders or to stress-induced mental health problems. Daily voice assessments are in fact a low-threshold form of therapeutic means that can be realized through selfassessments, that requires only little effort, can be carried out in the test person's own home, and has the potential to strengthen resilience and to induce positive behavioural changes.

Keywords: Psychiatric disorders; biofeedback; depression; early detection; psychosomatic disturbances; selfmonitoring; stress-related health problems; university students; voice analysis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Scatter plots of the raw scores ‘activity’ (x-axis) versus ‘defeatism’ (y-axis) as derived from the COPE data of 82 university students from the French speaking part of Switzerland (red triangles), Italy (green triangles), and Spain (blue triangles). A sufficiently large between-subject variation is a necessary prerequisite to study the interrelations with health factors at sufficiently high resolution.
Figure 2.
Figure 2.
Variation of Pause Duration: a relaxed speaker presenting a text produces a large variety of pauses of different length and separates some text sections by longer pauses. This in contrast to speakers with mental health problems, in particular patients with depressive symptoms, who tend to present a text in a more monotone, automatized way that lacks variation (‘M’ means mean pause duration, ‘S’ standard deviation, and ‘n’ the number of pauses in the spoken text).
Figure 3.
Figure 3.
Variation of Utterance Duration: when presenting a text, relaxed speakers typically vary the speed by which they produce utterances to make their speech more attractive and interesting. Speakers with mental health problems and speakers under chronic stress wouldn’t do this (‘M’ means mean utterance duration, ‘S’ standard deviation, and ‘n’ the number of utterances in the spoken text).
Figure 4.
Figure 4.
Variation of Loudness (Energy): when presenting a text, relaxed speakers typically vary loudness (dynamic expressiveness) to make their speech more attractive and interesting. The width of the distribution of loudness (energy) reflects the speaker’s dynamic expressiveness (‘M’ means mean energy per second, ‘S’ standard deviation, and ‘n’ the number of seconds used for the spoken text).
Figure 5.
Figure 5.
Variation of Vocal Pitch (Intonation): intonation is the manner of producing utterances with respect to rise and fall in pitch. It leads to tonal shifts in either direction of the speaker’s mean vocal pitch. The ‘broader’ the variation around the speaker’s mean vocal pitch the ‘richer’ the intonation. The example shows a typical speaker with fairly good intonation, where there is some room for improvement (‘M’ means mean vocal pitch in quarter tones for the 4 octaves [55-110Hz], [110-220Hz][220-440Hz][440-880Hz], i.e. 48 quarter tones per octave, ‘S’ standard deviation, and ‘n’ the number of segments underlying pitch estimation).
Figure 6.
Figure 6.
Variation of F0-Amplitude: this variation is an indicator of the ‘richness’ of a speaker’s voice sound. A narrow distribution typically means deficiency of emotions and empathetic feelings. By contrast, a broad distribution suggests a lively and mindful person (‘M’ means mean F0-Amplitude, ‘S’ standard deviation, and ‘n’ the number of segments underlying F0 estimation).
Figure 7.
Figure 7.
Variation of 55-440 Hz Power: this variation is another measure of the tonal ‘richness’ of a speaker’s voice. Reduced variation indicates a more monotonous production of utterances caused, for example, by sorrowfulness, weariness, or fatigue (‘M’ means mean energy in the frequency range 55-440 Hz, ‘S’ standard deviation, and ‘n’ the number of segments underlying 55-440 Hz Power estimation).
Figure 8.
Figure 8.
Pause Duration (red bars)/Loudness (green bars): Despite their great stability over time, the speech parameters ‘Pause Duration’ and ‘Loudness’ often show a systematic trend toward shorter pauses and greater loudness when speakers get used to the test (‘M’ means mean value of pause duration and loudness, respectively, ‘S’ standard deviation, and ‘n’ the number of repeated assessments [days]).
Figure 9.
Figure 9.
Pause Duration/Loudness: Pause duration (red bars) and loudness (green bars) over an observation period of 14 days: Speaking behaviour is virtually unchanged over time except for day 12 with longer pauses and a lower voice (subject may have been tired). No self-assessments on days 4, 13 (‘M’ means mean value of pause duration and loudness, respectively, ‘S’ standard deviation, and ‘n’ the number of repeated assessments [days]).
Figure 10.
Figure 10.
Energy (red bars) and dynamics (green bars) over an observation period of 14 days: Speaking behaviour shows a systematic trend towards higher energy values and greater dynamic variation as a function of time. This finding is likely due to habituation effects (‘M’ means mean value of energy and dynamics, respectively, ‘S’ standard deviation, and ‘n’ the number of repeated assessments [days]).
Figure 11.
Figure 11.
Pause Duration/Loudness: Pause duration (red bars) and loudness (green bars) over an observation period of 14 days: Speaking behaviour shows a systematic trend towards shorter pauses and greater loudness as a function of time. This finding may indicate habituation effects (‘M’ means mean value of pause duration and loudness, respectively, ‘S’ standard deviation, and ‘n’ the number of repeated assessments [days]).
Figure 12.
Figure 12.
Vocal Pitch (red bars)/Intonation (green bars): the test person exhibits short-term reductions in intonation during the observation period (reaching significance on day 6), thus suggesting physical or psychological reactions to some life events at that time. Additionally, the test person’s mean vocal pitch showed a tonal shift towards lower frequencies as well, which was probably due to fatigue (‘M’ means mean vocal pitch and intonation, respectively, in quarter tones for the 4 octaves [55-110Hz], [110-220Hz][220-440Hz][440-880Hz], i.e. 48 quarter tones per octave, ‘S’ standard deviation, and ‘n’ the number of repeated assessments [days]).

References

    1. Albert N., Weibell M. A., (2019). The outcome of early intervention in first episode psychosis. International Reviews of Psychiatry. 31(5-6), 413-424. doi:10.1080/09540261.2019.1643703. - PubMed
    1. Albus C. (2010). Psychological and social factors in coronary heart disease. Annals of Medicine, 42(7), 487-494. doi:10. 3109/07853890.2010.515605. - PubMed
    1. Arakawa T. (2018). Recent research and developing trends of wearable sensors for detecting blood pressure. Sensors (Basel), 18(9), 2772. doi:10.3390/s18092772. - PMC - PubMed
    1. Arevian A. C., Bone D., Malandrakis N., Martinez V. R., Wells K. B., Miklowitz D. J., Narayanan S., (2020). Clinical state tracking in serious mental illness through computational analysis of speech. PLoS One, 15(1), e0225695. doi:10.1371/ journal.pone.0225695. eCollection 2020. - PMC - PubMed
    1. Aschbacher K., O’Donovan A., Wolkowitz O. M., Dhabhar F. S., Su Y., Epel E., (2013). Good stress, bad stress and oxidative stress: insights from anticipatory cortisol reactivity. Psychoneuroendocrinology, 38(9), 1698-1708. doi:10.1016/j. psyneuen.2013.02.004. - PMC - PubMed