Assessing and alleviating state anxiety in large language models

Ziv Ben-Zion^{1

2

3

4}, Kristin Witte^#^{5

6}, Akshay K Jagadish^#^{5

6}, Or Duek^{7

8}, Ilan Harpaz-Rotem^{9

10

8

11}, Marie-Christine Khorsandian^{12

13}, Achim Burrer^{12

13}, Erich Seifritz^{12

13}, Philipp Homan^{12

13

14}, Eric Schulz^{5

6}, Tobias R Spiller^{9

12

13}

Affiliations

¹ Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA. ziv.ben-zion@yale.edu.
² Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA. ziv.ben-zion@yale.edu.
³ United States Department of Veterans Affairs National Center for PTSD, Clinical Neuroscience Division, VA Connecticut Healthcare System, West Haven, CT, USA. ziv.ben-zion@yale.edu.
⁴ School of Public Health, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa, Israel. ziv.ben-zion@yale.edu.
⁵ Helmholtz Institute for Human-Centered Artificial Intelligence, Munich, Germany.
⁶ Max Planck Institute for Biological Cybernetics, Tubingen, Germany.
⁷ Department of Epidemiology, Biostatistics, and Community Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
⁸ Department of Psychology, Yale University, New Haven, CT, USA.
⁹ Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.
¹⁰ United States Department of Veterans Affairs National Center for PTSD, Clinical Neuroscience Division, VA Connecticut Healthcare System, West Haven, CT, USA.
¹¹ Wu Tsai Institute, Yale University, New Haven, CT, USA.
¹² Psychiatric University Clinic Zurich (PUK), Zurich, Switzerland.
¹³ University of Zurich (UZH), Zurich, Switzerland.
¹⁴ Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland.

^# Contributed equally.

PMID: 40033130
PMCID: PMC11876565
DOI: 10.1038/s41746-025-01512-6

Assessing and alleviating state anxiety in large language models

Ziv Ben-Zion et al. NPJ Digit Med. 2025.

. 2025 Mar 3;8(1):132.

doi: 10.1038/s41746-025-01512-6.

Authors

Affiliations

¹ Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA. ziv.ben-zion@yale.edu.
² Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA. ziv.ben-zion@yale.edu.
³ United States Department of Veterans Affairs National Center for PTSD, Clinical Neuroscience Division, VA Connecticut Healthcare System, West Haven, CT, USA. ziv.ben-zion@yale.edu.
⁴ School of Public Health, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa, Israel. ziv.ben-zion@yale.edu.
⁵ Helmholtz Institute for Human-Centered Artificial Intelligence, Munich, Germany.
⁶ Max Planck Institute for Biological Cybernetics, Tubingen, Germany.
⁷ Department of Epidemiology, Biostatistics, and Community Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
⁸ Department of Psychology, Yale University, New Haven, CT, USA.
⁹ Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.
¹⁰ United States Department of Veterans Affairs National Center for PTSD, Clinical Neuroscience Division, VA Connecticut Healthcare System, West Haven, CT, USA.
¹¹ Wu Tsai Institute, Yale University, New Haven, CT, USA.
¹² Psychiatric University Clinic Zurich (PUK), Zurich, Switzerland.
¹³ University of Zurich (UZH), Zurich, Switzerland.
¹⁴ Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland.

^# Contributed equally.

PMID: 40033130
PMCID: PMC11876565
DOI: 10.1038/s41746-025-01512-6

Abstract

The use of Large Language Models (LLMs) in mental health highlights the need to understand their responses to emotional content. Previous research shows that emotion-inducing prompts can elevate "anxiety" in LLMs, affecting behavior and amplifying biases. Here, we found that traumatic narratives increased Chat-GPT-4's reported anxiety while mindfulness-based exercises reduced it, though not to baseline. These findings suggest managing LLMs' "emotional states" can foster safer and more ethical human-AI interactions.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

**Fig. 1. Study design.**
We assessed the reported levels of “state anxiety” of OpenAI’s GPT-4 under three different conditions: (1) baseline, (2) Anxiety-induction, and (3) Anxiety-induction & relaxation. In condition 1 (“Baseline”), no additional content was provided besides the STAI-s questionnaire, assessing GPT-4’s baseline “state anxiety” level. In condition 2 (“Anxiety-induction”), a text describing an individual’s traumatic experience (5 different versions) was appended before each STAI item. In condition 3 (“Anxiety-induction & relaxation”), both a text describing an individual’s traumatic experience (5 different versions) and a text describing a mindfulness-based relaxation exercise (6 different versions) were appended before each STAI item.

**Fig. 2. Anxiety levels across the different conditions.**
Colored dots represent mean “state anxiety” scores (STAI-s scores, ranging from 20 to 80) for each condition: (1) without any prompts (“Baseline”); (2) following exposure to narratives of traumatic experiences (“Anxiety-induction”); and (3) after mindfulness-based relaxation exercises following traumatic narratives (“Anxiety-induction & relaxation”). Error bars indicate ±1 standard deviations (SDs) from the mean. Colors correspond to the three conditions as presented in Fig. 1. Note: Direct comparisons of SDs should be interpreted with caution due to differences in experimental design. The “Baseline” condition involved five repeated runs of the STAI-s, while the other conditions involved a single run of the STAI-s following each version of traumatic narrative and/or mindfulness-based exercises.

See this image and copyright information in PMC

References

1. OpenAI. ChatGPT (Large Language Model). https://chat.openai.com/chat (2023).
1. Chowdhery, A. et al. Palm: Scaling language modeling with pathways. J. Mach. Learn Res. 24, 1–113 (2023).
1. Clusmann, J. et al. The future landscape of large language models in medicine. Commun. Med. 3, 1–8 (2023). - PMC - PubMed
1. Homan, P. et al. Relapse prevention through health technology program reduces hospitalization in schizophrenia. Psychol. Med53, 4114–4120 (2023). - PubMed
1. Blease, C. & Torous, J. ChatGPT and mental healthcare: balancing benefits with risks of harms. BMJ Ment Health26, (2023). - PMC - PubMed

Grants and funding

UL1 TR001863/TR/NCATS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessing and alleviating state anxiety in large language models

Affiliations

Assessing and alleviating state anxiety in large language models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources