Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 1;81(10):1010-1019.
doi: 10.1001/jamapsychiatry.2024.1796.

Exploration-Exploitation and Suicidal Behavior in Borderline Personality Disorder and Depression

Affiliations

Exploration-Exploitation and Suicidal Behavior in Borderline Personality Disorder and Depression

Aliona Tsypes et al. JAMA Psychiatry. .

Erratum in

  • Error in Funding.
    [No authors listed] [No authors listed] JAMA Psychiatry. 2025 Apr 1;82(4):427. doi: 10.1001/jamapsychiatry.2025.0003. JAMA Psychiatry. 2025. PMID: 39937505 Free PMC article. No abstract available.

Abstract

Importance: Clinical theory and behavioral studies suggest that people experiencing suicidal crisis are often unable to find constructive solutions or incorporate useful information into their decisions, resulting in premature convergence on suicide and neglect of better alternatives. However, prior studies of suicidal behavior have not formally examined how individuals resolve the tradeoffs between exploiting familiar options and exploring potentially superior alternatives.

Objective: To investigate exploration and exploitation in suicidal behavior from the formal perspective of reinforcement learning.

Design, setting, and participants: Two case-control behavioral studies of exploration-exploitation of a large 1-dimensional continuous space and a 21-day prospective ambulatory study of suicidal ideation were conducted between April 2016 and March 2022. Participants were recruited from inpatient psychiatric units, outpatient clinics, and the community in Pittsburgh, Pennsylvania, and underwent laboratory and ambulatory assessments. Adults diagnosed with borderline personality disorder (BPD) and midlife and late-life major depressive disorder (MDD) were included, with each sample including demographically equated groups with a history of high-lethality suicide attempts, low-lethality suicide attempts, individuals with BPD or MDD but no suicide attempts, and control individuals without psychiatric disorders. The MDD sample also included a subgroup with serious suicidal ideation.

Main outcomes and measures: Behavioral (model-free and model-derived) indices of exploration and exploitation, suicide attempt lethality (Beck Lethality Scale), and prospectively assessed suicidal ideation.

Results: The BPD group included 171 adults (mean [SD] age, 30.55 [9.13] years; 135 [79%] female). The MDD group included 143 adults (mean [SD] age, 62.03 [6.82] years; 81 [57%] female). Across the BPD (χ23 = 50.68; P < .001) and MDD (χ24 = 36.34; P < .001) samples, individuals with high-lethality suicide attempts discovered fewer options than other groups as they were unable to shift away from unrewarded options. In contrast, those with low-lethality attempts were prone to excessive behavioral shifts after rewarded and unrewarded actions. No differences were seen in strategic early exploration or in exploitation. Among 84 participants with BPD in the ambulatory study, 56 reported suicidal ideation. Underexploration also predicted incident suicidal ideation (χ21 = 30.16; P < .001), validating the case-control results prospectively. The findings were robust to confounds, including medication exposure, affective state, and behavioral heterogeneity.

Conclusions and relevance: The findings suggest that narrow exploration and inability to abandon inferior options are associated with serious suicidal behavior and chronic suicidal thoughts. By contrast, individuals in this study who engaged in low-lethality suicidal behavior displayed a low threshold for taking potentially disadvantageous actions.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Dr Wright reported grants from the National Institutes of Health during the conduct of the study. No other disclosures were reported.

Figures

Figure 1.
Figure 1.. Clock Task and Behavioral Manipulation Checks
A, The clock paradigm consists of decision and feedback phases. During the decision phase, a dot revolves 360° around a central stimulus over the course of 4 seconds. Participants press a button to stop the revolution and receive a probabilistic outcome. During the feedback phase, participants are informed about the number of points they won on this trial, with 0 points representing reward omission. Rewards are drawn from 1 of 2 monotonically time-varying contingencies in which expected values of choices either increase (increasing expected value [IEV]) or decrease (decreasing expected value [DEV]) with prolonged wait. Reward probabilities and magnitude varied independently (eFigure 1 in Supplement 1). B, Evolution of participants’ response times (RT) and RT swings by contingency in the borderline personality disorder (BPD) and major depressive disorder (MDD) samples. Plotted data are smoothed using a generalized additive model (GAM) in the ggplot2 package of R version 3.4.4 (R Foundation). In subplots on the right, the smoothing used natural splines from the splines package in R version 4.3.2, with a basis of 5 knots. The shaded area around the lines represents 95% CIs. Participants learned to respond later in the IEV compared to the DEV condition and RT swings generally decreased later in learning (especially in IEV). To ascertain that time courses were not distorted by smoothing, trial-averaged data are presented in eFigures 2 and 3 in Supplement 1. The difference between DEV and IEV at trial 1 is due to the alternation of IEV and DEV conditions, which change every 40 trials of the task.
Figure 2.
Figure 2.. Model Description, Selection, and Comparison
A, The Strategic Exploration/Exploitation of Temporal Instrumental Contingencies (SCEPTIC) reinforcement learning model shows basis function representation. Top: participant responds at 1 second and wins 110 points. Bottom left: the 1-dimensional space of the task is tiled with Gaussian-shaped learning elements with staggered receptive fields. Bottom right: the reward at 1 second updates expected values (weights) of nearby basis elements. Color indicates the location of the basis function within the interval. Darker colors indicate earlier responses, and lighter colors indicate later responses. B, Entropy dynamics of the information-compressing reinforcement learning (RL) model. Early in learning, entropy is high because all locations have similar values. The figure shows example value distribution early in learning, first within the circular visual space of the task and then projected linearly onto the abscissa. Later in learning, entropy decreased as the most attractive option dominated. Traditional RL is contrasted with information-compressing RL. Information compression (arrows) reduces the entropy of the value distribution. In contrast to traditional RL with long-term value persistence, information-compressing RL learns and forgets faster. Information compression is an emerging property of the algorithm, resulting from both the decay of unchosen options and value updates of the chosen location. In contrast, entropy change in the traditional RL model depends only on the latter. Entropy was defined as Shannon entropy of the normalized vector of element weights (gray bars). C, Random-effects bayesian model comparison of SCEPTIC model variants. Dots represent the estimated model frequency (ie, the proportion of participants for whom a given model provided the best fit to the data). The shade matches that of the same models in panel B. Uncertainty-sensitive RL is the SCEPTIC model variant where choice was influenced by both uncertainty and reward value to embody the alternative hypothesis that uncertainty modulates exploration. Information-compressing RL was used in the study as it performed better than traditional or uncertainty-sensitive RL. Diagnostic groups exhibited a similar pattern of model fits to the full sample. BOR indicates bayesian omnibus risk; EP, exceedance probability.
Figure 3.
Figure 3.. Behavior on the Clock Task
Estimates from multilevel linear regression models predicting trial-level responses. See eTables 4, 18, 19, and 22 in Supplement 1, respectively, for full outputs of these models. A, Response time (RT) swings following reward vs omission by group in the borderline personality disorder (BPD) sample. Smaller numbers indicate larger RT swings and vice versa. Individuals with high-lethality (HL) suicide attempts had lower levels of win-stay/lose-shift behavior (ie, the tendency to repeat rewarded options and shift away from unrewarded options). Whereas individuals with HL suicide attempts were less likely to shift behavior after a previously unrewarded action (smaller lose-shift), individuals with low-lethality (LL) suicide attempts were less likely to stick with the previously rewarded actions (smaller win-stay). B, Entropy dynamics by group during the clock task in the BPD sample. The ordinate depicts Shannon entropy of normalized element weights (illustrated in Figure 2), with higher values reflecting a greater number of competing options. Individuals with HL suicide attempts discovered fewer options than other groups, as evidenced by a lack of value entropy expansion in that group. C, Response time swings following reward vs omission as in panel A, replication in the major depressive disorder (MDD) sample. D, Response time swings following reward vs omission on the clock task and prospective suicidal ideation during ecological momentary assessment in individuals with BPD. Lower levels of win-stay/lose-shift (especially after reward omission) were associated with more frequent suicidal ideation in daily life. AU indicates arbitrary units; HC, healthy control individuals; NON, individuals without lifetime history of suicide attempts; SI, suicidal ideation.

Similar articles

Cited by

References

    1. Henriques G, Wenzel A, Brown GK, Beck AT. Suicide attempters’ reaction to survival as a risk factor for eventual suicide. Am J Psychiatry. 2005;162(11):2180-2182. doi:10.1176/appi.ajp.162.11.2180 - DOI - PubMed
    1. Wong PWC, Cheung DYT, Conner KR, Conwell Y, Yip PSF. Gambling and completed suicide in hong kong: a review of coroner court files. Prim Care Companion J Clin Psychiatry. 2010;12(6):PCC.09m00932. doi:10.4088/PCC.09m00932blu - DOI - PMC - PubMed
    1. Vijayakumar L, Kumar MS, Vijayakumar V. Substance use and suicide. Curr Opin Psychiatry. 2011;24(3):197-202. doi:10.1097/YCO.0b013e3283459242 - DOI - PubMed
    1. Brown VM, Wilson J, Hallquist MN, Szanto K, Dombrovski AY. Ventromedial prefrontal value signals and functional connectivity during decision-making in suicidal behavior and impulsivity. Neuropsychopharmacology. 2020;45(6):1034-1041. doi:10.1038/s41386-020-0632-0 - DOI - PMC - PubMed
    1. Clark L, Dombrovski AY, Siegle GJ, et al. . Impairment in risk-sensitive decision-making in older suicide attempters with depression. Psychol Aging. 2011;26(2):321-330. doi:10.1037/a0021646 - DOI - PMC - PubMed