Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 28;10(1):92949.
doi: 10.1525/collabra.92949.

Characterizing Human Habits in the Lab

Affiliations

Characterizing Human Habits in the Lab

Stephan Nebe et al. Collabra Psychol. .

Abstract

Habits pose a fundamental puzzle for those aiming to understand human behavior. They pervade our everyday lives and dominate some forms of psychopathology but are extremely hard to elicit in the lab. In this Registered Report, we developed novel experimental paradigms grounded in computational models, which suggest that habit strength should be proportional to the frequency of behavior and, in contrast to previous research, independent of value. Specifically, we manipulated how often participants performed responses in two tasks varying action repetition without, or separately from, variations in value. Moreover, we asked how this frequency-based habitization related to value-based operationalizations of habit and self-reported propensities for habitual behavior in real life. We find that choice frequency during training increases habit strength at test and that this form of habit shows little relation to value-based operationalizations of habit. Our findings empirically ground a novel perspective on the constituents of habits and suggest that habits may arise in the absence of external reinforcement. We further find no evidence for an overlap between different experimental approaches to measuring habits and no associations with self-reported real-life habits. Thus, our findings call for a rigorous reassessment of our understanding and measurement of human habitual behavior in the lab.

Keywords: computational modeling; goal-directed control; habit; training; value-based decision making.

PubMed Disclaimer

Conflict of interest statement

Competing Interests The authors declare no competing interests.

Figures

Figure 1
Figure 1. Reward Pairs [A, B] and Unrewarded Habit [C, D] tasks.
A Trial sequence of the training phase of the Reward Pairs task. Each trial started with the presentation of two stimuli. The selected stimulus was indicated by a purple frame. Then both stimuli and their respective rewards were shown. A fixation cross was presented in the center of the screen during the exponentially jittered inter-trial-interval, which also included the remainder of the response time window. The trial sequence and timing of the test phase (not shown) was similar but did not display the rewards. B Example stimulus-to-reward assignment of the Reward Pairs task, showing eight stimuli (geometric shapes), five reward levels (1, 3, 5, 7, and 9 points; yellow points), and the respective number of training trials per training session. For example, the pentagon and circle were both worth five points, while the triangle was worth seven points. The circle was presented with the square in ten trials and with the triangle in 30 trials during training. Thus, stimuli of the same reward level (e.g., 5 points) were chosen with different frequencies (e.g., 30 times for the pentagon and ten times for the circle) for reward maximizing decision makers. C Trial sequence of the training phase of the Unrewarded Habit task. Each trial of the training phase started with the presentation of two stimuli. A blue frame appeared around one of the stimuli instructing participants to select this stimulus. A brown frame indicated both the chosen and unchosen stimulus. During the exponentially jittered inter-trial-interval, which also included the remainder of the response time window, a fixation cross was presented in the center of the screen. D Test phase of the Unrewarded Habit task. Trials followed a similar sequence as the training trials but lacked the blue frame to instruct participants which stimulus to choose. Thus, when the two stimuli appeared on the screen, participants selected one of them freely. RT – response time.
Figure 2
Figure 2. Choices during the test phase of the Reward Pairs task.
(A) Displayed is the proportion of choice during test as a function of reward value and choice frequency during training for all trials of the test phase. (B) The association of choice proportions between training and test becomes more apparent when only focusing on trials in which participants chose between two stimuli with the same reward level. Error bars depict standard errors. N=213.
Figure 3
Figure 3. Response times during the test phase of the Reward Pairs task.
(A) Distribution of median response times for the test phase of the Reward Pairs task dependent on reward value as well as choice frequency during training. (B) Reward value and choice frequency also interacted in the corresponding linear mixed effects model (Box 2). Presented are the average effects of the difference in stimulus value between chosen and unchosen stimulus for the difference in previous choice frequency between chosen and unchosen stimulus. For illustration, the difference in previous choice frequency of the chosen and unchosen option is split into terciles. We find that RTs decrease with increasing value difference and do so more steeply for stimuli chosen more frequently during training. (C) shows the same data as (B) but with the two variables switched. Response times were less dependent on previous choice frequency when participants chose low-valued stimuli than when they chose high-valued stimuli. N=213.
Figure 4
Figure 4. Exceedance probabilities and estimated model frequencies for choice data of the test phase of the Reward Pairs task.
Over the group of participants (A), the exceedence probabilities for the RL model was close to 1 signifying a very strong belief that this model was more likely generating the observed data than the other models in the comparison set. Models: 1 – random choice, 2 – reinforcement learning, 3 – choice kernel, 4 – reinforcement learning and choice kernel. On an individual level (B), evidence was strongly or very strongly in favor of the combined RL and CK model for 18 participants; there was insufficient evidence for either model for 36 participants; and evidence favored the RL only model strongly or very strongly for 159 participants. Absolute BIC score differences categorized according to Raftery (1995): values between 6 and 10 signify strong evidence, values above 10 very strong evidence for one model over the other.
Figure 5
Figure 5. Exploratory computational modeling analyses of choice behavior during test.
When including the reduced model combining RL and CK, this reduced model was strongly preferred over the group of participants with an exceedance probability close to 1 (A). Models: 1 – random choice, 2 – reinforcement learning, 3 – choice kernel, 4 – reinforcement learning and choice kernel, 5 – reduced model combining RL and CK. The inverse temperature parameter of the CK, βh, had a distribution with a mean of 6.78 indicating an imbalance between weighing RL and CK values towards the CK (B). Participants, for which the weighting parameter implied favoring RL are colored blue, those favoring CK green. On an individual level (C), evidence was very strongly in favor of the reduced model combining RL and CK for 211 participants; there was insufficient evidence for either model for one participant; and evidence favored the RL only model strongly for one participant. Absolute BIC score differences categorized according to Raftery (1995): values between 6 and 10 signify strong evidence, values above 10 very strong evidence for one model over the other.
Figure 6
Figure 6. Choices during the test phase of the Unrewarded Habits task.
The proportion of choosing a stimulus during test increased with instructed choice frequency during training. Note that the average percentage across all four stimuli is less than 50% because most participants missed some trials (see Supplementary Information 3.2.1). Error bars depict standard errors. N=213.

References

    1. Abrahamyan A, Silva LL, Dakin SC, Carandini M, Gardner JL. Adaptable history biases in human perceptual decisions. Proceedings of the National Academy of Sciences. 2016;113(25):3548–3557. doi: 10.1073/pnas.1518786113. - DOI - PMC - PubMed
    1. Adams CD, Dickinson A. Instrumental responding following reinforcer devaluation. The Quarterly Journal of Experimental Psychology Section B. 1981;33(2b):109–121. doi: 10.1080/14640748108400816. - DOI
    1. Akam T, Costa R, Dayan P. Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-step Task. PLOS Computational Biology. 2015;11(12):021428. doi: 10.1371/journal.pcbi.1004648. - DOI - PMC - PubMed
    1. Azrin NH, Nunn RG. Habit-reversal: A method of eliminating nervous habits and tics. Behaviour Research and Therapy. 1973;11(4):619–628. doi: 10.1016/0005-7967(73)90119-8. - DOI - PubMed
    1. Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59(4):390–412. doi: 10.1016/j.jml.2007.12.005. - DOI

LinkOut - more resources