. 2022 Sep 9;18(9):e1010410.

doi: 10.1371/journal.pcbi.1010410. eCollection 2022 Sep.

Ambiguity drives higher-order Pavlovian learning

Tomislav D Zbozinek¹, Omar D Perez^{1

2

3}, Toby Wise¹, Michael Fanselow^{4

5

6

7}, Dean Mobbs^{1

8}

Affiliations

¹ California Institute of Technology, Humanities and Social Sciences, Pasadena, California, United States of America.
² University of Santiago, CESS-Santiago, Faculty of Business and Economics, Santiago, Chile.
³ University of Chile, Department of Industrial Engineering, Santiago, Chile.
⁴ University of California, Los Angeles, Department of Psychology, Los Angeles, California, United States of America.
⁵ University of California, Los Angeles, Department of Psychiatry & Biobehavioral Sciences, Los Angeles, California, United States of America.
⁶ University of California, Los Angeles, Staglin Center for Brain and Behavioral Health, Los Angeles, California, United States of America.
⁷ University of California, Los Angeles, Brain Research Institute, Los Angeles, California, United States of America.
⁸ California Institute of Technology, Computation and Neural Systems Program, Pasadena, California, United States of America.

PMID: 36084131
PMCID: PMC9491594
DOI: 10.1371/journal.pcbi.1010410

Ambiguity drives higher-order Pavlovian learning

Tomislav D Zbozinek et al. PLoS Comput Biol. 2022.

. 2022 Sep 9;18(9):e1010410.

doi: 10.1371/journal.pcbi.1010410. eCollection 2022 Sep.

Authors

Tomislav D Zbozinek¹, Omar D Perez^{1

2

3}, Toby Wise¹, Michael Fanselow^{4

5

6

7}, Dean Mobbs^{1

8}

Affiliations

¹ California Institute of Technology, Humanities and Social Sciences, Pasadena, California, United States of America.
² University of Santiago, CESS-Santiago, Faculty of Business and Economics, Santiago, Chile.
³ University of Chile, Department of Industrial Engineering, Santiago, Chile.
⁴ University of California, Los Angeles, Department of Psychology, Los Angeles, California, United States of America.
⁵ University of California, Los Angeles, Department of Psychiatry & Biobehavioral Sciences, Los Angeles, California, United States of America.
⁶ University of California, Los Angeles, Staglin Center for Brain and Behavioral Health, Los Angeles, California, United States of America.
⁷ University of California, Los Angeles, Brain Research Institute, Los Angeles, California, United States of America.
⁸ California Institute of Technology, Computation and Neural Systems Program, Pasadena, California, United States of America.

PMID: 36084131
PMCID: PMC9491594
DOI: 10.1371/journal.pcbi.1010410

Abstract

In the natural world, stimulus-outcome associations are often ambiguous, and most associations are highly complex and situation-dependent. Learning to disambiguate these complex associations to identify which specific outcomes will occur in which situations is critical for survival. Pavlovian occasion setters are stimuli that determine whether other stimuli will result in a specific outcome. Occasion setting is a well-established phenomenon, but very little investigation has been conducted on how occasion setters are disambiguated when they themselves are ambiguous (i.e., when they do not consistently signal whether another stimulus will be reinforced). In two preregistered studies, we investigated the role of higher-order Pavlovian occasion setting in humans. We developed and tested the first computational model predicting direct associative learning, traditional occasion setting (i.e., 1st-order occasion setting), and 2nd-order occasion setting. This model operationalizes stimulus ambiguity as a mechanism to engage in higher-order Pavlovian learning. Both behavioral and computational modeling results suggest that 2nd-order occasion setting was learned, as evidenced by lack and presence of transfer of occasion setting properties when expected and the superior fit of our 2nd-order occasion setting model compared to the 1st-order occasion setting or direct associations models. These results provide a controlled investigation into highly complex associative learning and may ultimately lead to improvements in the treatment of Pavlovian-based mental health disorders (e.g., anxiety disorders, substance use).

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Hierarchical Model of 2^nd-Order Occasion Setting.**
Using example from main text of child, friend, and grandparent: a) direct associative learning, b) 1^st-order occasion setting, and c) 2^nd-order occasion setting. Panels d-j display: d) direct excitation, e) direct inhibition, f) 1^st-order positive occasion setting, g) 1^st-order negative occasion setting, h) 2^nd-order positive occasion setting, i) 2^nd-order negative occasion setting, and j) the total model with all associations. Our mathematical model is presented at the bottoms of panels d-j, where black/bold variables are active (i.e., values greater than 0), and gray variables are inactive (i.e., values are 0). See Table 3 for details on formulas. In the figure, circles are stimuli: unconditional stimulus (US), conditional stimulus (CS), 1^st-order occasion setter (OS1), and 2nd-order occasion setter (OS2). Blue arrows indicate direct excitation; blue line segments indicate positive occasion setting; red line segments indicate direct inhibition or negative occasion setting; yellow glow indicates CS ambiguity; purple glow indicates OS1 ambiguity; blue USs indicate US delivery; and red USs indicate US omission. While we suggest that stimulus ambiguity is a dimensional, learned property, we present it as present/absent in the figure for simplicity. Thick arrows/lines indicate activated pathways; thin arrows/lines indicate deactivated pathways. Stimulus ambiguity is required for higher-order associative learning: 1^st-order occasion setting is learned only if the CS is ambiguous and has been trained with an OS1; 2^nd-order occasion setting is only learned if the CS and OS1 are ambiguous and if the CS has been trained with an OS2. CSs have a direct predictive relationship with the US. If the CS is ambiguous (i.e., sometimes predicts the US, sometimes predicts absence of the US), then attention is broadened to other stimuli or contextual factors (i.e., to the OS1); if a stimulus that disambiguates CS reinforcement is identified and is less salient than the CS, it becomes an OS1. The OS1 modulates the CS/US association. If OS1 consistently excites the CS/US association, then OS1 is a positive OS1; if OS1 consistently inhibits the CS/US association, then OS1 is a negative OS1. If OS1 sometimes excites and sometimes inhibits the CS/US association (i.e., OS1 is ambiguous), then attention is broadened to other stimuli or contextual factors (i.e., OS2) that disambiguate how the OS1 affects the CS/US association. If a stimulus disambiguates the effect of OS1 on the CS/US association and is presumably less salient than the OS1, 2^nd-order occasion setting is learned. If the OS2 consistently disables the OS1’s 1^st-order positive occasion setting ability, the OS2 is a 2^nd-order negative occasion setter. If OS2 consistently disables OS1’s 1^st-order negative occasion setting ability, then OS2 is a 2^nd-order positive occasion setter. Additionally, each hierarchical level (direct associations, 1^st-order occasion setting, 2^nd-order occasion setting) and excitatory/inhibitory directions are orthogonal–meaning, a given stimulus can be any combination of an excitatory or inhibitory CS, OS1, or OS2 (e.g., a given stimulus can simultaneously be a CS+, negative OS1, and positive OS2).

**Fig 2. Experiment 1 (2^nd-Order Negative Occasion Setting) Trial Design.**
Each colored box represents a trial type. Gray boxes represent what was shown visually on screen. Inter-trial intervals (ITIs) and inter-stimulus intervals (ISIs) included a gray screen with a fixation cross (“+”). Duration of each trial component is shown at top of each trial type. Rating slide is shown in abbreviated form, and visual analog scale was used to rate US Expectancy. Images of nature scenes, shapes, and auditory stimuli indicate experimental stimuli (i.e., CSs, occasion setters). Auditory stimuli are indicated below slides in horizontal auditory band. Violin symbol indicates violin sound, static screen indicates white noise, and dollar sign indicates cash register sound. None of the auditory symbols were shown on screen during the experiment. Gold coin indicates monetary reward (US). Black arrow pointing to the right for each trial type indicates chronological component sequence during trials. All stimuli are counterbalanced across participants within stimulus category: G/H (Unambiguous CSs), C/K (Ambiguous CSs), B/J (1^st-order occasion setters), and A/T (2^nd-order occasion setters). All trial types for Experiment 1 are shown except TJR+, which is identical to ABR+, except T and J stimuli are substituted for A and B. Experiment 2 (2^nd-order POS) design is mirror image of Experiment 1 in which all trial types reinforced in Experiment 1 were not reinforced in Experiment 2, and all trial types not reinforced in Experiment 1 were reinforced in Experiment 2. The only exceptions are G+ and H-, which remained a CS+ and CS-, respectively, in each study. Images of violin, gold coin, confetti, fractals, and dollar sign were obtained from https://openclipart.org.

**Fig 3. Experiment 1 and 2 Training Results.**
**a, b, c)** Experiment 1 Training results generally reflect direct CS/US associations, 1^st-order positive occasion setting, and 2^nd-order negative occasion setting. **d, e, f)** Experiment 2 Training results generally reflect direct CS/US associations, 1^st-order negative occasion setting, and 2^nd-order positive occasion setting. Congruent conditions/panels are displayed horizontally between experiments. Results in both experiments showed that participants correctly learned which stimuli were (non)reinforced. Error bands reflect standard error. Generally, “cool” colors (blues, greens, purples) indicate hypothesized higher values, whereas “warm” colors (reds, oranges, yellows) indicate hypothesized lower values. Additionally, because not all stimuli were trained in every phase of the experiment (e.g., the “ABC” and “DEF” stimuli were not trained during 2^nd Training), there are some empty spaces in the graphs of stimuli being shown either earlier or later in the experiment (e.g., G+ and H- were trained in the first half of the experiment). The reason that trial numbers vary between stimuli is to balance thoroughness of training and concision. During reminder phases, we wanted to remind participants of the most critical trial types relevant for the upcoming transfer test. For example, J- was not in the 2^nd Reminder phase because it was not essential to Transfer Test 2 (which focused on three-stimulus combinations, including ABC, AJK, and TJK). Additionally, the reader may notice that some stimuli (e.g., JK+) end prior to other stimuli (e.g., TJK-) during 2^nd Reminder. This is a visual artifact of the figure. The former stimuli received three trials of training during 2^nd Reminder (for concision purposes because they already received plenty of training beforehand), whereas the latter stimuli received nine trials because they had undergone less training. Thus, the figures show some stimuli (e.g., TJK-) having nine trials but others (e.g., JK+) having seven trials. Those stimuli end at seven trials in order to interpolate responding to them through the course of training and to scale their trial numbers for visual purposes. In short, this figure offers interpretable results of participants’ learning, and for detailed trial numbers, please see Table A in S7 Text.

**Fig 4. Experiment 1 and 2 Transfer Test Results.**
Panels **a-e** are from Experiment 1 (left column); panels **f-j** are from Experiment 2 (right column). Figure shows a bar plot (mean and standard error) with individual data points for each participant and each stimulus in ascending order. Bars with gradient colors and individual data points with two colors indicate novel transfer stimuli; solid bars and single-colored data points indicate trained stimuli. CS+ = excitatory conditional stimulus; CS- = inhibitory CS; POS1 = 1^st-order positive occasion setter; NOS1 = 1^st-order negative occasion setter; POS2 = 2^nd-order POS; NOS2 = 2^nd-order NOS. See main text for results details and references to panels. Main comparisons in panels e (AJK2 vs AJK1) and j (DMN2 vs DMN1) circled in **black ovals** and show that the 2^nd-order occasion setters (A & D) transferred more strongly to the 1^st-order occasion setter/CS combination after the combination was trained with a different 2^nd-order occasion setter (AJK2, DMN2) than before (AJK1, DMN1).

**Fig 5. Computational Modeling Results.**
a) Model fit was determined with Watanabe-Akaike Information Criterion (WAIC), where lower scores indicate more accurate models. In both experiments, we tested three models of hierarchical learning: our direct associative learning model, our 1^st-order occasion setting model (which also included direct associative learning), and our 2^nd-order occasion setting model (which also included 1^st-order occasion setting and direct associative learning). Results show that, in both experiments, the 2^nd-order occasion setting model (bold colors) outperformed the 1^st-order occasion setting model and direct associations model. b) As secondary/complementary results to our WAIC analyses, we also estimated median R² for each model, finding that our 2^nd-order occasion setting models had the greatest R² (.632, .560). c) Experiment 1 exemplar participant responding, model-predicted responding, and “perfect learning” prediction for the 2^nd-order occasion setting model. d) Experiment 2 exemplar participant responding, model-predicted responding, and “perfect learning” prediction for 2^nd-order occasion setting model. Transfer Test 1 = trials 205–249 and Transfer Test 2 = trials 331–342, shown between the vertical green hashed lines; remaining trials were Training/Reminder.

**Fig 6. Examples of Formula Inputs and Outputs.**
The formula variables (e.g., V, $\bar{V}$ , P) represent presence/absence of stimuli on a hypothetical trial and their training history. Bar graphs indicate predicted responding (i.e., R) to CS based on its training history and presence/absence of occasion setters. Left column provides names of formula variables. Across each row from these variables are values of 0 or 1, corresponding to the values of the variables’ names on the left. The values used by each figure are located in a column directly to the left of each figure. For example, the top-left figure shows a CS with 1 for direct excitation (V) and 0 for all other values (i.e., this is a CS+). POS = positive occasion setting; NOS = negative occasion setting. Excitation is color-coded as teal; inhibition is color-coded as purple. a) Examples of Direct Associative Learning and Successful Occasion Setting. We arranged inputs and outputs in a 2x3 grid, where the first column shows direct learning, the second column shows successful 1^st-order occasion setting, and the third column shows successful 2^nd-order occasion setting. First row shows excitatory responses, and second row shows inhibitory responses. b) Examples of Unsuccessful Occasion Setting. In bottom-left figure, we provide an example to demonstrate that 1^st-order occasion setters do not affect responding if either direct excitation or direct inhibition are 0 (in our example, a 1^st-order negative occasion setter does not affect a CS+, whose direct inhibition = 0). Congruently, 2^nd-order occasion setters do not affect responding if any of the following are 0: direct excitation, direct inhibition, 1^st-order positive occasion setting, or 1^st-order negative occasion setting. As examples, in the bottom-middle plot, we show a 2^nd-order negative occasion setter will not affect a CS unless an ambiguous 1^st-order occasion setter is present (i.e., no 1^st-order occasion setter is present, so P and N = 0). In our bottom-right plot, a 2^nd-order positive occasion setter will not affect a CS- for multiple reasons, such as direct excitation = 0 and having no 1^st-order occasion setter present (i.e., P and N = 0).

See this image and copyright information in PMC

References

1. Fraser KM, Holland PC. Occasion setting. Behav Neurosci. 2019;133: 145–175. doi: 10.1037/bne0000306 - DOI - PMC - PubMed
1. Trask S, Thrailkill EA, Bouton ME. Occasion setting, inhibition, and the contextual control of extinction in Pavlovian and instrumental (operant) learning. Behav Processes. 2017;137: 64–72. doi: 10.1016/j.beproc.2016.10.003 - DOI - PMC - PubMed
1. Bouton ME. Context, time, and memory retrieval in the interference paradigms of Pavlovian learning. Psychol Bull. 1993;114: 80–99. doi: 10.1037/0033-2909.114.1.80 - DOI - PubMed
1. Bouton ME. Context, ambiguity, and unlearning: sources of relapse after behavioral extinction. Biol Psychiatry. 2002;52: 976–986. doi: 10.1016/s0006-3223(02)01546-9 - DOI - PubMed
1. Rosas JM, Aguilera JEC, Álvarez MMR, Abad MJF. Revision of Retrieval Theory of Forgetting: What does Make Information Context-Specific? Int J Psychol Psychol Ther. 2006;6: 147–166.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

206460/Z/17/Z/WT_/Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ambiguity drives higher-order Pavlovian learning

Affiliations

Ambiguity drives higher-order Pavlovian learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources