. 2013 Oct 16;80(2):507-18.

doi: 10.1016/j.neuron.2013.08.008.

Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning

Yuji K Takahashi¹, Chun Yun Chang, Federica Lucantonio, Richard Z Haney, Benjamin A Berg, Hau-Jie Yau, Antonello Bonci, Geoffrey Schoenbaum

Affiliations

Affiliation

¹ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, Baltimore, MD 21224, USA. Electronic address: yuji.takahashi@nih.gov.

PMID: 24139047
PMCID: PMC3806218
DOI: 10.1016/j.neuron.2013.08.008

Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning

Yuji K Takahashi et al. Neuron. 2013.

. 2013 Oct 16;80(2):507-18.

doi: 10.1016/j.neuron.2013.08.008.

Authors

Yuji K Takahashi¹, Chun Yun Chang, Federica Lucantonio, Richard Z Haney, Benjamin A Berg, Hau-Jie Yau, Antonello Bonci, Geoffrey Schoenbaum

Affiliation

¹ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, Baltimore, MD 21224, USA. Electronic address: yuji.takahashi@nih.gov.

PMID: 24139047
PMCID: PMC3806218
DOI: 10.1016/j.neuron.2013.08.008

Abstract

Imagination, defined as the ability to interpret reality in ways that diverge from past experience, is fundamental to adaptive behavior. This can be seen at a simple level in our capacity to predict novel outcomes in new situations. The ability to anticipate outcomes never before received can also influence learning if those imagined outcomes are not received. The orbitofrontal cortex is a key candidate for where the process of imagining likely outcomes occurs; however, its precise role in generating these estimates and applying them to learning remain open questions. Here we address these questions by showing that single-unit activity in the orbitofrontal cortex reflects novel outcome estimates. The strength of these neural correlates predicted both behavior and learning, learning that was abolished by temporally specific inhibition of orbitofrontal neurons. These results are consistent with the proposal that the orbitofrontal cortex is critical for integrating information to imagine future outcomes.

PubMed Disclaimer

Figures

**Figure 1. Task design and recording sites**
a. Shown is the task design and experimental timeline. A1, A2 and A3 are auditory cues (tone, white noise and clicker, counterbalanced). V is a visual cue (a cue light). Two differently flavored sucrose pellets were used as reward (banana- or grape-flavored sucrose pellets, represented by solid or empty circles, counterbalanced). Training began with 12 conditioning sessions (CD1 – CD12) in which each cue was presented 8 times. A1 and V cues were paired with the same reward (3 pellets), and A2 were paired with the other reward (3 pellets). A3 was paired with no reward. After completion of the last conditioning session, rats underwent a single compound probe session (CP1) followed by 3 compound training sessions (CP2 – CP4). During the 1^st half of the compound probe session (CP 1/2), rats continued to receive simple conditioning. During the 2^nd half (CP 2/2), rats began compound training in which A1 and V were presented together as a compound (A1/V), followed by delivery of the same reward (3 pellets). A2, A3 and V continued to be presented as in simple conditioning. During the compound training sessions (CP2 – CP4), rats received presentations of A1/V, A2, A3 and V. After the completion of last compound training session, rats underwent a single extinction probe session (PB). The 1^st half of the session (PB 1/2) consisted of further compound training. During the 2^nd half of the session (PB 2/2), rats received eight non-reinforced presentations of A1, A2 and A3 with the order mixed and counterbalanced. b. Location of recording sites in OFC. Boxes indicate approximate location of recording sites in each rat, taking into account any vertical distance traveled during training and the approximate lateral spread of the electrode bundle.

**Figure 2. Conditioned responding and cue-evoked activity increased during simple conditioning**
a. Plot illustrating increase in conditioned responding as a percentage of time in the food cup during each of the 4 cues across sessions. Red diamond; A1, blue square; A2, green circle; A3, yellow triangle; V. b. Proportions of neurons that were significantly responsive to any of the 4 cues, shown for each pair of sessions and separated by those that increased (white) or decreased (black) firing rate compared to baseline. The proportion of neurons that increased firing grew significantly across conditioning (chi-square test compared to proportion in the first pair of sessions), whereas the proportion of neurons that decreased firing did not change. **p < 0.01, *p < 0.05. c. Examples of single units showing cue-evoked responses. Top and bottom units were recorded from Rat#11 in conditioning day 5 and from Rat#5 in conditioning day 11, respectively. Activity shown is synchronized to the onset of the 30 s cues. Red, blue, green and yellow lines indicate A1, A2, A3 and V, respectively. Gray bars indicate a period of cue presentation. Bin size; 1 sec.

**Figure 3. Conditioned responding and cue-evoked activity summates at the start of compound training**
a. Conditioned responding as a percentage of time in the food cup during each of the 4 cues during the compound probe (CP) and 3 days of compound training (CP2 – CP4). Red diamonds indicate A1 in CP 1/2 phase, and A1/V in CP 2/2 and CP2 – CP4 phases. Blue squares, green circles and yellow triangles indicate A2, A3 and V, respectively. Red and blue bars in the inset indicate the change in responding to A1 (red) and A2 (blue) from the 1^st half to the 2^nd half of CP. * p < 0.05. Error bars = S.E.M. b. Population responses of all 70 cue-responsive neurons, with firing normalized by neuron, to A1 (left), V (middle) and A2 (right) during 28 compound probe sessions. Dark and light red indicate population response to A1 in the 1^st half of the session and population response to A1/V in the 2^nd half, respectively. Dark and light yellow indicate population response to V in the 1^st half and 2^nd half of the session, respectively. Dark and light blue indicate population responses to A2 in the 1^st half and 2^nd half of the session, respectively. Small insets in each panel indicate population response to each cue in the 1^st half of the session and population response on the 1^st trial in the 2^nd half of the session. Gray shadings indicate S.E.M. Gray bars indicate a period of cue presentation. c. Average normalized firing to A1 (red), A2 (blue), and V (yellow) in the 1^st and 2^nd half of the compound probe session. Average normalized activity was calculated by dividing average firing during the last 20 sec by average firing during the last 20 sec of pre CS period **d - g**. Distributions of summation index scores for firing to A1 (d), V (**e and f**) and A2 (g) in the compound probe. Each summation index compares firing on the first trial of the second half of the compound probe (CP 2/2) against firing in the first half of the compound probe (CP 1/2), using the following formula: (2^nd FR – 1^st FR)/( 2^nd FR + 1^st FR), where FR represents average normalized firing for each condition. h. Distribution of compound index in the compound probe session. The compound index compares firing to the compound cue (A1/V) in the first trial of the second half of the session against the sum of firing to A1 and V in the first half of the session, using the following formula: (2^nd FR A1/V – (1^st FR A1 + 1^st FR V)/ (2^nd FR A1/V + (1^st FR A1 + 1^st FR V), where FR represents average normalized firing for each condition. Black bars represent neurons in which the difference in firing was statistically significant. The numbers in each panel indicate results of a Wilcoxon signed-rank test (p) on the distribution and the average summation index (u). i. Scatter plot in left represents relationship between average normalized firing of each neuron to preferred cue in the 1^st half and average normalized firing to A1/V on the 1^st trial in the 2^nd half of the session. Distribution plot in right represents summation index calculated by average normalized firing to preferred cue in the 1^st half and average normalized firing to A1/V on the 1^st trial in the 2^nd half of the session. j. Correlation between neural summation index scores and behavioral summation index scores during the compound probe session. The behavioral summation index compares conditioned responding to A1/V in the first trial of the second half of the session against that to A1 during the first half of the session, using the following formula: (2^nd CR A1/V1 – 1^st CR A1)/( 2^nd CR A1/V1 + 1^st CR A1), where CR represents average percent of time spent in the food cup during eachcondition. k. Line plot indicates the ratio between normalized firing to A1/V and A2 during each compound training session (CP – CP4). N’s indicate number of cue-responsive neurons in each session. A1/A2 ratio increased significantly in the compound phase of the probe, and then gradually decreased (ANOVA, **p < 0.01, *p < 0.05). Line plot in inset indicates normalized firing to A1/V and A2 across 6 trials in the 2^nd half of the compound probe session, with red diamonds for A1 and blue squares for A2. Error bars = S.E.M. See also Figures S1, S2, S4.

**Figure 4. Conditioned responding and cue-evoked activity spontaneously declines at the start of extinction training**
a. Conditioned responding as a percentage of time in the food cup during each of the 4 cues during the extinction probe (PB). Bar graph shows average responding during extinction trials only. Red indicates A1/V in PB 1/2, and A1 in the line plot and bar graph. Blue, green and yellow indicate A2, A3 and V, respectively. * p < 0.01. Error bars = S.E.M. b. Population responses of all 61 cue-responsive neurons, with firing normalized by neuron, to A1 (left) and A2 (right) during 28 extinction probe sessions. Light and dark red indicate population response to A1/V in the 1^st half of the session and population response to A1 on the 1^st trial in the 2^nd half, respectively. Light and dark blue indicate population responses to A2 in the 1^st half and population response on the 1^st trial in the 2^nd half of the session, respectively. Gray shadings indicate S.E.M. Gray bars indicate a period of cue presentation. c. Average normalized firing rate to A1 (red) and A2 (blue) in the extinction probe session. Average normalized activity was calculated by dividing average firing during the last 20 sec by average firing during last 20 sec of pre CS period. **d and e**. Distribution of over-expectation index scores for firing to A1 (d) and A2 (e) in the extinction probe. Each over-expectation index compares firing on the first trial of the second half of the probe (PB 2/2) against firing in the first half of the probe (PB 1/2), using the following formula: (2^nd FR – 1^st FR)/( 2^nd FR + 1^st FR), where FR represents average normalized firing for each condition. f. Distribution of compound index in the extinction probe session. The compound index compares firing to the compound cue (A1/V) in the first half of the session against the sum of firing to V in the first half of the session and A1 in the first trial of the second half of the session, using the following formula: ((2^nd FR A1 + 1^st FR V) – 1^st FR A1/V)/ ((2^nd FR A1 + 1^st FR V) + 1^st FR A1/V), where FR represents average normalized firing for each condition. Black bars represent neurons in which the difference in firing was statistically significant. The numbers in each panel indicate results of a Wilcoxon signed-rank test (p) on the distribution and the average over-expectation index (u). g. Correlation between behavioral over-expectation and neural over-expectation, and between behavioral over-expectation and neural summation. The neural summation index was A1 index, computed as in Fig. 3 (i.e. from the compound probe session). The neural over-expectation index was computed as in Fig. 4d. The behavioral over-expectation index compares conditioned responding to A1 in the first trial of the second half of the session against that to A1/V1 during the first half of the session, using the following formula: (2^nd CR A1 – 1^st CR A1/V1)/( 2^nd CR A1 + 1^st CR A1/V1), where CR represents average percent of time spent in the food cup during each condition. See also Figures S3 and S4.

**Figure 5. Optogenetic inhibition of OFC neurons prevents spontaneous decline in conditioned responding at the start of extinction training**
a. Representative coronal brain slice showing expression of NpHR-eYFP (green) after virus injection into OFC. Blue, fluorescent Nissl staining with NeuroTracer. b. Traces showing the expression of NpHR-eYFP (left) and eYFP (right). c. Locations of fiber tips in NpHR-eYFP (left) and eYFP (right) groups. d. NpHR transgene reduced OFC neural excitability. The top panel represents an example trace of NpHR-eYFP-expressing OFC neuron firing pattern in the presence and absence of light. Gray bars; current injection period (300 pA in this case), black bar, light on period. The line plot at the bottom represents neuron excitability comparison of NpHR-eYFP-expressing OFC neurons (n = 8) in the presence (open square) or absence of light (solid square). NpHR-eYFP-expressing OFC neurons generate fewer evoked spikes during light-on conditions compared to light-off conditions (F(1,14) = 8.94, p < 0.01). e. Optical stimulation was delivered during presentation of A1/V (NpHR-CS and eYFP-CS groups) or during inter-trial interval 30 s after A1/V presentation (NpHR-ITI and eYFP-ITI group). **f-i**. Conditioned responding as a percentage of time in the food cup during each of 3 cues during the extinction probe in NpHR-CS (f), eYFP-CS (g), NpHR-ITI (h) and eYFP-ITI (i) groups. The line plots show responding across 8 trials, and bar graphs show average responding of 8 trials. Red, blue and yellow indicate A1, A2 and A3, respectively. * p < 0.05. ** p < 0.01. Error bars = S.E.M. See also Figure S5.

See this image and copyright information in PMC

Cited by

Basal Forebrain Mediates Motivational Recruitment of Attention by Reward-Associated Cues.
Tashakori-Sabzevar F, Ward RD. Tashakori-Sabzevar F, et al. Front Neurosci. 2018 Oct 30;12:786. doi: 10.3389/fnins.2018.00786. eCollection 2018. Front Neurosci. 2018. PMID: 30425617 Free PMC article.
Orbital frontal cortex updates state-induced value change for decision-making.
Baltz ET, Yalcinbas EA, Renteria R, Gremel CM. Baltz ET, et al. Elife. 2018 Jun 13;7:e35988. doi: 10.7554/eLife.35988. Elife. 2018. PMID: 29897332 Free PMC article.
Flexible Use of Predictive Cues beyond the Orbitofrontal Cortex: Role of the Submedius Thalamic Nucleus.
Alcaraz F, Marchand AR, Vidal E, Guillou A, Faugère A, Coutureau E, Wolff M. Alcaraz F, et al. J Neurosci. 2015 Sep 23;35(38):13183-93. doi: 10.1523/JNEUROSCI.1237-15.2015. J Neurosci. 2015. PMID: 26400947 Free PMC article.
Dreams, reality and memory: confabulations in lucid dreamers implicate reality-monitoring dysfunction in dream consciousness.
Corlett PR, Canavan SV, Nahum L, Appah F, Morgan PT. Corlett PR, et al. Cogn Neuropsychiatry. 2014;19(6):540-53. doi: 10.1080/13546805.2014.932685. Epub 2014 Jul 16. Cogn Neuropsychiatry. 2014. PMID: 25028078 Free PMC article.
Orbitofrontal cortex mediates the differential impact of signaled-reward probability on discrimination accuracy.
Ward RD, Winiger V, Kandel ER, Balsam PD, Simpson EH. Ward RD, et al. Front Neurosci. 2015 Jun 23;9:230. doi: 10.3389/fnins.2015.00230. eCollection 2015. Front Neurosci. 2015. PMID: 26157358 Free PMC article.

See all "Cited by" articles

References

1. Abe H, Lee D. Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex. Neuron. 2011;70:731–741. - PMC - PubMed
1. Bunsey M, Eichenbaum E. Conservation of hippocampal memory function in rats and humans. Nature. 1996;379:255–257. - PubMed
1. Burke KA, Franz TM, Miller DN, Schoenbaum G. The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature. 2008;454:340–344. - PMC - PubMed
1. Camille N, Griffiths CA, Vo K, Fellows LK, Kable JW. Ventromedial frontal lobe damage disrupts value maximization in humans. Journal of Neuroscience. 2011;31:7527–7532. - PMC - PubMed
1. Chudasama Y, Wright KS, Murray EA. Hippocampal lesions in rhesus monkeys disrupt emotional responses but not reinforcer devaluation effects. Biological Psychiatry epub. 2008 - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning

Affiliation

Neural estimates of imagined outcomes in the orbitofrontal cortex drive behavior and learning

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous