. 2005 May 17;102(20):7338-43.

doi: 10.1073/pnas.0502455102. Epub 2005 May 9.

Prefrontal cortex and flexible cognitive control: rules without symbols

Nicolas P Rougier¹, David C Noelle, Todd S Braver, Jonathan D Cohen, Randall C O'Reilly

Affiliations

PMID: 15883365
PMCID: PMC1129132
DOI: 10.1073/pnas.0502455102

Prefrontal cortex and flexible cognitive control: rules without symbols

Nicolas P Rougier et al. Proc Natl Acad Sci U S A. 2005.

. 2005 May 17;102(20):7338-43.

doi: 10.1073/pnas.0502455102. Epub 2005 May 9.

Authors

Nicolas P Rougier¹, David C Noelle, Todd S Braver, Jonathan D Cohen, Randall C O'Reilly

Affiliation

¹ Department of Psychology, University of Colorado, 345 UCB, Boulder, CO 80309, USA.

PMID: 15883365
PMCID: PMC1129132
DOI: 10.1073/pnas.0502455102

Abstract

Human cognitive control is uniquely flexible and has been shown to depend on prefrontal cortex (PFC). But exactly how the biological mechanisms of the PFC support flexible cognitive control remains a profound mystery. Existing theoretical models have posited powerful task-specific PFC representations, but not how these develop. We show how this can occur when a set of PFC-specific neural mechanisms interact with breadth of experience to self organize abstract rule-like PFC representations that support flexible generalization in novel tasks. The same model is shown to apply to benchmark PFC tasks (Stroop and Wisconsin card sorting), accurately simulating the behavior of neurologically intact and frontally damaged people.

PubMed Disclaimer

Figures

**Fig. 1.**
Model and example stimuli. (a) The model with the complete PFC system. Stimuli are presented in two possible locations (left, right). Rows represent different stimulus dimensions (e.g., color, size, shape, etc., labeled A-E for simplicity), and columns represent different features (red, orange green, and blue; small, medium, etc., numbered 1-4). Other inputs include a task input indicating current task to perform (NF, name feature; MF, match feature; SF, smaller feature; LF, larger feature), and, for the “instructed” condition (used to control for lack of maintenance in non-PFC networks), a cue to the currently relevant dimension. Output responses are generated over the response layer, which has units for the different stimulus features, plus a “No” unit to signal nonmatch in the matching task. The hidden layers represent posterior cortical pathways associated with different types of inputs (e.g., visual and verbal). The AG unit is the adaptive gating unit, providing a temporal differences (TD) based dynamic gating signal to the PFC context layer. The weights into the AG unit learn via the TD mechanism, whereas all other weights learn using the Leabra algorithm that combines standard Hebbian and error-driven learning mechanisms, together with k-winners-take-all inhibitory competition within layers and point-neuron activation dynamics (26) (also see supporting information). (b) Example stimuli and correct responses for one of the tasks (NF) across three trials where the current rule is to focus on the Shape dimension (the same rule was blocked over 200 trials to allow networks plenty of time to adapt to each rule). The corresponding input and target patterns for the network are shown below each trial, with the unit meanings given by the legend in the lower left. The network must maintain the current dimension rule to perform correctly.

**Fig. 2.**
Representations (synaptic weights) that developed in four different network configurations. (a) Posterior cortex only (no PFC) trained on all tasks. (b) PFC without the adaptive gating mechanism (all tasks). (c) Full PFC trained only on task pairs (name feature and match feature in this case). (d) Full PFC (all tasks). Each image shows the weights from the hidden units (a) or PFC(b-d) to the response layer. Larger squares correspond to units (all 30 in the PFC and a random and representative subset of 30 from the 145 hidden units in the posterior model), and the smaller squares within designate the strength of the connection (lighter = stronger) from that unit to each of the units in the response layer. Note that each row designates connections to response units representing features in the same stimulus dimension (as illustrated in e and Fig. 1). It is evident, therefore, that each of the PFC units in the full model (d) represents a single dimension and, conversely, that each dimension is represented by a distinct subset of PFC units. This pattern is less evident to almost entirely absent in the other network configurations (see text for additional analyses).

**Fig. 3.**
Generalization and learning results. (a) Crosstask generalization results (% correct on task-novel stimuli) for the full PFC network and a variety of control networks, with either only two tasks (Task Pairs) or all four tasks (All Tasks) used during training (n = 10 for each network, error bars are standard errors). Overall, the full PFC model generalizes substantially better than the other models, and this interacts with the level of training such that performance on the All Tasks condition is substantially better than the Task Pairs condition (with no differences in numbers of training trials or training stimuli). With one feature left out of training for each of four dimensions, training represented only 31.6% (324) of the total possible stimulus inputs (1,024); the ≈85% generalization performance on the remaining test items therefore represents good productive abilities. The other networks are: Posterior, a single large hidden unit layer between inputs and response, a simple model of posterior cortex without any special active maintenance abilities; P + Rec, posterior + full recurrent connectivity among hidden units, allows hidden layer to maintain information over time via attractor dynamics; P + Self, posterior + self-recurrent connections from hidden units to themselves, allows individual units to maintain activations over time; SRN, simple recurrent network, with a context layer that is a copy of the hidden layer on the prior step, a widely used form of temporal maintenance; SRN-PFC, an SRN context layer applied to the PFC layer in the full model (identical to the full PFC model except for this difference), tests for role of separated hidden layers; NoGate, the full PFC model without the AG adaptive gating unit. (b) The correlation of generalization performance with the extent to which the units distinctly and orthogonally encode stimulus dimensions for the networks shown in Fig. 2. This was computed by comparing each unit's pattern of weights to the set of five orthogonal, complete dimensional target patterns (i.e., the A dimension target pattern has a 1 for each A feature, and 0s for the features in all other dimensions, etc.). A numeric value between 0 and 1, where 1 represents a completely orthogonal and complete dimensional representation was computed for unit i as: where *t_k* is the dimensional target pattern k, and *w_i* is the weight vector for unit i, and represents the normalized dot product of the two vectors (i.e., the cosine). This value was then averaged across all units in the layer and then correlated with that network's generalization performance. (c) Relative stability of PFC and hidden layer (posterior cortex) in the model, as indexed by Euclidean distance between weight states at the end of subsequent epochs (epoch = 2,000 trials). The PFC takes longer to stabilize (i.e., exhibits greater levels of weight change across epochs) than the posterior cortex. For PFC, within-PFC recurrent weights were used. For Hidden, weights from stimulus input to Hidden were used. Both sets of weights are an equivalent distance from error signals at the output layer. The learning rate is reduced at 10 epochs, producing a blip at that point.

formula image — **Fig. 3.**
Generalization and learning results. (a) Crosstask generalization results (% correct on task-novel stimuli) for the full PFC network and a variety of control networks, with either only two tasks (Task Pairs) or all four tasks (All Tasks) used during training (n = 10 for each network, error bars are standard errors). Overall, the full PFC model generalizes substantially better than the other models, and this interacts with the level of training such that performance on the All Tasks condition is substantially better than the Task Pairs condition (with no differences in numbers of training trials or training stimuli). With one feature left out of training for each of four dimensions, training represented only 31.6% (324) of the total possible stimulus inputs (1,024); the ≈85% generalization performance on the remaining test items therefore represents good productive abilities. The other networks are: Posterior, a single large hidden unit layer between inputs and response, a simple model of posterior cortex without any special active maintenance abilities; P + Rec, posterior + full recurrent connectivity among hidden units, allows hidden layer to maintain information over time via attractor dynamics; P + Self, posterior + self-recurrent connections from hidden units to themselves, allows individual units to maintain activations over time; SRN, simple recurrent network, with a context layer that is a copy of the hidden layer on the prior step, a widely used form of temporal maintenance; SRN-PFC, an SRN context layer applied to the PFC layer in the full model (identical to the full PFC model except for this difference), tests for role of separated hidden layers; NoGate, the full PFC model without the AG adaptive gating unit. (b) The correlation of generalization performance with the extent to which the units distinctly and orthogonally encode stimulus dimensions for the networks shown in Fig. 2. This was computed by comparing each unit's pattern of weights to the set of five orthogonal, complete dimensional target patterns (i.e., the A dimension target pattern has a 1 for each A feature, and 0s for the features in all other dimensions, etc.). A numeric value between 0 and 1, where 1 represents a completely orthogonal and complete dimensional representation was computed for unit i as: where *t_k* is the dimensional target pattern k, and *w_i* is the weight vector for unit i, and represents the normalized dot product of the two vectors (i.e., the cosine). This value was then averaged across all units in the layer and then correlated with that network's generalization performance. (c) Relative stability of PFC and hidden layer (posterior cortex) in the model, as indexed by Euclidean distance between weight states at the end of subsequent epochs (epoch = 2,000 trials). The PFC takes longer to stabilize (i.e., exhibits greater levels of weight change across epochs) than the posterior cortex. For PFC, within-PFC recurrent weights were used. For Hidden, weights from stimulus input to Hidden were used. Both sets of weights are an equivalent distance from error signals at the output layer. The learning rate is reduced at 10 epochs, producing a blip at that point.

**Fig. 4.**
Neuropsychological task results. (a) Performance of the full PFC network on a simulated Stroop task, demonstrating the classic pattern of conflict effects on the subordinate task of color naming with unaffected performance on the dominant word reading task (human data from ref. 31). This was simulated by training one dimension (a) with one-fourth the frequency of the others, making it weaker. In the neutral condition, a single feature was active, whereas the conflict condition had two features present and the dimension cue input specified that was to be named. Reaction time (RT) was measured as the number of cycles to activate a feature in the response layer >0.75 (multiplied by 35 to match human RT in msec). (b) Stroop performance for a 30% lesion (removal) of PFC units in the model (posttraining), compared with data from ref. on patients with left frontal (LF) lesions (six of eight include dorsolateral PFC) and matched controls (Ctrl) (data in seconds to complete a block of trials; model cycles were transformed as RT = cycles × 5.5-30 to fit this scale; the Conflict Word reading conditions were not run on the human subjects). The main effect of damage is an overall slowing of color naming, consistent with the notion that the PFC provides top-down support to this weaker pathway via abstract dimensional representations. (c) Performance in a simulated WCST task, demonstrating the classic pattern of increasing perseveration with increased PFC damage (% of units removed, posttraining). Perseverations = number of sequential productions of feature names corresponding to the previously relevant dimension after a switch. Clearly, the simulated PFC is critical for rapid flexible switching. (d) WCST results (perseverations) for the three different training conditions used by ref. (128 is the standard case plotted before, whereas 64A involves providing instructions about the relevant dimensions along which cards could be sorted, and 64B has explicit instruction when the rule changes; see supporting information for details). n = 10 networks; error bars = standard error for all graphs.

See this image and copyright information in PMC

References

1. Goldman-Rakic, P. S. (1987) Handb. Physiol. 5, 373-417.
1. Fuster, J. M. (1997) The Prefrontal Cortex: Anatomy, Physiology and Neuropsychology of the Frontal Lobe (Lippincott-Raven, New York), 3rd Ed.
1. Miller, E. K. & Cohen, J. D. (2001) Annu. Rev. Neurosci. 24, 167-202. - PubMed
1. Shallice, T. (1988) From Neuropsychology to Mental Structure (Cambridge Univ. Press, New York).
1. Duncan, J. (2001) Nat. Rev. Neurosci. 2, 820-829. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- ClinicalTrials.gov
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prefrontal cortex and flexible cognitive control: rules without symbols

Affiliation

Prefrontal cortex and flexible cognitive control: rules without symbols

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Molecular Biology Databases

Miscellaneous