Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(11):e1002774.
doi: 10.1371/journal.pcbi.1002774. Epub 2012 Nov 15.

Sensorimotor learning biases choice behavior: a learning neural field model for decision making

Affiliations

Sensorimotor learning biases choice behavior: a learning neural field model for decision making

Christian Klaes et al. PLoS Comput Biol. 2012.

Abstract

According to a prominent view of sensorimotor processing in primates, selection and specification of possible actions are not sequential operations. Rather, a decision for an action emerges from competition between different movement plans, which are specified and selected in parallel. For action choices which are based on ambiguous sensory input, the frontoparietal sensorimotor areas are considered part of the common underlying neural substrate for selection and specification of action. These areas have been shown capable of encoding alternative spatial motor goals in parallel during movement planning, and show signatures of competitive value-based selection among these goals. Since the same network is also involved in learning sensorimotor associations, competitive action selection (decision making) should not only be driven by the sensory evidence and expected reward in favor of either action, but also by the subject's learning history of different sensorimotor associations. Previous computational models of competitive neural decision making used predefined associations between sensory input and corresponding motor output. Such hard-wiring does not allow modeling of how decisions are influenced by sensorimotor learning or by changing reward contingencies. We present a dynamic neural field model which learns arbitrary sensorimotor associations with a reward-driven Hebbian learning algorithm. We show that the model accurately simulates the dynamics of action selection with different reward contingencies, as observed in monkey cortical recordings, and that it correctly predicted the pattern of choice errors in a control experiment. With our adaptive model we demonstrate how network plasticity, which is required for association learning and adaptation to new reward contingencies, can influence choice behavior. The field model provides an integrated and dynamic account for the operations of sensorimotor integration, working memory and action selection required for decision making in ambiguous choice situations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Structure of the context-dependent reach task that model and monkeys had to perform.
In the beginning either a single spatial cue (PMG task, B) or a spatial and a contextual cue (DMG task, A) were presented, indicated by a white circle (spatial cue) and a colored rectangle (contextual cue). During the memory period no cue was shown. The ‘go’-signal indicated the subject to make a reach movement towards the goal, which was either at the same location as the spatial cue (direct trial; green) or at the diametrically opposite location (inferred trial; blue). In one part of the PMG trials the contextual cue was presented at the end of the memory period (PMG-CI), and in another part no contextual cue was shown at all (PMG-NC) and a free choice had to be made (see Methods). In the inferred reach training task (C), a second spatial cue (target cue) is shown at the end of the memory period to indicate the rewarded goal position. This cue is gradually faded out over many trials during the training.
Figure 2
Figure 2. Model architecture and interactions in neural fields.
(A) The model consists of four interconnected DNFs and a set of dynamic nodes. The spatial input field, motor preparation field, and motor field are one-dimensional fields that span the space of possible spatial cue/reach directions. The two-dimensional association field is defined over this directional space as well as a second dimension along which selectivity for the contextual cue develops. Its activation is shown color coded (red highest, blue lowest activation). The activation of the two context nodes is shown as a bar plot. Fixed projections between the fields are shown as white arrows; variable projections (that are subject to learning) are shown through dark red arrows with a weight matrix W. (B) Lateral interactions in DNFs, shown exemplarily for the motor preparation field. Exogenous input from other fields (indicated by grey arrows at the bottom) locally increases activation (red). Regions of high activation produce an output signal (the soft threshold of the sigmoid output function is indicated by the dashed line), which acts on other parts of the field and is also projected to other fields of the architecture. The lateral interactions consist of local excitatory connections and surrounding inhibitory connections, which together implement a soft competition between distant field regions. This creates a selection property in the field, promoting the formation of a single peak even for multi-modal input.
Figure 3
Figure 3. Weight changes in the model during IR training.
(A, C, E, G) Weight difference matrix from the context input nodes to the association field. The color at each point of the field indicates the difference of the weights from the inferred context input node and the direct context input node to that point in the association field. In the untrained network, weight differences are randomly distributed around 0 without any spatial pattern (A). Over the course of IR training, distinct areas sensitive for direct or inferred context input evolve at the trained spatial positions (C, E, G). (B, D, F, H) Index shift in the projection from the association field to the motor preparation field (difference between spatial position of a point in the association field and the position in the motor preparation field to which it projects most strongly). In the beginning each point in the same spatial column preferably connects to the corresponding spatial position in the motor preparation field (B, index shift = 0°). After IR training those areas which prefer the inferred context input preferably connect to the opposite spatial position in the motor preparation field, corresponding to an index shift of about 180° (D, F, H).
Figure 4
Figure 4. Generalization performance in monkey and model.
Reaches performed by monkey (black) and model (white) were analyzed when generalizing from cardinal to oblique spatial cue directions. Bars show proportion of reaches in a direction relative to the rewarded goal (this means, 0° reaches are directed towards the correct goal, all others are failed reaches). Direct reaches to cardinal (A) and oblique (C) goals are almost always performed correctly. Inferred reaches to trained (cardinal) goals (B) are also almost always performed correctly, as was to be expected. If inferred reaches were required to oblique positions (D), both monkey and model show a similar pattern of failed reaches, illustrated in the inset of panel (D): Most reaches were made either in a previously trained cardinal direction adjacent to the goal direction (red, deviation of 45°) or in the direction of the spatial cue, meaning that a direct reach was performed (green, deviation of 180° from the goal direction).
Figure 5
Figure 5. Origin of generalization errors in the model.
Two snapshots of the activation patterns in the model during the memory period are shown, taken from different trials that developed different movement plans due to random noise in the model. In both cases, the spatial cue was located at 225° (an oblique direction not used during training), the blue context input indicates that an inferred reach should be performed. The model is depicted in the same form as in Figure 1. Arrows show the dominant active projections between fields that arise from the current activation patterns. Regions with pronounced preference for one context are outlined in the association field (green for direct context, white for inferred). (A) When the spatial cue was presented at the beginning of the trial (white arrow), it created an activation peak in the association field at the untrained oblique direction. This active region in the association field projects topologically to the motor preparation field, therefore preparing a reach to the spatial cue direction. This corresponds to a deviation of 180° from the goal direction, since the context cue indicates that an inferred reach should be performed. (B) If the activation peak in the association overlaps partly with a region that is selective for the inferred context, the activation peak may shift over to that region (the figure shows an intermediate step of this shift). This is driven by the input from the context node. The region of the association field that is now active has adapted its projection to the motor preparation field during training, and induces a new activation peak in the motor preparation field around 360°. This yields a deviation of 45° from the goal location, since the model now prepares one of the trained reaches in a cardinal direction.
Figure 6
Figure 6. Choice behavior of monkeys and model in PMG-NC trials.
If no context instruction is given in a trial, both model and monkeys show an inherent bias to perform the inferred reach after training (A). A balanced choice behavior (B) can be achieved by application of an appropriate reward schedule (BRMS).
Figure 7
Figure 7. Comparison of population activation in model and electrophysiological data.
Plots show the averaged and normalized field output from the motor preparation field in the model (A, C) and from electrophysiological recordings in PRR (B, D) during the PMG task. Prior to averaging and normalizing, the real and model neurons' selectivity profiles were aligned according to their preferred directions in DMG trials (PD: preferred direction, OD: opposite-to-preferred direction). The averaged and normalized activity of real neurons during the PMG task in the biased (B) and balanced (D) datasets is shown for three epochs, aligned to cue onset, ‘go’-signal, and movement onset, since the length of the epochs was variable. The model neurons were aligned accordingly even though the epochs had fixed lengths. It can be seen that during the memory period in the model and in the real data plots, only one activation ridge is stable throughout the memory period, before a bias minimizing reward schedule (BMRS; see Methods) was introduced (A, B). After application of the BMRS, two stable ridges with a lower activation remain during the memory period (C, D).
Figure 8
Figure 8. Emergence of bias for inferred reaches in the DNF model.
The figure shows two snapshots of the activation patterns in the model during a single PMG trial. (A) During the memory period, after the presentation of a spatial cue, an activation peak has formed in the association field. Its position along the spatial axis reflects the direction of the spatial cue, while its location along the second dimension is unspecific and spans both context-sensitive regions (shown as outlines in the association field, green for direct, white for inferred context). The region that shows preference for the inferred context is substantially larger than the direct-context region, due to the high proportion of inferred trials during training. This region projects to the location in the motor preparation field which codes for a reach in the direction opposite to the spatial cue. The competitive interactions in the motor preparation field further amplify this stronger input that supports the inferred reach. (B) When a context signal for a direct trial is given at the end of the memory period, the context input induces a shift of the peak in the association field: It is pulled almost completely onto the region specific for the direct context with which it partly overlapped. The input to the motor preparation field changes accordingly, leading to a switch in that field's activation pattern and a stronger activation of the ‘direct’ reach direction.
Figure 9
Figure 9. Influence of input statistics on model behavior and activation pattern during the memory period.
(A) The behavioral bias for inferred reaches in the free-choice trials depends on the percentage of inferred trials during IR training and rises continuously in a sigmoidal fashion (logistic fit function; black curve). (B) The difference of the mean activation of the motor preparation field at the preferred and opposite-to-preferred position during the memory period shows a softer, but also approximately sigmoid increase when the number of inferred trials is increased.

Similar articles

Cited by

References

    1. Tversky A, Kahneman D (1981) The Framing of Decisions and the Psychology of Choice. Science 211: 453–458 doi:10.1126/science.7455683 - DOI - PubMed
    1. Cisek P (2006) Integrated neural processes for defining potential actions and deciding between them: a computational model. J Neurosci 26: 9761–9770. - PMC - PubMed
    1. Cisek P, Kalaska JF (2010) Neural mechanisms for interacting with a world full of action choices. Annu Rev Neurosci 33: 269–298. - PubMed
    1. Andersen RA, Cui H (2009) Intention, Action Planning, and Decision Making in Parietal-Frontal Circuits. Neuron 63: 568–583 doi:10.1016/j.neuron.2009.08.028 - DOI - PubMed
    1. Klaes C, Westendorff S, Chakrabarti S, Gail A (2011) Choosing Goals, Not Rules: Deciding among Rule-Based Action Plans. Neuron 70: 536–548 doi:10.1016/j.neuron.2011.02.053 - DOI - PubMed

Publication types