Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 8:11:80.
doi: 10.3389/fncom.2017.00080. eCollection 2017.

Modeling Search Behaviors during the Acquisition of Expertise in a Sequential Decision-Making Task

Affiliations

Modeling Search Behaviors during the Acquisition of Expertise in a Sequential Decision-Making Task

Cristóbal Moënne-Loccoz et al. Front Comput Neurosci. .

Abstract

Our daily interaction with the world is plagued of situations in which we develop expertise through self-motivated repetition of the same task. In many of these interactions, and especially when dealing with computer and machine interfaces, we must deal with sequences of decisions and actions. For instance, when drawing cash from an ATM machine, choices are presented in a step-by-step fashion and a specific sequence of choices must be performed in order to produce the expected outcome. But, as we become experts in the use of such interfaces, is it possible to identify specific search and learning strategies? And if so, can we use this information to predict future actions? In addition to better understanding the cognitive processes underlying sequential decision making, this could allow building adaptive interfaces that can facilitate interaction at different moments of the learning curve. Here we tackle the question of modeling sequential decision-making behavior in a simple human-computer interface that instantiates a 4-level binary decision tree (BDT) task. We record behavioral data from voluntary participants while they attempt to solve the task. Using a Hidden Markov Model-based approach that capitalizes on the hierarchical structure of behavior, we then model their performance during the interaction. Our results show that partitioning the problem space into a small set of hierarchically related stereotyped strategies can potentially capture a host of individual decision making policies. This allows us to follow how participants learn and develop expertise in the use of the interface. Moreover, using a Mixture of Experts based on these stereotyped strategies, the model is able to predict the behavior of participants that master the task.

Keywords: Hidden Markov Models; behavioral modeling; expertise acquisition; search strategies; sequential decision-making.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic presentation of the BDT showing one possible instance of the task. Only one screen per level is presented, depending on the icon clicked previously. Highlighted icons along the black continuous lines represent the correct icon-to-concept mapping (see inset) that, when clicked in the correct sequence, produces a positive feedback.
Figure 2
Figure 2
Reaction times. Evolution of trial reaction times (y-axis) grouped by tasks instances (x-axis) and averaged over all participants. The global RT curve fit, corresponds to an exponential decay function of the type: λ1exp(-λ2xλ3), where λ1 = 21.6 is the starting average RT, λ2 = 0.28, and λ3 = 0.4. The coefficient λ3 is necessary since the drop in time is not as steep as when λ3 = 1 (the usual exponential decay constant).
Figure 3
Figure 3
Framework parameters modulation. An example for a participant's discriminative strategy weight (y-axis) for each task instance (x-axis) at different τ values (see inset). The inset x-axis presents the distance in choices from the current time and the y-axis the respective kernel weight. The curve marked by (*) represents the σ used for modeling the rest of results presented in the section.
Figure 4
Figure 4
Learners and non-learners average strategies across tasks. (A) Right panel: characterization of the strategy models averaged over all participants (66.7%) that managed to solve the sixteen different instances of the task (x-axis). Left panel: average weights of each strategy models over the same group of participants. (B) Right panel: characterization of the strategy models averaged over all participants (33.3%) that didn't fully learn the task. Left panel: average weights of each strategy models over the same group of participants. The behavioral modeling represents the averaged individual approximation (basal strategies weighting) and is shown with its standard error across tasks. Dispersion of α curves for basal strategies is not presented to facilitate visualization. The latter can be seen in greater detail in Figure 5 below. The same interpolation scheme as in Figure 2 is used.
Figure 5
Figure 5
Study cases. Three representative individual cases according to task performance. (A) Highly proficient learner. (B) Less proficient learner. (C) Non-learner. The left panel of each case shows the deployment of basal strategies in terms of α during the overall duration of the experiment (x-axis). Colored dots represent different degrees of expertise used to explore/exploit the interface knowledge (Equation 19). The right panels of each case shows the evolution of the corresponding weights for each strategy across task instances. The average of the basal strategy weights yield α* for the behavioral modeling.
Figure 6
Figure 6
Participants performance, framework prediction, and 50/50 guesses averaged across tasks. (A) Presents results for the learners group: the leftmost panel shows the participant's averaged performance (y-axis), compared with exploration/exploitation performance as predicted by the behavioral modeling; the center panel present the average number of predicted choices (y-axis) in a sequence for behavioral model and individual strategies; the rightmost panel present the number of 50/50 guesses (y-axis left) for center panel. The x-axis for all panels is the average across the 16 different tasks. (B) Presents the equivalent measures of (A) for non-learners. The corresponding standard deviation of participants' performance and behavioral modeling is shown in gray. Standard deviation for the rest of the curves is not presented to facilitate visualization.

Similar articles

References

    1. Abbeel P., Ng A. Y. (2004). Apprenticeship learning via inverse reinforcement learning, in Proceedings of the Twenty-First International Conference on Machine Learning (Banff, AB: ).
    1. Acuña D. E., Schrater P. (2010). Structure learning in human sequential decision-making. PLoS Comput. Biol. 6:e1001003. 10.1371/journal.pcbi.1001003 - DOI - PMC - PubMed
    1. Alagoz O., Hsu H., Schaefer A. J., Roberts M. S. (2009). Markov decision processes: a tool for sequential decision making under uncertainty. Med. Decis. Making 30, 474–483. 10.1177/0272989X09353194 - DOI - PMC - PubMed
    1. Albrecht S. V., Crandall J. W., Ramamoorthy S. (2016). Belief and truth in hypothesised behaviours. Artif. Intell. 235, 63–94. 10.1016/j.artint.2016.02.004 - DOI
    1. Baker C. L., Saxe R., Tenenbaum J. B. (2009). Action understanding as inverse planning. Cognition 113, 329–349. 10.1016/j.cognition.2009.07.005 - DOI - PubMed