Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul:2019:2058-2064.

Belief dynamics extraction

Affiliations

Belief dynamics extraction

Arun Kumar et al. Cogsci. 2019 Jul.

Abstract

Animal behavior is not driven simply by its current observations, but is strongly influenced by internal states. Estimating the structure of these internal states is crucial for understanding the neural basis of behavior. In principle, internal states can be estimated by inverting behavior models, as in inverse model-based Reinforcement Learning. However, this requires careful parameterization and risks model-mismatch to the animal. Here we take a data-driven approach to infer latent states directly from observations of behavior, using a partially observable switching semi-Markov process. This process has two elements critical for capturing animal behavior: it captures non-exponential distribution of times between observations, and transitions between latent states depend on the animal's actions, features that require more complex non-markovian models to represent. To demonstrate the utility of our approach, we apply it to the observations of a simulated optimal agent performing a foraging task, and find that latent dynamics extracted by the model has correspondences with the belief dynamics of the agent. Finally, we apply our model to identify latent states in the behaviors of monkey performing a foraging task, and find clusters of latent states that identify periods of time consistent with expectant waiting. This data-driven behavioral model will be valuable for inferring latent cognitive states, and thereby for measuring neural representations of those states.

Keywords: Animal behavior; Belief dynamics; Foraging; Partially observable switching semi-Markov process.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Overview: In complex natural tasks such as foraging, an animal faces a continuous stream of choices. Some of the choices pertain to hidden variables in the world, such as food availability at a given location and time. These variables determine time- and context-dependent rates for observation events and rewards. To perform well at these tasks, animals must learn these hidden rates and act upon what they have learned. Our goal is to develop a data-driven, continuous-time model for inferring an animal’s latent states and their dynamics.
Figure 2:
Figure 2:
A discrete-state Hidden Markov Model. a: Discrete state diagram shows latent states (blue circles) and their transitions (blue lines), as well as the possible emissions from each state (red circles) with their emission probability (red lines). b: Directed probabilistic graphical model showing dependence of state variable st+1 and observation ot on the previous state st. c: We present a continuous-time extension for latent states and discrete time observations using uniformization, Rao & Teh (2013)
Figure 3:
Figure 3:
Comparison of graphical models of behavior. Left: In Belief MDP, belief transitions depend on actions selected by a policy. Center: Transitions in Semi-Markov Jump Process are independent of actions. Right: The Switching SMJP allows transition rates to depend on actions.
Figure 4:
Figure 4:
Overview of the algorithm.
Figure 5:
Figure 5:
The model is able to explain simulated test data and the log-likelihood on held out data starts flattening out at the true number of states.
Figure 6:
Figure 6:
Latent states inferred by SMJP for an optimal agent implementing a POMDP. (a) Log-likelihood on held out data provides an estimate of the required number of latent states. (b) Co-clustering of states in a POMDP and our SMJP, based on the conditional probability of observing each POMDP state Z from each SMJP state, P(Z|s,obs). The POMDP states Z are depicted below the horizontal axis. Clustered structure in the plot reveals that the SMJP states have information about the agent’s belief dynamics.
Figure 7:
Figure 7:
Analyzing behavioral data from a freely moving monkey using the SMJP. (a) Overhead video (background image) tracked the locations and normalized velocities (vectors) of the monkey. These data were then clustered by the k-means algorithm. (b) We get an estimate of the required number of latent states by observing log-likelihood on held out data. (c) SMJP model for observed monkey behavioral data for the action stay. Highlighted reward expectant waiting states illustrate that the latent states as regressors for the beliefs dynamics are useful in understanding monkey’s behavior. (d) Subspaces p and q (blue and red dotted), within the subgraphs (green and gray highlighted) for the joint operators T31 and T13 reveal persistent reward belief states.

References

    1. Anderson DJ, & Perona P (2014). Toward a science of computational ethology. Neuron, 84(1), 18–31. - PubMed
    1. Bellman R (1957). Dynamic programming: Princeton univ. press. Princeton.
    1. Brandes U, Delling D, Gaertler M, Gorke R, Hoefer M, Nikoloski Z, & Wagner D (2008). On modularity clustering. IEEE transactions on knowledge and data engineering, 20(2), 172–188.
    1. Charnov E, & Orians GH (2006). Optimal foraging: some theoretical explorations.
    1. Dhillon IS, Mallela S, & Modha DS (2003). Information-theoretic co-clustering. In Proceedings of the ninth acm sigkdd international conference on knowledge discovery and data mining (pp. 89–98).

LinkOut - more resources