Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 30:12:45.
doi: 10.3389/fnbot.2018.00045. eCollection 2018.

Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

Affiliations

Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

Martin Biehl et al. Front Neurorobot. .

Abstract

Active inference is an ambitious theory that treats perception, inference, and action selection of autonomous agents under the heading of a single principle. It suggests biologically plausible explanations for many cognitive phenomena, including consciousness. In active inference, action selection is driven by an objective function that evaluates possible future actions with respect to current, inferred beliefs about the world. Active inference at its core is independent from extrinsic rewards, resulting in a high level of robustness across e.g., different environments or agent morphologies. In the literature, paradigms that share this independence have been summarized under the notion of intrinsic motivations. In general and in contrast to active inference, these models of motivation come without a commitment to particular inference and action selection mechanisms. In this article, we study if the inference and action selection machinery of active inference can also be used by alternatives to the originally included intrinsic motivation. The perception-action loop explicitly relates inference and action selection to the environment and agent memory, and is consequently used as foundation for our analysis. We reconstruct the active inference approach, locate the original formulation within, and show how alternative intrinsic motivations can be used while keeping many of the original features intact. Furthermore, we illustrate the connection to universal reinforcement learning by means of our formalism. Active inference research may profit from comparisons of the dynamics induced by alternative intrinsic motivations. Research on intrinsic motivations may profit from an additional way to implement intrinsically motivated agents that also share the biological plausibility of active inference.

Keywords: active inference; empowerment; free energy principle; intrinsic motivation; perception-action loop; predictive information; universal reinforcement learning; variational inference.

PubMed Disclaimer

Figures

Figure 1
Figure 1
First two time steps of the Bayesian network representing the perception-action loop (PA-loop). All subsequent time steps are identical to the one from time t = 1 to t = 2.
Figure 2
Figure 2
Bayesian network of the generative model with parameters Θ = (Θ1, Θ2, Θ3) and hyperparameters Ξ = (Ξ1, Ξ2, Ξ3). Hatted variables are models / estimates of non-hatted counterparts in the perception-action loop in Figure 1. An edge that splits up connecting one node to n nodes (e.g., Θ2 to Ê1, Ê2, …) corresponds to n edges from that node to all the targets under the usual Bayesian network convention. Note that in contrast to the perception-action loop in Figure 1, imagined actions Ât have no parents. They are either set to past values or, for those in the future, a probability distribution over them must be assumed.
Figure 3
Figure 3
Internal generative model with plugged in data up to t = 2 with Ŝ0 = s0, Ŝ1 = s1 and Â1 = a1 as well as from now on fixed hyperparameters ξ = (ξ1, ξ2, ξ3). Conditioning on the plugged in data leads to the posterior distribution q(ŝt:T^,ê0:T^,ât:T^,θ|sat,ξ). Predictions for future sensor values can be obtained by marginalising out other random variables e.g., to predict Ŝ2 we would like to get q(ŝ2|s0, s1, a1, ξ). Note however that this requires an assumption for the probability distribution over Â2.
Figure 4
Figure 4
Bayesian network of the approximate posterior factor at t = 2. The variational parameters Φ1, Φ2, Φ3, and ΦEt=(ΦE0,ΦE1) are positioned so as to indicate what dependencies and nodes they replace in the generative model in Figure 2.
Figure 5
Figure 5
Bayesian network of the approximate complete posterior of Equation (40) at t = 2 for the future actions ât:T^. Only Êt-1,Θ1,Θ2 and the future action ât:T^ appear in the predictive factor and influence future variables. In general there is one approximate complete posterior for each possible sequence ât:T^ of future actions.
Figure 6
Figure 6
Generative model including q(ât:T^|sat,ξ) at t = 2 with ŜÂ≺2 influencing future actions Â2:T^. Note that, only future actions are dependent on past sensor values and actions, e.g., action Â1 has no incoming edges. The increased gap between time step t = 1 and t = 2 is to indicate that this time step is special in the model. For each time step t there is an according model with the particular relation between past ŜÂt and Ât:T^ shifted accordingly.

References

    1. Allen M., Friston K. J. (2016). From cognitivism to autopoiesis: towards a computational framework for the embodied mind. Synthese 195, 2459–2482. 10.1007/s11229-016-1288-5 - DOI - PMC - PubMed
    1. Aslanides J., Leike J., Hutter M. (2017). Universal reinforcement learning algorithms: survey and experiments, in Proceedings of the 26th International Joint Conference on Artificial Intelligence (Melbourne, VIC: ), 1403–1410.
    1. Attias H. (1999). A variational Bayesian framework for graphical models, in Proceedings Advances in Neural Information Processing Systems 12, eds Solla S., Leen T., Müller K. (Cambridge, MA: MIT Press; ), 209–215.
    1. Attias H. (2003). Planning by probabilistic inference, in Proceedings 9th International Workshop on Artificial Intelligence and Statistics (Key West, FL: ).
    1. Ay N., Bernigau H., Der R., Prokopenko M. (2012). Information-driven self-organization: the dynamical system approach to autonomous robot behavior. Theor. Biosci. 131, 161–179. 10.1007/s12064-011-0137-9 - DOI - PubMed

LinkOut - more resources