Review

. 2021 Jan 18;23(1):117.

doi: 10.3390/e23010117.

Probabilistic Models with Deep Neural Networks

Andrés R Masegosa¹, Rafael Cabañas², Helge Langseth³, Thomas D Nielsen⁴, Antonio Salmerón¹

Affiliations

¹ Department of Mathematics, Center for the Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain.
² Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), CH-6962 Lugano, Switzerland.
³ Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway.
⁴ Department of Computer Science, Aalborg University, DK-9220 Aalborg, Denmark.

PMID: 33477544
PMCID: PMC7831091
DOI: 10.3390/e23010117

Review

Probabilistic Models with Deep Neural Networks

Andrés R Masegosa et al. Entropy (Basel). 2021.

. 2021 Jan 18;23(1):117.

doi: 10.3390/e23010117.

Authors

Andrés R Masegosa¹, Rafael Cabañas², Helge Langseth³, Thomas D Nielsen⁴, Antonio Salmerón¹

Affiliations

¹ Department of Mathematics, Center for the Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain.
² Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), CH-6962 Lugano, Switzerland.
³ Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway.
⁴ Department of Computer Science, Aalborg University, DK-9220 Aalborg, Denmark.

PMID: 33477544
PMCID: PMC7831091
DOI: 10.3390/e23010117

Abstract

Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.

Keywords: Bayesian learning; deep probabilistic modeling; latent variable models; neural networks; variational inference.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Structure of the probabilistic model examined in this paper, defined for a sample of size N.

**Figure 2**
Two-dimensional latent representations resulting of applying a probabilistic PCA of: (**Left**) the iris dataset [48] and (**Right**) a subset of 1000 instances from the MNIST dataset [49] corresponding to the handwritten digits 1, 2 and 3.

**Figure 3**
Example of a simple Computational Graph. Squared nodes denote operations, and the rest are input nodes. This computational graph encodes the operation $f = 3 \cdot w + 10$ , where w is a variable wrt. which we can differentiate.

**Figure 4**
Example of a simple computational graph encoding a neural network with two hidden layers and the squared loss function. Note that each operation node encapsulates a part of the CG encoding the associated operations, we do not expand the whole CG for the sake of simplicity.

**Figure 5**
Two-dimensional latent representations of the the MNIST dataset resulting of applying: (**Left**) a standard probabilistic PCA (reproduced from Figure 2 to ease comparison), and (**Right**) a non-linear probabilistic PCA with a ANN containing a hidden layer of size 100 with a *relu* activation function.

**Figure 6**
(**Left**) A stochastic computational graph encoding the function $h = E_{Z} [{(Z - 5)}^{2}]$ , where $Z \sim N (μ, 1)$ . (**Right**) Computational graph processing k samples from Z and producing $\hat{h}$ , an estimate of $E_{Z} [{(Z - 5)}^{2}]$ .

**Figure 7**
The top part depicts a probabilistic graphical model using plate notation [8]. The lower part depicts an abstract representation of a stochastic computational graph encoding the model, where the relation between $z$ and $x$ is defined by a DNN with $L + 1$ layers. See Section 4 for details.

**Figure 8**
SCG representing the ELBO function $L (ν)$ . $r$ is distributed according to the variational distribution, $r \sim q (r | ν)$ .

**Figure 9**
Reparameterized SCG representing the ELBO function $L (ν)$ .

**Figure 10**
Two-dimensional latent representation of the the MNIST dataset resulting of applying: (**Left**) a non-linear probabilistic PCA, and (**Right**) a VAE. The ANNs of the non-linear PCA and the ones defining the VAE’s decoder and encoder contain a single hidden layer of size 100.

See this image and copyright information in PMC

References

1. Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers; San Mateo, CA, USA: 1988.
1. Lauritzen S.L. Propagation of probabilities, means, and variances in mixed graphical association models. J. Am. Stat. Assoc. 1992;87:1098–1108. doi: 10.1080/01621459.1992.10476265. - DOI
1. Russell S.J., Norvig P. Artificial Intelligence: A Modern Approach. Pearson; Upper Saddle River, NJ, USA: 2016.
1. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning. Springer; Berlin/Heidelberg, Germany: 2001.
1. Bishop C.M. Pattern Recognition and Machine Learning. Springer; Berlin/Heidelberg, Germany: 2006.

Publication types

Actions

Grants and funding

TIN2015-74368-JIN, TIN2016-77902-C3-3-P, PID2019-106758GB-C31, PID2019-106758GB-C32/Ministerio de Ciencia e Innovación

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Probabilistic Models with Deep Neural Networks

Affiliations

Probabilistic Models with Deep Neural Networks

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources