Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul:37:1992-2001.

Entropic Graph-based Posterior Regularization

Affiliations

Entropic Graph-based Posterior Regularization

Maxwell W Libbrecht et al. Proc Int Conf Mach Learn. 2015 Jul.

Abstract

Graph smoothness objectives have achieved great success in semi-supervised learning but have not yet been applied extensively to unsupervised generative models. We define a new class of entropic graph-based posterior regularizers that augment a probabilistic model by encouraging pairs of nearby variables in a regularization graph to have similar posterior distributions. We present a three-way alternating optimization algorithm with closed-form updates for performing inference on this joint model and learning its parameters. This method admits updates linear in the degree of the regularization graph, exhibits monotone convergence, and is easily parallelizable. We are motivated by applications in computational biology in which temporal models such as hidden Markov models are used to learn a human-interpretable representation of genomic data. On a synthetic problem, we show that our method outperforms existing methods for graph-based regularization and a comparable strategy for incorporating long-range interactions using existing methods for approximate inference. Using genome-scale functional genomics data, we integrate genome 3D interaction data into existing models for genome annotation and demonstrate significant improvements in predicting genomic activity.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustration of optimization algorithm. Solid ovals denote closed-form update steps. Dashed ovals with dotted expansion lines denote updates that are implemented by alternating optimization. Pairs of opposing arrows indicate alternating optimization implemented by iterating each update to convergence. See the extended version (Libbrecht et al., 2015) for the full algorithm.
Figure 2.
Figure 2.
Using EGPR to learn ambiguous clusters. Shape denotes true class, color denotes predicted class, and colored arrows denote cluster means as they evolve between iterations of EM.
Figure 3.
Figure 3.
Comparison of EGPR with related inference methods. The X axis shows σ, a hyperparameter controlling the difficulty of inference. The Y axis shows the average accuracy over 200 simulations of MAP inference on the model in question (95% Wilcoxon test confidence intervals).
Figure 4.
Figure 4.
Synthetic model of genome spatial interactions. Color and labeled division lines indicate learned labels along the hypothesized 501 bp genome. Large filled circles indicate observed positions. Dotted lines indicate EGPR edges.
Figure 5.
Figure 5.
(a) Strategy for utilizing physical interaction information. (b) Improvement in RMSE over chain model for EGPR and SQGPR for 29 experiments. (c) Relative improvement in RMSE between EGPR and SQGPR for each of 29 experiments.

Similar articles

References

    1. Altun Yasemin, Belkin Mikhail, and Mcallester David A. Maximum margin semi-supervised learning for structured variables. In NIPS, pp. 33–40, 2005.
    1. Ay F, Bailey TL, and Noble WS Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Research, 24:999–1011, 2014a. - PMC - PubMed
    1. Ay F, Bunnik EM, Varoquaux N, Bol SM, Prudhomme J, Vert J-P, Noble WS, and Le Roch KG Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression. Genome Research, 24:974–988, 2014b. - PMC - PubMed
    1. Bishop C Neural Networks for Pattern Recognition. Oxford UP, Oxford, UK, 1995.
    1. Chapelle O, Zien A, and Schölkopf B (eds.). Semi-supervised learning. MIT Press, 2006.

LinkOut - more resources