. 2015 Jul:37:1992-2001.

Entropic Graph-based Posterior Regularization

Maxwell W Libbrecht¹, Michael M Hoffman², Jeffrey A Bilmes³, William S Noble¹

Affiliations

¹ Genome Sciences, Box 355065, Foege Building, S220B, 3720 15th Ave NE, Seattle, WA 98195-5065.
² Princess Margaret Cancer Centre, Toronto Medical Discovery Tower 11-311, 101 College St, Toronto, ON M5G 1L7.
³ Department of Electrical Engineering, University of Washington, Seattle, Box 352500, Seattle, WA 98195-2500.

PMID: 39483441
PMCID: PMC11526501

Entropic Graph-based Posterior Regularization

Maxwell W Libbrecht et al. JMLR Workshop Conf Proc. 2015 Jul.

. 2015 Jul:37:1992-2001.

Authors

Maxwell W Libbrecht¹, Michael M Hoffman², Jeffrey A Bilmes³, William S Noble¹

Affiliations

¹ Genome Sciences, Box 355065, Foege Building, S220B, 3720 15th Ave NE, Seattle, WA 98195-5065.
² Princess Margaret Cancer Centre, Toronto Medical Discovery Tower 11-311, 101 College St, Toronto, ON M5G 1L7.
³ Department of Electrical Engineering, University of Washington, Seattle, Box 352500, Seattle, WA 98195-2500.

PMID: 39483441
PMCID: PMC11526501

Abstract

Graph smoothness objectives have achieved great success in semi-supervised learning but have not yet been applied extensively to unsupervised generative models. We define a new class of entropic graph-based posterior regularizers that augment a probabilistic model by encouraging pairs of nearby variables in a regularization graph to have similar posterior distributions. We present a three-way alternating optimization algorithm with closed-form updates for performing inference on this joint model and learning its parameters. This method admits updates linear in the degree of the regularization graph, exhibits monotone convergence, and is easily parallelizable. We are motivated by applications in computational biology in which temporal models such as hidden Markov models are used to learn a human-interpretable representation of genomic data. On a synthetic problem, we show that our method outperforms existing methods for graph-based regularization and a comparable strategy for incorporating long-range interactions using existing methods for approximate inference. Using genome-scale functional genomics data, we integrate genome 3D interaction data into existing models for genome annotation and demonstrate significant improvements in predicting genomic activity.

PubMed Disclaimer

Figures

**Figure 1.**
Illustration of optimization algorithm. Solid ovals denote closed-form update steps. Dashed ovals with dotted expansion lines denote updates that are implemented by alternating optimization. Pairs of opposing arrows indicate alternating optimization implemented by iterating each update to convergence. See the extended version (Libbrecht et al., 2015) for the full algorithm.

**Figure 2.**
Using EGPR to learn ambiguous clusters. Shape denotes true class, color denotes predicted class, and colored arrows denote cluster means as they evolve between iterations of EM.

**Figure 3.**
Comparison of EGPR with related inference methods. The X axis shows $σ$ , a hyperparameter controlling the difficulty of inference. The Y axis shows the average accuracy over 200 simulations of MAP inference on the model in question (95% Wilcoxon test confidence intervals).

**Figure 4.**
Synthetic model of genome spatial interactions. Color and labeled division lines indicate learned labels along the hypothesized 501 bp genome. Large filled circles indicate observed positions. Dotted lines indicate EGPR edges.

**Figure 5.**
(a) Strategy for utilizing physical interaction information. (b) Improvement in RMSE over chain model for EGPR and SQGPR for 29 experiments. (c) Relative improvement in RMSE between EGPR and SQGPR for each of 29 experiments.

See this image and copyright information in PMC

References

1. Altun Yasemin, Belkin Mikhail, and Mcallester David A. Maximum margin semi-supervised learning for structured variables. In NIPS, pp. 33–40, 2005.
1. Ay F, Bailey TL, and Noble WS Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Research, 24:999–1011, 2014a. - PMC - PubMed
1. Ay F, Bunnik EM, Varoquaux N, Bol SM, Prudhomme J, Vert J-P, Noble WS, and Le Roch KG Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression. Genome Research, 24:974–988, 2014b. - PMC - PubMed
1. Bishop C Neural Networks for Pattern Recognition. Oxford UP, Oxford, UK, 1995.
1. Chapelle O, Zien A, and Schölkopf B (eds.). Semi-supervised learning. MIT Press, 2006.

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Entropic Graph-based Posterior Regularization

Affiliations

Entropic Graph-based Posterior Regularization

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources