Abstract:
Bayesian learning in undirected graphical models—computing posterior distributions over parameters and predictive quantities— is exceptionally difficult. We conjecture that for general undirected models, there are no tractable MCMC (Markov Chain Monte Carlo) schemes giving the correct equilibrium distribution over parameters. While this intractability, due to the partition function, is familiar to those performing parameter optimisation, Bayesian learning of posterior distributions over undirected model parameters has been unexplored and poses novel challenges. We propose several approximate MCMC schemes and test on fully observed binary models (Boltzmann machines) for a small coronary heart disease data set and larger artificial systems. While approximations must perform well on the model, their interaction with the sampling scheme is also important. Samplers based on variational mean-field approximations generally performed poorly, more advanced methods using loopy propagation, brief sampling and stochastic dynamics lead to acceptable parameter posteriors. Finally, we demonstrate these techniques on a Markov random field with hidden variables. 1
Citations
|
2322
|
Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images
– Geman, Geman
- 1984
|
|
615
|
Spatial interaction and the statistical analysis of lattice systems (with discussion
– Besag
- 1974
|
|
336
|
Inducing Features of Random Fields
– Pietra, Pietra, et al.
- 1995
|
|
323
|
Probabilistic inference using Markovchain Monte Carlo methods
– Neal
- 1993
|
|
296
|
A learning algorithm for Boltzmann machines
– Ackley, Hinton, et al.
- 1985
|
|
177
|
Generalized belief propagation
– Yedidia, Freeman, et al.
|
|
144
|
Training products of experts by minimizing contrastive divergence
– Hinton
|
|
36
|
Toward conditional models of identity uncertainty with application to proper noun coreference
– McCallum, Wellner
- 2003
|
|
31
|
Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models
– Dellaportas, Forster
- 1999
|
|
20
|
Variational approximations between mean field theory and the junction tree algorithm
– Wiegerinck, Boutilier, et al.
- 2000
|
|
12
|
Approximate inference and protein folding
– Yanover, Weiss
- 2002
|
|
5
|
Bayesian selection of log-linear models
– Albert
- 1996
|
|
4
|
Exact sampling from nonattractive distributions using summary states
– Childs, Patterson, et al.
- 2001
|
|
2
|
and Zoubin Ghahramani. Towards semisupervised classification with Markov random fields
– Zhu
- 2002
|
|
2
|
Bayesian inference in incomplete multi-way tables
– Dobra, Tebaldi, et al.
- 2003
|
|
2
|
and Tomáˇs Havránek. A fast procedure for model search in multidimensional contingency tables
– Edwards
- 1985
|
|
1
|
Teaching computers to fold proteins
– Winther, Krogh
- 2003
|
|
1
|
MRF parameter estimation by MCMC method. Pattern recognition
– Wang, Li
- 1925
|
|
1
|
and Qiansheng Cheng. MRF parameter estimation by an accelerated method
– Yu
|