This roadmap gives the background for my NIPS 2013 paper, "Annealing between distributions by averaging moments." This covers the basic machine learning concepts that the paper depends on, and should be sufficient for understanding it at a conceptual level.
Section 2: Estimating partition functions
- the partition function estimation problem (see inference in Markov random fields)
- annealed importance sampling (AIS), and its use in estimating partition functions
Section 3: Analyzing AIS paths
Note: equation (5) gives the path length on a Riemannian manifold where the metric is Fisher information. This manifold is the fundamental object of information geometry. The most relevant resource is probably chapters 2 and 3 of Amari and Nagaoka's Methods of Information Geometry. This background is very useful for thinking about the AIS paths, but it's fairly involved and it's not needed to understand the paper.
Section 4: Moment averaging
- exponential families
- multivariate Gaussians, and their information form representation
- restricted Boltzmann machines (RBMs)
- variational inference
Section 5: Experimental results
- persistent contrastive divergence (TODO)