# Efficient Learning of Deep Boltzmann Machines

@inproceedings{Salakhutdinov2010EfficientLO, title={Efficient Learning of Deep Boltzmann Machines}, author={Ruslan Salakhutdinov and H. Larochelle}, booktitle={AISTATS}, year={2010} }

We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. [...] Key Result Finally, we demonstrate that the DBM’s trained using the proposed approximate inference algorithm perform well compared to DBN’s and SVM’s on the MNIST handwritten digit, OCR English letters, and NORB visual object recognition tasks. Expand

#### 335 Citations

An Efficient Learning Procedure for Deep Boltzmann Machines

- Medicine, Computer Science
- Neural Computation
- 2012

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables is presented and results on the MNIST and NORB data sets are presented showing that deep BoltZmann machines learn very good generative models of handwritten digits and 3D objects. Expand

A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines

- Computer Science
- ICANN
- 2013

This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand

Hyper-Parameter-Free Generative Modelling with Deep Boltzmann Trees

- Computer Science
- ECML/PKDD
- 2019

It is shown that the conditional independence structure of any categorical Deep Boltzmann Machine contains a sub-tree that allows the consistent estimation of the full joint probability mass function of all visible units, and that the DBT is a theoretical sound alternative to likelihood-free generative models. Expand

How to Pretrain Deep Boltzmann Machines in Two Stages

- Computer Science
- 2015

This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand

Soft-Deep Boltzmann Machines

- Computer Science, Mathematics
- 2015

This paper proposes an approximate measure for the representational power of a BM regarding to the efficiency of a distributed representation and proposes an alternative BM architecture, which it is shown can more efficiently exploit the distributed representations in terms of the measure. Expand

Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines

- Computer Science
- AISTATS
- 2016

A pretraining algorithm, which is a layer-bylayer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented and it can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. Expand

Learning Deep Generative Models with Short Run Inference Dynamics

- Mathematics, Computer Science
- ArXiv
- 2019

This paper proposes to use short run inference dynamics guided by the log-posterior, such as finite-step gradient descent algorithm initialized from the prior distribution of the latent variables, as an approximate sampler of the posterior distribution, where the step size of the gradient descent dynamics is optimized by minimizing the Kullback-Leibler divergence. Expand

Deep Learning using Restricted Boltzmann machines

- 2015

-Restricted Boltzmann machines (RBM) are probabilistic graphical models which are represented as stochastic neural networks. Increase in computational capacity and development of faster learning… Expand

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

- Mathematics, Computer Science
- Physical Review X
- 2018

This work derives a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. Expand

Variational Probability Flow for Biologically Plausible Training of Deep Neural Networks

- Computer Science, Mathematics
- AAAI
- 2018

It is shown that weight updates in VPF are local, depending only on the states and firing rates of the adjacent neurons, and, interestingly, if an asymmetric version of VPF exists, the weight updates directly explain experimental results in Spike-Timing-Dependent Plasticity (STDP). Expand

#### References

SHOWING 1-10 OF 27 REFERENCES

Deep Boltzmann Machines

- Computer Science
- AISTATS
- 2009

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass. Expand

On the quantitative analysis of deep belief networks

- Mathematics, Computer Science
- ICML '08
- 2008

It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented. Expand

A Fast Learning Algorithm for Deep Belief Nets

- Mathematics, Computer Science
- Neural Computation
- 2006

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

- Computer Science
- ICML '09
- 2009

The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference. Expand

Learning Deep Architectures for AI

- Computer Science
- Found. Trends Mach. Learn.
- 2007

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand

Learning and Evaluating Boltzmann Machines

- Mathematics
- 2008

We provide a brief overview of the variational framework for obtaining deterministic approximations or upper bounds for the log-partition function. We also review some of the Monte Carlo based… Expand

3D Object Recognition with Deep Belief Nets

- Computer Science
- NIPS
- 2009

A new type of top-level model for Deep Belief Nets is introduced, a third-order Boltzmann machine, trained using a hybrid algorithm that combines both generative and discriminative gradients that substantially outperforms shallow models such as SVMs. Expand

Efficient Learning of Sparse Representations with an Energy-Based Model

- Computer Science
- 2007

A novel unsupervised method for learning sparse, overcomplete features using a linear encoder, and a linear decoder preceded by a sparsifying non-linearity that turns a code vector into a quasi-binary sparse code vector. Expand

Deep Learning using Robust Interdependent Codes

- Computer Science
- AISTATS
- 2009

A simple yet effective method to introduce inhibitory and excitatory interactions between units in the layers of a deep neural network classifier is investigated, and it is presented for the first time that lateral connections can significantly improve the classification performance of deep networks. Expand

Annealed importance sampling

- Mathematics, Physics
- Stat. Comput.
- 2001

It is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler, which can be seen as a generalization of a recently-proposed variant of sequential importance sampling. Expand