Neural Comp. Sign up for ETOCS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hinton, G. E.
Right arrow Articles by Teh, Y.-W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hinton, G. E.
Right arrow Articles by Teh, Y.-W.
(Neural Computation. 2006;18:1527-1554.)
© 2006 The MIT Press


Letter

A Fast Learning Algorithm for Deep Belief Nets

Geoffrey E. Hinton

hinton{at}cs.toronto.edu

Simon Osindero

osindero{at}cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4

Yee-Whye Teh

tehyw{at}comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543

We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.




This article has been cited by other articles:


Home page
Neural Comput.Home page
I. Sutskever and G. E. Hinton
Deep, Narrow Sigmoid Belief Networks Are Universal Approximators
Neural Comput., November 1, 2008; 20(11): 2629 - 2636.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
M. N. Abdelghani, T. P. Lillicrap, and D. B. Tweed
Sensitivity Derivatives for Flexible Sensorimotor Learning
Neural Comput., August 1, 2008; 20(8): 2085 - 2111.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
N. L. Roux and Y. Bengio
Representational power of restricted boltzmann machines and deep belief networks.
Neural Comput., June 1, 2008; 20(6): 1631 - 1649.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
P. Byrne and S. Becker
A principle for learning egocentric-allocentric transformation.
Neural Comput., March 1, 2008; 20(3): 709 - 737.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
J. F. Murray and K. Kreutz-Delgado
Visual recognition and inference using dynamic overcomplete sparse learning.
Neural Comput., September 1, 2007; 19(9): 2301 - 2352.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
G. E. Hinton and R. R. Salakhutdinov
Reducing the dimensionality of data with neural networks.
Science, July 28, 2006; 313(5786): 504 - 507.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2006 by The MIT Press.