Neural Comp. Sign up for ETOCS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hammer, B.
Right arrow Articles by Tino, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hammer, B.
Right arrow Articles by Tino, P.
(Neural Computation. 2003;15:1897-1929.)
© 2003 The MIT Press


Letter

Recurrent Neural Networks with Small Weights Implement Definite Memory Machines

Barbara Hammer

hammer{at}informatik.uni-osnabrueck.de, Department of Mathematics/Computer Science, University of Osnabrück, D-49069, Osnabrück, Germany

Peter Tino

P.Tino{at}cs.bham.ac.uk, School of Computer Science, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K.

Recent experimental studies indicate that recurrent neural networks initialized with "small" weights are inherently biased toward definite memory machines (Tino, Cernansky, & Benusková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.




This article has been cited by other articles:


Home page
Neural Comput.Home page
A. Gruning
Elman Backpropagation as Reinforcement for Simple Recurrent Networks
Neural Comput., November 1, 2007; 19(11): 3108 - 3131.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
P. Tino, I. Farkas, and J. v. Mourik
Dynamics and Topographic Organization of Recursive Self-Organizing Maps.
Neural Comput., October 1, 2006; 18(10): 2529 - 2567.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
H. Jacobsson
The Crystallizing Substochastic Sequential Machine Extractor: CrySSMEx.
Neural Comput., September 1, 2006; 18(9): 2211 - 2255.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
H. Jacobsson
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review
Neural Comput., June 1, 2005; 17(6): 1223 - 1263.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Boden and J. Hawkins
Prediction of subcellular localization using sequence-biased recurrent networks
Bioinformatics, May 15, 2005; 21(10): 2279 - 2286.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2003 by The MIT Press.