Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Fine, T. L.
Right arrow Articles by Mukherjee, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fine, T. L.
Right arrow Articles by Mukherjee, S.

Neural Computation, Vol 11, 747-769, Copyright © 1999 by The MIT Press


LETTERS

Parameter Convergence and Learning Curves for Neural Networks

Terrence L. Fine and Sayandev Mukherjee

We revisit the oft-studied asymptotic (in sample size) behavior of the parameter or weight estimate returned by any member of a large family of neural network training algorithms. By properly accounting for the characteristic property of neural networks that their empirical and generalization errors possess multiple minima, we rigorously establish conditions under which the parameter estimate converges strongly into the set of minima of the generalization error. Convergence of the parameter estimate to a particular value cannot be guaranteed under our assumptions. We then evaluate the asymptotic distribution of the distance between the parameter estimate and its nearest neighbor among the set of minima of the generalization error. Results on this question have appeared numerous times and generally assert asymptotic normality, the conclusion expected from familiar statistical arguments concerned with maximum likelihood estimators. These conclusions are usually reached on the basis of somewhat informal calculations, although we shall see that the situation is somewhat delicate. The preceding results then provide a derivation of learning curves for generalization and empirical errors that leads to bounds on rates of convergence.





HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 1999 by The MIT Press.