Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow An erratum has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Daw, N. D.
Right arrow Articles by Touretzky, D. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Daw, N. D.
Right arrow Articles by Touretzky, D. S.
(Neural Computation. 2006;18:1637-1677.)
© 2006 The MIT Press


Letter

Representation and Timing in Theories of the Dopamine System

Nathaniel D. Daw

daw{at}gatsby.ucl.ac.uk UCL, Gatsby Computational Neuroscience Unit, London, WC1N3AR, U.K.

Aaron C. Courville

aaronc{at}cs.cmu.edu Carnegie Mellon University, Robotics Institute and Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, U.S.A.

David S. Touretzky

dst{at}cs.cmu.edu Carnegie Mellon University, Computer Science Department and Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, U.S.A.

Although the responses of dopamine neurons in the primate midbrain are well characterized as carrying a temporal difference (TD) error signal for reward prediction, existing theories do not offer a credible account of how the brain keeps track of past sensory events that may be relevant to predicting future reward. Empirically, these shortcomings of previous theories are particularly evident in their account of experiments in which animals were exposed to variation in the timing of events. The original theories mispredicted the results of such experiments due to their use of a representational device called a tapped delay line.

Here we propose that a richer understanding of history representation and a better account of these experiments can be given by considering TD algorithms for a formal setting that incorporates two features not originally considered in theories of the dopaminergic response: partial observability (a distinction between the animal's sensory experience and the true underlying state of the world) and semi-Markov dynamics (an explicit account of variation in the intervals between events). The new theory situates the dopaminergic system in a richer functional and anatomical context, since it assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data. It also accounts for a range of experimental data, including the experiments involving programmed temporal variability and other previously unmodeled dopaminergic response phenomena, which we suggest are related to subjective noise in animals’ interval timing. Finally, it offers new experimental predictions and a rich theoretical framework for designing future experiments.




This article has been cited by other articles:


Home page
J. Neurosci.Home page
S. Kobayashi and W. Schultz
Influence of Reward Delays on Responses of Dopamine Neurons
J. Neurosci., July 30, 2008; 28(31): 7837 - 7846.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
M. C. Fuhs and D. S. Touretzky
Context learning in the rodent hippocampus.
Neural Comput., December 1, 2007; 19(12): 3173 - 3215.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
T. Schonberg, N. D. Daw, D. Joel, and J. P. O'Doherty
Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making
J. Neurosci., November 21, 2007; 27(47): 12860 - 12867.
[Abstract] [Full Text] [PDF]


Home page
Adaptive BehaviorHome page
W. H. Alexander
Shifting Attention Using a Temporal Difference Prediction Error and High-Dimensional Input
Adaptive Behavior, June 1, 2007; 15(2): 121 - 133.
[Abstract] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2006 by The MIT Press.