A new paper examines how the brain keeps track of positive and negative outcomes: No unified reward prediction error in local field potentials from the human nucleus accumbens The authors, London-based neuroscientists Max-Philipp Stenner and colleagues, recorded electrical local field potentials (LFP) using electrodes implanted into the nucleus accumbens (NAcc) in six patients. The patients all suffered from epilepsy and the electrodes were being implanted to treat the disease. The authors made use of the electrodes to test a research question - does the NAcc encode reward prediction errors (RPEs)?
An RPE is a signal corresponding roughly to positive or negative surprise. If you expect something good to happen (a reward) and then nothing happens, you would experience a negative RPE. If you expect something bad to happen and you get a reward, that would be a large positive RPE. If you expect a reward and you get that reward, your prediction was accurate so there's no RPE. RPEs are a key theoretical construct in much of contemporary neuroscience. Leading theories of decision making and learning propose that particular brain regions track RPEs and use these to adjust behavior. The NAcc is one of the main brain regions believed to track RPEs. There's plenty of evidence that it does from fMRI scanning studies. However, according to Stenner et al., the fMRI data is not conclusive, because the low temporal resolution of that method means that RPE-like signals could appear through the overlap of other, non-RPE signals if these occur close together in time in the nAcc. Direct recording of LFPs offers much better temporal resolution. So Stenner et al. recorded NAcc activity while the six patients performed a gambling task for real money. The task is designed to generate lots of positive and negative RPEs. However, the authors say that, while the NAcc did respond to the gain and loss of money, they did not observe the expected RPE signals. You might say that the data produced a negative prediction error. The authors say:
We found little evidence for RPE signals in LFPs recorded from the human NAcc during an economic decision-making task... Four of the patients in our study made choices in a way that required them to consistently form value expectations... none of these patients showed a modulation of NAcc LFPs after outcome onset that unified expected value and outcome magnitude, as expected for a RPE signal.
In only one of the six patients did Stenner et al. find evidence of an RPE-like signal. This patient, however, did not seem to understand the gambling task, because they were as willing to make bad bets as they were good ones. Therefore, it's not clear what these RPEs mean, since the patient doesn't seem to have been guided by them. In the four patients who did make sensible choices on the gambling task, no RPEs were seen. The sixth patient's data were too noisy to be analyzed. The authors conclude that the NAcc may not track RPEs, but rather, that it may be responsible for 'policy updating', tracking multiple inputs that are relevant to behavior choice, which might not be limited to RPE. They don't disagree with the theory (for which there's lots of evidence) that the NAcc receives a RPE signal as an input in the form of dopamine signalling from the midbrain. However, they say, the NAcc does not simply echo this signal:
Our data do not support the idea of a unified RPE signal in NAcc LFPs that is driven by the phasic dopamine release known to represent RPEs.
Stenner MP, Rutledge RB, Zaehle T, Schmitt FC, Kopitzki K, Kowski AB, Voges J, Heinze HJ, & Dolan RJ (2015). No unified reward prediction error in local field potentials from the human nucleus accumbens: evidence from epilepsy patients. Journal of Neurophysiology PMID: 26019312