fMRI researchers should care about (and report) the size of the effects that they study, according to a new Neuroimage paper from NIMH researchers Gang Chen and colleagues. It's called Is the statistic value all we should care about in neuroimaging?. The authors include Robert W. Cox, creator of the popular fMRI analysis software AFNI. Chen et al. explain the purpose of their paper:
Here we address an important issue that has been embedded within the neuroimaging community for a long time: the absence of effect estimates in results reporting in the literature.
The problem, they say, is that in studying brain activations, neuroscientists have been overly focussed on statistical significance. In a typical fMRI experiment, the researchers look for clusters (aka 'blobs') of brain activation, defined as areas where the observed activity is unlikely to have occured by chance (p < 0.05). These clusters then get reported (and colorfully depicted) in the paper. But the actual magnitude of the neural response is rarely considered. A small but consistent effect would produce just the same blob as a large but variable activation.
Chen et al. say that this is a problem:
Statistic values alone do not represent the whole scientific endeavor, and there is no reason to believe that neuroimaging should be an exception in which physical measurement is largely ignored... Such numerical and graphical information would offer a safeguard against spurious results, promote reproducibility and aid power and meta analysis.
I agree that reporting the size of effects would be great if it were possible. Yet it's not so simple. One reason why effect sizes have not been commonly reported in fMRI is that the units of measurement (i.e. of the MRI signal) are essentially arbitrary. fMRI is not like a thermometer in which we record measurements in precise units with a defined physical meaning such as Celsius. The MRI signal is just a number. The closest thing to a direct measure of effect magnitude in fMRI is the percent signal change, but this isn't a universal unit: the very same brain activation might produce a 0.8% signal change on one MRI scanner and a 0.3% change on a different scanner using different parameters. Chen et al. seem slightly confused on this point [Edit: or perhaps I was the one confused, see the comments]. First, they say that:
By default in AFNI... the effect estimate for each condition can be directly interpreted as a percent signal change relative to the voxel-wise temporal mean; as a result, effect estimates themselves are interpretable, carry real information about the size of the BOLD effect, and are comparable across brain regions, conditions, subjects, groups, studies and scanners.
Yet later they say (and I agree) that percent signal change is not universal:
Even scaled as a percent signal change, the BOLD effect estimates depend on MR acquisition parameters such as field strength B0, scanner sequence (e.g., SE vs. GRE) and echo time. Such dependencies have been studied and modeled by, for example, Uludağ et al. (2009), where total FMRI BOLD percent signal change was shown to increase with field strength as well as with echo time...
So yes, ideally every fMRI paper should include true, physical estimates of brain activation that could be directly compared with one another. Maybe the units would be "percent blood oxygen change", or something similar. However, the fact is that we can't do that (yet). Reporting percent signal change is the best we can do, and it's useful in many cases, but it could also be dangerous (e.g. if it encourages people to make false comparisons across studies).
Chen G, Taylor PA, & Cox RW (2016). Is the Statistic Value All We Should Care about in Neuroimaging? NeuroImage PMID: 27729277