False-Positive fMRI Hits The Mainstream

A new paper in PNAS has made waves. The article, called

Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates,

comes from Swedish neuroscientists Anders Eklund, Tom Nichols, and Hans Knutsson. According to many of the headlines that greeted "Cluster failure", the paper is a devastating bombshell that could demolish the whole field of functional magnetic resonance imaging (fMRI):

Bug in fMRI software calls 15 years of research into question (Wired) A bug in fMRI software could invalidate 15 years of brain research. This is huge. (ScienceAlert) New Research Suggests That Tens Of Thousands Of fMRI Brain Studies May Be Flawed (Motherboard)

So what's going on here, and is it really this serious? The first thing to note is that the story isn't really new. I've been covering Eklund et al.'s work on the false-positive issue since 2012 (1,2,3,4). Over that time, Eklund and his colleagues have developed the argument that several commonly used fMRI analysis software tools suffer from a basic flaw which leads to elevated false-positive rates when it comes to finding activations associated with tasks or stimuli i.e. finding which brain area 'lights up' during particular tasks. The new paper is just the culmination of this program, and the results - that up to 70% of analyses produce at least one false positive, depending on the software and conditions - won't come as a surprise to anyone who has been following the issue Although there is one unexpected point in "Cluster failure": Eklund et al. reveal that they discovered a different kind of bug in one of the software packages, called AFNI:

A 15-y-old bug was found in [AFNI's tool] 3dClustSim while testing the three software packages (the bug was fixed by the AFNI group as of May 2015, during preparation of this manuscript). The bug essentially reduced the size of the image searched for clusters, underestimating the severity of the multiplicity correction and overestimating significance (i.e., 3dClustSim FWE P values were too low)

This is a new and important issue, but this new bug only applies to AFNI, not other widely-used packages such as FSL and SPM. As to the question of how serious this is, in my view, it's very serious, but it doesn't "invalidate 15 years of brain research" as the headline had it. For one thing, the issue only affects fMRI, and most brain research does not use fMRI. Moreover, Eklund et al.'s findings don't call all fMRI studies into question - the problem only affects activation mapping studies. Yet while these experiments are common, they are far from the only application of fMRI. Studies of functional connectivity or multi-voxel pattern analysis (MVPA) are increasingly popular and they're not, as far as I can see, likely to be affected. Finally, it's important to remember that "70% chance of finding at least one false positive" does not imply that "70% of positives are false". If there are lots of true positives, only a minority of positives will be false. It's impossible to directly know the true positive rate, however. Update 15th July 2016: Tom Nichols, one of the authors of 'Cluster Failure', reports that he's requested some corrections to the paper in order to remove some of the statements that led to "misinterpretations" of the study (i.e. to those hyped headlines). However, PNAS did not agree to the correction, so Nichols has posted it on PubMed Commons, here.

Eklund A, Nichols TE, & Knutsson H (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proceedings of the National Academy of Sciences of the United States of America PMID: 27357684