Last month we learned that a problem in commonly used fMRI analysis tools was giving rise to elevated rates of false positives. Now, another issue has been discovered in an fMRI tool. The affected software is called GingerALE and the 'implementation errors' are revealed in a new paper by Simon B. Eickhoff et al., the developers of the package.
GingerALE is a meta-analysis tool, that offers the ability to combine the results of multiple fMRI studies to assess the overall level of evidence for neural activations under different conditions. According to Eickhoff et al., there were two different errors in earlier versions of the software. The first affected the false-discovery rate (FDR) statistical correction algorithm, and was caused by "a small mistake in the customized code for sorting floating-point numbers (P values)". Meanwhile, the second bug was another "small but important error", this time in the cluster-level family-wise error (FWE) correction function of GingerALE. Worryingly, the end result of both of the bugs was to make GingerALE more statistically liberal - i.e. they may have increased the rate of false positives, although the authors say that it's difficult to know how large of an impact the errors had. Both of the errors were discovered by GingerALE users:
Implementation errors in FDR were first suspected in May 2015, when inconsistencies were noted in the output of large scale, replication simulations performed by a member of the BrainMap user community and reported to the BrainMap development team. The source of the inconsistencies was identified rapidly, and a new build (V.2.3.3) was released within weeks. The error in the FWE correction was first suspected in January, 2016, also via a report from a BrainMap user community member. This error was confirmed, identified and corrected with a new build (V2.3.6) released in April, 2016. Both errors and their corrections were described on the BrainMap online forum (http://www.brainmap.org/forum).
The bugs are fixed in the latest version of GingerALE, V2.3.6, and the authors say that users should be sure to upgrade to this new version. But what about the analyses that have already been completed (and, in some cases, peer reviewed and published) using the old versions of the software? Should users re-run these old analyses with the latest GingerALE? What should researchers do if they find that their results change? This is a tricky issue that arises whenever software errors of this kind are discovered. I can't see many neuroscientists being keen to revisit and perhaps correct (or even, retract) their old papers. Eickhoff et al. advise that
We recommend that published meta-analyses using the GingerALE versions with implementation errors in the multiple-comparisons corrections be repeated using the latest version of GingerALE (V2.3.6), and the results compared to those of the original report. Depending upon the magnitude and potential impact of the differences, authors should consider corrective communications in consultation with the journal in which their original report appeared.
They reflect that
Implementation errors (reported here) and algorithmic errors [Eklund et al., 2016] in widely used image-analysis software creates the unfortunate situation wherein well intentioned researchers who have followed developers’ recommendations and established best practices may still have published flawed results - typically erroneous statistical confidence levels or cluster sizes. To best serve the neuroscientific community, corrections to the literature should be two-fold. First, the software developer should highlight the errors and need for re-analysis, as we are doing here. Second, the authors should be encouraged and enabled to self-correct such errors in a concise, rapidly implemented, non-pejorative manner.
In my view, the GingerALE developers handled this situation very well. But we can only expect the occurrence of these kinds of bugs to increase as new fMRI software packages are developed and made available to end users (and there do seem to be lots of them appearing nowadays). Given that these errors put researchers in an "unfortunate situation" - potentially, a catastrophic situation if a really serious bug made it into someone's published papers - perhaps we need a formal system of software validation?
Eickhoff SB, Laird AR, Fox PM, Lancaster JL, & Fox PT (2016). Implementation errors in the GingerALE Software: Description and recommendations. Human Brain Mapping PMID: 27511454