Register for an account


Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.


Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.


Beware Dead Fish Statistics

Neuroskeptic iconNeuroskepticBy NeuroskepticNovember 27, 2011 1:32 AM


Sign up for our email newsletter for the latest science news

An editorial in the Journal of Physiology offers some important notes on statistics.

But even more importantly, it refers to a certain blog in the process:


The Student’s t-test merely quantifies the ‘Lack of support’ for no effect. It is left to the user of the test to decide how convincing this lack might be. A further difficulty is evident in the repeated samples we show in Figure 2: one of those samples was quite improbable because the P-value was 0.03, which suggests a substantial lack of support, but that’s chance for you! A parody of this effect of multiple sampling, taken to extremes, can be found at

This makes it the second academic paper to refer to this blog as far. Although I feel rather bad about this one, since the citation ought to have been to the original dead salmon brain scanning study by Craig Bennett. I just wrote about it.

Actually, though, this editorial was published in five separate journals: The Journal of Physiology, Experimental Physiology, the British Journal of Pharmacology, Advances in Physiology Education, Microcirculation, and Clinical and Experimental Pharmacology and Physiology. Phew.

In fact, you could say that this makes not two but six citations for Neuroskeptic now. Yes. Let's go with that.

Anyway, after discussing the history of the ubiquitous Student's t-test - which was invented in a brewery - it reminds us that the p value you get from such a t-test doesn't tell you how likely it is that your results are "real".

Rather, it tells you how often you'd get the result you did, if there was no effect and it was just random chance. That's a big difference. A p value of 0.01 doesn't mean your results are 99% likely to be real. It means that there's a 1% chance that you'd get them, by chance. But if you did say 100 experiments, or more likely, 100 statistical tests on the same data, then you'd expect to get at least one result with a p value of 0.01 purely by chance.

In that case it would be silly to think that the finding was only 1% likely to be a fluke. Of course it could be true. But we'd have no particular reason to think so until we get some more data.

This is what the dead salmon study was all about. This multiple comparisons issue is very old, but very important. Arguably the biggest problem in science today is that we're doing too many comparisons and only reporting the significant ones.


Drummond GB, & Tom BD (2011). Statistics, probability, significance, likelihood: words mean what we define them to mean. British journal of pharmacology, 164 (6), 1573-6 PMID: 22022804

    3 Free Articles Left

    Want it all? Get unlimited access when you subscribe.


    Already a subscriber? Register or Log In

    Want unlimited access?

    Subscribe today and save 70%


    Already a subscriber? Register or Log In