Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

The Sciences

Do Peer Reviewers Prefer Significant Results?

An experiment on peer reviewers at a psychology conference suggests a positive result premium, which could drive publication bias.

Neuroskeptic iconNeuroskepticBy NeuroskepticApril 30, 2020 10:27 PM
Man Reading Books Computer Peer Review - Shutterstock
(Credit: lyf1/Shutterstock)

Newsletter

Sign up for our email newsletter for the latest science news

I’ve long been writing about problems in how science is communicated and published. One of the most well-known concerns in this context is publication bias — the tendency for results that confirm a hypothesis to get published more easily than those that don’t.

Publication bias has many contributing factors, but the peer review process is widely seen as a crucial driver. Peer reviewers, it is widely believed, tend to look more favorably on “positive” (i.e., statistically significant) results.

But is the reviewer preference for positive results really true? A recently published study suggests that the effect does exist, but that it’s not a huge effect.

Researchers Malte Elson, Markus Huff and Sonja Utz carried out a clever experiment to determine the impact of statistical significance on peer review evaluations. The authors were the organizers of a 2015 conference to which researchers submitted abstracts that were subject to peer review.

The keynote speaker at this conference, by the way, was none other than “Neuroskeptic (a pseudonymous science blogger).”

Elson et al. created a dummy abstract and had the conference peer reviewers review this artificial “submission” alongside the real ones. Each reviewer was randomly assigned to receive a version of the abstract with either a significant result or a nonsignificant result; the details of the fictional study were otherwise identical. The final sample size was n=127 reviewers.

The authors do discuss the ethics of this slightly unusual experiment!

It turned out that the statistically significant version of the abstract was given a higher “overall recommendation” score than the nonsignificant one. The difference, roughly 1 point on a scale out of 10, was statistically significant, although marginally (p=0.039).

The authors conclude that:

We observed some evidence for a small bias in favor of significant results. At least for this particular conference, though, it is unlikely that the effect was large enough to notably affect acceptance rates.

The experiment also tested whether reviewers had a preference for original studies vs. replication studies (so there were four versions of the dummy abstract in total.) This revealed no difference.

Effects of significant vs. nonsignificant results
(Credit: Elson et al. 2020)

So this study suggests that reviewers, at least at this conference, do indeed prefer positive results. But as the authors acknowledge, it’s hard to know whether this would generalize to other contexts.

For example, the abstracts that were reviewed for this conference were limited to just 300 words. In other contexts, notably journal article reviews, reviewers are provided with far more information to base an opinion on. With just 300 words to go by, reviewers in this study might have paid attention to the results just because there wasn’t much else to judge on.

On the other hand, the authors note that the participants in the 2015 conference might have been unusually aware of the problem of publication bias, and thus more likely to give null results a fair hearing.

For the context of this study, it is relevant to note that the division (and its leadership at the time) can be characterized as rather progressive with regard to open-science ideals and practices.

This is certainly true; after all, they invited me, an anonymous guy with a blog, to speak to them, just on the strength of my writings about open science.

There have only been a handful of previous studies using similar designs to probe peer review biases, and they generally found larger effects. One 1982 paper found a large bias in favor of significant results at a psychology journal, as did a 2010 study at a medical journal.

The authors conclude that their dummy submission method could be useful in the study of peer review:

We hope that this study encourages psychologists, as individuals and on institutional levels (associations, journals, conferences), to conduct experimental research on peer review, and that the preregistered field experiment we have reported may serve as a blueprint of the type of research we argue is necessary to cumulatively build a rigorous knowledge base on the peer review process.

3 Free Articles Left

Want it all? Get unlimited access when you subscribe.

Subscribe

Already a subscriber? Register or Log In

Want unlimited access?

Subscribe today and save 70%

Subscribe

Already a subscriber? Register or Log In