Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

Mind

Reproducibility Crisis: The Plot Thickens

NeuroskepticBy NeuroskepticNovember 11, 2015 12:36 AM

Newsletter

Sign up for our email newsletter for the latest science news

A new paper from British psychologists David Shanks and colleagues will add to the growing sense of a "reproducibility crisis" in the field of psychology. The paper is called Romance, Risk, and Replication and it examines the question of whether subtle reminders of 'mating motives' (i.e. sex) can make people more willing to spend money and take risks. In 'romantic priming' experiments, participants are first 'primed' e.g. by reading a story about meeting an attractive member of the opposite sex. Then, they are asked to do an ostensibly unrelated test, e.g. being asked to say how much money they would be willing to spend on a new watch. There have been many published studies of romantic priming (43 experiments across 15 papers, according to Shanks et al.) and the vast majority have found statistically significant effects. The effect would appear to be reproducible! But in the new paper, Shanks et al. report that they tried to replicate these effects in eight experiments, with a total of over 1600 participants, and they came up with nothing. Romantic priming had no effect. So what happened? Why do the replication results differ so much from the results of the original studies? The answer is rather depressing and it lies in a graph plotted by Shanks et al. This is a funnel plot, a two-dimensional scatter plot in which each point represents one previously published study. The graph plots the effect size reported by each study against the standard error of the effect size - essentially, the precision of the results, which is mostly determined by the sample size.

funnel_shanks1.png

This particular plot is a statistical smoking gun, and suggests that the positive results from the original studies (black dots) were probably the result of p-hacking. They were chance findings, selectively published because they were positive. Here's why. In theory, the points in a funnel plot should form a "funnel", i.e. a triangle, that points straight up. In other words, the more precise studies at the top should have less spread than the noisier estimates, but they should converge on the same effect size that's also the average of the less precise measures. In this plot, however, the black dots form a 'funnel' which is seriously tilted to the left. The trend line though these points is a diagonal (the red line). In other words, the more precise studies tended to find smaller mating priming effects. The bigger the study, the smaller the romantic priming. In fact, the diagonal red trend line closely tracks the line where an effect stops being statistically significant at p < 0.05 - which is marked as the outer edge of the grey triangle on the plot. Another way of expressing this would be to say that p values just below 0.05 are overrepresented. The published results "hug" the p = 0.05 significance line. So each of the studies tended to report an effect just strong enough to be statistically significant. It's very difficult to see how such a pattern could arise - except through bias. Shanks et al. say that this is evidence of the existence of "either p-hacking in previously published studies or selective publication of results (or both)." These two forms of bias go hand in hand, so the answer is probably both. Publication bias is the tendency of scientists (including peer reviewers and editors) to prefer positive results over negative ones. P-hacking is a process by which scientists can maximize their chances of finding positive results. I've been blogging about these issuesfor years, yet still I was taken aback by the dramatic nature of the bias in this case. The studies are like a torrent, rolling down the mountain of significance. The image is not so much a funnel plot as an avalanche plot.

avalanche1.png

Taken together with the negative results of the eight replication studies that Shanks et al. conducted, the funnel plot suggests that romantic priming doesn't exist, and that the many studies that did report the effect, were wrong. This doesn't mean that the previous romantic priming researchers were consciously trying to deceive by publishing results that they knew were false. In my view, they were probably led astray by their own cognitive biases, helped along by the culture of 'positive results or bust' in science today. This system can produce replicated positive results out of nowhere. I don't think this is a sustainable way of doing research. Reform is needed.

rb2_large_white.png

Shanks DR, Vadillo MA, Riedel B, Clymo A, Govind S, Hickin N, Tamman AJ, & Puhlmann LM (2015). Romance, Risk, and Replication: Can Consumer Choices and Risk-Taking Be Primed by Mating Motives? Journal of experimental psychology. General PMID: 26501730

    2 Free Articles Left

    Want it all? Get unlimited access when you subscribe.

    Subscribe

    Already a subscriber? Register or Log In

    Want unlimited access?

    Subscribe today and save 70%

    Subscribe

    Already a subscriber? Register or Log In