In 2011, Dutch social psychologist Diederik Stapel was revealed as a charlatan who had published dozens of fraudulent scientific papers. The shocking thing was that no one in his field had noticed until courageous students in his lab reported their suspicions. Within a year, two more fraudulent psychologists were outed.
This spate of misconduct happened in tandem with many failed attempts to replicate some of the field’s classic results, prompting its acolytes to question whether psychology was being polluted by quirky and attention-grabbing findings that might not actually be true.
This issue applies to every field of science, from physics to medicine. But psychologists, spurred by a growing crisis of faith, are tackling it head-on. Psychologist Brian Nosek at the University of Virginia is at the forefront of the fight. At his Center for Open Science, launched in January, he is coordinating several initiatives to make psychology more transparent and reliable — initiatives that could both restore confidence to his beleaguered field and influence similar efforts in other sciences. Discover spoke with Nosek about his current work to combat problems with reproducibility in psychology.
Discover: When did you first become aware of the problem of reproducibility?
Brian Nosek: When I was an undergrad, I read a paper that evaluated claims about subliminal mood tapes, which supposedly improved your memory if you played them while you slept. When people tried to replicate these claims, they found that only the subject’s expectations mattered: If you thought the tape increased your memory, your memory was better. This was my first insight that not all scientific claims are valid or reproducible.
Why is it so easy for mistaken results, or “false positives,” to gain currency in peer-reviewed scientific journals?
BN: Results that are novel are more likely to be published, and the primary currency of an academic scientist is a publication — a good publication record in prestigious journals helps you get a good academic job, funding and awards. But you can get all of those things without having true results. If I need to get positive, clean results, I can do two things: I could throw away results that don’t fit my story, or I could keep on working until the data looks nicer, analyzing it in a different way or collecting data until I get the results I need.
Don’t false positives get corrected in the long run, when other scientists fail to replicate experiments?
BN: There’s an ethic that science is self-correcting, but replication is often a pain in the butt, and since scientists’ career success doesn’t depend on exactly replicating a study that’s already been published, they usually don’t do it. As a result, mistakes end up hanging around longer than they need to.
Is this why fraudulent researchers like Stapel can build careers without anyone recognizing their deception?
BN: Yes. I was shocked by Stapel. How could the rest of us not have figured it out? It was because hardly anyone bothered to replicate his studies. If people did and failed, they assumed it was because they were bad researchers.
What are you doing about it?
BN: In 2011, colleagues and I launched the Reproducibility Project, in which a team of about 200 scientists are carrying out experiments that were published in three psychology journals in 2008. We want to see how many can reproduce the original result, and what factors affect reproducibility. That will tell us if the problem of false-positive results in the psychology journals is big, small or non-existent.
In early 2013, with a $5.25 million grant from the Laura and John Arnold Foundation, you announced the Center for Open Science, a new initiative to increase the reproducibility of scientific research. What is the center doing?
BN: My colleague Jeff Spies, who was a cofounder of the center, and I wanted to make it easy to find someone else’s methods and rerun them to see if you can get the same results. So we built the Open Science Framework, a web application where collaborating researchers can put all their data and research materials so anyone can easily see them. We also offer incentives by offering “badges” for good practices, like making raw data available. So the more open you are, the more opportunities you have for building your reputation.
Your projects are part of a larger movement in psychology that has accelerated during the past year. What other initiatives are underway to combat problems with reproducibility?
BN: At least five psychology journals are inviting people to submit plans for replicating specific studies, which will then get peer-reviewed on the basis of their designs [rather than their results]. This changes the incentives for replication, reduces the potential for selectively reporting your data and increases the potential for publishing negative results.
The Association for Psychological Science has also launched a project for publishing replications. As a start, 30 teams have signed up to each try to replicate [observations of] a phenomenon called the “verbal overshadowing effect,” the idea that describing things that aren’t coded in words, like the taste of wine, worsens your memory for those things. The results will collectively show whether the effect exists.
Replications are so rare that people perceive them to indicate a lack of trust. I want it to be very ordinary for people to say, “I’m not sure about this, so I’ll replicate it,” and for that to be a compliment, not a threat.
Reproducibility Problems Plague the Sciences
The problem of unreliable results is not unique to psychology. Every field is dominated by studies that confirm researchers’ hypotheses. Negative results appear just 30 percent of the time in space sciences and less than 10 percent in psychology and psychiatry. In one study that pooled numerous surveys of researchers in diverse fields, a third of scientists owned up to questionable research practices like cherry-picking data that yield positive results.
These practices could significantly slow the pace of research or send bright scientists down blind alleys. When Amgen, a biotech firm, tried to replicate 53 “landmark” studies in basic cancer research in 2012, it could confirm only six of them. Another corporate team from Bayer HealthCare could validate only a quarter of preclinical studies in cancer, heart disease and other medical fields.
Researchers need the right incentives to produce dependable results and to correct mistakes in the scientific literature, says John Ioannidis of Stanford University School of Medicine. “Some fields that have already adopted rigorous replication practices and a more transparent stream of work, such as genetic epidemiology, have seen a dramatic improvement in the credibility of their claimed discoveries within a few years.” Funding is also vital. Nosek’s Center for Open Science is allocating $1.3 million to the Reproducibility Initiative, a project that aims to validate 50 high-impact cancer studies published between 2010 and 2012.
[This article originally appeared in print as "Research, Report, Repeat with Brian Nosek."]