Should a dodgy paper on antidepressants be retracted? And what's scientific retraction for, anyway?
Read all about it in a new article in the BMJ: Rules of Retraction. It's about the efforts of two academics, Jon Jureidini and Leemon McHenry. Their mission - so far unsuccesful - is to get this 2001 paper retracted: Efficacy of paroxetine in the treatment of adolescent major depression.
Jureidini is a member of Healthy Skepticism, a fantastic Australian organization that Neuroskeptic readers have encountered before. They've got lots of detail on the ill-fated "Study 329", including internal drug company documents, here.
So what's the story? Study 329 was a placebo-controlled trial of the SSRI paroxetine (Paxil, Seroxat) in 275 depressed adolescents. The paper concluded: that "Paroxetine is generally well tolerated and effective for major depression in adolescents." It was published in the Journal of the American Academy of Child and Adolescent Psychiatry (JAACAP).
There's two issues here: whether paroxetine worked, and whether it was safe. On safety, the paper concluded that "Paroxetine was generally well tolerated...and most adverse effects were not serious." Technically true, but only because there were so many mild side effects.
In fact, 11 patients on paroxetine reported serious adverse events, including suicidal ideation or behaviour, and 7 were hospitalized. Just 2 patients in the placebo group had such events. Yet we are reassured that "Of the 11, only headache (1 patient) was considered by the treating investigator to be related to paroxetine treatment."
The drug company argue that it didn't become clear that paroxetine caused suicidal ideation in adolescents until after the paper was published. In 2002, British authorities reviewed the evidence and said that paroxetine should not be given in this age group.
That's as maybe; the fact remains that in this paper there was a strongly raised risk. However, in fairness, all that data was there in the paper, for readers to draw their own conclusions from. The paper downplays it, but the numbers are there.
The efficacy question is where the allegations of dodgy practices are most convincing. The paper concludes that paroxetine worked, while imipramine, an older antidepressant, didn't.
Jureidini and McHenry say that paroxetine only worked on a few of the outcomes - ways of measuring depression and how much the patients improved. On most of the outcomes, it didn't work, but the paper focusses on the ones where it did. According to the BMJ
Study 329’s results showed that paroxetine was no more effective than the placebo according to measurements of eight outcomes specified by Martin Keller, professor of psychiatry at Brown University, when he first drew up the trial.
Two of these were primary outcomes...the drug also showed no significant effect for the initial six secondary outcome measures. [it] only produced a positive result when four new secondary outcome measures, which were introduced following the initial data analysis, were used... Fifteen other new secondary outcome measures failed to throw up positive results.
Here's the worst example. In the original protocol
, two "primary" endpoints were specified: the change in the total Hamilton Scale (HAMD
) score, and % of patients who 'responded', defined as either an improvement of more than 50% of their starting HAMD score or a final HAMD of 8 or below.
On neither of these measures did paroxetine work better than placebo at the p=0.05 significance level. It did work if you defined 'responded' to mean only a final HAMD of 8 or below, but this was not how it was defined in the protocol. In fact, the Methods section of the paper follows the protocol faithfully. Yet in the Results section, the authors still say that:
Of the depression-related variables, paroxetine separated statistically from placebo at endpoint among four of the parameters: response (i.e., primary outcome measure)...
It may seem like a subtle point. But it's absolutely crucial. Paroxetine just did not work on either pre-defined primary outcome measure, and the paper says that it did.
Finally, there were also issues of ghostwriting. I've never been that concerned by this in itself. If the science is bad, it's bad whoever wrote it. Still, it's hardly a good thing.
Does any of this matter? In one sense, no. Authorities have told doctors not to use paroxetine in adolescents with depression since 2002 (in the UK) and 2003 (in the USA). So retracting this paper wouldn't change much in the real world of treatment.
But in another sense, the stakes are enormous. If this paper were retracted, it would set a precedent and send a message: this kind of p-value fishing to get positive results, is grounds for retraction.
This would be huge, because this kind of fishing is sadly very common. Retracting this paper would be saying: selective outcome reporting is a form of misconduct. So this debate is really not about Seroxat, but about science.
There are no Senates or Supreme Courts in science. However, journal editors are in a unique position to help change this. They're just about the only people (grant awarders being the others) who have the power to actually impose sanctions on scientists. They have no official power. But they have clout.
Were the JAACAP to retract this paper, which they've so far said they have no plans to do, it would go some way to making these practices unacceptable. And I think no-one can seriously disagree that they should be unacceptable, and that science and medicine would be much better off if they were. Do we want more papers like this, or do we want fewer?
So I think the question of whether to retract or not boils down to whether it's OK to punish some people "to make an example of them", even though we know of plenty of others who have done the same, or worse, and won't be punished.
My feeling is: no, it's not very fair, but we're talking about multi-billion pound companies and a list of authors whose high-flying careers are not going to crash and burn just because one paper from 10 years ago gets pulled. If this were some poor 24 year old's PhD thesis, it would be different, but these are grown-ups who can handle themselves.
So I say: retract.
Newman, M. (2010). The rules of retraction BMJ, 341 (dec07 4) DOI: 10.1136/bmj.c6985
Keller MB, et al. (2001). Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. Journal of the American Academy of Child and Adolescent Psychiatry, 40 (7), 762-72 PMID: 11437014