Antidepressants may help depression in some people but make it worse for others, according to a new paper.

This is a tough one so bear with me.

Gueorguieva, Mallinckrodt and Krystal re-analysed the data from a number of trials of duloxetine (Cymbalta) vs placebo. Most of the trials also had another antidepressant (an SSRI) as well. And the SSRIs and duloxetine seemed to be indistinguishable so from now on I'll just call it antidepressants vs. placebo as the authors did.

People on placebo got, on average, moderately better over 8 weeks.

People on antidepressants fell into two classes. The largest class got, on average, a lot better. But about 25% did poorly, staying just as depressed as before. This "nonresponder" group did much worse than the placebo group - again on average. Here you can see the mean "trajectories" of depression symptoms (HAMD scores) in the three groups:

This raises the scary possibility that while antidepressants are helping some people, they're harming others. But hang on. It's complicated.

First off, maybe this is all a statistical illusion. When the authors say that the people on drug fell into two classes, what they mean is that when you try to model the data according to a certain mathematical model, assuming either 1, 2, 3 or 4 underlying classes, the 2 class solution was the best fit. While for placebo a 1 class solution was best.

We considered linear, quadratic, and cubic trends over time, with between 1 and 4 trajectory classes. We also considered piecewise models with a change point at 2 weeks, linear change before week 2, and quadratic change after week 2. The selection of the best model was based on the Schwartz-Bayesian information criterion and on the Lo-Mendell-Rubin (LMR) likelihood ratio test...

That's nice... but they don't present the raw data. They don't tell us whether, looking at the individual trajectories of people on antidepressants, you'd actually see two classes. What I want is a graph of how likely people are to get better by a certain amount. If Gueorguieva et al are right, I want it to look like this i.e. bimodal -

We're not shown this graph. I'll eat my hat if it does look like that, frankly, because if it did people would have noticed the bimodality in antidepressant trials ages ago.

True, statistical models can tell us things that aren't obvious by inspection, so even if this isn't what the data look like, they might still be right. It could be that the two "peaks" are so broad, and there's so much random noise, that they blur into one.

However, it's also true that you can fit an infinite number of models to any set of data and at some point you have to step back and say - am I making this more complicated than it needs to be?

It could be that a 2-class model is better than a 1-class model for the people on antidepressants, but only because they're both crap, and really, every patient has a different, unpredictable trajectory which is poorly captured by such models.

Let's assume however that this is true. What would it mean?

Firstly, the fact that one class of people on antidepressants does worse than people on placebo doesn't mean that antidepressants are harming them. The authors miss this point, when they say

there are 2 trajectories for patients treated with antidepressants and 1 trajectory for patients treated with placebo [so] some patients would seem to be more effectively treated with placebo than with a serotonergic antidepressant.

But that's fallacious. It treats a purely statistical entity as representing individual people. Suppose that what antidepressants do is to take people who, on placebo, would have improved a bit, and make them improve a bit more than they otherwise would have. You'd then end up with more people doing well, but also fewer people doing moderately because they'd have been "moved up" out of the middle ground.

That "nudging people off the fence" could lead to a bimodal distribution and two distinct classes. But in this case the people doing badly would have done badly either way. The drug didn't make them do badly, it just made doing-badly into a class. On the other hand it's consistent with antidepressants doing real harm. We can't tell.

We do know that other randomized controlled trials show very convincingly that in a small minority of people, mostly but not exclusively young people, antidepressants do worsen suicidal thoughts and behaviours. So it's plausible. But we just don't know yet.

What worries me is that this paper is the latest in a series of attempts to use, well, creative statistical approaches to antidepressant trial data. This one is nowhere near as dodgy as the Cherrypicker's Manifesto I discussed last year, but it cites that paper and others by the same group. The first sentence of the Abstract of this paper makes the intention clear:

The high percentage of failed clinical trials in depression may be due to high placebo response rates and the failure of standard statistical approaches to capture heterogeneity in treatment response.

In other words, the reason clinical trials of new antidepressants often fail to show a benefit over placebo is not because the drugs are crap but because the statistics aren't subtle enough. And you can see where this is going: if only we could use statistical models to find the people who do benefit from antidepressants, and compare them to placebo, there'd be no problem...

Gueorguieva R, Mallinckrodt C, and Krystal JH (2011). Trajectories of depression severity in clinical trials of duloxetine: insights into antidepressant and placebo responses. Archives of General Psychiatry, 68 (12), 1227-37 PMID: 22147842