Update 06 05 2009: Time readers may find this other post interesting!
Antidepressants are some of the most-prescribed drugs in the world. Yet they are also amongst the least well understood. We know little about how effective antidepressants are in the people who take them. Some antidepressants may work fantastically for most people. On the other hand some of them, perhaps all of them, may be useless or even worse. The truth is unclear.
This is a minority view. Opinions about antidepressants are polarized - most people either firmly believe that they do work, or firmly believe that they don't. Yet neither of these positions seems to me to be supported by the evidence available. I don't think that anyone ought to firmly believe anything about these drugs - except that better research is urgently needed.
Another placebo meta-analysis
The issue is not a lack of studies. After fifty years of research, and untold millions of research dollars, there are hundreds of published clinical trials of antidepressants. It's when you try to make sense of the results of this great mass of trials that the problems become apparent. The latest attempt to do that is a paper from a German-American collaboration, Rief et. al.'s Meta-analysis of the placebo response in antidepressant trials. The authors set out to
Determine overall effect sizes of placebo and drug effects in antidepressant trials
In other words, they wanted to find out how much people improve when given antidepressants, and how much of that improvement is due to the placebo effect. They had plenty of data to work with. Even after discarding hundreds of trials for being too small or otherwise unsuitable:
The final sample consisted of 96 trials that reported sufficient data to compute effect sizes. The placebo groups of these studies comprised 9566 people. Approximately half of the studies were published after 1996, 68% were conducted in the United States, and the mean sample size was 86 participants.
And this is what they found after crunching the numbers:
The overall effect size [Cohen's
] of the placebo effect was 1.69 (95% CI=1.54–1.85), as compared to d=2.50 (95% CI=2.30–2.69) in the drug group. The ratio of the effect sizes suggests that 67.6% of the improvements in the drug group were attributable to the placebo effect [i.e. bec
ause 1.69 is 67.6% of 2.50].
That seems like a nice, neat and tidy result. When you give depressed people antidepressants, they get loads better (a standardized effect size of 2.50 is enormous), but most of that enormous improvement is due to the "placebo effect". However, the truth is not quite so neat.
It's a Little Bit More Complicated Than That
1. First off, none of the studies included in this analysis measured the placebo effect. The "placebo effect" is supposed to be the power of treatments to make people get better purely through making them expect to get better. It's certainly plausible that there could be big placebo effects in depression. There is plenty of anecdotal evidence that it happens.
In these studies, patients took either antidepressant pills or sugar pills. The patients given sugar pills were assessed as having got a lot better, on average. Is that evidence for the placebo effect? No, because as I've explained before, the improvement reported in the placebo group could be huge even if there were no "placebo effect" at all. The patients might have just got better spontaneously, because people who are depressed do tend to get better with time. It might have been that old chesnut, regression to the mean. Or maybe the patients only seemed to get better on average because the ones who didn't get better dropped out of the trial.
According to a meta-analysis of trials which actually did examine the placebo effect - by comparing people given placebos to people who got no treatment at all - the placebo effect in depression is at best small (Hrobjartsson & Gøtzsche 2004). However, the authors of this paper are well known for being very skeptical of placebos, and the number and quality of the trials was very low. There were 7 trials with a total of 258 patients. That's it. The only reasonable view is that we just don't know how powerful placebos are in depression.
2. Rief et. al. found that the size of the effects of antidepressants and placebos was much bigger when using "observer-rating" to measure the severity of depression, as compared to when patients rated their own symptoms. The difference between the two types of rating scale was enormous, dwarfing the drug vs. placebo difference:
In the placebo groups, there was a substantial difference between effect sizes for improvements rated by observers (d=1.85; 95% CI=1.69–2.01; 93 studies) compared to those rated by patients (d=0.67; 95% CI=0.49–0.85; 28 studies)...The difference between self-ratings and observer ratings was also found in the drug groups (self-rating d=1.12 versus observer rating d=2.89).
What does this mean? It could mean that psychiatrists tend to exaggerate small changes in their patients' depression. But it could mean that depression renders people unable to notice their own improvement. Perhaps the commonly used self-rating questionnaires, like the BDI, are just not very good at measuring depression, while observer rating scales, like the HAMD, are better. On the other hand it could be that self-rating scales are better, and observer-rating scales tend to exaggerate changes. Or...
Any or all of these could be true. Speaking as both a sufferer from depression and as a trained depression observer (I use the HAMD for research), I can confidently say that rating depression is one of the hardest things I ever have to do. Monitoring my own ups and downs, let alone putting a number on them, is extremely difficult. Trying to put a number on the mood of a patient who I've only known for an hour is even harder.
Poets and novelists struggle mightily to capture the purely qualitative aspects of our emotions. The idea that some guy reading a list off a printed list of questions could succeed at putting a number on a stranger's wellbeing in 5 minutes seems faintly absurd.
3. The results of this meta-analysis are much more favorable to antidepressants than was the analysis of Irving Kirsch et. al. (2008), Initial Severity and Antidepressant Benefits: A Meta-Analysis of Data Submitted to the Food and Drug Administration. This was the (in)famous paper that everyone in the mediathoughtproved that "Prozac doesn't work".
Kirsch et. al. reported an average difference between the drug improvement and the placebo improvement of d=0.32, as against d=0.81 in this study. Conventionally, a standardized effect size d of 0.3 would be called "small" while 0.8 would be called "large". So this is a big difference. Why?
Again, there are plenty of possible reasons. Kirsch et. al. included fewer trials - only 35 -and only considered "newer" antidepressants. Rief et. al. included trials of older drugs. Kirsch et. al. included unpublished drug company data; Rief. et. al. only included published trials, meaning that publication bias could have been a problem (although they say that there is no evidence it was.)
Differences in the statistical techniques used could also explain it. As a seriesofoutstandingposts by P J Leonard and Robert Waldmann last year showed, there were serious problems with the Kirsch et. al. analysis; these are too technical to go into here but suffice it to say that if the authors of this analysis had chosen to use Kirsch et. al.'s methods they might have reached very different conclusions.
This is a plot of the degree of improvement experienced by patients in the placebo group in each trial. The average improvement ranges from zero to huge. In addition, more recently published trials tended to find greater improvements. Yet all of these patients were given the exact same thing - sugar pills. (This is not a new finding.)
Clearly, something is seriously wrong here. People suffering from the same disease given the same treatment should show similar responses. The most likely explanation is that these groups of people were not all suffering from the same disease. The diagnosis of "major depression" is increasingly seen as problematic; almost certainly there is in fact no single disease called "depression" at all. Yet every antidepressant clinical trial operates under the assumption that there is one.
Given this, it's no wonder that antidepressants, and placebos, give such wildly different results in different trials. The wonder, perhaps, is that we are still conducting such trials without first establishing what exactly we think clinical depression is and how best to measure it.
Winfried Rief, Yvonne Nestoriuc, Sarah Weiss, Eva Welzel, Arthur J. Barsky, Stefan G. Hofmann (2009). Meta-analysis of the placebo response in antidepressant trials Journal of Affective Disorders DOI: 10.1016/j.jad.2009.01.029