An unassuming little paper in the latest Journal of Affective Disorders may change everything in the debate over antidepressants: Not as golden as standards should be: Interpretation of the Hamilton Rating Scale for Depression.
Bear with me and I'll explain. It's less boring than it looks, trust me.
The Hamilton Scale (HAMD) is the most common system for rating the severity of depression. If you're only a bit down you get a low score, if you're extremely ill you get a high one. The maximum score's 52 but in practice it's extremely rare for someone to score more than 30.
First published in 1960, the HAMD is used in most depression research including almost all clinical trials of antidepressants. It's come under much criticism recently, but that's not the point here. The authors of the new paper, Kristen & von Wolff, simply asked: what does a given HAMD score mean in terms of severity?
It turns out that people have proposed no less than 5 different systems for interpreting HAMD scores. Do they all agree? Ha. Guess.
The pretty colors are mine. Just a glance shows a lot of variability, but the obvious outlier is the second one. That's the American Psychiatric Association (APA)'s official 2000 recommendations. Their interpretations of a given point on the scale tend to be worse than everyone else's.
This is most apparent at the top end. The APA use the terminology "Very Severe", which doesn't even appear on other scales. Much of what they class as "Very Severe" (23-26), two other scales class as "Moderate" depression! Amusingly, British authorities NICE seem to have been so unimpressed with this that they simply copied the APA's scale and toned everything down a notch for their 2009 criteria.
Why does this purely terminological debate matter? Well. A number of recent studies, most notoriously Kirsch et al (2008), have shown that antidepressants work better in more severe cases. See also my post here. The cut-off for antidepressants being substantially better than placebo generally comes out as about 26 on the HAMD in these studies.
Under the APA's 2000 terminology, this is well into the "Very Severe" band. Hence why Kirsch et al wrote - in a phrase that launched a thousand "Prozac Doesn't Work" headlines -
antidepressants reach... conventional criteria for clinical significance only for patients at the upper end of the very severely depressed category.
But for Bech, 26 is simply middle-of-the-road "major depression". For Furukawa, it's borderline "moderate" or "severe". Hmm. So if they'd gone with those criteria, Kirsch et al would have written instead
antidepressants reach... conventional criteria for clinical significance only for patients with major depression, of moderate-to-severe severity.
All of these terminological criteria are arbitrary, so this isn't necessarily more accurate, but it's no less so. The irony of the fact that Kirsch et al used the American Psychiatric Associations own criteria to skewer modern psychiatry isn't lost on me and probably wasn't lost on them either.
But where did the APA get their system from? This is the most extraordinary thing. Here's the paper they based their approach on. It's an 1982 British study by Kearns et al. The authors wanted to see how the HAMD compared to other depression scales. So they used lots of scales on the same bunch of depressed patients and compared them to each other, and to their own judgments of severity. Here's what they found:
You'll recognize the APA's categories, kind of, but they're all shifted. Why? We can only guess. Here's my guess. The scores in that Kearns et al graph were the average HAMD scores of people who fell into each severity band. The APA must have decided that they could use these to create cutoffs for severity.
How? It's not at all clear. The mean score for "Moderate" was 18, but that's the topend of Moderate in the APA's book; ditto for "Mild". The average "Very Severe" was 30 and the average "Severe" was 21 so the cut-off should have been 25 or 26 if you just went for the midpoint, in fact the APA went with 23. And so on.
That's before we get into the question of whether you should be using these results to make cutoffs at all (you shouldn't.) And the APA seem to have ignored the fact that the HAMD did not statistically significantly distinguish between "Severe" and "Moderate" depression anyway (p=0.1). Kearns et al's graph shows that other scales, like the Melancholia Subscale ("MS"), would be better. But everyone's been using the HAMD for the past 50 years regardless.
In Summary: Interpreting the Hamilton Scale is a minefield of controversy and the HAMD is far from a perfect scale of depression. Yet almost everything we know about depression and its treatment relies on the HAMD. Don't believe everything you read.
Kriston, L., & von Wolff, A. (2010). Not as golden as standards should be: Interpretation of the Hamilton Rating Scale for Depression Journal of Affective Disorders DOI: 10.1016/j.jad.2010.07.011
Kearns, N., Cruickshank, C., McGuigan, K., Riley, S., Shaw, S., & Snaith, R. (1982). A comparison of depression rating scales The British Journal of Psychiatry, 141 (1), 45-49 DOI: 10.1192/bjp.141.1.45