For three days, Alex Jofriet drank one 8-ounce glass of milk per day. This was a big deal because he hadn’t touched the stuff in years. Jofriet has Crohn’s disease, a poorly understood chronic inflammation of the intestine. Eat the wrong thing, or even too much of the right thing, and he would pay the price: hours in the bathroom wracked with stomach pain, gas and diarrhea.
But Jofriet took the risk. He and his doctor, Shehzad Saeed, decided to set up an experiment, and Jofriet would be the only subject. For those three days, he would drink milk like he wasn’t afraid, like he hadn’t spent the past eight years negotiating a careful truce with his digestive system. He would drink milk and record how he felt on those days. How many bowel movements did he have? Was there blood in his stool? Did he have stomach pain and bloating? Then Saeed would statistically compare Jofriet’s health on the days he drank milk with the days he didn’t.
This isn’t how scientists normally answer health questions. In fact, the question itself is all wrong. Medical research happens on the scale of populations. You’re supposed to ask big questions like, “Does drinking milk cause negative symptoms in Crohn’s disease patients?” Then you’re supposed to round up a large and representative sample of patients, randomly sort them into groups, test the hypothesis using placebos or other controls, and check to see whether all the individual results yield a statistically significant aggregate answer.
This is the basic outline of a randomized control clinical trial — the gold standard of medical research and the basis of most of the medical facts we hear from doctors and read in magazines like this one. It’s the best way we know of to learn about how our bodies work (and don’t work) and what it takes to fix them.
But even the gold standard isn’t perfect. The controlled clinical trial is really about averages, and averages don’t necessarily tell you what will happen to an individual. Such a trial might tell you that, statistically speaking, milk isn’t good for Crohn’s patients. But within that sample, there might be people who didn’t have any problems drinking milk, and people whose symptoms even got better while drinking it. In the doctor’s office, one on one, what you know about average results for a population is just the beginning, not the final word.
One way to correct for the gaps the gold standard leaves in our knowledge is the “N of 1” trial, where the number of participants (N) is one instead of hundreds or thousands of volunteers. That one person works with the doctor to test a narrow hypothesis — for example, “I think drinking milk will make me feel sick. Am I right?” There are still controls. Ideally, there are still placebos. But at the end, what you get is a patient-specific, individualized answer. It’s a process shown — by controlled clinical trials, no less — to improve patient outcomes. And scientists working with these studies today, including Saeed, are almost invariably enthusiastic about N of 1’s potential.
The trouble is, other physicians and researchers enthusiastic about N of 1 studies have ended up disappointed. For 30 years, scientists in the United States, Canada and Australia have tried to set up services to help doctors do N of 1 trials. But the services keep failing, bogged down by cost, bureaucracy and a lack of interest from the very doctors and patients the studies are meant to help.
What makes a great idea such a flop in practice? Saeed and other researchers think they’ve figured out the problems. And this time, instead of fading away yet again, N of 1 is here to stay.
Trust but Verify
Alex Jofriet was always a picky eater. But somewhere around fourth grade, he got sick, and the casual pickiness — the “wake up in the morning and decide bologna is gross today” kind of pickiness — turned into something else. His stomach began to hurt. He spent hours on the toilet, suffering recurrent bouts of diarrhea.
Between that and throwing up much of what he ate, it was just easier not to eat. By the time everyone realized this was more than just a fussy distaste for a few foods or a bad stomach flu, Jofriet had lost 20 pounds and was so exhausted that he had to sit down on the shelves at the grocery store to rest. At age 10, the calcium-deficient bones in his back cracked and fractured.
That was the beginning of Jofriet’s life with Crohn’s disease. It was also the beginning of a weird relationship with food. Over the next eight years, he would go through numerous surgeries and treatments, and those things, along with the actual symptoms of Crohn’s, would affect what he could eat and when. Sometimes he subsisted on nothing but Teddy Grahams and cans of Pediasure. Sometimes he didn’t eat at all, receiving all his nutrition through a tube that snaked from a pack on his back into his nose and down to his stomach. The only way Jofriet could go to homecoming during his sophomore year of high school was on a six-hour pass from the hospital where he was recovering from yet another abdominal surgery.
Today, at 18, he’s enjoying the best health he’s had since grade school. Nobody is exactly sure why his illness flared up in horrible ways throughout junior high and high school, but after some experimentation with his doctor, he’s finally found a medication that keeps his Crohn’s under control. And that has left Jofriet in an awkward position. Increasingly healthy, he doesn’t want to do anything that could jeopardize that health. It’s not unusual for Crohn’s patients in his situation to experiment informally, jumping mostly by instinct into fad diets, supplements, new foods and alternative therapies.
That kind of experimentation isn’t limited to Crohn’s patients. Everybody has tried one-person experiments at some point. You want to lose weight, so you try the low-carb diet you keep hearing about on the news. You have arthritis, and you think acupuncture might relieve your pain better than medication. Your kid has a cold, and taking a friend’s advice, you give him some zinc. In that way, N of 1 trials are nothing new.
The problem is that the results of all these little experiments are suspect. Few of us start by documenting a baseline, tracking symptoms before we try a new treatment. Nor do we usually document what happens after we start the new treatment or test different treatments separately, comparing them with each other and with what happens to our bodies without any treatment.
This distinguishes formal N of 1 experiments from basic, everyday decision-making about health. Some are more formalized than others, but the best have three important elements that don’t exist when patients, or patients and their doctors, are just trying things out, says Naihua Duan, a retired Columbia University biostatistician and part of a team of experts convened by the federal Agency for Healthcare Research and Quality that recently published a user’s guide to N of 1 experiments.
The first characteristic of a formal N of 1 experiment is randomized assignment of treatment conditions. That is, the patient should cycle either between periods using the active treatment and periods using some kind of placebo, or between periods of two different treatments. Second, neither the doctor nor the patient should know when the patient is taking the placebo or the treatment. That’s called blinding. Finally, the doctor and patient should track symptoms in detail throughout the experiment, just like Jofriet and Saeed did. If one of the symptoms is bloating, the amount of bloating should be recorded at various points in the day, every day, throughout the experiment.
Psychologists have been running one-person experiments, using some of these basic principles, since the mid-20th century. But the idea of single-subject research didn’t really make the leap to medicine of the body until the early 1980s when Gordon Guyatt, a Canadian physician now known as a founder of evidence-based medicine, began working in an interdisciplinary department at McMaster University in Ontario, with psychologists, biostatisticians, ethicists and clinical epidemiologists all working together.
At a weekly departmental seminar, one person would present his or her current research, and the others would lob around criticism and ideas. In these debates, Guyatt remembers, one of the psychologists kept bringing up the idea of N of 1 trials. Guyatt decided to learn more.
At the time, he was dealing with a lot of patients whose health experiences didn’t match up with the results of large randomized clinical trials. One of those was an asthmatic septuagenarian whose three prescribed medications didn’t seem to be helping. N of 1, Guyatt realized, could solve that problem.
Guyatt and his team meant to do three blind comparisons between a placebo and one of the three drugs: theophylline, a bronchodilator that eases breathing. But after switching between theophylline and placebo two times, they had to stop the experiment. It was already clear that changing the medication was making a big difference — and the patient was healthier on the placebo. The active drug was actually making the patient worse, not better. The results didn’t mean theophylline, which is still in use, was a bad drug. It was just a bad drug for this particular patient. With theophylline removed from his regimen, the patient flourished.
Guyatt published the results in The New England Journal of Medicine in 1986. “To definitively establish whether or not something works in an individual is kind of a thrill,” he says. “A trial of 1,000 people is a long-term process. Even just recruitment can take three years. There’s a lot of slogging. But N of 1 gets you answers quickly.”
While the improvements in N of 1 patients aren’t always as dramatic, research over the past 20-plus years has produced solid support for the idea that these studies really can help doctors and patients work together to make better decisions. That’s especially true when it comes to chronic illness.
Saeed, Jofriet’s doctor at Cincinnati Children’s Hospital Medical Center, relates another case: a girl with Crohn’s who believed that taking a probiotic eased her symptoms. Saeed’s subsequent N of 1 experiment showed that the probiotic wasn’t actually helping; the fluctuation in symptoms existed independently of whether she was taking it. So she stopped, and instead, she and her doctors focused on trying to figure out why the symptoms fluctuated.
In 1990, when Guyatt and his team published the first review of their work, these were the kind of results they were looking for — experiments that answered a question and changed a patient’s treatment plan. Of the 70 N of 1 experiments they’d done at that point, 50 led to a definitive answer, and 39 percent of those answers led to a change in treatment. Later studies found similar benefits of N of 1 trials.
And yet, from Guyatt’s perspective, N of 1 trials have been a disappointment. He and other scientists published a lot of research on the experiments in the early to mid-1990s, but then they mostly gave up. Although the experiments produced useful results, their logistical complexities made them expensive and difficult for individual patients and doctors to manage properly. At the same time, Guyatt and other scientists found themselves fighting internal bureaucracy for the right to do the experiments at all. “We hung on for five years or so,” Guyatt says. “And people still call me to ask about it periodically. I have a chat with them, and I say, ‘good luck.’ ”
Rise and Fall and Rise Again
Today, few people have even heard of N of 1 experiments. When Richard Kravitz, co-vice chair of research in the department of internal medicine at the University of California, Davis, does focus groups to see what doctors think about N of 1 trials, he usually has to start by explaining what an N of 1 trial is.
Kravitz studies how doctors’ behavior affects patients’ health, and he sees N of 1 experiments as a clever means of changing the way medicine is done — a chance to tailor care to patients’ individual needs without relying on ultra-high-tech, ultra-expensive genetic-sequencing technologies. “It allows you to implement personalized medicine without the ‘omics,’ ” he says. “You can be rigorous and scientific, but you don’t need a lab.” Unfortunately, while N of 1 experiments may be more down to earth than personal genomics, they come with their own drawbacks — issues that led researchers like Guyatt to abandon them 20 years ago.
First and foremost: It’s just plain complicated to do an N of 1 trial. Most doctors have neither the time nor the tools, administrative help or extra funding they’d need to make it work. And in most trials, you also need a placebo, or sugar pill, and the placebo and the drug would need to be disguised so that neither the patient nor doctor would know which was being taken when.
That’s more complicated than it sounds, says Paul Glasziou, who used to run a service that helped doctors design N of 1 experiments at the University of Queensland in Brisbane, Australia. Despite their simple ingredients, placebos aren’t cheap. “[It] means getting pharmaceutical companies to shut down the usual production system and put in inert powder instead,” he says. For a simple one-off experiment, you could go to a compounding pharmacist to make capsules that hide either the active drug or a placebo, but those pharmacies aren’t common, and they rarely work in bulk. That’s a problem if you’re trying to set up a service where, ideally, lots of doctors would come to get placebos or capsules for hundreds of patients.
Bureaucracy is the second challenge that looms over N of 1. Any time scientists want to do an experiment on a human, they have to get the plan approved by an Institutional Review Board (IRB). N of 1 trials don’t fit into that established bureaucracy in a clearly defined way. The trials aren’t exactly research: Nobody is using them to figure out whether a drug works or is safe before it’s released to the general public. This is just doctors taking approved drugs and figuring out whether they work for specific patients.
But at the same time, they are each an experiment. And the process of securing placebos and setting up the proper controls convinced some IRBs that N of 1 trials should fall under standard experimental ethics protocols. “IRBs were totally unused to the idea,” Kravitz says. “There were some that even required separate approval for every trial.” So every time a doctor and a patient wanted to compare a drug to a placebo, they had to get IRB approval first. “Programs just collapsed,” Kravitz says. These problems sank the program Guyatt set up to help doctors at McMaster design and conduct N of 1 trials, and many other programs never got past the planning stages.
When interest in N of 1 experiments began to rebound in the last decade, proponents had to deal with the same problems. Sunita Vohra runs the N of 1 service at the University of Alberta, Canada. Although she got her funding in 2004, she couldn’t launch the service until 2006. The intervening years were filled with prolonged negotiations to convince the university that N of 1 experiments were primarily about improving patient care, not doing research on how drugs and other treatments worked.
Today, she can set up all sorts of pediatric N of 1 experiments. She’s tested whether probiotics would help a child with eczema. She helped a family figure out whether a supplement they bought in another country was actually improving their child’s arthritis. And she can do these things without sinking into a bureaucratic morass.
Meanwhile, the program at Cincinnati Children’s Hospital Medical Center — the one that helped Jofriet — has found ways around the costs and complications. Technology helps: Jofriet uses a web interface where he and Saeed, his doctor, can set goals, plan experiments and track Jofriet’s symptoms in between office appointments. Jofriet can send text messages or email and fill out online surveys.
But Cincinnati Children’s also streamlined the N of 1 process by discarding placebos and blinding. For instance, when Jofriet did the experiment to see whether drinking milk would make him sick, he and his doctors didn’t create fake “milk” — they knew when Jofriet was drinking milk and when he wasn’t.
That sounds like a serious sacrifice. After all, blinding and placebos, along with random assignment and documentation of outcomes, are supposed to be what make N of 1 experiments different from just randomly fiddling around. But Kravitz says that experiments like Jofriet’s still count as N of 1. In fact, getting rid of placebos could be a good thing, and not just because it makes the experimental process easier. Any time a treatment works, Kravitz says, the effect is actually a combination of the biological effects of a drug and a whole host of other effects, including placebo effects triggered by factors such as how much confidence the doctor exudes in the exam room or the color of the pill. It’s the total effect that matters.
Think back to Jofriet’s milk trial. The point of that experiment wasn’t just to see what happened when he drank milk, Saeed says. It was also about emotional and psychological reassurance. Jofriet hadn’t eaten a normal diet in years. Trying milk was the first step toward that. He needed to see himself drinking the milk, just like he needed to see the results: that it wasn’t hurting him.
Over the next couple of months, Jofriet went on to do the same kinds of experiments with Cheerios, green vegetables, fruit, peanut butter, pretzels and meat. All those were successful, too, not just because he didn’t get sick, but because he got back a normal life.
That’s why N of 1 experiments are worth doing, even if you can’t do them perfectly, says Michael Seid, director of health outcomes and quality of care research at Cincinnati Children’s Hospital. From Seid’s perspective, the personalization offered by N of 1 experiments is not only a cheaper and lower-tech way to achieve personalized medicine, it is even better than the personalization you get with the high-tech tools.
Personalization is Key
Single-subject trials allow you to personalize the outcome, not just the treatment. With the help of N of 1 experiments, Jofriet could decide that “being able to drink milk” was the desired outcome, and he and his doctor could figure out how to make that happen.
That ability to improve the quality of care in a truly personalized way explains why N of 1 trials are making a comeback, Naihua Duan says. It’s part of a bigger trend that also affects large placebo-controlled clinical trials. Historically, those “gold standard” trials have been focused on achieving FDA approval for a drug or producing generalizable knowledge about how the body works and responds to medications. In other words, they were focused on serving scientists.
That’s changing, Duan says. Over the last decade, medical researchers have begun to put more of an emphasis on doing research that serves patients’ needs directly. Some of that happens in the form of large placebo-controlled clinical trials aimed at answering questions such as which of two existing treatments for the same disease produces the best results for the least money. The trend toward patient-centered research is reshaping the way medical studies are done. In 2008, the movement got its own journal, The Patient: Patient Centered Outcomes Research. In 2010, the Affordable Care Act established the Patient Centered Outcomes Research Institute.
N of 1 experiments fit neatly into this shift in thinking, and this could be the catalyst that makes these experiments succeed where they had once failed. Duan certainly hopes so, and he’s using the opportunity to make sure more people learn about N of 1 experiments and what they can do. This summer, he and Kravitz began an ambitious study comparing N of 1 experiments with standard medical treatment in almost 250 patients. The study represents one of the first times that N of 1 has gone head to head with traditional health care. Depending on what Duan and Kravitz learn, the study could lead to a world with more Alex Jofriets — and more opportunities for people to improve their health, one patient at a time.
[This article originally appeared in print as "Singled Out."]