By Luke Jostins, a postgraduate student working on the genetic basis of complex autoimmune diseases. Jostins has a strong background in informatics and statistical genetics, and writes about genetic epidemiology and sequencing technology on the his blog Genetic Inference. A different version of this post appeared on the group blog Genomes Unzipped.
One of the great hopes for genetic medicine is that we will be able to predict which people will develop certain diseases, and then focus preventative measures to those at risk. Scientists have long known that one of the wrinkles in this plan is that we will only rarely be able to say with certainty whether someone develop a given disease based on their genetics—more often, we can only give an estimate of their disease risk. This realization came mostly from twin studies, which look at the disease histories of identical and non-identical twins. Twin studies use established models of genetic risk among families and populations, along with the different levels of similarity of identical and non-identical twins, to estimate how much of disease risk comes from genetic factors and how much comes from environmental risk factors. (See this post for more details.) There are some complexities here, and the exact model used can change the results you get, but in general the overall message is the same: genetic risk prediction contains a lot of information, but not enough to give guaranteed predictions of who will and who won't get certain diseases. This is not only true of genetics either: parallel studies of environmental risk factors usually reveal tendencies and probabilities, not guarantees.
This means that two people with exactly the same weight, height, sex, race, diet, childhood infection exposures, vaccination history, family history, and environmental toxin levels will usually not get the same disease, but they are far more likely to than two individuals who differ in all those respects. To take an extreme example, identical twins, despite sharing the same DNA, socioeconomic background, childhood environment, and (generally) placenta, usually do not die from the same thing—but they are far more likely to than two random individuals. This is a perfect analogy for how well (and badly) risk prediction can work: you will never have a better prediction than knowing the health outcomes of a genetic copy of you. The health outcomes of another version of you will be invaluable, and will help guide you, your doctor, and the health-care establishment, if they use this information properly. But it won’t let them know exactly what will happen to you, because identical twins usually do not die from the same thing. There is no health destiny: There is always a strong random component in anything that happens to your body. This does not mean that none of these things are important; being aware of your disease risks is one of the most important things you can do for your own future health. But risk is not destiny. And this central fact has been well known to scientists for a while now. This was the context into which a recent paper in Science Translational Medicine by Bert Vogelstein and colleagues was published, which also used twin study data to ask how well genetics could predict disease. The take-home message from the study (or at least the message that many mediaoutlets havetaken home) is that DNA does not perfectly determine which disease or diseases you may get in the future. The paper was generally pretty flawed: many geneticists expressed annoyance at the paper, and Erika Check Hayden carried out a thorough investigation into the paper for the Nature News blog. In short, the study used a non-standard and arbitrary model of genetic risk, and failed to properly model the twin data, handling neither the many environmental confounders nor the large degree of uncertainty associated with studies of twins. Many geneticists were annoyed that the authors seemed to be unaware of the existing literature on the subject, and that they presented their approach and their results as if they were novel and controversial at a well-attended press release at the American Association for Cancer Research annual meeting. However, what came as more of a shock was how surprised the media as a whole seemed to be at the results, with headlines such as "DNA Testing Not So Potent for Prevention" and "Your DNA blueprint may disappoint." No reporter (other than Erika) even mentioned the information that we already had about the limits of genetic risk prediction. As Joe Pickrell pointed out on twitter, we can’t really know whether this was genuine surprise or merely newspapers hyping the message to make it seem more like news, but having talked to a few journalists and members of the public, the surprise appears to be at least in part genuine. The gap between the public perception and the established consensus on genetic risk prediction seemed to us to be unexpected and worrying. But of course, the reason for this gap is relatively obvious. The previous papers that discussed this subject were written by statistical geneticists, were technical, discussed technical models and the merit of various predictive measures, and rarely came with a press release or an attempt to talk to the public about their results. The message, to those who can read them, is clear and well-established—genetic risk prediction (or any form of risk prediction) will never be able to perfectly predict disease incidence, and will never replace diagnostic tests. But the fact that the results of Bert Vogelstein’s study seems to have come as a surprise to people, when it comes as no surprise to us, shows us that we have failed in one of our primary duties to keep the public informed about the results of our research. The paper’s failure as a work of statistical genetics stands in contrast to its success as a work of public outreach. If we are annoyed that a bad paper got the message across, then we should be annoyed with ourselves that we never communicated our own results properly. As people who do technical, statistical research, we need to concentrate especially hard on explaining our results in ways that are intuitively understandable to non-specialists. The precise probabilistic nature of disease risk is tricky to get your head around, but not impossible. Perhaps if most of the public could be made aware of the basic concepts and results of statistical genetics, reports on genetics not being destiny would be old news instead of big news. And we would be able to address the more pressing questions of exactly how to use genetics to improve human health. (As a side note, and somewhat counter intuitively, only a very small proportion of human disease genetics research actually looks at using genetic information to predict diseases in patients. The majority of our research goes into using genetic predictors of disease to understand what goes wrong in disease in general, so we can treat everyone with the disease regardless of their genetic risk. This means that the benefits of disease genetics research are largely independent of how predictive the genetics actually is. However, this doesn’t mean that we don’t care at all about risk prediction.)