In the tiny workplace of the cell, every one of our 30,000 genes has a partner. Each performs the same job, so to speak, as the gene in the office across the hall. Situated on paired chromosomes, these duos execute one small task, usually directing the assembly of a protein.
A single change in a gene, known as a single nucleotide polymorphism, can lead to disease. The top sequence shows a portion of the gene that codes for hemoglobin. The bottom sequence shows a base change—thymine (T) for adenine (A)—that alters the shape of the hemoglobin protein. Having two copies of the altered gene causes sickle-cell anemia.Illustration by Dan Winters & Gary Tanhauser
Now let's say that among the genes facing each other in the cell, there is a pair that runs a biotech company. Yes, biotech company genes. Let's call the firm that this gene pair directs Perlegen Sciences. Let's place this new company, shall we, in Mountain View, California, in the heart of Silicon Valley.
For Perlegen to function, the two genes that make up the pair must work well together. They should like and sympathize with each other, and like genes in a real cell, one should be able to stand in when the other is out. Let's name the two genes Brad Margus and David Cox. The former is the chief executive officer and the latter the chief scientific officer of Perlegen Sciences.
Margus and Cox also happen to look and dress rather alike. Last fall at the company Halloween party, Brad went as David and David went as Brad, or so they joked afterward. But of course Margus and Cox aren't identical persons, just as paired genes in our cells almost never occur as perfect copies, because they were handed down by long and separate paths, one the mother's line and the other the father's. The genes have the same objective, which is spelled out in DNA's biochemical code of A's, C's, G's, and T's (for adenine, cytosine, guanine, and thymine). But rarely is the sequence of the letters, or nucleotides, exactly the same within each pair.
Letters can be shuffled, repeated, or deleted without losing their genetic meaning, the way that the word aeroplane is equivalent to airplane, for instance. The variations in spelling are known as polymorphisms. The most common and subtle type of polymorphism entails the substitution of a single letter. In English it would be kittykat instead of kittycat. In DNA dialect, if the nucleotide letter T appears instead of C, or G where usually there's A, the gene is said to contain a single nucleotide polymorphism. In short, a SNP, or Snip.
Such variants have long histories. They first occur by chance on one chromosome of a person's genome, then either disappear when the person dies or get passed on to future generations. By convention, researchers use the term Snip for a variant that has been inherited by at least 1 percent of the world's population. If less frequent, the change is simply referred to as a mutation. Snips therefore tend to be old and common—survivors of evolution—while mutations tend to be rare, youthful additions to the genome, which may or may not take.
In a broad sense Snips are what cause human beings to differ—short people from tall people (because of Snips in the genes controlling height), black skin from white (because of Snips in the genes for pigment), and so forth. The single-letter changes underlie not only our physical diversity but also our different vulnerabilities to disease. Snips explain why under equal conditions some people will succumb to high blood pressure, asthma, or mental illness, and some people won't.
To return to the Perlegen pair, Margus and Cox, you could think of them as variants, or Snips, of each other. That fits, because the aim of their new company is to search for actual Snips in the human genome. Since the full human genome was first sequenced almost two years ago, the hunt for Snips has become the next new thing. Competing with other players in the biotech and pharmaceutical industries, Perlegen wants to discover significant Snips in the hope that new drugs will eventually flow from the knowledge.
Although the big promise of the Human Genome Project—drugs customized to a person's genetic profile—is still a long way off, the first step toward that goal is to understand the fine print of genetic diversity. The two groups that produced rival sequences of the genome—the National Human Genome Research Institute, run by the federal government, and Celera Genomics, a private company—have both set their sights on Snips.
Margus and Cox voiced confidence, as they glanced at each other in Perlegen's conference room, that their own effort would succeed. Their confidence stems from a powerful new technology called the microarray, or gene chip. The chips can scan whole genomes for changes in lettering, and Perlegen's chips have been hard at it for a year.
Recently, Cox and Margus gave a presentation on the role of Snips in human disease. Margus, 41, set up his laptop on the conference table. He ran through the slides he normally uses for potential investors: the basic speech, not too technical. As Margus talked, Cox, 55, his hands clasped behind his head, interjected scientific detail. Jarrett, Margus's 13-year-old son, watched, too. He was in a wheelchair. Thin and shy, with neatly combed hair, the boy smiled but did not speak. He listened to the presentation with an almost furious concentration.
Brad Margus, gene chip in hand, stands with David Cox in the Perlegen lab in Mountain View, California. "I know better than anyone: You can't really celebrate just because you found the gene," says Margus.Photograph by Gillian Laub
The first thing we learned about the human genome is how extraordinarily long it is: 3.2 billion nucleotide letters. A book containing the human genome sequence would have a million pages of text—3,200 letters per page.
Margus's next image showed strings of letters unspooling from the snapshots of five Perlegen employees (Cox among them). The DNA from each picture read "ATTGCAAGGCCGT," etc. But the five sequences, as Margus pointed out, were not quite the same. At one or two places in the sequences—called bases—there were differences in spelling. These bases marked Snip sites. Still, "All humans share 99.9 percent the same spelling," Margus said.
OK, this raised several questions. A frequent misperception is that the human genome announced by Celera and by the government team was the human genome. It wasn't. The two sequences were a kind of mishmash. Celera Genomics had blended the DNA of five people in order to derive one genome, and the government group had combined genetic samples from 24 people. The sequences represented the "canonical human being," as Cox put it. They were benchmarks—reference texts against which to compare the lettering of other genomes in other individuals.
Then what about the statement "All humans share 99.9 percent the same spelling"? It was imprecise, Cox admitted. Take any two people in the world, such as Brad Margus and his wife, and on average their genomes will be 99.9 percent identical. Snips, the major source of variation, occur roughly once every thousand bases—which is not very often. Still, the human genome is so long that there are millions of locations where the sequence of the letters is different.
But now take two other people, David Cox and Jarrett Margus. Although they are also 99.9 percent alike, the sites on their chromosomes where they differ are not the same sites as in the previous pair. Thus, each time another person's DNA is added to the equation, the overall similarity among the human genomes decreases. If we included the DNA of all 6 billion people on Earth, changes in lettering would show up at every base in the genome. "It's a conundrum," said Cox. "It's either we're 99.9 percent the same, or every base is different. You could say all humans share nothing, and that would be correct, too." à
The question is more than semantic. At this early stage of discovery, the human genome is a murky and malleable document, open to interpretation. Like a new Web site, it's "under construction" by the scientists and companies seeking to develop it. Two genomes vary, or two genomes are alike, depending on the point of view, yet within a dozen genomes there seem to be common patterns to the variation. Blocks of Snips have been conserved over aeons of evolution, and these shared patterns may hold the clues to common diseases. Those are the variants Perlegen is after.
"Each Snip tends to happen only once in the history of mankind," Cox said dramatically. "So focusing on Snips is a way of using mankind as one big family."
But we were getting ahead of the lecture. Margus, waiting to continue, clicked on the next slide in the presentation. It was titled "Genetics and Disease." His son was swaying slightly in the wheelchair, his eyes cast down.
There was a list of the rare, infamous disorders that are caused when a single gene goes awry: cystic fibrosis, Huntington's disease, muscular dystrophy . . .
The most relevant condition wasn't cited, however: ataxia-telangiectasia, or A-T. It is a neurological disorder. In addition to a loss of motor control (ataxia), patients have cancer-prone immune systems. Spidery blood vessels (telangiectases) sprout in their eyes. A-T is caused by the failure of a gene by that name. Of the eight A-T genes present in the conference room that day—each person there bore two copies—at least three of the copies were flawed. A single change in the lettering was the culprit, but this was technically a mutation, not a Snip, because it is so terribly rare.
Brad Margus is not from the world of science. In the late 1980s, having earned a degree from Harvard Business School, he ran a shrimp-processing company called Kitchens of the Oceans, on the coast of Florida. In 1987 he and his wife, Vicki, started a family. They had a son, Colton, followed in short order by Jarrett and a third son, Quinn.
How gene chips find snipsChemical letters, or bases, in a single strand of DNA will bind to their chemical partners: A to T, T to A, C to G, and G to C. This principle allows the Perlegen Sciences' gene chip to detect subtle variations in the human genome that may yield clues to the origins of complex diseases. (A) The chip, which is five inches square, is lined with about 60 million short, chemically synthesized single strands of DNA. These strands are called probes. (B) Each small square on the chip contains 400,000 probes. (C) This example shows how probes detect differences between two sample strands. When each letter in the DNA strand to be tested binds to its complement on the probe, the probe lights up, showing the strand's sequence. Because groups of single-letter variants (Snips, shown in red) are inherited together, finding one member can locate the group.Photographs courtesy of Perlegen Sciences (2).Graphic by Matt Zang
The two younger boys learned to walk and talk at a normal age, but they wobbled when they ran and slurred their words. They were diagnosed with ataxia-telangiectasia in 1993. It turned out that Brad happens to be a carrier of the A-T genetic disorder, and so is Vicki.
A-T is a recessive condition inherited according to the pattern established more than a century ago by the monk Gregor Mendel. Each parent carries both a bad copy and a good copy of the gene. The good copy keeps the parent healthy. It generates adequate amounts of a protein called ATM, which normally monitors and repairs damage to the cells' DNA. Without it, a series of small internal breakdowns will progress to serious disease. The Marguses' oldest child is healthy because he also received one good copy.
But by the unlucky odds of one in four, both Jarrett and Quinn inherited the two bad copies that their parents carried. There are no treatments for the condition.
After making contact with other A-T families, Brad and Vicki started a foundation, the A-T Children's Project, to raise money for research and to create notice for the disorder. They were interviewed on national television by Barbara Walters, and Brad testified before Congress. Margus was not satisfied with knowing the basics of his kids' disorder—he wanted every single molecular detail. He found scientists to tutor him in biology. With the foundation's money, he supported geneticists trying to uncover the location and identity of the A-T gene, which in the early 1990s remained a mystery.
Margus learned of a prominent geneticist who'd recently been hired by Stanford University. He flew to California, hoping to enlist the doctor as head of the foundation's scientific advisory board. But David Cox, although impressed by Margus's command of the subject, said he had his own research to manage. He codirected the Stanford Human Genome Center, which was gearing up for its part in the effort to sequence the entire genome.
"He gave me a compassionate smile but said no," Margus recalled. "I talked to him for another hour, explaining why it wouldn't take too much of his time—all I needed him for, I explained, was his brains and advice. By the end of the meeting, David agreed to be my director."
Cox's career, like Margus's, was in transition. He had acquired a Ph.D. in genetics and then his M.D. His initial work, on the clinical side of medicine, was tending children with genetic disorders. "I was a pediatrician dealing with diseases I didn't understand," he said. "Next, I was involved in discovering single genes, for single-gene disorders. To which my wife [a genetics counselor] said, 'Ring me up when my patients should care.'"
She knew that although doctors might have a genetic test to offer, there wasn't much they could do after the diagnosis. The latest example was A-T, whose flawed gene was isolated in 1995, thanks partly to funds from the A-T Children's Project. Nevertheless, the physical condition of Jarrett and Quinn Margus continued to deteriorate, and by the late 1990s the boys were in wheelchairs. The research program that Cox had helped shape was no help yet.
In his studies Cox shed light on genes that contributed to Parkinson's disease and to a rare form of epilepsy, but increasingly he saw his work as "piecemeal contributions." "I realized," he said, "what I'm doing is not enough. Is genetics going to do anything useful or not?" Participating in the Human Genome Project began to give him a sense of the big picture and of the potential of genomics.
Meanwhile, Margus, still in Florida, studied precisely what goes wrong in the A-T gene sequence. He and Vicki had a fourth child, Caden, who was healthy, although he too was a carrier of ataxia-telangiectasia.
The slides flickered on the laptop screen, moving from the rare disorders caused by rare mutations, like A-T, to humanity's more common diseases, which are believed to be connected to common polymorphisms—Snips.
"But the most common diseases," Margus read aloud, "are not caused by a single misspelling. . . . Most are probably caused by changed letters in 20 to 50 places contributing to 'complex' diseases such as Alzheimer's, diabetes, heart failure, schizophrenia, osteoporosis, asthma, lupus, multiple sclerosis."
These are the diseases of interest to the major drug companies and to Perlegen's rivals such as Celera. Explains Celera's chief scientific officer, Samuel Broder: "If you look at the top killers in the United States, they're not the classic Mendelian disorders. You have the interplay of 10 or more genes—a chorus of genes—in which any one gene is singing at a low volume. Plus, you have the environmental factors, which Mendel disallows or doesn't acknowledge are important."
By environment Broder means diet, lifestyle, cigarette smoking, chemical exposures, and other external factors, which influence risks for these complex ailments as much as family history does. Family history, for that matter, is an expression that doctors use when inheritance can't be calculated as simply as with the single-gene maladies. The more common but unknown genes of the Snip medical universe are called disease-associated genes, whose mechanisms will not be evident even when the genes are identified.
Finding these disease-associated genes will require radically new ways of sampling the genome. The conventional approach has been to look for one promising candidate, a gene whose protein might have an impact on blood pressure, say. First, you locate the gene, and then you study its function intensively.
"This is the only way you could afford to do it until now," Margus said. "Most geneticists find a likely spot on the genome and get to work hunting."
The method has worked for pinpointing the sources of single-gene disorders like A-T, but it has not worked for elucidating complex, multi-gene disorders like high blood pressure. The contribution of any individual gene to the disease is too weak to be detected.
But researchers are betting they can simplify the search by focusing on blocks of Snips. Now the Perlegen presentation came to the point. "Snips occur together," said Margus. "We look only at places where genomes of people are frequently found to be spelled differently."
All right, so how will these chunks of Snips—called haplotypes—be identified? Margus introduced whole-genome scanning. The idea is that the haplotypes can be revealed by analyzing all the Snips simultaneously.
Scanning each letter of every subject's genome, however, is too costly and time consuming. So Perlegen planned a shortcut, taken from genomics—gene chips. The trick is to apply gene chips to a representative collection of human DNA. à
A microarray, or gene chip, consists of microscopic grids of DNA of known sequence. Normally, DNA is a double-stranded molecule, but on the chip it is laid down as a single strand. When exposed to a sample of unknown DNA, the probes on the chip bind to their complementary strands, thereby reading the sequences in the sample.
Brad Margus sits between sons Jarrett (left) and Quinn at their home in Florida. Margus works at the Perlegen Sciences office in California and commutes back to his family two or three times a month.Photograph by Gillian Laub
Genomics has brought "big science" into biology. Microarrays, DNA sequencers, and massively parallel computers have overturned the targeted approach to the genome by processing genes collectively. Machines can capture a cell's total genetic activity and then crunch the results for clues to disease. The amount of data is almost overwhelming. "It's a fire hose of information blasting people against the wall," said Cox.
In early 2000 Cox met with Stephen Fodor, the chief executive of a company called Affymetrix. Fodor had invented one of the two main types of microarrays. Affymetrix was now the dominant supplier of gene chips to the industry, but Fodor wanted to be more than a hardware salesman.
"Steve realized that this technology can scan the genome," Cox recalled. Microarrays would work faster and cheaper than the equipment Celera and the government group had used for their genome projects. Yet the Affymetrix chips were based on that output. The public sequence of the genome would serve as the DNA template for the microarrays. Then, by re-sequencing the genomes of other individuals, the chips could highlight the locations where the samples departed from the original. The bases of departure, of course, were the Snips.
Margus said, "What made David jump [from academia] was the technology."
"It was a conjunction of technological stars," Cox agreed. The first was the Human Genome Project. The second was the chip technology, and the third was the software to join the elements together."
The launch of Perlegen Sciences as a spin-off of Affymetrix was announced in October 2000. By the following spring $100 million had been raised for the venture. Cox was named chief scientific officer. When asked by Fodor to recommend a CEO, a person with solid business skills yet familiar with genetics, Cox soon thought of the right person. "He's a shrimp guy in Florida," Cox told Fodor, "but he is not your ordinary shrimp guy."
Today, Perlegen is about halfway through its scans of 50 genomes. The DNA samples were taken from previously established cell lines of 25 racially diverse individuals. On the face of it, that is too small a sample to reveal humanity's most meaningful variants. But Cox said, "The more common the change, the fewer people you need to test, statistically."
Within a few months the company expects to have found the haplotypes, or chunks of Snips, that occur most frequently. The expectation is that 300,000 blocks will emerge from the scans; each will be present in at least 10 percent of the sampled genomes. By the same token each is expected to occur in at least 10 percent of all human beings.
Firing a shot across the bow of the competition, the company's scientists published a report last fall on its scan of chromosome 21, the most detailed study of a human chromosome to date. Within 20 samples of the chromosome, the Perlegen group found some 35,000 Snips. But the variants could be organized in blocks: The four most frequent haplotypes were featured on 16 of the 20 chromosomes.
"That told us," said Cox, wrapping up the meeting, "that people are similar around the world in terms of these haplotype patterns."
Locating the Snips and haplotypes is actually the easiest part of the quest, a matter of applying the chip technology and reading what it spits out. Next, Perlegen and its customers will try to link the patterns to disease predispositions. Only if the genetic links stand up in large epidemiological studies will the third step be taken, which is to learn the biological functions of the disease-associated polymorphic genes. And the last step will be to devise new drugs against the new targets.
"It's like astrology," Cox said of the first step, "even if the genome is closer to biology than to the stars. But the value of the approach is that it poses hypotheses. Let's say there are 40 to 50 genes that have an effect on the risk of a disease, and you are able to find 20 of them. Now you have a better chance of doing something about it."
Margus wheeled Jarrett to the door of the conference room, where the boy softly said, "Bye."
"He's self-conscious about the slurring in his speech," Margus explained a few days later. "But that evening Jarrett asked a lot of questions. It was the first time he'd seen what Dad's company does. He asked, 'Can you tell who else has A-T by reading their genome?'"
Brad Margus has two tasks to reconcile: Snips and A-T research. "My passion is still A-T," he said. "I think about it falling asleep at night and in the shower the next morning. In this job I'm in the right place to hear things. To keep pushing on A-T from the inside, versus 'Here's this nice guy, head of a foundation,' on the outside looking in."
He has no illusions that Perlegen will solve his sons' problems. Still, his personal life has transmitted to his business life a sense of urgency. David Cox, for his own reasons, feels the same urgency.
"I have to know something sooner," Cox said. "The day that we'll know everything about genetic disease—you know what I say about that day? That I'll be dead."
"The attitude at Perlegen," Margus said, "is, 'If we made this experiment one day faster, could it help somebody's mother?' We only hire people who get it. It's not just about stock options. I've had my wake-up call in life, but others in the company have to have had theirs, too, in different ways."
See also the SNP Consortium home page at snp.cshl.org/about.
Learn about microarrays and genetic variation at the Perlegen site: www.perlegen.com.