Imagine that one day there lands upon Earth an alien spacecraft stuffed with a million crumpled pieces of paper, each covered in text written in an unknown script. The best brains in the world are put to the task of deciphering the code, which takes 10 years. But it takes another 40 years to smooth out all of the pages, translate them into English, sort them, and publish them as a vast book. Then, at long last, the task is done, and we sit down to read the book from beginning to end. It contains thousands of stories about the past, the present, and the future of humankind, from the origin of life to the recipe for curing cancer.
What an extraordinary and unlikely tale. And yet that is essentially what happened this year. After 50 years of preparation, we have suddenly been placed in the position of being able to read the entire genetic story of human beings— the genome.
On June 26 Francis Collins, head of the Human Genome Project, and Craig Venter, head of Celera Genomics, jointly announced that they had completed the reading of a "rough draft" of the human genome— the complete set of human DNA. The announcement came at least two years earlier than expected and marked a dead heat in a fiercely contested scientific marathon.
The researchers on the Human Genome Project had been working toward a complete human sequence since the late 1980s. In early 1998, with less than 10 percent of the job done, scientists were predicting that it would take seven more years. Then Venter announced that he would undertake to do the job by 2001, using private funds.
Twice before he had delivered on equally dramatic promises. In 1991 he invented a quick way to find human genes, using expressed sequence tags, after the senior scientists at the Human Genome Project had said it wouldn't work. In 1995, he invented a new "shotgun" technique for sequencing DNA and read the full genome of a bacterium while the establishment was still dismissing the technique as unworkable.
So Venter's threat was serious. The Human Genome Project reorganized its efforts, and the race was on. In the end, both projects announced together that they had finished a rough draft last June. A rough draft is a sequence with 91 percent of the letters in the right place, each letter having been read and reread between five and seven times. Plenty of gaps remain, but they amount to less than 10 percent of the text.
This announcement was the beginning of a whole new way of understanding human biology. Everything we have laboriously discovered hitherto about how our bodies work will be dwarfed by the knowledge tumbling from the genome.
It was also the end to a great detective story. In 1860, Gregor Mendel made the bizarre discovery that inheritance comes in tiny particles called genes that do not decay with age, or blend with one another. In 1953, James Watson and Francis Crick made the even more unexpected discovery that those particles are actually digital messages written along strands of DNA in code, using a four-letter chemical alphabet. In 1961, Marshall Nirenberg and Johann Matthaei cracked the first "word" in that code, revealing exactly how DNA instructs the cell to build proteins. It was then inevitable— if mind-boggling— that one day we would read all the genetic messages that a human body inherits. Now we have.
But, of course, the genome announcement is just a beginning. For the document that has been produced— a 3.3-billion-letter book, as long as 900 Bibles— is almost entirely mysterious. We do not know even the basic facts about it, such as how many genes it contains— although the guesses are converging on a figure of 38,000— let alone what each gene is and how genes interact with one another. We do not know why the genes are hidden in great stretches of apparently meaningless text, or so-called junk DNA. We stand on the brink of a continent of new knowledge.
Most people do not see the genome in such romantic terms. They want to know how it will help cure cancer; they speculate about customized medicine, with drugs designed for the individual, not the population. They worry that it will lead to designer babies for the rich or to a lessening of respect for the disabled; they fear the patenting of genes by private corporations; they predict that medical insurance may cease to be offered by insurance companies to those whose risks are known and high.
All these are real issues. The medical possibilities and ethical fears that dominate the debate are by no means trivial. But there is a larger philosophical truth missed. The genome represents an unprecedented draft of self-knowledge for humankind with implications that stretch far beyond medicine. It promises to tell us new things about our past as a species, and it promises new insights into philosophical conundrums, not least of which is the puzzle of free will.
We have been misled into thinking that genetics is all about disorders. Geneticists have so far concentrated on genes that are linked to disease: first the simple but rare inherited diseases like cystic fibrosis (the gene for which is on chromosome 7) or Huntington's (chromosome 4), then the environmental diseases for which different people inherit different susceptibilities, such as Alzheimer's (chromosome 19) or breast cancer (chromosomes 13 and 17). More recently, they have begun to seek genes that affect our behavior, prompting us to be dyslexic (chromosome 6), homosexual (perhaps on the X chromosome), adventurous (chromosome 11), or even highly religious (no map location yet).
A well-studied example of a human gene is the ACE gene on chromosome 17, which seems to predict physical performance. According to a group of scientists at University College, London, possessing one version of this gene rather than another dramatically improves the ability to increase muscle strength with training and increases the mechanical efficiency of trained muscle. Mountain climbers, rowers, and other athletes tend to have this high-performance version of the gene. Likewise, one version of the APOE gene on chromosome 19 predicts the likelihood of a boxer suffering from premature Alzheimer's disease. A person with the "wrong" versions of both these genes would be well advised not to become an athlete in a contact sport.
But that word wrong is all wrong, is it not? There is still too much tendency to think in terms of genetic divergence from the presumed norm. Back in the Stone Age, the low-performance version of the ACE gene might have resisted starvation better, and the risky version of the APOE gene might have had some other advantage. Besides, to define a gene as an "Alzheimer's gene" or a "dyslexia gene" is a bit like defining the heart as a "heart-attack organ." This is misleading. Neither blue nor brown eyes (a gene somewhere on chromosome 15) are normal. With the genome in hand, we can see genes in better context. We can study how and why all human beings inherit a musical sense, rather than why some people are more musical than others.
Genes are windows on the past. Some reflect the history of infectious disease in different tribes. The A and B blood groups (chromosome 9) protect against cholera; the cystic fibrosis and Tay-Sachs (15) mutations may protect against tuberculosis; the sickle-cell (11) and thalassemia (16) mutations protect against malaria. Hence the prevalence of these particular mutations in certain peoples.
Other genes tell a story of responses to culture. The fact that adult Europeans are twice as likely as Asians to tolerate lactose in milk (no location yet) reflects a much longer history of dairy farming in the West; the ability to dehydrogenate alcohol (chromosome 4) is more common in people with a history of drinking fermented fluids; the prevalence of the blond-hair gene in young northern Europeans (perhaps on chromosome 15?) may reflect a sexual preference for youthful mates.
Still others tell of events long before recorded history. The unusual genes of the Basque people mirror the unique nature of their language and suggest that they are descendants of preagricultural Europeans. The astonishing similarity between embryonic-development genes called Hox genes in fruit flies and people (chromosomes 2, 7, 9, 12, and 17) tells us that the common ancestor of people and insects was a segmented animal; yet this animal lived more than 600 million years ago and left no fossils. The genome is going to be a treasure trove of such stories.
Science has a habit of addressing problems raised by philosophy. It may not be too much to claim that the mystery of free will has been recast by recent discoveries in genetics, which have exposed the myth that genes are puppet masters and we are their puppets. Take, as an example, the various learning mutations that have been discovered in fruit flies and subsequently in mice and people (chromosomes 2 and 16). These are found in genes that are central to memory and learning, many of them part of the CREB (cyclic-AMP response elements binding protein) system in the brain. The mutations reveal that every time a person learns something, he has to switch on some of these genes in order to lay down new connections between brain cells.
That sounds like dull molecular plumbing. But actually it is revolutionary philosophy. In attempting to answer the question of whether we possess free will, the Scottish philosopher David Hume impaled himself on the following dilemma: Either our actions are determined, in which case we are not responsible for them, or they are the result of random events, in which case we are also not responsible for them. But the CREB genes show how to escape this fix. If genes are at the mercy of behavior, but behavior is also at the mercy of genes, then our actions can be determined by forces that originate within us as well as by outside influences. The will is therefore a mixture of instincts and outside influences. This makes it deterministic and responsible, but not predictable.
Curiously, the free will story brings us to cancer, which is where the whole genome project started. Cancer researchers first suggested sequencing the human genome in the mid-1980s. They were just beginning to realize that cancer was a wholly genetic process. Genetic, but not hereditary. Most cancer is not inherited— though there are well-known mutations that increase susceptibility to cancer, such as BRCA1 and BRCA2, both of which are associated with breast cancer.
Yet cancer is a disease of the genes. Like free will, it is a process mediated by the genes but not caused by them. Like the CREB genes of memory, changes in cancer genes are the consequence, not the cause, of environmental effects. Cigarette smoke, for example, causes cancer by mutating genes inside human cells called oncogenes, which encourage cells to multiply, and tumor-suppressor genes, which prevent them from multiplying. To turn malignant, a tumor must evolve with at least one oncogene jammed in the "on" position and at least one tumor-suppressor gene jammed in the "off" position.
Little wonder that President Clinton, announcing the genome last June, mused that one day people may know cancer only as a star sign, not a disease. That was going much too far, because cancer is also a disease of aging: Its incidence increases steadily with age. Rendering it easily curable will only increase its incidence. Nonetheless, by identifying all oncogenes and tumor-suppressor genes and understanding how they work, the Human Genome Project will transform cancer therapy. Already, drugs based on the most famous of the tumor-suppressors, (chromosome 17), are in early clinical trials.
The human genome opens a world of medical opportunity, of commercial promise, of ethical danger, and of social challenge. It is also a cornucopia of scientific possibilities that ranks alongside the revolutions wrought by Euclid, Copernicus, Newton, Darwin, and Einstein. It is a fitting bang with which to start a new millennium.
DEMYSTIFYING ALL THOSE SQUIGGLY LINESHow researchers break down the nucleus of a cell into the sequence of chemicals that make up genes
Human CellInside the nucleus of every human cell are the chromosomes, which carry the genetic code of the host individual. In a quiescent cell, like the one shown at right, the nucleus is intact and the chromosomes are a relatively indistinguishable mass of spaghettilike material. But in a cell preparing to divide, the nuclear boundary breaks down, and the chromosomes condense into near-rigid rods. Breaking such a cell open onto a glass slide, the researcher releases a jumbled cluster of chromosomes that are then stained with dye so that they can be studied under a microscope.
KaryotypeThis jumble of stained chromosomes is then photographed, and the images are rearranged into ordered pairs to create a karyotype, the standard form used to display chromosomes. In this configuration, the chromosomes are ordered by length from the largest (chromosome 1) to the smallest (chromosome 22 in humans), followed by the sex chromosomes. Karyotypes are used in clinical tests, such as amniocentesis, to determine if all the chromosomes appear normal and are present in the correct number.
IdeogramTo simplify the information gained from karyotypes, geneticists developed a schematic diagram, called an ideogram, to represent each chromosome. The three characteristics scientists use to identify a chromosome are readily visible in an ideogram: length; the banding pattern of the dye, which reflects the type of nucleotides concentrated in the chromosome; and the location of the centromere, a waistlike constriction required for proper movement during cell division. These features are visible in the enlarged image of chromosome 17 (far right).
GeneThree of the 1,263 genes present on chromosome 17 are mapped onto the ideogram presented here. Each gene stores the instructions that tell the cell how to make proteins and other vital biochemical components of our bodies.
ACE GeneACE is involved in fluid-electrolyte balance and blood-pressure control. Two forms are commonly found in the U.S. population. The D form increases the amount of the ACE enzyme in the bloodstream and may increase the risk of cardiovascular disease. By contrast, the I form, frequently found in successful athletes, increases potential muscle strength.
Nucleotide SequenceThe building blocks of the genes are called nucleotides. The four common ones are adenine, cytosine, guanine, and thymine, represented as A, C, G, and T.
atacagtcac tttttttttt tttttgagac ggagtctcgc tctgtcgccc aggctggagt gcagtggcgg gatctcggct cactgcaacg tccgcctccc gggttcacgc cattctcctg cctcagcctc ccaagtagct gggaccacag cgcccgccac tacgcccggc taattttttg tatttttagt agagacgggg tttcaccgtt ttagccggga tggtctcgat ctcctgacct cgtgatccgc ccgcctcggc ctcccaaagt gctgggatta caggcgtgVariations in the ACE GeneThe I form has 287 more nucleotides than the D form, which itself has more than 24,000 nucleotides. Researchers refer to this additional amount as an insertion (see diagram at right).