Writing Right

Do you know how to read and write English? You answer, "Of course, Jared Diamond, you dope. How else would I be reading this magazine?" In that case, have you ever tried to explain the rules behind written English to someone? The logic, say, of spelling the word seed as we do instead of cede, ceed, or sied? Or why the sound sh can be written as ce (as in ocean), ti (as in nation), or ss (as in issue), to name just a few possibilities? Innumerable examples like these illustrate the notorious difficulties of written English, even for educated adults. As I am now rediscovering through my twin sons in the first grade, English spelling is so inconsistent that children who have learned the basic rules (insofar as there are any) still can't pronounce many written words or spell words spoken to them. Danish writing is also difficult, Chinese and South Korean harder, and Japanese hardest of all. But it didn't have to be that way. French children can at least pronounce almost any written word, though they often cannot spell spoken words. In Finland and North Korea the fit between spoken sounds and written signs is so nearly perfect that the question "How do you spell it?" is virtually unknown. "Civilized" people have always considered literacy as the divide between themselves and barbarians. Surely, if we civilized English speakers sat down to devise a writing system, we could do as well as Finns or North Koreans. Why, then, is there such variation in the precision of writing systems? With thousands of years of literacy now behind us, are today's writing systems--even imperfect ones like our own--at least more precise than ancient ones, such as Egyptian hieroglyphics? Why do we, or any other people, cling to systems that are demonstrably lousy at doing what they're supposed to do? Before exploring these questions, we need to remind ourselves of the three basic strategies that underlie writing systems. The strategies differ in the size of the speech unit denoted by one written sign: either a single basic sound, or a whole syllable, or a whole word. The most widespread strategy in the modern world is the alphabet, which ideally would provide a unique sign--a letter--for every basic sound, or phoneme, of the language. Another widespread strategy employs logograms, written signs that stand for whole words. Before the spread of alphabetic writing, systems heavily dependent on logograms were common and included Egyptian hieroglyphs, Mayan glyphs, and Sumerian cuneiform. Logograms continue to be used today, notably in Chinese and in kanji, the predominant writing system employed by the Japanese. The third strategy uses a sign for each syllable. For instance, there could be separate signs for the syllables fa, mi, and ly, which could be strung together to write the word family. Such syllabaries were common in ancient times, as exemplified by the Linear B writing of Mycenaean Greece. Some persist today, of which the most important is the kana syllabary, used by the Japanese for telegrams, among other things. I've intentionally termed these three approaches strategies rather than writing systems because no actual writing system employs one strategy exclusively. Like all "alphabetic" writing systems, English uses many logograms, such as numerals and various arbitrary signs-- +, $, %, for example--that are not made up of phonetic elements. "Logographic" Egyptian hieroglyphs included many syllabic signs plus a virtual alphabet of individual letters for each consonant. Writing systems are still coming into existence, consciously designed by trained linguists. Missionaries, for example, are translating the Bible into native languages of New Guinea, and Chinese government linguists are producing writing materials for their tribal peoples. Most such tailor-made systems modify existing alphabets, although some instead invent syllabaries. But those conscious creations are developed by professional linguists, and linguistics itself is barely a few centuries old. How did writing systems arise before that--also through purposeful design, or by slow evolution? Is there any way we can figure out whether Egyptian hieroglyphs, for example, were a conscious creation? One way of approaching that question is to look at historical examples of systems that we know were consciously designed by nonprofessionals. A prime example is Korea's remarkable hangul alphabet. By the fifteenth century, when this alphabet was invented, Koreans had been struggling for more than 1,000 years with cumbersome adaptations of already cumbersome Chinese writing--a "gift" from their larger, influential neighbor. The unhappy results were described in 1446 by Korea's King Sejong: "The sounds of our country's language differ from those of the Middle Kingdom [China] and are not confluent with the sounds of our characters. Therefore, among the ignorant people there have been many who, having something they want to put into words, have in the end been unable to express their feelings. I have been distressed because of this, and have newly designed 28 letters, which I wish to have everyone practice at their ease and make convenient for their daily use." The king's 28 letters have been described by scholars as "the world's best alphabet" and "the most scientific system of writing." They are an ultrarational system devised from scratch to incorporate three unique features. First, hangul vowels can be distinguished at a glance from hangul consonants: the vowels are written as long vertical or horizontal lines with small attached marks; consonants, meanwhile, are all compact geometric signs. Related vowels or consonants are further grouped by related shapes. For example, the signs for the round vowels u and o are similar, as are the signs for the velar consonants g, k, and kh. Even more remarkable, the shape of each consonant depicts the position in which the lips, mouth, or tongue is held to pronounce that letter. For instance, the signs for n and d depict the tip of the tongue raised to touch the front of the palate; k depicts the outline of the root of the tongue blocking the throat. Twentieth-century scholars were incredulous that those resemblances could really be intentional until 1940, when they discovered the original draft of King Sejong's 1446 proclamation and found the logic explicitly spelled out. Finally, hangul letters are grouped vertically and horizontally into square blocks corresponding to syllables, separated by spaces greater than those between letters but less than those between words. That's as if the Declaration of Independence were to contain the sentence: A me a cr a te e qua ll n re e d l As a result, the Korean hangul alphabet combines the advantages of a syllabary with those of an alphabet: there are only 28 signs to remember, but the grouping of signs into larger sound bites facilitates rapid scanning and comprehension. The Korean alphabet provides an excellent example of the cultural phenomenon of "idea diffusion." That phenomenon contrasts with the detailed copying often involved in the spread of technology: we infer that wheels, for example, began to diffuse across Europe around 3500 B.C. because all those early wheels conformed to the same detailed design. However, the Korean alphabet conformed to no existing design; instead it was the idea of writing that diffused to Korea. So too did the idea of square blocks, suggested by the block format of Chinese characters; and so did the idea of an alphabet, probably borrowed from Mongol, Tibetan, or Indian Buddhist writing. But the details were invented from first principles. There are many other writing systems that we know were deliberately designed by historical individuals. In addition, there are some ancient scripts that are so regularly organized that we can safely infer purposeful design from them as well, even though nothing has come down to us about their origins. For example, we have documents dating from the fourteenth century B.C., from the ancient Syrian coastal town of Ugarit, that are written in a doubly remarkable 30-letter alphabet. The letters were formed by a technique then widespread in the Near East called cuneiform writing, in which a reed stylus was pressed into a clay tablet. Depending on the stylus's orientation, a sign could be a wedge-tipped vertical line, a wedge-tipped horizontal line, or a broad wedge. The Ugaritic alphabet's most striking feature is its regularity. The letterforms include one, two, or three parallel or sequential vertical or horizontal lines; one, two, or three horizontal lines crossed by the same number of vertical lines; and so on. Each of the 30 letters requires, on average, barely three strokes to be drawn, yet each is easily distinguished from the others. The overall result is an economy of strokes and consequently, we assume, a speed of writing and ease of reading. The other remarkable feature of the Ugaritic alphabet is that the letters requiring the fewest strokes may have represented the most frequently heard sounds of the Semitic language then spoken at Ugarit. Again, this would make it easier to write fast. Those two laborsaving devices could hardly have arisen by chance. They imply that some Ugarit genius sat down and used his or her brain to design the Ugaritic alphabet purposefully. As we shall see, by 1400 B.C. the idea of an alphabet was already hundreds of years old in the Near East. And cuneiform writing was by then nearly 2,000 years old. However, as with King Sejong's 28 letters, the Ugarit genius received only those basic ideas by diffusion, then designed the letterforms and the remaining principles independently. There were other ancient writing systems with such regular organization and for which we can similarly infer tailor-made creation. Furthermore, evidence suggests that even some highly irregular systems were consciously designed. The clearest example of these is the most famous of all ancient writing systems: Egyptian hieroglyphics, a complex mixture of logograms, syllabic signs, unpronounced signs, and a 24-letter consonantal alphabet. Despite this system's complexity, two facts suggest that the underlying principles were quickly designed and did not evolve through a lengthy process of trial and error. The first is that Egyptian hieroglyphic writing appears suddenly around 3050 B.C. in nearly full-blown form, as annotations to scenes carved on ceremonial objects. Even though Egypt's dry climate would have been favorable for preserving any earlier experiments in developing those signs, no such evidence of gradual development has come down to us. The other fact arguing for the deliberate creation of Egyptian hieroglyphic writing is that it appears suspiciously soon after the appearance of Sumerian cuneiform a couple of centuries earlier, at a time of intense contact and trade linking Egypt and Sumer. It would be incredible if, after millions of years of human illiteracy, two societies in contact happened independently to develop writing systems within a few hundred years of each other. The most likely explanation, again, is idea diffusion. The Egyptians probably learned the idea and some principles of writing from the Sumerians. The other principles and all the specific forms of the letters were then quickly designed by some Egyptian who was clever, but not quite as clever as Korea's King Sejong. So far, I've been discussing writing systems created by conscious design. In contrast, other systems evolved by a lengthy process of trial and error, with new features added and old features modified or discarded at different stages. Sumerian cuneiform, the oldest known writing system in the world, is one prime example of such an evolved writing system. Sumerian cuneiform may have begun around 8000 B.C. in the farming villages of the prehistoric Near East, when clay tokens of various simple shapes were developed for accounting purposes, such as recording numbers of sheep. In the last centuries before 3000 B.C., changes in accounting technology and the use of signs rapidly transformed the tokens into the first system of writing. This included a number of innovations, such as the organization of writing into horizontal lines. The most important, however, was the introduction of phonetic representation. The Sumerians figured out how to depict an abstract noun, one that could not be readily drawn as a picture, with another sign that was depictable and that had the same phonetic pronunciation. For instance, it's hard to draw a recognizable picture of life, say, but easy to draw a recognizable picture of arrow. In Sumerian, both these words are pronounced ti. The resulting ambiguity was resolved by adding a silent sign called a determinative to indicate the category of noun the intended object belonged to. Later the Sumerians expanded this phonetic practice, employing it to write syllables or letters constituting grammatical endings. While revolutionary, the phonetic signs in Sumerian writing nonetheless fell far short of a complete syllabary or alphabet. Some symbols lacked any written sign, while the same sign could be written in different ways or be read as a word, syllable, or letter. The result was a clumsy mess. Eventually, as with the subsequent users of cuneiform writing and along with the 3,000 years of Egyptian hieroglyphics, all passed into oblivion, vanquished by the advantages of more precise alphabetic writing. Most areas of the modern world write by means of alphabets because they offer the potential advantage of combining precision with simplicity. Alphabets apparently arose only once in history: among speakers of Semitic languages, roughly in the area from modern Syria to the Sinai, during the second millennium B.C. All the hundreds of ancient and modern alphabets were ultimately derived from that ancestral alphabet, either by idea diffusion or by actually copying and modifying letterforms. There are two likely reasons that alphabets evolved first among Semites. First, Semitic word roots were specified uniquely by their consonants; vowels merely provided grammatical variations on that consonantal root. (An analogy is the English consonantal root s-ng, where vowel variations merely distinguish verb tenses--sing, sang, and sung--from one another and from the corresponding noun song.) As a result, writing Semitic languages with consonants alone still yields much of the meaning. Consequently, the first Semitic alphabet makers did not yet have to confront the added complication of vowels. The second reason was the Semites' familiarity with the hieroglyphics used by nearby Egypt. As in Semitic languages, Egyptian word roots also depended mainly on consonants. As I've mentioned, Egyptian hieroglyphics actually included a complete set of 24 signs for the 24 Egyptian consonants. The Egyptians never took what would seem (to us) to be the logical next step of using just their alphabet and discarding all their other beautiful but messy signs. Indeed, probably no one would have noticed that the Egyptians even had a consonantal alphabet lost within their messy writing system had it not been for the rise of a true alphabet. Starting around 1700 B.C., though, the Semites did begin experimenting with that logical step. Restricting signs to those for single consonants was only one crucial innovation that distinguished alphabets from other writing systems. Another helped users memorize the alphabet by placing the letters in a fixed sequence and giving them easy-to-remember names. Our English names are otherwise-meaningless monosyllables ("a," "bee," "cee," "dee," and so forth). The Greek names are equally meaningless polysyllables ("alpha," "beta," "gamma," "delta"). Those Greek names arose, in turn, as slight modifications, for Greek ears, of the Semitic letter names "aleph," "beth," "gimel," "daleth," and so on. But those Semitic names did possess meaning to Semites: they are the words for familiar objects (aleph = ox, beth = house, gimel = camel, daleth = door). Those Semitic words are related "acrophonically" to the Semitic consonants to which they refer--that is, the first letter of the object is also the letter that is named for the object. In addition, the earliest forms of the Semitic letters appear in many cases to be pictures of those same objects. A third innovation laying the foundations for modern alphabets was the provision for vowels. While Semitic writing could be figured out even without vowel signs, the inclusion of vowels makes it more comprehensible since vowels carry the grammatical information. For Greek and most other non-Semitic languages, however, reading is scarcely possible without vowel signs. (Try reading the example "ll mn r crtd ql," used earlier in the Korean hangul format.) The Semites began experimenting in the early days of their alphabet by adding small extra letters to indicate selected vowels (modern Arabic and Hebrew indicate vowels by dots or lines sprinkled above or below the consonantal letters). The Greeks improved on this idea in the eighth century B.C., becoming the first people to indicate all vowels systematically by the same types of letters used for consonants. The Greeks derived the forms of five vowel letters by co-opting letters used in the Phoenician Semitic alphabet for consonantal sounds lacking in Greek. From those earliest Semitic alphabets, lines of evolutionary modifications lead to the modern Ethiopian, Arabic, Hebrew, Indian, and Southeast Asian alphabets. But the line most familiar to us was the one that led from the Phoenicians to the Greeks, on to the Etruscans, and finally to the Romans, whose alphabet with slight modifications is the one used to print this magazine. As a group, alphabets have undergone nearly 4,000 years of evolution. Hundreds of alphabets have been adapted for individual languages, and some of those alphabets have now had long separate evolutionary histories. The result is that they differ greatly in how precisely they match signs to sounds, with English, linguists agree, being the worst of all. Even Danish, the second worst, doesn't come close to us in atrocity. How did English spelling get to be so imprecise? (As a reminder of how bad it is, recall seven fascinating ways we can pronounce the letter o: try horse, on, one, oven, so, to, and woman.) Part of the reason is simply that it has had a long time to deteriorate--the English language has been written since about A.D. 600. Even if a freshly created writing system at first represents a spoken language precisely, pronunciation changes with time, and the writing system must therefore become increasingly imprecise if it is not periodically revised. But German has been written for nearly as long as has English, so that's not the sole answer. Another twist is spelling reforms. As anyone familiar with English and German books printed in the nineteenth century knows, nineteenth-century spelling is essentially identical to modern spelling for English, but not for German. That's the result of a major German spelling reform toward the end of the nineteenth century. The tragicomic history of English spelling adds to the horror. Those Irish missionaries who adapted the Latin alphabet to Old English did a good job of fitting signs to sounds. But disaster struck with the Norman conquest of England in 1066. Today only about half of English words are of Old English origin; the rest are mostly derived from French and Latin. English words were borrowed from the French using French spellings, according to rules very different from English spelling rules. That was bad enough, but as English borrowings from French continued, French pronunciation itself was changing without much change in French spelling. The result? The French words borrowed by English were spelled according to a whole spectrum of French spelling rules. English pronunciation itself changed even more radically with time; for example, all written vowels came to sound the same in unstressed syllables. (That is, when pronounced in normal speech, the a in elegant, e in omen, i in raisin, o in kingdom, and u in walrus all sound much the same.) As new words were borrowed from different languages, they were spelled according to the whim of the individual writer or printer. But many English printers were trained in Germany or the Netherlands and brought back still other foreign spelling conventions besides French ones. Not until Samuel Johnson's dictionary of 1755 did English spelling start to become standardized. While English may have the worst writing system in Europe, it is not the worst in the world. Chinese is even more difficult because of the large number of signs that must be independently memorized. As I said earlier, probably the most gratuitously difficult modern writing system is Japan's kanji. It originated from Chinese writing signs and now has the added difficulty that signs can variously be given Japanese pronunciations or modifications of various past Chinese pronunciations. An attempted remedy that compounds the confusion for Japanese readers is the insertion of spellings in yet another writing system, the kana syllabary, for hard- to-read kanji. As George Sansom, a leading authority on Japanese, put it, back in the 1920s: "One hesitates for an epithet to describe a writing system which is so complex that it needs the aid of another system to explain it." Do sub-ideal writing systems really make it harder for adults to read, or for children to learn to read? Many observations make clear that the answer is yes. In 1928 Turkey switched to the Latin alphabet from the Arabic alphabet, which has the twin disadvantages of a complex vowel notation and of changing the forms of letters depending on where they stand within a word. As a result of the switch, Turkish children learned to read in half the time formerly required. Chinese children take at least ten times longer to learn to read traditional Chinese characters than pinyin, a Chinese adaptation of the Latin alphabet. British children similarly learned to read faster and better with a simplified English spelling termed the Initial Teaching Alphabet than with our conventional spelling. Naturally, the educational problems caused by inconsistent spelling can be overcome by increased educational effort. For example, Japan, with the modern world's most difficult spelling system, paradoxically has one of the world's highest literacy rates--thanks to intensive schooling. Nevertheless, for a given educational effort, a simpler spelling system results in more literate adults. Hebrew provides interesting proof that not only spelling but also letter shapes make a difference. Hebrew writing has several sets of extremely similar letters: only one letter is distinctively tall, and only one letter stands out by dipping below the line (ignoring the special forms of Hebrew letters at the ends of words). As a result, a study suggests that, on the average, readers of Hebrew have to stare at print for longer than do readers of Latin alphabets in order to distinguish those indistinctive letter shapes. That is, distinctive letter shapes permit faster reading. Since details of writing systems do affect us, why do so many countries refuse to reform their writing systems? There appear to be several reasons for this seeming perverseness: aesthetics, prestige, and just plain conservatism. Chinese writing and Arabic writing are widely acknowledged to be beautiful and are treasured for that reason by their societies; so were ancient Egyptian hieroglyphics. In Japan and Korea, as in China, mastery of Chinese characters implies education and refinement and carries prestige. It's especially striking that Japan and South Korea stick to their fiendishly difficult Chinese-based characters when each country already has available its own superb simple script: kana for the Japanese, and the hangul alphabet for Korea. Unlike some of these writing systems, our awful English spelling is not considered beautiful or prestigious, yet all efforts to reform it have failed. Our only excuse is conservatism and laziness. If we wanted, we could easily improve our writing to the level of Finland's, so that computer spell-check programs would be unneeded and no child beyond fourth grade would make spelling errors. For example, we should match English spelling consistently to English sounds, as does the Finnish alphabet. We should junk our superfluous letter c (always replaceable by either k or s), and we should coin new letters for sounds now spelled with arbitrary letter combinations (such as sh and th). Granted, spelling is part of our cultural heritage, and English spelling reform could thus be viewed as a cultural loss. But crazy spelling is a part of our culture whose loss would go as unmourned as the loss of our characteristic English medieval torture instruments. But before you get too excited about those glorious prospects for reform, reflect on what happened to Korea's hangul alphabet. Although it was personally designed by King Sejong, not even a king could persuade his conservative Sinophilic countrymen to abandon their Chinese-derived script. South Korea persists with the resulting mess even today. Only North Korea under Premier Kim Il Sung, a dictator far more powerful than King Sejong ever was, has adopted the wonderful hangul alphabet as the writing norm. Lacking a president with Kim Il Sung's power to ram unwanted blessings down our throats, we Americans shall continue to suffer under spelling rules that become more and more archaic as our pronunciation keeps changing.

Writing Right

Some written languages are a reflection of a people's speech while others, like English, are a mess. Why is this?

Newsletter