This article is reposted from the old Wordpress incarnation of Not Exactly Rocket Science. The blog is on holiday until the start of October, when I'll return with fresh material.
For decades, scientists have realised that languages evolve in strikingly similar ways to genes and living things. Their words and grammars change and mutate over time, and new versions slowly rise to dominance while other face extinction.
In this evolutionary analogy, old texts like the Canterbury Tales are the English language's version of the fossil record. They preserve the existence of words that used to be commonplace before they lost a linguistic Darwinian conflict with other, more popular forms.
Today, the majority of English verbs take the suffix '-ed' in their past tense versions. Sitting alongside these regular verbs like 'talked' or 'typed' are irregular ones that obey more antiquated rules (like 'sang/sung' or 'drank/drunk') or obey no rules at all (like 'went' and 'had').
In the Old English of Beowulf, seven different rules competed for governance of English verbs, and only about 75% followed the "-ed" rule. As the centuries ticked by, the irregular verbs became fewer and far between. With new additions to the lexicon taking on the standard regular form ('googled' and 'emailed'), the irregulars face massive pressure to regularise and conform.
Today, less than 3% of verbs are irregular but they wield a disproportionate power. The ten most commonly used English verbs - be, have, do, go say, can, will, see, take and get - are all irregular. Lieberman found that this is because irregular verbs are weeded out much more slowly if they are commonly used.
To get by, speakers have to use common verbs correctly. More obscure irregular verbs, however, are less readily learned and more easily forgotten, and their misuse is less frequently corrected. That creates a situation where 'mutant' versions that obey the regular "-ed" rule can creep in and start taking over.
Lieberman charted the progress of 177 irregular verbs from the 9^th century Old English of Beowulf, to the 13^th century Middle English of Chaucer's Canterbury Tales, to the modern 21^st century English of Harry Potter. Today, only 98 of these are still irregular; many formerly irregular verbs such as 'laugh' and 'help' have put on new regular guises.
He used the CELEX corpus - a massive online database of modern texts - to work out the frequency of these verbs in modern English. Amazingly, he found that this frequency affects the way that irregular verbs disappear according to a very simple and mathematical formula.
They regularise in a way that is 'inversely proportional to the square root of their frequency'. This means that if they are used 100 times less frequently, they will regularise 10 times as fast and if they are used 10,000 times less frequently, they will regularise 100 times as fast.
As Lieberman says, "We measured something no one really thought could be measured, and got a striking and beautiful result." Using this model, the team managed to estimate how much staying power the remaining irregular verbs have and assigned them 'half-lives' just as they would to radioactive isotopes that decay over time.
The two most common irregulars - 'be' and 'have' - crop up once or more in every ten words and have half-lives of over 38,000 years. That's such a long time that they are effectively immune to regularity and are unlikely to change.
Less common verbs like 'dive' and 'tread' only turn up once in every 10,000-100,000 words. They have much shorter half-lives of 700 years and for them, regularisation is a more imminent prospect. Out of the 98 remaining irregular verbs examined in the study, a further 16 will probably have adopted the '-ed' ending by 2500.
Which will be next? Lieberman has his speculative sights set on 'wed'. It is one of the least commonly used of modern irregular verbs and the past form 'wed' will soon be replaced with 'wedded'. As he jokes, "Now is your last chance to be a 'newly wed'. The married couples of the future can only hope for 'wedded' bliss.
That little jibe highlights the greatest strength of this paper - it's not the striking and elegant results, it's Lieberman's delightful turns of phrase. Suitably for a study about language, he describes his results in pithy and measured language. Observe, for example, his concluding paragraph:
"In previous millennia, many rules vied for control of English language conjugation and fossils of those rules remain to this day. Yet, from this primordial soup of conjugations, the suffix '-ed' emerged triumphant. The competing rules are long dead, and unfamiliar even to well-educated native speakers. These rules disappeared because of the gradual erosion of their instances by a process that we call regularisation. But regularity is not the default state of a language - a rule is the tombstone of a thousand exceptions."
Ah, if only all scientists could write with such poetic flair.
Reference: Lieberman, Michel, Jackson, Tang & Nowak. 2007. Quantifying the evolutionary dynamics of language. Nature doi:10.1038/nature06137