One curious side-effect of the work to digitize books and historical texts is the ability to search these databases for words, when they first appeared and how their frequency of use has changed over time.
The Google Books n-gram corpus is a good example (an n-gram is a sequence of n words). Enter a word or phrase and it’ll show you its relative usage frequency since 1800. For example, the word “Frankenstein” first appeared in the late 1810s and has grown in popularity ever since.
By contrast, the phrase “Harry Potter” appeared in the late 1990s, gained quickly in popularity but never overtook Frankenstein — or Dracula, for that matter. That may be something of surprise given the unprecedented global popularity of J.K. Rowling’s teenage wizard.
And therein lies the problem with a database founded on an old-fashioned, paper-based technology. The Google Books corpus records “Harry Potter” once for each ...