Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

Technology

reCAPTCHA

Cosmic VarianceBy Sean CarrollNovember 13, 2007 3:29 AM

Newsletter

Sign up for our email newsletter for the latest science news

We've all seen CAPTCHA's -- those distorted words that function as a cut-rate Turing test, separating humans from spambots on any number of websites.

image.jpg

This weekend I was at a Kavli Frontiers of Science meeting at the National Academies of Science office in Irvine, and one of the participants was Luis von Ahn -- the guy who was responsible for inventing the CAPTCHA idea. He gave a great one-minute talk, in which he traced his personal feelings about being responsible for something that is so useful, yet so annoying. CAPTCHA, you will not be surprised to hear, is ubiquitous. Luis figured out that the little buggers are filled out about sixty million times per day by someone on the web. So, as the inventer, he first felt a certain amount of pride at having exerted such a palpable influence on modern life. But after a bit of reflection, and multiplying sixty million times by the five seconds it might take to fill in the form, he became depressed at the enormous number of person-hours that were essentially wasted on this task. Being a clever guy, Luis decided to make lemonade. What we have here is a huge number of people who are recognizing words that a computer can't make out. Luis realized that there was a separate circumstance in which you would want the computer to recognize the words, even though it wasn't quite up to the task -- optical character recognition, and in particular the problem of digitizing old texts. Apparently, before the advent of the Internet, people would store information by binding together pieces of paper with words printed on them, forming compact volumes known as "books." In the interest of preserving the products of this outmoded technology, various efforts around the world are attempting to scan in all of those books and store the results digitally. But often the text is not so clear, and the computers don't do such a great job at translating the images into words.

sample-ocr.gif

Thus, reCAPTCHA was born. At this point you should be able to guess what it does: takes scanned images from actual books, with which optical character recognition software are struggling, and uses them as the source material for CAPTCHA's. The project is up and running, and can be implemented anywhere the ordinary CAPTCHA's are used. Now, when you get annoyed at having to make out those squiggly words with lines slashed through them, you can take some solace in knowing that you're making the world a better place. Or at least saving some books from the trash bin of history.

    2 Free Articles Left

    Want it all? Get unlimited access when you subscribe.

    Subscribe

    Already a subscriber? Register or Log In

    Want unlimited access?

    Subscribe today and save 70%

    Subscribe

    Already a subscriber? Register or Log In