Google buys reCAPTCHA

reCAPTCHASpam stinks.  But books are good.  That’s the idea behind reCAPTCHA, an anti-spam tool that, in addition to preventing the Internet from being overloaded with inappropriate content, is helping accurately digitize vast quantities of printed media.  In fact, the technology is so integral to Google’s Books project, Google announced that it is acquiring the small company.

Most Internet users have been prompted to answer a CAPTCHA, those random combinations of pictures meant to differentiate an actual human from those annoying automated spam bots that troll websites depositing their filth onto any unprotected comment area or user registration form.  While CAPTCHA (which stands for Completely Automated Turing Test To Tell Computers and Humans Apart) techniques have been around for years, reCAPTCHA has uniquely developed a way to harness the human brainpower used in interpreting the CAPTCHA, collecting those few seconds of individual thought into millions of hours of combined human effort daily.

The basis of reCAPTCHA’s technology is so simple and beautiful that once people understand it, they’ll be scratching their heads wondering why they didn’t think of it first.  The process begins with scanned print media, such as the millions of books, newspapers, and magazines that Google is scanning and digitizing using specially designed scanners and advanced OCR software.  Any scanned word that doesn’t match two independent OCR interpretations and isn’t contained in a reference dictionary is considered suspicious and eligible for translation by the reCAPTCHA software.  From this huge database of uninterpreted words, the reCAPTCHA software presents users with two words that they must read and re-type.  One of those words is already known, while the other needs interpretation.  Simply put, if multiple users type the known word correctly, then their interpretation of the unknown word is considered correct and thus the unknown word is considered interpreted.

Using this technique, reCAPTCHA is deciphering the equivalent of  over 160 books a day with greater accuracy (over 99%) and efficiency than any other manual transcription technique.  Google’s purchase of the company gives it access to an advanced security measure that simultaneously helps its book scanning project and society.  With the increasing popularity of eBook readers, we all stand to benefit from this unique and simple technology.

You can leave a response, or trackback from your own site.
Powered by WordPress & The Best MLM Companies