Humans Help Solve Problems Computers Can't
It's a familiar online experience: You're about to make a purchase or submit a form, when you're asked to retype a series of distorted letters on the screen to prove you're human.
The user identification system known as CAPTCHA — Completely Automated Public Turing Test to Tell Computers and Humans Apart — was designed as a security measure to distinguish humans from machines, which are unable to solve the visual tests.
What you might not know is that CAPTCHAs often serve a secondary purpose — harnessing humans' abilities to help with a project that computers can't do alone.
Luis von Ahn, a computer science professor at Carnegie Mellon University, discussed the hidden value of CAPTCHAs on Monday night during the fourth talk in the University of Arizona College of Science lecture series. The series, themed "Humans, Data and Machines," focuses on the convergence of the digital, physical and biological worlds.
After asking a near-capacity crowd at Centennial Hall who had ever filled out a CAPTCHA, von Ahn posed this question: "How many of you found it really, really annoying?"
Met with raised hands and laughter, the 39-year-old responded: "So, I invented that."
Von Ahn, who sold his company reCAPTCHA to Google in 2009, undoubtedly has a sense of humor about the minor inconvenience his work has created for internet users.
"Each time you do one of them, you waste about 10 seconds of your time, and if you multiply 10 seconds by 200 million, humanity as a whole is wasting about 500,000 hours every day typing these CAPTCHAs in because of me," he said.
A Means of Digitizing Books
But it's not for nothing, he assured the crowd, because he found a way to harness all that valuable human time and energy for a greater purpose: using CAPTCHAs to help with an effort to digitize the world's books.
While computers do a decent job of recognizing words on scanned pages of books, they have a harder time if the ink is faded or otherwise compromised, von Ahn said. That's where humans come in.
Those wiggly word pairs you see in a CAPTCHA? Often, one of them is a word that has stumped a machine, and human help is needed to decipher it. When you and other users interpret the word in the same way, it tells the computer what it says.
"This has helped digitize 100 million words a day, which is the equivalent about 2 million books a year, all being digitized one word at a time by just having people type CAPTCHAs on the internet," said von Ahn, whose many accolades include being named one of the 50 Best Brains in Science by Discover and ranking in Popular Science's Brilliant 10 and Silicon.com's 50 Most Influential People in Technology.
The idea that humans can help with problems computers can't solve alone was central to von Ahn's talk.
The same idea underscored his development about 10 years ago of a human-based computation game called ESP. The game paired strangers online and asked them to describe an image, with the objective of coming up with the same word as their partner.
Fun and games? Sure. But that's not all. The words collected helped accurately label images — something computers aren't great at doing, with online image searches turning up pictures largely based on words that appear in their file names.
"As people were playing, they were also labeling random images from the Web without really knowing that they were doing that — and doing so very effectively and accurately," von Ahn said.
He sold that idea to Google as well, which created its own version of the game called Google Image Labeler.
Von Ahn's latest project, Duolingo, is a popular, free language-learning website and app intended to improve access to foreign language education worldwide.
Bridging the Access Gap
It's a cause von Ahn feels passionately about. Growing up in Guatemala, he saw education not as the great equalizer it is often championed to be, but as something that can instead deepen inequalities between those able to afford it and those who can't.
Today, an estimated 1.2 billion people in the world are learning a foreign language, von Ahn said. Of those, two-thirds are from low socioeconomic backgrounds and are learning English in an effort to get or change jobs.
The problem, von Ahn said, is that learning a language can be expensive. That's why he's committed to keeping Duolingo free. The program also offers a 30-minute language proficiency test for $20 to $50 as an alternative to costlier, more time intensive language tests sometimes required by universities or corporations.
Duolingo features activity-based learning exercises, and users must prove proficient in certain language skills before advancing to the next level. Mastery of different categories is measured by progress bars that decrease if users haven't touched the game in a while.
"The hardest thing about learning a language by yourself is to keep yourself motivated. It's kind of like going to the gym. Everybody wants to do it but, man, it's really hard," von Ahn said. "So what we decided to do was make Duolingo feel as much like a game as possible."
When von Ahn and one of his students set out to create a language learning program, they didn't know how best to teach a language, so they started reading up on the topic. They soon discovered that, while there are plenty of philosophies, there is no single agreed-upon answer.
So, like von Ahn's other projects, Duolingo looks to its users for answers.
When new users sign up for Duolingo, some may be assigned, for example, to a group that learns adjectives before plurals, while another group may learn in the opposite order. Comparing the performance of the two groups can provide information about which approach is most effective.
"It takes about six hours for us to get 50,000 new users on Duolingo," von Ahn said. "At any point in time, were running about 100 different experiments and trying to improve the way we teach. So Duolingo is literally getting better. It also means that if you are using Duolingo, we are probably experimenting on you.
"Duolingo today is, by itself, about as good a beginners' classroom," he said. "What we want is for Duolingo to be as good as a good human tutor."