reCAPTCHA : What’s Google doing now??

For those who don’t know what reCAPTCHA is, the first paragraph would be a short introduction, others can skip over to the second paragraph. Ok, CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. In non-technical terms, it’s a system designed to prevent spam by asking people to solve problems that computers can’t solve. reCAPTCHA is Google’s implementation of CAPTCHA, Here is their website, its motto is “Stop spam, read books”. It was started to help aid google’s book digitization project by allowing humans to digitize words which an OCR software can’t.

Now that we know what is (was) reCAPTCHA’s motive, here’s something that I found odd. Anyone who recently used reCAPTCHA would find out that instead of two skewed words, you get two sets of numbers. Not that numbers are odd, but upon closer inspection, one number looks like a real world image while the other looks the old skewed number image. Here is an example.


You can see many more like this in the above mentioned website.

What’s interesting is the second image. Yes, the reCaptcha is now more difficult for machines to solves and simultaneously easier for humans to solve. But the point is the second image is not computer generated, its taken from the real world, many of these look like door numbers, street numbers and room numbers. As you can see in the second image, the 8001 looks like its above a door. The next logical question is, where did google get all these images? Obvious answer, Google street view, therefore one can conclude that google has started to “digitize” street view images.

The strange thing is that there is no mention of this anywhere in the reCAPTCHA site or google’s blogs. The reCAPTCHA site still states that it is only digitizing books. Why is google keeping mum? . Interesting conclusions one can draw from this is that Google has made good progress with its digitization efforts and now expanding to street view images.

I still find google’s silence weird, all that one can hope now is that google comes clean soon or someone (like me 😉 ), figures this out and spreads the word around.

P.S strange thing is that, once I started writing this, I stopped seeing numeric captchas, guess i’m getting paranoid :p


Published by


Hi!, My name is Gokula Krishnan, you can (and I prefer) call me gokul. I'm a third year Computer Science major from BITS Pilani. I'm interested in Technology, Theoretical Computer Science and Discrete Mathematics. A FOSS enthusiast, I'm one of the founders of the BITS Firefox community. I'm currently working on Big Data Analytics, Machine learning and UNIX shell programming. My not-so-geeky hobbies include playing volleyball and football and origami

