5 Jul, 2007

ReCaptcha in ActionSome captchas are well -- weak. Or better said: broken and not an obstacle for the robots. This is, because OCR is doing pretty well, and the obfuscation is sometimes not obsucre.

So why not choose images which have already failed in OCR? 

Exactly this is the approach taken by reCAPTCHA. But where to take words, that failed in OCR? And how to avoid training the bots against this words? reCAPTCHA is refined their approach to also cover this topics:

They take words out of old bookes that are digitalized but not poperly OCRed, because the OCRing failed. They seperate words out of the books, disort them and give that puzzle to the user. They will always present two words: One which has already been decoded and one, which is not decoded yet. Only if both words are entered correctly, the captcha is solved.

So on the one side, solving the puzzle will help digitalizing books, and on the other side will prevent more efficient bypassing captchas.

But this is not all, the ReCaptcha can offer, as they offer features which are quite now (yet) not widely spread: Accessibility: If a blind person with an braille reader comes above an Captcha, he will usually not be able to decode it. With an click on the appropiate button, it will read you some digits and letters to be entered instead. 

I think, I will try to integrate this into lifetype. [Update: DONE. click here ]



