Wednesday, 8 November 2017

Captcha schmapcha

Before my bike-fatal road traffic accident, I had to park my wheels [synecdoche!] all over town. I needed a lock which was good enough to deter Bicycle Thieves but not so awkwardly heavy that I couldn't easily cycle with it on board. I found it also helped to deliberately 'distress' the frame with decals, mud spatter and and customising the handle-bars with sticky-tape. That and parking my bike next to a more obviously desirable bike (with crappier security). It was an optimisation problem.

It's been similar for the Interweb since it started going all ugly and commercial 20 years ago. Commerce, even if it was only counting page-views, created a war between 'genuine' human browsers-of-the-web and 'bots which were trying to game the system to the advantage of Bangalore Spambot Inc.  One solution to this sheep vs goats problem was the invention of Captchas [acronym: Completely Automated Public Turing test to tell Computers and Humans Apart]: where further progress into a site required you to solve a pattern-recognition problem which was easy for people and difficult for 'bots.  It is a wonder of the human condition that we-the-people are so good at this: it is honed by a million years of evolution successfully recognising tigers in the long-grass [count 'em solution].

Google acquired reCaptcha Inc in 2009 and threw a lot of software engineers at the problem and they were able to deduce humanity simply from the clicking style. I guess the 'bots click on the middle pixel of the tick-box while humans miss the bulls-eye and take longer to hit the outer rings as well.
Captcha got so distorted that although computers could figure them out 99.8% of the time, ordinary people were foxed 50% of the time and that meant lost business. How much do I want to access This Site? Enough to spend two minutes being told that I'm an idiot? Is there another place I can more easily do business? Megalithomania turned up to see The Ringstone, unannounced, on my birthday, a few years ago. I thought it would be nice to make contact but was so frustrated by the Captcha-portal
that I finally said "feck it, this guy doesn't want any feedback".

Things have improved recently where language and experience skills are added into the mix [R]. It requires an extra load of clicking but not too much and a cute mini-puzzle can be a bonus rather than a frustration - the 3rd Law of the Internet is "You can never have too many kittens". But this is not without its cultural-exclusion problems. If you normally write in arabic or cyrillic you may be less adapt than me at untangling distorted roman letters. A well known push-back against the Blacks have low IQs finding questioned whether the dispossessed knew what the words meant when the IQ test was designed by Dr Strate Whyte-Mail PhD from Stanford: Which jacuzzi is the odd one out? A reCaptcha puzzle I was recently presented with asked me to count the number of shop-fronts rather than kittens.  I could imagine a smart chap from Doha asking "what's a shop-front?". I recently labelled one of my graphs for 2nd Year ResMet course "Arbitrary Units" and two of the kids didn't know the word arbitrary in that context. I had similar probs with "salient" a few years ago.

And of course, it's war. If google and others are trying to separate humans from machines, curious researchers are trying to push the envelope to see if they can break in. Recent research published in Science and reported by George Dvorsky has developed a new way of looking at mangled letters and gets the right answer two times in three. I'm about ready to go back the parchment and quill pen.

No comments:

Post a Comment