For our fourth episode, we decided to try making a long, in-depth show about those squiggly word puzzles you find all over the internet, called CAPTCHAs. This is our first show that contains interviews, including of the happy fellow you see above, Dr. Andrei Broder, the Chief Scientist at Yahoo!. You’ll hear from him quite a bit in this episode.
This show is almost 50 minutes long. We hope you enjoy it. Right now we’re thinking about this as sort of a special occasion. Most of our shows will likely be shorter — mostly because they’re easier to make (Nat spent over 100 hours on this one). Unless you tell us long is the way to go!
And on that note, we’d love to get your feedback on this show in the comments below. Constructive criticism and gushing encouragement are all welcome!
If you want to learn more about the topics we discussed, here are some handy links.
- Dr. Andrei Broder, Chief Scientist at Yahoo’s Advertising Technology Group
- Ben Maurer, co-founder of reCAPTCHA
- Dr. Kumar Chellapilla, Scientist at Microsoft Research
- Shaun Friedle, creator of the Megaupload autofill CAPTCHA greasemonkey script
- The official CAPTCHA website
- Alan Turing’s 1950 paper, Computer Machinery and Intelligence, wherein he poses the Turing Test
- A nice little summary of the history of CAPTCHA
- A long Wired article about CAPTCHA and Luis von Ahn’s GWAP project
- reCAPTCHA - solve spam, read books
- CyberLover – the bot that steals personal information
- The Photoshop Phriday competition to make funny pictures from reCAPTCHA word combinations
- A funny xkcd about CAPTCHA and turing tests
- The CAPTCHA patent
- Taylor Hayward’s work on 3D images as CAPTCHAs
Algorithmic attacks on CAPTCHA
- Kumar Chellapilla’s paper on breaking CAPTCHAs at Microsoft Research
- Shaun Friedle’s megaupload autofill CAPTCHA greasemonkey script as broken down in John Resig’s blog
Convolutional Neural Networks
- Video of the Hubel/Wiesel cat brain experiments. Amazing example of reverse engineering.
- Yann LeCun’s LeNet-5, a convolutional neural network. LeCun is one of the originators of the technique.
- A great paper introducing convolutional neural networks
- Convolutional Neural Networks best practices, a Microsoft Research paper from Patrice Simard
CAPTCHA bypass services (aka CAPTCHA farms)
- Inside India’s CAPTCHA solving economy, a ZDNet article
- Spyder CAPTCHA assist for myspace
This episode contains two songs from Eternal Jazz Project, a Swedish jazz band that released some of their music under the Creative Commons BY-NC-SA license on magnatune. This episode is distributed under the same license.
DR SBAITSO CLIP
Actually, why don’t we start off and tell people what neural networks are.
Now we’ve been talking about some pretty sophisticated ways of attacking CAPTCHA. But there’s one very easy way to break a CAPTCHA we haven’t mentioned yet.