Spammers and Phishers Breaking CAPTCHAs

The emergence of CAPTCHA based authentication was a logical move in the fight against automated brute forcing of login details, registrations, spamming and sploging in the form of comments and splogs registration. And consequently, spammers, phishers and malware authors started figuring out how to automatically achieve their objectives, by either breaking or adapting to a certain CAPTCHA, and even more pragmatic - outsourcing the request to a third-party.

Two months ago, there were news stories on how spammers and phishers feeling the pressure put on them by

anti spam vendors, have supposedly broken Hotmail and Yahoo's CAPTCHA. Nothing is impossible, the impossible just takes a little longer, what's important is discussing the many other perspectives related to adapting to a CAPTCHA, directly breaking it, or entirely ignoring it.

In the first example you can see an automatic CAPTCHA recognition at a Russian email provider. What the script is doing is basically syndicating proxies, ensuring they work, and starting the mass registration process while providing confirmation or error results in between. The CAPTCHA in question is indeed primitive, but the email provider's clear IP reputation and launch pads for spam, phishing and malware is what the malicious parties are really interested in. Once the CAPTCHA becomes easily recognizable, the entire process of logging in and sending the malicious content can also be fully automated.

In the second example you can see a great example of the adaptation process. The CAPTCHA cannot be

efficiently abused we we've seen with the first case, but instead of putting efforts into breaking it directly, the malicious parties are simply adapting. Once proxies get syndicated and verified for connectivity, a request for the number of accounts to be registered is initiated, the script then responds with automatically generated logins, and presents the CAPTCHA to be manually entered by the malicious party. Malicious economies of scale in action, despite that the CAPTCHA cannot be broken, the process is still partly automated, another example of marginal thinking applied in order to achive an objective.

Sample CAPTCHA breaking project requests :

- "I need a captcha breaker that can break captchas that are of the same style i will upload here.I will want a c++ dll that recieves a file path and returns a char* with the content of the picture (letters and numbers)"

- "The program needs to take a myspace captcha image and determine what the text says in the image. The accuracy needs to be 80%+"

- "We are an expert group for inputing captcha for you with very low price and high accuracy. We can input 10k to 100k (depending on how many you can offer to us) per day with accuracy at least 70% (for simple captcha such as yahoo, it is above 95%). We also own expert programmers who can help you with writting your spiders or other softwares to get and manage all the captchas."

Some are purely malicious, others aim to verify the security of a CAPTCHA in development for instance. Let's summarize - Why are malicious parties interested in defeating CAPTCHA's at popular sites?

- take advantage of the clear IP reputation of the email service in order to improve the chance of having their phishing/spam/malware email successfully received

- set the foundations for a large scale automated spamming/phishing operations by using legitimate email addresses, thus improving their chances of not getting filtered

- automated registration of splogs -- spam blogs

- as search engines are starting to crawl sites submitted at the most popular social networks in real time, spammers or malware authors are naturally interested in abusing this development to timely attract huge
audiences at their splogs who often have malware embedded within

What are malicious parties doing to achieve efficiency despite their inability to defeat an advanced CAPTCHA?

- humans entering the CAPTCHAs while the script is auto generating, storing and auto logging with the passwords in a combinated with the human entered CAPTCHA

- adapting compared to putting more efforts into rocket science as whenever a CAPTCHA cannot be beated automatically, as you already saw on the second screenshot, they're making it easier for humans to enter the CAPTCHA and faster compared to an end user browsing

- outsourcing making it sound it's more of a quality assurance project of CAPTCHA to be introduced on the market

What can web sites do to prevent that sort of malicious behaviour? Strong CAPTCHAs should be in place by default, but taking another perspective, the way I discussed how click fraud could be easily detected by advertising networks syndicating IPs of already known to be malware infected hosts, in this very same fashion we could have CAPTCHA system that would check to see if, for instance, default proxy ports are opened at the host trying to register, and whether or not they're part of a botnet. With data like this now a commodity, a prioritization process to closely monitor mass registrations from these IPs is a pragmatic early warning system.

Interesting reading on the big picture too - CAPTCHA - The Broken Token :

"How much does it cost to have a CAPTCHA hack custom developed? $10 to $20 ought to do the trick; certainly no more than $50. But the cost isn’t the point. What’s more alarming is that thousands upon thousands of site owners are depending upon flawed technology to protect their sites from spam even though they know, or at least should know, that it’s only a matter of time until some spam robot shows up and starts hammering away at those worthless little images."

The irony regarding CAPTCHAs are how less popular sites compared to the Web 2.0 darlings often have a more sophisticted CAPTCHA compared to the most widely used web sites.

OCR Research Team; List of Weakness

PWNtcha - captcha decoder

XRumer

Vladuz's EBay CAPTCHA Populator

Attack of the SEO Bots on the .EDU Domain

Spam Comments Attack on Techcrunch Continuing

The Blogosphere and Splogs