Hypography Science Forums: Cracking CAPTCHA - Hypography Science Forums

Jump to content

Welcome! You are currently viewing the Hypography Science Forum as a guest. In order to participate in our science discussions, you should register now! Registration is free and you can use your Facebook login if you like.
  • 4 Pages +
  • 1
  • 2
  • 3
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

Cracking CAPTCHA Rate Topic: -----

#1 User is offline   Tormod 

  • Hypographer
  • View gallery
  • Group: Administrators
  • Posts: 14,084
  • Joined: 11-February 02
  • LocationOslo, Norway

Posted 31 October 2006 - 08:19 AM

When people sign up at Hypography (and other forums and sites) they have to fill in a form with letters and numbers from a graphic. This graphic is often obscured to a point where it is hard to see what it actually says.

I learned from Wikipedia that CAPTCHA is acronym for "completely automated public Turing test to tell computers and humans apart".

Here is an example:
Posted Image

What I'm wondering about is how do you set up to crack a thing like this? Obviously it has been done, because we see bots sign up at Hypography, and I'm sure we're not the only target.

Any ideas?
:) Your Friendly Neighborhood Administrator - Please help us by reporting bugs and problems via our Bug Tracker!

Please try out our new Creativity Forums! Log in with your Science Forums credentials.

Science is not only compatible with spirituality; it is a profound source of spirituality.
- Carl Sagan
0

#2 User is offline   C1ay 

  • ¿42?
  • Group: Administrators
  • Posts: 6,348
  • Joined: 14-February 05

Posted 31 October 2006 - 08:38 AM

Tormod said:

What I'm wondering about is how do you set up to crack a thing like this? Obviously it has been done, because we see bots sign up at Hypography, and I'm sure we're not the only target.

Any ideas?

Me thinks the bots like Googlebot and Slurp possibly phone home when they encounter such requests and a human at the other end helps them out. They only need human help for the initial registration then all they need is their username and password for future visits.

Just thinking out loud,
Clay

Editor and Forum Administrator
stego anyone?
"There are only 10 kinds of people in the world --
.....Those who understand binary, and those who don't."
"Draw no conclusions before their time."
0

#3 User is offline   TheFaithfulStone 

  • Rockin'
  • Group: Members
  • Posts: 1,482
  • Joined: 09-June 05

Posted 31 October 2006 - 09:23 AM

different kinds of OCR software choke on different kinds of things.

the ocr on my super-high end hp scanner will just spit out garbage if the letter forms aren't super clean, but it can read distorted ones.

the cheapo stuff on my epson at home does pretty good on dirty letters (like a fax or something) but the baseline had best be within a degree or two of level, or no dice.

check out "whatthefont" for an example of how an automated program can id fonts and such.

TFS
There are no stupid questions, but there are a LOT of inquisitive idiots.
0

#4 User is offline   Buffy 

  • Resident Slayer
  • Group: Administrators
  • Posts: 7,506
  • Joined: 28-January 05

Posted 31 October 2006 - 10:23 AM

Call center software could handle this perfectly along the lines of what C1ay described, but would not require any knowledge of any particular language.

Great for outsourcing!

My name is Jane, how service may I you,
Buffy
"If you do not agree with anything I say, I'll not only retract it, but deny under oath that I ever said it!"
________________________________________________________________-- Tom Lehrer

"You know, I promised my mom and dad I wouldn't do anything stupid after I got out of college....Sorry, Mom!"


Forum Administrator
Hypography Science Forums - Science for Boys and Girls! Its not for nothing that we hang out here.
0

#5 User is offline   InfiniteNow 

  • Suspended
  • Group: Banned
  • Posts: 8,980
  • Joined: 21-December 05

Posted 31 October 2006 - 10:52 AM

Buffy said:

My name is Jane, how service may I you,

Yes, you're right... I am a MCP.


FYI - Craig's post does not appear in this thread. :)
Remember, we cannot see everything even when it is there right in front of us.
"We succeeded in taking that picture [from deep space], and, if you look at it, you see a dot. That's here. That's home. That's us." - YouTube: Pale Blue Dot
(Photo of Earth, February 1990 - Voyager 1: Distance of Pluto)

[SIGPIC][/SIGPIC]
~~~~~~~~~~~~~~~~~
InfiniteNow
0

#6 User is offline   Jay-qu 

  • Ancora Imparo
  • View gallery
  • Group: Moderators
  • Posts: 5,881
  • Joined: 26-February 05

Posted 31 October 2006 - 11:02 AM

I have seen software that makes blogs on blogger, and posts articles and links, it is fully automated after setup - except for this step. So the human just gets shown a pic on the screen and enters what it says, thats it.

I dont think this means that it cant be cracked, but is probably just easier to not bother. With one that is fairly simple like the above, black and white no orientation changes, I think it could be cracked with some funky software.
Jay-qu
::Hypography Moderator of..
Chemistry, Physics & Mathematics, Astronomy & Cosmology, Space and Technology & gadgets Forums

"I don't think much of a man who is not wiser today than he was yesterday."
-Abraham Lincoln

Physics Guides - Physics Resources and help
0

#7 User is offline   CraigD 

  • Creating
  • View gallery
  • Group: Administrators
  • Posts: 6,508
  • Joined: 23-May 05

Posted 04 November 2006 - 12:05 PM

I’m don’t think that code to crack CAPTCHA, BAFFLETEXT, and similar anti-spam tools, is widely or at all implemented – as previous posters have noted, it’s likely cheaper to employ a human to do the task. I read in journalist Leo Bruno’s 11/2003 SciAm article Innovations: Baffling the Bots that some academics have worked on such schemes as “a kind of mind sport”, but suspect that that such work hasn’t found its way from the academic to the commercial world.

:doh: Hypothetically speaking as a greedy hacker, if I intended to write such a program, I’d not approach it as the high-minded academic exercise in AI these academics have, but as a reverse engineering project. CAPTCHA take a simple random text parameter, some random numeric parameters, and generate a graphic from this data. By knowing the range of possible parameters, and using an “fit” measuring algorithm, I suspect one could write a program to efficiently find the parameters for a particular CAPTCHA graphic, including the text. It likely wouldn’t be necessary to truly reverse engineer CAPTCHA, only have your own copy to generate graphics to compare to the target graphic.

Given how much easier it is to use humans, and the possibility of legal action, I doubt that anyone will try this soon – though it never pays to underestimate human industry and ingenuity when it come to making a $buck$.
Moderator: Computers and Technology; Medical Science; Science Projects and Homework; Philosophy of Science; Physics and Mathematics; Environmental Studies :)
0

#8 User is offline   Drip Curl Magic 

  • Creating
  • Group: Members
  • Posts: 1,134
  • Joined: 08-November 05

Posted 04 November 2006 - 12:36 PM

I've been noticing an increase in advertisment bot sign ups.


I've been wanting to suggest a CAPTCHA, but i had no idea what the name of it was.

I was planning on finding out.... but I guess T is already on it. Bravo.
Rofl waffles
0

#9 User is offline   Boerseun 

  • Phantom Cow of Justice
  • Group: Moderators
  • Posts: 5,601
  • Joined: 30-May 05

Posted 04 November 2006 - 10:43 PM

It seems that humans have more of a problem deciphering individual CAPTCHA letters than computers do. But humans beat computers only in discerning individual letters. If the letters are intertwined or tangled, the computer's busted. I personally think that the background and the letters are too far apart in the colour range, so that the letters themselves could be easily 'lifted' out of the background to be deciphered. Photoshop's "magic wand" tool is a simple example of how this is achieved programatically.

In my opinion, to completely baffle the computers, the letters should be filled in with a texture layer that's from a random graphic, and the background should be also filled in with a random graphic. The two should stand out pretty visibly for the human eye, but the random colours assigned to neighbouring pixels will confuse the computer no end - as well as the random colours inside the text as well - there'd be no easy way to pick it up programatically.
Hypography Forums Moderator

IIIIIIIIIIIIIIIII
IIIIIIIIIIIIIIIII
IIIIIIIIIIIIIIIII



Ecce bos taurus justitia
0

#10 User is offline   moo 

  • Questioning
  • Group: Members
  • Posts: 216
  • Joined: 25-October 06

Posted 04 November 2006 - 11:12 PM

Somewhere in the board software is a table/etc. used to check whether the user has entered the correct code for the image.

I'd guess either it's not that hard to find/crack, or else the tables have been dumped for the major brands (such as vBulletin) and shared among spammers.

My 2 cents. :shrug:

moo
"Other friends have flown before...
On the morrow he will leave me, as my hopes have flown before."
Quoth the raven "Nevermore."

~ From THE RAVEN by Edgar Allan Poe ~
0

#11 User is offline   CraigD 

  • Creating
  • View gallery
  • Group: Administrators
  • Posts: 6,508
  • Joined: 23-May 05

Posted 04 November 2006 - 11:31 PM

Boerseun said:

In my opinion, to completely baffle the computers, the letters should be filled in with a texture layer that's from a random graphic, and the background should be also filled in with a random graphic.

That’s a bit like how BAFFLETEXT works.

To my thinking, simple language and common knowledge puzzles, such as “Enter a word meaning the opposite of ‘good’” offer hard-to-defeat CAPTCHAs, and are easier for the less graphically capable to implement. Unfortunately, such test also weed out a considerable number of human beings.
Moderator: Computers and Technology; Medical Science; Science Projects and Homework; Philosophy of Science; Physics and Mathematics; Environmental Studies :)
0

#12 User is offline   ronthepon 

  • An Intern!!
  • Group: Members
  • Posts: 2,106
  • Joined: 30-April 06

Posted 05 November 2006 - 01:47 AM

How are the images corelated to the words to be typed anyway? Is it stored away is some kind of a database?

Because moo's thoughts seem much more realistic than AI cracking these twisted word groups.
0

#13 User is offline   C1ay 

  • ¿42?
  • Group: Administrators
  • Posts: 6,348
  • Joined: 14-February 05

Posted 05 November 2006 - 06:31 AM

Boerseun said:

I personally think that the background and the letters are too far apart in the colour range, so that the letters themselves could be easily 'lifted' out of the background to be deciphered.

I have actually encountered such tests that resembled a test for color blindness. On one that I recall the text was made of colored orange dots on a background made of red and yellow dots. A color blind person would not have been able to make the text out.
Clay

Editor and Forum Administrator
stego anyone?
"There are only 10 kinds of people in the world --
.....Those who understand binary, and those who don't."
"Draw no conclusions before their time."
0

#14 User is offline   Qfwfq 

  • Exhausted Gondolier
  • Group: Administrators
  • Posts: 6,239
  • Joined: 18-February 05
  • LocationTrying to float on an ocean of hydrogen.

Posted 07 November 2006 - 04:48 AM

Forms of daltonism other than the red-green one are exceedingly rare, this problem could be avoided. At the worst, a phone number could be offered for the rare people unable to take the test and poor Tormod ;) would only have to guess whether he was hearing a voice synthesizer.

To avoid what Moo says, I would expect the character sequences to be generated randomly and the images from them.

A bot might be designed to test various areas of various sizes for a few types of average in order to spot the difference between background and typeface; this would have to be confounded by having variability at all scales. The boundary would then be recognizeable only as a more abrupt change, requiring a more sophisticated 'bot.

Once the face were lifted out from background, you want to avoid things such as separation too. The image in the first post has only two of the characters in contact, and barely so. A relatively simple topological analysis would pick them out. Have characters of separated strokes as well as ligatures between different characters.
Inutil insegnà al mus, si piart timp, in plui si infastidìs la bestie.

Hypography Forum PITA......... er, Administrator. :hihi:
0

#15 User is offline   moo 

  • Questioning
  • Group: Members
  • Posts: 216
  • Joined: 25-October 06

Posted 07 November 2006 - 07:54 AM

Hmmm... are the individual character images stored in a file (dll etc.) and then assembled for display? If so, anyone with a copy of vBulletin could rip 'em for image comparisons.

[EDIT] Btw, is it really a good idea to hash this stuff out in an open forum?

moo
"Other friends have flown before...
On the morrow he will leave me, as my hopes have flown before."
Quoth the raven "Nevermore."

~ From THE RAVEN by Edgar Allan Poe ~
0

Share this topic:


  • 4 Pages +
  • 1
  • 2
  • 3
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users


View our Science Quizzes | Science links. About the Hypography Science Forums

Friends

We recommend these stellar sites:

PC Help Forum

ATL - Atlanta Computer Repair

Sponsors

Hypography?

Hypography [n.]: A combination of "hyperlink" and "bibliography" - ie, a list of links to electronic documents. Comparable to discography and bibliography, but not cartography.

When we launched in May 2000, we wanted to create a site to share science-related content of all kinds on the web. As time passed, our site turned into a pure science forum with lots of cool people.

So we kept the name Hypography and the cool science forum community - and aim to be a friendly place for discussion of science topics of all kinds.