Ray of Nature - Faizan Zahid

Captcha Decoding
Using Multivalued Image Decomposition Algorithm
Group Members:
•
•
•
•
Faizan Zahid
Fubha Burney
Ibrahim Ajmal
Syed Haider Raza
Table of Contents
•
•
•
•
•
•
•
•
•
•
•
•
Motivation for the Project ………………………………………………2
Introduction to CAPTCHAs………………………………………………3
Breaking CAPTCHA’s ……………………………………………………..6
Algorithm Methodology………………………………………………….8
Text extraction phase……………………………………………………..10
Letter extraction phase ………………………….……………………… 11
Image Recognition phase ………………………………………………..12
Algorithm Analysis…………………………………………………………13
Demo………………………………………………………………………….. 14
Uses and Benefits…………………………………………………………..15
References & Future aspects……………………………………………16
Queries………………………………………………………………….........17
Page 2
Algorithmic Motivation
o To learn & know about :
Latest algorithms & their practical usage
Solving real world problems
Optical Character Recognition (OCR)
Artificial Intelligence (AI)
o Curiosity & Awareness
CAPTCHAs behind the scenes
Google’s image search
Page 3
Introduction to CAPTCHAs
o
CAPTCHA stands for Completely Automated Public Turing tests to
tell Computers and Humans Apart.
o
In simple terms, "Are you a human?” test .
o
Used by many websites to prevent bots and to stop spam.
Examples:
http://www.slideshare.net/avinash2008/captchappt
Page 4
Types:
•
Visual
• Audio (Alternative)
More examples:
http://solvecaptchas.com/wp-content/uploads/2013/05/Captcha_Creator_PHP_Script358301.gif
Page 5
Is it possible to crack CAPTCHA’s ?
“ Every defeat is also a victory ”
[1]
[1] http://computer.howstuffworks.com/captcha5.htm
Page 6
Breaking CAPTCHAs
Why ?
Ensure & improve efficiency of CAPTCHA’s
o Ethical Hacking purposes
o
Basic steps:
 Convert image to grayscale (Text extraction)
 Apply pattern detection (OCR and AI)
 Matches based on dictionary or referenced characters.
Page 7
Algorithm
Extraction Technique :

Color extraction
Algorithm :

Multivalued image decomposition
Phases :
=> Text Extraction Phase
=> Letter Extraction Phase
=> Recognition Technique
Page 8
Methodology
Steps :
o
o
o
o
o
o
Build histogram of colors
Text Extraction Phase
Creates new B/W image
Slicing characters
Letter Extraction Phase
Building Disjoint Sets of Pixels
Vector space & Image recognition
Recognition Technique
Build Training set
Page 9
Text Extraction Phase
Step by step how the Algorithm solves a given problem.
 Build histogram of colors
 Creates new B/W image
Color’s ID & No of pixels
Text-only version of CAPTCHA
http://www.wausita.com/captcha/
Page 10


Slicing characters
Building Disjoint Sets of Pixels
http://www.wausita.com/captcha/
Letter Extraction Phase
Output :
Page 11
Recognition Phase
 Vector space & Image recognition
 Build Training set
http://www.wausita.com/captcha/
Output :
Page 12
Algorithm Analysis
Language :
Python 2.5
http://www.python.org/
Python Image Library http://www.pythonware.com/products/pil/
Technologies :
Image recognition & Artificial Intelligence
Running Time : (48 CAPTCHAs)
0.42 sec per Captcha
432,000 cracks per day
95,040 success rate
Page 13
Uses and Benefits
• This Algorithm helps us in extracting the text from pictures.
• It can be used to make computers smarter and more efficient.
• Many documents that are scanned can be converted into word files using
this algorithm.
• Softwares such as “Ever Note” are practically applying this algorithm to
convert scanned results and snapshots to text.
• “Scanthing” is another android application that extracts text from pictures
and documents and allows you to search for them by keywords on your
phone using OCR (Optical Character
Recognition) https://play.google.com/store/apps/details?id=com.evernote
Page 14
Future Aspects
• It can be integrated with speed cameras so that when someone
over speeds, the camera automatically captures the License
Plate of the car and automatically runs it through the database
after converting the picture to characters.
• This technology could replace the bar code technology and will
just require that the name of the product is scanned.
http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Traffic_enforcement_camera.html
Page 15
References
• Images & Content :
 http://www.wausita.com/captcha/
 http://www.howstuffworks.com/captcha.htm
• Research & Study :
 FYP thesis of BIT-5
 http://la2600.org/talks/files/20040102/Vector_Space_Search_Engine_Theory.pdf
 http://stackoverflow.com/questions/1752305/breaking-captchas-for-a-noblepurpose
Page 16
Demo
Live Demonstration of Algorithm
Page 17
Queries ?
Thank you for your patience
www.faizanzahid.me
Page 18