Anti-SIFT Images Based CAPTCHA Using Versatile Characters Chen-Chiung Hsieh and Zong-Yu Wu Department of Computer Science and Engineering Tatung University Taipei City, Taiwan [email protected] Abstract—Due to vigorous development of pattern recognition, traditional human form filling tasks would be replaced by automated processes. However, these automation processes are often misused for illegal behavior such as spam email or application for website account. In order to prevent website owner from suffering the attacks of automated program, this paper proposed an innovative image-based CAPTCHA for distinguishing human and computer by embedding versatile characters in the images. The proposed method makes the characters indiscernible by automated image analysis technologies like scale-invariant feature transform while human can easily distinguish the location of the embedded characters. Our designed mechanism was capable to elude such kind of attacks. In experiments, 15 users were invited to test the system and the success rate is 86%. If wrong operations like clicking out of text boxes were excluded, the success rate reached 95%. Compare the average logging time with reCAPTCHA and HELLO CAPTCHA, the proposed method is faster than these two methods by 32 seconds and 115 seconds, respectively. Keywords—CAPTCHA; internet verification code; internet attack; image analysis; SIFT I. image. The most difficult part for automated process is to recognize each versatile character that is interfered by nonuniform background image. To prove the robustness of proposed mechanism, pattern recognition technique Scaleinvariant feature transform (SIFT) were tried and the system could successfully resist this kind of attack. Meanwhile, human users need to complete the verification task through mouse clicking only. II. Related Works Shirali-Shahreza and Shirali-Shahreza [3] proposed handwritten CAPTCHA man-machine distinguishing method as shown in Fig. 1, where the alphabetic characters displayed are handwritten style. Users need to identify alphabetic letters in the image and input the correct answer to passed through verification. The letters displayed in this simple background image can be analyzed by computers. Furthermore, some of the characters are distorted too much for user to identify and answer in a short time. Introduction Throughout the years, CAPTCHA [1] has provided us with a number of ways to answer the same question. Why would a website want to verify this? To prevent spam emails. CAPTCHA originally developed by Carnegie Mellon researchers and professors in 2000 provided webmasters with a solution to the spam that plagued their sites. By generating an image containing a string of random, warped characters, CAPTCHA forms stopped automated spam bots in their tracks. Consequently, CAPTCHA [2] verification mechanism was applied to man-machine distinguishing. It is mainly with an automatically generated test mechanism where an individual person can easily pass through while automated program can hardly pass through on the contrary. However, spammers quickly adapted their automation software to bypass this kind of test by image analysis technologies: remove a CAPTCHA’s background, separate and identify individual characters, and ultimately type in the displayed character string. In this study, text based CHPTCHA is improved from a 1D distorted string to a 2D versatile characters by displaying individual character with varying size, color, style, and angle on an randomly selected background Fig. 1. Verification mechanism using handwritten characters [2]. Baird and Bentley proposed implicit CAPTCHA method [4], where the verification code mechanism only requires user an easy click. First, the system shows a picture and informs user on basis of the program to click on the correct location. User must click on the correct location to pass through the verification. However, such verification system is easy to be solved by semantic analysis, so safety protection could not be guaranteed. Shirali-Shahreza and Shirali-Shahreza proposed drawing CAPTCHA [5], where drawing method is used as verification mechanism. User is required to draw a correct shape. There are some special points (diamond shaped) given by the system, user need to connect these special points for verification. In the picture, other small points are used for interference, mainly for inhibiting automatic programs. On the contrary, it also creates inconvenience to the user and not easy for user to detect the special point designed to construct a required shape. 978-1-4799-0604-8/13/$31.00 ©2013 IEEE Liao [6] proposed a man-machine distinguishing mechanism based on contents in the image, where arbitrarily selected two blocks in the image are randomly switched. The used image is mainly human face because face organs can be easily identified by users as shown in Fig. 2. Thus the switched blocks can be easily found by user. Shirali-Shahreza and Shirali-Shahreza proposed motion CAPTCHA [8], which is based on recognition of a human motion, as shown in Fig. 5, where user will be presented with a movie clip. In the movie, a person is making some motion and there will be four choices below. Each sentence describes a motion for selection. User is required to select the correct answer to pass through the test so as to continue using the website. However, such test needs much download time for users to view the movie contents. Fig. 2. Two blocks switched for verification. (a) Original image (b) Image with blocks switched. Block size is of ratio 0.3*0.3 of the original image. (c) Solid rectangles are correct answers. In Microsoft Hotmail website, when users register for email service, “man-machine distinguishing” is used to distinguish human and automatic software. In the method, a string of randomly selected alphabetic letters is displayed with shape modified or interfered by information added graphics as shown in Fig. 3. That verification could not be easily passed through by automatic software, and certainly such interfering information could also create confusion to users. Thus, user needs to wait for images that can be more easily identified to perform the verification. Fig. 5. Motion CAPTCHA [7]. III. Proposed Method In the study, random number generation module is used to select image and verification characters for conducting manmachine distinguishing. Then color, size, style, and angle variation are applied to the verification characters. Finally, the verification image is generated. Fig. 3. Verification mechanism for Hotmail registration. There are related studies in man-machine distinguishing form Microsoft, such as Animal Species Image Recognition for Restricting Access (ASIRRA) [7], where the advantage of human over machine on image contents is applied. The test requires user to select cat or dog image in the pictures, as shown in Fig. 4. Microsoft collaborated with Petfinder (www.petfinder.com), not only that more than 3 million images of cats and dogs in Asirra database are provided, but also fun is provided by “adopt me” link shown below pet images in the verification process. Since the pictures are provided by pet adopting website, the verification of filming angle of each pet image is also significant. Although the classification could not be conquered by automatic program, it could also lead to difficult in identification when the angle deviation is too great to users. The system flow of the verification mechanism, as shown in Fig. 6, is firstly to randomly select a background image from an image bank. Then, five verification codes from alphanumeric are also selected randomly but to avoid text out of background image. The text display area is set smaller than the original image size, with collision detection to other anchored verification codes. Fig. 6. System flow. Fig. 4. Asirra verification mechanism. A. Text Generation Arbitrarily select a background image from a nonrestricted image database. Then randomly generate five alphanumeric characters and embed these characters into the image. These five alphanumeric characters with varying size, color, style, and coordinates are displayed on web page for user verification. In the verification process, user needs to select characters according to the sequence of the verification codes displayed in the text box. When a character is clicked, its color will be changed. Verification is passed once the fifth character is clicked. If error occurred then a verification error will be shown and the system will generate another new verification image again until the user passes through the verification process. B. Image with Verification Codes Generation The background image arbitrarily selected from the image database is brought into web page using JQuery Ajax. 26 uppercase and lowercase letters and 10 numeric digits are used as the character set. Five verification codes are generated by random and non-duplicate, wherein make variations on their size, color, and style. Each character is also rotated from the normal oriental direction. In addition, to avoid characters from going beyond image borders, as shown in Fig. 7, the location range for verification codes is in 270×270, as shown in Fig. 10. (a) (b) Fig. 7. (a) Situations of characters beyond 300×300 image boundary. (b) Inner rectangle of 270×270 is the location range for code generation. C. Character Collision Detection During the process of characters generation, overlapping may occur. We adopt an array to store the upper-left corner and bottom-right corner of the generated characters. Character collision detection could be easily implemented by comparing the location of newly generated character with those locations of already generated characters stored in the array. Figure 8(a) gives an example of the generated image for user verification. The verification code in normal format is displayed under the image and user needs to click these characters according to the sequence in the verification code. Figure 8(b) shows the four corner locations of the five generated characters. (a) (b) Fig. 8. (a) An example of generated image with verification code embedded. The verification code is displayed under the image for user to find them in the image. (b) The corners for “2OJzY” are listed as list a to e. D. SIFT Character Matching Scale-invariant feature transform (SIFT) [9, 10] is an algorithm in computer vision to detect and describe local features in images. Lowe's method for image feature generation transforms an image into a large collection of key points with feature vectors. Each key point is invariant to image translation, scaling, rotation, and partially invariant to illumination changes and robust to local geometric distortion. SIFT feature point comparison method comprises four steps: scale-space extreme value detection; feature point position optimization; calculating feature point directionality; feature point description. IV. Experimental Results A. User Verification Test For the test of the system, users need a computer connected to the developed web server. 15 subjects aged among 23 ~ 27, male and female were invited to conduct three experiments. In Experiment one, each subject conducted 10 times the verification. For comparisons, reCAPTCHA [11] and HELLO CAPTCHA [12] were also tested 10 times by each person. To avoid misunderstanding caused operation errors, each subject was told the standard verification procedure in advance. Table I summarized the results and it demonstrated that our method surpassed reCAPTCHA and had the equal performance with HELLO CAPTCHA. Our execution time was 102 seconds which was lower than reCAPTCHA’s 134 seconds and HELLO CAPTCHA’s 217 seconds. The proposed verification code mechanism is superior to reCAPTCHA due to the distorted characters in reCAPTCHA were too difficult for user to read. The situation was worse in HELLO CAPTCHA owing to occluded characters in dynamic presentation. In addition, downloading time was also time-consuming. Instead to require more time for user to recognize the text so as to input verification codes, the proposed mechanism only requires simple clicks to pass the verification. It is able to save time spending on web. TABLE I. PASS RATES FOR RECAPTCHA, HELLO CAPTCHA AND THE PROPOSED VERIFICATION MECHANISM. TABLE II. PASS RATES FOR ROTATED CODES OF PROPOSED METHOD In Experiment two, the characters were generated with rotation and the same procedure was conducted by each subject 10 times. Table II summarizes the results which showed that the success rate declined only one percent. It induced that human could easily find the character even with some degree of rotation. To investigate the factors which caused failure, Experiment three was conducted by asking each subject finding the five characters 10 times without abort on error. Hence, each subject totally needed to find 50 characters from 10 images. The character recognition rate was 95% in average. The failure reasons includes background of similar color, background of high illumination, background with duplicate characters, complex background, and characters happened to fit in background. B. SIFT Attacks SIFT Experiment was mainly to verify whether the proposed mechanism can be easily conquered or not. The more number of matched key points, the more possible be attacked. Figure 9 gives two examples doing SIFT matching of characters ‘w’ and ‘4’. It was found that no matter colored image nor grey scaled image, the embedded characters in the image could not be correctly extracted by the various interferences in image. On the other hand, once the deployed background image was simple enough, then the verification codes of the proposed mechanism may not under protection. Fig. 9. Two SIFT attacks. (a) Codes in color image. (b) Codes in grey image. (c) Codes in simple background image. V. Conclusion Along with the widespread of internet usage, people rely on automatic mechanism to process massive data day by day. To avoid the automatic mechanism to be misused and reduce any improper use of operation, we need to distinguish between man and machine so as to prevent the risk brought by automation. Rights and interests of users could be effectively maintained only if man-machine difference could be differentiated. In this paper, a novelty concept of man-machine distinguishing mechanism is developed in combination of various verification mechanisms is proposed. By varying the size, color, style, angle, and position of the verification codes, while keeping the human readability of these codes, a web server was created with such verification mechanism to test the feasibility of proposed method. User only needs to click the codes according to the displayed text. From experimental results, the pass rate by human is higher than some existing approaches. Meanwhile, the human verification time of our method is quite shorter than the other approaches. To test the robustness of the proposed method, famous SIFT matching algorithm was adopted to attack the system and the experimental results showed that the mechanism is capable of defending such kind of attacks. Acknowledgment This work was supported in part by the Division of Research and Development, Tatung University, Taipei, Taiwan, under grant B101-I03-035, 2013. References [1] http://www.techi.com/2010/05/are-you-human-captch as-many-ways-ofasking-the-same-question/ [2] L. Ahn, M. Blum, N. J. Hopper, and J. Langford, “CAPTCHA: telling humans and computers apart (automatically)”, in Eurocrypt '03 Advances in Cryptology, Lecture Notes in Computer Science, vol. 2656, pp. 294–311, 2003. [3] A. Rusu and V. Govindaraju, “Handwritten CAPTCHA: using the difference in the abilities of humans and machines in reading handwritten words”, in Proceedings of the 9th Int’l Workshop on Frontiers in Handwriting Recognition, Sept 2004. [4] H. S. Baird and J. L. Bentley, “Implicit CAPTCHAs,”, in Proceedings of the SPIE on Document Recognition and Retrieval XII, vol. 5676, pp. 191-196, 2004. [5] Shirali-Shahreza, “Drawing CAPTCHA,” in Proceedings of the 28th International Conference Information Technology Interfaces (ITI 2006), Cavtat, Dubrovnik, Croatia, pp. 475-480, June 19-22, 2006. [6] W. H. Liao, “A CAPATCHA mechanism based on textured patterns,” A Thesis of the Department of Computer Science, National Chengchi University, 2007. [7] J. Elson, J. R. Douceur, J. Howell, and J. Saul, “Asirra: a CAPTCHA that exploits interest-aligned manual image categorization,” in Proc. of ACM Conference on Computer and Communications Security, pp. 366374, 2007. [8] M. Shirali-Shahreza and S. Shirali-Shahreza, “Dynamic CAPTCHA,” in Proceedings of the 2008 International Symposium on Communications and Information Technologies (ISCIT 2008), Vientiane, Lao, pp. 436440, Oct. 2008. [9] D. G. Lowe, “Object recognition from local scale-invariant features”, in Proc. of the International Conference on Computer Vision, Corfu, Greece, pp. 1150-1157, Sep. 1999. [10] D. G. Lowe, “Distinctive image features from scale-invariant key points”, International Journal of Computer Vision, vol. 60, no. 2, pp. 91110, 2004. [11] ReCAPTCHA:http//www.google.com/recaptcha. [12] HELLO CAPTCHA:http://hellocaptcha.com/index.php.
© Copyright 2026 Paperzz