Rock Paper Scissors Alon Biran and Mor Sheffer Image Processing on Mobile Platforms IDC Abstract Rock-paper-scissors is a hand game usually played by two people, where players simultaneously form one of three shapes with an outstretched hand. The "rock" beats scissors, the "scissors" beat paper and the "paper" beats rock; if both players throw the same shape, the game is tied. The application described here is some form of implementation of this game, where the player plays against the computer, and using computer vision algorithms, the computer guesses what the player will do before he finishes the motion, as well as detecting the final hand gesture and displaying the winner. Match – includes a countdown in a form of 3 pink circles. When the countdown is over, a snapshot is taken and the view changes to the Results screen. Results – two gestures are displayed: one that was detected as the user’s gesture and one for what the computer selected. Also, the winner is announced, as well as the score count for previous matches. Implementation The implementation consists of 3 major parts: Gesture Detection, Gesture Guessing and the UI Controller. The implementation was in both C++ and Java, major algorithmic parts were in native code, while the analysis of the results was done using Java and in the GUI, the circles countdown as well as camera control was done using Java. Gesture Detection Interfaces / UI The UI consists of 3 screens: Splash Match Results In order to understand the gesture of the user, we’ve decided to implement a finger counting technique, count the fingers, and according to that, decide what gesture the user did (>4 => paper, 2-3 => scissors, 0-1 => rock), this was written in native code, supplying an interface to the application. In order to count fingers, using knowledge gained by [3], we’ve implemented the following algorithm: First we extract the skin from the image by converting it to YCC color space (a yellowish plane), and then applying a threshold we’ve gained from experiments, we’re also using a changing threshold in order to deal with different light conditions, after the threshold has been applied, an erosion and dilation action occurs, in order to fill holes and remove unwanted small interruptions, as displayed in the images below. Threshold Erosion that were found (we can see that since two fingers were close to each other, the skin detected after dilation merged and thus no defect was found. Dilation Another note worth mentioning is that in the case no skin was detected, the output is directed out as well and is shown as ‘unknown’ image. After that, a series of heuristic calculations and methods are applied, first we create a bounding box over the hand as well as a convex hull over the polygon, then we detect convexity defects according to the bounding convex hull, using those defects, we were able to understand how many fingers were shown by the user, this was sent from the native application to the GUI, and the GUI converted the finger count into a gesture. Below is shown an image where the gray rectangle is the bounding box of the hand, the blue polygon is the convex hull of the hand, the gray polygon is the hand polygon, the blue circle is the middle of the hand, and the red circles point to the convexity defects Gesture Guessing In order to “try” and guess what the user is doing we’ve used an optical flow method we saw in the opencv samples called FlowFarneback Optical Flow which outputs a map of “arrows” that were pointing to the flow of differences between two images, we’ve used that output in order to count up, down, left and right directions of the users hand, and using some heuristics, we were able to understand what the user was about to do, for example, in the case where we had a lot of down arrows, as shown in one of the images below, we’ve successfully concluded that the user was going to do rock gesture, this algorithm has proven slow, even after resizes and so it was solved using computer science and programming techniques rather than computer vision ones, as will be explained in the controller section. This algorithm was implemented in native C++ language as well, and the output of it was rock, paper or scissors. Controller The algorithms are controlled by the UI, which takes images and draws circles for counting down, since the optical flow algorithm was slow, we take one image at approx. 2.2 seconds and then another image at approx. 1.1 seconds and raise a thread to calculate the differences, this allows the program to continue running normally while waiting for the optical flow results, when counting is finished, another snapshot is taken and sent to the finger counting algorithm, when both finish (counting down is fast), the results are sent to the Results screen for display of the winner, the computer choice is selected according to what would win the optical flow result. Results The algorithm was run over 30 gestures and the thresholds for detecting skin were adjusted accordingly. After adjusting the thresholds, the skin was detected as expected, but then we encountered a new challenge: white light vs. yellow light. When the picture was taken under a yellow light, the whole picture was detected as skin. After more improvements, at the end of the process, the skin was detected as expected in over 90% of the cases. Regarding the optical flow detection, as we took only one frame before the gesture and one frame after it (because of performance limitations), the prediction of the gesture depends on the user’s timing of making the gesture. Link to Movie about the project: https://drive.google.com/file/d/0B3GkZ7Vs9 JewTHA4bXY2cDJRMEk/edit?usp=sharing References [1] Opencv Android Samples . (n.d.). Retrieved from http://opencv.org/platforms/android/o pencv4android-samples.html [2] OpenCV Tutorials . (n.d.). Retrieved from http://docs.opencv.org/doc/tutorials/tu torials.html [3] Tongo, L. d. (2010). Hand Detection and finger counting example. Retrieved from https://www.youtube.com/watch?v=Fjj 9gqTCTfc
© Copyright 2026 Paperzz