Dan Sharpe and Erik Kroeker PSY/ORF 322 Final Project Obscured Images and the Visual Processor Introduction Human visual processing is a truly amazing process, constantly occurring as we navigate the world. The amount of visual data that our eyes can take in is too much for our brains to deal with on a constant basis, so our eyes work with our brain to make assumptions and recognize things with minimal raw visual data. An example of this is in our ability to recognize basic features of things happening in our peripheral vision, while focusing on something else. Even if we aren’t looking directly at a sign as we walk, we can realize basic shapes and feature of the sign. If it was necessary for us to use our narrow field of higher level vision to do simple object recognition and edge detection, it would require much more information to do simple tasks like just walking down the street, or driving a car. Other instances where it is helpful to be able to process imperfect or incomplete images are physically obscured objects and low visibility situations. Often time objects are only partially visible, due to their location in space. In situations such as these, it is necessary to process visual information with background information to make assumptions about the obscured object. Expectations can be combined with a few visual cues to make assumptions about the identity of the object or person in question. In the case of our experiment, we gave an increasing amount of information about a picture by randomly revealing pixels in the picture, until the picture could be identified. This particular experiment gives us an insight into the resolution of visual information needed to identify objects. This is part of the reason that we are not overwhelmed with visual information to process at any given time, since we do not need much information to know what something is or what is going on around us. If context clues were given with the pictures in our experiment, it is easy to assume that the pictures would be recognizable sooner. Experimental Methods To help test how much visual data a person needs to identify what an image is, we designed a computer program. The program needed to serve two main functions; display the partially obscured images, and record the amount of visual information required for a subject to identify it. The images were 24-bit color, 200 by 160 pixel bitmaps. All of the images were taken from photographs, were properly oriented, and had not had their coloring altered. The program used a series of four such images (see Figures 1-4.) It was anticipated that different images would require different amounts of information to identify, thus we attempted to cover a variety of images in our test. The images represented two objects (a car and a plane) and two landscapes (the Sphinx and a pyramid, and a farm.) Figure 1: A Car Figure 2: A Plane Figure 3: Sphinx and Pyramid Figure 4: A Farm In order to determine the amount of the image that need to be revealed before the image could be identified, we decided to randomly display incrementally more pixels of the image until the subject could identify it. The choice of revealing many small pieces (pixels) of the image at once was chosen over the leading alternative of showing larger blocks of the image so as to avoid having the image prematurely identified because the block showed a very distinguishable feature of the image (ie. tire of the car.) It was also chosen so as to better simulate peripheral vision, since peripheral vision does not consist of several, separated, high-resolution blocks. It would more accurately be represented by a roughly uniform distributed, low resolution, cloud of color. To model this process, the program was designed to read the position and color data of each pixel from an image, and randomly order the pixels in a list. The ordering was random rather than a geometric pattern to avoid an image being more, or less, easily identifiable because of the geometric pattern chosen. Once the image data has been randomly sequenced, a subject (who is unaware of the content of the picture) is instructed to click a button labeled ‘Reveal.’ Upon clicking the button, the first one hundred pixels from the list of randomized pixels are displayed to the screen (See Figure 5.) The pixels were drawn as two pixel by two pixel squares and were displayed such that the image is scaled to two hundred percent of its original size. The images were scaled up to avoid having subjects be unable to identify an image because it was too small, while also keeping the images from having so many pixels as to slow the randomization process down too drastically. Figure 5: Program Still, ‘Reveal’ Button The user is then instructed to continue clicking the ‘Reveal’ button until they can identify the content of the image. Once the subject thought him or herself capable of identifying the image, the subject informed the test conductor what he or she thought the image was. The subject was instructed to click the ‘Next Image’ button and proceed with the next image in the series if they correctly identified the image (See Figure 6.) If the subject was incorrect, the test conductor informed the subject that the answer was incorrect and prompted the subject to continue clicking the ‘Reveal’ button until he or she could identify the image correctly. Figure 6: Program Still, ‘Next Image’ Button This process was repeated until the subject properly identified the fourth image, at which time the subject was instructed to click the ‘Done’ button. The ‘Done’ button replaced the ‘Next Image’ button when the subject began to reveal the pixels to the final image of the set. The program recorded the number of times the subject clicked the ‘Reveal’ button until he or she properly identified the image. When the ‘Done’ button is pressed, the program exports the number of clicks the subject made to an excel file to be analyzed (See Figure 7.) Figure 7: Program Still, ‘Done’ Button The subjects were not given any clues as to content of the images other than being told that the images were from photographs, were properly oriented, and had not being distorted in anyway. The subjects were instructed that they were not required to identify specifics about the image (ie. model of car.) The test conductors were not to answer any questions about the nature of the image, except to answer whether a specific answer was correct or not. This prevented a subject from gaining insight into the nature of the image by a question such as “is it some kind of vehicle?” or other questions of the like. The test was run by two conductors with a subject group of thirty students at Princeton University. Subjects were both male and female, and no records were made as to the gender of the subjects in relation to the data sets. No data sets were omitted and measures were taken to prevent subjects from taking the test twice. Results and Discussion The data revealed trends both amongst individual subjects and amongst the images being observed. While collecting the data we noticed variances in individuals’ readiness to respond affecting the data. Some subjects were more hesitant to venture a guess, therefore increasing the percentage of the picture being revealed prior to identifying the image. Conversely, some subjects would venture guesses with very low percentages of the picture having been revealed. The variations in individual subject patterns had minimal affect on our data as someone was as likely to be conservative in their guessing as they were not to, and thus the effects are cancelled out. This effect can be observed in Figure 8. Individual Series Percentage Required to Identify Image 1 0.9 Amount of Image Revealed before Identified 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Car Biplane Sphinx and Pyramid Farm Series1 Series2 Series3 Series4 Series5 Series6 Series7 Series8 Series9 Series10 Series11 Series12 Series13 Series14 Series15 Series16 Series17 Series18 Series19 Series20 Series21 Series22 Series23 Series24 Series25 Series26 Series27 Series28 Series29 Series30 Average Figure 8: Individual Subject Series Trends The bold black line in figure 8 represents the average of all subjects, basically what you would expect from an “average” person. The lines of these individuals tend to follow the same trends as the average, that is increasing from picture one to picture two, and increasing from picture three to picture four. This trend is also evidenced in the summary of data in Table 1. Mean Standard Deviation Minimum Maximum 1-Car 24.38% 6.83% 5.00% 41.88% 2-Biplane 32.44% 16.51% 10.94% 80.63% 3-Sphinx and Pyramid 28.15% 10.77% 11.88% 63.44% 4-Farm 54.23% 15.02% 24.06% 86.88% Table 1: Statistical Summary of Data The chart shows that on average, subjects needed the least information to identify picture one, and the most information to identify picture four. While the statistical average acts as an indicator of how difficult the picture was to distinguish, the standard deviation seems to indicate how much image recognition relied on prior information or familiarity with the object or place in the picture. The picture with highest standard deviation, the plane, was likely more recognizable to people somewhat familiar with planes, than to people who were not. This explains why some people identified the plane as quickly as the car, while it took others almost twice as long, while the average was still relatively close to that of the car and pyramid. When we counted the frequency of subjects identifying the pictures in 5% windows and plotted this frequency, when produced the graph in Figure 9. Percentage Frequency Distribution 12 10 Frenquency 8 Car Bi-plane Sphinx and Pyramid 6 Farm 4 2 1 0. 2 0. 25 0. 3 0. 35 0. 4 0. 45 0. 5 0. 55 0. 6 0. 65 0. 7 0. 75 0. 8 0. 85 0. 9 0. 95 0. 1 0. 15 0 0. 05 0 Percentage Range in which Image was Identified Figure 9: Frequency Distribution Figure 9 illustrates a few features of our results that we were pleased to see. We see that the distributions are fairly bell shaped, with the car and pyramid curves lying almost on top of one another, and the farm curve mean shifted to a higher percentage range. This means that the average person is able to identify these pictures in a very specific range of percentages. We also see that the plane curve is also slightly bell shaped, but it is flatter and more spread out that the other three. This is to be expected by the high standard deviation, and is best explained by the theory that people were able to identify the plane based on how familiar they were with the object to begin with. Even though these curves take on different shapes, we clearly see that there is a fairly well defined zone within which most people will be able to identify a picture, roughly between 20 and 50 percent. Only 13 out of the 120 pictures were identified using more that 50-55 percent, and only 9 pictures out of 120 were identified before 20 percent of the image was revealed. Conclusion The goal of our experiment was to find a range within which the eyes and brain would be able to distinguish an object or place. As our data shows, we did indeed find that most of the image recognition occurs in the range of 20-50 percent revealed. There is some dependence on the image itself, and also the familiarity of the person with this image, but this is to be expected. We feel confident that if this experiment were carried out on a grander scale, with more subjects and more pictures, our results would be further proven, and perhaps reveal other correlations to eyesight, or familiarity with the image. Appendix A Raw Data Picture 1 Picture 2 Picture 3 Picture 4 1600 7800 6800 11100 13400 14600 13400 15700 6400 8200 4200 14500 10600 14900 12100 19400 10200 11200 7300 18200 9600 5600 8200 15000 9000 6100 7600 10400 4700 3700 17200 22900 6600 5300 10600 16500 7700 4500 7200 15500 5400 3500 6600 14900 9900 25800 8900 21500 8800 13200 9500 13900 8600 9300 7900 25000 6900 20200 8000 7700 8200 10500 9100 17100 8400 10000 8900 15200 7500 9900 8500 16100 10100 9700 9700 16200 8400 6300 7600 15300 8300 11400 12500 15000 5400 8000 3800 17500 6600 9200 4600 16200 7100 11400 8700 23000 8700 18000 8000 26700 7800 12200 20300 27800 6600 4600 9400 17900 8400 19900 8500 25800 8100 7700 5800 15400 5000 8700 9300 13200
© Copyright 2025 Paperzz