Obscured Images and the Visual Processor

Dan Sharpe and Erik Kroeker
PSY/ORF 322
Final Project
Obscured Images and the Visual Processor
Introduction
Human visual processing is a truly amazing process, constantly occurring as we
navigate the world. The amount of visual data that our eyes can take in is too much for
our brains to deal with on a constant basis, so our eyes work with our brain to make
assumptions and recognize things with minimal raw visual data. An example of this is in
our ability to recognize basic features of things happening in our peripheral vision, while
focusing on something else. Even if we aren’t looking directly at a sign as we walk, we
can realize basic shapes and feature of the sign. If it was necessary for us to use our
narrow field of higher level vision to do simple object recognition and edge detection, it
would require much more information to do simple tasks like just walking down the
street, or driving a car. Other instances where it is helpful to be able to process imperfect
or incomplete images are physically obscured objects and low visibility situations. Often
time objects are only partially visible, due to their location in space. In situations such as
these, it is necessary to process visual information with background information to make
assumptions about the obscured object. Expectations can be combined with a few visual
cues to make assumptions about the identity of the object or person in question.
In the case of our experiment, we gave an increasing amount of information about
a picture by randomly revealing pixels in the picture, until the picture could be identified.
This particular experiment gives us an insight into the resolution of visual information
needed to identify objects. This is part of the reason that we are not overwhelmed with
visual information to process at any given time, since we do not need much information
to know what something is or what is going on around us. If context clues were given
with the pictures in our experiment, it is easy to assume that the pictures would be
recognizable sooner.
Experimental Methods
To help test how much visual data a person needs to identify what an image is, we
designed a computer program. The program needed to serve two main functions; display
the partially obscured images, and record the amount of visual information required for a
subject to identify it.
The images were 24-bit color, 200 by 160 pixel bitmaps. All of the images were
taken from photographs, were properly oriented, and had not had their coloring altered.
The program used a series of four such images (see Figures 1-4.) It was anticipated that
different images would require different amounts of information to identify, thus we
attempted to cover a variety of images in our test. The images represented two objects (a
car and a plane) and two landscapes (the Sphinx and a pyramid, and a farm.)
Figure 1:
A Car
Figure 2:
A Plane
Figure 3:
Sphinx and Pyramid
Figure 4:
A Farm
In order to determine the amount of the image that need to be revealed before the
image could be identified, we decided to randomly display incrementally more pixels of
the image until the subject could identify it. The choice of revealing many small pieces
(pixels) of the image at once was chosen over the leading alternative of showing larger
blocks of the image so as to avoid having the image prematurely identified because the
block showed a very distinguishable feature of the image (ie. tire of the car.) It was also
chosen so as to better simulate peripheral vision, since peripheral vision does not consist
of several, separated, high-resolution blocks. It would more accurately be represented by
a roughly uniform distributed, low resolution, cloud of color.
To model this process, the program was designed to read the position and color
data of each pixel from an image, and randomly order the pixels in a list. The ordering
was random rather than a geometric pattern to avoid an image being more, or less, easily
identifiable because of the geometric pattern chosen.
Once the image data has been randomly sequenced, a subject (who is unaware of
the content of the picture) is instructed to click a button labeled ‘Reveal.’ Upon clicking
the button, the first one hundred pixels from the list of randomized pixels are displayed to
the screen (See Figure 5.)
The pixels were drawn as two pixel by two pixel squares and were displayed such
that the image is scaled to two hundred percent of its original size. The images were
scaled up to avoid having subjects be unable to identify an image because it was too
small, while also keeping the images from having so many pixels as to slow the
randomization process down too drastically.
Figure 5: Program Still, ‘Reveal’ Button
The user is then instructed to continue clicking the ‘Reveal’ button until they can
identify the content of the image. Once the subject thought him or herself capable of
identifying the image, the subject informed the test conductor what he or she thought the
image was. The subject was instructed to click the ‘Next Image’ button and proceed with
the next image in the series if they correctly identified the image (See Figure 6.) If the
subject was incorrect, the test conductor informed the subject that the answer was
incorrect and prompted the subject to continue clicking the ‘Reveal’ button until he or she
could identify the image correctly.
Figure 6: Program Still, ‘Next Image’ Button
This process was repeated until the subject properly identified the fourth image, at
which time the subject was instructed to click the ‘Done’ button. The ‘Done’ button
replaced the ‘Next Image’ button when the subject began to reveal the pixels to the final
image of the set. The program recorded the number of times the subject clicked the
‘Reveal’ button until he or she properly identified the image. When the ‘Done’ button is
pressed, the program exports the number of clicks the subject made to an excel file to be
analyzed (See Figure 7.)
Figure 7: Program Still, ‘Done’ Button
The subjects were not given any clues as to content of the images other than being
told that the images were from photographs, were properly oriented, and had not being
distorted in anyway. The subjects were instructed that they were not required to identify
specifics about the image (ie. model of car.) The test conductors were not to answer any
questions about the nature of the image, except to answer whether a specific answer was
correct or not. This prevented a subject from gaining insight into the nature of the image
by a question such as “is it some kind of vehicle?” or other questions of the like.
The test was run by two conductors with a subject group of thirty students at
Princeton University. Subjects were both male and female, and no records were made as
to the gender of the subjects in relation to the data sets. No data sets were omitted and
measures were taken to prevent subjects from taking the test twice.
Results and Discussion
The data revealed trends both amongst individual subjects and amongst the
images being observed. While collecting the data we noticed variances in individuals’
readiness to respond affecting the data. Some subjects were more hesitant to venture a
guess, therefore increasing the percentage of the picture being revealed prior to
identifying the image. Conversely, some subjects would venture guesses with very low
percentages of the picture having been revealed. The variations in individual subject
patterns had minimal affect on our data as someone was as likely to be conservative in
their guessing as they were not to, and thus the effects are cancelled out. This effect can
be observed in Figure 8.
Individual Series Percentage Required to Identify Image
1
0.9
Amount of Image
Revealed before Identified
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Car
Biplane
Sphinx and Pyramid
Farm
Series1
Series2
Series3
Series4
Series5
Series6
Series7
Series8
Series9
Series10
Series11
Series12
Series13
Series14
Series15
Series16
Series17
Series18
Series19
Series20
Series21
Series22
Series23
Series24
Series25
Series26
Series27
Series28
Series29
Series30
Average
Figure 8: Individual Subject Series Trends
The bold black line in figure 8 represents the average of all subjects, basically what you
would expect from an “average” person. The lines of these individuals tend to follow the
same trends as the average, that is increasing from picture one to picture two, and
increasing from picture three to picture four. This trend is also evidenced in the summary
of data in Table 1.
Mean
Standard Deviation
Minimum
Maximum
1-Car
24.38%
6.83%
5.00%
41.88%
2-Biplane
32.44%
16.51%
10.94%
80.63%
3-Sphinx and
Pyramid
28.15%
10.77%
11.88%
63.44%
4-Farm
54.23%
15.02%
24.06%
86.88%
Table 1: Statistical Summary of Data
The chart shows that on average, subjects needed the least information to identify picture
one, and the most information to identify picture four. While the statistical average acts
as an indicator of how difficult the picture was to distinguish, the standard deviation
seems to indicate how much image recognition relied on prior information or familiarity
with the object or place in the picture. The picture with highest standard deviation, the
plane, was likely more recognizable to people somewhat familiar with planes, than to
people who were not. This explains why some people identified the plane as quickly as
the car, while it took others almost twice as long, while the average was still relatively
close to that of the car and pyramid. When we counted the frequency of subjects
identifying the pictures in 5% windows and plotted this frequency, when produced the
graph in Figure 9.
Percentage Frequency Distribution
12
10
Frenquency
8
Car
Bi-plane
Sphinx and Pyramid
6
Farm
4
2
1
0.
2
0.
25
0.
3
0.
35
0.
4
0.
45
0.
5
0.
55
0.
6
0.
65
0.
7
0.
75
0.
8
0.
85
0.
9
0.
95
0.
1
0.
15
0
0.
05
0
Percentage Range in which Image was Identified
Figure 9: Frequency Distribution
Figure 9 illustrates a few features of our results that we were pleased to see. We see that
the distributions are fairly bell shaped, with the car and pyramid curves lying almost on
top of one another, and the farm curve mean shifted to a higher percentage range. This
means that the average person is able to identify these pictures in a very specific range of
percentages. We also see that the plane curve is also slightly bell shaped, but it is flatter
and more spread out that the other three. This is to be expected by the high standard
deviation, and is best explained by the theory that people were able to identify the plane
based on how familiar they were with the object to begin with. Even though these curves
take on different shapes, we clearly see that there is a fairly well defined zone within
which most people will be able to identify a picture, roughly between 20 and 50 percent.
Only 13 out of the 120 pictures were identified using more that 50-55 percent, and only 9
pictures out of 120 were identified before 20 percent of the image was revealed.
Conclusion
The goal of our experiment was to find a range within which the eyes and brain
would be able to distinguish an object or place. As our data shows, we did indeed find
that most of the image recognition occurs in the range of 20-50 percent revealed. There is
some dependence on the image itself, and also the familiarity of the person with this
image, but this is to be expected. We feel confident that if this experiment were carried
out on a grander scale, with more subjects and more pictures, our results would be further
proven, and perhaps reveal other correlations to eyesight, or familiarity with the image.
Appendix A
Raw Data
Picture 1 Picture 2 Picture 3 Picture 4
1600
7800
6800
11100
13400
14600
13400
15700
6400
8200
4200
14500
10600
14900
12100
19400
10200
11200
7300
18200
9600
5600
8200
15000
9000
6100
7600
10400
4700
3700
17200
22900
6600
5300
10600
16500
7700
4500
7200
15500
5400
3500
6600
14900
9900
25800
8900
21500
8800
13200
9500
13900
8600
9300
7900
25000
6900
20200
8000
7700
8200
10500
9100
17100
8400
10000
8900
15200
7500
9900
8500
16100
10100
9700
9700
16200
8400
6300
7600
15300
8300
11400
12500
15000
5400
8000
3800
17500
6600
9200
4600
16200
7100
11400
8700
23000
8700
18000
8000
26700
7800
12200
20300
27800
6600
4600
9400
17900
8400
19900
8500
25800
8100
7700
5800
15400
5000
8700
9300
13200