Triggers - the hunt for the recognition moment
The proposition is that when we recognize animals, people and objects in
depictions, the depiction is triggering a subset of the recognition abilities that are
triggered when we recognize similar things in real life.
This is not a radical suggestion. It would come as no surprise if it turned out that
similar mechanisms of visual cognition are engaged by encountering an object in
real-life and recognising that object in a drawing or photograph. In fact, this very
assumption underlies much research into visual cognition, where 3D objects are
rarely used to study visual cognition, and where drawings and photographs are
invariably used as ‘targets’ in experiments.
An obvious experiment to confirm or disconfirm this assumption of researchers
in visual cognition would be to compare the brain scans of subjects exposed to an
object and to a picture of that object. If there is a significant overlap of brain
activity it would seem that an object and picture of that object engage similar
mechanisms of visual recognition. Unfortunately, overlap in brain activity in the
case of seeing the object and a picture of that object proves very little. It would
be surprising if there wasn’t any overlap – that would be an interesting result!
Perhaps as Semir Zeki suggests, it would be more fruitful to identify nonoverlapping brain activity. We could then begin to explore which cognitive
mechanisms an object engages that a picture doesn’t and vice versa. It is
nonetheless crucial to my thesis that a subset of object recognition mechanisms
are triggered by a picture of the object. Raw recognition of an object usually
happens in less than half-a-second, and in some cases can happen in less than a
20th of a second. In this short time our visual system has processed the light and
associated the visual cues with other objects we have encountered. There is no
evidence that this timeframe varies in the case of recognition of depictions. If a
subset of our natural object recognizing abilities is not engaged when we see a
depiction of an object it is hard to explain how recognition ever gets started. That
is, if a picture of an object does not trigger either the visual pathways that are
triggered by the object or memory modules that engender associations with that
object, then there is no recognition.
When we look for explanations of how recognition of an object in a depiction
gets started, the shortcomings of all theories of depiction are highlighted.
Conventionalist theories (theories that argue that when we look at a depiction
we are deciphering a symbol system) cannot explain how someone can instantly
recognize an object in a picture despite never having learned the symbology of
the syle or format the depiction is in. Children as young as 18 months can
recognize animals in drawings and photographs, despite never having been
taught how to ‘interpret’ various kinds of pictures. If depiction is a symbol
system, its advocates need to explain what in a child’s memory is being matched
to the ‘symbols’ it sees in the picture. If the child hasn’t learned the symbology
there is nothing in memory to match the symbol to. Recognition cannot get
started without the lines, tones or colours in the depiction triggering something
in the brain that has been learnt or memorized.
Advocates of Resemblance theories have the opposite problem. It has been
argued that everything resembles everything else in some respect. In many
respects a real cat resembles a real dog more than a 2D line drawing of a cat
resembles a cat. It is possible to find resemblances between all kinds of disparate
objects and animals. In the recognition moment when the child sees the line
drawing of a cat, how does the child know that the drawing is supposed to
resemble a cat and not resemble another drawing or something else which is
mainly white with lines on. At that recognition moment the child needs guidance
as to what respects the drawing is supposed to resemble something. It is not
clear that the resemblance mechanism can supply that guidance, unless the child
has already recognized the cat using her natural cat recognition abilities!
Resemblance therefore cannot be the primary mechanism of recognition.
It would seem that only natural recognition can supply the lifting power needed
for recognition of content in depictions to get started.
Recognition of content in drawings, paintings and photographs is fast and
powerful. We are rarely in doubt about what is depicted. Occasionally a bad
drawing, an odd angle, or some ambiguity will cause us to falter for a second
while we analyse and try to decipher the picture. This, for me, is always an
interesting moment, it is as if the ‘recognition moment’ has been elongated and
laid itself out for analysis. Later on I will explore delayed recognition in a series
of visual experiments. There is much to learn about our cognitive processes from
the phenomenon of visual recognition failure.
What are our options for isolating what Michael Podro calls ‘the mechanisms of
recognition’? Aesthetic theory has had much to say about the mechanisms at play
when we browse a picture, analysing how style composition ‘sustain
recognition’. There has been very little interest in aesthetics in the ‘recognition
moment’; that moment when in the first half-a-second of encountering a picture
we recognize that it is a landscape, a portrait or scene, when we initially pick out
trees, buildings, people, animals or objects. Arguably this moment is not
available for introspection; the cognitive mechanisms of recognition are
triggered in an instant and a complex set of processes and interactions in diverse
centres of the brain give rise to the perception of content of the picture. There
has been some inconclusive speculation and some experiments concerning
whether style or content are perceived simultaneously or serially. However, it is
not clear to me what implications this work has for a theory of depiction based
around natural recognition abilities. It is encouraging, and informative, however
to see some attempt to lay bare that primary moment of recognition, because it is
that mechanisms in play at that moment I would like to lay bare.
It would seem that my options are:Design a brain scan experiment that can discriminate between the cognitive
processes which are engaged when we see an object in 2D from those engaged
when we see the same object in 3D.
Design a psychological experiment that can discriminate the psychological
differences between when we see an object in 2D from those engaged when we
see the same object in 3D.
Other????
What counts as an explanation of how human beings recognize content in
depictions?
In general, for something to count as an explanation of a phenomenon (as
opposed to merely a description), a theory needs to account for proven facts
about the phenomenon, satisfy test cases and experiments, provide the tools to
predict features of the phenomenon which are, as yet, unproven or discount
them.
What features of recognition of depicted content does a robust theory need to
account for?
Universality
Speed of recognition
All styles of visual depiction
For an image to be a depiction the artist or photographer must intend that the
picture he or she is creating is recognizable by people with visual abilities like us.
An image of something created by accident, for example Leonardo’s stains on the
wall, is not a depiction.
An image which fails to trigger recognition is simply a failed depiction.
Why a Duck? [decoy – shooting – recognizing – quack -] The 4th June
Pre-amble about why I’m bother with such a crazy hypothesis when the
Gestalist/context theory is likely to be the one with legs. Praise Gombrich and
mention Ramachandran and neuroscience].
We almost never see an object out of context – except in catalogues and
recognition tests. [Stimulus pictures from experiments and catalogues].
So, we can recognize objects without the Gestalist superstructure.
Entertain the possibility that the natural recognition abilities that enable you to
recognize these objects scattered around the room are the ones that enable you
to see similar objects in drawings (and photos and paintings – but today line
drawings).
Further consider that the cognitive abilities that enable you to see the duck, keys
etc in the lines are a subset of the ones that enable you to see the 3D duck and 3D
keys.
This is an exercise in parsimony. If all you need to see a duck in a line drawing is
whatever you need to see a duck in real life there could be a recognition trigger
common to both. It seems unlikely because when you look at the line drawing it
doesn’t look much like a duck when look at it…and neither do all these easily
recognizable drawn ducks. Picture of lots of ducks.
What could a trigger be. Mention Ramachandrans gulls and something else. Also
mention that the gull skill is instinct – we probably haven’t got an instinctual
recognition trigger for chairs.
What are the chances of all chairs having a common recognition trigger. [picture
of lots of different chairs] - zero.
THE PROBLEM – you recognize all of these drawings as chairs and yet you
probably have never seen a chair in real life like this or this or this.[Picture of
weird chairs] Maybe you’ve seen it in a picture and our ability to recognize so
many chairs with apparently nothing/so little in common (??) is based seeing
lots of pictures of chairs in context beforehand. [Maybe explain this with pictures
of chairs in offices, houses etc].
Does this possibility mean that nothing hangs on the ‘common recognition
trigger’ hypothesis?
Etched Nebula
When I close my eyes at night I see a nebula of drifting clouds of greys and pale
blues and greens. Occasionally a tiny bright flash like a shooting star will appear
and disappear. The nebula subtly changes its shade as waves like the burning
credits on an old western sweep across the screen. Shapes like alien spacecraft
sometimes drift out of the mist. They are almost never a distinct shape. They
might be like a little green Star Trek probe – there and then gone. Or there might
be a glimpse of a complex grey metallic panels briefly visible though the grey
mist. Sometimes whole sheets of shapes, like Leonardo’s pages of drawings for
machines, that seem to be etched onto metal slide across the screen only to be
swallow again by the grey. I can look around and try to follow these shapes,
sometimes there are flashes of colour like distant lightning behind the clouds. It
eventually all stops and become a homogenous grey until it begins again with a
geometric shape, or a cloud of gas.
It is as if the aurora borealis is playing out on my retina.
Hypothesis – Algorithm – it is a diagnosis algorithm
The recognition process must be algorithmic.
Identification of an object usually happens in less then half-a-second.
If Oliva and Torralba are right we first hypothesise an outdoor scene, an indoor
scene, a street scene etc. Then move on to hypothesise about the objects in that
scene.
What are the steps in the diagnosis? What tests does it do?
Doors to Perceptions
A corridor of doors with rooms (scenes) behind them.
Through each door is one of the objects.
Why not use a window?
Am I setting context?
What is doing the work?
What feature of the sketch is enabling you to recognize what in reality is a smelly
one ton cow, a shiny metal VW, a hollow wooden shell, a feather covered duck?
The experience of the sketch does not come close to the experience of the close
encounter with the subject.
Could it be that there is some trigger feature of the real 3D object (or the real 3D
encounter) which is captured in the lines of the sketch and does the heavy lifting
work of triggering recognition? Seems unlikely. But we need to eliminate this
pesky idea before we move on to identifying how context etc jump starts the
recognition process.
20 Objects - score – animal vegetable mineral
A matrix of 20 ‘everyday’ object types which I can easily recognize from a simple
line drawing or photograph.
The master set is a set of crude line drawings of 20 objects which I have sketched
from memory and which I intend should be recognizable a particular object-type
by any viewer with natural human visual recognition abilities.
There is enough information in the sketches to trigger recognition of an object
classs.
2D Cues to a 3D world
The hypothesis hypothesis and the unlikely object specific recognition trigger
proposal.
Looking for cues – representing and depicting in 2D– let us count the ways (some
ways of representing in 2D are not depicting) Excluding film/video for now.
B&W Sketch
B&W Drawing
B&W silhouette
B&W Photo
B&W plan and elevation
B&W diagram
B&W map
B&W vector (eg matchstick man, point light figure)
Colour Drawing
Colour Painting
Colour Photo
A WORD ABOUT STYLE
A colour photograph taken with a standard lens is a depiction that uses a
technique (colour photography) and will be in a style (eg realist, documentary).
The style of a depiction may be constrained by its technique.
The style of my original sketches is constrained by what I can do with a pencil on
paper and my decision to use lines with no shading to depict the object.
Most of the object are in ¾ view as opposed to side on or face on. This is because
I wanted to depict the object as one might encounter it in everyday life, not as
one might find it in a book (eg an I-Spy book). The essential thing about a
depiction is that it is designed to evoke objects or scenes that we might
encounter or might have encountered in normal life and ….
A depiction, as I define it, a visual representation in standard projection of how
an object looks to a human being from a particular angle.
A picture of an object in side-view is the beginning of depiction if the Lascaux
Cave painting are anything to go by.
A WORD ABOUT CONTEXT – more objects in space.
I’m going to assume that it makes sense to say that a picture of an object can be
presented without context. A generous interpretation of this assumption would
be to grant that a drawing or photograph of an object against a white
background with no shadows or horizon line is an object without context. A less
generous interpretation might be that it is an object in the context of a white
background. Unless it can be shown that this ungenerous interpretation
somehow invalidates my experiment I am going to assume that a white
background is an adequate analog for ‘no context’.
Having said this, it is clear that our picture viewing experience and our picturedobject-recognising moment will always be
Human Vision – bad design, bad engineering, but a good post-production
team. – let’s assume (despite Gibson’s1 objection) that the data that is on
the retina is crucial as data for the object recognition to get started. If the
ludicrous proposition fails this retinal test we will revisit Gibson’s
ecological optics and see if that can provide a recognition kick-start for a
picture-object.
What is the resolution of the combination of lens and retinal ‘pixels’?
Gibson JJ. The Ecological Approach to Visual Perception. Boston:
Houghton Mifflin, 1979. Gibson says that optics of the eye and the resulting
retinal image are irrelevant for ecological optics. In Gibson’s view, information
from the environment is structured into the optic array, and the eye simply
“picks up” invariant relationships in the array that specify the environment.
1
Picture-games – the syntax of line drawings
Wittgenstein’s notion of language-games2 may be useful in highlighting how
recognizing content in pictures is different from understanding
language/words[???]. The crucial difference (if you accept, for the sake of this
thesis, that Schier’s natural recognition theory of depiction is on the right track)
is that meaning, in a language-game, is use, and meaning in a picture-game is
recognition [this sounds wrong]. Proper names are a very particular and not
very typical example of how language enables us to identify things in the world.
Names get their meaning from the thing they pick out [maybe this is wrong]. To
say “Here is Halle Berry.” Is to identify Berry, the person, using the sound “Halle
Berry”. When we hear this sound we don’t recognize Berry the person in the
sound; we associate Berry with the sound – it is reserved for her. The rest of
language does not work like that, and neither does recognizing pictures.
Wittgenstein originally argued that propositions (sentences) pictured the world
and that the world was the set of all true propositions (“everything that is the
case”). He rejected this picture-theory of language when, in Philosophical
Investigations, he formulated his ‘language-game’ account of language and
meaning. In this later account Wittgenstein sees language use as a social activity
interwoven with our ‘forms of life’ and not something which stands apart from
the activity in a representational way (as in the picture-theory).
Maybe what we need is a picture-theory of pictures.
It is not uncontroversial to say that a picture can picture (represent??) a
particular object in the world (in the way a proper noun does), a type of object
(as a category-type word such as guitar does) or an imaginary object.
Pictures describe the visible world in the way that a descriptive sentence might.
However, it may be that not every element of a picture refers. The same is true of
sentences. Some words are part of the syntax. The elements of a picture
selectively commit to reference. For example, a black and white line drawing of a
red ball selectively commit visible shape but does commit to colour and is not
suggesting that the ball has a black outline.
Kenny – “Like the picture theory of meaning, the concept of language-game was
much more than a metaphor. Words, Wittgenstein now insisted, cannot be
understood outside the context of the non-linguistic human activities into which
the use of the language is interwoven: the words plus their behavioural
surroundings make up the language-game. Words are like tools: their functions
differ from one another as much as those of a saw and a screwdriver. But their
dissimilarities of function are hidden by their uniform appearance in sound and
in print. (Similarly, a clutch- pedal is like a foot-brake to look at, but their
mechanical functions are totally different.) The similarity between words of
different kinds makes us assimilate them all to names, and tempts us to try to
explain their meaning by pointing to objects for which they stand. But in fact the
way to understand the meaning of a word is to study it in the language- game to
which it belongs, to see how it contributes to the communal activity of a group of
language-users. In general, the meaning of a word is not an object for which it
stands, but rather its use in a language (pi, i, 11–12, 24, 43).”
2
The syntax of line drawings – the lines in a line drawing work together to
reference the object-type. In a line drawing of an object with a white background
and no horizon line or shadows the lines refer in distinct ways and together they
trigger recognition by providing their own context [this is not the right word].
Degrading Images - vectors-blur-size-noise-obscuration-angle-distortremoval (also enhancing – exaggeration – compositional highlighting etc)
The conceit of this phase of this project is to test the ludicrous proposition that
every object has individual recognition triggers. The task is to find out what is
essential in each picture of an object to kick-start the recognition process (given
that we are assuming ‘no-context’ and thus no prior hypothesis about what the
object in the picture is likely to be). The matrix of line drawings, photographs etc
will be subjected to a number of processes designed to eliminate what is
unnecessary in the picture for recognition (eg its colour or its sharpness).
A drawing, painting or photograph presumably provides less ‘data’ to the retina
for the human visual ‘post-production’ process to evaluate (recognize).
Using Kennedy ‘seven ways a line can refer’ remove each kind of line and see
which mages are no longer recognizable.
Seven Ways Lines Can Refer
A picture is a proposition about what is visible in the world.
Lines refer to visible things – (like proper nouns but with different conditions eg
must be visible)
There is a syntax of line referral – that is sometimes a line needs other lines to
determine how it refers.
Kennedy says blind people can feel how a line refers even though they have
never had vision or seen a picture with their eyes.
Thus the way a line refers is connected with how the brain interprets the data on
the retina. This seem too simple. But I am going to test it somehow.
Ron’s Pedagogical Sketchbook
Looking at how lines refer.
Tanaka differential amplifier - model a response (is it like colour vision?)
The visual system looks at differences not similarities.
Purves and Lotto
The reason it is difficult to draw in perspective is that the visual system assumes
that it is always looking at a 3D scene and calculates anagles based on their real
world relation not on the relation on the flat plane of the retina.
The sketches
The sketches by Turner, Constable et al are often very sparse with trees
indicated with a few lines, squiggles and a rough hatch. Some[who?] have
described sketches like the ones from Turner’s sketchbooks as being drawn in a
kind of shorthand or notation. This may not be the right way to characterise
rough sketches. There seem to be similarities in the ‘notation’ techniques used
between quite diverse (diverse in time and geography) artists in this little group.
Note how buildings are drawn the short vertical dashes for windows.
Tell the story of encountering a line drawing
First glance – black and white lines
Light activates rods and cones.
Raw data analysed for verticals, horizontals edges, etc. Curves? Bezels?
Model the analysis algorithm for line drawings.
Unrecognisable images
In the paintings by Yves Tanguy the shapes are familiar but unrecognisable.
Make a collection of unrecognizable objects/images.
Waking Up - disorientation
I love it when I wake up in a strange room, a hotel, or someone’s house, and I
don’t recognize the scene when I open my eyes. There is a moment there when
my brain is seizing on everything and trying to make it part of a scene that it is
not – my bedroom.
Discontinuity - the visual system is a discontinuity engine
The first thing that the visual system does to the pattern of light stimulated rods
and cones is analyse adjacent discontinuities of light intensity. These are
boundaries where the stimulus in in the world has an edge, two differently
coloured surfaces abutting or some other discontinuity resulting in a light
reflection difference from a surface. The visual system analyses these
discontinuities for horizontal and vertical boundaries and takes note of large
fields where light intensity does not alter suddenly and ignores them.
The visual system, at this stage does not know what causes these discontinuities
it could be a real world object or a drawing of an object.
The Signal Station
From genesis not sure which div
.wrap {
}
max-width: 960px;
from going green
#wrap {
background:
url(images/wrap-top.png) top repeat-x,
url(images/wrap-bottom.png) bottom repeat-x;
}
© Copyright 2025 Paperzz