Trivariant Color Vision - The Visual Wavefield Project

33
CHAPTER TWO: TRIVARIANT COLOR VISION
“Color vision is known best by man's perception of it. It creates a unique dimension to
sight that is impossible to appreciate by any non-visual means. It depends on wavelength
more than on the energy of light but it is an illusion of reality resulting from a
comparison of the responses of nerve cells in our brain. Color and all vision are in a
sense illusory depending only on messages that pass between millions of neurons that
reside within the darkness of our skull. These visual messages allow us to project
ourselves into a universe that would be unknown to us without vision.”
Dr. Peter Gouras
2.1 Introduction
It is generally accepted that Aristotle was the first to enumerate the five “classical”
human senses of sight, sound, hearing, touch and taste. Today we recognize that we have
many more sensory systems including a system for a kinesthetic sense and a system for
sense of balance. Not all of our sensory systems give rise to a direct appreciation of a
sensation because many of our sensory systems function subconsciously. Of those that do
give rise to sensation, the process by which the sensory stimulation is translated into an
experience is called perception and the experience itself is called a percept1.
Of all of the systems that communicate to us via perception, the visual system is by
far the most underappreciated because we rarely notice its percepts. We notice auditory
percepts, we notice olfactory percepts but we rarely notice visual percepts. That is
because unlike the other senses, the visual system is never quiescent. We notice other
senses as part of the world around us but the visual system is the world around us. It is
our primary sensory modality, it dominates our neocortex and it functions even when the
eyes are closed and the brain is asleep.
The percepts of vision are the objects that surround us and interact with us. It is the
purpose of the visual system to identify these objects, determine their spatial position and
relationships, evaluate their movement and discover their properties. It must do all this
with all of the myriad objects that surround us; and do it in real time. The complexity of
1
An impression of an object obtained by use of the senses.
34
this task dwarfs all other sensory tasks. We do not notice most of the percepts because the
visual system is so efficient that we simply do not have time to become consciously
aware of everything that it produces.
The products of the visual processing system are percepts and since we intend to use
it to communicate scientific information the product of visualization must also be
percepts; we must perceive that which we are trying to show. To that end, this chapter
introduces the concepts of visual percepts and trivariant color vision. The former is what
we want to produce; the latter is the system by which the visual system goes about its
task of producing them. It introduces them in the context of the seismic variable density
display and goes on to prove that whereas we perceive this display as a seismic display
the display itself does not perceive seismic at all.
2.2 The Concept of Seismic Perception
“Perceptions are internal representations of the external world”
R.L. Gregory, Eye and Brain, Fourth Edition, 1997
Figure 2.1 and Figure 2.2 are both images of a complex scene. The former is a
picture of the author’s wife and dog in an alpine meadow; the latter is a variable density
display of a seismic line from the Trujillo area of Peru.
In both instances, the brain creates a model of the scene in the mind. Creating this
model is a two-stage process. In the first stage, the visual system segregates the scene
into discreet objects, in the second it interprets these objects as percepts. These percepts
are provisional in nature because as we acquire new knowledge our percept changes.
Take, for example, the percept of the dog in Figure 2.1. Almost everyone with normal
vision would recognize it as some form of animal; beyond that, most adults would
recognize it as a dog; beyond that, most people familiar with dogs would recognize it as a
Springer spaniel; beyond that, most people familiar with my family would recognize it as
my dog etc.
35
Figure 2.1: The author’s wife and dog in an alpine
meadow.
Figure 2.2: Variable density seismic display of a
faulted data set from the Trujillo area of Peru, data
courtesy PeruPetro. Color palette represents –ve
amplitudes in blue, zero amplitude in white and +ve
amplitudes in red.
Perception is also a multi-stage process because we assemble the whole from the
parts. We do not just perceive the dog, we first perceive its components, its eyes, its nose,
its tongue, its smile, and its tail etc. and we interpret each of those in turn as its own
provisional percept. We develop our perception of the dog as a whole from a
consideration of its parts but only after we assemble our perceptions of its parts from our
perceptions of its parts parts, and we assemble …. Ad infinitum.
Figure 2.2 is a seismic image of a complex geological cross-section and it contains
as many constituent parts as does the real world scene shown in Figure 2.1. It contains
numerous reflection events each of which is broken into smaller sections by the myriad
collection of major and minor faults. Each of these smaller sections is further subdivided
by even smaller faults and each of these even smaller sections has its own characteristic
amplitudes, dominant frequency etc.
However, unlike Figure 2.1 there is no concept of a percept here. There is no
perception of the major faults, for example. We cannot see them directly and therefore we
have to imply their locations. The situation is worse for the minor faults, which we can
barely detect at all. We also do not perceive the reflection events as objects and although
we know that each reflection event has amplitude changes along it, we do not perceive
36
those either. In fact, there is no sensation of perception at all in this image; there is
nothing in it that grabs our attention and no part of it that is visually more distinct than
any other is. When we look at the real world scene, we develop sensations of perception
but when we look at the seismic scene we develop none at all, our visual system has
failed us.
This is a strange concept that the visual system can fail even though we can see
something clearly. At first, it sounds nonsensical to imply that perception can fail even
when we can see. Nevertheless, the products of the visual system are percepts, sensations
of perception, and if the visual system cannot produce them for us, then it has by
definition, failed. The ultimate goal of visualization must be to produce a display that is
as equally interpretable to the visual system as is a real world scene. It is doubtful, given
the physical differences between the objects in the two images, that we will ever achieve
that goal. However, there is a fundamental reason why Figure 2.2 is so poor, why it
produces absolutely no sensation of perception. The reason is human trivariant color
vision, which I will discuss in the remainder of this chapter.
2.3 Simple Visual Experiment
In the spring of 2006, I conducted a simple psychophysical2 experiment involving
over 100 participants from the University of Calgary and Divestco Inc. I showed two
images (shown full size in Figure 2.3 and Figure 2.4) to the survey participants and asked
them the same two questions for each:
1. What is it?
2. Did you recognize it automatically or did you have to think about what it was?
2
Study of the quantitative relations between psychological events and physical events or, more
specifically, between sensations and the stimuli that produce them.
37
Figure 2.3: An image of an object created using a typical seismic color palette: cyan-blue-white-red-yellow.
I sent this image to the participants second; even so 60% of the survey participants could not identify what
the underlying object was. The majority of the 40% of the respondents who did identify it reported that they
did not develop their identification directly but had to use secondary information, in other words, they had
to think about what the object was before arriving at a conclusion.
38
Figure 2.4: A shaded relief image of the same object shown in Figure 2.3. I sent this image to the
participants first and 85% of the respondents reported that they identified the underlying object
automatically. The identifications varied with majority perceiving it as a mountain range and others
perceiving it as being either crumpled paper or a blanket.
39
Of the two questions, the second was the more important because the answer is
indicative of whether or not the visual processing system succeeded. The term “think
about it” is crucial because the visual processing system does not include conscious
thought. If it functions as we would like, then we sense the object and if it doesn’t then
we pass the image onto the higher brain functions for analysis.
The object that I show in both images is a mountain range, specifically, the
Crowsnest Pass region of southeastern British Columbia. The color image of Figure 2.3 is
essentially a variable density elevation display because elevations are mapped to color in
the same way that a variable density seismic display maps seismic amplitude to color. For
this example, I used a typical seismic palette (cyan-blue-white-red-yellow). Cyan
represents the lowest elevations and yellow represents the highest elevations.
By contrast, Figure 2.4 is a shaded relief image of the same elevation data. Shaded
relief (Batson, 1975), is a picture of the light that would reflect off a given surface for a
given direction of illumination. For this image, the light source points from the upper left
of the image towards the center. I sent this image to the survey participants first.
2.3.1 Survey Results
Eighty five percent of the people who responded to the survey reported that they
were able to identify an object in Figure 2.4 very quickly. Most people correctly
identified it as a mountain range but others saw it as crumpled paper and others as a
blanket. Interestingly, there was a high degree of correlation between the respondents
experience and what they reported seeing the image as. People with experience with
aerial photographs identified the image almost exclusively as a mountain range whereas
people in Divestco’s accounting department saw it as crumpled paper. As much as this is
amusing, the correlation between occupation and identification is significant because it is
indicative of the provisional nature of percepts. People perceive things according to their
own personal experiences and knowledge. In this experiment, however, it was not
important what they even eventually sensed the object to be. What was important was
40
that the vast majority of respondents had the sensation that what they were seeing in the
shaded relief image was a real object.
This perception of an underlying object is in contrast with the respondents
experience with Figure 2.3. I sent this image to the participants several days after the
shaded relief image and even though they had already identified the underlying object,
most people did not sense anything in the color. Sixty percent of the respondents reported
that they could not determine what the underlying object was. Of the remaining 40%, the
majority reported that the recognition was not automatic and that they had to think about
the image before arriving at an answer.
This was not a rigorous survey and I caution the reader not to take too much from it.
I include it here because it serves to introduce what is the most important and most
fundamental fact of the visual processing system, one that psychologists have known for
over a hundred years. This simple experiment shows that the visual processing system is
a multi-channel system and that each channel contributes to perception but in a different
way. In this example, a channel that produces the sensation of perception processes the
shaded relief image. By contrast, a channel that fills in details but gives very little
sensation of the object that the details apply to processes the purely chromatic image.
2.4 Primate Trivariant Color Vision
Our modern understanding of color begins with a series of experiment conducted by
Sir Isaac Newton in the late 1660’s. Before his experiments, people believed that color
was a mixture of light and darkness. Hooke, Newton’s antagonist, was a proponent of this
theory and proposed a scale from brilliant red, which he believed was pure white light
with no darkness added, to dull blue, the last step before black. He believed that darkness
was a physical property and that black was the complete extinction of light by the
hypothetical dark. Newton overturned this theory by use of the prism. In a revolutionary
experiment, he first split the light into its spectrum and then refracted it back together. By
reforming the original light, he proved that light itself was responsible for the color.
41
Newton’s experiments led to an understanding of the nature of color but said nothing
about how we see it. The Trichromatic Theory of Color Vision, first proposed by Thomas
Young in 1802, was the first widely accepted theory of how we actually see colors.
Young based his theory of color vision on the premise that there are three classes of cone
receptors sub serving color vision. One of the more important empirical aspects of this
theory is that it is possible to match all of the colors in the visible spectrum by
appropriate mixing of three primary colors. Which primary colors are used is not
important as long as mixing two of them do not produce the third.
Anyone who has looked at a television or a computer monitor will be familiar with
Trichromatic color. Each pixel on a computer monitor consists of three smaller pixels, a
red pixel, a blue pixel and a green pixel. By varying the intensity of the light emitted
from each, the display can produce a complete spectrum of colors. We call this the
additive mixing of colors and it is how we produce the colors of a computer monitor. As
one might expect, however, nature is more complex and even though we have three
separate color receptors in the retina, we do not combine them in the same way.
Whereas the Trichromatic theory of color explains many aspects of generating color
it is seriously deficient when it comes to the human perception of color. We can use it to
simulate colors but it does not explain why there are certain colors that we never see
together. For example, we see yellowish-greens and bluish-reds but we never see bluishyellows or reddish-greens. The trichromatic theory cannot explain this and we now
generally accept that the theory of trichromacy only applies to the color receptors in the
retina and not to our perception of color.
2.4.1 Hering Theory of Opponent Color Vision
The 19th century physiologist Edwald Hering proposed a different model for color
vision. He proposed the Opponent Color Theory (Hering, 1964; Hurvich 1981) which we
now accept as generally correct. Hering hypothesized that the trichromatic signals from
the cones were subject to subsequent neural processing. He proposed two major opponent
42
classes of processing, a spectrally opponent process and a spectrally non-opponent
process.
S-Cone
M-Cone
L-Cone
L-M
L+M
Blue
Red
Black -White
Yellow
Green
(Luminance)
L+M-S
Figure 2.5: The Hering theory of Opponent Color Vision. Neural processing produces three channels of
visual information each of which is processed by separate neural circuitry in the visual cortex. The first
channel is the opponent black-white (achromatic or luminance) channel, it provides the bulk of our
perception. The two chromatic channels, the opponent blue-yellow channel and the opponent red-green
channel also contribute to perception but to a lesser degree.
In the opponent color theory, the spectrally opponent processes of red vs. green and
blue vs. yellow provide our ability to separate hues. The spectrally non-opponent process
produces our black and white vision. This opponent process model lay relatively dormant
for many years until a pair of visual scientists working at Eastman Kodak at the time,
43
conceived of a method for quantitatively measuring the opponent processes responses.
Leo Hurvich and Dorothea Jameson invented the hue cancellation method to evaluate
psychophysically the opponent processing nature of color vision. Due in large part to
their work we no longer question opponent processing. We call the modern model for
how humans (and other primates) see colors "the Stage Theory" and it incorporates both
the Trichromatic theory and the opponent color theory. The first stage, the Trichromatic
stage, can be considered as the receptor stage, which consists of the three photo, pigments
(blue, green and red cones). The second is the neural processing stage and this is where
the color opponency occurs. It begins as early the first post-receptoral layer in the retina
and continues through the visual system and on into the visual cortex itself.
Figure 2.5 shows, in general terms, Hering’s spectrally non-opponent and opponent
processes. We process the Trichromatic signals from the cones into three separate
channels of visual information, an achromatic channel and two chromatic channels.
Hering’s non-opponent process occurs first. We then combine the signal from the L and
M cones to produce the Black-White (luminance) channel. Once we produce this channel,
we difference the same two inputs to produce a Red-Green channel and then we
difference the luminance channel with the S cone signal to produce a third, the BlueYellow channel.
We subsequently process these three by two separate circuits in the visual cortex.
The primary circuit, which we call the Achromatic Neural circuit, processes the intensity
channel. The secondary circuit, which we call the Chromatic Neural Circuit, processes
both the Red-Green and the Blue-Yellow channels. This processing of the Trichromatic
cone signals into three channels of information is, in very general terms, how we see in
daylight conditions. It is known as Trivariant color vision and among mammals; it is
unique to Old World primates.
2.4.2 Trivariant Color Vision in Practice
In section 2.3 I discussed the results of a small visualization survey that I conducted
in the spring of 2006. In this survey, I sent two images of the Crowsnest Pass (Figure 2.6)
44
to a group of over a hundred participants. I reproduce the images here at a smaller scale
for comparison purposes.
Figure 2.6: Three-dimensional view of the
Crowsnest Pass using the same lighting and color
used for the images in the visualization survey.
Figure 2.7: Bump mapped image formed by
multiplying the color values of Figure 2.9 with the
intensity values of Figure 2.8. This display is
analogous to looking at Figure 2.6 from directly
above (i.e. straight down).
Figure 2.8: Small-scale version of Figure 2.4 shown
for comparison. This image is purely achromatic
and is processed by the achromatic neural circuitry
in the visual cortex. Note that in comparison to
Figure 2.7 the underlying perception of the
mountain range doesn’t change by removing the
color.
Figure 2.9: Small-scale version of Figure 2.3 shown
for comparison. This image is purely chromatic and
is processed by the chromatic neural circuitry in the
visual cortex. Note that in comparison to Figure 2.7
the underlying perception of the mountain range is
lost when we remove the lighting.
These four images are an illustration of the underlying processes of trivariant color
vision. Figure 2.6 and Figure 2.7 represent “real world” views of the elevation data. The
45
former is a three-dimensional image and the latter a two-dimensional bump mapped3
image. They both represent the single integrated image that we are conscious of
whenever we view a scene in the real world.
By contrast, Figure 2.8 and Figure 2.9 simulate what happens to these integrated
images once they enter the visual system. According to the now widely accepted
trivariant theory, the visual processing system splits the integrated image4 into three
separate images. The first is an achromatic, intensity only image, which I simulate in
Figure 2.8. The other two are purely chromatic images, which I simulate by the single
combined image Figure 2.9. I have used a single chromatic image here instead of two
because the important point is that there are two neural pathways from processing
information, one for achromatic information and one for chromatic. When we look at
Figure 2.8 what we perceive is the result of processing by the achromatic channel, when
we look at Figure 2.9 what we perceive is the result of processing by the chromatic
channel.
Comparing these images can provide an understanding of the fundamental nature of
these channels. The only difference between Figure 2.7 and Figure 2.8 is the absence of
the chromatic information. When we compare the two, it is clear that our underlying
perception of the scene does not change because we get almost exactly the same
sensation of perception in the two images. This is not to imply that nothing is lost,
however, because clearly we lose details. For example, there is a small “island” like
structure in the lower right corner of Figure 2.7. This percept of an island is not there in
the achromatic image. Clearly, it is the colors used that make this appear as an isolated
structure.
3
Bump mapping (Blinn, 1978) is technique that produces the perception of three-dimensional
wrinkles on a two-dimensional surface.
4
The reader is cautioned that at no time does an “image” appear anywhere in the brain. I use the term
here as a colloquialism to refer to streams of visual information.
46
By contrast, the only difference between Figure 2.7 and Figure 2.9 is the absence of
the achromatic shaded relief information and consequently the comparison is more
dramatic. This is because when we look at the purely chromatic image we lose the
sensation of perception almost entirely. Interestingly, the “island” is even more apparent
on the chromatic image than on the bump mapped image. Its visual appearance, though,
is not a percept because it produces very little sensation of perception.
What this simple survey exposes is that the visual processing system is dependant
upon both achromatic and chromatic information. In subsequent chapters, I provide a
detailed description of the properties of these images as well as how they are formed and
processed. For now, though, all that is important is to understand that the achromatic and
the chromatic images are processed by separate but parallel neural circuits in the brain.
Both channels contribute to our sensation of perception but it is the achromatic circuit
dominates.
2.4.3 Trivariance and Seismic Data
The trivariant nature of vision has direct implications for our ability to communicate
seismic information. There are two conventional seismic displays, the wiggle trace
display and the variable density display. In this discussion, I am only interested in the
variable density display because we create them using color palettes similar to the one I
used throughout this chapter. I discuss the wiggle trace display, which is more achromatic
in nature, in Chapter 4.
Figure 2.10 and Figure 2.11 are variable density displays, the former being of the
elevation data from the Crowsnest pass and the latter being of a small section of a faulted
seismic line from the Trujillo area of Peru. Both displays use the same gray-dark bluewhite-dark red-yellow color palette. For the elevation data, gray represents the lowest
elevation and yellow the highest; for the seismic data, gray represent the lowest negative
amplitudes, white zero amplitude and yellow the highest positive amplitude. Both sets of
data contain exactly the same number of vertical and horizontal samples, in this case 512
by 512.
47
Figure 2.10: A variable density image of the
Crowsnest pass elevation data using a gray-dark
blue-white-dark red-yellow color palette. The data
came from a 512 x 512 digital elevation model.
Figure 2.11: A variable density image of a small
section of a seismic line from the Trujillo area of
Peru (data courtesy PeruPetro). The data shown has
the same number of vertical and horizontal samples
as Figure 2.10 (512 traces by 512 samples) and uses
the same color palette.
Figure 2.12: F-K Spectrum of the data shown in
Figure 2.10. The elevation information is
concentrated at very low spatial and temporal
frequencies. We could decimate this data set several
times both spatially and temporally before we lose
significant information.
Figure 2.13: F-K Spectrum of the data shown in
Figure 2.11. The seismic data is spread over a wider
range of both temporal and spatial frequencies.
Decimating this data by a power of two would result
in a significant loss of both spatial and temporal
information. This indicates that it is a more complex
or information rich data source than is the digital
elevation model.
Figure 2.12 and Figure 2.13 show the F-K spectrum of the two data sets out to their
spatial and temporal Nyquists. They show, in F-K space, what the readers can judge by
themselves by comparing the two variable density images; that in terms of information
content, the seismic data is the richer data source. Visually, the seismic just looks busier;
48
there is a lot more going on in this section than in the image of the elevation data. In
terms of the F-K spectra, it is clear that we could decimate the elevation data several
times before we lost any significant information. This contrast with the seismic data
because as is evidenced by the energy beyond the half-nyquists in Figure 2.13,
decimating the seismic data even once would result in a significant loss of information.
All this leads to the point that a seismic section is an extraordinarily complex object.
It contains a remarkable amount of information, all of which we must communicate
visually. We assumed that variable density displays show us this information but in light
of what we now know about trivariant vision, they cannot be; the assumption is false. If
the reader doubts this then ask yourself this question:
We know what a mountain ranges look like, given that Figure 2.10 is not a mountain
range then how can Figure 2.11 be a seismic section? The answer is that it is not a real
seismic section at all, it is just a cartoon of a seismic section.
49