Week 9 - CNS Classes

The doors of perception
Aldous Huxley, 1954
William Blake
The marriage of heaven and hell
1790
If the doors of perception were
cleansed, everything would appear
to man as it is: infinite.
The Doors
JIM MORRISON AND TEXTURE
3) Representations for segmentation
2) Textural segmentation -- textons?
1) What is texture?
CN530 S-2004 9- 3
Week 9: APPROACHES TO
TEXTURAL SEGMENTATION AND GROUPING
CN530 S-2004 9- 1
CN530 S-2004 9- 2
CN530 S-2004 9- 4
“What is it about the optical stimulation available to our eyes
that specifies the layout of surfaces in our environment?”
is the fundamental “computational question” of vision.
Gibson (1950) inaugurated the “modern era” of investigation of
visual texture as information for surface layout (Gibson, 1950.)
SOURCES OF VISUAL TEXTURE, 1
What are the units of texture perception?
If so, what are they?
If not, what then?
Are there “primitive” elements of visual texture?
Is it “leftover detail”?
Is it a statistical measure of local image variability?
What is visual texture?
Quote for the day: “Texture is bogus.” (M.B.)
WHAT IS VISUAL TEXTURE?
What if you only see a
part of a buffalo?
Can a water buffalo
be considered
an element of visual texture?
BUFFALO ELEMENTS
CN530 S-2004 9- 7
Note: “Stuff” is called by “mass nouns”, (i.e. grass, sand, etc.),
as opposed to “count” nouns, which refer to easily enumerated
entities (i.e., truck, door, “politician with honor,” . . .)
Sources of environmental variation that generate optical texture:
discontinuities in pigmentation of surfaces
“roughness” (physical microfacets) of surfaces
occlusion of small parts of “stuff” by other stuff
SOURCES OF VISUAL TEXTURE, 2
CN530 S-2004 9- 5
CN530 S-2004 9- 6
Each pad has texture.
The pads collectively form a texture.
LILY PADS
CN530 S-2004 9- 8
Think of how optical texture is transformed
as you approach environmental objects:
there are projective transformations (cf. Renaissance art),
but also non-projective transformations, including:
“accretion”
“deletion”
“generation” of new edges (“elements?”) by the eye’s
ability to resolve new levels of detail
etc.
Gibson described the optic array as being made up of
nested solid angles. (Why “nested”?)
Does an analysis of environmental sources of variation for
optical texture suggest that
visual texture is made up of elements?
MY KINGDOM FOR AN ELEMENT
CN530 S-2004 9- 9
CN530 S-2004 9- 11
*Although this term is thrown around quite casually in the vision
literature, you should be very careful of how you use it, and
extremely skeptical of how others use it.
II. Texture as a basis for segmentation and grouping, i.e. the
“preattentive*” determination that distinct regions exist in a scene.
(Cf. segmentation, grouping, figure-from-ground, “pop-out” in
visual search, etc.)
I. Texture as a source of information for surface layout
(in the sense of slant, curvature, 3-D arrangement of surfaces.)
TEXTURE PERCEPTION TRADITIONS
From such assumptions follows a huge literature on
“slant perception,” “shape-from-texture,” and so on,
. . . . . . . which we will not pursue.
In many ecological or computational analyses, such
discrete elements of texture are assumed to be
the same (or statistically the same) physical size on
the surface being modeled
distributed regularly (or statistically regularly) on a surface,
and lying tangent (or statistically tangent) to a surface.
Notwithstanding the preceding diatribe,
interesting approximations of many natural textures can
be made by imagining discrete elements that are “attached”
to some homogeneous background.
TEXTURE MODELING FOR STIMULUS GENERATION
CN530 S-2004 9- 12
3) Is there a third alternative?
2) Do you first distinguish two aggregates of texture properties
and conclude that a boundary must separate them?
1) Do you first find the boundary between textures and conclude
that there must be distinct regions on either side?
BOUNDARY (edge?)
REGION (surface?)
“line-like”
“area-like”
more 1-D than 2-D
more 2-D than 1-D (?)
ARROW OF CAUSALITY
Why do Marr and Ullman single him out for attack?
(e.g. “Against direct perception.”)
Why did Gibson turn away from such “computational analyses”?
Questions:
CLAIM: The first “computational analysis” of shape-from-texture
was in the dissertation of Purdy (1957), a student of J.J. Gibson,
at Cornell University.
HISTORICAL ASIDE
CN530 S-2004 9- 10
CN530 S-2004 9- 13
∫∫ a( x − ξ, y − η)b(ξ,η)dξdη
Note 2: This intuition amounts to the assertion that
TEXTURE IS “VARIANCE” in an image, in the sense
of a second order “moment” or “power spectrum,” which is the
definition of texture found in many computational texts to this day.
Note 1: For the special case of binary images, second order
statistics are equivalent to computation of autocorrelation,
with the proviso that sums are tabulated separately for
every possible displacement, (e.g., up, down, diagonal, …)
up to some size limit on displacements.
ISODIPOLE CONJECTURE: textural regions are discriminable
if and only if
they differ globally in “second order (‘dipole’) statistics.”
ISODIPOLE CONJECTURE
Julesz (1981) and Beck (1983) give their respective versions
of how Julesz’s “isodipole” conjecture was refuted.
CN530 S-2004 9- 15
NOTE: For discrete images, there are issues of quantization
(spatial and amplitude) and truncation (near image borders)
to be considered.
becomes autocorrelation if b(x,y) = a(x,y).
a⊗b=
Cross-correlation:
Let a(x,y) and b(x,y) be two scalar functions on an image.
Consider autocorrelation for binary images.
AUTOCORRELATION AND SECOND ORDER STATISTICS
CN530 S-2004 9- 16
CN530 S-2004 9- 14
. . . would need local power spectrum for texture segmentation
source: http://www.geog.ucsb.edu/~jeff/115a/jack_slides/page6.html
DO YOU FEEL THE POWER?
AUTOCORRELATION DEMO
CN530 S-2004 9- 17
NOTE: Density is a “first order” statistic!
(i.e. It is something like a mean, rather than a correlation
or “second moment” or “variance” of a luminance distribution.)
2) The textural “signature” of a region is viewed as the density
of the respective kinds of textons in that region.
1) Textons are the primitive elements of texture. (Cf. protons.)
TEXTON THEORY:
In place of his refuted conjecture, Julesz offered the
TEXTON THEORY
CN530 S-2004 9- 19
Figure from: Julesz, 1981, who refers to “metameric” matches
of different textures, by analogy to color metamers,
which are two different combinations of wavelengths that yield
indistinguishable perceived colors.
REFUTATION OF “IF”
Different second order statistics, but no segregation:
CN530 S-2004 9- 20
Q: How many unique classes of textons exist?
Heuristic 1. Human vision operates in two distinct modes
1. Preattentive vision -- parallel, instantaneous, without scrutiny,
independent of the number of patterns, covering a large visual
field, as in texture discrimination.
2. Attentive vision -- serial search by focal attention in 50-ms
steps limited to a small aperture, as in form recognition.
Heuristic 2: Textons
a. Elongated blobs, e.g., rectangles, ellipses, line segments
with specific colors, angular orientations widths, and lengths.
b. Terminators -- ends-of-line segments
c. Crossings of line segments
Heuristic 3: Preattentive vision directs attentive vision to the
locations where differences in the density (number) of textons
occur, but ignores the positional relationships between textons.
TEXTONS
Q: What is a “texton”? Quoting from Julesz, 1981:
From Julesz (1981).
Image contains same second order (and third order!) statistics
throughout, but there is no difficulty in segregation.
REFUTATION OF “ONLY IF”
CN530 S-2004 9- 18
Jacob Beck
“Consigliere”
Jake had long since (!)
refuted the isodipole
conjecture, using
displays such as:
Beck, 1972
Ahead of His Time
Those lists always ended with “etc.”
CN530 S-2004 9-23
The 1980s (and, to a diminishing extent) the 1990s saw
the publication of many articles that attempted to find
new “textons,” and every so often somebody would
attempt to list all the known kinds of textons.
CN530 S-2004 9- 21
CN530 S-2004 9- 22
Jake went on to note that dramatic
differences in ease of segregation
(of top and bottom halves of the
three figures here) could occur
dispite similar “dipole statistics” in
the two halves, depending on
interactions in the arrangements
of local elements.
CN530 S-2004 9-24
NOTE: this last point is in contradiction to Julesz’s third assertion.
It also remains largely ignored in the “computational” literatures.
Features may be formed
1) from the outputs of simple filtering
(e.g. center-surround or elongated receptive fields), or
2) BY LINKING OPERATIONS CARRIED OUT ON
SIMPLE FEATURES TO FORM HIGHER-ORDER FEATURES.
The texton theory is quite similar to the view expressed
by Beck (1966, and following) that texture segmentation
occurs by first-order differences in stimulus features
(i.e., little lines or shapes formed by groups of pixels),
rather than as the result of second-order differences of
image points (pixels).
CONVERGING VIEWS?
1) What does the texton theory have to say about linking?
I.e., are we concerned only with amounts (numbers, density)
of textons in different regions, or can geometric interactions
among textons affect segregation?
2) When are the results of linking operations perceptually “visible”?
ISSUES
regarding bases for segregation:
Beck:
“higher order elements”
are formed by
“linking operations.”
EMERGENT FEATURES VIA LINKING
CN530 S-2004 9- 27
In an often-reproduced figure, Jake noted the importance
of similarities and differences in line orientation in regions
for textural segmentation. Here the right third of the figure
strongly segregates from the rest, even though an individual
rotated
is judged more similar to an upright
than a
is to that same .
Orientation and Arrangement
CN530 S-2004 9-25
Q: What is
“the postulate of spatial impenetrability”?
Beck’s body of work on textural segmentation
was the single most important source of
constraints in the original design of the
Boundary Contour System.
POTENT BUT INVISIBLE BOUNDARIES
CN530 S-2004 9- 28
Jake noted that local
interactions, such as
“linking” among individual
texture elements could
create perceptually salient
regions.
Linking of Texture Elements
CN530 S-2004 9-26
CN530 S-2004 9- 31
CN530 S-2004 9-29
Boundary completion ==> boundary overruling.
Conclusion: The effective orientation of a perceptual boundary
at a place may not be the orientation of local contrast at that place.
Bonus question: What is the relation of (c) to (a) and (b)?
Contrast sensitivity: What are necessary or sufficient conditions
for activating a) simple cells, b) complex cells, and
c) boundary completion mechanisms?
Emergent features generally do not appear homogeneous
(in the sense of having homogeneous visible brightness/hue.)
Why not?
What should the output of a texture segmentation process tell us?
SEEING VS. RECOGNIZING REVISITED
Beck, Prazdny, and Rosenfeld, 1983
TYPICAL BECK MANUSCRIPT
CN530 S-2004 9- 32
but also a “fit” between the
orientation of linked features
and the orientation of linking:
(Different parts of BCS/FCS architecture may be of different
relative importance for the two tasks.)
CLAIM: The same mechanism (process) is responsible for both
boundary finding (in the sense that most closely parallels
“edge detection” or “coding the textural primitives” in other
theories) and textural segmentation and grouping.
For Marr, the textural grouping and segmentation process is,
by nature, different from and subsequent to the detection of
constituent elements (and the assignment of symbolic tokens
to stand for those elements.)
Marr: Yes, and they are coded in the raw primal sketch.
Are there any textural primitives?
EXISTENCE OF TEXTURAL PRIMITIVES
I.e, perceptual liking
requires not just a
spatial zone like this:
ELEGANT UNDERSTATEMENT
CN530 S-2004 9-30
Note 1: Zucker was researching curvature before BCS.
Note 2: Zucker’s work in general merits study.
Prediction 3. Inter-columnar interactions exist between
curvature consistent (co-circular) tangent hypotheses.
(i.e., between units coding orientation/position combinations
that could be on some same circle.)
[Q: Comparison to “bipole” hypothesis?]
Prediction 2. Endstopped neurons carry
the quantized representation of orientation
and (non-zero) curvature at each position.
MORE ZUCKER ET AL.
Prediction 1. Crossings, corners, and bifurcations are represented
at the early processing stages by multiple neurons firing within a
“hypercolumn.”
[Q: What are the interactions between neurons in a hypercolumn?!]
CN530 S-2004 9- 35
The visual system cannot know a priori what kind of processing
to apply to which part of an image. Remember the thermos!
How does one assess the relative contributions of pigmentation,
texture, occlusion, shadows, etc. in forming the intensity of a
local patch of a scene?
Think of phase transitions in physical systems.
In other words,
whether aspects of a scene form a shaded object, or a boundary,
or a textured region, is a determination of the entire model,
not a precondition for invoking a particular module.
WHAT ARE THE MODULES?
CN530 S-2004 9- 33
CN530 S-2004 9- 34
CO-TANGENT COMPUTATION
Intro is a critique of “received mythology” re: simple,
complex cells, etc.
CN530 S-2004 9- 36
Two stages of curve detection:
1) Local:
Coarse and explicit; preconfigured architecture
2) Global:
Fine and implicit; dynamically-constructed architecture
Zucker, Dobbins and Iverson (1989)
ZUCKER ET AL.
CN530 S-2004 9- 37
_+_
_
+
_
_
+
_
_
+
_
_
+
_
_
+
_
_
+
_
_ +_
_
+
_
_
+
_
Image
Filters (linear,
even-symmetric)
Half-wave rectification
to get simple cell responses
PIR
post inhibition response
Threshold and take max
over small neighborhoods
Wide odd-symmetric filters
Max
CN530 S-2004 9- 39
Third Stage
Linear filter, e.g., vertical, at
fundamental spatial frequency
[of periodic pattern in stimulus]
Second Stage
A pointwise nonlinearity,
e.g., rectification, squaring
First stage
Linear filter[s] (e.g. horizontal),
of high spatial frequency
[or a range of frequencies]
MALIK & PERONA (1990)
_
+
_
_
+
_
Getting back to texture segregation as such . . .
Graham, Beck, and Sutter (1991) proposed a “complex model”:
MODELS OF TEXTURE SEGREGATION
SIMULATIONS
CN530 S-2004 9- 40
The model thus expresses less than the sum of all his
intuitions regarding texture segregation.
Note that while Beck is one of the authors of the
model on the previous page, that model is not capable
of “linking” local features to form higher-order features.
CN530 S-2004 9- 38
CN530 S-2004 9- 43
A broad consensus regarding textural segmentation exists
among many researchers regarding the function of:
1) early filters that are sensitive to oriented contrasts and a
variety of spatial scales, and
2) an early rectification (or, for some, squaring) nonlinearity,
followed by
3) a later compressive nonlinearity (e.g.“logarithmic” or
“normalizing”), which dovetails into a “choice” mechanism
for asserting the location of a texture boundary.
(See Beck, Week 9.)
TEXTURE SEGMENTATION CONSENSUS
Second nonlinearity:
Early contrast compression OR
normalization via intracortical inhibition
Even-symmetric OR odd-symmetric filters?
Rectification: Full wave OR (two) half-wave(s)?
MALIK AND PERONA ISSUES
CN530 S-2004 9- 41
What’s the best way to characterize the relationship
of mechanisms of texture segmentation to other visual functions
(such as brightness perception, figure/ground perception, or
visual search)?
What kinds of formalism are best used for modeling
(i.e. filters, networks, Bayesian approaches)?
What about “linking” and “emergent features”?
TEXTURE SEGMENTATION: STILL NO CONSENSUS
CN530 S-2004 9- 44
Question 2: How do both of these models compare with BCS?
Question 1: What are the similarities and differences of the
“complex channels” model of Graham, Beck and Sutter
and the model of Malik and Perona?
TEXTURE SEGMENTATION CONSENSUS (?)
CN530 S-2004 9- 42
3) Is there a third alternative?
2) Do you first distinguish two aggregates of texture properties
and conclude that a boundary must separate them?
1) Do you first find the boundary between textures and conclude
that there must be distinct regions on either side?
BOUNDARY (edge?)
REGION (surface?)
“line-like”
“area-like”
more 1-D than 2-D
more 2-D than 1-D (?)
What do we say now about the following?
ARROW OF CAUSALITY, REVISITED
CN530 S-2004 9- 47
CANONICAL COMPUTATIONAL APPROACH
CN530 S-2004 9- 45
4) What is a discrimination of regions a discrimination of?
What attributes of the outcome of a simulation correspond with
(or “explain”) what attributes of a percept? (Also, what’s missing?
E.g. how do texture boundaries interact with lightness, depth, etc?)
3) Evaluation of models: Formal steps vs. interpretation of those
steps -- that is, what in nature are model steps identified with?
NOTE: The “channels” of many models are functional and
abstract, and need not correspond with any anatomical
pathway in vivo.
2) Does visible contrast “track” the grouping?
E.g. Glass patterns: no; neon: yes; textures: (usually) no
1) When do or do not two abutting regions appear distinct from
each other, by virtue of textural differences?
REMAINING ISSUES
Remaining ISSUES in textural segmentation and grouping:
CN530 S-2004 9- 48
Figures in previous panel and this one taken from
http://www-dbv.informatik.uni-bonn.de/image/example8.html
SEGMENTATION BY FEATURE EXTRACTION
AND CLUSTERING
CN530 S-2004 9- 46
CN530 S-2004 9- 49
CN530 S-2004 9- 51
http://cyvision.if.sc.usp.br/msskeletons/
(postdoc at Harvard’s Robotics lab) web page
http://hrl.harvard.edu/people/postdocs/rlo.html
Robert L. Ogniewicz’s
This image stolen from
e.g. Blum’s 1967 “grassfire” model.
Among the categories of models that we should talk more about
in CN 530 (besides models incorporating cortical magnification!)
are medial axis models,
LOOSE ENDS
Image downloaded from http://www-white.media.mit.edu/~fliu/
A book of photographs of natural textures that has become a
standard reference and source of images:
Brodatz, P., "A Photographic Album for Arts and Design,"
Dover Publishing Co., Toronto, Canada, 1966.
Brodatz
Bold lines indicate stimulated
units
Radius of detector indicates
its scale
doubly stimulated
medialness detector
medialness detector
boundariness detector
or, even . . .
Burbeck, C. A. & Pizer, S. M. (1995). Object representation by
cores: identifying and representing primitive spatial regions.
Vision Research. 35(13), 1917-1930.
MEDIAL AXES AND SCALES
CN530 S-2004 9- 52
“The following figure shows a hierarchical segmentation of a
mixture image with 16 Brodatz micro-textures. All textures have
been correctly identified, borders are localized precisely. The
result has been obtained without prior knowledge of the spatial
relationship of different sites. Stable solutions have been detected
for 11 and 16 clusters according to our new model selection
criterium. The 6 cluster solution posesses local stability. For the
segmentation 12 Gabor filters on three octaves were used. The
resolution was K=24 clusters and 300 evaluated dissimilarities
for each site.”
Broadatz’s “coffee table book” has
become a standard reference in both
psychophysics and machine vision!
CN530 S-2004 9- 50
CN530 S-2004 9- 55
CN530 S-2004 9- 53
http://socrates.berkeley.edu/~plab/earlygroup/figureGroundGrouping.htm
Check out:
JUST FOR FUN
FUZZY CORE
SCALE SPACE FOR CORES
CN530 S-2004 9- 54