Bidirectional and hierarchical learning models of cognitive

Bidirectional and hierarchical learning models of
cognitive representation and computation
in the visual cortex
–– Project outline ––
Nov, 2016
Haruo Hosoya
Senior researcher, ATR
Background: Visual cortex
•  Major neural basis of our visual system
•  A wealth of experimental facts from neuroscience
–  Anatomy: multiple pathways/stages, feedforward/feedback, ...
–  Physiology: tuning/selectivity to various visual features, ...
•  Theoretical questions:
–  What are underlying computational principles?
–  How do they lead to our highly sophisticated visual system?
•  Potential applications: brain information decoding, artificial
intelligence, ...
V1
IT
V4
V2
Background: Efficient coding theory
•  Hypothesis: the visual cortex might employ a coding strategy
optimized to the statistics of natural inputs
•  Sparse coding theory [Olshausen & Field 1996] and related theories
explained a number of receptive field properties in V1
–  edge detection, color coding, binocular disparity, motion, etc.
•  Our recent hierarchical model exhibited a
good match with V2 properties
[Hosoya & Hyvärinen 2015]
•  But most of these results focused on early
visual stages
(A)
Layer 3
...
...
“V2”
...
x12 + x22
...
x12 + x22
Layer 2
...
...
...
Layer 1
“V1 simple”
...
inputs
(natural image patches)
Input image
x12 + x22
x +x
x12 1+ x22 2
“V1 complex”
model neurons
(sparsely activated)
...
...
32
32
(B
Background: Bidirectional computation
(background)
•  Bidirectional (feedforward and feedback) circuitry is ubiquitous in the
visual cortex, but its precise roles are not well understood
•  Modern approaches to study bidirectionality is Bayesian inference
–  Seminal hierarchical Bayesian vision model [Rao & Ballard 1999]
–  Our project touched on this direction
•  blind-spot filling-in [Hosoya NeCo 2012]
•  But prior studies concentrated on rather low-level phenomena
top-down info.
(prior prob.)
bottom-up info.
(likelihood)
P(x)
integrated info.
(posterior prob.)
P( y | x)P(x)
P(x | y) =
P( y)
!
P(y | x)
Bayes’ theorem
Project goal
•  Bidirectionality in visual cortex has been considered to play roles in
visual imagery, contextual inference, and attention
•  But we consinder that bidirectional computation may be far more
essential for the visual system than previously thought
•  Questions:
–  What are other kinds of bidirectional visual computation?
–  What are the underlying theoretical principles?
–  How can they explain related to neural properties?
–  How do they lead to our sophisticated visual functionalities?
imagery
contextual completion
hinted recognition
Recent result:
a mixture of sparse coding models
•  A hierarchical model with a mixture of sparse coding models can
–  reconcile parts-based and holistic processing of faces and objects
–  explain selectivities and certain tuning properties of face neurons
in the monkey IT cortex
•  A top-down “explain-away effect” is crucial for these properties
•  Preprint available
[Hosoya & Hyvärinen 2016]
(1,23)
10
y [pixel]
20
facial parts
representations
30
40
50
60
10
(1,4)
10 30 20 40 30 50 40 60 50
x [pixel] x [pixel]
20
30
40
x [pixel]
(1,14)
(1,4)
60
10
50
60
10
20 10 30 20 40 30 50 40 60 50
x [pixel] x [pixel]
20
30
40
x [pixel]
(1,23)
(1,14)
60
10
50
60
10
20 10 30 20 40 30 50 40 60 50
x [pixel] x [pixel]
20
30
40
x [pixel]
(1,23)
60
50
60
x12 + x22
x12 + x22
x12 + x22
2
1
x +x
2
2
object
sparsecoding
(1,14)
energy input
face
sparsecoding model
(1,4)
Future plan
•  We will pursue other roles of bidirectionality in visual functions, e.g.,
–  High-level representations of objects, scenes, faces, etc.
–  Visual attention
–  Invariant visual recognition
–  Visual perception, imagery, and illusion
•  Larger-scale bidirectional vision model integrating individual models
•  Applicability to realistic problems
References
[Olshausen & Field 1996] Olshausen BA, Field DJ. Emergence of simple-cell receptive
field properties by learning a sparse code for natural images. Nature. 1996;381:607–9.
[Hosoya & Hyvärinen 2015] Hosoya H, Hyvärinen A. A Hierarchical Statistical Model of
Natural Images Explains Tuning Properties in V2. Journal of Neuroscience.
2015;35:10412–28.
[Rao & Ballard 1999] Rao RP, Ballard DH. Predictive coding in the visual cortex: a
functional interpretation of some extra-classical receptive-field effects. Nature
neuroscience. 1999;2:79–87.
[Hosoya 2012] Hosoya H. Multinomial Bayesian learning for modeling classical and
nonclassical receptive field properties. Neural Computation. 2012;24:2119–50.
[Hosoya & Hyvärinen 2016] Hosoya H, Hyvärinen A. A mixture of sparse coding models
explaining properties of face neurons related to holistic and parts-based processing.
bioRxiv: http://dx.doi.org/10.1101/086637