Bidirectional and hierarchical learning models of cognitive representation and computation in the visual cortex –– Project outline –– Nov, 2016 Haruo Hosoya Senior researcher, ATR Background: Visual cortex • Major neural basis of our visual system • A wealth of experimental facts from neuroscience – Anatomy: multiple pathways/stages, feedforward/feedback, ... – Physiology: tuning/selectivity to various visual features, ... • Theoretical questions: – What are underlying computational principles? – How do they lead to our highly sophisticated visual system? • Potential applications: brain information decoding, artificial intelligence, ... V1 IT V4 V2 Background: Efficient coding theory • Hypothesis: the visual cortex might employ a coding strategy optimized to the statistics of natural inputs • Sparse coding theory [Olshausen & Field 1996] and related theories explained a number of receptive field properties in V1 – edge detection, color coding, binocular disparity, motion, etc. • Our recent hierarchical model exhibited a good match with V2 properties [Hosoya & Hyvärinen 2015] • But most of these results focused on early visual stages (A) Layer 3 ... ... “V2” ... x12 + x22 ... x12 + x22 Layer 2 ... ... ... Layer 1 “V1 simple” ... inputs (natural image patches) Input image x12 + x22 x +x x12 1+ x22 2 “V1 complex” model neurons (sparsely activated) ... ... 32 32 (B Background: Bidirectional computation (background) • Bidirectional (feedforward and feedback) circuitry is ubiquitous in the visual cortex, but its precise roles are not well understood • Modern approaches to study bidirectionality is Bayesian inference – Seminal hierarchical Bayesian vision model [Rao & Ballard 1999] – Our project touched on this direction • blind-spot filling-in [Hosoya NeCo 2012] • But prior studies concentrated on rather low-level phenomena top-down info. (prior prob.) bottom-up info. (likelihood) P(x) integrated info. (posterior prob.) P( y | x)P(x) P(x | y) = P( y) ! P(y | x) Bayes’ theorem Project goal • Bidirectionality in visual cortex has been considered to play roles in visual imagery, contextual inference, and attention • But we consinder that bidirectional computation may be far more essential for the visual system than previously thought • Questions: – What are other kinds of bidirectional visual computation? – What are the underlying theoretical principles? – How can they explain related to neural properties? – How do they lead to our sophisticated visual functionalities? imagery contextual completion hinted recognition Recent result: a mixture of sparse coding models • A hierarchical model with a mixture of sparse coding models can – reconcile parts-based and holistic processing of faces and objects – explain selectivities and certain tuning properties of face neurons in the monkey IT cortex • A top-down “explain-away effect” is crucial for these properties • Preprint available [Hosoya & Hyvärinen 2016] (1,23) 10 y [pixel] 20 facial parts representations 30 40 50 60 10 (1,4) 10 30 20 40 30 50 40 60 50 x [pixel] x [pixel] 20 30 40 x [pixel] (1,14) (1,4) 60 10 50 60 10 20 10 30 20 40 30 50 40 60 50 x [pixel] x [pixel] 20 30 40 x [pixel] (1,23) (1,14) 60 10 50 60 10 20 10 30 20 40 30 50 40 60 50 x [pixel] x [pixel] 20 30 40 x [pixel] (1,23) 60 50 60 x12 + x22 x12 + x22 x12 + x22 2 1 x +x 2 2 object sparsecoding (1,14) energy input face sparsecoding model (1,4) Future plan • We will pursue other roles of bidirectionality in visual functions, e.g., – High-level representations of objects, scenes, faces, etc. – Visual attention – Invariant visual recognition – Visual perception, imagery, and illusion • Larger-scale bidirectional vision model integrating individual models • Applicability to realistic problems References [Olshausen & Field 1996] Olshausen BA, Field DJ. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature. 1996;381:607–9. [Hosoya & Hyvärinen 2015] Hosoya H, Hyvärinen A. A Hierarchical Statistical Model of Natural Images Explains Tuning Properties in V2. Journal of Neuroscience. 2015;35:10412–28. [Rao & Ballard 1999] Rao RP, Ballard DH. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience. 1999;2:79–87. [Hosoya 2012] Hosoya H. Multinomial Bayesian learning for modeling classical and nonclassical receptive field properties. Neural Computation. 2012;24:2119–50. [Hosoya & Hyvärinen 2016] Hosoya H, Hyvärinen A. A mixture of sparse coding models explaining properties of face neurons related to holistic and parts-based processing. bioRxiv: http://dx.doi.org/10.1101/086637
© Copyright 2026 Paperzz