Spatio-Temporal Sequence Learning of Visual

Spatio-Temporal Sequence Learning of
Visual Place Cells for Robotic Navigation
IJCNN, WCCI, Barcelona, Spain, 2010
Nguyen Vu Anh, Alex Leng-Phuan Tay,
Wooi-Boon Goh
School of Computer Engineering
Nanyang Technological University
Singapore
presented by Nguyen Vu Anh
date: 20th July, 2010
Janusz A. Starzyk
School of Electrical Engineering
Ohio University
Athens, USA
Outline
•
•
•
•
•
Introduction
HMAX Feature Building and Extraction
Spatio-Temporal Learning and Recognition
Empirical Results
Conclusion and future directions
Introduction
• Robotic navigation: Localization and Mapping.
– Topological map & Place cells
– Scope: Topological Visual Localization
• Challenges:
– High dimension and uncertainty of visual features
– Perceptual aliasing
– Complex probabilistic frameworks e.g. HMM
• Approach:
– Structural organization of human memory architecture.
– Short-Term Memory (STM) and Long-Term Memory(LTM) Interaction
Introduction
• System Architecture
Classifier
Sequence
Storage
Symbol
Quantization
Feature
Building
and
Extraction
Introduction
• Existing Works:
– Autonomous navigation (SLAM): Mapping, Localization and Path Planning
• Topological vs metric representation
• Human employs mainly topological representation of environment [O’Keefe
(1976), Redish(1999), Eichenbaum (1999), etc]
– Visual Place-cell model: [Torralba (2001) ; Renninger&Malik (2004) ; Siagian&Itti (2007)]
• Hierarchical feature building and extraction (HMAX Model) [Serre et al (2007)]
– Spatio-Temporal sequence learning: [Wang&Arbib (1990) (1993), Wang&Yowono (1995)]
• Our previous works: [Starzyk&He, (2007);Starzyk&He (2009);Tay et al (2007);Nguyen&Tay (2009)]
HMAX Feature Building and Extraction
•
Interleaving simple (S) and complex (C) layers with increasing spatial
invariance (Retina - LGN – V1 – V2,V4)
•
2 Stages:
– Feature Construction
– Feature Extraction
•
Feature Significance:
HMAX Feature Building and Extraction
Spatial Invariance Processing
Prototypes
Dot-Product
Matching
Ref: Riesenhuber & Poggio (1999), Serre et al (2007)
Spatio-Temporal Learning Architecture
• STM Structure:
See: Tay, Zurada,Wong and
Xu, TNN, 2007
– Quantization of input using KFLANN with vigilance ρ
Spatio-Temporal Learning Architecture
•
STM Structure:
See: Tay, Zurada,Wong and Xu, TNN, 2007
Spatio-Temporal Learning Architecture
• LTM Cell Structure:
– Each LTM is learnt by one-shot mechanism.
– Each long training sequence is segmented into N overlapping
subsequences of the same length M.
– Each subsequence is dedicated permanently to an LTM cell.
Spatio-Temporal Learning Architecture
• LTM Cell Structure:
Dual Neurons –
STM
Primary Neurons –
Primary Excitation
Spatio-Temporal Learning Architecture
•
•
Storage
– One-shot learning
Recognition
Input feature vector
Primary Excitation
Computation
Dual Neurons Update –
Evidence Accumulation
Output Matching Score
from the last DN
Empirical Results
• ICLEF Competition 2010 Dataset
– 9 classes of places
– 2 sets of images with the same trajectory (Set S and SetC) (~4000
images each set)
C
K
L
O
Empirical Results
•
•
•
•
Task
– 1 sequence (Set S) as training set and 1 sequence as testing set (Set R).
Features:
– 10% of the training sequence
Training
– ρ=0.7.
– Segmentation into consecutive subsequences of equal length (100) with
overlapping portion (>50%).
– Each subsequence is stored as a LTM cell.
– The label of each LTM cell is the majority label of individual components.
Testing
– The label is assigned as the label of the maximally activated LTM cell.
– If the activation of the maximal activated LTM cell is below ө, the system refuses
to assign the label.
Empirical Results
Table: LTM listing with training set S
Empirical Results
• Accuracy without threshold
• Accuracy with threshold ө=0.4
• Robust testing: missing elements
Empirical Results
Figure: LTM cells’ activation during recall stage
Empirical Results
• Intersection case:
Conclusion
• A hierarchical spatio-temporal learning architecture
– HMAX hierarchical feature construction and extraction
– STM clustering by KFLANN
– Sequence storage and retrieval by LTM cells.
• Application in appearance-based topological localization
Future Directions
• Automatic tolerance estimation
– E.g. Signal-to-noise ratio figure of features [Liu&Starzyk 2008]
• Hierarchical episodic memory which characterizes the
interaction between STM and LTM
– Other embodied intelligence components
– Goal creation system [Starzyk 2008]
• Application in other domains:
– Human Action Recognition
Thank you! 