DBN experiments and some ideas.

Deep belief nets experiments
and some ideas.
Karol Gregor
NYU/Caltech
Outline
DBN Image database experiments
Temporal sequences
Deep belief network
Backprop
Labels
H3
H2
H1
Input
Preprocessing – Bag of words of SIFT
With: Greg Griffin (Caltech)
Images
Features (using SIFT)
Bag of words
Word1
Word2
Word3
…
Group them (e.g. K-means)
Image1
23
12
92
…
Image2
11
55
33
…
13 Scenes Database – Test error
Train error
- Pre-training on larger dataset
- Comparison to svm, spm
Explicit representations?
Compatibility between databases
Pretraining: Corel database
Supervised training: 15 Scenes database
Temporal Sequences
Simple prediction
Y
t
W
t-1
t-2
X
Supervised learning
t-3
With hidden units
(need them for several reasons)
G
H
t-1,t-2,t-3
t-1,t-2,t-3
X
t
t
Y
¡ E = WiXj kY H X i Yj H k + WjYk H Yj H k + WjY Yj + WkH H k
Memisevic, R. F. and Hinton, G. E., Unsupervised Learning of Image Transformations. CVPR-07
Example
pred_xyh_orig.m
¡ E = WiXj kY H X i Yj H k + WjYk H Yj H k + WjY Yj + WkH H k
Additions
G
H
t-1
t
t-1
t
X
Y
Sparsity: When inferring the H the first time, keep only the largest n units on
Slow H change: After inferring the H the first time, take H=(G+H)/2
Examples
pred_xyh.m
present_line.m
present_cross.m
Cortex+Thalamus
Hippocampus
Senses
Muscles
e.g. Eye (through (through subretina, LGN) cortical structures)
e.g. See: Jeff Hawkins: On Intelligence
Cortical patch: Complex structure
(not a single layer RBM)
From Alex Thomson and Peter Bannister, (see numenta.com)
Desired properties
1) Prediction
ABCDEF G
HJ KL EF H
2) Explicit representations for
sequences
V I S I O N R E S E A R C H
time
3) Invariance discovery
e.g. complex cell
time
4) Sequences of variable length
V I S I O N R E S E A R C H
time
5) Long sequences
Layer1
Layer2
1111111111 1 1 1 1 1 ? ?22222 2 2 2 2 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 1 2 3 5 8 13 21 34 55 89 144
6) Multilayer
- Inferred only after some time
V I S I O N R E S E A R C H
time
7) Smoother time steps
8) Variable speed
- Can fit a knob with small speed range
9) Add a clock for actual time
Cortex+Thalamus
Hippocampus
Senses
Muscles
e.g. Eye (through (through subretina, LGN) cortical structures)
Cortex+Thalamus
Hippocampus
In Addition
- Top down attention
- Bottom up attention
- Imagination
- Working memory
- Rewards
Senses
Muscles
e.g. Eye (through (through subretina, LGN) cortical structures)
Training data
- Videos
-Of the real world
-Simplified: Cartoons (Simsons)
-A robot in an environment
-Problem: Hard to grasp objects
-Artificial environment with 3D objects that are easy
to manipulate (e.g. Grand theft auto IV with
objects)