Contextual models for object detection
using boosted random fields
by Antonio Torralba,
Kevin P. Murphy and
William T. Freeman
Quick Introduction
What is this?
Now can you tell?
Belief Propagation (BP)
Network (Pairwise Markov Random
Fields)
observed nodes (yi)
Belief Propagation (BP)
Network (Pairwise Markov Random
Fields)
observed nodes (yi)
hidden nodes (xi)
Belief Propagation (BP)
Network (Pairwise Markov Random
Fields)
observed nodes (yi)
hidden nodes (xi)
Statistical dependency,
called local evidence:
i ( xi , yi )
Shord-hand
i ( xi )
Belief Propagation (BP)
Statistical dependency:
Local evidence
i ( xi , yi )
Shord-hand
Statistical dependency:
Compatibility function
ij ( xi , x j )
i ( xi )
Belief Propagation (BP)
Joint probability
1
p({x}) i ( xi ) ij ( xi , x j )
Z i
( ij )
Belief Propagation (BP)
Joint probability
1
p({x}) i ( xi ) ij ( xi , x j )
Z i
( ij )
x
x1
y1
x5
y2
x3
x1
x2
x4
….
x12
xi
xj
yi
Belief Propagation (BP)
Joint probability
1
p({x}) i ( xi ) ij ( xi , x j )
Z i
( ij )
x
x1
y1
x5
y2
x3
x1
x2
x4
….
x12
xi
xj
yi
Belief Propagation (BP)
The belief b at a node i is represented
by
the local evidence of the node
all the messages coming in from
neighbors
bi ( xi ) ki ( xi ) m ji ( xi )
xi
jN ( i )
i ( xi )
yi
Ni
∏
xj
Belief Propagation (BP)
The belief b at a node i is represented
by
the local evidence of the node
all the messages coming in from
neighbors
bi ( xi ) ki ( xi ) m ji ( xi )
xi
jN ( i )
pi ( xi | y)
i ( xi )
yi
Ni
∏
xj
Belief Propagation (BP)
Messages m between hidden nodes
xi
mji(xi)
xj
How likely node j thinks it is that node i
will be in the corresponding state.
Belief Propagation (BP)
xi
mji(xi)
xj
m ji ( xi ) j ( x j ) ji ( x j , xi )
xj
xi
ji ( x j , xi )
j (x j )
xj
m
kj
kN ( j ) \ i
xk
(x j )
Conditional Random Field
Distribution of the form:
Conditional Random Field
Distribution of the form:
1
p( x | y) i ( xi ) ij ( xi , x j )
Z i
jN i
Boosted Random Field
Basic Idea:
Use BP to estimate P(x|y)
Use boosting to maximize
Log Likelihood of each node
wrt to i ( xi )
Algorithm: BP
Minimize negative log likelihood of training
data (yi). Label Loss function to minimize:
J J b ( xi ,m )
t
t
i
i
t
i ,m
m
i
Algorithm: BP
Minimize negative log likelihood of training
data (yi). Label Loss function to minimize:
J J b ( xi ,m )
t
t
i
t
i ,m
i
m
i
b (1)
t
i ,m
m
i
xi*,m
1 xi*,m
b (1)
t
i ,m
Algorithm: BP
Minimize negative log likelihood of training
data (yi). Label Loss function to minimize:
J J b ( xi ,m )
t
t
i
t
i ,m
i
m
i
b (1)
t
i ,m
m
xi*,m
1 xi*,m
b (1)
t
i ,m
i
xi*,m ( xi ,m 1) / 2
xi ,m {1,1}
Algorithm: BP
Ni
b ( xi ) ki ( xi ) m
t
i
t 1
j i
( xi )
jN ( i )
xi
i ( xi )
yi
∏
xj
Algorithm: BP
Ni
b ( xi ) i ( xi ) m
t
i
t 1
j i
( xi )
jN ( i )
xi
i ( xi )
yi
∏
xj
Algorithm: BP
Ni
b ( xi ) i ( xi ) m
t
i
t 1
j i
( xi )
jN ( i )
M
xi
t 1
i
( 1)
∏
xj
Algorithm: BP
b ( xi ) i ( xi ) m
t
i
t 1
j i
( xi )
jN ( i )
M
m
t 1
j i
xi
t 1
i
( 1)
j ,i ( x j , 1)
x j {1, 1}
mtj1 i
mit j
t
j
b (x j )
t
i j
m
(x j )
xj
Algorithm: BP
b ( xi ) i ( xi ) m
t
i
t 1
j i
( xi )
jN ( i )
i ( xi ) [e
t
Fi / 2
;e
xi
Fi / 2
t
]
i ( xi )
yi
F: a function of the input data
Algorithm: BP
b (1) ( Fi G )
t
i
with
t
1
(u )
u
1 e
t
i
yi
Fi t
xi
Git
xj
Algorithm: BP
yi
b (1) ( Fi G )
t
i
t
t
i
Fi t
xi
with
Git
xj
1
(u )
u
1 e
t
t
t
Gi log M i (1) log M i (1)
log J log 1 e
t
i
m
xi ,m ( Fit,m Git,m )
Function F
t 1
Fi ( yi ,m ) Fi ( yi ,m ) fi ( yi ,m )
t
t
yi
Fi t
Boosting!
f is the weak learner: weighted decision
stumps.
f i ( y) ah( y) b
xi
Minimization of loss L
log J log 1 e
t
i
m
xi ,m ( Fit,m Git,m )
Minimization of loss L
log J log 1 e
t
i
m
arg min log J arg min
f it
t
i
f it
xi ,m ( Fit,m Git,m )
w Y
t
i ,m
m
t
i ,m
f i ( xi ,m )
t
2
Minimization of loss L
log J log 1 e
t
i
m
arg min log J arg min
t
i
f it
where
f it
t
i ,m
Y
t
i ,m
w
xi ,m ( Fit,m Git,m )
w Y
t
i ,m
t
i ,m
f i ( xi ,m )
t
m
xi ,m 1 e
xi ,m ( Fit Git )
b (1)b (1)
t
i
t
i
2
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
Git
xj
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
Git
xj
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
bi ( xi )
Git
xj
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
Git
xj
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
Git
xj
Local Evidence: algorithm
For t=1..T
Iterate Nboost times
yi
find the best basis function h
t 1
t
update local evidence with Fi f i
update the beliefs
t
t
t
update the weights wi ,m bi (1)bi (1)
Iterate NBP times
update messages
update the beliefs
Fi t
xi
bi ( xi )
Git
xj
bj (x j )
Function G
By assuming that the graph is densely
connected we can make the
approximation:
m
m
t 1
j i
t 1
j i
(1)
(1)
1
Now G is a non-linear
additive
function
t
t 1
of the beliefs: Gi bm
Function G
Instead of learning ij the function
t
t 1
Gi bm can be learnt with an additive
model:
t
g b
n 1
n
t
t
gi bm a (w bm ) b
t 1
i ,m
G
n
i
t
m
weighted regression stumps
Function G
The weak learner is chosen by
minimizing the loss:
log J (b ) log 1 e
m
t
i
t 1
xi ,m Fit,m g it ( bmt 1 )
t 1
n1
g it ( bmt 1 )
The Boosted Random Field
Algorithm
For t=1..T
find the best basis function h for f
n
t 1
g
b
find the best basis function for i Ni ,m
compute local evidence
xi
compute compatibilities
update the beliefs
Fi t
update weights
yi
Git
xj
The Boosted Random Field
Algorithm
For t=1..T
b1
xi
b2
…
find the best basis function h for f
n
t 1
g
b
find the best basis function for i Ni ,m
compute local evidence
compute compatibilities
update the beliefs
update weights
bj
Final classifier
For t=1..T
update local evidences F
update compatibilities G
compute current beliefs
t
x
(
b
Output classification: i ,m
i ,m 0.5)
Multiclass Detection
U: Dictionary of ~2000 images patches
V: Same number of image masks
Multiclass Detection
U: Dictionary of ~2000 images patches
V: Same number of image masks
At each round t, for each class c for
each dictionary entry d there is a weak
learner:
v ( I ) ( I U ) V 0
d
d
d
d
Function f
f
To take into account different sizes, we
first downsample the image and then
upsample and OR the scales:
d
x , y ,c
( I ) s [v ( I s) s]
d
x, y
d
which is our function for computing the
local evidence.
Function g
The compatibily function has a similar
form:
d
d
(b) bx ', y ',c ' Wx ', y ',c '
c '1
C
g
d
x , y ,c
d
Function g
The compatibily function has a similar
form:
d
d
(b) bx ', y ',c ' Wx ', y ',c '
c '1
C
g
d
x , y ,c
d
W represent a kernel with all the
messages directed to node x,y,c
Kernels W
Example of incoming messages:
Function G
The overall incoming messages
function is given by:
n
n
(b) bx ', y ',c ' Wx ', y ',c ' n
c '1
n
n
C
G
t
x , y ,c
def C
bx ', y ',c ' Wx'', y ',c ' '
c '1
Learning…
Labeled dataset of office and street
scenes, with each ~100 images
In the first 5 round updated only the local
evidence
After the 5th iteration update also the
compatibility functions
At each round update only F and G of
the single object class that reduces the
most the multiclass cost.
Learning…
Biggest objects are detected first
because they reduce the error of all
classes the fastest:
The End
Introduction
Observed: Picture
Dictionary: Dog
P(Dog|Pic)
Introduction
P(Head|Pici)
P(Tail|Pici)
P(Front Legs|Pici)
P(Back Legs|Pici)
Introduction
Dog!
Comp(Head, Tail)
Comp(Head, Legs)
Comp(Tail, Legs)
Comp(F. Legs, B. Legs)
Introduction
P(Piraña|Pici)
Comp(Piraña, Legs)
Graphical Models
Observation nodes yi
Y
yi can be a pixel or a patch
Graphical Models
Hidden Nodes
X
Dictionary
Local Evidence:
i ( xi , yi )
Shord-hand
i ( xi )
Graphical Models
X
Compatibility Function:
ij ( xi , x j )
© Copyright 2026 Paperzz