I. Frosio - Adaptive Segmentation Based on a

Adaptive Segmentation
Based on a Learned Quality
Metric
I. Frosio1, E. Ratner2
1
NVIDIA, USA,
2 Lyrical
Labs, USA
Motivation: good / bad segmentation
SLIC (Achanta, 2012)
2
Motivation: good / bad segmentation
GRAPH-CUT (Felzenszwalb, 2004)
3
Motivation: good / bad segmentation
ADAPTIVE GRAPH-CUT (our)
4
Motivation: good / bad segmentation
>
SLIC (Achanta, 2012)
>
GRAPH-CUT (Felzenszwalb, 2004)
ADAPTIVE GRAPH-CUT (our)
5
Motivation: good / bad segmentation
Achanta, 2012 (SLIC); Kaufhold, 2004:
segmentation algorithms aggregate sets of
perceptually similar pixels in an image.
Felzenszwalb,
2004
(graph-cut):
a
segmentation algorithm should capture
perceptually important groupings or regions,
which often reflect global aspects of the
image.
6
Motivation: segmentation & video
compression
Frame segmentation
Segment motion estimation
Encoding
True block and sub-block
motion vectors
7
Aim #1: use the human factor
(aka segmentation quality metric)
8
Aim #2: automatic parameter tuning
9
Road map
3) … And put them
together (autotuning).
2) … Learn a quality metric
including the human factor
(application needs) …
1) Pick a segmentation algorithm…
10
Graph-cut
vj
w(vi, vj)>>0
Graph:
w(vi, vj)>0
Nodes:
vi
Edges:
Weights:
w(vi, vj)=0
11
Graph-cut
Cm
Internal difference:
12
Graph-cut
Cm
Difference between components:
Cn
13
Graph-cut
Ck
Boundary predicate:
10
15
12
Cn
14
Graph-cut
C1
Boundary predicate:
15
8
11
C2
15
Graph-cut
C1
Boundary predicate:
Observation scale ~ k
C2
16
K = 100
K = 10,000
K=3
Graph-cut
17
Road map
3) … And put them
together (autotuning).
2) … Learn a quality
metric including the
human factor…
1) Pick a segmentation algorithm…
18
(Weighted) symmetric uncertainty
4 bits
------------------ = 33%
7 bits + 5 bits
Entropy based average
19
k vs. Uw vs. quality
160 x 120 image
block
20
k vs. Uw vs. quality
Training
160 x 120 blocks
320x240 rgb
images
K = [1, …, 10,000]
visual inspection &
classification
21
k vs. Uw vs. quality
Training
160 x 120 blocks
640x480 rgb
images
K = [1, …, 10,000]
visual inspection &
classification
22
Learning the metric
Uw = m log(k) + b
23
Road map
3) … And put them
together (autotuning).
2) … Learn a quality
metric including the
human factor…
1) Pick a segmentation algorithm…
24
Automatic k selection
25
Automatic k selection
26
Automatic k selection
27
Automatic k selection
28
Automatic k selection
29
… and adaptivity
k = k(x,y)
30
Road map
31
Results - Quality
Adaptive graph-cut (ours)
Graph-cut (Felzensswalb, 2004) *
SLIC (Achanta, 2012) *
* Same number of segments forced for each algorithm
32
Results
33
Results
SLIC
Graph-cut
Adaptive graph-cut
34
Results
35
Results
SLIC
Graph-cut
Adaptive graph-cut
36
Results: inter-class contrast
(the higher the better)
Sum of the contrasts among segments weighted by their areas (Chabrier, 2004)
320x240
640x480
0.2
0.18
0.18
0.16
0.16
0.14
0.14
0.12
0.12
0.1
0.1
SLIC
Graph-cut
Adaptive graphcut
Inter class contrast
0.08
average
0.08
average
0.06
median
0.06
median
0.04
0.04
0.02
0.02
0
0
SLIC
Graph-cut
Adaptive graphcut
Inter class contrast
37
Results: intra-class uniformity
(the lower the better)
Sum of the normalized standard deviation for each region (Chabrier, 2004)
320x240
640x480
14
45
40
12
35
10
30
8
6
4
25
average
20
average
median
15
median
10
2
5
0
SLIC
Graph-cut
Adaptive graphcut
Intra class uniformity
0
SLIC
Graph-cut
Adaptive graphcut
Intra class uniformity
38
Results: contrast - uniformity ratio
(the higher the better)
320x240
640x480
35
14
30
12
25
10
20
8
15
average
6
median
SLIC
Graph-cut
1000 * Inter / Intra
Adaptive graphcut
average
median
10
4
5
2
0
0
SLIC
Graph-cut
Adaptive graphcut
1000 * Inter / Intra
39
Discussion
LEARNED
segmentation
quality metric including the
HUMAN FACTOR
Iterative
method
AUTOMATICALLY
ADAPTIVELY compute
optimal scale parameter
to
and
the
40
A more general approach
(edge thresholding segmentation in YUV)
41
A more general approach
(edge thresholding segmentation in YUV)
Openboradcast
encoding (x264)
Lyricallabs encoding
(adaptive
segmentation)
Show
42
A more general approach
(edge thresholding segmentation in YUV)
Openboradcast encoding (x264)
Lyricallabs encoding (adaptive segmentation)
Show
43
Open issues & improvements
Resolution dependency (160x120 blocks)
Learning: the Berkeley Segmentation Dataset
Avoid iterations (see I. Frosio, SPIE EI 2015)
44
Questions
45