Enhanced Generic Fourier Descriptor for Object

1
Generic Fourier Descriptor for
Shape-based Image Retrieval
Dengsheng Zhang, Guojun Lu
Gippsland School of Comp. & Info Tech
Monash University
Churchill, VIC 3842
Australia
[email protected]
http://www.gscit.monash.edu.au/~dengs/
Faculty
Faculty of
of Information
Information Technology
Technology
2
Outline
• Motivations
• Problems
• Generic Fourier Descriptor (GFD)
• Experimental Results
• Conclusions
Faculty
Faculty of
of Information
Information Technology
Technology
3
Motivations
• Content-based Image Retrieval
– Image description is important for image searching
– Image description constitutes one of the key part of MPEG-7
– Shape is an important image feature along with color and
texture
• Effective and Efficient Shape Descriptor
– good retrieval accuracy, compact features, general
application, low computation complexity, robust retrieval
performance and hierarchical coarse to fine representation
Faculty
Faculty of
of Information
Information Technology
Technology
4
Fourier Descriptor
• Obtained by applying Fourier transform
on a shape signature, such as the central
distance function r(t).
1
an 
N
N 1
 r (t ) exp(  j 2nt / N ), n  0, 1, , N -1
t 0
No Contour
Shape Signature
Same Contour and
Different content
Faculty
Faculty of
of Information
Information Technology
Technology
5
Zernike Moments
• Acquired by applying Zernike moment transform on a
shape region in polar space.
–
–
–
–
Complex form
Does not allow finer resolution in radial direction
create a number of repetitions in each order of moment
Shape must be normalized into an unit disk
Z nm 
n 1
 f ( x, y)  V
*
nm
( x, y )
 x y
n 1

 f (r cos  , r sin  )  Rnm (r )  exp( jm ),
 r 
r 1
Vnm ( x, y)  Vnm (r cos  , r sin  )  Rnm (r )  exp( jm )
Rnm (r ) 
( n |m|) / 2

s 0
(1) s
(n  s)!
r n2 s
n | m |
n | m |
s! (
 s)! (
 s)!
2
2
Faculty
Faculty of
of Information
Information Technology
Technology
6
Generic Fourier Descriptor
• Polar Transform
– For an input image f(x, y), it is first
transformed into polar image f(r,  ):
r  ( x  xc ) 2  ( y  yc ) 2 ,   arctan
1
where xc 
M
N 1
1
x and yc 

N
x 0
y  yc
x  xc
M 1
y
y 0
– Find R = max{ r( ) }
Faculty
Faculty of
of Information
Information Technology
Technology
7
Generic Fourier Descriptor-II
• Polar Raster Sampling
Polar Grid
Polar image
Polar raster sampled image in Cartesian space
Faculty
Faculty of
of Information
Information Technology
Technology
8
Generic Fourier Descriptor-III
• Binary polar raster sampled shape
images
Polar raster sampling
Polar raster sampling
Faculty
Faculty of
of Information
Information Technology
Technology
9
Generic Fourier Descriptor-IIV
• 2-D Fourier transform on polar
raster sampled image f(r,  ):
r
2i
PF (  ,  )   f (r , i ) exp[  j 2 (  
 )]
R
T
r
i
where 0r<R and i = i(2/T) (0 i<T); 0<R, 0<T.
R and T are the radial frequency resolution and
angular frequency resolution respectively.
• The normalized Fourier coefficients
are the GFD.
Faculty
Faculty of
of Information
Information Technology
Technology
10
Generic Fourier Descriptor-V
• Rotation invariant
Fourier
Polar raster sampled
PF
Fourier
Polar raster sampled
PF
Faculty
Faculty of
of Information
Information Technology
Technology
11
Generic Fourier Descriptor-VI
• Translation invariant due to using shape
centroid as origin.
• Scale normalization:
| PF (0,0) | | PF (0,1) |
| PF (0, n) |
| PF (m,0) |
| PF (m, n) |
GFD  {
,
,...,
,...,
,...,
}
area
| PF (0,0) |
| PF (0,0) |
| PF (0,0) |
| PF (0,0) |
• Due to f(x, y) is real, only a quarter of the
transformed coefficients are distinct. The first
36 coefficients are selected as shape
descriptor.
• The similarity between two shapes are
measured by the city block distance between
the two set of GFDs.
Faculty
Faculty of
of Information
Information Technology
Technology
12
Experiment
• Datasets
– MPEG-7 region shape database (CE-2) has been tested. CE-2 has
been organized by MPEG-7 into six datasets to test a shape
descriptor’s behaviors under different distortions.
– Set A1 is for test of scale invariance. 100 shapes in Set A1 has
been classified into 20 groups which are designated as queries.
– Set A2 is for test of rotation invariance. 140 shapes in Set A2 has
been classified into 20 groups which are designated as queries
– Set A3 is for test of rotation/scaling invariance.
– Set A4 is for test of robustness to perspective transform. 330
shapes in Set A4 has been classified into 30 groups which are
designated as queries.
– Set B consists of 2811 shapes from the whole database, it is for
subjective test. 682 shapes in Set B have been manually classified
into 10 groups by MPEG-7.
– For the whole database, 651 shapes have been classified into 31
groups which can be used as queries.
Faculty
Faculty of
of Information
Information Technology
Technology
13
Performance Measurement
• Precision-Recall
r
number of relevant retrieved images
R 
n1 total number of relevant images in DB
r number of relevant retrieved images
P

n2
number of retrieved images
• For each query, the precision of the retrieval at
each level of the recall is obtained. The result
precision of retrieval is the average precision of
all the query retrievals.
Faculty
Faculty of
of Information
Information Technology
Technology
14
Results
• Average Precision-Recall on Set A1 and A2
Rotation Invariance Test
100
90
80
70
60
50
40
30
20
10
0
Precision
Precision
Scale Invariance Test
GFD
ZMD
10
20
30
40
50
60
Recall
70
80
90
100
100
90
80
70
60
50
40
30
20
10
0
GFD
ZMD
10
20
30
40
50
60
70
80
90
100
Recall
Faculty
Faculty of
of Information
Information Technology
Technology
15
Results
• Average Precision-Recall on Set A4 and CE-2
General Invariance Test
100
90
80
70
60
50
40
30
20
10
0
Precision
Precision
Perspective Invariance Test
GFD
ZMD
0
10
20
30
40
50
60
Recall
70
80
90
100
100
90
80
70
60
50
40
30
20
10
0
GFD
ZMD
0
10
20
30
40
50
60
70
80
90
100
Recall
Faculty
Faculty of
of Information
Information Technology
Technology
16
• Average Precision-Recall on Set B
Subjective Test
80
70
Precision
60
50
GFD
40
ZMD
30
20
10
0
0
10
20
30
40
50
60
70
80
90
100
Recall
Class
1
2
3
4
5
6
7
8
9
10
Average
No. of shapes
68.0
248
22
28
17
22
45
145
45
42
GFD (%)
47.0
66.4
55.6
50.0
50.0
24.8
30.4
50.8
55.6
29.0
46.0
ZMD (%)
37.0
58.0
55.0
41.2
42.6
22.6
33.6
52.0
41.4
34.0
41.7
Faculty
Faculty of
of Information
Information Technology
Technology
17
Results
Faculty
Faculty of
of Information
Information Technology
Technology
18
Faculty
Faculty of
of Information
Information Technology
Technology
CE-2
Set B
19
Faculty
Faculty of
of Information
Information Technology
Technology
20
Conclusions
• A new shape descriptor, generic Fourier descriptor
(GFD) has been proposed.
• It has been tested on MPEG-7 region shape database
• Comparisons have been made between GFD and
MPEG-7 shape descriptor ZMD.
• Compared with ZMD, GFD has four advantages:
– it captures spectral features in both radial and circular
directions;
– it is simpler to compute;
– it is more robust and perceptually meaningful;
– the physical meaning of each feature is clearer.
• The proposed GFD satisfies all the six requirements
set by MPEG-7 for shape representation .
Faculty
Faculty of
of Information
Information Technology
Technology