Absolute chain codes Chain code Relative chain code

Absolute chain codes
Shape representations vs. descriptors
Start
• After the segmentation of an image,
its regions or edges are represented and described
in a manner appropriate for further processing.
Search direction: look to the
left first and check neighbors in
clockwise direction
2
• Shape representation:
the ways we store and represent the objects
3
– Perimeter
– Interior
0
5
6
• Shape descriptors:
methods for characterizing object shapes.
INF 4300
INF 4300
Chain code
2
Relative chain code
• The chain code depends on the starting point.
• It can be normalized for start point by treating it as a
circular/periodic sequence, and redefine the starting point so
that the resulting number is of minimum magnitude.
• We can also normalize for rotation by using
the first difference of the chain code:
(direction changes between code elements)
– Code: 10103322
1
– First difference (counterclockwise): 33133030
– Minimum circular shift of first difference: 03033133 2
•
•
•
Directions are defined in relation to a moving perspective.
Example: Orders given to a blind driver ("F”, "B”, "L”, "R").
The directional code representing any particular section of line
is relative to the directional code of the preceding line segment.
•Why is the relative code:
R,F,F,R,F,R,R,L,L,R,R,F ?
2
3
1
4
0
5
7
6
0
• To find the difference, look at the code and count
3
counterclockwise directions.
• Treating the curve as circular we add the 3 for the first point.
• This invariance is only valid if the boundary itself
is invariant to rotation and scale.
INF 4300
7
Chain code in clockwise direction:
00077665555556700000644444442221111112234445652211
– The resulting feature values should be useful for
discrimination between different object types.
26.11.2014
1
4
•
•
•
•
•
3
Note: rotate the code
table so that 2 is
forward from your
position
The absolute chain code for the triangles are
4,1,7 and 0, 5, 3.
The relative codes are 7, 7, 0. (Remember ”Forward” is 2)
Invariant to rotation, as long as starting point remains the same.
Start-point invariance by ”Minimum circular shift”.
To find the first R, we look back to the end of the contour.
INF 4300
4
Signature representations
Boundary segments from convex hull
•
• A signature is a 1D functional representation
of a 2D boundary.
• It can be represented in several ways.
• Simple choise: radius vs. angle:
•
•
•
•
The boundary can be decomposed into segments.
– Useful to extract information from concave parts of the objects.
Convex hull H of set S is the smallest convex set containing S.
The set difference H-S is called the convex deficiency D.
If we trace the boundary and identify the points where we go in and out
of the convex deficiency, these points can represent important border points
charaterizing the shape of the border.
Border points are often noisy, and smoothing can be applied first.
– Smooth the border by moving average of k boundary points.
– Use polygonal approximation to boundary.
– Simple algorithm to get convex hull from polygons.
• Invariant to translation.
• Not invariant to starting point, rotation or scaling.
INF 4300
5
Descriptors extracted from the
Convex Hull
INF 4300
6
Skeletons
• The skeleton of a region is defined by the medial axis transform:
For a region R with border B, for every point p in R,
find the closest neighbor in B.
• If p has more than one such neighbor, it belong to the medial axis.
• The skeleton S(A) of an object is the axis of the object.
• The medial axis transform gives the distance to the border.
Useful features for shape characterization can be e.g.:
• Area of object and area of convex hull (CH)
– CH ”solidity” aka ”convexity” = (object area)/(CH area)
= The proportion of pixels in CH also in the object
– Better than ”extent” = (object area)/(area of bounding box)
• Number of components of convex deficiency
– Distribution of component areas
• Relative location of
– points where we go in and out of the convex deficiency.
– points of local maximal distance to CH.
INF 4300
7
INF 4300
8
Introduction to Fourier descriptors
• Suppose that we have an object S and that
we are able to find the length of its contour.
The contour should be a closed curve.
• We partition the contour into M segments of
equal length, and thereby find M equidistant
points along the contour of S.
• Traveling anti-clockwise along this contour at
constant speed, we can collect a pair of
waveforms (=coordinates) x(k) and y(k). Any
1D signal representation can be used for
these.
• If the speed is such that one
circumnavigation of the object takes 2,
x(k) and y(x) will we periodic with period 2.
•
Reminder: complex numbers
• a+bi
y
x
b
a
INF 4300
9
24.09.14
Contour representation using 1D Fourier
transform
• The coordinates (x,y) of these M points are
then put into a complex vector s :
10
• We perform a 1D forward Fourier transform
a (u ) 
0
7
1
M
M 1
  2iuk  1

M  M
 s(k ) exp
k 0
M 1

k 0

 2uk 
 2uk  
  i sin 
 ,
M 
 M 
 s(k ) cos
u  0, M  1
0
•
•
•
•
•
7
x=3233334444443
y=1234566543211
INF 4300
INF 4300
Fourier-coefficients from f(k)
Start
s(k)=x(k)+iy(k), k[0,M-1]
• We choose a direction (e.g. anti-clockwise)
• We view the x-axis as the real axis and the yaxis as the imaginary one for a sequence of
complex numbers.
• The representation of the object contour is
changed, but all the information is preserved.
• We have transformed the contour problem
from 2D to 1D.
a+bi
s (1)  3  1i
s( 2)  2  2i
s (3)  3  3i
Complex coefficients a(u) are the Fourier representation of boundary.
a(0) contains the center of mass of the object.
Exclude a(0) as a feature for object recognition.
a(1), a(2),....,a(M-1) will describe the object in increasing detail.
These depend on rotation, scaling and starting point of the contour.
• For object recognitions, use only the N first coefficients (a(N), N<M)
• This corresponds to setting a(k)=0, k>N-1
s( 4)  3  4i

11
INF 4300
12
Approximating a curve with Fourier
coefficients in 1D
Original signal of length N=36
Approximating in increasing detail
• Take the Fourier transform
of the signal of length N
• Keep only the M (<N/2) first
Fourier coefficients (set the
others to 0 in amplitude).
• Compute the inverse 1D
Fourier transform of the
modified signal.
• Display the signal
corresponding to M
coefficients.
m=2
m=10
Reconstructed signal using
only 2 Fourier coefficients
24.09.14
INF 4300
13
m=3
m=4
m=13
m=15
Inverse Fourier transform gives an approximation to the original contour
N 1
 2iuk 
sˆ(k )   a(u ) exp
,
 M 
k 0
We have only used N features to reconstruct each component of sˆ( k ).
The number of points in the approximation is the same (M ),
but the number of coefficients (features) used to reconstruct each point
is smaller (N<M ).
•
•
Use an even number of descriptors.
The first 10-16 descriptors are found to be sufficient for character
description. They can be used as features for classification.
The Fourier descriptors can be invariant to translation and rotation if the
coordinate system is appropriately chosen.
All properties of 1D Fourier transform pairs (scaling, translation, rotation)
can be applied.
•
15
k  0, M  1
•
•
•
INF 4300
14
Fourier Symbol reconstruction
•
02.10.13
Original
INF 4300
Look back to 2D Fourier spectra (2310)
Most of the energy is concentrated along the lowest frequencies.We can reconstruct the image
with an increasing accuracy by starting with the lowest frequencies and adding higher.
m=8
INF 4300
16
Fourer descriptor example
Matlab DIPUM Toolbox:
Image, 26x21 pixels
Boundary
2 coefficients
Fourier coefficients and invariance
8 coefficients
20 coefficients
4 coefficients
b=boundaries(f);
b=b{1};
%size(b) tells that the contour is 65
pixels long
bim=bound2im(b,26,21);
%must tell image dimension
z=frdescp(b);
zinv2=ifrdesc(z,2);
z2im=bound2im(zinv2,26,21);
imshow(z2im);
6 coefficients
INF 4300
17
Run Length Encoding of Objects
– See e.g., Ø. D. Trier, A. Jain and T. Taxt,
Feature extraction methods for character recognition – a survey.
Pattern Recognition, vol. 29, no. 4, pp. 641-662, 1996.
INF 4300
18
“Gray code”
• Sequences of adjacent pixels are represented as ”runs”.
• Absolute notation of foreground in binary images:
– Runi = …;<rowi, columni, runlengthi>; …
• Relative notation in graylevel images:
Is the conventional binary representation of graylevels optimal?
• Consider a single band graylevel image having b bit planes.
• We desire a minimum complexity in each bit plane
– Because the run-length transform will be most efficient.
– …;(grayleveli, runlengthi); …
• Conventional binary representation gives high complexity.
• This is used as a lossless compression transform.
• Relative notation in binary images:
– If the graylevel value fluctuates between 2k-1 and 2k, k+1 bits will
change value: example: 127 = 01111111 while 128 = 10000000
Start value, length1, length2, …, eol,
…
Start value, length1, length2, …, eol,eol.
• In ”Gray Code” only one bit changes if graylevel is changed by 1.
• This is also useful for representation of image bit planes.
• RLE is found in TIFF, GIF, JPEG, …, and in fax machines.
INF 4300
• Translation affects only the center of mass (a(0)).
• Rotation only affects the phase of the coefficients.
• Scaling affects all coefficients in the same way,
so ratios a(u1)/a(u2) are not affected.
• The start point affects the phase of the coefficients.
• Normalized coefficients can be obtained,
but is beyond the scope of this course.
19
• The transition from binary code to ”gray code” is a reversible
transform, while both binary code and ”gray code” are codes.
INF 4300
20
Gray Code transforms
Learning goals - representation
• Chain codes
• ”Binary Code” to ”Gray Code”:
1.
2.
3.
4.
– Absolute
• First difference
– Relative
• Minimum circular shift
Start by MSB in BC and keep all 0 until you hit 1
1 is kept, but all following bits are complemented until you hit 0
0 is complemented, but all following bits are kept until you hit 1
Go to 2.
• Polygonization
– Focus on recursive
• ”Gray Code” to ”Binary Code”:
1.
2.
3.
4.
• Signatures
• Convex hull
• Skeletons
Start by MSB in GC and keep all 0 until you hit 1
1 is kept, but all following bits are complemented until you hit 1.
1 is complemented, but all following bits are kept until you hit 1.
Go to 2.
INF 4300
– Thinning
• Fourier descriptors
• Run Length Encoding
21
Bayes rule for a
classification problem
• Euclidean distance
between point x and class
center :
x   T x    
p ( x)
likelihood  prior probability
normalizing factor
x

2
• Mahalanobis distance
between x and :
• P(j) is the prior probability for class j. If we don't have special
knowledge that one of the classes occur more frequent than
other classes, we set them equal for all classes. (P(j)=1/J,
j=1.,,,J).
INF 4300
22
Euclidean distance vs.
Mahalanobis distance
• Suppose we have J, j=1,...J classes.  is the class label for a
pixel, and x is the observed feature vector).
• We can use Bayes rule to find an expression for the class with
the highest probability:
p (x |  j ) P( j )
P ( j | x)
posterior probability 
INF 4300
23
r 2   x     1  x   
T
INF 4300
24
Discriminant functions
for the normal density
Case 1: Σj=σ2I
•
• We saw that the minimum-error-rate classification can
computed using the discriminant functions
g i (x)  w ti x  wi 0
g i (x)  ln p (x | i )  ln P(i )
where w i 
•
• With a multivariate Gaussian we get:
1
μ i and wi 0  
1
2
2 2
The equation gi(x)=gj(x) can be written as
μ ti μ i  ln P(i )
w t (x  x 0 )  0
1
d
1
g i (x)   (x  μ i ) t  i1 (x  μ i )  ln 2  ln  i  ln P (i )
2
2
2
where w  μ i - μ j
and x0 
• Let ut look at this expression for some special cases:
•
•
•
INF 4300
An equivalent formulation of the discriminant functions:
25
A simple model, Σj=σ2I
2
1
μ i - μ j    ln P(i ) μi - μ j 
2
P( j )
μi - μ j
w=i-j is the vector between the mean values.
This equation defines a hyperplane through the point x0, and
orthogonal to w.
If P(i)=P(j) the hyperplane will be located halfway between the
mean values.
INF 4300
26
Case 2: Common covariance, Σj= Σ
• If we assume that all classes have the same shape of data
clusters, an intuitive model is to assume that their probability
distributions have the same shape
• By this assumption we can use all the data to estimate the
covariance matrix
• This estimate is common for all classes, and this means that
also in this case the discriminant functions become linear
functions
•
•
•
•
•
The distributions are spherical in d dimensions.
The decision boundary is a generalized hyperplane of d-1 dimensions
The decision boundary is perpendicular to the line separating the two
mean values
This kind of a classifier is called a linear classifier, or a linear
discriminant function
1
1
g j (x)   (x  μ j )T Σ 1 (x  μ j )  ln Σ  ln P( j )
2
2
1
1 T 1
T 1
T 1
  (x Σ x  2μ j Σ x  μ j Σ μ j )  ln Σ  ln P( j )
2
2
If P(i)= P(i), the decision boundary will be half-way between i and
j
Common for all classes, no need to compute
Since xTx is common for all classes, gj(x) again reduces to
a linear function of x.
– Because the decision function is a linear function of x.
INF 4300
27
INF 4300
28
Case 3:, Σj=arbitrary
Case 2: Common covariance, Σj= Σ
• An equivalent formulation of the discriminant functions is
• The discriminant functions will be quadratic:
g i (x)  x t Wi x  w ti x  wi0
g i (x)  w ti x  wi0
1
where Wi   Σ i1 , w i  Σ i1μ i
2
1 t 1
1
and wi0   μ i Σ i μ i  ln Σ i  ln P(i )
2
2
where w i  Σ 1μ i
1 t
and wi0   μ i Σ 1μ i  ln P (i )
2
• The decision boundaries are agian hyperplanes.
• Because wi= Σ-1(i- j) is not in the direction of (i- j), the
hyperplane wil not be orthogonal to the line between the
means.
INF 4300
• The decision surfaces are hyperquadrics and can assume any of
the general forms:
–
–
–
–
–
–
29
Use few, but good features
•
•
•
•
•
•
INF 4300
30
Exhaustive feature selection
To avoid the ”curse of dimensionality” we must take care in finding
a set of relatively few features.
A good feature has high within-class homogeneity,
and should ideally have large between-class separation.
In practise, one feature is not enough to separate all classes,
but a good feature should:
– separate some of the classes well
– Isolate one class from the others.
If two features look very similar (or have high correlation),
they are often redundant and we should use only one of them.
Class separation can be studied by:
– Visual inspection of the feature image overlaid the training mask
– Scatter plots
Evaluating features as done by training can be difficult to do automatically,
so manual interaction is normally required.
INF 4300
hyperplanes
hypershperes
pairs of hyperplanes
hyperellisoids,
hyperparaboloids
hyperhyperboloid
31
• If – for some reason – you know that you will use d
out of D available features, an exhaustive search will
involve a number of combinations to test:
D!
n
D  d ! d !
• If we want to perform an exhaustive search through D
features for the optimal subset of the d ≤ m “best
features”, the number of combinations to test is
m
n
D!
 D  d ! d !
d 1
• Impractical even for a moderate number of features!
d ≤ 5, D = 100 => n = 79.374.995
INF 4300
32
Suboptimal feature selection
k-Nearest-Neighbor classification
• A very simple classifier.
• Classification of a new sample xi is done as follows:
• Select the best single features based on some quality
criteria, e.g., estimated correct classification rate.
– A combination of the best single features will often imply
correlated features and will therefore be suboptimal .
• More in INF 5300
• “Sequential forward selection” implies that when a
feature is selected or removed, this decision is final.
• “Stepwise forward-backward selection” overcomes this.
– A special case of the “add - a, remove - r algorithm”.
– Out of N training vectors, identify the k nearest neighbors
(measure by Euclidean distance) in the training set,
irrespectively of the class label. k should be odd.
– Out of these k samples, identify the number of vectors ki
that belong to class i , i:1,2,....M (if we have M classes)
– Assign xi to the class i with the maximum number of ki
samples.
• k must be selected a priori.
• Improved into “floating search” by making the number of
forward and backward search steps data dependent.
– “Adaptive floating search”
– “Oscillating search”.
INF 4300
33
About kNN-classification
34
Supervised or unsupervised classification
• If k=1 (1NN-classification), each sample is assigned
to the same class as the closest sample in the
training data set.
• If the number of training samples is very high,
this can be a good rule.
• If k->, this is theoretically a very good classifier.
• This classifier involves no ”training time”, but the
time needed to classify one pattern xi will depend on
the number of training samples, as the distance to all
points in the training set must be computed.
• ”Practical” values for k: 3<=k<=9
INF 4300
INF 4300
35
• Supervised classification
– Classify each object or pixel into a set of k known classes
– Class parameters are estimated using a set of training
samples from each class.
• Unsupervised classification
– Partition the feature space into a set of k clusters
– k is not known and must be estimated (difficult)
• In both cases, classification is based on the value of
the set of n features x1,....xn.
• The object is classified to the class which has the
highest posterior probability.
• ”The clusters we get are not the classes we want”.
INF 4300
36
Learning goals from classification
K-means clustering
•
•
1.
2.
Note: K-means algorithm normally means ISODATA, but different
definitions are found in different books
K is assumed to be known
Start with assigning K cluster centers
–
–
k random data points, or the first K points, or K equally spaces points
For k=1:K, Set k equal to the feature vector xk for these points.
•
Compute for each sample the distance r2 to each cluster center:
Assign each object/pixel xi in the image to the closest cluster center
using Euclidean distance.
r 2   xi   k   xi   k   xi   k
T
•
3.
4.
• Be able to use and implement Bayes rule with a ndimensional Gaussian distribution.
• Know how s and s are estimated.
• Understand the 2-dimensional case where a
covariance matrix is illustrated as an ellipse.
• Be able to simplify the general discriminant function
for 3 cases.
• Have a geometric interpretation of classification with
2 features.
2
Assign xi to the closest cluster (with minimum r value)
Recompute the cluster centers based on the new labels.
Repeat from 2 until #changes<limit.
ISODATA K-means: splitting and merging of clusters are included in
the algorithm
INF 4300
37
Learning goals continued
• Understand how different measures of classification
accuracy work:
• Be familiar with the curse of dimensionality and the
importance of selecting few, but good features
• Understand kNN-classification
• Understand the difference between supervised and
unsupervised classification
• Understand the Kmeans-algorithm.
• Be able to solve the previous exam questions on
classification 
INF 4300
39
INF 4300
38