Classification of plant leaf images with

Applied Mathematics and Computation 205 (2008) 916–926
Contents lists available at ScienceDirect
Applied Mathematics and Computation
journal homepage: www.elsevier.com/locate/amc
Classification of plant leaf images with complicated background
Xiao-Feng Wang a,b,c,*, De-Shuang Huang a, Ji-Xiang Du a,b, Huan Xu a,b, Laurent Heutte d
a
Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, P.O. Box 1130, Hefei, Anhui 230031, China
Department of Automation, University of Science and Technology of China, Hefei 230027, China
Department of Computer Science and Technology, Hefei University, Hefei 230022, China
d
Lab LITIS, UFR Sciences, University of Rouen, France
b
c
a r t i c l e
i n f o
Keywords:
Image segmentation
Plant leaf
Complicated background
Watershed segmentation
Hu geometric moments
Zernike moment
Moving center hypersphere (MCH) classifier
a b s t r a c t
Classifying plant leaves has so far been an important and difficult task, especially for leaves
with complicated background where some interferents and overlapping phenomena may
exist. In this paper, an efficient classification framework for leaf images with complicated
background is proposed. First, a so-called automatic marker-controlled watershed segmentation method combined with pre-segmentation and morphological operation is introduced to segment leaf images with complicated background based on the prior shape
information. Then, seven Hu geometric moments and sixteen Zernike moments are
extracted as shape features from segmented binary images after leafstalk removal. In addition, a moving center hypersphere (MCH) classifier which can efficiently compress feature
data is designed to address obtained mass high-dimensional shape features. Finally, experimental results on some practical plant leaves show that proposed classification framework
works well while classifying leaf images with complicated background. There are twenty
classes of practical plant leaves successfully classified and the average correct classification
rate is up to 92.6%.
Ó 2008 Elsevier Inc. All rights reserved.
1. Introduction
Plants play the most important part in the cycle of nature. They are the primary producers that sustain all other life forms
including people. This is because plants are the only organisms that can convert light energy from the sun into food. Animals,
incapable of making their own food, depend directly or indirectly on plants for their supply of food. Moreover, all of the oxygen available for living organisms comes from plants. Plants are also the primary habitat for thousands of other organisms. In
addition, many of the fuel people use today, such as coal, natural gas and gasoline, were made from plants that lived millions
of years ago. However, in recent years, people have been seriously destroying the natural environments, so that many plants
constantly die and even die out every year. Conversely, the resulting ecological crisis has brought many serious consequences including land desertion, climate anomaly, land flood, and so on, which have menaced the survival of human being
and the development of society. Now, people have realized the importance and urgency of protecting plant resource. Besides
taking effective measures to protect plants, it is important for ordinary people to know or classify plants which can further
enhance the public’s consciousness of plant protection. Therefore, besides professional botanists, many non-professional
researchers have paid more attention to plant classification.
* Corresponding author. Address: Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, P.O. Box 1130, Hefei,
Anhui 230031, China.
E-mail addresses: [email protected] (X.-F. Wang), [email protected] (D.-S. Huang), [email protected] (J.-X. Du), [email protected] (H. Xu),
[email protected] (L. Heutte).
0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved.
doi:10.1016/j.amc.2008.05.108
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
917
Classifying plant is a process in which each individual plant should be correctly assigned to a descending series of groups
of related plants. According to the statistic and investigated data, there are about 400,000 species of plants, of which 270,000
species of plants have been named and identified by botanists. It is impossible for any botanist or non-professional researcher to know more than a tiny fraction of the total number of named species, which makes the further research on plants difficult [1]. Up to now, some new plant taxonomy methods, such as cytotaxonomy, chemotaxonomy, serotaxonomy and
cladistics, etc, are becoming popular. However, these new methods are all complicated and time-consuming and can mainly
been carried out by botanists. Comparatively, traditional shape taxonomy methods are still widely used since they are easily
implemented and suitable for field-living plants classification. In recent years, information technologies including image processing and pattern recognition techniques have been introduced into plant shape taxonomy to make up the deficiency of
people’s classification ability [1–10].
According to the theory of plant shape taxonomy, plants are basically classified according to the shapes of their leaves and
flowers. Usually, leaves are approximately two-dimensional in shape and flowers are three-dimensional. It is difficult to analyze shapes and structures of flowers since they have complex 3D structures [2]. Moreover, leaves can be easily found and
collected everywhere at all seasons, while flowers can only be obtained at blooming season. Therefore, leaves are widely
used for computer-aided plant classification. Im et al. [3] used a hierarchical polygon approximation representation of leaf
shape to classify the Acer family variety. Oide and Ninomiya [4] chose leaf shape images as neural networks input and applied a Hopfield model and a simple perceptron to Soybean leaf classification. Wang et al. [5,6] proposed combining different
features based on centroid-contour distance curve and adopted fuzzy integral for leaf image retrieval. Mokhtarian and
Abbasi [7] used curvature scale space image to represent leaf shapes and applied it to leaf classification with self-intersection. Fu and Chi [1] combined the thresholding method and BP neural network to extract leaf veins. Du et al. [8] adopted
accelerated Douglas–Peucker approximation algorithm for leaf shape approximation and used the modified dynamic programming algorithm for leaf shape matching. Wu et al. [9] and Tak and Hwang [10] have also made some researches on leaf
image retrieval.
Although obtaining some encouraging results, most of the researches on leaf classification have been focusing on the feature extraction and classifier design. The test leaf images used in these researches are generally specimen images and indoor
images with pure or simple background. Accordingly, some traditional segmentation methods, such as thresholding methods, gradient operators and morphological operators, etc., were selected to segment the leaf objects from color or gray
images. However, one of the objectives of leaf classification is to identify the field-living leaves, the digital images of which
inevitably contain complicated background. For example, overlaps between some adjacent parts of leaves are sometimes
unavoidable, which may create confusion between the boundaries of adjacent leaves. Some interferents around the target
leaves, such as small stones, ruderals and dead leaves, etc., may make segmentation results unsatisfying. Even in indoor leaf
images, there also exist some branches or non-target leaves which may touch the target leaves. As a result, boundaries of the
target leaves from traditional segmentation methods will unavoidably connect to boundaries of branches or non-target
leaves, which make feature extraction imprecisely.
In this paper, we proposed an efficient classification framework for leaf images with complicated background. First, a socalled automatic marker-controlled watershed method combined with pre-segmentation and morphological operation is applied to segment leaf images with complicated background based on the prior shape information. Then, some shape features
including seven Hu geometric moments and sixteen Zernike moment features will be extracted from segmented binary
images. In addition, an efficient moving center hypersphere (MCH) classifier with data compression function is introduced
to address extracted features.
The rest of this paper is organized as follows: In Section 2, we introduce the automatic marker-controlled watershed segmentation method for leaf images with complicated background. Shape feature extraction is presented in Section 3. Then,
the moving center hypersphere classifier is described in Section 4. In Section 5, the proposed classification framework is validated by some practical plant leaf images with complicated background. Finally, some conclusive remarks are included in
Section 6.
2. Automatic marker-controlled watershed segmentation
As we discussed above, leaf images with complicated background usually contain more than one object which may include foreground objects, i.e., target leaves and background objects (small stones, ruderals, branches, non-target leaves
and other interferents). Moreover, target leaves are possibly touching or covering the background objects. Consequently, traditional segmentation methods including thresholding methods, edge operators and morphological operators can hardly
separate the target leaves from background objects due to their own limitations. In this section, we will introduce one effective method to address this problem, which is called automatic marker-controlled watershed segmentation method.
Watershed segmentation [11] is a region growing technique belonging to the class of morphological operations. Usually,
it is simulated based on an immersion process, enabling an increase in speed and accuracy. A digital watershed is defined as a
small region that can not be assigned uniquely to any influence zones of local minima in the gradient image. Watershed segmentation has increasingly been recognized as a powerful segmentation process due to its many advantages [12], including
simplicity, speed and complete division of the image. Even with target regions having low contrast and weak boundaries,
watershed segmentation can always provide closed contours.
918
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
In contrast to classical area based segmentation, the watershed segmentation is executed on the gradient image which
can be regarded as the topography with boundaries between regions. However, the over-segmentation phenomenon inevitably comes out due to the fact that the gradient image exhibits too many minima. To avoid severe over-segmentation,
marker-controlled watershed segmentation was proposed which is normally implemented by region growing based on a
set of markers [13]. The fundamental idea is to filter out the undesired minima of the gradient image according to the
markers. Certain desired local minima are selected as markers, and then geodesic reconstruction is applied to fill the other
minima to non-minimum plateaus. The marker image used for watershed segmentation is a binary image consisting of
marker regions where each marker is placed inside an object (either a foreground object or a background object). Each initial marker has a one-to-one relationship to a specific watershed region, thus the number of markers will equal the final
number of watershed regions. After segmentation, the boundaries of the watershed regions are arranged on the desired
ridges, thus separating each object from its neighbors [14]. The markers can be manually or automatically selected; in this
paper, we proposed automatically creating marker for leaf images based on pre-segmentation results and the prior shape
information.
Although thresholding methods could not give satisfying results while segmenting the leaf images with complicated
background, it can still provide some useful information for the automatic marker creation. In this paper, we choose the Otsu
thresholding method [15] to construct the preliminary marker image. Otsu thresholding method belongs to global thresholding algorithms which assume that image is bimodal. The basic idea is that if a certain threshold t can minimizes the weighted
within-class variance rW among all the possible values, the threshold t is the one with which target and background can be
divided. Here, the weighted within-class variance rW is represented as follows:
r2W ðtÞ ¼ q1 ðtÞr21 ðtÞ þ q2 ðtÞr22 ðtÞ;
ð1Þ
where the class probabilities are estimated as
q1 ðtÞ ¼
t
X
PðiÞ q2 ðtÞ ¼
i¼1
I
X
PðiÞ;
ð2Þ
i¼tþ1
and the class means are given by
l1 ðtÞ ¼
t
X
iPðiÞ
q
1 ðtÞ
i¼1
l2 ðtÞ ¼
I
X
iPðiÞ
:
q
ðtÞ
i¼tþ1 2
ð3Þ
Finally, the individual class variances are
r21 ðtÞ ¼
t
X
PðiÞ
½i l1 ðtÞ2
q
1 ðtÞ
i¼1
r22 ðtÞ ¼
I
X
i¼tþ1
½i l2 ðtÞ2
PðiÞ
:
q2 ðtÞ
ð4Þ
For gray-scale image, rW is computed through the full range of t values [1,256] and the t value that minimizes rW is selected as threshold.
After segmenting leaf images using Otsu thresholding method, binary images can be obtained where foreground parts are
numerically displayed with 1(white) and background is 0(black). Here, segmented foreground objects are usually partial part
of target leaf and other interferents due to the complicated background. Even so, it does not influence the effect of marker
creation since those isolated foreground parts can represent corresponding objects perfectly. Correspondingly, each isolated
foreground part can be regarded as the marker which acts as a specific watershed region. Note that some interferents such as
branches and non-target leaves may touch or be covered by target leaf in some leaf images with complicated background.
After segmentation, the overlapping phenomenon still exists in obtained binary images, i.e., certain foreground part may
both contain the target leaf and touching interferents. If this foreground part is directly regarded as the marker for watershed
segmentation, the target leaf can hardly be segmented like other traditional methods. So, binary images obtained from Otsu
thresholding method are only preliminary marker images and need to be further processed to avoid overlapping
phenomenon.
According to our observations to original leaf images, most of overlapping phenomena are due to that target leaves touch
or partially cover the interferents including non-target leaves on the same branch of target leaf, neighboring ruderals, etc.
The shapes of target leaves are usually maintained completely, and concave angles exist between target leaves and interferents. These attributes are approximately maintained in binary images (preliminary marker images). Thus, we consider
applying morphological operator, more precisely, erosion to binary image to separate marker corresponding to target leaf
from that corresponding to touching or covered interferents. Usually, the basic effect of the erosion operation for a binary
image is to erode away the boundaries of foreground parts. Thus, foreground parts shrink in size, and holes within those
parts become larger. The mathematical definition of erosion for binary images is as follows:
Suppose that X is the set of Euclidean coordinates corresponding to the input binary image, and K is the set of coordinates
for the structuring element. Let (K)x denotes the translation of K so that its origin is at x. Then the erosion of X by K is simply
the set of all points x such that (K)x is a subset of X, i.e.,
X HK ¼ fxjðKÞx # Xg:
ð5Þ
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
919
When applied to a binary leaf images, erosion can efficiently enlarge the concave angles between target leaf and touching
or covered interferents. If a larger size of structuring element is selected, the part of target leaf and interferents will inevitably be divided because the values of concave angles will decrease to zero and even disappear. Then, each separated part can
be regarded as the markers. Here, it should be noticed that some small foreground parts may disappear after erosion operation. Actually, these disappear parts generally belong to the background objects in original images. Their disappearances
would not affect the later watershed segmentation since only the target leaves are needed in the whole segmentation
process.
After thresholding and erosion procedure, all markers can be obtained. Since segmenting target leaf is the final objective,
the markers need to be further classified into internal markers and external markers. Here, internal markers associate with
objects of interest (target leaves), which are then made to be the only allowed minima in the watershed segmentation process. External markers associate with background including all interferents, and then help to efficiently partition the image
into regions containing objects of interest and background. Usually, the discrimination of internal marker and external marker is a difficult task if no prior information can be provided. Note that the size of target leaf is always the largest among that
of all objects in leaf images with complicated background. Even in eroded images, this characteristic still maintains since
erosion operation is applied to all objects. Therefore, the size difference between target leaf and interferents can be regarded
as prior shape information and used to judge whether a marker belongs to internal marker or not. Assume that there are
totally n markers after erosion, each of which has the size of Si(i = 1,2, ,n), then we can define the judge criterion as follows:
I ¼ arg maxi2f1;2;3;;ng Si ;
ð6Þ
where I means the index for the internal marker. All other markers are regarded as external markers.
In the final marker image, values of pixels belonging to the internal marker, external markers and background are set to 1,
2 and 0, respectively. Then, watershed segmentation can be carried out based on the gradient image and the maker image.
Readers can refer to [16] for a detailed treatment.
Figs. 1 and 2 demonstrate the process of automatically creating markers and corresponding watershed segmentation results of leaf images with complicated background. The first column in Fig. 1 show two indoor leaf images with overlapping
phenomenon in which target leaves cover the non-target leaves. Thus, target leaves can not be segmented using traditional
segmentation methods since the boundaries between target leaves and covered leaves are blurred. The binary images in the
second column show the segmentation results using Otsu thresholding method which actually act as a pre-segmentation
tool for marker creation in this paper. Then, erosion operation is applied to binary images so that marker images can be obtained according to proposed judge criterion (as shown in the third column). Corresponding gradient images are demonstrates in the fourth column. Finally, watershed segmentation is carried out based on the marker images and gradient
images. It can be seen from the final segmentation results in the fifth column that two target leaves are segmented completely. Fig. 2 demonstrates the segmentation of two outdoor leaf images containing interferents. It can be seen from the
first column in Fig. 2 that some interferents, including ruderals, small stones and non-target leaves, surround the target
leaves. Otsu thresholding is applied as a pre-segmentation procedure for automatic marker creation. The pre-segmentation
results are shown in the second column in which target leaves are still touched by those interferents. The third column show
the marker images obtained after erosion operation. Final segmentation results from watershed method based on the marker
images are listed in the fourth column. Two target leaves are successfully separated from surrounding interferents.
Notice that there exist some variance on length and bending of leafstalks. To keep the precision of shape feature extraction, these leafstalks should be further removed from obtained binary images. Here, we consider applying opening operation
of mathematical morphology to binary images, which is defined as an erosion operation followed by a dilation operation
using the same structuring element. By performing opening operation with proper structuring element, we can successfully
Fig. 1. Segmentation of indoor leaf images with overlapping phenomenon using automatic marker-controlled watershed method. The first column: Original
leaf images with overlapping phenomenon. The second column: Otsu thresholding results. The third column: Erosion results. The fourth column: Gradient
images. The fifth column: Final segmentation results.
920
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
Fig. 2. Segmentation of outdoor leaf images with interferents using automatic marker-controlled watershed method. The first column: Original leaf images
containing interferents. The second column: Otsu thresholding results. The third column: Erosion results. The fourth column: Final segmentation results.
remove the leafstalks while preserving the main shape characteristics of leaf objects. The results of leafstalk removal of three
binary images in Figs. 1 and 2 are shown in Fig. 3.
3. Shape feature extraction
Shape feature is one of the most important image features for characterizing an object since it is an important feature of
human perception. Human beings tend to perceive scenes as being composed of individual objects, which can be best identified by their shapes. On the other hand, shape features are also the most important and effective visual features for classifying plants according to the theory of plant taxonomy. Consequently, we consider using shape features for leaf image
classification. Notice that there exist greater morphological differences in different kinds of leaves; even in the same kind
of leaves there also possibly exist some obvious variance on location, direction of rotation and scale in obtained digital
images. Thus, selection of good features is a crucial step for the leaf classification. An efficient leaf classification system must
be able to recognize a leaf regardless of its location, orientation and size in the field of view, i.e., translation, rotation and
scale invariance.
The image moments are widely used as shape features for image processing and classification, which provide a more geometric and intuitive meaning than some simple geometric features. The invariant properties of moments have received considerable attention in recent years since moments define a simply calculated set of region properties that can be used for
shape based classification. In this paper, two kinds of moments, Hu geometric moments and Zernike orthogonal moments,
are chose as the shape features.
3.1. Hu geometric moments
The moment invariant was first introduced by Hu, who showed how the descriptors can be derived from algebraic invariants in his fundamental theorem of moment invariants [17]. Hu has also defined seven of these moment invariants computed from central moments through order three that are invariant under object translation, scaling and rotation. The
two-dimensional traditional Hu geometric moments of order (p + q) of an intensity function f(x,y) are defined as
Fig. 3. Leafstalk removal results of three binary images in Figs. 1 and 2. (a) Leafstalk removal result of binary image in the second row and the fifth column
of Fig. 1. (b) Leafstalk removal result of binary image in the first row and the fourth column of Fig. 2. (c) Leafstalk removal result of binary image in the
second row and the fourth column of Fig. 2.
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
Mpq ¼
Z
þ1
Z
1
921
þ1
xp yq f ðx; yÞdx dy;
p; q ¼ 0; 1; 2:
ð7Þ
1
In practical pattern recognition applications the image space is reduced to a binary version, and in such a case f(x, y) takes
the value of 1 when the pixel (x, y) represents target objects or even noise and it is 0 when the pixel belongs to the
background.
When the geometrical moments Mpq in Eq. (7) are referred to the object centroid (xc, yc) they become the central moments, and are given by
Z
lpq ¼
þ1
1
Z
þ1
ðx xc Þp ðy yc Þq f ðx; yÞdx dy;
xc ¼
1
M 10
M 00
yc ¼
M 01
:
M00
ð8Þ
Thus, the central moments are invariant to translation and may be normalized to turn also invariant to scaling through
the relation:
gpq ¼ lpq =lc00 ;
ð9Þ
where the normalization factor is c = (p + q)/2 + 1. The values of seven Hu geometric moments can then be calculated from
the normalized central moments as follows:
/1 ¼ g20 þ g02
/2 ¼ ðg20 g02 Þ2 þ 4g211 ;
/3 ¼ ðg30 3g12 Þ2 þ ðg03 3g21 Þ2 ;
/4 ¼ ðg30 þ g12 Þ2 þ ðg03 þ g21 Þ2 ;
/5 ¼ ð3g30 3g12 Þðg30 þ g12 Þ½ðg30 þ g12 Þ2 3ðg21 þ g03 Þ2 þ ð3g21 g03 Þ
ðg21 þ g03 Þ 3ðg21 þ g03 Þ2 þ ð3g21 g03 Þðg21 þ g03 Þ;
2
ð10Þ
2
/6 ¼ ðg20 g02 Þ½ðg30 þ g12 Þ ðg21 þ g03 Þ þ 4g11 ðg30 þ g12 Þðg21 þ g03 Þ;
/7 ¼ ð3g21 g03 Þðg30 þ g12 Þ½ðg30 þ g12 Þ2 3ðg21 þ g03 Þ2 þ ð3g12 g30 Þðg21 þ g03 Þ
½3ðg30 þ g12 Þ2 ðg21 þ g03 Þ2 :
It can be seen from Eq. (7) that the double integrals are to be considered over the whole area of the object including its
boundary, this implies computational complexity of order O(N2). Chen [18] has proposed a fast computation method based
on contours of objects which can reduce the computational complexity to O(N).
The Chen’s improved geometrical moments of order (p + q) are defined as
Mpq ¼
Z
xp yq ds;
ds ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðdxÞ2 þ ðdyÞ2 ;
ð11Þ
C
R
where C is the line integral along a closed contour C. For practical implementation, they should be computed in their discrete form:
X
Mpq ¼
xp y q :
ð12Þ
ðx;yÞ2C
The contour central moments invariant to translation can be similarly defined as
lpq ¼
Z
ðx xc Þp ðy yc Þq ds;
C
xc ¼
M 10
M00
yc ¼
M01
:
M 00
ð13Þ
In the discrete case lpq above becomes:
lpq ¼
X
ðx xc Þp ðy yc Þq :
ð14Þ
ðx;yÞ2C
The scale normalized central moments with normalization factor a = p + q + 1 are given by
gpq ¼ lpq =la00 :
ð15Þ
3.2. Zernike orthogonal moments
Zernike moments are the most widely used family of orthogonal moments due to their properties, of being invariant to an
arbitrary rotation of the object that they describe and not sensitive to image noise. They are usually used, after making them
invariant to scale and translation, as object descriptors in pattern recognition applications.
The introduction of Zernike moments in image analysis was made by Teague [19], using a set of complex polynomials,
which form a complete orthogonal set over the interior of the unit circle x2 + y2 = 1. The form of these polynomials is as follows [20]:
V pq ðx; yÞ ¼ V pq ðr; hÞ ¼ Rpq ðrÞ expðjqhÞ;
ð16Þ
922
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
where p is a non-negative integer, q is positive and negative integers subject to constraints p jqj even and jqj 6 p. r is the
length of vector from the origin to the pixel (x, y), h is the angle between vector r and x axis in count-clockwise direction.
Rpq(r) is the radial polynomial defined as
Rpq ðrÞ ¼
ðpjqjÞ=2
X
s¼0
ð1Þs ½ðp sÞ!r p2s
:
s! pþjqj
s ! pjqj
s !
2
2
ð17Þ
Here, it should be noticed that Rpq(r) = Rp,q(r).
These polynomials are orthogonal and satisfy the orthogonality principle:
x2 þy2 ¼1 V nm ðx; yÞ
V pq ðx; yÞdx dy ¼
p
nþ1
dnp dmq ;
dab ¼
1; a ¼ b
0;
otherwise
ð18Þ
:
Zernike moments are the projection of the image function onto these orthogonal basis functions. The Zernike moment of
order p with repetition q for a continuous image function f(x, y) that vanishes outside the unit circle is
Z pq ¼
pþ1
p
x2 þy2 61 f ðx; yÞV pq ðr; hÞdxdy:
ð19Þ
For a digital image, the integrals are replaced by summations as follows:
Z pq ¼
pþ1 X X
p
x
f ðx; yÞV pq ðr; hÞ;
x2 þ y2 6 1:
ð20Þ
y
To compute the Zernike moments of a given image, the center of the image should be firstly taken as the origin and the
pixel coordinates are mapped to the range of the unit circle. Those pixels falling outside the unit circle are not used in the
computation [20]. Thus, Zernike moments can become invariant to translation, rotation and scaling.
The calculation of Zernike moments using the definition of radial polynomials in Eq. (17) is called Direct Method. It can be
seen from Eq. (17) that there are four factorial functions to be computed for each Rpq(r). Hence, Direct Method would
inevitably consume a lot of computation time. For this reason, many recursive algorithms for the computation of the radial
polynomials in Eq. (17) have been developed, of which a so-called q-recursive algorithm has a remarkable performance [21].
The q-recursive method uses Zernike radial polynomials of fixed order p with higher repetition q to derive the polynomial of
the lower repetition q without computing the polynomial coefficients and the power series of radius. The recurrence relation
and its coefficients are given as follows:
Rpðq4Þ ðrÞ ¼ H1 Rpq ðrÞ þ ðH2 þ
H3
ÞRpðq2Þ ðrÞ;
r2
ð21Þ
where the coefficient H1, H2 and H3 are given by
qðq 1Þ
H3 ðp þ q þ 2Þðp qÞ
qH2 þ
2
8
H3 ðp þ qÞðp q þ 2Þ
H2 ¼
þ ðq 2Þ
4ðq 1Þ
4ðq 2Þðq 3Þ
:
H3 ¼
ðp þ q þ 2Þðp q þ 4Þ
H1 ¼
ð22aÞ
ð22bÞ
ð22cÞ
For p = q and (p q) = 2, the following equations can be used.
Rpp ðrÞ ¼ rp
for p ¼ q;
ð23Þ
Rpðp2Þ ¼ pRpp ðrÞ ðp 1ÞRðp2Þðp2Þ ðrÞ for p q ¼ 2:
ð24Þ
In this paper, we use the q-recursive method to compute the Zernike moments and select the modulus of Zernike moments with order p increasing from 0 to 6 as the features. Hence, sixteen Zernike moment features can be obtained. Table 1
shows the values of seven Hu geometric moments and sixteen Zernike moment features of binary images in Fig. 3.
Table 1
The values of seven Hu geometric moments and sixteen Zernike moment features of binary images in Fig. 3
(a)
(b)
(c)
(a)
(b)
(c)
/1
/2
/3
/4
/5
/6
/7
jZ00j
jZ11j
jZ20j
jZ22j
jZ31j
0.1715
0.1668
0.2195
0.0035
0.0022
0.0223
8.1872
9.3963
0.0002
1.0578
2.4425
5.6788
5.7193
3.2854
6.2085
2.9324
3.1682
7.9338
8.0136
1.7025
1.2641
10.1063e+003
7.7270e+003
6.8497e+003
0.0237e+003
2.2339e+003
0.7551e+003
4.6072e+003
5.1459e+003
4.2505e+003
4.5639e+003
5.0964e+003
5.7622e+003
0.1149e+003
2.5842e+003
1.1485e+003
jZ33j
jZ40j
jZ42j
jZ44j
jZ51j
jZ53j
jZ55j
jZ60j
jZ62j
jZ64j
jZ66j
0.1111e+003
2.5870e+003
1.3990e+003
0.1501e+003
1.7138e+003
0.0047e+003
0.0725e+003
2.1891e+003
4.7479e+003
0.0525e+003
1.2093e+003
4.4473e+003
0.0726e+003
0.9887e+003
1.2103e+003
0.0852e+003
0.9542e+003
1.3801e+003
0.0892e+003
0.7914e+003
1.5874e+003
0.5168e+003
1.5320e+003
3.5339e+003
0.6177e+003
0.2487e+003
4.7252e+003
0.6716e+003
0.7933e+003
2.9298e+003
0.6165e+003
2.5870e+003
2.6311e+003
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
923
From Table 1 it can be seen that values of Hu geometric moments and Zernike moment features are different greatly in
order of magnitude. Therefore, before classification, these features need to be normalized as follows:
XN ¼
X X min
:
X max X min
ð25Þ
where X denotes one feature. Xmax is the maximum value of all features in the same class with X and Xmin is the minimum
one.
4. Moving center hypersphere classifier
In this paper, one feature vector containing seven Hu geometric moments and sixteen Zernike moment features is regarded as a pattern. Considering that the number of patterns and the dimension of pattern space are both very large, if
we use some traditional classifiers like K-NN and Neural Network, the corresponding classification process would be quite
time-consuming and space-consuming. Therefore, we propose using a moving center hypersphere (MCH) classification
method to perform the plant leaves classification, the fundamental idea of which is that each class of patterns in n-dimensional space can be represented using a series of n-hyperspheres (as shown in Fig. 4b), while in traditional approaches these
patterns from one class are all treated as a set of points. In other words, the training process of MCH classification method
can be regarded as a data compression process.
Usually, the n-hypersphere is a generalization of circle (2-hypersphere) and sphere (3-hypersphere) to dimensions n P 4.
The n-hypersphere is therefore defined as the set of n-tuples of points (x1,x2, . . ., xn) such that
x21 þ x22 þ þ x2n ¼ R2 ;
ð26Þ
where R is the radius of the n-hypersphere.
Here, we take one class for example to introduce the training process of MCH classifier. Actually, the training process is
more like a constructing process for center and radius of each hypersphere. The first step of training process is to compute
the initial center for possible hypersphere. The multi-dimensional median of all points belonging to the class is firstly computed, and the initial center of possible hypersphere is chose as the closest point to computed median in the class. Then, the
maximum radius needs to be found so that it can encompass the points of the class. To achieve this objective, the center of
the hypersphere should be moving around in a way that would enlarge the hypersphere and make it encompass as many
points as possible. This is an iteration process which is performed by having the center move from one data point to a neighboring point. Once the largest possible hypersphere is found, the points inside this hypersphere are all removed. Above process will be executed repeatedly for the remaining points of the class until all points belonging to that class are surrounded
by certain number of hyperspheres. Then, the next class is treated in a similar manner. The algorithm steps of training process of the moving center hypersphere classifier for one class are summarized as follows:
Step 1. Put all the training data points into set S and set hypersphere index k = 0;
Step 2. Select the closest point to the median of points in S as the initial center C0 of the kth hypersphere (as shown in
Fig. 5a);
Step 3. Find the nearest point Z to the center from all other classes, and denote the distance as d1 (as shown in Fig. 5a);
Step 4. Find the farthest point U of the same class inside the kth hypersphere with radius d1 to the center. Let d2 denote the
distance from the center to that farthest point (as shown in Fig. 5a);
Step 5. Set the radius of the kth hypersphere as (d1 + d2)/2;
Step 6. Select the point in the most negative direction of the center to the nearest point of the other classes among the
nearest m points in this class. The purpose is to move the center to the new point to enlarge the hypersphere. If
point exists, set the point as new center C1 of the kth hypersphere (as shown in Fig. 5b), return to step 4; otherwise,
go to next step;
Fig. 4. The demonstration of proposed hypersphere classification method. (a) Pattern space and (b) hyperspheres surrounding each class of patterns.
924
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
Fig. 5. Demonstration of the center moving process while training the MCH classifier. (a) Initial center C0 and radius (d1 + d2)/2 of certain hypersphere and
(b) center moves from C0 to C1.
Fig. 6. The demonstration of classification criterion.
Fig. 7. Block diagram of the proposed classification framework.
Step 7. Remove those points surrounded by the kth hypersphere from the set S. If S is still not empty, k = k + 1, return to
Step 2; otherwise, go to next step;
Step 8. Erase the redundant hyperspheres which are completely enclosed by some larger hyperspheres of this class. The
algorithm is stopped.
925
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
When the training process is finished, MCH classifier is required to be able to classify any given input data point. The perpendicular distances from the input data point to the surface of all hyperspheres (with that distance counting as negative if
the point is inside the hypersphere) are selected as the classification criterion (as shown in Fig. 6). Assume that there are
totally H hyperspheres after training, each of which has the radius of Ri(i = 1,2, ,H). Let Di denotes the distance between
the data point and the center of the ith hypersphere. Thus, the decision criterion can be defined as follows:
I ¼ arg mini2f1;2;3;;Hg ðDi Ri Þ
ð27Þ
where I means the index for the nearest neighbor hypersphere.
5. Experimental results
The whole classification framework for leaf image with complicated background is demonstrated in the block diagram of
Fig. 7. To verify the proposed classification framework, we have taken 1200 leaf samples corresponding to 20 classes of
plants collected by ourselves [23] and from web resource [22], including Ginkgo, Chinese Allspice, Camphortree, Maple, Honeysuckle, and so on. Each class includes 60 leaf samples with pure background and complicated background, of which 40
samples are selected randomly as training samples and the remaining is used for test samples. The proposed classification
framework was implemented on a computer with Intel Pentium 2.6GHz CPU, 1G RAM, and Windows XP operating system.
Fig. 8a shows the histogram for each hyperspheres’ radius after training 800 samples. Class that each hypersphere belongs
to is listed in Table 2. It can be seen from Fig. 8a and Table 2 that there are total 115 hyperspheres obtained after the training
process. The histogram for the distance between test sample extracted from Fig. 3c and surface of each hypersphere is shown
in Fig. 8b, in which the distance from the input data point to the surface of 63th hypersphere is negative. It can be inferred
that this point is inside the 63th hypersphere. From Table 2 it can be queried that the 63th hypersphere just encompasses
part of Honeysuckle training sample points. The correct classification rate of 400 test samples corresponding to 20 classes of
plants listed in Table 2 using MCH classifier is shown in Fig. 9 and the average correct classification rate is 92.6%.
b
4.5
4
Hypersphere Radius
3.5
3
2.5
2
1.5
1
0.5
0
0
20
40
60
Hypersphere ID
80
100
120
Distance Between Test Sample and Surface of Each Hypersphere
a
25
20
15
10
5
0
-5
0
20
40
60
Hypersphere ID
80
100
120
Fig. 8. The demostration of moving center hypersphere classifier. (a) The histogram for each hyperspheres’ radius after training 800 samples corresponding
to 20 classes and (b) the histograms for the distances between test sample extracted from Fig. 3c and surface of each hypersphere.
Table 2
Class that each hypersphere belongs to
Class ID
Hypersphere ID
Class ID
Hypersphere ID
Class ID
Hypersphere ID
Class ID
Hypersphere ID
1
5
9
13
17
1–6
32–38
57–60
71–79
96–100
2
6
10
14
18
7–16
39–46
61–63
80–86
101–103
3
7
11
15
19
17–27
47–49
64–65
87–90
104–109
4
8
12
16
20
28–31
50–56
66–70
91–95
110–115
*Class ID explanation: 1 Chinese Allspice; 2 Seatung; 3 Camphortree; 4 Photinia; 5 Gingkgo; 6 Tuliptree; 7 Arrowwood; 8 Maple; 9 Donglas Fir; 10
Honeysuckle; 11 Sweetgum; 12 Panicled Goldraintree; 13 Hazel; 14 Rose bush; 15 Laurel; 16 Chestnut; 17 China Redbud; 18 London Planetree; 19 Plum 20
Willow.
X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926
Correct Recognition Rate(%)
926
100
80
60
40
20
0
0
5
10
Class ID
15
20
Fig. 9. The histograms for the correct classification rate of 400 test samples corresponding to 20 classes of plants listed in Table 2 using MCH classifier.
6. Conclusions
In this paper, an efficient classification framework is proposed to classify leaf images with complicated background where
some interferents and overlapping phenomenon may exist. First, an automatic marker-controlled watershed method combined with pre-segmentation and morphological operation is applied to segment leaf images with complicated background
based on the prior shape information. Then, twenty-three moments invariants, including seven Hu geometric moments and
sixteen Zernike moment features, are extracted from binary images after watershed segmentation and leafstalk removal.
Moreover, an efficient moving center hypersphere (MCH) classifier with data compression function is introduced to address
extracted high-dimensional features. Experimental results show that twenty classes of practical plant leaves are successfully
classified, and the average classification rate is up to 92.6%. Our future research works will focus on how to define and compute the leaf image complexity and combine level set method with watershed technique to further improve the segmentation effect of leaf images with complicated background.
Acknowledgement
This work was supported by the Grants of the National Science Foundation of China, Nos. 60772130 & 60705007 &
30700161, the Grant from the National Basic Research Program of China (973 Program), No. 2007CB311002, the grants from
the National High Technology Research and Development Program of China (863 Program), Nos. 2007AA01Z167 &
2006AA02Z309, the Grant of the Guide Project of Innovative Base of Chinese Academy of Sciences (CAS), No. KSCX1-YWR-30, the Grant of Oversea Outstanding Scholars Fund of CAS, No. 2005-1-18, the Grant of the Graduate Students’ Scientific
Innovative Project Foundation of CAS, the Grant of the Scientific Research Foundation of Education Department of Anhui
Province, No. KJ2007B233, the grant of the Young Teachers’ Scientific Research Foundation of Education Department of Anhui Province, No. 2007JQ1152 and the grant of the Scientific Research Foundation of Hefei University, No. 06KY007ZR.
References
[1] H. Fu, Z. Chi, Combined thresholding and neural network approach for vein pattern extraction from leaf images, IEE Proc. Vis. Image Signal Process. 153
(6) (2006) 881–892.
[2] C.L. Lee, S.Y. Chen, Classification for leaf images, Proc. 16th IPPR Conf. Comput. Vision Graphics Image Process. (2003) 355–362.
[3] C. Im, H. Nishida, T.L. Kunii, Recognizing plant species by leaf shapes – a case study of the Acer family, Proc. Pattern Recog. 2 (1998) 1171–1173.
[4] M. Oide, S. Ninomiya, Discrimination of soybean leaflet shape by neural networks with image input, Comput. Electron. Agric. 29 (2000) 59–72.
[5] Z. Wang, Z. Chi, D. Feng, Fuzzy integral for leaf image retrieval, Proc. Fuzzy Systems 1 (2002) 372–377.
[6] Z. Wang, Z. Chi, D. Feng, Shape based leaf image retrieval, IEE Proc. Vis. Image Signal Process. 150 (1) (2003) 34–43.
[7] F. Mokhtarian, S. Abbasi, Matching shapes with self-intersection: application to leaf classification, IEEE Trans. Image Process. 13 (5) (2004) 653–661.
[8] J.X. Du, D.S. Huang, X.F. Wang, X. Gu, Computer-aided plant species identification (CAPSI) based on leaf shape matching technique, Trans. Inst. Measure.
Control. 28 (3) (2006) 275–284.
[9] Q. Wu, C. Zhou, C. Wang, Feature extraction and XML representation of plant leaf for image retrieval, Lecture Note. Computer Sci. 3842 (2006) 127–131.
[10] Y.S. Tak, E. Hwang, A leaf image retrieval scheme based on partial dynamic time warping and two-level filtering, Proc. Seventh ICCIT, IEEE (2007) 633–638.
[11] L. Vincent, P. Soille, Watersheds in digital spaces: an efficient algorithm based on immersion simulation, IEEE Trans. PAMI 13 (6) (1991) 583–598.
[12] G. Hamarneh, X. Li, Watershed segmentation using prior shape and appearance knowledge, Image Vis. Comput. (2007).
[13] F. Meyer, S. Beucher, Morphological segmentation, J. Visual Commun. Image Represent. 1 (1) (1990) 21–46.
[14] X.C. Tai, E. Hodneland, J. Weickert, N.V. Bukoreshtliev, A. Lundervold, H.H. Gerdes, Level set methods for watershed image segmentation, Lecture Note.
Computer Sci. 4485 (2007) 178–190.
[15] N.A. Ostu, Threshold selection method from gray-level histograms, IEEE Trans. System Man Cybernet. 9 (1) (1979) 62–66.
[16] S. Beucher, F. Meyer, The morphological approach to segmentation: the watershed transformation, in: E.R. Dougherty (Ed.), Mathematical Morphology
in Image Processing, Marcel Dekker, 1993.
[17] M.K. Hu, Visual pattern recognition by moment invariants, IRE Trans. Inform. Theory 8 (1962) 179–187.
[18] C.C. Chen, Improved moment invariants for shape discrimination, Pattern Recogn. 26 (5) (1993) 683–686.
[19] M. Teague, Image analysis via the general theory of moments, J. Opt. Soc. Am. 70 (8) (1980) 920–930.
[20] Z.J. Miao, Zernike moment-based image shape analysis and its application, Pattern Recogn. Lett. 21 (2) (2000) 169–177.
[21] C.W. Chong, P. Raveendran, R. Mukundan, A comparative analysis of algorithms for fast computation of Zernike moments, Pattern Recogn. 36 (2003)
731–742.
[22] Hiker’s Guide to the Trees, Shrubs, and Woody Vines of Ricketts Glen State Park, third ed., Web Version, 2007.
[23] X.F. Wang, J.X. Du, G.J. Zhang, Recognition of leaf images based on shape features using a hypersphere classifier, Lecture Note. Computer Sci. 3644
(2005) 87–96.