Applied Mathematics and Computation 205 (2008) 916–926 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc Classification of plant leaf images with complicated background Xiao-Feng Wang a,b,c,*, De-Shuang Huang a, Ji-Xiang Du a,b, Huan Xu a,b, Laurent Heutte d a Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, P.O. Box 1130, Hefei, Anhui 230031, China Department of Automation, University of Science and Technology of China, Hefei 230027, China Department of Computer Science and Technology, Hefei University, Hefei 230022, China d Lab LITIS, UFR Sciences, University of Rouen, France b c a r t i c l e i n f o Keywords: Image segmentation Plant leaf Complicated background Watershed segmentation Hu geometric moments Zernike moment Moving center hypersphere (MCH) classifier a b s t r a c t Classifying plant leaves has so far been an important and difficult task, especially for leaves with complicated background where some interferents and overlapping phenomena may exist. In this paper, an efficient classification framework for leaf images with complicated background is proposed. First, a so-called automatic marker-controlled watershed segmentation method combined with pre-segmentation and morphological operation is introduced to segment leaf images with complicated background based on the prior shape information. Then, seven Hu geometric moments and sixteen Zernike moments are extracted as shape features from segmented binary images after leafstalk removal. In addition, a moving center hypersphere (MCH) classifier which can efficiently compress feature data is designed to address obtained mass high-dimensional shape features. Finally, experimental results on some practical plant leaves show that proposed classification framework works well while classifying leaf images with complicated background. There are twenty classes of practical plant leaves successfully classified and the average correct classification rate is up to 92.6%. Ó 2008 Elsevier Inc. All rights reserved. 1. Introduction Plants play the most important part in the cycle of nature. They are the primary producers that sustain all other life forms including people. This is because plants are the only organisms that can convert light energy from the sun into food. Animals, incapable of making their own food, depend directly or indirectly on plants for their supply of food. Moreover, all of the oxygen available for living organisms comes from plants. Plants are also the primary habitat for thousands of other organisms. In addition, many of the fuel people use today, such as coal, natural gas and gasoline, were made from plants that lived millions of years ago. However, in recent years, people have been seriously destroying the natural environments, so that many plants constantly die and even die out every year. Conversely, the resulting ecological crisis has brought many serious consequences including land desertion, climate anomaly, land flood, and so on, which have menaced the survival of human being and the development of society. Now, people have realized the importance and urgency of protecting plant resource. Besides taking effective measures to protect plants, it is important for ordinary people to know or classify plants which can further enhance the public’s consciousness of plant protection. Therefore, besides professional botanists, many non-professional researchers have paid more attention to plant classification. * Corresponding author. Address: Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, P.O. Box 1130, Hefei, Anhui 230031, China. E-mail addresses: [email protected] (X.-F. Wang), [email protected] (D.-S. Huang), [email protected] (J.-X. Du), [email protected] (H. Xu), [email protected] (L. Heutte). 0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2008.05.108 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 917 Classifying plant is a process in which each individual plant should be correctly assigned to a descending series of groups of related plants. According to the statistic and investigated data, there are about 400,000 species of plants, of which 270,000 species of plants have been named and identified by botanists. It is impossible for any botanist or non-professional researcher to know more than a tiny fraction of the total number of named species, which makes the further research on plants difficult [1]. Up to now, some new plant taxonomy methods, such as cytotaxonomy, chemotaxonomy, serotaxonomy and cladistics, etc, are becoming popular. However, these new methods are all complicated and time-consuming and can mainly been carried out by botanists. Comparatively, traditional shape taxonomy methods are still widely used since they are easily implemented and suitable for field-living plants classification. In recent years, information technologies including image processing and pattern recognition techniques have been introduced into plant shape taxonomy to make up the deficiency of people’s classification ability [1–10]. According to the theory of plant shape taxonomy, plants are basically classified according to the shapes of their leaves and flowers. Usually, leaves are approximately two-dimensional in shape and flowers are three-dimensional. It is difficult to analyze shapes and structures of flowers since they have complex 3D structures [2]. Moreover, leaves can be easily found and collected everywhere at all seasons, while flowers can only be obtained at blooming season. Therefore, leaves are widely used for computer-aided plant classification. Im et al. [3] used a hierarchical polygon approximation representation of leaf shape to classify the Acer family variety. Oide and Ninomiya [4] chose leaf shape images as neural networks input and applied a Hopfield model and a simple perceptron to Soybean leaf classification. Wang et al. [5,6] proposed combining different features based on centroid-contour distance curve and adopted fuzzy integral for leaf image retrieval. Mokhtarian and Abbasi [7] used curvature scale space image to represent leaf shapes and applied it to leaf classification with self-intersection. Fu and Chi [1] combined the thresholding method and BP neural network to extract leaf veins. Du et al. [8] adopted accelerated Douglas–Peucker approximation algorithm for leaf shape approximation and used the modified dynamic programming algorithm for leaf shape matching. Wu et al. [9] and Tak and Hwang [10] have also made some researches on leaf image retrieval. Although obtaining some encouraging results, most of the researches on leaf classification have been focusing on the feature extraction and classifier design. The test leaf images used in these researches are generally specimen images and indoor images with pure or simple background. Accordingly, some traditional segmentation methods, such as thresholding methods, gradient operators and morphological operators, etc., were selected to segment the leaf objects from color or gray images. However, one of the objectives of leaf classification is to identify the field-living leaves, the digital images of which inevitably contain complicated background. For example, overlaps between some adjacent parts of leaves are sometimes unavoidable, which may create confusion between the boundaries of adjacent leaves. Some interferents around the target leaves, such as small stones, ruderals and dead leaves, etc., may make segmentation results unsatisfying. Even in indoor leaf images, there also exist some branches or non-target leaves which may touch the target leaves. As a result, boundaries of the target leaves from traditional segmentation methods will unavoidably connect to boundaries of branches or non-target leaves, which make feature extraction imprecisely. In this paper, we proposed an efficient classification framework for leaf images with complicated background. First, a socalled automatic marker-controlled watershed method combined with pre-segmentation and morphological operation is applied to segment leaf images with complicated background based on the prior shape information. Then, some shape features including seven Hu geometric moments and sixteen Zernike moment features will be extracted from segmented binary images. In addition, an efficient moving center hypersphere (MCH) classifier with data compression function is introduced to address extracted features. The rest of this paper is organized as follows: In Section 2, we introduce the automatic marker-controlled watershed segmentation method for leaf images with complicated background. Shape feature extraction is presented in Section 3. Then, the moving center hypersphere classifier is described in Section 4. In Section 5, the proposed classification framework is validated by some practical plant leaf images with complicated background. Finally, some conclusive remarks are included in Section 6. 2. Automatic marker-controlled watershed segmentation As we discussed above, leaf images with complicated background usually contain more than one object which may include foreground objects, i.e., target leaves and background objects (small stones, ruderals, branches, non-target leaves and other interferents). Moreover, target leaves are possibly touching or covering the background objects. Consequently, traditional segmentation methods including thresholding methods, edge operators and morphological operators can hardly separate the target leaves from background objects due to their own limitations. In this section, we will introduce one effective method to address this problem, which is called automatic marker-controlled watershed segmentation method. Watershed segmentation [11] is a region growing technique belonging to the class of morphological operations. Usually, it is simulated based on an immersion process, enabling an increase in speed and accuracy. A digital watershed is defined as a small region that can not be assigned uniquely to any influence zones of local minima in the gradient image. Watershed segmentation has increasingly been recognized as a powerful segmentation process due to its many advantages [12], including simplicity, speed and complete division of the image. Even with target regions having low contrast and weak boundaries, watershed segmentation can always provide closed contours. 918 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 In contrast to classical area based segmentation, the watershed segmentation is executed on the gradient image which can be regarded as the topography with boundaries between regions. However, the over-segmentation phenomenon inevitably comes out due to the fact that the gradient image exhibits too many minima. To avoid severe over-segmentation, marker-controlled watershed segmentation was proposed which is normally implemented by region growing based on a set of markers [13]. The fundamental idea is to filter out the undesired minima of the gradient image according to the markers. Certain desired local minima are selected as markers, and then geodesic reconstruction is applied to fill the other minima to non-minimum plateaus. The marker image used for watershed segmentation is a binary image consisting of marker regions where each marker is placed inside an object (either a foreground object or a background object). Each initial marker has a one-to-one relationship to a specific watershed region, thus the number of markers will equal the final number of watershed regions. After segmentation, the boundaries of the watershed regions are arranged on the desired ridges, thus separating each object from its neighbors [14]. The markers can be manually or automatically selected; in this paper, we proposed automatically creating marker for leaf images based on pre-segmentation results and the prior shape information. Although thresholding methods could not give satisfying results while segmenting the leaf images with complicated background, it can still provide some useful information for the automatic marker creation. In this paper, we choose the Otsu thresholding method [15] to construct the preliminary marker image. Otsu thresholding method belongs to global thresholding algorithms which assume that image is bimodal. The basic idea is that if a certain threshold t can minimizes the weighted within-class variance rW among all the possible values, the threshold t is the one with which target and background can be divided. Here, the weighted within-class variance rW is represented as follows: r2W ðtÞ ¼ q1 ðtÞr21 ðtÞ þ q2 ðtÞr22 ðtÞ; ð1Þ where the class probabilities are estimated as q1 ðtÞ ¼ t X PðiÞ q2 ðtÞ ¼ i¼1 I X PðiÞ; ð2Þ i¼tþ1 and the class means are given by l1 ðtÞ ¼ t X iPðiÞ q 1 ðtÞ i¼1 l2 ðtÞ ¼ I X iPðiÞ : q ðtÞ i¼tþ1 2 ð3Þ Finally, the individual class variances are r21 ðtÞ ¼ t X PðiÞ ½i l1 ðtÞ2 q 1 ðtÞ i¼1 r22 ðtÞ ¼ I X i¼tþ1 ½i l2 ðtÞ2 PðiÞ : q2 ðtÞ ð4Þ For gray-scale image, rW is computed through the full range of t values [1,256] and the t value that minimizes rW is selected as threshold. After segmenting leaf images using Otsu thresholding method, binary images can be obtained where foreground parts are numerically displayed with 1(white) and background is 0(black). Here, segmented foreground objects are usually partial part of target leaf and other interferents due to the complicated background. Even so, it does not influence the effect of marker creation since those isolated foreground parts can represent corresponding objects perfectly. Correspondingly, each isolated foreground part can be regarded as the marker which acts as a specific watershed region. Note that some interferents such as branches and non-target leaves may touch or be covered by target leaf in some leaf images with complicated background. After segmentation, the overlapping phenomenon still exists in obtained binary images, i.e., certain foreground part may both contain the target leaf and touching interferents. If this foreground part is directly regarded as the marker for watershed segmentation, the target leaf can hardly be segmented like other traditional methods. So, binary images obtained from Otsu thresholding method are only preliminary marker images and need to be further processed to avoid overlapping phenomenon. According to our observations to original leaf images, most of overlapping phenomena are due to that target leaves touch or partially cover the interferents including non-target leaves on the same branch of target leaf, neighboring ruderals, etc. The shapes of target leaves are usually maintained completely, and concave angles exist between target leaves and interferents. These attributes are approximately maintained in binary images (preliminary marker images). Thus, we consider applying morphological operator, more precisely, erosion to binary image to separate marker corresponding to target leaf from that corresponding to touching or covered interferents. Usually, the basic effect of the erosion operation for a binary image is to erode away the boundaries of foreground parts. Thus, foreground parts shrink in size, and holes within those parts become larger. The mathematical definition of erosion for binary images is as follows: Suppose that X is the set of Euclidean coordinates corresponding to the input binary image, and K is the set of coordinates for the structuring element. Let (K)x denotes the translation of K so that its origin is at x. Then the erosion of X by K is simply the set of all points x such that (K)x is a subset of X, i.e., X HK ¼ fxjðKÞx # Xg: ð5Þ X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 919 When applied to a binary leaf images, erosion can efficiently enlarge the concave angles between target leaf and touching or covered interferents. If a larger size of structuring element is selected, the part of target leaf and interferents will inevitably be divided because the values of concave angles will decrease to zero and even disappear. Then, each separated part can be regarded as the markers. Here, it should be noticed that some small foreground parts may disappear after erosion operation. Actually, these disappear parts generally belong to the background objects in original images. Their disappearances would not affect the later watershed segmentation since only the target leaves are needed in the whole segmentation process. After thresholding and erosion procedure, all markers can be obtained. Since segmenting target leaf is the final objective, the markers need to be further classified into internal markers and external markers. Here, internal markers associate with objects of interest (target leaves), which are then made to be the only allowed minima in the watershed segmentation process. External markers associate with background including all interferents, and then help to efficiently partition the image into regions containing objects of interest and background. Usually, the discrimination of internal marker and external marker is a difficult task if no prior information can be provided. Note that the size of target leaf is always the largest among that of all objects in leaf images with complicated background. Even in eroded images, this characteristic still maintains since erosion operation is applied to all objects. Therefore, the size difference between target leaf and interferents can be regarded as prior shape information and used to judge whether a marker belongs to internal marker or not. Assume that there are totally n markers after erosion, each of which has the size of Si(i = 1,2, ,n), then we can define the judge criterion as follows: I ¼ arg maxi2f1;2;3;;ng Si ; ð6Þ where I means the index for the internal marker. All other markers are regarded as external markers. In the final marker image, values of pixels belonging to the internal marker, external markers and background are set to 1, 2 and 0, respectively. Then, watershed segmentation can be carried out based on the gradient image and the maker image. Readers can refer to [16] for a detailed treatment. Figs. 1 and 2 demonstrate the process of automatically creating markers and corresponding watershed segmentation results of leaf images with complicated background. The first column in Fig. 1 show two indoor leaf images with overlapping phenomenon in which target leaves cover the non-target leaves. Thus, target leaves can not be segmented using traditional segmentation methods since the boundaries between target leaves and covered leaves are blurred. The binary images in the second column show the segmentation results using Otsu thresholding method which actually act as a pre-segmentation tool for marker creation in this paper. Then, erosion operation is applied to binary images so that marker images can be obtained according to proposed judge criterion (as shown in the third column). Corresponding gradient images are demonstrates in the fourth column. Finally, watershed segmentation is carried out based on the marker images and gradient images. It can be seen from the final segmentation results in the fifth column that two target leaves are segmented completely. Fig. 2 demonstrates the segmentation of two outdoor leaf images containing interferents. It can be seen from the first column in Fig. 2 that some interferents, including ruderals, small stones and non-target leaves, surround the target leaves. Otsu thresholding is applied as a pre-segmentation procedure for automatic marker creation. The pre-segmentation results are shown in the second column in which target leaves are still touched by those interferents. The third column show the marker images obtained after erosion operation. Final segmentation results from watershed method based on the marker images are listed in the fourth column. Two target leaves are successfully separated from surrounding interferents. Notice that there exist some variance on length and bending of leafstalks. To keep the precision of shape feature extraction, these leafstalks should be further removed from obtained binary images. Here, we consider applying opening operation of mathematical morphology to binary images, which is defined as an erosion operation followed by a dilation operation using the same structuring element. By performing opening operation with proper structuring element, we can successfully Fig. 1. Segmentation of indoor leaf images with overlapping phenomenon using automatic marker-controlled watershed method. The first column: Original leaf images with overlapping phenomenon. The second column: Otsu thresholding results. The third column: Erosion results. The fourth column: Gradient images. The fifth column: Final segmentation results. 920 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 Fig. 2. Segmentation of outdoor leaf images with interferents using automatic marker-controlled watershed method. The first column: Original leaf images containing interferents. The second column: Otsu thresholding results. The third column: Erosion results. The fourth column: Final segmentation results. remove the leafstalks while preserving the main shape characteristics of leaf objects. The results of leafstalk removal of three binary images in Figs. 1 and 2 are shown in Fig. 3. 3. Shape feature extraction Shape feature is one of the most important image features for characterizing an object since it is an important feature of human perception. Human beings tend to perceive scenes as being composed of individual objects, which can be best identified by their shapes. On the other hand, shape features are also the most important and effective visual features for classifying plants according to the theory of plant taxonomy. Consequently, we consider using shape features for leaf image classification. Notice that there exist greater morphological differences in different kinds of leaves; even in the same kind of leaves there also possibly exist some obvious variance on location, direction of rotation and scale in obtained digital images. Thus, selection of good features is a crucial step for the leaf classification. An efficient leaf classification system must be able to recognize a leaf regardless of its location, orientation and size in the field of view, i.e., translation, rotation and scale invariance. The image moments are widely used as shape features for image processing and classification, which provide a more geometric and intuitive meaning than some simple geometric features. The invariant properties of moments have received considerable attention in recent years since moments define a simply calculated set of region properties that can be used for shape based classification. In this paper, two kinds of moments, Hu geometric moments and Zernike orthogonal moments, are chose as the shape features. 3.1. Hu geometric moments The moment invariant was first introduced by Hu, who showed how the descriptors can be derived from algebraic invariants in his fundamental theorem of moment invariants [17]. Hu has also defined seven of these moment invariants computed from central moments through order three that are invariant under object translation, scaling and rotation. The two-dimensional traditional Hu geometric moments of order (p + q) of an intensity function f(x,y) are defined as Fig. 3. Leafstalk removal results of three binary images in Figs. 1 and 2. (a) Leafstalk removal result of binary image in the second row and the fifth column of Fig. 1. (b) Leafstalk removal result of binary image in the first row and the fourth column of Fig. 2. (c) Leafstalk removal result of binary image in the second row and the fourth column of Fig. 2. X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 Mpq ¼ Z þ1 Z 1 921 þ1 xp yq f ðx; yÞdx dy; p; q ¼ 0; 1; 2: ð7Þ 1 In practical pattern recognition applications the image space is reduced to a binary version, and in such a case f(x, y) takes the value of 1 when the pixel (x, y) represents target objects or even noise and it is 0 when the pixel belongs to the background. When the geometrical moments Mpq in Eq. (7) are referred to the object centroid (xc, yc) they become the central moments, and are given by Z lpq ¼ þ1 1 Z þ1 ðx xc Þp ðy yc Þq f ðx; yÞdx dy; xc ¼ 1 M 10 M 00 yc ¼ M 01 : M00 ð8Þ Thus, the central moments are invariant to translation and may be normalized to turn also invariant to scaling through the relation: gpq ¼ lpq =lc00 ; ð9Þ where the normalization factor is c = (p + q)/2 + 1. The values of seven Hu geometric moments can then be calculated from the normalized central moments as follows: /1 ¼ g20 þ g02 /2 ¼ ðg20 g02 Þ2 þ 4g211 ; /3 ¼ ðg30 3g12 Þ2 þ ðg03 3g21 Þ2 ; /4 ¼ ðg30 þ g12 Þ2 þ ðg03 þ g21 Þ2 ; /5 ¼ ð3g30 3g12 Þðg30 þ g12 Þ½ðg30 þ g12 Þ2 3ðg21 þ g03 Þ2 þ ð3g21 g03 Þ ðg21 þ g03 Þ 3ðg21 þ g03 Þ2 þ ð3g21 g03 Þðg21 þ g03 Þ; 2 ð10Þ 2 /6 ¼ ðg20 g02 Þ½ðg30 þ g12 Þ ðg21 þ g03 Þ þ 4g11 ðg30 þ g12 Þðg21 þ g03 Þ; /7 ¼ ð3g21 g03 Þðg30 þ g12 Þ½ðg30 þ g12 Þ2 3ðg21 þ g03 Þ2 þ ð3g12 g30 Þðg21 þ g03 Þ ½3ðg30 þ g12 Þ2 ðg21 þ g03 Þ2 : It can be seen from Eq. (7) that the double integrals are to be considered over the whole area of the object including its boundary, this implies computational complexity of order O(N2). Chen [18] has proposed a fast computation method based on contours of objects which can reduce the computational complexity to O(N). The Chen’s improved geometrical moments of order (p + q) are defined as Mpq ¼ Z xp yq ds; ds ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðdxÞ2 þ ðdyÞ2 ; ð11Þ C R where C is the line integral along a closed contour C. For practical implementation, they should be computed in their discrete form: X Mpq ¼ xp y q : ð12Þ ðx;yÞ2C The contour central moments invariant to translation can be similarly defined as lpq ¼ Z ðx xc Þp ðy yc Þq ds; C xc ¼ M 10 M00 yc ¼ M01 : M 00 ð13Þ In the discrete case lpq above becomes: lpq ¼ X ðx xc Þp ðy yc Þq : ð14Þ ðx;yÞ2C The scale normalized central moments with normalization factor a = p + q + 1 are given by gpq ¼ lpq =la00 : ð15Þ 3.2. Zernike orthogonal moments Zernike moments are the most widely used family of orthogonal moments due to their properties, of being invariant to an arbitrary rotation of the object that they describe and not sensitive to image noise. They are usually used, after making them invariant to scale and translation, as object descriptors in pattern recognition applications. The introduction of Zernike moments in image analysis was made by Teague [19], using a set of complex polynomials, which form a complete orthogonal set over the interior of the unit circle x2 + y2 = 1. The form of these polynomials is as follows [20]: V pq ðx; yÞ ¼ V pq ðr; hÞ ¼ Rpq ðrÞ expðjqhÞ; ð16Þ 922 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 where p is a non-negative integer, q is positive and negative integers subject to constraints p jqj even and jqj 6 p. r is the length of vector from the origin to the pixel (x, y), h is the angle between vector r and x axis in count-clockwise direction. Rpq(r) is the radial polynomial defined as Rpq ðrÞ ¼ ðpjqjÞ=2 X s¼0 ð1Þs ½ðp sÞ!r p2s : s! pþjqj s ! pjqj s ! 2 2 ð17Þ Here, it should be noticed that Rpq(r) = Rp,q(r). These polynomials are orthogonal and satisfy the orthogonality principle: x2 þy2 ¼1 V nm ðx; yÞ V pq ðx; yÞdx dy ¼ p nþ1 dnp dmq ; dab ¼ 1; a ¼ b 0; otherwise ð18Þ : Zernike moments are the projection of the image function onto these orthogonal basis functions. The Zernike moment of order p with repetition q for a continuous image function f(x, y) that vanishes outside the unit circle is Z pq ¼ pþ1 p x2 þy2 61 f ðx; yÞV pq ðr; hÞdxdy: ð19Þ For a digital image, the integrals are replaced by summations as follows: Z pq ¼ pþ1 X X p x f ðx; yÞV pq ðr; hÞ; x2 þ y2 6 1: ð20Þ y To compute the Zernike moments of a given image, the center of the image should be firstly taken as the origin and the pixel coordinates are mapped to the range of the unit circle. Those pixels falling outside the unit circle are not used in the computation [20]. Thus, Zernike moments can become invariant to translation, rotation and scaling. The calculation of Zernike moments using the definition of radial polynomials in Eq. (17) is called Direct Method. It can be seen from Eq. (17) that there are four factorial functions to be computed for each Rpq(r). Hence, Direct Method would inevitably consume a lot of computation time. For this reason, many recursive algorithms for the computation of the radial polynomials in Eq. (17) have been developed, of which a so-called q-recursive algorithm has a remarkable performance [21]. The q-recursive method uses Zernike radial polynomials of fixed order p with higher repetition q to derive the polynomial of the lower repetition q without computing the polynomial coefficients and the power series of radius. The recurrence relation and its coefficients are given as follows: Rpðq4Þ ðrÞ ¼ H1 Rpq ðrÞ þ ðH2 þ H3 ÞRpðq2Þ ðrÞ; r2 ð21Þ where the coefficient H1, H2 and H3 are given by qðq 1Þ H3 ðp þ q þ 2Þðp qÞ qH2 þ 2 8 H3 ðp þ qÞðp q þ 2Þ H2 ¼ þ ðq 2Þ 4ðq 1Þ 4ðq 2Þðq 3Þ : H3 ¼ ðp þ q þ 2Þðp q þ 4Þ H1 ¼ ð22aÞ ð22bÞ ð22cÞ For p = q and (p q) = 2, the following equations can be used. Rpp ðrÞ ¼ rp for p ¼ q; ð23Þ Rpðp2Þ ¼ pRpp ðrÞ ðp 1ÞRðp2Þðp2Þ ðrÞ for p q ¼ 2: ð24Þ In this paper, we use the q-recursive method to compute the Zernike moments and select the modulus of Zernike moments with order p increasing from 0 to 6 as the features. Hence, sixteen Zernike moment features can be obtained. Table 1 shows the values of seven Hu geometric moments and sixteen Zernike moment features of binary images in Fig. 3. Table 1 The values of seven Hu geometric moments and sixteen Zernike moment features of binary images in Fig. 3 (a) (b) (c) (a) (b) (c) /1 /2 /3 /4 /5 /6 /7 jZ00j jZ11j jZ20j jZ22j jZ31j 0.1715 0.1668 0.2195 0.0035 0.0022 0.0223 8.1872 9.3963 0.0002 1.0578 2.4425 5.6788 5.7193 3.2854 6.2085 2.9324 3.1682 7.9338 8.0136 1.7025 1.2641 10.1063e+003 7.7270e+003 6.8497e+003 0.0237e+003 2.2339e+003 0.7551e+003 4.6072e+003 5.1459e+003 4.2505e+003 4.5639e+003 5.0964e+003 5.7622e+003 0.1149e+003 2.5842e+003 1.1485e+003 jZ33j jZ40j jZ42j jZ44j jZ51j jZ53j jZ55j jZ60j jZ62j jZ64j jZ66j 0.1111e+003 2.5870e+003 1.3990e+003 0.1501e+003 1.7138e+003 0.0047e+003 0.0725e+003 2.1891e+003 4.7479e+003 0.0525e+003 1.2093e+003 4.4473e+003 0.0726e+003 0.9887e+003 1.2103e+003 0.0852e+003 0.9542e+003 1.3801e+003 0.0892e+003 0.7914e+003 1.5874e+003 0.5168e+003 1.5320e+003 3.5339e+003 0.6177e+003 0.2487e+003 4.7252e+003 0.6716e+003 0.7933e+003 2.9298e+003 0.6165e+003 2.5870e+003 2.6311e+003 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 923 From Table 1 it can be seen that values of Hu geometric moments and Zernike moment features are different greatly in order of magnitude. Therefore, before classification, these features need to be normalized as follows: XN ¼ X X min : X max X min ð25Þ where X denotes one feature. Xmax is the maximum value of all features in the same class with X and Xmin is the minimum one. 4. Moving center hypersphere classifier In this paper, one feature vector containing seven Hu geometric moments and sixteen Zernike moment features is regarded as a pattern. Considering that the number of patterns and the dimension of pattern space are both very large, if we use some traditional classifiers like K-NN and Neural Network, the corresponding classification process would be quite time-consuming and space-consuming. Therefore, we propose using a moving center hypersphere (MCH) classification method to perform the plant leaves classification, the fundamental idea of which is that each class of patterns in n-dimensional space can be represented using a series of n-hyperspheres (as shown in Fig. 4b), while in traditional approaches these patterns from one class are all treated as a set of points. In other words, the training process of MCH classification method can be regarded as a data compression process. Usually, the n-hypersphere is a generalization of circle (2-hypersphere) and sphere (3-hypersphere) to dimensions n P 4. The n-hypersphere is therefore defined as the set of n-tuples of points (x1,x2, . . ., xn) such that x21 þ x22 þ þ x2n ¼ R2 ; ð26Þ where R is the radius of the n-hypersphere. Here, we take one class for example to introduce the training process of MCH classifier. Actually, the training process is more like a constructing process for center and radius of each hypersphere. The first step of training process is to compute the initial center for possible hypersphere. The multi-dimensional median of all points belonging to the class is firstly computed, and the initial center of possible hypersphere is chose as the closest point to computed median in the class. Then, the maximum radius needs to be found so that it can encompass the points of the class. To achieve this objective, the center of the hypersphere should be moving around in a way that would enlarge the hypersphere and make it encompass as many points as possible. This is an iteration process which is performed by having the center move from one data point to a neighboring point. Once the largest possible hypersphere is found, the points inside this hypersphere are all removed. Above process will be executed repeatedly for the remaining points of the class until all points belonging to that class are surrounded by certain number of hyperspheres. Then, the next class is treated in a similar manner. The algorithm steps of training process of the moving center hypersphere classifier for one class are summarized as follows: Step 1. Put all the training data points into set S and set hypersphere index k = 0; Step 2. Select the closest point to the median of points in S as the initial center C0 of the kth hypersphere (as shown in Fig. 5a); Step 3. Find the nearest point Z to the center from all other classes, and denote the distance as d1 (as shown in Fig. 5a); Step 4. Find the farthest point U of the same class inside the kth hypersphere with radius d1 to the center. Let d2 denote the distance from the center to that farthest point (as shown in Fig. 5a); Step 5. Set the radius of the kth hypersphere as (d1 + d2)/2; Step 6. Select the point in the most negative direction of the center to the nearest point of the other classes among the nearest m points in this class. The purpose is to move the center to the new point to enlarge the hypersphere. If point exists, set the point as new center C1 of the kth hypersphere (as shown in Fig. 5b), return to step 4; otherwise, go to next step; Fig. 4. The demonstration of proposed hypersphere classification method. (a) Pattern space and (b) hyperspheres surrounding each class of patterns. 924 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 Fig. 5. Demonstration of the center moving process while training the MCH classifier. (a) Initial center C0 and radius (d1 + d2)/2 of certain hypersphere and (b) center moves from C0 to C1. Fig. 6. The demonstration of classification criterion. Fig. 7. Block diagram of the proposed classification framework. Step 7. Remove those points surrounded by the kth hypersphere from the set S. If S is still not empty, k = k + 1, return to Step 2; otherwise, go to next step; Step 8. Erase the redundant hyperspheres which are completely enclosed by some larger hyperspheres of this class. The algorithm is stopped. 925 X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 When the training process is finished, MCH classifier is required to be able to classify any given input data point. The perpendicular distances from the input data point to the surface of all hyperspheres (with that distance counting as negative if the point is inside the hypersphere) are selected as the classification criterion (as shown in Fig. 6). Assume that there are totally H hyperspheres after training, each of which has the radius of Ri(i = 1,2, ,H). Let Di denotes the distance between the data point and the center of the ith hypersphere. Thus, the decision criterion can be defined as follows: I ¼ arg mini2f1;2;3;;Hg ðDi Ri Þ ð27Þ where I means the index for the nearest neighbor hypersphere. 5. Experimental results The whole classification framework for leaf image with complicated background is demonstrated in the block diagram of Fig. 7. To verify the proposed classification framework, we have taken 1200 leaf samples corresponding to 20 classes of plants collected by ourselves [23] and from web resource [22], including Ginkgo, Chinese Allspice, Camphortree, Maple, Honeysuckle, and so on. Each class includes 60 leaf samples with pure background and complicated background, of which 40 samples are selected randomly as training samples and the remaining is used for test samples. The proposed classification framework was implemented on a computer with Intel Pentium 2.6GHz CPU, 1G RAM, and Windows XP operating system. Fig. 8a shows the histogram for each hyperspheres’ radius after training 800 samples. Class that each hypersphere belongs to is listed in Table 2. It can be seen from Fig. 8a and Table 2 that there are total 115 hyperspheres obtained after the training process. The histogram for the distance between test sample extracted from Fig. 3c and surface of each hypersphere is shown in Fig. 8b, in which the distance from the input data point to the surface of 63th hypersphere is negative. It can be inferred that this point is inside the 63th hypersphere. From Table 2 it can be queried that the 63th hypersphere just encompasses part of Honeysuckle training sample points. The correct classification rate of 400 test samples corresponding to 20 classes of plants listed in Table 2 using MCH classifier is shown in Fig. 9 and the average correct classification rate is 92.6%. b 4.5 4 Hypersphere Radius 3.5 3 2.5 2 1.5 1 0.5 0 0 20 40 60 Hypersphere ID 80 100 120 Distance Between Test Sample and Surface of Each Hypersphere a 25 20 15 10 5 0 -5 0 20 40 60 Hypersphere ID 80 100 120 Fig. 8. The demostration of moving center hypersphere classifier. (a) The histogram for each hyperspheres’ radius after training 800 samples corresponding to 20 classes and (b) the histograms for the distances between test sample extracted from Fig. 3c and surface of each hypersphere. Table 2 Class that each hypersphere belongs to Class ID Hypersphere ID Class ID Hypersphere ID Class ID Hypersphere ID Class ID Hypersphere ID 1 5 9 13 17 1–6 32–38 57–60 71–79 96–100 2 6 10 14 18 7–16 39–46 61–63 80–86 101–103 3 7 11 15 19 17–27 47–49 64–65 87–90 104–109 4 8 12 16 20 28–31 50–56 66–70 91–95 110–115 *Class ID explanation: 1 Chinese Allspice; 2 Seatung; 3 Camphortree; 4 Photinia; 5 Gingkgo; 6 Tuliptree; 7 Arrowwood; 8 Maple; 9 Donglas Fir; 10 Honeysuckle; 11 Sweetgum; 12 Panicled Goldraintree; 13 Hazel; 14 Rose bush; 15 Laurel; 16 Chestnut; 17 China Redbud; 18 London Planetree; 19 Plum 20 Willow. X.-F. Wang et al. / Applied Mathematics and Computation 205 (2008) 916–926 Correct Recognition Rate(%) 926 100 80 60 40 20 0 0 5 10 Class ID 15 20 Fig. 9. The histograms for the correct classification rate of 400 test samples corresponding to 20 classes of plants listed in Table 2 using MCH classifier. 6. Conclusions In this paper, an efficient classification framework is proposed to classify leaf images with complicated background where some interferents and overlapping phenomenon may exist. First, an automatic marker-controlled watershed method combined with pre-segmentation and morphological operation is applied to segment leaf images with complicated background based on the prior shape information. Then, twenty-three moments invariants, including seven Hu geometric moments and sixteen Zernike moment features, are extracted from binary images after watershed segmentation and leafstalk removal. Moreover, an efficient moving center hypersphere (MCH) classifier with data compression function is introduced to address extracted high-dimensional features. Experimental results show that twenty classes of practical plant leaves are successfully classified, and the average classification rate is up to 92.6%. Our future research works will focus on how to define and compute the leaf image complexity and combine level set method with watershed technique to further improve the segmentation effect of leaf images with complicated background. Acknowledgement This work was supported by the Grants of the National Science Foundation of China, Nos. 60772130 & 60705007 & 30700161, the Grant from the National Basic Research Program of China (973 Program), No. 2007CB311002, the grants from the National High Technology Research and Development Program of China (863 Program), Nos. 2007AA01Z167 & 2006AA02Z309, the Grant of the Guide Project of Innovative Base of Chinese Academy of Sciences (CAS), No. KSCX1-YWR-30, the Grant of Oversea Outstanding Scholars Fund of CAS, No. 2005-1-18, the Grant of the Graduate Students’ Scientific Innovative Project Foundation of CAS, the Grant of the Scientific Research Foundation of Education Department of Anhui Province, No. KJ2007B233, the grant of the Young Teachers’ Scientific Research Foundation of Education Department of Anhui Province, No. 2007JQ1152 and the grant of the Scientific Research Foundation of Hefei University, No. 06KY007ZR. References [1] H. Fu, Z. Chi, Combined thresholding and neural network approach for vein pattern extraction from leaf images, IEE Proc. Vis. Image Signal Process. 153 (6) (2006) 881–892. [2] C.L. Lee, S.Y. Chen, Classification for leaf images, Proc. 16th IPPR Conf. Comput. Vision Graphics Image Process. (2003) 355–362. [3] C. Im, H. Nishida, T.L. Kunii, Recognizing plant species by leaf shapes – a case study of the Acer family, Proc. Pattern Recog. 2 (1998) 1171–1173. [4] M. Oide, S. Ninomiya, Discrimination of soybean leaflet shape by neural networks with image input, Comput. Electron. Agric. 29 (2000) 59–72. [5] Z. Wang, Z. Chi, D. Feng, Fuzzy integral for leaf image retrieval, Proc. Fuzzy Systems 1 (2002) 372–377. [6] Z. Wang, Z. Chi, D. Feng, Shape based leaf image retrieval, IEE Proc. Vis. Image Signal Process. 150 (1) (2003) 34–43. [7] F. Mokhtarian, S. Abbasi, Matching shapes with self-intersection: application to leaf classification, IEEE Trans. Image Process. 13 (5) (2004) 653–661. [8] J.X. Du, D.S. Huang, X.F. Wang, X. Gu, Computer-aided plant species identification (CAPSI) based on leaf shape matching technique, Trans. Inst. Measure. Control. 28 (3) (2006) 275–284. [9] Q. Wu, C. Zhou, C. Wang, Feature extraction and XML representation of plant leaf for image retrieval, Lecture Note. Computer Sci. 3842 (2006) 127–131. [10] Y.S. Tak, E. Hwang, A leaf image retrieval scheme based on partial dynamic time warping and two-level filtering, Proc. Seventh ICCIT, IEEE (2007) 633–638. [11] L. Vincent, P. Soille, Watersheds in digital spaces: an efficient algorithm based on immersion simulation, IEEE Trans. PAMI 13 (6) (1991) 583–598. [12] G. Hamarneh, X. Li, Watershed segmentation using prior shape and appearance knowledge, Image Vis. Comput. (2007). [13] F. Meyer, S. Beucher, Morphological segmentation, J. Visual Commun. Image Represent. 1 (1) (1990) 21–46. [14] X.C. Tai, E. Hodneland, J. Weickert, N.V. Bukoreshtliev, A. Lundervold, H.H. Gerdes, Level set methods for watershed image segmentation, Lecture Note. Computer Sci. 4485 (2007) 178–190. [15] N.A. Ostu, Threshold selection method from gray-level histograms, IEEE Trans. System Man Cybernet. 9 (1) (1979) 62–66. [16] S. Beucher, F. Meyer, The morphological approach to segmentation: the watershed transformation, in: E.R. Dougherty (Ed.), Mathematical Morphology in Image Processing, Marcel Dekker, 1993. [17] M.K. Hu, Visual pattern recognition by moment invariants, IRE Trans. Inform. Theory 8 (1962) 179–187. [18] C.C. Chen, Improved moment invariants for shape discrimination, Pattern Recogn. 26 (5) (1993) 683–686. [19] M. Teague, Image analysis via the general theory of moments, J. Opt. Soc. Am. 70 (8) (1980) 920–930. [20] Z.J. Miao, Zernike moment-based image shape analysis and its application, Pattern Recogn. Lett. 21 (2) (2000) 169–177. [21] C.W. Chong, P. Raveendran, R. Mukundan, A comparative analysis of algorithms for fast computation of Zernike moments, Pattern Recogn. 36 (2003) 731–742. [22] Hiker’s Guide to the Trees, Shrubs, and Woody Vines of Ricketts Glen State Park, third ed., Web Version, 2007. [23] X.F. Wang, J.X. Du, G.J. Zhang, Recognition of leaf images based on shape features using a hypersphere classifier, Lecture Note. Computer Sci. 3644 (2005) 87–96.
© Copyright 2026 Paperzz