Fruit classification based on weighted score

Fruit classification based on weighted
score-level feature fusion
Hulin Kuang
Leanne Lai Hang Chan
Cairong Liu
Hong Yan
Journal of Electronic Imaging 25(1), 013009 (Jan∕Feb 2016)
Fruit classification based on weighted score-level
feature fusion
Hulin Kuang,a,* Leanne Lai Hang Chan,a Cairong Liu,b and Hong Yana
a
City University of Hong Kong, Department of Electronic Engineering, 83 Tat Chee Avenue, Kowloon, Hong Kong
Chinese University of Hong Kong, Department of Mathematics, Tai Po Road, Shatin, NT, Hong Kong
b
Abstract. We describe an object classification method based on weighted score-level feature fusion using
learned weights. Our method is able to recognize 20 object classes in a customized fruit dataset. Although
the fusion of multiple features is commonly used to distinguish variable object classes, the optimal combination
of features is not well defined. Moreover, in these methods, most parameters used for feature extraction are not
optimized and the contribution of each feature to an individual class is not considered when determining the
weight of the feature. Our algorithm relies on optimizing a single feature during feature selection and learning
the weight of each feature for an individual class from the training data using a linear support vector machine
before the features are linearly combined with the weights at the score level. The optimal single feature is
selected using cross-validation. The optimal combination of features is explored and tested experimentally
using a customized fruit dataset with 20 object classes and a variety of complex backgrounds. The experiment
results show that the proposed feature fusion method outperforms four state-of-the-art fruit classification
algorithms and improves the classification accuracy when compared with some state-of-the-art feature fusion
methods. © 2016 SPIE and IS&T [DOI: 10.1117/1.JEI.25.1.013009]
Keywords: object classification; multiple feature extraction; optimal feature selection; weighted score-level feature fusion; fruit
classification.
Paper 15634 received Aug. 11, 2015; accepted for publication Dec. 11, 2015; published online Jan. 19, 2016.
1 Introduction
Many engineering applications depend on object classification.1 In general, systems for object classification utilize
machine-learning algorithms, such as Adaboost2 and its variants,3 support vector machine (SVM),4,5 neural networks,6
and deep learning,7,8 with feature descriptors. Most object
classification methods usually utilize a single type of feature
descriptor, such as the Haar feature,2,3 histogram of oriented
gradients (HOG)4,5 and its variants,9 scale-invariant transform feature (SIFT)10 and extensions of SIFT,11,12 local
binary pattern (LBP)13 and the variants of LBP,14 four direction features,15 Gabor features16 and Gabor-based features,17
and convolutional neural network features.18 In view of the
limited performance when using a single feature descriptor in
complex scenes, multifeature fusion can be used to improve
the classification accuracy.
Multiple feature descriptors can be combined using various methods, which can be categorized into four types:
feature-level fusion,19–26 learning-level fusion,12,27–29 scorelevel fusion,30–34 and decision-level fusion.35,36 Score-level
fusion is also referred to as “classifier fusion” or “classifier
ensemble”30,37–41 due to the fact that it also combines classification results (scores) of classifiers trained using each feature. We focus on score-level fusion because it is less prone
to the curse of dimensionality and can make full use of the
complementarity of different features. This paper proposes a
weighted score-level fusion method that combines the scores
of all classifiers with class-specific weight vectors learned
*Address all correspondence to: Hulin Kuang, E-mail: [email protected].
edu.hk
Journal of Electronic Imaging
from training samples using a linear SVM. The proposed
weighted score-level feature fusion is more effective than
several feature fusion approaches.
Commonly used features for fruit classification19,20,23,24,35
are not optimized before performing feature fusion.
Moreover, the optimal combination of features is not considered in existing methods. In this work, we present a set of
diverse and complementary feature descriptors that can be
used to represent different attributes of fruits. They are global
shape features, global color histograms, and statistical color
features, LBP, HOG, and LBP based on edge maps obtained
using edge detection,42 also named edgeLBP. Except for
the global shape features, the features are optimized by a
five-fold cross-validation. The optimal combination of feature descriptors is then selected by testing each probable
combination. Our contributions in this work include complementary feature extraction, optimal feature selection, optimal
combination, and the effective weighted score-level feature
fusion based on learned weights.
The performance of the proposed weighted score-level
fusion method using the optimal feature combination is analyzed experimentally on a new customized fruit dataset. The
images in the dataset were collected from Google Images,
which include images with more complex backgrounds, a
greater variety of images, and more fruit classes than the
existing fruit datasets.23,24,35 Each image contains a single
fruit object, similar to the datasets for face recognition.2,6
The evaluation of the proposed method using the customized
fruit dataset is twofold. First, our proposed method is
1017-9909/2016/$25.00 © 2016 SPIE and IS&T
013009-1
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
compared with several state-of-the-art feature fusion approaches. Then, it is compared with four state-of-the-art fruit
classification methods. The extensive experiments on this
large dataset enable us to confirm the effectiveness and
superiority of the weighted score-level fusion approach.
The remainder of the paper is organized as follows.
Related work is described in Sec. 2. The proposed fruit
classification approach, multiple feature extraction, optimal
feature selection, and the proposed feature fusion approach
are presented in Sec. 3. In Sec. 4, experimental settings are
provided. Section 5 demonstrates the experiment results, and
provides the evaluations and comparisons with other methods. Finally, conclusions are discussed in Sec. 6.
2 Related Work
In this section, we review four feature fusion techniques: feature level fusion, learning level fusion, score-level fusion,
and decision level fusion.
Feature level fusion: concatenating one feature after
another to obtain a new and long feature vector is traditional
and simple, yet effective. Harjoko and Abdullah19 utilized
this traditional feature fusion of shape and color features
for fruit classification and Arivazhagan et al.20 performed
the concatenating fusion of color and texture features for
fruit classification. An extension of LBP was concatenated
with an extension of HOG to improve performance in pedestrian detection.21 SIFT and boundary-based shape features
were combined using fusion by concatenating to improve
object class recognition.22 These methods were shown to
be effective. However, the curse of dimensionality may
occur when the number of features increases.
There are other feature-level fusion methods such as principal component analysis (PCA)-based fusion,23,24 multiple
component analysis (MCA)-based fusion,25 and canonical
correlation analysis (CCA)-based fusion.26 Multiple features
are combined using PCA to reduce the feature dimension and
to obtain a new feature vector after normalization.23,24 The
new feature vector after PCA might not be optimal for classification.25 Hou et al.25 proposed a feature-level fusion
method named MCA in which feature dimension reduction
and feature fusion were coupled together. This method was
effective for the fusion of three or more features. Two feature
sets were fused by CCA,26 which was used to find the basis
vectors for two sets of variables. These feature fusion methods obtained correlated information from different feature
vectors. Because different features transform images into
various feature spaces and have different scales and ranges,
their feature fusion results might not be optimal for
classification.
Learning level fusion, multiple kernel learning (MKL),
has been widely utilized in object classification and recognition to combine different feature sets by selecting the
optimal combination of kernels based on different features
or different similarity functions.27 MKL learned a kernel
matrix, in which a variety of information of multiple features
could be combined together. The kernel function of SVM
was extended in a weighted fashion for multiple features
but the weights of each feature were equal.12 MKL was
utilized to integrate various types of features.28 With MKL,
an SVM classifier was trained with an adaptively weighted
combination of kernels, which integrated multiple features.
However, MKL did not perform data dimensionality
Journal of Electronic Imaging
reduction.25 MKL required additional time to select the optimal combination of kernels.27–29 Moreover, an MKL-based
method for selecting the optimal kernel combination that
works well for all classification tasks has not yet been
explored.
Score-level fusion, that is, combining multiple features at
score level by fusing all scores obtained by each feature to
acquire the final scores for classification decision, has also
been studied.30–34 Score-level fusion was referred to as classifier ensemble,30 where an ensemble of different classifiers
from multiple features were performed by combining scores
using operators including sum, product, median, max, and
majority vote. The combination rule was also trained by a
multilayer perceptron combiner using a posteriori probability
obtained by each feature. The trained combination rule demonstrated improved performance compared to the fixed rules.
Chen and Chiang31 proposed a multiple-feature extraction
approach in which two kinds of features were combined
at score-level stage by several well-known fusion techniques
including the mean, max, and product rules. Before score
fusion, the scores were normalized into a common scale
and range. The recognition results (scores) of two features
were fused by Kikutani et al.32 using a weighted combination
according to the normalized output likelihood of each feature. Han et al.33 proposed a feature fusion method based
on Gaussian mixture models. In this method, the probabilities of each feature were summed using the maximum
likelihood probability as the weights of each feature to
achieve feature fusion. Several weighting approaches to linearly combine the scores of features were evaluated by
Eisenbach et al.34 They concluded that pairwise optimization
of projected genuine-impostor overlap (PROPER) was very
effective. Classifier fusion approaches37–41 could also be
utilized for score-level feature fusion. Guermeur37 assessed
multiclass SVMs (M-SVM) as an ensemble method of
classifiers. The optimal class-specific weight vectors that
contained weights of each classifier and class-specific bias
terms were learned using a new family of M-SVMs to
linearly combine the classification results from multiple
discriminant models. Guermeur38 combined two M-SVMs
with the linear ensemble method (LEM). In this method,
the classification results were postprocessed and input into
a LEM method to learn the weights of each classifier for
each class to minimize a loss function. The weights were
also class-specific vectors where each vector contained classifier-specific weights. The class posterior probabilities were
then estimated for recognition. A genetic algorithm (GA)
was used by Santos et al.39 and Chernbumroong et al.40 to
search for optimal weights for each classifier to combine
the scores of each classifier. Different from the abovementioned methods, Kumar and Raj41 developed instancespecific weights for score-level classifier fusion. Although
these methods are effective, the weights are mostly classifier-specific; however, the differences between the classification contributions of each feature of individual classes were
not considered in these methods when designing weights.
Moreover, in these methods, weights should be relearned
when new classifiers are added.
Decision level fusion utilized operators such as and, or,
max, and min36 at the decision level for integrating classification labels. Rocha et al.35 presented a unified approach that
could combine many features and classifiers at decision level
013009-2
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
for multiclass classification of fruits and vegetables. The
multiclass classification problem was divided into a set of
binary classifications. For each class, a set of binary classifiers was trained using each feature. Then, the decision
results were computed by majority voting of the classified
results (labels) of all sets of binary classifiers using each
feature. Because hard labels such as −1 or þ1 lose more
information than soft scores, the fusion of hard labels might
lead to misclassifications.
3 Proposed Object Classification Based on
Weighted Multiple Feature Fusion
In this section, we present the proposed object classification
approach, multiple feature extraction, optimal feature selection, and the proposed weighted score-level feature fusion
method.
3.1 Overview of the Proposed Object Classification
Approach
The framework of the proposed object classification method
is shown in Fig. 1. The image preprocessing step includes
conversion from RGB color space to HSV color space, grayscale transformation, image segmentation (Otsu’s method43),
and edge detection (Canny edge detection44). Multiple and
complementary features are then extracted. We improve
the recognition accuracy by performing optimal single feature selection using cross-validation to choose the optimal
feature parameters. The complementarity of each feature
and different classification contribution of each feature for
each class is fully utilized by employing weighted scorelevel feature fusion. Each optimal feature is used to train
a multiclass classifier with SVM (LibSVM45). Then, the
weights of each feature (i.e., the multiclass classifier) for
each class are separately learned from training samples using
a linear SVM (LibLinear46) with the scores of each classifier
for each class as input features. The classifier responses
(scores) of each classifier are vectors containing the probability of samples being classified into each class. The optimal
fusion of features is determined by testing all probable combination of features. During the testing stage, multiple features that are optimal are extracted. Subsequently, the trained
multiclass classifiers are used to recognize the corresponding
feature vectors to obtain the score vectors. Finally, the final
score vector is computed by summing the score vectors
of each multiclass classifier (i.e., a feature) with learned
weights during the training stage. The classification result
(label) is decided by finding the maximum score in the final
score vector.
3.2 Multiple Feature Extraction
In this section, details of the multiple feature extraction procedure are presented. We use complementary features that
include shape, color, HOG, LBP, and edgeLBP features.
3.2.1 Global shape features
In general, different fruits have different shape properties, for
example, the shape of an apple is spherical, whereas the
shape of a hami melon is elliptic. Therefore, the global
shape features carry important attributes for fruit recognition.
However, using the shape features as the only attributes of
fruit is not sufficient because fruits have different colors, textures, as well as gradient features. In the field of fruit recognition, simple shape features such as area, circumference,
and circularity are utilized.19 In this work, these three
kinds of global shape features are all used. The shape features are computed from the fruit region based on Otsu’s
image segmentation43 and Canny edge detection.44 Area
(A) is the number of foreground pixels, belonging to a
fruit. Circumference (C) is the number of pixels belonging
to the edge of a fruit. Circularity is computed by the equation: Circularity ¼ C2 ∕4πA. The shape features represent the
global shape of an object, which may lead to misclassification because of the different scales and orientations of the
object. Hence, features that are able to represent the local
shape of an object are indispensable.
3.2.2 Color features
Statistical color features, such as the mean and standard
deviation of all color values in each color channel, are commonly used in object recognition. Methods using the statistical features are simpler and faster than methods based on
the color histogram. The hue is invariant to the orientation of
an object with respect to illumination and the camera direction; hence, the HSV color space is often used. In addition,
the HSV color space more closely resembles that of the
human visual system than the RGB color space.20 In HSV,
the H and S channels indicate the color, whereas the V channel indicates the brightness or luminance. Therefore, the
color features are extracted on the H and S channels in
the HSV color space. Four statistical color features are
used in this paper. They are the mean and standard deviation
values of both of the H and S channels, respectively.
We also utilize a global color histogram on the HSV color
space by dividing each channel into various bins to compute
the histograms. The final color histogram is obtained by
combining the histograms of each channel. The number
of bins influences the performance of the color histogram.
Fig. 1 Framework of the proposed object classification method.
Journal of Electronic Imaging
013009-3
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
3.2.3 Histogram of oriented gradients
The HOG descriptor adopts the statistical information of gradients to represent the local contour of an object, such as a
pedestrian.4 HOG can represent the local contour of an
object, which is a kind of local shape feature. Hence, HOG
not only complements simple shape features but also color
and texture features. In this paper, the HOG features are
extracted from the grayscale images. In general, images
are divided into several blocks with overlap or no overlap to
capture more information. Therefore, the accuracy of HOG
depends on the size of a block and the extent of overlap.
Similarly, using the HOG feature alone is also insufficient
because the local shape features only form a subset of all
the attributes of fruit. Thus, other features, especially texture
features such as LBP, should be utilized.
3.2.4 Local binary pattern
LBP is a commonly used texture feature that has shown
robust performance in pedestrian detection and face recognition.13 Hence, LBP is utilized in this work to improve the
accuracy of fruit classification.
We utilize uniform pattern LBPs that are extracted on the
V channel of HSV color space. The recognition accuracy of
LBP depends on radius, number of sample points, block size,
and extent of overlap. LBP complements other features such
as HOG. Although LBP is useful for object recognition, LBP
only extracts the local texture information between the center
pixel and its neighbors. Thus, global features should be used
to complement LBP.
3.2.5 EdgeLBP
Edge detection using structured forests is one of the state-ofthe-art edge detection methods42 that is capable of computing
high-quality edge maps. We complement other features by
extracting edge maps from RGB images using edge detection.42 The use of edge maps as features leads to a large
number of high dimension features. We extract LBP on
edge maps to obtain the local properties of edge features.
Similar to LBP, the performance of edgeLBP is also influenced by the block size and degree of overlap. Edge maps
are global edge features and edgeLBP captures local analyses
of edge features that can complement other features.
3.4 Proposed Multifeature Fusion Approach
Once the optimal features have been selected, we perform
multiple feature fusion. In this section, we propose a
weighted score-level feature fusion method based on
learned weights. We first introduce several state-of-the-art
feature-fusion baselines so that they can be better compared
with our proposed method.
3.4.1 Feature-level fusion based on simple
concatenation
We concatenate one feature after another to obtain the final
feature vector, which is used to train the final classifier (an
approach named “Concatenating”), similar to the fusion
described by Manshor et al.22 The fusion of multiple features
is achieved as follows:
2
3
shape
6 color
7
6
7
7;
F¼6
(1)
LBP
6
7
4 HOG
5
edgeLBP
EQ-TARGET;temp:intralink-;e001;326;579
where F is the final feature fusion vector, and the five terms
in the bracket denote the five features discussed above.
3.4.2 Score-level feature fusion baselines
Score-level fusion based on average classification contribution. We propose a simple and new weighted score-level
fusion based on the average classification contribution
(named “WSLF-ACC” in this paper), where the weights of
each feature are defined as the average classification contribution computed using average accurate probability.
Each feature vector is used to train a multiclass classifier
using SVM on training samples. Each classifier (corresponding to each feature) can obtain the responses (i.e., scores) of
each sample. We then linearly sum the scores of each classifier for each sample together with weights to obtain the
final scores. The weights of each classifier are computed
using the average accurate probability as
wj ¼
EQ-TARGET;temp:intralink-;e002;326;305
N
1X
P ;
N i¼1 ij
wj
w j ¼ P5
EQ-TARGET;temp:intralink-;e003;326;254
j¼1
3.3 Optimal Feature Selection
The above analyses indicate that performance accuracy of
global color histograms, HOG, LBP, and edgeLBP depends
on feature parameters. Optimal single feature selection is
necessary for improving recognition accuracy and dimensionality reduction. We select optimal parameters of each
feature by first applying kernel principal component analysis
to reduce feature dimension and discard useless features.47
Subsequently, the average five-fold cross-validation accuracies of each set of parameters are computed. The set of
parameters with the highest average cross-validation accuracy is selected as the optimal parameters of each feature.
The optimal single feature is extracted using the optimal
parameters, which are presented in Sec. 5.1.
Journal of Electronic Imaging
wj
(2)
;
(3)
where N is the number of training samples and Pij is the
accurate probability of the i’th sample to be classified correctly by the j’th classifier (feature). Pij is computed using
LibSVM.
During the testing stage, the outputs of the classifiers that
are trained using each of the optimal features are summed
with the computed weights to obtain the final score vectors.
For multiclass classification, the output of the classifier is
a vector containing the probability of samples belonging to
each class and the class label is computed by determining
the maximum score in the final score vector.
Four state-of-the-art score-level fusion approaches. MSVM:37 Five classifiers are trained using each feature with
013009-4
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
LibSVM and cross validation. The class-specific weight vectors and bias terms are learned using the previously proposed
M-SVM37 to linearly combine the results of each sample
obtained by the five classifiers as
hðsk Þ ¼ wTk sk þ bk ;
EQ-TARGET;temp:intralink-;e004;63;708
(4)
where sk is the score vector of the k’th class containing the
scores obtained by the five classifiers, wTk and bk are the
class-specific weight vector and bias term, respectively,
and hðsk Þ is the final score of the k’th class.
“M-SVM+LEM”:38 we follow an existing concept of
classifier fusion.39 This is done by training five M-SVM classifiers using the five features introduced above, after which
the classification results of the five classifiers are combined
by class-specific weight vectors learned by the LEM
method39 as
hðsk Þ ¼ wTk sk :
EQ-TARGET;temp:intralink-;e005;63;567
(5)
Weighted linear combination (WLC) using genetic
algorithm (WLC-GA):39 After computing the scores of
each classifier (i.e., feature), we search for the optimal
weights of each feature by using a modified GA39 to linearly
combine the scores of each classifier for predicting the
class label.
GA-based fusion weight selection (GAFW):40 GAFW is
another GA-based feature fusion approach. The GA used to
compute weights40 is different from that in WLC-GA.
Similarly, we select the weights of each feature using this
GA40 to fuse together the classification results of each
feature for class label recognition.
In WLC-GA and GAFW, the weights are classifierspecific. The final score vector is computed as
Sfinal ¼
5
X
EQ-TARGET;temp:intralink-;e006;63;381
i¼1
wi Si ;
(6)
where Si is the score vector containing the scores of each
class obtained by the i’th classifier, and wi is the learned
weight of the i’th classifier.
3.4.3 Decision level fusion and learning level fusion
We use the five optimal features described in Sec. 3 and
apply a known feature fusion approach.35 One hundred
(5 20) binary classifiers of each feature for each class
are learned, and combined using the classifier fusion framework35 to obtain the final decision. This decision level is
termed “DLF-BC” in this paper (decision-level fusion of
binary classifiers).
“SimpleMKL,”48 one of the state-of-the-art MKL methods, is selected as the baseline for learning-level feature
fusion. SimpleMKL warps multiclass SVM and selects an
optimal combination of kernels based on different features
and is easy to use for multiclass classification problems.
We utilize the five features above. Then, SimpleMKL48 is
used to combine the five features by finding the optimal combination of kernels for object classification.
Journal of Electronic Imaging
3.4.4 Our proposed weighted score-level multiple
feature fusion based on learned weights
The differences in classification strength among the features
of each class should be considered. Therefore, we propose a
score-level feature fusion based on learned weights of each
feature for each class in this section to improve the overall
classification accuracy. We named the proposed feature
fusion as weighted score-level fusion with learned weights
(WSLF-LW). For each class, different features (classifiers)
demonstrate different classification accuracies. In addition,
a feature (classifier) also demonstrates various classification
accuracies. This requires us to learn the weight of each feature for an individual class accurately and reasonably. That is
to say, the learned weights should be classifier-specific and
class-specific.
In recent years, the coefficients learned from training data
are utilized in object proposals.49,50 “BING” first learned
a classifier to compute the original scores, followed by a
score function with two terms (weights) to obtain accurate
and reasonable scores.50 In view of this, we also learn the
accurate weights of each feature for each class using a linear
SVM.
In stage 1, we train a multiclass classifier of each feature
using SVM (LibSVM) and conduct cross-validation on
training samples. Each feature utilized is generated by the
optimization selection framework proposed in this paper.
The multiclass classifier computes the classifier responses
(e.g., scores) of each sample. The scores of each classifier
for each sample are represented by the following vector
Sij ¼ ½si1j ; si2j ; : : : ; sikj ; : : : ; siCj ;
(7)
EQ-TARGET;temp:intralink-;e007;326;429
where C is the number of classes of an object, j and i denote
the j’th classifier (feature) and the i’th sample, respectively,
and sikj denotes the score of the i’th sample for the k’th class
obtained by the j’th classifier (feature).
In stage 2, the weights and bias terms of each feature for
each class are learned by a linear SVM. The score vector of
each feature (e.g., classifier) is calibrated using learned
weights and then multiclass prediction is derived from the
final score vector that is computed by summing the calibrated score vectors of each feature.
We define the calibrated score vector of each feature for
each class as
oij ¼ wj Sij þ bj
EQ-TARGET;temp:intralink-;e008;326;266
¼ ½wj1 si1j þ bj1 ; wj2 si2j þ bj2 ; : : : ; wjk sikj
þ bjk ; : : : ; wjC siCj þ bjC ;
(8)
where wjk is the weight of the j’th feature for the k’th class,
wjk and bjk are the learned weight and bias term of the j’th
feature for the k’th class for score calibration, respectively,
and oij is the output score vector of the i’th sample using
the j’th feature.
The final score vector is computed by summing the
calibrated score vectors of each feature as
013009-5
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
EQ-TARGET;temp:intralink-;e009;63;752
Oi ¼
¼
5
X
j¼1
training of the new multiclass classifiers using the
added features and the learning of weight and bias
terms of the feature added to each class, while keeping
the existing classifier and weights unchanged.
3. We also learn the bias term of each classifier for each
individual class to calibrate all weights to improve
classification accuracy. However, bias terms are not
utilized in these score-level fusion baselines, with
the exception of M-SVM,37 where the bias terms of
each class for different classifiers are the same.
4. Our approach differs from other feature-level and
learning-level feature-fusion approaches in which all
features are used to train a single classifier; thus,
our method is less prone to the curse of dimensionality
because each feature is used separately to train a
classifier.
5. Our proposed approach differs from the existing
decision-level feature-fusion approach,35 in which the
differences of classification ability of each feature for
each class are considered and 5 C binary classifiers
are trained and learned to obtain a final decision by
majority voting of hard labels (e.g., 1∕ − 1). In contrast, our proposed method only needs to train five
multiclass classifiers and linearly sum the soft scores
with weights. The disadvantage of using hard labels
and majority voting is that some useful information
might be lost for classification. Our proposed method
has the potential to be more effective and to reduce
training and testing time.
oij
X
5
j¼1
wj1 si1j þ bj1 ;
þ bjk ; : : : ;
5
X
j¼1
5
X
j¼1
wj2 si2j þ bj2 ; : : : ;
5
X
j¼1
wjk sikj
wjC siCj þ bjC
¼ ½Oi1 ; Oi2 ; : : : ; Oik ; : : : ; OiC ;
(9)
where oij is obtained using Eq. (7), Oi is the final score
vector of the i’th sample after weighted feature fusion,
and Oik is the score for the k’th class.
Note that the decision function of the multiclass classification problem is defined as follows
li ¼ arg max Oik ;
EQ-TARGET;temp:intralink-;e010;63;567
k¼1;: : : ;C
(10)
where li is the classification result (class label) of the i’th
sample, Oik is the k’th value of the final score vector of
the i’th sample. This decision function signifies that the
class label of a sample is the index of the maximum
score.
The terms wjk and bjk in Eq. (8) are learned by using a
linear SVM (LibLinear), which is done by using training
samples of the k’th class as positive samples and training
samples of all other classes as negative samples. We use
score values of the j’th feature for the k’th class as features
(one-dimensional feature) which are input to a linear SVM.
Details of the procedure used to learn the weights can refer to
the “BING” code that has been published.50 The learning
procedure is run 5 C times to obtain the weight and bias
term of each feature for each class.
We first evaluate each multiclass classifier for the corresponding features of each testing sample, after which we
compute the final score vector by inputting the classifier
responses and learned weights together with the bias
terms into Eq. (9). Finally, the classification results are
obtained using Eq. (10).
The highlights of our proposed weighted score-level
feature fusion based on learned weights are as follows:
1. Our proposed WSLF-WL approach considers the
strength of complementarity of each feature for recognizing the individual class when designing weights,
which is ignored by most feature fusion approaches
except M-SVM37 and M-SVM+LEM.38 The learned
weights are not only class-specific (for the same feature, the weights of each class are different and learned
separately) but also classifier-specific (for the same
class, the weights of each feature are also different).
2. Our proposed method learns each weight of each
feature for each class separately in a learning step (independent), whereas other score-level fusion approaches such as M-SVM,37 M-SVM+LEM,38 WLC-GA,39
and GAFW40 only use one learning step to learn all the
weights. Thus, those approaches require all weights to
be relearned when additional features are added. In
contrast, our proposed method is easy to extend when
additional features are added and only necessitates the
Journal of Electronic Imaging
4 Experimental Settings
4.1 Experimental Conditions
All experiments were performed using an Intel Core i7-4770
processor with 8 GB installed memory (RAM) running
Windows 7 Enterprise 64-bit. The computing speed of the
CPU was 3.40 GHz. In addition, the evaluation measure
used in this work is recognition accuracy, which is the
ratio of the number of samples that are recognized correctly
to the number of samples tested.
4.2 Dataset
In the work described in this paper, all the experiments are
performed on the customized fruit dataset. Currently, there is
no universal fruit dataset for fruit recognition. We developed
an object classification method as the basis of object detection. Therefore, the samples should consist of images containing a single fruit. As a result, existing fruit datasets
are not suitable for our work. Therefore, a new fruit dataset
with images containing only a single fruit in the foreground
is built. The dataset comprises 20 fruit classes, including the
background, red apple, orange, pear, tomato, strawberry,
banana, watermelon, kiwi fruit, peach, pomegranate, pineapple, starfruit, red grape, lemon, mango, pomelo, durian,
hami melon, and papaya. The size of each sample is
100 100 pixels. Fifty samples of each class are randomly
selected for testing and the other samples are used for
training. All the samples are RGB images. Details of the
dataset are given in Table 1 and the dataset is available
online.51
013009-6
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
Table 1 Fruit dataset developed in this paper.
Class of fruit
Number of samples
for training
Number of samples
for testing
Class of fruit
Number of samples
for training
Number of samples
for testing
Background
740
50
Pomegranate
266
50
Red apple
418
50
Pineapple
225
50
Orange
432
50
Starfruit
161
50
Pear
268
50
Red grape
164
50
Tomato
411
50
Lemon
149
50
Strawberry
450
50
Mango
149
50
Banana
394
50
Pomelo
118
50
Watermelon
239
50
Durian
120
50
Kiwi fruit
311
50
Hami melon
102
50
Peach
434
50
Papaya
59
50
Total
5610
1000
Figure 2 shows representative images of each class in our
fruit dataset and two other existing fruit datasets. The first
two images in Fig. 2(a) are background images, that is,
images in which fruits do not appear. The backgrounds,
sizes, positions, and illuminations of the sample images
are different. Our dataset has a large diversity and is more
complex than the existing datasets in which most of the
images have a simple and monotonic background. For instance, the background of images in an existing dataset23
is white in color [see Fig. 2(b)] and in Fig. 2(c) the background of each image of another existing dataset35 is very
simple and monotonic. All experiments shown in Sec. 5
are conducted on this fruit dataset.
5 Experimental Results
This Section presents the experimental results, including
optimal feature selection, the difference in the classification
strength of each feature for each class, a comparison between
our proposed method and some baseline methods, a validation of feature complementarity, a comparison with other
fruit classification methods, the accuracy of different training
samples, and the classification speed.
5.1 Optimal Feature Selection Results
Optimal feature selection is performed to select the optimal
set of feature extraction parameters. The selection standard is
the average five-fold cross-validation accuracy. For a global
color histogram, the optimal number of bins is 32. The optimal HOG is extracted by dividing images into 20 20 blocks
without overlap. The optimal LBP is a uniform pattern with a
radius of 1, and the number of sample points is 8. Besides,
the block has a size of 25 25 with no overlap among blocks.
For edgeLBP, the parameters of LBP extraction in edgeLBP
are similar to those of LBP.
Journal of Electronic Imaging
5.2 Classification Strength Differences of
Each Feature for Each Class
The classification strength of each feature for each class is
investigated by classifying samples of each fruit class using
the five respective feature descriptors. The classification
accuracies of each feature for each class are shown in
Fig. 3. “Color” and “shape” denote global shape features
and color features, respectively.
The results in Fig. 3 show that the same feature demonstrates various classification accuracies for the 20 classes of
fruit. For instance, LBP demonstrates high accuracy (98%)
for durian and low accuracy (50%) for papaya. Different features also vary in the classification accuracy for each class.
For instance, HOG demonstrates higher accuracy (60%) for
papaya than LBP. In terms of the average accuracy for all 20
classes, LBP demonstrates the highest accuracy (83.7%) and
shape shows the lowest accuracy (25.7%). These results
indicate that the various contributions of each feature to
an individual class require the weights of each feature for
each class to be considered when performing feature fusion
to improve the overall classification accuracy.
5.3 Validation of Feature Complementarity
This section describes the results of experiments that are performed to validate feature complementarity. Table 2 summarizes the performance (accuracy for 20 classes) of single
features and several multiple-feature fusions based on
learned weights. The accuracies of shape and color features
are low. However, it does not mean that shape and color features are not useful. The analysis above shows that the shape
features are global shapes which are complementary with
HOG and the color features are complementary with other
features. The accuracy of the fusion shape and HOG feature
is 82.2%, which is higher than that of the HOG (78.6%) and
shape features (25.7%). This validates the complement of
013009-7
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
Fig. 2 Examples of each class in our dataset and examples of other datasets. (a) Examples of each class
of our fruit dataset; (b) and (c) examples of other fruit datasets.23,35
Fig. 3 Classification accuracy comparison of each feature for each class.
shape features with HOG. In order to validate the complement of edgeLBP with LBP, the fusion of edgeLBP and
LBP is tested. The accuracy of this fusion is 86.8%,
which is higher than that of LBP (83.7%) and edgeLBP
(77.8%). The accuracy of a fusion of color, shape, and
HOG features is 84.4% which is higher than the accuracy
of using a single feature, which validates the complement
Journal of Electronic Imaging
of color features with other features. The fusion of five features demonstrates the highest recognition accuracy (90.7%),
which also validates that the five features are complementary
with each other. These results also demonstrate that our proposed feature-fusion approach is effective. Note that scorelevel multiple-feature fusion based on learned weights is
performed when two or more features are used. Moreover,
013009-8
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
Table 2 Experimental results for validating the complementarity of features. C and S denote color and shape features, respectively.
Features
Accuracy (%)
Features
Accuracy (%)
C
55.6
C þ S þ LBP þ HOG þ edgeLBP
90.7
S
25.7
S þ HOG
82.2
LBP
83.7
LBP þ edgeLBP
86.8
HOG
78.6
C þ S þ HOG
84.4
EdgeLBP
77.8
C þ S þ LBP þ HOG
88.7
we consider the five feature vectors extracted from red apples
as random variables and conduct T-test and F-test on each of
the two different features to verify the statistic independence
of the feature descriptors. The average P values over all
samples using T-test and F-test are both less than 0.05. For
instance, the P values of T-test and F-test on LBP and HOG
are 0.0324 and 0.0059, respectively. These results indicate
that the five features are statistically independent with
each other to a large extent. We also compare the classification speed (i.e., computation time) when using each feature.
For instance, the classification speed when using “LBP,”
“S þ HOG,” and “C þ S þ LBP þ HOG þ edgeLBP” is
about 0.032 s per image, 0.044 s per image, and 0.092 s
per image, respectively. Because the weights have been
learned offline, the classification speed of our proposed feature fusion is only slightly slower than that of a single feature
(e.g., LBP); however, our proposed method improves the
accuracy by 7%. This result demonstrates the effectiveness
of our proposed method.
Due to the fact that we extract five complementary features,
we wonder which combination of features is optimal.
Therefore, we also test all probable fusions of features to select
the optimal combination of features. There are 26 probable
combinations of features (C25 þ C35 þ C45 þ C45 ¼ 26), but
we only partially show the results in Table 2. The results indicate that the combination of all five features is optimal.
Hereafter, the performance of the proposed method is obtained
using the optimal fusion of features.
of fruit, our proposed weighted score-level fusion based on
learned weights demonstrates higher classification accuracy
(90.7%) than M-SVM+LEM (89.3%), GAFW (89.1%),
WLC-GA (88.8%), SimpleMKL (88.5%), M-SVM (88.3%),
DLF-BC (88.1%), WSLF-ACC (87.4%), and “concatenating” (85.6%). These results confirm our proposed feature
fusion to be superior to other multiple-feature fusion methods in terms of classification accuracy.
5.5 Comparison with State-of-the-Art Fruit
Classification Methods
Our proposed object classification method is validated on a
fruit classification task using a customized fruit dataset. Four
state-of-the-art fruit classification methods and our proposed
object classification method are compared on the same
dataset.
The first is the method proposed by Harjoko and
Abdullah19. This method uses shape and color features.
Another is the method developed by Arivazhagan et al.20
This method utilizes a fusion of color and texture features.
The first two methods both utilize feature fusion by simple
concatenating one feature after another. The third method
uses a fusion of shape, color, and texture feature by PCA
proposed by Zhang and Wu.23 The last method utilizes
three features, which combine binary classifiers of each feature using the majority voting rule.34 Figure 5 shows the
comparison between the four fruit classification methods
and our proposed object classification method. Our proposed
5.4 Comparison with Multiple-Feature Fusion
Baselines
The effectiveness of our proposed score-level feature fusion
based on learned weights is demonstrated by comparing our
results with the eight other multiple feature fusion methods
described in Sec. 3. The eight baseline methods introduced in
Sec. 3 are evaluated on the same fruit dataset using the five
features. The comparison of recognition accuracies of multiple feature fusion approaches is shown in Fig. 4. The trend of
recognition with an increasing number of classes of fruit is
shown by randomly selecting 2 to 20 classes from the fruit
dataset to be recognized. For example, 4 in Fig. 4 denotes
that we randomly select four classes (e.g., background, pear,
red apple, and orange) that are evaluated by different fusion
baseline methods. Recognition accuracy decreases when the
number of classes of fruit increases. The weighted scorelevel feature fusion based on learned weights (WSL-LW)
demonstrates higher accuracy than other fusion methods
with the same number of classes. For instance, for 20 classes
Journal of Electronic Imaging
013009-9
Fig. 4 Comparison of different feature fusion methods.
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
the accuracy (for 20 classes) comparison of a different number of training samples per class. The results in Fig. 6 indicate that the higher the number of training samples per class,
the higher the accuracy, which signifies the probability of
improving the accuracy by increasing the number of training
samples.
Fig. 5 Comparison of several fruit classification methods with our
developed fruit dataset.
fruit classification method demonstrates higher accuracy
(90.7% for 20 classes) than the four state-of-the-art fruit classification methods under the same conditions. For instance,
our proposed method improves accuracy for 20 classes by
4%, 9.4%, 30.4%, and 33.2% compared to the four existing
methods,19,20,23,35 respectively. Therefore, our proposed
method is more effective than other fruit classification
methods in terms of classification accuracy of fruit.
5.6 Results Using Different Numbers of
Training Sample
In the new fruit dataset, there are 5610 training images. We
are interested in the impact of changing the number of training samples on accuracy. The higher the number of training
samples included, the longer the training time. We investigate whether we could obtain comparable accuracy with
fewer training samples. Experiments are performed by
changing the percentage of training samples per class and
using weighted feature fusion based on learned weights.
This is done by testing 8%, 16%, 32%, 64%, 80%, 90%,
and 100% of training samples per class. Figure 6 shows
Fig. 6 Comparison of several training samples per class using
weighted multiple-feature fusion based on learned weights. X -axis
shows the percentage of training samples per class. Y -axis denotes
the recognition accuracy under different sample sizes. The curve in
this figure shows that the accuracy increases with the increase in the
number of training samples used.
Journal of Electronic Imaging
5.7 Classification Speed
The classification process includes feature extraction, classification by each trained classifier and weighted summation
of scores. The classification time of weighted multiplefeature fusion based on learned weights is about 0.092 s
per image (for a 100 100 RGB image), which would
meet the criterion for real-time recognition. We can conclude
that weighted score-level multiple-feature fusion based on
learned weights is effective and efficient in terms of recognition accuracy and classification speed.
6 Conclusions
This paper proposes a object classification method using
weighted score-level multiple-feature fusion based on
learned weights and an optimal feature selection framework.
The proposed method demonstrates effective and robust performance for 20 classes of fruit and can recognize fruit
against complex backgrounds using a customized fruit dataset. Color features, shape features, and more efficient and
robust features including LBP, HOG, and edgeLBP, are utilized. Optimal feature parameter selection is performed. The
complementarity of the five features is also analyzed and
validated. Multiple features are combined in score level
by summing scores with learned weights of each feature
for each class. The weights are separately learned from training data using a linear SVM. The experiment results demonstrate that the proposed score-level multiple feature fusion
based on learned weights is more effective than several
state-of-the-art multiple feature fusion methods. As a consequence of the complementarity of the five features and the
effectiveness of the proposed feature fusion approach, the
proposed object classification method outperforms other
state-of-the-art fruit classification methods when validated
on the same dataset. The recognition speed can meet the
demand of real-time applications. Each image in the dataset
contains a single fruit object. Therefore, the proposed object
classification could be used as the basis for object detection
systems.
The proposed method is effective and efficient for object
classification. Yet, there are still some scopes for improvement. In this paper, LBP, HOG, and edgeLBP are extracted
from regions of an image. However, there are several regions
which contain little information that can be used for recognition purposes. For example, regions in the corners of an
image may only contain background. In future, a region
selection framework will be added to obtain meaningful
regions and improve accuracy and testing speed.
In real-world conditions, the image conditions vary; however, only gray images and HSV color space are considered
in this paper. An improvement in the robustness of image
conditions would require the utilization of multiple color
spaces for feature extraction.
013009-10
Jan∕Feb 2016
•
Vol. 25(1)
Kuang et al.: Fruit classification based on weighted score-level feature fusion
Acknowledgments
The work described in this paper was fully supported by
a grant from City University of Hong Kong (Project
No. 9610326).
References
1. D. S. Prabha and J. S. Kumar, “Three dimensional object detection and
classification methods: a study,” Int. J. Eng. Res. Sci. Tech. 2(2), 33–42
(2013).
2. P. Viola and M. Jones, “Robust real-time face detection,” Int. J. Comput.
Vision 57 (2), 137–154 (2004).
3. J. Geng and Z. Miao, “Domain adaptive boosting method and its applications,” J. Electron. Imaging 24(2), 023038 (2015).
4. N. Dalal and B. Triggs, “Histograms of oriented gradients for human
detection,” in IEEE Conf. on Computer Vision and Pattern Recognition,
pp. 886–893 (2005).
5. I. Charfi et al., “Optimized spatio-temporal descriptors for real-time fall
detection: comparison of SVM and Adaboost based classification,”
J. Electron. Imaging 22(14), 041106 (2013).
6. Z.-Q. Zhao, D. S. Huang, and B.-Y. Sun, “Human face recognition
based on multiple features using neural networks committee,”
Pattern Recognit. Lett. 25 (12), 1351–1358 (2004).
7. D. Zang et al., “Vehicle license plate recognition using visual attention
model and deep learning,” J. Electron. Imaging 24(3), 033001 (2015).
8. T. N. Sainath et al., “Deep convolutional neural networks for large-scale
speech tasks,” Neural Networks 64, 39–48 (2015).
9. P. F. Felzenszwalb et al., “Object detection with discriminatively trained
part based models,” IEEE Trans. Pattern Anal. Mach. Intell. 32 (9),
1627–1645 (2010).
10. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision 60(2), 91–110 (2004).
11. R. Wang, Z. Zhu, and L. Zhang, “Improving scale invariant feature
transform-based descriptors with shape-color alliance robust feature,”
J. Electron. Imaging 24(3), 033002 (2015).
12. K. E. Van De Sande, T. Gevers, and C. G. Snoek, “Evaluating color
descriptors for object and scene recognition,” IEEE Trans. Pattern
Anal. Mach. Intell. 32(9), 1582–1596 (2010).
13. G. Zhao et al., “Rotation-invariant image and video description with
local binary pattern features,” IEEE Trans. Image Process. 21(4),
1465–1477 (2011).
14. A. Satpathy, X. Jiang, and H. L. Eng, “LBP-based edge-texture features
for object recognition,” IEEE Trans. Image Process. 23(5), 1953–1964
(2014).
15. H. Kuang et al., “Mutual cascade method for pedestrian detection,”
Neurocomputing 137, 127–135 (2014).
16. M. A. Amin and H. Yan, “An empirical study on the characteristics of
gabor representations for face recognition,” IEEE Trans. Pattern Anal.
Mach. Intell. 23(3), 401–431 (2009).
17. H. Han et al., “Discriminant analysis with Gabor phase feature for
robust face recognition,” J. Electron. Imaging 22(4), 043035 (2013).
18. X.-X. Niu and Y. S. Ching, “A novel hybrid CNN-SVM classifier for
recognizing handwritten digits,” Pattern Recognit. 45(4), 1318–1325
(2012).
19. A. Harjoko and A. Abdullah, “A fruit classification method based on
shapes and color features,” in 3rd Asian Physics Symp., pp. 445–448
(2009).
20. S. Arivazhagan et al., “Fruit recognition using color and texture
features,” J. Emerg. Trends Comput. Inf. Sci. 1(2), 90–94 (2010).
21. X. Wang, T. X. Han, and S. Yan, “An HOG-LBP human detector with
partial occlusion handling,” in IEEE Int. Conf. on Computer Vision,
pp. 32–39 (2009).
22. N. Manshor et al., “Feature fusion in improving object class recognition,” J. Comput. Sci. 8(8), 1321–1328 (2012).
23. Y. Zhang and L. Wu, “Classification of fruits using computer vision
and a multiclass support vector machine,” Sensors 12, 12489–12505
(2012).
24. Y. Zhang et al., “Fruit classification using computer vision and feedforward neural network,” J. Food Eng. 143, 167–177 (2014).
25. S. Hou, Q. Sun, and D. Xia, “Feature fusion using multiple component
analysis,” Neural Process Lett. 34(3), 259–275 (2011).
26. F. Ou et al., “Face verification with feature fusion of Gabor based
and curvelet based representations,” Multimedia Tools Appl. 57(3),
549–563 (2012).
27. S. S. Bucak, R. Jin, and A. K. Jain, “Multiple kernel learning for visual
object recognition: a review,” IEEE Trans. Pattern Anal. Mach. Intell.
36(7), 1354–1369 (2014).
28. H. Hoashi, T. Joutou, and K. Yanai, “Image recognition of 85 food
categories by feature fusion,” in IEEE Int. Symp. on Multimedia,
pp. 296–301 (2010).
29. P. Gehler and S. Nowozin, “On feature combination for multiclass
object classification,” in IEEE 12th Int. Conf. on Computer Vision,
pp. 221–228 (2009).
Journal of Electronic Imaging
30. R. M. Cruz, G. D. Cavalcanti, and T. I. Ren, “Handwritten digit
recognition using multiple feature extraction techniques and classifier
ensemble,” in 17th Int. Conf. on Systems, Signals and Image
Processing, pp. 215–218 (2010).
31. Y. M. Chen and J. H. Chiang, “Face recognition using combined
multiple feature extraction based on Fourier-Mellin approach for single
example image per person,” Pattern Recognit. Lett. 31(13), 1833–1841
(2010).
32. Y. Kikutani et al., “Hierarchical classifier with multiple feature
weighted fusion for scene recognition,” in Int. Conf. on Software
Engineering and Data Mining, pp. 648–651 (2012).
33. G. Han et al., “A new feature fusion method at decision level and its
application,” Optoelectron. Lett. 6, 129–132 (2010).
34. M. Eisenbach et al., “Evaluation of multi feature fusion at score-level
for appearance-based person re-identification,” in Int. Joint Conf. on
Neural Networks, pp. 1–9 (2015).
35. A. Rocha et al., “Automatic fruit and vegetable classification from
images,” Comput. Electron. Agric. 70(1), 96–104 (2010).
36. J. Kittler et al., “On combining classifiers,” IEEE Trans. Pattern Anal.
Mach. Intell. 20, 226–239 (1998).
37. Y. Guermeur, “Combining discriminant models with new multi-class
SVMs,” Pattern Anal. Appl. 5(2), 168–179 (2002).
38. Y. Guermeur, “Combining multi-class SVMs with linear ensemble
methods that estimate the class posterior probabilities,” Commun.
Stat. Theory Methods 42(16), 3011–3030 (2013).
39. A. B. Santos, A. de Albuquerque Araujo, and D. Menotti, “Combining
multiple classification methods for hyperspectral data interpretation,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 6(3), 1450–1459 (2013).
40. S. Chernbumroong, S. Cang, and H. Yu, “Genetic algorithm-based classifiers fusion for multisensor activity recognition of elderly people,”
IEEE J. Biomed. Health Inf. 19(1), 282–289 (2014).
41. A. Kumar and B. Raj, “Unsupervised fusion weight learning in multiple
classifier systems,” CoRR, abs/1502.01823, http://arxiv.org/abs/1502
.01823 (2015).
42. P. Dollár and C. L. Zitnick, “Fast edge detection using structured forests,”
IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2014).
43. N. Otsu, “A threshold selection method from gray-level histograms,”
IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979).
44. J. Canny, “A computational approach to edge detection,” IEEE Trans.
Pattern Anal. Mach. Intell. 8(6), 679–698 (1986).
45. C.-C. Chang and C.-J. Lin, “LIBSVM: a library for support vector
machines,” ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011).
46. R.-E. Fan et al., “Liblinear: a library for large linear classification,”
J. Mach. Learn. Res. 9, 1871–1874 (2008).
47. L. J. Cao et al., “A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine,” Neurocomputing 55(1–2),
321–336 (2003).
48. A. Rakotomamonjy et al., “SimpleMKL,” J. Mach. Learn. Res. 9,
2491–2521 (2008).
49. Z. Zhang and P. Torr, “Object proposal generation using two-stage cascade
SVMs,” IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 102–115 (2015).
50. M. M. Cheng et al., “BING: binarized normed gradients for objectness
estimation at 300 fps,” in IEEE Conf. on Computer Vision and Pattern
Recognition (CVPR), pp. 3286–3293 (2014).
51. Hulin Kuang, “Fruitdataset,” https://www.researchgate.net/publication/
283087342_Fruitdataset (October 2015).
Hulin Kuang received his MEng and BEng degrees from Wuhan
University, China, in 2013 and 2011, respectively. Currently, he is
a PhD student at the Department of Electronic Engineering at City
University of Hong Kong. His current research interests include computer vision and pattern recognition, especially object recognition.
Leanne Lai Hang Chan received her BEng degree in electrical and
electronic engineering from University of Hong Kong, MS degree in
electronic engineering, and her PhD in biomedical engineering
from University of Southern California. Currently, she is an assistant
professor in electronic engineering at City University of Hong Kong.
She is a member of IEEE. Her research interests include artificial
vision, retinal prosthesis, and neural recording.
Cairong Liu received her BSci degree from Wuhan University, China
in 2014. Currently, she is an Mphil student at the Department of
Mathematics at the Chinese University of Hong Kong. Her current
research interests include real analysis and machine learning in
computer vision.
Hong Yan received his PhD from Yale University. He was a professor
of imaging science at the University of Sydney and currently is a
professor of computer engineering at City University of Hong Kong.
He is a fellow of IEEE and IAPR. His research interests include image
processing, pattern recognition, and bioinformatics.
013009-11
Jan∕Feb 2016
•
Vol. 25(1)