Abstract When an individual consider of buying a car, there`s many

Decision Model for Car Evaluation
Final Project in Pattern Recognition
Ronald Fatalla
December 6th, 2004
Computer and Electrical Engineering Department
University of Iceland
Abstract When an individual consider of buying a car, there are many aspects
that could affect his/her decision on which type of car he/she is interested in.
Factors such as the prize, the regular maintenance, comfort, and safety are few
vital issues that we need to consider. Car evaluation database is very important
structural information that someone should look at for the features and
classification of each car could be helpful in our decision making. Because of its
underlying concept structure, this database is very informative particularly
useful for testing practical training and structure discovery methods. This
database classifies cars specifications according to PRICE, COMFORT, and
SAFETY.
The
data
used
in
this
project
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/car
can
be
.
•Car
access
at
Evaluation
Database was derived from a simple hierarchical decision model originally
developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert
system for decision making. Sistemica 1(1), pp. 145-157, 1990.)
The goal of this report is particularly to recognize the decision making,
identifying the ways in recognizing the cars ultimate value with certain
attributes to be met and apply this pattern to a whole set of pattern as a classifier
between a good acceptable car from the unacceptable ones.
1
1
Introduction
Understanding the concept in making a decision in acquiring an
automobile is essential to everyone especially the first time buyers or anyone
who are inexperienced in how the automobile industry works. Usually we need
a car as a means of transportation but as we add fun into it we forget the factors
that we should not underestimate, which could lead us to abandoning our source
of transportation and again going back to commuting, which is not a bad way
but adds too many hassles and is less comfortable as if you have your own car
that you could use right when you need it.
Classifying a good car from a decent to a bad one are usually being done
manually with the help of our friendly mechanics who tells us to buy this
because of this or from the opinion of our family and friends who had previous
experienced with car troubles. It would have been nice to have a gadget that can
scan car features and tell that it’s an A car or a D car. If there’s such thing then
there should be no worries in achieving a particular car. In the present time it is
always the car salesman personality which encourages us to buy this car or not.
We might or might not know it consciously but we are simply ignoring the
factors that would help us financially, comfortably, and safely in a long run.
This report is divided into different parts. From the introduction of data
and information, to selecting the classifiers that best fit with the data or
depending on the application to trying out other ways that could be very
beneficial in classifying samples such as Neural Networks Design. The
classifiers used are Minimum Euclidean Distance, Bayes Classifier, Parzen
Density (R-method) and K-NN (R-method). Application of Neural Networks
was intended to use but was not completed.
2
2.
Background
The hierarchical decision model, from which this dataset is derived, was
first presented in M. Bohanec and V. Rajkovic: Knowledge acquisition and
explanation for multi-attribute decision making. In 8th Intl Workshop on
Expert Systems and their Applications, Avignon, France. Pages 59-78, 1988.
Within machine-learning, this dataset was used for the evaluation of HINT
(Hierarchy INduction Tool), which was proved to be able to completely
reconstruct the original hierarchical model. This, together with a comparison
with C4.5, is presented in B. Zupan, M. Bohanec, I. Bratko, J. Demsar:
Machine learning by function decomposition. ICML-97, Nashville, TN. 1997
(to appear).
The model evaluates cars according to the following concept structure:
CAR
car acceptability
. PRICE
overall price
. . buying
buying price
. . maint
price of the maintenance
. TECH
. . COMFORT
. . . doors
technical characteristics
comfort
number of doors
. . . persons
capacity in terms of persons to carry
. . . lug_boot
the size of luggage boot
. . safety
estimated safety of the car
3
Input attributes are printed in lowercase. Besides the target concept
(CAR), the model includes three intermediate concepts: PRICE, TECH,
COMFORT. Every concept is in the original model related to its lower level
descendants by a set of examples (for these examples sets see http://wwwai.ijs.si/BlazZupan/car.html).
The Car Evaluation Database contains examples with the structural
information removed, i.e., directly relates CAR to the six input attributes:
buying, maint, doors, persons, lug_boot, safety.
The Data should look like this:
Number of Instances: 1728
Number of Attributes: 6
Attribute Values:
buying
v-high, high, med, low
maint
v-high, high, med, low
doors
2, 3, 4, 5-more
persons
2, 4, more
lug_boot
small, med, big
safety
low, med, high
Missing Attribute Values: none
Class Distribution (number of instances per class)
Class
N # samples
N[%]
Unacc
1210
70.023%
Acc
384
22.222%
Good
69
3.9930%
V-good
65
3.7620%
4
3. Classifiers
A. Minimum Euclidean Distance
ω1
( X − µ1) ( X − µ1)
T
<
>
( X − µ 2 )T ( X − µ 2 )
ω 2
A simple classifier with the decision rule shown above Minimum
Euclidean Distance is probably the easiest classifier applied in this project. It
simply classifies each sample accordingly to the class, whose mean is closest
to the sample. The means of each class are evaluated and then compare to the
mean of the sample. In a 6 dimensional space each sample is simply taken in
a vector form then takes the mean and compare to the mean of each classes.
The X in the decision rule below is a vector form in a form like a = [a1 a2 a3
a4 a5 a6], an individual sample containing six attributes. In this problem,
with four classes in a six dimensional space, the decision boundary should be
a three dimensional hyperplane, which has the properties that each sample or
point is at the same distance to the means of each classes. In here, I had
concerns about the number of samples from each class where whenever I
compare the training data with the test data I have to adjust the size of the
training data to fit with the size of the tested data. At the end I simply took
30 samples from each class and took the mean of each class and compare
them to the mean of the sample. This is probably the reason why I got a
pretty good accuracy performance with respect to this classifier. My results
are show in the table below. MED results:
Med_error1
Med_error2
Med_error3
7.8%
4.35%
4.62%
5
B. Bayes Classifier
ω1
(X − µ1)
T
∑1−1 ( X − µ1 ) − ( X − µ2 ) ∑−21 ( X − µ2 ) +ln
T
∑1
∑2
P1 <
− ln 0
P2 >
ω2
Bayes classifier is the best parametric classifier there is. This classifier is
slightly better than of the minimum Euclidean distance. This classifier can be
designed using the covariance of each class and the priors. If the covariance
and priors are given this classifier could work in an optimal sense providing
it has a multivariate Gaussian distribution. In my case, I have to estimate the
covariance matrix and solve for the mean of each class. I assume both classes
having equal size or I took a sample from each class in a finite amount then
solved the decision boundary. This process is not precise, as we’ve seen in
the previous classifier where it’s hard for it to interpret a six dimensional
matrix element. Reducing the samples into a given length helped a lot and
also decreases the length of time working on the classification. But as I
decrease the training samples most of the information is being thrown out,
which probably lead to some inaccuracies in my part. I’m sure there is a
better way of handling this simple case where there’s no need of data
reduction.
Bayes Classification Error:
Bayes_error1
Bayes_error2
Bayes_error3
1.82%
17.39%
9.23%
6
C. Parzen Density Estimation
In this classifier a window function is imposed on every training sample
to get a density estimate of each of the classes. The width of the window
function ‘h’ can be varied to get different parzen densities. In my case I solved
for the optimal value of h that applies for a multivariate distribution. Parzen
classifier assigns the class which gives the highest probability for the test pattern
based on the density functions of all the classes computed during the training.
Although this classifier has a high error rate accuracy it is very time consuming
and requires more storage because it has to store all the training samples to
compute the density estimates for a given class for a given test pattern. In here
instead of expressing window width in a vector form I simply took the mean of
the mean of the vector for each class. The error was calculated by counting the
numbers of samples that did not fall on that window, which is of the same class.
And the error result table is show below.
Optimal windows used for each class:
h_x12/h_x21
h_x13/h_x31
h_x14/h_x41
0.6118
0.7312
0.7336
0.7991
1.2768
1.3587
Parzen_error1
Parzen_error2
Parzen_error3
1.56%
8.7%
9.23%
Parzen Density Error Estimation:
7
D. K-NN Rule
K-NN is a classifier which simply finds the classes of the k-nearest
neighbors (based on a distance metric,
the shortest distance between the
samples which is Euclidean in a sense) and finds which class is in majority and
assigns that class to the test pattern. In here I started using few samples to get
started with the implemention of the K-NN rule. I started comparing ten
samples for each sets or class then I expanded it but as I volume of samples get
larger it started getting complicated as it takes a while to run the particular k-nn
program. I made a loop to handle every sample and it got better and the results
are below.
K1NN_error1
K1NN_error2
K1NN_error3
22.2%
4.0%
3.76%
K1NN_error3
overall
9.98%
E. Neural Networks
Neural Network design that I started working on is the “feedforward
backpropagation”. One of the simplest and most general methods for supervised
training of multilayer neural networks. This design deals with setting the
weigths in accordance with training patterns and the desired output. The power
of backpropagation is that it allows us to calculate the effective error for each
hidden unit, and thus derive a learning ruel for the input-to-hidden weights. This
method has two modes of operation: Feedforward and Learning. Feedforward
consists of presenting a pattern to the input units and passing the signals through
the network in order to yield outputs from the output units. Supervised learning
consists of presenting an input pattern and changing the network parameters to
bring the actual outputs closer to the desired teaching or target values.
8
The matlab program package with the neural network toolbox is very
helpful in understanding how this design works. I would love to try out and
understand more of these neural networks design. I want to implement the
perceptron and linear layer and I would probably do that when I get a chance to.
4. Summary
Different classifiers works in a special way in accordance with the data
someone is working on. The simpler one such as Minimum Euclidean Distance
and Bayes Classifier can be implemented in my design but others such as
Parzen and K-NN is another way achieved the goal of classifying the data
according to its classes. One of the first problems I encounter was to convert my
information into something I could work on. Unlike the previously done
projects in the class this one is made up of four classes, in which a multiple
classifier is needed. A combination of classifiers in a cascading manner would
have work in an optimal sense in the same manner as a tree classifier but I
wasn’t able to perform that design. Another problem is them distribution of each
class, in which every time I tried to compare one class from another or testing
from a data I lose information if I have to omit some of the remaining samples
to have an equal distribution. And lastly, with the neural networks I think that
this could work better as a classifier I just wasn’t able to grab the idea behind it.
This project is a learning experience, fun, though it’s time consuming but
I would love to continue and pursue the goal of this project, I want to clear and
understand better what I have might not able to perform in the design processes
of this project. One thing I learned in this project is to improve greatly my
knowledge of programming.
9
5. References
1
2
3
4
5
Josef Kitler, Fabio Roli (Eds.) “Lecture Notes in Computer
Science” 1857 ‘Multiple Classifier Systems’, First International
Workshop, MCS 2000 Cagliari, Italy June 2000
Duda, Richard., Hart. Peter E., Stork, David G., “Pattern
Recognition” 2nd Edition p. cm. “ A Wiley-Interscience
Publication.” Partial Contents: Part 1. Pattern classification.
Ganjikunta, Raja Sekhar., “A Study on Multiple Classifier
Systems”, Computer Science and Engineering, MSU Project for
CSE802
Kittler, Josef., Member, IEEE Computer Society, Mohamad Hatef,
Robert P.W. Duin, and Jiri Matas “On Combining Classifiers”.
Centre for Vision, Speech, and Signal Processing, School of
Electronic Engineering, Information Technology, and
Mathematics, University of Surrey, Guildford GU2 5XH, United
Kingdom
Tin Kam Ho, Member, IEEE, Jonathan J. Hull, Member, IEEE, and
Sargur N. Shihari, Senior Member IEEE “Decision Combination in
Multiple Classifier Systems”. IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. 16. No. 1, January 1994
10