Concept learning by candidate elimination essentially modifies the

IS53024A: Artificial Intelligence
Generalization as Search: Version Space Learning
1. Three-based Generalization
The candidate elimination algorithm for incremental concept
learning operates on example and hypotheses spaces defined
using generalization hierarchies (lattices).
This approach has several advantages:
- this means that the version space can be defined in terms of
simple ordered sets of nodes;
- this means that the generality ordering can be traversed using
classical tree-traversal algorithms;
- this means that the general and specific boundaries are
guaranteed to be singleton sets.
Overall this approach offers a plausible way for exploring the
knowledge provided by the training examples.
The concept learning problem is essentially a search problem.
Concept learning by candidate elimination essentially modifies the
boundary sets of hypotheses after the arrival of each next example,
the modification of one particular hypothesis corresponds to one
step in the generalization hierarchy representing the version space.
Moving over the version space is typically carried out by
depth-first search, and breadth-first search, although other
techniques are also applicable.
Since the candidate elimination algorithm updates simultaneously
both the general and specific boundary sets it actually performs
bi-directional search through the version space of hypotheses.
2. Generalization as Search
2.1. Specific-to-General Depth-first Search
Consider the following positive and negative training examples:
1. (large red triangle)(small blue circle) +
2. (large blue circle)(small red triangle) +
3 . (large blue triangle)(small blue triangle) After the first example the algorithm is initialized with
the following hypothesis (conjunctive concept description):
H = [((large red triangle)(small blue circle))]
After the arrival of the second example the hypothesis is
generalized so that is can match this new example:
H = [((large ? ?)(small ? ?))]
After the arrival of the third example the hypothesis should be
specialized so as to exclude this negative example:
H = [((? red triangle)(? blue circle))]
The problem of using only such depth-first search is that it may
need backtracking in order to reconsider previous examples when
alterations of the current hypothesis are necessary.
2.2. Specific-to-General Breadth-first Search
Adopting breadth-first search allows us to maintain several
alternative hypotheses when doing search in the hypotheses space.
Starting with the most specific generalizations the search is
organized to follow the branches of the partial ordering so that
progressively more generalizations are considered each time
the current set must be modified.
After the first example the algorithm is initialized as follows:
S = [((large red triangle)(small blue circle))]
In response to the second example the specializations are
generalized along each branch of the partial ordering:
S = [((large ? ?)(small ? ?))
((? red triangle)(? blue circle))]
When the third negative example comes the hypothesis that
Match it are removed from the set of current hypotheses:
S = [((? red triangle)(? blue circle))]
2.3. Bi-directional Search
The version space strategy may be considered as an extension
to the breadth-first search approach into bi-directional search.
The version space strategy conducts a complementary breadth-first
search from general-to-specific generalizations (in addition to the
search for specific-to-general specializations).
The algorithm is initialized with two boundaries as follows:
G =[(? ? ?)(? ? ?)]
S =[((large red triangle)(small blue circle))]
Processing the second example yields:
G =[(? ? ?)(? ? ?)]
S =[((large ? ?)(small ? ?))((? red triangle)(? blue circle))]
After the third negative example these boundaries are updated as follows:
G = [((? red ?)(? ? ?))(? ? circle)(? ? ?)]
S = [(? red triangle)(? blue circle)]
Note: the version space contains the members of G and S as well
as all generalizations that lie between these two sets in the partially
ordered hypotheses space.
Example: Consider that an unknown concept in the vehicles domain is
represented by two attributes predefined as follows:
Size
|
small
|
|
|
micro tiny medium
|
large
|
|
|
big vast huge
transport
|
|
vehicle
plane
|
|
|
|
|
|
bike moped car prop jet glider
Learn this concept with the candidate elimination algorithm using the
following positive and negative training examples:
1.
2.
3.
4.
5.
(
(
(
(
(
tiny moped ) +)
tiny car ) +)
big jet ) -)
med car ) +)
med jet ) -)
The steps performed by the candidate elimination algorithm are:
Training with the first example: ( tiny moped ) +) generalizing…
G = [( size transport )]
S = [( tiny moped )]
bike
moped car
prop jet glider
micro
+
tiny
med
big
vast
huge
Training with the second example: ( tiny car ) +) generalizing…
G = [( size transport )]
S = [( tiny vehicle )]
bike
moped car
micro
tiny
med
big
vast
huge
. + +
prop jet glider
Training with the third example: ( big jet ) -) specializing…
G = [( small transport )( any vehicle )]
S = [( tiny vehicle )]
bike
micro
tiny
moped car
prop jet glider
G set
. + +
., + S set
med
-
big
vast
huge
Training with the fourth example: ( med car ) +) generalizing…
G = [( small transport )]
S = [( small vehicle) ]
bike
micro
tiny
med
big
vast
huge
moped car
prop jet glider
. . .
. + +
. . +
G set
., + S set
-
Training with the fifth example: ( med jet ) -) specializing…
G = [( small vehicle )]
S = [( small vehicle) ]
bike
micro
tiny
med
big
moped car
. . .
. + +
. . +
prop jet glider
G set
-
., + S set
vast
huge
Convergence, the learned concept must be: ( small vehicle )
3. Features of Version Space Learning
The version space strategy features the following characteristics:
- Version space and the candidate elimination algorithm provide
a framework for studying concept learning. However, this learning
algorithm is not robust to noisy data, or to situations in which the
unknown target concept can not be expressed in the terms of the
provided hypotheses space;
- The boundary sets provide the learner a description of its uncertainty
regarding the exact identity of the target concept. The version space of
alternative hypotheses can be examined to determine whether the learner
has converged to the target concept, to determine whether the training data
are inconsistent, to generate queries to further refine the version space;
- Inductive learning algorithms are able to classify unseen examples
only because of their implicit inductive bias for selecting one consistent
hypothesis over another. The bias associated with the candidate
elimination algorithm is that the target concept can be found in the
provided hypotheses space;
- If the hypotheses space is enriched to the point where there is a
hypothesis corresponding to every possible subset of training examples
this will remove any inductive bias from the candidate elimination
algorithm. Unfortunately this will also remove the ability to classify
any examples beyond the observed training instances;
- Every classical version space could be considered a set of fragments, so that
the union of classical version subspaces is a disjunctive version space.
Disjunctive concept learning can be performed by manipulating
the boundary sets of the fragments.
The boundary sets include the corresponding boundary sets of the
classical version spaces within the disjunctive version space. One fragment
is represented by one maximally general consistent boundary and the closest
maximally specific boundary in the partial ordering structure of the
concept description language.
References:
Mitchell, T.M. (1982). “Generalization as Search”, Artificial Intelligence, vol.18, N:2, pp.203-226.
Mitchell, T.M. (1997). “Machine Learning”, The McGraw-Hill Companies, Inc.
Thornton, C.J. (1992). “Techniques in Computational Learning”, Chapman and Hall, London.