IS53024A: Artificial Intelligence Generalization as Search: Version Space Learning 1. Three-based Generalization The candidate elimination algorithm for incremental concept learning operates on example and hypotheses spaces defined using generalization hierarchies (lattices). This approach has several advantages: - this means that the version space can be defined in terms of simple ordered sets of nodes; - this means that the generality ordering can be traversed using classical tree-traversal algorithms; - this means that the general and specific boundaries are guaranteed to be singleton sets. Overall this approach offers a plausible way for exploring the knowledge provided by the training examples. The concept learning problem is essentially a search problem. Concept learning by candidate elimination essentially modifies the boundary sets of hypotheses after the arrival of each next example, the modification of one particular hypothesis corresponds to one step in the generalization hierarchy representing the version space. Moving over the version space is typically carried out by depth-first search, and breadth-first search, although other techniques are also applicable. Since the candidate elimination algorithm updates simultaneously both the general and specific boundary sets it actually performs bi-directional search through the version space of hypotheses. 2. Generalization as Search 2.1. Specific-to-General Depth-first Search Consider the following positive and negative training examples: 1. (large red triangle)(small blue circle) + 2. (large blue circle)(small red triangle) + 3 . (large blue triangle)(small blue triangle) After the first example the algorithm is initialized with the following hypothesis (conjunctive concept description): H = [((large red triangle)(small blue circle))] After the arrival of the second example the hypothesis is generalized so that is can match this new example: H = [((large ? ?)(small ? ?))] After the arrival of the third example the hypothesis should be specialized so as to exclude this negative example: H = [((? red triangle)(? blue circle))] The problem of using only such depth-first search is that it may need backtracking in order to reconsider previous examples when alterations of the current hypothesis are necessary. 2.2. Specific-to-General Breadth-first Search Adopting breadth-first search allows us to maintain several alternative hypotheses when doing search in the hypotheses space. Starting with the most specific generalizations the search is organized to follow the branches of the partial ordering so that progressively more generalizations are considered each time the current set must be modified. After the first example the algorithm is initialized as follows: S = [((large red triangle)(small blue circle))] In response to the second example the specializations are generalized along each branch of the partial ordering: S = [((large ? ?)(small ? ?)) ((? red triangle)(? blue circle))] When the third negative example comes the hypothesis that Match it are removed from the set of current hypotheses: S = [((? red triangle)(? blue circle))] 2.3. Bi-directional Search The version space strategy may be considered as an extension to the breadth-first search approach into bi-directional search. The version space strategy conducts a complementary breadth-first search from general-to-specific generalizations (in addition to the search for specific-to-general specializations). The algorithm is initialized with two boundaries as follows: G =[(? ? ?)(? ? ?)] S =[((large red triangle)(small blue circle))] Processing the second example yields: G =[(? ? ?)(? ? ?)] S =[((large ? ?)(small ? ?))((? red triangle)(? blue circle))] After the third negative example these boundaries are updated as follows: G = [((? red ?)(? ? ?))(? ? circle)(? ? ?)] S = [(? red triangle)(? blue circle)] Note: the version space contains the members of G and S as well as all generalizations that lie between these two sets in the partially ordered hypotheses space. Example: Consider that an unknown concept in the vehicles domain is represented by two attributes predefined as follows: Size | small | | | micro tiny medium | large | | | big vast huge transport | | vehicle plane | | | | | | bike moped car prop jet glider Learn this concept with the candidate elimination algorithm using the following positive and negative training examples: 1. 2. 3. 4. 5. ( ( ( ( ( tiny moped ) +) tiny car ) +) big jet ) -) med car ) +) med jet ) -) The steps performed by the candidate elimination algorithm are: Training with the first example: ( tiny moped ) +) generalizing… G = [( size transport )] S = [( tiny moped )] bike moped car prop jet glider micro + tiny med big vast huge Training with the second example: ( tiny car ) +) generalizing… G = [( size transport )] S = [( tiny vehicle )] bike moped car micro tiny med big vast huge . + + prop jet glider Training with the third example: ( big jet ) -) specializing… G = [( small transport )( any vehicle )] S = [( tiny vehicle )] bike micro tiny moped car prop jet glider G set . + + ., + S set med - big vast huge Training with the fourth example: ( med car ) +) generalizing… G = [( small transport )] S = [( small vehicle) ] bike micro tiny med big vast huge moped car prop jet glider . . . . + + . . + G set ., + S set - Training with the fifth example: ( med jet ) -) specializing… G = [( small vehicle )] S = [( small vehicle) ] bike micro tiny med big moped car . . . . + + . . + prop jet glider G set - ., + S set vast huge Convergence, the learned concept must be: ( small vehicle ) 3. Features of Version Space Learning The version space strategy features the following characteristics: - Version space and the candidate elimination algorithm provide a framework for studying concept learning. However, this learning algorithm is not robust to noisy data, or to situations in which the unknown target concept can not be expressed in the terms of the provided hypotheses space; - The boundary sets provide the learner a description of its uncertainty regarding the exact identity of the target concept. The version space of alternative hypotheses can be examined to determine whether the learner has converged to the target concept, to determine whether the training data are inconsistent, to generate queries to further refine the version space; - Inductive learning algorithms are able to classify unseen examples only because of their implicit inductive bias for selecting one consistent hypothesis over another. The bias associated with the candidate elimination algorithm is that the target concept can be found in the provided hypotheses space; - If the hypotheses space is enriched to the point where there is a hypothesis corresponding to every possible subset of training examples this will remove any inductive bias from the candidate elimination algorithm. Unfortunately this will also remove the ability to classify any examples beyond the observed training instances; - Every classical version space could be considered a set of fragments, so that the union of classical version subspaces is a disjunctive version space. Disjunctive concept learning can be performed by manipulating the boundary sets of the fragments. The boundary sets include the corresponding boundary sets of the classical version spaces within the disjunctive version space. One fragment is represented by one maximally general consistent boundary and the closest maximally specific boundary in the partial ordering structure of the concept description language. References: Mitchell, T.M. (1982). “Generalization as Search”, Artificial Intelligence, vol.18, N:2, pp.203-226. Mitchell, T.M. (1997). “Machine Learning”, The McGraw-Hill Companies, Inc. Thornton, C.J. (1992). “Techniques in Computational Learning”, Chapman and Hall, London.
© Copyright 2026 Paperzz