download

Matakuliah
Tahun
Versi
: T0264/Intelijensia Semu
: Juli 2006
: 2/2
Pertemuan 26
Learning
Learning from Examples
1
Learning Outcomes
Pada akhir pertemuan ini, diharapkan mahasiswa
akan mampu :
• << TIK-99 >>
• << TIK-99>>
2
Outline Materi
•
•
•
•
•
Materi 1
Materi 2
Materi 3
Materi 4
Materi 5
3
17.5. Learning from Examples
Induction
•
Classification is the process of assigning,
to a particular input, the same of a class
to which it belongs.
• This task of constructing class definitions
is called concept learning or induction.
1. Winston’s arch learning program
2. Version spaces
3. Decision trees
4
Winston’s Learning Program
Some Blocks World Concepts
An example of such a structural description
for the house.
5
Winston’s Learning Program
Example :
• It learned the concepts House, Tent and Arch.
• A near miss is an object that is not an instance of
the concept in question but is very similar to such
instances.
• The program started with a line drawing of a blocks
world structure.
• It used produces such as the one describe to
analyze the drawing and construct a semantic net
representation of the structural description of the
object(s).
6
Structural Descriptions
7
Structural Descriptions contd’
8
Keterangan Structural Descriptions
• Figure a : An example of such a structural
description for the house. Node A represents the
entire structure, which is composed of two parts :
node B, a wedge, and node C, a Brick.
• Figure b and c : show descriptions of the two Arch
structures. These descriptions are identical except
for the types of the objects on the top; one a Brick
is while the other is a Wedge.
• Two supporting objects are related not only by leftof and right-of links, but also by a does-not marry
link, which says that the two objects do not marry.
• The marry relation is critical in the definition of an
Arch.
9
Winston’s Learning Program
•
The basic approach that Winston’s program took
to the problem of concept formation ca be
described as follow :
1. Begin with structural description of one known
instance of the concept. Call that description the
concept definition.
2. Examine descriptions of other known instances
of the concept. Generalize the definition to
include them.
3. Examine descriptions of near misses of the
concept. Restrict the definition to exclude these.10
The Comparison of Two Arches
11
The Comparison of Two Arches
Keterangan :
• The c-note link from node C describes the
difference found by the comparison routine.
• The difference occurred in the isa link, and that in
the first structure the isa link pointed to Brick, and
in the second it pointed to Wedge.
• If follow isa links from Brick and Wedge, these links
would eventually merge. New description of the
concept Arch ca be generated.
• Node C must be either a Brick or a Wedge.
• It is probably better to trace up the isa hierarchies
of Brick and Wedge until they merge.
12
The Arch Description after Two Examples
13
The Arch Description After a Near Miss
14
Keterangan : The Comparison of Two Arches
• The comparison routine will note that only
difference between the current definition and the
near miss is in the does-not-marry link between
nodes B and A.
• But since this is a near miss, we do not want to
broaden the definition to include it.
• Modify the link does-not-marry by must-not-marry.
• The Arch description at this point. Actually mustnot-marry should not be a completely new link.
• There must be some structure among link type to
reflect the relationship between marry, does-notmarry, and must-not-marry.
15
Version Spaces
• Version Space is another approach to concept
learning [Mitchell, 1977; 1978].
• The goal is to produce description that is
consistent with all positive examples but no
negative examples in the training set.
• Version Space work by :
- Maintain a set of possible descriptions
- Evolve that set as new examples are presented
- Near misses are presented
16
An Example of the Concept “Car”
Car023
origin
manufacturer
color
decade
type
:
:
:
:
:
Japan
Honda
Blue
1970
Economy
A frame representing an individual Car
17
Representation Language for Car
origin
{Japan, USA, Britain, Germany, Italy}
manufacturer {Honda, Toyota, Ford, Chrysler, Jaguar, BMW,
Fiat}
color
{Blue, Green, Red, White}
decade
{1950, 1960, 1970, 1980, 1990, 2000}
type
{Economy, Luxury, Sports}.
The Concept “Japanese economy car”
origin
: Japan
manufacturer : x1
color
: x2
decade
: x3
type
: Economy
18
Partial Ordering of Concept ….
19
Version Spaces
• Before we proceed to the version space
algorithm, we should make some
observations about the representation.
• Some descriptions are more general than
other (x1 more general than Honda, etc).
• In fact, the representation language defines
a partial ordering of descriptions.
• Needed a portion of that partial ordering of
Concept Specified by the representation
language.
20
An Concept and Version Spaces
21
An Concept and Version Spaces
• At the top of the concept space is null description,
consisting only of variables.
• At the bottom are all of the possible training instance,
which contain no variable.
• Before receive any training examples, the target concept
lies somewhere in the concept space.
• If every possible description is an instance of the intended
concept, then null description is the concept definition
since it matches everything.
• If the target concept includes only a single example, then
one of the descriptions at the bottom of the concept space
is the desired concept definition.
• Most target concepts, of course, lie somewhere in between
these two extremes.
22
How can we represent the Version Spaces
• Version Space consists two subsets of the concept space.
• G contains the most general descriptions consistent with
the training examples.
• S contains the most the most specific description
consistent with the training examples.
• The version space is the set of all descriptions that lie
between some element of G and element of S in the partial
order of the concept space.
• Positive training is to make the S set more general
• Negative training is to make the G set more specific
• If the S and G set converge, our range hypotheses will
narrow to a single concept description.
• The algorithm for narrowing the version space is called the
candidate elimination algorithm.
23
Algorithm : Candidate Elimination
•
Given : A representation language and a set of
positive and negative examples expressed in
that language.
• Compute : A concept description that is
consistent with all the positive examples and
none of the negative examples.
1. Initialize G to contain one element : the null
description (all features are variables).
2. Initialize S to contain one element : the first
positive example.
24
Algorithm : Candidate Elimination
3. Accept a new training example.
• If it is a positive example, first remove from G any
descriptions that do not cover the example. Then,
update the S set to contain the most specific set
of descriptions in the version space that cover
the example and the current element of the S set.
That is, generalize the elements of S as little as
possible so that they cover the new training
example.
25
Algorithm : Candidate Elimination
•
If it is a negative example, first remove from S
any descriptions that cover the example. Then,
update the G set to contain the most general set
of descriptions in the version space that do not
cover the example. That is, specialize the
element of G as little as possible so that the
negative example is no longer covered by any of
the elements of G.
•
If S and G are both singleton sets, then if they
are identical, output their value and halt. If they
are both singleton sets but they are different,
then the training cases were inconsistent. Output
this result and halt. Otherwise, go to step 3.
26
Positive and Negative Examples of the
Concept “ Japanese economy car”
origin : Japan
mfr
: Honda
color : Blue
decade : 1980
type : Economy
origin : Japan
mfr
: Toyota
color : Green
decade : 1970
type
: Sports
(+)
origin : Japan
mfr
: Toyota
color : Blue
decade : 1990
type : Economy
(-)
(+)
origin : USA
mfr
: Chrysler
color : Red
decade : 1980
type : Economy
origin : Japan
mfr
: Honda
color : White
decade : 1980
type : Economy
(-)
(+)
27
A Version Space Example
• G = {(x1,x2,x3,x4,x5)}
S = {(Japan, Honda, Blue, 1980, Economy)}
• G = {(x1, Honda,x3,x4,x5) ,(x1,x2,Blue,x4,x5),
(x1,x2,x3,1980,x5),(x1,x2,x3,x4,Economy)}
• S = {(Japan, Honda, Blue, 1980, Economy)}
• G = {(x1,x2, Blue, x4,x5), (x1,x2,x3,x4,Economy)}
• S = {(Japan, x2, Blue, x4, Economy)}
• G = { (Japan,x2, Blue,x4,x5),(Japan,x2,x3,x4, Economy)}
• S = {(Japan,x2, Blue,x4, Economy)}
• G = {(Japan,x2,x3,x4,Economy)}
• S = {(Japan,x2,x3,x4,Economy)}
28
A Decision Tree
origin ?
USA Germany Britain Italy Japan
(-)
(-)
(-)
(-)
type ?
Sports Economy
(-)
(+)
Luxury
(-)
29
A Decision Tree
• Decision trees, as exemplified by the ID3 program
of Quilan [1986]
• ID3 uses a tree representation for concept, such
as the one shown in figure.
• Start at the top of tree and answers questions until
we leaf, where the classification is stored.
• It begins by choosing a random subset of the
training examples.
• This subset is called the window.
• The algorithm builds a decision tree that correctly
classifies all examples in the window.
30
A Decision Tree
• The tree is then tested on the training examples
outside the window.
• If all the examples are classified correctly, the
algorithm halts.
• Otherwise, it adds a number of training examples
to the window and the process repeats.
• Empirical evidence indicates that the iterative
strategy is more efficient than considering the
whole training set at one.
• When the concept space is very large, decision
tree learning algorithm run more quickly than their
version space cousins.
31
17.6. Explanation Based Learning
• Input:
– A Training Example – What the learning program
sees in the world
– A Good Concept – A high-level description of what the
program is supposed to learn
– An Operationality Criterion – A description of which
concepts are usable
– A Domain Theory – A set of rules that describe
relationship between objects and action in a domain
• Output:
– A generalization of the training example that is
sufficient to describe the goal concept, and also
satisfies the operationality criterion
32
17.7. Discovery
Restricted form of learning which one entity
acquires knowledge without the help of a
teacher:
– AM: theory-driven discovery
– BACON: data driven-discovery
– Clustering
33
Discovery contd’
• BACON Discovering the Ideal Gas Law
34
17.8. Analogy
• Last month, the stock market was a roller
coaster.
• Bill is like a fire engine.
• Problems in electromagnetism are just like
problems in fluid flow.
35
Transformational Analogy
• The idea is to transform a solution to previous
problem into a solution for the current problem.
36
Derivational Analogy
• Transformational analogy does not look at how
the old problem was solved : it only looks at the
final solution.
37
17.9. Formal Learning Theory
• A device learns a concept if it can, given positive
& negative examples, produce an algorithm that
will classify future examples correctly with
probability 1/h.
• The complexity of learning a concept is a
function of three factors: the error tolerance (h),
the number of binary features present in the
examples (t), and the size of the rule necessary
to make the discrimination (f).
• If the number of training examples required is
polynomial in h, t, and f, then the concept is said
to be learnable.
38
The Concept Elephant
39
Summary
• Learning by taking advice
– Initial state: high-level advice
– Final state: an operational rule
– Operators: unfolding definitions, case
analysis, etc.
• Learning from examples
– Initial state: positive and negative examples
– Final state: concept description
– Search algorithm: candidate elimination,
induction of decision trees
40
Summary contd’
• Learning in problem solving
– Initial state: solution traces to examples problems
– Final state: new heuristics for solving new problems
efficiently
– Heuristics for search: generalization, explanationbased learning, utility considerations
• Discovery
– Initial state: some environment
– Final state: unknown
– Heuristics for search: interistingness, analogy, etc.
41
<< Closing >>
End of Pertemuan 26
Good Luck
42

Download Report

download

Paperzz.com

Your Paperzz