Computational Learning
Theory
Part 1: Preliminaries
1
VERSION SPACE
Concept Learning by Induction
• Much of human learning involves acquiring general
concepts from specific training examples (this is called
inductive learning)
• Example: Concept of ball
* red, round, small
* green, round, small
* red, round, medium
• Complicated concepts: “situations in which I should
study more to pass the exam”
2
VERSION SPACE
Concept Learning by Induction
• Each concept can be thought of as a Boolean-valued
function whose value is true for some inputs and false
for all the rest
(e.g. a function defined over all the animals, whose
value is true for birds and false for all the other
animals)
• This lecture is about the problem of automatically
inferring the general definition of some concept, given
examples labeled as members or nonmembers of the
concept. This task is called concept learning, or
approximating (inferring) a Boolean valued function
from examples
3
VERSION SPACE
Concept Learning by Induction
• Target Concept to be learnt: “Days on which Aldo
enjoys his favorite water sport”
• Training Examples present are:
4
VERSION SPACE
Concept Learning by Induction
• The training examples are described by the values of
seven “Attributes”
• The task is to learn to predict the value of the attribute
EnjoySport for an arbitrary day, based on the values of
its other attributes
5
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• The possible concepts are called Hypotheses and we
need an appropriate representation for the hypotheses
• Let the hypothesis be a conjunction of constraints on
the attribute-values
6
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• If
sky = sunny temp = warm humidity = ?
wind = strong water = ? forecast = same
then
Enjoy Sport = Yes
else
Enjoy sport = No
• Alternatively, this can be written as:
{sunny, warm, ?, strong, ?, same}
7
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• For each attribute, the hypothesis will have either
?
Any value is acceptable
Value
Any single value is acceptable
No value is acceptable
8
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• If some instance (example/observation) satisfies all the
constraints of a hypothesis, then it is classified as
positive (belonging to the concept)
• The most general hypothesis is {?, ?, ?, ?, ?, ?}
It would classify every example as a positive example
• The most specific hypothesis is {, , , , , }
It would classify every example as negative
9
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• Alternate hypothesis representation could have been
Disjunction of several conjunction of constraints
on the attribute-values
Example:
{sunny, warm, normal, strong, warm, same}
{sunny, warm, high, strong, warm, same}
{sunny, warm, high, strong, cool, change}
10
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• Another alternate hypothesis representation could
have been
Conjunction of constraints on the attribute-values
where each constraint may be a disjunction of
values
Example:
{sunny, warm, normal high, strong, warm cool,
same change}
11
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
• Yet another alternate hypothesis representation could
have incorporated negations
Example:
{sunny, warm, (normal high), ?, ?, ?}
12
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
By selecting a hypothesis representation, the space of all
hypotheses (that the program can ever represent and
therefore can ever learn) is implicitly defined
In our example, the instance space X can contain 3.2.2.2.2.2 =
96 distinct instances
There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses.
Since every hypothesis containing even one classifies every
instance as negative, hence semantically distinct hypotheses
are: 4.3.3.3.3.3 + 1 = 973
13
VERSION SPACE
Concept Learning by Induction: Hypothesis Representation
Most practical learning tasks involve much larger, sometimes
infinite, hypothesis spaces
14
VERSION SPACE
Concept Learning by Induction: Search in Hypotheses Space
Concept learning can be viewed as the task of searching
through a large space of hypotheses implicitly defined by
the hypothesis representation
The goal of this search is to find the hypothesis that best fits
the training examples
15
VERSION SPACE
Concept Learning by Induction: Basic Assumption
Once a hypothesis that best fits the training examples is
found, we can use it to predict the class label of new
examples
The basic assumption while using this hypothesis is:
Any hypothesis found to approximate the target function well
over a sufficiently large set of training examples will also
approximate the target function well over other unobserved
examples
16
VERSION SPACE
Concept Learning by Induction: General to Specific Ordering
If we view learning as a search problem, then it is natural
that our study of learning algorithms will examine
different strategies for searching the hypothesis space
Many algorithms for concept learning organize the
search through the hypothesis space by relying on a
general to specific ordering of hypotheses
17
VERSION SPACE
Concept Learning by Induction: General to Specific Ordering
Example:
Consider h1 = {sunny, ?, ?, strong, ?, ?}
h2 = {sunny, ?, ?, ?, ?, ?}
any instance classified positive by h1 will also be
classified positive by h2 (because it imposes fewer
constraints on the instance)
Hence h2 is more general than h1 and h1 is more
specific than h2
18
VERSION SPACE
Concept Learning by Induction: General to Specific Ordering
Consider the three hypotheses h1, h2 and h3
• Neither h1 nor h3 is more general than the other
• h2 is more general than both h1 and h3
19
VERSION SPACE
Find-S Algorithm
How to find a hypothesis consistent with the observed
training examples?
- A hypothesis is consistent with the training examples if it
correctly classifies these examples
One way is to begin with the most specific possible
hypothesis, then generalize it each time it fails to cover a
positive training example (i.e. classifies it as negative)
The algorithm based on this method is called Find-S
20
VERSION SPACE
Find-S Algorithm
We say that a hypothesis covers a positive training example
if it correctly classifies the example as positive
A positive training example is an example of the concept to
be learnt
Similarly a negative training example is not an example of
the concept
21
VERSION SPACE
Find-S Algorithm
22
VERSION SPACE
Find-S Algorithm
23
VERSION SPACE
Find-S Algorithm
The nodes shown in the diagram are the possible hypotheses
allowed by our hypothesis representation scheme
Note that our search is guided by the positive examples and
we consider only those hypotheses which are consistent
with the positive training examples
The search moves from hypothesis to hypothesis, searching
from the most specific to progressively more general
hypotheses
24
VERSION SPACE
Find-S Algorithm
At each step, the hypothesis is generalized only as far as
necessary to cover the new positive example
Therefore, at each stage the hypothesis is the most specific
hypothesis consistent with the training examples observed
up to this point
Hence, it is called Find-S
25
VERSION SPACE
Find-S Algorithm
Note that the algorithm simply ignores every negative
example
However, since at each step our current hypothesis is
maximally specific it will never cover (falsely classify) any
negative example. In other words, it will be always
consistent with each negative training example
However the data must be noise free and our hypothesis
representation should be such that the true concept can be
described by it
26
VERSION SPACE
Definition: Version Space
Version Space is the set of hypotheses consistent with the
training examples of a problem
Find-S algorithm finds one hypothesis present in the
Version Space, however there may be others
27
VERSION SPACE
Definition: Version Space
Version Space is the set of hypotheses consistent with the
training examples of a problem
28
VERSION SPACE
List-then-Eliminate Algorithm
This algorithm first initializes the version space to
contain all hypotheses possible, then eliminate any
hypothesis found inconsistent with any training
example
The version space of candidate hypotheses thus shrinks
as more examples are observed, until ideally just one
hypothesis remains that is consistent with all the
observed examples
29
VERSION SPACE
List-then-Eliminate Algorithm
For the Enjoy Sport data we can list 973 possible
hypotheses
Then we can test each hypothesis to see whether it
confirms with our training data set or not
30
VERSION SPACE
List-then-Eliminate Algorithm
For this data we will be left with the following hypotheses
h1 = {Sunny, Warm, ?, Strong, ?, ?}
h2 = {Sunny, ?, ?, Strong, ?, ?}
h3 = {Sunny, Warm, ?, ?, ?, ?}
h4 = {?, Warm, ?, Strong, ?, ?}
h5 = {Sunny, ?, ?, ?, ?, ?}
h6 = {?, Warm, ?, ?, ?, ?}
Note that the Find-S algorithm is able to find only h1
31
VERSION SPACE
List-then-Eliminate Algorithm
If insufficient data is available to narrow the version
space to a single hypothesis, then the algorithm can
output the entire set of hypotheses consistent with the
observed data
It has the advantage that it guarantees to output all the
hypotheses consistent with the training data
Unfortunately, it requires exhaustive listing of all
hypotheses – an unrealistic requirement for practical
problems
32
VERSION SPACE
Candidate Elimination Algorithm
The Candidate Elimination algorithm instead of listing all
the possible members of the version space, employs a
much more compact representation
The version space is represented by its most general
(maximally general) and most specific (maximally
specific) members
These members form the general and specific boundary sets
that delimit the version space. Every other member of the
version space lies between these boundaries
33
VERSION SPACE
Candidate Elimination Algorithm
34
VERSION SPACE
Candidate Elimination Algorithm
It begins by initializing the version space to the set of all
hypotheses, by initializing the G & S boundary sets as
{?, ?, …, ?, ?} and {, , … , , } respectively
As each training example is considered, the S boundary is
generalized and the G boundary is specialized, to
eliminate from the version space any hypotheses
found inconsistent with the new training example
35
VERSION SPACE
Candidate Elimination Algorithm
36
VERSION SPACE
Candidate Elimination Algorithm
37
VERSION SPACE
Candidate Elimination Algorithm
38
VERSION SPACE
Candidate Elimination Algorithm: Example
S0 = {, , , , , }
G0 = {?, ?, ?, ?, ?, ?}
39
VERSION SPACE
Candidate Elimination Algorithm: Example
S0 = {, , , , , }
S1 = {sunny, warm, normal, strong, warm, same}
G0 = G1 = {?, ?, ?, ?, ?, ?}
40
VERSION SPACE
Candidate Elimination Algorithm: Example
S0 = {, , , , , }
S1 = {sunny, warm, normal, strong, warm, same}
S2 = {sunny, warm, ?, strong, warm, same}
G0 = G1 = G2 = {?, ?, ?, ?, ?, ?}
41
VERSION SPACE
Candidate Elimination Algorithm: Example
S0 = {, , , , , }
S1 = {sunny, warm, normal, strong, warm, same}
S2 = S3 = {sunny, warm, ?, strong, warm, same}
G3 {sunny, ?,?,?,?,?} {?, warm, ?,?,?,?} {?, ?, ?, ?, same}
{?, ?, ?, ?, ?, ?}
42
VERSION SPACE
Candidate Elimination Algorithm: Example
S0 = {, , , , , }
S1 = {sunny, warm, normal, strong, warm, same}
S2 = S3 = {sunny, warm, ?, strong, warm, same}
S4 = {sunny, warm, ?, strong, ?, ?}
G4
{sunny, ?,?,?,?,?} {?, warm, ?,?,?,?}
G3 {sunny, ?,?,?,?,?} {?, warm, ?,?,?,?} {?, ?, ?, ?, same}
{?, ?, ?, ?, ?, ?}
43
VERSION SPACE
Candidate Elimination Algorithm: Example
44
VERSION SPACE
Candidate Elimination Algorithm: Noisy Data
Suppose the 2nd example is presented as negative
S0 = {, , , , , }
S1 = S2 = {sunny, warm, normal, strong, warm, same}
G2 = {?, ?, Normal, ?, ?, ?}
G0 = G1 = {?, ?, ?, ?, ?, ?}
45
VERSION SPACE
Candidate Elimination Algorithm: Noisy Data
When the 4th example arrives the only general hypothesis
will be wiped off
G2 = G3 = {?, ?, Normal, ?, ?, ?}
G0 = G1 = {?, ?, ?, ?, ?, ?}
46
VERSION SPACE
Candidate Elimination Algorithm: Is target concept present in
the hypothesis representation?
The algorithm will fail if the target concept cannot be
described in the hypothesis representation
Our hypothesis representation: conjunction of attribute
values
Example
sunny warm normal strong cool change
cloudy warm normal strong cool change
rainy warm normal strong cool change
Yes
Yes
No
47
VERSION SPACE
Candidate Elimination Algorithm: Is target concept present in
the hypothesis representation?
Example
sunny warm normal strong cool change
cloudy warm normal strong cool change
rainy warm normal strong cool change
Yes
Yes
No
Our representation is unable to represent disjunctive
target concepts such as:
sunny or cloudy
48
VERSION SPACE
Candidate Elimination Algorithm: Is target concept present in
the hypothesis representation?
The obvious solution to the problem is to provide a
hypothesis space capable of representing every teachable
concept (by allowing arbitrary disjunctions, conjunctions,
negations of attributes and hypotheses to form new
hypotheses)
However this raises the problem that the algorithm is not
able to generalize beyond the training instances
Example:
Let there be three positive instances x1, x2, & x3
and 2 negative instances x4 and x5
49
VERSION SPACE
Reference
Sections 2.5.5 – 2.6
of T. Mitchell
50
© Copyright 2026 Paperzz