Rule-based Classification Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber Using IF-THEN Rules for Classification A rule-based classifier uses a set of IF-THEN rules for classification. An IF-THEN rule is an expression of the form: IF condition THEN conclusion. R1: IF age = youth AND student = yes THEN buys computer = yes. The “IF”-part (or left-hand side)of a rule is known as the rule antecedent or precondition. The “THEN”-part (or right-hand side) is the rule consequent. R1 can also be written as 2 R1: (age = youth) ^ (student = yes))(buys computer = yes). Using IF-THEN Rules for Classification If the condition (that is, all of the attribute tests) in a rule antecedent holds true for a given tuple, we say that the rule antecedent is satisfied (or simply, that the rule is satisfied) and that the rule covers the tuple. A rule R can be assessed by its coverage and accuracy. a rule’s coverage is the percentage of tuples that are covered by the rule (i.e., whose attribute values hold true for the rule’s antecedent). Where be the be the number of tuples covered by R and number of tuples in D. 3 be the Using IF-THEN Rules for Classification For a rule’s accuracy, we look at the tuples that it covers and see what percentage of them the rule can correctly classify.: Where be the be the number of tuples covered by R and the number of tuples correctly classified by R. 4 be Using IF-THEN Rules for Classification 5 Using IF-THEN Rules for Classification Our task is to predict whether a customer will buy a computer. Consider rule R1 above, which covers 2 of the 14 tuples. It can correctly classify both tuples. Therefore, coverage(R1) = 2/14 = 14.28% and accuracy (R1) = 2/2 = 100%. 6 Using IF-THEN Rules for Classification Let’s see how we can use rule-based classification to predict the class label of a given tuple, X. If a rule is satisfied by X, the rule is said to be triggered. For example, suppose we have Note that triggering does not always mean firing because there may be more than one rule that is satisfied! If more than one rule is triggered, we have a potential problem. We look at two strategies namely size ordering and rule ordering. 7 Using IF-THEN Rules for Classification The size ordering scheme assigns the highest priority to the triggering rule that has the “toughest” requirements, where toughness is measured by the rule antecedent size. That is, the triggering rule with the most attribute tests is fired. The rule ordering scheme prioritizes the rules beforehand. With classbased ordering, the classes are sorted in order of decreasing “importance,” such as by decreasing order of prevalence. That is, all of the rules for the most prevalent (or most frequent) class come first, the rules for the next prevalent class come next, and so on.. 8 Rule Extraction from a Decision Tree Rules may be easier to understand than trees One rule is created for each path from the root to a leaf age? Each attribute-value pair along a path forms a conjunction: the leaf holds the class prediction <=30 student? Rules are mutually exclusive and exhaustive no Example: Rule extraction from our buys_computer decision-tree IF age = young AND student = no THEN buys_computer = no IF age = young AND student = yes THEN buys_computer = yes IF age = mid-age IF age = young AND credit_rating = fair THEN buys_computer = no If part -> Rule antecedent, Then part -> Rule consequent 9 no THEN buys_computer = yes IF age = old AND credit_rating = excellent THEN buys_computer = yes 31..40 yes yes yes >40 credit rating? excellent fair yes Pruning Rule Sets In some cases, when trees are large, the set of extracted rules may be difficult to interpret Decision trees may suffer from subtree repetition and replication Rule set may need pruning 10 Rule Assessment For a given rule antecedent, any condition that does not improve the estimated accuracy of the rule can be pruned Assessment of a rule: coverage and accuracy ncovers = # of tuples covered by R (Rule antecedent holds true) ncorrect = # of tuples correctly classified by R coverage(R) = ncovers /|D| /* D: training data set */ accuracy(R) = ncorrect / ncovers 11 Issues in Rule Extraction The rules may no longer be mutually exclusive and exhaustive More than one rule may be triggered What if they specify different classes? What if no rule is triggered? If more than one rule is triggered, need conflict resolution 12 Size ordering: assign the highest priority to the triggering rule that has the “toughest” requirement (i.e., with the most attribute test) Class-based ordering: Classes are sorted in order of decreasing importance (order of prevalence or misclassification cost per class) Issues in Rule Extraction (contd…) Rule-based ordering (decision list): rules are organized into one long priority list, according to some measure of rule quality or by experts If no rule is satisfied 13 A default rule may be specified to specify a default class This may be the class in majority or the majority class of the tuples that were not covered by any rule Rule Extraction from Training Data Sequential covering algorithm: Extracts rules directly from training data Typical sequential covering algorithms: FOIL, AQ, CN2, RIPPER Rules are learned sequentially, each for a given class Ci will cover many tuples of Ci but none (or few) of the tuples of other classes Steps: Rules are learned one at a time Each time a rule is learned, the tuples covered by the rules are removed The process repeats on the remaining tuples unless termination condition, e.g., when no more training examples or when the quality of a rule returned is below a user-specified threshold Comparison with decision-tree induction: learning a set of rules simultaneously 14 Sequential covering algorithm 15 Sequential covering algorithm… 16
© Copyright 2026 Paperzz