Chapter 2 Basic Case-Based Reasoning Techniques 2.1 Case Representation A case is “a contextualized piece of knowledge representing an experience that teaches a lesson fundamental to archiving the goal of the reasoner” [Kolodner, 1993]. Typically, there are three major parts of a case: Problem description: the state of the world while the case is happening and what problem needed solving at the time Solution: the stated or derived solution to the problem Outcome: the resulting state of the world after the case occurred. Table 2-1 lists the content of the above major parts of a case. Table 2-1 The Content of the Major Parts of a Case [Kolodner, 1993] Major Parts Problem Description Solution Outcome Contents 1. Goals to be achieved 2. Constraints on the goals 3. Features of the problem situation and relationship between its parts 1. Solutions 2. Reasoning steps 3. The set justifications for decisions 1. The outcome it self 2. Explanation of the expectation violation and/or failure 3. Repair strategy 4. Pointer to next attempt at solution Another way to describe case presentations is to visualize the structure in terms of the problem space and the solution space [Watson, 1997 and Doyle, 1998]. Figure 2 illustrates the structure. According to this structure, the description of a problem resides 10 in the problem space. The retrieval process identifies the features of the case with the most similar problem. When the best matching is found, the system uses similarity metrics to find the best matching case. In those processes, the solution of a case with the most similar problem may have to be adapted to solve the new problem. Problem Space R = description of new problem to solve = description of solved problems = stored solutions = new solution created by adaptation A A = Adaptation R = Retrieval Solution Space Figure 2-1 Problem and solution spaces [Watson, 1997 and Doyle, 1998] 2.2 Case Indexing An index is a computational data structure that can be stored in memory and searched quickly. Case indexing involves assigning indexes to cases to facilitate their retrieval [Watson, 1995]. The CBR community proposed several guidelines on indexing [Hammond, 89 and Kolodner, 93]: 11 1. Indexes should be predictive. 2. Predictions that can be made should be useful ones, that is, they should address the purposes the case will be used for. 3. Indexes should be abstract enough to make a case useful for future cases. 4. Indexes should be concrete enough to be recognized in the future. Methodologies for choosing indexing includes manual and automated methods. In some systems, cases are indexed by hand. For example, when the cases are complex and the knowledge needed to understand cases well enough to choose indexes accurately is not concretely available, hand indexing is needed [Kolodner, 1993]. On the other hand, if problem solving and understanding are already automated, it is advantageous to use automated indexing methods. 2.3 Case Retrieval Case retrieval is a process that a retrieval algorithm retrieves the most similar cases to the current problem. Case retrieval requires a combination of search and matching. In general, two retrieval techniques are used by the major CBR applications: nearest neighbor retrieval algorithm and inductive retrieval algorithm. 2.3.1 Nearest-Neighbor Retrieval Nearest-neighbor retrieval is a simple approach that computes the similarity between stored cases and new input case based on weight features. A typical evaluation function is used to compute nearest-neighbor matching [Kolodner, 1993] as shown in Figure 2-2: 12 n w sim( f i similarity (CaseI , CaseR ) i 1 I i , f iR ) n w i i 1 Figure 2-2 A nearest-neighbor evaluation function Where wi is the importance weight of a feature, sim is the similarity function of features, and fiI and fiR are the values for feature i in the input and retrieved cases respectively. Figure 2-2 displays a simple scheme for nearest-neighbor matching. In this 2-dimensional space, case3 is selected as the nearest neighbor because similarity(NC, case3)> similarity(NC, case1) and similarity(NC, case3)> similarity(NC, case2). feature2 NC - New Case case1 similarity(NC, case1) NC similarity(NC, case3) case3 case2 similarity(NC, case2) feature1 Figure 2-3 How to find the nearest neighbor of the new case NC. 2.3.2 Inductive Retrieval Inductive retrieval algorithm is a technique that determines which features do the best job in discriminating cases and generates a decision tree type structure to organize the cases in memory [Watson, 1997]. This approach is very useful when a single case feature is required as a solution, and when that case feature is dependent upon others. Here is a completed decision tree (see Figure 2-4) generated from the data in Table 2-2 [Watson, 1997]. The task is to predict the status of a loan from features of the loan applicant (income, job status and repayment). 13 Table 2-2 Four loan cases [Watson, 1997] Case No. Loan Status Monthly Income Job Status Repayment Case 1 Good $2000 Salaried $200 Case 2 Very bad $4000 Salaried $600 Case 3 Very good $3000 Waged $300 Case 4 Bad $1500 Salaried $400 Repayment < $400 Yes No Job status Income > $1500 Salaried W aged Yes Case 1 Case 3 Case 2 No Case 4 Figure 2-4 The completed decision tree Table 2-3 A Target Case Case No. Loan Status Monthly Income Job Status Repayment Case X ? $1000 Salaried $600 14 If a target case were presented as shown in Table 2-3, to determine the loan status of the target case, the algorithm would traverse the decision tree and search for the best matching case in the case-base. For the given the loan repayment, the algorithm first selects the left branch. After this, the algorithm traverses to the node (Income>$1500) and selects the left branching according the monthly income. We can therefore predict that the best matching case is Case 4. This suggests that the loan prospect is bad because Case 4’s outcome is bad. 2.3.3 Nearest-Neighbor Retrieval vs. Inductive Retrieval Nearest-neighbor retrieval and inductive retrieval are widely applied in CBR applications and tools. Table 2-4 shows strengths and weakness of two techniques. The choice between nearest-neighbor retrieval and inductive retrieval in CBR applications requires experience and experimentation. Usually, it is a good choice using nearestneighbor retrieval without any preindexing [Watson, 1997]. If retrieval time becomes an important issue, inductive retrieval is preferable. Table 2-4 Comparison between nearest-neighbor retrieval and inductive retrieval Retrieval Techniques Nearest Neighbor Retrieval Inductive Retrieval Strength Weakness Slow retrieval speed when the case base is large Simple Fast retrieval speed 1. Depends on pre-indexing which is a time-consuming process 2. Impossible to retrieval a case while case data is missing or unknown In some CBR tools, both techniques are used: inductive indexing is used to retrieve a set of matching cases, then nearest-neighbor is used to rank the cases in the set according to the similarity to the target case. 15
© Copyright 2026 Paperzz