Decision Sciences Institute 2002 Annual Meeting Proceedings IS DESIGN USING ANALOGICAL GENERATION Dan Zhu Department of Logistics, Operations and MIS 300 Carver Hall Iowa State University Ames, IA 50011 515-294-5041 515-294-2534 (Fax) [email protected] Chang Zhang Microsoft Corp. Dongrong Xu Center for Biomedical Image Computing Johns Hopkins University Abstract: This paper presents a framework for using analogical reasoning and generation to automate the process of designing forms and reports. We begin by separating successful cases into different regions, and then storing them in a knowledge base that allows future retrieval based on identity and similarity. The target can be obtained through a three-layered structure matching and analogical reasoning process. 1. INTRODUCTION Information systems development is a complex process, which requires a clear understanding of the organization or enterprise being modeled and systematic skills in developing computer-based systems. This research aims to use an analogical reasoning approach to provide design strategies for forms and report design in IS project development. Analogy forms a significant part of our cognitive endeavors. Analogical reasoning occurs when we evaluate and classify objects according to their similarities. Significant research has been conducted on the important role of the comparative process in human decision-making (Markman and Medin 1995, Medin, Godstone, & Gentner 1995; Simonson & Tversky 1992). This research focuses on the conceptual design phase of information development. Conceptual design phase is one of the most critical phases and has been viewed as a predominantly creative process (French 1971). This is where analogy can play an important role. Research on creative design has explored the use of analogies in proposing solutions to design problems in the design process's conceptual phase (Goel 1997). This conceptual design phase is also the phase which makes most demands on the designers and which offers the greatest scope for improvements (French 1971). In general, system designers learn through experience, with guidance from proven methods. As they seek out additional information, they also begin to formulate alternative solutions to the problem. Devising solutions leads to a search for more information, which in turn leads to improvements in the alternatives. More specifically, we focus on the design of the database relations. The design of database relations (tables) is probably the most important part of the job, generally requiring 30-40% of the total time and effort spent on the project. The tables must be designed according to certain criteria, but more importantly, they must be convenient for users to operate, allowing them to efficiently retrieve information from the database. Often, different tables share the same resources and thus may be similar in many ways. Despite these similarities, tables are designed manually and individually, which can be a mundane and routine task that is both inefficient and subject to a wide margin of error. An alternative method is to collect examples and separate them into elements with descriptions attached. When a new task is given, the descriptions can be searched to determine if any of the previous information can be used in the new table. If so, data corresponding to each item in the table can be obtained directly from the system and inserted into a new table. This can potentially automate the design process by allowing designers to work from existing tables rather than constantly designing new ones from scratch (Poulin 1995). The rest of the paper is organized as follows. Section 2 introduces the theory of analogical reasoning and 203 multi-source analogical generation. Section 3 presents the new representation schema of forms as matrices. Section 4 describes the process of applying multi-source analogical generation to the design of tables and forms. Section 5 uses an example to illustrate our approach. Section 6 concludes the paper with some directions for future research. 2. BACKGROUND A typical system development life cycle includes project identification and selection, project initiation and planning, analysis, logical and physical design, implementation, and maintenance (McFadden et al. 2001). Given the problem specification, people will need to analyze the problem and determine the ways to solve it. Simon (Simon 1960) proposed a problem-solving model that lends significant sights into how people solve certain types of problems. The model includes four phases for analyzing and solving problems. These phases are intelligence, design, choice and implementation. First of all, all relevant information is collected during the intelligence phase. Following that, several alternatives are formulated during the design phase. The best alternative solution is then chosen during the choice phase. Finally, the solution is put into practice during the implementation stage. The development of information systems can be broken down into many activities. In the conceptual design phase when system's requirements and constraints have been identified and prioritize, people may develop several alternative design strategies for the organization's information system problem. No matter which design approach is adopted, a successful design that satisfies all the requirements within one step is impractical if impossible. It is usually necessary to subdivide the overall job into smaller tasks. In designing problems for large organizations, the decomposition is often done through the separate consideration based on different user views. A user view is the set of requirements that is necessary to support the operations of a particular user. Therefore, an information level design involves representing each individual user views, refining them and then merging them into a cumulative design. When given a user view or some sort of stated requirement, we can develop a collection of tables that will support it. People will find that the more designs they have done, the easier it will be for them to develop such a collection without resorting to any special procedure. Tables are a major component of information systems. End users access and manage information by manipulating and querying the data contained in the tables. This paper explores the process of table design and investigates using a structured matrix to access the knowledge embedded in the tables. First, a table is represented as a matrix, and its element can be either a data item or a matrix. Therefore, a table is treated as a general matrix while its elements are sub-matrices. In this way, a table is represented as a multi-layered matrix. There are usually up to two layers and the 2-layered matrix structure can be expressed by a 3-layered structure, like a tree with many branches. Every branch of the tree has attributes that are the data items constructing the basic structure of the matrix. The root of the tree represents a table and its attributes are the name of the table, the number of its submatrices, and the number of different matrices, respectively. The leaves of the tree are the data items, and their attributes are the size of the data item, the number of lines in the matrix of the specific data item, and the number of columns in the matrix of the specific data item. The intermediate nodes of the tree stand for every sub-matrix, and the attributes of the nodes are the number of data items and sub-matrices that belong to the current sub-matrix, the feature of the sub-matrix, and the position marked by its column and row location taking the sum-matrix as the unit. Figure 1 illustrates the tree analogy by showing a table that has been divided into modules. Table A can be considered a 3 by 2 matrix consisting of 4 sub-matrices: A1, A2, A3 and A4: A1 A 2 A= 0 A 3 0 A4 a211 a213 a111 a112 a221 a121 a122 a123 a131 a132 a311 a321 a322 a323 a331 a332 a411 a412 a413 a421 a422 a423 a431 a432 a433 204 Figure 1. An Example Matrix Therefore, 0 a 1 11 a 1 12 A1= a 1 21 a 1 22 a 1 23 0 a 1 31 a 1 32 In this case, A4 consists of records with the same three fields. In the sense of the whole table, it is obtained by reading the database more than once. When a record occupies one row, records line horizontally and form rows. When a certain number of records are combined into one column, they will be arranged vertically and more than one row and column are formed. In either vertical or horizontal direction, the 4 sub-matrices cannot be aligned; each is read from the database once and only once. The original matrix can be rewritten in the format of a 3-layered tree, as shown in Figure 2. There may exist other ways to combine the tables. However, the basic hypothesis suggests that a successful case in the knowledge base is attributable to a successful general form, which mainly refers to the relative location of the sub-sections. Therefore, if we can find a similar case in the knowledge base, the result should be considered acceptable and other solutions should be discarded. In such a way, many designed tables are collected and recorded as knowledge based on the structured representation. When the task of designing a new table is given, similar cases can be retrieved from the knowledge base and new solutions can be obtained by analogical reasoning on the “successful experience.” The objective of retrieval is to find trees with layer-structures matching the one that follows, when it comes to the level of description. 3. METHODOLOGY Analogy is an important aspect of human learning and thinking. Analogical reasoning occurs when people recall knowledge from a previous problem and relate it to a new problem (Mayer 1991). Research on analogical problem solving is rooted in cognitive psychology (Carbonell 1983, Gentner 1983; Holyoak and Thagard 1989 & 1995). Analogies have been used to enhance students' abilities to model and solve, for example, algebra and geometry problems (Weaver & Kintsch 1992; Lovett and Anderson 1994, Chee 1993). Several models of an analogy machine have been developed with some limitations. For example, analogy has looser constraints, thus the final result cannot be guaranteed to be true for all the cases. Analogy is one of the most important human abilities and it is the kernel of human intelligence. Gentner's theory of structure mapping laid the foundation for analogical reasoning. However, the theory is limited by the fact that it requires detailed knowledge and understanding of the objects, plans, and targets as well as an accurate description and expression of the targets. Therefore, useful and satisfactory results can only be obtained when sources and targets are greatly similar, such as the model of the atom and the solar system. Analogical problem solving involves retrieving a source that is similar to the target problem and then subsequently using its solution to solve the target problem. One of the keys to successful analogical reasoning lies in ignoring the similarities or dissimilarities in the surface features of the problems but recognizing analogies in their structures (Mayer 1991). Research on analogy and design has focused primarily on mental models of design (Bhatta & Goel 1996, Goel 1997). The type of knowledge they capture characterizes mental models. One of the most common mental models is the structure-behavior-function model. This model represents the structure of a design in some object-attributerelation ontology, representing its internal causal behavior as well as its functions. In this paper, we propose the concept of multi-source analogy, which releases the traditional analogy from the limitation of presumed restriction, thus allowing its newly added sources to be mapped to a more extended target field. The similarity calculation based on multi-source analogy can thus be applied to a variety of domains. In order to apply analogical reasoning in the field of design to produce a new result, we should have no presumption of similarity. A less rigid definition of similarity is needed. Because AR is not a restrictive reasoning, its conclusion is not necessarily true. Therefore, it is important to verify its outcome. However, the fact that it is not always true complicates the verification process. Since the method presented in this article uses the same structure in the final result and in its sources, we suggest using a separate set of evaluating standards on each part, 205 and then obtaining a generally evaluated score. If the score is too low, the process is guided back into the redesign loop. The design if considered complete when the score is sufficient. 4. DESIGN FRAMEWORK The next problem is how to create a new table/form according to the given restrictions from a group of retrieved cases. The task involves synthesizing more than one analog to a result, there must be conflicts existing in the source cases. Simply put, only two original tables (Table A and Table B) are considered; both tables consist of two sub-matrices. Now let us take A1 out of Table A and B2 out of Table B, and re-combine them into a new table -- Table C. According to the analogical reasoning theory, this is an example of 2-source analogical generation (AG). Before Table C was created, analogical reasoning was used to revise A1 and B2 to A1’ and B2’. Therefore the resulting Table C is similar to both Table A and Table B. A1 F O R M -A F O R M -B B1 A2 B2 M : A1 → A 1' M :B 2 → B ' 2 F O R M -C Figure 2. An example of Analogical Generation (AG) Since Table A and Table B already exist, it would be meaningless to design Table C so that it duplicates A or B. However, Table C is different from either A or B. On the other hand, C is from A and B, the design makes sense, and only C is new. In order to make the idea of AG practical and operable, the measure of similarity has to be set in order to calculate the procedures of retrieving, matching, and reasoning. A mapping in analogy reasoning is: m: s/t. In this case, it should be revised as m (p%):s/t where s stands for the source and t for the target while p% is the similarity between s and t. Since an object consists of many parts with various similarities between them, another parameter is needed to determine the influence of each part. That is to say, what distinguishes one part from another part or how easily the object is recognized if only one specific part is known. Denote identity of the j-th sub-part as Ij, then: N ∑ Ij = 1 j =1 We can see that the more parts an object has, the more difficult it will be for a sub-object to represent it. If the sub-part vector of the identity of A and B are (a1, a2…an) and (b1, b2… bn), respectively, and the vector of similarity is (s1, s2…Sn), then the general similarity between A and B is defined as: n S = w1s1 + w2 s2 +...+ wn sn = ∑ wi si i =1 where ω is weight and n ∑w i =1 i = 1 and wi = ai bi n ∑a b j =1 j j As a result: 206 S = a 1 b1 n ∑ s1 + a i bi i =1 a 2 b2 n ∑ i =1 a i bi s2 + L L a n bn n ∑ i =1 a i bi sn = n ∑ i =1 a i bi si n ∑ a jb j j =1 Expanding the situation discussed above to general cases, we should consider the followings: (1) a generated table can have more than two analogues; (2) the "de-parting" of each table may not be necessarily the same (e.g. some have two parts while others may have three or more parts). (3) A resulting table may inherit different subparts from different analogues, and it may be possible for it to inherit a synthesis sub-result from the corresponding parts of different analogues. And the concepts are expanded to similarity matrix and identity matrix (both of n by m). When a new table is designed, the cases stored as analogues in the knowledge base should be removed apriori. In other words, cut the root of the 3-layer tree to cancel the high-order constraints between the source and the target, making the sub-matrix a sub-part of the table. Thus, each sub-matrix of the target can be compared with all the submatrices of every table. This means that all of the tables can act as one analog of the sub-part. This is a multi-source analogy and it is necessary to choose the matrix most similar to the target matrix. This is a problem of similarity calculation. Each data item is considered a sub-part, and all of their identities are considered identical, and the most matched source sub-matrixes can be retrieved; thus a map from the source sub-matrix to the target sub-matrix is formed, and the target sub-matrices are achieved. Subsequently, these target sub-matrices need to be reunited as an integrated table. This reunification task is also performed by analogical generation. This time the goal is to find a table in the source base that includes matrices that are almost the same style as the target sub-matrix. We believe that all of the sub-matrices here possess the same identity. Therefore, the objective is to find a table in the base that has the greatest sum of similarity of the corresponding parts. After finding the table, all of the target sub-parts are arranged into the retrieved table the same way that the retrieved one does. It is notable that only the general formation rather than its content is concerned. Finally, since analogical reasoning is not 100 percent accurate, the results must be reviewed and validated by a human user. Analogical reasoning techniques can be a tremendous help in automating the design of information systems. We tested our theory of multi-source analogical generation on an inventory order processing system and have found tremendous success. 5. CONCLUSION Designing an information system can be a time-consuming task. In this paper, we propose a novel approach of multisource analogical generation based on analogical reasoning to facilitate information system design and development. We demonstrate the successful uses of the theory in designing tables and forms in databases. Traditional analogy reasoning is a one-source two-part analogy course and the operation applied on it is binary, which does not accurately reflect the human thought process. Multi-source analogies use analogical reasoning to innovate the process of simple repetition. This allows creative and practical design of databases with few constraints. The methodology presented in this paper is helpful to database designers, especially novices, and can be generalized to other domains as well. It makes it possible for a beginner to use the rich experience stored in the knowledge base and the efficiency of the system can thus be improved with the progress of the system being used. References Bhatta, S. and Goel, A. From Design Experiences to Generic Mechanisms: Model-Based Learning in Analogical Design, Artificial Intelligence in Engineering Design, Analysis, and Manufacturing, special issue on machine learning in design, Vol. 10, 1996, 131-136. Carbonell, J. Learning By Analogy: Formulating and Generalizing Plans form Past Experience, in Machine Learning: An Artificial Intelligence Approach, R. Michalski, J. Carbonell and T. Mitchell, eds., Tioga, Palo Alto, 207 California, 1983. Chee,Y.S., Applying Gentner's theory of analogy to the teaching of computer programming, Int. J. Man-Machine Studies, 38, 1993, 347-368 French, MJ. Engineering Design: The Conceptual Stage. Heinemann Educational Books, London, 1971. Gentner, D. Structure-mapping: A Theoretical framework for analogy. Cognitive Science, 1983, 7, 155-170. Gentner, D., & Markman, B. A. Structural Alignment in Comparison: No difference without similarity. Psychological Science, 1994, 5, 152-158. Gentner, D., Rattermann, M. J., & Forbus, D. D. The roles of similarity in transfer: Separating retrievability from inferential soundness, Cognitive Psychology, 25, 1993, 524-575. Goel, A. "Design, Analogy and Creativity", IEEE Expert, Vol. 12, no. 3, May/June, 1997, 62-70. Holyoak, K. J. and P. Thagard. A computational model of analogical problem solving, Similarity and Analogical Reasoning, Cambridge University Press, Cambridge, England, S. Vosniadou, A. Ortony, eds. 1989, 242-266. Holyoak KJ and P. Thagard, Mental Leaps: Analogy in Creative Thought, MIT Press, Cambridge, MA, London, England, 1995. Lung, CH. and Urban, J.E., An Approach to the Classification of Domain Models in Support of Analogical Reuse, ACM, Software Engineering Notes, 1995, ACM No.595950 Lovett, C.M. and Anderson, J.R. Effects of solving related proofs on memory and transfer in geometry problem solving. Journal of Experimental Psychology: Learning, Memory and Cognition, 1994, 20 (2), 366-378. McFadde, F. R., J. A. Hoffer, and M.B. Prescott, Modern Database Management. 6th edition. Reading, MA: Addision Wesley Longman, 2001. Markman, A.B. & Gentner, D. Structrual alignment during similarity comparisons. Cognitive Psychology, 1993, 25, 431-467. Markman, A. B. & Medin, D. L. Similarity and Alignment in Choice. Organizational Behavior and Human Decision Processes, 1995, 63, 117-130. Mayer, RE. Thinking, Problem Solving, Cognition. Second Edition. W. H. Freeman and Company, New York. 1991. Medin, DL, R. L. Goldstone and D. Gentner, Comparison and Choice: Relations between similarity processes and decision processes. Psychological Review, 2, 1995, 1-19. Mukhopadhyay, D., Dalezman, B. Designing Open System with CASE, Information System Management, Vol.12 , No.1,Winter 1995 Poulin, JS, Populating Software Repositories: Incentives and Domain-Specific Software, The Journal of Systems and Software, Vol. 30, No.3, September 1995 Simon, H. A. 1960. The New Science of Management Decision, New York: Harper & Row. Weaver, C.A. and Kintsch, W. Enhancing Students' Comprehension of the Conceptual Structure of Algebra Word Problems. Journal of Educational Psychology, 1992, 84 (4), 419-428. 208
© Copyright 2026 Paperzz