1 Introduction 1 Introduction 2 User-Adaptive Systems: An Integrative Overview Tutorial originally presented at UM99 (June, 1999) and IJCAI99 (August, 1999) Anthony Jameson Department of Computer Science Saarland University,Germany [email protected] http://www.cs.uni-sb.de/users/jameson Overview 1. Introduction 2. Functions of Adaptation 3. Properties Modeled 4. Types of Input 5. Inference Techniques 6. Empirical Foundations 7. Concluding Remarks 2 3 Introduction 3 Contents Introduction Introduction Overview Contents Previous Terminology Proposal for an Integrative Term Why Are User-Adaptive Systems Important? Functions Overview Tailor Information Presentation Recommend Products or Other Objects Help U to Find Information Support Learning Give Help Adapt an Interface Take Over Routine Tasks Support Collaboration Other Functions Properties Overview Personal Characteristics General Interests and Preferences Proficiencies Noncognitive Abilities Current Goal Beliefs Behavioral Regularities Psychological States Context of Interaction Input Types Overview Self-Reports on Personal Characteristics Self-Reports on Proficiencies and Interests Evaluations of Specific Objects Responses to Test Items Naturally Occurring Actions Low-Level Measures of Psychological States Low-Level Measures of Context Inference Overview Bayesian Networks Machine Learning, Introduction ML, Rule Learning ML, Perceptron-Style Learning ML, Probability and Instance-Based Learning ML, Collaborative Filtering ML, General Discussion Logic Stereotypes Empirical Foundations Overview Results of Previous Research Early Exploratory Studies Knowledge Acquisition Controlled Evaluations With Users Experience With Real-World Use Concluding Remarks 4 1 1 2 3 5 7 8 9 9 12 17 20 23 26 29 34 37 40 41 41 44 46 48 50 52 54 57 60 62 64 64 66 67 69 71 73 74 76 78 78 80 94 96 98 99 104 106 108 117 129 129 133 135 139 141 143 146 For a collection of some of the early research, see Kobsa, A., & Wahlster, W. (Eds.) (1989). User models in dialog systems. Berlin: Springer. 5 Introduction 5 Previous Terminology (1) 6 User modeling • First research in the late 1970’s • Research originally focused on natural language dialog systems • Recently, considerable expansion of scope to include types of adaptation listed below See, e.g., the journal User Modeling and User-Adapted Interaction • Emphasis on use of detailed, explicit models of the user (U) to adapt the behavior of the system (S) Student (or learner) modeling • U uses an intelligent tutoring system or learning environment • Typically very explicit and sophisticated models of U Previous Terminology (2) 6 Learning about users, adaptive interfaces • Terms sometimes used by machine-learning specialists • Modeling of the user is typically less explicit Personal learning assistants / apprentices • Terms often used in the agents community • Emphasis is on the role of the agent in supporting the individual user Personalization • The most popular term at the present time • Used by many commercial firms, most of which have been founded only recently 7 Introduction 8 7 Proposal for an Integrative Term A user-adaptive system • adapts its behavior to the individual user U on the basis of nontrivial inferences from information about U GENERALLY RELEVANT PROPERTIES UPWARD INFERENCE INPUT DOWNWARD INFERENCE DECISIONRELEVANT PROPERTIES General schema for processing in a user-adaptive system Why Are User-Adaptive Systems Important? Relevant technological developments Recent technological developments have enabled the practical deployment of user-adaptive systems: • General increase in computing power • New application domains where adaptivity is especially valuable • New interaction techniques that facilitate adaptivity • Advances in relevant AI techniques Examples of the earliest widely used systems • Recommendations provided by Amazon.com’s Web sites • Help provided by Microsoft’s Office Assistant Examples of companies that offer personalization software • NetPerceptions • BroadVision • ... (Go to Google and type "personalization") 8 9 Functions / Overview 10 Functions / Overview Overview of Functions of Adaptation 9 1. Tailor information presentation 5. Give help 2. Recommend products or other objects 6. Adapt an interface 3. Help U to find information 7. Take over routine tasks 8. Support collaboration 4. Support learning 9. Other functions Questions About Functions of Adaptation (1) Task What is the task that U and/or S is performing? Normal division of labor In a nonadaptive system, which aspects of the task would be performed by U and by S, respectively? Division of labor with adaptation How does adaptivity lead to a change in the division of labor − or in other aspects of the performance of the task? Adaptive decisions of S What decisions does S typically make in adapting to U? 10 11 11 Functions / Overview 12 Questions About Functions of Adaptation (2) Relevant properties of U What are the main properties of U that S takes into account when deciding how to adapt? Potential benefits What are the main potential benefits of this type of adaptation? Limitations What are the main difficulties or disadvantages? Tailor Information Presentation Figure 3 of Hirst, G., DiMarco, C., Hovy, E., & Parsons, K. (1997). Authoring and generating health-education documents that are tailored to the needs of the individual patient. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 107−118).Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/): the HealthDoc system Tailoring Medical Information Presentation 12 13 Hohl, H., Böcker, H., & Gunzenhauser, R. (1996). Hypadapter: An adaptive hypertext system for exploratory learning and programming. User Modeling and User-Adapted Interaction, 6, 131−156. 13 Functions / Tailor Information Presentation Tailoring Educational Presentations (1) Two different descriptions of the same LISP function for a beginner (left) and an expert (right) Tailoring Educational Presentations (2) Figure 6 of Hohl et al. (1996) 14 The size and availability of the listed concepts depend on S’s assessment of U’s knowledge 14 15 Functions / Tailor Information Presentation 16 Discussion (1) 15 Task S presents information to U about some topic Normal division of labor S (generates and) presents the same information in the same form to all users Division of labor with adaptation S may generate a different presentation for each user Decisions of S What information should S present? In what form should S present it (e.g., with what media)? Discussion (2) 16 Relevant properties of U Interests Knowledge Influences U’s need to see particular information and her ability to understand it Presentation preferences E.g., for media, for concrete vs. abstract forms of presentation Potential benefits Greater comprehensibility, relevance, and enjoyment for U Limitations Presentations become less consistent over occasions and different users Whether such inconsistency is a problem depends on the situation 17 Functions / Recommend Products or Other Objects 18 Recommend Products or Other Objects Adaptive Route Advisor Figure 3 from Rogers, S., Fiechter, C., & Langley, P. (1999). An adaptive interactive agent for route advice. Proceedings of the Third International Conference on Autonomous Agents, Seattle, WA. 17 Discussion (1) Task Find objects suitable for use by U Normal division of labor S provides object descriptions U searches through them, evaluating the objects Division of labor with adaptation U provides some indication of evaluation criteria S recommends suitable objects Decisions of S What objects to recommend 18 P1: VTL/RKB P2: PMR/TKL Machine Learning P3: PMR/TKL KL459-02-Pazzani 19 QC: PMR/AGR June 17, 1997 T1: PMR 17:23 Functions / Recommend Products or Other Objects 20 Discussion (2) 19 Relevant properties of U U’s evaluation criteria Potential benefits U saves time and effort searching U covers a larger number of potential objects Limitations U’s evaluation criteria may be hard to assess adequately in the limited time available S can more easily be designed to offer biased presentations Help U to Find Information Figure 2 from Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27, 313−331. Recommendations of Syskill & Webert Nonrecommended page Recommended page Pages U has already rated 20 21 Functions / Help U to Find Information 22 Discussion (1) 21 Task Finding information relevant to U in a large collection of information Normal division of labor Browsing: U navigates within a hypertext system Query-based information retrieval (IR): S retrieves items matching U’s query U finds relevant items among those retrieved Division of labor with adaptation Browsing: U navigates, but S gives advice about where to look Query-based IR: S’s choice and presentation of retrieved items take into account information about U that goes beyond U’s query Discussion (2) Adaptive decisions of S Browsing: What links should S recommend? Query-based IR: What items should S present, in what order? Relevant properties of U U’s current information need, goals, and general interests Potential benefits Browsing: Better decisions by U about what links to pursue Query-based IR: Higher recall and precision Limitations If S’s support is of poor quality, U may miss information that she otherwise would have found 22 23 Functions / Support Learning 24 Support Learning The APT Lisp Tutor Interface Corbett, A. T., & Anderson, J. R. (1995). Knowledge tracing: Modelling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4, 253−278.; Corbett, A. T., & Bhatnagar, A. (1997). Student modeling in the ACT programming tutor: Adjusting a procedural learning model with declarative knowledge. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 243−254). Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) 23 Discussion (1) 24 Task S acquires knowledge and/or skills in some topic area Normal division of labor S provides information, exercises, tests, hints, and feedback U processes this material and learns Division of labor with adaptation S performs basically the same subtasks but more strongly dependent on properties of the individual U Adaptive decisions of S What information, exercises or tests should S present or recommend next? What hints are appropriate at this point? How should U’s state of knowledge be described to U? 25 Functions / Support Learning 26 Discussion (2) 25 Relevant properties of U Prior knowledge and knowledge acquired during interaction Learning style, motivation, viewpoints, ... Current specific goals and beliefs (e.g., misconceptions) Potential benefits Faster, better, and/or more enjoyable learning by U Limitations Decision-relevant information about the specific U may be too difficult to obtain Decisions by S may not need to depend on such information E.g., S can teach a correct method without knowing what incorrect method U is currently employing Give Help http://www.research.microsoft.com/research/dfg/horvitz/lum.htm. Cf. Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 256−265).San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) Active Help From the Office Assistant 26 27 Functions / Give Help 28 Discussion (1) 27 Task U finds out how to use an application successfully and efficiently Normal division of labor U searches through documentation and engages in trial and error Division of labor with adaptation U does basically the same things, but S gives hints to expedite the process Adaptive decisions of S What hints should S offer? Should S intervene now at all? Relevant properties of U Knowledge of the application Current goal General desire to be helped by S Discussion (2) 28 Potential benefits U finds relevant functions and documentation more quickly and with less frustration U discovers aspects of the application that U would otherwise never have known about Limitations Reasonably accurate assessment by S can be very difficult The relevant properties are usually reflected only indirectly in U’s behavior U’s current goal can change quickly Interventions by S that aren’t helpful can be distracting and misleading Dey, A. K., Abowd, G. D., & Wood, A. (1998). Cyberdesk: A framework for providing self-integrating context-aware services. Knowledge-Based Systems, 11, 3−13.(http://www.cc.gatech.edu/fce/cyberdesk) Figure 4 of Dey, A. K., Abowd, G. D., & Wood, A. (1998). Cyberdesk: A framework for providing self-integrating context-aware services. Knowledge-Based Systems, 11, 3−13. (http://www.cc.gatech.edu/fce/cyberdesk) 29 29 Functions / Adapt an Interface A Context-Aware Interface (2) U’s current work context is in itself a "property" of U that should be taken into account when adaptation decisions are made 30 Adapt an Interface A Context-Aware Interface (1) CyberDesk actively offers U different applications and specific functions, depending on U’s current work context 30 31 Functions / Adapt an Interface 32 Discussion (1) 31 Task U works with one or more applications through a given interface Normal division of labor U learns to deal with the given interface, perhaps adapting it explicitly − if it is an adaptable interface (to be distinguished from an adaptive one) Division of labor with adaptation S modifies the interface to make it more suitable for U, perhaps after consulting U for approval Adaptive decisions of S How should input methods be adjusted? E.g., keyboard properties What functions should be made available or highlighted? How should aspects of the display be adjusted? E.g., colors, fonts Discussion (2) Relevant properties of U Noncognitive skills (e.g., motor or visual disabilities) Tendency or ability to use particular functions Potential benefits Saving of time, effort, errors, and frustration in using the interface Avoidance of need to specify interface adaptations explicitly U may not know what an appropriate specification would be E.g., optimal delay before the keyboard starts repeating Specifications might have to be made in each individual case, as opposed to just once and for all E.g., suitable choice of fonts and colors for each individual graph 32 33 Functions / Adapt an Interface 34 Discussion (3) 33 Limitations Adaptation of interface elements can decrease the consistency and predictability of the interface These properties are prerequisites for rapid automatic processing by a skilled user Take Over Routine Tasks Mitchell, T., Caruana, R., Freitag, D., McDermott, J., & Zabowski, D. (1994). Experience with a learning personal assistant. Communications of the ACM, 37(7), 81−91. Advice from the Calendar Apprentice Situation • U is adding a planned meeting to her calendar • U has already specified the meeting type, the attendees, and the date • U is about to specify the duration Adaptation • S proposes a duration of 60 minutes on the basis of previous observations of U’s scheduling behavior • U can accept or reject S’s recommendation 34 35 35 Functions / Take Over Routine Tasks 36 Discussion (1) Task U performs some regularly recurring task that requires many specific decisions and/or operations The task may involve devices other than traditional computers (e.g., cars, home heating systems) Normal division of labor U makes the decisions and performs the operations individually Division of labor with adaptation Strong form: S makes the decisions and performs the operations on U’s behalf Weak form: S recommends specific decisions and offers to perform particular operations Adaptive decisions of S What action would U want to perform in this case? Would U want me to perform this action without consultation? Discussion (2) 36 Relevant properties of U Preferences, interests, goals, etc. Note: Most systems in this category don’t represent such properties explicitly; instead, they learn behavioral regularities of U Potential benefits U saves time and effort Limitations Having S perform actions without consulting U can have serious negative consequences in individual cases Unlike a human assistant, S may not be able to recognize exceptional cases that require U’s attention When S does consult U, the savings of time and effort for U may be limited 37 Functions / Support Collaboration 38 Support Collaboration Identification of Suitable Collaborators Figure 2 of Collins, J. A., Greer, J. E., Kumar, V. S., McCalla, G. I., Meagher, P., & Tkatch, R. (1997). Inspectable user models for just-in-time workplace training. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 327−337). Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) 37 Discussion (1) 38 Task U identifies persons who can work with U on a given task Normal division of labor U considers potential collaborators and assesses their suitability Division of labor with adaptation S compares a characterization of U with characterizations of potential collaborators and suggests good matches Adaptive decisions of S Who would be a good collaborator? Relevant properties of U Knowledge, interests, ... Willingness to collaborate, availability, ... 39 Functions / Support Collaboration 40 Discussion (2) 39 Potential benefits U can save time and find collaborators that couldn’t be identified otherwise Limitations Because of the greater accessibility of models of users, threats to privacy may be greater Especially valuable collaborators may be requested too often Other Functions Figure 1 of Noh, S., & Gmytrasiewicz, P. J. (1997). Agent modeling in antiair defense: A case study. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 389−400).Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) Agent Modeling for Coordination C A Missile B E D Interceptor F Defense Unit 1 2 1 2 When deciding which incoming missiles to try to intercept, Agent 1 should anticipate the corresponding decisions of Agent 2; and vice-versa 40 41 Properties / Overview 42 Properties / Overview 41 Overview of Properties Modeled 1. Personal characteristics 2. General interests and preferences 3. Proficiencies 4. Noncognitive abilities 5. Current goal 6. Beliefs 7. Behavioral regularities 8. Psychological states 9. Context of interaction Typical Relationships Among Properties Personal Characteristic 1 Personal Characteristic 2 Knowledgeability About Topic General Interest Long-Term Goal Specific Belief Specific Evaluation Short-Term Goal User Action 1 "A −> B" = "A influences B" User Action 2 Noncognitive Ability User Action 3 42 43 Properties / Overview 44 Questions About Properties 43 Breadth of implications Is the property P relevant to a variety of decisions of S or only to a single type of decision? Directness of decision-relevance Can S take P into account directly when making a decision, or does S first have to make (uncertain) inferences on the basis of P? Ease of assessment How difficult is it in general to arrive at a reliable assessment of P? Personal Characteristics Figure 3 from Hirst, G., DiMarco, C., Hovy, E., & Parsons, K. (1997). Authoring and generating health-education documents that are tailored to the needs of the individual patient. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 107−118).Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) Personal Characteristics and Text Generation 44 References to personal characteristics 45 Properties / Personal Characteristics 46 Discussion 45 Breadth of implications Often quite high (e.g., for age, gender) Directness of decision-relevance Personal characteristics sometimes have direct, important consequences E.g., what type of clothes a customer might want to buy But inferences about properties like preferences and knowledge are often unreliable Ease of assessment The necessary information is often already available or easily supplied by U Inferring personal characteristics on the basis of indirect evidence is in general difficult General Interests and Preferences Figure 3 from Rogers, S., Fiechter, C., & Langley, P. (1999). An adaptive interactive agent for route advice. Proceedings of the Third International Conference on Autonomous Agents, Seattle, WA. Modeling of Evaluation Criteria In the Adaptive Route Advisor, the preferences of each user U are represented as a vector of importance weights; this vector is used to predict how U would evaluate any given route if she knew all of its attributes 46 47 Properties / General Interests and Preferences 48 Discussion 47 Breadth of implications Generally high Directness of decision-relevance Generally moderate Ease of assessment Explicit self-reports can be fairly useful (see below) Indirect inference is typically moderately difficult Proficiencies Figure 1 of Hohl, H., Böcker, H., & Gunzenhauser, R. (1996). Hypadapter: An adaptive hypertext system for exploratory learning and programming. User Modeling and User-Adapted Interaction, 6, 131−156. Proficiencies in Hypadapter Specific proficiencies General proficiency 48 49 Properties / Proficiencies 50 Discussion 49 Breadth of implications The more general the proficiency, the broader the implications Directness of decision-relevance The more general the proficiency, the less direct the implications Ease of assessment Self-reports can be of some use (see below) The more general the proficiency, the greater the amount of available indirect evidence But this evidence may not be representative of the overall proficiency E.g., S wants to assess U’s knowledge of Linux, but the available evidence concerns mainly the use of LaTeX and Ghostview Noncognitive Abilities Figure 1 from Morales, R., & Pain, H. (1999). Modelling of novices’ control skills with machine learning. In J. Kay (Ed.), UM99, User modeling: Proceedings of the Seventh International Conference (pp. 159−168).Wien, New York: Springer Wien New York. (http://www.cs.usask.ca/UM99/) Modeling Pole-Balancing Skills a 0 x F Morales and Pain use machine learning techniques to model a learner’s ability to balance a pole on a cart 50 51 Properties / Noncognitive Abilities 52 Discussion 51 Breadth of implications Ranges from very high (e.g., generally poor eyesight) to very low (e.g., specific motoric impairment) Directness of decision-relevance Typically quite high Ease of assessment Self-reports are in some cases possible, but often U wouldn’t know how to characterize a noncognitive ability Reliable objective tests are often available Current Goal Figure 1 from Heckerman, D., & Horvitz, E. (1998). Inferring informational goals from free-text queries: A Bayesian approach. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 230−237).San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) Goal Identification With the Answer Wizard 52 p(Goal 1 | Query) Query p(Goal n | Query) 53 Properties / Current Goal 54 Discussion 53 Breadth of implications Relevant only to decisions that S makes in the current situation Directness of decision-relevance Generally high Ease of assessment Explicit specification by U is usually too inconvenient Indirect inference is typically difficult Observable actions may be compatible with various goals U may be pursuing more than one goal at a time Beliefs Figure 8 from Kobsa, A., & Pohl, W. (1995). The user modeling shell system BGP-MS. User Modeling and User Adapted Interaction, 4, 59−106. Representation of Beliefs in BGP-MS 54 55 56 Higher-Order Beliefs and Coordination 55 Figure 3 of Noh, S., & Gmytrasiewicz, P. J. (1997). Agent modeling in antiair defense: A case study. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 389−400). Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) Used with permission. Properties / Beliefs A B C Battery1 D E F Level 1: A 26.8 47.4 38.6 46.7 45.4 50.0 B 46.0 28.7 40.0 48.1 46.7 51.3 Belief: 0.75 A B C Level 2: Battery2 D E F A 26.8 46.0 41.4 56.2 56.1 62.3 B 47.4 28.7 44.2 58.9 58.8 65.1 Battery1 C D 38.6 46.7 40.0 48.1 19.3 43.5 50.2 33.3 50.1 58.2 56.3 64.4 E 45.4 46.7 42.1 56.9 34.1 63.1 Battery2 C D 41.4 56.2 44.2 58.9 19.3 50.2 43.5 33.3 42.1 56.9 46.7 61.5 E 56.1 58.8 50.1 58.2 34.1 61.4 F 62.3 65.1 56.3 64.4 63.1 42.6 0.14 F 50.0 51.3 46.7 61.5 61.4 42.6 A A 26.8 B 46.0 C 41.4 B 47.4 28.7 44.2 Battery1 C D E F 38.6 46.7 45.4 50.0 40.0 48.1 46.7 51.3 19.3 43.5 42.1 46.7 0.11 No-info 1 [1/6, 1/6, 1/6, 1/6, 1/6, 1/6] No-info 1 [1,0,0,0,0,0] . . . . . [0,0,0,0,0,1] No-info 1 [1,0,0,0,0,0] . . . . . [0,0,0,0,0,1] In the RMM approach, higher-order expectations about beliefs and values are represented in terms of a hierarchy of payoff matrices Discussion 56 Breadth of implications High or low, depending on the generality of the belief Directness of decision-relevance Can be high even for a general belief, if it has specific implications in the current situation E.g., "Ink jet printers can’t handle PostScript files" Ease of assessment Explicit self-reports would in general be unnatural and time-consuming Indirect assessment is often tricky Observable actions seldom unambiguously reflect particular beliefs Particular beliefs are in general not easy to predict on the basis of general proficiencies and personal characteristics 57 Mitchell, T., Caruana, R., Freitag, D., McDermott, J., & Zabowski, D. (1994). Experience with a learning personal assistant. Communications of the ACM, 37(7), 81−91. 57 Properties / Behavioral Regularities 58 Behavioral Regularities Rules Representing U’s Scheduling Habits Example of a rule learned by the Calendar Apprentice If • Position-of-attendees is Grad-Student, and • Single-attendee? is Yes, and • Sponsor-of-attendees is Mitchell Then Duration is 60 minutes Comments There is no reference to the reasons why such a rule may be applicable (or why it may be inapplicable in some cases), e.g.: • the sort of topic that is typically discussed with graduate students • Mitchell’s attitude toward graduate students • ... Excerpt from Table 1 of Linton, F., Joy, D., & Schaefer, H. (1999). Building user and expert models by long-term observation of application usage. In J. Kay (Ed.), UM99, User modeling: Proceedings of the Seventh International Conference (pp. 129−138).Wien, New York: Springer Wien New York. (http://www.cs.usask.ca/UM99/) Profiles of Users’ Command Usage Rank 1 2 3 4 5 6 7 ... 18 19 20 Menu Command File Edit File File Edit Edit Format Open Paste Save DocClose Clear Copy Bold Tools File Spelling View PrintPreview Header Frequency (%) 13.68 12.50 11.03 10.25 9.50 7.86 4.22 Cumulative Frequency (%) 13.68 26.18 37.22 47.47 56.97 64.83 69.05 0.75 0.74 88.19 88.94 0.68 89.62 A simple frequency profile of U’s command usage can yield useful information if it is compared with the profiles of more expert users 58 59 Properties / Behavioral Regularities 60 Discussion 59 Breadth of implications Low: The implications typically concern the particular action that U would perform in a given situation Directness of decision-relevance High, for the same reason Ease of assessment The inference methods are typically straightforward and reliable, but they require a large amount of behavioral data Psychological States Affective-Computing homepage: http://www.media.mit.edu/affect/ Transitions Among Emotional States Within the Affective Computing paradigm, a variety of methods are used to recognize and predict a user’s emotional state 60 61 Properties / Psychological States 62 Discussion 61 Breadth of implications Relatively low, since the states are typically of limited duration Directness of decision-relevance Seldom very high Even if S knows that U is frustrated, distracted, or rushed, it is seldom obvious how S should adapt its behavior accordingly Ease of assessment Can be fairly high if suitable sensors are used (see below) More tricky if behavioral symptoms must be interpreted Context of Interaction Figure 5 of Dey, A. K., Abowd, G. D., & Wood, A. (1998). Cyberdesk: A framework for providing self-integrating context-aware services. Knowledge-Based Systems, 11, 3−13. (http://www.cc.gatech.edu/fce/cyberdesk) Sensitivity to Other Applications Being Used 62 U’s current work context is in itself a "property" of U that should be taken into account when adaptation decisions are made 63 Properties / Context of Interaction 64 Discussion 63 Breadth of implications Implications are largely limited to the current situation Directness of decision-relevance Implications are often quite straightforward Ease of assessment Some aspects of context are more or less directly accessible to S, since they concern the state of the computing device If appropriate sensors are available (see below), other aspects can be assessed straightforwardly In other cases, difficult indirect inferences may be required Input Types / Overview Overview of Input Types Explicit self-reports and -assessments 1. Self-reports on personal characteristics 2. Self-reports on proficiencies and interests 3. Evaluations of specific objects Nonexplicit input 4. Responses to test items 5. Naturally occurring actions 6. Low-level measures of psychological states 7. Low-level measures of context 64 65 Input Types / Overview 66 Questions About Input Types 65 Frequency How often is the input of the type I typically acquired? Additional activity Does I require extra activity by U that would not be required if I were not needed? Cognitive effort How much thinking and memory retrieval does I require from U? Motor effort How much physical input activity (e.g., typing) does I require? Reliability To what extent does I allow S to derive precise, accurate assessments of properties of U? Self-Reports on Personal Characteristics PC Self-Reports: Discussion Frequency Typically only at one or two points in a dialog Additional activity Usually, there’s extra activity But in some cases U might have to supply much of the same information anyway − e.g., to place an order Cognitive effort Generally little Motor effort Often a lot − unless easy copying from other information sources is supported 66 Reliability On the whole high But there may be distortions if sensitive issues are involved, or if seldom-used information has to be recalled from memory 67 Input Types / Self-Reports on Proficiencies and Interests 68 Self-Reports on Proficiencies and Interests Examples from Hypadapter Figure 1 from Hohl, H., Böcker, H., & Gunzenhauser, R. (1996). Hypadapter: An adaptive hypertext system for exploratory learning and programming. User Modeling and User-Adapted Interaction, 6, 131−156. 67 Discussion Frequency Typically only at one or two points in a dialog Additional activity There’s almost always extra activity Cognitive effort High, if U tries seriously to give a reliable assessment (see below) Motor effort Little 68 Reliability Generally low: U is unlikely to know the exact meanings of the reference points given U may be motivated to give an inaccurate self-assessment 69 Input Types / Evaluations of Specific Objects 70 Evaluations of Specific Objects Feedback to Syskill & Webert Figure 1 from Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27, 313−331. 69 U can click on the thumbs to indicate interest or disinterest in the page being shown Discussion Frequency Typically, many evaluations are given Additional activity Usually, there’s some extra activity but it’s not much of a burden (see below) Cognitive effort Low; usually a spontaneous reaction is expressed Motor effort Can be reduced to a single mouse click Reliability Fairly high The same problems as for general interests may arise to some extent, but the concreteness of the judgment makes it more reliable 70 71 Input Types / Responses to Test Items 72 Responses to Test Items http://www.c-lab.de/~gutkauf/um97.html; Gutkauf, B., Thies, S., & Domik, G. (1997). A user-adaptive chart editing system based on user modeling and critiquing. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 159−170). Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) 71 Example: Color Discrimination Test In the graph critiquing system of Gutkauf et al., users can take a color discrimination test S can then take any limitations into account when advising about graph design Discussion Frequency Tests can be administered at a single point, or continually (e.g., in educational systems) Additional activity Usually there’s extra activity If the test items also serve as practice items, they may be worth dealing with in any case Cognitive and motor effort Depends on the nature of the test Reliability Generally high: Test items can be selected, administered, and interpreted through well-understood, reliable procedures 72 73 Input Types / Naturally Occurring Actions 74 Naturally Occurring Actions Discussion 73 Frequency There may be a great number of relevant natural actions − or only a few if a very specific type is involved Additional activity Reliability Naturally occurring actions can be reliable indicators of specific beliefs and attitudes, but less so of generally relevant properties of U Usually there’s no extra activity Sometimes, ostensibly natural actions are invoked by disguised tests Cognitive and motor effort Can vary greatly; but not very important, since the actions will tend to be performed anyway Low-Level Measures of Psychological States Affective-Computing homepage: http://www.media.mit.edu/affect/ Example: Jewelry for Affective Computing 74 75 Input Types / Low-Level Measures of Psychological States 76 Discussion 75 Reliability Frequency Where the necessary sensors are available, a large amount of input data can be acquired Additional activity Since most people wouldn’t wear such sensors otherwise, any associated effort is usually extra effort Fairly high for some psychological states that are closely associated with measurable symptoms Cognitive effort Typically none, since U doesn’t have to give any explicit input Motor effort Sensors can be cumbersome, but technological advances are continually reducing the burden Low-Level Measures of Context Use of Position Information CyberDesk can take into account a mobile user’s position, as sensed via GPS 76 77 Input Types / Low-Level Measures of Context 78 Discussion 77 Reliability Frequency The amount of data that can be acquired is limited only by the technological possibilities and the amount of available information that is worth collecting Depends on the degree of precision required and on the available technology Additional activity; cognitive and motor effort Some information about context may be required from U, but technological advances make it increasingly possible to acquire it without effort from U Inference / Overview Overview of Main Inference Techniques 1. Decision-theoretic methods 2. Machine learning • Rule learning • Perceptron-style learning • Learning of probabilities • Instance-based learning 3. Logic 4. Stereotypes 78 79 79 Inference / Overview 80 Questions About Inference Techniques Knowledge acquisition effort How much effort is required to provide S with the necessary background knowledge about users in general? Computational complexity How much time (and other computational resources) does T require? Amount of input data required How much input data does T require in order to be able to make useful inferences? Handling of noise and uncertainty How adequately can T deal with the uncertainty that is inherent in most inferences about a user? Validity and justifiability To what extent can the inferences made through T be counted on to be accurate? Bayesian Networks Introduction (1) 80 Basic concepts Each chance node in a Bayesian network (BN) corresponds to a variable In user-adaptive systems, most variables refer to properties of U These may be observable or unobservable S’s belief about each variable is represented as a probability distribution over the possible values of the variable Links between nodes typically (though not always) represent causal influences When the value of a given variable is observed, the corresponding node is instantiated The evaluation of the network then leads to revised beliefs about the other variables 81 82 Introduction (2) 81 Horvitz, E., & Barry, M. (1995). Display of information for time-critical decision making. In P. Besnard & S. Hanks (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Eleventh Conference (pp. 296−305).San Francisco: Morgan Kaufmann. Inference / Bayesian Networks Extension to Influence Diagrams An influence diagram contains (in addition to chance nodes) decision nodes and value nodes Through the evaluation of an influence diagram, S can determine which decision is likely to yield the highest value Influence diagrams have so far seldom been employed in user-adaptive systems (but see. e.g., Horvitz & Barry, 1995; Jameson et al., 2000, cited below). Introduction (3) Learning of Bayesian networks from data In contrast to the machine learning techniques discussed below, the learning of Bayesian networks has so far been restricted almost exclusively to networks that apply to users in general Therefore, the learning does not in itself involve adaptation to a particular user; it only sets the stage for such adaptation Examples Albrecht, D. W., Zukerman, I., & Nicholson, A. E. (1998). Bayesian models for keyhole plan recognition in an adventure game. User Modeling and User-Adapted Interaction, 8, 5−47. Lau, T., & Horvitz, E. (1999). Patterns of search: Analyzing and modeling Web query dynamics. In J. Kay (Ed.), UM99, User modeling: Proceedings of the Seventh International Conference (pp. 119−128).Wien, New York: Springer Wien New York. (http://www.cs.usask.ca/UM99/) 82 83 Inference / Bayesian Networks 84 Introduction (4) 83 References Overview of applications in user-adaptive systems Jameson, A. (1996). Numerical uncertainty management in user and student modeling: An overview of systems and issues. User Modeling and User-Adapted Interaction, 5, 193−251. General books Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann. Jensen, F. V. (1996). An introduction to Bayesian networks. New York: Springer. Castillo, E., Gutierrez, J. M., & Hadi, A. S. (1997). Expert systems and probabilistic network models. Berlin: Springer. Figure 7 of Heckerman, D., & Horvitz, E. (1998). Inferring informational goals from free-text queries: A Bayesian approach. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 230−237). San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) Figures from this article reproduced with permission. Answer Wizard (1) 84 85 Inference / Bayesian Networks Answer Wizard (2) 85 Figure 3 of Heckerman, D., & Horvitz, E. (1998). Inferring informational goals from free-text queries: A Bayesian approach. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 230−237).San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) 86 p(Goal 1 | Query) Query p(Goal n | Query) Overall goal of the Answer Wizard Goal Term 1 Term 2 Term 3 Term m Simplified conception of relationships between U’s current goal and terms in U’s query Inference in the Answer Wizard U’S CURRENT GOAL BAYESIAN UPDATING WORDS IN U’S QUERY --- --- 86 Figure 1 of Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 256−265). San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) Figure 6 of Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 256−265). San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) Figures from this article reproduced with permission. 87 Inference / Bayesian Networks 87 • Actions (t) • Query Assistance history User’s background Competency profile User's acute needs Automated assistance Utility Explicit query 88 Overview of Lumière Event synthesis Events Inference Words Control system Influence Diagram from Lumière Task history Context User's goals Documents & data structures Cost of assistance User’s actions 88 89 Figure 2 of Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 256−265). San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) 89 Inference / Bayesian Networks 90 Bayesian Network from Lumière Recent menu surfing Difficulty of current task User expertise User needs assistance Pause after activity User distracted Inference in Lumière U’S CURRENT GOAL, OTHER CURRENT STATES, AND LONGER-TERM PROPERTIES PROPAGATION IN BAYESIAN NETWORK U’S ACTIONS BAYESIAN NET PROPAGATION / INFLUENCE DIAGRAM EVALUATION U’S DESIRE FOR HELP / UTILITY OF OFFERING U HELP 90 91 For more detailed discussion of these issues, see Jameson, A. (1996). Numerical uncertainty management in user and student modeling: An overview of systems and issues. User Modeling and User-Adapted Interaction, 5, 193−251. 91 Inference / Bayesian Networks 92 Discussion (1) Knowledge acquisition Arriving at a suitable network structure and accurate probabilities can require a substantial knowledge-acquisition effort Recently, researchers in this area have begun applying general techniques for the learning of Bayesian networks from data Computational complexity In general, evaluation of Bayesian networks is NP-hard In practice, some designers of user-adaptive systems have experienced considerable difficulty with these complexity problems, investing a lot of effort in finding ways to overcome them But in more favorable cases, the complexity does not pose a significant problem Discussion (2) 92 Amount of input data required If the networks are learned from data − usually off-line − a large amount of data is required Inferences about a specific user can be made on the basis of a single piece of unreliable evidence Handling of noise and uncertainty Bayesian networks are especially good at appropriately interpreting individual pieces of evidence that permit only uncertain inference Note that even an uncertain belief about a user may uniquely determine an appropriate system action 93 Inference / Bayesian Networks 94 Discussion (3) 93 Validity and justifiability If the qualitative and quantitative aspects of the networks are correct, the inferences made will be valid (though typically they will yield explicitly uncertain conclusions) Shortcuts taken to simplify knowledge acquisition or to deal with complexity problems may impair the validity of inferences in unpredictable ways Machine Learning, Introduction Introduction (1) Basic concepts A machine learning (ML) system comprises a learning component and a performance component In the context of user-adaptive systems, • the learning component, on the basis of data about a particular user U (and/or about other users) infers the knowledge that is required by the performance component; • the performance component is typically used to make decision-relevant predictions about a particular user U The following subsections illustrate the ML methods that have been used most in user-adaptive systems 94 95 95 Inference / Machine Learning, Introduction 96 Introduction (2) General References Overviews of use for user-adaptive systems Langley, P. (1999). User modeling in adaptive interfaces. In J. Kay (Ed.), UM99, User modeling: Proceedings of the Seventh International Conference. Wien, New York: Springer Wien New York. (http://www.cs.usask.ca/UM99/) Sison, R., & Shimura, M. (1998). Student modeling and machine learning. International Journal of Artificial Intelligence in Education, 9, 128−158. Machine learning texts Langley, P. (1996). Elements of machine learning. San Francisco: Morgan Kaufmann. Mitchell, T. M. (1997). Machine learning. Boston: McGraw-Hill. ML, Rule Learning "CAP" = "Calendar APprentice". Cf. Mitchell, T., Caruana, R., Freitag, D., McDermott, J., & Zabowski, D. (1994). Experience with a learning personal assistant. Communications of the ACM, 37(7), 81−91. Inference in CAP 96 RULES FOR PREDICTING U’S DECISIONS DECISION TREE LEARNING OBSERVED SCHEDULING DECISIONS OF U RULE APPLICATION PREDICTED DECISIONS OF U 97 Inference / ML, Rule Learning 98 Learned Rules 97 Example of a learned rule: If • Position-of-attendees is Grad-Student, and • Single-attendee? is Yes, and Position • Sponsor-of-attendees is Mitchell Student Programmer Faculty Then Lunchtime? Mitchell et al. (1994) Duration is 60 Applicability so far Training data: 6/11 Yes DiningHall Department W5309 S No W5309 EE W5309 H301 Example of a learned decision tree Test data: 51/86 ML, Perceptron-Style Learning Cf. Rogers, S., Fiechter, C., & Langley, P. (1999). An adaptive interactive agent for route advice. Proceedings of the Third International Conference on Autonomous Agents, Seattle, WA. Inference in the Adaptive Route Advisor U’S IMPORTANCE WEIGHTS FOR 5 ATTRIBUTES OF ROUTES PERCEPTRON-STYLE LEARNING U’S EXPRESSED PREFERENCES FOR SPECIFIC ROUTES GRAPH SEARCH PRESUMABLY DESIRABLE ROUTES FOR U 98 99 Inference / ML, Probability and Instance-Based Learning 100 ML, Probability and Instance-Based Learning NewsDude Interface Billsus, D., & Pazzani, M. J. (1999). A hybrid user model for news story classification. In J. Kay (Ed.), UM99, User modeling: Proceedings of the Seventh International Conference (pp. 99−108). Wien, New York: Springer Wien New York. (http://www.cs.usask.ca/UM99/) 99 Inference in NewsDude: Long-Term CONDITIONAL PROBABILITIES FOR WORDS LEARNING OF CONDITIONAL PROBABILITIES RATINGS OF HEARD STORIES NAIVE BAYESIAN CLASSIFICATION INTEREST IN INDIVIDUAL STORIES 100 101 Inference / ML, Probability and Instance-Based Learning 102 Modeling Long-Term Interests 101 STORY INTERESTING? WORD 1? WORD 2? ... WORD 3? WORD 200? Naive Bayesian classification in NewsDude;cf. Syskill & Webert (Pazzani & Billsus, 1997);Answer Wizard (Heckerman & Horvitz, 1998) Inference in NewsDude: Short-Term PREVIOUSLY HEARD STORIES AND THEIR RATINGS STORAGE OF STORIES IN VECTOR SPACE RATINGS OF HEARD STORIES NEAREST-NEIGHBOR PREDICTION INTEREST IN INDIVIDUAL STORIES 102 103 Inference / ML, Probability and Instance-Based Learning 104 103 Use of Nearest Neighbor Algorithm + + + - - + Voting stories: + + ? New story ML, Collaborative Filtering Inference in Collaborative Filtering Systems 104 SIMILARITY TO PARTICULAR OTHER USERS COMPUTATION OF CORRELATIONS WITH OTHER USERS’ RATINGS U’S RATINGS OF OBJECTS WEIGHTED AVERAGING OF RATINGS OF OTHERS PREDICTED RATINGS OF U 105 Billsus, D., & Pazzani, M. (1998). Learning collaborative information filters. In J. Shavlik (Ed.), Machine learning: Proceedings of the Fifteenth International Conference. San Francisco, CA: Morgan Kaufmann. 105 Inference / ML, Collaborative Filtering 106 Progress With Collaborative Filtering Problems with traditional methods • Lack of sound theoretical basis • Information loss because of limited overlap in ratings Approach using better-understood machine learning methods • Treatment of problem as one of classification learning • Reduction of dimensionality so that overlap in specific ratings is less important • Demonstrable improvement in precision over traditional methods ML, General Discussion Discussion (1) 106 Knowledge acquisition effort In general relatively little The systems typically start with very few assumptions about users in general, learning everything they need to know from the input data But it is important to frame the learning problem carefully Computational complexity Complexity ranges from negligible to considerable; it seems in general not to pose a major problem Since learning often takes place over a number of sessions, some computation can often be performed off-line, between sessions Amount of input data required Typically, a large amount, preferably collected over a number of sessions with the user in question Exceptions • instance-based learning, as in NewsDude • methods which exploit data collected from other users 107 107 Inference / ML, General Discussion 108 Discussion (2) Handling of noise and uncertainty The methods applied so far to user-adaptive systems take into account the noisiness of the data The conclusions derived are often associated with an indication of S’s confidence; but the representation of such uncertainty is less explicit and sophisticated than for Bayesian networks Validity and justifiability Confidence in the applicability of the techniques is in general based on empirical tests of their performance in the specific domain in question Machine learning researchers have conducted more thorough empirical tests than other researchers into user-adaptive systems There is in general no a priori reason to be confident that the technique will achieve a particular level of performance in a given setting Inspection of the individual learned rules, probabilities, weights, etc., does not in general permit an assessment of their validity Logic Introduction (1) 108 The role of logic in user-adaptive systems The most salient strength of logic is the possibility of using it to • represent beliefs of U in a precise and differentiated way; • make inferences on the basis of the content of these beliefs Logic can also be used to represent and make inferences about other aspects of a user But its suitability, relative to other representation and inference techniques, is less clear when U’s beliefs are not involved Accordingly, logic is typically used in conjunction with other approaches to representation and inference 109 Inference / Logic 110 Introduction (2) 109 General references Overview of applications to user-adaptive systems Part I of Pohl, W. (1998). Logic-based representation and reasoning for user modeling shell systems. Sankt Augustin: infix. Relevant books of broader scope Ballim, A., & Wilks, Y. (1991). Artificial believers: The ascription of belief. Hillsdale, NJ: Erlbaum. Fagin, R., Halpern, J. Y., Moses, Y., & Vardi, M. Y. (1995). Reasoning about knowledge. Cambridge, MA: MIT Press. Genesereth, M. R., & Nilsson, N. J. (1987). Logical foundations of artificial intelligence. Palo Alto, CA: Morgan Kaufmann. Logical Inference in BGP-MS 110 GENERALLY RELEVANT (POSSIBLY HIGHER-ORDER) BELIEFS AND GOALS OF U LOGICAL INFERENCE ACTIONS AND SELF-ASSESSMENTS OF U LOGICAL INFERENCE DECISION-RELEVANT BELIEFS AND GOALS OF U 111 111 Inference / Logic 112 U’s Beliefs About the World (1) U’s specific beliefs: DOC1 is a PostScript document PS-document(DOC1) LPT1 is either a laser printer or an inkjet printer laser-printer LPT1 inkjet-printer LPT1 U’s general beliefs: PostScript documents are printable on PostScript printers PostScript documents are printable on inkjet printers DOC1 is printable on LPT1 PS-document PS-printer printable-on PS-document injket-printer printable-on printable-on DOC1 LPT1 Jameson, A. (1995). Logic is not enough: Why reasoning about another person’s beliefs is reasoning under uncertainty. In A. Laux & H. Wansing (Eds.), Knowledge and belief in philosophy and artificial intelligence (pp. 199−229).Berlin: Akademie Verlag. U’s Beliefs About the World (2) 112 Strengths and limitations of this approach + Complex beliefs can be represented (e.g., in first-order predicate calculus) + The use of modal operators can often be avoided if the statements are maintained in a partition of beliefs ascribed to U + The logical inferences that U can make on the basis of her beliefs can be computed straightforwardly − The partition approach can’t handle S’s uncertain beliefs about U’s beliefs • Either U thinks LPT1 is a laser printer or U thinks LPT1 is an inkjet printer • U probably knows that PostScript documents are printable on inkjet printers − Inferences presuppose that the relevant background knowledge can be ascribed to U There is often no direct evidence about U’s background knowledge (cf. Jameson, 1995) 113 113 Inference / Logic 114 More Complex Intensional Ascriptions (1) U’s goals and desires U wants to print DOC1 on LPT1 If U wants to print a document on a printer, then U must believe that the document is printable on that printer Therefore, U believes that DOC1 is printable on LPT1 General inference schemas If U wants to do something and !#"$&% printed-on ' DOC1 ( LPT1 ) *+,-.0/#132&4 +6,-37 printed-on 5 8 /#1/94 printable-on 5 +,-7: ;#<;>= knows that it will have a particular consequence, then U wants that consequence printable-on ? DOC1 @ LPT1 A BC#DEGFIHKJLC#DC9FMBHON N C#DE&FTP PRQSQ More Complex Intensional Ascriptions (2) U’s higher-order beliefs U believes that S believes that U believes that DOC1 is printable on LPT1 U believes that it is mutually believed that DOC1 is printable on LPT1 U#VU9WTU#VU9W []\[9^I[]_ 114 printable–on X DOC1 Y LPT1 Z printable–on ` DOC1 a LPT1 b Strenghts and limitations of this approach + Sophisticated representation and reasoning involving important concepts are possible - Reasoning with modal logic can be computationally very complex - Here again, the required premises are often unknown or uncertain 115 115 Inference / Logic 116 Discussion (1) Knowledge acquisition effort Systems that use logic sometimes presuppose many assumptions about users in general that actually require verification Computational complexity Complexity is mainly a problem in connection with reasoning about higher-order beliefs with modal logic A good deal of effort has been devoted to finding ways of keeping complexity within reasonable limits Amount of input data required A logic-based system can make an inference on the basis of a single specific fact about U Discussion (2) 116 Fagin, R., & Halpern, J. Y. (1994). Reasoning about knowledge and probability. Journal of the Association for Computing Machinery, 41, 340−367. Handling of noise and uncertainty Logic-based systems in this area have in general treated uncertainty management as a matter of secondary priority A typical approach is to view inferences as being defeasible and to withdraw them when contradictory evidence is obtained, using truth-maintenance methods Research integrating logic and probabilistic methods (see, e.g., Fagin & Halpern, 1994) has so far had little impact in the area of user-adaptive systems Validity and justifiability The inference methods used are normally known to be valid The propositions represented and the inferences made are in general understandable Questions about validity arise when uncertainty about the propositions being used in inference is not taken into account 117 Inference / Stereotypes 118 Stereotypes 117 Introduction (1) What is a stereotype? A stereotype comprises two main components (Rich, 1989, cited below): • a body, which contains information that is typically true of users to whom the stereotype applies • a set of triggers, which are used to determine whether the stereotype applies to a given user Upward inference occurs when the condition corresponding to a trigger is satisfied and the stereotype is activated Downward inference occurs when one or more of the facts in the body of an activated stereotype is assumed to apply to the user in question Introduction (2) 118 Hierarchies and multiple inheritance Stereotypes may be arranged in hierarchies More than one stereotype may be active for a given U Conflicts arising through multiple inheritance are typically resolved through the application of heuristic rules Classic general references Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3, 329−354. Rich, E. (1989). Stereotypes and user modeling. In A. Kobsa & W. Wahlster (Eds.), User models in dialog systems (pp. 35−51). Berlin: Springer. Chin, D. N. (1989). KNOME: Modeling what the user knows in UC. In A. Kobsa & W. Wahlster (Eds.), User models in dialog systems (pp. 74−107). Berlin: Springer. Figure 9 of Kobsa and Pohl (1995) Figure 8 of Kobsa, A., & Pohl, W. (1995). The user modeling shell system BGP-MS. User Modeling and User Adapted Interaction, 4, 59−106. 119 119 Inference / Stereotypes Stereotypes in BGP-MS (2) 120 Stereotypes in BGP-MS (1) 120 Figure 1 of Hohl, H., Böcker, H., & Gunzenhauser, R. (1996). Hypadapter: An adaptive hypertext system for exploratory learning and programming. User Modeling and User-Adapted Interaction, 6, 131−156. Figures from this article reproduced with permission. 121 121 Inference / Stereotypes TRIGGERING OF STEREOTYPES ACTIONS AND SELF-ASSESSMENTS OF U Personal Questionnaires (1) 122 Inference in Hypadapter MEMBERSHIP IN ONE OR MORE STEREOTYPES (POSSIBLY MULTIPLE) INHERITANCE SPECIFIC DECISION-RELEVANT FACTS ABOUT U 122 123 124 Personal Questionnaires (2) Figure 2 of Hohl et al. (1996). 123 Inference / Stereotypes Figure 8 of Hohl et al. (1996) Content of User Model 124 125 Inference / Stereotypes 126 Stereotype Triggering Rules 125 Rule "Specified-explicitly" Condition U has specified her level of qualification as "Expert" Triggered stereotype Based on Figure 10 of Hohl et al. (1996) Expert Rule "Sufficient-topic-knowledge" Condition There are at least 15 topics T such that U’s knowledge of T equals or exceeds the knowledge of T expected for an Expert Triggered stereotype Expert Evaluating Topics for Presentation Rule "Easy-Topic-for-Beginners" Conditions U is a Beginner The level of difficulty of Topic T is "Mundane" or "Simple" Based on Figure 13 of Hohl et al. (1996) Scoring Add 10 points to the appropriateness score for T Rule "Avoid-Known-Topic" Condition According to the user model, U knows Topic T Scoring Subtract 20 points from the appropriateness score for T 126 127 Inference / Stereotypes 128 Discussion (1) 127 Knowledge acquisition effort In principle, the same sort of effort is required as for Bayesian networks: What stereotypes should be defined? What triggering rules are appropriate? What properties can be inherited from each stereotype? ... In practice, systematic knowledge acquisition for stereotype-based systems has been rare Computational complexity Low Amount of input data required Stereotypes are typically triggered on the basis of only a few observations Discussion (2) 128 Handling of noise and uncertainty Various calculi for representing S’s confidence in its inferences have been employed • uncertainty about the assignment of U to a given stereotype • uncertainty about the inheritance of particular properties from that stereotype These methods are less well understood than probabilistic methods Validity and justifiability Stereotypes don’t explicitly represent propositions that can be judged to be true or false The inference methods don’t guarantee the truth of the conclusions given the truth of the premises The various aspects of a given system can be judged to be more or less plausible, and the performance of the system can be evaluated 129 Empirical Foundations / Overview 130 Empirical Foundations / Overview 129 Overview of Types of Empirical Studies Studies that don’t require a running system 1. Results of previous research 2. Early exploratory studies 3. Knowledge acquisition from experts Studies with a system 4. Controlled evaluations with users 5. Experience with real-world use Questions Addressed by Empirical Studies (1)130 1. Correctness of assumptions about users relied on by inference techniques Each system typically presupposes some general assumptions about users that are not logically necessarily correct. Can these assumptions be shown empirically to be correct? Bayesian networks: network structure, probabilities Logic: general propositions about users that S assumes to be true; simplifying assumptions that are made to make inference easier Stereotypes: specific triggering rules, assumptions underlying inheritance mechanisms 131 Empirical Foundations / Overview 132 Questions Addressed by Empirical Studies (2) 131 2. Appropriateness of the inference techniques used Are the techniques used well suited to dealing with the inference tasks faced by S? E.g., can they deal with the amount of noise actually present in the input data? 3. Adequacy of available data Is there typically enough data available about each user U to enable S to make useful inferences about U? 4. Adequacy of coverage Does S take into account enough of the relevant input data and user properties to be able to make a useful number of appropriate adaptation decisions? If not, S’s adaptation may be of little use, even if it meets the other criteria Questions Addressed by Empirical Studies (3)132 5. Appropriateness of adaptation decisions Do the adaptation decisions that S makes on the basis of decision-relevant properties actually improve the quality of U’s interaction with S? Even if S fulfills all other criteria, its adaptation decisions may be useless or harmful if they don’t take into account important factors E.g., even if S correctly identifies the menu options that U needs most, it may be a bad idea to move these to the top of the respective menus The resulting unpredictability of the menus could outweigh any advantage of the adaptation 133 Empirical Foundations / Results of Previous Research 134 Results of Previous Research Figure 1 of Corbett, A. T., & Bhatnagar, A. (1997). Student modeling in the ACT programming tutor: Adjusting a procedural learning model with declarative knowledge. In A. Jameson, C. Paris, & C. Tasso (Eds.), User modeling: Proceedings of the Sixth International Conference, UM97 (pp. 243−254). Wien, New York: Springer Wien New York. (http://www.cs.uni-sb.de/UM97/) 133 A Research-Based Tutor The ACT tutors are based on a large body of theoretical and empirical research into cognitive psychology and cognitive modeling Discussion 134 Previous research can in principle yield valuable information about almost any of the questions listed above But it does not in general refer to a system like S or even to a similar setting Consequently, a good deal of extrapolation is typically required It may be necessary to conduct studies in the domain of S in order to determine the applicability of previous research results 135 Empirical Foundations / Early Exploratory Studies 136 Early Exploratory Studies Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 256−265). San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) 135 Wizard of Oz Study: Method The studies were based on a "Wizard of Oz" design. In the studies, naive subjects with different levels of competence in using the Microsoft Excel spreadsheet application were given a set of spreadsheet tasks. Subjects were informed that an experimental help system would be tracking their activity and would be occasionally making guesses about the best way to help them. Assistance would appear on an adjacent computer monitor. Experts were positioned in a separate room that was isolated acoustically from the users. The experts were given a monitor to view the subjects’ screen, including their mouse and typing activity, and could type advice to users. Experts were not informed about the nature of the set of spreadsheet tasks given to the user. Both subjects and experts were told to verbalize their thoughts and the experts, subjects, and the subjects’ display were videotaped. Wizard of Oz Study: Results (1) 136 Difficulty of the system’s task We found that experts had the ability to identify user goals and needs through observation. However, recognizing the goals of users was a challenging task. Experts were typically uncertain about a user’s goals and about the value of providing different kinds of assistance. At times, goals would be recognized with an "Aha!" reaction after a period of confusion. Consequences of poor advice We learned that poor advice could be quite costly to users. Even though subjects were primed with a description of the help system as being "experimental", advice appearing on their displays was typically examined carefully and often taken seriously; even when expert advice was off the mark, subjects would often become distracted by the advice and begin to experiment with features described by the wizard. This would give experts false confirmation of successful goal recognition, and would bolster their continuing to give advice pushing the user down a distracting path. 137 137 Empirical Foundations / Early Exploratory Studies 138 Wizard of Oz Study: Results (2) Suggestions for desirable strategies Experts would become better over time with this and other problems, learning, for example, to become conservative with offering advice, using conditional statements (i.e.,"if you are trying to do X, then..."), and decomposing advice into a set of small, easily understandable steps. Discussion Other methods in this category include interviews with − or observation of − users of a nonadaptive version of S These methods can yield valuable information about all of the questions listed above Getting this information at an early stage can help to avoid poor design decisions that may later be difficult or impossible to change 138 139 Empirical Foundations / Knowledge Acquisition 140 Knowledge Acquisition Figure 4 of Heckerman, D., & Horvitz, E. (1998). Inferring informational goals from free-text queries: A Bayesian approach. In G. F. Cooper & S. Moral (Eds.), Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference (pp. 230−237).San Francisco: Morgan Kaufmann. (http://www.sis.pitt.edu/~dsl/UAI.html) 139 Elicitation of Probabilities from Experts Early in the design of the Answer Wizard, usability experts identified key terms employed by users and assessed the conditional probabilities of their being used in a query, given that U had a particular goal. p(term | goalj) Modifying chart Formatting document • chart • graph • cells • row • column • insert • modify • edit • format • plot • file • Discussion 140 Knowledge acquisition from domain experts has a special role in that adaptation to a given person has so far been something that humans are much better at than computers For some types of system, there are persons who have extensive experience with the type of adaptation required The knowledge acquired can concern all of the questions listed above The general comments about early exploratory studies apply here as well 141 Empirical Foundations / Controlled Evaluations With Users 142 Controlled Evaluations With Users 141 Value of Adaptivity in the Route Advisor 75 Percent Correct Figure 7 of Rogers, S., Fiechter, C., & Langley, P. (1999). An adaptive interactive agent for route advice. Proceedings of the Third International Conference on Autonomous Agents, Seattle, WA. 100 50 25 Individualized model Aggregate model 0 1 3 5 7 9 11 13 15 17 19 21 23 Subject Number The Adaptive Route Advisor’s personalized models consistently outpredicted an aggregate model, even though each personalized model was based on only a small fraction of the data Pazzani, M., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27, 313−331. Discussion 142 Controlled evaluations differ in the extent to which they • focus on specific aspects of S (e.g., the inferences made) or on S’s overall effectiveness • are conducted in a setting that approximates that of the real use of S The more focused studies mainly yield insight into • the correctness of assumptions about users in general and • the appropriateness of the inference techniques used A frequent type of study of this sort compares the effectiveness of several variants of a given modeling or inference technique E.g., naive Bayesian classification vs. more complex classification techniques (see, e.g., Pazzani & Billsus, 1997) 143 Empirical Foundations / Experience With Real-World Use 144 Experience With Real-World Use 143 Evaluation of Calendar Apprentice in Use (1) PREDICTION ACCURACY USER B: LOCATION 100 80 80 60 60 40 40 20 20 0 0 Mar Jun Sep Jan Jun Sep Dec Jan USER A: DURATION PREDICTION ACCURACY Top half of Figure 2 of Mitchell, T., Caruana, R., Freitag, D., McDermott, J., & Zabowski, D. (1994). Experience with a learning personal assistant. Communications of the ACM, 37(7), 81−91. Explanation on next slide USER A: LOCATION 100 Jun Sep Dec USER B: DURATION 100 100 80 80 60 60 40 40 20 20 0 0 Mar Jun Sep Jan Jun Sep Dec Jan Jun Sep Dec Evaluation of Calendar Apprentice in Use (2) 144 The previous figure shows CAP’s accuracy over time in predicting, for each of the users A and B, two aspects of their scheduling behavior: • choice of location for a meeting • choice of duration Note the drops in accuracy at times in the year when scheduling habits typically change The authors also reported a number of more specific analyses 145 Empirical Foundations / Experience With Real-World Use 146 145 Discussion Studies of real use can potentially yield highly specific answers to all of the questions listed above Because of the lack of control, it may be hard to pinpoint the reasons for particular (positive or negative) results Building a version of the system that can be tested in real use may be impractical at an early stage; and at a late stage the insights gained may be hard to take into account Concluding Remarks Concluding Remarks 146 • In the past, research on user-adaptive systems has been unnecessarily fragmented • Specialized communities focused on particular types of application or particular implementation techniques • Design decisions were often taken by default, rather than on the basis of a consideration of all available options • Recent conferences and publications have seen an increasing tendency for researchers to adopt ideas and results that originated in other research communities • If this trend is strengthened, we may expect an acceleration of progress in this already fast-moving field
© Copyright 2026 Paperzz