The Learning Chatbot Bonnie Chantarotwong IMS-256 Fall 2006 What is wrong with state of the art chatbots? • They are repetitive • They are predictable – simple pattern matching & set response • They have no memory – can lead to circular conversations • They don’t sound like real people Proposed Solution • Train chatbots on a corpora of conversations in order to mimic a given personality Filtering the training corpus • Need a lot of conversations containing query • screen name Eliminate undesirable data – phone numbers – Addresses – sensitive gossip • Eliminate highly technical data – since most tech problems are very specific, unless the bot was trained on a newsgroup, learned responses are not likely to be useful for tech support Parsing the training corpus • Extract messages from HTML • Group together consecutive messages by the • same screen name Simplify prompt messages – – – – !!!!!!!??????? -> Ohhhhhhhhhhhh! -> WhATz uP?? -> hahahahaha -> !? ohh! whatz up? haha • Break prompts into word sequences (eliminating stop words) – I took the cat to a vet -> [i, took, cat, to, vet] Constructing the CFD • CFD conditions are prompt words • FD samples are string responses, with numeric count indicating strength of correlation • Example: – Cfd[‘sleep’].sorted_samples() -> [“sleep is the best thing ever”, “are you tired?”, “maybe after I eat.”, “hang on a sec.”] Constructing the CFD i Simple Concept: 1/3 Can we put mittens on it? 1/3 want 1/3 kitten 1/3 food 1/3 they had good at restaurant 1/3 1/6 Me too, I’m hungry. 1/6 1/6 What kind did they have? 1/6 1/6 1/6 Original Conversation: A: B: A: B: A: B: I want a kitten Can we put mittens on it? I want food Me too, I’m hungry They had good food at the restaurant What kind did they have? If a prompt is n words long, then each word is 1/n likely to have caused the response Using the CFD Problem • Each word in a prompt is not equally likely to have caused the response – More common words (such as ‘I’) are less indicative of meaning Solution • Take into account the commonality of the word over all conversations – Divide the weight of the word/response pair by the weight sum over all samples for that word – Rare words are weighted more; using a dynamic scale – This improved quality of bot responses greatly! Using the CFD - Example CFD: Cfd[‘i’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Cfd[‘want’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)] Cfd[‘kitten’] = [(“Can we put mittens on it?”, 1/3)] Cfd[‘food’] = [ (“Me too, I’m hungry”, 1/3), (“What kind of food did they have?”, 1/6)] Cfd[‘they’] = [ (“What kind of food did they have?”, 1/6)] Cfd[‘had’] = [ (“What kind of food did they have?”, 1/6)] Cfd[‘good’] = [ (“What kind of food did they have?”, 1/6)] Cfd[‘at’] = [ (“What kind of food did they have?”, 1/6)] Cfd[‘restaurant’] = [ (“What kind of food did they have?”, 1/6)] Sum: Sum: Sum: Sum: Sum: Sum: Sum: Sum: Sum: 2/3 2/3 1/3 1/2 1/6 1/6 1/6 1/6 1/6 Responses (“how is your kitten?”) = [(“Can we put mittens on it?”, (1/3 / 1/3) = 1)] Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3), (“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3] Using the CFD Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3), (“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3] • Given the CFD, the response to any prompts containing • ‘food’ and ‘good’ will give back “What kind of food did they have?” Problem: This can lead to redundancy – – – – A: The food was good B: What kind of food did they have? A: Didn’t you think the food was good? B: What kind of food did they have? • Solution: Store an FD of used responses, and don’t use them again – – – – A: The food was good B:What kind of food did they have? A: Didn’t you think the food was good? B: Me too I’m hungry What if we have no responses? Because: • • We’ve never encountered any of the prompt words We’ve used up all relevant responses We can: 1. 2. 3. 4. Find a random response Enhance randomness by favoring unlikely responses (near the end of association lists) to reduce redundancy Fabricate a response based on pattern matching Select a response from a default list of responses (i.e. “Lets talk about something else”, “I don’t know anything about that” * All my bots implement 1 & 2, and one bot (bonnie) also implements 3 & 4 Interactive Webpages are not trivial • Especially if you want to retain some “memory” • • of the past First CGI problem: A new bot is created with every web prompt - all memory is lost Solution: – Write all bot state changes to a file, including used responses. – Run this file with every prompt, and reset it when a new conversation starts – The bot loads the huge CFD & all state changes from scratch with EVERY call. Slow, but works. Interactive Webpages are not trivial Memory = Self-modifying code Interactive Webpages are not trivial Memory = Self-modifying code Demo http://ischool.berkeley.edu/~bonniejc/
© Copyright 2026 Paperzz