presentation

The Learning Chatbot
Bonnie Chantarotwong
IMS-256 Fall 2006
What is wrong with state of the art
chatbots?
• They are repetitive
• They are predictable
– simple pattern matching & set response
• They have no memory
– can lead to circular conversations
• They don’t sound like real people
Proposed Solution
• Train chatbots on a corpora of
conversations in order to mimic a given
personality
Filtering the training corpus
• Need a lot of conversations containing query
•
screen name
Eliminate undesirable data
– phone numbers
– Addresses
– sensitive gossip
• Eliminate highly technical data
– since most tech problems are very specific, unless the bot was trained
on a newsgroup, learned responses are not likely to be useful for tech
support
Parsing the training corpus
• Extract messages from HTML
• Group together consecutive messages by the
•
same screen name
Simplify prompt messages
–
–
–
–
!!!!!!!???????
->
Ohhhhhhhhhhhh! ->
WhATz uP??
->
hahahahaha
->
!?
ohh!
whatz up?
haha
• Break prompts into word sequences (eliminating
stop words)
– I took the cat to a vet
->
[i, took, cat, to, vet]
Constructing the CFD
• CFD conditions are prompt words
• FD samples are string responses, with
numeric count indicating strength of
correlation
• Example:
– Cfd[‘sleep’].sorted_samples() ->
[“sleep is the best thing ever”, “are you tired?”,
“maybe after I eat.”, “hang on a sec.”]
Constructing the CFD
i
Simple Concept:
1/3
Can we put mittens on it?
1/3
want
1/3
kitten
1/3
food
1/3
they
had
good
at
restaurant
1/3
1/6
Me too, I’m hungry.
1/6
1/6
What kind did they have?
1/6
1/6
1/6
Original Conversation:
A:
B:
A:
B:
A:
B:
I want a kitten
Can we put mittens on it?
I want food
Me too, I’m hungry
They had good food at the restaurant
What kind did they have?
If a prompt is n
words long, then
each word is 1/n
likely to have
caused the
response
Using the CFD
Problem
• Each word in a prompt is not equally likely to
have caused the response
– More common words (such as ‘I’) are less indicative of meaning
Solution
• Take into account the commonality of the word
over all conversations
– Divide the weight of the word/response pair by the weight sum
over all samples for that word
– Rare words are weighted more; using a dynamic scale
– This improved quality of bot responses greatly!
Using the CFD - Example
CFD:
Cfd[‘i’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)]
Cfd[‘want’] = [(“Can we put mittens on it?”, 1/3), (“Me too, I’m hungry”, 1/3)]
Cfd[‘kitten’] = [(“Can we put mittens on it?”, 1/3)]
Cfd[‘food’] = [ (“Me too, I’m hungry”, 1/3), (“What kind of food did they have?”, 1/6)]
Cfd[‘they’] = [ (“What kind of food did they have?”, 1/6)]
Cfd[‘had’] = [ (“What kind of food did they have?”, 1/6)]
Cfd[‘good’] = [ (“What kind of food did they have?”, 1/6)]
Cfd[‘at’] = [ (“What kind of food did they have?”, 1/6)]
Cfd[‘restaurant’] = [ (“What kind of food did they have?”, 1/6)]
Sum:
Sum:
Sum:
Sum:
Sum:
Sum:
Sum:
Sum:
Sum:
2/3
2/3
1/3
1/2
1/6
1/6
1/6
1/6
1/6
Responses (“how is your kitten?”) = [(“Can we put mittens on it?”, (1/3 / 1/3) = 1)]
Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),
(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]
Using the CFD
Responses (“the food was good”) = [(“Me too, I’m hungry”, (1/3 / 1/2) = 2/3),
(“What kind of food did they have?”, (1/6 / 1/2 + 1/6 / 1/6) = 4/3]
• Given the CFD, the response to any prompts containing
•
‘food’ and ‘good’ will give back “What kind of food did they
have?”
Problem: This can lead to redundancy
–
–
–
–
A: The food was good
B: What kind of food did they have?
A: Didn’t you think the food was good?
B: What kind of food did they have?
• Solution: Store an FD of used responses, and don’t use
them again
–
–
–
–
A: The food was good
B:What kind of food did they have?
A: Didn’t you think the food was good?
B: Me too I’m hungry
What if we have no responses?
Because:
•
•
We’ve never encountered any of the prompt words
We’ve used up all relevant responses
We can:
1.
2.
3.
4.
Find a random response
Enhance randomness by favoring unlikely responses (near the
end of association lists) to reduce redundancy
Fabricate a response based on pattern matching
Select a response from a default list of responses (i.e. “Lets
talk about something else”, “I don’t know anything about that”
* All my bots implement 1 & 2, and one bot (bonnie) also implements 3 & 4
Interactive Webpages are not trivial
• Especially if you want to retain some “memory”
•
•
of the past
First CGI problem: A new bot is created with
every web prompt - all memory is lost
Solution:
– Write all bot state changes to a file, including used
responses.
– Run this file with every prompt, and reset it when a
new conversation starts
– The bot loads the huge CFD & all state changes from
scratch with EVERY call. Slow, but works.
Interactive Webpages are not trivial
Memory = Self-modifying code
Interactive Webpages are not trivial
Memory = Self-modifying code
Demo
http://ischool.berkeley.edu/~bonniejc/