Bayesian Update of Dialogue State system

Dialogue Modelling
Milica Gašić
Dialogue Systems Group
Why are current methods poor?
Dialogue as a Partially Observable Markov Decision Process
(POMDP)
at
State depends on a
noisy observation
P(st|ot) -- the
observation
probability
st
st+1
ot
ot+1
rt
State is unobservable
and depends on the
previous state and
action:
P(st+1|st,at) – the
transition probability
• Action selection (policy) is based on the distribution over all states at
every time step t – belief state b(st)
How to track belief state?
b(st+1 ) = ?
Belief propagation
• Probabilities conditional on the observations
Da
x
Db
• Interested in the marginal probabilities p(x|D), D={Da,Db}
p(x | Db, Da )µ p(x, Db | Da ) = p(Db | x, Da )p(x | Da )
= p(Db | x)p(x | Da )
Belief propagation
• Split Db further into Dc and Dd
Dc
Da
x
Db
Dd
p(x | Da, Dc , Dd )µ p(Dc, Dd | x)p(x | Da )
= p(Dc | x)p(Dd | x)p(x | Da )
Belief propagation
Db
b
c
Da
Dc
a
p(c | Da , Db ) = å p(a | Da )p(b | Db )p(c | b, a)
a,b
p(Dc, Db | a) µ å p(Dc | c)p(b | Db )p(c | a, b)
b,c
Belief propagation
Da
a
b
Db
p(b | Da ) = å p(a | Da )p(b | a)
a
p(Db | a) = å p(Db | b)p(b | a)
b
How to track belief state?
at
st
st+1
ot
ot+1
rt
b(st+1 ) = ?
Belief state tracking
at
st
st+1
ot
ot+1
Requires summation
over all possible states
at every dialogue turn –
intractable!!!
rt
b(st+1 )µ p(ot+1 | st+1 )å p(st+1 | at , st )b(st )
st
Requires summation over every dialogue state!!!
Challenges in POMDP dialogue modelling
How to define the state
space?
How to tractably
maintain the belief
state?
How to define transition
and observation
probabilities?
How to represent dialogue state?
Markov property
• Needs to know what happened before
– the dialogue history
Task oriented
dialogue
• Needs to know what user wants – the
user goal
Robust to errors
• Needs to know what user says – the
user act
Dialogue state factorisation
• Decompose the sate
into conditionally
independent
elements:
at
gt
gt+1
st
user goal
dt
dt+1
ut
ut+1
user action
dialogue history
ot
rt
ot+1
Belief update
at
b(gt+1, ut+1, dt+1 ) =
gt
gt+1
dt
dt+1
ut
ut+1
= p(ot+1 | ut+1 )
p(ut+1 | gt+1, at )
å
ot
rt
Requires summation over all
possible histories and user
actions– intractable!!!
ot+1
Requires
summation over all
possible goals–
intractable!!!
gt
å
p(gt+1 | at , gt )
dt ,ut
p(dt+1 | dt , gt , ut , at )
b(gt , ut , dt )
Dialogue models for real-world dialogue system
Hidden Information State
(HIS) system
Bayesian Update of Dialogue
State (BUDS) system
Hidden Information State system
Real world
dialogue system
based on POMDP
Takes an N-best
input of user
utterances
Maintains a
distribution over
most probable
dialogue states in
real time
Hidden Information State system – dialogue acts
Is there um maybe a cheap place in the centre of town please?
inform ( pricerange = cheap, area = centre)
dialogue act type
inform
request
confirm
…
semantics slots and values
type=restaurant
food=Chinese
…
Hidden Information State system -- ontology
type
restaurant
area
north
hotel
food
south
Chinese
starts
Indian
Hidden Information State system – belief update
b(gt+1, ut+1, dt+1 ) =
• Only the user acts from the N-best Iist
= p(ot+1 | ut+1 )
• Dialogue histories take a small
number of values
p(ut+1 | gt+1, at )
å
gt
å
p(gt+1 | at , gt )
dt ,ut
p(dt+1 | dt , gt , ut , at )
b(gt , ut , dt )
• Goals are grouped into partitions
• All probabilities are handcrafted
Dialogue history in the HIS system
• Dialogue history ideally represent everything that happened
• History states: system informed, user informed, user requested, system
requested for each concept in the dialogue
•
p(dt+1 | dt , gt ,ut , at ) either 1 or 0 and defined by a finite state automaton
HIS partitions
• Represent group of (most probable) goals
• Dynamically built during the dialogue
•
p(gt+1 | gt , at ) is set to a high value if gt+1 is in line with gt and at,
otherwise a small value
HIS partitions --example
System: How may I help you?
request(task)
User: I’d like a restaurant in the centre.
inform(entity=venue, type=restaurant, area=centre)
entity
entity
! venue
venue
entity
type
!restaurant
entity
venue
area
venue
!central
type
!restaurant
entity
type
restaurant
area
!central
venue
area
central
entity=venue
type
area
type=restaurant
area=central
restaurant
central
Pruning
entity
entity
! venue
venue
entity
type
!restaurant
entity
venue
area
venue
!central
type
!restaurant
restaurant
area
!central
venue
central
entity=venue
0.9
entity
type
area
type
restaurant
area
central
type=restaurant
0.2
area=central
0.5
23
Hidden Information State systems
Any limitations?
Bayesian Update of Dialogue State system
Further decomposes the dialogue state
Tractable belief state update
Learning of the shape of distribution
Bayesian network model for dialogue
at
ggttfood
gtarea
ddtfood
t
dtarea
utarea
food
ddt+1
t+1
uutfood
t
ot
gt+1area
food
gt+1
gt+1
food
uut+1
t+1
rt
ot+1
dt+1area
ut+1food
Belief tracking
• For each node x
• Start on one side, and keep getting p(x|Da)
• Then start on the other ends and keep getting p(Db|x)
• To get a marginal simply multiply these
Bayesian network model for dialogue
at
gt+1food
gtfood
gtarea
dtarea
utarea
dtfood
θ
dt+1area
dt+1food
ut+1food
utfood
ot
gt+1area
rt
ot+1
ut+1food
Training policy using different parameters
• Policy trained using
reinforcement
learning (explained
in next lecture)
• Examined on
different errors in
the user input
• Average reward
Summary
Essential ingredients to include in dialogue state
Belief state maintaining
Dialogue modelling for real world problems
Learning of the shapes of probability
distributions