Dialogue Modelling Milica Gašić Dialogue Systems Group Why are current methods poor? Dialogue as a Partially Observable Markov Decision Process (POMDP) at State depends on a noisy observation P(st|ot) -- the observation probability st st+1 ot ot+1 rt State is unobservable and depends on the previous state and action: P(st+1|st,at) – the transition probability • Action selection (policy) is based on the distribution over all states at every time step t – belief state b(st) How to track belief state? b(st+1 ) = ? Belief propagation • Probabilities conditional on the observations Da x Db • Interested in the marginal probabilities p(x|D), D={Da,Db} p(x | Db, Da )µ p(x, Db | Da ) = p(Db | x, Da )p(x | Da ) = p(Db | x)p(x | Da ) Belief propagation • Split Db further into Dc and Dd Dc Da x Db Dd p(x | Da, Dc , Dd )µ p(Dc, Dd | x)p(x | Da ) = p(Dc | x)p(Dd | x)p(x | Da ) Belief propagation Db b c Da Dc a p(c | Da , Db ) = å p(a | Da )p(b | Db )p(c | b, a) a,b p(Dc, Db | a) µ å p(Dc | c)p(b | Db )p(c | a, b) b,c Belief propagation Da a b Db p(b | Da ) = å p(a | Da )p(b | a) a p(Db | a) = å p(Db | b)p(b | a) b How to track belief state? at st st+1 ot ot+1 rt b(st+1 ) = ? Belief state tracking at st st+1 ot ot+1 Requires summation over all possible states at every dialogue turn – intractable!!! rt b(st+1 )µ p(ot+1 | st+1 )å p(st+1 | at , st )b(st ) st Requires summation over every dialogue state!!! Challenges in POMDP dialogue modelling How to define the state space? How to tractably maintain the belief state? How to define transition and observation probabilities? How to represent dialogue state? Markov property • Needs to know what happened before – the dialogue history Task oriented dialogue • Needs to know what user wants – the user goal Robust to errors • Needs to know what user says – the user act Dialogue state factorisation • Decompose the sate into conditionally independent elements: at gt gt+1 st user goal dt dt+1 ut ut+1 user action dialogue history ot rt ot+1 Belief update at b(gt+1, ut+1, dt+1 ) = gt gt+1 dt dt+1 ut ut+1 = p(ot+1 | ut+1 ) p(ut+1 | gt+1, at ) å ot rt Requires summation over all possible histories and user actions– intractable!!! ot+1 Requires summation over all possible goals– intractable!!! gt å p(gt+1 | at , gt ) dt ,ut p(dt+1 | dt , gt , ut , at ) b(gt , ut , dt ) Dialogue models for real-world dialogue system Hidden Information State (HIS) system Bayesian Update of Dialogue State (BUDS) system Hidden Information State system Real world dialogue system based on POMDP Takes an N-best input of user utterances Maintains a distribution over most probable dialogue states in real time Hidden Information State system – dialogue acts Is there um maybe a cheap place in the centre of town please? inform ( pricerange = cheap, area = centre) dialogue act type inform request confirm … semantics slots and values type=restaurant food=Chinese … Hidden Information State system -- ontology type restaurant area north hotel food south Chinese starts Indian Hidden Information State system – belief update b(gt+1, ut+1, dt+1 ) = • Only the user acts from the N-best Iist = p(ot+1 | ut+1 ) • Dialogue histories take a small number of values p(ut+1 | gt+1, at ) å gt å p(gt+1 | at , gt ) dt ,ut p(dt+1 | dt , gt , ut , at ) b(gt , ut , dt ) • Goals are grouped into partitions • All probabilities are handcrafted Dialogue history in the HIS system • Dialogue history ideally represent everything that happened • History states: system informed, user informed, user requested, system requested for each concept in the dialogue • p(dt+1 | dt , gt ,ut , at ) either 1 or 0 and defined by a finite state automaton HIS partitions • Represent group of (most probable) goals • Dynamically built during the dialogue • p(gt+1 | gt , at ) is set to a high value if gt+1 is in line with gt and at, otherwise a small value HIS partitions --example System: How may I help you? request(task) User: I’d like a restaurant in the centre. inform(entity=venue, type=restaurant, area=centre) entity entity ! venue venue entity type !restaurant entity venue area venue !central type !restaurant entity type restaurant area !central venue area central entity=venue type area type=restaurant area=central restaurant central Pruning entity entity ! venue venue entity type !restaurant entity venue area venue !central type !restaurant restaurant area !central venue central entity=venue 0.9 entity type area type restaurant area central type=restaurant 0.2 area=central 0.5 23 Hidden Information State systems Any limitations? Bayesian Update of Dialogue State system Further decomposes the dialogue state Tractable belief state update Learning of the shape of distribution Bayesian network model for dialogue at ggttfood gtarea ddtfood t dtarea utarea food ddt+1 t+1 uutfood t ot gt+1area food gt+1 gt+1 food uut+1 t+1 rt ot+1 dt+1area ut+1food Belief tracking • For each node x • Start on one side, and keep getting p(x|Da) • Then start on the other ends and keep getting p(Db|x) • To get a marginal simply multiply these Bayesian network model for dialogue at gt+1food gtfood gtarea dtarea utarea dtfood θ dt+1area dt+1food ut+1food utfood ot gt+1area rt ot+1 ut+1food Training policy using different parameters • Policy trained using reinforcement learning (explained in next lecture) • Examined on different errors in the user input • Average reward Summary Essential ingredients to include in dialogue state Belief state maintaining Dialogue modelling for real world problems Learning of the shapes of probability distributions
© Copyright 2025 Paperzz