Conversational roles assignment problem in multi

Conversational role assignment
problem in multi-party dialogues
Natasa Jovanovic Dennis Reidsma
Rutger Rienks
TKI group
University of Twente
Outline
Research tasks at TKI
 Interpretation of multimodal humanhuman communication in the meetings
 Conversational Role Assignment Problem
(CRAP)
 Towards automatic addressee detection

A framework for multimodal interaction
research
Tools (e.g. for retrieval,
simulation, remote
presence, generation of
minutes for meetings) IV
Research on human
behaviour
V
Multimodal interpretation of
events in terms of semantic
models.
IIB
Models and theories of
interaction; semantics of
annotation schemes
III
Annotation of events in
different modalities (E.g.
gaze, posture, gesture,
speech)
IIA
Layered annotation
Unannotated corpus (video
and audio recordings in a
certain domain)
I
II
Multimodal annotation tool
Who is talking to whom?
CRAP as one of the main issues in multiparity conversation (Traum 2003.)
 Taxonomy of conversational roles (Herbert
K. Clark)

speaker
addressee
all participants
all listener
side
participant
bystander
eavesdropper
Our goal:
 Automatic addressee identification in
small group discussions
 Addressees in meeting conversations:
single participant, group of people, whole
audience
 Importance of the issue of addressing in
multi-party dialogues

Addressing mechanisms



What are relevant sources of information for
addressee identification in the face-to-face
meeting conversations?
How does the speaker express who is the
addressee of his utterance?
How can we combine all this information in order
to determine the addressee of the utterance?
Sources of information

Speech






Linguistic markers
 word classes: personal pronouns, determiners in combination
with personal pronouns, possessive pronouns and adjectives,
indefinite pronouns, etc.
Name detection ( vocatives)
Dialogue acts
Gaze direction
Pointing gestures
Context categories(features)
Dialogue Acts and Addressee
detection (I)



How many addresses may have an utterance?
According to dialog act theory an utterance or an
utterance segment may have more than one
conversational function.
Each DA has a addressee ==> an utterance may
have several addresses
Dialogue Acts and Addressee
detection (II)


MRDA (Meeting Recorder Dialogue Acts)– tag set
for labeling multiparty face to face meetings
(ICSI)
We use a huge subset of the MRDA set which is
organized on two levels:
 Forward looking functions (FLF )
 Backward looking functions (BLF)
Non-verbal features

Gaze



Contribution of the gaze to the addressee detection is
dependent on: participants’ location (visible area),
utterance length, current meeting action
Turn-taking behavior and addressing behavior
Gesture ( pointing at a person)


TALK_TO (X,Y) AND POINT_TO (X,Y)
TALK_TO( X,Y) AND POINT_TO (X,Z) – X talk to Y about Z
Context categories


Bunt: “totality of conditions that may influence
understanding and generation of communicative behavior”
 Local context is an aspect of context that can be
changed through communication
Context categories:
 Interaction history ( verbal and non-verbal)
 Meeting action history
 Spatial context (participants’ location, distance, visible
area, etc. )
 User context (name, gender, roles, etc. )
Towards an automatic addressee
detection
Manual or automatic features annotation?
 An automatic target interpreter has to deal
with uncertainty
 Methods:
 Rule-based method
 Statistical method ( Bayesian networks)

Rule-based method
1.
2.
Processing information obtained from the utterance
( linguistic markers, vocatives, DA). The result is a list
of possible addressees with corresponding probabilities
1.
Eliminate cases where target is completely
determined (for instance, name in vocative form)
2.
Set of rules for BLF
3.
Set of rules for FLF
Processing gaze and gesture information adding the
additional probability values to the candidates
Meeting actions and addressee
detection
Automatic addressee detection method can
be applied to the whole meeting
 Knowledge about the current meeting action
as well as about meeting actions history
may help to better recognize the addressee
of a dialogue act.

Future works
Development of multimodal annotation tool
 Data annotation for
 training and evaluating statistical models
 obtaining inputs for rule-based methods
 New meeting scenarios for research in
addressing
