slides day 4 (Idioms. Speech acts and conventionalisation.)

Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms. Speech acts and conventionalisation.
Ann Copestake
Natural Language and Information Processing Group
Computer Laboratory
University of Cambridge
June 2008
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Course topics
• Tuesday: Idioms in HPSG implementations. Speech acts
and conventionalisation.
• Wednesday: Generation: realisation ranking. Lexical
selection: grammar vs collocation.
Idioms in HPSG implementations
Speech acts
Outline.
Idioms in HPSG implementations
Idiom variability
Formalizing/implementing idioms
Speech acts
BDI
Speech acts as idioms
Speech acts in computational systems
Conventional speech act formulae
Conventional speech act formulae
Idioms in HPSG implementations
Speech acts
Outline.
Idioms in HPSG implementations
Idiom variability
Formalizing/implementing idioms
Speech acts
BDI
Speech acts as idioms
Speech acts in computational systems
Conventional speech act formulae
Conventional speech act formulae
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms and variability
Google search on spill beans on 3/6/2008:
• Grant’s chance to spill the beans on exit
• Early universe to spill beans on physics
• I won’t embarass Tom and spill all the beans
• White hat hackers spill their beans
• Mariah Carey and Nick Cannon spill wedding beans to
People
• Loos to spill more of Beckham’s beans
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Classes of idiom
1. words not found in other contexts
by dint of, tit for tat
2. syntactically ill-formed
by and large
NB: to lose face is syntactically regular
3. not decomposable
kick the bucket, red herring
4. decomposable once idiom meaning is known
let the cat out of the bag, spill the beans
5. transparent (conventional metaphor?)
cast light on (seeing as understanding), grease the wheels
Nunberg, Sag and Wasow (1994), Riehemann (2001)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms as compositional
Hypothesis: some speakers attribute meaning to the individual
words in decomposable and transparent idioms:
• spill the beans corresponds to reveal the secrets
• cat out of the bag corresponds to secret out of the hiding
place
• light at the end of the tunnel corresponds to good outcome
at the end of the difficult circumstances
That is a cat which has been a very long time coming out of its
bag.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms as compositional
Hypothesis: some speakers attribute meaning to the individual
words in decomposable and transparent idioms:
• spill the beans corresponds to reveal the secrets
• cat out of the bag corresponds to secret out of the hiding
place
• light at the end of the tunnel corresponds to good outcome
at the end of the difficult circumstances
That is a cat which has been a very long time coming out of its
bag.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idiomatic lexical signs
• lexical variation: cast/throw/shed light on
• recurring uses
shed light on (help understanding of)
see the light (come to understanding)
light dawns (understanding happens)
• mixing idioms
drop a bombshell (utter something startling)
drop a brick (utter something stupid)
These idioms can be mixed:
Kim is unpredictable: she’ll either drop a bombshell or a
brick
conjunction implies the same ‘drop’ in both idioms.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Intuitive idea of formalisation
• Decomposable idioms are compositional, given the
idiomatic meaning-form correspondance
• Idiomatic lexical signs, constrained by idiomatic phrase
types to co-occur (possibly with normal signs)
• Specify semantics on idiomatic phrase to get the right
idiom pattern:
i_spill_v(e,u,y), i_bean_n(y)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms in the LKB
Lexical signs:
i_spill_tv1 := idiomatic-trans-verb &
[ ORTH <! "spill" !>,
SEM.HOOK.KEYPRED "i_spill_v_rel" ].
i_bean_n1 := idiomatic-noun-lxm &
[ ORTH <! "bean" !>,
SEM.HOOK.KEYPRED "i_bean_n_rel" ].
Phrasal constraint:
spill_the_beans := phrase &
[ SEM.IDRELS < [ PRED i_spill_v_rel,
ARG2 #y ],
[ PRED i_bean_n_rel,
ARG0 #y ] > ].
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Phrasal constraint
• Phrasal constraints are required to ensure that all the
required parts of the idioms are there: need to block e.g.,
idiomatic beans without idiomatic spill.
• Lexical selection not adequate: non-idiomatic words in
idioms, non-headed idioms (cat out of bag).
• Constraint implemented as a root symbol/start symbol in
grammar: all bits of idiom must appear in same sentence.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Properties of a grammar
Grammar A grammar consists of a set of grammar rules, G,
a set of lexical entries, L, and a start structure, Q.
Lexical sign A lexical sign is a pair hL, Si of a TFS L and a
string list S
Valid phrase A valid phrase P is a pair hF , Si of a TFS F and a
string list S:
1. P is a lexical sign, or
2. F is subsumed by some rule R and
hF1 , S1 i . . . hFn , Sn i s.t. R’s daughters’
subsume F1 . . . Fn and S is the ordered
concatenation of S1 . . . Sn .
Sentences A string list S is a well-formed sentence if there is
a valid phrase hF , Si such that the start structure
Q subsumes F .
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Idioms: summary
• Many idioms are not fixed phrases.
• Distribution implies compositionality.
• HPSG implementation as phrasal constraints on
co-occurring lexical signs. (Maybe statistical models can
account for canonical forms???)
• Can we use this sort of account elsewhere, e.g., for
conventional speech acts?
Idioms as compositional: Nunberg et al
Pulman: idioms as normal lexical items plus quasi-inference
Copestake (1994), Riehemann (1997, 2001)
http://mwe.stanford.edu/idioms.html for bibliography
Idioms in HPSG implementations
Speech acts
Outline.
Idioms in HPSG implementations
Idiom variability
Formalizing/implementing idioms
Speech acts
BDI
Speech acts as idioms
Speech acts in computational systems
Conventional speech act formulae
Conventional speech act formulae
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Speech acts
The simple story:
• declarative ↔ statement
• interrogative ↔ question
• imperative ↔ request/command
But:
• A window is open
• ∃x[window(x) ∧ open(e, x) ∧ present(e)]
• Context: tense, set of relevant windows etc, temperature.
• Speaker’s intention: get hearer to close the window.
i.e., a declarative acting as a request/command
Similarly:
• Is a window is open?
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Speech act theory
• assertives — Speaker is committed to the truth of a
proposition
• directives — Speaker is attempting to achieve some effect
via the actions of Hearer (e.g., ordering, requesting)
• commissives — commit Speaker to some action (e.g.,
promising, offering)
• expressives — expression of some internal/psychological
state
• declarations — successful performance brings about the
propositional effect (e.g., ‘I resign’)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
The inference approach
‘Can you pass the salt?’ is a question as far as syntax and
semantics are concerned, but the hearer infers a request from
the literal meaning.
1. Speaker has uttered something and the hearer
understands the literal meaning
2. In context, literal meaning would violate the Cooperative
principle.
3. Hearer assumes that Speaker must have intended
something else
4. Hearer has to work out what Speaker intended.
For questions used as requests, the query concerns a
precondition that must be satisfied in order for Hearer to carry
out the action.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Beliefs, desires and intentions (BDI)
Request (S is Speaker, H is Hearer, A is some Action):
Preconditions:
S believes H able/willing to do A;
A is not going to happen anyway;
S wants H to do A.
Action: S predicates future act A of H.
Effect: H believes S wants H to do A.
Planning concepts:
preconditions — conditions necessary for an action to be
performed
action — the nature of the action performed
effect — the expected result of the action.
A rational agent reasons about the actions that it can perform
which will have effects that lead towards its goal.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Requests
S has desire for the window to be open.
Possible ways of realizing this:
1. S opens the window
2. S requests that H opens the window
S can also make an indirect request: if S asks Can you open
the window?, then S is questioning a precondition of the literal
Open the window. Since H doesn’t believe S is interested in H’s
ability for its own sake, H can interpret this as a request to open
the window.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Indirect responses
A: Can we meet next week?
B: I’m away from the 4th.
B’s response is rational if the 4th is before or during next week,
but otherwise isn’t.
Assume A’s goal is to meet next week: A is genuinely
questioning the preconditions.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Some problems
Why don’t we just use imperatives? e.g., Open the window!
It’s not polite, but how do we account for politeness?
Conventional ways of making requests with interrogatives
(Clark et al 1979):
Can you close the window?
Are you able to close the window?
Second is more likely to be interpreted literally.
Different behaviour of conventional ‘indirect’ requests:
Could you please pass the salt?
* Should I please close the window?
* Are you able to please pass the salt?
* I can’t please reach the salt.
This suggests that conventional ‘indirect’ requests aren’t really
indirect.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Speech acts vs empirical data
Theoretical problems with the speech act approach:
• Speech act theory does not explain the forms that
speakers use
• Conventional forms are used to perform requests (e.g.,
evidence of please)
• Distribution of forms used depends on extra-linguistic facts
(social class, etc) and degree of politeness
• Theory of mind problem: small children use indirect
speech acts but don’t seem to be able to reason about
what other people do/don’t know.
We want to know how people actually make requests etc, not
just how they can do so.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventional speech acts as idioms?
Idiom approach
conventional speech acts are represented in the
lexicon/grammar with idiosyncratic meaning
Do you have X?
gives logical form equivalent to
‘request(transfer(hearer,speaker,X))’
Not appropriate for non-conventional cases, indirect responses
etc, where inference is still assumed:
A: When do you want to leave?
B: My meeting starts at 3pm.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventional speech acts as idioms?
Many of the standard objections (e.g., in Levinson’s book) don’t
apply if idioms are treated as essentially compositional.
The dual uptake problem:
A: Do you have the International Herald Tribune today?
B: Yes, here you are.
The speech act is not just a request. ‘Do you have X?’ means
something more like ‘Do you have X and if so transfer it to me’.
So can’t just have idiomatic meaning.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Speech acts in computational systems
• Planning approach used for interpretation in demo
systems, but it is relatively brittle.
• Alternative (mainstream): interpretation primarily based on
dialogue state: ignore could you etc (or treat as cue phrase
which provide features for a machine learning approach).
• Generally: try and discourage user from making indirect
response.
• Generation: requests etc usually hard-wired (but lack of
variation).
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Dialogue act tagging (Stolcke et al, 2000)
Manual tagging with dialogue acts:
• Modified DAMSL tag set (42 labels)
• 1,155 Switchboard conversations were tagged: this is
about 205,000 utterances
• Good interannotator agreement overall: kappa 0.8
Experiments with automatic tagging:
• Word strings: HMMs over transcribed/recognised words.
• Prosodic model
• ‘discourse grammar’ for dialogue structure: HMMs over
dialogue act tags.
Best results from combined approaches: 65% recognised
speech, 71% transcribed speech. (Baseline is 35%, Ceiling
(human agreement) is 84%)
Idioms in HPSG implementations
Speech acts
Outline.
Idioms in HPSG implementations
Idiom variability
Formalizing/implementing idioms
Speech acts
BDI
Speech acts as idioms
Speech acts in computational systems
Conventional speech act formulae
Conventional speech act formulae
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventional speech act formulae
• Conventional formulae frequent (varying according to
context), may be lexicalized. But ‘idiom theory’ fails to
account for dual responses.
• Conventional speech act formulae as interpretive shortcuts
(Morgan 1978): allow hearers a shortcut to an
interpretation they could have obtained by full inference
pre-packaged way for speakers to achieve an effect
• Copestake and Terkourafi:
• conventional illocutionary force represented separately from
the compositional semantics
• conventional illocutionary force adds to the compositional
semantics, thus licensing dual responses
• work with real formulae verified by corpus analysis
• attempt a precise formalization of conventional formulae
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventional speech act formulae in Cypriot Greek
(Terkourafi 2001)
• Corpus of 2,000+ recorded spontaneous exchanges
• Offers and requests identified depending on:
addressee’s uptake
desirability of act predicated
• Full contexts broken down to ‘minimal’ contexts (age,
gender, socioeconomic status, etc)
• Analyzed linguistic realization of offers and requests in
different ‘minimal’ contexts
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventional speech act formulae in Cypriot Greek:
findings
• Different formulae preferred in different ‘minimal’ contexts
• Types of formulae:
Lexeme-based: inflected verb forms rendered with a
particular accent and intonation
Construction-based: e.g. imperative, 1sg subjunctive, 2sg
subjunctive
• formulae: frequency in a ‘minimal’ context
• formulae: evidence of lexicalization
• fixed word order
• phonological reduction
• characteristic intonation contour
• please-insertion in requests (Greek parakalo / ligho)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
The ‘want’ formulae
(Gr. thelo): thelis NP/VP? (= do-you-want NP/VP?)
[Shoe-shop; Speaker: female, aged 18-30, working-class;
Addressee: female, aged 18-30, middle class;
Relationship: acquaintances]
’lis kafe? (.) indalos in’ o kafes su?
‘Do you want coffee? How is your coffee?’
[Shoe-shop; Speaker: female, aged 31-50, working-class;
Addressee: female, aged 31-50, working class;
Relationship: salesperson to new customer]
thelis na valumen kanena pataki mesa?
‘Do you want us to put an insole in?’
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
The ‘want’ formulae
• illocutionary force: offer
• context: wide range of informal contexts (home and work)
• most frequent function of thelis NP/VP? (103/112
occurrences)
• most frequent verb form for offers in each of many informal
contexts
• lexicalization:
word order: sentence initial (over 90%)
phonological reduction: thes, ’lis
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Speech act formulae in HPSG
Formulae treated as analogous to lexical items/idioms
• each formula is a sign, a conventional association between
• PHON: including intonation as indicator of function
• SYNSEM: syntax, semantics, morphology
• CTXT
• C-ILLOC: conventionalized illocutionary force (i.e., the
shortcut)
• BACKGROUND: situational info from ‘minimal’ contexts
• all formulae are listed
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
The C-ILLOC feature in CTXT
C-ILLOC only instantiated in formulae
thelis na valumen kanena pataki mesa?
Do you want us to put an insole in?
sign for utterance contains (schematically where S is
SPEAKER, H is HEARER):
SYNSEM:CONTENT: int(thelo(H,[1]=put-insole-in(S)))
C-ILLOC: OFFER(S,H,[1])
Some utterances will have formulaic and non-formulaic
analyses
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
C-ILLOC and compositional semantics
C-ILLOC and compositional semantics - three possibilities:
1. conventionalized illocutionary force, C-ILLOC, instantiated
along with compositional semantics (as above, this talk)
2. no C-ILLOC, speaker intentions inferred by the hearer from
compositional semantics
3. C-ILLOC is instantiated, no (useful) compositional
semantics, e.g. greetings like Hello!
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Dual uptake
[Pharmacy; Speaker A: female, aged 18-30, working class;
Speaker B: female, aged over 51, middle class;
Relationship: new customer to salesperson]
A: na mu kopsete apodhiksin? (=‘Can you give me a receipt?’)
B: ne ((issues receipt)) (=‘Yes.’ ((issues receipt))
Asymmetry between two parts:
• ordering: literal first
• different interactional consequences if only one is
provided:
• no reply to literal meaning > lack of politeness
• no reply to indirect meaning > uncooperativeness
Compositional semantics plus conventionalised illocutionary
force.
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Formalizing OFFER
our account: OFFER(S,H,ACT(S))
• connection with part of compositional semantics (which
part depends on formula)
• OFFER(S,H,[index into content])
SYNSEM:CONTENT: int(thelo(H,[1]=put-insole-in(s)))
C-ILLOC: OFFER(SPEAKER,HEARER,[1])
BUT:
’lis kafe? (Do you want coffee?)
thelis na scepastis? (Do you want to cover up?)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
offers and OFFERS
• uptake criterion primary for corpus annotation as offer vs
request (beneficiary may not distinguish in collaborative
situation, e.g., buying/selling event)
• but for C-ILLOC, OFFER is distinguished from REQUEST
by speaker agency (speaker’s plan)
• examples from corpus which are offers according to the
uptake criterion may not contain explicit ACT(S)
• hypothesis: all offers involve explicit or implicit ACT(S)
• ’lis kafe? S offers to provide H with coffee
• thelis na scepastis? S offers H some action by S which will
allow H to cover up (fetching a blanket)
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Logical metonymy and conventionalized illocutionary
force
• OFFER takes event with SPEAKER agent
• contextual coercion with NP: ‘lis kafe?
SYNSEM:CONTENT: int(thelo(H,[1]=coffee))
C-ILLOC: OFFER(SPEAKER,HEARER, P(S,H,[1]))
most examples of thelis NP are food/drink, so P=provide
(cf enjoy)
• contextual coercion with VP with HEARER agent
thelis na scepastis? (=‘Do you want to cover up?’)
SYNSEM:CONTENT: int(thelo(H, [1] = cover_up(H)))
C-ILLOC: OFFER(S,H,P) & P=ACT(S)
& PRECONDITION(P,[1])
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
Conventionalised illocutionary force: a summary
• Formalization of Morgan’s shortcut idea via
conventionalized illocutionary force.
• Compared with idiom theory: compositional meaning is
retained.
• HPSG account allows all conventional aspects of sign to
be integrated.
• Linking between compositional semantics and C-ILLOC,
not replacing compositional semantics (so, dual uptake
possible).
• Specifying meaning of REQUEST, OFFER: hypothesis that
some examples involve conventionalized metonymic
inference (cf logical metonymy).
Idioms in HPSG implementations
Speech acts
Conventional speech act formulae
This talk: summary
• Idioms are conventionalised but not fully fixed (Riehemann
2001 for detailed analysis).
• Compositional idioms can be treated as lexical entries
constrained to appear in fixed semantic relationship.
• Inference is needed in general to work out speaker
intention, but conventional formulae exist (Terkourafi 2001).
• Conventional speech act formulae as interpretative
shortcuts.
• Formulae as phrases associated with extra C-ILLOC, direct
or metonymic connection to compositional interpretations.
• Idiom and formula distribution/variation???