poster - Computational Linguistics and Phonetics

Automatic recognition of habituals:
a three-way classification of clausal aspect
Annemarie Friedrich
EMNLP 2015
LISBON, PORTUGAL
Manfred Pinkal
Department of Computational Linguistics, Saarland University
Clausal aspect
Lexical aspectual class
lexical aspectual class + aspectual transformations
(temporal) function of clause in discourse
property of verb in context
dynamic: event, activity
drink, swim, forget
stative: states, properties
like, be, own
Habituality
episodic: a particular event
habitual: generalization over situations,
exceptions are tolerated
www.coli.uni-saarland.de/projects/sitent
similar: Xue & Zhang (2014)
clausal aspect
Mathew &
Katz (2009)
episodic
John went
swimming
yesterday!
episodic
Bill drank a coffee after lunch.
dynamic
habitual
Bill usually drinks coffee after lunch.
Italians drink coffee after lunch.
Sloths sometimes sit on top of branches.
John never drinks coffee.
dynamic
dynamic
stative
dynamic
Bill likes coffee.
Bill can swim.
Bill didn’t drink coffee yesterday.
Mary has made a cake.
stative
dynamic
dynamic
dynamic
habitual
static
Bill often goes
swimming.
lexical aspect
Siegel & McKeown (2000)
Zarcone & Lenci (2008)
Friedrich & Palmer (2014)
lexically stative clauses & clauses
stativized via aspectual
transformations such as negation,
modals or English perfect
Future work: distinguish them!
Context-based features
Type-based features – linguistic indicators
Data
verb:
subject:
object:
clause:
Siegel & McKeown (2000)
102 texts
10355 clauses
tense, POS, voice, progress., perfect
bare plural, (in)definite
absent, bare plural, (in)definite
modal, negated, conditional, tmod, …
verb type: drink -- ling_ind_past = 0.0927
9.27% of all instances of drink in corpus are in past tense
John has spilled his coffee.
tense=past
perfect=true
parsed background corpus
modal=false
static vs. non-static: accuracy in %
maj. class
90
Context
79.3
Type
84.4
79.2
73.2
64.9
59.7
negated
perfect
progressive
no subject
evaluation adverb
continuous adverb
future
particle
for-PP
in-PP
manner adverb
temporal adverb
60% static
20% episodic
20% habitual
3 annotators, κ=0.61
CASCADED MODEL
JOINT MODEL
Random Forest
classifier
RandomForest
classifier
Context+Type+lemma
71.7
70
frequency
present
past
59.7
50
episodic
static
habitual
30
RANDOM CV
UNSEEN VERBS
static
non-static
episodic vs. habitual: accuracy in %
90
maj. class
lemma
Type
M&K
Context
Context+Type
85.1
82.3 82.8
70
65.4
81.4 83.8 83.1
68.1
Random
Forest
classifier
episodic vs. habitual vs. static
accuracy in %
Both Context- and
ype-based features
are essential.
CASCADED MODEL:
works better
especially for
UNSEEN VERBS.
60
68.4 69.9
59.7 63.8
59.7
20
0
RANDOM CV
30
80
UNSEEN VERBS
81.2 82.6
69.5 72
31.3
CASCADED MODEL improves
classification especially of
habitual class.
habitual
habitual
JOINT
50.2
40
UNSEEN VERBS
F1-scores
60
episodic
63.9
72.1 74.3
40
100
RANDOM CV
79 79.6
80
46.3 46.3
42.1
JOINT
MODEL
100
53.9
50
maj. class
Context
Type
Context+Type
CASCADED
20
0
static
episodic
Next steps
CASCADED
Leverage discourse context?
John rarely ate fruit. He just ate oranges.
Use aspectual distinctions to improve
models of temporal discourse structure
[Costa & Branco 2012]