Modeling Long Distance Dependencies in

Modeling Long-Distance
Dependencies in Double R
July 2008
Jerry Ball
Human Effectiveness Directorate
Air Force Research Laboratory
Double R Model
•
Goal: Model the basic grammatical patterns of
English to support development of cognitively
plausible and functional language comprehension
systems
– Declaratives – “The man hit the ball”
– Questions
• Yes-No Questions – “Did the man hit the ball?”
• Wh Questions – “Where did the man hit the ball?”
– Imperatives – “Hit the ball!”
– Relative Clauses – “The ball that the man hit”
– Wh Clauses – “I know where the man hit the ball”
– Passive constructions – “The ball was hit”
2
Empirical Evidence
•
Basic grammatical patterns have been most
extensively studied in generative grammar
– The focus in generative grammar has been on studying the
syntactic form of linguistic expressions in isolation from
meaning and processing
•
The “Simpler Syntax” of Culicover and Jackendoff
(2005) is redressing the consideration of meaning
and simplifying syntax as a side effect
• O’Grady’s “Syntactic Carpentry” (2005) integrates
processing as well (see also Hawkins, 2004)
•
Reference grammars (Huddleston & Pullum, 2002;
Quirk et al., 1985) provide a wealth of examples
which integrate form, function and meaning
3
Long-Distance Dependencies
•
Long-distance dependencies are the sin qua non of
modern linguistic theorizing
– An important motivation for Chomsky’s transformational
grammar – deep structures with arguments in place are
mapped to surface structures with arguments “moved” by
various transformations
•
Introduction of traces supported the collapsing of deep and
surface structure – traces mark the original location
•
Construction specific transformations were generalized to
Move a subject to universal, parameterized constraints
– Many basic grammatical constructions involve longdistance dependencies
• Wh questions, relative clauses, passive constructions…
– Require retention of grammatical information for extended
stretches of input
4
Long-Distance Dependencies
•
Binding of pronouns and anaphors:
– Anaphors (“himself”) vs. pronouns (“him”)
• Johni kicked himselfi ( i = i)
(Principle A of GB
Theory)
• Johni kicked himj
( i not = j) (Principle B of GB Theory)
– Proper binding often requires use of semantic
information (but considered syntactic in
generative grammar)
• Johni and Maryj were talking. Shej told himi… (gender)
• Johni is reading a bookj. Itj is about… (animacy)
• Johni is reading the comicsj. Theyj are… (number)
5
Long-Distance Dependencies
•
Verb Control
– Object Control: “Hei persuaded mej PROj to go”
• PROj is an “implicit” pronoun (a trace without movement)
– Subject Control: “Hei promised mej PROi to go”
•
Raising Verbs
– “Hei seems ti to like me”
• ti is a trace of a “raised” argument
6
Long-Distance Dependencies
•
Passive Constructions
– “The balli was kicked ti by the man”
• The object is “raised” out of its normal position and the
subject is pushed into an oblique complement position
“by the man”
•
Wh Questions
– “Whoi did Johnj decide PROj to see ti”
• Relative Clauses
– “The balli that the man kicked ti”
7
Modeling Long-Distance
Dependencies
•
An ontology of DM chunk types supports the
grammatical distinctions
•
Productions match buffer elements at the
appropriate level of the ontology given the function
of the production, e.g.
– Production matches pronoun “he…”  project nominal and
put in subject buffer
– Production matches predicate specifier (e.g. “…is…”) 
project a declarative clause
– Production matches declarative clause and a nominal in
subject buffer (e.g. “he is…”)  integrate the nominal as the
subject of the clause
– Production matches transitive verb (e.g. “hitting”)
functioning as clausal head (e.g. “he is hitting…”) and a
nominal (e.g. “…the ball”)  integrate the nominal as the
object of the verb
8
Ontology of Situation Referring
Expressions
• Decl-sit-refer-expr
• Yes-no-quest-sit-refer-expr
– “Is he going?”
• Wh-quest-sit-refer-expr
– “Where did he go?”
• Imp-sit-refer-expr
– “Don’t go!”
• Wh-sit-refer-expr
– “I know where he went”
• Rel-sit-refer-expr
– “The book that you like”
Note:
Situation Referring Expression
corresponds to Clause in other
approaches
What are the grammatical cues
that trigger recognition of
an expression type? These cues
need to be accessible!
9
Slots in Referring Expressions
• Bind-indx (all referring expression types)
– Identifier for referring expression
• Parent (all chunk types)
– Links child to parent chunk
– Used to avoid multiply integrating chunk into other chunks
• Token (all chunk types)
– Distinguishes types from tokens (and type-tokens)
• Grammatically relevant semantic info
– Animate (all object referring expression types)
– Gender (all animate referring expression types)
– Number (all object referring expression types)
– Person (all object referring expression types)
10
Recognizing Wh-Quest and WhSituation Referring Expressions
…where he went
Where did he…?
1
2
3
4
(p cog-process-obj-refer-expr-->
project-wh-quest-sit-refer-expr
=goal>
isa process-obj-refer-expr
=wh-focus>
isa wh-refer-expr
;; “where”
=most-recent-child-sre-head>
isa operator-pred-spec
;; “did”
=retrieval-2>
isa obj-refer-expr
;; “he”
=subject> isa nothing
=context>
isa context
- sit-context "wh-quest-sit-refer-expr“
==> project wh-quest-sit-refer-expr
1
2
3
(p cog-process-pred-type
project-wh-sit-refer-expr
=goal>
isa process-pred-type
=wh-focus>
isa wh-refer-expr ;; “where”
=subject>
isa refer-expr
;; “he”
=retrieval-2>
isa pred-type
;; “went”
=context>
isa context
- sit-context "wh-sit-refer-expr"
- sit-context "wh-quest-sit-refer-expr"
==> project wh-sit-refer-expr
Note: the more grammatical cues, the greater the likelihood of being correct!
“Who kicked…?” “Where the heck is...?” “Why is there…?
11
Modeling Long-Distance
Dependencies
•
Model needs simultaneous access to multiple
grammatical elements
– Serial retrieval from DM is not a viable option
– Buffers support simultaneous access – buffers on left-hand
side of production constitute focus of attention – limited to
~4 (Cowan, 2000) besides goal and context buffers
– Can’t predict in advance of production selection which
grammatical elements will be needed
– Buffers and productions are functionally motivated – they
are needed in the processing of various constructions
•
A model with fewer buffers (and productions) that
handles a similar set of phenomena might be a better
model, but a model with fewer buffers that handles
fewer phenomena is not comparable (Ball, in
preparation)
12
Double R Buffers – Single Chunk
•
•
•
•
•
Subject – stores the subject
Wh-focus – stores the fronted wh expression
Rel-focus – stores the relative clause marker
Context – stores contextual information
Construct – buffer for constructing DM chunks
– Dual path processing – construct chunk vs. retrieve chunk
• Retrieval-2 – buffer for storing retrieved or constructed DM
chunks
– Retrieval buffer only used temporarily, retrieved chunk is
copied into retrieval-2 for subsequent processing
• Most-recent-loc-refer-expr – just the most recent
– Supports locative fronting “On the table is the book”
13
Double R Buffers – Multiple Chunk
Obj-Refer-Expr buffers
• Most-recent-childobj-refer-expr
• Most-recent-parentobj-refer-expr
• Most-recent-grandparentobj-refer-expr
Four generic
Short-Term
Working Memory
buffers
• St-wm-1
• St-wm-2
• St-wm-3
• St-wm-4
Obj-Refer-Expr-Head buffers
• Most-recent-childobj-refer-expr-head
• Most-recent-parentobj-refer-expr-head
• Most-recent-grandparentobj-refer-expr-head
Note:
object referring expression
corresponds to nominal
in other approaches
14
Double R Buffers – Multiple Chunk
Sit-Refer-Expr buffers
• Most-recent-childsit-refer-expr
• Most-recent-parentsit-refer-expr
• Most-recent-grandparentsit-refer-expr
Sit-Refer-Expr-Head buffers
• Most-recent-childsit-refer-expr-head
• Most-recent-parentsit-refer-expr-head
• Most-recent-grandparentsit-refer-expr-head
Note 1: with the introduction of obj-refer-expr and sit-refer-expr
specific buffers, the short-term working memory buffers are
infrequently used (primarily for conjunctions and adverbs)
Note 2: child, parent and grandparent buffers are all directly accessible,
whereas only st-wm-1 is directly accessible
15
Long-Distance Dependencies
I want to go
Note: entire representation
is not accessible at once!
Infinitive sit-refer-expr has
implied subj with trace
bound to matrix subj
Combination of “bind-indx” and “trace” needed to
indicate long-distance dependency
Traces only
occur in
argument
positions!
16
Long-Distance Dependencies
He promised me to go
Alternative view: antecedent & trace both
bind to same object in situation model
Subject control (verb): matrix clause subject binds to subject
of infinitive situation complement – subject must be accessible
17
Long-Distance Dependencies
He persuaded me to go
Object control (verb): matrix clause (indirect) object binds to subject
of infinitive situation complement – object must be accessible
18
Long-Distance Dependencies
Who did he kick the ball to?
Object of preposition
is bound to fronted who-obj-refer-expr – wh-focus must be accessible
19
Long-Distance Dependencies
The man that I gave the book
I-Obj Trace to Obj-Refer-Expr
with animate or human head
Rel-focus co-indexed with Obj-Refer-Expr
rel-focus and subject must be accessible
(rel-focus is optional)
20
Long-Distance Dependencies
The book that I gave the man
Obj Trace to Obj-Refer-Expr
with inanimate head
Rel-focus co-indexed with Obj-Refer-Expr
rel-focus and subject must be accessible
(rel-focus is optional)
21
Architectural Constraints
•
•
No hard architectural limit on the number of buffers
Buffers provide the context for production selection
and execution
– Highly context sensitive
•
Productions limited to accessing ~4 buffers on lefthand side (beside goal and context buffers)
– Focus of attention (Cowan, 2000)
– “Conscious activity corresponds to the manipulation of the
contents of these buffers by production rules” (Anderson,
2007)
•
Can humans learn to buffer useful information?
– Fronted Wh-expression buffer very useful in English, but
not needed in in situ languages like Chinese
22
Processing Constraints
• A “mildly” deterministic, serial processing mechanism
(selection and integration) operating over a parallel,
probabilistic substrate (activation)
• Interactive and non-autonomous processing (no distinctly
syntactic representations exist)
• Incremental processing with immediate determination of
meaning – word by word
• No algorithmic backtracking or lookahead – a mechanism of
context accommodation (Ball et al. 2007) used instead
• Forward chaining only
• Declarative and explicit linguistic representations generated via
implicit execution of productions
• Operates in real-time on Marr’s algorithmic level (serial and
parallel processing are relevant)
– No slow down with length of input
23
Summary
•
Additions to model are
– motivated by functional considerations
– driven by empirical evidence
– constrained by well-established cognitive
constraints on language processing
•
Goal is a large-scale, functional language
comprehension system implemented in the ACT-R
cognitive architecture
•
Model currently handles a fairly wide-range of
grammatical constructions including numerous
forms of long-distance dependency
24
Questions?
25
References
Anderson, J. (2007). How Can the Human Mind Occur in the Physical Universe. Oxford:
Oxford University Press.
Ball, J. (in preparation). A Naturalistic Functional Approach to Modeling Language
Comprehension.
Ball, J., Heiberg, A. & Silber, R. (2007). Toward a Large-Scale Model of Language
Comprehension in ACT-R 6. Proceedings of the 8th International Conference on
Cognitive Modeling.
Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of
mental storage capacity. Behavioral and Brain Sciences, 24, 87-185.
Culicover, P. & Jackendoff, R. (2005). Simpler Syntax. Oxford: Oxford University Press.
Hawkins, J. (2004). Efficiency and Complexity in Grammars. Oxford: Oxford University
Press.
Huddleston, R. & Pullum G. (2002). The Cambridge Grammar of the English Language.
NY: Cambridge Unversity Press.
O’Grady, William (2005). Syntactic Carpentry, an Emergentist Approach to Syntax.
Mahway, NJ: LEA.
Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. (1985). A Comprehensive Grammar of
the English Language. Essex, UK: Pearson Education Limited.
26
Long-Distance Dependencies
The ball by the table was kicked by the man
passive cue
(be + V-ed or V-en)
Subject co-indexed with Object
subject must be accessible
27