A formal theory of dependency syntax with non-lexical

A formal theory of dependency syntax with non-lexical units
VINCENZO LOMBARDO*,** and LEONARDO LESMO**
Resume' - Abstract
The paper describes a formal theory for the dependency approach
to syntax. The formalism is able to deal with long-distance
dependencies, involved in function sharing and extractions. The
distinctive aspects of the approach are the use of non lexical
categories and a GPSG-style treatment of long-distance
dependencies. The formalism is applied to a number of
coordination phenomena, that are well-known in the literature.
Mots Clefs - Keywords: Dependency grammar, Long-distance
dependencies, Coordination, Non-lexical units
* Dipartimento di Scienze e Tecnologie Avanzate, Università del Piemonte Orientale "A.
Avogadro", c.so Borsalino 54, 15100 Alessandria, Italy , [email protected].
** Dipartimento di Informatica and Centro di Scienza Cognitiva, Università di Torino, c.so
Svizzera 185, 10149 Torino, Italy, {vincenzo, lesmo}@di.unito.it.
1. INTRODUCTION
Dependency syntax has had a long tradition in European linguistics
since (Tesniere L. 1959): formal descriptions of dependency syntax are the
functional generative approach (Sgall P. et al. 1986), the meaning-text theory
(Mel'cuk I. 1988), the hierarchy-based word grammar (Hudson R. 1990)
(Fraser N. & Hudson R. 1992), the dynamic dependency grammar (Milward
D. 1994). Also, a number of parsers have been developed for frameworks
featuring core aspects of dependency syntax (Covington M. 1990) (Sleator
D. & Temperley D. 1993) (Hahn U. et al. 1994) (Lombardo V. & Lesmo L.
1996), (Järvinen T. & Tapanainen P. 1997), including a stochastic treatment
(Eisner J. 1996) and an object-oriented parallel parsing method (Neuhaus P.
& Hahn U. 1996).
The basic idea of dependency is that the syntactic structure of a
sentence is described in terms of binary relations (dependency relations) on
pairs of words, a head (parent), and a dependent (daughter), respectively;
these relations usually form a tree, the dependency tree (fig. 1).
know
SUBJ
SCOMP
I
likes
SUBJ
John
OBJ
beans
Figure 1. A dependency tree for the sentence "I know John likes beans". The
leftward or rightward orientation of the edges represents the linear order of
words: the dependents that precede (respectively, follow) the head stand on
its left (resp. right).
The linguistic merits of dependency syntax are widely acknowledged:
core dependency concepts like the head of a phrase and the representation
of grammatical relations have become ubiquitous in linguistic theories.
Dependency syntax is also attractive because of the immediate mapping of
dependency trees onto predicate-argument structures, and because of the
treatment of free-word order constructs. These merits have recently triggered
a series of mathematical analyses of dependency syntax. Many years after
Gaifman showed that projective dependency grammars are weakly
equivalent to context-free grammars (Gaifman H. 1965), a number of authors
have devised O(n3) parsers for projective dependency formalisms (Eisner J.
1996) (Lombardo V. & Lesmo L. 1996) (Milward D. 1994). Then, Neuhaus
and Bröker (1997) have showed, through the reduction of the vertex cover
problem, that the recognition problem for unconstrained non-projective
dependency grammars (what they call discontinuous DG) is NP-complete.
More recently (Bröker N. 1998), Bröker has proposed the adoption of a
context-free backbone to introduce linear precedence constraints in the
grammar, that avoid the combinatorial explosion on the order of words. The
-2-
goal of relaxing projective constraints in a controlled manner has led Nasr
(1995, 1996) to introduce the notion of pseudo-projectivity, which allows a
limited number of arc crossings in a dependency tree. This approach has
been carried on by Kahane et al. (1998), who have formalized this notion in
the so-called lifting rules, which permit some element to be displaced by
means of a mechanism similar to functional uncertainty (Kaplan R. & Zaenen
A. 1988) applied to a dependency tree. All these approaches, also including
(Lombardo V. & Lesmo L. 1998a), an earlier development of the work in this
paper, have been gathered under the general notion of meta-projectivity in
(Bröker N., this issue), with some minor differences concerning the licensing
of gaps. Meta-projectivity claims that an element D governed by an element
G appears either among the dependents of G or among the dependents of
one of G’s ancestors. This notion recalls the structural relations between
fillers and gaps posed by c-command in the Chomskian approach to syntax.
Most of the central assumptions of the dependency theories is the
lexical character of all the units of the syntactic representation. In fact, nonlexical categories are generally banned from the dependency theories.
However, their use can be viewed as a notational variant of a number of
approaches. Word Grammar (Hudson R. 1990) adopts a graph structure to
represent multiple dependencies of a single element (as in the case of
function sharing or displacements): this representation can be replaced by
making one of the two links to point to an empty category node, co-indexed
with a lexical node (see fig. 2). A similar dichotomy (with different
motivations) is in (Kahane S. et al. 1998), where a word can have two
relations, with a Syntactic Governor and a Linear Governor: again, one of the
two relations can be represented by a trace co-indexed with the lexical
element. Finally, Neuhaus and Bröker (1997) distinguish between dashed
and solid dependencies to represent items that respect projectivity and items
that don't: again the same comments above hold.
promised
SUBJ
John
OBJ
Mary
VCOMP
to
1 John
PRED
cook
SUBJ
SUBJ
promised
OBJ
Mary
VCOMP
to
PRED
cook
OBJ
SUBJ
beans
ε1
OBJ
beans
Figure 2. A graph-structured Word Grammar representation (on the left) and
a notational variant that includes coindexed non-lexical nodes (on the right).
This paper introduces a lexicalized projective dependency formalism
which represents long-distance dependencies through the use of non lexical
categories (structurally represented as empty nodes). The non lexical
categories allow us to keep inalterate the condition of projectivity, encoded in
the notion of derivation, and to produce a rich syntactic structure which
-3-
explicitly represents both surface and deep dependencies. These facilities
are extremely useful in NLP applications. The formalism is an extension of
the one presented in (Lombardo V. & Lesmo L. 1998a), and incorporates the
ideas on the treatment of coordination outlined in (Lombardo V. & Lesmo L.
1998b).
The paper is organized as follows. The next section presents the
complete formal system. Section 3 illustrates how the formal system can
cope with some broad coordination phenomena. Section 4 concludes the
paper.
2. A DEPENDENCY FORMALISM
The basic idea of dependency is that the syntactic structure of a
sentence is described in terms of binary relations (dependency relations) on
pairs of words, a head (or parent), and a dependent (daughter), respectively;
these relations form a tree, the dependency tree. In this section we introduce
a formal dependency system called RDG, Rule-based Dependency
Grammar, which expresses syntactic knowledge through dependency rules
that describe one level of a dependency tree. Also, we introduce a notion of
derivation that allows us to define the language generated by a dependency
grammar of this form, and to compute the derivation (dependency) tree.
In RDG, syntactic and lexical knowledge coincide, since the rules are
lexicalized: the head of the rule is a word of a certain category, called the
lexical anchor. From the linguistic point of view we can recognize two types
of dependency rules: primitive dependency rules, which represent
subcategorization frames, and non-primitive dependency rules, which result
from the application of lexical metarules to primitive and non-primitive
dependency rules. In section 2.3, we sketch the general idea of a metarule,
and provide a few examples.
•
•
•
•
•
•
A RDG is a six-tuple <W, C, S, D, U, H>, where
W is a finite set of symbols (words of a natural language);
C is a finite set of syntactic categories (including the special category Ε);
S is a non-empty set of root categories (S ⊆ C);
D is the set of dependency relations, e.g. SUBJ, OBJ, XCOMP, P-OBJ,
PRED (among which the special relation VISITOR1);
U is a finite set of symbols (among which the special symbols !, !c and ◊),
called u-indices;
H is a set of dependency rules of the form
x:X (<d1Y1u1τ1> ... <di-1Yi-1ui-1τi-1> <# ui> <di+1Yi+1ui+1τi+1> ...
<dmYmumτm>)
where:
1) x∈W, is the head of the rule;
2) X∈C, is its syntactic category;
1 The relation VISITOR (Hudson 1990) accounts for displaced elements and, differently from
the other relations, is not semantically interpreted.
-4-
3) an element <dj Yj uj τj> is a d-quadruple (which describes a
dependent); the sequence of d-quads, which includes the pair <# ui>
(the linear position of the head, # is a special symbol), is called the dquad sequence. We have that ui∈U and that, for j ∈ {1, ..., i-1, i+1, ...,
m}:
a) dj∈D
b) Yj∈C
c) uj∈U
d) τj is a (possibly empty) set of triples <u, d, Y>, called u-triples,
where u∈U, d∈D, Y∈C.
Intuitively, a dependency rule constrains one node (head) and its
dependents in a dependency tree2: the d-quad sequence states the order of
elements, both the head (# position) and the dependents (d-quads). The
grammar is lexicalized, because each dependency rule has a lexical anchor
in its head (x:X). A d-quad <diYiuiτi> identifies a dependent of category Yi,
connected to the head via a dependency relation di. Each element of the dquad sequence is possibly associated with a u-index (uj) and a set of utriples (τj). The u-index, when present, specifies that the dependent
described by the d-quad is co-indexed with some (non-lexical) trace ε. A utriple (τ-component of the d-quad) <u, d, Y> bounds the area of the
dependency tree where the trace can occur. Both uj and τj can be null
elements, i.e. ◊ and ∅, respectively (but see the principle of u-triple
satisfiability below). The τ-component can be viewed as a slash feature in
GPSG terms (Gazdar G. et al. 1985). That is, if a d-quad is of the form <di, Yi,
ui, {<u, dj, Yj>}>, it can be expressed in GPSG notation as Yi / Yj, because the
Yj element must realized by a trace in the subtree rooted by Yi.
There are two possible situations (refer to fig. 3, where on the left side
are dependency rule schemas, and on the right side are dependency tree
schemas)3: a trace can participate in whole subtree gapping or in partial
subtree gapping. The major difference is that in the first case, traces have no
dependents (the entire subtree is gapped), while in the second case, traces
do have dependents (only part of the subtree is gapped). To keep the two
cases apart, there are two forms of u-indices: full u-indices, for whole subtree
gapping, and local u-indices for partial subtree gapping. An example of whole
subtree gapping is in fig. 3a: the u-triple <u,SUBJ,DET> states that the
subtree rooted by the VCOMP dependent must contain a trace co-indexed
2 Dependency rules express the subcategorization frames, and possibly include adjuncts.
We defer to a future project the formal distinction between argument and adjuncts. On this
topic, an interesting solution is in (Nasr A., 1995, 1996).
3 What we intend as a schema of some structure is a fragment of the structure which is
significant for the discussion topic. In figure 3, we report some schema (fragments) of the
dependency rules and of the dependency tree associated with the sentence "A nice boy
wants to buy a book of 200 pages for Mary and the guy across the street of 300 pages for
Susan". Further aspects of these structures will be analyzed in depth throughout the paper.
-5-
Dependency tree
Dependency rule(s)
wants:V
wants:V
... SUBJ
u
...
TO <u, SUBJ, DET>
DET
...
... SUBJ
VCOMP ...
α
VCOMP ...
u a: DET
NBAR
boy:N
ATTR
nice:ADJ
to:TO
PRED
buy:V
SUBJ
β
ε:DET u
α
(a)
!c wants:V
u
COORD
VCOMP
...
TO
!
...
wants:V
...
...
CONJ-V <!c, 2nd,V>
...
...
t
buy:V
OBJ
...
! DET
a:DET
!
... NBAR
...
δ
book: N
...
2nd
buy:V
γ
!N
...
and:CONJ-V
OBJ
...
...
a:DET
v
NBAR
...
...
book: N
z
!V
!
...
COORD
...
w to:TO
PRED
...
to:TO
!
PRED
VCOMP
ε:V u
...
VCOMP
...
ε:TO w
... PRED
...
ε:V t
... OBJ ...
ε:DETv
... NBAR ...
ε:N z
γ’
(b)
Figure 3. Schematic representation of u-triple satisfiability. The figure reports
some fragments of the dependency tree of the sentence “A nice boy wants to
buy a book of 200 pages for Mary and the guy across the street of 300 pages
for Susan”. 3.a illustrates the use of full u-indices, while 3.b shows the use of
local u-indices. See comments in the text.
with the SUBJ dependent. The u-triple also states that such a trace, of
category DET, must be linked to its direct governor through the relation
SUBJ; u is a full u-index. The dependency tree in fig 3.a on the right satisfies
this condition, as the co-indexation between the node a:DET and the trace
ε:DET reveals. The trace ε:DET is in the subtree β rooted by the VCOMP
-6-
dependent. The figure also fleshes out the fact that the trace (ε:DET) is
intended to refer to the complete subtree α, rooted by a:DET4.
Figure 3b illustrates an example of partial subtree gapping. Partial
subtree gapping requires a more flexible mechanism for the identification of
the gapped structure. Since partial subtree gapping occurs when some trace
element of the gapped structure governs some word in the sentence, the full
u-index approach described above is not adequate, because the full subtree
trace element cannot govern any word (the whole subtree is gapped). The
approach to partial subtree gapping involves a different type of indices, called
local u-indices. The gapped structure consists of a number of traces, each
referring to a word of the subtree, but, in contrast with the full u-index
approach, the trace is not intended as a reference to its whole subtree. The
trace is intended to refer just to the word, and then it can govern a number of
dependents as licensed by a dependency rule headed by that word. Some of
these dependents can in turn be traces.
On the left side of figure 3b, there are five dependency rule schemas
involving the two forms of local u-indices, !c and !. !c identifies the root of the
subtree; ! signals the top-down continuation of the gapped subtree. The
dependency rule on top identifies the word wants as the root of the subtree
(!c index) which is gapped in the substructure headed by the COORD
dependent (marked with the u-triple <!c, 2nd, V>); the same rule also states
that such subtree spans only the descendants of wants that are reachable
through the VCOMP dependent, marked with the u-index !5. Subtree
continuation is licensed by the other dependency rules in the figure, where
both the head and the dependents that are involved in the gapping process
are marked with the u-index !. The bottom rule (headed by book:N) is not
involved in the gapping phenomenon (except for the head “book”, marked by
the a:DET rule above).
On the right side of figure 3b there is a dependency tree schema which
satisfies the conditions expressed by the dependency rules. The node
wants:V is co-indexed with the trace ε:Vu which occurs in the subtree rooted
by the COORD dependent of wants:V. Then, one dependent of wants:V (of
category TO) is co-indexed with the trace ε:TOw. Then buy:V, and a:DET.
Finally, book:N is co-indexed with the trace ε:Nz. So, in the subtree rooted by
and:CONJ-V (the trapezoid δ), there is a partial gapping of the subtree rooted
by wants:V, concerning the portion wants+to+buy+a+book replaced by
traces. The gapping is partial because only part of the subtree rooted by
wants is needed in the subtree rooted by and:CONJ-V. Also, the gapped part
does not include the low levels: the two subtrees γ (“of 200 pages) and γ’ (“of
300 pages”) are in fact different. The replication of the tree structure is
realized by the derivation process (see below), when the dependency rule
4 Notice that the subtrees indicated in the figure 3a, α and β, are depicted as dashed
trapezoids because they are not actual elements of the grammar theory, but only additional
devices introduced for the mere purpose of explanation.
5 Notice that the dependents spanned by the subtree can be more than one, as well as
none, when only the head is gapped. In this example, the subtree spans only one
dependent; the dependents that are not of interest for the phenomenon have been omitted.
-7-
associates the symbol !c and ! with the head and some dependents,
respectively.
The distinction between full and local u-indices recalls the distinction
between standard and non standard constituents in constituency theories
(see e.g. Steedman 1990). Non standard constituents are particularly useful
in describing coordination, where it is possible to conjoin sequences that
cannot be parsed as standard phrases. In our approach to dependency
syntax (see also section 3), local u-indices allow for dependency subtrees
with traces that are licensed at the word level. To illustrate the difference
between full and local u-indices, consider the following sentence:
A nice boy wants to buy a book for Mary and a guy across the street
for Susan
The constituency structure in terms of non standard constituents is the
following:
[[A nice boy] wants to buy [a book for Mary]] and [[a guy across the
street] [for Susan]]
where the second conjunct is a typical non standard constituent. The
dependency structure licensed by our approach corresponds to the following
linearization:
nice boy] 2wants 3to ε1 4buy 5a 6book for Mary and 7[a guy across
the street] ε2 ε3 ε7 ε4 ε5 ε6 for Susan
1[A
where indices associated with square brackets correspond to full-subtree coindexing (full u-indices), and indices associated with single words correspond
to single-word coindexing (local u-indices).
A fundamental constraint concerning the u-indices and the u-triples is
established by the following principle of u-triple satisfiability:
•
•
•
For each dependency rule δ∈H, there exists a u-triple <uj, d, Y>
∈ τi (uj∈U) in a d-quad <diYiuiτi> of δ iff there exists one d-quad
<djYjujτj> of δ and i ≠ j.
For each dependency rule δ∈H, there exists a u-triple <!c, d, Y>
∈ τi (uj∈U) in a d-quad <diYiuiτi> of δ iff the pair <# ,!c> belongs
to the d-quad sequence.
The local u-index ! does not appear in the u-triples.
In the next subsection we introduce a notion of derivation which allows
to define the language generated by a RDG. Then we sketch the structure of
the metarules, and a few linguistic examples.
-8-
2.1 Derivation
In the RDG formalism, the derivation is a string rewriting process, which
starts from the root categories of the grammar, and yields the language
strings. The derivation process requires u-index instantiation in the
dependency rules, because any dependency rule can be used more than
once in one derivation process. So, it is necessary to instantiate the u-indices
as progressive integers in the derivation process. The instantiation must be
consistent in the u and the τ components. A dependency rule (as well as a utriple) with all the u-indices instantiated is said to be instantiated: U+ refers to
the set of instantiated u-indices (including the special symbol ◊, which is the
null value for the u-indices).
Before the derivation rules, we need to introduce some terminology.
Word Objects
A 4-tuple consisting of a word w (∈W) or the trace symbol ε (∉W), and
three instantiated u-indices η, µ and ν is a word object of the grammar.
Formally, the set of word objects of a given grammar G is,
W' (G)={η,µxν / η,µ,ν ∈ N∪{◊}, x ∈ W∪{ε}}.
The two indices η and µ are used to signal that the word-object is coindexed with some other word-object (a trace occurring elsewhere): η is the
instantiation of a local u-index, and thus is intended to refer to a word object;
µ is a full u-index, and thus is intended to refer to the whole subtree headed
by the word object. ν is associated with traces only, and co-refers with either
η and µ indices of other word objects. The word objects are the terminal
elements of a derivation: the sequence of word objects represents the
generated sentence, annotated with traces and co-indexing. For instance,
the derivation of the sentence The boy wants to leave produces the following
sequence of word objects:
◊,1The◊
◊,◊boy◊
◊,◊wants◊
◊,◊to◊
◊,◊ε1
◊,◊leave◊
which, omitting the null indices, and observing that 'The' governs 'boy', and
that 1 is the instantiation of a full u-index (µ) corresponds to the more usual
form:
1[The boy]
wants to ε1 leave.
Category Objects
A category object of the grammar G is a pair consisting of a category X
(∈C) and a set of instantiated u-triples ρ; it will be denoted by X(ρ). The
category objects play the role of non-terminals in a derivation in constituency
-9-
terms. When the set of u-triples ρ is empty, then X corresponds to a standard
non-terminal; when the set of u-triples ρ is not empty, then X corresponds to
a slashed category in GPSG, and the u-triples specify which traces are to be
found in the subtree rooted in the category object.
Derivation Objects
A derivation object of the grammar G is a quadruple consisting of a
dependency relation d (∈D), a category object X(ρ), an instantiated u-index
µ, and a pair of instantiated u-indices η+. Given a grammar G, the set of
derivation objects of G is
C' (G) = {<d, X(ρ), µ, η+> /
d∈D, Y∈C, µ∈N, η+∈(N∪{!c})×N, ρ a set of instantiated u-triples}.
The derivation objects stand for dependents not yet instantiated. The
quadruple specifies the grammatical relation linking the dependent to its
governor [d] and the, possibly slashed, expected category of the dependent
[X(ρ)]; moreover, the two indices µ and η refer to the (possible) coindexing of
the dependent with a trace (as a full subtree or as a single word).
Derivation Rules
Let α∈W'(G)* and ω∈(W'(G) ∪ C'(G))*. The following derivation rules
define the derivation relation (⇒):
DR1: α <d, X(ρ), µ, η+> ω ⇒
α
<d1, Y1(ρ1), µ1, η1>
<d2, Y2(ρ2), µ2, η2>
...
<di-1, Yi-1(ρi-1), µi-1, ηi-1>
vi,µ x ◊
<di+1, Yi+1(ρi+1), µi+1, ηi+1>
...
<dm, Ym(ρm), µm, ηm>
ω
where
a) x:X (<r1Y1u1τ1> ... <ri-1Yi-1ui-1τi-1> <# ui> <ri+1Yi+1ui+1τi+1> ...
<rmYmumτm>) is a dependency rule, and µj stands for instantiated uj;
b) ρ1 ∪ ... ∪ ρm = ρ ∪ τ1 ∪ τ2 ... ∪ τm;
c) u-indices: three combinations are possible
c1) ui=!c and η+=◊
In this case, vi=w, where w is an integer never used before. A d-quad
<rjYjujτj> in the rule must contain a u-triple of the form <!c, d, x:X> (u- 10 -
triple satisfiability); correspondingly, the u-triple <!c.w, d, x:X> is
inserted into ρj;
c2) ui=! and η+=q.s (where q, s ∈ N)
In this case, vi=s and a u-triple <q.s, d, x:X> is inserted in the ρ-set in
α or ω containing the u-triple <t.q, ...> (t∈N);
c3) ui=◊
In this case, if η+=◊, then vi=◊; else, if η+=q.s, then vi=s;
d) if vi=y, then, for each uk=! (k=1, ..., i-1, i+1, ..., m) in the dependency rule,
set ηk=y.zk, where zk are all different integers never used before.
DR1 works as follows. The derivation object <d, X(ρ), µ, η+> (called the
current derivation object), occurring in the derivation sequence, can be
rewritten by means of any dependency rule associated with the category X.
The rewriting process involves the following operations:
a) Inserting the lexical anchor (x) in the proper position in the derivation
sequence. It is possibly annotated with the u-indices encoding the fact
that x is co-indexed with a single-word trace (vi), or with a full-subtree
trace (µ).
b) Inserting in the derivation sequence the derivation objects licensed by a
dependency rule associated with X (corresponding to the dependents). All
the traces predicted to occur below x (predicted by the u-triples in ρ)
together with all traces introduced by the rule being applied (predicted by
the u-triples in τi) must be distributed appropriately across the
dependents (ρ ∪ τ1 ∪ τ2 ... ∪ τm = ρ1 ∪ ... ∪ ρm).
c) Assigning local u-indices to the lexical anchor (see fig. 4, top three steps).
This occurs when a tree-shape part of the subtree rooted by x is
replicated below one of the dependent of x, say y. Let us call this part P.
The replication of P consists in gapping part of y’s subtree, since it is
identical to P. The nodes (i.e. the derivation objects) of P, that are gapped
in the subtree below y, must be marked for trace co-indexing (i.e. labeled
with local u-indices). As described above, local u-indices serve coindexing nodes with single-word traces. Since marked nodes are
arranged as a tree (P), every marked node must depend upon another
marked node, except one (the root of P). To replicate P in the subtree
rooted by y, the derivation process must store P’s structure in some way:
the solution is to keep P’s structure with the set of u-triples associated
with the dependent y. The labels of nodes become dotted pairs such that
the first element refers to the parent node, and the second element refers
to the node itself. Thus, a path in P can be described with a sequence of
the following form <!c.w, w.v, ..., t.q, q.s>, where !c marks the beginning of
the path (root of the marked subtree). The u-triple corresponding to some
object has the form <dotted-pair, d, x:X>: the trace able to satisfy such a
u-triple must be linked through a relation d to its parent (labeled by the
first element of the dotted pair), must license its children through a
- 11 -
dependency rule headed by x:X, and must have a ν index equal to the
second element of the dotted pair (see DR3 below).
c1) Let ui =!c, where ui is the u-index associated with the head of the rule
x:X(…). Since ui=!c, the current object is the root of the subtree P; so,
the derivation process must label x with vi=!c.w (there is no parent) in
the η component, and must introduce the u-triple <!c.w, d, x:X> in the
ρ-set of the appropriate derivation object.
c2) Let ui =!. This means that the current derivation object is part of P, but
it is not the root. The local u-index η+ must be of the form q.s (see the
comments above). The local u-index of the lexical anchor vi=s, and a
u-triple <q.s, d, x:X> is inserted in the ρ-set of the derivation object
where the pair t.q appears.
c3) Let ui=◊. This means that the dependency rule does not contribute to
the marked subtree P (no continuation), and the lexical anchor is
possibly marked only if required by its parent node (if η+=q.s, then
vi=s);
d) Instantiating local u-indices and assigning them to the new derivation
objects. All the derivation objects corresponding to dependents indexed
with ! in dependency rule must be marked with some local u-indices
(dotted-pair) of the following form: the first element is equal to the second
element of the lexical anchor (its parent in the tree structure); the second
element is a new integer never used before in the derivation process.
As shown by the derivation relation, the derivation rule DR1 does not
introduce any trace, but actual words of the surface string. The complexity of
the rule is due to the necessity of (possibly) marking the subtrees that are
wholly (µ index) or partially (η index) gapped.
- 12 -
Derivation sequence
Dependency rule fragment
V, {},◊,◊
!c
wants:V
COORD
VCOMP
Subtree root
(DR1)
wants: V
COORD
VCOMP
CONJ-V
<!c, VCOMP, V>
! TO
!
w
TO, {}, w.t, ◊ CONJ-V, {<!c.w,2nd,wants:V>}, ◊,◊
to:TO
w
PRED
Continuation
(DR1)
wants: V
COORD
VCOMP
!V
!
t
buy:V
w
to:TO
CONJ-V, {<!c.w,2nd, wants:V >,
<w.t,VCOMP,to:TO>},
PRED
◊,◊
V, {}, t.s, ◊
wants: V
COORD
VCOMP
OBJ
! DET
t
…
and:CONJ-V
w
to:TO
CONJ-V, {<!c.w,2nd, wants:V >,
PRED
<w.t,VCOMP, to:TO >,
<t.s,PRED,buy:V>,
buy:V
…},◊,◊
s
wants: V
COORD
VCOMP
2nd
t
V
to:TO
PRED
s
w
wants:V
buy:V
DR1
and:CONJ-V
2nd
V, {<!c.w,2nd, wants:V >,
<w.t,VCOMP, to:TO >,
<t.s,PRED,buy:V>,
…},◊,◊
wants: V
COORD
VCOMP
VCOMP
Continuation
and completion
of the subtree
marking
(DR1)
t
to:TO
and:CONJ-V
PRED
TO
s
buy:V
2nd
ε:V
Trace insertion
(DR3)
w
VCOMP
TO, {<w.t,VCOMP, to:TO >,
<t.s,PRED,buy:V>,
…},◊,◊
Figure 4. A few derivation steps in graphical form: for each step, we report
the appropriate derivation rule, and the expansion of the dependency tree
with the variations in the ρ-sets of the category objects.
DR2: α <d, X({<ν, d, X>}), µ, η+> ω ⇒ α η+,µεν ω
- 13 -
DR2 accounts for the insertion of full-subtree traces. The ν index is now
satisfied and is associated to the trace. Moreover, the trace itself can act as a
co-indexing element for a further trace (through µ and η+). DR2 can be
applied just in case the derivation object under analysis includes a category
object of the category X, linked to its governor via the dependency relation d,
and with the ρ-set being a singleton containing the u-triple <instantiated uindex, d, X>. Notice that it is necessary that ρ be a singleton, since no other
u-triple can be satisfied in this subtree.
DR3:α <d, X(ρ), µ, η+> ω ⇒
α
<d1, Y1(ρ1), µ1, η1>
<d2, Y2(ρ2), µ2, η2>
...
<di-1, Yi-1(ρi-1), µi-1, ηi-1>
vi,µεq
<di+1, Yi+1(ρi+1), µi+1, ηi+1>
...
<dm, Ym(ρm), µm, ηm>
ω
where
a) x:X (<r1Y1u1τ1> ... <ri-1Yi-1ui-1τi-1> <# ui> <ri+1Yi+1ui+1τi+1> ...
<rmYmumτm>) is a dependency rule;
b) ρ includes a u-triple of the form <t.q, d, x:X>, t∈N∪{!c}; q∈N;
c) for each single u-triple <q.s, dY, y:Y> in ρ, there exists a d-quad
<dkiYkiukiτki> in the dependency rule such that dki=dY and Yki=Y, and
{<s, dki, Yki>} ⊆ ρki;
d) ρ1 ∪ ... ∪ ρm=ρ ∪ τ1 ∪ τ2 ... ∪ τm - {<t.q, d, x:X>}
DR3 is a kind of merge of DR1 and DR2 (see fig. 4, bottom step): it expands
the derivation object both by inserting a single-word trace, and by introducing
derivation objects associated with the dependents licensed by the rule (a).
The trace is inserted exactly as in DR2 (u-index q), the only difference being
that the ρ-set need not be a singleton, since some other traces can be
satisfied in the subtree headed by the trace (b). In fact, the remaining utriples in ρ are distributed over the dependents as in DR1, with the exception
of the u-triple already satisfied by the local trace inserted here (c) (d).6
6 Condition (c) forces the presence of a suitable d-quad in the rule (a) selected for
application. For instance, the rule for the transitive sense of "eat" can specify that the
direct object is gapped. Condition (c) prevents the application, in the second conjunct, of the
rule associated with the intransitive sense of "eat", where the direct object gap cannot be
satisfied.
- 14 -
The Derivation Relation
We define ⇒* as the reflexive, transitive closure of ⇒.
Given a grammar G, L'(G) is the language of sequences of word
objects:
L' (G)={α∈W'(G)* / <TOP, Q(∅), ◊, ◊> ⇒* α and Q∈S(G)}
where TOP is a dummy dependency relation. The language generated
by the grammar G, L(G), is defined through the function t:
L(G)={w∈W'(G)* / w=t(α) and α∈L'(G)},
where t is defined recursively as
t(-) = -; t(µwν α) = w t(α); t(η,µεν α) = t(α).
where - is the empty sequence.
2.2 An example
In the following example, we show how the derivation mechanism deals
with full u-indices. An example concerning local u-indices is in the appendix,
and involves a coordination construct.
Let us consider the grammar
G1 = <
W(G1) = {I, John, beans, know, likes}
C(G1) = {V, V+EX, N}
S(G1) = {V, V+EX}
D(G1) = {SUBJ, OBJ, SCOMP, VISITOR, TOP}
U(G1) = {◊, u}
H(G1) >,
where H(G1) includes the following dependency rules:
1. I: N (#);
2. you: N (#);
3. John: N (#);
4. beans: N (#);
5. likes: V (<SUBJ, N, ◊, ∅> # <OBJ, N, ◊, ∅)>);
6. know: V+EX (<VISITOR, N, u, ∅> <SUBJ, N, ◊, ∅> # <SCOMP, V,
◊, {<u,OBJ,N>}>);
7. say: V (<SUBJ, N, ◊, ∅> # <SCOMP, V, ◊, ∅)>).
A derivation for the sentence "Beans I know you say John likes" is the
following:
<TOP, V+EX(∅), ◊, ◊> ⇒DR1
<VISITOR, N(∅), 1, ◊> <SUBJ, N(∅), ◊, ◊> know
<SCOMP, V({<1,OBJ,N>}), ◊, ◊> ⇒DR1
1beans <SUBJ, N(∅),◊,◊> know
<SCOMP, V({<1,OBJ,N>}), ◊, ◊> ⇒DR1
1beans I know <SCOMP, V({<1,OBJ,N>}), ◊, ◊> ⇒DR1
- 15 -
1beans I know <SUBJ, N(∅),◊,◊> say
<SCOMP, V({<1,OBJ,N>}), ◊, ◊> ⇒DR1
1beans I know you say <SCOMP, V({<1,OBJ,N>}), ◊, ◊> ⇒DR1
1beans I know you say <SUBJ, N(∅),◊,◊> likes
<OBJ, N({<1,OBJ,N>}), ◊, ◊> ⇒DR1
1beans I know you say John likes <OBJ, N({<1,OBJ,N>}), ◊, ◊> ⇒DR2
1beans I know John likes ε1
The dependency tree corresponding to this derivation is in fig. 5.
know
VISITOR
beans
SCOMP
SUBJ
I
say
SUBJ
SCOMP
you
likes
SUBJ
John
OBJ
ε1
Figure 5. Dependency tree of the sentence "Beans I know you say John
likes", given the grammar G1.
2.3 Metarules
As stated above, there are two types of dependency rules: primitive
dependency rules, which represent subcategorization frames, and nonprimitive dependency rules, which result from the application of lexical
metarules to primitive and non-primitive dependency rules. In this section, we
sketch the general idea of a metarule, and provide a few examples.
The general schema of a metarule is
meta-id
SOURCE
TARGET
where "meta-id" is an identifier of the metarule, and SOURCE and
TARGET are PATTERNS of dependency rules. A PATTERN is an
abstraction (underspecification) over a dependency rule, where the head can
(possibly) reduce to the syntactic category (from x:X to X), and some
subsequences of d-quads can be (possibly) replaced by some variable
symbol. The abstraction allows to generalize the description of phenomena,
where possible.7
7 Recently, there has been a trend in linguistics to view most syntactic phenomena as
lexicon-dependent rather than category-dependent (sometimes, e.g. passivization, the shift
- 16 -
An example of metarule for object extraction is the following:
V (<SUBJ, N, ◊, ∅> # <SCOMP, V, ◊, ∅>)
→extr2
V+EX (<VISITOR, N, u, ∅> <SUBJ, N, ◊, ∅> #
<SCOMP, V, ◊, {<u,OBJ,N>}> )
which can be represented graphically as
SUBJ
V
SCOMP
extr2
V+EX
VISITOR
SUBJ
N
V
uN
N
SCOMP
V
<u, OBJ, N>
The metarule extr2 takes as SOURCE a PATTERN that represents the
subcategorization frame of a verb with a sentential complement (know,
believe, admit, ...), and produces as TARGET a PATTERN that accounts for
the object extraction from the sentential complement. In the previous section,
we have seen how a dependency rule abstracted by this PATTERN can
derive "Beans, I know you say John likes".
Some subsequences of d-quads do not affect the application of
metarules, and pass unaltered from the SOURCE to the TARGET.
PATTERNs avoid redundancy by introducing variables. The following
metarule accounts for preposition stranding (for example, "This place, he
comes to", or "This place, he often comes to with reluctance"):
V (<SUBJ, DET, ◊, ∅> # σ1 <ρ, P, ◊, ∅> σ2)
→pstrand1
V+PS (<VISITOR, DET, u, ∅> <SUBJ, N, ◊, ∅> # σ1
<ρ, P, ◊, {<u, P-OBJ, DET>}> σ2 )
A unification procedure accounts for matching PATTERNs (SOURCE
and TARGET) and dependency rules, taking care of variable substitutions
from SOURCE to TARGET. The SOURCE dependency rule for "comes" in
fig. 6a represents its subcategorization frame. The metarule pstrand1
matches this dependency rule through its SOURCE PATTERN, and
produces the TARGET dependency rule, which licenses the dependency tree
in fig. 6b.
even continues to sentence-dependency). In this work, we associate metarules to
subcategories arranged in a hierarchy (on this topic, see (Barbero C. et al., 1998)).
- 17 -
comes
VISITOR
comes: V
SUBJ
DEST
comes: V
pstrand1 VISITOR
SUBJ
N
P
u DET
N
DEST
1 this
NBAR
P
<u, P-OBJ, DET>
(a)
SUBJ
he
DEST
to
P-OBJ
ε1
place
(b)
Figure 6. Metarule pstrand1 for preposition stranding (a), and dependency
tree of the sentence "This place, he comes to" (b).
In the next section, we describe the metarules that apply to dependency
rules to generate some coordination constructs.
3. COORDINATION
In this section we show how the formal mechanisms of RDG can be
applied to describe coordination constructs. The goal of the section is not to
provide an exhaustive treatment of coordination, one of the most challenging
issues for the syntactic theories, but rather to motivate the distinction
between full and local u-indices, and their interaction with the derivation
rules, from the linguistic point of view.
Coordination phenomena do not fit well with the dominance-based
character of the vast majority of linguistic paradigms. The accounts of
coordination all rely on the notion of non traditional constituents, because of
the variety of word strings that can play the role of conjuncts (see, for
example, (Gazdar G. et al. 1985), (Steedman M. 1985, 1990, 1996)).
Dependency paradigms exhibit obvious difficulties with coordination
because, differently from most structures, it is not possible to characterize
the coordination construct with a general schema involving a head and some
dependents. The conjunction itself has distributional properties that have
nothing to do with the whole coordination. Hudson (1990), following
(Tesniere L. 1959), claims that conjuncts are word substrings (instead of
subtrees), which can be internally organized as (possibly disconnected)
dependency structures, and each conjunct root is dependency related to
some element of the sentence which is external to the coordination. Mel'cuk
(1988), on the other hand, privileges one of the two conjuncts as the head of
the coordination, and claims that coordination symmetry is such only at the
semantic level. This approach solves the problem of providing a head with
the same distributional properties of the whole coordination.
The dependency account of coordination we propose follows Mel'cuk's
hint: one of the two conjuncts is the head of the construction, and the
conjunction itself is the head of the other conjunct. This approach is
consistent with the general asset of metaprojectivity (Bröker N., this issue),
even if in a generalized form for some cases of gapping (see below). The
dependency rules that license coordination are non primitive rules for the
head conjunct. Conjunctions are treated as lexically ambiguous elements
- 18 -
(polymorphic functors in categorial terms): they are assigned categories of
the form CONJ-X, for each syntactic category X. So, we have CONJ-V for
verbs, CONJ-N for nouns, and so on. In the following, we illustrate a limited
number of metarules for coordination, that are useful for the purpose of
illustrating the derivation process. There are three subsections: metarules for
unit coordination, which do not involve u-indices, metarules for nonconstituent coordination, which involve full u-indices, and metarules for
gapping, involving local u-indices. Motivations and examples for our
approach to coordination can be found in (Lombardo V. & Lesmo L. 1998b).
3.1 Metarules for unit coordination
Unit coordination occurs when conjuncts are complete. The metarule for
unit coordination (coord-unit) is the following8:
X (σ) →coord-unit X (σ <COORD, CONJ-X, ◊, ∅>)
For each dependency rule with a head of category X (variable), "coordunit" produces a dependency rule having exactly the same d-quad sequence
(σ), but with the added rightmost dependent CONJ-X, whose relation with the
head is COORD (fig. 7a). The dependency rule
and: CONJ-V (# <2nd, V, ◊, ∅>)
licenses the second conjunct of a coordination of finite verbs. The
relation "2nd" links the second conjunct as a dependent of the conjunction
and. The TARGET dependency rule in fig. 7b results from the application of
the metarule "coord-unit" to the primitive dependency rule for the predicateargument structure of laughed (the same for sneezed and other intransitive
verbs). According to these dependency rules, the sentence "John laughed
and Mary sneezed" can be represented as shown in fig. 7c.
8 The term "unit coordination" for the full constituent coordination comes from (Huang X.
1984).
- 19 -
X
X
coord-unit
σ
COORD
σ
CONJ-X
(a)
laughed: V
laughed: V
coord-unit SUBJ
SUBJ
N
N
COORD
CONJ-V
(b)
laughed
COORD
SUBJ
John
and
2nd
sneezed
SUBJ
Mary
(c)
Figure 7. Graphical representation of the metarule for unit coordinations
(a), its application to a dependency rule for laughed (b), and the
dependency tree of "John laughed and Mary sneezed".
The metarule "coord-unit" accounts for full constituent (unit)
coordinations, where the two conjuncts are headed by a word of the same
category. Note that in our terms "full constituents" means that the conjuncts
have no gaps inside. Other examples are the following sentences:
She eats apples and pears (OBJ coordination),
Elizabeth likes to go to the zoos and to the museums
(DEST coordination),
Mary thought ostrichs fly and kangaroos swim (SCOMP coordination).
As a final remark, note that these examples consider constituents that
are full arguments of predicate-argument structures. Note that VP
coordination, which is a full constituent coordination in phrase structure
terms, is not an example of unit coordination in our terms, because the
dependency rules represent complete predicate-argument structures
(including subject). VP coordination can be described as shown in the next
subsection.
3.2 Metarules for non-constituent coordination
From the point of view of dependency syntax, non-constituent
coordination occurs when one or both conjuncts display an incomplete
predicate structure. The missing elements are handled by traces (empty
nodes) and u-indices, that are controlled through u-triples in non-primitive
- 20 -
dependency rules. U-triple specifications allow to produce a uniform
treatment of many kinds of argument gaps, among which the following
(single and multiple) ones:
Mary cooked and John ate beans (OBJ gap, fig. 8b)
Mary cooked and ate beans (SUBJ+OBJ gap, or V coordination, fig.
8c)
John offered, and Mary actually gave, a golden Cadillac to Billy
Schwartz (OBJ+I-OBJ gap, or Right Node Raising, fig. 8d)
coord-gap
SUBJ
V
SUBJ
σ
V
SUBJ
COORD
σ
CONJ-V
uN
N
<u, SUBJ, N>
(a)
V
SUBJ
N
OBJ
σ
coord- gap
OBJ
V
SUBJ
N
N
COORD
σ
CONJ-V
<u, OBJ, N>
OBJ
uN
(b)
V
UBJ
N
OBJ
σ
coord- gap
SUBJ+OBJ
V
SUBJ
σ
uN
N
N
V
σ
I-OBJ
OBJ
coord- gap SUBJ
OBJ + IOBJ
N P[to]
N
OBJ
CONJ-V
<u, SUBJ, N>
<v, OBJ, N>
(c)
SUBJ
COORD
σ
vN
V
COORD
OBJ
CONJ-V
uN
<u, OBJ, N>
<v, I-OBJ, P[to]>
I-OBJ
vP[to]
(d)
Figure 8. Metarules for several types of non-constituent coordinations.
These metarules enforce the high attachment of actual dependents and
the low attachment of gapped dependents (trace nodes). This implies that
right dependents attach to the farther head of the two, always respecting the
condition of projectivity. Even if the second conjunct could attach more
immediately to the lower head, we think that the study of intonation in
naturally occurring speech favours an analysis where the second conjunct
forms a unit per se, without the right dependents (Steedman M. 1996). Also,
this analysis implements an immediate generalization of the notion of metaprojectivity. In fact, in this case we are not talking about displacing, but of
sharing, elements.
- 21 -
A different case of non-constituent coordination is the following
I gave the books to Mary and the records to Sue (V + SUBJ gapping)
where both the head and some dependent are involved in gapping. For
this sentence, we need to employ local u-indices, as this (non primitive)
dependency rule reveals:
gave: V (<SUBJ, N, !, ∅>
<#,!c>
<OBJ, N, ◊, ∅>
<I-OBJ, PREP[to], ◊, ∅>
<COORD, CONJ-V, ◊, {<!c, 2nd, gave:V>}>)
This rule is produced by the metarule in fig. 9a. The dependency tree
which results from the derivation of this sentence is in fig. 9b.
SUBJ
N
V
σ
coord- gap
!c, ◊V
SUBJ-V
SUBJ
σ
COORD
N
CONJ-V
!, ◊
<!c, 2nd, V>
(a)
1gave
SUBJ OBJ
2I
COORD
I-OBJ
the
NBAR
and
to
2nd
P-OBJ
ε1
books Mary
SUBJ
OBJ
ε2
the
NBAR
(b)
I-OBJ
to
P-OBJ
records Sue
Figure 9. The metarule for V+SUBJ gapping (a) and the dependency tree
associated with the sentence "I gave the books to Mary and the records to
Sue".
3.3 Metarules for gapping
Now let us turn to gapping. These coordination constructs occur where
the missing structure in the second conjunct is not a whole subtree, but only
a part of it, namely the head and, possibly, some of its dependents. Here are
some examples:
I saw a unicorn and Carol a tyrannosaurus (V gapping)
- 22 -
John wants to give the books to Mary and Bill to Sue
(V complex + OBJ gapping)
For the (V-gapped) sentence
I saw a unicorn and Carol a tyrannosaurus
we need the metarule in fig. 10a, which produces the TARGET
dependency rule
saw: V (<SUBJ, N, ◊, ∅> <#, !c> <OBJ, DET, ◊, ∅>
<COORD, CONJ-V, ◊, <!c, 2nd, V>}>)
which licenses the dependency tree in fig. 10b.
V
σ1
coord- gap
V
σ2
V
!c, ◊
σ1
σ2
COORD
CONJ-V
<!c, 2nd, V>
(a)
1, ◊ saw
SUBJ OBJ
I
a
COORD
and
NBAR
2nd
ε1
unicorn
SUBJ
OBJ
a
Carol
NBAR
tyrannosaurus
(b)
Figure 10. The metarule for verb gapping (a) and the dependency
tree of the sentence "I saw a unicorn and Carol a tyrannosaurus" (b).
Notice that our approach does not distinguish these cases of gapping,
from some cases of non-constituent coordination (see above). In the
appendix, the interested reader can find a complete example of derivation on
the sentence “John wants to give a present to Mary, and Bill to Sue”, whose
dependency tree is in fig. 11. There we list the rules of the grammar, and the
complete derivation in detail.
- 23 -
2, ◊wants
SUBJ
◊,1
John
COORD
VCOMP
and
3, ◊to PRED
2ND
4, ◊give
SUBJ
ε1
OBJ
5, ◊
SUBJ
I-OBJ
a
NBAR
present
6, ◊
to
P-OBJ
◊,7
Bill
ε2
VCOMP
ε3
PRED
ε4
Mary
SUBJ
I-OBJ
OBJ
ε7
ε
NBAR
ε6
5
to
PREP-OBJ
Sue
Figure 11. The dependency tree associated with the sentence "John wants
to give a present to Mary and Bill to Sue". Notice that e2, e3, and e4, as well
as e5 and e6, can be collapsed into a single node. However, this collapsing
is possible only after the completion of the derivation process, with possible
complications for the semantic interpreter in terms of compositionality.
4. CONCLUSION
The paper has described the rule-based dependency formalism RDG.
We have introduced the form of the rules, and the derivation relation, which
determines the structure of the dependency tree. We have also sketched a
structured treatment of coordination which is coherent with the dependency
approach to syntax in terms of heads and dependents. Two aspects require
some closing comments: the use of non-lexical categories, and the
overgeneration problem.
The introduction of non lexical categories in a dependency formalism
allows the representation of long-distance dependencies in the static
derivation tree. In totally lexical approaches, where all the units in the
syntactic representation are lexically realized, the mapping between syntax
and semantics can be realized only dynamically, that is while computing the
syntactic structure (see for example the formalisms described in (Milward D.
1994) and (Kahane S. et al. 1998)). However, in NLP this process is often
split in two temporal phases, with the syntactic structure as a communication
medium between the parser and the interpreter. This architectural
organization requires a maximally informative syntactic structure which
overtly expresses all the dependencies. The use of non-lexical unit allows to
express in the same representation both the syntactic and the argumental
dependencies at the same time.
The formalism that we have described overgenerates, i.e. yields wrong
natural language sentences. The major issue concerns the satisfaction of utriples, which is restricted to any element in a subtree. A possible immediate
solution is grounded on one of two basic mechanisms, well known in the
- 24 -
literature: bounding nodes, adopted in the GB theories and in early LFG, and
functional uncertainty, adopted in late LFG, and applied to dependency
syntax by Kahane et al. (1998). Although we are currently oriented to the
second solution (see also Broker N., this issue), this is a matter of further
investigation.
A final word on parsing. The generalization of the formalism described
in (Lombardo V. & Lesmo L. 1998a) can result in the loss of polynomiality of
the parsing algorithm described therein. The treatment of local u-indices can
in fact introduce a length-of-sentence factor at the exponent, thus yielding an
exponential algorithm. We are currently studying a careful selection of the
data structures, in order to reduce the guessed exponential complexity.
ACKNOWLEDGEMENTS
We like to thank Norbert Bröker and Sylvain Kahane for several comments
and discussions on the formal issues of dependency syntax. Also, we thank
Cristina Bosco for having read and commented previous drafts of this paper.
REFERENCES
BARBERO Cristina, LESMO Leonardo, LOMBARDO Vincenzo,
MERLO Paola, "Integration of syntactic and lexical information in a
hierarchical dependency grammar", Proc. of the Wokshop on Dependency
Grammars, ACL-COLING 1998, Montreal, Canada, 1998, 58-67.
BECKER Tilman, RAMBOW Owen, "Parsing non-immediate dominance
relations", Proc. IWPT 95, Prague, 1995, 26-33.
BRÖKER Norbert, "Separating Surface Order and Syntactic Relations
in a Dependency Grammar", ACL/COLING 1998, Montreal, Canada, 174180.
BRÖKER Norbert, "Unordered and Non-projective Dependency
Grammars", this issue.
COVINGTON Michael A., "Parsing Discontinuous Constituents in
Dependency Grammar", Computational Linguistics 16, 1990, 234-236.
EARLEY Jay, "An Efficient Context-free Parsing Algorithm",
Communications of the ACM 13, 1970, 94-102.
EISNER Jason, "Three New Probabilistic Models for Dependency
Parsing: An Exploration", Proc. COLING 96, Copenhagen, 1996, 340-345.
FRASER Norman M., HUDSON Richard A., "Inheritance in Word
Grammar", Computational Linguistics 17, 1992, 133-157.
GAIFMAN Heim, "Dependency Systems and Phrase Structure
Systems", Information and Control 7, 1965, 304-337.
GAZDAR Gerald, KLEIN Ewan, PULLUM Geoffrey, SAG Ivan,
Generalized Phrase Structure Grammar, Basil Blackwell, Oxford, 1985.
HAHN Udo, Schacht Susanne, BRÖKER Norbert, "Concurrent, ObjectOriented Natural Language Parsing: The ParseTalk Model", Journal of
Human-Computer Studies 41, 1994, 179-222.
- 25 -
HUANG Xiuming, "Dealing with conjunctions in a machine-translation
environment", Proc. of COLING 84, Stanford, 243-246.
HUDSON Richard, English Word Grammar, Basil Blackwell, Oxford,
1990.
JÄRVINEN Timo, TAPANAINEN Pasi, "A Dependency Parser for
English", Technical Report TR-1, Dept. Of General Linguistics, Univ. Helsinki,
1997.
KAHANE Sylvain, NASR Alexis, RAMBOW Owen, "Pseudo-Projectivity:
A Polynomially Parsable Non-Projective Dependency Grammar", Proc.
ACL/COLING 1998, Montreal, Canada, 646-652.
KAPLAN Ronald, ZAENEN Annie, "Long Distance dependencies,
constituent structure, and functional uncertainty", in M.Baltin, A.Kroch (eds.):
Alternative conceptions of phrase structure, Univ. of Chicago Press, Chicago,
IL, 1988.
KWON Huan, YOON A., "Unification-Based Dependency Parsing of
Governor-Final Languages", Proc. IWPT 91, Cancun, 1991, 172-192.
LOMBARDO Vincenzo, LESMO Leonardo, "An Earley-type recognizer
for dependency grammar", Proc. COLING 96, Copenhagen, 1996, 723-727.
LOMBARDO Vincenzo, LESMO Leonardo, "Formal aspects and
parsing issues of dependency theory", Proc. ACL-COLING 98, Montreal,
1998a, 787-793
LOMBARDO Vincenzo, LESMO Leonardo, "Unit coordination and
gapping in dependency theory", Proc. of the Wokshop on Dependency
Grammars, ACL-COLING 1998, Montreal, Canada, 1998b, 11-20.
MEL'CUK Igor, Dependency Syntax: Theory and Practice, SUNY Press,
Albany, 1988.
MILWARD David, "Dynamic Dependency Grammar", Linguistics and
Philosophy 17,, December 1994, 561-606.
NASR Alexis, "A formalism and a parser for lexicalized dependency
grammar", Proc. IWPT 95, Prague, 1995, 176-195.
NASR Alexis, “Un modèle de reformulation automatique fondé sur la
Théorie Sens Texte: Application aux langues contrôlées}, Ph.D. Thesis,
Université Paris 7, 1996.
NEUHAUS Peter, BRÖKER Norbert, "The Complexity of Recognition of
Linguistically Adequate Dependency Grammars", Proc. ACL/EACL97,
Madrid, 1997, 337-343.
NEUHAUS Peter, HAHN Udo, "Restricted Parallelism in ObjectOriented Parsing", Proc. COLING 96, Copenhagen, 1996, 502-507.
RAMBOW Owen, JOSHI Aravind, "A Formal Look at Dependency
Grammars and Phrase-Structure Grammars, with Special Consideration of
Word-Order Phenomena", Int. Workshop on The Meaning-Text Theory,
Darmstadt, 1992.
SARKAR Anoop, JOSHI Aravind K., "Handling Coordination in a Tree
Adjoining Grammar", Unpublished manuscript, Department of Computer and
Information Science, University of Pennsylvania, Philadelphia (PA), February
1997.
SCHABES Yves, “Mathematical and Computational Aspects Of
Lexicalized Grammars”, Ph.D. Dissertation MS-CIS-90-47, Dept. of
- 26 -
Computer and Information Science, University of Pennsylvania, Philadelphia
(PA), August 1990.
SGALL Petr, HAIJCOVA Eva, PANEVOVA Jarmila, The Meaning of
Sentence in its Semantic and Pragmatic Aspects, Dordrecht Reidel Publ.
Co., Dordrecht, 1986.
SLEATOR Daniel D., TEMPERLEY Daniel, "Parsing English with a Link
Grammar", Proc. of IWPT93, 1993, 277-291.
STEEDMAN Mark, "Dependency and Coordination in the grammar of
Dutch and English", Language 61, 1985, 523-568.
STEEDMAN Mark, "Gapping as constituent coordination", Linguistics
and Philosophy 13, 1990, 207-264.
STEEDMAN Mark, Surface structure and interpretation , MIT Press,
1996.
TESNIERE Lucien, Elements de syntax structural, Klincksieck, Paris,
1959.
Appendix: Another example of derivation
In this section we trace the derivation for the sentence "John wants to
give a present to Mary, and Bill to Sue". The underlined derivation objects are
the ones to which the derivation rule applies. The resulting derivation tree is
shown in fig.11.
Sentence
John wants to give a present to Mary, and Bill to Sue.
Rules
and: CONJ-V (# <2nd, V, ◊, ∅>)
giveα: V (<SUBJ, N, ◊, ∅> # <OBJ, DET, ◊, ∅> <I-OBJ, PREP-TO, ◊, ∅>)
giveβ: V (<SUBJ, N, ◊, ∅> <#,!> <OBJ, DET, !, ∅> <I-OBJ, PREP-TO, ◊, ∅>)
wantsα: V (<SUBJ, N, u, ∅> # <VCOMP, TO, ◊, {<u,SUBJ,N>}>)
wantsβ: V (<SUBJ, N, u, ∅>
<#,!c>
<VCOMP, TO, !, {<u,SUBJ,N>}>
<COORD, CONJ-V, ◊, {<!c,2nd,wants:V>}>)
toα: TO (# <PRED, V, ◊, ∅>)
toβ: TO (<#,!> <PRED, V, !, ∅>)
toγ: PREP-TO (# <PREP-OBJ, N, ◊, ∅>)
aα: DET (# <NBAR, N, ◊, ∅>)
aβ: DET (<#,!> <NBAR, N, !, ∅>)
presentα: N (#)
presentβ: N (<#,!>)
John, Mary, Bill, Sue: N (#)
Derivation
<TOP, V(∅), ◊, ◊>
=== DR1 (wantsβ) [u = 1, !c = !c.2] ===>
- 27 -
<SUBJ,N(∅),1,◊> 2,◊wants◊ <VCOMP,TO({<1,SUBJ,N>}),◊,2.3>
<COORD, CONJ-V({<!c.2,2nd,wants:V>}), ◊, ◊>
Rule DR1 is applied to expand the TOP dummy category. The wantsβ dependency rule specifies
that:
a - The SUBJ of 'wants' will be coindexed with a (full subtree) trace appearing within the
VCOMP (index u, instantiated as 1)
b - The head verb 'wants' will be coindexed with a (single word) trace appearing within the
COORD subtree (index !c, starting up a new chain and instantiated with the dotted pair
!c.2). The new index value (i.e. 2) is associated with 'wants' (in the derivation sequence) as
a single-word index.
According to DR1, a continuation index (2.3) is inserted in the VCOMP derivation object, to
signal that the head of VCOMP will also be single-word coindexed (see ! in the wantsβ
dependency rule).
=== DR1 (John) [u = 1] ===>
◊,1John◊ 2,◊wants◊ <VCOMP,TO({<1,SUBJ,N>}),◊,2.3>
<COORD, CONJ-V ({<!c.2,2nd,wants:V>}), ◊, ◊>
The lexical element 'John' is inserted in the sequence. It gets the full-subtree index 1, as
specified in the derivation object which has been expanded via DR1.
=== DR1 (toβ) [! =2.3] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ <PRED, V({<1,SUBJ,N>}),◊,3.4>
<COORD, CONJ-V ({<!c.2,2nd,wants:V>, <2.3, VCOMP,to:TO>}), ◊, ◊>
The lexical element 'to' is inserted in the sequence. It gets the single-word index 3, i.e. the
second element of the dotted pair 2.3 appearing in the VCOMP derivation object. The u-triple
expressing the requirement of a single-word trace is inserted in the COORD d-quad, i.e. where
!c.2 appears. A continuation of the coindexing chain is introduced in the PRED d-quad (as
required by the toβ dependency rule). In the present situation, it is known that:
a - A two-trace chain must be found within the COORD subtree
b - The two elements which will be coindexed with the traces have been found ('wants' and 'to').
c - The chain of coindexed elements will continue within the PRED
=== DR1 (giveβ) [! = 3.4) ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ <SUBJ, N({<1,SUBJ,N>}), ◊, ◊> 4,◊give◊
<OBJ, DET(),◊,4.5>
<I-OBJ, PREP-TO(), ◊, ◊>
<COORD, CONJ-V ({<!c.2,2nd,wants:V> <2.3 ,VCOMP,to:TO>
<3.4, PRED,give:V>}),
◊, ◊>
This application of DR1 is exactly analogous to the previous one. Now the chain within COORD
is three traces long, since 'give' is also single-word coindexed.
=== DR2 ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ <OBJ, DET(∅),◊,4.5>
<I-OBJ, PREP-TO(∅), ◊, ◊>
<COORD, CONJ-V ({<!c.2,2nd,wants:V> <2.3, VCOMP,to:TO>
<3.4, PRED,give:V>}),
◊, ◊>
The first (full-subtree) trace is inserted via DR2. In fact, the 'SUBJ,N' requirements expressed in
the u-triple are satisfied by the element expected.
=== DR1 (aβ) [! = 4.5) ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ <NBAR, N(∅),◊,5.6>
<I-OBJ, PREP-TO(∅), ◊, ◊>
<COORD,CONJ-V ({<!c.2,2nd,wants:V> <2.3,VCOMP,to:TO>
<3.4,PRED,give:V> <4.5,OBJ,a:DET>}),
- 28 -
◊, ◊>
The single-word chain grows again (the determiner 'a')
=== DR1 (presentβ) [! = 5.6) ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊
<I-OBJ, PREP-TO(), ◊, ◊>
<COORD, CONJ-V ({<!c.2,2nd,wants:V> <2.3,VCOMP,to:TO>
<3.4,PRED,give:V> <4.5,OBJ,a:DET> <5.6,NBAR,present:N>}),
◊, ◊>
And, finally, it is completed with 'present'
=== DR1 (toα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊
<PREP-OBJ, N(∅), ◊, ◊>
<COORD, CONJ-V ({<!c.2,2nd,wants:V> <2.3,VCOMP,to:TO>
<3.4,PRED,give:V> <4.5,OBJ,a:DET> <5.6,NBAR,present:N>}),
◊, ◊>
'to' is inserted, with no trace involved
=== DR1 (Mary) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊
<COORD, CONJ-V ({<!c.2,2nd,wants:V> <2.3,VCOMP,to:TO>
<3.4,PRED,give:V> <4.5,OBJ,a:DET> <5.6,NBAR,present:N>}),
◊, ◊>
As well as 'Mary'.
=== DR1 (and) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
<2nd, V ({<!c.2,2nd,wants:V> <2.3,VCOMP,to:TO> <3.4,PRED,give:V>
<4.5,OBJ,a:DET> <5.6,NBAR,present:N>}),
◊, ◊>
COORD is expanded; all the expectations for traces are forwarded to the governed V.
=== DR3 (wantsα) [u = 7] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
<SUBJ, N(∅), 7, ◊,>
◊,◊ε2
<VCOMP,
TO ({<2.3, VCOMP,to:TO> <3.4, PRED,give:V> <4.5, OBJ, a:DET>
<5.6, NBAR, present:N> <7,SUBJ,N>}),
◊, ◊>
The verb is gapped (constraints for !c.2 satisfied). So a trace is inserted in the derivation
sequence in place of the expected head (wants). The trace gets the index 2 (second element of
the dotted pair !c.2), thus encoding the co-indexing with 'wants'.
=== DR1 (Bill) [u = 7] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2
<VCOMP,
TO ({<2.3, VCOMP,to:TO> <3.4, PRED,give:V> <4.5, OBJ, a:DET>
<5.6, NBAR, present:N> <7,SUBJ,N>}),
◊, ◊>
‘Bill’ gets the (full subtree) index set to enable co-indexing with the (traced) subject of the
gapped ‘give’.
- 29 -
=== DR3 (toα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3
<PRED, V({<3.4, PRED,give:V> <4.5, OBJ, a:DET>
<5.6, NBAR, present:N> <7,SUBJ,N>}),
◊, ◊>
Insertion of the trace for 'to'
=== DR3 (giveα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 <SUBJ, N({<7,SUBJ,N>}), ◊, ◊> ◊,◊ε4
<OBJ, DET({<4.5, OBJ, a:DET> <5.6, NBAR, present:N>}), ◊, ◊>
<I-OBJ, PREP-TO(), ◊, ◊>
Insertion of the trace for 'give'. The expected traces are distributed among the dependents:
- The full-subtree trace (7) to the governed subject (SUBJ)
- All the single word traces (4.5 and 5.6) to the governed direct object (OBJ)
=== DR2 ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 ◊,◊ε7 ◊,◊ε4
<OBJ, DET({<4.5, OBJ, a:DET> <5.6, NBAR, present:N>}),◊, ◊>
<I-OBJ, PREP-TO(∅), ◊, ◊>
The subject of the (gapped) ‘give’ is a trace co-indexed with Bill (index 7)
=== DR3 (aα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 ◊,◊ε7 ◊,◊ε4 ◊,◊ε5
<NBAR, N({<5.6, NBAR, present:N>}),◊, ◊>
<I-OBJ,PREP-TO(∅),◊,◊>
The trace for the determiner 'a' is introduced (index 5)
=== DR3 (presentα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 ◊,◊ε7 ◊,◊ε4 ◊,◊ε5 ◊,◊ε6
<I-OBJ, PREP-TO(∅), ◊, ◊>
The trace for the 'present' is introduced (index 6)
=== DR1 (toα) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 ◊,◊ε7 ◊,◊ε4 ◊,◊ε5 ◊,◊ε6 ◊,◊to◊
<PREP-OBJ, N, ◊, ◊>
The I-OBJ (to Sue) is realized in the sentence, so it is generated in a standard way.
=== DR1 (Sue) [] ===>
◊,1John◊ 2,◊wants◊ 3,◊to◊ ◊,◊ε1 4,◊give◊ 5,◊a◊ 6,◊present◊ ◊,◊to◊ ◊,◊Mary◊ ◊,◊and◊
◊,7Bill◊ ◊,◊ε2 ◊,◊ε3 ◊,◊ε7 ◊,◊ε4 ◊,◊ε5 ◊,◊ε6 ◊,◊to◊ ◊,◊Sue◊
So that the final sequence is obtained.
- 30 -