Projection, Heads, and Optimality

Projection, Heads, and Optimality*
Jane Grimshaw
Department of Linguistics and Center for Cognitive Science
Rutgers University
May 1995
Department of Linguistics
18 Seminary Place
New Brunswick NJ 08903
[email protected]
* This research was generously supported by Rutgers University. The analysis has benefited
immensely from discussions with, and (often extensive) comments from, the following colleagues:
Peter Ackema, David Adger, Maria Bittner, Eric Bakovic, Claudia Borgonovo, Christine Brisson,
George Broadwell, Daniel Büring, Harald Clahsen, Veneeta Dayal, Viviane Déprez, Liliane
Haegeman, Hubert Haider, Katharina Hartmann, Marco Haverkort, Teun Hoekstra, Norbert
Hornstein, Riny Huybregts, Sabine Iatridou, Ingeborg Lasser, Geraldine Legendre, Jim McCloskey,
Armin Mester, Gereon Müller, Ad Neeleman, Jaye Padgett, Susan Pintzuk, Andrew Radford, Vieri
Samek-Lodovici, Yael Sharvit, Nomi Shir, Sten Vikner, Fred Weerman, Ken Wexler, Edwin
Williams, Ellen Woolford, Raffaella Zanuttini. Many suggestions came from audiences at Brandeis
University, CUNY, University of Delaware, Girona International Summer School in Linguistics, Lund
University, University of Massachusetts at Amherst, MIT, the University of Pennsylvania, Rutgers
University, Universität Stuttgart, and Wuppertaler Linguistisches Kolloquium. I owe a particular debt
to Alan Prince and Paul Smolensky, for their extremely generous contributions to this work, and to
Vieri Samek-Lodovici for his invaluable scrutiny of the manuscript. Two reviewers provided very
helpful commentaries.
1. Introduction
The goal of this paper is to show that the distribution of heads in English clauses, including inversion
of the subject and auxiliary verb, and the appearance of the auxiliary verb do and the complementizer
that, can be explained by the interaction of universal constraints, under Optimality Theory as
proposed in Prince and Smolensky 1993. The core of Optimality Theory lies in these ideas:
Constraints are universal.
Constraints can be violated.
Grammars are rankings of constraints.
The optimal form is grammatical, all non-optimal candidates are ungrammatical.
An optimal output form for a given input is selected from among the class of competitors in
the following way: A form which, for every pairwise competition involving it, best satisfies
the highest-ranking constraint on which the competitors conflict, is optimal.
The constraints which play a central role in the proposal are these:1
Constraints related to Specifiers:
OPERATOR IN SPECIFIER (OP-SPEC)
position
CASE-MARKING (CASE)
SUBJECT (SUBJ)
-- syntactic operators must be in specifier
-- DPs must be case marked
-- clauses have subjects
Constraints related to heads:
OBLIGATORY HEADS (OB-HD)
-- a projection has a head
-- the head is leftmost in its projections
HEAD LEFT (HD-LFT)
-- the head is rightmost in its projections
HEAD RIGHT (HD-RT)
NO MOVEMENT OF A LEXICAL HEAD (NO-LEX-MVT) -- a lexical head cannot move
Government Constraints:
TRACE GOVERNMENT (T-GOV)
-- a trace is governed
TRACE IS LEXICALLY GOVERNED (T-LEX-GOV) -- a trace is lexically governed
Others:
PROJECTION PRINCIPLE (PROJ-PRIN)
-- no adjunction to subordinate clauses; and no
movement into the head of a subordinate clause
ECONOMY OF MOVEMENT (STAY)
-- trace is not allowed
FULL INTERPRETATION (FULL-INT)
-- lexical conceptual structure is parsed
CONDITIONAL (COND)
-- a dependent head c-commands the extended
projection containing it
1
Grimshaw 1993 and 1994 assumed an additional constraint, “Minimal Projection”, which
required that a functional projection make a contribution to the functional representation of the
extended projection that it is part of, thus ruling out entirely empty projections. In the current system,
there is no need to stipulate such a constraint: the effects follow from OB-HD and STAY, as will be
seen in Section 2.
2
I will discuss the formulation of the constraints and their relationship to ideas in the literature at
the appropriate points in the discussion to come. The constraints are assessed at s-structure, and the
account is, as far as I can tell, entirely consistent with the view that s-structure is freely constructed
from the primitives of the theory, including traces, and then evaluated by well-formedness constraints,
among them the ones in this paper.
Because OT constraints are violable, they are not in general “surface-true”. Under standard
assumptions, positing a constraint which is violated requires corrective work. The constraint may be
modified to a less general form so that no violation occurs, or taken to be satisfied by an invisible
element or piece of structure. Or it may be relegated to some level of representation where it can be
held to be unviolated, LF having proved most useful for this purpose. Under OT, violability is the
norm, and it is this which makes it possible to have general constraints freely interacting. The
generality of the constraints leads to systematic conflicts. For example, OB-HD may mandate
movement to provide a head for a projection, yet STAY is always violated by movement. Similarly,
PROJ-PRIN will prohibit head movement if the projection is a subordinate clause. Such conflicts, and
their resolutions, prove essential to explaining the empirical generalizations at issue.
When constraints conflict, it is their relative ranking that determines which will be satisfied and
which violated. The following rankings will be crucial in the proposal to come: Both OB-HD and
OP-SPEC dominate STAY, OP-SPEC dominates OB-HD. FULL-INT is dominated by OB-HD, NO-LEXMVT, SUBJ and CASE. PROJ-PRIN dominates OB-HD and T-LEX-GOV. HD-LFT dominates OB-HD.
An ordering which is consistent with all of these dominance relations is: COND, PROJ-PRIN, OP-SPEC,
HD-LFT, OB-HD, CASE, NO-LEX-MVT, FULL-INT, STAY, T-GOV, T-LEX-GOV. The constraints are
cited in this order throughout.
Cond
No-Lex-Mvt
Proj-Prin
HdLeft
T-Lex-Gov
Case
Subj
Full-Int
Op-Spec
2-Criterion
Ob-Hd
Stay
The competitors, Prince and Smolensky's “candidate set”, are alternative realizations of an input.
3
The INPUT for a verbal extended projection is a lexical head plus its argument structure and an
assignment of lexical heads to its arguments, plus a specification of the associated tense and
semantically meaningful auxiliaries.
The input is passed to GEN (cf. Prince and Smolensky 1993), which generates all extended
projections which conform to XN theory, i.e. in which all projections are of the right basic structure.
Thus GEN can introduce such functional heads as do not appear in the input, due to their lack of
semantic content, that being an example discussed in this paper. I assume a inimal XN theory, which
simply says that each node must be a good projection of a lower one, if a lower one is present. The
presence of a head is mandated by OB-HD, which is violable. (For a very different position on the
head position see Bakovic (1995), where properties of the Spanish inversion system are argued to
follow if OB-HD is considered to be part of GEN, hence inviolable.)
An extended projection (Grimshaw 1991; see also Haider 1989, van Riemsdijk 1990 for related
proposals) is a unit consisting of a lexical head and its projection plus all the functional projections
erected over the lexical projection. The smallest verbal projection is VP, but IP and CP are both
extended projections of V. Other projections such as Agr-s, Agr-o can be incorporated into the
overall program, but are not discussed here.
Suppose, for example, that the lexical head is see, and John is assigned to its external argument
and who to its internal argument: see(x,y), x=John, y=who. GEN will construct all extended
projections of this lexical specification. Among the candidates will be some with who in object
position, some with who in Spec of a functional projection, some with that included, some with do
and so forth.2 Competing candidates are evaluated as analyses of the same lexical material because
of the way the input is defined. Thus, say, John saw who as an analysis of the input above is compared
to other analyses of the same input, and not to, say, analyses of an input with a different lexical head,
or with different arguments assigned to the predicate. I will assume that competing candidates have
non-distinct logical forms, in a sense which must be made precise by further research, but which
certainly must entail that they are truth functionally equivalent. Reinhart (1993) and Iatridou and
Embick (1994) discuss this issue with respect to minimalism and economy (Chomsky 1991, 1992).
It may turn out that the input should include a specification of LF-related properties, such as scope,
as suggested in Smolensky et al. (1995).
The entire set of candidates is compared with respect to conformity to the set of violable
constraints provided by UG and ranked by the grammar of the language, and the optimal one(s)
survive as grammatical.
2
Among other candidates are those which do not analyze elements of the input: those with
violations of the Prince and Smolensky (1993) PARSE constraints. FULL-INT is a constraint of this
type -- see 3.3.
4
2. The Basics of Inversion
Patterns of inversion follow from the interactions among universal constraints affecting heads in the
verbal extended projection. The key constraints for this initial exploration are STAY, OP-SPEC, OBHD.
First, consider an English matrix declarative sentence, where inversion is neither required nor
allowed.
(1)
a. They will read some books
b. *Will they read some books
What well-formedness requirements are relevant for a matrix declarative? In an IP analysis of a
matrix declarative with an auxiliary verb, such as the one in (1a), both projections have heads: a
lexical V heads the VP and the auxiliary will heads IP. Thus there is no head missing from the
structure,and here lies the crucial contrast between a declarative and an interrogative. Wh questions
are operator variable constructions and the wh phrase is subject to OP-SPEC. OP-SPEC is based on
the insight of Rizzi (1991) and Haegeman (1992), that there is a special relationship between the
Specifier position and a syntactic operator, a scope-bearing expression which takes its scope by virtue
of its syntactic position. All phrases which are marked as wh by virtue of percolation (through an
extended projection) of a wh feature from a wh head or specifier (as in Grimshaw (1991)) count as
wh operators. OP-SPEC requires that such expressions be Specifiers, motivating one more projection
than occurs in the corresponding declarative. Specifier of VP is filled by the underlying subject,
specifier of IP by the surface subject, hence the operator must be in specifier of an additional
projection. This extra projection has no head and thus violates OB-HD, which requires a projection
to have a head (either lexically realized or occupied by a trace) much as in Haider (1989). Head
movement provides a head for the CP. Wh movement thus violates STAY in order to satisfy OP-SPEC.
Similarly, inversion violates STAY, but it results in satisfaction of OB-HD.
(2) a. Which books will they read ?
b. *Which books they will read ?
In questions introduced by how come there is no empty C because how come itself is a C (Collins
1991) and OB-HD is satisfied without inversion. To complete the explanation for (2), it is necessary
to show that there is no C to head the CP in (2b), an obvious candidate being the complementizer
that. Section 6 addresses this issue.
The basic idea, then, is that subject-auxiliary inversion in interrogatives is movement to C,
following den Besten (1983), due to the OB-HD constraint, which induces movement to provide a
head for a projection which would otherwise lack one. The projection is absent altogether in
declaratives, hence no movement is necessary for the satisfaction of OB-HD.3
3
Verb second languages show a much-studied paradigm in which inversion is found in matrix
declaratives. While there are a number of different accounts of V2, many view the typological
5
The tableau in (T1) shows the key competitors for the interrogative. If the interrogative is just
an IP it will inevitably violate OP-SPEC (except when the wh phrase is the subject, see 3.4). The CP
structures with no wh movement also violate OP-SPEC, and without inversion they violate OB-HD
too. The CP structure with wh movement but no inversion violates OB-HD and STAY. The CP with
wh movement and inversion violates STAY twice. It is optimal.
(T1)
Matrix interrogatives
Candidates
L
OPSPEC
[IP DP will [VP read what]]
*!
[CP e [IP DP will [VP read what]]]
*!
[CP what e [IP DP will [VP read t ]]]
STAY
*
*!
*
**
[CP what willi [IPDP ei [VP read t ]]]
[CP willi [IPDP ei [VP read what ]]]
OB-HD
*!
*
This illustrates some of the fundamentals of Optimality Theory. Every candidate, including the
grammatical one, violates some constraint. The optimal one, however, marked with the pointing
finger, violates only STAY, while the competitors all violate at least one constraint that is higherranked than STAY. (Because of the way constraint violations are reckoned, no number of violations
of STAY would evict the winning candidate from its position as optimal, see Prince and Smolensky
(1993).) Since OP-SPEC is the highest ranked of these constraints, any candidate which violates it will
fail if there are candidates which do not violate it, as there are in this case.
Several rankings are crucial here. OP-SPEC must out-rank, or “dominate” STAY, otherwise
the candidates with no wh movement would be preferred over candidates which satisfy OP-SPEC. OBHD must out-rank STAY, for similar reasons: this is why head movement is possible if it leads to
satisfaction of OB-HD. Furthermore, we might wonder why the wh phrase cannot be base generated
in Specifier position, avoiding violation of STAY and satisfying OP-SPEC. This possibility is
property that distinguishes the V2 languages from the others as a property of the C position (see for
example, the work reviewed in Weerman (1989) and Vikner (in press)), taking the filling of Spec of
CP as a secondary property. The analysis of English given here invites a different perspective, in
which what is special about these languages is that they require the presence of a Spec position of a
particular kind, hence an extra projection can appear and OB-HD motivates V2 to fill the empty head
position. (The secifier-related constraint in question must outrank STAY in V2 languages, and not
in English -- see 3.3 and 7 for discussion of effects of re-ranking.) The correct formulation of the
constraint requiring the Spec position is a complex issue, however, and I will not pursue it further.
6
eliminated, for arguments at least, if the Theta Criterion is invoked: either as a violable constraint
dominating STAY or as an inviolable constraint (Vieri Samke-Lodovici p.c.). The result is that the
system prefers to generate the wh phrase inside the VP to satisfy the Theta Criterion, at the cost of
violating STAY in satisfying OP-SPEC.
In the declarative, OP-SPEC is always satisfied, vacuously, since there is no operator present.
Hence the extra structure is not needed:
(T2)
Matrix declaratives
OPSPEC
Candidates
L
OB-HD
STAY
[IP DP will [VP read books]]
[CP e [IP DP will [VP read books]]]
[CP willi [IP DPei [VP read books]]]
*!
*!
When the matrix is an IP, every constraint considered so far is respected, and this is the optimal
candidate. OB-HD is satisfied by will in I and by a lexical head in V. OP-SPEC is vacuously satisfied.
When the matrix is a CP there are two alternative patterns. If inversion does not occur then OB-HD
is violated, while if inversion does occur, STAY is violated. (This is why there is no need to appeal
to the constraint “Minimal Projection” of Grimshaw (1993, 1994) to eliminate empty projections.
Joan Bresnan points out, however, (p.c.) that empty adjunction structures are not excluded by the
system.) Thus of the three competing possibilities discussed, the IP is the best. In effect, there is
no point to inversion, since it leads to violation of STAY, with no compensating improvement on a
higher-ranked constraint. In interrogatives, in contrast, the STAY violation has the benefit of allowing
OB-HD to be satisfied.
Under the VP-internal subject hypothesis (Zagona 1982, Koopman and Sportiche 1991, Kitagawa
1986, McNally 1992, Burton and Grimshaw 1992), the subject in both the declarative and the
interrogative has raised from Spec of VP to Spec of IP, satisfying CASE and SUBJECT but in violation
of STAY. I will factor this out for now, and not represent it in the analysis, until it becomes strictly
relevant, in 3.5.
Returning to the interrogatives, OP-SPEC merely requires that the wh phrase appear in a Specifier
position — it says nothing about the position of this specifier relative to the remainder of the clause.
Why then does the extra projection go on top of IP, instead of, for example, between IP and VP?
This is due to the scopal properties of wh phrases: they take scope by virtue of their syntactic
position, and must have scope over the entire propositional structure (roughly speaking IP) in order
to perform their semantic function. There are several different ways to incorporate this into the
present solution. The assumption in Grimshaw (1993) was that structures with the wh phrase in the
7
wrong Specifier would be uninterpretable and hence would never be grammatical. Here instead, I will
take the tack that OP-SPEC requires an operator to be in the Specifier from which it takes its scope.
Thus OP-SPEC for a wh operator will be satisfied only by movement to a position from which the
operator c-commands the entire extended projection.
This solution predicts that when more than one wh phrase is present in a clause, only one moves,
so that (3b), for example, is ungrammatical.
(3)
a. What will they put where?
b. *What will where they put?
This is because the movement of where to an internal specifier violates STAY, but fails to satisfy OPSPEC since it is not moving the wh phrase into a scope position. Thus both candidates have one OPSPEC violation, and the decision falls to STAY, which selects the candidate with fewest movements.
The (simplified) tableau in (T3) illustrates this.
(T3)
Multiple interrogatives
Candidates
L
OPSPEC
OB-HD
STAY
[CP what will [IP they [VP put t where ]]]
*
*
[CP what will [where [IP they [VP put t t ]]]]
*
**!
Movement to Specifier induces inversion, as we have seen. Adjunction, however, does not.
Adjunction leaves the number of projections present in the representation unchanged:
(4)
a.
b.
XP
/ \
Spec XN
/ \
X YP
XP
/ \
ZP XP
/ \
Spec XN
/ \
X
YP
There are no more heads to be filled in an adjoined structure than in a structure with no adjunction
and OB-HD will be satisfied in exactly the same way in both. Inversion will therefore never be
necessary and therefore never be possible, since inversion always violates STAY. Along with Baltin
(1982) and Lasnik and Saito (1992), I take topicalization (perhaps also the wh movement in
exclamatives) to be a case of adjunction. See Cinque (1990), Lasnik and Stowell (1991), Postal
8
(1992), Müller and Sternefeld (1993) for recent discussion of differences between topicalization and
true operator movements. Possibly English exclamatives, which have no inversion, are also adjunction
structures.
In English matrix yes-no questions, inversion is again required.
(5)
a. Will they read some books?
b. *They will read some books?
The most straightforward way to subsume them under the same system is to posit a null operator in
Spec of CP.4 Then matrix yes-no questions are CPs, the C position will be empty without inversion,
and OB-HD will induce inversion. The assumption that yes-no questions involve an operator is crucial
for other aspects of the syntax of interrogatives, such licensing of polarity items, and relativized
minimality effects for interrogative complements (Rizzi 1990).
Empty layers of structure in the system presented here always violate OB-HD without movement
and STAY with movement. The effects of a constraint against empty structure follow without
stipulation.5 It is a consequence of competition among competing candidates that a clause is only
as big as it needs to be. It is an IP unless it has to be a CP. A clause always has the minimal structure
consistent with maximal satisfaction of constraints, under the OT theory of ranking. There is no fixed
structure for clauses; no unique answer to the question of whether they are IPs or CPs. (or indeed
VPs as we will see below). Sometimes they are one, sometimes the other, depending on what wellformedness conditions are relevant.
It should be noted that is crucial that all extended projections which can be formed above the
lexical projection be in the candidate set, otherwise both the optimal CP candidates and the optimal
IP candidates would be grammatical, giving inversion in declaratives alongside a no-inversion
structure, and admitting an interrogative in which neither wh movement of the operator nor inversion
occurred.
3. Do Support
The generalization governing do in English is simple to state but has proved challenging to
formalize: that do is possible only when it is necessary (Chomsky 1957, 1991). The proposal
developed so far in this paper allows us to make precise this conceptualization of do, given two
4
George Broadwell points out (p.c.) that cross-linguistically, overt yes-no question markers
seem to be X-zeros rather than XPs, casting some doubt on analyses which treat yes-no questions and
wh questions in parallel. This suggests exploring an alternative which would more closely resemble
the proposal made in Section 6 for conditionals, where the operator is taken to be a head rather than
a phrase.
5
In Grimshaw (1993,1994), however, “Minimal Projection” was necessary, because STAY was
not among the constraints.
9
assumptions. First, do is a semantically and functionally empty verbal head: this seems to be the
minimal specification we can give to do. As a result every occurrence of do violates FULL-INT.
Second, OB-HD outranks FULL-INT, so do occurs when its presence results in satisfaction of OB-HD.
The distribution of do then follows. The consequences are that do is impossible in (positive) matrix
declaratives and subordinate interrogatives, required in matrix interrogatives with no auxiliary, never
co-occurs with other auxiliaries and never co-occurs with itself.
3.1 Background
Emonds (1978) and Pollock (1989) argue that in French, the main verb raises to I, while in
English the main verb appears to stay in its base position (see these works on English and French,
Vikner (in press) on Scandinavian). If we assume, more or less as they do, that I is the locus of verbal
inflection, then the French system is immediately comprehensible, since V raises to I in order to
combine with its inflection. On the other hand, the English system becomes quite mysterious, since
it is self-evident that the verb and its inflection do in fact combine, yet there is no obvious way for this
result to be achieved. There have been a variety of responses to this problem. Under “checking
theory” English main verbs raise at Logical Form (Chomsky 1993). In “Distributed Morphology”
tense and the verb are merged in Morphological Structure ( Halle and Marantz 1993). Williams
(1994) argues that inflectional features are not syntactic nodes, but part of X-bar projections: under
this view there is no V-to-I movement even in French type languages. Here I will take a position
which maintains features of both the Emonds/Pollock proposal and the alternatives. Suppose that
the difference between the French system and the English system is that in the English system
inflection is morphologically associated with a V, i.e. it is lexically attached to a V head, while in
French it is syntactically projected as head of a projection. (Whether inflection is syntactically
projected as an I, or morphologically attached to V, the result is the same in one respect, namely the
entire verbal extended projection has an inflectional specification, regardless of which (extended) head
of the projection is the source of it). French will then have V to I raising, while English will not. If,
as has been suggested by Pollock (1989), Platzack and Holmberg (1989), there is an important
relationship between the existence of V-to-I movement and richness or “strength” of inflection, this
should now be understood as reflecting a relationship between properties of inflection and existence
as an independently projected head.
For concreteness I will adopt the position that the first auxiliary verb, like a finite main verb, is
morphologically associated with phi features. It is generated in I. I assume a constraint, not further
discussed here, which requires that the finite verb must c-command all other proposition-related heads
in the extended projection. (This automatically penalizes extended projections containing more than
one finite verb, and will in practice rule them out.) CASE and SUBJ constrain the position of the
subject relative to functional heads in the extended projection. Their role will be examined in the
discussion of negation, but in the meantime I consider only candidates which satisfy them both. A
consequence of this analysis is that a tensed clause which contains a main verb and no auxiliaries,
must be a VP in English, since inclusion of an IP in its representation will violate OB-HD with no
10
compensating effects.6 Thus main clauses can be VPs, IPs, or CPs, depending on their contents.
3.2
The Distribution of do
I will first show how the analysis works out for the contrast between matrix declaratives and
matrix interrogatives, starting with declaratives. (I will omit CP representations for declaratives,
always non-optimal for reasons given above.) The first issue, then, is why do does not occur in
declaratives, even when there is no other auxiliary present.
(6)
a. She said that
b. *She did say that
(T4)
Matrix declaratives with and without do
Candidates
L
OP-SPEC
OB-HD
FULL-INT
STAY
[VP DP V that ]
[IP DP do [VP V that ]
[IP
e [VP DP V that ]
*!
*!
(T4) shows how this result is obtained. Note that FULL-INT, which was satisfied in all previous cases,
and was omitted from the tableaux for reasons of presentation, is now crucial. The VP form of the
clause, with no auxiliary verb projection, is the optimal one, since it involves no violations, while the
alternative with do violates FULL-INT, and the final alternative which has an extra projection but no
do, violates OB-HD. Thus auxiliary do cannot occur in a declarative. The crucial difference between
do and other auxiliaries is that they differ from do in having semantic content. They are part of the
input.
In interrogatives, do must appear in the absence of another auxiliary verb.
(7)
a. What did she say?
b. *What she said?
c. *What she did say?
6
Adjunction to X-bar will have to be admitted, to accommodate the position of the adverb in
John usually likes tomatoes, as pointed out to me by Sten Vikner and Marco Haverkort. If
adjunction is regulated only by scope considerations and by PROJ-PRIN, this is not a problematic
conclusion. Haverkort suggests that adjunction to X-bar is needed anyway for examples like the
following: Peter said that Carl, had he been on time, would have caught the train.
11
I simplify by considering only those forms which do not violate OP-SPEC, i.e. where wh movement
has occurred. Further, I will temporarily ignore candidates in which the main verb moves to supply
the otherwise missing head, returning to the analysis of these cases in 3.3.
(T5) Matrix Interrogatives with and without do
Candidates
L
OP-SPEC
OB-HD
[CP wh doi [IP DP ei [VP V t ]]]
[CP wh e [IP DP e [VP V t ]]]
FULL-INT
STAY
*
**
*!*
*
[CP wh e [VP DP V t ]]
*!
*
[CP wh e [IP DP do [VP V t ]]]
*!
*
*
The optimal form is the one in which do occurs and inverts to C: this one violates only FULL-INT
and STAY. Inversion of do satisfies OB-HD, which is violated in all other candidates. The ranking
of OB-HD over FULL-INT is crucial here: the opposite ranking would select as grammatical a form in
which the C is empty and do does not appear.
Thus we obtain the desired result: do is possible only where it is necessary, and it is necessary
when its presence makes a clause obey a constraint that has a higher ranking than FULL-INT, in this
case OB-HD.7 We will see in Section 4 that in subordinate interrogatives, OB-HD is violated anyway,
regardless of whether an auxiliary verb is present or not. As a result there is no possible advantage
to the inclusion of do, so it never occurs in embedded questions.
This analysis explains why do never co-occurs with another auxiliary verb, or with another token
of do. Consider, for example, a matrix interrogative clause in which do and will co-occur. Movement
of will to C satisfied OB-HD. There is no advantage to including do and its projection, which will
always add a FULL-INT violation, hence only the will version is possible.
(8)
a. What will she say?
b. *What will she do say?
c. *What does she will say?
7
Although Roberts (1992: 293-294) notes that do insertion appears to have been freely available
in 16th century English.
12
(T6) Co-occurrence of do and another auxiliary in a matrix interrogative
Candidates
L
OP-SPEC
OB-HD
FULL-INT
STAY
**
[CP wh willi [IPDP ei [VP V t ]]]
[CP wh willi [IPDP ei [XP do [VP V t ]]]]
*!
**
[CP wh doi [IPDP ei [XP will [VP V t ]]]]
*!
**
Main be and do will not occur together. Since be has the capacity to move to C, to satisfy OBHD, the form with do will have no competitive advantage over the one without, and the do-less form
will be optimal.
Obviously, auxiliary do will also not repeat. Clauses containing do will be in competition with
all otherwise equivalent clauses containing more and fewer occurrences of do. Each occurrence of
do yields a violation of FULL-INT and no occurrence of do after the first one can ever improve the
success of the structure with respect to the contstraints. Hence a clause with more than one instance
of do can never be optimal. This is illustrated with respect to a matrix question with no semantic
auxiliary, where one do is possible:
(9)
a. What did she say?
b. *What did she do say?
c. *What did she do do say?
(T7) Multiple occurrences of do (illustrated for matrix interrogative)
Candidates
L
OP-SPEC
[CP wh doi [IPDP ei [VP V t ]]]
[CP wh doi [IPDP ei [XP do [VP V t ]]]]
[CPwh doi [IPDP ei [XP do [XP do [VPV t ]]]]
OB-HD
FULL-INT
STAY
*
**
** !
**
** ! *
**
The fundamental point of analysis for do, then, is that it violates FULL-INT, hence it is present only
when its presence leads to improvement on a higher ranked constraint. Since OB-HD >> FULL-INT,
do will appear in order to satisfy OB-HD.
13
3.3
Constraint Ranking and the Lexicon: Main Verbs and the Availability of do
In 3.2, I set aside a central property of the English inversion system, namely that main verbs do
not invert, do occurring instead.
(10)
a. *What said she?
b. What did she say?
This motivates a constraint which penalizes movement of a lexical head, a the constraint named “NOLEX-MVT". This constraint is ranked above FULL-INT in English, therefore English will violate FULLINT (introducing do) in order to avoid violating NO-LEX-MVT. Thus the full constraint analysis is the
one given in (T8), which compares the winning candidate from (T5) with the alternative involving
movement of a main verb.
(T8)
Inversion of a Main Verb versus presence of do
Candidates
L
OP-SPEC
[CPwh Vi
NO-LEXMVT
[VP DP ei t]]]
OB-HD
FULLINT
STAY
*!
**
*
[CPwh doi [IPDP ei [VPV t]]]
**
In English, then, we know that OB-HD >> FULL-INT, and that NO-LEX-MVT >> FULL-INT. There
is no evidence concerning the ranking of OB-HD with NO-LEX-MVT, since the winning candidate
violates neither. Evidence for ranking NO-LEX-MVT over OB-HD is sketched in footnote 15.
What of a language that allows main verbs to move? Such a language must have both OB-HD and
FULL-INT dominating NO-LEX-MVT. This ranking yields a system which selects filling a head position
with a main verb over leaving the head position unfilled or filling it with a meaningless item. This is
schematized in (T9):
(T9)
Effect of re-ranking on Inversion of a Main Verb versus presence of do
Candidates
L
OP-SPEC
[CPwh Vi
OBHD
FULL-INT
[VP DP ei t ]]
*!
[CPwh doi [IPDP ei [VP V t ]]]
[CPwh e
[VP V t ]]
*!
NOLEXMVT
STAY
*
**
**
*
14
As in the English case, there is no crucial ranking here of OB-HD and FULL-INT — either ranking
will yield the desired result.
The difference between the English ranking and the alternative is farther-reaching than might
appear at first glance: it does not just concern the ability of a main verb to move. A grammar with
NO-LEX-MVT dominated by both FULL-INT and OB-HD will be inconsistent with the use of a
semantically empty verb like English do in inversion. In fact it will be inconsistent with the existence
of such a verb, which can never appear. Thus we can derive a gap in the behavior of verbs from the
ranking of constraints, following the line of reasoning developed by Prince and Smolensky (1993) for
segmental inventories.
Pursuing this point further, we can also ask whether it is an accident that English has a
semantically empty verb, and whether it is an accident that the morpheme do is the one which has this
use. Can we characterize the presence of do in English as a necessary consequence of the constraint
rankings, just as its absence in the grammar of (T9) is a necessity? The starting point for the answer
is that there is only one do in English. When its lexical conceptual structure (lcs) is parsed, do is a
theta-marking and argument-taking predicate. However, its lcs can also be unparsed, and this is the
source of "auxiliary" do.
(11)
a. parsed lcs:
b. unparsed lcs:
do ---- (ACT x)
do --//-- (ACT x)
When its lcs is unparsed, do is a "light verb" in the sense of Grimshaw and Mester (1988), using a
term from Jespersen (1954). Light do will always vacuously satisfy NO-LEX-MVT. Only an element
with an lcs is "lexical", hence only such an element is subject to the constraint. This is why do shows
the characteristic auxiliary-verb property of appearing outside the lexical projection: it patterns with
the semantic auxiliaries and presumably be in having no lcs. Light do will, however, violate FULLINT, since it will have no lcs interpretation associated with it. With parsed lcs, FULL-INT will be
satisfied, so "main" or "heavy" do does not violate FULL-INT. Thus, light do will not be affected by
NO-LEX-MVT, but it will violate FULL-INT, while heavy do will be regulated by NO-LEX-MVT but will
satisfy FULL-INT.
We can now address the question of why do is the morpheme that appears when an auxiliary is
required by OB-HD. This follows if, as seems reasonable enough, do is the lexically simplest verb of
the language, though perhaps it is necessary to appeal to further details of its analysis in order to
explain why do violates FULL-INT minimally, rather than be. At any rate, the reason do appears and
not shout or obfuscate is that the lcs of do is simpler, hence there is less in the lcs of do to be
unparsed. Failure to parse the lcs of do results in a less severe violation of FULL-INT than failure
to parse the lcs of any other verb in the language. This proposal treats FULL-INT as a "gradient
constraint" (see Prince and Smolensky (1993) for theoretical background.)
15
We can also see that it is inevitable that light do exists in the language, given the constraint
rankings. A grammar with the rankings of English will necessarily select the least offensive FULL-INT
violation in order to satisfy OB-HD, and since lcs-unparsing is freely available there will always be (at
least) one candidate which minimally violates FULL-INT and which is therefore optimal.
Finally, one might ask why "auxiliary" do must occur even when the main verb is itself do, if they
are really both the same verb. The answer has already been implicitly given. When do has its lcs
parsed it can theta-mark but cannot occur outside VP. When its lcs is unparsed it can occur outside
VP but cannot theta-mark. Thus (12a) violates NO-LEX-MVT when do has a parsed lcs, but when
it lacks one, the sentence violates the theta criterion, since do cannot theta-mark when its lcs is
unanalyzed. Assuming that the theta criterion dominates FULL-INT (perhaps because the theta
criterion is never dominated), the optimal form will contain both occurrences of do.
(12)
a. *What did he e t?
b. What did he do t?
(T10) shows the interaction of the Theta Criterion and the other constraints in the sentences of (12),
for candidates which satisfy OB-HD and OP-SPEC.
(T10) Co-occurrence of heavy and light do
Candidates
THETACRIT
L
NOLEXMVT
*!
[CPwh doi [IPDP ei [VPdo t]]]
-lcs
+lcs
[CPwh doi [IPDP ei [VPdo t]]]
+lcs
-lcs
FULLINT
*
**
*
**
**
*!
*!
STAY
**
*!
[CPwh doi [IPDP ei [VPdo t]]]
+lcs
+lcs
[CPwh doi [IPDP ei [VPdo t]]]
-lcs
-lcs
OBHD
*!
[CPwh doi [VP he ei t]]
+lcs
[CPwh doi [VP he ei t]]
-lcs
OPSPEC
**
**
**
An advantage of this solution is that it eliminates the positing of ambiguity for do. That is, we
16
no longer have to say that English has a main verb do and an auxiliary verb which is accidentally
identical to a main verb which itself is semantically extremely vague. There is only one G43
verb, recruited to occur in non-lexical positions because it is the least costly choice. Note that
Japanese suru is similarly “ambiguous” between and light and heavy use (Grimshaw and Mester
(1988)), inviting the same analysis. See Ritter and Rosen (1993) for related ideas about have.
In this view, there can not be a language which lexically lacks a semantically empty verb, any
more than there can be a language which lexically lacks an epenthetic vowel. The occurrence of such
items is not regulated by lexical stipulation. Rather, the semantically simplest verb will appear if the
constraint rankings for the language induce its presence, and not otherwise, just as an epenthetic
vowel appears when it is induced by constraint interaction. Thus we can predict both that a language
with the constraint rankings in (T9) will not have a light verb like do and that a language with the
constraint rankings of English will inevitably have such a verb. In this way, we avoid lexical
stipulations concerning the inventory of the language and instead explain the visible lexical items as
the result of constraint interaction. This general point is taken up again in Section 7.
3.4
Do and Wh Subjects
The one matrix interrogative configuration in which do does not appear is one with a wh subject:
(13)
a. Who saw it?
b. *Who did see it?
The key here is the formulation of OP-SPEC, which does not require that a wh operator occur in
Specifier of CP, just that it occur in a Specifier position from which it c-commands the verbal
extended projection. An interrogative clause is therefore not required to be a CP, but just a verbal
projection with a wh specifier. When a wh phrase is a non-subject it will always have to move to a
Spec position. The candidate positions are basically VP, IP, and CP. Both of the first two are filled,
so non-subject wh phrases move to Spec of a higher projection, which we label CP. Specifier of VP
is not a possible position for a wh operator, both because it is (usually) filled and because it does not
give the wh operator the right scope.
The one case where the requirements of an interrogative are met without any movement is when
a wh phrase is a subject. It is already in Specifier position, moreover it is in Specifier of the highest
phrase in the verbal extended projection. Thus an IP (when the clause contains an auxiliary verb) or
a VP (when there is no auxiliary present) will be a perfectly good interrogative, provided it has a wh
phrase as subject.
As a result, a subject wh phrase can never occur in Spec of CP, because the STAY violation
incurred by the movement will not be offset by satisfaction of OP-SPEC: OP-SPEC is already satisfied.
By the same reasoning, a subject wh phrase cannot occur in Spec of IP when there is no auxiliary
present: the wh phrase in this case would be moving from Spec of VP to Spec of IP, violating STAY,
again with no improvement on OP-SPEC. This is why do never occurs in subject wh clauses.
17
(T11) The position of subject wh phrases
Candidates
OPSPEC
L
NOLEXMVT
OBHD
FULLINT
STAY
[VP wh V ... ]
[IP wh e
[VP
t
V ... ]]
[IP wh do [VP t
V ... ]]
*!
*
*!
*
Thus we conclude that subject wh phrases do not move, but remain in Specifier of the verbal head
with phi features, i.e. IP or VP. This Spec position must then count simulataneously as an A-position
(by virtue of its relationship to phi features) and an A-bar position (by virtue of being a scope
position.) The idea that subject wh phrases do not move has been frequently entertained (see Gazdar
(1981), Chung and Mccloskey (1983), Haider (1989) (Rizzi 1991) for example, also Travis (1991)
and Vikner and Schwartz (1991) Cardinaletti (1992) for discussion of similar issues in V2 systems).
It is important to counter an apparently overwhelming argument against the idea that wh questions
with subject wh phrases are just VPs (when they contain no auxiliary verb) or IPs (when an auxiliary
is present): this is based on selection. The very same predicates select questions with wh subjects
as those with other wh phrases, despite the VP versus IP versus CP distrinctions. How is it possible
to explain why they are selected by the same predicates? This problem does not arise under the
“Type-Category” theory of selection (Grimshaw 1991), in which any verbal projection with a [+wh]
Spec is equivalent to any other verbal projection with the same property, as far as selection is
concerned, hence it necessarily follows that a verb which takes an interrogative will combine with all
of these sub-cases. See Section 8 for further discussion.
Interrogatives with wh subjects show that OP-SPEC does not stipulate which specifier must house
the wh phrase. This suggests the desirability of a theory in which reference cannot be made to such
information, in which notions like “CP”, “IP” really have no status. I return to this issue in Section
9 below..
3.5 Negative clauses with do
The appearance of negation, like inversion, induces the presence of do. The core idea is that when
not is present, so is a higher projection which is absent from clause structure when not is absent. The
head of this projection is an auxiliary verb, or do if no other auxiliary verb is present in the clause.
In this way, the analysis of do with negation resembles the analysis for interrogatives, although, as
18
we will see, the situations are not identical.8
What requires the presence of the extra projection? There are a number of possibilities which
could be explored. One would be to appeal to the idea that there is a requirement that T must ccommand negation (Laka 1990). This will force the generation of a tensed verb above not, and when
the only contentful verb is a main verb, it will force the generation of tensed do as a head ccommanding negation. A second line might try to use the idea that not is a clitic and a verb must
appear to its left to support it. This is essentially the proposal of Roberts (1992), and as he argues,
it explains why in English do support came in at the same time as n't. However, while a clitic analysis
may be correct for n't (though see Williams 1994 for arguments that n't is lexically attached) it is less
clearly so for not, which occurs unsupported elsewhere, e.g. in subjunctives, as we will see below.
The proposal I will make here attempts to use independently necessary constraints on subjects
and verbs to achieve the desired result. The first constraint, SUBJECT, corresponds essentially to the
“Extended Projection Principle” (Chomsky, 1981), and says that a clause must have a subject. There
are two alternative formulations, either of which will be satisfactory here. The constraint may require
that the highest A Specifier in a clause must be filled. Alternatively it could require that the Specifier
of the highest I-related head be filled, where “I-related” includes V, T, Agr, Neg etc., i.e. the cluster
of heads that are neither lexical nor type-affecting. Which of these formulations is to be preferred
depends on the theoretical development of the notions “A-position” and “I-related”, and I will not
address these questions here. The second constraint, CASE requires that the head of a DP A-chain be
in a case position.
CASE must dominate STAY, because the Spec of VP raises to Spec of IP to get case, when I contains
the case assigner.
In the positive declaratives in (14), both constraints are satisfied: by a DP in Spec of VP in (14a)
and by a DP in Specifier of IP in (14b), where movement has occurred from Specifier of VP to the
higher Specifier position, assuming the VP-internal subject hypothesis (see references in Section 2).
The reason is that the DPs in (14a) and (14b) are both in case positions, and the clause has a subject.
Inclusion of a do could not improve the fate of the examples on these or any other constraints, hence
the FULL-INT violation that accompanies the presence of do is fatal.
(14)
a. John left
b. John will t leave
When a negative is present, however, the situation is disrupted. If not is a head, as in Roberts
(1992) Williams (1994), it is not one that has case-assigning ability, so if the subject is in its Spec,
CASE is violated, (15a). On the other hand, if the DP is in Spec of VP CASE is obeyed but SUBJ is
8
I will not discuss emphatic do here. It may well be that the general line of explanation will
support the proposal of Laka (1990) that emphasis induces a projection, although this projection, like
all the others discussed, will not be present in all clauses.
19
violated (assuming that the Specifier position is an A position, see Haegeman 1992 for an alternative
perspective): (15b).
(15)
a. *John not t left
b. *Not John left
Even if not is a specifier as in Zanuttini (1991), the conflict still arises, since the subject in (15a)
must then be in a higher specifier, and again CASE is violated. (A more precise formulation of SUBJ
must be given to eliminate the possibility that not, if it is a specifier, might satisfy the constraint.
Perhaps SUBJ should require that an argument fill the position.) In contrast, the presence of an
intervening adjoined element such as rarely, in John rarely left on time, does not affect satisfaction
of the two constraints. Thus they conflict when negation is present but not otherwise.
The candidate in (16) does succeed in satisfying both SUBJ and CASE but at the cost of a NO-LEXMVT violation, since the main verb is heading a non-lexical projection. The ranking of NO-LEX-MVT
higher than FULL-INT, established above for inversion in interrogatives, thus properly eliminates this
candidate.
(16)
*John left not
The conflict caused by not is resolved by the presence of do, which is both the feature-carrying
head and the highest I-related head. in (17), just as will is in (14b). Provided that FULL-INT be
dominated by both SUBJ and CASE, this candidate will be the optimal one.
(17)
John did not leave
Tableau (T12) displays the analysis for a clause with no meaningful auxiliary. (I have omitted OPSPEC from the table since it is vacuously satisfied by all candidates.) The tableau assumes that not
is a head, although as previously mentioned, this is not crucial. It also assumes that subjects are
generated VP-internally, and shows the resulting traces, factored out elsewhere in the paper.
(T12) Presence of do with negation
Candidates
HDLFT
NOLEXMVT
OBHD
SUBJ
STAY
HDRT
*
*!
[not [VP John left]]
*
[John did[t not[VPt leave]]]
[John left[t not[VP t t]]]
FULLINT
*!
[John not [VP t left]]
L
CASE
*!
**
***
20
One instance of do makes it possible to satisfy both subject-related constraints: additional
instances would constitute fatal violations of FULL-INT, although I do not show these cases. Note
the crucial ranking of FULL-INT with both SUBJ and CASE. If FULL-INT dominated both constraints,
then (17) could not be the optimal candidate: instead either (15a) or (15b) would be, depending on
the ranking between SUBJ and CASE. If FULL-INT dominated just SUBJ, and was dominated by CASE
then (15b) would be optimal, since it satisfies CASE and FULL-INT, violating only SUBJ. If FULL-INT
dominated just CASE and was dominated by SUBJ, then (15a) would be optimal, since it satisfies SUBJ
and not CASE. The ranking between NO-LEX-MVT and FULL-INT is crucial also, as pointed out
already.
In the account given here, (15a) is ungrammatical because there is an alternative which better
satisfies the constraints, given the constraint rankings for English. This solution contrasts with recent
proposals by Pollock (1989) and Chomsky (1991), discussed in Baker (1991), which make appeal
to hidden elements or movements. Pollock's account posits a null counterpart of do, which must
move to T, the movement being blocked when not is present. In a similar vein, Chomsky (1991) uses
obligatory LF V-to-I raising, triggered by an invisible affixal Q morpheme, but blocked by not.
The full system of constraints predicts the distribution of do in a negative question: here do will
appear even when the wh phrase is in subject position, for the same reason it appears in negative
declaratives, and only one do is permitted.
(18)
a. Who did not leave?
b. *Who not left?
Construction of the tableaux for these examples is left for the reader.
Where a clause does not contain a head bearing phi-features, what happens? One answer is that
CASE is simply violated in this configuration. This is the case in infinitives, and in subjunctives (thanks
to Hubert Haider for drawing the importance of subjunctives to my attention.). 9 The conflict which
gives rise to do support in (17) is not in effect, with the result illustrated for subjunctives in (19).
(19)
a.
b.
c.
d.
I insist that John not leave
*I insist that not John left
*I insist that John do not leave
*I insist that John leave not
The extended projection in (19a), with the subject in specifier of not, satisfies SUBJ and violates
CASE. Since none of the other candidates can satisfy CASE either, and they all do worse on one of
the other relevant constraints, (19a) is optimal, and the variant which includes do is impossible.
9
Imperatives, such as Do not collect $200, do not pattern with subjunctives and infinitives,
although the reason is unclear, having to do perhaps with their morphology (note that they take finite
tag questions) or the missing subject.
21
(Calculations of STAY violations are omitted, as they involve many irrelevant commitments and have
no effect on the outcome here).
(T13) Negation in Subjunctives
Candidates
L
NOLEXMVT
OBHD
SUBJ
[IP John [ not [ VP t leave]]]
[ not [ VP John leave]]
*!
*!
FULLINT
STAY
*
?
*
?
*
[IP John do [t not [ VP t leave]]]
[IP John leave [ t not [ VP t t]]]
CASE
*
*!
?
?
Thus the different behavior of not in tensed and non-tensed clauses follows from the effect of
CASE. Nothing need be said about not itself. This contrasts with the analysis given in Williams
(1994), where not has an arbitrary [-tense]specification, and hence does not take a tensed
complement.
Under this proposal, the nature of the problem solved by “do-support” with negation is a little
different from the nature of the problem solved by “do-support” in interrogatives. In the interrogative
case the role of do is to fill an otherwise empty head. In the negation case the role of do is to provide
a head that is structurally higher than not and that agrees with the subject. More precisely, the
occurrence of do in inversion depends on the ranking of OB-HD with FULL-INT The occurrence of
do with negation depends on the ranking of CASE and SUBJECT with FULL-INT. But both depend on
the ranking of NO-LEX-MVT and FULL-INT. This partial separation receives indirect support from the
historical development of English: Roberts (1992) notes that the use of do in interrogatives pre-dates
its occurrence in negatives. This suggests that the two instances of do support represent the same
solution to a different problem, rather than the same solution to the same problem. In any case, it is
no accident that the same verb appears in both circumstances: the verb which minimally violates
FULL-INT.
In this analysis, then, the finite auxiliary is freely generated in any head of the verbal extended
projection, its distribution being reined in by a set of constraints. One consequence of this is that
there is no movement involved even in cases where the auxiliary precedes not. This will later turn
out to be important: we will see that the effects of raising an auxiliary verb to C are detectable from
their interaction with the Projection Principle, while there is no such effect of the presence of
negation.
22
4. Adjunction, Heads and the Projection Principle
As is well known, inversion in interrogatives is limited to main clauses in most varieties of English
(see McCloskey (1992) for analysis of inversion in Hibernian English).
(20)
a. They found out when they should take the train.
b. *They found out when should they take the train.
This simple fact is initially quite puzzling. OP-SPEC will be violated unless the wh phrase is in a
Specifier in subordinate interrogatives, for just the same reason as in matrix interrogatives. Yet if this
is correct, OB-HD is regularly violated in examples like (20a). I will develop a solution here which
attributes the difference between matrix and subordinate interrogatives to a constraint which conflicts
with OB-HD, namely the Projection Principle. The constraints conflict because filling the head
position by head movement would violate PROJ-PRIN. Since PROJ-PRIN is the dominant constraint,
and since it is not possible to satisfy both PROJ-PRIN and OB-HD, the structure with no head is wellformed. It is the optimal structure, hence it is the only one possible.
4.1
The Projection Principle
PROJ-PRIN is loosely related to the principle proposed in Chomsky (1981), and it prohibits
movement into the head of, and adjunction to, a subordinate clause. This is a development of two
proposals in the literature. The first (Rizzi and Roberts (1989)), is that the root nature of certain head
movements (see Emonds 1975) follows from the PROJ-PRIN. In particular, they propose that head
movement which is direct substitution is disallowed in selected contexts. The second is the argument
in McCloskey (1992) that configurations in which inversion is ruled out seem to be systematically
related to configurations in which adjunction is disallowed, the correspondence being particularly
clear in the case of arguments. The constraint proposed by McCloskey, based on that in Chomsky
(1986), states: “Adjunction to a phrase which is s-selected by a lexical head is ungrammatical”. The
contrast in (21) from McCloskey (1992), illustrates the motivation for the constraint.
(21)
a. It's probable that in general/most of the time he understands what is going on.
b. *It's probable in general/most of the time that he understands what is going on.
(Note also the observation made by Rochemont (1989) that topicalization adjoins to either CP or IP
in a matrix, but only to IP in a subordinate clause).
Both of these proposed constraints make specific reference to selection. That is, they take the
primary cut to be between selected arguments on the one hand, and matrix and adjunct clauses on the
other (see also Kayne (1982, 1983) and den Besten (1983)). This split was followed in Grimshaw
(1993): there is, however, a problem with such a formulation. McCloskey shows that adjunction to
relative clauses and adjuncts is not possible, despite the fact that they are not selected.
23
(22)
a. *The people [when you get home [who want to talk to you right away]]...
b. *I graduated [while at college [ without having really learned anything]]]
This suggests that all non-root clauses, including adjuncts and relatives, are subject to the constraint.
Moreover, as Rizzi and Roberts (1989) note, inversion is impossible in relative clauses, again
suggesting that all subordinate clauses behave in the same way. This is supported by patterns such
as the one displayed by temporal adjuncts and relative clauses introduced by when, illustrated in (23).
They behave exactly like the corresponding indirect questions in (23c), and unlike the corresponding
matrix questions in (23d).
(23)
a.
b.
c.
d.
I left when he had / *had he arrived.
The day when he had / *had he arrived ...
I found out when he had /*had he arrived
When *he had / had he arrived?
If PROJ-PRIN governs adjuncts, then both patterns have the same explanation: in each case
inversion violates PROJ-PRIN and since there is no movement-inducing constraint dominating PROJPRIN the optimal forms violate OB-HD.
Examples with whether, raised by Armin Mester and Andrew Radford (p.c.), make the same
point: there is no inversion in adjuncts introduced by whether, just like complements.
(24)
a. Whether we can agree or not, we have to make a decision
b. *Whether can we agree or not, we have to make a decision
The most attractive position is obviously that the absence of inversion in all of these cases reflects a
general property of adjuncts and complements, i.e. of subordinate clauses. Hence the Projection
Principle must ban adjunction and head movement for all subordinate clauses:
PROJ-PRIN:
clause.
No adjunction to subordinate clauses and no movement into the head of a subordinate
I will leave a number of questions open here, including whether head movement is substitution (as
Rizzi and Roberts argue for I to C movement) or adjunction. Also, it is possible that PROJ-PRIN is
properly understood as two constraints, one on adjunction and one on head movement, which both
regulate subordinate clauses. In this case, it will be possible to rank them separately. For English I
see no reason to separate them so I will continue to treat PROJ-PRIN as a single constraint.
It is important that the definition of “clause” and “head of a clause” be made clear, as it will be
crucial to understanding the effect of PROJ-PRIN. A clause is the highest extended projection of V:
i.e. if a CP is the top projection then it is the clause, and other extended projections of V, such as XP,
IP, etc. are not clauses, merely parts thereof. The head of a clause is the head of the highest extended
projection of V. i.e. C is the head if CP is highest, I is the head if there is no CP etc. PROJ-PRIN is
24
therefore violated by head movement or adjunction to complements to lexical heads, relative clauses
and adjuncts, which are all subordinate clauses. It is not violated by head movement in matrix
clauses, or in complements to functional heads, since the former are not subordinate and the latter
are not clauses.
This leaves us with one problem: if adjunction and head movement are both ruled out in adjuncts,
including relative clauses, why is head movement admitted in one kind of adjunct clause, namely
conditionals, even though adjunction is disallowed?
(25)
a. Had I learned anything at college, I would be better off now
b. *While at college had I learned anything, I would be better off now.
This is a dilemma which can be disssolved under OT, if PROJ-PRIN is violated in (25a) due to
the effects of a higher ranked constraint. I will suggest in Section 6 that this is in fact the case here.
4.2
The Projection Principle and Subordinate Interrogatives
PROJ-PRIN is irrelevant for matrix clauses, clearly, in fact it is vacuously satisfied in all of the cases
we have looked at so far, and has been omitted from tableaux. Subordinate interrogatives are
subordinate clauses in the sense relevant for PROJ-PRIN however. Let us for now consider just the
CP version of the complement, with wh-fronting. These candidates satisfy OP-SPEC at the cost of
a STAY violation. With respect to the other constraints there are two possibilities: if inversion occurs
OB-HD will be satisfied (at the cost of a STAY violation), but PROJ-PRIN will be violated, since the
CP is a subordinate clause. If inversion does not occur, OB-HD will be violated but PROJ-PRIN will
be satisfied (and STAY will be violated only once, although this is not relevant to determining the
optimal form). Since the clause with no inversion is the grammatical one, we conclude that PROJPRIN>> OB-HD, hence the optimal form is one which satisfies PROJ-PRIN and OP-SPEC but violates
OB-HD. Ranking PROJ-PRIN above OB-HD thus derives the absence of inversion in subordinate
interrogatives.
(T14) Subordinate Interrogatives — Candidates which Satisfy OP-SPEC
Candidates
L
PROJPRIN
[CP wh willi [ IP DP ei [VP V t ]]]
OPSPEC
CAS
E
*!
SUBJ
FULLINT
STAY
**
*
[CP wh e [IP DP will [VP t V t ]]]
[IP wh will [VPDP V t ]]]
OBHD
*
*!
There is another candidate to be ruled out here, included as the final candidate in (T14). Here the
auxiliary has been generated in C, rather than moving there. Since this strategy evades a violation
25
of PROJ-PRIN, the candidate threatens the success of the actual winner. However, CASE is violated
here, since the DP is never in a Specifier-head relationship with the case-marking head. Thus the final
candidate is eliminated. (Having made this point, I will set such candidates aside, and omit SUBJ and
CASE from tableaux in this section, along with NO-LEX-MVT.) If the picture is as in (T14), then
CASE >> OB-HD. If SUBJ is also violated by the final candidate, which depends on the exact
formulation of the constraint, we can conclude only that at least one of CASE and SUBJ dominates
OB-HD.
Some alternative candidates which do not satisfy OP-SPEC, are illustrated in (T15). A CP with no
wh movement but with inversion violates PROJ-PRIN, which guarantees that it will lose to one of the
other candidates. A CP with no wh movement and no inversion, the last candidate in (T15), satisfies
PROJ-PRIN, but at the cost of a violation of OP-SPEC and OB-HD, so it too compares unfavorably with
the optimal candidate in the first row. An informative comparison is between the optimal candidate
and the last candidate: an IP with no wh movement. This extended projection necessarily violates
OP-SPEC, but it satisfies all the other constraints. Comparison of this alternative with the grammatical
one reveals that there is a crucial ranking between OP-SPEC and OB-HD: OP-SPEC >> OB-HD.
Otherwise the IP structure would be optimal, since it violates OP-SPEC but not OB-HD.
(T15) Candidates which violate OP-SPEC, compared to the optimal candidate
Candidates
L
PROJPRIN
OP-SPEC
*
[CPwh e [IPDP will [VPV t ]]]
[CP willi [IP DP ei [VPV wh ]]]
OB-HD
*!
*
[IP DP will [VPV wh ]]
*!
[CP e [IPDP will [VPV wh ]]]
*!
FULLINT
STAY
*
*
*
We can now see why do never occurs in subordinate interrogatives, given the analysis of do from
Section 3.
(26)
a. *I don't know what did she say
b. *I don't know what she did say
c. I don't know what she said
When no auxiliary verb is present and there is no inversion, a subordinate interrogative violates OBHD because the head of the projection housing the wh operator is empty. Including a do adds a FULLINT violation, and if it inverts, a PROJ-PRIN violation. Since OB-HD is violated in subordinate
interrogatives in all candidates which respect PROJ-PRIN, and since the only virtue of do is that it can
satisfy OB-HD, the presence of a do can only add violations in this situation, it can never reduce them.
Hence there can be no advantage to the presence of do, hence it is impossible.
26
(T16) Subordinate Interrogatives with do
Candidates
L
PROJPRIN
[CP wh doi [IP DP ei [VP V t ]]]
OP-SPEC
OB-HD
*!
[CP wh e [VP DP V t ]]
*
[CP wh e [IP DP do [VPV t ]]]
*
FULL-INT
STAY
*
**
*
*!
*
If if and whether are heads, then both inversion and the appearance of do will be ruled out in
subordinate yes-no questions because the C position will be filled by a complementizer. As a result,
the if and whether forms satisfy OB-HD, without inversion or the appearance of do. (T17) shows the
optimal candidate, for an input which includes an auxiliary, under this analysis. Since inversion
violates STAY, and do violates FULL-INT, it is clear that the inverted and do forms will always lose
to the optimal ones.
(T17) Subordinate Interrogatives with if and whether as heads
Candidates
L
(27)
PROJPRIN
OPSPEC
OB-HD
FULLINT
STAY
[CP if/whether [ IPDP will [VPV ]]]
a.
b.
c.
d.
They asked if/whether he will leave
*They asked if/whether will he leave
They asked if/whether he left
*They asked if/whether he did leave
If, on the other hand, whether is a Spec of CP, as in Kayne (1991), then the analysis for clauses
with whether is essentially the same as for other subordinate interrogatives, except for the fact that
no wh movement is involved, hence there is no STAY violation. Comparison of the optimal if clause
in (T17) and the optimal whether clause in the Specifier analysis, shows that under Kayne's proposal,
the two clause types must not be competitors. If they were then the if variant would be optimal
because it does not violate OB-HD. Perhaps the structural difference between them is sufficient to
guarantee that their LFs are distinct. However, another reason why they might not be in the same
27
candidate set is the analysis suggested by Donca Steriade (p.c.) in which the if clause is really a kind
of adjunct. This is supported by Steriade's observation that if clauses occur naturally only with verbs
that allow Null Complement Anaphora in the sense of Grimshaw (1979).
So far, then, three questions concerning inversion patterns have been addressed: why there is no
inversion in matrix declaratives, why there is inversion in matrix interrogatives and why matrix and
subordinate interrogative clauses should show different inversion patterns. OB-HD makes heads
obligatory except where PROJ-PRIN makes them impossible. The fact that an empty head is possible
is a side effect of the fact that movement is not. Since PROJ-PRIN does not affect matrix clauses, a
head can be empty only in a complement clause or adjunct.
4.3
Some Alternatives
The essential property of the solution for inversion is that each component principle is fully
general: none of the principles is specific to interrogatives or to inversion, for example. In fact, there
is no theory of inversion; it is just the result of OB-HD, whose effects are seen whenever the effects
of PROJ-PRIN do not obscure them. Conflict between general constraints, and the resolution of the
conflict, lies behind the observed patterns.
Consider an alternative to the constraint conflict proposal for the absence of preposing in
subordinate interrogatives, namely that there is a null C, or a C filled by +wh, in subordinate
interrogatives. Now it is necessary to distinguish in a principled way between the empty head in this
case, which by hypothesis would be filled by a null element, and other empty heads, such as the one
in a matrix interrogative (and others to be analyzed in Section 8), which cannot be filled by a null
complementizer. One might take the position that only selected heads can be null. This would run
into serious empirical problems with adjuncts (see 4.1), and that complements (see 8). But it is worth
dwelling on, because it has a revealing property — it builds into the principle governing empty heads
the effects of PROJ-PRIN. The real situation is that heads can be empty in exactly the case where
PROJ-PRIN will not allow them to be filled. This is a direct consequence of constraint conflict but not
of an empty heads solution, where elaboration of the constraint, reflecting the effects of conflict must
be stipulated. A useful example is the following:
Specifier Licensing Condition (Plunkett 1990, 128)
If a maximal projection is in a non-subcategorized position, its specifier may not be filled at sstructure unless its head position has also been filled by that time.
This is an accurate description of the empirical situation, setting aside adjuncts; examining each part
of the condition reveals that it states the effects of the interaction of the relevant constraints. The
highly specific character of this kind of solution entails that it cannot extend over the range of cases
which follow from the optimality theoretic proposal: this will be particularly clear in the case of
obligatory complementizers (Section 8). It is, moreover, inevitably language particular (see Sect. 7.)
Similar points hold with respect to the recent proposal for inversion developed by Rizzi (1991)
Haegeman (1992). It uses the idea that a head with a certain feature has to raise in cases of inversion
28
in order to participate in a Specifier-head relationship with a Specifier with the same feature, to meet
a well-formedness principle, called the “Wh Criterion” for the interrogative cases. An important
insight in this work, that the relationship of Specifiers and heads lies behind inversion, is incorporated
into the present proposal.
A large part of the work done by the Wh Criterion and related principles results from an auxiliary
hypothesis concerning the initial distribution of the feature: it is on I at d-structure in matrix clauses,
hence it must raise to C, and it is on C already in subordinate clauses, hence inversion is not required
to put it in the right relationship to Specifier of CP. What is the result of constraint interaction in the
present paper is instead the result of feature distribution in the Rizzi-Haegeman approach. From the
present perspective, the Wh Criterion, like the Specifier Licencing Condition, builds in the effects of
the independently existing PROJ-PRIN. As expected, this is accompanied by significant loss of
generality. For example, there is no relationship between the explanation for inversion and the
explanation for the obligatoriness of that discussed below: inversion is necessary to get features in
the right place, that is necessary for reasons of selection, as touched on in Section 8.
The constraint conflict proposal is more general in another way — it predicts that any constraint
that forces an element to occur in Spec will have the effect of inducing inversion. English offers a
number of other instances of inversion, some of which do not seem to be insightfully subsumed under
a feature-based account. These inversions have the same distribution as inversion with negative
preposing, analyzed in Section 5.)
(28)
a.
b.
c.
d.
So wealthy will he become that ....
I will be rich and so will you
I don't like coffee and neither/nor does Bill.
Only under these circumstances will you be able to win
Although of course possible, it seems unlikely that there is a criterion governing Specifier- head
relationships for all of these, with a feature generated on I in matrix clauses and so forth. Rather it
sems that the patterns in (28) reflect the existence of a set of expressions which must occur in
specifier. Inversion is simply a structural consequence.
Maximally general principles will inevitably conflict. The alternative is to formulate more specific
principles which are designed never to conflict, and one price is generality. Only by allowing
constraints to conflict can we avoid building the effects of every principle into all of the others that
it potentially conflicts with. There is another price — universality. These points will be developed
further in Section 7.
5. Inversion inside a subordinate clause
We know from the previous discussion that OB-HD will induce inversion wherever PROJ-PRIN
does not prevent it. PROJ-PRIN is violated by inversion into the highest head of a subordinate clause,
i.e. into the highest head of the extended projection of V. It is not violated, however, by inversion
29
into other heads of the extended projection, since the projections they head are not subordinate
clauses, merely parts of clauses. So inversion should be possible when the relevant head is not the
highest head of a subordinate clause, but is contained within the extended projection. This prediction
is verified by the pattern of inversion which accompanies preposing of a negative.
If a negative operator is preposed, inversion is required, as illustrated in (29) (Klima 1964,
Liberman 1974).
(29)
a. Never/under no circumstances will she work this hard again
b. *Never/under no circumstances she will work this hard again
In the absence of preposing, inversion is not allowed (see (30)).
(30)
a. She will never work this hard
b. *Will she never work this hard
This paradigm follows the same pattern as interrogatives: negative operators occur in specifier
position, so a projection is present when preposing occurs, which is otherwise absent. The head of
this projection is empty. Hence head movement must occur to fill the head, and inversion follows.
What is the projection that is present when negative preposing occurs? It cannot be CP since the
negative element follows C in subordinate clauses, where the entire paradigm can be replicated.
(31)
a.
b.
c.
d.
She said that never/under no circumstances would she work this hard again
*She said that never/under no circumstances she would work this hard again
She said that she would never work this hard again
*She said that would she never work this hard again
Nor can the projection be IP since the Specifier of IP is already filled by the subject. Thus the
projection must be a further member of the verbal extended projection, which intervenes between IP
and CP, and which I label “XP”. The relative position of the wh phrase and preposed negative follows
from the nature of the operator. Wh movement is type changing, so the wh operator must be outside
everything pertaining to the propositional structure. Negative preposing on the other hand, is a
variety of sentential negation, hence the operator c-commands IP but is c-commanded by C. There
is no need, then, to stipulate which specifier position each operator appears in.10
10
Teun Hoekstra (p.c.) raises the question of whether wh movement and negative preposing can
co-occur, in examples like Which book will never in her life Mary read? Such an example
presumably violates Relativized Minimality (Rizzi 1990), and certainly has a highly marginal status.
It seems clear, however, that this version comes closest to well-formedness -- positioning the auxiliary
after the preposed negative seems worse, which is what the proposal made here would predict.
Exchanging the position of the wh phrase and negative gives clear ungrammaticality.
30
The constraints discussed so far dictate that the structure of the matrix clause in (29a) is (32a),
with no CP present just as for matrix declaratives, while the structure of the complement in (31a) is
(32b).
(32)
a. [XP never/under no circumstances... X [IPDP I [VP V ..... t ]]]
b. [CP C [XP never/under no circumstances... X [IPDP I [VP V .... t ]]]
The discussion here is simplified by proceeding as if all negative phrase operators are subject to
OP-SPEC, and therefore must move. This is approximately true for some, presumably those for which
sentential scope is the only possibility:
(33)
a. *She would do this under no circumstances
b. Under no circumstances would she do this
It is not true for, e.g. never. I assume that this reflects the fact that such negatives can take scope
from more than one position, unlike wh phrases.
(T18) shows how inversion is induced by preposing. If the matrix is an IP then there will be no
possibility for preposing in the first place, so OP-SPEC will be violated. If the matrix is an XP there
are four possibilities: no inversion and no preposing will violate OB-HD and OP-SPEC, preposing
without inversion will violate OB-HD, inversion without preposing will violate OP-SPEC but preposing
with inversion will violate nothing other than STAY twice. Since STAY is ranked below both OP-SPEC
and OB-HD, as we already know, the form which violates only STAY is optimal, and grammatical.
In this way the constraints and rankings predict that preposing “induces” inversion, and inversion is
impossible without preposing.
(T18) Negative-Induced Inversion in Matrix Clauses: “u. no circs.” stands for “under no
circumstances”.
Candidates
PROJPRIN
OPSPEC
[IP DP will [VP V u. no circs.]]
*!
[XP e [IP DP will [VP V u. no circs.]]]
*!
[XP u. no circs. e [IP DP will [VP V t ]]]
L[
XP
FULLINT
STAY
*
*!
*
**
u. no circs. willi [IP DP ei [VP V t ]]]
[XP willi [IP DP ei [VP V u. no circs. ]]]
OB-HD
*!
*
31
(31) shows that inversion with negative preposing does not show a matrix-subordinate contrast,
unlike inversion in interrogatives. The reason is that PROJ-PRIN is vacuously satisfied by inversion
into X — the XP is not a subordinate clause, hence inversion into it does not violate PROJ-PRIN. The
subordinate clause in (31a) is the CP headed by that, and not the XP it contains, since CP is the
highest extended projection of V. (We will consider what happens when CP is absent in Section 8).
PROJ-PRIN is violated by head movement into the complement of a lexical head, but not by movement
into the complement of a functional head; this follows from the fact that complements to functional
heads are not clauses. XP embedded within a CP is like a matrix clause in the critical respect, and
both therefore require inversion.
(T19) Negative-Induced Inversion in Subordinate Clauses
Candidates
PROJPRIN
OPSPEC
[CP that [IP DP will [VP V u. no circs.]]]
*!
[CP that [XP e [IP DP will [VP V u. no circs.]]]]
*!
[CP that [XP u. no circs. e [IP DP will [VP V t ]]]]
L[
CP
FULLINT
STAY
*
*!
*
**
that [XP u. no circs. will [IP DP t [VP V t ]]]]
[CP that [XP willi [IP DP ei [VP V u. no circs]]]]
OB-HD
*!
*
The pattern of constraint violation and satisfaction is exactly the same for negative preposing in
a subordinate clause and in a root clause, as can be seen from a comparison of (T19) with (T18). By
the same token, (vacuous) satisfaction of PROJ-PRIN admits adjunction to VP (when dominated by
IP) and to IP (when dominated by CP), as can be seen in (34), based on McCloskey (1992), where
this observation is made.
(34)
They announced that at Christmas time [IPthe president has generally [ VP gone to visit his
mother.
The structure assigned to a clause is entirely determined by the constraints. Hence it is not
possible for the analyst to simply declare that some clause has a certain structure, without providing
the system of constraints which will guarantee that desired result. In the present case, the constraints
require that the XP intermediate verbal projection be omitted except when preposing to specifier
occurs. An XP which is empty will always violate OB-HD, and moving an auxiliary to fill the head
X position will always violate STAY. Thus the optimal representation of a clause will not include XP
unless some constraint higher ranked than STAY can be satisfied by virtue of its presence. As a
consequence the structure of a subordinate clause with a preposed negative is different from that of
one with no preposing. One has the extra XP and the other does not. This is crucial to inversion of
course: when the XP is present inversion must occur, when the XP is absent inversion is not possible.
32
6. Inversion in Adjuncts
PROJ-PRIN is violated by inversion into the head of any subordinate clause, including an adjunct.
Hence adjuncts systematically resist XP adjunction and inversion. However, as mentioned above,
inversion is possible in conditional adjuncts, which show the pattern illustrated in (35)-(36), recently
discussed in Rizzi and Roberts (1989), Pesetsky (1989) Iatridou and Embick (1994).
(35)
a. Had I been on time I would have caught the train
b. Were he to be asked, he would probably say no.
c. Should it ever happen, you will be sorry.
(36)
a. *I had been on time I would have caught the train
b. *He were to be asked, he would probably say no.
c. *It should ever happen, you will be sorry.
These examples show that a higher-ranked constraint can force violation of PROJ-PRIN . In support
of this, note that I-to-C movement is possible in complements sometimes also, just not in English. In
the Aux-to-Comp process studied in Rizzi (1983, Ch3) and Raposo (1987), an Aux raises to C to
assign case to an otherwise caseless subject. Here too PROJ-PRIN is overriden by another constraint,
in a complement this time.
There are two properties of conditionals which might form the basis of the relevant constraint:
the morphology of the auxiliary, and its semantics. Unlike conditionals introduced by if, which allow
any auxiliary to occur, inverted conditionals admit only certain auxiliaries, those in (35), which never
occur in this form in declaratives (apart from should in its root meaning). 11 These auxiliaries, then,
are not independent elements, but require a connection to the consequent. COND requires a
dependent head to c-command the extended projection containing it. The idea is that the inverting
auxiliaries are semantic and/or morphological dependents, and thus must be locally accessible to the
supporting main clause, and hence at the top of the extended projection that houses them. COND
dominates PROJ-PRIN, hence the inverted form is grammatical despite the violation of PROJ-PRIN.
11
The auxiliary do will never invert, because it has neither the semantics nor the morphology
to be subject to COND. Pesetsky (1989) attributes absence of inversion with do to its language
particular status, which makes it inert. In the present system, do is most certainly not inert, since OBHD induces movement of do just like any other auxiliary. Indeed, do is not so very language
particular either, given the proposal in 3.3.
33
(T20) Inversion in Conditional Adjuncts
Candidates
L
COND
[IP DP had [VP V ..]]
[XP hadi [IP DP ei [VP V ..]]]
PROJPRIN
OPSPEC
OBHD
FULLINT
STAY
*!
*
*
In a clause introduced by the conditional element if, there is no dependency of relevance to
COND, and the constraint is satisfied without inversion. In this circumstance, inversion will violate
STAY and will not improve the status of the extended projection on any other constraint, hence the
ungrammaticality of examples like (37b).
(37)
a. If he were to be asked, he would probably say no
b. *If were he to be asked, he would probably say no.
It is important that the if and inversion conditionals are not in the same candidate set, given that
inversion conditionals always violate PROJ-PRIN and STAY where if conditionals respect both. The
inversion conditionals would always be ungrammatical if the two were in competition. The issue
arises in any theory with an economy constraints; see Iatridou and Embeck (1994) for discussion of
this point, and an argument that the two kinds of conditional have different logical forms.
In sum, inversion in conditionals is not motivated by OB-HD, which is dominated by PROJ-PRIN
in English, but by a constraint which dominates PROJ-PRIN. Hence conditionals provide the lone
example of inversion in a subordinate clause in the language.
7. Typological Consequences of Constraint Ranking
As Prince and Smolensky (1993) show, positing universal constraints subject to ranking by
individual grammars offers a theory of language typology, with often striking predictions. (See also
Legendre et al (1993) for an optimality theoretic typology of case systems.) It is not strictlyspeaking possible to determine the effects of alternative rankings of constraints without knowing what
all the constraints of UG are. Nonetheless it is revealing to examine re-rankings, and in particular
to compare re-ranking with parametric accounts of variation.
As an illustration of the systems generated with alternative rankings, consider the interaction of
STAY with OB-HD and OP-SPEC, crucially ranked in English with STAY dominated by both OB-HD
and OP-SPEC. The six possible rankings of these three constraints are illustrated in (38).
34
(38)
a.
OP-SPEC
OB-HD
STAY
b.
OB-HD
OP-SPEC
STAY
c.
STAY
OP-SPEC
OB-HD
d.
STAY
OB-HD
OP-SPEC
e.
OB-HD
STAY
OP-SPEC
f.
OP-SPEC
STAY
OB-HD
(38a) shows the English option, giving wh movement and inversion. Both (38c) and (38d) yield
systems with neither wh movement nor inversion, since STAY suppresses the effects of both OB-HD
and OP-SPEC. (38e) in fact yields the same result: when OB-HD dominates STAY and STAY dominates
OP-SPEC, the effect is the same as when STAY dominates both. This is because the ranking of OPSPEC relative to STAY will prohibit wh movement, hence there will be no CP present and no empty
C to fill. Thus rankings (38c,d,e) all give a system with no movement.12 (38f) corresponds to a
system which has wh movement but no inversion. These all seem to be natural possibilities.
One a priori possible language type is characterized as impossible, namely one in which there is
no general inversion process, yet there is inversion, but not wh movement, in interrogatives. This is
ruled out by the reasoning governing (38e): without XP movement to Spec there is no head to be
filled.
Finally, the ranking in (38b) gives a system in which both wh movement and inversion occur in
a matrix clause, and also in a subordinate clause if OB-HD dominates PROJ-PRIN. If, however, PROJPRIN outranks OB-HD, in subordinate clauses there will be no wh movement (Compare the first and
third candidates in (T15). Whether this corresponds to a possible language, or whether some other
constraint interferes here, perhaps concerning selection, is a question that I leave open.
Re-ranking constraints gives typological effects of the kinds treated as parametric in other
perspectives. Within OT, the constraints are universal. Whether the effects of a constraint are visible
in a given language depends on the constraint rankings. In contrast, if only inviolable constraints are
admitted into the theory, then some notion like a parameter is essential, since there is no set of nonparameterized constraints such that every language satisfies them. Consider for example, the
descriptive generalization cited in 4.3 concerning the distribution of inversion in English, which, as
noted in 4.3, is a non-violable counterpart, in some sense, of the proposed OT constraint system.
12
If Japanese and Korean “wh-phrases” are, as first argued in Kim (1989) are simply QPs with
no wh properties then the absence of movement might be properly attributed to the irrelevance of OPSPEC.
35
It is accurate for English, but it is not true of all languages, and it cannot be a universal constraint,
but must be parametric. Hence the need for parameters — they accommodate language variation in
a system of inviolable constraints. In contrast, when such generalizations are understood, not as part
of the grammar of any language but rather as descriptions of the state of affairs that results from the
constraints as ranked in a particular grammar, the constraints are seen to be universal, and not
parameterized.
There is a systematic relationship between the constraint rankings in an optimality theoretic
grammar and the formulation of the consequences of these rankings as an unviolable constraint,
varying parametrically. If we reformulate a violable constraint Cv, as an inviolable constraint Ci, what
happens is this. If Cv is undominated, no change is required. However, if Cv is dominated, the
reformulation will have to build into Ci the effects of its interaction with every dominating constraint.
The more numerous the constraints that crucially dominateCv the more complicated the formulation
of Ci will have to be. Needless to say, the form that Ci eventually takes will vary cross-linguistically,
since Cv stands in different dominance relations in different grammars.
Variation determined by constraint interaction entails the existence of an entire set of grammars:
there is no way to simply eliminate a system by stipulation. Given the constraints, each grammar is
inevitable, as illustrated by (38). Of course the constraints could be wrong or other constraints could
be at play, but any change in the posited constraints, or the addition of new ones, will itself make
further predictions about possible re-rankings. For this reason, there is no way to surgically remove
exactly those systems which we wish to dispose of. This is not true of a parametric system. For the
sake of illustration, suppose we posit a parametric system using inviolable constraints to replace the
one posited here. We could formulate it like (39).
(39)
Matrix/subordinate interrogatives have wh movement/no movement
Matrix/subordinate interrogatives have inversion/no inversion
This system allows the problematic system yielded by ranking (38b) above. It also allows the
grammar which re-ranking of the universal constraints makes impossible — inversion without wh
movement. But the point is that it can easily be reformulated, in any way we choose. For instance,
we could write in a dependency between the specification concerning wh movement and the
specification concerning inversion. Any other arbitary dependency could equally well be established.
The reformulation may involve complication, but it is always possible. In the limit we can just list the
alternative systems, call the list a "parameter", and its members "values" of the parameter. It is
arbitrary what appears on the list and what does not, and mere descriptive convenience is the driving
force. In this sense, parametric values are isolated from each other, interacting constraints are not.
Re-ranking has inevitable consequences for the entire grammatical system in a way that re-writing
parameters does not. Re-ranking of universal constraints clearly promises a more illuminating theory
of typological variation.
36
8. Head Position and Subordinate Interrogatives
The interaction of OB-HD and PROJ-PRIN explains why inversion does not occur in subordinate
interrogatives. It does not, however, explain why the optimal candidates have no complementizer,
rather than some element such as that. Such a representation would satisfy both OB-HD and PROJPRIN. In Grimshaw (1993) it was assumed that the answer lies in the lexicon of English: that is a -wh
complementizer, hence cannot occur in a Specifier-head relationship with a +wh expression, and
English offers no alternatives which are compatible both semantically and syntactically with an
interrogative structure. However, it seems that despite the apparently parochial character of this
puzzle, the answer nonetheless resides in UG.
In order to develop this more interesting alternative, we will need to build a picture of how XNtheory can be treated under OT assumptions, in particular a theory of how heads are positioned.
Travis (1989) discusses two cases where a head V is not uniformly initial or final. In Chinese,
according to her analysis, the V precedes arguments but follows adjuncts. This follows if there is
default head-final, overriden by directionality of theta-marking, which requires theta-marking to the
right. In Kpelle, the default is head-initial, and it is overridden by directionality of case-marking,
which is leftward. Hence NP complements precede V while PPs follow. (Note that this solution is
not coherent under standard assumptions, since it crucially relies on notions like "default" and
"override" which have no place in a system of inviolable constraints.) In OT terms, in Chinese ThetaRight >> Head-Right (and both these constraints dominate all others concerning theta-marking, casemarking and head position).13 In Kpelle, Case-Left >> Head-Left (and these constraints dominate
all relevant others, as for Chinese.) Thus in Chinese and Kpelle head-position constraints, HD-LFT
and HD-RT, conflict with other constraints affecting position, and when the other constraints
dominate, the pattern of head position is perturbed as a result. These constraints, which require that
the head of every projection be leftmost/rightmost, are alignment constraints (Prince and Smolensky
1993, McCarthy and Prince 1993).
English shows a different kind of mixture: it is head final at the XP level and head initial at the
XN level. This too is the result of the head position constraints in interaction wth others. Suppose
that in English HD-LFT >> HD-RT. (If all structure is binary then HD-RT is satisfied whenever HD-LFT
is violated). HD-LFT is violated by subjects (Spec of VP when there is no auxiliary, Spec of IP when
there is). This is the effect of a dominant constraint on Specifier positions: SPEC-LFT. In this analysis,
then, English is a left-headed language because of HD-LFT, except where other constraints demanding
other head configurations intervene. More generally, heads are uniformly left or right within a
13
The head-finality of nominals in Chinese (Huang 1982) could be due to a constraint treating
N as final, N-RT, outranking THETA-RT . More interestingly, it could be attributed to N not thetamarking, as Travis suggests, in which case THETA-RT would not be violated in a head final NP. The
issue is subtle, however, in view of the arguments (Grimshaw 1990) that nouns take obligatory
arguments just like verbs when they have an event structure.
37
language, to the extent that they are able to be so. Apparent mixed systems always arise from
constraint conflict.
This sketch of the theory of head position in UG makes possible a simple solution to the residual
problem of heads in subordinate interrogatives. The candidates of interest are one with no C (the
successful one for English), and the alternative in which C is filled with that, or with any other
morpheme. The crucial comparison is between the pair in (40).
(40)
a. I wonder when I will see such a sight again.
b. *I wonder when that I will see such a sight again
Both these sentences, in the analyses in (T21), violate STAY twice, because of wh movement and
movement of the subject DP. Both also violate HD-LFT twice, because V and I are not leftmost in
their XP projections, due to the presence of Specifiers. The critical difference is that the candidate
with no that violates OB-HD, while the candidate with a that satisfies OB-HD but at the cost of an
additional HD-LFT violation, in CP. We conclude that HD-LFT dominates OB-HD, hence English
chooses to have no C rather than a C in the wrong position. (The opposite ranking of OB-HD and
HD-LFT would give a language in which a complementizer occurs obligatorily with interrogatives,
as well as with adverbial clauses. Perhaps middle English is close to such a system.)
(T21) omits PROJ-PRIN, OP-SPECand FULL-INT, all satisfied in the two candidates at issue.
There is no crucial ranking of SPEC-RT AND HD-RT with respect to each other or with respect to OBHD and STAY.
(T21) Subordinate Interrogatives, presence versus absence of that
Candidates
[CP wh that [IP DP will [VP t V t ]]]
L
[CP wh e [IP DP will [VP t V t ]]]
SPECLFT
HDLFT
SPECRT
HDRT14
***!
****
**
**
****
OBHD
STAY
**
*
**
Why doesn't the null complementizer in the optimal candidate violate HD-LFT also? If it does,
then the two candidates will tie on HD-LFT, and the decision will be made by OB-HD, which will
choose the wrong candidate. The answer must be that there is no "null complementizer" present.
This conclusion allows us to choose between two alternative conceptions of OB-HD. We might have
14
Violations of HD-RT (strictly irrelevant here) are calculated on the assumption that a
complement to the right of X induces two violations, since X is final neither in X' nor in XP.
Alternative formulations of the constraint would give different outcomes, but no evidence here bears
on the issue.
38
taken the position that GEN always includes a head X for a projection of X. Then both the candidates
in (40) would have a head, and the OB-HD violation would be due to the emptiness of the head. But
this interpretation is inconsistent with the explanation just given here for the absence of a
complementizer in interrogative CPs. Thus we conclude that GEN includes a head X for a projection
of X only when there is an element filling X. Only the first candidate in (41) has a head C present:
the other has no C position at all, just a CN dominating IP. OB-HD, then, is violated by this
configuration, i.e. it regulates the presence of X0 in a projection, and not the filling of X0 with
linguistic material. With this interpretation, the problem of why the null complementizer in the
optimal candidate does not violate HD-LFT is trivial: there is no null complementizer, so HD-LFT is
(vacuously) satisfied.15
When an auxiliary verb raises to a higher position, as in matrix questions and with negative
preposing, OB-HD is satisfied, but what of HD-LFT? The raised head has a specifier on its left in both
cases, so it is apparently not in the optimal head position. Since we know that HD-LFT dominates
OB-HD, we are in danger of making the incorrect prediction that raising of an auxiliary verb should
never be the best response to an empty head, which would undermine the entire analysis of inversion.
This suggests that HD-LFT (and presumably HD-RT) holds only of perfect heads in the sense of
Grimshaw (1991), ie. heads which match the projection in all respects. The relationship of I to IP
is that of perfect head, but that of any raised head to the projection it raises into, is not. Hence HDLFT is not violated by moved heads. This will have no effect on X-bar structure in general, since
simple X-bar projections are always perfect projections, hence HD-LFT is always effective here.
We know, then, that unprojected heads and moved heads do not figure in calculating violations
of HD-LFT (and HD-RT): the same proves to be true for traces. (T5') is a revision of (T5), which now
includes HD-LFT and HD-RT, and also shows the trace left by movement of DP from within VP. How
many violations of HD-LFT are there? If traces and unprojected heads left by head movement are
exempted from the constraint, the violations are those indicated. All candidates except the last incur
one violation, hence the decision passes to OB-HD, which prefers the first candidate, with inversion.
Note that if unprojected heads alone were exempted from the constraint, we would get the wrong
result. The trace of head movement in the first candidate would violate HD-LEFT, thus incurring two
violations of the constraint, which would eliminate the actual optimal candidate from the competition.
We conclude that the constraint holds of overt perfect heads.16
15
16
Thanks to Alan Prince for aid in developing this proposal.
The other cases of do support -- those involving negation provide us with evidence for a
ranking. In (T12) the final candidate (with no violation of HD-LFT) will win unless NO-LEX-MVT
dominates HD-LFT, a ranking which is entirely consistent with the system, but which is not further
explored here.
39
(T5') Matrix Interrogatives with and without do (revised)
Candidates
L
OPSPEC
*HDLFT
OBHD
FULLINT
STAY
HDRT
*
***
*
[CP wh doi [IP DP ei [VP t V t ]]]
*
[CP wh e [IP DP e [VP t V t ]]]
*
*!*
**
*
[CP wh e [VP DP V t ]]
*
*!
*
*
[CP wh e [IP DP do [VP t V t ]]]
**
*!
**
**
*
Thus UG provides constraints on head position, in the form of HD-LFT and HD-RT. The ranking
of HD-LFT relative to OB-HD in English ultimately explains the absence of complementizers in
subordinate interrogatives, and a fact which appears to be highly language specific is derived from
the interaction of universal constraints.
Note too, that this is another example where a lexical gap is explained as a function of constraint
rankings. English has no word to appear in the head position in subordinate interrogatives. The
constraint ranking of English makes it impossible for such a word to be used, thus in effect it makes
it impossible for it to exist in the lexicon. The same point has been made previously: no language
in which NO-LEX-FUNCT is dominated by both FULL-INT and OB-HD can have a semantically empty
use for a verb as in English auxiliary do (3.3.) Gaps in the lexicon can be epiphenomena of constraint
rankings, and the same is true of the existence of lexical items in particular analyses, if the analysis
of do in 3.3 is correct. These results are quite surprising, in view of the more standard view
(Chomsky 1992) that language variation is due to differences in the lexicon, rather than, as here, that
lexical variation may be due to differences in the grammar.
This raises the question of whether all principled lexical variation, in particular all cross-linguistic
variation in functional categories, might be derived from constraint ranking. Typological differences
attributed to “strong” versus “weak” features in work in the minimalist program (Chomsky 1992) for
example, can be understood as resulting from ranking of checking constraints and STAY. When STAY
outranks a checking constraint there will be no movement, when it is outranked by a checking
constraint, movement will occur. Minimally, it seems that re-ranking of violable constraints offers
an extremely interesting window on the relationship between lexical and syntactic properties.
9. The Obligatoriness of that
9.1
Complements with and without that
This paper does not aim to characterize inversion, but to derive the distribution of inversion from
the properties of heads in general. I will show here that the obligatory appearance of that in
40
subordinate clauses with topicalization or operator movement to Specifier follows from the principles
already laid out.
The constraints allow sentential complements to be VPs, IPs or CPs, or in principle XPs, although
as we will see below, this possibility is in fact excluded by the constraints. When a complement is a
CP, it must have its head filled, because of OB-HD. Hence a clause which is not introduced by a
complementizer cannot be a CP. (Doherty (1993) provides extensive argumentation in favour of
analyzing clauses without that as IPs.) Similarly, a clause with no auxiliary cannot be an IP. Thus
verbs like think allow three complement structures, as in (41):
(41)
a. I think [ CP that it will rain]
b. I think [ IP it will rain]
c. I think [ VP it rained]
An apparent objection to this analysis is that we have to stipulate that think and every verb like
it selects VP, IP and CP as a complement. This would not only complicate the lexical representation
for all of these verbs, it makes it impossible to explain why there isn't a verb of approximately the
same semantics as think, which takes just one of the three, since such a verb would be exploiting the
simplest possible selection option available.
The same issue arose in connection with the claim that interrogatives with wh subjects are VPs
or IPs, in Section 6, and the same answer holds here, exploiting Type-Category selection (Grimshaw
1991). Recall that in the theory of extended projection, functional heads do not select at all. Lexical
heads do select: they c-select the syntactic category and s-select the semantic type of their
complements. All members of the verbal extended projection (C, I, V and whatever other heads
participate) are of the same syntactic category (verbal). They differ in their functional analysis, not
in their syntactic category. It follows that they cannot be distinguished by c-selection. What about
s-selection? Suppose finite VP, IP and CP are good realizations of the same semantic type: let us
call it “proposition”. Then it will follow that all verbs which take propositional arguments take all
three realizations (VP, IP and CP) of their arguments. (Those verbs which appear to take just CP,
such as factives and manner-of-speaking verbs, have different selectional specifications.)
Given the constraints and rankings developed so far, when the input includes a semantic auxiliary,
there are two optimal candidates: a CP with that as its head, and an IP headed by the auxiliary. When
there is no semantic auxiliary in the input, the two optimal candidates are a CP with that and a bare
VP. A CP with no head is non-optimal in both cases, since it violates OB-HD.
Note that the complementizer that does not violate FULL-INT, since it does not have an unparsed lcs,
unlike do as analyzed in 3.3.
41
(T22) VP and CP propositional complements
Candidates
L
L
PROJPRIN
OPSPEC
OB-HD
FULLINT
STAY
V [VP DP V ...]
V [CP that [VP DP V ...]]
(T23) IP and CP propositional complements
PROJPRIN
Candidates
L
L
OPSPEC
OB-HD
FULLINT
STAY
V [IP DP will [VP t V ...]]
*
V [CP that [IP DP will [VP t V ...]]]
*
The structure of Optimality Theory makes the survival of more than one candidate a difficult
result to achieve. In fact some adjustment of the present proposal will be necessary if the head
position constraints include HD-RT, as suggested in the previous section. Otherwise the CP
complement will always be eliminated in favor of the VP or IP complement, since the CP complement
contains an extra left-headed projection (C'), and hence violates HD-RT one more time than the VP
or IP complement. It is also necessary to assume that that, and presumably all other functional heads,
can be freely included in an extended projection or not, without incurring a violation of constraints
of either the FILL or PARSE type. Both of these issues merit further exploration.
In general, then, VP, IP and CP alternate in complement position. Nonetheless, there are certain
circumstances in which the complementizer occurs obligatorily. When topicalization or any other
adjunction occurs in a subordinate clause, the that complementizer is obligatory. We can see this
below: (42), with no adjunction, is equally good with and without the complementizer, while (43),
with most of the time adjoined to IP and construed with the subordinate clause, becomes very
seriously degraded if the complementizer is omitted.17
17
The complementizer is obligatory also in all positions except when the clause is a complement
to a verb (Stowell (1981) Kayne (1981)). This generalization, illustrated for subject clauses and
complement clauses in (i), is more robust than the obligatory that effect analysed here. There are
42
(42)
a. She swore/insisted/thought that they accepted this solution most of the time.
b. She swore/insisted/thought they accepted this solution most of the time.
(43)
a. *She swore/insisted/thought(,) most of the time(,) they accepted this solution.
b. She swore/insisted/thought that(,) most of the time(,) they accepted this solution.
The solution builds on a suggestion by Eric Hoekstra (p.c.): when the complement is a CP then
adjunction to the IP is possible, whereas when the complement is an IP then adjunction to IP will be
ruled out. Hence, only when there is a CP projection over the IP projection will the IP projection be
a possible adjunction site.
Let us consider first of all the situation for adjunction. If the complement is an IP, PROJ-PRIN will
be violated, because the IP is a subordinate clause. If the complement is an XP (not shown in the
tableau), or a CP with no C, OB-HD will be violated, and if inversion takes place PROJ-PRIN will be
violated. But if the complement is a CP headed by that, PROJ-PRIN and OB-HD will both be
respected. Hence the optimal configuration is a CP with a filled head, hence this is the only
grammatical configuration.
(T24) Adjunction to IP and CP complements
PROJPRIN
Candidates
V [IP adjunct [IP DP will [VP t V ...]]]
V [CP willi [IP adjunct [IP DP ei
L
[VP t V ...]]]]
V [CP e [IP adjunct [IP DP will [VP t V ...]]]]
V [CP that [IP adjunct [IP DP will [VP t V ...]]]]
OPSPEC
OBHD
FULLINT
STAY
*!
*
*!
**
*!
*
*
Obligatoriness of that is predicted both with adjunction and with negative preposing. The
reasoning for negative preposing parallels that for adjunction. If the complement is an XP, either
speakers who systematically accept adjunction and negative-related inversion even in the absence of
the complementizer, (Andrew Radford p.c.), but such speakers still reject examples like (ib).
(i)
a.
That he left so early shows (that) he was tired.
b.
*He left so early shows (that) he was tired.
The explanation for this effect seems to concern the root character of complements to some verbs,
but the issue will not be developed further here.
43
PROJ-PRIN or OB-HD must be violated: PROJ-PRIN if inversion to X occurs, and OB-HD if inversion
to X does not occur. When the complement is a CP headed by that, OB-HD is satisfied in CP, and
inversion to X is not prohibited by PROJ-PRIN, and satisfies OB-HD for X. This is the optimal
structure, and it includes that. As (44) shows, that is indeed required with negative preposing.
(44)
a. She swore/insisted/thought that never in her life would she accept this solution
b. *She swore/insisted/thought never in her life would she accept this solution
(T25) Negative Preposing in XP and CP complements
Candidates
PROJPRIN
V [XP never e [ IPDP will [ VP V ...]]]
V [XP never will [ IP DP
L
e [ VP V ...]]]
V [CP that [XP never e [ IP DP will [ VP V ...]]]]
OPSPEC
OBHD
*!
*!
STAY
*
**
*!
*
**
V[CP that [XP never willi [ IP DP ei [ VP V ...]]]]
V [CP e [XP never willi [ IP DP ei [ VP V ...]]]]
FULLINT
*!
**
The CP in all of these cases is present to protect the projection below it from the effects of PROJPRIN. When the IP or XP is a subordinate clause it cannot be adjoined to, and its head cannot be
filled by movement. It is no longer necessary to appeal to the idea that selectional properties of that
(its ability to select a CP complement for "CP recursion") are responsible for its appearance here,
(Rizzi and Roberts (1989), Cardinaletti and Roberts (1991), McCloskey (1992), Vikner in press).
Such a solution rests on a lexical stipulation, while the solution based on constraint conflict rests on
general properties of the grammatical system of English. (It seems very likely that the same basic
analysis explains the obligatoriness of a complementizer in some embedded V2 systems, see Vikner
(in press)).
Comparison of the situation with negative preposing and wh movement in subordinate clauses
shows an interesting contrast. In subordinate interrogatives, inversion simply fails, since PROJ-PRIN
and OB-HD conflict and PROJ-PRIN outranks OB-HD. Why then does inversion not fail also when
negative preposing occurs in the XP complement to V? If we get a grammatical sentence by failing
to invert in a subordinate interrogative, why don't we get a grammatical sentence by failing to invert
in an XP complement to a V? The two sentences have the same constraint profile, and certainly look
identical:
(45)
a. He wondered when she would arrive
b. *He said never he had arrived so late
44
The solution rests on the comparative nature of optimality theory. In the XP case there is another
candidate available, namely the one that includes the CP, which is more successful. In contrast, for
an interrogative complement, there is no way to enlarge the extended projection to protect the CP,
making inversion possible. The reason is that any additional projection will add an OB-HD violation,
given that that does not occur outside a interrogative. So the uninverted form of the interrogative is
optimal, but the inverted form of the negative is optimal. The very same pattern of constraint
satisfaction and violation can yield a grammatical sentence or an ungrammatical sentence, depending
on the competition.
Why does that not occur outside the wh phrase? More generally, why does English not have a
way to fill a higher functional head above the projection used for wh fronting, as does Spanish (Suñer
1991). Sten Vikner suggests (p.c.) that this follows from OP-SPEC. In Section 2 it was suggested that
OP-SPEC is satisfied for a wh operator only when the operator is in a position from which it ccommands the entire extended projection. When a higher projection is present, then, OP-SPEC will
be violated for wh operators. The successful candidate will be one with the wh phrase in Specifier
of the highest projection in the extended projection. Hence there is no way to enlarge the extended
projection to allow inversion with the interrogative.
In sum, the question of why the complementizer is obligatory with topicalization or preposing to
Specifier is answered in terms of the principles laid out here. The question of why that is obligatory
reduces to the question of why a CP projection is obligatory, and this follows from the Projection
Principle. OB-HD, responsible for the inversion patterns analyzed in earlier sections, forces the
presence of that in the optimal CP. A possible extension of these results (suggested by Sten Vikner
and Viviane Déprez p.c.) would take the general obligatoriness of the complementizer in Romance
and Germanic languages to follow from their having V-to-I movement in finite clauses. This will
cause them to violate PROJ-PRIN unless the complementizer is present. In contrast, since there is no
head movement involved in the English auxiliary system, that is not obligatory in subordinate clauses
in general.
9.2 A Note on that-trace Configurations
Extraction of an adjunct or a complement is unaffected by the presence of that, while the
extraction of a subject is possible only if that is omitted, except in relative clauses where that must
be present when it is the highest subject that is extracted. This effect has been widely attributed to
the ECP (e.g. Kayne (1981), Lasnik and Saito (1984), Rizzi (1990), and the proposal here is based
on such solutions. It builds on the insight of Déprez (1991, 1993) that English that-trace
configurations are ungrammatical because English offers an alternative, in the form of a that-less
clause. The proposal to be given here depends crucially on the results presented above concerning
clause structure, and on the assumption that (all) heads govern both their complements and the
specifiers of their complements. We then posit these government constraints:
45
T-GOV
T-LEX-GOV
Trace is governed
Trace is lexically governed
T-GOV is violated when a trace is not governed by any head, T-LEX-GOV when a trace is not
governed by a lexical head.
Let us begin by examining the extraction of complements and adjuncts, both of which are
unaffected by the presence or absence of that. The reason is that in each case the candidates with and
without that are equally successful at satisfying the constraints.
(46)
a. Who do you think (that) they will see t?
b. When do you think (that) they will see them t?
When the object is extracted as in (46a), both constraints are satisfied, so both candidates are optimal,
as can be seen in the tableau (T26), which shows only the government constraints
(the bold and unindexed trace is the relevant one).
(T26) Extraction of an object
Candidates
L
L
T-GOV
T-LEX-GOV
V [IP DPi I [VPti V t ]]
V [CP that [IP DPi I [VP ti V t ]]]
When the adjunct is extracted, as in (46b), neither constraint is satisfied, since an adjunct is not
governed at all. Again, then, the candidates with and without that are equally successful and both
are grammatical.
(T27) Extraction of an adjunct
Candidates
L
L
[CP
T-GOV
T-LEX-GOV
V [IP DPi I [VP ti V ] t ]
*
*
that [IP DPi I [VP ti V ] t ]]
*
*
It is precisely in the case of extraction of a subject that the presence of that makes a difference.
(47)
a. Who do you think will see them?
b. *Who do you think that t will see them?
46
(T28) Extraction of a subject
Candidates
T-GOV
T-LEX-GOV
V [IP ti I [VP ti V ... ]]
V [CP that [IP ti I [VPt i V ... ]]]
*!
When that is present, it governs the trace, so the trace is governed, but it is not lexically governed,
since C is not lexical. Hence the second candidate violates T-LEX-GOV. When the complement is just
an IP, however, as in the first candidate, the complement-taking verb governs the trace, which is
therefore lexically governed and satisfies both constraints. Hence the that-trace configuration is
ungrammatical because the V-trace configuration is optimal.
Relative clauses, being adjuncts, are not governed. It follows that the specifier of a relative clause
is not governed by any element from outside the clause. This is the key to the paradigm in (48).
(48)
a. *The people t will see them ...
b. The people that t will see them ...
The trace in Spec of IP is governed by that in (48b), hence it is governed, although not lexically
governed. The trace in Spec of IP is not governed by any head in (48a), hence this candidate
violates both government constraints.18
18
I have represented the relative clause with no that as just an IP in (T29). A more traditional
analysis, based on Chomsky (1977), would posit a CP here with a null operator in its Spec.
(i)
(ii)
The people [Op e [t will ...]]
The people [Op that [t will...]]
Assuming that HD-LFT is violated in the second structure, if HD-LFT is outranked by T-GOV, we will
get the right result under this representation. There is a problem, however, since if HD-LFT is violated
because of the empty operator to the left of C, the prediction is that the candidate with no that will
be optimal whenever anything other than the highest subject has been relativized, contrary to fact.
If, on the other hand HD-LFT is not violated because the OP is empty, then (ii) will still be the optimal
version, but we encounter a different problem: OB-HD will require that to introduce all relative
clauses. This is the reason for the choice of the IP analysis.
47
(T29) Extraction of a subject in a relative clause
Candidates
L
N]
[IP t I [VP V ... ]]
T-GOV
T-LEX-GOV
*!
*
N ] [CP that [IP t I [VP V ... ]]]
*
The essential point of this solution is that the that-trace configuration is not always ruled out: under
certain circumstances it may be the optimal configuration, and be grammatical. Whether it is good
or bad for a trace to be governed by C depends on what the alternatives are: it is better to be
governed by C than not to be governed at all.
This point holds cross-linguistically as well. The prediction is that languages in which there is no
better candidate will simply not show that-trace effects, so that extraction corresponding to (47b) will
be grammatical. This is exactly the situation in Dutch, as illustrated in this example from Weerman
(1989):
(49)
Wie denk je dat t ons gezien heeft?
Who think you that t us seen has
The observation that that-trace configurations are not universally ruled out has been challenging
to understand (see e.g. Maling and Zaenen 1978, Sobin 1987, Bayer 1984, Bennis and Haegeman
1984). But in the optimality-theoretic constraint satisfaction perspective, it is just what is expected.
In a language which does not allow complementizer-less subordinate clauses, the route to optimality
followed by English is not available, so the language must settle for only non-lexical government in
this configuration.
The violability of the constraint system governing that-t effects is visible within English, and lies
behind the final puzzle to be analyzed here. (Culicover 1993) shows that the presence of an
expression adjoined to IP can make a that-trace configuration legitimate:
(50)
a. *Who did she swear(,) most of the time(,) t accepted this solution?
b. Who did she swear that(,) most of the time(,) t accepted this solution.
Culicover concludes that ECP-based accounts of the effect must be incorrect. However note that in
exactly these configurations, there is a conflict between PROJ-PRIN and the lexical government
constraint. Tableau (T30) shows the effects.
48
(T30) PROJ-PRIN conflicts with T-LEX-GOV:
PROJPRIN
Candidates
L
O
BHD
FULLINT
STAY
*
V [CPthat [VP adjunct [VP t V ]]]
V [VP adjunct [VP t V ]]
OPSPEC
*!
TGOV
TLEXGOV
*
*
PROJ-PRIN requires that here because of the adjunction. But if that is present, then T-LEX-GOV is
violated, since the trace is not lexically governed. The correct ranking of these two constraints will
therefore automatically give the right result: PROJ-PRIN dominates T-LEX-GOVERNMENT.
10.
General Discussion
Every projection is optional, and only present if it is needed. The size of an extended projection
is variable, and depends on the effects of grammatical constraints (see Haider (1989), Ackema et al.
(1992), Heycock and Kroch (1993) for related proposals). This is inconsistent with the hypothesis
that what constructs phrase markers is either phrase structure rules or selectional statements. Optional
projections are difficult to make sense of in such systems, because they always involve complicating
the PS rules or selection statements. Instead, it seems that what is involved is free combination of
heads and their projections, regulated by constraints of the kind discussed here, among others. At
this point, however, we can take a more radical step: there is no reason to label the projections which
make up extended projections. Every projection is just that, a projection, which has a grammatical
category, may have a lexically realized head, may have acquired a head by movement, or may lack
one completely. The properties of a projection are just a function of what happens to head it. If it
has no head it has no properties other than those imposed by its role in the extended projection. E.g.
“XP” is just a verbal projection. Strongly consistent with this is the fact that selectional statements
make no reference to distinctions among projections, at least those that are of the same category,
such as VP, IP and CP. Moreover, none of the constraints refer to projection labels: a wh phrase
need not be in Specifier of CP for example, any specifier which c-commands the extended projection
will suffice. Adjunction occurs freely, its effects being reined in by PROJ-PRIN. From this
perspective, there is, of course, no such thing as “CP recursion” (Rizzi and Roberts (1989) Vikner
and Schwartz (1991) McCloskey (1993)), even for cases like Dutch with its three “C” positions
(Hoekstra 1993). The “extra” projections that have sometimes been given this analysis are really no
different from all other projections: just part of the indefinitely expandible verbal extended projection,
which is regulated by the constraints examined in this paper, among others.
Since there is no fixed limit on the number of projections that can be included in an extended
projection, the number of competitors also has no fixed limit. However, there is a way to eliminate
all but a few candidates; adding projections eventually and reliably leads to worsening status on the
constraints. If the addition of, say, two projections leads to a less satisfactory candidate, the addition
49
of yet another projection will never yield improvement. Thus candidates which are yet larger need
not be considered.
There is a clear affinity between the optimality theoretic model presented here and research using
notions like “economy”. As I pointed out earlier, STAY is an economy principle, which chooses
representations containing the fewest traces. Optimality theory provides a way to understand such
notions as “economy”, as just subcases of the total universal set of violable constraints. Moreover,
under OT there is an explicit way to determine how constraints will interact with each other. This
is of course essential to making sense of any theory in which there is constraint interaction. Consider,
for example, the idea that short derivations are less costly than longer ones, and that universal devices
are less costly than language particular ones, as in Chomsky (1991). Without a means of computing
the comparative cost of the two expensive items, there is no way to calculate the results of interaction
between them. What happens when we have to choose between a derivation with fewer steps but
more language particular devices and one with more steps and fewer language particular devices?
OT provides a theory of constraint interaction which makes such questions answerable: the choice
will depend on the ranking of the constraints in the grammar.
50
References
Ackema, Peter, Ad Neeleman and Fred Weerman. 1992.
Proceedings of NELS.
Deriving functional projections.
Baker, L. 1991. The Syntax of English Not: The Limits of Core Grammar. Linguistic Inquiry
22:387-429.
Bakovic, Eric. 1995. A Markedness Subhierarchy in Syntax. ms. Rutgers University.
Baltin, Mark. 1982. A landing site theory of movement rules. Linguistic Inquiry 13: 1-38.
Bayer, Joseph 1984. Towards an explanation for certain that-t phenomena: The COMP-node in
Bavarian. In Sentential Complementation, ed. W. de Geest and Y. Putseys, 23-32. Dordrecht: Foris.
Besten, Hans den. 1983. On the interaction of root tranformations and lexical deletive rules. In On
the formal syntax of the Westgermania, ed. Werner Abraham, 47-131. Amsterdam: Benjamins.
Brisson, Christine. 1994. Noun Phrases, of, and Optimality. Ms. Rutgers University, New Brunswick,
NJ.
Burton, Strang and Jane Grimshaw. 1992. Coordination and VP-internal subjects. Linguistic Inquiry
23: 305-313.
Cardinaletti, Anna. 1992. Spec CP in Verb-Second Languages. in M. Starke ed. Geneva Generative
Papers, Volume 0 Number 0, 1-9. Departement de Linguistic Generale, Geneva.
Cardinaletti, Anna and Ian Roberts. 1991. Clause-structure and X-Second. Ms. Università di Venezia
and Université de Genève, Venice and Geneva, Switzerland and Italy.
Cheng, Lisa L. S. 1991. On the typology of Wh-questions. Doctoral dissertation. MIT, Cambridge,
MA.
Chomsky, Noam. 1957. Syntactic Structures. den Haag: Mouton.
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam. 1986. Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam. 1991. Some notes on economy of derivation and representation. In Principles and
Parameters in Comparative Grammar, ed. Robert Freidin, 417-453. Cambridge, MA: MIT Press.
Chomsky, Noam. 1992. A minimalist program for linguistic theory. MIT Occasional Papers in
Linguistics: Number 1. Cambridge, MA: MIT Press.
51
Chung, Sandra and James McCloskey. 1983. On the interpretation of certain island facts in GPSG.
Linguistic Inquiry 14: 704-13.
Cinque, Guglielmo. 1990. Types of A-Bar dependencies. Cambridge, MA: MIT Press.
Collins, Chris. 1991. Why and How Come. MIT Working Papers in Linguistics 15.
Déprez, Viviane. 1991. Economy and the that-t effect. Proceedings of ESCOL.
Déprez, Viviane. 1991. Wh-movement: adjunction and substitution. Proceedings of WCCFL X.
Déprez, Viviane. 1993. A minimal account of the that-t effect. To appear in Festschrift for Richard
Kayne.
Doherty, Cathal. 1993. Clauses without that: the case for bare sentential complementation in English.
Doctoral dissertation, University of California, Santa Cruz.
Dwivedi, Veena. 1993. Topicalization in Hindi and small pro. Presented at the Workshop on
Ergativity, Rutgers University.
Emonds, Joseph. 1975. A transformational approach to English syntax. Academic Press: New York,
NY.
Emonds, Joseph. 1978. The verbal complex of VN–V in French. Linguistic Inquiry 9:151-175.
Engdahl, Elisabet. 1980. Wh-constructions in Swedish and the relevance of Subjacency. Proceedings
of NELS 10.
Erteschik-Shir, Nomi. 1992. Focus Structure and Predication, the case of negative wh-questions.
Belgian Journal of Linguistics 7: 35-51.
Erteschik-Shir, N. In preparation. Focus.
Gazdar, Gerald. 1981. Unbounded dependencies and coordinate structure. Linguistic Inquiry 12:
155-184.
Geest, W. de and Y. Putseys, eds. 1984. Sentential Complementation. Dordrecht: Foris.
Grimshaw, Jane. 1979. Complement selection and the lexicon. Linguistic Inquiry 10: 279-326.
Grimshaw, Jane. 1990. Argument Structure. Cambridge, MA: MIT Press.
52
Grimshaw, Jane. 1991. Extended Projection. Ms, Brandeis University, Waltham, MA.
Grimshaw, Jane 1994. Minimal Projection and Clause Structure. In B. Lust, J. Whitman and J.
Kornfilt eds. Syntactic Theory and First Language Acquisition: Cross Linguistic Perspectives Volume I: Heads, Projections, and Learnability, Erlbaum. 75-83.
Grimshaw, Jane and R. Armin Mester. 1988. Light verbs and Theta-Marking. Linguistic Inquiry 19:
205-232.
Grimshaw, Jane and Vieri Samek-Lodovici. 1994. Optimal Subjects. Ms, Rutgers University, New
Brunswick NJ.
Haegeman, Liliane. 1992. Sentential negation in Italian and the Neg Criterion. In M. Starke ed.
Geneva Generative Papers, Volume 0 Number 0, 10-26. Departement de Linguistic Generale,
Geneva.
Haegeman, Liliane and Hans Bennis. 1984. On the status of agreement and relative clauses in WestFlemish. In Sentential Complementation, ed. W. de Geest and Y. Putseys, 33-53. Dordrecht: Foris.
Haider, Hubert. 1989. Matching projections. In Constituent Structure: Papers from the 1987 GLOW
Conference, ed. Anna Cardinaletti, Guglielmo Cinque and G. Giusti, 101-122. Dordrecht: Foris.
Halle, Morris, and A. Marantz. 1993. Distributed Morphology. In Kenneth Hale and Samuel Jay
Keyser ed. The View from Building 20. MIT Press, Cambridge, MA. 111-176.
Heycock, Caroline and Anthony Kroch. 1993. Verb movement and coordination in the Germanic
languages: evidence for a relational perspective on licensing. Ms. University of Pennsylvania,
Philadelphia, PA.
Hoekstra, Eric. 1993. On the parametrisation of functional projections in CP. In Proceedings of the
North East Linguistic Society 23, volume 1, 191-204.
Huang, C.-T. James. 1982. Logical relations in Chinese and the theory of grammar. Doctoral
dissertation. MIT, Cambridge, MA.
Iatridou, Sabine and David Embick. 1994. Conditional inversion. In Proceedings of NELS 24.
Iatridou, Sabine and Anthony Kroch. 1992. The licensing of CP-recursion and its relevance to the
Germanic verb-second phenomenon. Working Papers in Scandinavian Syntax 50: 1-24.
Jespersen, O. 1954. A Modern English Grammar on Historical Principles, George Allen & Unwin,
London, and Ejnar Munksgaard, Copenhagen.
53
Kayne, Richard. 1981. ECP extensions. Linguistic Inquiry 12: 93-133.
Kayne, Richard. 1982. Predicates and arguments, verbs and nouns. Abstract of paper presented to
the 1982 GLOW conference. GLOW Newsletter 8: 24.
Kayne, Richard. 1983. Chains, categories external to S, and French complex inversion. Natural
Language & Linguistic Theory 1: 109-137.
Kayne, Richard. 1991. Romance clitics, verb movement, and PRO. Linguistic Inquiry 22: 647-686.
Kim, Soo Won. 1990. Chain scope and quantificational structure. Doctoral dissertation. Brandeis
University, Waltham, MA.
Kitagawa, Yoshihisa. 1986. Subjects in Japanese and English. Doctoral dissertation. University of
Massachusetts, Amherst, MA.
Klima, Edward. 1964. Negation in English. In The Structure of Language, eds. Jerry Fodor and
Jerrold Katz. New York: Prentice-Hall.
Koopman, Hilda and Dominique Sportiche. 1991. The position of subjects. In The syntax of verbinitial languages, ed. James McCloskey. Elsevier. [Published as a special issue of Lingua].
Laka Mugarza, M. Itziar. 1990. Negation in syntax: On the nature of functional categories and
projections. Doctoral dissertation. MIT, Cambridge, MA.
Lasnik, Howard and Mamoru Saito. 1984. On the nature of proper government. Linguistic Inquiry
15: 235-289.
Lasnik, Howard and Mamoru Saito. 1992. Move ". Cambridge, MA: MIT Press.
Lasnik, Howard and Tim Stowell. 1991. Weakest crossover. Linguistic Inquiry 22: 687-720.
Legendre, Géraldine, William Raymond and Paul Smolensky. 1993. An Optimality-Theoretic
typology of case and grammatical voice systems. Proceedings of the Berkeley Linguistics Society 19.
Liberman, Mark. 1974. On conditioning the rule of Subject-Auxiliary Inversion. Proceedings of
NELS V, 77-91.
McCarthy, John and Alan Prince. 1993. Generalized Alignment. In Yearbook of Morphology 1993,
ed. Geert Booij and Jaap van Marle, 79-153. Dordrecht: Kluwer.
McCloskey, James. 1992. Adjunction, selection and embedded verb second. Linguistics Research
Report LRC-92-07. University of California at Santa Cruz, Santa Cruz, CA.
54
McNally, Louise. 1992. VP-coordination and the VP-internal subject hypothesis. Linguistic Inquiry
23: 336-341.
Müller, Gereon and Wolfgan Sternefeld. 1993. Improper movement and unambiguous binding.
Linguistic Inquiry 24: 461-507.
Pesetsky, David. 1989. Language particular processes and the Earliness Principle. Unpublished ms.
Cambridge, MA: MIT.
Platzack, Christer and Anders Holmberg. 1989. The rôle of AGR and finiteness. Working Papers in
Scandinavian Syntax 31: 9-14.
Plunkett, Bernadette. 1990. Inversion and early questions. In Papers in the acquisition of Wh, ed.
T. L. Maxfield and B. Plunkett. University of Massachusetts Occasional Papers, special edition.
University of Massachusetts, Amherst, MA.
Pollock, Jean-Yves. 1989. Verb movement, Universal Grammar, and the structure of IP. Linguistic
Inquiry 20: 365-424
Postal, Paul. 1992. Overlooked extraction distinctions. Ms. T. J. Watson Research Center, IBM,
Yorktown Heights, NY.
Prince, Alan and Paul Smolensky. 1993. Optimality Theory: constraint interaction in generative
grammar. RuCCS Technical Report #2. Rutgers University Center for Cognitive Science,
Piscataway, NJ.
Raposo, Eduardo. 1987. Case Theory and Infl-to-Comp: the inflected infinitive in European
Portuguese. Linguistic Inquiry 18: 85-109.
Reinhart, Tanya. 1993. Wh-in-situ in the framework of the Minimalist Program. Ms. Utrecht and TelAviv.
Riemsdijk, Henk van. 1990. Functional prepositions. In Unity in Diversity, ed. H. Pinkster and I.
Genee. Dordrecht: Foris.
Ritter, Elizabeth and Sara Thomas Rosen. 1993. Deriving causation. Natural Language & Linguistic
Theory 11: 519-555.
Rizzi, Luigi. 1982. Issues in Italian Syntax. Dordrecht: Foris.
Rizzi, Luigi. 1990. Relativized Minimality. Cambridge, MA: MIT Press.
Rizzi, Luigi. 1991. Residual Verb Second and the Wh Criterion, Technical Reports in Formal and
55
Computational Linguistics 2. Faculté des Lettres, Université de Genève, Geneva.
Rizzi, Luigi and Ian Roberts. 1989. Complex inversion in French. Probus 1.1.
Roberts, Ian. 1993. Verbs and diachronic syntax. Dordrecht: Foris.
Rochemont, Michael. 1989. Topic islands and the Subjacency parameter. Canadian Journal of
Linguistics 43: 145-170.
Sadock, Jerrold. 1984 The pragmatics of subordination. In Sentential Complementation, ed. de
Geest and Putseys, 205-213. Dordrecht: Foris.
Safir, Kenneth. 1993. Perception, selection, and structural economy. Ms. Rutgers University, New
Brunswick, NJ.
Smolensky, Paul, Colin Wilson, Géraldine Legendre, Kristin Homer and William Raymond. 1994.
Scope and wh-movement in Optimality Theory. Presented at the Society for Philosophy and
Psychology, Memphis, TN.
Sobin, Nicholas. 1987. The variable status of COMP-trace phenomena. Natural Language &
Linguistic Theory 5: 33-60.
Stowell, Tim. 1981. Origins of phrase structure. Doctoral dissertation. MIT, Cambridge, MA.
Suñer, Margarita. 1991. Indirect questions and the structure of CP. In Current issues in Spanish
linguistics, ed. H. Campos and F. Martínez-Gil. Washington, DC: University of Georgetown Press.
Travis, Lisa. 1989. Parameters of phrase structure. In Alternative Conceptions of Phrase Structure,
ed. Mark Baltin and Anthony Kroch. Chicago, IL: University of Chicago Press.
Travis, Lisa. 1991. Parameters of phrase structure and V2 phenomena. In Principles and Parameters
in Comparative Grammar, ed. Robert Freidin. Cambridge, MA: MIT Press.
Vikner, Sten. In press. Verb movement and expletive subjects in the Germanic languages. Oxford
University Press.
Vikner, Sten and Bonnie Schwartz. 1992. The verb always leaves IP in V2 clauses. To appear in
Parameters and functional heads, ed. Adriana Belletti and Luigi Rizzi. Oxford: Oxford University
Press.
Weerman, Fred. 1989. The V2 conspiracy. Dordrecht: Foris.
Williams, Edwin. 1994. Thematic structure in syntax. Cambridge, MA: MIT Press.
56
Zagona, Karen. 1982. Government and Proper Government of verbal projections. Doctoral
dissertation. University of Washington, Seattle, WA.
Zanuttini, Raffaella. (1991) Syntactic properties of sentential negation: a comparative study of
Romance languages. Doctoral dissertation. University of Pennsylvania, Philadelphia, PA.