XX THE THEORY OF HEAD MOVEMENT AND CYCLIC SPELL OUT

Head movement and cyclic Spell Out PRE-FINAL 1
XX
THE THEORY OF HEAD MOVEMENT AND
CYCLIC SPELL OUT
Balázs Surányi, RIL, Hungarian Academy of Sciences
1. INTRODUCTION
This paper develops a syntactic approach to head movement which offers a resolution of
several long-standing challenges to the view according to which head movement proper exists
in narrow syntax (=HM). It has been a recurrent theme in recent minimalist theorizing that
head movement sticks out in a typology of movements as exceptional, and hence its status in
the computational system (CS) itself is questionable. Recent reactions to the problematic
nature of head movement follow two markedly different paths. According to one view head
movement phenomena are not syntactic in nature but are part of the PF branch of the
computation (e.g., Chomsky, 2000, 2001; Boeckx and Stjepanović, 2001), while according to
the other prominent approach, they are syntactic, but they do not involve the movement of a
head but of a remnant XP (e.g., Sportiche, 1998; Mahajan, 2000, 2003; Koopman and
Szabolcsi, 2000). These are both perfectly feasible approaches, and several others are also
readily conceivable, and indeed have been proposed. What must be recognized, however, is
that on fairly standard assumptions about the syntactic computation (of the sort made by
Chomsky, 2000 et seq.), narrow syntactic head movement is expected to exist insofar as
syntactic heads are extant in the model (as entities potentially distinct from phrasal categories,
cf. Brody, 2000)—unless, of course, independent reasons conspire to block it from ever
arising. This is because the operation of Move (involving Merge) is not sensitive to
∗
The present paper, which has gone through a relatively long gestation period, develops the analysis of head
movement in Surányi (2004a/2000, 2002), drawing on Surányi (2004b). Thanks are due for questions and
comments to the audiences at GLOW 46 in 2003, TiLT 2004, an NYU Syntax/Semantics colloquium (November
2005), and the Sounds of Silence conference, where different versions of this material have been presented. I am
grateful to the audiences there, as well as to Misi Brody, Katalin É. Kiss, Eric Reuland, Michal Starke, Peter
Svenonius for questions, comments and for discussions both short and long. Special thanks are due to two
anonymous reviewers for their suggestions and criticism, as well as to the editors of the present volume for their
kind patience. While working on this material, I received the support of grant #D048454 of HSRF (Hungarian
Scientific Research Fund), and participated in the HSRF project #TS049873.
2 Sounds of Silence (provisional book title)
“projection level,” if at all this notion has any status in the theory. Unless further conditions
are added, Move should be able to apply to heads just as Merge is—the null hypothesis.
I argue that indeed this null hypothesis can be upheld, embedding it in a restrictive
model allowing only strictly cyclic derivations. I demonstrate that it is possible to retain the
descriptively beneficial aspects of syntactic head movement which it has been motivated by,
and to do away with the unwanted complications at the same time. According to the view
defended here, HM does not involve adjunction as in Chomsky’s (1993, 1995) checking
theory approach, but—in terms of generalized transformations—it is merger with the root,
followed by ‘projection’ of the raised head. This means that under the right conditions a head
H can be moved out of the current phrase marker K, merging H with K and projecting H into
HP, with K a complement of H, as below.
1. [HP Hi [K … (Hi) … ] ]
The structural description for HM in (1) is new as such; an analysis fundamentally along the
same lines, though with completely different motivation and descriptive focus, was proposed
by Ackema, Neeleman and Weerman (1993). It is a fact that the idea expressed in (1) has not
gained a great deal of acceptance in minimalist theorizing since it was first entertained. The
reason is, I believe, that the idea has not been worked out in sufficient detail, and to the extent
details have been given, the account has remained less than compelling. Indeed, proposing (1)
as a structural description is insufficient as a theory of the basic properties of head movement
phenomena: in several ways it does not in itself provide an answer to the set of challenges
head movement phenomena pose. The proper identification of the precise mechanisms of how
(1) can be, and why it must be, brought about in the derivation is required if (1) is to offer a
reasonable hope of success both in terms of descriptive coverage and in terms of parsimony.
This is the goal pursued in the present study.
The paper is organized as follows. In Section 2, I review some frequently noted, and
some less often recognized complications incurred by head movement as conceived of in a
checking theory of movement, arguing that head movement as modeled there is not merely
susceptible to definitional difficulties, but in fact bleeds from more wounds than commonly
acknowledged. Section 3 provides a context for the proposal of the paper by briefly
presenting the scene of competing alternatives currently being explored. In Section 4 I present
the key assumptions which my alternative is based on. Each of these theses is motivated
independently of HM, and they are shown to interact to yield an elegant account of HM that
resolves all of the problems reviewed in Section 2. I conclude in Section 5 by highlighting the
repercussions of the analysis for the syntax of null heads, and by a summary of key results.
2. HEAD MOVEMENT IN CHECKING THEORY
Head movement is described in early minimalism (Chomsky 1993, 1995) as an adjunction
operation, moving a lower head element to adjoin to a higher head category (see Baker 1988).
2. [YP [Y′ [Y Xi Y] [XP [X′ [X (X)i ] [ . . . ] ]]]]
Head movement and cyclic Spell Out PRE-FINAL 3
Adopting a lexicalist view, Chomsky (ibid.) assumes lexical items (LI) to be inserted fully
inflected, moving by head-to-head movement to inflectional functional heads to check their
features either in overt or in covert syntax, allowing for variation across languages and
constructions. In this section I enumerate a few of the problems for this account of HM.
The most conspicuous of these, often noted in the literature (and of course realized by
Chomsky too), are (i) and (ii) below (e.g., Brody 1997a/2000; Mahajan 2000, 2003).
(i) The moved head does not trivially c-command its trace position: X in (2) does not
c-command out of Y (cf. also the Proper Binding Condition on movement). Accordingly, the
definition of c-command needs to be complicated in terms of a distinction posited between
containment vs. dominance / segment vs. category.1
(ii) Head movement apparently defies the Extension Condition (EC) on Generalized
Transformations (the condition that all transformations should extend the root), i.e., it is
counter-cyclic. The EC is best viewed as a consequence of the definition of the basic
operations of CS (see Kitahara, 1995; Watanabe, 1995), not a filter on derivations.2 Note that
the EC and the C-command Condition are logically independent, and as determined by the
choice of the model, each one may or may not implicate the other (cf. Fn. 2).
(iii) A further shortcoming of the approach is that the special locality of HM,
unmatched in the domain of phrasal movement, does not receive a genuine explanation. The
strict locality involved can be summed up roughly by the generalization that HM cannot skip
any c-commanding head position, captured by Travis’s Head Movement Constraint (HMC),
subsumed under the ECP. The restriction that head movement cannot proceed via
excorporation plays a crucial role in accounting for HMC effects. If excorporation were
allowed, then a (non-null) head could incorporate into another (non-null) head, and then raise
on by excorporating movement, effectively voiding the HMC. (The ban on excorporation
entails that, in contrast to phrasal movement, head movement cannot be successive cyclic.)
Nevertheless in minimalist theory the prohibition against excorporating movement is not
properly derived from an independent source, and remains stipulative (see Brody, 1997a/2000
for an elaboration of this point, esp. Brody, 2000, note 7; cf. Roberts, 1991 for an antecedentgovernment based account).3
1
Arguably, the problem with c-command does not arise in (2) if the c-command condition holds of the Agree
relation: Y does c-command (the original copy of) X.
2
Chomsky’s (1993, 1995) execution of a Copy+re-Merge theory of movement effectively derives the EC.
Compare Chomsky’s (1995) strong feature based account of cyclicity (dubbed featural cyclicity by Richards,
1999), as well as Chomsky’s (2000 et seq.) Least Tampering Condition (LTC), according to which once
established, structural relations should not be altered later on in the derivation. Both accounts allow for tuckingin multiple specifiers (see Richards, 1999), and head movement by adjunction. Tucking-in, however, need not
involve any non-root-extending operations: multiple movements to a single head may be taken to be
simultaneous (see Hiraiwa, 2001; Chomsky 2005b), or to proceed from inside out (see Surányi, 2005). For
critical discussion of Chomsky’s (2000) LTC, see Surányi (2006).
Problem (ii) can be avoided in more permissive models where interarboreal operations (hence sideward
movements) are assumed to be available (Bobaljik and Brown, 1997; Nunes, 2001). In such models, however,
other conditions must be invoked to appropriately restrict sideward movement.
3
Various kinds of apparent long head movement phenomena described among others by Rivero (1991, 1994),
Roberts (1994) and Borsley et al. (1996) are assumed here not to involve HMC-violating HM, but to be
amenable to either an XP-movement analysis (e.g., Broekhuis and Migdalski, 2003), or an overt HM analysis
(Bošković, 1995, 2001) or a covert head movement analysis (Ackema and Čamdšić, 2003 for Serbo-Croatian), or
a non-movement (base generation plus prosodic clustering) analysis (Schafer, 1997 for Breton). As for
excorporating sub-extraction from complex heads, Julien (2002, 66ff) argues at length that some apparent cases
of excorporation are indeed merely apparent.
4 Sounds of Silence (provisional book title)
Even if we grant the unavailability of excorporation, it is still not clear why a
functional head P cannot attract a head G2 which is further down in the hierarchy than the
immediately next lower head G1, as schematized in (3).
3. [ P
[ G1
[ G2 … ] ] ]
This is because principles of locality like Relativized Minimality and its descendants
formulated in terms of Closeness to the attracting head P, are selective: they are sensitive to
(classes of) features. The locality of HM sticks out in that being a head is not a featural
property (so it is not clear why only heads are affected), and also in that it is non-selective
(any head is an intervenor for HM). In short, the special locality of HM is not properly
derived.4
(iv) Head movement also incurs complications with regard to the Uniformity
Condition on chains (a descendant of Chomsky’s (1986) formulation of Structure
Preservation), which demands that chain links be of uniform projection level (Chomsky
1995). This is because, strictly speaking, a head-chain is non-uniform on the Bare Phrase
Structure theory (Chomsky 1994), which draws on relational definitions of projection levels.
The lower link L1 of a head chain projects, hence it is non-maximal (in fact, minimal), while
the higher link L2 does not project, hence it is maximal.
4.
HP
H2
L2
LP
H1
L1
The Uniformity Condition is apparently too strong to allow HM. To evade this complication,
Chomsky (1995) exempts head-internal elements from its application by more or less direct
stipulation (see Chomsky, ibid., 321–322; cf. Brody, 1998 for relevant discussion).
The Uniformity Condition (UC) also plays a role in warranting the strict locality of
HM. For, if a head could move into a specifier slot, assuming the position and status of a
phrase, then it would be free to move on from that position as a phrase. The UC is
instrumental in preempting such head movements. Nevertheless, reliance on the UC is a
marked departure from minimalist ideals. The condition is a purely syntax-internal principle
with no well-substantiated syntax-external motivation, and as such it had better not be
postulated if possible.
(v) Since a moved head is allowed to adjoin to another in Chomsky’s model, it is not
clear how to rule out the analogous movement of a phrase into a head. Such movement will
not violate the Uniformity Condition (whether or not head-internal elements are exempted
from its application) since the relevant chain links would both be maximal. Chomsky (1995:
319) reasons that “the morphological component gives no output (so the derivation crashes) if
presented with an element that is not an X0 or a feature.” In a strongly lexicalist theory this is
4
See also Chomsky (2001, note 67) for a hint at this general problem. Pesetsky and Torrego (2001) claim that if
and only if a head attracts its sister, it is the head of its sister category that will raise; in all other cases the whole
phrase gets pied-piped. This is of course only a re-formulation, rather than a resolution, of the puzzle itself.
Head movement and cyclic Spell Out PRE-FINAL 5
a paradoxical statement: assuming morphology to assemble words prior to syntactic
computation, the morphological component will never be informed of what a given word
combines with in syntax.
(vi) The treatment of HM as attraction to another head runs into complications with
respect to the strong/weak distinction (no matter whether the distinction is encoded by feature
strength proper or [EPP], or by some other technology). Various syntactic contexts involve an
alternation in terms of the location occupied by a given head-level element E at Spell Out
between a high and a low head position (H and L) within the relevant hierarchy of functional
projections. Frequently there is at least one functional head in between H and L, which E
passes through in the course of its raising. For instance, in some Germanic languages we find
clauses that have the finite verb in the C position in some context, but inside the VP in some
other context (e.g., clauses embedded under a subset of bridge verbs in German, and in
Frisian, see Vikner, 1995; de Haan, 2001). The former is the case in (5a) (assuming a uniform
V2 analysis), where the verb passes through T en route to C. The latter situation is illustrated
by (5b), where the embedded C is filled by a lexical complementizer (parentheses around V
mark the landing site of covert V-movement). V-to-T is not licensed by the syntactic context
at hand.
5. a.
b.
[CP [C [T [ V ] T ] C ] … [TP SU [T [ V ] T ] … [VP [V V ] ]]]
Sie
sagte, [CP
sie
wolle keine Bücher kaufen]
she
said
she
wants no books-acc buy
‘She said that she didn't want to buy any books.’
[CP [C C ] … [TP SU [T [(V)] T ] … [VP [V V] ]]]
Sie sagte,
[CP dass
[TP sie keine Bücher kaufen wolle]]
A similar dissociation of overt V-to-T (which is lacking) and overt V2 is found in main
clauses in Mainland Scandinavian and Faroese (see Zwart, 2001 for discussion). In fact, an
elaborate cartographic approach entails that analogous alternations are rather wide-spread in
language, and not only in the clausal domain. For instance, the presence and absence of V-toT within the same language is also of the same nature, with a set of functional heads (incl.
aspectual, modal, adverbial, agreement etc.) intervening between H and L .
Unless we make additional assumptions, the set of intervening heads in such scenarios
must be characterized as alternating between being uniformly strong and being uniformly
weak, so that they attract the moving head element E step by step in one context but not in the
other. This pattern may be encoded by a transitive (sub)set of selectional requirements,
‘passing’ information about the choice of the filler of H starting out from H all the way down
to the head selecting E. This would be a curious state of affairs, given that, besides the choice
of the filler of H, there is typically no other, independent difference in the featural properties
of these (series of) intermediate heads. In other words, what one is forced to assume is (i)
selection for the strength property itself (however strength may be represented), and (ii)
systematic lexical ambiguity (optionality) in terms of strength. Both these postulates are
undesirable in minimalist explanation.
(vii) In a checking theory of head movement to functional projections, the functional
heads that are identified as landing sites for HM are typically phonologically empty (see, inter
alia, Starke, 2001; Fanselow, 2003/2004). This must also be considered a drawback of the
standard checking approach to HM, in so far as it makes the postulation of systematically
phonologically empty functional heads inevitable. Paradoxically, if a syntactic head must
6 Sounds of Silence (provisional book title)
systematically be phonologically empty according to the theory, this weakens the motivation
of positing that head to begin with.
Further doubt is cast on a checking theory approach to HM if we consider feature
strength of a functional head F that at Spell Out hosts some attracted word form generated
under a lower head L. On a checking theory of HM, F bears an uninterpretable feature [uF]
that is targeted by HM. This [uF] is typically assumed to be some feature of the lexical head
below the extended projection hierarchy that F is a member of (e.g., a [V]-feature, see
Chomsky 1993, 1995), or a feature of the head of the complement of F (e.g., a [T]-feature, if
F=C). The problem is posed by alternations similar to that illustrated in (5) above. It is a
common scenario that a functional head F behaves radically differently in terms of HM when
it is null from the way it behaves when it is non-null. Typically, when F is null itself and
houses the raised inflected stem at Spell Out (e.g., when T is filled with the raised verb), then
it does not overtly attract the word formed from the same stem when it is non-null, i.e., when
it is filled by some element of category F (e.g., when T is filled by a modal). In other words,
the non-null counterpart of a strong null functional head is weak.
Such a generalization is not explained on a checking theory of HM. Given that in
checking theory words are inserted fully formed, morphology must be pre-syntactic
(otherwise words could be formed in syntax via HM). On this approach nothing in principle
excludes that an inflectional head that serves as the target of HM be morpho-phonologically
non-null itself, as HM to a head-adjunction position will not feed morphology. This option is
instantiated, for instance, in the syntax of verbal complexes of some languages (cf. É. Kiss
and Van Riemsdijk, 2004), and various other overt incorporation phenomena. As there is
apparently no correlation within the class of functional heads between the property of being
null and the property of being strong, or between the property of being non-null and the
property of being weak, the systematic alternation in terms of strength described immediately
above remains unaccounted for.
The way out for checking theory is to deny that there is strength alternation in such
cases. The idea would be that F is uniformly strong, and when it is generated as filled by a
lexical item of category F (LI[F]), its strength is saturated by (LI[F]) itself. The way to
implement this intuition would be to analyze LI[F] as being Merged itself to F in an adjoined
position, from where it can check F. However, such an account is problematic, as it introduces
phonologically radically empty functional heads, i.e., heads that are invariably unpronounced.
This approach poses a complication for the semantic analysis as well. This is because either it
is the case that the silent head F is meaningful when it attracts an inflected stem, but
meaningless when LI[F] adjoins to it (as in this case it is LI[F] that gets interpreted), or
alternatively, F is meaningful in both cases, and LI[F] itself is meaningless—neither analysis
being particularly attractive.
What has been demonstrated in this section is that HM as implemented in checking
theory not only suffers from definitional problems related to c-command and the Extension
Condition, as is commonly recognized, but is much more deeply flawed. For a discussion of
further shortcomings, see Surányi (2006) and references there.
Head movement and cyclic Spell Out PRE-FINAL 7
3. RECENT REACTIONS
We have seen that there are ample grounds to abandon the checking theory approach to HM.
In the course of the past couple of years, although not necessarily for the same reasons,
various researchers have proposed to reinterpret head movement phenomena. This section
will serve to provide the context within which the proposal advocated in this paper is to be
situated, presenting the core conceptions of the main directions of recent research. I cannot
hope to offer a balanced discussion of the alternatives within the confines of these pages; but
see Surányi (2006) for a critical discussion.
One avenue to take is to deny that syntactic head movement exists. As noted in
Section 1, this is a (defendable) departure from what is prima facie the null hypothesis,
namely that the operation of movement (just like Merge, which on most accounts is also
involved in Move) is applicable to syntactic objects independently of their internal structure.
This genre of approach has several different varieties.
One type of accounts ascribes head movement phenomena to a variation in the spell
out of PF-features only, without assuming syntactic HM proper. The best known suggestion
to this effect is Chomsky’s (2000), who has proposed to relegate head movement phenomena
to the PF-branch of the computation, taking it to involve PF reordering (see also Boeckx and
Stjepanovic, 2001).5 A different option under the same general rubric is proposed by Brody
(1997a/2000) within his Mirror Theory, according to which head displacement effects derive
from variation in the choice of the head where a sequence of related heads get spelled out as a
morphological word. Zwart’s (2001) proposal is a particular combination of the previous two
proposals, according to which the spell out position of a lexical head is determined by
phonological movement of the LEX-features (lexical features) of the lexical head within a
chain of (F-)related heads. Bobaljik (1995) is in the same general vein as the latter two
accounts, although that work only assumes the lowering of heads to take place on the PFbranch (keeping overt HM in the syntax). A more distant precursor, of course, is the Affix
Hopping rule of early generative grammar.6 Harley (2004) takes PF-features of elements to
migrate in the syntactic tree through the mechanism of projection.
A second alternative approach that has been taken, still denying the availability of
syntactic HM, is that apparent head movement is the result of remnant phrasal movement.
This account is put forward for instance by Sportiche (1998), Mahajan (2000, 2003) and
Koopman and Szabolcsi (2000). Mahajan (2003) proposes that a remnant VP raises to check
5
This suggestion, besides not being worked out in great detail, has met with considerable criticism, both
conceptual and empirical (see e.g., Zwart, 2001; Matushansky, 2006: Appendix, and references there). The
empirical criticism has mostly focused on one of the key predictions of the view, namely the expected lack of
semantic effects associated with head movement. It has been argued that head movement is far from being
uniformly semantically vacuous (e.g., Benedicto 1997; Cinque 1999: 102f, 184, Fn. 8; Müller 2001; Lechner
2005).
6
Bobaljik (1995), developing the core ideas of Distributed Morphology (Halle and Marantz, 1993), accounts for
low position inflected stems (such as the verb in English), by a hybrid approach. On that approach, stems
appearing higher than their base position are displaced by overt HM as in checking theory, while affixes
appearing on stems lower than their base position are displaced by a morphological merger PF operation that
effectively lowers the affix onto the stem prior to Vocabulary Insertion (see also Embick and Noyer, 2001). In so
far as that the mechanism of overt head movement remains unchanged, such a theory is subject to the problems
discussed in Section 2. To the extent that (overt) HM and Affix Hopping have analogous effects and descriptive
conditions (e.g., identity in strictness of locality, class of interveners), the question of redundancy of these two
operations arises in a sharp form. In addition, the PF merger operation needs to access relatively large chunks of
syntactic representation, which are simply not available in a cyclic Spell Out model.
8 Sounds of Silence (provisional book title)
T in an inner [Spec,TP], and Koopman and Szabolcsi (ibid.) develop an account of verbal
complexes relying on remnant phrasal movement, but not on HM. Snow-balling phrasal
movement can generate the same surface order of head elements as roll-up head movement in
a model with HM, and various constraints interact to rule out unwanted XP-movements.
Indeed, more generally, it needs to be ensured on this type of approach that the remnant XPs
should include just the head (or a unique pre-head specifier, as in Müller, 2004), and should
be allowed to move only in ways similar to the manner heads can move on a theory based on
HM.
It is also possible to combine the view that the moved head is in fact an XP with the
view that some rearrangement is carried out in the PF component. Matushansky (2006) offers
such an analysis. She assumes that heads can undergo syntactic movement, but instead of
landing in a head position, they land in an XP position, i.e., in a specifier slot (see Toyoshima,
2000, 2001 for this idea). A special reanalysis operation, having stipulated properties, then
combines the head in the specifier position with its host head, and a further condition (called
Transparence Condition) ensures the locality of head movement.
4. THE PROPOSAL
The proposal I advocate in this paper differs from those reviewed in the preceding section in
that it assumes neither that head movement involves some PF operation, nor that heads only
move qua phrases or only to phrasal positions. The account I develop below maintains that
Move can and does apply to heads too, taking them to landing sites that qualify as head
positions. Of course, only if such an account is shown to avoid the unwanted complications
discussed in Section 2 above, can we hope to retain the descriptively beneficial aspects of
head movement which head movement has been motivated by in the first place. This is the
objective of the present section.
4.1 The Re-Merge and Project Hypothesis
The alternative I advocate here is based on the mechanism schematized in (1), reproduced
below as a tree-diagram. (1) encapsulates two assumptions: (A) HM uniformly involves root
re-Merger of a head, and (B) the re-Merged head projects (I abbreviate (A) and (B) together
as the RPH, the Re-Merge and Project Hypothesis).
6.
HP
H
K
(H)
Assumption (A) is a subcase of the more general assumption (A′) that at any derivational
point, Merge (and re-Merge) can only apply at the root. The particular approach to movement
I will adopt below is a version of copy theory (Chomsky 1993, 1995), according to which
Head movement and cyclic Spell Out PRE-FINAL 9
Move is Copy + Re-Merge. Assumption (B) is of course much less trivial. I argue first that
the syntactic scenario that (B) involves is an available option.
4.1.1 Projection of moved heads: an argument for possibility
The question whether a moved element can project is raised and answered in the negative by
Chomsky (1993, 190ff; 1994, 19; 1995, section 4.4.2; 2000, section 5), making the
generalization that it is always the target of movement that projects. Brody (1998, section 6),
however, argues that the way Chomsky derives the ‘Target Projects’ generalization is both
conceptually and empirically flawed. Chomsky’s argumentation is based in part on the
Uniformity Condition. However, as pointed out in Section 2 above, in a theory without
syntax-internal filters nothing like the Uniformity Condition should exist.7 That does not
necessarily mean that we should find non-uniform chains all over the place in syntax: the
Uniformity Condition may simply turn out to be redundant, if the computational system itself
is such that it cannot generate structures that would violate it. To be sure, the structure for HM
pictured in (1) involves a uniform head chain: the moved H is minimal in both chain link
positions. If (1) is indeed the structural description for HM, the view defended in this paper,
then HM only yields uniform chains.
Consider now the argument against the projection of a moved head advanced in
Chomsky (1994). There it is suggested that should a head H move, attach to its own
projection HP, and project itself into a higher HP (as in [HP H [HP Spec [HP H Compl]]]), the
generated higher HP would have two heads, creating a headedness ambiguity. However, it is
important to observe that the headedness ambiguity can arise to begin with only if H10 and H20
are categorially non-distinct. As Brody (1998, 391) notes, it is not explicitly stated by
Chomsky why such a headedness ambiguity should result in crash, and indeed it is not clear
what interface requirement such an ambiguity would violate. At any rate, since according to
the proposal I develop below H10 and H20 are categorially distinct, no headedness ambiguity
can arise.
It appears then that the analysis in (1), according to which H provides the label for the
phrase that is built when HM Merges H to the root, is not precluded as such. The more
difficult question concerns why H should project in such a scenario. This issue will be taken
up in Section 4.2.
4.1.2 What the RPH buys
Let us consider now what the welcome consequences of an analysis of HM along the lines of
(1) would be with regard to the problems noted in Section 2 for a checking theory approach to
HM. (i) First, the moved head c-commands its pre-movement position trivially; no
definitional problems arise. (ii) Second, the moved head extends the root. Head movement is
no longer exceptional in this regard (and there is no need for the introduction of the Least
Tampering Condition); compare Fn. 3. (v) As movement operations extend the root,
movement into a head never occurs; a fortiori, it does not occur with XP-movement into a
7
Carnie (1995) and Toyoshima (2000, 2001) argue convincingly against a condition like the Uniformity
Condition on theoretical grounds (see also Gärtner (2002, 88–90) for discussion of how the Uniformity
Condition is both too strong and too weak). Further, if it did exist as a principle, the Uniformity Condition would
inevitably push the model of syntax back to a mixed derivational–representational type (cf. Brody, 2002).
10 Sounds of Silence (provisional book title)
head either. In short, analyzing HM as in (1) allows properties (i), (ii), and (v) fall out from a
restrictive definition of movement (where movement can only affect an element inside the
current root and it has to re-Merge it to the current root itself), such as Chomsky’s (1993,
1995) Copy+re-Merge theory.8 Note that these results are shared by any theory of head
movement which assumes the same theory of movement, including accounts that take head
movement to involve XP-movement. The real challenge is how to make all the other
properties to follow as well.
As for (vii), the systematic phonological emptiness of inflectional heads targeted by
HM, no such inflectional heads are assumed to exist prior to HM, thus there is no need to
explain why they should be systematically null. Whether the remaining issues raised in
Section 2 are also accounted for by a theory of HM incorporating the derivational segment in
(1) will greatly depend on further assumptions, which are logically independent of (1).9
In particular it remains to be ensured that the re-Merged head must project, which
renders head chains uniform (iv), allowing the Uniformity Condition to be a descriptive
generalization, rather than a syntax-internal principle. The strict locality of HM (i.e., the
HMC) also awaits an account (iii). The second problem noted in (vii) regarding the
unexplained alternation between null and filled functional heads in terms of feature strength,
as well as the apparent strong/weak transmutation of sequences of intermediate functional
heads discussed in (vi) are not yet resolved. Finally, adopting the minimalist assumption of
the Last Resort character of syntactic movements, given that the movement of H in (1) cannot
be driven by an attracting feature, the question arises as to what the trigger of HM is on the
RPH. These are the issues that will be addressed in the remainder of this section.
8
I adopt Chomsky’s (1993, 1995) copy theory of movement (see also Munn 1994, Kitahara 1997), rather than a
multi-dominance (Internal Merge) theory (see Starke 2001, Gärtner 2002, Chomsky 2004 etc.). The copy theory
allows for each copy to have a distinctly modifiable lexical content, as discussed in Sauerland (1998, 2004) and
Fox (2000), as well as distinct spell out forms in different positions of the chain (see e.g., Nunes, 2001). See also
Brody (2004), who argues extensively against the view that it is a logical necessity that Internal Merge should
exist, contra Chomsky (2004). He also shows that some phenomena remain puzzling if movement involves
nothing more than Internal Merge.
Note that the operation of Copy needs not be conceptualized as being specific to movement: Copy can
be involved in accessing the Lexicon (or the Numeration, should there be one), and there is no reason to block
the application of Copy followed by Merge outside of the realm of canonical movement dependencies. Sideward
movement (as in parasitic gap licensing) may involve the latter case (cf. Nunes 2001, 2004; Hornstein 2001). On
the view according to which the question whether two elements are related as being in a ‘movement’ dependency
is an interface matter (see Brody 2002, cf. also Collins 1997: 90–91), applying Copy+Merge in non-movement
cases as well is harmless (it yields accidental coreference for lexical DPs, for instance).
9
In fact, several researchers have advocated the treatment of HM along the lines of (1) in recent years. Work by
Koeneman (2000), Surányi (2004a/2000, 2002, 2003), Bury (2003a, 2003b) and Fanselow (2003/2004) all
incorporate this basic idea in one form or another, diverging with respect to virtually all their further assumptions
(see also Nash and Rouveret, 1997). An approach to verb movement along the lines of (1) was proposed, albeit
for rather different reasons from those considered here, by Ackema, Neeleman and Weerman (1993) (ANW),
whose analysis is an important antecedent of each of these recent accounts of HM, as well as by Holmberg
(1991). The reader is referred to Surányi (2006) for a critical review of ANW, the development of their ideas by
Koeneman (2000), as well as the implementations by Fanselow (2003/2004) and Bury (2003a, 2003b).
Head movement and cyclic Spell Out PRE-FINAL 11
4.2 Head movement and cyclic Spell Out
4.2.1 Phase evacuation and head movement
Let us begin with the last question, that of the trigger for HM. Before I can address this issue,
we will have to make a brief detour to examine the nature of cyclic Spell Out, which will be
essential to the proposal to be made in this part.10 To anticipate the proposal, my claim will be
that HM is driven (in the sense of Last Resort) in the same way as (successive cyclic)
movement to intermediate phase edges. A central motivation for the hypothesis of phases
comes from the successive cyclic nature of non-local movements. I argue that if Spell Out is
cyclic at the phase level, as Chomsky (2000, 2001, 2004) assumes, then the property of
successive cyclicity of long movements can be captured without postulating P-/EPP-/OCCfeatures on phase heads, contra Chomsky (ibid.). The point elaborated on here was originally
made in Surányi (2004a/2000, 2002, 2003).
Chomsky’s EPP-(/P-/OCC-)based account of successive cyclic movement to edges
has various shortcomings, calling for an alternative. First, movement to an edge is technically
an optional movement, triggered by an optionally assigned uninterpretable feature. Such
optionality gives cause for concern: if such features could in general be optional, we would
expect optional (or optionally overt) movements to phase edges all over the place, contrary to
fact.
Second, if there is more than one phrase that has to move through the same edge, the
account based on an EPP-feature has to be complicated, either by introducing a complex
structure to the feature composition of heads in order to distinguish the individual instances of
the EPP feature (N.B. axiomatically, the set {EPP, EPP} is equivalent to {EPP}), or by
assuming that EPP-features can remain active even after an element has been moved to the
edge. This latter account, however, effectively introduces unwarranted complexity to the
theory of checking and Last Resort.
Third, an EPP-based account of successive cyclic movement in a multiple Spell Out
model fails to extend to long covert movement dependencies, induced by probes checked
against some goal inside a lower phase (see, e.g., Ochi, 1999). Assume for instance that an in
situ wh-phrase WhP in an embedded clause in some language needs to undergo Agree with
the matrix C. WhP does not undergo overt movement, hence it cannot be attracted to the edge
of the embedded CP by an EPP feature on C. This means that WhP will not be accessible to
matrix C when it is merged in, and also that WhP will lead to crash when the embedded CP
undergoes Transfer. Cases like this have led some researchers to propose that interpretation is
cyclic at the phase level only with respect to phonological spell out, and semantic
interpretation takes place only at the end of the whole derivation. This allows Agree that is
not accompanied with overt displacement to stretch across phase boundaries (e.g., Stjepanović
and Takahashi, 2002; Legate, 2005; Bošković, to appear). This move effectively pulls back
from the objective of minimizing computational complexity (a key motivation for the
introduction of phase-based interpretation, see Chomsky 2000, 2001), and leaves the model
with the conceptual problems with a mixed derivational/representational theory (see Brody,
2002).
The alternative that is able to cover cases of long covert movement dependencies and
that is consistent with an approach that strives to be radically derivational is to treat covert
10
I use the term Spell Out to cover mapping to both interface components, i.e., in the sense of Chomsky’s (2004)
Transfer, in keeping with the earlier use of the term in Chomsky (2000, 2001).
12 Sounds of Silence (provisional book title)
movement dependencies (on a par with overt movement) as involving category displacement
(see a.o. Nissenbaum, 2000; Pesetsky, 2000; Richards, 2001; Fanselow, 2001; Cheng and
Rooryck, 2002; Chomsky, 2004). However, if covert movement affects categories just like
overt movement does, and it is successive cyclic (when long) just like overt movement is,
then an EPP feature on phase heads cannot do the job of pulling uninterpretable elements up
to the phase edge, because the EPP by definition does not induce covert movements. Either
way, long covert movement dependencies pose a complication for the EPP-theory of
movement to intermediate phase edges.
Note, furthermore, that positing the EPP-feature on intermediate phase heads does not
help in explaining how the [uF] of the moved XP in its phase-internal position is deleted when
XP is raised to the phase edge. The elimination of the [uF] from XP is not derived by Nunes
(2001) account of deletion in movement chains, according to which the [uF] is removed from
all occurrences in a movement chain except the one that participates in feature checking. If
successive cyclic A-movement through A-positions exists (which is currently under debate),
then it represents a different case, as the occurrences of the DP undergoing successive cyclic
A-movement form a chain internally to a single phase on Chomsky’s assumptions, hence
Nunes’ (2001) Chain Reduction can apply. It should be concluded that deletion of the [uF] off
the phase-internal occurrence of XP takes place as an operation independent of the checking
of [uF], which is not achieved by movement to an intermediate phase edge, triggered by an
EPP feature. The [uF] can be deleted off the phase-internal XP, as it is recoverable due to its
presence on the raised copy, and it must be deleted, as otherwise the Spell Out of the phase
containing XP results in crash.
This brings us to the fourth, and perhaps most critical problem. The complication is
that movement to intermediate phase edge positions is in fact determined locally, even
without the assignment of optional uninterpretable EPP features to phase heads. Recall that on
Chomksy’s (2000, 2001, 2004) account, the goal must bear some uninterpretable feature [uF]
if it is to be moved overtly. As Chomsky notes, “local determination is straightforward: […]
an uninterpretable feature in the domain at the phase level determines that the derivation will
crash” [unless it is moved to the phase edge, BS] (Chomsky 2000, 22). In other words, it can
be locally determined what the two available alternatives result in. Either no movement of the
offending XP applies, in which case the derivation crashes upon Spell Out at the next step, or
the uninterpretable XP is raised, allowing the offending [uF] to be removed in the phaseinternal position, in which case the derivation avoids crash and can continue. In this light, the
introduction of the uninterpretable features on the phase head appears to be unwarranted:
movement that rescues XP out of the domain that is to undergo Spell Out to the edge position
is locally determined to be unavoidable, i.e., it is properly triggered, as required by Last
Resort. The conception of Last Resort is understood here as a general economy condition,
which can be formulated as in (7):
7. Last Resort
A syntactic (movement) operation is licensed by Last Resort iff it serves the
elimination of uninterpretability from the syntactic object that is fed to the external
components of sound and meaning (=Spell Out Domain).
The core of (7) is that a syntactic operation is allowed if it serves to avoid uninterpretability
(crash) at the interface (cf. Full Interpretation). Significantly, (7) is not a radical departure
from the narrow, formal feature based view of Last Resort according to which movement
Head movement and cyclic Spell Out PRE-FINAL 13
must result in feature checking (cf. Chomsky, 1993); rather, it is merely an explicit
rationalization of that definition.11
To reiterate, a movement operation evacuating an element to the phase edge out of the
constituent that is sent to Spell Out (the Spell Out Domain) satisfies Last Resort whether or
not it also eliminates an uninterpretable P-/EPP-/OCC-feature on the phase head.
Consequently, this latter feature is redundant: it is not necessary to drive movement to edges.
In this manner, it is cyclic Spell Out itself that drives cyclic evacuating movements out of
Spell Out Domains.12
A similar conclusion is reached independently by Bošković (to appear), as well as
Stjepanović and Takahashi (2002). Bošković (to appear) discusses, among others, an instance
of the following general problem for EPP-driven movement to phase edges in Chomsky’s
(2000 et seq.) probe–goal theory. According to that theory, both probe and goal bear some
[uF]. As the probe is merged in much later than the goal, there is no way, without massive
lookahead, to block successive cyclic raising of the goal to intermediate phase edges. The
puzzle is posed by scenarios where the goal XP, bearing some [uF], is to be checked against a
‘weak’ probe (one without an EPP feature) in some higher phase. On Chomsky’s theory, in
such scenarios after the last step of successive cyclic movement of the goal XP to the edge
that is already accessible to the probe, the movement dependency between the attracting probe
and XP is covert. This would leave XP in an intermediate edge position in overt syntax. Such
situations are apparently not attested, however. Bošković (to appear) illustrates the case with
the ungrammatical sentence *Who thinks what that Mary bought?, which would be wrongly
generated on Chomsky’s theory.
Returning to the main thread now, I argue that HM is triggered in the same way as
instances of phrasal movements to intermediate phase edges. The idea is that, in terms of the
configuration in (1), with H a head, and K the containing phase, H bears some uninterpretable
feature [uF] that it has not been able to check internally to K, and thus upon the completion of
K, H is escaped out of K to avoid being Spelled Out, which could not be avoided if H stays
11
Chomsky (2004), building on much recent work, broadens the conception of Last Resort even further by
assuming it to also license movements that have an effect on interpretation (e.g., Quantifier Raising, Object
Shift). Such a permissive view involves some extra computation, but that consequence is argued to be
empirically correct by Reinhart (2006). Another case is the Merger of arguments, which, although it does not
involve feature checking, it nevertheless turns an uninterpretable (unsaturated) syntactic object into an
interpretable (saturated) one (see Chomsky, ibid.)
12
The drive behind successive cyclic movement through escape hatches is non-featural in Chomsky and Lasnik
(1993) and Takahashi (1994), where it is identified directly with a locality principle minimizing the distance
between chain links. Such an approach is informulable in the current restrictive derivational framework.
Chomsky’s (1995 et seq.) feature checking account (in terms of uninterpretable wh-features on intermediate C’s)
faces the same problems that are discussed in the text above.
Heck and Müller’s (2001) (H&M) resourceful account of successive cyclic movement, noted by an
anonymous reviewer, may be considered as a hybrid of the feature-based and the non-feature-based approaches.
The proposal is feature-based, yet it does not involve actual feature-checking, but only the requirement that there
be a matching feature for each feature to be checked. Specifically, H&M introduce the principle of Phase
Balance, which demands for every phase PH that for every strong feature in the Numeration there must be a
distinct matching feature either in the Numeration or at the edge of PH. In this manner lookahead is avoided, at
least in the technical sense of the term. Arguably, however, the computation still does look ahead to what
building blocks are yet to be Merged at later stages of the derivation. It is not clear how the principle of Phase
Balance itself is to be derived in a minimalist model that is not augmented with an OT component, and also, the
Numeration is apparently dispensable in a theory based on highly local choices (cf. Collins, 1997; Frampton and
Guttmann, 2002) (for arguments against, and for alternatives to, Chomsky’s (2000) Merge-over-Move
preference, supposedly supplying empirical evidence for Lexical Array, see also Epstein and Seely (2006) and
references cited therein). For these reasons I will not discuss H&M here any further.
14 Sounds of Silence (provisional book title)
within K. This result can be derived by a modification in the definition of the notion of edge.
The edge, for Chomsky (2000 et seq.) is a set of positions in a phase PH that are exempted
from being affected by Spell Out when Spell Out applies to PH. The notion of ‘edge’ is
understood in the descriptive terms of (8).13
8. The edge includes the phase head Ph, the feature-checking specifiers of Ph, and any
elements raised to Ph by some intermediate movement to phase edge.
According to Chomsky (2000) (see also Nissenbaum, 2000), Spell Out applies upon the
completion of a phase.14 Then the Spell Out Domain (SD) of a phase PH, defined as the
syntactic domain that in fact is shipped to the external component for interpretation upon
Spell Out, is PH minus its edge. It is clear that given the mechanism of phase-based Spell Out,
elements that are still to be displaced should not be shipped to the interface modules. The set
of positions these elements occupy includes exactly those identified as the edge in (8). In fact,
(8) also prevents the Spell Out of some elements that are not moved any further, namely
feature-checking specifiers that are fully checked (valued) at the level of PH, and also those
phase heads Ph that are spelled out in the Ph position. This apparent inelegance, having the
effect of requiring more operational memory than would otherwise be warranted, is imposed
on the system by the definition in (8). But there is a more serious and more conspicuous
problem with (8) within Chomsky’s frame of assumptions. Chomsky has been insisting that
the set of phases is limited to CP and vP (or more precisely v*P, which stands for vPs
containing an external argument), and excludes TP and VP. He proposed to motivate this
selective view on the empirical basis that CP and vP are the categories that are correlated with
a propositional structure at the semantic interface and with a degree of independence at the
phonological interface (e.g., Chomsky, 2001; see also Legate, 2003). The paradox is that the
set of syntactic objects that have ‘interface motivation’ in this way (i.e., phases) is disjoint
from the set of syntactic objects that are actually Transferred to the interface systems (i.e.,
complements of phase heads) (see e.g., Abels, 2003 for this latter point). Finally, note a
problem related to the topmost phase in such a model: the Spell Out of this last phase needs to
involve a mechanism distinct from that applying to all other phases, given that it is not
(within) the Spell Out Domain of any phase head.
Each of these three discrepancies can be eliminated if the edge is defined as below:
9. The edge is the set of elements bearing some [uF] that are Merged to the phase head.
Under (9), fully checked specifiers of the phase head need not wait to be spelled out at the
level of the next higher phase, which effectively reduces operational complexity. More
significantly, as the phase head is not part of the edge, there is no mismatch between
categories undergoing Spell Out and categories identifiable on the basis of interface criteria as
13
Here and in what follows, I ignore adjuncts (on the status and treatment of which there is currently much
disagreement) for simplicity of exposition.
14
Chomsky (2001, 2004) weakens the theory by allowing more representation to stay on: each (strong) phase
undergoes Spell Out at the next higher (strong) phase level. I adopt the stricter version. Chomsky delineates
strong phases from weak phases, a distinction I will not assume here (see Legate, 2003).
Head movement and cyclic Spell Out PRE-FINAL 15
phasal: the two are identical. Accepting Chomsky’s characterization of phases from the side
of the interfaces for the moment, this result provides support for (9).15
Significantly, (9) itself is made possible on the assumption that HM re-Merges the
head at the root. On the analysis that HM adjoins the moved phase head Ph to the attractor A,
the narrow definition of edge in (9) is not available. The reason for this is that on this latter
view Ph would be shipped to the interface components as soon as the phase PH that it heads is
completed, i.e., before A is Merged in. This results in crash if Ph bears some [uF], but at any
rate it preempts HM of Ph to A. As HM of phase heads is attested (cf. v-to-T), this outcome is
undesirable.
According to the alternative developed here, if Ph bears some [uF], it will have to be
escaped out of its current phase-head position, since that position belongs to the domain of
Spell Out according to (9). If Ph does not bear any [uF], then it will be Spelled Out as part of
the Spell Out domain of PH. In short, the approach to intermediate movement to edges
discussed in this subsection, based on Last Resort (7), extends to HM too, given the revised
notion of phase edge (9), which has independent appeal. Then we have an answer to how HM
is triggered in (1).
This proposal gives rise to several immediate questions. The discussion thus far has
identified H in (1) with the head of K and has taken K to be a phase. One question concerns
how the treatment sketched immediately above can be generalized. In other words, what about
the other options: when H is not the head of K, and when K is not a phase (i.e., when H is not
a phase head Ph)? Also, nothing has been said about the uninterpretable features on the
moving heads that trigger HM. Is there a need to postulate novel features for this purpose?
And what ensures that H projects when it is Merged at the root? The following subsections
address these questions in turn.
4.2.2 Labels and checking
I begin by addressing the last of these issues, that of the head status of H in its raised position.
I suggest that H functions as the head in its higher position precisely due to its [uF] that drove
it out of the Spell Out Domain of the phase. It is this [uF] that identifies H as (containing) the
label of the syntactic object created as a result of HM of H. More specifically, I will be
arguing that H projects the label because this leads to the creation of a checking configuration
licensing the checking of [uF] on H.16
15
Note that (9) allows for complements to be part of the edge when they bear a [uF]. That may be ruled out by
some appropriate modification of (9), should there be strong reasons for doing so. I am not aware of such
reasons. As formulated above, (9) rules out Complement-to-Edge movement. This is apparently a welcome
consequence, see Abels (2003). On the assumption that a Complement is in a sufficiently local relation with the
head in order to enter feature checking with, it follows that Last Resort bans Complement-to-Specifier
movements (Abels, 2003). This reasoning alone does not rule out Complement-to-Edge movement on the view
of movement to edges laid out above. But (9) does: in case the Complement itself bears some [uF], the
Complement is part of the Edge, i.e., it does not need to be moved in order to occupy an Edge position.
16
For present purposes, I take labels to be copies of LIs. I am assuming labels for complex syntactic objects too
(see also Chomsky 2005b: 15–16), contra Collins (2002). Collins (ibid., 49) notes that the notion of label has a
representational flavor, insofar as “once [it is] created, [it] remains in existence for the rest of the derivation.”
This is not a necessary property of labels (see Hornstein and Uriagereka, 2002, as well as Fn. 24). Collins (2002)
argues that labels are predictable, but the fact that something is predictable of course in no way entails that it
does not exist. Crucially, Collins makes “the stipulation that the locus is identifiable in the workspace with no
search” (p. 48). But if we do not introduce such a memory buffer into our model (neither a backtracking search
16 Sounds of Silence (provisional book title)
The analysis does not involve Chomsky’s (2000 et seq.) Agree as the checking
relation. In Chomsky’s (2000 et seq.) theory of movement, Agree is taken to be the basic
operation relating elements to be checked against each other. For Chomsky (2000, 2001),
covert movement consist in nothing more than Agree, while overt movement involves Agree
and, in addition, displacement of a category. Agree is crucial in differentiating overt and
covert movement dependencies. One peculiarity of this setup is that it conceptualizes covert
movement as being simpler and more fundamental than overt movement. Chomsky (2004)
modifies this picture by reintroducing covert category movement, as displacement of a
category after transfer of phonological features to the sound component (see also
Nissenbaum, 2000, as well as the other relevant references in Section 4.2.1).
An alternative rationalization of overt and covert category movement, the one I am
adopting here, is that movement of the syntactic category itself can pied-pipe both semantic
and phonological features contained in the category, only semantic features, only
phonological features, or neither. The last one of these options is covert movement without
semantic effects (e.g., the movement of the associate in there-constructions, see Lasnik,
1999); pied-piping phonological features only is the case of radical reconstruction (e.g., short
scrambling in Japanese); pied-piping semantic features only is semantically interpreted covert
movement (e.g., Quantifier Raising); and finally, pied-piping both is overt movement.
Interpretability is per definitionem a notion that is relative to some specific interpretive
system. Given the two interpretive systems of sound and meaning interfacing with syntax,
unless we add a stipulation to the contrary, (un)interpretability can be either semantic (S-) or
(morpho)phonological (P-). A prominent view concerning the interpretation of syntactic
elements by interface systems in a derivational setting is that the derivation hands over
elements to the interface components as soon as all their uninterpretable features are
eliminated (checked, valued) (see, for instance, Kitahara, 2000; Svenonius, 2001; and
Chomsky, 2001).17 Given the conception of (un)interpretability distinction between
(un)interpretability for semantics and for (morpho)phonology, the computation hands over
elements to the phonological interface component as soon as all their P-uninterpretable
features are checked, and hands over elements to the semantic interpretive component as soon
as all their S-uninterpretable features are checked. This view allows for two theoretical
options: either it is required that an element can be fed to some interpretive component to be
interpreted there if it is sent to both, or this is not required. Enforcing that the two operations
of transfer be simultaneous might involve some extra constraint, but at any rate, the latter
mechanism), then we need labels to ensure “identifiab[ility] in the workspace with no search.” Collins trades in a
memory buffer for the elimination of labels.
In reality, if labels (may) remain after a phrase has been built and the derivation has moved on, then we
have a straightforward answer to why it is the whole phrase XP that is moved to a non-local attractor, when it is
a feature of the LI head X of that phrase that is being attracted (i.e., why the phrase itself is apparently pied
piped). The reason can be that it is the label of the phrase itself that contains the closest occurrence of the
attracted feature to the attractor. Another possible answer is that moving just the head X would result in the
elimination (deletion/valuation) of occurrences of the offending feature [uF] on X, but it would leave the
occurrence of [uF] in the label of the phrasal projection(s) of X in place, unchecked.
Note that projection of labels is not vital for the core proposal I am making below; the analysis can be
transposed into a model based on Locus, dispensing with labels for complex syntactic objects.
17
Chomsky (2001, 10) reasons that “the simplest assumption is that the phonological component spells out
elements that undergo no further displacement, with no need for further specification [i.e. checking, BS]”. By
extension, the same should apply to the semantic component as well.
Head movement and cyclic Spell Out PRE-FINAL 17
option is no less restrictive than the former. I will assume here for simplicity’s sake that
simultaneity of transfer to the two interface modules is not required.
Independently of this choice of modeling the overt/covert opposition, the point to be
underscored is that if covert movement involves category movement, then Agree is no longer
instrumental in differentiating overt and covert movements.
But Agree has other functions too, in Chomsky’s theory, owing to the way it is
defined. It is assumed to be a complex notion involving feature matching, c-command, and
Closeness. C-command itself is a complex and stipulative relation (contra Epstein et al., 1998;
Epstein, 1999): it is complex, as it is composed of the relations immediately contain (which
can be derived from the application of Merge itself) and contain (a transitive closure of
immediately contain), and it is stipulative, as nothing explains the elementary asymmetry at
the heart of the notion, namely the fact that on one branch it is immediately contain that
defines the participant in the relation (viz. c-command), while on the other branch it is contain
(see Brody, 2002 for this point). Why isn’t contain the defining relation on both branches?18
Another potential conceptual shortcoming of c-command, abstracting away now from
the asymmetry it incorporates, is that it is non-local. It does not fall out directly from the basic
structure building operation of Merge: immediate containment (IC) is not an addition to the
theory, but relying on the transitive closure of IC is (otherwise all other relations definable in
terms of IC and logical connectives should be ‘freely’ available syntactic relations).
Furthermore, the non-locality of Agree makes Agree a representational concept, confining the
derivational theory that makes use of it into the class of mixed derivational–representational
theories (see Brody, 2002 for this notion).
A last problem with incorporating c-command into Agree (or into another checking
relation) to be mentioned here is that c-command seems to be eliminable for other
grammatical purposes (see also Chomsky, 2005b, 7–8). The monotonic increasingness of the
structure building operation (Merge to the root only) derives c-command for category
movement, rendering an additional c-command requirement (as part of Agree) redundant.
Binding and control appear to be reducible to movement (see Hornstein 2001; Kayne 2002;
Zwart, 2002; for development of this view). Variable binding by (binary) quantifiers is also
partly implemented via movement. The c-command effect in variable binding in in-situ binary
quantification (should it exist) can be put down to the semantics of binary quantifiers Q. This
is because Q takes its restrictor as its first argument and the nuclear clause, where it binds a
variable, as its second argument (according to Hornstein and Uriagereka, 2002, the same
holds of moved QPs). This means that [Q+restrictor] (e.g., a quantifier phrase) will be Merged
with its nuclear clause containing a variable, thereby deriving the c-command effect. But if c-
18
One can dispense with this stipulated asymmetry by deconstructing c-command into two components:
projection of label and dominance (containment) (see Brody 2000, 2002, who decomposes it into Specifier–Head
agreement and dominance). This would open up the possibility of differentiating features triggering overt
movement from those triggering covert movement in the following manner. A covert movement dependency is
created when the probe (=the label of the current root) needs to dominate the goal (‘weak probe’), and an overt
movement ensues if there is a stronger requirement in addition to that: a ‘strong probe’ needs to immediately
dominate the goal. This involves movement of the goal, Merging it with the root, and projecting the label of the
root (i.e., the probe). Overt movement would be seen as involving the requirement of the establishment of the
basic structural relation, i.e., the one created by Merge. This approach would constitute an attractive alternative
to the EPP-based account of the overt/covert distinction.
However, requiring a dominance relation to hold between attracting label and goal would face the same
problems that are identified for c-command immediately below in the main text.
18 Sounds of Silence (provisional book title)
command is only epiphenomenal in all these areas of the grammar, then keeping it just for
feature checking (as distinct from movement) seems unmotivated.19
These considerations should eliminate the c-command based checking relation of
Agree from the syntactic toolbox. I propose to replace Agree with a simple and natural
checking configuration: immediate containment, i.e., immediate dominance, a syntactic
relation that is directly supplied by Merge (cf. the notion of a ‘natural’ relation in Epstein,
1999; see also Chomsky 2005b, 7).20 As the probe is the label that projects, i.e., is copied to
become the label of the complex, phrasal syntactic objects, it will come to immediately
dominate the goal, or more precisely the label of the goal, sitting in a specifier position. A
consequence, in full accord with the primacy of lexical items (LI) in minimalism, is that the
checking relation of immediate dominance is a relation between LIs: between a label and
another label, or a label and a terminal LI.
Besides being a natural relation supplied by Merge, taking the checking relation to be
immediate dominance converges with the fundamental labeling mechanism in Chomsky
(2000, 2001) according to which it is always the probe (‘selector’) that projects. On the
assumption that the checking relation is immediate dominance, unless the current probe, or
Locus (see Collins, 2002), projects, the goal will not have a chance to enter a checking
relation with it.21 This is the reason why a labeling mechanism that projects the Locus can be
considered an optimal solution to the checking configuration supplied by Merge, facilitating
checking. I spell out this mechanism for future reference as (10).22
19
A residual component of the grammar where c-command plays a crucial role is Closeness/Relativized
Minimality (part of Agree, on Chomsky’s (2000 et seq.) definition). Closeness might well be dispensable,
however (cf. Chomsky, 2005b for this suggestion), in favor of Shape Conservation (e.g., Cyclic Linearization,
Fox and Pesetsky, 2005), or in favor of sufficiently small phases (on this, see Müller, 2004, as well as Section
4.2.3 below). This choice is immaterial for the present purposes, as Closeness can be retained in the system
without c-command on the simplification of the checking relation I am proposing immediately below.
20
Compare the view that checking is under sisterhood, e.g., Zwart (1992, 2003), Adger (2003). ‘Derivational
sisterhood’ (see Epstein et al., 1998) is non-local: it is a relation involving mutual c-command.
21
Note that projecting the probe H takes place in two steps. H is projected first when H and its complement are
Merged, at which point H attracts the goal. The goal is then Merged at the root, and H projects further. This twostep derivational sequence may be taken to involve lookahead, insofar as H will come to immediately dominate
XP only when it has undergone both projection operations. This lookahead, however, is minimal, i.e. extremely
local. The decision for H to undergo the first projection step will be evaluated as correct in the immediately next
step, when H projects again, coming to be in a checking relation (and check off) the goal Merged in [Spec,H].
Projecting the complement of H, in contrast, results in no checking relation. The labeling principle in (10) avoids
even this minimal lookahead.
It is interesting to note that on the approach described in Fn. 18, the two projection steps would each be
motivated separately (the first one by the dominance requirement, and the second one, applicable only in overt
movement, by the immediate dominance requirement).
Note, finally, that I do not assume uninterpretable c-selectional features to drive head–complement
Merge (cf. e.g., Svenonius, 1994; Julien, 2000); for arguments against that view, see Surányi (2006).
22
See Chomsky (2005a, 14) on the necessity of some form of labels, and (Chomsky, ibid., 15) on the
unproblematic nature of the ordered pair notation of labeling, which need not have the status of a primitive for
the computational system, beyond the operation of set formation.
Chomsky (2005b, 10–11) advocates a two-part labeling algorithm, of which both parts are stipulated
principles, even though they are simple. One of the two states that upon Merger it is the non-projected category,
i.e., the LI, that will be the label, and the other is an update of ‘Target Projects’ generalization of Chomsky
(1995) in the form of a stipulation (cf. Section 4.1.1). Admittedly, the two rules conflict when an LI-level
category (like a pronoun) is moved. Chomsky makes the puzzling assumption that in such cases there is
optionality with respect to which principle is conformed to, and which one is violated. Besides the obvious
Head movement and cyclic Spell Out PRE-FINAL 19
10. Labeling Principle
The Locus projects.
As for argument structure, I wish to adopt a Hale and Keyser-type configurational–
contextual approach to theta roles, where the theta role of a phrase is determined by its
syntactic context (viz., by virtue of being located in a certain position). In terms of the present
assumptions, the verb V, and also the external-argument-taking v, is unsaturated, and will be
interpretable only as part of a syntactic object labeled by itself, where the label immediately
dominates an appropriate argument (see Chomsky 2005b, Fn. 30, for a suggestion along these
lines). An agent, in the present terms, is the DP that is immediately dominated by a v node,
and other theta roles are interpreted analogously. If it is not the v that projects when the
constituent labeled with v and the agent DP are Merged, then v itself will remain
uninterpretable at the semantic interface, as it will not have its argument identified.
Returning to the issue investigated in this subsection, namely the head status of H in
its raised position after the application of HM, it can be concluded that if H bears a [uF] in its
original position marked (H) in (1), the reason for the movement rescuing it from the Spell
Out Domain of (H)P, then provided that H is to check its [uF] in its raised position, it will
have to project, by (10).
This result appears paradoxical at first glance. A feature of H, [uF], can enter checking
in its higher position, but it was not able to do so in its lower position (H), when it was inside
the Spell Out Domain, even though no new element inaccessible to H in its lower position (H)
has been Merged in (assuming specifiers of (H) to be accessible for checking with (H), (H)
being the label immediately dominating those specifiers). The paradox is resolved, however,
if (H) projected a different label from the label that H projects.
This is possible if H itself contains more than one element that can be projected as a
label. As labels are LIs (Chomsky, 1995), it follows that H must contain more than one LI.
Recall that according to the theory developed here, there are no functional heads attracting H
to undergo HM. Compare head-to-head movement theory: there if a head H undergoes HM,
then it incorporates into another head F. On the lexicalist approach, H already bears a
morpheme corresponding to F. For instance, in cases of V-to-T movement V is inserted with a
T morpheme affixed on it (N.B. if T is not affixal but free, no V-to-T can take place). In other
words in V-to-T movement, V+T, which is generated as a unit prior to Merging the inflected
verb with its object, raises to a (null) T. On the present account, this latter null T does not
exist prior to HM. But given a lexicalist approach, affixal T is anyway present prior to HM:
namely, it is part of the complex verb, which it is generated as a morphological part of. I will
adopt the same assumption for the present purposes, with the difference that I will take the
complex (inflected) verb to be formed syntactically from its individual components, rendering
these components visible for the syntax. In other words, I adopt a syntactic approach to
inflectional morphology. Nonetheless, analogously to the lexicalist approach, I continue to
assume that affixation of a stem takes place prior to the combination of the affixed stem with
its phrasal arguments.23 For instance, a finite verb in French would have the following
schematic structure:
conceptual problems this suggestion raises within the given larger frame of assumptions, it is doubtful whether
this could extend to simple cases like pronominal subjects.
23
Although I will assume, unlike Distributed Morphology, that syntactic nodes do contain lexical material
already within syntax, I believe the account can be transposed to a late insertion framework without difficulties.
20 Sounds of Silence (provisional book title)
11. [[[ V ] v ] T ]
Taking this example, this complex verb can saturate its v component in a given position,
projecting v as the label, in accordance with (10), which will come to immediately dominate
the agent DP (see the discussion immediately below (10)). The complex verbal head can then
be raised up, where it can project a different label, T, projecting it to come to immediately
dominate the subject DP raised to T.24
We now have the contours of an answer to why H raises out of its (H) position inside
(H)P in (1), and why it projects again in its raised position. The reason is the same for both: a
[uF] on one component of the complex head H. This [uF] would cause the Spell Out Domain
of the phase (H)P to crash if H is not moved out, and the component that bears it projects as
the label in the higher position, in order to be able to enter checking under immediate
dominance with a goal.
The moving head gives rise to a uniform chain, allowing the Uniformity Condition to
remain a descriptive generalization, as desired. As for the concern about the nature of the
features that are assumed on H to drive HM, it appears that regular features of the head
components of the complex moving head can act as triggers for HM: no special features need
to be introduced.
Recall that it was noted in (vii) that there is an unexplained alternation between null
and filled functional heads in terms of feature strength in checking theory. Functional heads
that overtly attract an affixed head, i.e., exhibiting the property of being strong, are apparently
weak and do not trigger overt HM to them when they are lexically filled themselves. This
does not follow from checking theory, but falls out trivially on the present assumptions. When
a functional head F is morphologically free, it will not be generated as part of the inflected
head H, which will then never raise to F by HM. In this case strong F is checked in its basegenerated position. If, on the other hand, F is affixal and therefore originates as part of H, the
complex head H raises to the F position in overt syntax, due to the strong (i.e., Puninterpretable) feature of F, which gets checked there.
Finally, consider the apparent strong/weak transmutation of sequences of intermediate
functional heads discussed in (vi). As it was pointed out above, it is a recurrent scenario in
derivations that one would have to assume some sequence of intermediate inflectional heads
I1...In to be once uniformly weak, once uniformly strong, only to allow both for the derivation
of sentences with the verb (or the relevant head generated below In) pronounced low (lower
than I1...In) and for the derivation of sentences with the verb pronounced high (higher than
I1...In). Examine this setting within the model outlined above. If the verb is pronounced in
24
I must leave the issue of the hierarchy of the components and the order of their projection aside here. See
Surányi (2002, 2003, 2006) for an account according to which complex heads are internally re-projected: the
next higher head component LI with an uninterpretable feature becomes the new Locus of checking upon reMerger of the complex head H at the root. Re-projection internal to H is the result of the assumption that the
basic operation of Copy applies to LIs both when LIs are Copied from the Lexicon and when they are Copied as
part of Move. Movement of a complex head effectively involves Copying each of its LI components, and feeding
the copies to Merge to build another instance of the complex head. This new occurrence of H will have atomic
components identical to those of the lower occurrence of H, but due to the fact that the component LI with the
lowest remaining [uF], the new Locus, will start to project (owing to the labeling principle Locus Projects, see
(10)), all the labels in this new occurrence that dominate the new projecting Locus will be different. This
mechanism of re-projection upon the step-by-step re-assembling of the new occurrence of H by Merge—though
distinct from it in requiring actual re-Merger of the complex element for its re-projection—is akin to Hornstein
and Uriagereka’s (2002) notion of Reprojection.
Head movement and cyclic Spell Out PRE-FINAL 21
some high functional head, call it C, where it gets to by head movement, then this means in
terms of the current model that C must be an affix in the complex verb.25 The fact that the
verb is pronounced high, in C, shows that this C component of the complex verb bears a
strong feature, i.e., a [uF] uninterpretable at PF. In the paradigmatic case, when the verb is
pronounced low, C is realized as a free morpheme (say, a complementizer). If the complex
verb contains no strong features on its I1...In inflectional components (which on the present
account, are also generated inside the complex verbal head), then the verb will be sent to PF
low, below In. Note that on this approach, whether C is a bound morpheme inside the verbal
complex, or a free morpheme Merged from the Lexicon at the relevant high position, the
feature specification of the inflectional head components I1...In remains the same (they bear no
strong feature). Therefore, the problem of strong/weak transmutation does not arise. This is
due to the fact that the feature to be checked by verb movement to the high position is a
property of a component of the moved head.26
4.2.3 Head-based cyclic Spell Out
Let me complete the core of the account by addressing the remaining question raised at the
end of Section 4.2.1, concerning the generalizability of the analysis, which has been
demonstrated to work for phase heads. One of the two options considered there was precisely
the HM of non-phase heads (i.e., when K in (1) is non-phasal). In the radically derivational
model I am to adopt, this state of affairs is never the case.
For Chomsky (2000 et seq.), the radical derivationality of narrow syntax significantly
reduces the operational complexity of the computational system, in accord with his strong
minimalist thesis (SMT), and at the same time, it drastically restricts available syntactic
operations, especially, movements, which can apply only within the relatively small ‘active
window’ (=the phase) accessible at a given derivational stage. A priori, however, phases
25
Note that this assumption must be made in the checking account too. This is because there the verb
syntactically incorporates into a phonologically null C, which would not be possible if C were a free morpheme
(i.e., a word).
26
In reality, the problem of strong/weak transmutations crops up in certain derivations involving remnant
movement. This case is worth examining separately, as it shows that the problem of strong/weak transmutations
cannot be explained away in a way that assumes that the verb is somehow attracted all the way up to C, even
though it was still in its low position before C gets Merged, due to the ‘weak’ property of intermediate
inflectional heads. Consider, for instance, a derivation where the finite verb ends up in the C position, and the
remnant vP is fronted to some lower specifier position, say [Spec,T]. In a language that has weak inflectional
heads above VP, the strong/weak transmutation problem manifests itself as a non-local lookahead problem.
Consider why this is so.
(i) [CP [C V] [TP [vP ... tV ...] T . . . [ . . . tvP ]]]
In (i), the C element is merged in only after vP-fronting has taken place. If inflectional heads between V and C
are weak, they do not attract the verb overtly, hence fronting the vP to [Spec,TP] moves the overt verb along
(i.e., the position tV in (i) should be filled by the overt verb before C is merged in). The problem now is that in
order to get the verb move to C one would have to resort to extraction of the overt verb from a position
embedded in a left branch (and note that this position is not necessarily the highest head position in vP, it could
be lower, hence more deeply embedded in a left branch). In order to circumvent this difficulty, one either has to
resort to assuming inflectional heads above VP to uniformly be strong when V-to-C must take place, or else, the
movement of the V from head to head all the way up to C will involve non-local lookahead. This problem does
not arise on the present account, however. Here the complex verb will leave the vP prior to vP-fronting precisely
due to the fact that its C component carries a strong feature.
22 Sounds of Silence (provisional book title)
should be as small as possible, if the SMT holds. Chomsky (ibid.) has been assuming,
however, that the identity of phasal categories is an empirical issue. He has also insisted that
the identification of phases as CP and v*P in his model has appropriate interface motivation,
in terms of the properties of propositionality on the semantic side and relative independence
on the PF side (see Section 4.2.1 above).
In fact, the proper interface-based, and also syntactically motivated, definition of the
phase has been the subject of much controversy, both empirical and conceptual, ever since
Chomsky (2000) put forward his own (e.g., Bošković, 2002a; Epstein and Seely, 2002;
Legate, 2003; Matushansky, 2005; Boeckx and Grohmann, to appear). The protracted
quandary about the resolution of this dilemma is unsurprising if it should turn out that the
question itself is based on false premises. The relevant premises are that (i) Spell Out
(Transfer) proceeds in cycles (the Multiple Spell Out architecture), and (ii) Spell Out is
selective: there is an asymmetry in that some syntactic objects created by the computational
system of syntax (ChL) are inherently such that they are subjected to Spell Out when they are
built, while others are inherently such that Spell Out does not apply when they are completed.
It has been suggested by some that (ii) does not hold, i.e. the relevant asymmetry should not
be postulated. According to one particular alternative view that has been advocated regarding
the distribution of Spell Out points, each phrase undergoes Spell Out (=(12)) (see e.g.,
Bošković, 2002a; Müller, 2004; and Boeckx, 2003; compare also Frampton and Guttmann,
1999). According to (12), in a simplified clausal domain, not only v*P and CP, but also VP,
TP and vP are phasal, thereby eliminating any inherent asymmetry between phasal and nonphasal phrases.27
12. Each phrase undergoes Spell Out (i.e., each phrase is a phase).
If (12) is correct, then the analysis of HM presented above can apply to every phrase,
not just to v*P and CP. Even though (12) does not postulate an inherent asymmetry between
phasal and non-phasal phrases, as it stands, it defines the notion of phase purely syntaxinternally, since it singles out maximal phrasal projections as phases, as opposed to nonmaximal ones.28 Then (12) cannot be axiomatic according to the SMT—it can only hold if it
is a theorem following from more general assumptions. Such a construal of (12) is possible: I
assume that it is best thought of as being derivative of (13–15).
13. Free Spell Out
Spell Out can apply (to the root syntactic object, SORoot) at any derivational stage.
14. Spell Out Earliness
Each category K is subjected to Spell Out as soon as it can be.
27
The discussion of the thesis in (12) is based on a similar exposition in Surányi (to appear).
The ‘edge’ of the root syntactic object SORoot is exempt from being interpreted by the interface systems upon
Spell Out. Assume for now, following Chomsky’s (2000 et seq.) lead, that the edge is understood to include the
head H of SORoot, any feature-checking specifiers of H, as well as any elements raised to H via intermediate
movement to phase edge (I am ignoring adjuncts here and in what follows). It is only the complement of H that
is sent off to the external modules (=the Spell Out Domain, SD).
28
Head movement and cyclic Spell Out PRE-FINAL 23
15. (A corollary of) Full Interpretation
The syntactic object that ChL is operating on at any given stage (=SORoot) can undergo
Spell Out if there is no [uF] in its Spell Out Domain (=SD).29
Let me briefly comment on these three assumptions. (13) states that no extrinsic
restriction limits the application of Spell Out. Unless further restrictions are added, (13) yields
a Multiple Spell Out (MSO) model. MSO is simply the null hypothesis in a derivational
syntax, which by definition postulates a sequence of derivational stages. To the extent that
these derivational stages are real, they should in principle all be able to undergo the operation
of Spell Out, acting as an interface level. (13) does not imply that Spell Out must apply at
each derivational stage: according to the general premise inherited from the Principles and
Parameters framework, operations are formally optional in the minimalist approach—even
though generally heavily restricted by economy (cf. Last Resort), and by triggers (cf. Full
Interpretation). It is the economy principle in (14) that endorses applying Spell Out as often as
possible. (14) is a substantive principle, reducing operational complexity to a minimum
through minimizing the size of syntactic domains subjected to Spell Out (hence also limiting
the burden imposed on operational memory). (14) together with (15) (a corollary of Full
Interpretation (=FI), the hypothesis about the interface systems according to which they do
not tolerate uninterpretable elements) define an optimally small ‘delay’ in Spell Out whenever
it is necessary. This delay, stated descriptively in (12), is optimal in the sense that it is the
shortest delay (i.e., the smallest amount of violation of the economy constraint of Spell Out
Earliness in (14)) that is required to meet Full Interpretation.
Consider first how this modifies the model under consideration here, yielding a headbased view of cycles of Spell Out. In accord with (15), ChL can apply Spell Out immediately
after Merging a head H only if the complement of H and H (making up the Spell Out Domain)
contain no [uF]s. If the complement of H does contain elements E bearing some [uF], then if
E matches some feature of H, this triggers (overt or covert) movement of E to H. Any
remaining elements bearing [uF] in the SD are raised up to the edge of HP in order to avoid
crash upon the Spell Out of SD (see Section 4.2.1). Taken together, this means that the
earliest stage when Spell Out can apply without violating FI/(15) is the completion of the
maximal phrase.
In this manner, the generalization in (12) that each phrase is a phase is derived from
the architectural definition of the timing of Spell Out in (14), granting the assumptions of the
freedom of Spell Out and FI. Note that nothing that is said in (13–15) is selective with regard
to categories or projection level.30 Given that Spell Out is non-selective, the notion of ‘phase’
is epiphenomenal. If true, this eliminates any syntax-internal aspect to the definition of phase,
as no definition is required—not even one that is motivated by and formulated in terms of
notions of the external components, be it propositionality or some other concept.
Combining the account of HM developed thus far with this radically cyclic
derivational approach yields the result that a head H bearing a [uF] will be raised out of the
SD of every phrase. The application of this movement (=HM) is repeated as long as H still
bears a [uF]. A consequence of this analysis is that HM can be successive cyclic in the same
29
Compare, for instance, Kitahara (1997, 2000), Svenonius (2001, 2004), Chomsky (2001) for analogous views.
Also, nothing in the argumentation is crucially based on the assumption that SD includes H, or that movement
to the edge are non-feature driven. Thus, that each phrase is a phase is derived from (13–15) within Chomsky’s
(2000 et seq.) model as well.
30
24 Sounds of Silence (provisional book title)
way as phrasal movement (unlike in the incorporation + no excorporation view of HM): the
two are symmetric in this regard.
Recall now the other option raised at the end of Section 4.2.1 concerning the
generalizability of the proposed analysis of HM, namely the question what happens if the
raised head H is not the head of the current ph(r)ase, K, i.e., if K≠HP. Although nothing that
has been said so far addresses this issue in an explicit way, it can be deduced from the model
that this scenario can never arise. I briefly spell out how this result is derived.
Assume, for the sake of the argument, that the current ph(r)ase K=GP, and the moved
head H is the head of the complement selected by G. A scenario under consideration, where
H=/G, is ‘head-superraising’ given abstractly in (16) (with projection levels omitted).
16.
H2
H2
G (=K)
Y
G
G
H1
H1 … X …
As a first option, take the case when the label that the complex head H projects is the
same in both the lower and the higher position, call it L. Consider why such a case cannot
exist. Given that L projects in the position above GP, it follows that L bears a [uF] that it
checks there against some goal F, or some externally Merged element E. Note that it is also
due to this [uF] that such head-superraising is conceivable at all: this [uF] on L, with L
projected to HP1, defines HP1 as part of the edge of G (see (9)). Assume that L could have
checked this feature in the same way in the H1 position, either because E could have been
externally Merged to H1, or because F was already part of the phrase marker at the level of the
HP1 (e.g., F=X in (16)). It can be argued that the Merger of G, a new head, was in plain
violation of the restriction of lexical access, because checking possibilities have not been
exhausted prior to the insertion of G (see Chomsky, 2000, 132; Collins 2002).31 But
31
Lexical access, i.e., Merger of new features from outside the work space, can be seen as a Last Resort type
operation. The computation introduces new features (i.e., LIs) from the Lexicon only if there is no other way to
continue the derivation. This restriction on lexical access may be derivable from general considerations of
minimal computational complexity, in line with the SMT. Collins (2002) formulates this property of syntactic
computation as what he calls the Locus Principle, according to which the locus being checked stays the same
until it is fully saturated by checking. Compare also Chomsky’s (2005b) generalization of Edge features to act as
triggers for external Merge.
The view of restricted lexical access as formulated above implies that Move is preferred over Merge
from the Lexicon, which is incompatible with Chomsky’s (2000) Merge-over-Move theorem (maintained in
Chomsky, 2004). The present approach takes Merge (in Chomsky’s (2004) terms, External Merge) and Move
(Internal Merge) to be equally complex operations, both involving Copy + Merge (see Fn. 8 above), and
identifies the relevant difference as the introduction of new features into the workspace, which Merge from the
Lexicon does involve, whereas Move does not. It should also follow, that (External) Merger of a complex
syntactic object already built (i.e., existing in the workspace) does not introduce new features into the
workspace, hence it should not be dispreferred to Move. Compare Epstein and Seely (2006), and references
Head movement and cyclic Spell Out PRE-FINAL 25
independently of this argument, there is good reason within the present approach why this
derivation is illegitimate: the movement of H1 to the H2 position is movement to the edge of G
from within an element (=HP1) that is also part of the edge of G. HP1 is in Edge(G), since L in
the phrasal node HP1, bears a [uF]. But recall that movement to edge is licensed by Last
Resort because it eliminates an occurrence of an uninterpretable feature from the current Spell
Out Domain (see (7)). This means that the H1-to-H2 movement does not satisfy Last Resort.
Assume that L could not have checked its [uF] at the lower stage, but it can do so at
the higher position H2. This is possible only if [uF] of L is checked against Y in [Spec,G],
which was not present at the lower stage yet. What is it that rules out this case? The first
conceivable factor is that such checking of [uF] of L in the H2 position does not achieve the
elimination of [uF] from L in HP1. This is because although the movement of H1 to the H2
position is movement to an edge, which involves deletion of [uF] off L in the original
occurrence H1 (under Recoverability), this does not delete [uF] off L in the label of HP1. This
stranded [uF] stays on, and as it cannot be checked by Y, the derivation does not lead to
convergence. The second reason why the derivation is illegitimate is simply that it also
involves movement from a position within Edge(GP) to another position within Edge(GP), in
violation of Last Resort.
Let us turn now to the second option, namely when the label that the complex head H
projects is different in the H1 and H2 positions, call it L1 and L2, respectively. L1 must have
been saturated in the H1 position, otherwise H1-to-H2 would once again be movement within
the same edge. L2 bears a [uF] that it checks in the H1 position. It follows on the head-based
approach to Spell Out cycles that HP1 must be completed and subjected to Spell Out before G
can be Merged in. This involves checking all features of H that can be checked in the H1
position (i.e., saturating L1), and rescuing all elements with uninterpretable features to edge of
H1. This includes moving H1 itself to the edge of H1, as H1 is part of the Spell Out Domain of
HP1, and H1 contains at least one [uF], namely the [uF] on L2. This means that headsuperraising as in (16) cannot arise.
The immediately preceding discussion leads to another option of non-local HM that
needs to be considered, namely the scenario when H1 in (16) is raised first to the edge of HP1,
and moves on to the H2 position from there. The situation is depicted in (17), where the
occurrences of H are suppressed, except for the one in Edge(L1), and are replaced by the
labels they project). H projects and saturates its L1 component, and rescues H to the edge of
L1P. G is Merged in, and is saturated by entering checking with a goal Y that it attracts to its
specifier. H in Edge(L1P) is part of SD(GP), since it is part of GP but not part of Edge(GP),
given that L1P bears no [uF]. Apparently, H can raise on from its Edge(L1P) position inside
SD(GP) to Edge(GP), where it can project one of its components with [uF], to be checked by
some element within Y (or by Y itself, should it bear a second [uF] not checked against G).
The trouble with this derivation lies with the first step of moving H. The label of H in its L1
position is L1. L1 is saturated by assumption (internally to the lower L1 projection in (17)),
otherwise L1P would be part of the edge of G. The derivation in (17) gets blocked at the stage
when H is moved from the L1 position to Edge(L1P). This is because L1 is still the label of the
whole complex head H when it is in Edge(LP1), and L1 in H has no [uF], having been
saturated in the L1 position. That means, however, that when H is raised from the L1 position
to the H position in (17), it will not come to be part of Edge(LP1), as its label (=L1) bears no
therein, for a rejection of Chomsky’s (2000) Merge-over-Move (cf. also Shima, 2000 for a defense of Moveover-Merge).
26 Sounds of Silence (provisional book title)
[uF], and consequently will still be defined as part of SD(LP1). In which case, this first
movement step of LP1 to H is not appropriately driven, in the sense of Last Resort (see (7)).
17.
L2
L2
G (=K)
Y
G
G
L1
H
L1
L1… X …
We have been concerned thus far with non-local HM from the complement position of
the current ph(r)ase head G. But in fact, whatever has been said about the complement of G
carries over to the specifier of G, as both are part of Edge(GP) if their label bears a [uF], and
both are part of SD(GP) if their label bears no [uF].
There is, in reality, exactly one case where non-local movement of a head is
determined to be allowed. Consider a structure analogous to (17), but with an element E in
Edge(L1P) that was moved to Edge(L1P) not from the head position of L1P, unlike in (17). If
E is a phrase HP, then moving its head H to the L2 position would necessarily strand an
unchecked copy of the [uF] that the label L2 of H (projected above GP) bears on the label of
HP itself in (Edge,L1P) (recall the parallel case of stranding in the phrasal label discussed
immediately above). But if E is not phrasal, but only a complex head H itself, then its
movement from (Edge,L1P) to above G is licit: it leaves behind no stranded occurrence of
[uF], and it involves movement from SD(GP) to Edge(GP). If H is is not yet checked after
movement, then it stays part of Edge(GP), and the movement can be repeated after some
further head F is Merged in above GP. As a last move, F can attract H into [Spec,F]. The
empirical case being looked at is phrasal movement of simplex phrasal elements. Bare Phrase
Structure theory allows for LIs to move on their own as a phrase, since simplex elements are
maximal projections (in fact, both minimal and maximal; see Chomsky, 1995, 321).
The more interesting case is when H is not attracted to the specifier of another head as
a final step, but rather, it is moved and projected, similarly to the last movement of H in (17).
Such a derivation would involve phrasal movement of a simplex element (an LI) through one
or more phrase edges, ending up in a head position. One case that has been argued to have
properties of this kind is clitic pronouns. That VP-external clitics reach their higher position
via phrasal chains is suggested in work by Sportiche (1992), Cardinaletti and Starke (1999),
Roberts (1997), while at the same time, there is a host of evidence that clitic pronouns
typically behave as heads in their landing site position, often taken to be of the category Agr
(cf. also Bošković, 2002b, on whose analysis they are generated in a specifier position and are
moved to an incorporated head position). The movement of some clitic pronouns, then, may
instantiate the prediction at hand. Another case that comes to mind is that of simplex whexpressions like what as discussed by Donati (2006) (see also Chomsky 2005b, 10–11).
Donati (ibid.) argues that in free relatives simplex wh-elements, which are of category D, are
Head movement and cyclic Spell Out PRE-FINAL 27
raised to the CP-domain (on a standard view, non-local movement of the phrasal movement
type), where they project, giving rise to a DP (see also her analogous treatment of
comparative clauses for another case). As we have seen, such syntactic constructions are
predicted to exist. Significantly, however, they can arise only under the rather stringent
conditions that the moving simplex element must be base-generated as an unmodified
complement or specifier, and cannot be born as a head that has a complement or specifier
itself. This is because in this latter case it would not be able to come to occupy an edge
position: as demonstrated above, extraction of such a head is impossible. Indeed the cases
briefly noted as possible instantiations each fall into the expected category.
In summary, the model as developed above derives the strict locality of HM without
any extra assumptions. Recall from (iii) above that on Chomsky’s (1995) theory this special
locality remained a mystery. The HMC generalization, along with a specific well-determined
case when its effect can be circumvented, derives crucially on the present approach from the
head-based cyclicity of the Spell Out operation itself.32
5. CONSEQUENCES AND CONCLUSIONS
The approach to HM and structure building advocated in the foregoing sections has several
important repercussions for the syntax of null head positions, which this concluding section
will briefly recapitulate. Null elements, the subject of the present volume, are no doubt
legitimate and indeed extremely beneficial tools of grammatical description. A syntactic
element can in principle be null either on the sound side or on the meaning side, or both.
Nevertheless, it should be clear that positing abstract entities requires substantial (and
preferably empirical, rather than purely theory-internal) motivation. Methodologically
speaking, if a theory can dispense with some null element that its competitor is bound to
postulate, then, on the usual provisions of theory comparison, this can be regarded as a
distinct advantage.
According to the present theory of HM, HM of H is not driven by feature checking
against a c-commanding functional head F, resulting in incorporation of H into F, but instead,
it is driven by some [uF] on the moving head H itself. Perhaps the most conspicuous
consequence of this approach is that there is no need to posit pre-existing, systematically null
inflectional heads to act as attractors. Such attracting inflectional heads appear to be null,
because they do not exist. In fact, these heads are not only problematic because they are
systematically null in a lexicalist framework, but also for the reason that they duplicate wordinternal morphemes in the syntax, which introduces a massive redundancy in the basegenerated grammatical representation. In the theory advocated in this paper word-internal
32
As can be recalled from Section 2, one restriction that remained stipulative in Chomsky’s (1995) checking
theory of HM is the ban on excorporation. In the model presented here, no complex head is created by moving
an element into a head; complex heads are both syntactically and morphologically integral units. On the natural
assumption that morphological words are licensed only if their elements form a syntactic unit, it follows that if
some element of the complex head is excorporated in the course of the derivation, the morphological word is
disrupted.
As an anonymous reviewer notes, XP-movement is also strictly local on the phase=phrase hypothesis.
While this is certainly the case in the sense that the farthest position any XP can move to in a given movement
step is the edge of the current phrase, nevertheless, the point is that XPs can cross over (some) other XPs that
they are (in descriptive terms) c-commanded by (e.g., XPs in Spec or Edge positions), while the same is
impossible for HM.
28 Sounds of Silence (provisional book title)
affixes are generated only once, as part of the word, which then raises successive cyclically to
come to occupy locations in phrase structure that correspond to that of the null inflectional
heads in Chomsky’s (1995) checking theory.33 A fundamental change in perspective is that
these head loci do not exist independently of HM, i.e., they are not truly positional in nature.
Null elements often accomplish their mission in a given analysis not by merely being
present in the structure, but by being bearers of properties that are themselves instrumental. It
is all the worse if these properties are null at both interfaces, as is the case with strength
features. Two such cases are discussed in this paper: that of silent inflectional heads
intermediate between a high and a low spell out position (Section 2, (vi)), and that of silent
strong heads, whose lexicalized alternants are invariably weak (Section 2, (vii)). As for the
first case, the model offered in this study does not require these intermediate heads to be
duplicated, forming a strong and a weak paradigm, where it needs to be additionally stipulated
that “strong selects for strong, weak selects for weak.” The movement of the head H
generated below such a sequence of heads to the head position F above them is due to a single
strong feature present on the F component of the H complex itself (see Section 4.2.2).
There is no unexplained alternation between null and filled functional heads in terms
of feature strength either. When a functional head F is morphologically affixal and bears a
strong (i.e., P-uninterpretable) [uF], the complex head H that it is part of will overtly raise to
the F “position,” where its strong [uF] is checked. When F is morphologically free, it will not
be generated as part of the inflected head H, whence H will not raise to F (see Section 4.2.2).
Finally, the present theory resolves the problematic status of Agr (and AgrP). Agr is a
head that is radically null on a checking theory approach: it is null at both interfaces. As
Chomsky (1995) suggests, AgrPs are merely projected to ‘house’ the checking relation of a
head and a DP. The issue that Chomsky (ibid.) raises is that the Agr head itself is never
interpretable. Moreover, one is forced to posit two features in the Agr head, one attracting the
verb, and one attracting the DP. This means that there are altogether four features, out of
which only one happens to be interpretable (agreement on DP). Within this system, Chomsky
(1995) concludes, Agr projections had better not exist. Indeed, he proposes that they do not,
and develops a multiple specifiers based alternative account of V–DP agreement, according to
33
A further complexity that the base-generated duplication gives rise to is that the two sequences of elements
(affixes and functional heads) will need to be matched through some mechanism, in order to derive the
generalization captured by the Mirror Principle (Baker, 1988). The auxiliary assumptions Chomsky (1995)
introduces to ensure the mirror order matching effect is that (i) the order of the checking features on the inflected
word correspond to the order of affixes inside the morphological word (resulting in a triplication of the same
order), and (ii) checking must proceed in strict order starting from the innermost checking feature. As Brody
(1997a/2000) points out, “the ordering requirement amounts to a stipulation that is not obviously better than
stipulating the mirror principle itself: the mirror principle is also just an ordering statement based on suffix
order.” To make matters worse, the generalization described by the Mirror Principle will only follow on this
account if two ill-understood restrictions are adopted: the ban on excorporation, and the strict locality of HM
itself (see (iii) in Section 2 above). Brody (1997a) concludes that a theory “that makes it possible to project
complementation structure directly from the lexical information encapsulated in the structure of words, making it
possible to eliminate the independent construction of complementation structures” would be a superior
alternative. Both Brody’s (ibid.) Mirror Theory and the present approach are such alternatives. Note that
(re-)projection of the LI components of the complex head H is order preserving with respect to the hierarchy of
LI components internal to H, owing to the fact that re-projection involves re-assembly of the new occurrence of
H via Merge; see Fn. 24. In this manner it is ensured that the hierarchy of syntactic heads mirrors word-internal
morpheme order. See Surányi (2006) for detailed discussion.
Head movement and cyclic Spell Out PRE-FINAL 29
which agreement is always parasitic on some independently existing head (v and T, in the
clausal domain).34
At the same time, many researchers are of the opinion that there is solid empirical
evidence that AgrP-type projections do exist, and a great number of analyses of empirical
facts incorporate AgrPs (see e.g., Belletti, 2001, and references therein). Note also that a
Kaynean (single specifier) approach to phrase structure virtually calls for AgrPs in clause
structure (Kayne 1994, 30) in order to be able to provide an account of the simplest word
order facts (cf. Transitive Expletive Constructions, Object Shift, etc.; see also Cinque, 1999).
In short, even though they may well be descriptively motivated, a checking account of head
movement cannot admit AgrPs for conceptual–methodological reasons.
On the account of HM advocated here, AgrPs lead to no conceptual difficulty. This is
because we can directly relate the projection of AgrPs to agreement morphology on the verb.
The case to consider is when verbal agreement morphology and tense are not fused in the
language (cf. Bobaljik and Thráinsson, 1998), but project separately. Given that agreement is
uninterpretable on the verb (or at least it is taken to be uninterpretable in mainstream
minimalist theory), the agreement morpheme, a component of the complex verbal head, will
cause the complex verbal head at some derivational point to raise out of the current phrase
(which is to be subjected to Spell Out), get re-Merged at the root, and project its Agr
component. By projecting that component, agreement features can come to immediately
dominate the DP raised into the specifier of the created AgrP projection, licensing the
checking of Agr. Uninterpretable categorial features like D- and V-features are uncalled for,
only agreement features themselves play a role. There is no need for mediation, and there is
no need for four features, but just for two: agreement on the verb and agreement on the DP,
which enter checking with each other directly. Agr is not interpreted by semantics when it is
part of the verb, but it is interpretable at PF as agreement morphology on the verb (and in fact
it is interpretable as part of the DP, satisfying Radical Interpretability; see Brody, 1997b,
Pesetsky and Torrego, forthcoming). Agr and AgrP pose no special theory-internal problems
at all.
No doubt a number of new, and I believe intriguing, questions arise, if the outline of
the theory of HM put forward in this paper is on the right track. Most have not even been
touched on here, and the resolution of some have been barely sketched.35 Nevertheless, I hope
34
Additionally, in Chomsky’s (1995) model, the checking configuration must be complicated so that a V
incorporated into Agr can check against the DP in [Spec,Agr]. If DP and V are attracted to AgrP by
uninterpretable categorial features (a D-feature and a V-feature, respectively), then the fact that V’s phi-features
come to be checked by DP is viewed as a mere accident: they happen to be in the right configuration in AgrP.
Chomsky (2000) provides further arguments against AgrPs. The idea pursued there is that once the
uninterpretable features (undergoing Agree with the verb and with the nominal) are checked and deleted, the
constituent corresponding to AgrP will lack a label, and assuming that Agr has no other features than the ones
checked by V and DP, the whole Agr head will be deleted as such, yielding an illegitimate, because headless,
syntactic object. While Chomsky’s (1995) argument, based on the multiplication of the required uninterpretable
features, is convincing, the force of the one in Chomsky (2000) is questionable: it is unclear why a headless
syntactic object {DP, VP} representing the case of a moved nominal phrase is illegitimate, once there are no
syntax-internal filters on phrase structure (like X-bar theory).
35
For instance, as shown in Surányi (to appear), a certain interpretation of the definition of edge in (9) yields
phase extension (den Dikken, to appear) / phase sliding (Gallego, 2006). Another issue, also sidestepped in this
paper, is whether HM only ensues if the moved head H carries an [uF] that is checked by some element distinct
from H (through movement or external Merge), or there exists HM that could be called ‘independent.’ Obtaining
compelling empirical evidence is more difficult than it would seem at first glance, as covert movement (and
external Merger of covert elements) to H need to be properly excluded. In Surányi (2006) it is assumed that such
independent HM can take place, and it is analyzed as involving the checking of an EPP feature on one
30 Sounds of Silence (provisional book title)
to have substantiated, even if somewhat programmatically, the attractiveness and plausibility
of the approach, which is defined by the interaction of a small number of general assumption
about the computational system and the way it interacts with the interface modules. Taking
stock, the key assumptions include the following.
18. a.
b.
c.
d.
e.
Last Resort (see the general formulation in (7)),
an improved definition of Edge (based on uninterpretable features, see (9)),
the local checking relation of immediate dominance supplied by Merge, and its
interaction with projection of labels,
the head-based cyclic Spell Out defined by Spell Out Earliness (14) (see also
(13), (15)), and finally,
movement as involving Copy (applied to LIs) and re-Merge at the root.
Note that none of (18a–e) is an addition to the mainstream minimalist theory, but instead,
each one expresses a particular resolution of some fundamental question that arises in any
derivational minimalist approach. None of the specific assumptions I have adopted leads to a
loss (or for that matter, an unsupported gain) in empirical coverage elsewhere, and (excepting
the view of movement in (18e), for which see Fn. 8) each assumption represents a
simplification over its mainstream alternative.
Combined, these assumptions have led to a picture of HM according to which HM and
phrasal movement are symmetric (both can be successive cyclic, both extend the root, etc.);
projection level (i.e., the head vs. phrase distinction) is epiphenomenal (it is not referred to by
any grammatical operation, not even in terms of contextual definitions of Bare Phrase
Structure); the Uniformity Condition remains descriptively correct, but is dispensable as a
principle; and finally, the special locality of HM is the result of the head-based cyclicity of
Spell Out. Hopefully, these results, as well as the simplicity of the theoretical base from
which they are derived, places the theory of HM outlined in this paper on the scene of
currently explored directions as an attractive alternative, according to which the null
hypothesis can be upheld: LIs can be Moved on their own in narrow syntax.
REFERENCES
Abels, K. (2003). Adposition stranding and Universal Grammar. Ph.D. thesis, University of
Connecticut.
Ackema, P., A. Neeleman and F. Weerman (1993). Deriving functional projections. In:
Proceedings of NELS 23, pp. 17-31.
component of H by its own phonological features. EPP is construed as a feature whose only property is Puninterpretability (=being unvalued for P(honological)-features), to be checked (valued) under immediate
dominance by the P-features of some element.
Overt incorporation is yet another issue that cannot be discussed here. Nonetheless, it may be noted that
at least two approaches to incorporation data are consistent with the present account of HM. One option is to
extend this account to incorporation phenomena by base-generating the incorporated element and its host as a
complex head, which is labeled by the incorporated element in the base position and it comes to be labeled by
the host when raising via HM to the ‘position’ of the host. Another alternative is to analyze overt incorporation
as phrasal movement to the specifier of the host. Both accounts have been proposed in the literature.
Head movement and cyclic Spell Out PRE-FINAL 31
Ackema, P. and A. Čamdšić (2003). LF complex predicate formation: the case of participle
fronting in Serbo-Croatian. In: UCL Working Papers in Linguistics, 15, 131-175.
Adger, D. (2003). Core Syntax. A Minimalist Approach. Oxford University Press, Oxford.
Baker, M. C. (1988). Incorporation: A Theory of Grammatical Function Changing.
University of Chicago Press, Chicago, IL.
Belletti, A. (2001). Agreement projections. In: The handbook of contemporary syntactic
theory (Baltin, M. and C. Collins, ed.), pp. 483-510. Blackwell, Oxford.
Benedicto, E. (1997). V-movement and its interpretational effects. GLOW Newsletter, 38, 1415.
Bobaljik, J. D. (1995). Morphosyntax: The Syntax of Verbal Inflection. Ph.D. thesis, MIT.
Bobaljik, J. D. and S. Brown (1997) Interarboreal operations: Head movement and the
Extension Requirement. Linguistic Inquiry, 28, 345-356.
Bobaljik, J. D. and H. Thráinsson (1998). Two heads aren’t always better than one. Syntax, 1,
37-71.
Boeckx, C. (2003). Islands and Chains: Resumption as Stranding. John Benjamins,
Amsterdam.
Boeckx, C. and S. Stjepanović (2001). Head-ing toward PF, Linguistic Inquiry, 32, 345-355.
Boeckx, C. and K. Grohmann (To appear). Putting phases into perspective. Syntax.
Borsley, R. D., M.-L. Rivero, and J. Stephens. (1996). Long head movement in Breton. In:
The Syntax of Celtic Languages: A Comparative Perspective (R. D. Borsley and I.
Roberts, ed.), pp. 53-74. Cambridge University Press, Cambridge.
Bošković, Ž. (1995). Participle movement and second position cliticization in Serbo-Croatian.
Lingua, 96, 245-266.
Bošković, Ž. (2001). On the Nature of the Syntax-Phonology Interface. Elsevier, Amsterdam.
Bošković, Ž. (2002a). A-movement and the EPP. Syntax, 5, 167-218.
Bošković, Željko. (2002b). Clitics as nonbranching elements and the linear correspondence
axiom. Linguistic inquiry, 33, 329-340.
Bošković, Ž. (To appear). On the locality and motivation of Move and Agree: An even more
minimal theory. Linguistic Inquiry.
Brody, M. (1997a). Mirror Theory. Manuscript, University College London/Hungarian
Academy of Sciences.
Brody, M. (1997b). Perfect chains. In: Elements of grammar (L. Haegeman, ed.), pp. 139167. Kluwer, Dordrecht.
Brody, M. (1998). Projection and phrase structure. Linguistic Inquiry, 29, 367-398.
Brody, M. (2000). Mirror Theory: Syntactic representation in Perfect Syntax. Linguistic
Inquiry, 31, 29-56.
Brody, Michael. (2002). On the Status of Derivations and Representations. In: Derivation and
Explanation in the Minimalist Program (S. D. Epstein and T. D. Seely, ed.), pp. 1941. Blackwell, Oxford.
Brody, M. (2004). Move in syntax: Logically necessary or undefinable in the best case?
Manuscript, UCL.
Broekhuis, H. and K. Migdalski (2003). Participle fronting in Bulgarian as XP movement. In:
Linguistics in the Netherlands 2003 (L. Cornips and P. Fikkert, ed.), pp. 1-12. John
Benjamins, Amsterdam.
32 Sounds of Silence (provisional book title)
Bury, D. (2003a). Phrase structure and derived heads. Ph.D. thesis, UCL.
Bury, D. (2003b). Selection and head chains. In Generative Grammar in a Broader
Perspective: Proceedings of the 4th GLOW in Asia (H.-J. Yoon, ed.), pp. 67-86.
Hankook, Seoul.
Cardinaletti, A. and M. Starke (1999). The typology of structural deficiency: A case study of
the three classes of pronouns. In: Clitics in the languages of Europe (Riemsdijk, H.
van, ed.), pp. 145-233. de Gruyter, Berlin.
Carnie, A. (1995). Nonverbal predication and head movement. Ph.D. thesis, MIT
Cheng, L. L-S. and J. Rooryck (2002). Types of wh-in-situ. Manuscript, Leiden University.
Chomsky, N. (1986). Barriers. MIT Press, Cambridge, MA.
Chomsky, N. (1993). A minimalist program for linguistic theory. In: The view from Building
20: Essays in linguistics in honor of Sylvain Bromberger (K. Hale and S. Keyser, ed.),
pp. 1-52. MIT Press, Cambridge, MA.
Chomsky, N. (1994). Bare phrase structure. MIT Occasional Papers in Linguistics, 5.
MITWPL, MIT.
Chomsky, N. (1995). Categories and transformations. In: The Minimalist Program, pp. 219394. MIT Press, Cambridge, MA.
Chomsky, N. (2000). Minimalist inquiries: the framework. In: Step by Step: Essays on
minimalism in honor of Howard Lasnik (R. Martin, D. Michaels and J. Uriagereka,
ed.), pp. 89-155. MIT Press, Cambridge, MA.
Chomsky, N. (2001). Derivation by phase. In: Ken Hale: A Life in Language (M. Kenstowicz,
ed.), pp. 1-52. MIT Press, Cambridge, MA.
Chomsky, N. (2004). Beyond explanatory adequacy. In: The cartography of syntactic
structures. Structures and Beyond, Vol. 3 (A. Belletti, ed.), pp. 104-131. Oxford
University Press, Oxford.
Chomsky, N. (2005a). Three factors in language design. Linguistic Inquiry, 36, 1-22.
Chomsky, N. (2005b). On phases. Manuscript, MIT.
Chomsky, N. and H. Lasnik (1993). The theory of principles and parameters. In: Syntax: An
International Handbook of Contemporary Research (J. Jacobs, A. von Stechow, W.
Sternefeld and T. Vennemann, ed.), pp. 506-569. de Gruyter, Berlin.
Cinque, G. (1999). Adverbs and Functional heads: A Cross-Linguistic Perspective. Oxford
University Press, Oxford.
Collins, C. (1997). Local Economy. MIT Press, Cambridge, MA.
Collins, C. (2002). Eliminating labels. In: Derivation and Explanation in the Minimalist
Program (S. D. Epstein and T. D. Seely, ed.), pp. 42-61. Blackwell, Oxford.
Dikken, M. den. (To appear). Phase extension: Contours of a theory of the role of head
movement in phrasal extraction. Theoretical Linguistics.
Donati, C. (2006). On wh-head movement. In: Wh-Movement. Moving On (L. Cheng and J.
Rooryck, ed.), pp. 21-46. MIT Press, Cambridge, MA.
É. Kiss, K. and H. van Riemsdijk (ed.) (2004). The Verbal Complex. A Study of Hungarian,
German and Dutch. John Benjamins, Amsterdam.
Embick, D. and R. Noyer. (2001). Movement operations after syntax. Linguistic Inquiry, 32,
555-595.
Head movement and cyclic Spell Out PRE-FINAL 33
Epstein, S. (1999). Un-principled syntax and the derivation of syntactic relations. In: Working
Minimalism (S. Epstein and N. Hornstein, ed.), pp. 317-345. MIT Press, Cambridge,
MA.
Epstein, S. E. Groat, R. Kawashima, H. Kitahara (1998). A Derivational Approach to
Syntactic Relations. Oxford University Press, Oxford.
Epstein, S. D., and T. D. Seely (2002). Rule applications as cycles in a level-free syntax. In:
Derivation and Explanation in the Minimalist Program (S. D. Epstein and T. D. Seely,
ed.), pp. 65-89. Blackwell, Oxford.
Epstein, S. D. and T. D. Seely. (2006). Derivations in Minimalism. Cambrdige, Cambridge
University Press.
Fanselow, G. (2001). When formal features need company. In: Audiatur Vox Sapientiae (C.
Féry and W. Sternefeld, ed.), pp. 131-152. Akademie-Verlag, Berlin.
Fanselow, G. (2003). Münchhausen-style head movement and the analysis of Verb Second.
In: Syntax at Sunset 3: Head Movement and Syntactic Theory. UCLA Working Papers
in Linguistics, 10 (A. Mahajan, ed.), pp. 40-76. University of California, LA.
Fanselow, G. (2004). Münchhausen-style head movement and the analysis of Verb Second.
In: Linguistics in Potsdam, 22, 9-49.
Fox, D. (2000). Economy and Semantic Interpretation. MIT Press, Cambridge, MA.
Fox, D. and D. Pesetsky (2005). Cyclic linearization of syntactic structure. Theoretical
Linguistics, 31, 1-46.
Frampton, J. and S. Gutmann (1999). Cyclic computation, a computationally efficient
minimalist syntax. Syntax, 2, 1-27.
Frampton, J. and S. Gutmann. (2002). Crash-proof syntax. In: Derivation and Explanation in
the Minimalist Program (S. D. Epstein, and T. D. Seely, ed.), pp. 90-105. Blackwell,
Oxford.
Gallego, Á. J. (2006). Phase sliding. Manuscript, UAB/UMD.
Gärtner, H-M. (2002). Generalized Transformations and Beyond: Reflections on Minimalist
Syntax. Akademie Verlag, Berlin.
Haan, G.J. de (2001). More is going on upstairs than downstairs: embedded root phenomena
in West Frisian. Journal of Comparative Germanic Syntax, 4, 3-38.
Halle, M. and A. Marantz. (1993). Distributed Morphology and the pieces of inflection. In:
The View from Building 20 (K. Hale and S. J. Keyser, ed.), pp. 111-176. MIT Press,
Cambridge, MA.
Harley, H. (2004). Merge, conflation, and head movement: The First Sister Principle
revisited. In: Proceedings of NELS 34 (K. Moulton and M. Wolf, ed.) GLSA,
Amherst, MA.
Heck, F. and G. Müller. (2001). Repair-driven movement and the local optimization of
derivations. Manuscript, Universität Stuttgart and IDS Mannheim.
Hiraiwa, K. (2001). On nominative-genitive conversion. In: A Few From Building E-39. MIT
Working Papers in Linguistics, 39 (O. Matushansky and E. Guerzoni, ed.), pp. 66-123.
MIT, Cambridge, MA.
Holmberg, A. (1991). Head scrambling. Talk at GLOW 1991, Leiden, The Netherlands.
Hornstein, N. (2001). Move! A Minimalist Theory of Construal. Blackwell, Oxford.
34 Sounds of Silence (provisional book title)
Hornstein, N. and J. Uriagereka. (2002). Reprojections. In: Derivation and Explanation in the
Minimalist Program (S. D. Epstein and T. D. Seely, eds.), pp. 106-132. Blackwell,
Oxford.
Julien, M. (2000). Syntactic heads and word formation: A study of verbal inflection. Ph.D.
thesis, University of Tromsø.
Julien, M. (2002). Syntactic Heads and Word Formation. Oxford University Press, Oxford.
Kayne, R. (1994). The Antisymmetry of Syntax. MIT Press, Cambridge, MA.
Kayne, R. (2002). Pronouns and their antecedents. In: Derivation and Explanation in the
Minimalist Program (S. D. Epstein and T. D. Seely, ed.), pp. 133-166. Blackwell,
Oxford.
Kitahara, H. (1995). Target α: Deducing strict cyclicity from derivational economy.
Linguistic Inquiry, 26, 47-77.
Kitahara, H. (1997). Elementary Operations and Optimal Derivations. MIT Press,
Cambridge, MA.
Kitahara, H. (2000). Two (or more) syntactic categories vs. multiple occurrences of one.
Syntax, 3, 151-158.
Koeneman, O. (2000). The flexible nature of verb movement. Ph.D. thesis, University of
Utrecht.
Koopman, H. and A. Szabolcsi (2000). Verbal Complexes. MIT Press, Cambridge, MA.
Lasnik, H. (1999). Minimalist Analysis. Blackwell, Oxford.
Lechner, W. (2005). Semantic and syntactic effects of head movement. Talk at GLOW 2005,
Geneva.
Legate, J. A. (2003). Some interface properties of the phase. Linguistic Inquiry, 34, 506-616.
Legate, J. A. (2005). Phases and cyclic agreement. In: Perspectives on Phases. MIT Working
Papers in Linguistics, 49 (M. McGinnis and N. Richards, ed.), pp. 147-156. MIT,
Cambridge, MA.
Mahajan, A. (2000). Eliminating head-movement. The GLOW Newsletter, 44, 44-45.
Mahajan, A. (2003). Word order and (remnant) VP movement. In: Word Order and
Scrambling (S. Karimi, ed.), pp. 217-237. Blackwell, Oxford.
Matushansky, O. (2005). Going through a phase. In: Perspectives on Phases. MIT Working
Papers in Linguistics, 49 (M. McGinnis and N. Richards, ed.) MIT, Cambrdige, MA.
Matushansky, O. (2006). Head movement in linguistic theory. Linguistic Inquiry, 37, 69-109.
Müller, G. (2001). Order preservation, parallel movement, and the emergence of the
unmarked. In: Optimalitiy-Theoretic Syntax (G. Legendre, J. Grimshaw and S. Vikner,
ed.), pp. 279-313. MIT Press, Cambridge, MA.
Müller, G. (2004). Phrase impenetrability and wh-intervention. In: Minimality Effects in
Syntax (A. Stepanov, G. Fanselow and R. Vogel, ed.), pp. 289-325. Mouton/de
Gruyter, Berlin.
Munn, A. (1994). A minimalist account of reconstruction asymmetries. In: Proceedings of
NELS 24, pp. 397-410. GLSA, Amherst, MA.
Nash, L. and A. Rouveret (1997). Proxy categories in phrase structure theory. In: Proceedings
of NELS 27, pp. 287-304. GLSA, Amherst, MA.
Nissenbaum, J. (2000). Investigations of covert phrasal movement. Ph.D. thesis, MIT.
Head movement and cyclic Spell Out PRE-FINAL 35
Nunes, J. (2001). Sideward movement. Linguistic Inquiry, 32, 303-343.
Nunes, J. (2004). Linearization of Chains and Sideward Movement. MIT Press, Cambridge,
MA.
Ochi, M. (1999). Constraints on feature checking. Ph.D. thesis, University of Connecticut,
Storrs.
Pesetsky, D. (2000). Phrasal Movement and Its Kin. MIT Press, Cambridge, MA.
Pesetsky, D. and E. Torrego (2001). T-to-C movement: Causes and consequences. In: Ken
Hale: A life in language (M. Kenstowicz, ed.), pp. 355-426. MIT Press, Cambridge,
MA.
Pesetsky, D. and E. Torrego (Forthcoming). The syntax of valuation and the interpretability of
features. Clever and Right: Festschrift in honor of Joe Emonds (S. Karimi, V.
Samiian, and W. Wilkins, ed.) Mouton de Gruyter, Berlin.
Reinhart, T. (2006). Interface Strategies: Optimal and Costly Computations. MIT Press,
Cambridge, MA.
Richards, N. (1999). Featural cyclicity and the ordering of multiple specifiers. In: Working
Minimalism (S. D. Epstein and N. Hornstein, ed.), pp. 127-158. MIT Press,
Cambridge, MA.
Richards, N. (2001). Movement in Language: Interactions and Architectures. Oxford
University Press, Oxford.
Rivero, M-L. (1991). Long head movement and negation: Serbo-Croatian vs. Slovak and
Czech. The Linguistic Review, 8, 319-351.
Rivero, M-L. (1994). Clause structure and V-movement in the languages of the Balkans.
Natural Language and Linguistic Theory, 12, 63-120.
Roberts, I. (1991). Excorporation and minimality. Linguistic Inquiry, 22, 209-218.
Roberts, I. (1994). Two types of head-movement in Romance. In: Verb Movement (N.
Hornstein and D. Lightfoot, ed.), pp. 207-242. Cambridge University Press,
Cambridge.
Roberts, I. (1997). Restructuring, head movement and locality. Linguistic Inquiry, 28, 423460.
Sauerland, U. (1998). The meaning of chains. Ph.D. thesis, MIT.
Sauerland, U. (2004). The interpretation of traces. Natural Language Semantics, 12, 63-127.
Schafer, R. (1997). Long head movement and information packaging in Breton. Canadian
Journal of Linguistics, 42, 169-203.
Shima, E. (2000). A preference for Move over Merge. Linguistic Inquiry, 31, 375-385.
Sportiche, C. (1992). Clitic constructions. Manuscript, UCLA.
Sportiche, D. (1998). TBA. Handout, MIT.
Starke, M. (2001). Move dissolves into merge: a theory of locality. Ph.D. thesis, University of
Geneva.
Stjepanovic, S. and S. Takahashi (2002). Eliminating the Phase Impenetrability Condition.
Manuscript, Kanda University of International Studies.
Surányi, B. (2002). Multiple operator movements in Hungarian. Ph.D. thesis., University of
Utrecht.
Surányi, B. (2003). Head movement qua substitution. GLOW Newsletter #26 (D. Adger and
P. Svenonius, ed.)
36 Sounds of Silence (provisional book title)
Surányi, B. (2004a/2000). The left periphery and Cyclic Spellout: the case of Hungarian. In:
Peripheries. Syntactic Edges and Their Effects (D. Adger, C. de Cat and G. Tsoulash,
ed.), pp. 49-73. Kluwer, Dordrecht.
Surányi, B. (2004b). Head movement qua root merger. In: The Even Yearbook 9 (L. Varga,
ed.), pp. 167-183. Budapest: ELTE.
Surányi, B. (2005). Object Shift and linearization at the PF interface. Theoretical Linguistics,
31, 199-213.
Surányi, B. (2006). Cyclic Spell Out and head movement. Manuscript, Hungarian Academy
of Sciences.
Surányi, B. (To appear). On phase extension and head movement. Theoretical Linguistics.
Svenonius, P. (1994). C-selection as feature-checking. Studia Linguistica, 48, 133-155.
Svenonius, P. (2001). On Object Shift, scrambling, and the PIC. In: A Few from Building E39.
Papers in Syntax, Semantics and their Interface. MIT Working Papers in Linguistics,
39 (E. Guerzoni and O. Matushansky, ed.) MIT, Cambridge, MA.
Svenonius, P. (2004). On the edge. In: Peripheries: Syntactic Edges and their Effects (D.
Adger, C. de Cat and G. Tsoulas, ed.), pp. 261-287. Kluwer, Dordrecht.
Takahashi, D. (1994). Minimality of movement. PhD thesis, University of Connecticut.
Toyoshima, T. (2000). Head-to-Spec movement and dynamic economy. Ph.D. thesis, Cornell
University.
Toyoshima, T. (2001). Head-to-Spec movement. In: The Minimalist Parameter. Selected
Papers from the Open Linguistics Forum, Ottawa, 12-23 March 1997. (G. M.
Alexandrova and O. Arnaudova, ed.), pp. 115-136. Amsterdam, John Benjamins.
Vikner, S. (1995). Verb Movement and Expletive Subject in the Germainc Languages. Oxford
University Press, Oxford.
Watanabe, A. (1995). Conceptual basis of cyclicity. In: Papers on Minimalist Syntax. MIT
Working Papers in Linguistics, 27 (R. Pensalfini and H. Ura, ed.), pp. 269-291.
MITWPL, MIT.
Zwart, J.-W. (1992). Matching. In: Language and Cognition 2. Yearbook 1992 of the
Research Group for Linguistic Theory and Knowledge Representation of the
University of Groningen (D. Gilbers and S. Looyenga, ed.), pp. 349-361. University of
Groningen.
Zwart, J.-W. (2001). Syntactic and phonological verb movement. Syntax, 4, 34-62.
Zwart, J.-W. (2002). Issues relating to a derivational theory of binding. In: Derivation and
Explanation in the Minimalist Program (S. D. Epstein and T. D. Seely, ed.), pp. 269302. Blackwell, Oxford.
Zwart, J.-W. (2003). Agreement as sisterhood. Talk at the Comparative Germanic Syntax
Workshop 18, 2003, Durham.