Journal of Memory and Language 56 (2007) 384–409 Journal of Memory and Language www.elsevier.com/locate/jml The effect of syntactic constraints on the processing of backwards anaphora Nina Kazanina a,*, Ellen F. Lau b, Moti Lieberman c, Masaya Yoshida b, Colin Phillips b,d a Department of Linguistics, University of Ottawa, 401 Arts Hall, 70 Laurier Avenue East, Ottawa, Ont., KIN 6N5, Canada b Department of Linguistics, University of Maryland, College Park, MD 20742, USA c Department of Linguistics, McGill University, 1085 Dr. Penfield, Montreal, QC H3A 1A7, Canada d Neuroscience and Cognitive Science Program, University of Maryland, College Park, MD 20742, USA Received 5 February 2006; revision received 20 September 2006 Available online 3 November 2006 Abstract This article presents three studies that investigate when syntactic constraints become available during the processing of long-distance backwards pronominal dependencies (backwards anaphora or cataphora). Earlier work demonstrated that in such structures the parser initiates an active search for an antecedent for a pronoun, leading to gender mismatch effects in cases where a noun phrase in a potential antecedent position mismatches the gender of the pronoun [Van Gompel, R. P. G. & Liversedge, S. P. (2003). The influence of morphological information on cataphoric pronoun assignment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 128–139]. Results from three self-paced reading studies suggest that structural constraints on coreference, in particular Principle C of the Binding Theory [Chomsky, N. (1981). Lectures on government and binding. Dordrecht, Foris], exert an influence at an early stage of this search process, such that gender mismatch effects are elicited at grammatically licit antecedent positions, but not at grammatically illicit antecedent positions. The results also show that the distribution of gender mismatch effects is unlikely to be due to differences in the predictability of different potential antecedents. These findings suggest that backwards anaphora dependencies are processed with a grammatically constrained active search mechanism, similar to the mechanism used to process another type of long-distance dependency, the wh dependency (e.g., [Stowe, L. (1986). Evidence for online gap creation. Language and Cognitive Processes, 1, 227–245; Traxler, M. J., & Pickering, M. J. (1996). Plausibility and the processing of unbounded dependencies: an eye-tracking study. Journal of Memory and Language, 35, 454–475.]). We suggest that the temporal priority for syntactic information observed here reflects the predictability of structural information, rather than the need for an architectural constraint that delays the use of non-syntactic information. 2006 Elsevier Inc. All rights reserved. Keywords: Parsing; Coreference; Binding; Cataphora; Grammatical constraints; Pronouns * Corresponding author. Fax: +1 613 562 5141. E-mail address: [email protected] (N. Kazanina). 0749-596X/$ - see front matter 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.jml.2006.09.003 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Introduction To understand the architecture of the human language processor it is important to establish the extent to which its behavior is governed by general mechanisms or by highly specific subroutines that are restricted to individual constructions or individual languages. In this article we address the question of the generality of sentence processing mechanisms by examining whether well-known properties of the processing of long-distance filler-gap dependencies are also found in the processing of backwards pronominal dependencies (backwards anaphora or cataphora), as found in sentences like When she was in Paris, Susan visited the Louvre. Backwards anaphoric dependencies parallel filler-gap dependencies in terms of the relative ordering of key elements, but differ in a number of other respects, including the obligatoriness and overtness of the elements of the dependency and the syntactic and discourse constraints that they are subject to. An improved understanding of how backwards anaphora is processed may contribute to a fuller understanding of the mechanisms responsible for the construction of long-distance relations in language in general. Much previous research on filler-gap dependencies such as wh interrogatives, topicalizations and scrambling constructions has established two generalizations that form the starting point for the present studies. First, much evidence indicates that the parser constructs filler-gap dependencies actively, i.e., after encountering a suitable filler (e.g., a wh-phrase) it constructs gap positions without waiting for unambiguous evidence for the position of the gap in the input (e.g., Crain & Fodor, 1985; Frazier & Clifton, 1989; Stowe, 1986). Second, the mechanisms that search for potential gap positions show on-line sensitivity to grammatical constraints on fillergap dependencies, such that active gap construction is not observed in structural positions that are ruled out by those constraints (e.g., Stowe, 1986; Traxler & Pickering, 1996). In this study we investigate whether these two properties are also found in the processing of backwards anaphora. Previous studies of the processing of backwards anaphora provide some evidence for active dependency formation mechanisms (Van Gompel & Liversedge, 2003) and for the impact of grammatical constraints (Cowart & Cairns, 1987), but in both cases the findings are open to alternative interpretations. In the current study we examine these issues in more detail, with a particular focus on distinguishing the effects of grammatical constraints from the effects of distributional probabilities on active dependency formation. Properties of long-distance dependencies The wh-constructions in (1) are examples of a longdistance dependency that create a relation between a 385 wh-phrase, also known as a filler, and its gap position, as in (1a). Although the filler and the gap may potentially be separated by an indefinite amount of intervening material (1b–c), the occurrence of the filler reliably predicts the occurrence of a gap within the same sentence and if there is no gap, the result is unacceptable (1d). Note that we use the filler-gap terminology here mostly for the purposes of exposition. The long-standing controversy about whether wh-dependencies genuinely involve gaps or whether they involve direct associations between verbs and wh-phrases (for reviews see Phillips & Wagers, in press; Pickering, 1993) is orthogonal to the processing issues that are the focus of this article. (1) a. What did John see __? b. What did Mary say that John saw __? c. What did Bill say that Mary claimed that John saw __? d. *What did John see an apple? The relative positioning of the two elements of a whdependency is subject to a number of constraints on well-formedness, often known as island constraints. The sentences in (2), for example, are unacceptable due to a ban on filler-gap dependencies that cross relative clause boundaries as in (2a) or that span non-complement adverbial clauses as in (2b) (Chomsky, 1973; Huang, 1982; Ross, 1967). (2) a. *What did the boy that bought __ like Mary? b. *What did the boy read a book while his sister was playing with __? Referential dependencies between a pronoun and an antecedent noun phrase include forwards anaphora, in which the antecedent precedes the pronoun (3a), and backwards anaphora, in which the pronoun precedes the antecedent (3b). Similar to filler-gap dependencies, referential dependencies can span multiple clauses, as in (3a), and are subject to structural restrictions. Most relevant for the current study is a constraint, Principle C (Chomsky, 1981), that prohibits referential dependencies in which the pronoun structurally c-commands the antecedent, i.e., where the antecedent is structurally within the scope of the pronoun (4). (3) a. Johni said that the news report would make the customers think that hei wanted to sell the company. b. Although shei was sleepy, Susani tried to pay attention. 386 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 (4) *Hei said that Johni saw the movie. intended meaning: ‘John said that he (=John) saw the movie.’ In research on language processing, the constraints on long-distance filler-gap dependencies have received a good deal of attention, but there has been relatively little research on the processing of constraints on backwards anaphora. Real time processing of wh-dependencies Fodor (1978) sketched out two potential mechanisms for parsing filler-gap dependencies, contrasting a ‘gapdriven’ view whereby dependency formation is initiated only when the parser has encountered all necessary pieces of information in the input, that is, both the filler and the gap, with a ‘filler-driven’ view, according to which the parser starts predictively building a dependency upon encountering the filler that constitutes the first half of the dependency. Stowe (1986) explored this distinction using a word-by-word self-paced reading technique, by comparing reading times for sentences that contain a licit wh-dependency (5a) with controls that lack a whdependency (5b). (5) a. My brother wanted to know who Ruth will bring us home to __ at Christmas. b. My brother wanted to know if Ruth will bring us home to Mom at Christmas. Stowe found that readers slowed down at the direct object noun phrase us in wh-sentences, relative to the corresponding word in the control sentence, and argued that this effect arose because the parser predictively posited a gap in direct object position, which subsequently needed to be retracted upon encountering the overt pronoun us. This ‘filled-gap effect’ is taken to support the ‘filler-driven’ view and to reflect the parser’s willingness to complete a wh-dependency before it has unambiguous bottom-up evidence that the verb has a missing argument. Such a mechanism for fillergap dependency formation was dubbed as active search, and has been supported by evidence from similar filled-gap effects in diverse syntactic positions (Aoshima, Phillips, & Weinberg, 2004; Crain & Fodor, 1985; Lee, 2004), and by evidence using different experimental measures, including eye-tracking (Sussman & Sedivy, 2003; Traxler & Pickering, 1996), event-related potentials (Garnsey, Tanenhaus, & Chapman, 1989; Kaan, Harris, Gibson, & Holcomb, 2000), and cross-modal lexical priming (Nicol, Fodor, & Swinney, 1994). The evidence for active wh-dependency formation leads to a further question about the grammatical accuracy of the parser. Whereas a gap-driven parsing mechanism could be expected to posit gaps only in grammatically acceptable positions, by virtue of its dependence on direct evidence in the language input, this is not guaranteed for an active dependency formation mechanism, since it is predictive and less closely tied to evidence in the bottom-up input. If the process of active dependency formation immediately adheres to grammatical constraints, the parser should not attempt to posit a gap inside an island, despite its preference to complete the dependency as soon as possible. In her second experiment, Stowe (1986) found that no filled-gap effect was observed inside a syntactic island, suggesting that the parser respects constraints on wh-movement even at the earliest stages of filler-gap dependency formation (see also Bourdages, 1992; Phillips, 2006; Traxler & Pickering, 1996 for a similar conclusion and Pickering, Barton, & Shillcock, 1994 for discussion of potentially discrepant findings). Referential dependencies Backward anaphoric dependencies parallel fillergap dependencies in the respect that they may span long-distances, and in the respect that a dependent element (wh-phrase, pronoun) precedes a controlling element (verb or gap, antecedent noun phrase). This raises the possibility that the parser may analyze backward anaphoric dependencies in the same fashion as filler-gap dependencies, searching for an antecedent for an unanchored pronoun in a manner that is active yet sensitive to constraints on referential dependencies. However, unlike wh-fillers that require a gap position, finding an overt antecedent is not a grammatical requirement for a pronoun, which can be taken to refer to an unspecified discourse referent. For this reason, an active search mechanism may not be as useful for backwards anaphora as for filler-gap dependencies. Furthermore, if common mechanisms are used for parsing forwards and backwards referential dependencies, then this is another reason why active search might not be deployed, since forward anaphoric dependencies cannot be recognized until both the antecedent and the pronoun have been encountered. If pronouns do trigger an active search for an antecedent, this leads to the question of whether binding principles immediately constrain active search for antecedents. If binding principles, like island constraints on filler-gap dependencies, constrain the earliest stages of structure generation, we should expect that during an active search for an antecedent triggered by a cataphoric pronoun the parser should never consider positions that would violate the binding constraints. Note that our use N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 of the term ‘antecedent’ in referring to a noun phrase does not necessarily entail that that noun phrase is a licit licensor for the dependent element in question. Instead it will be used as a shortcut for ‘a candidate antecedent’ which could become a licit antecedent provided that (i) it occurs in a licit structural position, and (ii) it matches the pronoun in its morphological features (gender, number). Active search in backwards anaphora processing Previous evidence for active search for antecedents in the processing of backwards anaphora comes from a study by Van Gompel and Liversedge (2003) that used eye-tracking to examine reading times for bi-clausal sentences as in (6). The gender of the main subject was manipulated such that it either matched (6a) or mismatched (6b) with the preceding pronoun. In all sentences an antecedent for the pronoun appeared at some point in the main clause. (6) Stimuli from Experiment 1, Van Gompel and Liversedge (2003). a. gender match When he was at the party, the boy cruelly teased the girl during the party games. b. gender mismatch When he was at the party, the girl cruelly teased the boy during the party games. c. control When I was at the party, the boy cruelly teased the girl during the party games. Van Gompel and Liversedge found a mismatch effect in early eye-tracking measures at the region immediately following the main clause subject noun: first-pass reading times were slower at cruelly in (6a) than in (6b). Furthermore, inclusion of the control condition in (6c) made it possible to rule out the possibility that the longer reading times following the second subject in (6b) were due to the introduction of a new discourse entity, since reading times in the critical region in the control condition (6c) did not differ from (6a). Van Gompel and Liversege instead argue that this gender mismatch effect reflects formation of a referential dependency between the pronoun and the second subject noun phrase before relevant bottom-up information about the antecedent has been taken into consideration, which leads to processing difficulty when the antecedent noun phrase is recognized as semantically incompatible with the pronoun. This evidence parallels the filled-gap effect found in filler-gap dependency processing, and may be considered as evidence for active formation of backwards anaphoric dependencies 387 (for similar findings from Japanese see Aoshima, 2003; Aoshima, Yoshida, & Phillips, submitted for publication). Note that a number of different processing mechanisms for backward anaphora may be considered as ‘active’, in the respect that dependencies are constructed before all bottom-up information has been analyzed. Van Gompel and Liversedge (2003) assume that the syntactic category of the antecedent must be processed before a referential dependency can be formed, whereas computation of its morphological properties ‘is delayed until after the computation of coreference relations’ (p. 128), and thus argue that syntactic category information has an architectural priority in parsing (e.g., Cowart & Cairns, 1987; Fodor, 1983; Frazier, 1987). However, Van Gompel and Liversedge’s findings are also compatible with an even more ‘active’ processing mechanism that constructs referential dependencies as soon as an antecedent position can be reliably predicted, potentially before any of the specific features of the antecedent are encountered in the input. Under this approach, the parser may recognize that a sentence-initial while-clause, as in (6), must be followed by a main clause that contains a subject noun phrase, and may use this knowledge to construct a dependency between the pronoun and the main clause subject even before the end of the while-clause. If this is the case, then the gender mismatch effect may reflect the fact that a subject position may be reliably projected ahead of time as a structurally licit antecedent, whereas its gender and other features may only become known once the subject noun phrase is encountered bottom-up. This view would obviate the need to impose an architectural constraint that delays the availability of morphological information, as Van Gompel and Liversedge propose. The gender mismatch paradigm therefore provides a useful method for tracking active dependency formation in backwards anaphora, but leaves open the question of how ‘active’ the search for pronoun antecedents is, and also how sensitive it is to constraints on referential dependencies. Processing of constraints on forwards anaphora A number of previous studies explored the timecourse of the application of binding constraints on forwards anaphora constructions in which the pronoun follows the antecedent, in particular the requirement that reflexives find a clausemate antecedent (7) and the requirement that pronouns not have a clausemate antecedent (8), known as Principle A and Principle B respectively (Chomsky, 1981; Reinhart & Reuland, 1993). 388 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 (7) a. Susan heard that Billi had painted himselfi. b.*Billi heard that Susan had painted himselfi. (8) a. *Susan heard that Billi had painted himi. b. Billi heard that Susan had painted himi. Evidence has been found for both early/immediate and late/delayed application of these constraints. Nicol and Swinney (1989) examined the activation of candidate antecedents at a pronoun or reflexive using a cross-modal priming task and found that candidates excluded by Conditions A and B were not considered (see also Clifton, Kennison, & Albrecht, 1997). In contrast, Badecker and Straub (2002) showed in a selfpaced reading task that reading times for a pronoun were affected by the gender congruency of a grammatically inaccessible antecedent, and concluded from this that the application of grammatical constraints on binding was delayed relative to early parsing processes (see also Kennison, 2003). More recently, Sturt (2003) found in eye-tracking studies on the processing of reflexives that early eye-tracking measures (first-fixation and first-pass) were unaffected by the gender congruency of grammatically inaccessible antecedents, although some later eye-tracking measures (secondpass, regression path) did show effects of those antecedents. Sturt’s findings raise the possibility that discrepancies among earlier findings might have been due to differences in the stage of processing that different tasks tapped into. This parallels findings on the processing of filler-gap dependencies, where evidence for the construction of illicit dependencies comes primarily from studies that have used whole sentence judgment tasks. A difficulty in interpreting effects of grammatical constraints on the processing of forwards anaphora arises from the fact that the pronoun or reflexive cannot, in general, be anticipated, and therefore the search for an antecedent cannot be initiated until the end of the dependency is reached. Consequently, the search for an antecedent involves a search in memory for candidate antecedents. If there are multiple candidate antecedents, they may need to be evaluated in parallel. Studies of binding constraints in processing have asked whether the parser selectively searches structural positions where antecedents are allowed, but this question overlaps with the question of what type of memory representation is searched. The pieces of a sentence may be encoded in echoic memory, in a syntactic parse, or in a discourse model, and may to some degree be stored in all of these simultaneously. Therefore, evidence for effects of grammatically inaccessible antecedents on the processing of pronouns or reflexives may reflect processes that search illegal antecedent positions in a syntactic structure, or they may reflect processes that access a non-syntactic encoding of potential antecedents. Furthermore, in cases where the effect of structurally inaccessible antecedents is realized as a processing disruption due to the presence of two candidate antecedents (e.g., Badecker & Straub, 2002; Sturt, 2003), this may reflect interference effects at a lexical or conceptual level that make a grammatically accessible antecedent harder to retrieve from memory when it is similar to another noun phrase in the sentence. These difficulties do not arise in the processing of backwards anaphora, since the dependency can be anticipated after processing of its first element, i.e., the pronoun, and therefore each potential antecedent can be evaluated in succession. It is therefore possible to investigate, for a series of different structural positions, whether the search for an antecedent for the pronoun is sensitive to grammatical constraints on backwards anaphora. This may be viewed as a particularly strong test of the impact of binding constraints in parsing, since successful constraint application requires that the parser ignore a candidate antecedent at a point where no other antecedent is yet available. Principle C in grammar and processing Although backwards anaphora is productive and fully acceptable in English (see van Hoek, 1997 for multiple naturalistic examples), it is excluded in configurations where a pronoun c-commands its antecedent, as in the examples in (9), where the noun phrases he and John cannot be understood as coreferential. This constraint, known as Principle C (Chomsky, 1981), is highly robust across languages (e.g., Baker, 1991; Jelinek, 1984; but cf. Speas, 1990; Bruening, 2001) and appears early in language development (Crain & McKee, 1985; Kazanina & Phillips, 2001). (9) a. *Hei likes Johni. b. *Hei said that Johni likes wine. c. *Hei drank beer while Johni watched a soccer game. It should be noted that there are at least two welldefined classes of apparent exceptions to Principle C, which must be taken into consideration when designing studies on this constraint. The first type of exception can be seen in a sentence like He then did what John always did in such situations, where coreference between he and John is acceptable. Such cases are generally understood to involve comparisons of multiple mental representations or ‘guises’ of the same individual (e.g., Heim, 1992; Reinhart, 1983), and minimal changes to the situations under comparison can lead to failure of coreference, as in *He then did what John had done half an hour earlier. Rather than undermine the structural account of Principle C, such contrasts suggest that we should understand coreference in terms N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 of elements in a mental model, which may stand in a many-to-one relation with entities in the world. A second type of exception involves sentences like He was threatening to leave when Billy noticed that the computer had died (Harris & Bates, 2002). Such cases are clearly acceptable, and are most effective in descriptions of scenarios where the embedded clause event interrupts the main clause event. Again, minimal changes make coreference unacceptable, as in *He decided to leave when Billy noticed that the computer had died. Syntactic tests suggest, however, that the when-clause in such exceptional examples is a sentential modifier, rather than a verb phrase-subordinator, which takes away the problematic c-command relation between the pronoun and the embedded subject and obviates the Principle C violation. In sum, both apparent exceptions to Principle C may not ultimately be counterexamples to the structural generalization (see Kazanina, 2005 for more detail). Yet, in all of the studies reported below care was taken to avoid such exceptions in constructing the experimental materials and the acceptability of all contrasts under investigation was verified by off-line questionnaires. Previous psycholinguistic investigation of Principle C in sentence processing has been limited. In a whole-sentence self-paced reading Hirst and Brill (1980) looked for effects of an inaccessible antecedent in constructions that contained a potential Principle C violation using two sentence sequences such as (10), that contained both a grammatically accessible antecedent for the pronoun (John) and a grammatically inaccessible one (Henry). (10) John stood watching. He ran for a doctor after Henry fell down some stairs. Hirst and Brill found that reading times for the second sentence differed as a function of the plausibility of the inaccessible antecedent (Henry) as the referent for the pronoun. They concluded that during pronoun resolution speakers temporarily consider grammatically illicit referents. However, the coarse temporal resolution of the task makes it difficult to draw inferences regarding the time-course and the source of the effect. Specifically, the effect could be due to a later stage of processing at which information from multiple sources is combined and the sentence is evaluated for overall plausibility. Cowart and Cairns (1987) used sentence fragments such as (11)–(12) to investigate the processing of backwards anaphora and the application of Principle C. (11) a. Whenever they lecture during the procedure, charming babies is. . . b. Whenever you lecture during the procedure, charming babies is. . . 389 (12) a. If they want to believe that visiting uncles is. . . b. If you want to believe that visiting uncles is. . . The first phrase of the 2nd clause (charming babies,visiting uncles), which is ambiguous between a noun phrase and a gerund, is disambiguated by the number marking on the following auxiliary in favor of the gerund parse. Cowart and Cairns investigated whether this ambiguity was affected by a pronoun subject in a preceding clause. If the pronoun initiates a search for an antecedent then this may increase the likelihood that the ambiguous phrase should be analyzed as a noun phrase in (11a)–(12a), relative to (11b)–(12b), which contain an indexical pronoun. In either condition there was an additional restriction against treating the ambiguous phrase as the antecedent of the pronoun: in (11a) it is a semantically implausible antecedent for the pronoun; in (12a) it is a grammatically inaccessible antecedent, due to Principle C. In a naming task, Cowart and Cairns found longer response latencies for is in (11a) relative to (11b), whereas no difference was found in (12a) vs. (12b). They concluded that the parser was more biased towards the noun phrase analysis when it needed to link the pronoun with an antecedent, but only if that antecedent was in a grammatically accessible position. Further, they interpreted the contrast between (11) and (12) to suggest that semantic restrictions, unlike syntactic restrictions, do not have the immediate effect of excluding an incongruent noun phrase from the set of candidate antecedents for the pronoun. The findings in Cowart and Cairns (1987) suggest two important conclusions. First, the cataphoric pronoun they initiates an active search for an antecedent in the main subject position. In fact, these results may provide stronger evidence for active search than Van Gompel and Liversedge’s (2003) study: if the presence of the pronoun biases the parser to analyze the ambiguous phrase as a noun phrase, this suggests that dependency formation does not wait until after grammatical categories have been identified based on bottom-up analysis of the input. Secondly, their results suggest that the parser does not search for an antecedent in positions that are subject to Principle C. However, some objections can be raised with respect to the design of the study and this interpretation of the results. First, the naming task that was used did not provide information on positions beyond the probe position, which may have concealed important effects at earlier or later words. Second, as the authors themselves point out, the results were only significant in the participants analysis but not in the items analysis, which raises the possibility that the results are dependent on the specific 390 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 properties of individual items. Finally, the argument for immediate effects of Principle C is based on the fact that the pronoun-type effect is present in the ‘semantic’ pair (11) but not in the ‘syntactic’ pair (12). This argument requires the implicit assumption that different semantic and syntactic violations should produce an effect of a comparable magnitude at the same point in time. In light of this, the conclusions drawn from these findings should be more cautious, perhaps that the syntactic considerations are taken into account by the parser earlier than the semantic considerations. However, the claim that the parser never violates a syntactic constraint needs stronger evidence, e.g., using structures as controls that are not subject to any constraints. In the experiments that we present below, we used methods and materials designed to resolve these difficulties. Experiment 1 Experiment 1 consisted of an acceptability rating task and an on-line self-paced reading task that used a gender mismatch paradigm to test for the impact of Principle C on the search for pronoun antecedents. Participants Participants were 60 native speakers of English from the University of Maryland undergraduate population with normal or corrected-to-normal vision and no history of language disorders. All participants in this and subsequent studies gave informed consent and were paid $10/ h for their participation. Materials and design The materials for the on-line task are described first, since the materials for the acceptability rating study were derived from them. The experiment had five conditions, four of which were organized in a 2 · 2 design with the factors constraint(Principle C vs. no-constraint) and gender congruency (match vs. mismatch between the pronoun and the subject of the second clause). A full set of items is shown in Table 1. The gender of the cataphoric pronoun was balanced across stimulus sets: half of the sets were built on the basis of the masculine pronoun he, and the other half on the basis of the feminine pronoun she. The gender-match and gender-mismatch sentences differed only in the gender of the subject of the second clause, which was always a gender-unambiguous proper name, matched for the number of letters and syllables in the match and mismatch variants within each set of items. To guard against a semantic bias for or against a coreferential interpretation of the pronoun and the second subject, the two clauses were carefully selected such that the events in each clause could plausibly be performed either by the same agent or by different agents. In addition, to ensure that the cataphoric pronoun received a grammatical antecedent in every case, the target structures were embedded in a further sentence introduced by the conjunctions although or since. The gender of the third clause subject was chosen such that each sentence had a unique grammatical antecedent for the pronoun. Thus, in the Principle C conditions the subject of the third clause always matched the gender of the pronoun and served as a grammatical antecedent. In the no-constraint conditions the gender of the third clause subject mismatched the pronoun in the gender-match condition, due to the possibility of coreference between the pronoun and the second clause subject, but the third clause subject matched the gender of the pronoun in the gender-mismatch condition. Following Van Gompel and Liversedge (2003), we added a fifth ‘name’ condition to each set. This condition was identical to the no-constraint/mismatch condition in the number of referents introduced in each clause. This was done to ensure that any observed Table 1 Sample set of experimental items for on-line Experiment 1 Principle C/match Because last semester shei was taking classes full-time while Kathryn was working two jobs to pay the bills, Ericai felt guilty Principle C/mismatch Because last semester shei was taking classes full-time while Russell was working two jobs to pay the bills, Ericai felt guilty No constraint/match Because last semester while shei was taking classes full-time Kathryni was working two jobs to pay the bills, Russell never got to see her No constraint/mismatch Because last semester while shei was taking classes full-time Russell was working two jobs to pay the bills, Ericai promised to work part-time in the future No constraint/name Because last semester while Ericai was taking classes full-time Russell was working two jobs to pay the bills, shei promised to work part-time in the future The underlined name indicates the critical second subject noun phrase. Subscript indices indicate intended backward anaphoric dependencies. N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 mismatch effect at the second subject in the no-constraint pair could not merely be due to the introduction of a new discourse referent. If the number of referents is indeed the reason for a mismatch effect in the no-constraint pair, we should expect a similar increase in reading times in the name condition. Thirty sets of five conditions were distributed among 5 lists in a Latin Square design, and combined with 90 filler sentences. To mask experimental sentences, the fillers bore a number of similarities with the target items, including length and average clause number and were designed in several subgroups, each built around a salient feature of the targets, such as the use of proper names and pronouns, or a subordinator followed by a temporal modifier at the beginning of the sentence. There were no instances of unresolved anaphora in the filler sentences. Thus, we ensured that throughout the experiment pronouns always found intra-sentential antecedents. This was appropriate, since the interest of the study is not in whether readers search for an antecedent but, rather, in where they search. Acceptability rating task (off-line) We conducted an off-line rating task using a similar methodology to Gordon and Hendrick (1997). In each sentence a pronoun and a noun phrase were highlighted in bold and participants were instructed ‘to determine how plausible it is that the pronoun in bold and the noun in bold refer to the same person’ on a scale from 1 (impossible) to 5 (absolutely natural). We sought to confirm that, provided matching gender/number values, participants indeed accept coreference between the cataphoric pronoun and the second subject in the no-constraint environments, but reject it in the Principle C environments. 12 sets of items were chosen from the materials for the online task of Experiment 1 and then simplified to create the two conditions shown in (13). Two different lists were constructed based on these 12 sets using a Latin Square design, and each list had two different stimulus order randomizations. 40 native English speakers from the University of Maryland undergraduate population were recruited specifically for the purposes of the rating study and completed the questionnaire, which contained 6 instances of each condition, interspersed with 20 additional filler sentences, twelve of which tested the configurations used in Experiment 2. (13) Stimuli from off-line rating task, Experiment 1. a. * Principle C Because last semester shei was taking classes full-time while Kathryn was working two jobs to pay the bills, Ericai felt guilty. 391 b. No-constraint Because last semester while shei was taking classes full-time Kathryni was working two jobs to pay the bills, Russell never got to see her. The mean rating score for the Principle C condition (mean = 1.4, standard error = .12) was significantly lower than the mean score in the no-constraint condition (mean = 4.1, standard error = .13) both in the participants analysis and items analysis (two-tailed paired t-test, both ps < .001, t1(39) = 12.2, t2(11) = 38.2). These results fully agree with the predictions of the Principle C constraint: coreference between the pronoun in the main subject position and the name in the second subject position is acceptable only when the pronoun does not c-command the name, i.e., in the no-constraint condition but not in the Principle C condition. Procedure Participants were tested using a desktop PC running the Linger software (Doug Rohde, MIT) in a standard self-paced word-by-word moving window paradigm (Just, Carpenter, & Woolley, 1982). Each trial started with a blank screen. Upon pressing the space bar, a sentence masked by dashes appeared on the screen. The masks extended to all letters and punctuation marks, but left spaces unmarked. As the participant pressed the spacebar, a new word appeared on the screen as the previous one was re-masked by dashes. A comprehension question appeared after the end of each sentence all at once (e.g. Was Kathryn/Russell working two jobs?). Participants were instructed to read sentences at a natural pace and to respond to the comprehension questions as accurately as possible. To answer the question the subject pressed the f-key for ‘yes’ and the j-key for ‘no.’ If the question was answered incorrectly the word ‘Incorrect’ appeared briefly in the center of the screen. Each participant was randomly assigned to one of the lists, and the order of the stimuli within the presentation list was randomized for each participant. Analysis Only trials for which the corresponding comprehension question was answered correctly were included in the analysis. Reading times that exceeded a threshold of 2.5 standard deviations above a participant’s mean reading rate for each region were replaced by the threshold value. This winsorizing procedure affected 2.4% of the data (range 1.9–2.6% for individual conditions). 392 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 collapsing over participants (F2). Furthermore, min F 0 statistics (Clark, 1973) were computed, and 95% confidence intervals (CIs) based on the mean squared errors of the respective effect from the participants analysis were calculated for comparisons between means of conditions (Masson & Loftus, 2003). The regions used for the data analysis corresponded to single words, except for regions corresponding to the end of the clause, for which several words were combined due to variation in the clause length between items (see the legend in Figs. 1 and 2 for regions). The critical second subject position in which the gender manipulation occurred corresponded to region 8 in all conditions. The data from the first four conditions were entered into a 2 · 2 repeated-measures ANOVA with the factors constraint (Principle C, no constraint) and congruency (match, mismatch). Reading times from the name condition were compared pairwise to each of the no-constraint conditions in a one-way ANOVA. ANOVAs were computed on the participant mean raw reading times collapsing over items (F1), and on item means Results Comprehension question accuracy Two of 60 participants showed comprehension question accuracy that was more than 2.5 standard deviations below the mean accuracy. Consequently, the lowest scoring participant from each list was removed from the analysis to balance the number of participants 480 Mean reading time (ms) Principle C, match Principle C, mismatch subject NP 440 400 360 320 1 2 4 5 6 7 8 9 10 11 12 13 14 15 Because1 / last semester 2 / she4 / was taking5 /classes full-time6 / while7 / Kathryn (Russell)8 / was working 9 / two10 / jobs to pay the bills,11 / Erica12 / felt13 /guilty14. Fig. 1. Mean reading times in milliseconds for the Principle C conditions from Experiment 1. The error bars represent one standard error above/below the participant mean at each region. The arrow marks the critical second subject noun phrase. Mean reading time (ms) 480 No constraint, match No constraint, mismatch Name 440 400 360 subject NP 320 1 2 3 4 5 6 8 9 10 11 12 13 14 15 Because1 / last semester2 / while3 / she (Erica)4 /was taking5 / classes full-time6 / Kathryn (Russell)8 / was working 9 / two10 /jobs to pay the bills,11 / Russell (Erica) (she)12 / never13 / got14 / to see her15. Fig. 2. Mean reading times in milliseconds for the no-constraint and the name conditions from Experiment 1. The error bars represent one standard error above/below the participant mean at each region. The arrow marks the critical second subject noun phrase. N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 393 Self-paced reading We first present the results of the 2 · 2 ANOVA on the Principle C and no-constraint conditions (Table 2), and then discuss the name condition. The results from the on each list. For the remaining 55 participants, the mean accuracy was 91%, with a range of 90–93% for individual conditions. There were no reliable differences in accuracy between individual conditions. Table 2 Results of 2 · 2 ANOVAs and min F 0 values for Experiment 1 By participants Min F 0 By items df MSE F1 p df F2 p df min F 0 p Region 4 (1st subject) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 2357 2509 1622 2.0 1.2 2.9 0.160 0.276 0.094 1,29 1,29 1,29 1.6 3.1 1.0 0.216 0.091 0.319 1,70 1,81 1,50 <1 <1 <1 0.348 0.354 0.388 Region 5 (was taking) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 2163 1699 2017 <1 2.0 <1 0.622 0.162 0.768 1,29 1,29 1,29 <1 1.4 <1 0.721 0.244 0.558 1,59 1,66 1,76 <1 <1 <1 0.772 0.366 0.792 Region 6 (classes full-time) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 1785 1227 2300 <1 <1 <1 0.910 0.443 0.946 1,29 1,29 1,29 <1 1.0 <1 0.799 0.320 0.952 1,72 1,83 1,70 <1 <1 <1 0.917 0.541 0.964 Region 8 (2nd subject) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 12602 8026 6982 9.2 <1 4.8 0.004 0.385 0.033 1,29 1,29 1,29 10.9 <1 2.8 0.003 0.419 0.104 1,79 1,72 1,62 5.0 <1 1.8 0.028 0.551 0.188 Region 9 (was working) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 4571 3471 2895 8.6 <1 1.0 0.005 0.834 0.313 1,29 1,29 1,29 11.4 <1 <1 0.002 0.725 0.357 1,81 1,80 1,71 4.9 <1 <1 0.030 0.857 0.493 Region 10 (two) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 2309 2742 2495 6.2 <1 2.6 0.016 0.347 0.112 1,29 1,29 1,29 2.3 <1 2.3 0.142 0.414 0.141 1,50 1,69 1,72 1.7 <1 1.2 0.202 0.535 0.273 Region 11 (jobs to pay the bills) Constraint 1,54 Congruency 1,54 Constraint · congruency 1,54 2883 2234 2183 1.4 1.1 1.4 0.236 0.295 0.249 1,29 1,29 1,29 <1 1.8 <1 0.579 0.196 0.545 1,42 1,82 1,45 <1 <1 <1 0.614 0.411 0.590 Region 12 (3rd subject) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 6854 9304 8821 <1 4.2 <1 0.820 0.046 0.970 1,29 1,29 1,29 <1 3.1 <1 0.754 0.087 0.762 1,83 1,68 1,56 <1 1.8 <1 0.853 0.185 0.971 Region 13 (never) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 3266 3381 3623 5.9 2.2 2.7 0.019 0.140 0.109 1,29 1,29 1,29 4.3 1.6 3.4 0.048 0.216 0.075 1,67 1,67 1,80 2.5 <1 1.5 0.121 0.337 0.226 Region 14 (got) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 1706 2514 1958 <1 3.6 <1 0.517 0.062 0.698 1,29 1,29 1,29 <1 3.2 <1 0.588 0.083 0.817 1,67 1,73 1,50 <1 1.7 <1 0.676 0.195 0.842 Region 15 (to see her) Constraint Congruency Constraint · congruency 1,54 1,54 1,54 1339 1089 1513 <1 1.9 <1 0.920 0.178 0.983 1,29 1,29 1,29 <1 1.0 <1 0.707 0.329 0.897 1,61 1,59 1,57 0.0 0.6 0.0 0.922 0.425 0.983 394 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Principle C conditions are presented in Fig. 1 and the results from the no-constraint and name conditions in Fig. 2. Principle C and no-constraint conditions. There were no reliable main effects or interactions in any region in the first clause. In the main ANOVA at the critical second subject in region 8 there was a main effect of constraint due to shorter mean reading times in the Principle C conditions than in the no-constraint conditions (mean for the noconstraint conditions = 425.5 ms, mean for the Principle C conditions = 377.3 ms, 95% CI = 30.3 ms). This effect can be attributed to the differences in the content of the preceding region between the two pairs of conditions. Most importantly, there also was a significant constraint · congruency interaction at the second subject in the participants analysis. Pairwise comparisons within each level of the constraint factor showed no effect of gender congruency for the Principle C conditions (gender-match—383.4 ms, gender-mismatch—371.1 ms, 95% CI = 23.8 ms) and a marginally significant effect of gender congruency for the no-constraint conditions (gender-match—408.4 ms, gender-mismatch—442.6 ms, 95% CI = 40.3 ms) due to a slowdown in average reading times in the gender-mismatch condition (i.e., a mismatch effect). The main effect of constraint persisted to the region immediately following the second subject (mean for the no-constraint conditions = 390.7 ms, mean for the Principle C conditions = 361.7 ms, 95% CI = 18.3 ms). However, neither the main effect of congruency nor the constraint · congruency interaction were significant and no significant differences were observed in the pairwise comparisons within each level of the factor constraint (Principle C conditions: gender-match—366.1 ms, gender-mismatch—357.3 ms, 95% CI = 16.1 ms; no-constraint conditions: gendermatch—388.0 ms, gender-mismatch—393.5 ms, 95% CI = 26.1 ms). The main effect of constraint was still significant in the second post-subject region in the participants analysis, although not in the items analysis (mean for the no-constraint conditions = 351.4 ms, mean for the Principle C conditions = 369.0 ms, 95% CI = 13.0 ms). Other reliable effects included a main effect of congruency at the third subject noun phrase in region 12 that was significant in the participants analysis and marginally significant in the items analysis, due to longer reading times for the third subject in the match conditions than in the mismatch conditions (mean for the gendermatch conditions = 450.2 ms, mean for the gender-mismatch conditions = 426.0 ms, 95% CI = 26.1 ms). Recall that the congruency factor manipulated the (mis)match in gender between the pronoun and the 2nd subject, rather than the 3rd subject. There was also a main effect of the constraint factor at the following word (mean for the no-constraint conditions = 351.8 ms, mean for the Principle C conditions = 371.9 ms, 95% CI = 15.4 ms) due to slower reading times in the Principle C conditions. Pairwise comparisons in the same region showed a significant effect of congruency for the Principle C pair (gender-match—362.7 ms, gender-mismatch— 340.9 ms, 95% CI = 18.6 ms) due to longer reading times in the match condition, but no corresponding effect of congruency in the no-constraint pair (gendermatch—371 5 ms, gender-mismatch—372.3 ms, 95% CI = 26.4 ms). No other differences were statistically reliable. The effects in regions 12 and 13 do not receive a straightforward explanation, e.g. they cannot be explained by the parser’s anticipation of an antecedent noun phrase in conditions where the anaphoric dependency has not yet been resolved (i.e. both Principle C conditions and the gender-mismatch condition of the no-constraint pair). Name condition. In the first clause there was a significant difference in reading times between the name and the mismatch condition of the no-constraint pair at the subject noun phrase in region 4 (mean for the name condition = 377.4 ms, mean for the no-constraint mismatch condition = 351.0 ms, 95% CI = 19.5 ms) and at the following region (mean for the name condition = 356.7 ms, mean for the no-constraint mismatch condition = 331.9 ms, 95% CI = 16.7 ms), due to longer reading times in the name condition. These effects are expected in light of the length differences in the first subject noun phrase, which was a personal pronoun in the no-constraint conditions and a proper name in the name condition. Reading times at the critical second subject noun phrase were significantly longer in the mismatch condition of the no-constraint pair than in the name condition (mean for the name condition = 393.9 ms, mean for the no-constraint mismatch condition = 442.6 ms, 95% CI = 41.0 ms). However, reading times at the same region did not differ between the name and the match condition (mean for the name condition = 393.9 ms, mean for the no-constraint match condition = 408.4 ms, 95% CI = 22.6 ms). The fact that reading times at the second subject were longer in the mismatch condition than in the other two conditions is unlikely to be a spill-over from an earlier region, in light of the almost identical reading times across all three conditions at the immediately preceding region (i.e. region 6). There were additional reading-time differences between the name condition and either of the no-constraint conditions at later regions. These differences are expected in light of the parser’s attempt to resolve coreference at the second subject noun phrase in the no-constraint conditions but not in the name condition and are not reported here as they are not germane to the goals of this study. N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Discussion The off-line ratings showed that participants judged the second subject noun phrase to be an acceptable antecedent for the preceding pronoun in the no-constraint condition, but not in the Principle C condition, confirming that the contrast between the conditions is real, as predicted by the Principle C constraint. Notwithstanding the fact that the effect remained marginally significant, our results from the self-paced reading task replicated Van Gompel and Liversedge’s (2003) gender-mismatch effect in the no-constraint conditions: there was a slowdown in the second subject position if that noun phrase mismatched in gender with the preceding pronoun. In contrast, there was no effect of gender congruency at the corresponding position in the Principle C conditions, or in any other region in the second clause. The absence of a mismatch effect in the Principle C condition suggests that in this condition the parser never considered the subject of the second clause as a potential antecedent, thus making the gender-congruency between that subject and the preceding pronoun irrelevant. We can also rule out the possibility that the gender-mismatch effect in the no-constraint condition was due to the cost of adding a new discourse referent, since reading times at the second subject in the name condition patterned with the gender-match condition rather than with the gender-mismatch condition. The name condition was no harder to process than the gender match condition in any of the regions, so the difference between the gender match and mismatch conditions cannot be due to the introduction of a new discourse referent in the mismatch condition. The presence of the gender mismatch effect in the noconstraint conditions indicates that an anaphoric dependency is formed at the second subject noun phrase before gender information is taken into account, but it does not show exactly when the dependency is formed. This could occur immediately following the identification of the category of the subject based on bottom-up information, as in Van Gompel & Liversedge’s account. Alternatively, dependency formation could occur before any bottomup information about the second subject noun phrase appears, due to the fact that a sentence-initial adverbial clause reliably predicts a higher clause that contains a subject noun phrase. Both of these accounts may be viewed as evidence for ‘active’ search for the antecedent for a pronoun, but uncertainty about the role of prediction leads to a problem in interpreting the Experiment 1 results. The argument for the early on-line effects of Principle C involves the absence of a mismatch effect, and therefore relies on the assumption that there is no other difference between the conditions that could have led to the absence of a mismatch effect in the Principle C conditions. However, there is an inherent asymmetry between the Principle C and the no-constraint conditions related 395 to the parser’s ability to predict the second subject position. Recall that the sentences used in the experiment were 3-clause structures, schematized in (14). (14) Schematic representation of conditions from Experiment 1. No-constraint: [Because [while pronoun [2nd subject. . .. . ..]], 3rd subject. . .] Principle C: [Because [pronoun. . ..[while 2nd subject. . ...]], 3rd subject. . .] In the no-constraint conditions the parser could reliably predict the second subject position immediately after encountering the subordinator while in the first clause. In contrast, the second subject position cannot be reliably predicted during the first clause in the Principle C conditions. Furthermore, in these conditions the third subject position, which corresponds to the main clause subject, can be reliably predicted during the first clause. Therefore, the reason for the absence of a gender mismatch effect in the Principle C conditions could be due to the unpredictability of the second subject position, rather than to the effects of the grammatical constraint. Furthermore, the lack of expectation for the second subject position could strongly reduce any effect of the gender of the noun phrase in that position. To separate the effects of grammatical constraints and predictability, Experiments 2 and 3 tested the effects of the Principle C constraint in configurations where the critical noun phrase position is equally predictable across all conditions. Experiment 2 The goal of Experiment 2 was to test the effect of Principle C on active search for antecedents, while avoiding confounds due to differences in the predictability of the critical subject noun phrase. In Experiment 2 both the Principle C and the no-constraint conditions were identical with respect to the predictability of the second clause at all times. Thus, a contrast between the results for the no-constraint and Principle C conditions in this study would provide stronger evidence for the impact of grammatical constraints on the construction of anaphoric dependencies. Participants Participants were 60 native speakers of English from the University of Maryland undergraduate population with normal or corrected-to-normal vision and no history of language disorders. 396 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Materials and design The materials for the on-line task are described first, since the materials for the rating study were derived from them. The experiment had four conditions in a 2 · 2 design with the within-subjects factors constraint (Principle C vs. no-constraint) and gender congruency (match vs. mismatch between the pronoun and the subject of the second clause). A full set of items is shown in Table 3. All target sentences started with a clause with an impersonal subject and a non-agentive predicate such as it seemed or it was surprising. We reasoned that a sentence fragment like It seemed worrisome to . . . would create a strong expectation for a complement clause and hence for an additional subject position. Consequently, in all conditions the second subject position could be anticipated when the cataphoric pronoun was first encountered, thereby eliminating the potential confound that made it more difficult to interpret the distribution of gender mismatch effects in Experiment 1. In the Principle C conditions the cataphoric pronoun was the complement of the dative preposition to or for and referred to the experiencer of the main clause predicate. In this configuration coreference between the pronoun and the subject of an embedded clause is unacceptable. In the no-constraint conditions the pronoun was embedded as the possessor inside the complement of to or for. In this configuration the pronoun may corefer with the subject of an embedded clause. Note that this contrast in acceptability follows from the syntactic formulation of Principle C only if prepositions are assumed to be irrelevant for calculation of c-command relations. This assumption is consistent with results from other diagnostics of c-command, such as negative polarity item licensing and scope assignment, and the reader is referred to Reinhart (1983) and Brody (1994) for further discussion. As in Experiment 1, the materials were designed such that a cataphoric pronoun could always find a grammatically accessible intra-sentential antecedent, even in the Principle C conditions. This was achieved by adding a but-clause to the end of the sentence, the subject of which was a licit antecedent for the pronoun. The gender of the third subject was chosen such that there was a unique antecedent for the pronoun in each condition. The stimuli were designed such that plausibility considerations should not favor coreference between the pronoun and the embedded subject more strongly in either the Principle C condition or the noconstraint condition (irrespective of grammatical acceptability). For example, in the set of materials in Table 3 both John and his family could plausibly be worried about something negative involving John such as gaining weight. Thirty two sets of target items were distributed among four presentation lists in a Latin Square design. Each list contained 32 experimental sentences (8 per condition) and 90 filler sentences of varying length and complexity, thereby maintaining target-to-filler ratio of approximately 1:3. Filler sentences were identical across all four lists, and were designed with the same considerations used in Experiment 1. Each participant was randomly assigned to one of the lists, and the order of the stimuli within the presentation list was randomized for each participant. The full list of materials is given in Appendix B. Acceptability rating task (off-line) An off-line rating task was used to confirm that, provided matching gender/number values, participants accept coreference between the cataphoric pronoun and the second subject in the no-constraint environment, but reject it in the Principle C environments. The acceptability rating task for Experiment 2 was run together with the corresponding task for Experiment 1, and so was completed by the same 40 participants described there, using the same design and procedure. Participants were asked to rate the acceptability of coreference on a scale from 1 (impossible) to 5 (absolutely natural). The test items were 12 pairs of gender-match conditions chosen from the materials for the online task, distributed Table 3 Sample set of experimental items for on-line Experiment 2 Principle C/match It seemed worrisome to himi that John was gaining so much weight, but Matti didn’t have the nerve to comment on it Principle C/mismatch It seemed worrisome to himi that Ruth was gaining so much weight, but Matti didn’t have the nerve to comment on it No constraint/match It seemed worrisome to hisi family that Johni was gaining so much weight, but Ruth thought it was just a result of aging No constraint/mismatch It seemed worrisome to hisi family that Ruth was gaining so much weight, but Matti thought it was just a result of aging The underlined name indicates the critical second subject noun phrase. Subscript indices indicate intended backward anaphoric dependencies. N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Results across two presentation lists using a Latin Square design and interspersed with 48 filler items (additional to the materials from Experiment 1). The mean rating score for the Principle C condition (mean = 1.5, standard error = .12) was significantly lower than in the no-constraint condition (mean = 4.2, standard error = .13) both in the participants and items analyses (two-tailed paired t-test, both ps < .001, t1(39) = 12.9, t2(11) = 31.2). These results strongly suggest that participants recognize the constraint on coreference for the Principle C structures in (14). Comprehension question accuracy The mean question answering accuracy was 95.7% and ranged between 94.4% and 97.3% for individual conditions. There were no reliable differences in accuracy between individual conditions (Fs < 1). Self-paced reading The results from Experiment 2 are plotted in Fig. 3. The results of the 2 · 2 ANOVA analysis are reported in Table 4. At the impersonal subject it in region 1 the general ANOVA showed an interaction of constraint and congruency that was significant in the participants analysis and marginally significant in the items analysis. However, this interaction was not supported in the pairwise comparisons within each level of the constraint factor. There were no other significant effects or interactions anywhere in the first clause. At the complementizer that in region 7 there was a main effect of the constraint factor, due to longer reading times for the no-constraint conditions than for the Principle C conditions (mean for the no-constraint conditions = 321.7 ms, mean for the Principle C conditions = 302.4 ms, 95% CI = 7.7 ms). This difference was likely due to the lexical difference between conditions at the preceding regions. A main effect of the constraint factor in the same direction was also found at the subsequent region, the critical subject of the second clause (region 8: mean for the no-constraint conditions = 328.5 ms, mean for the Principle C conditions = 317.7 ms, 95% CI = 10.3 ms). However, in region 8 the constraint · congruency interaction did not reach significance, and planned pairwise comparisons revealed no Procedure and analysis The procedure was a self-paced word-by-word moving window task, identical to Experiment 1. Each sentence was followed by a yes-no comprehension question. Data from all 60 participants was included in the analysis, which followed the same steps as in Experiment 1. Sentences for which the comprehension question was answered incorrectly were excluded from the analysis, and reading times that exceeded a threshold of 2.5 standard deviations above a participant’s mean reading rate for each region were replaced by the threshold value. This winsorizing procedure affected 2.0% of trials (1.9–2.2% for individual conditions). The regions used for the data analysis corresponded to single words, except for regions corresponding to the end of the clause, for which several words were averaged together due to variation in the clause length between items (see the legend for Fig. 3 for region encoding). Raw reading times for each region were entered into a 2 · 2 repeated-measures ANOVA with the factors constraint and gender congruency. Principle C, match 360 397 subject NP Principle C, mismatch Mean reading time (ms) No constraint, match No constraint, mismatch 340 320 300 280 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 It1 seemed2 worrisome3 to4 {him5/ his5 family6} that7 John/Ruth8 was9 gaining10 so11 (much weight)12, but13 Matt14 didn't15 have16 (the nerve to comment on it)17. Fig. 3. Mean reading times in milliseconds from Experiment 2. The error bars represent one standard error above/below the participant mean at each region. The arrow marks the critical second subject noun phrase. 398 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Table 4 Results of 2 · 2 ANOVAs and Min F 0 values for Experiment 2 By participants Min F 0 By items df F2 p df min F 0 p 0.626 0.564 0.571 1,31 1,31 1,31 <1 <1 <1 0.754 0.690 0.750 1,31 1,31 1,31 <1 <1 <1 0.792 0.743 0.781 5.4 <1 <1 0.024 0.821 0.743 1,31 1,31 1,31 3.2 <1 <1 0.082 0.754 0.701 1,31 1,31 1,31 2.0 <1 <1 0.165 0.855 0.803 1448.3 997.3 1401.9 16.3 <1 <1 <0.001 0.486 0.912 1,31 1,31 1,31 10.6 <1 <1 0.003 0.654 0.966 1,31 1,31 1,31 6.4 <1 <1 0.016 0.707 0.968 1,59 1,59 1,59 1589.5 1187.6 1624.0 4.7 3.2 <1 0.034 0.077 0.927 1,31 1,31 1,31 4.3 2.7 <1 0.048 0.112 0.895 1,31 1,31 1,31 2.2 1.5 <1 0.145 0.235 0.940 Region 9 (was) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1925.5 2223.8 1681.0 <1 3.7 <1 0.880 0.059 0.544 1,31 1,31 1,31 <1 3.9 <1 0.708 0.058 0.406 1,31 1,31 1,31 <1 1.9 <1 0.889 0.179 0.625 Region 10 (gaining) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1422.0 1055.7 1587.0 1.6 8.8 <1 0.209 0.004 0.601 1,31 1,31 1,31 1.3 5.0 <1 0.260 0.033 0.438 1,31 1,31 1,31 <1 3.2 <1 0.401 0.085 0.665 Region 11 (so) Constraint Congruency constraint · congruency 1,59 1,59 1,59 1339.5 1909.9 2056.6 8.1 7.4 6.7 0.006 0.009 0.012 1,31 1,31 1,31 5.2 5.7 9.0 0.030 0.023 0.005 1,31 1,31 1,31 3.2 3.2 3.8 0.085 0.083 0.059 Region 12 (much weight) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 3863.8 5224.2 2737.8 4.0 1.9 1.1 0.049 0.175 0.297 1,31 1,31 1,31 3.1 5.0 1.9 0.090 0.033 0.183 1,31 1,31 1,31 1.7 1.4 <1 0.196 0.250 0.411 Region 13 (but) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1464.2 2053.6 2752.1 1.6 <1 2.4 0.210 0.450 0.126 1,31 1,31 1,31 1.0 1.3 5.9 0.326 0.256 0.021 1,31 1,31 1,31 <1 <1 1.7 0.439 0.530 0.200 Region 14 (3rd subject) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1106.5 1562.1 1231.9 <1 5.8 <1 0.480 0.019 0.383 1,31 1,31 1,31 <1 8.5 <1 0.578 0.007 0.367 1,31 1,31 1,31 <1 3.5 <1 0.662 0.073 0.530 Region 15 (didn’t) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1399.0 1101.5 1840.9 2.5 8.8 <1 0.122 0.004 0.789 1,31 1,31 1,31 1.4 3.0 <1 0.240 0.091 0.764 1,31 1,31 1,31 <1 2.3 <1 0.349 0.143 0.842 Region 16 (have) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 1134.4 1309.7 1176.0 3.8 4.0 5.5 0.058 0.050 0.022 1,31 1,31 1,31 1.0 4.5 5.6 0.332 0.042 0.024 1,31 1,31 1,31 <1 2.1 2.8 0.387 0.156 0.105 df MSE F1 Region 4 (to) Constraint Congruency constraint · congruency 1,59 1,59 1,59 1279.1 943.0 880.6 <1 <1 <1 Region 5 (him/his) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 955.4 944.8 820.4 Region 7 (that) Constraint Congruency Constraint · congruency 1,59 1,59 1,59 Region 8 (2nd subject) Constraint Congruency Constraint · congruency p N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 effect of congruency within either level of the constraint factor. The ANOVAs at each of the three regions immediately following the subject (regions 9–11 in Fig. 1) showed a significant or a marginally significant main effect of congruency, due to slower reading times in mismatching conditions than matching conditions (region 9: mean for the gender-match conditions = 321.6 ms, mean for the gender-mismatch conditions = 333.8 ms, 95% CI = 12.2 ms; region 10: mean for the gender-match conditions = 324.3 ms, mean for the gender-mismatch conditions = 337.2 ms, 95% CI = 8.4 ms; region 11: mean for the gender-match conditions = 323.4 ms, mean for the gender-mismatch conditions = 338.5 ms, 95% CI = 11.3 ms). In region 11 there also was a significant main effect of constraint (mean for the no-constraint conditions = 337.8 ms, mean for the Principle C conditions = 324.1 ms, 95% CI = 11.7 ms) and a significant constraint · congruency interaction due to longer reading times in the no-constraint mismatch condition (noconstraint conditions: gender-match—322.4 ms, gender-mismatch—353.1 ms; Principle C conditions: gender-match—324.4 ms, gender-mismatch—323.9 ms, 95% CI = 23.4 ms). Planned pairwise comparisons at regions 9–11 showed a significant or marginally significant effect of gender congruency (i.e., gender-mismatch effect) in the no-constraint conditions at all three regions (region 9: mean for the gender-match condition = 320.2 ms, mean for the gender-mismatch condition = 336.4 ms, 95% CI = 16.8 ms; region 10: 325.8, 341.9 and 14.9 ms; region 11: 322.4, 353.1 and 19.2 ms, respectively). The effect of gender congruency did not reach significance in the Principle C conditions in any region in the second clause, although there was a numerical trend towards slower reading times in the incongruent conditions. In the third clause there was a significant main effect of congruency in the overall ANOVA at the subject noun phrase in region 14, due to longer reading times in the mismatch conditions than in the match conditions (mean for the gender-match conditions = 311.9 ms, mean for the gender-mismatch conditions = 323.9 ms, 95% CI = 10.2 ms). Pairwise comparisons showed that the effect of gender congruency was significant in the Principle C conditions (region 14: mean for the gender-match condition = 311.9 ms, mean for the gendermismatch condition = 327.7 ms, 95% CI = 12.7 ms), but not in the no-constraint conditions (region 14: mean for the gender-match condition = 312.0 ms, mean for the gender-mismatch condition = 320.2 ms, 95% CI = 14.6 ms). This was the first region where an accessible antecedent for the pronoun could be found in all but the no-constraint/match condition. This pattern of reading-time contrasts was unexpected, especially in light of the persistency of the effect that continued to hold in regions 15 and 16. 399 Discussion The goal of Experiment 2 was to test the claim that Principle C immediately restricts dependency formation during online processing of backwards anaphora, using materials that match the predictability of potential antecedent positions across conditions. As in Experiment 1, this experiment elicited a gender mismatch effect in the no-constraint conditions, and no corresponding effect in the Principle C conditions. These findings support the claim that the parser does not search for an antecedent for a pronoun in grammatically inaccessible positions, and do not straightforwardly lend themselves to an explanation in terms of predictability, unlike Experiment 1. However, it is clear that the mismatch effect in the noconstraint conditions in Experiment 2 was both delayed and diminished in comparison to Experiment 1. The gender mismatch effect showed a significant or marginally significant mismatch effect at the each of the three words following the critical region, but the overall constraint · congruency interaction did not reach significance until the third region after the critical noun phrase. One possible reason for the smaller and delayed gender-mismatch effect in this experiment relative to Experiment 1 involves the structural necessity of the clause that contains the potential antecedent noun phrase. Experiment 1 introduced pronouns in adverbial clauses introduced by subordinators like while, which are obligatorily followed by a main clause. The target items in Experiment 2 introduced pronouns inside a main clause with a beginning like It seemed clear. . ., which were assumed to create a strong bias for a clausal continuation, based on intuition and on informal corpus searches (Linguist’s Search Engine: Resnik & Elkiss, 2003). However, the lead-ins used in Experiment 2 are compatible with a grammatical (although pragmatically unlikely) parse as mono-clausal sentences with referential subject pronouns. To test whether the weaker gender mismatch effect in this experiment might have been due to a weaker expectation for a following clause, we conducted an offline sentence completion task to determine the strength of the prediction for a complement clause. Thirteen undergraduate students from the University of Maryland completed a pencil-and-paper completion task for monetary compensation. The questionnaire contained sentence-initial fragments such as (15), which participants were asked to complete in a natural fashion. The last word of the fragment was a possessive pronoun, e.g. his. The pronoun was included to assess what proportion of the completions that contained an embedded clause also contained a referent for the pronoun as part of the embedded clause. All target fragments contained the masculine 3rd person singular pronoun his or plural their. The feminine pronoun her was not used in the 400 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 target sentences due to the homophony of the accusative and possessive feminine forms. The questionnaire consisted of 12 target fragments, interspersed with 60 fragments of other types. (15) It seemed clear to his. . .. The results from the completion task suggest that a sentence fragment such as (15) created a strong but not absolute expectation for an upcoming embedded clause. Participants provided a clausal continuation in 69% (99/143) of cases (71% (51/72) in the ‘his’ condition, 67% (48/71) in the ‘their’ condition), but in the remaining 29% (44/143) of cases the completed sentence was a simple monoclausal sentence. Such monoclausal completions are grammatical, since the sentence-initial it can be understood as a 3rd person neuter pronoun with an unspecified referent, but they are pragmatically infelicitous in the absence of a supporting context. The complement clause contained a potential antecedent for the cataphoric pronoun his or their in 70% (69/99) of the completions that included a second clause, suggesting that when the embedded clause was expected, a referent for the pronoun was also often expected. In 87% (60/69) of the completions that provided a potential antecedent, the potential referent appeared in subject position. The results from the fragment completion questionnaire lend support to the notion that speakers normally expect a clausal continuation following a lead-in like It seemed clear to his . . ., and also suggest that in most cases speakers also anticipate an antecedent for the pronoun. However, the 30% of completions that did not contain an embedded clause still represent a significant difference from the materials in Experiment 1, where a completion to a fragment like While he . . . would necessarily include a second clause. Thus, it is possible that the difference in strength of prediction represented by this 30% contributed to the weakened gender mismatch effect in Experiment 2. Therefore, Experiment 2 provided further evidence that active search for antecedents for pronouns is restricted by grammatical constraints on coreference, while leaving open certain questions about the strength and immediacy of gender-mismatch effects and what they show about the time-course of anaphoric dependency formation. In particular, the results of Experiment 2 show that the distribution of gender-mismatch effects cannot be explained entirely in terms of predictability, but do not indicate whether the predictability of a potential antecedent position is a prerequisite for eliciting a gender-mismatch effect. The results of Experiments 1 and 2 are consistent with a view in which active dependency formation mechanisms for cataphora are activated only at the time when the pronoun is encountered. Under this view, which we refer to as instantaneous resolution, the parser actively constructs anaphoric depen- dencies only in positions that can be reliably predicted at the time when the pronoun is encountered. Instantaneous resolution predicts that a dependency between the anaphoric pronoun (his) and the second subject position (John) would be actively constructed in a structure like ‘While his mother slept, John sang’, in which the second subject position can be anticipated even before the pronoun is encountered, due to the early appearance of the subordinator while. On the other hand, instantaneous resolution predicts that the parser would not attempt a similar dependency in ‘His mother slept, while John sang’, as the second subject position cannot be reliably predicted until after the pronoun is encountered. An alternative to the instantaneous resolution view is that the parser’s search for candidate antecedent positions remains active beyond the point when the pronoun is encountered and takes account of any new piece of structure predicted by bottom-up information following the pronoun. This view, which we refer to as continuous resolution, predicts that the parser would actively consider a dependency between a cataphoric pronoun and the second subject even in ‘His mother slept, while John sang’, despite the fact that a reliable prediction for the second subject position becomes available only with the subordinator while and after the anaphoric pronoun is encountered. This latter view is a closer parallel to the properties of active search for gaps in processing fillergap dependencies, where active gap creation effects have been found in positions that could not have been anticipated at the point when the filler was first encountered (e.g., Aoshima et al., 2004; Frazier & Clifton, 1989). Importantly, whereas the two views disagree on the specific configuration that must obtain to initiate an active search, they both agree that a cataphoric dependency can be formed based on a prediction for an antecedent position and before the antecedent becomes available bottom-up. We assume that speakers can encode the fact that a pronoun is linked to another syntactic position that serves as its antecedent, independent of encoding what the antecedent refers to. We refer to this process as ‘linking’, and suggest that it may occur as soon as an antecedent position is predicted. We assume that a separate process of ‘valuation’ occurs when the properties of the antecedent are processed bottom-up, and the pronoun automatically inherits the referential properties of the antecedent. Experiment 3 attempts to adjudicate between the instantaneous resolution and the continuous resolution proposals by testing a configuration in which no syntactic prediction for a potential antecedent position is available at the time when the cataphoric pronoun is encountered. If a gender mismatch effect is found in such a configuration, it would be evidence that the search mechanism does not search for candidate antecedents just once, but repeatedly, as more information comes online. Therefore, the results of this third experiment N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 will also speak further to the extent of the parallelism between active search mechanisms in the two construction types. Experiment 3 The goals of Experiment 3 were twofold. First, to test the instantaneous vs. continuous resolution accounts, i.e., whether active search for pronoun antecedents occurs in structural positions that could not be reliably predicted at the point when the pronoun is first encountered. Second, as in Experiments 1 and 2 we sought to test whether active search for pronoun antecedents is restricted by grammatical constraints, specifically Principle C. Participants Participants were 60 native speakers of English from the University of Maryland undergraduate population with normal or corrected-to-normal vision and no history of language disorders. They gave informed consent and were financially compensated for their participation. Materials and design The target items for Experiment 3 consisted of 24 sets of 4 items organized in a 2 · 2 design with the factors constraint (no-constraint vs. Principle C) and congruency (match vs. mismatch) as within-subjects factors. One full set of items is shown in Table 5 and a complete set of materials can be found in Appendix C. All conditions began with a main clause that was followed by an adverbial clause headed by a subordinator such as while. The pronoun appeared as the subject of the main clause in the Principle C conditions, or as the possessor of the main clause subject in the no-constraint conditions. The critical potential antecedent position 401 was the subject noun phrase of the embedded adverbial clause, which could not be reliably predicted at the point when the pronoun was first encountered. The first possible cue for such a position appeared later in the sentence when the subordinator while was encountered. Importantly, the subordinator while did not immediately precede the critical head noun of the second subject noun phrase, but instead was separated from the head noun by the determiner the and two adjectives. This ensured that the parser had ample opportunity to process the information associated with the subordinator before encountering the critical noun, i.e., (i) that there is a new clause and, hence, an upcoming subject position, and (ii) that in the Principle C condition this subject position is c-commanded by the cataphoric pronoun and thus excluded as a potential antecedent due to Principle C. Gender congruent and incongruent sentences were formed by manipulating the gender of the sentence-initial pronoun such that it either matched or mismatched the head noun of the second subject noun phrase. The nouns used were either lexically (e.g. king, queen) or conventionally (e.g. quarterback) strongly gender-specific. To make it impossible to reliably anticipate the gender of the noun, the adjectives preceding the noun did not include gender-biased adjectives (e.g., pregnant or handsome). As in the previous experiments, to ensure the possibility of intra-sentential coreference, the sentences included an additional clause that contained a licit antecedent for the cataphoric pronoun whenever no suitable antecedent was available in the second clause. The twenty-four sets of experimental stimuli were distributed across four lists in a Latin Square design. Each list contained 24 experimental sentences and 72 filler sentences. Filler sentences were identical across all four lists. Each participant was randomly assigned to one of the lists, and the order of the stimuli within the presentation list was randomized for each participant. Table 5 Sample set of experimental items for on-line Experiment 3 Principle C/match Hei chatted amiably with some fans while the talented, young quarterback signed autographs for the kids, but Stevei wished the children’s charity event would end soon so he could go home Principle C/mismatch Shei chatted amiably with some fans while the talented, young quarterback signed autographs for the kids, but Caroli wished the children’s charity event would end soon so she could go home No constraint/match Hisi managers chatted amiably with some fans while the talented, young quarterbacki signed autographs for the kids, but Carol wished the children’s charity event would end soon so she could go home No constraint/mismatch Heri managers chatted amiably with some fans while the talented, young quarterback signed autographs for the kids, but Caroli wished the children’s charity event would end soon so she could go home The underlined noun indicates the head of the critical second subject noun phrase. Subscript indices indicate intended backward anaphoric dependencies. 402 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Acceptability rating task (off-line) Procedure and analysis As in Experiments 1 and 2, an off-line acceptability rating task was used to confirm that the configurations in the no-constraint and Principle C conditions effectively varied the acceptability of coreference between the pronoun and the subject of the second clause. Participants were the same 60 native speakers of English who participated in the on-line portion of Experiment 3. To avoid bias from the acceptability rating task on the on-line results, participants completed the rating questionnaire after the on-line experiment. The task was identical to that used in Experiments 1 and 2, i.e., participants rated the acceptability of coreference on a scale from 1 (impossible) to 5 (absolutely natural). Twentyfour sets of stimuli were constructed based on 24 target stimulus sets from on-line Experiment 3. The Principle C and the no-constraint conditions were derived from the gender-match conditions of the on-line study by removing the sentence-final but-clause. The forwards anaphora condition was created by exchanging the pronoun and the second subject noun phrase in the Principle C condition. This condition was included to address the possible concern that lower acceptability ratings in the Principle C condition might reflect the implausibility of having the main and embedded clause events be simultaneously performed by the same agent. The but-condition was identical to the Principle C/match condition in the online study and was included to test the effectiveness of our effort to provide a licit intra-sentential antecedent for the pronoun through the subject of an additional but-clause. Four experimental lists were constructed using a Latin Square design, such that each participant saw only one condition from each set. In addition to the 24 experimental sentences, each questionnaire also contained 36 filler sentences that were identical across the four lists. The mean coreference rating scores from Experiment 3 are summarized in Table 6. The Principle C condition received a mean rating score of 1.7 that was significantly lower than the rating score in the other three conditions (2-tailed paired t-test, all ps < .01). The coreference rating score in the no-constraint and the but-conditions was significantly lower than in the forwards anaphora condition, but this is expected given that forwards anaphora is the preferred way of expressing coreference in these contexts. The procedure was a self-paced word-by-word moving window task, identical to Experiments 1 and 2. Each sentence was followed by a yes-no comprehension question. The data from 2 of the 60 participants could not be included due to technical problems. To balance the number of participants across each list, one participant was excluded from each list, such that analysis was performed on 56 participants, distributed equally among lists. As in Experiment 1, trials on which the comprehension question was answered incorrectly were excluded, and reading-times that exceeded a threshold of 2.5 standard deviations above the mean reading rate for each region and participant were replaced by that threshold value. This procedure affected 2.2% of trials (2.1–2.3% for individual conditions). The data were entered into a 2 · 2 repeated-measures ANOVA. Table 6 Mean rating scores from Experiment 3 Condition Principle C No-constraint Forwards anaphora But-condition Mean rating (Standard error) 1.7 3.4 4.3 3.9 (.09) (.13) (.08) (.09) Results The mean accuracy on comprehension question was 92.6% (91.6-93.8% for individual conditions). There were no reliable differences in accuracy between individual conditions (Fs < 1). The results from all conditions in Experiment 3 are shown in Fig. 4. The results of the 2 · 2 ANOVAs are summarized in Table 7. In the first clause, there was a significant main effect of the constraint factor at the verb-adverb sequence in regions 3 and 4 due to longer reading times in the noconstraint conditions than in the Principle C conditions (region 3: mean for the no-constraint conditions = 406.3 ms, mean for the Principle C conditions = 338.6 ms, 95% CI = 18.3 ms; region 4: 381.2, 345.1 and 14.6 ms respectively). In region 5 there was a significant main effect both of constraint (mean for the no-constraint conditions = 358.0 ms, mean for the Principle C conditions = 338.7 ms, 95% CI = 15.1 ms) and of congruency (mean for the gender-match conditions = 340.9 ms, mean for the gender-mismatch conditions = 355.8 ms, 95% CI = 8.8 ms), but no significant interaction of the factors. These effects were partly due to lexical differences between conditions at the preceding regions and partly unexpected. At the conjunction while in region 6 there was a significant main effect of constraint (mean for the no-constraint conditions = 362.3 ms, mean for the Principle C conditions = 337.8 ms, 95% CI = 12.8 ms) and a significant constraint · congruency interaction. Pairwise comparisons revealed that the interaction was caused by an unexpected effect of congruency in the Principle C conditions that contained identical lexical material until that point (region 6: mean for the gender-match condition = 328.3 ms, mean for the gender-mismatch condition = N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Mean reading time (ms) 460 subject noun 403 Principle C, match Principle C, mismatch No constraint, match 420 No constraint, mismatch 380 340 300 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 He (she)1 / [Principle C] or His (her)1 / managers2 / [no-constraint] / chatted3 / amiably4 / with some fans5 / while6 / the7 / talented,8 / young9 / quarterback10 / signed11 / autographs12 /for13 /the kids,14 / but15 / Steve (Carol)16 / wished17 / the18 / children's charity event would end soon so he could go home19. Fig. 4. Mean reading times for the Principle C and no-constraint conditions from Experiment 3. The arrow marks the position of the critical gender-marked head-noun. The error bars represent one standard error above/below the participant mean at each region. The arrow marks the critical second subject noun phrase. 347.4 ms, 95% CI = 17.0 ms) that was not observed in the no-constraint conditions (mean for the gendermatch condition = 369.6 ms, mean for the gender-mismatch condition = 354.9 ms, 95% CI = 17.2 ms). The effect of constraint was marginally significant at the determiner in region 7 (mean for the no-constraint conditions = 326.0 ms, mean for the Principle C conditions = 316.9 ms, 95% CI = 9.6 ms), and was significant at the first adjective in region 8 (mean for the no-constraint conditions = 354.1 ms, mean for the Principle C conditions = 331.6 ms, 95% CI = 16.6 ms). Importantly, there were no effects of congruency in the two regions preceding the critical noun and similarly, no effects of the constraint factor in the region immediately preceding the critical noun. At the critical noun in region 10 there was a main effect of congruency (mean for the gender-match conditions = 367.1 ms, mean for the gender-mismatch conditions = 389.4 ms, 95% CI = 14.1 ms) and a significant constraint · congruency interaction. Separate pairwise comparisons of the Principle C and no-constraint conditions revealed a strong effect of congruency in the noconstraint pair in the predicted direction (region 10: mean for the gender-match condition = 364.6, mean for the gender-mismatch condition = 402.5, 95% CI = 18.1 ms). No corresponding effect in the Principle C pair (region 10: mean for the gender-match condition = 369.6 ms, mean for the gender-mismatch condition = 376.4 ms, 95% CI = 19.9 ms). The constraint · congruency interaction was also significant at the word following the subject noun in region 11. Once again, pairwise comparisons within each level of the constraint factor showed that the interaction was due to the presence of a significant effect of congru- ency in the no-constraint conditions (mean for the gender-match condition = 384.4, mean for the gendermismatch condition = 417.0, 95% CI = 24.9 ms), and no corresponding effect in the Principle C conditions (mean for the gender-match condition = 395.6 ms, mean for the gender-mismatch condition = 391.3 ms, 95% CI = 22.4 ms). There were no other significant effects in the remainder of the second clause or anywhere in the final clause. Discussion The main finding of Experiment 3 was the presence of a gender mismatch effect in the no-constraint condition at the critical gender-marked head noun. The immediacy and robustness of this effect in Experiment 3 is important in light of the fact that the corresponding effect was marginally significant in Experiment 1 and delayed in Experiment 2. The gender mismatch effect was present despite the fact that there was no independent prediction for an antecedent position at the time when the cataphoric pronoun was encountered. This suggests that the parser’s active search for pronoun antecedents is not a strictly instantaneous process that occurs when the pronoun is first encountered, but is rather a continuous process that applies whenever it becomes possible to anticipate a new potential antecedent position. In this respect, therefore, active search for pronoun antecedents resembles active search for gaps in filler-gap dependencies. In contrast, the gender mismatch effect was not found at the critical second subject or in any other region in the second clause in the Principle C conditions. These results corroborate the conclusion drawn from Experiments 1 404 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Table 7 Results of 2 · 2 ANOVAs and min F 0 values for Experiment 3 By participants Min F 0 By items df MSE F1 p df F2 p df min F 0 p Region 6 (while) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 2283.0 2230.7 1843.1 13.7 <1 7.6 0.001 0.825 0.008 1,24 1,24 1,24 14.9 <1 11.2 0.001 0.832 0.003 1,69 1,65 1,75 7.1 <1 4.5 0.009 0.878 0.037 Region 7 (the) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 1293.0 1205.2 1500.1 3.6 <1 <1 0.062 0.936 0.652 1,24 1,24 1,24 4.2 <1 <1 0.051 0.851 0.896 1,71 1,71 1,28 2.0 <1 <1 0.167 0.941 0.900 Region 8 (talented) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 3848.1 2863.8 3604.4 6.2 1.3 1.4 0.016 0.260 0.250 1,24 1,24 1,24 11.4 1.5 1.1 0.003 0.229 0.316 1,78 1,71 1,60 4.0 <1 <1 0.049 0.405 0.445 Region 9 (young) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 3726.0 1963.0 3876.2 1.0 <1 <1 0.326 0.779 0.655 1,24 1,24 1,24 1.2 <1 <1 0.280 0.928 0.576 1,72 1,29 1,77 <1 <1 <1 0.463 0.931 0.725 Region 10 (quarterback) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 2772.2 3077.2 1949.5 2.5 8.7 8.1 0.118 0.005 0.006 1,24 1,24 1,24 3.0 4.9 6.2 0.097 0.037 0.021 1,71 1,52 1,60 1.4 3.1 3.5 0.246 0.082 0.066 Region 11 (signed) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 5072.1 3125.3 4710.2 <1 2.9 4.2 0.468 0.095 0.045 1,24 1,24 1,24 <1 2.8 5.7 0.431 0.110 0.026 1,71 1,66 1,74 <1 1.4 2.4 0.591 0.239 0.124 Region 12 (authographs) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 2112.0 1405.5 2298.9 <1 3.9 2.0 0.980 0.052 0.161 1,24 1,24 1,24 <1 <1 1.3 0.935 0.408 0.258 1,65 1,33 1,56 <1 <1 <1 0.981 0.444 0.373 Region 13 Constraint Congruency Constraint times congruency 1,55 1,55 1,55 1642.1 1175.3 1593.0 4.4 <1 <1 0.041 0.866 0.886 1,24 1,24 1,24 2.9 <1 <1 0.103 0.845 0.606 1,56 1,74 1,63 1.7 <1 <1 0.193 0.898 0.890 Region 14 Constraint Congruency Constraint times congruency 1,55 1,55 1,55 2114.9 2606.5 3352.8 1.3 4.3 <1 0.259 0.042 0.562 1,24 1,24 1,24 <1 1.8 <1 0.407 0.192 0.871 1,51 1,45 1,28 <1 1.3 <1 0.499 0.263 0.875 Region 15 (but) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 1847.7 1536.3 1672.7 <1 <1 <1 0.748 0.740 0.460 1,24 1,24 1,24 <1 <1 <1 0.617 0.742 0.794 1,79 1,67 1,30 <1 <1 <1 0.786 0.814 0.805 Region 16 (3rd subject) Constraint Congruency Constraint times congruency 1,55 1,55 1,55 1530.2 1227.0 1604.4 1.5 <1 <1 0.224 0.987 0.398 1,24 1,24 1,24 1.9 <1 <1 0.183 0.865 0.493 1,72 1,56 1,56 <1 <1 <1 0.363 0.987 0.592 Region 17 Constraint Congruency Constraint times congruency 1,55 1,55 1,55 1992.0 1251.7 1799.6 1.6 0.1 0.2 0.205 0.704 0.664 1,24 1,24 1,24 1.8 0.2 0.0 0.190 0.661 0.911 1,69 1,74 1,27 0.9 0.1 0.0 0.355 0.773 0.914 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 and 2 that the initial set of antecedents for a cataphoric pronoun does not contain noun phrase candidates that violate Principle C. Note that this does not necessarily exclude the possibility that the ‘illicit’ noun phrases might be considered at some later point, before they are ultimately rejected, although we find no evidence of such ‘late’ effects (see Sturt, 2003 for evidence of late effects involving forwards anaphora). General discussion Our primary concern in this study was to investigate the extent to which mechanisms for processing backwards anaphoric dependencies match the mechanisms that have been previously reported for the processing of filler-gap dependencies, with a focus on mechanisms of active search for pronoun antecedents and on the impact of grammatical constraints on coreference relations. Results from our three off-line and three on-line experiments together yield a fairly consistent picture. The three off-line rating studies all showed that judgments of coreference are substantially degraded when a pronoun c-commands its antecedent, as predicted by the Principle C constraint. These results show that Principle C captures genuine acceptability contrasts, but leave open the question of whether the constraint impacts the real-time construction of coreference relations. The three on-line experiments all used the gender mismatch paradigm (Aoshima et al., submitted for publication; Van Gompel & Liversedge, 2003) to test where active dependency formation occurs. All three experiments found a gender mismatch effect in the no-constraint conditions and no gender mismatch effect in the Principle C conditions. These findings are therefore consistent with the hypothesis that upon encountering a cataphoric pronoun the parser actively searches for an antecedent for the pronoun, except in positions that are excluded by grammatical constraints. This in turn suggests that constraints on coreference apply during the earliest stages of sentence processing. The results of Experiment 1 showed the contrast between the no-constraint conditions and the Principle C conditions, and a fifth control condition demonstrated that the gender mismatch effect in the no-constraint conditions could not simply be an effect of additional discourse complexity. However, in this experiment an additional difference in structure predictability was confounded with the difference in the relevance of Principle C. The lack of a gender mismatch effect in the Principle C conditions could have been due to the effects of Principle C, or it may have reflected the fact that the critical antecedent position could not be reliably predicted far in advance. Experiments 2 and 3 were designed to match the conditions in terms of predictability. Experiment 2 contained structures that shared equally strong indicators (prior to 405 the pronoun) of the upcoming critical subject position, while in Experiment 3 the critical antecedent position was equally unpredictable at the pronoun across the two pairs of conditions. In both experiments we obtained essentially the same results as in Experiment 1: a significant reading time slowdown associated with gender mismatch in the no-constraint conditions, but no corresponding slowdown in the Principle C conditions. These findings suggest that the parser considered the critical subject noun phrase as a potential antecedent in advance of bottom-up morphological information, but only if the subject position was not ruled out by a grammatical constraint on coreference. These findings clearly parallel results from previous work on filler-gap dependencies that has shown that those dependencies are parsed using an active search mechanism that posits gaps in advance of bottom-up confirmation, but does not posit gaps in positions that are excluded by island constraints on long-distance dependencies (Phillips, 2006; Stowe, 1986; Traxler & Pickering, 1996). Backwards anaphora constructions have the same dependent-element-first property as filler-gap dependencies, but differ from filler-gap dependencies in a number of respects, most notably the fact that potential antecedent noun phrases are overt and hence easier to identify based on bottom-up information than phonologically null gaps. The similarity in the processing mechanisms for these two different-looking types of constructions raises the possibility that the parser might draw on a single mechanism for all linguistic dependencies that share this same basic dependent-element-first structure. It also suggests that the active search mechanism is not a direct consequence of the fact that gaps are difficult to identify using bottom-up information alone, since the same mechanism is found in the search for antecedents of pronouns. In light of the evidence from Experiment 3 for continuous (rather than instantaneous) resolution of cataphoric pronouns, the difference in the amplitude and timing of the gender mismatch effect in Experiment 2 relative to the other two experiments cannot be accounted in terms of the weaker predictability of the second subject position in that experiment. Instead, the strength of the mismatch effect may be dependent on the distance of the critical subject position from its structural predictor. This idea was supported by the stronger mismatch effect observed in Experiment 3. Although the critical subject position in that experiment was entirely unpredictable at the point when the pronoun was first processed, the subsequent subordinator (e.g., while) made it possible to predict the head of the subject noun phrase three words before it was confirmed by bottom-up information. This increased distance may have enabled the parser to fully process the subordinator and the accompanying structural prediction for a subject position. Although it is possible that the parser did not initiate 406 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 processing of the pronoun-antecedent relation until bottom-up information about the antecedent was encountered, as assumed by Van Gompel and Liversedge (2003), this would entail that the consequences of grammatical constraints such as Principle C are computed faster than the gender of the antecedent can be identified, and would also make it more difficult to account for the delayed gender mismatch effect in Experiment 2. Generality of active search mechanism To understand how the parser actively constructs long-distance dependencies in advance of bottom-up information it is important to know under what circumstances dependency formation can be initiated. In the domain of wh-dependencies there is good reason for the parser to initiate the search for a gap as soon as a wh-phrase is encountered, since the wh-phrase itself provides sufficient evidence that a gap must occur somewhere in the sentence. Matters are less straightforward in the case of backwards anaphora, since the occurrence of a pronoun early in a sentence does not normally guarantee that an antecedent for the pronoun will appear later in the sentence. Furthermore, the occurrence of a pronoun does not normally guarantee that even a potential antecedent position will appear. However, the results of Experiment 3 indicate that active search can be initiated even when evidence for an upcoming potential antecedent position does not occur until a few words after the pronoun. Of course, we cannot at present exclude the possibility that the generality of active search in our studies was related to the fact that the target and filler items in our studies were designed to ensure that all pronouns found an intra-sentential antecedent. This aspect of the design was well-motivated, based on our interest in the absence of gender mismatch effects in the Principle C conditions, but it may have exaggerated the evidence for active search. A related question about the generality of active search involves the range of positions where the parser considers potential antecedents to occur. In the case of wh-constructions previous evidence indicates that gaps are actively posited in object positions (Stowe, 1986) and subject positions (Lee, 2004) of the clause that is local to the wh-phrase, and also in embedded clauses that could not have been anticipated at the point when the wh-phrase was first encountered (Aoshima et al., 2004; Frazier & Clifton, 1989; Phillips, 2006; Phillips, Kazanina, & Abada, 2005). It should be noted that the studies reported here share with previous studies on backwards anaphora processing (Aoshima et al., submitted for publication; Van Gompel & Liversedge, 2003) the limitation that they show active construction of anaphoric dependencies only in subject positions. This limitation does not undermine our conclusions about the effects of Principle C, but it leaves open the question of whether the active search for pronoun antecedents is as general as the active search for gaps. Temporal priority for structural information As discussed in the Introduction, the existence of gender mismatch effects implies that the parser constructs anaphoric dependencies before gender information about a potential antecedent noun phrase is fully processed. Our current results add to this the finding that structural constraints on coreference such as Principle C are implemented before gender information is processed, providing further evidence for a temporal priority for structural information in the processing of backwards anaphora. Van Gompel and Liversedge (2003) argue that this timing advantage for syntactic information provides evidence against ‘strongly interactive’ processing models and favors ‘modular’ models that give priority to syntactic evidence (p. 138). We would argue, however, that this conclusion is not required, since the temporal priority for syntactic information in processing backwards anaphora can be explained without the need to appeal to architectural constraints that specifically delay the use of certain types of information. We suggest that structural information has priority in this domain because it is often the only type of information about a potential antecedent that is reliably derivable before bottom-up information about the noun phrase is encountered. If the parser encounters structural evidence for an upcoming clause, then it can immediately and reliably predict that the clause will have a subject noun phrase, and it could also immediately evaluate whether that subject noun phrase is in a position that makes it a structurally licit or illicit antecedent for a previously encountered pronoun. This information could all be computed before any bottomup information about the subject noun phrase is encountered. In contrast, other information about the subject noun phrase such as its morphological features and its semantic match to a previously encountered pronoun cannot be evaluated except via bottom-up information, because these properties are not structurally predictable. Thus, under this view syntactic information plays a crucial role because it enables the parser to make predictions about upcoming material earlier than any other type of information, and hence there is no need to impose architectural constraints that force certain information types to have priority. Note that this line of reasoning might apply differently in languages that display richer morphological agreement than English, such that it may be possible to reliably predict morphological properties of an upcoming noun in advance of the noun itself. Studies on the processing of forwards anaphora have also explored the issue of priority for structural information, and the discussion has focused on the question of N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 whether structural constraints on coreference immediately restrict the parser’s selection of candidate antecedents for pronouns and reflexives. Some studies have found immediate effects of structural constraints in processing forwards anaphora (e.g., Nicol & Swinney, 1989; Sturt, 2003), while others have reported effects of ungrammatical antecedents on the processing of pronouns (e.g., Badecker & Straub, 2002; Kennison, 2003; see also Runner, Sussman, & Tanenhaus, 2003). Sturt (2003) proposes a ‘binding-as-defeasible-filter’ model that attributes the differing results across studies to different processing stages. Sturt suggests that syntactic constraints on coreference are respected at an initial phase of processing, but may be violated at later stages. He observes time-course differences in his own studies in the effects of structurally accessible and inaccessible antecedents. Further, Sturt’s account may capture the fact that in previous studies even highly salient structurally inaccessible antecedents have failed to elicit a gender mismatch effect, a measure of early anaphor resolution (for more detailed review see Kazanina, 2005). In the respect that we observed no effects of structurally inaccessible antecedents in our studies, our results are compatible with Sturt’s account. Note, however, that care should be taken in comparing the resolution of forwards and backwards anaphora, due to differences in their temporal dynamics and due to differences in the representations that must be searched for an antecedent. Forwards anaphora involves a retrospective search for an antecedent in memory, and multiple candidate antecedents may be considered simultaneously. Backwards anaphora, in contrast, involves a prospective search for an antecedent. Consequently, potential antecedents do not need to be retrieved from memory, and may be evaluated in succession. In contrast to some studies of forwards anaphora, our studies found no delayed effect of structurally inaccessible antecedents. This difference may reflect either the temporal dynamics of backwards anaphora resolution, or the different representations that are searched for an antecedent. Analogous effects of giving priority to one type of information over others during anaphora processing have also been discussed by Sanford and Garrod (1989) for cross-sentential forwards anaphor resolution. Their findings suggest that bottom-up morphological information about a noun phrase and a pronoun can set in motion a process they call ‘bonding’: the formation of a tentative (and ultimately incorrect) dependency in cases like ‘‘Harry was sailing to Ireland. It sank without trace’’ on the basis of the shared singular morphology of Ireland and it (originally reported in Sanford, Garrod, Lucas, & Henderson, 1984). These findings are similar to ours in the sense that one type of information has temporal priority over others in the processing of anaphoric dependency. However, whereas the Sanford and Garrod’s effect relies on bottom-up morpho- 407 logical information to begin the resolution process, our cases crucially depend on the combined use of information from the pronoun and structural expectations to create a dependency in advance of the bottom-up information, although the full details are filled in only after the bottom-up information about the antecedent is encountered. Therefore, we think that these two types of effects are probably driven by distinct mechanisms. Implications for ungrammatical representations Finally, our results are also relevant to questions about the nature of grammatical constraints and how they are implemented. In some linguistic frameworks there is a clear distinction between the range of structures that the grammar generates and the range of structures that satisfy all grammatical constraints or ‘filters.’ This overgenerate-and-filter property is explicitly advocated in such frameworks as Optimality Theory (Prince & Smolensky, 1993/2004) and Government and Binding Theory (Chomsky, 1981). A different perspective on illicit representations is that ill-formedness is the result of an inability to generate a structure, such that there is a closer match between the set of wellformed structures and the set of representable structures. This is more characteristic of analyses in such approaches as Head-Driven Phrase Structure Grammar (HPSG: Pollard & Sag, 1994) and the Minimalist Program (Chomsky, 1995). To the extent that parsing is seen as reflective of the linguistic knowledge modeled by these theories, our results are consistent with the latter view, at least for Principle C. Speakers do not appear to generate representations that violate Principle C and then reject them based on application of a later filter. Instead, it appears that Principle C violations simply do not occur to speakers of English. Note that this finding contrasts with the conclusion from an interesting but controversial set of findings by Freedman and Forster (1985), who claimed based on the results of a series of sentence-matching studies that the parser considers illicit filler-gap dependencies that violate syntactic island constraints (for alternative interpretations see Crain & Fodor, 1987; Stowe, 1992). Note, also, that our results cannot exclude the possibility that the parser does temporarily construct representations that violate Principle C, but successfully filters these representations before any bottom-up information about an antecedent is presented. Conclusion We have shown that upon encountering a cataphoric pronoun the parser initiates an active search for an antecedent in the following material. Importantly, during this search the parser does not consider positions that 408 N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 are inaccessible antecedent positions due to a structural constraint on coreference, Principle C. These results support a view whereby syntactic constraints immediately restrict active search processes, whether the search is for a thematic position, as in wh-dependencies, or for a referent, as in backwards anaphora. More generally, our results suggest that the mechanisms used for processing wh-dependencies and referential dependencies like backwards anaphora might be largely identical. Acknowledgments We are grateful to Roger van Gompel and to an anonymous reviewer for valuable feedback on an earlier draft. This research was supported by grants to CP from the National Science Foundation (#BCS-0196004) and the Human Frontiers Science Program (#RGY-0134). EFL was supported by an NSF Graduate Fellowship. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.jml.2006.09.003. References Aoshima, S. (2003). The grammar and parsing of wh-dependencies. PhD dissertation, University of Maryland, College Park. Aoshima, S., Phillips, C., & Weinberg, A. S. (2004). Processing filler-gap dependencies in a head-final language. Journal of Memory and Language, 51, 23–54. Aoshima, S., Yoshida, M., & Phillips, C. (submitted for publication). Incremental processing of coreference and binding in Japanese. Badecker, W., & Straub, K. (2002). The processing role of structural constraints on the interpretation of pronouns and anaphora. Journal of Experimental Psychology: Learning Memory and Cognition, 28, 748–769. Baker, M. A. (1991). On some subject/object non-asymmetries in Mohawk. Natural Language and Linguistic Theory, 9, 537–576. Bourdages, J. S. (1992). Parsing complex NPs in French. In H. Goodluck & M. S. Rochemont (Eds.), Island constraints: Theory, acquisition and processing (pp. 61–87). Netherlands: Kluwer Academic Publishers. Brody, M. (1994). Phrase structure and dependence. London: Ms. University College. Bruening, B. (2001). Syntax at the edge: cross-clausal phenomena and the syntax of Passamaquoddy. PhD dissertation, MIT. Chomsky, N. (1973). Conditions on transformations. In S. Anderson & P. Kiparsky (Eds.), A festschrift for morris halle (pp. 232–286). New York: Holt, Rinehart & Winston. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky, N. (1995). The minimalist program. Cambridge, Mass: MIT Press. Clark, H. H. (1973). The language-as-fixed-effect fallacy: a critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12, 335–359. Clifton, C., Kennison, S. M., & Albrecht, J. E. (1997). Reading the words him and her: implications for parsing principles based on frequency and on structure. Journal of Memory and Language, 36, 276–292. Cowart, W., & Cairns, H. S. (1987). Evidence for an anaphoric mechanism within syntactic processing: some reference relations defy semantic and pragmatic constraints. Memory & Cognition, 15, 318–331. Crain, S., & Fodor, J. D. (1985). How can grammars help parsers. In D. R. Dowty, L. Karttunen, & A. M. Zicky (Eds.), Natural language parsing: psychological, computational, and theoretical perspectives (pp. 94–128). Cambridge: Cambridge University Press. Crain, S., & Fodor, J. D. (1987). Sentence matching and overgeneration. Cognition, 26, 123–169. Crain, S., & McKee, C. (1985). The acquisition of structural restrictions on anaphora. In S. Berman, J. Choe, & J. McDonough (Eds.), Proceedings of NELS15. Amherst, MA: GLSA Publications. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor, J. D. (1978). Parsing strategies and constraints on transformations. Linguistic Inquiry, 9, 427–473. Frazier, L. (1987). Sentence processing: a tutorial review. In M. Coltheart (Ed.), Attention and performance XII. Hillsdale, N.J.: Erlbaum. Frazier, L., & Clifton, C. (1989). Successive cyclicity in the grammar and the parser. Language and Cognitive Processes, 4, 93–126. Freedman, S. E., & Forster, K. I. (1985). The psychological status of overgenerated sentences. Cognition, 19, 101–131. Garnsey, S. M., Tanenhaus, M. K., & Chapman, R. M. (1989). Evoked potentials and the study of sentence comprehension. Journal of Psycholinguistic Research, 18, 51–60. Gordon, P. C., & Hendrick, R. (1997). Intuitive knowledge of linguistic co-reference. Cognition, 62, 325–370. Harris, C. L., & Bates, E. A. (2002). Clausal backgrounding and pronominal reference: a functionalist approach to ccommand. Language and Cognitive Processes, 17, 237–269. Heim, I. (1992). Anaphora and semantic interpretation: a reinterpretation of Reinhart’s approach. In U. Sauerland & O. Percus (Eds.), The interpretive tract (pp. 205–246). Cambridge, MA: MIT Working Papers in Linguistics. Hirst, W., & Brill, G. A. (1980). Contextual aspects of pronoun assignment. Journal of Verbal Learning and Verbal Behavior, 19, 168–175. Huang, C.-T. J. (1982). Move-wh in a language without whmovement. The Linguistic Review, 1, 369–416. Jelinek, E. (1984). Empty categories, case, and configurationality. Natural Language and Linguistic Theory, 2, 39–76. Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms and processes and in reading comprehension. Journal of Experimental Psychology: General, 3, 228–238. N. Kazanina et al. / Journal of Memory and Language 56 (2007) 384–409 Kaan, E., Harris, A., Gibson, E., & Holcomb, P. (2000). The P600 as an index of syntactic integration difficulty. Language and Cognitive Processes, 15, 159–201. Kazanina, N. (2005). The acquisition and processing of backwards anaphora. Ph.D. dissertation, University of Maryland, College Park. Kazanina, N., & Phillips, C. (2001). Coreference in child russian: distinguishing syntactic and discourse constraints. In A. Do, L. Domiı́nguez, & A. Johansen (Eds.), Proceedings of the 25th Annual Boston University Conference on Language Development (pp. 413–424). Somerville MA: Cascadilla Press. Kennison, S. M. (2003). Comprehending the pronouns her, him, and his: Implications for theories of referential processing. Journal of Memory and Language, 49, 335–352. Lee, M. (2004). Another look at the role of empty categories in sentence processing (and grammar). Journal of Psycholinguistic Research, 33, 51–73. Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203–220. Nicol, J. L., Fodor, J. D., & Swinney, D. (1994). Using crossmodal lexical decision tasks to investigate sentence processing. Journal of Experimental Psychology: Learning, Memory & Cognition, 20, 1229–1238. Nicol, J., & Swinney, D. (1989). The role of structure in coreference assignment during sentence comprehension. Journal of Psycholinguistic Research, 18, 5–19. Phillips, C. (2006). The real-time status of island phenomena. Language, 82, 4. Phillips, C., Kazanina, N., & Abada, S. H. (2005). ERP effects of the processing of syntactic long-distance dependencies. Cognitive Brain Research, 22, 407–428. Phillips, C. & Wagers, M. (in press). Relating structure and time in linguistics and psycholinguistics. In: G. Gaskell (ed.), Oxford handbook of psycholinguistics. Oxford, UK: Oxford University Press. Pickering, M. J. (1993). Direct association and sentence processing: a reply to Gorrell and to Gibson and Hickok. Language and Cognitive Processes, 8, 163–196. Pickering, M. J., Barton, S., & Shillcock, R. (1994). Unbounded dependencies, island constraints and processing complexity. In C. Clifton, L. Frazier, & K. Rayner (Eds.), Perspectives on sentence processing (pp. 199–224). London: Lawrence Erlbaum. Pollard, C., & Sag, I. A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press. Prince, A., & Smolensky, P. (1993/2004). Optimality theory: Constraint interaction in generative grammar. Technical 409 Report, Rutgers University and University of Colorado at Boulder, 1993. Revised version published by Blackwell, 2004. Rutgers Optimality Archive 537. Reinhart, T. (1983). Anaphora and semantic interpretation. London: Croom Helm. Reinhart, T., & Reuland, E. (1993). Reflexivity. Linguistic Inquiry, 24, 657–720. Resnik, P. & Elkiss, A. (2003). The linguist’s search engine: Getting started guide. Technical Report LAMP-TR-108, University of Maryland. Ross, J. R. (1967). Constraints on variables in syntax. Ph.D. dissertation, MIT. Runner, J. T., Sussman, R. S., & Tanenhaus, M. K. (2003). Assignment of reference to reflexives and pronouns in picture noun phrases: evidence from eye movements. Cognition, 89, B1–B13. Sanford, A. J., Garrod, S. C., Lucas, A., & Henderson, R. (1984). Pronouns without antecedents? Journal of Semantics, 2, 303–318. Sanford, A. J., & Garrod, S. C. (1989). What, when and how? Questions of immediacy in anaphoric reference resolution. Language and Cognitive Processes, 4, 235–262. Speas, M. (1990). Phrase structure in natural language. Dordrecht: Kluwer Academic Publishers. Stowe, L. (1986). Evidence for online gap creation. Language and Cognitive Processes, 1, 227–245. Stowe, L. A. (1992). The processing implementation of syntactic constraints: the sentence matching debate. In H. Goodluck & M. Rochemont (Eds.), Island constraints: Theory, acquisition and processing (pp. 419–443). Dordrecht: Kluwer Academic Publishers. Sturt, P. (2003). The time-course of the application of binding constraints in reference resolution. Journal of Memory and Language, 48, 542–562. Sussman, R. S., & Sedivy, J. C. (2003). The time-course of processing syntactic dependencies: evidence from eye-movements during spoken wh-questions. Language and Cognitive Processes, 18, 143–163. Traxler, M. J., & Pickering, M. J. (1996). Plausibility and the processing of unbounded dependencies: an eye-tracking study. Journal of Memory and Language, 35, 454–475. Van Gompel, R. P. G., & Liversedge, S. P. (2003). The influence of morphological information on cataphoric pronoun assignment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 128–139. van Hoek, K. (1997). Anaphora and conceptual structure. Chicago: University of Chicago Press.
© Copyright 2025 Paperzz