EXCLUSION INFERENCES VS. SCALAR IMPLICATURE Differentiating scalar implicature from mutual exclusivity in language acquisition [Authors and Affiliations Blinded] Total Word Count: 9,597 Abstract Word Count: 205 [Author contact information blinded] 1 EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 2 Abstract When acquiring language, children must not only learn the meanings of words, but also how to interpret them in context. For example, as part of acquiring the quantifiers some and all, children must learn to differentiate the logical semantics of the scalar quantifier some from its pragmatically enriched meaning, “some but not all”. Many studies report that this type of inference, generally referred to as “scalar implicature”, poses a difficult challenge to children until quite late in development – as late as 9 or 10 years of age. In contrast, more recent studies find successes as early as 3 years of age. We ask whether the computation of exclusion inferences (like those licensed by contrast or mutual exclusivity) – rather than an ability to compute scalar implicatures – might account for children’s performance on tasks showing early success on distinguishing some from all. We find strong evidence that young children (N = 214; ages 4;0-7;11) compute exclusion inferences when relating some and all to one another even when such inferences are inappropriate in the context. We also find evidence that, in some contexts, performance that is consistent with the computation of scalar implicature can actually be explained by the computation of exclusion inferences like contrast and mutual exclusivity. Keywords: Pragmatics, mutual exclusivity, contrast, scalar implicature, inference EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 3 When children encounter new words, they assume that they differ in meaning from previously learned words. For example, when shown a pair of objects – one novel, and the other familiar – and told to “Find the blicket”, children as young as 18 months of age preferentially select the novel referent (Halberda, 2003; Markman, Wasow, & Hansen, 2003; Spiegel & Halberda, 2011). Results like this have been reported in relation to children’s acquisition of nouns, verbs, adjectives, proper names, and even number words (e.g., Au & Markman, 1987; Carey & Bartlett, 1978; Clark, 1987; 1988; 1990; Wynn, 1992), suggesting that children very generally assume that different words encode different meanings. According to many accounts, this reasoning reflects a more general pragmatic approach to language: Children assume that speakers are cooperative Gricean interlocutors who seek to be truthful, informative, relevant, and clear (Grice, 1970). Thus, they reason that if the speaker had intended to refer to the object with the known label then they would have used that word, and thus must have intended to label the novel object. In this way, children can infer that utterances like those in (1) differ in meaning, even if they lack meanings for the words some and all or understand how some and all are related to one another. (1a) I ate some of the cookies. (1b) I ate all of the cookies. Taken alone, word learning assumptions like Clark’s (1987) Principle of Contrast and Au and Markman’s (1987) Mutual Exclusivity Assumption are useful, but incomplete, cues to word learning and interpretation. In the best case scenario, assumptions like these can indicate that two words differ somehow in meaning – even if only in connotation or conversational register. In the worst case, they can lead to false inferences – e.g., that a cat is not an animal. Very generally, in order to arrive at adult-like interpretations of words, the assumption that they have mutually EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 4 exclusive meanings (or simply contrast) must be supplemented with additional information about the words under consideration (e.g., whether they are relevant alternatives to one another; whether one entails the other). For example, to interpret the sentences in (1) in an adult-like way (e.g., that the speaker is intending to convey that they ate some but not all of the cookies), children must know the literal meanings of some and all, including that all is strictly stronger than some in this context, and that if (1b) were true, then it would have been a more informative/optimal thing to say. Only with this knowledge is it possible to infer that (1a) implies the negation of (1b). This particular inference – a “scalar implicature” – is in some respects similar to inferences based on contrast or mutual exclusivity. Just as interpreting the novel word blicket is easier if one can use contrast to rule out “cat” as a possible meaning, interpreting an utterance containing some to mean “not all” involves negating a corresponding utterance that contains all. Indeed, several researchers have argued that contrast and mutual exclusivity rely on similar computational architecture to scalar implicature (Barner & Bachrach, 2010; Clark, 1990; Gathercole, 1989; Grice, 1970, 1991; Katsos & Wilson, 2014; Wynn, 1992). However, despite their similarities, these types of inferences exhibit important differences, both in their structure and their developmental trajectories. Below, we explore one property of scalar implicature that is not described by word learning constraints – what linguists call “asymmetric entailment”. In particular, we ask whether studies which claim to show early abilities to compute implicatures actually do so, or whether they might instead rely on simpler inferences, like those required by contrast and mutual exclusivity. In doing so, we not only ask whether previous studies provide valid tests of implicature, but also whether the ability to compute implicatures might arise initially from the same machinery that supports word learning. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 5 In order to understand how contrast and mutual exclusivity are related to scalar implicature, it is important to first characterize the computations that they involve. First, consider contrast (for discussion see Clark 1987; 1988; 1990). Contrast inferences arise from the assumption that differences in linguistic form signal differences in meaning, and are relatively weak, in the sense that they do not, in isolation license strong assumptions about reference. For example, contrast allows expressions like couch and sofa to convey different connotations while still labeling the same referent. Mutual exclusivity inferences can be described as enriched contrast inferences: “Mutual exclusivity is one kind of contrast, but it is a more specific and stronger assumption: many terms that contrast in meaning are not mutually exclusive.” (Woodward & Markman, 1991; p. 141) Much like contrast, mutual exclusivity inferences arise when listeners assume that each word has one meaning and each meaning is expressed by only one word, such that word “A” implies not-“B”, just as “B” negates “A” (Woodward & Markman, 1991). In addition, mutual exclusivity inferences also involve the assumption that differences in meaning predict differences in reference – a particular object labeled as a “blicket” cannot also be called a “toma”. Thus, in this sense, they are stronger than contrast alone. For the purposes of the present study, this difference between contrast and mutual exclusivity will not play a central role, and therefore we will refer to both forms of inference collectively as “exclusion inferences”, except in cases where the difference is important. According to Clark (1990), exclusion inferences can be characterized as Gricean in nature, and can therefore be thought of as being importantly similar to scalar implicature. For example, in a context including a cat and a novel object, a speaker who intends to refer to the cat is expected to say, “Look at the cat!” rather than, “Look at the blicket”. In this case, on a standard Gricean analysis, the listener hears the word blicket and assumes that the speaker's EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 6 intended referent is not a cat, since if they had intended to refer to the cat then they would have chosen the word cat instead. This is much like the inference proposed by Gricean models of scalar implicature. In (1a), if the speaker had actually eaten all of the cookies then they should have said so; thus, saying they ate some of the cookies licenses the implicature that the statement containing all is false. While exclusion inferences are structurally similar to scalar implicature, they also differ in critical ways. First, exclusion inferences differ dramatically from scalar implicatures with respect to their developmental trajectories. While children readily compute exclusion inferences by the age of 2 or earlier (Au & Markman, 1987; Carey & Bartlett, 1978; Halberda, 2003; Heibeck & Markman, 1987; Markman & Wachtel, 1988; Wynn, 1992), the literature on scalar implicature is divided, with some studies reporting difficulties among children as old as 9 or 10 years of age, and others finding evidence of scalar implicature in 3-year-olds (Chierchia et al., 2001; Miller, Schmitt, Chang, & Munn, 2003; Noveck, 2001; Papafragou & Tantalou, 2004; Papafragou & Musolino, 2003; Stiller, Goodman, & Frank, 2015; Yoon, Wu, & Frank, 2015). A second difference is that, whereas exclusion inferences involve a symmetrical exclusion relationship (hence “mutual exclusivity”), implicature generally does not. For example, in (1a) and (1b) the utterances not only contrast, but also exhibit what linguists call “asymmetric entailment”: Whereas the truth of the utterance in (1b) entails that (1a) must also be true – i.e., if I ate all of the cake I must have eaten some of it – the opposite is not true. This observation has led some linguists to posit the existence of ordered lexical classes, sometimes called Horn scales, which are deployed in the service of scalar implicature (Horn, 1989; Gazdar, 1979; Levinson, 2000). According to this view, the ability to compute scalar implicatures involves not only the ability to contrast utterances, but also the ability to relate them to one another according to their EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 7 relative informational strength in context. Specifically, upon hearing, e.g., “I ate some of the cookies”, the listener must infer that were the stronger alternative statement true – i.e., “I ate all of the cookies” – then the speaker would have said so, and consequently they must not believe this stronger statement to be true, but instead believe that they ate some, but not all, of the cookies. As already noted, whereas children compute exclusion inferences almost as soon as they can talk, they often fail to compute scalar implicatures until well into elementary school (Chierchia et al., 2001; Noveck, 2001; Papafragou & Musolino, 2003). Explanations for children’s failures vary, but there is increasing evidence that at least part of their difficulty lies in summoning relevant scalar alternatives – e.g., they fail to spontaneously conjure utterances containing all as relevant alternatives to utterances containing some, resulting in an inability to make a some-but-not-all inference (Barner, Brooks, & Bale, 2011; Hochstein, Bale, Fox, & Barner, 2014; Skordos & Papafragou, 2016). In support of this view, children appear to perform better when scalar alternatives are either primed, or explicitly contrasted with one another in two-alternative forced choice paradigms (e.g., when one utterance is matched with one of two pictures, or one picture is matched with one of two utterances; Chierchia et al., 2011; Katsos & Bishop, 2010; Miller, Schmitt, Chang, & Munn, 2003; Papafragou & Tantalou, 2004; Stiller, Goodman, & Frank, 2015). Critically, these are precisely the paradigms that would allow participants to easily recruit mutual exclusivity to guide their judgments. Because of this, as we describe below, although these studies can be interpreted as evidence for early abilities to compute scalar implicature, it’s also possible that children’s earliest experimental successes rely on a simpler exclusion inference, which does not involve computing asymmetric entailment. Currently, no previous study has provided evidence to differentiate these two alternatives. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 8 In one report of early implicature abilities, Papafragou and Tantalou (2004) showed 4and 5-year-old children a puppet who was asked to do a task – e.g., “Puppy! Paint all of the stars”. After disappearing into a house, the puppet then returned and reported what he had done – e.g., “I painted some of the stars”. In this case, children correctly judged that Puppy had not done the right thing (e.g., that some implied, in this case, not all). While this study was interpreted as providing evidence of scalar implicature in the early preschool years, it is also possible that children’s success was driven by mutual exclusivity alone. To understand this, consider the dialogue in (2). (2a) S1: Did you buy all of the candy? (2b) S2: I bought some of the candy. Here, if children compute a scalar implicature, they should infer that not all of the candy has been bought, because the statement containing all is stronger than the statement containing some. However, they should also reach this conclusion if they use mutual exclusivity – some implies “not all” simply because some and all are different words. In such cases, it is impossible to differentiate which mechanism a listener has used to infer that some implies “not all”. However, other cases do allow us to tease these two mechanisms apart. Consider now the dialogue in (3), which contains an entailment relation. (3a) S1: Did you approve some of the charges? (3b) S2: I approved all of the charges. Here, if children make an exclusion inference, then they should assume (counterintuitively and incorrectly) that the speaker of 3b did not approve some of the charges because he approved “all” of them. On the other hand, if children represent the appropriate scalar EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 9 entailment relationship (that all entails some), they should infer that the speaker in 3b approved some (and, in fact, all) of the charges. In the present study, we investigated why different studies find discrepancies of up to 8 years in their estimates of when children can compute implicatures. First, we asked whether recent reports of early implicature computation in children necessarily involved true implicatures, or instead might be explained as being supported by exclusion inferences alone. Second, our study addressed the question of whether scalar implicature might be rooted, in part, in the same inferential capacities that underlie exclusion inferences like contrast and mutual exclusivity. To test these questions, we built on Papafragou and Tantalou’s (2004) study with two important modifications. First, in addition to trials like those used in the original study (which could be solved either by computing scalar implicatures or by using exclusion inferences), we also included trials that required sensitivity to asymmetric entailment, as in (3). This allowed us to ask whether children’s inferences reflect the computation of exclusion inferences, or whether they also involve assumptions regarding asymmetric entailment, as required by scalar implicature. Second, we included trials in which logically equivalent statements nonetheless contrasted in form (e.g., a request to paint “3” out of 3 stars, fulfilled by a Puppy painting “all” of the stars), and trials containing novel words (e.g., blick, as in 3) about which the child should have no previous expectations about meaning. This allowed us to ask whether, outside of the special case of scalar implicature, children relied more heavily on mutual exclusivity than other sources of information when making judgments about contrasting utterances. If children rely on simple exclusion inferences to guide their judgments, then they should perform as though any contrasting statements (regardless of their logical relations to one another) negate one another EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 10 (e.g., arriving at the false inferences that all implies “not some”; three (out of 3) implies “not all”). In contrast, if children can compute scalar implicatures, then they should perform in a way that suggests understanding of asymmetric entailment – namely, that stronger alternatives entail weaker ones but not vice versa. Experiment 1 Materials and Methods Participants A total of 95 monolingual English participants between the ages of 4;0 and 7;11 from [BLINDED] participated (7 additional participants were tested but were out of our age range). Children were recruited from [BLINDED], a local museum, and preschools and daycares in [BLINDED]. Parental consent and verbal child assent were obtained prior to testing. Three participants were excluded for failing to complete at least half of the task or for experimenter notes of inattention, 2 were excluded for being bilingual, and 1 was excluded due to experimenter error. Finally, 8 additional participants were excluded for only providing one type of response (e.g., always giving Puppy a prize or withholding a prize). Thus, a final N of 81 children (4 YOs n = 21; 5 YOs n = 27; 6+ YOs n = 33) was included in analyses; our data collection goal was to have a minimum of 20 participants in each age group. An additional 16 adult (undergraduate) participants participated for course credit, and there were no exclusions of adults. Materials and Procedure Our methods were modeled after Papafragou and Tantalou (2004). A toy puppy was used during the task, and an Experimenter served as narrator. A plastic container full of approximately 25 small plastic grapes served as “prizes” for Puppy. For each trial, there was a “Before” picture of EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 11 three items (e.g., a picture of 3 unpainted stars; see Fig. 1 left panel). Pictures were presented on an 8.5” x 11” piece of paper, with all three items laid out horizontally across the page. Puppy painted some of the stars! Okay, Puppy, you get a prize if you paint all the stars! Make sure to paint all of the stars! Puppy painted some of the stars! Look what Puppy did! Request Exp. 1 Exp. 2 (Words and Pictures Exp. 2 (Pictures Only) What Puppy Did Figure 1. Schematic of methods for Exp. 1 and Exp. 2. Left panel indicates the experimenter’s request (e.g., Puppy’s task), while the right panel demonstrates how the child learned of what puppy did. For each trial, the experimenter requested that Puppy complete a task. For example, the Experimenter requested that Puppy e.g., paint all/some/two/three/blick of the stars (list of all trials is in Table 1). Puppy’s task was always described verbally and repeated twice (see below), with the request first stated in a conditional “if…” statement, where scalar implicatures are typically not calculated, minimizing the likelihood that participants would compute implicatures on the request (e.g. some) itself. Participants were informed that they would have the option to give Puppy a prize (toy grapes that they could feed to Puppy) if he did his job well. On each trial, the experimenter showed a Before picture (e.g., three unpainted stars) and told the participant that Puppy had to complete a task. On each trial, the experimenter said, EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 12 “Wow, look! There are three [objects]. Okay Puppy, you get a prize if you [requested task]. Make sure to [requested task].” The participant then learned what Puppy actually did via a verbal description (e.g., “Alright, Puppy, did you [task]?” followed by the summary of his actions: “Puppy painted some of the stars”), but saw no visual evidence of his actions (i.e. the picture did not change). After hearing about Puppy’s actions, the participant was given an opportunity to offer a prize to Puppy (indicating that Puppy had successfully done what was asked of him), and to give justifications for their decision. Participants completed 22 trials. Within each set of stimuli, there were two trial orders to reduce the likelihood of order effects. We had seven trial types, described in detail below and in Table 1. In Table 1, we make the content of and predictions for each trial explicit (e.g., what task the Experimenter requested Puppy to do, what Puppy actually did, the number of trials each participant received, and the predictions for the trial). There were two types of control trials: ‘Good Job’ Controls and ‘Bad Job’ Controls. For the ‘Good Job’ Control trials, the experimenter’s request exactly matched what Puppy did, allowing us to measure whether participants would give Puppy a prize if he did the right thing (see Table 1 for a description of how – regardless of strategy – all participants are predicted to give a prize on these trials). For the “Bad Job” Control trials, the experimenter’s request clearly didn’t match what the Puppy did, allowing us to measure whether participants would withhold a prize from Puppy if he did the wrong thing. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 13 Table 1. Trial types, names, actions involved, and predictions for trials in Experiment 1. Exclusion n of Experimenter Puppy's Implicature Inference trials Request Actions Prediction Prediction 1 three three prize prize 3 some some prize prize Trial Type Trial Name Good Job Controls request 3, did 3 request some, did some Bad Job Controls request 3, did 2 request blue, did green & red request green, did red & blue 1 1 1 three blue green two green/red red/blue no prize no prize no prize no prize no prize no prize Scalar scalar implicature scalar entailment request 3, did some 3 1 3 all some three some all some no prize prize no prize no prize no prize no prize Novel request all, did blick request blick, did all 1 1 all blick blick all no prize prize no prize no prize Contrast Mismatch request 2, did some 3 two some prize no prize Numerical Entailment request 2, did all 1 two all prize no prize 1 1 blue hat blue/green hat/shirt prize prize no prize no prize Contextual request blue, did blue and green Entailment request hat, did hat and shirt There were five critical trial types: Scalar Implicature, Novel Im-“blick”-ature, Contrast Mismatch, Numerical Entailment, and Contextual Entailment. For the Scalar trials (Scalar Implicature and Scalar Entailment), we used some and all in the request and fulfillment. On Scalar Implicature trials, we assessed whether participants computed implicatures (by inferring that some implies “not all”), and on Scalar Entailment trials, we measured whether they also computed entailment relations (that all entails some). For the Novel Im-“blick”-ature trials, we replaced the word some from the Scalar trials with the novel quantifier blick in describing either Puppy’s task or what he did. In other words, the Im-“blick”-ature trials were structurally identical to the Scalar Implicature trials, but involved a novel word instead of the word some, EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 14 thus allowing us to test participants’ inferences in the absence of semantic or scale-specific knowledge. For the Contrast Mismatch trial type, Puppy did what the experimenter requested, but Puppy’s action was described using a different word from the request (see Table 1). This trial type was important because if participants rely only on mutual exclusivity, they should withhold a prize from Puppy, while if they use any other strategy that involves knowledge of the word meanings (including computing scalar implicature), they should give Puppy a prize. For the Numerical Entailment trials, Puppy was asked to do a number of actions (i.e., 2), but actually did more. These trials were structurally similar to the Scalar Entailment trials in that Puppy’s actions entailed the requested actions, but did not involve the sum/all scale – if participants rely on mutual exclusivity, they should reject Puppy’s actions (since the request differed in format from Puppy’s fulfillment), whereas if they recognize that a set of 3 includes a set of 2, they should accept Puppy’s actions. Finally, the Contextual Entailment trials were again similar to the Numerical Entailment and Scalar Entailment trials, in that Puppy did what was asked and more – however, they did not require scale-specific knowledge of numerals and quantifiers, but instead relied only on contextually-defined alternatives. Again, in this case, we predicted that if participants compute entailment relations between the requests and reports of Puppy’s actions, then they should give Puppy a prize, since Puppy did everything that was asked. However, if participants rely on mutual exclusivity – and only reward actions that exactly match requests – then they should withhold the prize. Results Data Management A total of 9 trials were excluded prior to analyses due to experimenter error. Responses were binary in nature (“prize” or “no prize”). As such, all analyses were binomial (though we EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 15 also report the proportion of trials on which participants gave puppy a prize). Whenever a participant provided responses to multiple trials of a given trial type, we included participant as a random factor when predicting binomial performance; analyses were conducted using the lme4 package in R (Bates, Machler, Bolker, & Walker, 2015). For all reported analyses, we first tested for age effects on children's performance; unless reported otherwise, we found no differences in performance between 4-, 5-, and 6-year-olds. Thus, all subsequent analyses compared children’s performance to adults’ performance, although we visually display performance for each age separately in our figures, in order to allow for comparisons with previous literature. Results Control Trials. ‘Good Job’ Controls. We first considered the control trials on which there was an exact match between the experimenter’s request and Puppy’s actions – e.g., the experimenter asked Puppy to perform an action on three of the objects, and, in fact, the Puppy performed an action on three of the objects. Across all ages, participants successfully gave Puppy a prize on these trials (see Fig. 2), and did so the vast majority of the time (see Table 2), and there was no difference between children and adults (B = .44, SE = 1.84, p = .81). ‘Bad Job’ Controls. We next considered the control trials on which Puppy’s actions did not fulfill the experimenter’s request. For example, the experimenter asked Puppy to perform an action on three of the objects, and instead the child heard that Puppy performed an action on two of the objects. Across all trial types and ages, participants nearly always withheld a prize (see Fig. 2; Table 2), suggesting that they did not believe that Puppy’s actions fulfilled the experimenter’s request. This suggests that participants attended to and understood the task, and EXCLUSION INFERENCES VS. SCALAR IMPLICATURE were willing to both give and withhold a prize from Puppy when they thought it was warranted. Again, there were no differences between children and adults (B = .14, SE = 1.8, p = .94). 16 EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 17 Figure 2: Mean rates of prize-giving for control trials; ‘Good Job’ Control trials are on the top row and ‘Bad Job’ Control trials are on the bottom two rows. Error bars are SEM, dashed line is 50%. Critical Trials Scalar Implicature and Scalar Entailment Trials. The Scalar Implicature and Entailment trials contained only the scalar terms some and all. On these trials, participants who compute scalar implicatures should accept Puppy’s actions in the ‘request some, did all’ Scalar Entailment case (because all entails some), but should reject Puppy’s actions on the ‘request all, did some’ Scalar Implicature trials (because some implies that “not all” were done). Therefore, performance on these two trial types should differ significantly. Consistent with this prediction, adults gave puppy a prize 73.33% of the time in the ‘request some, did all’ Scalar Entailment case, whereas they did so 8.33% of the time in the Scalar Implicature ‘request all, did some’ case; performance on these two trial types differed significantly for adults (B = -3.41, SE = .78, p<.0001). Moreover, consistent with the use of true implicature, there was a strong correspondence between performance on these two trial types: Nine out of the 11 adults who withheld a prize on the scalar implicature trials rewarded puppy with a prize on the entailment trials. 1 Note that whereas the use of scalar implicature predicts asymmetric judgments on these different trials types, the use of exclusion inference predicts that participants should reject 1 To complete these analyses, we coded whether participants gave puppy a prize on Scalar Implicature and Entailment trials. Only participants who provided consistent responses within a given trial type were included in this analysis (Adult n = 11; Child n = 60). EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 18 Puppy’s actions for both trial types (‘request some, did all’ and ‘request all, did some’). This is what we found in children. Children gave puppy a prize 43.21% of the time on the Scalar Entailment ‘request some did all’ trials, and 38.24% of the time on the Scalar Implicature “request all did some” trials, and performance on these two trial types did not differ statistically (B = -.43, SE = .42, p = .31). Importantly, the same child participants who rejected Puppy’s actions for the ‘request all did some’ trials also rejected Puppy’s actions for the ‘request some did all’ trials. Unlike in adults, only 4/40, or 10%, of children who withheld a prize on the scalar implicature trials then gave puppy a prize on the entailment trials. These judgments are unlike those of adults’ and suggest that children did not compute scalar implicatures during our task, but instead based their judgments purely on exclusion. When the experimenter requested three, and Puppy reportedly did some, children gave a prize 40.17% of the time, at a rate comparable to that of the scalar implicature ‘request all did some’ trials. This performance is consistent with either exclusion inference or scalar implicature. However, children’s performance on this trial type was not adult-like: children gave a prize significantly more often than adults did (6.25% of adults gave a prize; B = 4.43, SE = 1.38, p = .001), suggesting that overall they were less likely to judge that reported actions mismatched requests. This finding is consistent with the possibility that at least some children interpreted some as “some and possibly all” (not computing a scalar implicature) or “a relatively small quantity”. Novel Im-“blick”-ature trials. In order to understand the extent to which the above pattern of performance relied on scale-specific knowledge of the word some, we next considered trials that involved the novel word (blick) instead of the scalar word some. For one such trial, Puppy was asked to do blick, and he reportedly did all. On this trial, both adults and children EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 19 rewarded Puppy around half of the time (Children: 44.87%; Adults: 53.33%; B = .34, SE = .57, p = .55). For the request all did blick trials, adults consistently rejected Puppy’s actions (6.67% gave Puppy a prize), suggesting that they did not believe that blick was a fulfillment of a request for all. Children gave prizes at significantly higher rates (49.32% of the time) than did adults (B = 2.54, SE = 1.06, p = .02). Importantly, children’s performance on the ‘request all did blick’ Im“blick”-ature trials was statistically indistinguishable from their performance on the Scalar Implicature ‘request all did some’ trials (B = .21, SE = .37, p = .57). These data suggest that scale-specific knowledge of the relation between some and all was not necessarily driving children’s behavior on our scalar implicature and scalar entailment trials (Fig. 3). EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 20 Figure 3: Mean rates of prize-giving for Scalar and Im-“blick”-ature; error bars are SEM, dashed line is 50%. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 21 Contrast Mismatch. We next considered the case where reports of Puppy’s actions were consistent with the experimenter’s request, yet contrasted in linguistic format (i.e. when puppy was asked to do two, and said he did some). Critically, on this trial type, Puppy’s response was described using a different linguistic term than had been used in the initial request, leading to different predictions of performance for participants who computed scalar implicatures, relative to those who computed exclusion inferences . Specifically, participants who computed a scalar implicature (that some implies “not all”) should give puppy a prize, since some (and not all) is a good fulfillment of a request for two items in this context (as adults did, see Table 1). However, participants who (a) compute an exclusion inference (e.g., that two and some are different words and therefore must refer to different quantities in this context) or (b) fail to compute a scalar implicature (and believe that some might be compatible with doing 3 out of 3) should reject Puppy’s actions, as children did around half of the time of the time (Fig. 4; Table 2). Numerical Entailment. We next considered the trial on which Puppy was asked to do some number of actions (e.g., “feed 2 of the lions”), but ended up doing too much (e.g., “feed all of the lions”). On these trials, participants had three choices for how to interpret the experimenter’s request. The first was to interpret the request for Puppy to do 2 as a lowerbounded request (e.g., interpreting the request like: “do at least two actions”), consistent with a scalar entailment calculation. This predicted that participants would give Puppy a prize when he did too much, as adults tended to do (Table 1). The second option was to compute an exclusion inference, and the third was to interpret the request as an exact request (e.g., “do two, and no more and no less”); both of these options predict withholding the prize. Children (and a plurality of adults) tended to adopt this latter strategy, and withheld Puppy’s prize (Table 2, Fig. 4). EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 22 Figure 4: Mean rates of prize-giving for Contrast Mismatch and Numerical/Contextual Entailment trials; error bars are SEM, dashed line is chance. Contextual Entailment. Finally, we considered trials on which Puppy did more than what was requested of him, and Puppy’s actions were described using contextual (as opposed to scalar or numerical) alternatives. For example, these were trials where Puppy was asked to wash the hat, but he actually washed the hat and shirt; or he was asked to eat the blue lollipop but actually ate the blue and green lollipops (Fig. 4). These were important, because they involved entailment (e.g., that washing a hat and shirt entails washing the hat), but included contextually defined alternatives rather than lexical quantifiers like some and all. This allowed us to test whether children’s reliance on mutual exclusivity was specific to quantifiers or reflected a more general indifference to entailment relations in this task. Participants who correctly computed the EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 23 entailment relation should have preferred to give Puppy a prize. Consistent with this, adults gave prizes on 81.25% of trials. On the other hand, we predicted that participants who compute exclusion inferences across noun phrases should withhold Puppy’s prize. This is exactly what children did, giving puppy a prize only 19.88% of the time; see Table 1). Table 2. Percent giving puppy a prize in Experiment 1 (Words Only), and Experiment 2 (Words and Pictures; Pictures Only). Trial Type Good Job Controls Stimuli Four Year Olds Five Year Olds Words Only 98.70% 91.51% Words and Pictures 78.18% 87.50% Pictures Only 67.79% 71.58% Six+ Year Olds 93.08% 91.76% 85.56% Adults 88.71% 98.75% 98.75% Bad Job Controls Words Only Words and Pictures Pictures Only 8.06% 9.38% 11.11% 13.58% 6.38% 14.04% 8.08% 0.00% 5.56% 12.50% 0.00% 0.00% Scalar Implicature Words Only Words and Pictures Pictures Only 38.33% 12.12% 13.89% 33.75% 10.42% 5.26% 41.84% 0.00% 7.41% 8.33% 2.13% 0.00% Scalar Entailment Words Only Words and Pictures Pictures Only 57.14% 72.72% 58.33% 40.74% 43.75% 68.42% 36.36% 47.06% 72.22% 75.00% 93.33% 87.50% Contrast Mismatch Words Only Words and Pictures Pictures Only 53.33% 90.91% 87.50% 66.67% 93.75% 86.84% 48.45% 97.06% 100.00% 78.05% 100.00% 100.00% Numerical Enatilment Words Only Words and Pictures Pictures Only 33.33% 50.00% 33.33% 14.81% 31.25% 32.14% 9.09% 8.00% 12.96% 43.75% 52.08% 62.50% Contextual Entailment Words Only Words and Pictures Pictures Only 36.59% 36.36% 41.67% 20.37% 28.13% 21.05% 9.09% 14.71% 31.42% 81.25% 75.00% 71.85% EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 24 Experiment 1 Discussion Throughout the control trials in Experiment 1, participants of all ages demonstrated that they understood and could complete the task, by giving Puppy a prize when appropriate, and by withholding a prize when appropriate. Critical to the main question of this experiment, we found that while both children and adults tended to (appropriately) withhold prizes from Puppy on Scalar Implicature trials (when Puppy was asked to do all and actually did some), only adults successful gave Puppy a prize on Scalar Entailment trials (when Puppy was asked to do some but actually did all). Children, instead, frequently withheld a prize from Puppy on these entailment trials. Importantly, the majority of children who initially appeared to “succeed” on Scalar Implicature trials (by withholding a prize from Puppy when he did some after being told to do all) failed to behave in an adult-like manner on entailment trials. And, when we replaced the word some with the novel word blick for the Im-“blick”-ature trials, children’s performance didn’t change, suggesting that their performance on Scalar Implicature trials didn’t necessarily require scale-specific knowledge of the relation between some and all. The simplest explanation of our data, therefore, is that children compute exclusion inferences instead of relying on scalespecific entailment relations: Whenever the form of the reported result differed from the form of the request, children withheld the prize for the Puppy, even if the result entailed the request. Thus, although children are clearly able to compute entailment relations in tasks that directly probe this (e.g., Chierchia et al., 2001), they may not deploy this knowledge spontaneously, and may instead prefer to base judgments on contrast. More generally, without explicit tests of whether children deploy their knowledge of entailment, it is impossible to know whether their behaviors reflect true scalar implicatures, or simpler strategies. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 25 Further evidence for our conclusion that children computed exclusion inferences comes from cases in which Puppy’s actions did not match the request in form but did in meaning. In these cases, children did not reward Puppy’s actions, even when the two expressions were otherwise compatible; e.g., when Puppy was asked to do two (out of 3) and reportedly did some. We found that adults appeared to infer that some could, in this context, refer to 2, while children believed that some was a poor fulfillment of a request for two. Similarly, on the Contextual and Numerical Entailment trials where Puppy’s actions entailed the request (but did not linguistically match the request), children withheld a prize much more frequently than adults did. These data suggest that even in situations where adults readily use entailment information, children instead computed simple exclusion inferences. One possible objection to these conclusions is that children’s behaviors can also be explained if we assume that they, unlike adults, computed scalar implicatures both at the moment of request and at the moment when the Puppy’s behavior was reported. For example, when Puppy was asked to paint some of the stars, children may have computed an implicature at this time, interpreting the request as a demand to paint “some but not all of the stars”. This would predict that on entailment trials (when Puppy painted all of the stars after being asked to paint some) children should treat Puppy’s action of painting all as a violation of the demand (now strengthened by scalar implicature) to paint “some but not all.” One reason to be skeptical of this argument is that we couched the request first in an “if…” statement before the direct request, which should help minimize implicatures on the request. Another main problem with this argument, already alluded to, is that adults (who can compute scalar implicatures) did not arrive at this interpretation of the stimuli, and consistently rewarded Puppy when he painted all stars after a request for some. Thus, such an account would EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 26 require that early in life children are more likely to compute implicatures than adults, and then gradually learn to compute them less frequently – the exact opposite of what is standardly believed to occur. Nevertheless, in Experiment 2, we directly tested this alternative interpretation of our data. To rule out the possibility that children computed scalar implicatures at the moment of request, we modified the stimuli from Experiment 1. Half of our participants in Experiment 2 did not have access to linguistic information regarding Puppy’s behavior (e.g., Puppy was asked to paint some of the stars, and the child saw visually that Puppy painted all of the stars), thus removing linguistic contrast as a possible means by which children decided to reward Puppy. The other half gained evidence of what Puppy did both linguistically and visually (e.g., Puppy was asked to paint some of the stars, and the child both heard and saw visual evidence that Puppy painted all of the stars). If children compute implicatures on the experimenter’s original request, then they should reject Puppy’s actions (as they did in Exp. 1) when he does all after being asked to do some, even in the absence of contrasting linguistic evidence. On the other hand, if our results from Experiment 1 were driven by exclusion inference, then in the absence of contrasting linguistic evidence (i.e. when the child learns of Puppy’s actions visually, not linguistically), they should accept Puppy’s actions in entailment trials as fulfillments of the request. In other words, if children no longer reject Puppy’s action when he completes all after being asked to complete some) when this result is presented nonlinguistically, we can rule out the possibility that children computed a scalar implicature at the moment of request. Experiment 2 Materials and Methods Participants EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 27 A total of 119 children between the ages of 4;0 and 7;11 were recruited from the same population and using the same methods as Experiment 1 (an additional 6 participants were tested but were beyond our targeted age range). Six participants were excluded for failing to complete at least half of the task or for experimenter notes of inattention, 2 were excluded for being bilingual, 2 were excluded due to experimenter error, and 3 were excluded for having participated in a related study. Finally, 13 additional participants were excluded for only providing one type of response (e.g., always giving Puppy a prize or withholding a prize). Thus, a final N of 93 child participants (4 YOs n = 23; 5 YOs n = 35; 6+ YOs n = 35) were included in analyses; our data collection goal had been to include a minimum of 20 usable participants in each age group. An additional 32 adult (undergraduate) participants participated for course credit, and there were no exclusions of adults. Materials and Procedure The procedure in Experiment 2 was identical to that in Experiment 1, with the following exceptions. As in Experiment 1, participants saw a Before picture alongside the experimenter’s request (e.g., a picture of 3 unpainted stars), but new to Experiment 2, they also saw an After picture, showing what Puppy had done (e.g., a picture of 2 painted and 1 unpainted stars Fig. 1, lower right panels). Thus, the new paradigm introduced in Experiment 2 provided all participants with visual evidence of what Puppy did. Participants were randomly assigned to one of two conditions: The Words and Pictures condition, and the Pictures Only condition. As a result, participants learned of Puppy’s actions in one of two ways (see Fig. 1). Participants in the Words and Pictures condition heard a verbal description of what Puppy did (e.g., “Puppy painted some of the stars”; as in Experiment 1) and saw a picture of what that looked like (e.g., one unpainted star and two painted stars; new to EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 28 Experiment 2). Participants in the Pictures Only condition saw a picture of what Puppy did (e.g., one unpainted star and two painted stars), but Puppy’s actions weren’t labeled except to draw attention to the picture (e.g., “Look what Puppy did!”). Thus, between-subjects, we manipulated access to linguistic information about Puppy’s actions in the presence of visual information about Puppy’s actions. To aid in comparing methods across experiments, in Figure 1, we have also depicted Experiment 1 (as Words Only). There were several additional small changes to the stimuli in Experiment 2. While images in Experiment 1 were hand drawn, images in Experiment 2 were computer generated. Second, we removed the Im-“blick”-ature trial types in Experiment 2, since we could not develop an appropriate picture of what it looked like to paint blick stars. We replaced these trials with (a) a Control “Good Job” trial (‘request two did 2’) trial and (b) a Numerical Entailment (‘request two did 3’) trial. In addition, we replaced one “Request two did some” with a “Request two did 3” trial in Exp. 2. Results Nine individual trials were excluded due to experimenter error. All other analytic procedures are as reported in Exp. 1. Control Trials ‘Good Job’ Controls. As in Experiment 1, we found that participants of all ages successfully accepted Puppy’s actions when there was an exact match between the request and puppy’s fulfillment (Table 2; Children = 81.03%; Adults = 98.75%). However, a post-hoc inspection of the data revealed that unlike in Experiment 1, children were less likely to accept Puppy’s actions for the ‘request some, did some’ (72.76%) trial than for the theoretically identical ‘request two, did two’ trial (91.40%), even though adults were at ceiling for both trial EXCLUSION INFERENCES VS. SCALAR IMPLICATURE types (puppy was given a prize on 100% and 97.9% of trials, respectively). This suggests that children may have believed that the visual depiction of 2/3 actions was a poor match for the request to complete some. Instead, some children appeared to interpret some as only true when all was true, perhaps considering some to be ‘a handful’ or ‘at least 3’. Also, performance was lower for children when they did not have a verbal description of puppy’s actions (as in the Pictures Only condition: 75.8% vs. 86.82%; post-hoc Wilcoxon p = .044), suggesting that children may have had difficulty spontaneously relating visual and verbal depictions (e.g., some to visual depictions of ‘2 out of 3 items’). ‘Bad Job’ Controls. We next considered the trials on which Puppy’s actions clearly did not fulfill the experimenter’s request. As in Experiment 1, at all ages and across all trials, participants successfully rejected Puppy’s actions (Table 2; Children = 7.6% gave prize; Adults = 0%). Thus, across control trials, we saw a similar pattern of performance in Experiment 1 and Experiment 2, suggesting that the introduction of visual evidence of Puppy’s actions in Experiment 2 did not interfere with participants’ understanding of the task. Critical Trials Scalar Implicature and Entailment Trials. Our critical finding from Experiment 1 was that while children rejected Puppy’s actions on Scalar Implicature trials (when Puppy was asked to do all but actually did some), they also rejected Puppy’s actions on Scalar Entailment trials, (when Puppy was asked to do some, and Puppy did all). We first considered participants’ judgments for Scalar Implicature ‘request all did some’ trials.2 For these trials, adult responses 2 Note that it is possible to succeed on these trials without computing scalar implicatures, because participants are shown visual representations of Puppy’s actions. For example, when 29 EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 30 were not influenced by the presence of disambiguating visual information. As in Experiment 1, adults in Experiment 2 believed that some (and/or a visual representation consistent with some) was not a good fulfillment of a request for all, and withheld Puppy’s prize (Table 2). Children also rejected Puppy’s some fulfillments for all, as they did in Experiment 1, both when they heard that Puppy did some (only 6.8% accepted Puppy’s actions) and when they didn’t hear some but only saw that Puppy did 2 out of 3 (only 8.2% accepted Puppy’s actions). Thus, like in Experiment 1, children believed that some was not a good fulfillment of a request for all, whether the information was presented linguistically or visually. For Scalar Entailment (‘request some did all’) trials, adults gave Puppy a prize across all conditions and experiments (Table 2). In contrast, children’s performance differed depending on whether the learned about Puppy’s actions linguistically or visually. Similar to Experiment 1, children in the Words and Pictures condition gave Puppy a prize around half of the time (Exp 1: 43.21%; Exp 2 Words and Pictures: 52.27%; no different from a chance rate of 50%, Wilcoxon p = .77), once again raising the possibility that children computed a scalar implicature during Puppy’s request – e.g., that Puppy was asked to do ‘only’ some. Against this interpretation, unlike adults, children in the Pictures Only condition who heard that Puppy was asked to do some and then saw that he did 3/3 (without hearing the word all) readily accepted Puppy’s actions (67.34% of the time3; significantly higher than a chance rate of 50%, Wilcoxon p = .013), asked to do all, child are shown that puppy did 2/3, and thus could use visual evidence as a basis for rejecting Puppy’s actions. 3 Based on performance on the ‘Good Job’ Control trials, it also seems possible that children included ‘3’ as a possible interpretation of ‘some’, causing them to accept Puppy’s EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 31 suggesting that they did not initially compute a “not all” interpretation for some. Instead, crucially, the linguistic contrast between the words some and all (when both are spoken) appears to cause children to assume that the action and request did not match. Contrast Mismatch. We next considered the cases where Puppy’s response (performing an action on some (2/3 objects) when asked to do two) matched the experimenter’s request in content but not in form. Both adult and child participants in Experiment 2 overwhelmingly accepted Puppy’s performance of the action on 2/3 of the items. While children frequently rejected Puppy’s actions in Experiment 1 (accepting them only 55.74% of the time), when children had visual access to disambiguating information (a picture showing that Puppy completed the action on 2/3 of the objects), they now overwhelmingly gave Puppy a prize (93.01%; Table 2). In other words, children who got both contrasting linguistic information (hearing a request for two and hearing that Puppy did some) used visual information (a picture of two stars colored) to remind them that, in this context, some is an appropriate fulfillment of two. Numerical Entailment. We next considered trials on which Puppy was asked to do some number of actions (e.g., “feed 2 of the lions”), but ended up doing more than this. For the trials where Puppy was asked to do two and did three, adults typically accepted Puppy’s actions, while children routinely rejected them (Table 2). For trials where Puppy was asked to do two and did all, as in Experiment 1, children also withheld Puppy’s prize (awarding a prize only 24.73% of actions at high rates in the Pictures Only condition. Still, if this was the strategy that children adopted, it is inconsistent with either a mutual exclusivity-based or scalar-implicature based computation. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 32 the time). Interestingly, children withheld Puppy’s prize even in the Pictures Only condition, where they did not have access to contrasting linguistic information. Contextual Entailment. We next considered trials on which Puppy did more than what was requested of him and his actions were described using contextual (as opposed to scalar or numerical) alternatives. As in Experiment 1, adults overwhelmingly computed the entailment relation and gave Puppy a prize (73.43% of the time). However, children consistently withheld prizes (Pictures/Words: 25% gave prizes; Pictures Only: 29.90% gave prizes; Table 2). Experiment 2 Discussion The main goal of Experiment 2 was to test whether children’s apparent failure to compute linguistic entailment (e.g., by recognizing that if puppy did all he also did some) might actually be due to computation of a scalar implicature at the moment of request (i.e. inferring that a request for some was a request to ‘not do all’). Across all conditions (Experiment 1’s ‘Words Only’ and both Experiment 2’s ‘Pictures and Words’ and ‘Pictures Only’ conditions), the child heard, “Puppy, paint some of the stars”, and therefore had the opportunity to compute the scalar implicature, “Puppy, paint some but not all of the stars” at the moment of request. Had the child computed a scalar implicature at the moment of the request (and therefore inferred that the request was to do some but not all), then they should have rejected Puppy’s actions whenever he was asked to do some but really did all. In Experiment 1 and the Pictures and Words condition of Experiment 2 – the conditions in which the children had access to contrasting linguistic labels (some and all) – children withheld a prize from Puppy when he was asked to do some but actually did all. However, these results could have emerged either if children computed an exclusion inference, or computed a scalar implicature both at the moment of request and at test. In the Pictures Only condition, however, we could differentiate an exclusion-based strategy from EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 33 one in which the child computed a scalar implicature at the moment of request. This is because the child didn’t hear contrasting linguistic labels in this condition – instead, they simply saw that Puppy painted all of them. In other words, there was no some/all linguistic contrast, but there was still the opportunity to compute a scalar implicature at the moment of request. Thus, if children computed a scalar implicature at the moment of request, they should have rejected Puppy’s actions. This is not what we found. Instead, without access to contrasting linguistic labels, children no longer withheld a prize when Puppy did all after a request to do some. This is inconsistent with the possibility that children spontaneously computed a scalar implicature at the moment of request: If children did, then they should have interpreted the request as one for ‘some and not all’, and therefore should have found Puppy’s actions (doing all) incompatible with the initial request. Importantly, adults’ performance was consistent across all trial types in Experiment 1 and Experiment 2, suggesting that by adulthood, the presence (or absence) of supporting visual information does not influence the mature linguistic processing of the types of dialogues presented in our experiments (Table 2). General Discussion Many past studies have found that, whereas children readily apply the principle of contrast and mutual exclusivity during word learning, they take many years to exhibit adult-like behavior when computing scalar implicatures (e.g., that an utterance containing some implies “not all”). However, a small number of recent studies have reported surprisingly early successes at scalar implicature, usually using forced choice paradigms or direct contrast of scalar alternatives (Papafragou & Tantalou, 2004; Miller et al., 2006; Skordos & Papafragou, 2016; Stiller et al., 2015; Yoon, Wu, & Frank, 2015). Based on these findings, we explored the possibility that children’s earliest successes on purported tests of scalar implicature might instead EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 34 be best explained by appeal to exclusion inference (e.g., inference driven by contrast or mutual exclusivity): When children successfully judge that some is not a good fulfillment of a request for all (Papafragou & Tantalou, 2004), they may do so either because they compute a scalar implicature or because they note that some and all are different words, and therefore infer that they must differ in some way in meaning (contrast) or even that they are incompatible (mutual exclusivity). Here, we asked whether it is possible to differentiate scalar implicature from simpler, alternative exclusion inferences like contrast and mutual exclusivity. We began with the observation that, unlike in the case of scalar implicature, exclusion inferences allow the symmetric negation of alternatives and can therefore give rise to the judgment that an utterance containing some implies “not all”, and also the false inference that an utterance containing all implies ‘not some’. In contrast, scalar implicature generally involves asymmetric negation – e.g., such that “I ate some of the cookies” implies the negation of “I ate all of the cookies” but not vice versa. Based on this logic, we asked whether children who successfully contrasted alternative utterances also spontaneously invoked asymmetric entailment relations in the same experimental context, as would be predicted if they truly computed scalar implicatures. Consistent with the computation of exclusion inferences, we found that children rejected a character’s actions whenever the descriptions of these actions linguistically contrasted with the experimenter’s original request, independent of entailment relations. For example, if the experimenter requested some and then said that Puppy did all, children rejected Puppy’s action to the same degree as when all was requested and Puppy reportedly did some. This pattern of performance is largely inconsistent with the possibility that children computed scalar EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 35 implicatures, because it suggests that children did not use asymmetric entailment relations to constrain their responses. As noted in Experiment 2, one alternative explanation of this pattern of results is that children may have computed a scalar implicature both at the moment of the experimenter’s request and when they learned of what Puppy did. Like an account based on exclusivity inferences, this explanation also predicts that the child should reject Puppy’s actions when he does all after being told to do some. For example, upon hearing the experimenter say “Puppy, paint some of the stars”, children may have computed a scalar implicature and inferred that the experimenter expected Puppy to paint some but not all of the stars, leading children to reject Puppy’s action when he painted all of the stars, as we found in Experiment 1. Against this interpretation, adults in Experiment 1 (who certainly can compute scalar implicatures) didn’t reject Puppy’s actions, suggesting that it is unlikely that, in this context, a mature language user would compute a scalar implicature at the moment of request. Nevertheless, in Experiment 2, we tested this possibility in a condition where the child heard, e.g., “Puppy, paint some of the stars” (and therefore still had the opportunity to compute a scalar implicature at the moment of request), but then learned of Puppy’s actions visually instead of linguistically (and therefore no longer had access to contrasting linguistic labels). Had children computed a scalar implicature at the moment of request, then they should have rejected Puppy’s actions regardless of the modality in which they learned about his actions (linguistic vs. visual). On the other hand, if performance were driven by the computation of an exclusion inference, then they should have rejected Puppy’s actions only when they had access to contrasting linguistic labels. We found that when children didn’t have access to contrasting linguistic labels, they were much less likely to reject Puppy’s actions, suggesting that the pattern we found in Experiment 1 was not due to children EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 36 computing a scalar implicature at the moment of request. This suggests that exclusion inferences – and not scalar implicature – were driving children’s performance on our task. While we found that children appeared to use exclusion inferences (instead of scalar implicature) in our tasks, our claim is not that children are unable to compute entailment relations, or even that their knowledge of the entailment relations between all and some is incomplete. In fact, there is experimental evidence that young children of this age understand entailment. For example, in a study by Chierchia et al. (2001), 3- to 6-year-olds demonstrated an understanding of entailment by successfully choosing the more informative of two possible utterances (by selecting e.g., “Every farmer cleaned a horse and a rabbit” instead of “Every farmer cleaned a horse or a rabbit”) to describe a scene (Chierchia et al., 2001). Because there is no reason to believe that children are incapable of reasoning about asymmetric entailment, our claim here is simply that because exclusion inferences and implicature predict similar behaviors in many contexts, it is critical to test whether children deploy their knowledge of asymmetric entailment in a task, to be sure that they are indeed computing implicatures. Notice here that the critical difference between implicature and exclusion inferences is not necessarily the form of the inference itself, but instead the nature of the alternatives and how they are represented by children in the service of this inference. In tasks like the one presented here, children do not appear to interpret alternative utterances according to their scalar relations, but instead assign all alternatives equal status, such that uttering an utterance A negates utterance B, regardless of whether one entails the other. This last point is important to understanding the significance of our findings to the literature at large. A key difference between experiments which find early successes with implicature and those which find later successes is that the former studies tend to use forced EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 37 choice paradigms that are similar to mutual exclusivity tasks, and which therefore present children with a set of salient alternative referents. According to some past reports, children’s main difficulty with implicature computation is their ability to spontaneously access relevant alternatives – e.g., to access an utterance containing all when interpreting a sentence containing some (Barner, Brooks, & Bale, 2011; Chierchia et al., 2001; Hochstein, Bale, Fox, & Barner, 2016; Skordos & Papafragou, 2016). Forced choice paradigms may help children appear to succeed on implicature tasks by providing them with a pre-established set of relevant alternative referents, which can be used by the child to generate linguistic alternatives. As evidence for this, adult subjects are faster to compute implicatures in visual world paradigms when they are allowed to preview the visual referents before hearing the test sentence (Snedeker, 2015). According to Snedeker (2015), this suggests that such paradigms – and by extension forced choice paradigms like those used in recent tests of implicature in children (e.g., Stiller, Goodman, & Frank, 2014) – allow children to generate descriptions for different visual alternatives, and to then use these descriptions to compute inferences. Consistent with this, infants as young as 18 months appear to also spontaneously conjure the referents of pictures upon viewing them, suggesting that even very young children can generate descriptions of visually presented alternatives (Mani & Plunkett, 2010; for evidence in adults, see Meyer & Belk, 2007; Meyer & Damian, 2007). How might this impact children’s ability to compute scalar implicature? Upon seeing a picture with “all of the socks” and another containing “some of the soccer balls” children may code the pictures with corresponding verbal descriptions, and use a relatively simple matching strategy to guide their looking at test – e.g., looking to the picture containing “some of the soccer balls” upon hearing “some” simply because it matches their prior expectation, without involving any appeal to the stronger alternative containing “all”. In this EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 38 way, children may appear to succeed on scalar implicature tasks without needing to deploy any knowledge of asymmetric entailment. To summarize, we exploited the asymmetric relationship between scalar items to investigate children’s understanding of the relation between words like some and all. We asked whether children might rely on exclusion inferences (based in contrast or mutual exclusivity) – which we already know children recruit for the purpose of word learning – in cases where adults would typically compute scalar implicatures. We found strong evidence that young children rely on exclusion inferences when scalar implicatures would be more appropriate. This suggests that previous work reporting early success at computing scalar implicatures, which do not test children's knowledge of entailment relations, may overestimate young children’s pragmatic knowledge. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE Acknowledgements: [Acknowledgments Blinded] 39 EXCLUSION INFERENCES VS. SCALAR IMPLICATURE References Au, T. & Markman, E. (1987). Acquiring word meanings via linguistic contrast. Cognitive Development, 2, 217-236. Barner, D., & Bachrach, A. (2010). Inference and exact numerical representation in early language development. Cognitive Psychology, 60, 40–62. Barner, D., Brooks, N., & Bale, A. (2011). Accessing the unsaid: The role of scalar alternatives in children’s pragmatic inference. Cognition, 118, 84–93. Bates, D., Machler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models using lme4. Journal of Statistical Software, 67, 1-48. Carey, S. & Bartlett, E. (1978). Acquiring a single new word. Papers and Reports on Child Language, 15, 17-29. Chierchia, G., Crain, S., Guasti, M.T., Gualmini, A., & Meroni, L. (2001). The Acquisition of Disjunction: Evidence for a Grammatical View of Scalar Implicature. Proceedings of the 25th Boston University Conference on Language Development. Cascadilla Press: Somerville, MA. Clark, E. (1987). The principle of contrast: A constraint on language acquisition. Mechanisms of Language Acquisition, 1, 33. Clark, E. (1988). On the logic of contrast. Journal of Child Language, 15(2), 317-335. Clark, E. (1990). On the pragmatics of contrast. Journal of Child Language,17(2), 417 - 431. Gathercole, V. (1989). Contrast: a semantic constraint? Journal of Child Language, 16, 685-702. Gazdar, G. (1979). Pragmatics: Implicature, presupposition, and logical form. New 40 EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 41 York: Academic Press. Grice, H. P. (1970). Logic and Conversations. Syntax and Semantics Volume Three: Speech Acts, P. Cole and J.L. Morgan. New York: Academic Press, 41-58. Grice, H. P. (1991). Studies in the Way of Words. Harvard University Press. Halberda, J. (2003). The development of a word-learning strategy. Cognition, 87, B23-B34. Heibeck, T. & Markman, E. (1987). Word learning in children: an examination of fast mapping. Child Development, 58, 1021-1034. Hochstein, L., Bale, A., Fox, D., & Barner, D. (2014). Ignorance and inference: Do problems with Gricean epistemic reasoning explain children’s difficulty with scalar implicature? Journal of Semantics, 0, 1-29. Horn, L. (1989). A Natural History of Negation. Chicago: University of Chicago Press. Katsos, N., & Bishop, D. (2011). Pragmatic tolerance: Implications for the acquisition of informativeness and implicature. Cognition, 120, 67-81. Katsos, N. & Wilson, E. (2014). Convergence and divergence between word learning and pragmatic inferencing. Proceedings of Formal and Experimental Pragmatics, 2014, 1420. Levinson, S. (2000). Presumptive Meanings: The theory of generalized conversational implicature. Cambridge, MA: The MIT Press. Mani, N., & Plunkett, K. (2010). In the Infant’s Mind’s Ear: Evidence for Implicit Naming in 18 - Month-Olds. Psychological Science, 21, 908-913. Markman, E. & Wachtel, G. (1988). Children’s use of mutual exclusivity to constrain the meaning of words. Cognitive Psychology, 20, 121-157. EXCLUSION INFERENCES VS. SCALAR IMPLICATURE 42 Markman, E., Wasow, J., & Hannsen, M. (2003). Use of the mutual exclusivity assumption by young word learnings. Cognitive Psychology, 47, 241-275. Meyer A.S., Belke E., Telling A., Humphreys G.W. (2007). Early activation of object names in visual search. Psychonomic Bulletin & Review, 14, 710–716. Meyer A.S., Damian M.F. (2007). Activation of distractor names in the picture-picture word interference paradigm. Memory & Cognition, 35, 494–503. Miller, K., Schmitt, C., Chang, H., & Munn, A. (2005). Young children understand some implicatures. Proceedings of the 29th Annual Boston University Conference on Language Development. Noveck, I. (2001). When children are more logical than adults: Experimental investigations of scalar implicature. Cognition, 78, 165-188. Papafragou, A., & Musolino (2003). Scalar implicatures: experiments at the semantics pragmatics interface. Cognition, 86, 253-282. Papafragou, A., & Tantalou, N. (2004). Children's Computation of Implicatures. Language Acquisition, 12(1), 71–82. Skordos, D., & Papafragou, P. (2016). Children’s derivation of scalar implicatures: alternatives and relevance. Cognition, 153, 6-18. Snedeker, J. (2015). Scalar implicature: A whirlwind tour with stops in processing, development, and disorder. Tubingen, Germany. Slides available: https://software.rc.fas.harvard.edu/lds/research/snedeker/jesse-snedeker/ Spiegel, C. & Halberda, J. (2011). Rapid fast-mapping abilities in 2-year-olds. Journal of Experimental Child Psychology, 109, 132-140. Smith, C. (1980). Quantifiers and question answering in young children. Journal of EXCLUSION INFERENCES VS. SCALAR IMPLICATURE Experimental Child Psychology, 30, 191-205. Stiller, A. J., Goodman, N. D., & Frank, M. C. (2014). Ad-hoc Implicature in Preschool Children. Language Learning and Development, 11(2), 176–190. Woodward, A. & Markman, E. (1991). Constraints on Learning and Default Assumptions: Comments on Merriman and Bowman’s “The Mutual Exclusivity Bias in Children’s Word Learning”. Developmental Review, 11, 137-163. Wynn, K. (1992). Children’s acquisition of the number words and the counting system. Cognitive Psychology, 24(2), 220-251. Yoon, E., Wu, Y., & Frank, M. (2015). Children’s Online Processing of Ad-Hoc Implicatures. Proceedings of the 37th Annual Conference of the Cognitive Science Society. 43
© Copyright 2026 Paperzz