Modifying Baayen: Can a Corpus Count Provide a Reasonable Measure of Morphological Productivity? Rebecca Brown [Scarborough] UCLA Spring 2001 The notion of morphological productivity is widely invoked and even somewhat intuitive. But the formalization of this notion of the readiness with which various affixes combine with roots is not at all straightforward. There are a number of definitions and measures of productivity in the literature, but studies investigating productivity have generally involved some method of counting affixed forms. These counting methods, however, leave questions about which things should be counted among which countable things. Earlier studies, such as Aronoff’s (1976), measured the productivity of affixes by counting the number of times a particular affix appeared as part of any word in a dictionary, making the claim that productivity was correctly conceived of as a ratio of ‘actual words’ to ‘possible words’. Baayen (with Lieber1, 1991) expresses Aronoff’s approach in more formal terms as a ratio of the number of actual forms with an affix x to the number of bases of a kind eligible for x-affixation. But as Baayen also points out, such an approach in which the dictionary is used to make affix counts is not altogether satisfactory. Dictionaries represent the ‘actual’ lexicon of only a particular lexicographer, and furthermore, represent only standard, well-established elements of the lexicon. Since productivity critically involves the formation of novel forms, dictionaries are inappropriate and unreliable for studies of productivity, and other methods of quantifying productivity must be explored. 1 Though Baayen often collaborates with various other linguists, the productivity index described in various works bearing his name will, for the sake of simplicity, be referred to as Baayen’s throughout this paper. 1 Baayen’s Productivity Index Baayen (1989, 1991, 1996) proposes instead a measure of morphological productivity based on counts made from a corpus. By using a corpus, he alleviates the problem of representing the lexicon of only a single speaker, as the corpus includes data from a large number of sources. And because a number of different kinds of sources are incorporated, and all forms found in those sources are included unedited, productive forms used in actual speech and writing are accessible. Baayen’s measure is based on the number of hapax legomena, or singly occurring forms, with a given affix in a corpus. Specifically, Baayen’s index of productivity is calculated as a ratio of hapax legomena to tokens for a given affix: P= n1 N n1 = number of hapax legomena with the affix N = number of tokens with the affix Of course, this sort of measure is still dependent upon the particular corpus used. In Baayen’s study, CELEX, a corpus of more than 18,000,000 English words taken from a variety of written (75%) and spoken (25%) sources, is used. To understand this ratio as a measure of productivity, we must first consider what the two components of the relation represent. The numerator, which is the number of x-affixed hapaxes, represents novel forms with the affix x. Because they are created on the fly for use in a particular situation rather than stored, individual novel forms appear only infrequently in a corpus. The denominator, which is the total number of x-affixed tokens, represents the overall extent of occurrence of the affix x, taking into account both the number of different types (or words) and the frequency of each type. By calculating the ratio of hapaxes to tokens, Baayen gets an estimate of the probability of coming across new, unobserved types (i.e., novel forms) with a given affix. Since to Baayen, productivity is a measure of the likelihood of producing novel forms (Baayen 1991), if it is assumed that the likelihood of encountering forms in the corpus 2 directly represents the likelihood of a speaker producing those forms, this ratio seems to provide an index of productivity. But a better indication of the appropriateness of this measure is whether or not it yields satisfactory results. We consider first Baayen’s own means of evaluating the success of a measure of productivity. He proposes that a good measure of productivity should meet the following requirements (Baayen, et al. 1991): 1. it should reflect the linguist’s intuitions concerning productivity 2. it should express the ‘statistically determinable readiness with which an element enters into new combinations’ (Bolinger) The index is designed specifically to meet the second criterion, given the above-mentioned assumption on Baayen’s part that the occurrence of forms in the corpus directly reflects the likelihood of a speaker producing those forms. As for the first criterion, Baayen and Lieber (1991, p. 801) claim explicitly that the results yielded by P “accord nicely with [their] intuitive estimates of productivity.” The results of Baayen’s index for selected affixes are shown in Table 1 below: Table 1 Baayen's Results affix N hapaxes dislegomena P1 P2 -ous (n) 21861 13 10 0.0006 0.0011 -able (v) 15004 10 8 0.0007 0.0012 -er (v) 57683 40 40 0.0007 0.0014 -ness (adj) 17481 77 54 0.0044 0.0075 -ish (n) 1602 8 4 0.0050 0.0075 -ish (adj) 290 1 2 0.0034 0.0103 N = tokens, P1 = productivity index with hapaxes, P2 = productivity index with hapaxes + dislegomena These results do not, however, accord with the intuitions of all linguists, Van Marle (1991) and this author among them. The following comparisons of Baayen’s indices, for instance, do not satisfy my intuitions about the relative productivity of certain affixes: 3 1. P(-ous) = 0.0006 ≈ P(-able) = 0.0007 P(-er) = 0.0007 2. P(-able) = 0.0007 P(-ness) = 0.0044 ≠ P(-er) = 0.0007 P(-ish) = 0.0050 As indicated in the first comparison, Baayen’s measure assigns the suffix –ous a productivity index of 0.0006, only marginally smaller than the index of 0.0007 assigned to both –able and agentive deverbal –er. However, it is my impression that –ous is effectively unproductive, appearing primarily in learned, listed words (e.g., ferrous, but not foamous), while both –able and –er are essentially fully productive, attaching to virtually any semantically appropriate verb (e.g., teachable, teacher, slurpable, slurper). If we assume that the premise of Baayen’s index is sound, we must consider what could be causing the index to yield unintuitive results in these cases. Given the nature of the index, there are two obvious places to look for the source of error: in the counting of tokens and in the counting of hapaxes. The first potential error is that the number of relevant tokens is being over-counted. Certain types (words) are very frequent, and as such, are stored non-compositionally in the lexicon. Baayen himself (with Lieber, 1991) acknowledges that derived forms are more likely to be stored as token frequency increases, regardless of productivity. It does not make sense to count tokens of these words as instances of a particular affix since they are not stored with any reference to the affix. By counting all tokens containing the phonological string of a given affix, Baayen counts the non-compositional forms as well, and thus the number of relevant tokens is over-counted. Because the denominator of the index is artificially increased, some words are made to seem less productive than they ought to be. The second potential error is that the number of relevant hapax legomena is being over- 4 counted. There are two, actually three, reasons that a word might occur only once in a corpus: 1. The word is a novel formation, created productively by adding an affix to a stem to which it has not been added before. This is what Baayen intends. 2. The word is simply very infrequent. It is stored in its entirety, but it is only stored in the lexicons of a few people or only used in very limited contexts. 3. The infrequent occurrence of the word is a pure statistical accident of the particular corpus (e.g., picky). This sort of gap, statistically, should not happen often. By counting all tokens with a given affix appearing exactly once in the corpus, Baayen counts hapaxes of type 2 and 3 as well those of type 1, and thus the number of relevant hapaxes is overcounted. Because the index numerator is artificially increased, some words are made to seem more productive than they ought to. A Revised Index of Productivity In an effort to address these problems so that the basic core of Baayen’s index of productivity can be better evaluated, the methods used for counting forms were modified and Baayen’s calculation of the index of productivity was reapplied. Three basic modifications were made independently and in combination. First, high frequency tokens were eliminated. In an attempt to get rid of the bulk of high frequency, listed types, all words with a frequency of 100 or more were excluded from the token count. Secondly, low frequency hapax roots were eliminated. In an attempt to get rid of noncompositional hapaxes that are simply obscure words, words with low frequency roots were excluded from the hapax count. If a root (the part of the word excluding the relevant affix, and sometimes excluding another productive affix attached to the same stem) occurred fewer than 20 5 times in the corpus alone or with any affix, the root was considered to be too infrequent to be separately accessible to a speaker who might be searching for roots and affixes to productively combine. It was assumed that these forms must be listed in their entirety, with their affix strings, and are therefore not instances of morphological productivity. Finally, dislegomena were included. A number of types occurring twice in the corpus looked just as novel as the singly-occurring forms. It is possible that in some cases, speakers repeat a newly coined word again in the same conversation, or, given the size of the corpus, that two speakers might have individually produced the same novel form. In such cases, dislegomena are just as indicative of morphological productivity as hapaxes are. Thus, productivity indices were calculated on the basis of the number of hapaxes plus dislegomena as well as on the basis of the number of hapaxes alone. Incorporating these modifications, productivity indices for selected affixes were recalculated from the CELEX corpus. Results are shown in the following tables. Table 2 shows the modified P values based on hapax counts, and Table 3 shows the modified indices based on counts of hapaxes plus dislegomena. Table 2 Modified P values based on hapaxes Suffix N Ned n1 n1-ed P Unedited Without High Freq. Tokens -ous -able -ness -ish (n) -ish (adj) -er 36068 18401 14915 1592 452 32651 8921 4366 8582 780 452 12406 34 22 206 9 2 142 13 21 177 6 2 126 0.00094 0.00120 0.01381 0.00566 0.00442 0.00435 0.00381 0.00504 0.02400 0.01154 0.00442 0.01145 tokens hapaxes Without Low Freq. Hapaxes Without Both 0.00036 0.00114 0.01187 0.00377 0.00442 0.00386 0.00146 0.00481 0.02062 0.00769 0.00442 0.01016 N = tokens, Ned = edited number of tokens, n1 = hapaxes, n1-ed = edited number of hapaxes, P Unedited = n1/ N, Without High Freq. Tokens = n1/ Ned, Without Low Freq. Hapaxes = n1-ed/ N, Without Both = n1-ed/ Ned 6 Table 3 Modified P values based on hapaxes + dislegomena tokens Suffix N Ned hap + dislegomena n2 n2-ed P Unedited Without High Freq. Tokens Without Low Freq. Without Hap & Dis Both -ous -able -ness -ish (n) -ish (adj) -er 36068 8921 55 28 0.00152 0.00616 0.00078 0.00314 18401 4366 41 38 0.00223 0.00939 0.00207 0.00870 14915 8582 307 270 0.02058 0.03577 0.01810 0.03146 1592 780 13 8 0.00817 0.01667 0.00503 0.01026 452 452 4 4 0.00885 0.00885 0.00885 0.00885 32651 12406 242 221 0.00741 0.01951 0.00677 0.01781 N = tokens, Ned = edited number of tokens, n2 = hapaxes + dislegomena, n2-ed = edited number of hapaxes + dislegomena, P Unedited = n2/ N, Without High Freq. Tokens = n2/ Ned, Without Low Freq. Hap & Dis = n2-ed/ N, Without Both = n2-ed/ Ned Recall that we were dissatisfied with the results from Baayen’s productivity calculations whereby the productivity of –ous is assessed to be approximately the same as the productivity of –able and –er, and –able and –er appear to be far less productive than the intuitively similarly productive affixes –ness and denominal –ish. The specific goals of the modified index of productivity, then, were to bring out the difference in productivity between unproductive –ous and productive –able and –er and to neutralize the difference in productivity between –able and –er and –ness and –ish. With regard to the first goal, the current modifications prove to be generally helpful. The degree to which –able and –er are more productive than –ous is illustrated in the following tables, where the number of times that P(-able) and P(-er) are greater than P(-ous) are shown for both hapax and hapax plus dislegomena counts: Table 4 Relative Degrees of Productivity: –ous vs. –able P(–ous) unedited .00094 w/o hi freq token .00381 w/o lo freq hapax .00036 w/o both .00146 vs. P(–able) .00120 .00504 .00114 .00481 7 hapax 1.28 x 1.32 x 3.17 x 3.29 x hap+disleg 1.47 x 1.52 x 2.65 x 2.77 x Table 5 Relative Degrees of Productivity: –ous vs. –er unedited w/o hi freq token w/o lo freq hapax w/o both P(–ous) .00094 .00381 .00036 .00146 vs. P(–er) .00435 .01145 .00386 .01016 hapax 4.63 x 3.01 x 10.72 x 6.96 x hap+disleg 4.88 x 3.17 x 8.68 x 5.67 x As can be seen in Table 5, the unedited productivity index of –er is already greater than that of – ous in the counts made for this study (P(–ous)=0.00094 vs. P(–er)=0.00435).2 Even so, for both –able and –er, the present modifications increase the degree to which these affixes are demonstrably more productive than –ous. This change in the measured productivity of the suffix –ous and the suffixes –able and –er can be attributed to the large number of –ous hapaxes and dislegomena in the unedited list having low-frequency roots, combined with the high frequencies of many –able and –er forms. By eliminating low-frequency roots from the –ous hapax counts, the numerator of the hapax to token ratio is reduced, and thus P(–ous) is reduced. At the same time, eliminating high-frequency tokens from the counts for –able and –er reduces the denominator of the hapax to token ratio, thus increasing the productivity index for these affixes. The combined result is the desired expansion of the difference between the measured productivity of –ous an that of –able and –er. With regard to the second goal of neutralizing the difference in productivity between – able and –er and –ness and denominal –ish, the results are slightly less straightforward than in the previous case. The various P values for these affixes can be compared in Table 6 below3: 2 It is not clear why the current measures differ from those obtained in the Baayen studies. Baayen does not specify which version of CELEX he has used in his calculations. As the corpus is continually updated, some of the differences between his counts and the current counts may arise from consulting different versions of the corpus. Additional disparities may arise from differences in the determination of which forms have a “noun root” in the case of –ous or a “verb root” in the case of –able and –er, etc. 3 The data in Table 6 are taken from Tables 2 and 3 above. 8 Table 6 Comparison of P values for –able, –er, –ness, and –ish P Unedited Without High Freq. Tokens Without Low Freq. Hap & Dis Without Both Suffix hapax hap+dis hapax hap+dis hapax hap+dis hapax hap+dis -able -er -ish (n) -ness 0.00120 0.00435 0.00566 0.01381 0.00223 0.00741 0.00817 0.02058 0.00504 0.01145 0.01154 0.02400 0.00939 0.01951 0.01667 0.03577 0.00114 0.00386 0.00377 0.01187 0.00207 0.00677 0.00503 0.01810 0.00481 0.01016 0.00769 0.02062 0.00870 0.01781 0.01026 0.03146 In the current counts, –able has a lower P value lower than that of –er, as well that of as –ness or –ish, in all conditions, and –ness has a much higher P value than the other affixes regardless of the condition. The most obvious effect of the modifications of P is that for all modified indices (except for the hapax-only count ‘without high frequency tokens’), –er and –ish trade positions, so that –er is characterized as being at least slightly more productive than –ish. The degree of measured difference between the productivity of the two affixes, however, is not much affected; and where it is, the difference is not lessened. As for the productivity of –er relative to –ness, both the ‘without high frequency tokens’ and the ‘without both’ conditions lessen the measured productivity difference somewhat, though P(–ness) is still approximately two times greater than P(–er) in both cases (compared with a difference of approximately three times in the unedited counts). Likewise, the productivity difference between –able and the affixes –ness and –ish is reduced somewhat in the ‘without high frequency tokens’ and the ‘without both’ conditions: P(– ness) goes from 11.5 times P(–able) to 4.8 and 4.3 times the affix, and P(–ish) goes from 4.7 times to 2.3 and 1.6 times P(–able), counting hapaxes. Thus, we see that in the conditions in which high frequency tokens are not included in the count, the differences between the measured productivity of –able and –er and –ness and denominal –ish are generally reduced relative to the differences in the unedited counts; however, the differences are certainly not neutralized as we had predicted. To better evaluate the effect of the modifications to Baayen’s productivity index 9 considered here, it is useful to summarize the effect of each with respect to individual affixes, as in Table 7: Table 7 The effect of individual modifications on P of individual affixes High Frequency Tokens Removed -able, -er, -ness, -ish(n) are all approx. doubled or more in this condition -able is most effected Low Frequency Hapaxes Removed -ous is most effected little or no effect on other affixes in any condition Both -able, -er and -ous are most effected (-ous in the opposite direction from the other two) As mentioned above, the effect of the removal of high frequency tokens on –able and the effect of the removal of low frequency hapaxes on –ous combine to bring about the widening of the gap in productivity between the two affixes. But because all of the intuitively productive affixes, –able, –er, –ness, and denominal –ish, are affected by the removal of high frequency tokens, but the removal of low frequency hapaxes has little effect on any of these affixes, the current modifications are not very effective in neutralizing differences among these affixes. Given the imperfect results of the current modifications, we must consider carefully whether these adjustments, though motivated, really capture the psychological dynamic that determines productivity, or whether they merely seize upon some emergent property of this dynamic. We have noted that the productive affixes, –ish, –er, –able, and –ness, show increased measured productivity when the affix counting methods are modified. It is not the case, however, that –er and –able show a greater increase in productivity than –ish and –ness, thus effecting the neutralization of the productivity of these affixes. To satisfy our intuitions about relative productivity, then, we must assume that there are other factors that systematically differentiate the modification of different productive affixes. Insofar as it is these other factors (which may well be dependent on or at least highly correlated with the factors manipulated in the 10 current study) that would bring about the desired result, we might infer that it is these factors which more closely represent some psychologically real phenomenon of productivity. Further, despite the fact that removing hapaxes with low root frequencies seems to have worked well in lowering the measured productivity of –ous, when the individual forms actually eliminated from the count are considered, it is not clear that this is the most reasonable sort of means of effecting this result. The roots of –ous hapaxes are not simply less frequent than the roots of hapaxes with other affixes; the roots of –ous words (hapaxes as well as more frequent forms) are of an intrinsically different kind. While the roots of other forms tend to be words (e.g., bashfulness, dwarfish, weaver), the roots of –ous words are fundamentally less word-like (e.g., opprobrious, contumacious). Thus, although these non-word-like roots tend to have lower frequencies, not all low frequency roots are of the –ous type. Some, particularly among the forms suffixed with –able, –er, –ness, and –ish, are simply obscure (or even just accidentally infrequent) words that a speaker may add productive morphology to. Surely such forms (e.g., heinousness, kissable) should not, by their removal from the calculation, count against productivity. Corpus-Based Measures of Productivity in General As we have established, modifications to the method of counting in the calculation of the index of productivity seem to bring relative P values at least somewhat closer to our intuitions about the relative productivity of certain affixes, but their success is far from complete. Though we may consider slight adjustments to the modifications to the productivity index, as suggested above, the results of the current study lead us to consider as well whether even a perfectly counted corpus-based measure could ever be sufficient to capture the nature of morphological productivity. It is clear that certain characterizations can be made of productivity that are not 11 accessible to a model that can only count forms. There are a number of specific constraints on productivity that require more complicated, less quantitative sorts of knowledge. For example, some affixation is parasitic; in other words, some affixes attach primarily to particular other affixes (e.g., –ity attaches to –ic and –able forms). Some affixes are more likely in the semantic context of another particular affix (e.g., –ee forms appear often in the context of –er forms) (Barker 1999). Other affixes attach only to roots of a certain etymological type (e.g., –ous attaches only to Latinate roots). Still other affixes are restricted to particular registers of usage (e.g., –ish is found primarily in spoken language, and not in written texts) (Plag, et al. 1999). An intuitive notion of productivity clearly includes such factors, but though their effect may in some sense be counted in the current index, their substance is in no way represented by the index. The argument could be made that it is not a flaw of the current index that it does not account for such conditions on productivity. One could imagine, for example, that factors like these do not directly affect productivity; rather, they act as a kind of post-application constraint on the possible outputs of Word Formation Rules, which in turn have access only to the P values of affixes to determine whether or not to apply to a particular root and affix. In other words, the constraints have a sort of veto power over the less discriminating Word Formation Rules, which produce far more forms than are actually attested. Under this system, roots like gold and ferr and the affix –ous are combined by a general Word Formation Rule, but the morphologically complex output form goldous is rejected by a constraint on –ous words with non-latinate roots, while ferrous, with its latinate root, is accepted. Insofar as productivity is understood to be the likelihood of a particular affix entering into a morphological process, these constraints do not, in fact, affect productivity. But under such an analysis, “productivity”, as measured by the productivity index, seems 12 almost meaningless. Such a notion of productivity is not at all accessible to intuition, and given Baayen’s own requirement that a measure of productivity should “reflect…intuitions concerning productivity,” it can be inferred that an analysis with such a deliberate exclusion of factors which influence intuitions on productivity is not Baayen’s intention. We might assume instead that these factors were simply not considered to be a necessary part of this model, which is based on the theory that the critical properties of productivity are statistical in nature. However, it might be precisely factors like these which contribute in a non-quantitative way to the real productivity of an affix, influencing the likelihood that a Word Formation Rule will even apply to a particular root and affix to yield an output, and rendering the current model, which cannot consider such conditions, empirically inadequate. There are means other than an appeal to intuition by which the empirical adequacy of a corpus-based, statistical view of productivity might be evaluated as well. A statistical view of productivity carries certain consequences which must be considered and which provide the basis for potential experimental evaluation of such models. Perhaps the most obvious consequence is that a statistical model of productivity suggests that productivity is inherently gradient. To illustrate, if we choose the maximally edited version of P (hapaxes + dislegomena with revised numerator and denominator), –ness is shown to be more productive than –er, which is more productive than –ish (n), and so on, as illustrated in Table 8 below: Table 8 The relative modified P values of various affixes in decreasing order of productivity -ness -er -ish (n) -ish (adj) -able -ous 0.03146 0.01781 0.01026 0.00885 0.00870 0.00314 13 Furthermore, –er, for example, is calculated to be twice as productive as –able, while –ish (n) is 1.2 times as productive. Moreover, the difference in productivity between –able and –ous is less (P(–able) is 2.77 times P(–ous)) than the difference in productivity between –ness and –able (P(–ness) is 3.62 times P(–able)). These consequent facts, however, do not accord with the intuitions of this author. In fact, the intuitions which serve as the basis for the current evaluation of Baayen’s index represent productivity as essentially categorical rather than gradient. One of the goals of the modification of the productivity index was to neutralize the P values of the affixes –ness, –er, –able, and –ish, effectively making them all equal members of a category of fully productive affixes. The other goal was to differentiate between the P value for –ous and the values for –able and –er in order to allow –ous to fall into a separate category of non-productive affixes. Though there is no explicit discussion of the gradient or categorical nature of productivity in Baayen’s studies, Baayen’s claim is that the productivity index is not merely a descriptive statistic, but rather “P expresses…in a very real sense the linguistic notion of productivity” (Baayen 1991). We assume, then, that he considers the gradient nature of his assessment of productivity to be psychologically real. The conflict between this consequence of the model and its alternative lends itself to experimental investigation. Carefully designed lexical decision experiments involving low frequency (hapax and dislegomena) affixed words and like-affixed nonwords could reveal differences which might exist among novel or near-novel forms with various affixes (as manifested by low-level differences in reaction times). The underlying assumption is that forms with more productive affixes would be more quickly accepted and more slowly rejected because it would simply be more likely that any given root could occur with a more productive affix than with a less productive one. Systematic differences between words 14 with different affixes would offer support to the position that productivity is gradient, while a clustering of groups of affixes around two distinct reaction time points would offer support to the categorical position. With this more representative sample of more objectively gathered intuitions, the relative ordering of affixes by their productivity (if systematic differences are found) could also be compared with Baayen’s results. Of course, a gradient model could be interpreted as somewhat categorical. One could imagine, for example, that there is a threshold of productivity, above which an affix is marked as productive, and below which it is not. A learner could keep track of statistical information about an affix until it reached the P threshold and was labeled as ‘productive’. After that, such detailed knowledge would be unnecessary, as the affix would already be marked as able to participate in word formation processes. Statistical differences among productive affixes would then be meaningless, although varying degrees of productivity should still exist among less productive affixes. The linguistic and psychological phenomenon of morphological productivity might well, finally, be comprised of such knowledge of the lexicon and the frequency of occurrence of its members as are represented in Baayen’s index of productivity. But even when it is most successfully modified--given its best chance, Baayen’s index is inadequate to meet the standard of satisfying linguist intuitions about the relative productivity of certain affixes. Thus, we are left with two possible conclusions. First, it could be that important, perhaps non-quantitative factors are missing from the corpus-based statistical model, and incorporating more factors would improve its success. But because we know that, after a point, the more complications that are introduced, the less likely the model is to be psychologically plausible, due to limitations on 15 human computational abilities, we must consider a second possible conclusion as well. If, ultimately, the only knowledge necessary for the proper application of Word Formation Rules is whether or not an affix is productive and any specific conditions on a particular affix’s occurrence, why would the effort for such detailed calculation be wasted? Perhaps then the nature of morphological productivity is actually very different from the way that Baayen’s index represents it, and the critical factors, whatever they are, cannot be counted, however carefully, from any corpus. We must await further experimental studies to make a determination. 16 References Aronoff, M. (1976) Word formation in Generative Grammar. Cambridge, MA: MIT Press. Baayen, H. (1991) Quantitative aspects of morphological productivity. Yearbook of Morphology, 1991, 109-149. Baayen, H. & Lieber, R. (1991) Productivity and English derivation: a corpus-based study. Linguistics, 29, 801-843. Baayen, R.H. & Renouf, A. (1996) Chronicling the Times: productive lexical innovations in an English newspaper. Language, 72, 69-97. Barker, C. (1998) Episodic –ee in English: a thematic role constraint on new word formation. Language, 74, 695-727. Plag, I., Dalton-Puffer, C., Baayen, H. (1999) Morphological productivity across speech and writing. English Language and Linguistics, 3, 209-228. Van Marle, J. (1991) The relationship between morphological productivity and frenquency: a comment on Baayen’s performance-oriented conception of morphological productivity. Yearbook of Morphology, 1991, 151-163. 17
© Copyright 2026 Paperzz