John Benjamins Publishing Company This is a contribution from International Journal of Corpus Linguistics 12:3 © 2007. John Benjamins Publishing Company This electronic file may not be altered in any way. The author(s) of this article is/are permitted to use this PDF file to generate printed copies to be used by way of offprints, for their personal use only. Permission is granted by the publishers to post this file on a closed server which is accessible to members (students and staff) only of the author’s/s’ institute. For any other use of this material prior written permission should be obtained from the publishers or through the Copyright Clearance Center (for USA: www.copyright.com). Please contact [email protected] or consult our website: www.benjamins.com Tables of Contents, abstracts and guidelines are available at www.benjamins.com Lexical repulsion between sense-related pairs Antoinette Renouf and Jayeeta Banerjee University of Central England This paper builds on the groundwork and setting up of methods for an innovative approach to analysing text. We have proposed that there is a hitherto unexplored textual feature, which we call ‘repulsion’, which operates on the construction of meaning in an opposing way to that of word collocation. To illustrate, we do not say cheerfully happy even though we say blissfully happy. We focus on ‘lexical repulsion’, by which we mean the intuitively-observed tendency in conventional language use for certain pairs of words not to occur together, for no apparent reason other than convention. Our goal is to establish how repulsion as a whole operates and whether it can be assigned the status of an objectively measurable ‘force’. It is anticipated that this approach will have wide implications for corpus linguistics and NLP. In this paper, we take the particular case of repulsion between sense-related word pairs. Keywords: lexical repulsion, attraction, collocation, corpus linguistics, synonyms, antonyms . Introduction . Background ‘Corpus linguistics’ is a broad term, encompassing a cycle of theoretical and practical activities which both precede and include the discovery in a corpus of texts of many different kinds of information about language in use. The corpus linguistic cycle typically begins with a hypothesis which brings a twinkle to the eye of the corpus designer, then moves through the birth pangs of corpus creation, through the detailed scrutiny of each instance of a word in a given text corpus once constructed, and on to the extraction from the corpus of more generalised, class-level knowledge, which in turn spawns new hypotheses International Journal of Corpus Linguistics 12:3 (2007), 45–443. issn 1384–6655 / e-issn 1569–9811 © John Benjamins Publishing Company 46 Antoinette Renouf and Jayeeta Banerjee (Renouf 2007). In this paper, we focus on the latter stages of this methodological cycle; the procedures associated with the extraction of information about word use from a corpus, and the derivation from this of knowledge about textual organisation. Since corpus linguistics as we know it began around forty years ago with the creation of the Brown Corpus (Kučera & Francis 1967), the bread and butter of corpus analytical practice has been the perusal of words arrayed in alphabetically-ordered contexts, with particular attention to their preferred word neighbours. This routine serves to identify the semantic and other features of a word which are revealed by the presence of their corresponding collocational associations. The term ‘collocation’ (as defined by Firth (1957)) typically characterises the situation whereby a word is not evenly or randomly distributed across texts, but is found close to its preferred word partners (or ‘collocates’). In our previous work, the focus has been on the circumstances under which this lexical ‘attraction’1 occurs, whether in adjacent word pairings or discontinuous phrasal or grammatical frameworks. This fact of the language has in turn been transformed into a ‘methodological tool’ for discovering further information; for example, in the discovery of the meaning of an unknown word (e.g. Renouf 1996). The possibility of the existence of repulsion had lain in the recesses of the RDUES2 Unit’s collective consciousness through the years of collocational study, and was finally disinterred for inspection a few years ago. Our intuitive awareness was that one routinely utters some word combinations, such as Merry Christmas, Happy Christmas and Happy Birthday, but not Merry Birthday. We tested this apparent phenomenon of avoidance against our existing z-score collocational (span +/–1, case insensitive) statistics, and found that it indeed proved to be identifiable. The measures for the three word pairs above revealed (see Table 1) that, while Pairs 1 (Merry + Christmas) and 2 (Happy + Birthday) and 4 (Happy + Christmas) collocate strongly, Pair 3 (Merry + Birthday) does not collocate at all and produces a negative z-score. Table 1. Collocation of merry, happy, christmas and birthday Word1 Corpus freq. (merry 2326) (happy 8323) (merry 2326) (happy 8323) Word2 Corpus freq. (christmas 90,670) (birthday 2416) (birthday 2416) (christmas 90,670) © 2007. John Benjamins Publishing Company All rights reserved Collocates 450 526 0 299 Z score 393.205 516.16 –1.014 196.010 Lexical repulsion between sense-related pairs 47 This finding encouraged us to proceed to develop and test systematically a hypothesis about the existence and nature of ‘repulsion’ in text. Given that our methodology is based on collocational considerations, we naturally focus on ‘lexical’ repulsion. By ‘lexical repulsion’, we mean the observed tendency in conventional language use for certain pairs of words not to occur together. Of course, we know that repulsion, if it exists, does so at several levels of description and generality. This is also suggested by the existing studies of ‘cooccurrence restriction’ in language use. Other fields have traditionally focused on what is linguistically allowable, stated in terms of constraints. In grammar, a vast and established body of scholarship deals with the rules governing wordclass co-occurrence, concord and syntactic sequencing (e.g. Blache et al. 2003). In semantics, there is a long-established tradition of studying ‘selectional constraints’ on word co-occurrence, encompassing both non-computational and computational approaches (e.g. Resnik 1997). Morphologists and phonologists talk of ‘blocking’ in word formation (e.g. Andrews 1990; Aronoff 1976; Suzuki 1998; Yip 1998; Kim 1998). Translators (e.g. Laviosa-Braithwaite 1996) and foreign language teachers (e.g. Bonci 2004), refer impressionistically to ‘collocational constraints’, ‘restrictions’ and ‘clashes’ in relation to the preference for certain pairings, such as round of applause over round of clapping, and take a nap over take a sleep. However, we find no real guidance elsewhere3 on ‘lexical repulsion’ per se, in terms of active repulsion, as a measure of distance between two words, and of its scope and potential as a supplementary tool in text analysis. .2 Goals of the Repulsion Project So against the background outlined so far, of our own intuitions and of work in associated fields, our study goals are as follows. .2. Notion of a force We wish to introduce the notion that there is another ‘force’, which we call ‘repulsion’, which operates on the construction of text in an opposing way to that of collocation or ‘lexical attraction’. The goal of the study is to establish some understanding of how this phenomenon operates, and whether it can be assigned the status of an objective and measurable feature of textual organisation. © 2007. John Benjamins Publishing Company All rights reserved 48 Antoinette Renouf and Jayeeta Banerjee .2.2 Active distancing Specifically, we investigate the existence and measurability of ‘active’ distancing which may operate between words, rather than the known, routine constraints on co-occurrence imposed by grammatical and other more easily observable norms, and as distinct from ‘indifference’ (see 3.3). .2.3 Lexical repulsion We focus particularly on ‘lexical repulsion’, the trickiest case, where there appears to be no explanation for the non-cooccurrence of two particular words beyond tradition, no objective means of prediction or rule to apply. This case is a continuing thorn in the side of English language learners of all mother tongues. Crucially, we seek to isolate the exceptional cases of active ‘repulsion’ (see Section 3.2) between two words in a text from the commoner relationship of ‘indifference’,4 as defined in Section 3.3. .2.4 Semantic repulsion We shall also conduct a small-scale investigation into another area of repulsion which we assume will also be measurable by collocational means: that is, ‘semantic repulsion’ (e.g. bus and butter). .2.4 Applications We shall investigate ways (some of which we already have in mind) in which the knowledge thereby cumulatively gained can be exploited to provide solutions to some problems in language teaching and NLP. 2. The scope of this paper This paper is based on our initial investigations into lexical repulsion within the larger project, as reported in Renouf and Banerjee (2007 and forthcoming). We focus in the paper on the repulsion obtaining between sense-related pairings, with reference to synonymy and antonymy. At this point, however, our interest is not in executing a rigorous and exhaustive investigation of sense relations and repulsion per se. The reason for prioritising sense-related items is two-fold. The first is that sense-related pairings are likely to embody repulsion in its strongest form, and thus generate a more linguistically significant sub-set of repelled items. Synonyms can be expected by virtue of their shared meanings to share significant collocates in text. So where two synonyms repel each others’ collocates, the © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 49 repulsion is particularly surprising and noteworthy, and in principle, one could expect the converse with antonyms; that the absence of repulsion would be surprising. The second reason for prioritising sense-related pairs is a practical one: if we focus only on repulsion between these pairs, we reduce the amount of indifferent and irrelevant output generated, given that an individual word attracts only a few favoured collocates, and is more or less indifferent or “hostile” to the rest of the vocabulary. In this paper, we also focus only on the contiguous collocates of each synonym pair, in order to eliminate any ambiguity between repulsion and simply close collocation. 3. Definition of terms used As a prelude to the study proper, we shall clarify our use of the key terms in the study. 3. Collocation This is a property of language whereby words are not randomly distributed across texts, but occur close to certain preferred word partners (or ‘collocates’), a relationship refined according to their particular roles in each textual domain, genre and text type. Collocation exists between both adjacent word pairings and discontinuous phrasal or grammatical frameworks; and the significance of co-occurrence is deemed to be measurable by the application of appropriate statistical algorithms. 3.2 Repulsion By ‘repulsion’, we mean the intuitively-observed tendency in conventional language use for certain pairs of words not to occur together, not simply due to the fact that they are semantically incompatible, or grammatically disallowed, or morphologically or phonologically blocked, but where there appears to be no explanation other than convention. For instance, for no apparent reason, it is conventional in English to say sheer guts, but not utter guts; and utter peace but not sheer peace. © 2007. John Benjamins Publishing Company All rights reserved 420 Antoinette Renouf and Jayeeta Banerjee 3.3 Indifference Furthermore, non-coocurrence is not necessarily a matter of indifference. ‘Indifference’ refers to the situation where two words are in a statistically neutral relationship as regards proximity; there is no statistical evidence that Word A is singling out Word B either to be significantly close to or significantly distant from. 3.4 Force We borrow the term ‘force’ from the sciences, to reinforce our ‘attraction’ and ‘repulsion’ metaphors. In general, a force is an action on an object that causes its momentum to change; in electromagnetism, a force is the repulsion of like, and attraction of unlike, charges. The fact that electrically-charged particles can also repel each other allows us to extend the metaphor to characterise the relationship of ‘repulsion’. In the corpus linguistic context, however, we use the term loosely, to mean a “statistically-significant tendency for something to happen, specifically for certain words to repel or be repelled by each other”. Our concept of repulsion is represented diagrammatically in Figure 1. Word A and word B in Figure 1 are synonyms, shown within their respective ‘collocational spaces’. The circular area on the left represents the significant collocate set for Word A, and the circular area on the right represents the significant collocate set for Word B. The overlap between their two sets of collocates represents their shared significant collocates. The middle area of the circle (between the overlap and crescent) for each word represents those collocates which might also collocate with the other word, but only weakly Repels word B But strong collocation with word A Repels word A Word A Collocate space Word B Collocate space Shared A/B collocates (strong collocation with both A and B) Figure 1. Graphic representation of ‘Repulsion’ © 2007. John Benjamins Publishing Company All rights reserved But strong collocation with word B Lexical repulsion between sense-related pairs 42 or insignificantly. Meanwhile, the extreme outer crescents represent areas of actual repulsion — by Word B of Word A’s collocates, and by Word A of Word B’s collocates. 4. Data and method The source data used in our study consisted of a corpus of 800 million words of written text from the domain of UK broadsheet journalism, the Independent and the Guardian newspapers.5 In view of this data selection, it will be realised that when we make reference to instances of lexical repulsion in this paper, we only speak of that which is characteristic of “broadsheet” journalistic text. The texts cover the period 1989 to end 2006 in an unbroken stretch, so allow observation on recent usage. Since the limits of repulsion can not, by definition, extend beyond the confines of a complete text, it was informative to retain the mark-up which indicates the boundary of each article. The study entails the interaction of linguistics, statistics and software tool development. The approach taken was the usual iterative, stochastic method of articulating hypotheses, of developing tools and statistical measures to allow them to be tested on the selected language data, and ultimately evolve into a system which can extract the knowledge on a sufficiently large scale to allow its ultimate application in the fields of language teaching and NLP. 4. Measuring repulsion The statistical measures of repulsion which we applied were based on relative frequency in relation to the 800 million word corpus as a whole. We have built ‘collocational profiles’ containing this information for each word in newspaper text, and used a z-score cut-off to identify only the most significant collocates. Collocational z-scores are a measure of the strength of a relationship based on comparing (a) the frequency with which two observed words collocate within a given span with (b) their expected frequency in a body of text if the occurrence of one word of the pair were at random relative to the other word. We demonstrate the statistical thresholds (z-score cut-offs), with reference to the synonymous word pair nearly and almost, as follows: – for ‘repulsion’: the strength of association was set at ≤ –2 that is, the words nearly + certainly exhibit repulsion because their strength of association is –5.959 © 2007. John Benjamins Publishing Company All rights reserved 422 Antoinette Renouf and Jayeeta Banerjee – – for ‘weak collocation’: the strength of association lay between –2 and 2 that is, the words nearly + commonplace exhibit weak collocation, with a strength of association: –1.27; and the words almost + 50ft with a strength of association: –1.103 for ‘strong collocation’: the strength of association was set at ≥ 2 that is, the word almost + half have a collocation strength of 261.703; while the words nearly + doubled have one of 284.358 It is important to note that the statistical thresholds for both collocational attraction and repulsion are thus set on the same scale, albeit at different points. 5. Lexical repulsion between sense-related word-pairs: synonyms It was essentially a pragmatic move to focus on the repulsion behaviour of synonymous pairs, as explained, and an exhaustive study of all aspects of the phenomenon was beyond the scope of this project. There were many possible ways to proceed. We had previously composed a candidate inventory based on three main sources: our intuition (and that of others’, e.g. Palmer (1981)), the products of our ACRONYM collocational similarity measures (Renouf 1996; Pacey et al. 1998); and those of targeted lexical signalling (e.g. Renouf 2001). But we began this research by looking at classic examples of close synonymy within different word classes and between polysemous items which are known to cause problems for language learners. From that beginning, our method of selection has largely been one of exploratory boot-strapping, moving on to a new synonym pair which suggested itself in examining a previous pair. Sometimes this was “good research practice”, to change one linguistic variable to throw particular light, even if obscure, such as a shift to a different word-class. At other times it was to break a deadlock, or out of curiosity. In due course, we intend to shape this incidental study into a more structured whole, investigating for example how the typicality of behaviour varies according to scales of concreteness to abstraction, monosemy to polysemy, and low to high frequency. 5. Case studies on synonym pairs to demonstrate new repulsion methodology The following cases studies have been selected to illustrate the phenomenon of repulsion within different word classes. The pairs chosen are: almost and nearly, seat and chair, argue and discuss, and pretty and attractive. © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 423 5.. Synonymous pair: ‘almost’ and ‘nearly’ 5... Selection The adverbial synonyms almost and nearly were selected because they are staples in the pedagogic challenge to English language teachers to explain the appropriate use of near synonyms to learners. The general view (see also recent analysis by Kjellmer (2003)) is that it is impossible to differentiate between these two words sensibly. Linguists assert ‘attitudinal’ or ‘evaluative’ differences as the key, with one being positively and the other negatively connotative, but this nuance does not hold for long before a counter-example emerges. Most dictionaries, even the corpus-based Collins-Cobuild Dictionary (2006), define each word in terms of the other. We decided to see if our collocational measures (see 4.1) could reveal anything further about the pair. 5...2 Identifying degrees of association between synonyms A cross tabulation (Table 2) was used to display the joint distribution of the three categories of association: repulsion, weak collocation and strong collocation, between the collocates of almost and nearly. Table 2. almost and nearly repulsion cross-tabulation 78 762 28 nearly Total repulsion almost weak collocation repulsion weak collocation strong collocation Total strong collocation 28 28 335 335 78 1999 762 2839 78 1999 1127 3202 The word almost is more than twice as frequent as nearly, their frequencies being 232,689 and 85,431 respectively, and Table 2 shows that almost collocates strongly with more word types (2839) than does almost (1127). Thus, nearly operates within a more restricted domain and/or range of functions, © 2007. John Benjamins Publishing Company All rights reserved 424 Antoinette Renouf and Jayeeta Banerjee and consequently repels more word types (78) than almost (28) in the shared collocate space. The two words share many, in fact, 762, strong collocates. As shown in Table 2, 762 collocates occupy the common shaded area between the Table 3. Collocates exhibiting different textual uses of almost and nearly almost repels (nearly attracts) far man Not men year million Christmas taking including quarter 68 Over average pretty weren’t Very 77 aged metres sent 1990 31 married put population serving 2.3 purchased nearly repels (almost attracts) see seems become times feel look certainly sure appears knew expect felt looks seemed remains immediately word worse remain feels physical surely straight equally exactly directly academic regarded essential hear quiet considered desperate bound appear guarantee both attract wish intense definitely remained imagine afraid entirely Victorian classical daily sweet concentrate focused precisely hidden secondary comic amounts depends inevitably romantic unknown acceptable monthly overnight routine sheer bizarre quietly qualify rely immediate completely © 2007. John Benjamins Publishing Company All rights reserved half 40 doubled two-thirds pounds two 200 three twice spent one-third killed six lost dollars impossible seven lasted worth drowned died tripled eight enough fell 25 40000 1m 2m 100000 trebled 400000 ruined 5m 2 spend collapsed blew 250000 5bn went broke 1400 extinct 13 reached 300m blind two-and-a-half attracting scuppered After scored 17000 70000 1300 900000 midnight 10m bankrupt raised 2.5 Lexical repulsion between sense-related pairs 425 two circles, while 78 and 28 items occupy the crescents on either edge of the circles respectively, representing repulsion. There are 1999 strong collocates of almost which only weakly collocate with nearly, and conversely 335 strong collocates of nearly which only weakly collocate with almost. The other boxes are empty because we have been considering only the significant collocate space of the two words, and in this space, we are not likely to find any words that are either repelled by both almost and nearly, or that weakly collocate with each of them. Only when we consider all collocates of almost and nearly can we expect these boxes to be filled. We shall later in our study also consider the contents of these boxes. Table 3 lists some of the words that either almost or nearly repel or attract. 5...3 Linguistic analysis and interpretation As might be expected for words expressing approximation, there is a prevalence of words representing numerals in the results. Of the 762 words that collocate strongly with both words, 241 are numerals, with or without a unit of measurement attached (e.g. 7m) and this does not include numbers written as words (e.g. five). The word almost seems to repel precise numbers, whereas nearly is found modifying numerals, e.g. nearly 5, nearly 68 and so on. Consistent with this, almost repels verbs of “achievement” or “milestones”, such as aged, sent, married, serving, purchased, which frequently occur in text with a unit of measurement (e.g. married nearly 10 years) quantifying the achievement, and which collocate with nearly. Further, in contrast with nearly, almost repels pretty and very, adverbials which can emphasise achievement. The word nearly, meanwhile, repels verbs of perception, including some forms of the lemmata LOOK, FEEL, HEAR, which collocate significantly with almost. Nearly also repels some stative verbs, such as APPEAR, SEEM, BECOME, verbs of reported opinion, such as considered, regarded; adverbs of “certainty”, such as surely, definitely, completely, immediately; and verbs of “reassurance”, such as bound, rely, guarantee. Since other repelled items in Table 3 seem to be consistent with this analysis, let us tentatively claim that the data do show us a semantic difference between these synonyms: that almost generally avoids physical measurement, while nearly repels contexts of modality, used in hedging. In other words, nearly seems to modify precise numbers, while almost contributes to a discourse which is down-playing certainty. The words almost and nearly have a shared area of collocates, which seems business-oriented, involving verbs to do with money (spend, blow, lost, doubled, © 2007. John Benjamins Publishing Company All rights reserved 426 Antoinette Renouf and Jayeeta Banerjee trebled, tripled) and a number of verbs of catastrophe, both financial (bankrupt, broke, ruined) and existential (blind, scuppered, collapsed, killed, died, drowned). 5..2 Synonyms: ‘seat’ and ‘chair’ 5..2. Selection The word seat and chair were selected because their semantic relationship had been shown by our earlier ACRONYM project to vacillate in text. There was an assumption in pre-computational lexical semantics that seat was a superordinate of chair, which hierarchical relationship we have observed to be inconsistent in real text. Rather, we tend to view the two as distinguishable referentially. 5..2.2 Identifying degrees of association between synonyms The word seat occurs more (41,244 occs.) than chair (27,214 occs.) in the corpus. The cross tabulation in Table 4 shows three categories of association: repulsion, weak collocation and strong collocation, between the collocates of chair and seat. Table 4. seat and chair repulsion cross-tabulation 22 70 29 seat Total repulsion chair weak collocation repulsion weak collocation strong collocation Total strong collocation 29 29 505 505 22 374 70 466 22 374 604 1000 Table 4 shows that seat collocates with more word types (604) than does chair (466), indicating that seat operates over a less restricted range of functions, and repels fewer word types (22) than chair (29). The two words share only © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 427 70 strong collocates, which occupy the common shaded area between the two circles, whereas 29 and 22 items occupy the crescents on either edge of the circles respectively, representing repulsion. There are 374 strong collocates of chair which only weakly collocate with seat, and 505 strong collocates of seat which only weakly collocate with chair. Table 5. Collocates exhibiting different textual uses of seat and chair chair repels (seat attracts) country family West best child third East sales prices ground front economy South left free space second race majority sale numbers class compared target middle train parliament fit held seat repels (chair attracts) woman today’s national year’s made Davies Professor King Campbell Smith Jones Commission university man black MP party personal easy leg office deputy both attract electric director’s editor’s empty leather wooden plastic comfy manager’s executive’s folding reclining former comfortable vacant cane garden presenter’s upholstered opposite captain’s Labour speaker’s favourite leader’s covers nearby courtside independent minister’s facing non-executive upright child’s uncomfortable next vacated padded currently Democrat 5..2.3 Linguistic analysis and interpretation In Table 5, it can be seen that, in our journalistic corpus, each specifically repels the other’s phrasal “completives”; seat thus repels woman, national, Commission, university, man, easy, leg, office, deputy, and a list of surnames of chairmen; while chair repels country and family (as in country seat and family seat in reference to ancestral homes), child, West, East, South, prices, numbers, train, parliamentary. Extrapolating further, one can say that the commoner word seat appears to repel references to holding or carrying out academic or public office, particular in the role of chairperson, while the rarer word chair repels references to the © 2007. John Benjamins Publishing Company All rights reserved 428 Antoinette Renouf and Jayeeta Banerjee holding of inherited territory or property, the gaining of seats in government, seats and seat prices in the theatre or cinema, and seats or seating spaces in cars and aeroplanes (and possibly other forms of transport). In other words, they are differentiated referentially as follows: a seat is generally an inherited property or parliamentary representation, or alternatively a kind of paid seating space, while a chair is generally a role as chairperson in academic or other public contexts, including governmental. As for the information derivable from the right hand column of Table 5 about the collocates shared by both seat and chair, we learn that both synonyms can refer to “things to sit on”, with seat capable of referring either to the object as a whole or specifically to its seating pad (e.g. padded, upholstered). The collocates also play a role in their combinations with seat and chair in the context, in themselves having more than one reference. The shared collocates appear to suggest chair more strongly. This is because they indeed generally collocate more strongly with chair. It is because the word seat itself is much more frequent in the corpus than chair that chair’s collocates also sometimes co-occur with it (that is, corpus frequency is affecting the significance scores). Moreover, where the collocate in question, e.g. reclining, is rare in the corpus, the significance scores for its co-ocurrence are bigger. This fact is known, and we have as an ongoing task the modification of statistical thresholds and measures. 5..3 Synonyms: ‘argue’ and ‘discuss’ 5..3. Selection The synonyms argue and discuss were selected as verbs with similar frequencies, argue occurring 35,419 times and discuss 34,660 times. Intuitively, they might be regarded as differing in terms of degree; folk wisdom regards the verb argue as meaning [discuss + (the semantic component of) ‘anger’]. We were interested in seeing whether this was confirmed in real text. 5..3.2 Identifying degrees of association between synonyms As shown in Table 6, both argue and discuss have similar frequencies and thus similar ratios of collocates (244 and 277). The number of strong collocates that argue and discuss share are fewer than the number they repel. The words argue and discuss share only 14 strong collocates; argue repels 58 words and discuss repels 19 words in the shared collocate space. The fact that argue repels nearly 3 times the number of collocates than discuss, implies discuss is used more selectively in the corpus. © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 429 Table 6. argue and discuss repulsion cross-tabulation 19 14 discuss span1 Total repulsion argue 58 weak collocation repulsion weak collocation strong collocation Total strong collocation 58 58 172 172 19 244 14 277 19 244 244 507 5..3.3 Linguistic analysis and interpretation The pattern of repulsion, as revealed in Tables 6 and 7, seems to be as follows. The word discuss particularly repels words which refer to people and institutions who are cited in newspapers as being in a position to put forward an argument: people, companies, industry, authorities, scientists, analysts, banks, etc.6 On the other hand, it does not significantly repel any words identifying what the argument of proposition might be, (since these are rendered too disparate by the favoured syntax of the word argue for any right-hand collocates to emerge as sufficiently statistically significant to count). The word argue, meanwhile, repels many words which refer to the problems of society which are reported on in journalism, and ways to deal with them. Specifically, these repelled items consist of adjectives and nouns which designate a particular area or topic of interest, such as political, economic, human, industrial, nuclear; trade, peace, sex, security, safety, arms. They also comprise adjectives indicating the (time) scale involved: future, possible, potential, propose, regional, global. Argue also repels some discourse-organising nouns with future reference, such as plans, strategy, policies, ideas, policy, efforts. Argue further repels certain progressive verb forms with semantic implications of futurity, such as putting, setting, increasing, buying, moving. © 2007. John Benjamins Publishing Company All rights reserved 430 Antoinette Renouf and Jayeeta Banerjee Table 7. Collocates exhibiting different textual uses of argue and discuss argue repels (discuss attracts) political such interest future things possible security economic human problems terms trade plans potential peace questions sex military personal reports common events drugs ways details individual moving aid global nuclear strategy putting policies progress buying ideas safety England’s arms policy issues sexual setting alternative fears proposed industrial joint efforts increasing turning concerns relations various alleged regional allegations discuss repels (argue attracts) still people companies now instead others industry banks supporters authorities Americans Tories agencies states doctors easily groups heads organisations Scientists Observers Analysts both attract economics let’s writers openly publicly today endlessly even leaders seriously privately ministers authors experts Overall, it was found that argue repels more of the collocates of discuss than vice versa. This can be explained by the fact that the word argue is used to report a proposition or thesis, occurring over 50% (22,736 : 39,540) with the right-hand collocate that, and used as a reporting verb in the construction ‘<NN> argue(d) that…’ with the result that it can precede almost any word, whereas discuss is used more frequently in constructions such as ‘discuss + <JJ> + <NN>…’ or ‘discuss + <Ving> + <JJ> + <NN>…’ . What the two synonyms share can be seen in the right-hand column of Table 7 to be particular nouns which seem to denote human subjects who can both © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 43 “discuss” and “argue”: ministers, authors, experts, leaders, writers. The pair also both collocate with certain adverbs of manner: openly, publicly, privately, endlessly, seriously, which can be employed to characterise both these verbal activities. 5..4 Synonyms:‘ pretty’ and ‘attractive’ 5..4. Selection The synonyms pretty and attractive were automatically extracted from the ‘nymic’ thesaural output of our ACRONYM system as two adjectives which can be synonymous in relation to physical attributes. The word pretty is almost twice as frequent as attractive in the corpus (70,767 and 32,283 respectively). This is as expected, given that pretty in fact functions not just as an evaluative adjective but also as a degree adverbial, while attractive is restricted to the former of these functions. Table 8. attractive and pretty repulsion cross-tabulation 131 140 43 pretty Total repulsion attractive weak collocation repulsion weak collocation strong collocation Total strong collocation 43 43 1324 1324 131 298 140 569 131 298 1507 1936 5..4.2 Identifying degrees of association between synonyms Table 8 shows that though pretty is more frequent in the corpus and has a greater number of strong collocates (1,507) than attractive (569), pretty significantly repels more collocates (131) than attractive (43). This implies that attractive modifies a wider range of words than does pretty. Both words share more collocates (140) in common, than they repel. This follows as expected from their synonymic relationship. © 2007. John Benjamins Publishing Company All rights reserved 432 Antoinette Renouf and Jayeeta Banerjee 5..4.3 Linguistic analysis and interpretation Table 9 shows, perhaps unsurprisingly but logically, that the word pretty repels nouns with human reference which are not described as pretty: man, men, person, people, team, quality. Above all, however, pretty repels words which deal with aspects of inanimate business and finance: deal, offer, price, tax, interest, shares, return, investment, insurance, stock, business, fund, growth, financial, and so on. The word attractive, in contrast, repels words that refer to states of affairs that often are or are becoming difficult to handle. These include adjectives such Table 9. Collocates exhibiting different textual uses of attractive and pretty attractive repels (pretty attracts) get sure good kept got early hard average already similar important significant big unlikely difficult firm close quickly getting happy serious nearly soon clear gone moving comes probably mean actually poor feeling worked keen near impossible high sitting full complex bad hot felt pretty repels (attractive attracts) way People man income team performance deal subject tax markets interest single shares career men price business even financial insurance return property find offer longer Business investment provide form buy market source growth character prices target rate choice least benefits quality range idea rates bid terms person products opportunity businesses building figure stock policies areas ideas compared partner © 2007. John Benjamins Publishing Company All rights reserved both attract look looks sight particularly young confident blonde girl sounds extremely looked woman girls slim seem football damned looking village seemed sound still intelligent conventionally enough exceptionally women pictures terribly little picture smart strikingly equally neat female Lexical repulsion between sense-related pairs 433 as hard, difficult, poor, bad, impossible, complex, unlikely, important, serious, significant, but also more neutral terms, such as average, similar, and some semantically positive items such as sure, clear, high, firm, full, happy, keen. To put this more simply, pretty functions primarily in journalism as an adverbial, used to emphasise the seriousness of bad situations or the positivity of good ones. Meanwhile, it is perhaps a sad reflection on UK broadsheet journalism that attractive is more closely associated with business deals than with human appearance. Looking to the final column in Table 9, we see that pretty and attractive call a truce, and find common ground in strongly collocating with words dealing with physical good looks and sometimes intellectual calibre. Their shared semantics of good looks is particularly attributed to the female stereotype: girl, woman, women, little, neat, slim, young, blonde, confident. Both pretty and attractive are significantly often preceded by adverbs of degree such as: particularly, extremely, damned, exceptionally, strikingly, terribly. 6. Lexical repulsion between sense-related word-pairs: antonyms Our attention to antonyms has so far been cursory: our intention is not to take on a rigorous a-priori classification of antonymy, which would be beyond our scope. Instead, we have side-stepped our main research thrust, and selected a couple of antonyms, in the first instance, to gain a different perspective on repulsion. Antonyms represent another subset of the lexicon which is likely to generate restricted and surprising output. Antonymy poses of course an even greater problem of interpretation than synonymy, since our intuitions are weaker for this aspect of the thesaurus. We know from the outset that, as with synonyms, antonymic pairs are only partially antonymic, and will thus only repel certain of each other’s collocates. Furthermore, they will, in spite of their contrastive meanings, share some collocates. 6. Case studies on antonym pairs to demonstrate new repulsion methodology For this case-study, we have selected just two antonymous pairs, black and white exemplifying complementarity and hot and cold representing gradability. © 2007. John Benjamins Publishing Company All rights reserved 434 Antoinette Renouf and Jayeeta Banerjee 6.. Antonyms:‘ black’ and ‘white’ 6... Selection The words black and white are complementary antonyms, expressing absolutely opposite concepts. We selected them as an instance of ‘core’ antonyms; that is to say most archetypal in being conceptually related most neutrally of various scales of degree, evaluation and so on (e.g. Carter 1987), and because they both refer to colours, physical appearance and racial type, which we were curious to observe in its effect of the pattern of repulsion. 6...2 Identifying degrees of association between antonyms As expected, these core antonyms generally seem to share more words in the strongly collocating group and repel only a few selective words in their shared collocate space. The pair are highly frequent and also fairly equi-frequent in the corpus, black with 139,269 occurrences and white with 120,013, strongly collocating with 1,232 words and repelling only 77 and 73 words respectively, as seen in Table 10. 6...3 Linguistic analysis and interpretation Table 11 shows that both black and white, as expected of core words in the language, do more than denote the obvious contrasts of colour, physical appearance and race. Many of the words repelled by black form established idiomatic Table 10. black and white repulsion cross-tabulation 77 1232 73 white Total repulsion black weak collocation repulsion weak collocation strong collocation Total strong collocation 73 73 1513 1513 77 1477 1232 2786 77 1477 2818 4372 © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 435 phrases with white such as the metaphorical lie, paper, goods, meat, house, palace, heat, spirit; and the literal: board, (great white) hope. Similarly, many words repelled by white combine in metaphors with black, such as day, book, market, the new, rain, mark. Through these runs a common thread that black repels a positive and white a negative connotation. When white is viewed individually, a repulsion profile enfolds which again reflects the conventionally negative or problematic associations with black, almost exclusively in the racial context. White repels issues referred to as experience, rights, struggle, and topics such as crime, criminal, employment, pensions. On the other hand, white also repels a series of words which associate black positively with success: successful, talented, ambitious, leading, respected, popular, Table 11. Collocates exhibiting different textual uses of black and white black repels (white attracts) government working health building house hope board energy skills parliament hopes bed alternative pictures goods regime rose week’s image fears room transport ruling eat buildings heat Australian garden spirit winter lie dead landscape English photographs smile tourists beach meat dominated clinical clean medium rooms Red justice tender lies palace footage nights port paper white repels (black attracts) day market new book first experience rights leader art Tory history Africa successful term newspaper presence mood radio Britain’s criminal talent politics dark © 2007. John Benjamins Publishing Company All rights reserved both attract writing pepper bra rain leather artists England’s pudding stuff pitch community Lycra struggle people urban music hair king crime velvet girl leading peppercorns limousine weekly teenager glossy arts smoke lacquer MPs man moustache mark shiny gun tie affect young joke trousers referee lace jokes matt shadow armbands ambitious thick governor box respected men awareness dress pensions ink unemployment satin popular magic oldest Africans 436 Antoinette Renouf and Jayeeta Banerjee and with successful social positions: leader, leading, governor, politics, shadow, MPs, Tory. These are often regarded as newsworthy in being oldest, first, new. On a different but also positive tack, white repels words which assume a connotation of sophistication in association with black as a colour term: lacquer, bra, Lycra, limousine, moustache. The word black, meanwhile, repels the major semantic areas associated with white. These include race: regime, dominated, ruling, Australian, tourists; but also the colour of décor: room, bed, winter, rose, dead, clinical; and the colour of landscape (through sun or possibly snow): garden, landscape, beach. Absolute or canonical opposites, particularly the high-frequency, core pairings like black and white, are also most likely to share collocates, since they often co-occur in fixed phrases and frameworks of the kind not black but white, both black and white. These phrases can be metaphorical, as in in black and white (= in writing) as well as literal, as in black and white (= monochromatic). In the latter case, the phrasal nature of black and white, leads it to combine as a noun-phrase modifier with words such as pictures, image, photos, montage. When observing repulsion over a span of 1, this can throw the scores out, since the phrases generate a strong collocation score with white but a falsely high repulsion score with black. The current output also cannot show that black repels classes of multiword phrase such as government/parliamentary/transport white paper. Clearly, all this points to the necessity to extend this study to units above individual word level. 6..2 Antonyms: ‘hot’ and‘ cold’ 6..2. Selection The terms hot and cold are classic, core gradable antonyms, sitting at two ends of the spectrum but different from black and white in being (more readily) amenable to modification, by adverbs of degree such as very. They were selected for this, and also because they are almost equally frequent in the corpus (44,674 and 45,672 respectively). 6..2.2 Identifying degrees of association between synonyms Table 12 shows that hot and cold share many strong collocates (252) and repel only a few collocates of the other (38 and 28 respectively) (Table 12), which probably indicates that they share several functions and contexts purely by virtue of being core items. © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 437 Table 12. hot and cold repulsion cross-tabulation 38 252 28 cold Total repulsion hot weak collocation repulsion weak collocation strong collocation Total strong collocation 28 28 436 436 38 564 252 854 38 564 716 1318 6..2.3 Linguistic analysis and interpretation Table 13 shows the words that both hot and cold attract and repel. To some extent, we see similar patterns of repulsion here to those for black and white. These core, multi-functional words override any notion of straightforward antonymy where they each combine strongly in fixed idiomatic expressions. Thus, hot repels the favoured pairings of cold as follows: cold meaning “unsolicited” in call, calling; cold meaning “pitiless” as with heart, truth, reality, eye; cold meaning “not warm”, as with front, dead; and cold meaning “viral infection”, as with caught, common, bad. There is also a series of verbs which hot repels, which clearly combine phrasally with cold: left, goes, went, gone. Conversely, cold repels the collocates combining significantly with hot in many, more widely semantically-distributed idiomatic combinations as follows: (illegal) money, (immediate) line, (high-achieving) shot, (likely to succeed) favourite, (responsibility-carrying) seats, (strong) competition, (vibrant, sexy) red, (highly-charged) dispute. The significantly repelling items which are not idiomatically fixed with these two antonyms confirm but also enlarge the picture. There are many food and weather terms in their output: cold repels July, August; hot repels December, January, February, March. There is no doubt that hot, in spite of its equifrequency with cold, is the favoured term in journalism, and so cold repels the many different collocates of hot which denote topical matters, such as issue, © 2007. John Benjamins Publishing Company All rights reserved 438 Antoinette Renouf and Jayeeta Banerjee Table 13. Collocates exhibiting different textual uses of hot and cold hot repels (cold attracts) left went old front gone common bad March dead cash economic clear goes post call February heart truth caught peace January December voters north eye calling reality hard cold repels (hot attracts) money new young line issue news shot race keep subject issues debate books crime August favourite currently followed date both attract fashion July band became video natural sex pace comes seats latest gay competition weekend small takeover red dispute property water spots air weather potato summer oven springs bath dry spot drinks meals long milk baths showers hot shower breath day iron food extremely plate blowing unusually serve sweet climate towels spring uncomfortably really served dishes toast add blow white exceptionally cup issues, news, subject, debates; and which name fashionable topics: crime, sex, band, books, fashion, gay, takeover; as well as journalistic adjectives and adverbs such as currently, quick, latest, new, young, small. Cold also repels collocates of hot in its “sexy” sense: in addition to those collocates in fixed expressions above, we find weekend, video. On looking at our entire list of collocates for these words, those displaying weak repulsion/collocation included, we discover that there are many further words which do not quite make the threshold for the ‘cold repels…’ group but which are, nevertheless, semantically interesting in being phrasal completives of hot: chocolate, tub, seat, pursuit, topic, dog, cakes, piping, coals, tip, dinners, chilli, etc. The advantage gained from studying antonyms as well as synonyms is that it contributes a new perspective to our study of repulsion while incidentally furthering our understanding of the nature of sense relations. We shall thus continue with the investigation of other sense relations. © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 439 7. Concluding remarks At the outset, we knew that synonymy and antonymy, for various well-rehearsed reasons (e.g. Palmer 1981), are only partial phenomena, and that synonyms differ in meaning according to the particular functions they each fulfil, and the contexts in which they typically occur, just as antonyms do something like the reverse of this. In our studies so far, we have discovered that where synonyms differ along such parameters, they actively repel certain of each other’s collocates. A further finding is that the differences between two synonyms which cannot readily be ascertained by consulting the mental lexicon, appearing just to be arbitrary and conventional, show up robustly in our repulsion output (bearing in mind that this is based on journalistic corpus data) as being systematic and explicable differences in terms of register or style, but above all, semantic. Of course, it should be borne in mind that our output lists contain actively repelled words identified specifically according to measures of lexical repulsion, but this does not prevent some of these instances of repulsion from being further explicable in terms of phonological, grammatical or other co-occurrence constraints. From a language-pedagogic point of view, if lexical repulsion is not just an arbitrary matter of convention but is explicable in terms of semantic and other qualities which in principle should be accessible to us within our mental lexicon, then this finding is rather fundamental — it indicates that what a language learner ideally needs to learn — and to be taught — is a finer awareness of these less intuitively accessible features of words as they are used in text. Unfortunately, this awareness does not seem to be intuitively accessible even to native speaker linguists. One practical application of our research could accordingly be in the generation of inventories of lexical repulsion lists: lists of words which cannot normally combine naturally with a given headword, for the benefit of language learners and non-native-speakers wishing to optimise the quality of their textual composition. We are currently using a corpus of UK newspaper texts for analysis but our approach will be extended to other text types; it will of course also be interesting to investigate repulsion patterns in spoken language data. It is hoped that a new ‘repulsion’ measure will emerge from the research, which complements and supplements the use of existing collocation measures in another area of application, NLP. For instance, repulsion scores could be used to advance the field of spell-checking, by identifying well-formed but © 2007. John Benjamins Publishing Company All rights reserved 440 Antoinette Renouf and Jayeeta Banerjee contextually inappropriately spelled or selected words on the basis of the aberrant presence of dis-preferred or repelled collocates. If concluding remarks are ordered from specific to general, then perhaps the most general conclusion we can offer is the observation that corpus linguistics is vital in the search for repulsion. We may know of words within our own mental lexicon which we try to avoid, personal shibboleths, just as we prefer some words. But whereas we can perceive lexical collocation (or attraction) each time we read or hear English text, we cannot as individuals observe the phenomenon of lexical repulsion, except through a sizeable cross-section of the shared repository of language that a large text corpus represents. We have given a taste of our early findings concerning the phenomenon of repulsion as a possible force in text, and in particular lexical repulsion, and this specifically between synonyms and antonyms. Our plan is to move on to investigate the following areas: modified statistical thresholds and measures, repulsion spans, directionality of repulsion in fixed phrases, as well as text at phrase level, repulsion across sentence boundaries, and the effect of case sensitivity. We have found that semantic distinctions play a major role in lexical repulsion, but we also plan a small-scale study of ‘semantic’ repulsion per se; that is to say, to study word pairs which are semantically incompatible, such as plug and horse. Of course, these words could be identified manually, by semantic componential analysis, but we are interested on discovering how far we can distinguish them by collocational means. Acknowledgement We acknowledge with gratitude the financial support for the Repulsion Project awarded by the EPSRC (grant no. EP/D502551/1). The authors would also like to thank the two anonymous reviewers and the editors of this issue for their comments on the various versions of this manuscript. Notes . The metaphor of ‘attraction’ comes from the sciences, where it refers to a type of magnetic relationship, that of being drawn together, that arises between electrically charged particles that are in motion. It is thus a convenient metaphor for the characteristic preferences for cooccurrence that are well-known to exist between two (or more) particular words in text. 2. The Research and Development Unit for English Studies (http://rdues.uce.ac.uk). This Unit began at the University of Birmingham in 1988, moved to the University of Liverpool © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 44 in 1994, and returned to UCE, Birmingham, in 2004. The Director, Antoinette Renouf, and the statistician, Paul Davies, have been in place since the outset; the remaining research team has changed in personnel through the years. Currently, it is composed of software experts Jayeeta Banerjee, Andrew Kehoe and Matt Gee. The Unit has primarily engaged in English corpus linguistic description which has led to the creation of automated and semiautomated software systems, which in turn extract various types of knowledge about lexical and textual meaning, and language change, from large textual databases, primarily (for convenience) broadsheet journalism. 3. Beeferman et al. (1997), in ‘A Model of Lexical Attraction and Repulsion’, in fact only discusses the ‘lexical exclusion principle’ — specifically, that exact word repetition occurs less frequently within shorter text spans. 4. This ‘indifference’ is a complex matter, which we cannot go into fully here. We use it according to a minimal definition, simply for the case where words do not significantly attract or repel each other. A more comprehensive definition would see it is a relative, not absolute, term, both in degree and in function. The construction of a single article or text on a single topic ultimately requires that all words in it are selected interdependently. But here, a whole network of larger-scale relationships involving long-range collocation is probably involved. 5. The shift from the Independent to the Guardian took place in 2000, when the distribution of Independent data moved out of the hands of the Financial Times and into the Lexis-Nexus database, whence it is only retrievable at commercial rates. The Guardian, at the moment of writing, still allows a free download to be made. 6. Though we are viewing this interaction collocationally, there is a grammatical point here: discuss and argue are base verb forms, and each form of the lexeme will only interact positively or negatively with word forms which are grammatically compatible — here, we find that these are plural nouns. References Andrews, A. (1990). Unification and Morphological Blocking. Natural Language & Linguistic Theory, 8, 507–57. Aronoff, M. (1976). Word Formation in Generative Grammar. MIT Press. Beeferman, D., Berger, A. & Lafferty, J. (1997). A Model of Lexical Attraction and Repulsion. In Proceedings of the 35th Annual Meeting of the ACL and 8th Conference of the EACL (pp. 373–380). Association for Computational Linguistics, NJ, USA. Blache P., Guenot, M. L. & Van Rullen, T. (2003). A corpus-based technique for grammar development. In D. Archer, P. Rayson, A. Wilson & T. McEnery (Eds.), Proceedings of Corpus Linguistics 2003 (pp. 123–131). University of Lancaster. Bonci, A. (2004). Collocations in Italian L2. A case control study. PhD Thesis, Royal Holloway College, University of London. Carter, R. (1987). Is there a Core Vocabulary? Some Implications for Language Teaching. Applied Linguistics, 8 (2), 178–193. Oxford: Oxford University Press. © 2007. John Benjamins Publishing Company All rights reserved 442 Antoinette Renouf and Jayeeta Banerjee Collins Cobuild English Language Dictionary (2006). (5th edition). Sinclair, J. et al. (Eds.). London/Glasgow: William Collins Sons & Co. Ltd. Firth, J. R. (1957). Modes of meaning. In J. R. Firth: Papers in Linguistics 1934–1951 (pp. 190–215). London: Oxford University Press. Kim, D. W. (1998). Finding the Reader in Literary Computing. In R. Woolridge, W. McCarty & W. Winder (Eds.), CH Working Papers series, A.11. Jointly publ. with TEXT Technology, 8.1. Wright State University. Available at: http://www.chass.utoronto.ca/epc/chwp/ kim/ Kjellmer, G. (2003). Synonymy and corpus work: on almost and nearly. ICAME Journal, 27, 19–27. Kučera, H. & Francis, W. N. (1967). Computational Analysis of Present-Day American English. Brown University Press Providence. Laviosa-Braithwaite, S. (1996). Comparable Corpora: Towards a Corpus Linguistic Methodology for the Empirical Study of Translation. In M. Thelen & B. Lewandowska-Tomaszczyk (Eds), Translation and Meaning, Part 3 (pp.153–163). Maastricht. Pacey, M., Collier, A. J. & Renouf, A. J. (1998). Refining the Automatic Identification of Conceptual Relations in Large-scale Corpora. In E. Charniak (Ed.), Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, 15–16 August 1998 (pp. 76–84). COLINGACL. Palmer, F. R (1981). Semantics. Cambridge: Cambridge University Press. Renouf, A. J. (2007). Corpus Development 25 years on: from super-corpus to cyber-corpus. In R. Facchinetti (Ed.), Corpus Linguistics twenty-five years on: Selected papers of the twenty-fifth International Conference on English Language Research on Computerised Corpora, Verona, May 2004 (pp. 27–49). Atlanta/New York: Rodopi. Renouf, A. J. (2001). Lexical Signals of Word Relations. In M. Scott & G. Thompson (Eds.), Patterns of Text: in honour of Michael Hoey (pp. 35–54). Amsterdam/ Philadelphia: John Benjamin Publishing Co. Renouf, A. J. (1996). The ACRONYM Project: Discovering the Textual Thesaurus. In I. Lancashire, C. Meyer & C. Percy (Eds.), Papers from English Language Research on Computerized Corpora (ICAME 16) (pp. 171–187). Amsterdam/ New York: Rodopi. Renouf, A. J. & Banerjee, J. (Forthcoming). The Phenomenon of Repulsion in text. In C. Leclère et al. (Eds.), Special edition of Proceedings of 25th International Conference on Lexis and Grammar, Palermo, Sicily, Sept. 6–10, 2006, Lingvisticae Investigationes, Amsterdam: John Benjamins. Renouf, A. J. & Banerjee, J. (2007). The Search for Repulsion: a new corpus analytical approach. In eVARIENG: Methodological Interfaces as online Proceedings of 27th International ICAME Conference, May 2006, Hanasaari, Finland. Available at: http://www. helsinki.fi/varieng/journal/volumes/index.html Resnik, P. (1997). Selectional Preference and Sense Disambiguation. Presented at the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? April 4–5, 1997, Washington, D.C Suzuki, S. (1998). A Typological Investigation of Dissimilation. Unpublished doctoral dissertation, University of Arizona. © 2007. John Benjamins Publishing Company All rights reserved Lexical repulsion between sense-related pairs 443 Yip, M. (1998). Identity avoidance in phonology and morphology. In S. LaPointe, D. Brentari & P. Farrell (Eds.), Morphology and its Relation to Phonology and Syntax (pp. 216–246). Stanford, CA: CSLI Publications. Authors’ address: Antoinette Renouf and Jayeeta Banerjee Research and Development Unit for English Studies, School of English, University of Central England in Birmingham Perry Barr, Birmingham B42 2SU [email protected], [email protected] © 2007. John Benjamins Publishing Company All rights reserved © 2007. John Benjamins Publishing Company All rights reserved
© Copyright 2026 Paperzz