CHAPTER 1 THE STUDY OF COLLOCATIONS 1.0 Introduction 'Collocations' are usually described as "sequences of lexical items which habitually co-occur [i.e. occur together]" (Cruse 1986:40). Examples of English collocations are: ‘thick eyebrows’, 'sour milk', 'to collect stamps', 'to commit suicide', 'to reject a proposal'. The term collocation was first introduced by Firth, who considered that meaning by collocation is lexical meaning "at the syntagmatic level" (Firth 1957:196). The syntagmatic and paradigmatic relations of lexical items can be schematically represented by two axes: a horizontal and a vertical one. The paradigmatic axis is the vertical axis and comprises sets of words that belong to the same class and can be substituted for one another in a specific grammatical and lexical context. The horizontal axis of language is the syntagmatic axis and refers to a word's ability to combine with other words. Thus, in the sentence 'John ate the apple' the word 'apple' stands in paradigmatic relation with 'orange', 'sandwich', 'steak', 'chocolate', 'cake', etc., and in syntagmatic relation with the word 'ate' and 'John'. Collocations represent lexical relations along the syntagmatic axis. 114 Firth's attempt to describe the meaning of a word on the collocational level was innovative in that it looked at the meaning relations between lexical items, not from the old perspective of paradigmatic relations (e.g. synonyms, antonyms) but from the level of syntagmatic relations. Syntagmatic relations between sentence constituents had been widely used by structural linguists (e.g. 'John ate the apple' is an 'Subject-Verb-Object' construction), but not in the study of lexical meaning. Up till now, studies on collocation have been insufficient in defining the concept of collocation in a more rigorous way (Cowan 1989:1). Since the term 'collocation' was introduced by Firth to describe meaning at the syntagmatic level, subsequent linguists and researchers have not often attempted to define 'collocation' in a more thorough and methodical way. Collocation is still defined as the tendency of a lexical item to co-occur with one or more other words (Halliday, McIntosh & Strevens 1964:33; Ridout & Waldo-Clarke 1970; Backlund 1973, 1976; Seaton 1982; Crystal 1985:55; Cruse 1986:40; Zhang 1993:1). Although the theoretical treatment of collocations has been inadequate, the teaching of collocations to second language (L2) learners has gained importance during the last decade. For a long time the emphasis in vocabulary learning has been on accumulating and memorising lists of word definitions, followed by gap filling exercises (Robinson 1989:276; Gitsaki 1992; for a review of the development of vocabulary teaching see Carter and McCarthy 1988). However, applied linguists realised that vocabulary skills involve more than 115 the ability to define a word. Suggestions were made for a new approach to vocabulary teaching that would avoid the previous emphasis on words in isolation and on word definitions. The new approach would include an examination of the syntagmatic relations of collocation between lexical items, a skill that is evident in the adult native speakers of a language (McCarthy 1984:14-16; Carter 1987:38; Sinclair 1991). The shift of interest towards lexical learning is also evident in the introduction of a new approach to L2 teaching. The Lexical Approach, as it is outlined by Lewis (1993), regards language as grammaticalised lexis and places the way words combine at the centre of its theoretical perspective (Hewitson & Steele 1993). Lexis becomes the central organising principle of the syllabus, and collocation assumes an important syllabus-generating role (Lewis 1993). Raising the learners' understanding of the collocations of words is a matter of first-rate importance (McCarthy 1984:21), since the task of learning collocations can present both intralingual and interlingual problems. 'Collocation' as a term describing lexical relations is not well-defined, and unfortunately joining words that are in principle semantically compatible does not always produce acceptable collocations, e.g. 'many thanks' is an acceptable collocation in English but *'several thanks' is not, in the same way that 'strong tea' is well-formed but *'powerful tea' is not. Further on, unlike paradigmatic relations between words which can be the same for different languages, syntagmatic relations are more likely to differ from language to language (Mitchell 1975:10). For example, English people 116 'draw conclusions' while the Greeks ‘bga;zoun sumpera;smata’ [take out conclusions]; in English you have to 'wait for somebody' while in Greek ‘perime;neiß ka;poion’ [wait somebody]; in English you 'go on a diet' while in Greek 'ka;neiß di;aita’ [do diet]; in English someone who drinks a lot is a 'heavy drinker' while in Greek he is a 'gero; poth;ri' [strong glass]; in English you 'get in touch with someone', while in Greek you 'e;rcese se epafh; me ka;poion' [come to touch with someone]. The purpose of this thesis is to study syntagmatic lexical relations within a framework that will allow a more thorough treatment of the phenomenon of 'collocation', and to investigate the acquisition process of English collocations by L2 learners as an attempt to describe the possible factors affecting the development of English collocational knowledge. 1.1 The Importance of Collocations in L2 Learning The importance of collocations for the development of L2 vocabulary and communicative competence has been underscored by a number of linguists and language teachers who recommend the teaching and learning of collocations in the L2 classroom. Collocation has been considered as a separate level of vocabulary acquisition. Bolinger (1968) and (1976) argues that we learn and memorise words in chunks and that most of our "manipulative grasp of words is by way of collocations" (Bolinger 1976:8). The learning of language in segments of collocation size, especially in children, is proved by the fact that "the collocate is 117 what the young child produces if you ask him a definition", e.g. a 'hole' is 'a hole in the ground' (Cazden 1972:129, cited in Bolinger 1976:11). Bolinger describes language learning as a continuum starting at the morpheme level with word formation rules, moving to the word level and activating phrase formation rules. The last stage before storage into memory is the level where words enter into collocations. When learning a language people may or may not store a morpheme as such, but they do store phrases. For example, the phrase 'indelible ink' will be stored as a phrase, but few people will analyse the word 'indelible' as having the morpheme 'in-' as a prefix (Bolinger 1968:106). Among the early advocates for the importance of collocations in L2 learning and their inclusion in L2 teaching is Brown (1974), who suggests that an increase of the students' knowledge of collocation will result in an improvement of their oral and listening comprehension and their reading speed. In an effort to make the advanced students achieve a better feel of what is acceptable and what is appropriate, Brown outlines a number of exercises. The combination of lexical items as a source of difficulty in vocabulary acquisition has been noted by researchers like Korosadowicz-Struzynska (1980), who claims that the learner's mastery of these troublesome combinations, rather than her/his knowledge of single words, should be an indication of her/his progress (Korosadowicz-Struzynska 1980:111). Korosadowicz-Struzynska reports that students face intralingual and interlingual problems in the use of collocations, and even advanced students who have considerable fluency of expression in a foreign language make collocational errors. The teaching and 118 learning of collocations for production reasons is regarded as essential by Korosadowicz-Struzynska, who also describes certain steps that should be followed in order to promote the teaching of collocations from the initial stages of foreign language learning. These include selection of the most essential words on the basis of usefulness and frequency of occurrence, selection of the most frequent collocations of these words, presentation of these collocations in the most typical contexts, and contrasting any of the selected collocations with the equivalent native-language collocations that could cause interference problems for the learners. The significant role that conventionalised language forms (idioms, routine formulas and other forms such as collocations) play in the development of foreign language learners' communicative competence is stressed by Yorio (1980). One of the functions of conventionalised forms is that they "make communication more orderly because they are regulatory in nature” (Yorio 1980:438). Realising that random selection on purely subjective grounds from diverse conventionalised language forms is totally inadequate for the purposes of foreign language teaching, Yorio describes a set of criteria for the selection of specific forms to be taught: need, usefulness, productivity, currency, frequency, and ease (Yorio 1980:439). It has been claimed that prefabricated language chunks and routinized formulas play an important role in acquiring and using language (Nattinger & DeCarrico 1992:1; Nattinger 1980). Nattinger and DeCarrico have argued that a common characteristic in acquiring a language is the progression from routine 119 to pattern to creative language use (Nattinger & DeCarrico 1992:116). Therefore, it is suggested that the learning of prefabricated language patterns should be promoted in the classroom. The "apparent rulelessness" of collocations as one factor that interferes with foreign language vocabulary learning has been noted by Laufer (1988). Laufer reports that collocations constitute an essential aspect in the learners' knowledge of vocabulary, and she acknowledges that problems can arise in the learners' use of word combinations. She also suggests that collocations could be found to provide help in many levels of vocabulary development and the development of self-learning strategies such as guessing (Laufer 1988:16). Realising the foreign language learner's difficulties in learning vocabulary, Cowie (1978), (1981) stresses the importance of the compilation of English dictionaries "in which collocation and examples play a separate but complementary role" (Cowie 1978:131). Cowie points out that "meaning is not the only determinant of the extent and semantic variety of collocating words....The constraint may be situational" (Cowie 1978:134). For example, in the collocation 'a tea/dinner service of 50 pieces' there is a restriction as to which meals can combine with 'service' (tea, dinner, breakfast, ?luncheon) and their combination is based on cultural factors, i.e. which of these meals it is customary to serve, and whether it is conventional to have separate sets of dishes and plates for each (Cowie 1978:134). As a result, special treatment of the cultural factor of collocability in a learner's dictionary is proposed. He also suggests the inclusion of 'free word-combinations' that could still cause 120 problems for the foreign language learners, as well as the inclusion of grammatical rules that will indicate the correct grammatical treatment of the included collocations (Cowie 1981:226,232). The teaching of collocations in the classroom could help students overcome problems of vocabulary, style and usage (Leed & Nakhimovsky 1979). Leed and Nakhimovsky suggest the utilisation of lexical functions, as these are described by Mel'cuk and Zholkovsky (1988) (see Table 1), for the construction of foreign language teaching materials, vocabulary exercises and learners' dictionaries. Leed and Nakhimovsky argue that vocabulary exercises should be based on the findings of a well-structured lexical analysis, in the same way that pronunciation exercises are based on phonology (Leed & Nakhimovsky 1979:111). The theory of lexical functions can provide the basis for the generation of pedagogical exercises that are more consistent, diversified, and elaborate, less arbitrary, and ultimately more effective. Such an approach would help foreign language learners with problems of vocabulary, style and usage, and give teachers a method to produce and carry out lexical exercises in the classroom, as well as concentrate on the teaching of restricted collocations such as 'heavy drinker', 'heavy smoker', 'deep trouble', etc., (Leed & Nakhimovsky 1979:109). Table 1. Examples of Lexical Functions Lexical Functions Syn (to shoot) = to fire [synonym] 121 Sync (to shoot) = to machine-gun [narrower synonym] Anti (victory) = defeat [antonym] Oper1 (analysis) = to perform [be the subject of] Oper2 (analysis) = to undergo [be the object of] (Mel'cuk & Zholkovsky 1970:26; Mel'cuk 1981:39) Teaching phrase-patterns and sentence patterns from the early stages of L2 learning may help vocabulary expansion (Twaddell 1973; KorosadowiczStruzynska 1980). Twaddell argues that vocabulary expansion should take place from the intermediate stages of L2 learning and onwards under the condition that "the most habitual parts of language use" such as phrase-patterns and sentence patterns will be "practised and established as early as possible" (Twaddell 1973:63). After those habits have been adequately established, then new vocabulary can be assimilated into the L2 patterns. KorosadowiczStruzynska also suggests that it is reasonable to teach collocations of words to learners from the beginning rather than to arrange remedial courses afterwards, when lexical errors have become fossilised (Korosadowicz-Struzynska 1980:116). She disagrees with Smith's view that "mastery of the utterance should be the culmination of learning, not the beginning" (Smith 1971:42). It has been argued that the teaching of collocations facilitates vocabulary building for University-bound ESL students (Smith 1983). Smith (1983) illustrates a type of exercise for the teaching of collocations that combines both paradigmatic and syntagmatic relations between words. A number of 122 collocations that are primarily used in academic subjects are selected for teaching, and the key words of these collocations are members of the same semantic field (e.g. 'same', 'identical', 'equivalent', 'parallel', 'equal', 'homogeneous', 'similar') . According to Smith, this type of exercise could prove to be useful in an ESP course. A "carefully graded curriculum" should include word associations according to Murphy (1983), who treats collocations and word associations as synonymous. Murphy describes 11 steps that foreign language teachers could follow in order to include collocations, word association, famous sayings and catch phrases in their teaching program. The study of fixed expressions in English has been suggested as a useful starting point for a principled approach to vocabulary learning and teaching (Alexander 1984:132). Alexander stresses the benefits in the learning process if emphasis is placed "on the three C's of vocabulary learning: collocation, context, and connotation" (Alexander 1984:128). Contrastive analysis has been suggested as an approach to the teaching of collocations. The main strategy of this approach is the compilation of lists of collocations in the learner's L1 and their equivalents in the target language. Newman (1988) conducted a contrastive analysis of Hebrew and English dress and cooking verbs and their noun/object collocations. Newman suggests that providing learners with words that are described in terms of meaning components, derived from contrastive analysis and collocation restrictions, can prove to be a useful device in the learners' disposal for making conscious 123 distinctions and avoiding lexical errors arising from negative L1 transfer (Newman 1988:303). Therefore, the language learning process should be complemented by frequent practice and immersion to cater for the acquisition of idioms and rigidly restricted collocations, along with meaningful mnemonic operations that will involve the "deliberate exercising of the learner's powers of analysis and creativeness parallel to the characteristics of the transparent freer end of the collocational range" (Newman 1988:304). A similar view is reported by Bahns (1993). He argues that a contrastive analysis of the lexical collocations in the students' L1 and the target language will reveal which collocations have direct translational equivalents and therefore need not be taught, allowing foreign language teaching to concentrate on items for which there is no translational equivalence in the target language. The studies reviewed above show the importance of teaching collocations to ESL learners, and the necessity of the inclusion of collocations in the second/foreign language curriculum, as this can prove to be beneficial for the development of L2 vocabulary, communicative competence, and language performance. Even though some criteria are offered in order to help teachers decide which collocations to teach, these criteria are arbitrarily established, they are not based on empirical research, and they are by no means conclusive. For example, Brown (1974) recommends that 'normal' collocations should be taught first because they form the basis for 'unusual' collocations (Brown 1974:3), but she does not define the criteria that would help teachers distinguish 'normal' from 'unusual' collocations. In addition, the proposed exercises do not seem to 124 have been constructed systematically; the choice of verbs and nouns to be combined seems random, and no criteria are given as a means for establishing the "usefulness" of the collocations provided by the exercises; and the teacher has to rely on her/his own intuition about which of the collocations are more or less useful. Similarly, Laufer (1988) accepts the view that collocations constitute an essential aspect in the learner’s knowledge of vocabulary, and she acknowledges that problems can arise in the learner’s use of word combinations, but she nevertheless concentrates on the paradigmatic lexical relations, abandoning collocations to their 'rulelessness'. In addition, Laufer does not explain how the problem of teaching, learning, and use of collocations can be tackled, even though collocations could be found to provide help in many levels of vocabulary development (Laufer 1988). In Murphy's paper (1983) a number of exercises are outlined for the teaching of collocations, but it is left to the teacher's personal judgement to decide which collocations, word associations and phrases are more useful than others and which ones should be taught first. These are some of the problems presented by studies prescribing the teaching of collocations. It is apparent that even though the importance of collocations in L2 teaching and learning has been established, the treatment of collocations has been inadequate. There are still decisions to be made as to which collocations should be given priority in the classroom, how many collocations per new word should be taught, how to practice collocations, at 125 which level the teaching of collocations should be attempted, how is the acceptability of specific collocations to be established. Finally, the large repertoire of terms employed by linguists and language pedagogists to refer to word combinations includes 'combinations of lexical items' (Korosadowicz-Struzynska 1980), 'conventionalised language forms' (Yorio 1980), 'prefabricated language chunks and routinized formulas' (Nattinger & DeCarrico 1992), 'phrase patterns and sentence patterns' (Twaddell 1973), 'word associations' (Murphy 1983), 'fixed expressions' (Alexander 1984)(see also Kennedy 1990). The variety of terms used underscores the need for a more precise definition of 'collocation' and a method for the systematic classification of individual collocations. 1.2 Collocations in L2 Acquisition Research There have been a number of studies in L2 acquisition research that investigated how the knowledge and use of collocations by students at different levels of proficiency affect their communicative competence and language performance, and so established the importance of collocations in L2 learning. In her effort to identify the main factors in L2 acquisition for academic achievement, Saville-Troike studied a group of nineteen non-English speaking elementary school students who were subsequently taught and tested in English. The longitudinal study revealed that the most usual verbal interaction patterns consisted of the use of English routines such as 'don't do' and 'that's mine' (Saville-Troike 1984:207) and that vocabulary knowledge in English is the 126 most important aspect of L2 competence for academic achievement (SavilleTroike 1984:216). Students progressed from simply repeating after the teacher, to nodding or shaking the head, to using single words, and finally to using phrase and sentence patterns. These patterns and routines can be considered as collocations since they are word combinations, and hence Saville-Troike's study shows that collocations are essential for communicative interaction even from the initial stages of L2 acquisition. In an experiment carried out by Bahns and Eldaw (1993), a translation and a cloze task were used to test German post-secondary learners' active knowledge of 15 English verb-noun 'lexical collocations' (i.e. collocations that included words belonging to open-class categories, and excluding words such as prepositions, articles or conjunctions). The German collocations used in the translation test were direct equivalents of the English collocations. In the cloze test there were 15 sentences each sentence containing one verb-noun collocation with the verb missing. The analysis of the data revealed that the subjects produced more than twice as many errors in their translations of the nouns in the noun-verb collocations as in their translation of general lexical words, while in the cloze test nearly 52% of the responses were grammatically or collocationally unacceptable to a native speaker of English. The results show that for advanced ESL students collocations present a major problem in the production of correct English. The results also indicate that the learners' knowledge of collocations does not expand in parallel with their knowledge of general vocabulary, since they could not identify the specific verb-noun 127 collocations, although they could use general lexical items. Also, the learners' inability to paraphrase collocational phrases suggests that "a knowledge of collocations is essential to full communicative mastery of English" (Bahns & Eldaw 1993:109). Bahns and Eldaw suggest that the results of their study are due to the fact that collocations are not taught explicitly in the classroom and therefore learners do not pay any attention to learning them (p. 109). Verb-noun collocations were also tested by Aghbar (1990) in a writing task based on the assumption that the use of formulaic language should be considered in assessing native and non-native English proficiency. Aghbar defines formulaic language as language chunks that are used and learnt together. He reports that "collocations are the less obvious examples of formulaic language", possibly because they are not fixed in the same way that idioms and proverbs are (Aghbar 1990:2). The writing test consisted of 50 sentences, appropriate for formal written contexts, with each sentence containing one formulaic verb-noun expression. In each of these expressions the verb was missing and the participants had to provide the verb most likely to be used in a formal written context. The results showed that ESL students did well where 'get' was the desirable word. However, they used 'get' even when other more specific and more appropriate verbs were needed. For example, 'This is an opportunity for you to _______ knowledge in your field of study' could be filled with 'get' but also with other more appropriate verbs such as 'acquire', 'accumulate', 'gain', 'demonstrate', 'display' etc. The reason for the poor ESL performance in the test was the "lack of acquisition of those language 128 chunks that make discourse fluent and idiomatic" (Aghbar 1990:6). The results also showed that the performance of American students was similar to that of ESL students, thus proving that even the native undergraduates' knowledge of the collocations used in formal written language was inadequate. Similarly, 200 undergraduate third and fourth year Jordanian students majoring in English performed poorly in a multiple choice test conducted by Fayez-Hussein (1990), who aimed to assess the students' ability to collocate words correctly in English. The multiple choice test consisted of 40 sentences, with each sentence containing an incomplete collocation (i.e. idioms, fixed expressions, restricted collocations). The collocations tested were mainly nounnoun, adjective-noun, and verb-noun phrases. The students' performance on the test (only 48.4% of the collocations were answered correctly) was found unsatisfactory. Almost half of the incorrect responses were found to be due to negative transfer from L1, e.g. in item 5 'By the weekend the death _________ had reached 95', 51% of the subjects chose 'death number' instead of 'death toll'. Unfamiliarity with the structure of the particular idioms and fixed expressions was another major factor for incorrect responses, e.g. in item 21 'The first voyage of a new ship is referred to as a __________ voyage', 45.5% of the subjects selected 'primary voyage' instead of 'maiden voyage'. Finally, the students' tendency to use generic terms instead of specific ones accounted for 38.3% of incorrect responses, e.g. in item 29 'After the current repairs of the city's water supply system, ________ water will be safe for drinking', 48.5% of the subjects chose 'pipe water' instead of 'tap water'. Fayez-Hussein lists a 129 number of reasons for the students' inadequate knowledge of English collocations: the neglect of lexicon in the teaching and learning of English as a foreign language, the students' insufficient reading experience (which is assumed to restrict their knowledge of vocabulary, synonyms, lexical restrictions, etc.), the reduction and simplification that takes place in the teaching of a foreign language (which can encourage students to use oversimplified generalisations), and the subjects' overuse of guessing strategies in answering the test items. The latter could have also been encouraged by the format of the test, i.e. multiple choice test items. The lack of emphasis which most syllabuses place on vocabulary has been reported as the main reason for the frequency of learners' lexical errors (e.g. collocational errors, over-use of a few general items) by Channell (1981). A group of eight advanced students of English were asked to fill in a 'collocational grid' which had the adjectives 'handsome, pretty, charming, lovely' as its vertical axis, and the nouns 'woman, man, child, dog, bird, flower, weather, landscape, view, house, furniture, bed, picture, dress, present, voice' as its horizontal axis. The test showed that the students failed to mark a large number of acceptable collocations, even though they were very familiar with the words involved in the test. Channell concludes that it is essential that learners realise the potential of words they know and of the new words they learn, and she recommends that syllabuses should take into account two things about every new word the learner needs to learn: how it relates to other words 130 with similar meaning, and which other words it can be used with and in which contexts (Channell 1981:116). An analysis of the writing of four Arab college ESL students by Elkatib (1984) showed unfamiliarity of collocation as well as overuse of a few general lexical items to be among the eight main types of lexical errors that were recorded. In a further analysis of the collocational errors, Elkatib observes that the learners knew the basic meaning of the lexical item but they did not know its collocative patterns, which resulted in the use of erroneous collocations such as 'beautiful noise', 'shooting stones', 'I increased a hundred marks', 'do progress'. Elkatib concludes that new words should be presented in company with their most typical collocations in the form of example sentences or of collocation grids like the ones proposed by Channell (1981). The importance of such a practice derives from the fact that "students often fail to realise the potential even of words they know well, because they use them only in a limited number of collocations of which they are sure" (Elkatib 1984:50). The analysis of frequent words and their collocations was used in order to assess the writing proficiency of primary school students in Singapore (Ghadessy 1989). Writing samples of grade three (8-9 years old) and grade six (11-12 years old) students were analysed using the KWIC (key-word in context) method. It was found that grade three students used content words (i.e. nouns, verbs, adjectives and adverbs) more frequently than grade six students, who showed a more frequent use of function words (i.e. articles, pronouns, prepositions, etc.) (Ghadessy 1989:113). According to Ghadessy, the frequent 131 use of function words is indicative of a more advanced use of collocations, grammatical patterns and cohesive devices on the part of grade 6 students (Ghadessy 1989:114). Ghadessy reports that looking at the collocations students use is a valid way of investigating what happens during their development towards a full linguistic communicative competence, i.e. by looking at the collocations of nouns, one can draw conclusions about the development of the students' ability to use premodification and postmodification of nouns. For example, in Ghadessy's study all students used premodification (e.g. ‘tall tree’, ‘tennis ball’, ‘shady tree’) more frequently than postmodification (e.g. ‘the tree that...’, ‘’a tree near the place that...’, ‘the tree which...’), which appeared mainly in the writings of grade six students. Therefore, it appears that postmodification is a more complex skill that develops at later stages of L2 learning, and as such it may be used as an indicator of a more advanced level of language acquisition. The use of collocations in the writings of native and non-native college freshmen was examined by Zhang (1993). Samples of written essays, as well as a fifty-item blank filling test containing 21 types of collocation (11 grammatical and 10 lexical ones), were analysed in order to examine any associations between collocational knowledge (as this was measured by the blank filling tests) and writing quality, on the one hand, and the use of collocations in the students' essays and writing quality on the other. The results show that collocational knowledge is a source of fluency in written communication, and also that the quality of collocations in terms of variety and accuracy is 132 indicative of the quality of college freshmen writing. An interesting result in Zhang's study is that the use of more grammatical collocations (e.g. SV to Inf) and fewer lexical collocations (e.g. Verb Adverb) (see section 1.5. for definitions) was found to be characteristic of the writing in native Good writers and nonnative Poor writers (Zhang 1993:168). Zhang considers this result indicative of the development that takes place as non-native speakers develop from poor writers to good writers to native-like writers. Even though Zhang did not test subjects from different proficiency levels, he anticipates that learners at the lower levels of English proficiency use more grammatical collocations, and fewer lexical collocations in their writing, and whatever collocations they do use are poor in variety and accuracy. As learners progress to intermediate levels they produce a greater variety of collocations and fewer collocational errors, but they are still dependent on the prefabricated routines they have acquired, and thus they use more lexical collocations than grammatical ones. At higher levels of English proficiency learners have a better knowledge of grammatical collocations and they are able to use the analysed parts to create new ones, resulting to fewer lexical collocations and more grammatical ones (Zhang 1993:169). Zhang's study suggests that there is some kind of development in collocational knowledge as L2 learners proceed from low language proficiency to more advanced language proficiency. In an investigation of possible ways of facilitating L2 vocabulary learning, Cohen and Aphek (1981) concluded that intermediate level students find tasks with contextualised words (average 77% correct) easier than tasks 133 involving lists of words, which in turn are easier for beginners (average 84% correct) (Cohen & Aphek 1981:225). Thus, teaching words in their collocations could be beneficial for intermediate level students but not for elementary students. Overall, the use of correct collocations in the reviewed studies was found to be indicative of a higher level of language proficiency, and the lack of collocational knowledge was found to impair language performance. Even though the above studies pursued similar goals, i.e. to reveal that a limited knowledge of collocations inhibits language performance and that the teaching of collocations in L2 classroom is necessary, they present a number of limitations. Some of the studies were limited to the examination of a small number collocations, usually belonging to the same pattern (verb-noun collocations in Bahns & Eldaw 1993; Aghbar 1990; adjective-noun collocations in Channell 1981). The use of elicitation procedures differed from study to study, making their results difficult to compare (translation and cloze test in Bahns & Eldaw 1993; blank filling in Aghbar 1990; collocational grid in Channell 1981; multiple choice test in Fayez-Houssein 1990; analysis of written performance in Ghadessy 1989; Elkatib 1984; essay writing and blank filling in Zhang 1993) (for a critique of the use of multiple choice tests and open-choice tests in the investigation of collocational knowledge see Aghbar & Tang 1991). Some studies contained only a small number of subjects (8 subjects in Channell 1981; four in Elkatib 1984; nineteen in Saville-Troike 1984; Cohen & Aphek 1981). There is no common theoretical framework for the study of collocations, 134 i.e. they are mainly descriptions of the problems that learners have with collocations (word combinations, routinized patterns, phrase patterns, etc.). With the exception of Zhang's (1993) study, where a number of collocational patterns are identified and systematically tested, the rest of the studies lack systematicity and methodology in the selection of the collocations they tested, which were based mainly on native speaker intuitions. Due to these limitations the study of the acquisition of collocations is still in need of systematic and methodologically sound research, while a common framework for the study of collocations is yet to be established. The following section outlines the different approaches to the study of collocations in an attempt to construct a theoretical framework as the basis of the present study. 1.3 Approaches to the Study of Collocations Since the 1960's there have been three main approaches to the study of collocations, focusing on different aspects of the phenomenon of collocation. In this study, these approaches are referred to as: the lexical composition approach, the semantic approach, and the structural approach. The lexical composition approach characterises collocation as a different level of lexical meaning. The semantic approach attempts to predict the collocates of lexical units by reference to their semantic features. The structural approach examines collocations using grammatical patterns. Each approach is described in more detail in the following sections. 135 1.3.1 The Lexical Composition Approach The lexical composition approach in the study of collocations is based on the assumption that words receive their meaning from the words they co-occur with. Among those who perceived collocations as a lexical phenomenon independent of grammar is Firth, who is also believed to be the 'father' of the term "collocation". Collocation according to Firth is a "mode of meaning". Just as the light of mixed wave-lengths disperses into a spectrum, "the lexical meaning of any given word is achieved by multiple statements of meaning at different levels", e.g. the orthographic level, phonological level, grammatical level, and collocational level (Firth 1957:192). For example, the meaning of the word 'peer' is described by Firth in the following way: at the orthographic level the group of letters 'peer' is distinguished from the group of 'pier'. Next the pronunciation is stated, then at the grammatical level we state whether 'peer' is a noun or a verb, and by making such statements at the grammatical level we make explicit a further component of meaning. Also, formal and etymological meaning may be added, together with social indications of usage (Firth 1957:192). Finally, at the collocational level, one of the meanings of the word 'peer' is its collocation with 'school', as in 'school peers'. Firth highlights the "general rule" that every word entering a new context is a new word. Firth also distinguishes contextual meaning from meaning by collocation, and attempts a classification of collocations into "general or usual collocations and more restricted technical or personal collocations", though unfortunately without any 136 further elaboration (Firth 1957:195). Even though Firth does not enter into a thorough exploration of a theory of collocations, he uses collocation in his book as a technique for the stylistic criticism of literary works, e.g. personal or 'unusual' collocations can reflect personal idiosyncratic styles in the use of language (for the use of collocations in the stylistic analysis of literature, see Behre 1967). Halliday (1966) and Sinclair (1966) took Firth's theory of meaning one step forward and stressed the importance of lexical collocations, i.e. collocations that consist of lexical items, in an integrated lexical theory. The so called NeoFirthians attempted the study of lexis as a distinct linguistic level. Sinclair saw Grammar and Lexis as two 'interpenetrating ways' of looking at language form (Sinclair 1966:411), and Halliday argued that lexical theory is complementary to, but not part of, grammatical theory (Halliday 1966:148). Grammar organises language as a system of choices and whatever patterns and/or items fail to "resolve themselves into systems" are listed at the end of each grammatical description (Sinclair 1966:411). 'Lexis', on the other hand, is devoted to the study and description of individual lexical items and their collocational tendencies that cannot be dealt with by grammar, since they are not a matter of choice (one rather than another) but of likeliness of occurrence, i.e. "there are virtually no impossible collocations, but some are more likely than others" (Sinclair 1966:411), e.g. the collocation 'this lemon is sweet' could be considered as unusual except in the context of somebody exclaiming over a child's painting of still life (McIntosh 1961:329). 137 The Neo-Firthians also introduced a new set of linguistic terms related to the study of collocations. They used the term Node to refer to a lexical item whose collocations are being studied, Span to refer to the number of lexical items on either side of the node that are considered to be relevant to the node, and Collocates to refer to those items that are in the environment defined by the span (Sinclair 1966:415). For example, when we study the collocational patterns of 'tea', 'tea' is the node. If we decide to have a span of 3, that means we study the 3 lexical items that occur before and after 'tea'. All the lexical items that are within the span of the word 'tea' are considered to be its collocates. To the extent to which words are specified by their collocational environment, similarities of their collocational restrictions enables linguists to group lexical items into "lexical sets", i.e. sets of words with similar collocational restrictions. For example, the words 'bright', 'shine' and 'light' are members of the same lexical set because they are frequent collocates of the word 'moon' (Halliday 1966:156). Along the same lines, the lexical items 'bright', 'hot', 'shine', 'light', 'lie' and 'come out' are all members of the same lexical set because they all collocate with the item 'sun' (Halliday 1966:158). The criterion for a lexical item to enter a lexical set is its syntagmatic relation to a specific lexical item (i.e. its collocation with a specific word) rather than its paradigmatic relation to that lexical item. For example, lexical items like 'strong' and 'powerful' are considered members of the same lexical set because they collocate with the lexical item 'argument', e.g. 'strong argument' and 'powerful argument'. As far as other collocates are concerned, e.g. 'car' and 138 'tea', the lexical items 'strong' and 'powerful' will enter different lexical sets, i.e. 'strong' will be a member of the lexical set defined by 'tea', and 'powerful' will be a member of a lexical set defined by 'car' (Halliday 1966:152). Halliday is also interested in the collocational patterns that lexical items enter. For example, 'a strong argument' presents the same collocational pattern as 'the strength of his argument' and 'he argued strongly'. Since 'strong', 'strength', and 'strongly', are parts of the same collocational pattern, they are considered as word-forms of the same lexical item (Halliday 1966:151). Halliday also points out that lexical items need not have any formal relationship to one another in order to collocate. For example 'strong' and 'argument' could be in different sentences 'I wasn't convinced of his argument. He had some strong points but they could all be met'. What Halliday refers to as 'collocational pattern' McIntosh calls 'collocational range' in order to distinguish it from its grammatical equivalent, i.e. 'pattern', which has to do with the structure of the sentences we produce, while 'collocational range' has to do with the specific collocations we produce in a series of particular instances (McIntosh 1961:337; McIntosh & Halliday 1966). McIntosh also argues that since collocations are the material out of which sentences are made, collocational range should be taken into account within the dictates of pattern when dealing with the text of actual sentences. A theory of lexical meaning similar to the one outlined by Firth and the Neo-Firthians is suggested by Anthony (1975). Even though Anthony was not involved directly in the study of collocation, his proposed theory treats the 139 lexical word as an empty form capable of bonds to different kinds of meaning (Anthony 1975:22). Each lexical word becomes a discourse word when it is used in ordinary discourse, and the particular meaning which is in focus is called its lexical meaning. For example, the lexical word 'pitch' can mean many things, i.e. it is capable of bonds to different kinds of meaning (a throwing action, a tar-like substance, something musical, etc.). The moment 'pitch' is used communicatively in a group of other words and becomes a discourse word, then a small portion of its repertory of meanings is in focus and this becomes its lexical meaning, e.g. in the sentence 'pitch the ball to me', 'pitch' receives the meaning of 'a throwing action'. Anthony also remarks that a word that occurs in one grammatical construction differs in lexical meaning from the same word in another construction. For example the use of 'mother' as a verb has a different referential meaning from the use of 'mother' as a noun. Collocation has also been identified by Halliday and Hasan as a form of lexical cohesion, and it has been defined as the "cohesive effect" of pairs of words such as 'bee...honey' and 'king...crown' which "depends not so much on any systematic semantic relationship as on their tendency to share the same lexical environment, to occur in COLLOCATION with one another" (Halliday & Hasan 1976:286). However, 'collocational cohesion', as it is used by Halliday and Hasan, is simply "a cover term" for textual cohesion, a kind of "semantic interlace that provides texts with their texture- their non-structural cohesion or lexical form" (Addison 1983:3), and leaves the "specific kinds of co-occurrence which are variable and complex" to be dealt with by "a general semantic 140 description of the English language" (Halliday & Hasan 1976:287-288). Halliday and Hasan's definition of collocation serves the task of textual analysis, but it is restricted in lexically predictable collocational chains that extend beyond the boundaries of a sentence. Furthermore, it does not pay attention to idiosyncratic and unpredictable co-occurrences of words that are not semantically or environmentally, in a physical sense, associated to each other, e.g. there is nothing obvious in the meaning of 'tea' that explains why it collocates with 'strong' but not with 'powerful'. The main problem with lexical analysis has been identified as "the circularity of the definition of the basic unit of description, the lexical item" (Sinclair 1966:412). That is, every item is described in terms of its environment which in its turn is defined in terms of the item. For example, one of the meanings of 'night' is its collacability (i.e. ability to collocate) with 'dark', and of 'dark', its collocation with 'night' (Firth 1957:196). The above realisation makes lexical statements look weaker and less precise than grammatical ones, which are based on a well-defined and explicit framework. One of the good points of the lexical composition approach is that it drew attention to lexis and uncovered the insufficiency of grammatical analysis to account for the 'patterns' a word enters in, in the Hallidayan sense, and the collocatory idiosyncrasies of lexical items. The Neo-Firthians argue that grammar alone cannot describe what the lexical item is, therefore the lexical item "must be identified within Lexis, on the basis of collocation" (Halliday, McIntosh & Strevens 1964:35). 141 Sinclair and Halliday do not underestimate the importance of grammatical analysis; they rather highlight the significance of being able to make valid statements about lexis that do not disregard but complement grammar. However, the Neo-Firthians admit that they do not know "how far collocational patterns are dependent on the structural relations into which the items enter" (Halliday 1966:159), and therefore it is essential to examine collocational patterns in their grammatical environments. In other words, the advocates of the lexical composition approach recommend that collocational patterns are best described and analysed through lexical analysis, but they do admit that help from grammar is still needed. 1.3.2 The Semantic Approach Collocation as a linguistic phenomenon associated with lexical semantics was described as early as 2,300 years ago. Greek Stoic philosophers, according to Robins (1967), rejected the equation of "one word, one meaning" and shed light on an important aspect of the semantic structure of language: "word meanings do not exist in isolation, and they may differ according to the collocation in which they are used" (Robins 1967:21). In parallel to the lexical composition approach, where linguists recognised lexis as a level of analysis of language separate from grammar, in the semantic approach linguists attempted to investigate collocations on the basis of a semantic framework, also separate from grammar. 142 Chomsky was among the first to suggest the treatment of collocations by semantics. Even though Chomsky did not examine collocations, he distinguished between 'strict subcategorisation rules', i.e. rules that "analyze a symbol in terms of its categorical context", and 'selectional rules', i.e. rules which "analyze a symbol in terms of syntactic features of the frames in which it appears" (Chomsky 1965:95). These rules assist the generation of grammatical strings. The breaking of strict subcategorisation rules will result in strings such as e.g. 'John found sad' and 'John became Bill to leave', while failure to observe the selectional rules will give examples like 'Colorless green ideas sleep furiously' (Chomsky 1965:149). He then finds that selectional rules play a marginal role in the grammar and suggests that they should be dropped from the syntax and be taken over by semantics. The Neo-Firthians' approach to the study of collocations was found inadequate by semanticists because it sorts lexical items into sets according to their collocations, but it does not explain why there are lexical items that collocate only with certain other lexical items. In the lexical composition approach collocations and sets are studied as if the combinatorial processes of language were arbitrary (Lehrer 1974:176). Firth's theory of meaning was found to be insufficient for the study of collocations (Lyons 1966). Lyons claims that Firth's definition of 'meaning' as a "complex of contextual relations" is puzzling, and he criticises the apparent lack of principles by means of which "lexical groups by association" can be established and "lexical sets" can be defined (Lyons 1966:289-297). Overall, 143 Lyons proposes an abandonment of Firth's theory of meaning, in which the statement of meaning by collocation was introduced, because it does not coincide with well-established theories of meaning and language description and furthermore there are other "more important meaning relations" which must be accounted for in a theory of meaning (Lyons 1966:295). Even though Lyons seems to agree that 'collocations' restricted to "syntagms (or collocations) composed of a noun and a verb or a noun and an adjective" (Lyons 1977:261) are worthy of study by the semanticist, he does not believe that a separate collocational level has to be established. Lyons also proposes that collocations should be studied only as part of the synchronic and diachronic analysis of language. For the study of collocations Lyons proposes the notion of "lexical fields" founded upon "the relations of sense holding between pairs of syntagmatically connected lexemes" (Lyons 1977:261). However, he advises against going to the extreme of "defining the meaning of a lexeme to be no more than the set of its collocations" (Lyons 1977:265-268). He then proceeds to describe the principles of a strong version of field-theory as if the vocabulary of a language was a closed set of lexemes with each lexeme being a member of no more than one field. However, the vocabulary of a language is an open system, and lexemes do belong to different fields due to their different meanings. Therefore, the study of vocabulary in a theory of lexical fields based on syntagmatic relations presents problems. These problems led Lyons to suggest that descriptive semantics can get along well without syntagmatic relations (Lyons 1977:268). Thus, Lyons decides to deal with the 'more important' 144 paradigmatic relations of sense in his study of semantics, setting aside the study of syntagmatic relations altogether. Even though Lyons (1977) provided only a criticism of the Firthian theory of meaning, there have been other semanticists who tried to put together a theory of lexical meaning based on the semantic properties of lexical units. This approach is the semantic approach to the study of collocations. According to the semantic approach, the meaning of a lexical item is perceived as a combination of the semantic properties of that item. It is the semantic properties of a lexical item that determine its collocates. Just as the Neo-Firthians tried to establish the lexis as different from grammar, the semanticists also tried to establish a semantic theory that is different from, but complementary to, grammar. Katz and Fodor (1963) describe a semantic theory that would organise, systematise, and generalise facts about meaning (Katz & Fodor 1963:170). They state that a semantic theory of a language would "take over the explanation of the speaker's ability to produce and understand new sentences at the point where grammar leaves off" (Katz & Fodor 1963:172-173). They accept that one component of a semantic theory of a language is a dictionary of that language, and they proceed to describe the semantic markers for a few lexical entries of a model dictionary of English. According to the semantic theory proposed by Katz and Fodor, each entry for a lexical item in the dictionary must contain a selection restriction, i.e. a condition for that particular lexical item to combine with others. For example, the lexical item 'sleep' would require a subject with the feature [Animate], and 145 the lexical item 'break' would require as object something that is a [Physical object] and [Rigid]. Due to the fact that under the semantic approach to the study of collocations each lexical item will be defined by semantic markers based on its meaning or meanings, Lehrer (1974) argues that the semantic approach is more likely to explain why certain words can be found together. In his examination of syntagmatic meaning relations between lexical units, Cruise describes collocational restrictions as co-occurrence restrictions that are arbitrarily established (Cruse 1986:279). For example, 'kick the bucket' can only be used with human beings, although its propositional meaning is simply 'die' and not 'die in a characteristically human way'. Similarly, 'blond' refers to hair, but describing a hairy animal or a fur coat as 'blond' would be unacceptable. Cruse also distinguishes three kinds of collocational restrictions: systematic, semisystematic and idiosyncratic, according to whether, and if so to what degree, the semantic properties of a lexical item set up an expectation of a certain collocant. Lexical units that belong to the category of systematic collocational restrictions are 'grill' and toast'. Both verbs denote the same process from the point of view of the agent, but different patients: normally we 'grill' food that is raw, while we 'toast' food that is already cooked. Semi-systematic are those collocational restrictions that still behave as presuppositions of the lexical item in question, but there can be certain exceptions to the general tendency. For example, 'customers' obtain something material in exchange for money, while a 'client' receives a less tangible professional or technical service. So, butchers, 146 bakers, and grocers have 'customers', but solicitors and architects have 'clients'. However, banks seem to have 'customers' rather than 'clients' (Cruse 1986:281). Finally, for lexical items that present idiosyncratic collocational restrictions, their collocational ranges can only be described by enumerating all their acceptable collocants (Cruse 1986:281). For example, one can 'pay attention/a visit' but not ?'pay a greeting or welcome'. Idiosyncratic collocational preferences, such as 'flawless performance' but not *'unblemished performance', do not give rise to presuppositions, according to the semantic approach, and Cruse wonders whether "idiosyncratic restrictions are a matter of semantics at all" (Cruse 1986:282). A close study of what collocational restrictions can deliver to the sentence they are used in is totally justified, since they are not 'logically' necessary. For example, 'die' and 'pass away' have the same meaning, but 'pass away' refers to human beings, so the use of 'pass away' in the sentence 'My grandfather passed away' adds semantic cohesion to it; if it is used to describe the death of a pet animal then it anthropomorphises the animal (Cruse 1986:280). Due to the difficulty of the syntagmatic relations, Cruse (1986), like most lexical semanticists, finds that paradigmatic sense relations are "a richer vein to mine than relations of the syntagmatic variety" (Cruse 1986:86). One of the weaknesses of the semantic approach - the view that cooccurrence of words is the result of their semantic properties - is that there is a large number of idiosyncratic co-occurrences or combinations that are arbitrarily restricted (see Cruse's examples above). These constructions create 147 problems for the study of collocations under a theory of lexical fields, and therefore they are left unexplained and marginal by semanticists. To return to Halliday's example, since there is nothing in the meaning of 'tea' to explain why it collocates with 'strong' but not with 'powerful', according to the semantic approach, it will be listed as an idiom and as such it will be ignored in a study of lexical semantics. Furthermore, as Lehrer (1974) points out, finding semantic features for each lexical item that would account for all its collocates is an extremely ambitious task (Lehrer 1974:178). Fillmore (1978) also points out the difficulty of estimating the magnitude of collocational binding between lexical items, while he acknowledges the fact that a semantic theory must not accept the suggestion that all meanings must be described in the same terms. An example of how the semantic approach to the study of collocations can be best utilised was the compilation of a prototypical dictionary, the Explanatory Combinatorial Dictionary (ECD), of any language. The ECD is related to the Meaning-Text theory which defines language as "a specific system of correspondences between an infinite set of meanings and an infinite set of texts" (Mel'cuk 1988:167). As a core component of the Meaning-Text Model, the ECD, according to Mel'chuk (1988): "ensures the lexicalisation of the initial meaning (i.e., of semantic representation), uniting bundles of configurations of semantic elements into actual lexical units and supplying the enormous bulk of syntactic and lexical co-occurrence information that accrues from the individual lexical units of the language in question" (Mel'cuk 1988:167). 148 Each ECD entry is divided into three zones: a semantic zone, a syntactic zone and a lexical co-occurrence zone. The latter comprises all the restricted lexical co-occurrences of the entry lexeme. For this purpose, Mel'cuk and Zholkovsky, the ECD initiators, devised the concept of Lexical Functions that describe all the paradigmatic and syntagmatic relations that a lexeme can have with other lexemes (Mel'cuk & Zholkovsky 1988:42). The above approach resulted in a large number of standard basic lexical functions - some of which had already been utilised in dictionaries for several decades (e.g. 'Syn' for synonyms) and others were new (e.g. 'Instr' preposition meaning 'by means of', and 'Propt' preposition meaning 'because of', 'as a result of') (see Table 1, above). In the ECD version for French, Dictionnaire Explicatif et Combinatoire du Francais Contemporain, there are 53 lexical functions listed, and these are used together with the other semantic and syntactic information for the description of 50 lexical items. Mel'cuk and Zholkovsky are considered pioneers in their lexicographic principles and the heuristic criteria they used for the compilation of the ECD. The fact that only 50 lexical items were described in the French ECD underlines the extremely difficult task of listing all the semantic features of lexical items in an effort to account to all its collocates. Despite its limitations, the ECD could be used as "a central component of automatic text synthesis and analysis", as a "format" for the development of textbooks, pedagogically oriented dictionaries, and reference works, and also it can contribute to language theory (Mel'cuk & Zholkovsky 1988:66-67). 149 Even though semanticists claimed that syntagmatic lexical relations should be studied under the scope of semantics, they did not proceed any further with the study of collocation and they did not make the phenomenon of 'collocation' any more explicit. Due to the irregularities and idiosyncrasies that collocations present, semanticists, who followed a similar role to grammarians (i.e. assigning semantic labels to sentence constituents and examining generalisable tendencies and regularities), preferred to study the more regular paradigmatic lexical relations, abandoning collocations to their rulelessness. 1.3.3 The Structural Approach The structural approach consists of those linguists and researchers who suggest that collocation is influenced by structure, and collocations occur in patterns. Therefore, the structural approach recommends that the study of collocations should include grammar. The Neo-Firthians' view of separating lexical analysis from grammatical analysis was criticised by Mitchell (1971), who argues for the "one-ness of grammar, lexis and meaning" (Mitchell 1971:43). The interdependence of grammar and lexicon is evident from the fact that 'lexical particularities' derive their meaning not only from contextual extension of a lexical kind but also from the generalised grammatical patterns in which they appear (Mitchell 1971:48). For the study of collocations, Mitchell proposes that "collocations [which are 'of roots' rather than 'of words'] are to be studied within grammatical matrices" (Mitchell 1971:65). In a group of word forms like 'drinks', 'drinker' and 150 'drinking' Mitchell abstracts the common elements of each word form and labels that as 'root', e.g. /drink, and the associations of different roots, e.g. /drink- and /heav-, as 'collocations', e.g. 'heavy drinker', 'drink heavily' (Mitchell 1971:51). Mitchell refers to the collocation 'heavy drinker' as an exemplification of the colligation 'adjective + agentive noun' (Mitchell 1966:337). The relationship between 'collocation' and 'colligation' is one of generality: 'colligations' are the generalised classes of associations and 'collocations' are their particular members (Mitchell 1971:53). Mitchell's view that collocations are of roots rather than of words does not hold for every combination of roots. For example, 'faint praise' is an acceptable English collocation, but not all combinations of the two roots, /faintand /praise-, produce acceptable collocations: 'she was damned by faint praise' is acceptable, but 'he praised her faintly' is not. Matthews (1965) proposes another way of studying collocations within grammar. He suggests enriching Chomsky's syntax with extra sets of rules that will account for the selectional restrictions on lexical items. This approach deals with the syntagmatic relations along a string of lexical items, a 'kernel colligation' (p.38), rather than with individual collocational relations of pairs of words, but Matthews realises that such a description of the language involves double or triple the number of rules when compared to a description on the lines of Chomsky's syntax (Matthews 1965). Matthews' theory suggested the study of syntagmatic relations, and consequently of collocation, along the lines of transformational grammar, but it was not developed any further. 151 The influence of grammar on collocation was also discussed by Greenbaum (1970), (1974) who pointed out that certain instances of collocation require syntactic information. For example 'much' collocates with 'prefer' when it is in a pre-verb position as in 'I much prefer a dry wine', but it does not collocate with 'prefer' in post-object position as in *'I prefer a dry wine much' (Greenbaum 1974:82). Greenbaum suggests that the collacability of words (i.e. their potential co-occurrence with other lexical items) should be "tied" to syntax, and realises that there are certain lexical items that can occur only in certain syntactic relationships, e.g. 'His sincerity frightens us' but not 'We frighten his sincerity' (Greenbaum 1974:82). Without reference to syntax, the notion of collacability becomes vacuous - virtually any two items can co-occur at a given arbitrary distance. For example, 'sincerity' can collocate with 'frighten’, but the acceptability of the combinations they produce can only be judged via syntax. The notion of language blocks and lexicalised sentence stems was introduced by Pawley and Syder (1983), who suggest that if a learner is going to achieve a native-like control of a language, then along with the rules of a generative grammar, she/he needs to "learn a means for knowing which of the well-formed sentences are native-like -- a way of distinguishing those sentences that are normal or unmarked from those that are unnatural or highly marked" (Pawley & Syder 1983:194). Pawley and Syder propose a new way of examining native-like selection and fluency. According to their approach, learners memorise a language in blocks, and a big portion of a native speaker's 152 lexicon consists of "lexicalised sentence stems". For example, an expression of apology like 'I'm sorry to keep you waiting' gives the sentence stem 'NP beTENSE sorry to keep-TENSE you waiting'; the constituents of this sentence are its 'inflections' and any additional constituents (e.g. 'all this time') are its 'expansions' (Pawley & Syder 1983:210). According to Pawley and Syder lexicalisation belongs to the domain of competence and a sentence stem can be lexicalised if it is a standard expression of the meaning in question in a particular community, or if it is an "arbitrary choice, in terms of linguistic structure, for the role of standard expression". For example, 'it's twenty to six' is a standard expression in English since it is a convention that one tells 'to [Hour]' rather than 'preceding [Hour]' or 'before [Hour]', and 'I want to marry you' is an arbitrarily established standard usage, compared to a less standard paraphrase such as 'I wish to be wedded to you', which could be used in a formal letter or a satirical speech (Pawley & Syder 1983:198). As with most of the theories examined so far, Pawley and Syder do not define the notion of lexicalised sentence stems any further, and they did not offer an explicit list of sentence stems that could be used as a framework in the study of collocations. The view that language consists of blocks or 'chunks' was also supported by Nattinger and DeCarrico (1992), who proposed the compilation of a lexical phrase dictionary for L2 learners. Nattinger and DeCarrico give the following examples of lexical phrases for inclusion in the dictionary: Conversational Maintenance (regularities of conversational interaction that describe how conversations begin, continue and end). Summoning: 153 Excuse/pardon me (sustained intonation); Hey/hi/hello, (Name); How are you (doing)? I didn't catch/get your name; Do you live around here? Hello, I'm + NAME; Good morning/afternoon/evening, (how are you) What's up? (Nattinger & DeCarrico 1993). From the examples of lexical phrases, as these were presented by Nattinger and DeCarrico, it appears that lexical phrases are not the same as collocations or lexicalised sentence stems. Lexical phrases appear to be more general than collocations and less systematic than lexicalised sentence stems. Also, Nattinger and DeCarrico are not concerned with providing explanations about why certain lexical phrases are put together, which would be more useful for the study of collocations. A set of criteria for examining whether a combination of words is a collocation or not is outlined by Kjellmer (1984), who also suggests the study of collocations in a grammatical framework. Kjellmer defines collocations as "lexically determined and grammatically restricted sequences of words" (Kjellmer 1984:163). According to this definition, only recurring sequences that are grammatically well-formed can be considered as collocations. For example, during a search of the Brown Corpus, Kjellmer found the following sequences: 'green ideas', 'try to', 'hall to'. From these strings, it is only 'hall to' and 'try to' that recur, and from these two, only 'try to' that is grammatically well-formed. Therefore, only 'try to' is a collocation (Kjellmer 1984:163). Kjellmer also tries to establish a set of rules for assessing 'collocational distinctiveness'. According to these, a sequence is highly distinctive when it appears frequently in many and 154 different categories of texts; it is long (minimum length is two words); and it is structurally complex. On the other hand, Renouf and Sinclair (1991) applied their theory of studying collocations to 'frameworks' consisting of discontinuous sequences of two words, whose grammatical well-formedness depends on what intervenes, e.g. 'a + ? + of', 'too + ? + to' (Renouf & Sinclair 1991:128). They found out that in some cases there seems to be a stronger collocational pull exerted by one of the pair on some items rather than on others, e.g. in the framework 'too + ? +to', 'to' would be able to collocate with 'easy', 'hard', 'good' and 'proud' even in the absence of 'too', e.g. ‘easy to do’, ‘good to do’, but not with 'much' or 'tired' which require the presence of 'too', e.g. ‘too tired to dance’, ‘too full to eat’, (Renouf & Sinclair 1991:133). Thus, Renouf and Sinclair demonstrated that the collocations of grammatical words offer an appropriate basis for studying collocations, since "co-occurrences in the language most commonly occur among grammatical words" (Renouf & Sinclair 1991:128). The importance of grammatical words for the study of collocations was also confirmed by Jones and Sinclair (1974). Even though their study on English lexical collocations was based on a relatively small corpus (147,000 running words), it yielded some interesting results concerning the study of collocation: the influence of the node does not extend beyond span position Node (N) + 4 (see also Berry-Rogghe 1973). Grammatical words are not collocationally neutral (unlike Haskel 1971). Even though grammatical words are weak at predicting their environment, they do show ability to predict word 155 classes at specific span positions, e.g. the collocates of the word 'the' in position N-1 are mainly verbs and prepositions, while in position N+1 they are nouns and adjectives. The significance of a collocation takes into account the overall frequency of the two items concerned, the number of times they occur together, and the length of the text. Collocations can appear to be 'text dependent'. Verbs tend to collocate with grammatical items, e.g. 'put' and 'take' collocate with a great number of prepositions to form phrasal verbs. Association between lexical items is subject to grammatical influence, e.g. the adjective 'good' is preceded by adverbs and followed by nouns as significant collocates. Significant collocations show a considerable amount of position dependence, e.g. in a span of 4, significant collocations most frequently occur in the span positions immediately next to the node, N-1 and N+1, while very little occurs at the two extremes of the span, N-4 and N+4. Finally, collocation was found to be an organising principle that influences the construction and interpretation of utterances (Jones & Sinclair 1974:48; Leitner 1992). The study of collocations in structural patterns was also suggested by Aisenstadt (1979). Aisenstadt distinguishes collocability restrictions as part of the wide field of collocability. Word combinations whose constituents are restricted in their 'commutability', i.e. their ability to combine with other words, are called restricted collocations (Aisenstadt 1979:71). Restricted collocations are defined as combinations of two or more words used in one of their regular, non-idiomatic meanings, following certain structural patterns (e.g. V+(art)+(A)+N), and restricted in their commutability not only by grammatical 156 and semantic valency (e.g. in the restricted collocation 'shrug one's shoulders' both components have a narrow semantic valency), but also by usage (e.g. we can 'bear a grudge' but we cannot *'bear hatred/ scorn') (Aisenstadt 1979:71, 1981:54). Restricted collocations are different from free word-combinations. For example 'carry' can enter a large number of free word-combinations when it means 'to support the weight of something' like 'carry a book/bag/chair/torch/table/etc.', but it may also enter a restricted collocation pattern 'carry conviction', 'carry persuasion', 'carry weight' when it is used to denote 'being convincing' or 'winning the argument' (Aisenstadt 1979:72). Some of the structural patterns of restricted collocations in English listed by Aisenstadt are given below in Table 2: Table 2. Examples of structural patterns of restricted collocations in English Pattern Example V+(art)+(A)+N 'command devotion', 'give a loud laugh' V+prep+(art)+(A)+N 'leap to a sudden conclusion', 'leap to a decision' A+N 'cogent argument' V+Adv 'take off', 'take away', 'sit down' I(Intensifier)+A 'dead tired', 'dead drunk', 'stark naked' Note: V = Verb, art = Article, A = Adjective, N = Noun, prep = Preposition, Adv = Adverb, I = Intensifier 157 Aisenstadt also reports that restricted collocations have not yet been studied yet adequately as a specific linguistic phenomenon, and therefore they have not received a proper treatment in lexicography: some of them are listed alongside free word combinations and others are listed in dictionaries of idioms as idioms (Aisenstadt 1981:53). Aisenstadt concludes that a study of restricted collocations is of great importance for applied linguistics, translators, lexicographers, language teachers and students. The structure-based studies make clear that collocational restrictions do not apply only to lexical words (as the other two approaches assume) but also to grammatical words. Furthermore, studies such as Jones and Sinclair (1974), Renouf and Sinclair (1992), and Aisenstadt (1979) show that it is possible to study collocations using structural patterns. Thus, there is no need for the debate among linguists over whether collocations should be described using lexical analysis, or semantic rules and/or grammar rules. It is possible that by defining structurally and isolating a particular collocational pattern and examining its frequency, variability and systematicity in a language corpus, the notion of collocation could be enriched. Benson, Benson and Ilson (1986a) compiled the BBI Combinatory Dictionary of English, a dictionary of English collocations. The difference between the BBI and the ECD, examined earlier on, is that the BBI includes more lexical items and a less detailed grammatical and lexical treatment. The BBI writers do not include in their dictionary "free combinations" that are predictable and thus not needed, e.g. the collocation of the verb 'to destroy' 158 with a large number of nouns denoting physical objects like 'bridge', 'house', 'road' etc. (Benson 1985:66; Ilson 1985; Benson et al. 1986a). Fifteen different types of "essential grammatical and lexical recurrent word combinations" are defined and included in the BBI dictionary for "general use" (Benson et al. 1986a:7). The BBI distinguishes between grammatical and lexical collocations in the following way: a grammatical collocation is a phrase that consists of a dominant word (verb, noun, adjective) and a preposition or grammatical structure such as an infinitive or clause. Lexical collocations normally do not contain prepositions, infinitives, or clauses. Typical lexical collocations consist of nouns, adjectives, verbs, and adverbs. Examples of grammatical and lexical collocational patterns are given in Table 3. Table 3. Examples of Grammatical and Lexical Collocations in the BBI Combinatory Dictionary of English Code Pattern Example Grammatical Collocations: (G4) preposition + noun in agony, at anchor (G8) verb + to infinitive decide to come, offer to help Lexical Collocations: (L1) verb + noun make an impression (L3) adjective + noun long hair (L4) noun + verb dogs bark (Benson et al. 1986) 159 The BBI contains seven types of lexical collocations, L1...L7, and eight main types of grammatical collocations, G1...G8, with the eighth type consisting of nineteen English verb patterns, e.g. SVO to O (e.g. ‘I gave the book to Mary’), SVV-ing (e.g. ‘I started crying’), SV to inf (e.g. ‘I want to sleep’), etc. Altogether, there are 33 patterns of grammatical and lexical collocations included in the BBI. One of the disadvantages of the BBI is that its writers do not explain how they established that a word combination is recurrent enough to be included in their dictionary. The recent advances in corpus analysis provide more accurate examples of significant collocations for their inclusion in a dictionary (see COLLINS COBUILD English Words in Use, forthcoming; cited in Bahns 1993; also Collins COBUILD English Collocations on CD-ROM); for the advantages of using corpus analysis in lexicography see also Sinclair (1985) and Greenbaum (1984). The use of language corpora for the detection of collocative semantic lexical relations in the compilation of dictionaries is also suggested by Meijs (1992), Noel (1992), Sinclair (1992), (1993), and, for the making of a lexical and phraseological grammar, Francis (1993). Even though the BBI has methodological weaknesses, its major contribution to the study of collocations is that it defines explicitly a number of patterns and, unlike the previous studies on collocations, it actually organises the collocations of a large number of words around those patterns, proving that it is possible to use structural patterns in order to study collocations. 1.3.4 Summary of the Three Approaches 160 The three approaches to the study of collocations focus on different aspects of the phenomenon of collocation. The lexical composition approach regards lexical analysis as independent from grammar and considers lexis an autonomous entity, choosing its own collocates which can be enumerated and classified in lexical sets. The semantic approach tries to find semantic features based on the meaning of lexical units that would enable the prediction of their collocates. The structural approach tries to establish patterns of collocations that include grammatical and lexical words alike. The semantic and the lexical composition approaches are restricted to the study of a small number of collocations (usually 'verb noun' and 'adjective noun' collocations); they exclude grammatical words from their scope, and eventually they achieved only limited results. The structural approach, on the other hand, examines more patterns of collocations, includes grammatical words in the study of collocations, and provides a framework for the study of collocations that is feasible and systematic (e.g. the collocational patterns included in the BBI). 1.4 Collocations and Idioms Before proceeding to the description of the framework employed by the present study on collocations, it is necessary to make reference to one of the debates concerning the study of collocations: to what degree collocations are similar to idioms. 161 Along the continuum with free combinations on one end and idioms on the other, collocations seem to fall in the middle as they blend together the semantic transparency of free combinations and the syntagmatic bonds of idioms. An idiom is usually described as "a constituent or series of constituents for which the semantic interpretation is not a compositional function of the formatives of which it is composed" (Nagy 1978:296). Collocations, although they are combinations of at least two words, exhibit a degree of syntactic frozenness and resistance to lexical substitution; they are semantically transparent; and hence they are not idioms. However, there are certain lexical combinations that are semantically transparent, and therefore should be classified as collocations, but which also show a certain degree of syntactic frozenness and resistance to lexical substitution, just like idioms: for example, 'foot the bill', 'curry flavour', 'high explosive', 'highest confidence'. Such expressions have been called 'bound collocations' (Cruse 1986:41), 'semiproductive expressions’ (Nagy 1978:296), and 'partial idioms' (Palmer 1976:99). There are linguists who do not distinguish between idioms and collocations. For example, Wallace (1979) describes collocations as a class of idioms, as stereotyped expressions that are easily decoded from the meaning of their constituent elements. Wallace distinguishes two dimensions to the idiom: the dimension of meaning (the semantic dimension) and the dimension of grammatical context (the structural dimension) (Wallace 1979:63). Idioms, according to the degree of their decodability, are classified as 'transparent', if they are easily decoded, or 'opaque'. Idioms falling into the area of transparent 162 stereotypes are called 'restricted collocations', e.g. ‘Pleased to meet you’, ‘be honest with’, ‘use up’. The semantic approach to the study of collocations also considers lexical co-occurrences that are arbitrarily restricted and so lacking a semantic explanation. These are like idioms, i.e. linguistically non-productive, and as such they should be left out of the study of lexical fields (Lehrer 1974:187). By and large, semantic transparency appears to be the only criterion that could make a difference in the process of classifying expressions as idioms or collocations, while the importance of how clear-cut the distinction is between collocations and idioms seems to vary among linguists, with some arguing that "it is, of course, a matter of terminology whether collocations should be classed separately from idioms or as a major sub-class" (Bolinger 1976:5). This study examines collocations, i.e. word combinations, in terms of the syntactic patterns in which they enter. Therefore, the degree of their semantic transparency is, for the purposes of this study, overlooked. 1.5 A Framework for the Study of Collocations For the investigation of the acquisition of collocations, this study adopts a framework based on the structuralist approach. The framework comprises 37 patterns of collocation. 33 of these patterns are from the BBI, 2 patterns are extensions of the BBI patterns, and 2 are adapted from Zhang (1993). The use of structural patterns for the study of collocations has been employed in previous studies (Zhang 1993; Bahns & Eldaw 1993; Biscup 1992). These patterns are 163 utilised in this study in order to operationalise the notion of collocation and to examine the development of English collocational knowledge in L2 learners. In order to avoid a confusion between structural/collocational patterns and grammatical patterns, the patterns used in this study are, from now on, referred to as 'types'. 'Type' with a capital 'T' is used for reference to individual collocation types. For a complete list of the 37 types of collocation with examples, see Table 4 below. Table 4. Types of Collocation used in the study* TYPE EXAMPLE 1. Noun Preposition argument about 2. Noun to Infinitive (it was a) pleasure to do it 3. Noun that-clause he took an oath that he would do .... 4. Preposition Noun in agony 5. Adjective Preposition angry at 6. Predicate Adjective to Infinitive she is ready to go 7. Adjective that-clause she was afraid that she would fail... 8. SVO to O/ SVOO he sent the book to his brother 9. SVO to O they described the book to her 10. SVO for O/ SVOO she bought a shirt for her husband 11. SV(O) Preposition O we export to many countries 12. SV to Infinitive they began to speak 13. SV Infinitive we must work 164 14. SV V-ing he kept talking 15. SVO to Infinitive she asked me to come 16. SVO Infinitive she heard them leave 17. SVO V-ing I caught them stealing apples 18. SV Possessive V-ing they love his clowning 19. SV(O) that-clause they admitted that they were wrong 20. SVO to be c we consider her to be well trained 21. SVOc she dyed her hair red 22. SVOO the teacher asked the pupil a question 23. SV(O) Adverbial he carried himself well 24. SV(O) wh-word he asked how to do it 25. S(it) VO to Infinitive it surprised me to learn of her decision 26. SVc he was a teacher 27. Verb Noun/Pronoun (creation) make an impression 28. Verb Noun (eradication) reject an appeal 29. Adjective Noun strong tea 30. Noun Verb bees buzz 31. Noun1 of Noun2 a piece of advice 32. Adverb Adjective deeply absorbed 33. Verb Adverb affect deeply 34. Noun Noun aptitude test 35. Miscellaneous in fact 36. Preposition Determiner Noun on the contrary 165 37. Phrasal Verb to pass on Note: S: Subject, V: Verb, O: Object, c: complement * Henceforth, ‘Preposition’ and ‘Prep’, ‘Adjective’ and ‘Adj’, ‘Noun’ and ‘N’, ‘Verb’ and ‘V’, ‘Infinitive’ and ‘Inf’, ‘creation’ and ‘creat’, ‘determiner’ and ‘det’ are used interchangeably depending on the availability of space in the tables. See also table of abbreviations. The categorisation of the above collocation types in lexical and grammatical collocations by the BBI (see section 1.3.3.) was further refined by Zhang (1993). According to Zhang, a lexical collocation is "a type of collocation where one component recurrently co-occurs with one or more other components as the only lexical choice or one of the few lexical choices in a combination" (Zhang 1993:14). A grammatical collocation, on the other hand, is "a type of collocation where one component recurrently co-occurs with one or more other components as a grammatical category, rather than a particular lexical item" (Zhang 1993:14). In other words, if a collocation is lexicalised, i.e. if the combination of an open class word (verb, noun adjective, adverb) and a preposition or another open class word is used as a single word, e.g. 'to do one's homework', 'to depend on', 'strong in', then it is a lexical collocation. If the collocation is a combination of an open class word (verb, adjective, noun, adverb) and a clause, infinitive, gerund, or preposition, then it is a grammatical collocation, e.g. 'enjoy + V-ing', 'want + to infinitive'. Zhang's definition of lexical and grammatical collocations was found to be more appropriate than the 166 BBI's for pedagogical research, and this study considers the following types to be lexical collocations (Types 27, 28, 29, 30, 31, 32, and 33 were also defined as lexical collocations by the BBI): Table 5. Types of Lexical Collocations used in the study Type 1. Noun Prep 2. Adjective Prep 27. Verb Noun (creation) 28. Verb Noun (eradication) 29. Adjective Noun 30. Noun Verb 31. Noun1 of Noun2 32. Adverb Adjective 33. Verb Adverb 36. Prep Det Noun 37. Phrasal Verb The use of syntactic structures to operationalise English collocations and to examine the acquisition of an area of vocabulary, i.e. collocations, is considered appropriate for this study for the following reasons: i) English collocations have already been found to be influenced by structure (see studies under the structuralist approach, above). Also, the classification of 167 English collocations in patterns/types enables a large scale investigation of vocabulary acquisition, i.e. by using types of collocations a larger area of vocabulary will be covered than by using a number of specific collocations. ii) The use of syntactically defined structures will enable the description of the development of collocational knowledge with respect to types of collocation rather than to a limited number of specific collocations. If collocational knowledge is affected by structure and does develop in terms of collocation types, then the results of this investigation will be applicable for all the specific collocations that belong to a particular collocation type. For example, if certain conclusions can be drawn about how collocational knowledge develops with respect to the 'SV inf' collocations, then the results will hold for all collocations that belong to this type: 'I can sing', 'we must go', 'he might win', etc. iii) The old debate in linguistics on the division between grammar/syntax and vocabulary did not prove a constructive approach to the description of L2 acquisition. If vocabulary is not a mere listing of words in memory but combinations of words carrying meaning and governed by syntactic rules, as the studies reviewed in this chapter claim, then investigating the acquisition of vocabulary in combination with syntactic structures will yield a more complete picture of L2 vocabulary acquisition. The investigation of the development of English collocations at different proficiency levels was considered useful because previous studies have made assumptions based on their results that learners at different levels of proficiency use different types of collocation (Zhang 1993). The aim of the 168 present study will be to describe the development of collocational knowledge in L2 learners at different proficiency levels and to investigate whether there are any collocation types that are acquired before others. If different collocation types are used by L2 learners at different levels of proficiency, could it be that there are developmentally determined acquisition orders in the acquisition of English collocations? The following chapter reviews a number of studies on acquisition orders and developmental sequences in L2 acquisition. 169 CHAPTER 2 SECOND LANGUAGE ACQUISITION AND THE DEVELOPMENT OF COLLOCATIONAL KNOWLEDGE 2.0 Introduction In the 1970's, research in L1 acquisition provided evidence of developmental patterns and stages that characterise child language acquisition (see Brown 1973). Along similar lines, studies in L2 acquisition investigated how a L2 is acquired and whether it follows a similar developmental route. Theories of L2 acquisition were formulated, deductively or inductively, and research in the L2 classroom flourished. Longitudinal and cross-sectional studies were conducted (for a critique see Miesel, Clahsen & Pienemann 1981; Rosansky 1976) and the data were analysed to reveal "developmental sequences" of L2 acquisition. These sequences were then compared to L1 developmental sequences and found to be either similar (Ravem 1968, 1970, 1974; Dato 1970; Milon 1974; Gillis & Weber 1976) or different (Wode 1976). Among the studies investigating L2 development there is great variation in the way language "development" is operationalised. Some studies describe the various "stages" that the learner's interlanguage goes through before a particular language structure is considered to be acquired, e.g. the five stages of 170 the acquisition of word order in German (Meisel et al. 1981). Such stages form a "developmental sequence" that all learners seem to traverse regardless of their native language or the learning context. Other studies describe "acquisition orders" for certain language components, e.g. it has been shown that the acquisition of a number of English morphemes follows such a predetermined acquisition order (see Krashen 1977). Such orders have also been referred to as "accuracy orders" because the criterion for a certain item to enter an order is its accurate use by the L2 learner. Morpheme acquisition orders also support the existence of developmental sequences in L2 acquisition. The most commonly researched aspects of language for developmental sequences were the areas of morphology (Dulay & Burt 1973, 1974; Bailey, Madden & Krashen 1974; LarsenFreeman 1975; Krashen, Sfelazza, Feldman & Fathman 1976; Mace-Matluck 1977; Fuller 1978; Fathman 1978; Makino, 1979; Lightbown 1983), word-order and syntax (Huang 1970; Butterworth 1972; Ravem 1974; Wagner-Gough 1975; Adams 1978; Cazden, Cancino, Rosansky & Schumann 1975; Gillis & Weber 1976; Meisel et al. 1981; Pienemann, Johnston & Brindley 1988). This chapter reviews studies on developmental sequences pertaining to different aspects of L2 acquisition and highlights the motivation for the present study, i.e. the investigation of evidence of development in the acquisition of English collocations. 2.1 Morphology 171 The Natural Order Hypothesis in Krashen's Monitor Theory suggests that there is a natural order of acquisition of L2 rules. Some of them are earlyacquired and some are late-acquired. This order does not necessarily depend on simplicity of form. It can also be influenced by classroom instruction (Krashen 1985). Evidence for the Natural Order Hypothesis was provided by a series of research studies investigating morpheme acquisition orders and showing that grammatical morphemes elicited in free speech and with the use of specifically designed instruments (e.g. the Bilingual Syntax Measure) constitute a natural order of morpheme acquisition for performers (Houck, Robertson & Krashen 1978; Krashen, Houck, Giunchi, Bode, Birnbaum & Strei 1977). Krashen's Natural Order for the acquisition of 9 English morphemes, from the early acquired morphemes (top) to the late acquired ones (bottom), are given below in Table 6: Table 6. The acquisition of English morphemes Morpheme -ing plural copula auxiliary article irregular past regular past 172 3rd person singular possessive 's (Krashen 1977). Dulay and Burt (1973), (1974) used the Bilingual Syntax Measure (BSM) to elicit speech data from 250 Spanish- and Chinese- speaking children learning English in the USA. They found statistically significantly related acquisition orders for the two groups, but these were different from the order of acquisition for English L1 obtained by Brown (1973) in his longitudinal study of three children. Dulay and Burt's findings were also confirmed by Bailey et al. (1974) in their study of 73 Spanish and non-Spanish ESL adults. Acquisition orders that were L1-neutral were also found by LarsenFreeman (1975). She tested the acquisition of ten English morphemes by 24 adults from four different L1 backgrounds (Arabic, Spanish, Japanese, and Farsi) using five different tasks: the BSM speaking task, a reading task, a listening comprehension test, an imitating task, and a writing test. LarsenFreeman found that language background did not affect performance in morpheme ordering in a significant way, i.e. there were significantly high coefficients of concordance produced among the language groups on tasks within the study, and also the BSM elicited a very similar order of morphemes for learners from different L1 backgrounds. The BSM ordering from LarsenFreeman's study and the ordering obtained by Dulay and Burt (1974) correlated highly at the .01 level of significance, rho = .87. Also the ordering elicited by 173 the imitating task correlated significantly with the ordering obtained in Dulay and Burt (1974), rho = .60. However, the morpheme orderings that the other three tasks produced had low correlations with Dulay and Burt's study, none of them reaching statistical significance. In an attempt to provide an explanation for the similar ordering obtained by the BSM in both the Dulay and Burt (1974) and the Larsen-Freeman (1975) studies, Larsen-Freeman suggested that input frequency could be one factor influencing the order along with other factors (Larsen-Freeman 1975, 1976). Also, other factors affecting morpheme acquisition by L2 learners are that the learner supplies certain morphemes correctly because she/he is trying to match the gestalt of the speech she/he hears, or that these certain morphemes occur in speech patterns that she/he has memorised (Larsen-Freeman 1978:100). Other morpheme studies involved learners from Indo-European and non-Indo-European L1 backgrounds (Mace-Matluck 1977; Fuller 1978), in both second and foreign language learning contexts (Fathman 1978; Makino 1979; Lightbown 1983), and on different tasks (Krashen et al. 1976). Morpheme studies for L2s other than English (e.g. Spanish in van Naersen 1980; Quiche Mayan in Bye 1980; and a 'creoloid' (Singapore English) in Platt 1977) also proved the existence of accuracy orders. Evidence was also provided for strong similarities in the L2 acquisition process for learners involved in different learning situations and with different amounts of exposure (Makino 1979), and for the language acquisition processes utilised by adults and children (Krashen et al. 1976). 174 An alternative to the morpheme order studies is reported by Wode, Bahns, Bedey and Frank (1978). They describe the stages that German children go through while acquiring one morpheme, i.e. plural in English. The data for this study were from Wode's four children acquiring English naturalistically (without classroom instruction) during a 6 month field trip to the U.S.A. There are three stages described: Stage 1: One form for both singular or plural intention Stage 2: Two forms for each noun reflecting target singular and plural Stage 3: Forms with plural target reflexes restricted to plural intention; forms with singular target reflexes restricted to singular intention (Wode et al. 1978:178-179). Wode et al. argue that their approach of investigating morpheme order and language acquisition as a developmental process can provide more insights into the mechanisms of the process of language acquisition. However, their approach was limited to the analysis of the acquisition of English plural inflections, and it can only be used for the investigation of the acquisition of morphemes that present a variety of allomorphs, like the English plural. Although these results strongly suggest that common accuracy and acquisition orders in morphemes are evident across L2 learners, there are certain shortcomings in the morpheme studies. Research did not provide enough empirical support for a theoretical explanation of the developmental 175 sequences (e.g. for a critique of Krashen's Monitor Theory see Gregg 1984). Also, only a tiny portion of English grammar was studied, and the acquisition orders obtained represented a linguistically heterogeneous group of bound and free NP and VP morphemes. The methodology was also criticised for using a limited number of elicitation methods (mainly the BSM for which claims have been made that it is not a valid instrument for measuring the sequence of morpheme acquisition; for a critique of the BSM see Porter 1977). However, even though these orders are not rigidly invariable across studies, they are far from being random (Krashen 1977; Larsen-Freeman & Long 1991). 2.2 Syntax Empirical evidence for developmental sequences in the area of syntax is also available. Studies identified developmental sequences for the acquisition of ESL interrogatives (Huang 1970; Butterworth 1972; Ravem 1970, 1974; Young 1974; Wagner-Gough 1975; Adams 1978; Cazden, Cancino, Rosansky & Schumann 1975; Gillis & Weber 1976; for a review see Larsen-Freeman & Long 1991). Four stages of interrogative formation in ESL were identified: Stage 1. Rising intonation e.g. He work today? Stage 2. Uninverted Wh-word, with or without an auxiliary e.g. What he (is) saying? Stage 3. Overinversion 176 e.g. Do you know where is it? Stage 4. Differentiation e.g. Does she like where she lives? (examples from Larsen-Freeman & Long 1991:93) Four stages of acquisition were also identified for negation in ESL: Stage 1: no + X e.g. 'No book', 'No you playing here' Stage 2: no/don't Verb e.g. 'He don't have job' Stage 3: auxiliary-negation e.g. 'I can't play the guitar' Stage 4: analysed don't e.g. 'She doesn't drink alcohol' (examples from Larsen-Freeman & Long 1991:94; for a review see Schumann 1979). Studies in German word order acquisition yielded a five stage model of development in the acquisition of German L2 (Meisel et al. 1981). The Multidimentional Model provided a theoretical basis for the observed acquisition order and was further extended to ESL acquisition (Pienemann & Johnston 1987). According to the model, invariant developmental stages in the 177 acquisition of certain morphological and syntactic elements in both German and English can be predicted and explained in terms of "hierarchically ordered speech processing constraints" (Pienemann, Johnston & Brindley 1988:217). Based on the same data Pienemann (1984), (1985), and (1989) suggests that formal input impedes rather than promotes language acquisition, therefore the formal instruction of syntax can be abandoned (see also Dulay & Burt 1973). For a review of the debate on whether instruction affects L2 acquisition see Long (1983); for a critique of the Multidimensional Model see Hudson (1993). The acquisition of relative clauses in ESL was also investigated and found to follow a developmental route similar to that found in some L1 acquisition studies (Schumann 1980). Apart from describing developmental stages for the acquisition of a single syntactic structure, there have also been studies that investigated the existence of acquisition orders of grammatical structures. Fathman (1977) tested the usage of 20 grammatical structures by 500 non-native Englishspeaking children learning English in public schools in the United States. She found difficulty orders (or learning orders) that were similar for students coming from different language backgrounds and ages. Fathman suggests that the forms found to be used correctly early in the learning of L2 are those which are needed for effective communication. Difficulty orders were also found by Yamada and Matsuura (1982) in the acquisition of English articles by Japanese students. Yamada and Matsuura reported that the definite article was the easiest for both intermediate and 178 advanced students. The zero article was most difficult for the intermediate students, while the indefinite article remained most difficult for the advanced level students. In a functional approach to linguistic universals in L2 acquisition research, Keenan and Comrie (1977) constructed the Accessibility Hierarchy for Relativisation. They argue that the degree of difficulty for relativising on a particular noun phrase proceeds along an implicational order. For example, sentences with NP in subject position are predicted to relativise easier than sentences with NP in direct object position. Keenan and Comrie suggest that the Accessibility Hierarchy could be considered as an acceptability ordering within each language and used for the explanation of syntactic processes in learners' interlanguage. A number of studies have used the Accessibility Hierarchy for testing predictions concerning ease or difficulty of acquisition. Gass (1979) and Gass and Ard (1980) tested relative clause formation in English by learners from different L1 backgrounds. The results indicate that learners followed the constraints of the Accessibility hierarchy in their English regardless of their L1 background. All learners found it easier to relativise sentences with NP in subject position than sentences with NP in direct object position. Markedness was also examined as a factor affecting L2 acquisition in the Principles and Parameters approach (White 1989). Although there are a number of definitions of markedness, most of them consider the structures which are exceptions to linguistic generalisations, or which are of low 179 frequency across the world's languages, or which are very complex (White 1989:117). Markedness has been used to make predictions about L1 and L2 acquisition. It has been claimed that developmental sequences of language structures based on the criterion of markedness can predict ease or difficulty of acquisition of specific language structures. For example, it was shown that learners acquire unmarked forms, i.e. the unmarked dative prepositional phrase complement (e.g. Mary gave the book to John), before marked forms, i.e. marked double noun phrase constructions (e.g. Mary gave John the book) (Mazurkewich 1984). The limitations of the markedness theory in predicting developmental sequences of L2 acquisition are reported by White (1987). In an investigation of the value of markedness as a predictor of L1 transferability, White (1987) concludes that even though markedness can affect acquisition, it is not a clear predictor of what L2 learners will or will not transfer from L1. The above studies provide evidence that there are stages of L2 learner development which are sequenced in a predictable order and which can be identified and described with a certain degree of accuracy. What is also evident from the studies reviewed so far is that grammar (in the form of syntax, wordorder or morphology) has been the central issue in L2 acquisition research. In contrast, phonology and vocabulary have not been investigated to the same extent that grammar has (Tarone, Swain & Fathman, 1976). Other limitations reported by Tarone et al. are the undeveloped methodology for data collection and data analysis (the limitations of data collection instruments such as the 180 BSM have been noted by a number of researchers), and finally the limited number of replicated studies in L2 acquisition. The focus on form rather than function is another limitation in the interlanguage studies (Long & Sato 1984). Long and Sato also argue that more research is needed in "a broader array of morphosyntactic features, e.g. complex syntactic structures, and for lexical choice" (Long & Sato 1984:279). In the next sections of this chapter a representative selection of studies in phonology and vocabulary acquisition are reviewed. 2.3 Phonology In the limited research studies to-date, claims have been made that L2 phonology also follows certain patterns of development. For example, Tarone (1976) found that L2 learners prefer to use open syllables (i.e. syllables that end in a vowel) rather than closed syllables (i.e. syllables that end in a consonant) in the early stages of L2 acquisition (Tarone 1976, 1978). Also, Wode (1977) found that children acquire the L2 phonological system in ordered developmental sequences. In his study of German children acquiring ESL, he found that German children follow the same developmental route for /r/ as the native, English-speaking children (Wode 1977: 213). Similar findings were obtained in an analysis of the production of the English syllable-final stops /b d g/ in Spanish, Polish and Mandarin learners by Flege and Davidian (1985). The authors conclude that the observed developmental processes are similar to those affecting child L1 speech production. 181 Markedness theory has also been applied to L2 phonology. Eckman (1977) claimed that where there are differences between the phonemes of L1 and L2, those phonemes that are more 'marked' (e.g. word-final voicing contrasts are more marked than medial or initial contrasts) will be more difficult for the L2 learner. In her review paper, Tarone (1978) reports that the following processes have been utilised in shaping the development of L2 phonology: i) negative transfer from L1 ii) first language acquisition processes iii) overgeneralisation iv) approximation v) avoidance (Tarone 1978:25, 1987:77). These processes are similar to the general interlanguage strategies employed by L2 learners (see Selinker 1972). As yet there is no substantial evidence as to why some developmental processes that occur in the acquisition of a L1 phonology are employed by the L2 learner, and some others are not (Ioup & Weinberger 1987). What these studies show, though, is that there are certain developmental processes that learners follow in the acquisition of L2 phonology (for a review on the acquisition of L2 speech see Leather & James 1991). 182 2.4 Vocabulary Until recently, lexical acquisition has been a "victim of discrimination" (Levenston 1979:147). Traditionally, L2 acquisition research has meant "grammar" research, in which the focus is on understanding the acquisition of rules of structural development. Largely ignored was the fact that "using the right word is the most important aspect of language use" (Politzer 1978:258), and that lexis is "the major learning priority" in L2 acquisition (Jones 1994:441). As a result, research in developmental sequences in ESL has been mainly concerned with morphology and syntax. Lexical development has rarely been researched (Meara 1978, 1980) even though it is evident that vocabulary is an important aspect of L2 acquisition. It has been shown that lexical errors outnumber grammatical ones by almost four to one (Meara 1984), and that a poor knowledge of vocabulary has negative effects in the writing of L2 learners (Linnarud 1986). Also, it was found that L2 learners vocabulary errors are corrected more frequently by native speakers than errors in syntax (Chun, Day, Chenoweth & Luppescu 1982). 2.4.1 Vocabulary as a Language Sub-skill Interest in L2 vocabulary development has been expressed by two sources: those linguists and language practitioners who saw vocabulary as a component of one of the four major language skills, i.e. reading, and those who 183 saw vocabulary as an independent aspect of language development, equal in importance and status to grammar. L2 vocabulary development is viewed as a necessary subcomponent of the development of reading skills because L2 learners need very well developed vocabularies in order to read authentic selections (Dubin 1989). However, according to Dubin, ESL learners do not have time to undertake separate vocabulary building courses, and furthermore, teaching vocabulary items which are not embedded in some meaningful context, such as a stretch of text, does not seem to help learners, and therefore vocabulary should be taught through unedited text. Krashen's view on vocabulary acquisition is that vocabulary is acquired in the same way that the rest of the language is acquired (Krashen 1989). In the skill-building view, vocabulary learning "involves learning words one at a time, by deliberate study" (Krashen 1989:440) and comprehensible input in the form of reading and listening to stories is the way to successful vocabulary development. Krashen concludes that explicit teaching of vocabulary is not so effective and "in addition, many vocabulary teaching methods are at best boring, and are at worst painful" (Krashen 1989:450). Thus, successful vocabulary development can only occur through the development of reading and listening skills. Along the same lines Fox (1987) suggests an approach to vocabulary development based on the assumption that "developing vocabulary and reading skills takes time and extensive practice" (Fox 1987:310). According to 184 this approach, reading simplified texts followed by more complex ones results in a gradual development of L2 vocabulary. Fox also expresses the need for research on rates of acquiring receptive vocabulary. Oral translation was also suggested as an adequate exercise to build vocabulary (Heltai 1989) as it makes students devote attention to vocabulary, and encourages them to extend their vocabulary into new areas, for example synonymic sets, collocations and idioms (Heltai 1989:292). However, such an approach can be made possible only under the condition that all the students and the teacher share the same mother tongue. Other L2 vocabulary teaching suggestions include the teaching of new words through a "meaningful learning approach", i.e. teaching the etymology of a word, as opposed to other techniques such as rote memorisation of words, especially with intermediate and advanced L2 learners (Pierson 1989:57). The above studies express an 'interest' in vocabulary acquisition mainly due to fact that language practitioners realised that the development of reading skills was impeded because of the lack of adequate vocabulary. The suggestions given for vocabulary development are not the product of research in the development of L2 vocabulary, but ways of circumventing the problem of inadequate vocabulary in order to develop reading skills. 2.4.2 Vocabulary as a Language Skill The first attempts to discover how L2 vocabulary is acquired led researchers to investigate how vocabulary is stored and then retrieved by L2 185 learners. Evidence for a phonologically organised mental lexicon was provided by Fay and Cutler (1977), Cutler and Fay (1982) through an investigation of "malapropisms" (word substitution errors), e.g. 'we need a few laughs to break up the monogamy' instead of 'monotony' (Fay & Cutler 1977:505). They conclude that the mental dictionary lists its entries according to syllable structure and/or stress pattern, and only within these categories according to sound (Fay & Cutler 1977:511). In investigating the problem of how new foreign words are stored in the learner's mental lexicon, Meara (1978) tested the word associations of 76 English girls learning French in two London Comprehensive schools. The girls were given a list of 100 French words and were asked to write down, beside each one, the first French word that it made them think of (Meara 1978:194). These associations were then compared with the word associations produced by native French speakers. Meara concludes that the native speaker's mental dictionary is organised mainly on semantic lines while in L2 learners this semantic organisation seems to be much less well established (Meara 1978:208). This lack of proper semantic organisation could be the source of difficulty that foreign language learners experience in processing both written and spoken foreign language material (Meara 1978:208). Meara finds it plausible that learners follow a transition from a mental L2 lexicon organised on nonsemantic criteria to a more native-like one organised on semantic grounds. Meara's claim that there are transitional stages in the lexicon has been criticised. The results of his research have been described as "simply messy" and failing 186 to confirm the existence of developmental patterns (Sharwood-Smith 1984:238). Despite the negative criticism, Sharwood-Smith suggests that the networks of semantic associations that exist between words could be a viable avenue to explore in the investigation of L2 vocabulary acquisition. In a study of the acquisition of individual words, Meara and Ingle (1986) tested the acquisition and retrieval of 35 low-frequency French nouns by English-speaking learners. The nouns were presented and practised phonetically. They found that the beginnings of L2 words were relatively resistant to error, while subsequent consonants were more likely to be incorrect. The results of Meara and Ingle's study are suggestive of how words are stored and retrieved from mental lexicon, but they are limited in that they pertain to words acquired phonetically. Furthermore, they concern individual lexical items. In a more recent paper Meara (1992) draws attention to the examination of vocabulary acquisition as a network of structures and associations. Laufer's (1990a) study showed that in vocabulary acquisition learners follow a similar developmental route according to the L1 acquisition = L2 acquisition hypothesis which predicts that L2 learners follow a similar developmental route to that followed by a child learning the same language as L1 (Laufer 1990a:290). Laufer compared adult EFL learners and English native speaking children in order to examine the similarities and/or differences that they experience in distinguishing between words of similar form (synforms), e.g. 'considerate' and 'considerable', 'extend' and 'extent', 'simulate' and 'stimulate'. Laufer concludes that native speaking learners of English and 187 foreign learners of English share the same order of difficulty in the acquisition of 'synforms', i.e. suffix synforms (e.g. considerable/considerate) created the most difficult synformic distinctions, followed by the vocalic (e.g. cute/acute), and then the prefix (e.g. superficial/artificial) and consonantal (e.g. price/prize) (Laufer 1990a:281). Despite the interesting results, Laufer's study suffers from certain shortcomings: she compared adult foreign learners of English and 12year-old native speakers of English without justifying why she expected language development in these two groups to be comparable. Further on, the multiple choice test she used for her research was poorly designed (e.g. the fourth distractor of each item is almost always one that is definitely wrong - in the 38 items tested, only one has (d) as the correct answer). Despite its limitations Laufer's investigation suggests that in L2 vocabulary acquisition, too, there are developmental sequences. Palmberg (1987) also investigated patterns of vocabulary development in Swedish ESL learners in Finland. Palmberg used 'spew' tests, which required the students to write down as many words as they could think of that began with a given letter (M or R). This was done for one minute per week for 17 weeks. Palmberg found that the words produced by his subjects consisted mainly of textbook vocabulary. Results also show a steady increase in the overall word-production capacity of the subjects over time (see also Palmberg 1988). The acquisition of modal auxiliaries (i.e. can, could, may, and might) by L2 learners was investigated by Gibbs (1990). She examined 75 Panjabi- 188 speaking pupils on their expression of English modal auxiliaries and found that the acquisition of modal auxiliaries by the L2 learners follows an English L1 developmental pattern. The acquisition of word formation processes was investigated by Olshtain (1987). Word formation rules in Hebrew were tested using three tasks (production, evaluation and interpretation) with a group of native speakers and two groups of foreign speakers of Hebrew (advanced and intermediate levels). In the production task, subjects were asked to coin new terms for concepts not named in the conventional lexicon of Hebrew. In the evaluation task, subjects were presented with five innovative forms representing word formation devices in Hebrew and asked to judge which of these forms was the most suitable name for a specified noun. In the interpretation task, subjects were asked to supply the most likely meaning of an innovative blend. Olshtain's results show that L2 learners acquire target word formation processes in a gradual progression, with the advanced learners exhibiting productivity that is very similar to native speaker's performance (Olshtain 1987:229). It was also shown that at the advanced level the L1 influence in the application of L2 word formation devices is marginal, while at the intermediate level students rely mainly on word formation devices that were covered in their Hebrew course (i.e. affixation devices). Olshtain's study strongly suggests a developmental process in the acquisition of word-formation rules. Giacobbe and Cammarota (1986) conducted an investigation of the relationship between L1 and L2 in the construction of lexis during the first 189 phases of L2 acquisition. They collected their data by interviewing two Spanish subjects acquiring French during the first months of their stay in France. They concluded that there are two approaches to the construction of lexis, systematic and non-systematic, depending on the learner's ability or inability to establish a relationship between the L1 and L2. In the systematic approach, the learner forms a General Lexeme Construction Hypothesis (GLCH) which is concretised by a series of simple operations facilitating the transformation of L1 lexemes into L2 lexemes. For example, Cacho, one of the subjects in the study, suppressed the final vowel of Spanish lexemes, e.g. [kurs] instead of 'curso' and [mism] instead of 'misma', in order to produce French lexemes, e.g. 'cours' and 'meme'. The GLCH is further complemented by parallel hypotheses concerning other aspects of the lexemes such as stress. In the non-systematic approach, the learner just memorises words that are frequently used in her/his environment. Even though Giacobbe & Cammarota's study reveals that a degree of systematicity can exist in the acquisition of L2 lexis, it has certain shortcomings. First, their study was limited to the examination of only two Spanish adults acquiring French without formal instruction. Second, the similarity of the subjects' mother tongue and the L2 could have accentuated the role of L1 in the construction of rules for the acquisition of lexis. In the studies reviewed above, vocabulary acquisition has been equated with the acquisition of individual words by L2 learners, even though it has been suggested that an examination of vocabulary as a network of semantic and structural associations would be worthwhile (Meara 1992). So far, results 190 suggest that in L2 vocabulary acquisition, too, there are certain patterns of development. However, the scope of these studies has been mainly exploratory, and there has not been a systematic framework of investigation of patterns of vocabulary development. The rest of this chapter will focus on studies exploring the acquisition of sequences of lexical items, i.e. lexical phrases and collocations. 2.4.3 The Acquisition of Lexical Phrases The studies considered so far dealt with the acquisition of individual words. Other studies have also dealt with the acquisition of combinations of two or more words. The investigation of the early acquisition and use of prefabricated patterns such as "can you", "where is", "how to" and others, revealed that in the initial stages of L2 acquisition learners learn to use multiword phrases as if they are individual lexical items (Hakuta 1974). Hakuta poses the question of whether this rote memorisation of prefabricated patterns accelerates or decelerates language development. Peters for one believes that 'chunks' play an important role in L1 acquisition (Peters 1983). Krashen and Scarcella (1978) have also identified the memorisation of syntactic patterns, i.e. prefabricated routines, as part of the early stages of L2 acquisition. However they conclude that, when more learning has taken place, "language development proceeds analytically, in the 'one word at a time' fashion" (Krashen & Scarcella 1978:297). Krashen and Scarcella conclude that 191 prefabricated routines and patterns are useful for establishing social relations and also for encouraging intake of target language. However, this intake is insufficient for successful language acquisition and thus the teaching of routines and patterns should be minor (Krashen & Scarcella 1978:298). Even though Krashen and Scarcella provide an answer (negative) to Hakuta's question, their conclusions are speculative since they have not been based on empirical evidence. Counter to Krashen and Scarcella's view of the usefulness of prefabricated routines, Nattinger and DeCarrico (1992) have argued that unanalysed chunks of language play an integral part in acquiring and using language. Nattinger and DeCarrico identified the structural and functional properties of lexical phrases (e.g. 'I'm sorry to hear that X' (expressing sympathy), 'by the way' (topic shift), 'Could/Would you X ?' (request) (DeCarrico & Nattinger, 1993)), and suggested ways for utilising lexical phrases in language teaching. Nattinger and DeCarrico's lexical approach to language learning draws attention to the systematic utilisation of lexical phrases in language teaching, however, there is still little empirical evidence on the way these 'lexico-grammatical units' are actually acquired by L2 learners; furthermore their approach is limited - for the purposes of this study - by being focused on the linguistic analysis of native adult language use (Weinert 1995). Pienemann et al. (1988) also underscore the importance of lexical phrases. The use of formulae in the oral production of English L2 learners was classified as Stage 1 structure, i.e. low in processing complexity, and the 192 formulae were used as indicators of linguistic development by Pienemann et al. (1988). However, these 'formulae' were left unexplained and the individuals employed as 'assessors' of linguistic development had considerable difficulties in identifying when a formula was used or not. It is possible that using an umbrella term, i.e. 'formulae', to refer to word combinations memorised as chunks, could create problems when this is used as an indicator of linguistic development as different formulae can exhibit different levels of complexity depending on factors such as the length of the collocational string, the frequency of the lexical items in the formula, the formality of the formula, etc. Thus, more refinement is needed in the description of formulae if it is going to be used as an indicator of linguistic development. The above studies suggest that the acquisition of formulae/lexical phrases is characteristic of the initial stages of L2 acquisition, and that their utilisation for language teaching would be of benefit to the learner. However, their conclusions and suggestions are not based on empirical evidence, while the use of the term 'formula' or 'lexical phrase' to describe any combination of words that could be memorised as a whole is inappropriate and vague for a detailed investigation and description of the acquisition process of such word combinations. Still, we need to know much more about the role of formulaic language in classroom L2 development (Weinert 1995). 2.4.4 The Acquisition of Collocations 193 Collocational development in L2 vocabulary acquisition has not been investigated yet in terms of systematic patterns of acquisition, even though there has been evidence for the existence of such sequences in the fields of syntax and morphology and phonology, and also evidence that vocabulary acquisition may also follow patterns of development. There is already no doubt that collocations are an important part of L2 lexical development. It has been shown that collocational errors make up a high percentage of all errors committed by L2 learners (Grucza & Jaruzelska 1978 cited in Biscup 1992); Marton 1977; Arabski 1979), and linguists have acknowledged the importance of focusing on the relations that hold between items in the lexical system in order to describe vocabulary development (White 1988; Meara 1992). It has also been suggested that collocations provide most of the "initial lexical units", and thus their study is of great importance both for the early stages of language acquisition and for the following years of vocabulary development (Greenbaum 1974:89). The need for research in collocations has long been identified (Levenston 1979), but it is only in recent years that empirical investigations have been conducted. One reason for this lack of interest could be the shortage of suitable research instruments designed specifically for testing hypotheses about lexical acquisition processes (Levenston & Blum 1978:2). The recent research on collocations has taken a number of forms. Links between the acquisition and use of collocations and writing proficiency were reported by Ghadessy (1989) (see Chapter 1). According to 194 Ghadessy, the use of function words indicates a more advanced use of collocations, grammatical patterns and cohesive devices on the part of the older students (Ghadessy 1989:114). Ghadessy's study demonstrates that the examination of the collocations L2 learners use can be useful in an investigation of what happens during the L2 learners' development towards a full linguistic communicative competence. A developmental process in the acquisition of collocations is also suggested by Zhang (1993) in his study of the use of collocations in the writings of native and non-native speakers of English (also see Chapter 1). One of the results of the study is that poor non-native writers and good native writers use more grammatical collocations and fewer lexical collocations. Even though Zhang did not compare the acquisition of English collocations by L2 learners from different proficiency levels, he assumes that the results of his study indicate a certain development in the acquisition of collocations by L2 learners: at the lower levels of English proficiency learners use more grammatical collocations and fewer lexical collocations; when learners are at intermediate levels they produce a greater variety of collocations but they still rely greatly on the prefabricated routines they have acquired at early stages, and therefore use more lexical collocations than grammatical ones; finally, when learners have reached an advanced level of proficiency, they have a better knowledge of grammatical collocations, which they are now able to break down into parts and use to create new ones, thus resulting in a heavier use of grammatical collocations. However, a developmental continuum like the one described by 195 Zhang would require empirical evidence from L2 learners at different acquisition stages. The acquisition of lexical collocations by advanced learners of English from two different L1 backgrounds, Polish and German, was investigated by Biskup (1992). Subjects were asked to supply the English translation equivalents of lexical collocations in Polish and German respectively. German learners were more prone to use descriptive answers and try alternative ways of rendering the meaning of unfamiliar collocations, while the Polish students would use a collocation only if they were sure it was the correct one. This result is explained in the light of the different emphasis on EFL in Poland and Germany. The Polish educational system insists on accuracy, so the Polish learners would refrain from giving any answer at all unless they were certain that it was the correct one. On the other hand, the Germans pay more attention to communication and fluency and thus the German learners tried to use alternative ways of expressing the meaning of collocations whose English equivalents they did not know (Biskup 1992:88). Even though Biskup's study does not concern the acquisition of collocations from a strictly linguistic point, it suggests that by employing different approaches and taking into account factors such as the focus of instruction, new and valuable insights in the field vocabulary acquisition can be provided. Aghbar and Tang (1991) devised an instrument to measure the acquisition of collocations. The principle of the proposed scoring scheme is based on the assumption that the acquisition and use of collocations evolves 196 along a continuum from the least semantic approximation to full mastery of collocations that are idiomatic and appropriate, both semantically and by register (Aghbar & Tang 1991:2). The scoring instrument was used to test mastery of verb-noun collocations by 205 university level ESL students. The collocations were collected using a blank filling test, and they were scored in terms of their idiomaticity (idiomatic/non-idiomatic), semanticity (semantic/marginally semantic/not semantic), and register (proper register/not proper register). Results showed that the use of common verbs such as 'take', 'get', 'find' were relatively easy for the low proficiency groups and therefore do not discriminate between low and high proficiency in collocations. It was also concluded that open-choice tests are more reflective of the students' choice of collocations in their own natural communication, and that low proficiency students are much more likely to choose an appropriate answer in a multiple choice test. The acquisition of low frequency (or rare) words and multi-word (or complex) lexical units (e.g. noun phrases (a damp squib), adjectival/ adverbial/ prepositional phrases (at a pinch), predicates (to bite the bullet), and sayings (the penny drops)) by advanced L2 learners was investigated by Arnaud and Savignon (1994). A list of sixty rare words and sixty complex lexical units was compiled in a multiple choice format (i.e. each item on the list was followed by four choices, one of them being a paraphrase or a synonym of the item and the other three distractors). The list was given to French advanced learners of English, who were asked to complete the multiple choice test by choosing the 197 appropriate definition for each test item. Results show that native-like performance was attained in the case of rare words but not in that of complex lexical units (Arnaud & Savignon 1994). It is possible that because of lack of awareness of the importance and nature of complex lexical units, learners did not pay attention to them. Arnaud and Savignon conclude that even though the acquisition of a large number of complex lexical units (such as collocations) involves considerable difficulty, such an acquisition is necessary for the advanced learner's receptive competence (Arnaud & Savignon 1994). The acquisition of lexical collocations or "conventional syntagms" in foreign language learning was also investigated by Marton (1977). Results showed that recurrent exposure to conventional syntagms did not lead to their remembering and recall by the learners. This could be due to the fact that conventional syntagms are easily decodable and thus they do not cause any difficulty in the process of recognition. Simple words or more idiomatic expressions have a stronger impact on the learner's conscious mind as the learner makes an effort to learn them, and thus they have a better chance of being remembered. Marton suggests that intensive study of vocabulary and a conscious effort in memorising and rehearsing of a great number of conventional syntagms is the most effective way to learn how to handle target language lexical collocations (Marton 1977:55). More recent studies have also underscored the effects of practice in L2 acquisition (see Kirsner, Lalor & Hird 1993). 198 The above studies show that an investigation of how collocations are acquired will be of potential benefit for illuminating some of the processes that contribute to L2 vocabulary development and for L2 teaching. 2.4.5 Summary The reviewed literature so far suggests that: i) L2 vocabulary development only recently received systematic attention and examination even though there have been studies suggesting the existence of developmental patterns in the acquisition of L2 vocabulary. ii) Given the emerging consensus that vocabulary knowledge is best viewed as a network of associations, the acquisition of collocations is a valuable avenue to explore since it represents structural and semantic relationships between lexical items. iii) Other language aspects have been found to exhibit developmental processes and patterns. The acquisition of vocabulary, and in particular the acquisition of collocations could be found to follow a developmental process of some kind that can be described and analysed (Ellis 1994:113). 2.5 The Aims of the Present Study The limited research in the development of L2 vocabulary, and the availability of English collocations for a study of development, as these are operationalised in this study (see Chapter 1), provided the rationale for this 199 study which aimed to investigate whether there are patterns in the development of collocational knowledge in L2 learners. Describing developmental 'stages' in the acquisition of collocations (i.e. the stages that the learner goes through before the correct English collocations are fully acquired) is not feasible in an investigation of vocabulary learning. For example, in the investigation of the acquisition of English interrogatives (Cazden et al. 1975) the end product (i.e. a well-formed interrogative conforming to English grammar rules) was evident and the researchers had to describe the stages learners go through in the acquisition of English interrogatives. In vocabulary acquisition, however, and in particular in the acquisition of collocations, the end product is not as obvious. For example, when the learner uses 'bad milk', the end product cannot be confidently determined. It is possible that the learner is trying to say 'sour milk', or even that 'the milk is off'. Due to the above limitation, this study aimed to explore 'patterns' or 'acquisition/difficulty/accuracy orders' rather than 'stages' of development in the acquisition of collocations. Thus, development in the acquisition of collocations is in the form of sequences or implicational steps of correctly used English collocations by learners at different proficiency levels. For the purposes of the present study, ESL learners from three different proficiency levels were tested in their free and cued production of collocation types as these are operationalised in the BBI and other studies on collocations (see Biskup 1992; Zhang 1993). The proficiency level of the selected L2 learners 200 was based on the assumption that collocations are important for the early stages of language acquisition and for the following years of vocabulary development (Greenbaum 1974). Thus, the subjects in this study were at postbeginner, intermediate, and post-intermediate levels of proficiency. The correctly used collocations were sequenced to reveal implicational orders from 'easy' or 'early acquired' collocation types to 'difficult' or 'late acquired' types. In this way any systematic patterns of development in the acquisition of collocations would emerge. As a result of the foregoing, there are two hypotheses tested in this study: i) There are stable patterns of development of collocational knowledge across language proficiency levels. ii) There are stable patterns of development of collocational within language proficiency levels. The next chapter describes the methodology of this study. 201 knowledge CHAPTER 3 METHODOLOGY FOR THE PRESENT STUDY 3.0 Introduction This chapter specifies the methodology of the present study. It describes the development of the testing materials, the data collection procedures, the coding and scoring of the data, and the analyses to be performed in order to test the two predictions: 1. There are stable patterns in the development of collocational knowledge across language proficiency levels. 2. There are stable patterns in the development of collocational knowledge within proficiency levels. 3.1 Analysis of the Teaching Materials For the purposes of the present study an initial analysis and classification of the collocations found in three textbooks, namely Task Way 202 English 1, 2 & 3, was performed. These textbooks are used in all the State Junior High Schools in Greece for the teaching of English. The Task Way English(TWE) series was designed by a five member English Language Teaching (ELT) committee appointed by the Greek Ministry of Education. All the members of the committee were Greek and their aim was to design a series of textbooks for the teaching of English in the State Junior High Schools that will meet the interests and needs of Greek students. 3.1.1 Curriculum Objectives The objectives of the Junior High School ELT curriculum were reformed under the task-based approach to foreign language learning adopted by the authoring committee of the series. According to the committee, the new objectives were "related to knowledge of language as a system and to language as a means of communication" (Dendrinos 1988:2) and the TWE series was designed to realise these objectives through role-play tasks, listening activities, and emphasis on communicative competence. 3.1.2 Syllabus and Methodology The syllabus for each of the three textbooks is graded in terms of both grammar and the communicative functions of language: the contents page for each book describes for each unit the title, the grammar points included, and the language functions that are to be practised. The aim of the textbooks, as 203 outlined by the authoring team, is to develop the learners' communicative competence and provide them with practice in using the target language. The authors of the TWE series wanted to adopt a methodology that follows the principles of "process-oriented learning" (Dendrinos 1988:5). Such a methodology, the authors of the books claim, has made the grading of the formal, semantic and pragmatic properties of language "far less important than the sequencing of the learning tasks" (Dendrinos 1988:5). 3.1.3 Activities and Tasks Each unit in the textbooks has a central theme, which is further divided into several topics and issues leading to situations where the learner is invited to participate by using her/his communicative skills. Before each task is performed, the sociolinguistic context of the situation is given. The team of authors designed the tasks, aiming to develop in learners both receptive and productive skills and to encourage them to discover new knowledge rather than impose it on them. They also wanted to offer the students opportunities for metacognition and metacommunication (Dendrinos 1988:6). For example, learners are asked to look at the usage of different grammar tenses in a comparative way, or they are informed about the roles of certain grammatical structures, e.g. Passive Voice is used when we are interested in the action rather than the agent (TWE3, p.57). The last part of each unit in the three textbooks aims to help the learner systematise the knowledge that she/he acquired throughout the unit. For 204 example, in TWE2, Unit 6, p. 84, an alphabetical list of more than eighty English verbs with irregular past tense forms is presented in order to help learners systematise their knowledge of irregular past tense verb forms. However, the effectiveness of such tasks depends to a large extent on the way these are presented to the learners, and the use that learners make of them. The writers of the series did not design activities that could raise the students' awareness of collocations in a systematic way. In TWE1, there is only one activity that asks students to list nouns that could be accompanied by a certain adjective, e.g. big: toe, finger, foot, hand, mouth, ear, eye. In TWE2, there are no activities that would help learners acquire specific collocations. Finally, in TWE3, there is one activity in which students are asked to make adjective-noun and noun-noun compounds using specific words on a list, e.g. 'classified advertisement', 'natural resources', 'entertainment section'. Instructors follow the curriculum closely and the TWE series textbooks are the only textbooks used in the classroom. So the textbooks control the learners lexical acquisition in the classroom. 3.1.4 The Use of L1 The learners' L1, Greek, is used in the textbooks in order to describe the context of the tasks to the students, to tell them how to carry out the task, and in some instances to give rules of language use. The use of Greek is much more extensive in the first book, which is aimed at beginners. 205 3.1.5 The Vocabulary The authors report that due to the communicative purposes of the textbooks, "the vocabulary which appears in different discourse types has not been chosen with any formal criteria in mind, while it has not been strictly graded" (Dendrinos 1988:4). For the purposes of this study, and in order to understand the linguistic environment that the subjects of this study have been exposed to, the vocabulary of the textbooks was analysed in terms of types of collocations. The list of 37 types of collocation developed by the BBI was used (see Chapter 1). The classification was performed manually by the researcher. Inter-rater reliability of 90% was achieved with one other rater on a random 5% sample of the total number of pages analysed, and it was considered to be sufficient. The results were entered in a database using the Quattro Pro 3.0 software program. Descriptive statistics were then calculated. 3.1.6 Descriptive Statistics for the TWE Series There is a steady increase of the English collocations included in the books (Figure 1). TWE1 contains 2,161 collocations, TWE2 contains 3,922 collocations, and TWE3 contains 5,901 collocations. Token-type ratios were calculated for each book (see Table 7). 206 Sum of Col l oca ti ons per textbook 6000 5000 4000 3000 2000 1000 0 1 2 3 Ta sk W a y Engl i sh 1, 2, 3 1Figure 1. Distribution of collocations across the TWE series Table 7. Collocation tokens and token/type ratios in the TWE series TYPE TWE1 TWE2 TWE3 Total 1. Noun Preposition 76 80 145 301 2. Noun to Infinitive 2 16 27 45 15 3. Noun that-clause 0 0 0 0 0 4. Preposition Noun 228 215 429 872 290.67 45 75 128 248 82.66 6. Predicate Adjective to Infinitive 3 9 37 49 16.33 7. Adjective that-clause 8 7 12 27 15 26 27 68 22.66 1 3 7 11 3.66 5. Adjective Preposition 8. SVO to O/ SVOO 9. SVO to O 207 Tokens / Type 100.3 9 10. SVO for O/ SVOO 4 6 9 19 11. SV(O) Preposition O 10 45 5 60 12. SV to Infinitive 19 234 285 538 179.33 13. SV Infinitive 26 230 347 603 201 14. SV V-ing 31 9 22 62 20.66 4 25 56 85 28.33 11 28 37 76 25.33 17. SVO V-ing 1 16 14 31 10.33 18. SV Possessive V-ing 0 0 0 0 45 149 234 428 20. SVO to be c 0 0 3 3 1 21. SVOc 4 9 32 45 15 22. SVOO 3 13 8 24 8 23. SV(O) Adverbial 110 130 61 301 100.33 24. SV(O) wh-word 73 174 203 450 150 0 0 0 0 0 402 351 503 1256 418.67 30 135 143 308 102.67 3 5 4 12 4 306 704 1346 2356 785.33 3 11 2 16 5.33 31. Noun1 of Noun2 16 21 47 84 28 32. Adverb Adjective 3 15 30 48 16 15. SVO to Infinitive 16. SVO Infinitive 19. SV(O) that-clause 25. S(it) VO to Infinitive 26. SVc 27. Verb Noun/Pronoun (creat) 28. Verb Noun (eradication) 29. Adjective Noun 30. Noun Verb 208 6.33 20 0 142.67 33. Verb Adverb 58 173 146 377 34. Noun Noun 197 319 498 1014 338 36. Preposition Determiner Noun 240 345 454 1039 346.33 37. Phrasal Verb 184 344 600 1128 376 2,161 3,922 5,901 11,984 ------ TOTAL OCCURRENCES 125.67 As can be seen from Table 7, there were also collocation types for which no instances of collocations were found in any of the books: Type 18. SV Possessive V-ing Type 25. S(it)VO to Infinitive Type 3. Noun that-clause These categories have not been included in the calculation of the means and standard deviations for each textbook that appear below. A look at the mean number of collocations found in each book confirms this steady increase of collocations, see Table 8. The standard deviation was also calculated for each book, see Table 8, and it was found that as the level of English progresses in the TWE series the variability from the central point in the distribution of scores becomes greater, that is, some types of English collocations are represented by a large number of tokens, which gets even larger in the third book of the series, while in other types the occurrences remain consistently low. It also appears that the scores are much more spread 209 out in the second and third books than in the first one, which has generally low levels of English collocations. Table 8. Means and standard deviations per textbook BOOK MEAN STD TWE 1 61.5 97.01 TWE 2 111.1 150.1 TWE 3 167.8 261.9 As we can also see in Table 7 above, most of the collocation types have a small number of tokens in all three books, a few have a medium sized number of tokens, and only a couple have a high number of tokens. This indicates that some types are represented more than others in the TWE series. There is little recycling of collocations across the three books. For about half of the collocation types there were no common collocations appearing in all three books. The largest amount of recycled collocations across the three books are under the following types: Type 24. SV(O) wh-word (13 tokens appearing in all three books); Type 37. Phrasal Verb (18 tokens appearing in all three books); Type 36. Preposition Det Noun (11 tokens appearing in all three books); Type 4. Preposition Noun (11 tokens appearing in all three books). 210 A closer look at these categories revealed the following: i) Type 24. SV(O) wh-word - The collocations that belong to this category and appear in all three books are instructions mainly for carrying out role-play tasks - e.g. "ask if she has got a brother" (TWE1, Unit 5, p.66). ii) Type 37. Phrasal Verb - Most of the recycled collocations in this category also appear in task instructions - e.g. "try to fill in the 'What is Done By Whom' table" (TWE3, Unit 3, p.47). From the above, it appears that most of the recycled collocations are mainly standard expressions used for giving instructions to the students about the task they are asked to perform. A close look at the textbook data shows that there is little recycling of the collocations used in TWE1 and TWE2, and TWE1 and TWE3. TWE2 and TWE3 appear to be more compatible as they have 329 collocations in common. However, an examination of these 329 collocations revealed that one third of them (28%) belonged to the types used for task instructions (see above). 3.2 Subjects Three groups of Greek students of English were involved in the present study. They were all learners from the same Greek Junior High School and were taught English via the TWE coursebooks. The first group consists of students in the first year of Junior High School, the second one consists of students in the second year of Junior High School, and the third group comprises students in the third year of Junior High School. 211 The Junior High School that participated in the study used the TWE series, and it was situated in an urban area, i.e. Veria which is the capital of the prefecture of Imathia, to ensure that its students were mainly from the same town rather than from the nearby villages. Permission was obtained from the Principal of the Junior High School and the Department of Education for Secondary Schools, as the research would engage the participating students for one hour and forty minutes for the completion of the test. There were 347 subjects participating in the study: 107 subjects for the first group, 125 subjects for the second group, and 115 subjects for the third group. All subjects were between 12 and 15 years of age. They were Greek nationals who were native speakers of Greek, and they all had the same level of formal education. These subjects had to be screened with regard to their language proficiency and their production of collocations, and hence those subjects that had not written an essay were not included in the study (see 3.5.2.). Ultimately, there were 275 subjects included in the study: 91 subjects for Group 1, 94 subjects for Group 2, and 90 subjects in Group 3. The subjects' mean age for each group was calculated: 12 years and 9 months for Group 1, 13 years and 8 months for Group 2, and 14 years and 7 months in Group 3 (see Appendix A). The study was conducted three teaching weeks before the end of the school year in Greece. By this time students were in the final chapters of their books and under revision in preparation for the annual exams which follow the end of the school year. 212 3.3 Materials The test instrument used for the data collection consisted of a questionnaire eliciting background information about the subjects, and a battery of three tests: an essay writing task eliciting free-production data, a translation task, and a blank-filling task eliciting accuracy in the use of collocations. The purpose and the contents of the materials are examined in detail below. 3.3.1 Questionnaire The first part of the test instrument was a questionnaire in Greek aiming to elicit information about the students' background. There were 15 questions, 10 open ended and 5 closed ones, asking information such as the students' age, sex, recent marks in English, how many languages they speak, whether they had any additional exposure in English, when they started learning English, how often they watch English movies with Greek subtitles or without subtitles, how often they read English books and/or newspapers, how often they listen to English songs, whether they speak English with their friends, and whether they correspond with pen friends in English. The questionnaire was the same for all three groups. See Appendix A for information on the subjects' gender and age and Appendix B for the English translation of the questionnaire. 213 3.3.2 Composition The first test in the battery of tests was a composition task measuring free production of collocations. Students were asked to write an essay of approximately 200 words on a given topic. There were different topics for the three different groups and each topic had been covered in the textbook of the particular group. The topic for each group was given in Greek (see Appendix C for the topics in Greek and their English translations). 3.3.3 Translation The second test consisted of an elicited translation task. The translation test measured cued production of collocations. There were 10 sentences in Greek for each group, and the subjects were asked to translate them into English (see Appendix D for a word-by-word English translation of the Greek sentences and their expected translations in English). Each sentence tested one collocation. The collocations included in the translation test were selected from the database of the collocations found in the students' textbooks, and each collocation included in the test was different from its Greek equivalent, e.g. 'draw conclusions' is 'take out conclusions' in Greek. The types of grammatical and lexical collocations tested in the elicited translation task for each group are given below in Table 9. Table 9. Collocation types included in the translation test. 214 TYPES Group1 Group2 Group3 1. Noun Preposition 2 2 2 5. Adjective Preposition 1 2 2 11. SV(O) Preposition O 2 2 1 13. SV Infinitive 1 0 1 14. SV V-ing 1 1 1 16. SVO Infinitive 1 1 1 23. SV(O) Adverbial 1 0 0 27. Verb Noun (creation) 1 2 2 10 10 10 TOTAL: 3.3.4 Blank-Filling Finally a blank-filling test was also included. This test measured cued production of collocations. There were a number of sentences in English (50 sentences for Group 1, 65 for Group 2, and 90 for Group 3), containing collocations in context. Each sentence contained one collocation. In each sentence, one part of the collocation was replaced by a blank and students were asked to read the sentence and provide one suitable word for each blank. As with the translation test, the collocations tested in the blank-filling test were selected from those appearing in the students' textbooks, and they were different from their Greek equivalents (see Appendix E for the sentences 215 included in the blank-filling tests with the intended collocations underlined and the missing part bolded). All parts of the tests were typed. There were instructions in Greek for each part of the test. The types of grammatical and lexical collocations tested in the elicited translation and the blank-filling tasks for each group are given below in Table 10. Table 10. Collocation types included in the blank-filling test. TYPES Group1 Group2 Group3 1. Noun Preposition 1 5 3 4. Preposition Noun 3 6 7 5. Adjective Preposition 5 5 15 11. SV(O) Preposition O 12 14 27 23. SV(O) Adverbial 1 1 2 24. SV(O) wh-word 2 1 3 27. Verb Noun (creation) 6 10 11 28. Verb Noun (eradication) 0 0 2 29. Adjective Noun 1 0 2 30. Noun Verb 1 0 0 31. Noun1 of Noun2 3 1 0 33. Verb Adverb 1 3 1 34. Noun Noun 2 1 1 216 36. Preposition Determiner Noun 6 11 12 37. Phrasal Verb 6 7 4 50 65 90 TOTAL: 3.4 Data Collection Procedures In the following sections, the procedures followed by the researcher for the collection of data are reported. 3.4.1 Test Administration The test was administered on three consecutive days: one day for each group. All subjects for each group were tested on the same day. All the subjects belonging in a particular group were tested together. During the data collection, the researcher personally monitored the testing. The subjects were told that their school had agreed to participate in a research project undertaken by the Centre for Language Teaching and Research of the University of Queensland. Their knowledge of English was to be assessed using a test and the collective results of their performance would be forwarded to their school after the completion of the project. A complementary copy of the Macquarie Dictionary of Australian English would be donated to the school library for student use as a reward for their participation to the testing. Subjects were assured that the data would be treated confidentially and would not affect their course marks. 217 After the tests were distributed to the subjects, they were asked to first complete the questionnaire, then to write the essay, then to do the translation task and finally to complete the blank-filling task. The researcher explained what each test required the students to do. All test instructions were written and spoken in Greek. The questionnaire and the topic for the essay were written in Greek, while the translation and the blank-filling test were introduced by instructions in Greek asking students to translate and fill in the sentences respectively (see Appendix F for the exact wording of the instructions). Subjects were then asked to complete the tests. They were encouraged to ask the researcher about anything in the test they might find difficult to understand, or any unknown words. Even though the vocabulary used for the test items came out of the students' textbooks, the researcher realised that the students might not remember certain words under the pressure of time. Therefore, any words unknown to the students were explained by the researcher, with care taken that the particular words were not giving away the answers to any of the test items. Such cases were limited only to subjects in Group 1 because of their low level of English. The subjects were allowed one hour and thirty minutes to complete the test, and they were told that they should not leave any of the test items unanswered. 218 Students finishing earlier than expected were asked to remain seated and revise their tests, e.g. try to expand their composition. When the time was up all of the subjects had finished and they were allowed to leave the room. The same procedure was followed for each day of the testing, until all three groups of subjects had been tested. 3.4.2 Debriefing At the end of the last day of testing, the researcher had a meeting with the two English teachers of the school during which she explained the purpose of the testing and the research project. She also asked them to complete a questionnaire about the use of the TWE coursebooks and the teaching of collocations in the classroom (see Appendix G for the teachers’ questionnaire). This information was to be used later in the data analysis and the discussion of the results. 3.5 Coding Procedures Each set of data from the three tests was coded and scored according to the following criteria. 3.5.1 Free Composition 219 The data obtained from the free composition were treated as evidence of both language proficiency and of collocational use. As language proficiency data, the essays were analysed with respect to six different measurements: holistic rating, target-like use of articles, lexical density, length of terminal units, error-free terminal units, and sentence-nodes per terminal unit. The use of each of the six measures is explained below. i) Holistic Rating The free compositions were scored on a holistic scale of 1-100 which is a standardised and widely used scale compiled by Jacobs, Zinkgraf, Wormuth, Hartfiel and Hughey (1981) (see Appendix H for a list of the criteria for scoring). Each composition was assessed by two raters. The raters were native speakers of English and experienced English teachers. Each essay received two scores, one from each rater. In cases where the two raters had more than ten points difference in their evaluation of a particular essay, the essay in question was scored by a third rater. If the third rater gave a score that was half way between the previous two scores, e.g. Rater 1 gave a 50, Rater 2 gave a 30, and Rater 3 gave a 40, then the third rater's score was counted, and the previous two ignored. If the third rater gave a score that was the same as or closer to one of the previous two scores, then the third score and the closest other score were averaged and the estimated score was given to the essay, e.g. Rater 1 gave a 50, Rater 2 gave a 30, Rater 3 gave also a 30, the scores from Rater 2 and Rater 3 were averaged, while Rater 1 was ignored. If in 220 the previous case Rater 3 gave a 60, then Rater 1 and Rater 3 would be averaged, and Rater 2 would be ignored. In 32 out of 275 essays (11.6%) where a third rater was needed, the score from the third rater had no more than nine points difference with at least one of the previous two scores. Finally, each essay received a score on a scale 1-100 based on the average of the two ratings. ii) Target-Like Use of Articles The analysis of the Target-Like Use (TLU) of articles was performed by the researcher. TLU is an accuracy measure. As in Pica (1983), the number of accurately supplied articles (definite and indefinite) in obligatory contexts were counted and divided by the overall number of obligatory contexts in the essay (whether an article was provided in them or not) plus the number of non-obligatory contexts with inappropriate articles multiplied by 100. The TLU percentage score was recorded for each essay. Inter-rater reliability was performed on a sub-set of the ratings randomly sampled from the entire set of data. 5% of the data (i.e. 15 essays) was rated in this way by two other raters and the inter-rater reliability was at 99%. The raters were given a random selection of 15 essays, 5 essays from each sample, and an instruction sheet, which reported what the measurement was and what each rater was required to do (see Appendix I). The raters were given a short training session by the researcher on the TLU analysis on two other essays. After the raters had performed the measurement on the sample essays, the researcher's and each rater's ratings were compared. In a total of 235 accurate 221 and inaccurate suppliance and omission of articles, there were two instances of disagreement between the researcher and one of the raters in one of the essays. In essay 13 from Group 1, there was an ungrammatical sentence, "He plays basketball and sometimes he plays tennis or going with friends for jogging". The researcher had considered the phrase "going with friends for jogging" as a non-obligatory context for an article, provided that it was read as "going (for) jogging with friends", while the rater had marked it as an obligatory context for the use of an indefinite article, if it was read as "going with friends for a jog(ging)". After the case was discussed, both the researcher and the rater agreed that "going with friends for a jog" was closer to what the student had written, and as such it was an obligatory context for an article. The second instance was the phrase "she often climbs on the mountains" in the same essay. Here the researcher had considered that the phrase did not need an article, while the rater had marked it as correct, assuming that the student was probably talking about a specific group of mountains. After discussion it was agreed that the omission of the article was much more general and consequently appropriate, especially since the student did not refer any further to particular mountains in his essay. For the rest of the essays, there was total agreement between the researcher and the raters. iii) Lexical Density A Lexical Density analysis (LD) was also performed by the researcher. Lexical Density refers to the number of lexical, or 'open class', words divided by 222 the total number of words in each essay and multiplied by 100 (see Long 1991, unpublished paper; Linnarud 1986). For this analysis a number of criteria were defined: abbreviations such as 'etc.', 'e.g.' were not counted at all; proper names were not counted as lexical words (in Group 1 the students were asked to describe themselves and their family, so in each essay there was a considerable number of Greek names that did not really contribute to the semantic richness of the essay and so they were not considered to be lexical words, see also Palmberg 1987:212); names of places in Greece were not counted as lexical words (see above); numbers were not counted as lexical words; adverbs other than those ending in -ly were not counted as lexical words; the verb 'do' was not counted as a lexical word, even when it was used as a main verb; misspelt words that could be easily recognised as English words either in writing or when pronounced according to the Greek or English phonetic system were counted (in this case it was assumed that the student knew the word and was attempting to use it, see also Palmberg 1987:205); words written in Greek were not counted at all. Inter-rater reliability was also obtained for the LD analysis from two other raters on 5% of the data, which randomly sampled (i.e. 15 essays), and was at 98%. Each rater was given the same sample of essays that the TLU raters were given and an instruction sheet (see Appendix J) and were asked to underline the lexical words in each essay. A short training session was given by the researcher on two other essays. After the raters performed the LD analysis on the sample essays, the researcher's and each rater's ratings were 223 compared. There were three instances in which one of the raters had underlined the verb "going" as a lexical word in phrases in which it was an auxiliary, and as such it was a grammatical rather than a lexical word, e.g. essay 37 Group 3 "we are going to do everything". In one case the adverb "hard" was underlined as a lexical word even though it did not end in -ly. Finally, words such as "other" and "everybody" were underlined by the raters even though they were clearly grammatical words, while there were cases in which the raters should have underlined words such as "think", "way", "worst", "better", "use", "thanks" but they did not. Overall, there was agreement between the researcher and the raters in the LD analysis in 1264 out of 1280 lexical words, and the inter-rater reliability was considered sufficient for the LD measurement. iv) Terminal-Units The essays were also analysed with respect to the number of terminal units (T-Units) they contained. All the main clauses plus any subordinate clauses attached to or embedded in them were counted as T-Units (see Long 1991, unpublished paper; Hunt 1966). A T-Unit is a structural discourse unit and it was used in this study in three different measures: length of T-Units, Error-Free T-Units, and S-Nodes per T-Unit. Inter-rater reliability for the number of T-Units was also obtained from two other raters on a randomly sampled sub-set of the data (5% of the data), and was at 97%. Each of the raters was given the same sample of essays and the T-Unit instructions (see Appendix 224 K) and were asked to mark all the T-Units in each essay. A short training session was given by the researcher on two other essays. After the measurement, the number of T-Units marked by the raters and the researcher were compared and the inter-rater reliability (97%) was considered sufficient for the T-Units measurement. v) Length of T-Units After the T-Units per each essay were calculated, the average length for the T-Units in each essay was estimated by dividing the total number of words in each essay by the number of T-Units in that essay (see Larsen-Freeman 1978), e.g. an essay with 186 words and 25 T-Units had 7.44 as the average length of a T-Unit. This was a complexity measure. vi) Error Free T-Units The number of error-free T-Units per each essay was also calculated as an accuracy measure. Only T-Units that were free from grammatical, syntactic, lexical, spelling and punctuation errors were counted as Error-Free T-Units (see Larsen-Freeman 1978). Inter-rater reliability on a randomly sampled sub-set of the data (15 essays) was also obtained from two other raters, and was at 97%. Each of the raters was given the same sample of essays and instructions for counting the Error-Free T-Units in each essay (see Appendix K). A short training session was given by the researcher on two other essays. After the 225 measurement, the number of Error-Free T-Units marked by the raters and the researcher was compared, and the inter-rater reliability (97%) was considered sufficient for the Error-Free T-Units measurement. vii) S-Nodes per T-Unit The essays were also analysed with respect to the number of sentence nodes (S-Nodes) they contained. This was a measure of syntactic accumulation. The number of underlying sentence nodes, indicated by tensed and untensed verbs, was calculated for each essay and then the average number of S-Nodes per T-Unit was estimated by dividing the number of SNodes in each essay by the number of T-Units in that essay (see Long 1991, unpublished paper). Inter-rater reliability on 5% of the data was also obtained from two other raters, and was at 98%. Each of the raters was given the same sample of essays and instructions for counting the S-Nodes in each essay (see Appendix L). A short training session was given by the researcher on two other essays. After the measurement, the number of S-Nodes marked by the raters and the researcher were compared and the inter-rater reliability (98%) was considered sufficient for the Error-Free T-Units measurement. 3.5.2 Use of Collocations in the Essays The subjects' performance on the free composition task served not only as a measurement of the subjects' writing proficiency in English (see above), but 226 also as a measurement of their free production of collocations. Test papers in which no composition was given, either because of the particular subject's inadequate level of English or because of lack of time, interest etc., were not included in the study. Thus, there were 275 complete test papers: 91 complete test papers in Group 1, 94 in Group 2, and 90 in Group 3. The essays were then analysed with regard to the collocations they contained. The students' production of the 37 different collocation types as these are operationalised in this study (see Chapter 1) was recorded as frequency data. Where the students provided a correct collocation they were marked as having used a token of the particular type in which the collocation belonged. Misspelt collocations were recorded as evidence of collocational use. Each collocation found in the essays was checked against the collocations included in the BBI. If the particular collocation was included in the BBI it was recorded as correct evidence of use of the particular collocation type (see also Zhang 1993). If it was not included in the BBI it was discarded. There were 13.1% of rejected collocations in all three groups. These collocations were mainly Adjective Noun combinations with 'big' or 'good' as the adjective. Such collocations are considered 'free combinations' by the BBI writers and they are not listed in the BBI (Benson et al. 1986a:xxiv). The quantity of collocations found in each essay for each of the 37 types was also recorded as well as the percentage of the type-token ratio. Inter-rater reliability of 90% was obtained for 5% of the data. 3.5.3 Translation 227 The data from the translation task were scored both as frequency data and as accuracy data. As frequency data the answers in the translation test were marked using a binary code: when students used the correct collocation they received 1, and when they used the wrong collocation or they provided no collocation at all, they received 0. As accuracy data, the mean accuracy of response to each collocation type in the translation test was recorded. Spelling mistakes were disregarded. For a list of the types of collocations tested in the translation test see Table 9 above. 3.5.4 Blank-Filling The data from the blank-filling test were also recorded both as frequency data and as accuracy data. As frequency data, the same binary coding was used as for the translation data. As accuracy data, the mean accuracy of response to each of the collocation types included in the blank-filling test was recorded. Spelling mistakes were disregarded. In the few cases where the students provided a collocation that did not match with the target one, but which belonged to the same collocation type, the collocation was recorded as correct. For a list of the collocation types tested in the blank-filling test see Table 10 above. 3.6 Analyses 228 In the following section, the analyses of the language proficiency measures and the two hypotheses investigated by this study are outlined. 3.6.1 Language Proficiency Measures Before testing the hypotheses it was necessary to perform a number of language proficiency measures on the free production data in order to determine that there are different levels of language proficiency in the groups. The analyses performed included the following measurements: holistic rating, target-like use of articles, lexical density, length of T-Units, error-free T-Units, and S-Nodes per T-Unit. A six-way factorial MANOVA was calculated on the scores obtained by the six different measures in order to determine whether the three samples were different with respect to all six measures. Following that, six one-way ANOVAs were then calculated for each of the six measures. 3.6.2 Analyses for the Hypotheses The analyses performed for testing the two hypotheses are described below. i) Analysis for Hypothesis 1 229 To address Hypothesis 1, that there are stable patterns of development in the acquisition of collocations across proficiency levels, non-parametric Kruskal-Wallis tests were performed on the data. Kruskal-Wallis is the nonparametric equivalent of ANOVA. Due to the wide range of types of collocations that were used in the analysis of the essay data, and the unequal number of tokens for each of the types of collocations tested in the translation and the blank-filling tests, the data were not expected to be normally distributed and thus non-parametric statistics were considered suitable to address the first hypothesis. Previous studies on collocations used nonparametric statistics too (see Zhang 1993). For the free production essay data that resulted from the analysis of the students' essays, Kruskal-Wallis tests followed by Dunn's multiple comparison procedures were performed on the mean tokens of each of the 37 collocation types used by subjects in each group. For the cued production translation data non-parametric Kruskal-Wallis tests followed by Dunn's procedure were performed on the mean accuracy of responses to each of the 6 types of collocations repeated across the three groups. For the cued production blank-filling data non-parametric KruskalWallis tests followed by Dunn's procedure were performed on the mean accuracy of responses to each of the 11 types of collocations repeated across the three groups. 230 Implicational scaling analysis for each of the three sets of data was also performed in order to reveal any acquisition orders. The frequency data were used for this analysis. For each implicational scale the Guttman's coefficients of reproducibility and scalability were then calculated (Hatch & Lazaraton 1991:204) in order to test the validity of the scales and the scalability of the items on the scales. ii) Analysis for Hypothesis 2 To address Hypothesis 2, that there are patterns of development in the acquisition of collocations within proficiency levels, non-parametric Friedman repeated measures tests were performed on each group for each set of data, i.e. three Friedman tests per each set of data. Friedman repeated measures test is the non-parametric equivalent for ANOVA repeated measures test. These tests were followed by Nemenyi's multiple comparisons tests based on the Friedman rank sums. The analysis for Hypothesis 2 also includes those collocation types that were not repeated across groups. For the free production essay data, separate repeated measures Friedman tests were performed for each group on the tokens for each of the 37 types of collocation found in the students' essays, followed by Nemenyi's multiple comparison procedures. Implicational scaling analyses using the frequency data for each group were also performed and the Guttman's coefficients of reproducibility and scalability were calculated. 231 For the cued production translation data, separate repeated measures Friedman tests were performed for each group. There were 8 types of collocation for Group 1, 6 types for Group 2, and 7 types for Group 3. These tests were followed by Nemenyi's multiple comparisons procedures. Implicational scaling analyses using the frequency data for each group were also performed and the Guttman's coefficients of reproducibility and scalability were calculated. For the cued production blank-filling data, separate repeated measures Friedman tests were performed for each group. There were 14 types of collocation for Group 1, 12 types for Group 2, and 13 types for Group 3. These tests were followed by Nemenyi's multiple comparisons procedures. Implicational scaling analyses using the frequency data for each group were also performed and the Guttman's coefficients of reproducibility and scalability were calculated. The following chapter presents the results of the analyses. 232 CHAPTER 4 ANALYSES AND RESULTS 4.0 Introduction This chapter describes the results from the language proficiency measures and the main analyses performed to address the two hypotheses listed in 2.5. The presentation of the results is organised around the three sets of data used to address each of the hypotheses: the free production essay data, the elicited production translation data and the elicited production blank filling data. In section 4.1 the results from the language proficiency measures performed on the essay data are reported. The aim of these measures was to screen the data and establish clear proficiency differences among the three groups, using different measures of language proficiency on the essay data. This initial screening of the data was considered necessary, since the proficiency differences between groups are the major independent variable for the present study. In 4.2 the results of the main analyses of each set of data are described. The analyses and results for Hypothesis 1 are first reported in 4.2.1. The aim of these analyses was to examine evidence for developmental differences in the knowledge of collocations, assessed both in terms of ability to use collocations in the essay data, and in terms of accuracy of response to questions eliciting collocations in the translation and blank filling data. These analyses involved 233 comparisons of collocation use and accuracy between groups using KruskalWallis tests. The Kruskal-Wallis one-way analysis of variance by ranks is a non-parametric test for deciding whether a number of independent groups are from different populations (Siegel & Castellan 1988:206). Evidence for acquisition orders was then sought using implicational scaling of: i) the use of collocations in essays across all groups, and ii) mean accuracy of response to collocation types on the translation and blank filling tests across all groups. In this way it was hoped to show what differences existed in the subjects' knowledge of collocations across different proficiency levels, and how knowledge of collocations developed. This evidence was used to address the first hypothesis, which states that there are patterns in the development of collocational knowledge across all groups. Analyses and results for Hypotheses 2 are described in section 4.2.2 These analyses involved comparisons of collocation use and accuracy within each group using Friedman Repeated Measures tests, which is a parallel nonparametric test for repeated-measures ANOVA. Evidence for acquisition orders within groups was then examined using implicational scaling of collocation use and accuracy for each group. The aim of these analyses was to show what developmental differences and sequences existed in the use of collocations within groups. These results would reveal any group-specific patterns in the development of collocational knowledge. The results of all the analyses are summarised in 4.3 in relation to the two hypotheses of the study. 4.1 Language Proficiency Results 234 In this section the results of the analyses performed to determine the proficiency differences between the three groups are described. 4.1.1 Descriptive Statistics Prior to performing the MANOVA for the six measurements on the free production data, the data were examined with regard to the normality of their distribution. The data for each group were tested for kurtosis and skewness. The results are reported below. 4.1.1.1 Descriptive statistics for Group 1 The results of the descriptive analysis of the data for Group 1 show normal distributions for the majority of the dependent variables: Holistic Rating (Kurtosis: -.689, Skewness: -.162), Words per T-Unit (Kurtosis: -.183, Skewness: .209), and Error-Free T-Units (Kurtosis: -.669, Skewness: .513). The results of the analysis for the dependent variable Target-Like Use of Articles also show no significant effects for kurtosis or skewness (Kurtosis: -1.099, Skewness: -.229). Results of the analysis for the dependent variable Lexical Density reveal a slightly peaked distribution (Kurtosis: 1.301, Skewness: -.54). Finally, the results of the analysis for the dependent variable S-Nodes per TUnit show a distribution that is positively skewed and peaked (Kurtosis: 2.514, Skewness: 1.51). 4.1.1.2 Descriptive Statistics for Group 2 The results of the analysis for the data in Group 2 reveal normal distributions for the dependent variables: Holistic Rating (Kurtosis: .12, Skewness: -.447), Target-Like Use of Articles (Kurtosis: .134, Skewness: -.747), and S-Nodes per T-Unit (Kurtosis: .251, Skewness: .611). The data for Lexical 235 Density (Kurtosis: 7.005, Skewness: 1.94) and Error-Free T-Units (Kurtosis: 1.067, Skewness: 1.259) reveal distributions that are peaked and positively skewed. Finally, the results of the analysis for the dependent variable Words per T-Unit show a peaked distribution (Kurtosis: 3.312, Skewness: .891). 4.1.1.3 Descriptive Statistics for Group 3 The analysis of the data for Group 3 again reveal normal distributions for the majority of the dependent variables: Holistic Rating (Kurtosis: -.855, Skewness: -.3), Lexical Density (Kurtosis: .204, Skewness: .129), Words per TUnit (Kurtosis: -.042, Skewness: .354), and S-Nodes per T-Unit (Kurtosis: .394, Skewness: .519). The results of the analysis for the dependent variable TargetLike Use of Articles show a peaked and negatively skewed distribution (Kurtosis: 2.986, Skewness: -1.815), while the results of the analysis for the dependent variable Error-Free T-Units show a distribution that is peaked and positively skewed (Kurtosis: 3.372, Skewness: 1.569). Despite the fact that some variables in each group displayed slightly peaked or skewed distributions, overall the distribution of the data from the analyses of the language proficiency measures was found to be normal. For a summary of the results on the frequency distributions of the language proficiency measures see Table 11 below. Table 11. Kurtosis* and skewness* for the language proficiency measures Group 1 Measures Group 2 Group 3 Kurtosis Skewness Kurtosis Skewness Kurtosis Skewness 236 Hol. Rating -.689 -.162 .12 -.447 -.855 -.3 TLU -1.099 -.229 .134 -.747 2.986 -1.815 Lex. Density 1.301 -.54 7.005 1.94 .204 .129 Words per T -.183 .209 3.312 .891 -.042 .354 Error-Free T -.669 .513 1.067 1.259 3.372 1.569 S -nodes per T 2.514 1.51 .251 .611 .394 .519 * Values > +1 show distributions that are not normal. 4.1.2 Results of the MANOVA A MANOVA was performed for the factor Group (three levels) and the six dependent variables. The results of the MANOVA reveal a significant main effect for Group (F(6, 268) = 69.363, p = .0001). Following this, six one-way ANOVAs were performed on the data to examine which of the six language proficiency measures show differences between the groups. The results of the univariate ANOVAs for the six variables are reported below. 4.1.3 Holistic Rating The results of the ANOVA for the holistic rating show no significant difference between the groups (F(2, 272) = 1.148, p = .3188). As can be seen from Table 12, the mean holistic rating for each group is similar. Table 12. Means and standard deviations for the dependent variable: Holistic Rating GROUP COUNT MEAN STD. DEV. 237 Group 1 91 63.868 12.638 Group 2 94 66.574 12.333 Group 3 90 66.333 15.039 4.1.4 Target-Like Use of Articles The ANOVA for the dependent variable target-like use of articles (TLU) shows a significant main effect for the factor Group (F(2, 272) = 31.306, p = .0001). To examine the source of the significant effect for the factor Group, posthoc comparisons of the means for each group were performed. The results of the comparisons are illustrated in Table 13. There is a significant difference between all groups at the p < .05 level of significance. The table of means, Table 14, shows a steady increase in TLU from Group 1 to Group 3. Table 13. Post-hoc comparisons for the dependent variable: Target-Like Use of Articles Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t Group 1 vs. Group 2 -152.815 73.324* 8.419* 4.103 Group 1 vs. Group 3 -297.831 74.122* 31.295* 7.911 Group 2 vs. Group 3 -145.016 73.531* 7.539* 3.883 * Significant at .05 level Table 14. Means and standard deviations for the dependent variable: TargetLike Use of Articles 238 GROUP COUNT MEAN STD. DEV. Group 1 91 54.879 32.479 Group 2 94 70.160 23.117 Group 3 90 84.662 18.292 4.1.5 Lexical Density The ANOVA for the dependent variable Lexical Density shows a significant main effect for the factor Group (F(2, 272) = 23.642, p = .0001). To examine the source of the significant main effect, post-hoc comparisons of the means for each group were performed. The results of the comparisons are illustrated in Table 15. There is a significant difference between all groups at the p < .05 level of significance. The table of means, Table 16, shows that there is a steady decrease in Lexical Density from Group 1 to Group 3. The reason for this drop is made clear in the light of the further analyses performed. Table 15. Post-hoc comparisons for the dependent variable: Lexical Density Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t Group 1 vs. Group 2 27.091 13.845* 7.422* 3.853 Group 1 vs. Group 3 48.765 13.995* 23.533* 6.86 Group 2 vs. Group 3 21.674 13.884* 4.724* 3.074 * Significant at .05 level 239 Table 16. Means and standard deviations for the dependent variable: Lexical Density GROUP COUNT MEAN STD. DEV. Group 1 91 42.017 5.240 Group 2 94 39.308 5.030 Group 3 90 37.141 3.956 4.1.6 Length of T-Units The ANOVA for the dependent variable Words per T-Unit shows a significant main effect for the factor Group, (F (2, 272) = 151.684, p = .0001). To examine the source of the significant main effect, post-hoc comparisons of the means for each group were performed. The results of the comparisons are illustrated in Table 17. There is a significant difference between all groups at the p < .05 level of significance. The table of means, Table 18, shows a steady increase of the length of the T-Units from Group 1 to Group 3. Table 17. Post-hoc comparisons for the dependent variable: Words per TUnit Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t Group 1 vs. Group 2 -9.734 4.619* 8.609* 4.15 Group 1 vs. Group 3 -39.622 4.669* 139.603* 16.709 Group 2 vs. Group 3 -29.889 4.632* 80.719* 12.706 * Significant at .05 level 240 Table 18. Means and standard deviations for the dependent variable: Words per T-Unit GROUP COUNT MEAN STD. DEV. Group 1 91 6.801 1.070 Group 2 94 7.774 1.339 Group 3 90 10.763 2.177 4.1.7 Error-Free T-Units The ANOVA for the dependent variable Error-Free T-Units shows a significant main effect for the factor Group, (F(2, 272) = 9.031, p = .0002). To examine the source of the significant effect for the factor Group, post-hoc comparisons of the means for each group were performed. The results of the comparisons are illustrated in Table 19. There is a significant difference between Group 3 and Group 1, and between Group 3 and Group 2, at the p < .05 level of significance. The highest proficiency group had the smallest number of Error-Free T-Units, see Table 20. Table 19. Post-hoc comparisons for the dependent variable: Error-Free TUnits Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t Group 1 vs. Group 2 -.095 2.09 .004 .09 Group 1 vs. Group 3 3.892 2.112* 6.581* 3.628 241 Group 2 vs. Group 3 3.988 2.096* 7.019* 3.747 * Significant at .05 level Table 20. Means and standard deviations for the dependent variable: ErrorFree T-Units GROUP COUNT MEAN STD. DEV. Group 1 91 9.681 6.993 Group 2 94 9.777 8.662 Group 3 90 5.789 5.596 4.1.8 S-Nodes per T-Unit The ANOVA for the dependent variable S-Nodes per T-Unit shows a significant main effect for the factor Group (F(2, 272) = 89.607, p = .0001). To examine the source of the significant effect for the factor Group, post-hoc comparisons of the means for each group were performed. The results of the comparisons are illustrated in Table 21. There is a significant difference between all groups at the p < .05 level of significance. There is a steady increase in the number of S-Nodes per T-Unit from Group 1 to Group 3, see Table 22. Table 21. Post-hoc comparisons for the dependent variable: S-Nodes per TUnit Comparison Mean Diff. Fisher PLSD Scheffe F-test Dunnett t Group 1 vs. Group 2 -2.462 .691* 24.614* 7.016 Group 1 vs. Group 3 -4.747 .698* 89.551* 13.383 242 Group 2 vs. Group 3 -2.285 .693* 21.085* 6.494 * Significant at .05 level Table 22. Means and standard deviations for the dependent variable: SNodes per T-Unit GROUP COUNT MEAN STD. DEV. Group 1 91 1.119 .144 Group 2 94 1.366 .229 Group 3 90 1.594 .312 4.1.9 Summary of the Results for the Language Proficiency Analyses The results of the analyses for the language proficiency measures show an overall significant main effect for the factor Group. No difference was found between the three groups in the holistic rating of the essays, even though the three groups represent three different levels of language proficiency. This could be due to the nature of the holistic rating since it takes into account not just the use of language, but also the structure of the essay, its organisation, the expression of ideas, the explanations and arguments provided by the writer, etc. Therefore the ratings based on the holistic scale may obscure differences among subjects that are attributable to language proficiency, which is of most interest to this study. However, this lack of significant differences between groups using this measure is counter-balanced by the fact that reliable differences were found using the other measures, and these are in line with the 243 claim that the different groups are composed of subjects at different levels of proficiency, and possibly different stages of development. As can be seen from Table 16, there is a significant drop in Lexical Density for subjects in Groups 2 and 3 compared with subjects in Group 1, and for subjects in Group 3 compared with subjects in Group 2. To this extent, increases in proficiency appeared to be related to increases in the number of grammatical words used in the essays (as the TLU analysis showed). As a result, lower-level students were grammatically less accurate in their essays (as the results from the TLU analysis show), and thus the omission of grammatical words (e.g. articles) contributed to a higher percentage score for Lexical Density. In Group 3 where the students are grammatically more accurate, as the TLU analysis showed, the lexical density is lower. These results are also consistent with the findings of recent research which showed that subjects of lower proficiency levels use more content words, while those of higher proficiency levels use more function words , e.g. pronouns, articles, and prepositions (Ghadessy 1989). The results for the dependent variable Words per T-Unit also reflect different proficiency groupings (see Table 18). The higher the level of proficiency, the more subordination and embedding the student uses in the construction of sentences, and thus the longer the sentences they produce. This finding, it must be noted, is in partial agreement with the finding reported by Larsen-Freeman and Strom (1977), who found that the mean length of the TUnits in the writings of the subjects in their study increased steadily with 244 proficiency level, but the statistical analysis performed on their data did not yield significant differences. Larsen-Freeman and Strom conclude that length of T-Units is still "a viable contender on which to base an index of development" (Larsen-Freeman & Strom 1977:132). The results for the dependent variable Error-Free T-Units do show significant differences between the three groups (see Table 19). Although these differences support the claim that the three different groups reflect different proficiency groupings, the direction of the difference is in contrast to the findings of previous research. In line with Larsen-Freeman's findings (1978) it was expected that more proficient subjects would use more Error-Free T-Units than less proficient subjects. However, the present findings show that subjects in Group 1 use significantly more Error-Free T-Units than subjects in Group 2, and these subjects in turn use significantly more Error-Free T-Units than subjects in Group 3. The present results could be due to the fact that subjects in this study are simply trying harder to produce more complex syntax than less proficient subjects. It is certainly true that subjects in the present study are not at a sufficiently advanced level to make no mistakes in their writing, since the subjects in Group 3, who have had the longest period of instruction in English, and who are older by one and two years on average than subjects in the other groups, are only at a post-intermediate level. In Larsen-Freeman's study subjects were from a larger range of proficiency levels (5 groups), from subjects that were of very low proficiency and needed a great deal of ESL instruction (Group 1) to subjects that were advanced enough not to need any more ESL 245 instruction (Group 5). Even though Larsen-Freeman does not report the posthoc comparisons for the Error-Free T-Units measure, it is apparent from the percentages reported in her paper that it is at the advanced level that subjects singificantly use more Error-Free T-Units, e.g. there is a 15% increase in the amount of Error-Free T-Units used by the advanced learners in group 5 (see Table 23). Table 23. Percentage of Error-Free T-Units in Larsen-Freeman (1978) Group Number %EFT 1 37 11.4 2 39 18.5 3 45 22.1 4 56 34.3 5 35 49.6 (Adapted from Larsen-Freeman 1978:445) In line with the above interpretation, that more proficient subjects in Group 3 try harder to produce more complex syntax and so make greater numbers of errors, it was shown that Group 3 students write longer T-Units than subjects in the other groups. They should therefore have a higher chance factor of making mistakes than subjects in the other two groups. The shorter the T-units, the less chance subjects have of making spelling, punctuation, grammatical, or syntactic mistakes. 246 Also in line with this argument are the results for the dependent variable S-Nodes per T-Unit. The higher the level of the students’ proficiency, the more syntactically complex sentences they produce in writing (see Bardovi-Harlig 1992a). In summary, the higher the level students belong to, the more accurate they are in the use of articles, and the more syntactically complex and longer sentences they produce, while their lexical density decreases and their chance of making an error increases. 4.2 Results of the Main Analyses In this section the results of the analyses performed to address each of the two hypotheses are described. 4.2.1 Hypothesis 1: There are patterns of development in collocational knowledge across proficiency levels To address Hypothesis 1, the three sets of data were analysed separately: i) For the free production data, tokens of the correct use of the thirty-seven types of collocation were recorded. Lack of, or incorrect use of, a particular type were scored as 0. The data were entered as the sum of tokens of correct usage of each collocation type by each subject. ii) For the translation data, the mean accuracy of response to each of the six types of collocation repeated across groups was calculated. iii) For the blank filling data, the mean accuracy of response to each of the eleven types of collocation repeated across groups was calculated. 247 The data for these analyses were examined and were not found to be normally distributed. This is due, in the case of the elicited production measures, to the fact that means for accurate responses to some types were calculated on the basis of a small number of responses to tokens, thus restricting the possible range of scores on these types. In the case of the essay data the mean use of many types of collocation did not follow the normal pattern of distribution within and across groups. This justifies the use of nonparametric Kruskal-Wallis tests, followed by Dunn's multiple comparisons procedures to address the first hypothesis regarding between-group differences in accuracy and use of collocations. 4.2.1.1 Essay Data (All Groups) The sum of tokens for each of the 37 types of collocation were calculated for each essay. Kruskal-Wallis tests were performed to identify significant between-group differences with respect to each collocation type. The results of the Kruskal-Wallis tests of the mean tokens of each of the 37 collocation types used by subjects in each group, corrected for ties, together with the results of the post-hoc Dunn's multiple comparisons procedures, are reported below. These are summarised in Table 24. Collocation types that did not show significant differences across all groups, or which did not contain any tokens for one or two of the groups, are not included in the table. Table 24. Summary of the results of the Kruskal-Wallis tests and post-hoc analyses for the essay data Dunn’s Procedure: Mean Rank Differences 248 Type K-W 1. Noun Prep 15.664 122.401-136.926 2. Noun to inf 4. Prep Noun 5. Adjective Prep 1 vs. 2 6.832 132 2 -139.33 19.104 129.742<163.234* 6.118 125.242<144.601* vs. 3 1 vs. 3 136.926<154.894* 122.401<154.894* 139.33 -142.678 132 163.234>119.994* 129.742-119.994 144.601-144.006 125.242<144.006* 146.33 >112.811* -142.678 11. SV(O)prepO 14.592 146.33 -154.053 154.053>112.811* 12. SV to inf 41.069 97.242<166.83* 166.83 >149.1* 97.242<149.1* 13. SV inf 71.452 89.11 <144.34* 144.34 <180.811* 89.11 <180.811* 137.198-147.963 147.963>128.406* 137.198-128.406 14. SVV-ing 6.19 15. SVO to inf 16.115 127.473-136.239 136.239-150.483 127.473<150.483* 19. SV(O) that 45.251 103.11 <141.569* 141.569<169.55* 103.11 <169.55* 21. SVOc 19.721 128 135.154-151.083 128 147.096-138.55 128.06 -138.55 24. SV(O) wh -135.154 8.585 128.06 <147.096* <151.083* 26. SVc 41.535 170.088>147.202* 147.202>95.944* 170.088>95.944* 29. Adj Noun 63.637 177.049>149.261* 149.261>86.756* 177.049>86.756* 8.371 135.588-133.936 133.936-144.683 135.588-144.683 31. N1of N2 36. Prep Det N 16.584 113.813<158.527* 158.527>141.017* 113.813<141.017* 37. Phrasal Verb 51.136 116.082<175.399* 175.399>121.1* 116.082<121.1* *: Significant at the .05 level < or >: direction of the difference 4.2.1.1.1 Kruskal-Wallis Analyses for the Essay Data 249 Type 1. Noun Preposition - Results of the Kruskal-Wallis test for numbers of tokens of Noun Preposition collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 15.664, p = .0004). The table of means, Table 25, shows that the mean number of tokens used per group increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences at the p < .05 level between numbers of tokens of this type used in Groups 1 and 3, and in Groups 2 and 3, but not in Groups 1 and 2 (see Table 24). Type 2. Noun to Infinitive - Results of the Kruskal-Wallis test for numbers of tokens of Noun to Infinitive collocations used in the subjects' essays show the difference between groups to be significant (Kruskal-Wallis c2 (2, N = 275) = 6.832, p = .0328). However, the results of the Dunn's multiple comparisons procedure show no significant difference between any pairs of groups at the p < .05 level, even though the mean number of tokens used per group increases across groups and the difference between Group 1 and Group 2 is approaching significance (see Table 24). Type 4. Preposition Noun - Results of the Kruskal-Wallis test for numbers of tokens of Preposition Noun collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 19.104, p = .0001). The table of means, Table 25, shows that subjects in Group 2 used more collocations of this type in their essays that subjects in 250 Group 1, and subjects in Group 1 used more collocations of this type than subjects in Group 3. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used between Groups 1 and 2, and between Groups 2 and 3, but not between Groups 1 and 3 (see Table 2.5). Type 5. Adjective Preposition - Results of the Kruskal-Wallis test for numbers of tokens of Adjective Preposition collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.118, p = .0469). The table of means, Table 25, shows that the mean number of tokens used by subjects in Group 2 is equal to the mean number of tokens of this type used by subjects in Group 3, while subjects in Group 1 produced considerably less tokens on this type than subjects in Groups 2 and 3. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used between Groups 1 and 3, and between Groups 1 and 2, but not between Groups 2 and 3 (see Table 24). Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test for numbers of tokens of SV(O) Preposition O collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 14.592, p = .0007). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in 251 Group 1, who used more tokens of this type of collocation than subjects in Group 3. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 24). Type 12. SV to Infinitive - Results of the Kruskal-Wallis test for numbers of tokens of SV to Infinitive collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 41.069, p = .0001). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in Group 3, who used more tokens of this type of collocation than subjects in Group 1. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used across all groups (see Table 24). Type 13. SV Infinitive - Results of the Kruskal-Wallis test for numbers of tokens of SV Infinitive collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 71.452, p = .0001). The table of means, Table 25, shows that the mean number of tokens used per group increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used across all groups (see Table 24). 252 Type 14. SVV-ing - Results of the Kruskal-Wallis test for numbers of tokens of SVV-ing collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.19, p = .0453). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in Group 1, who used more tokens of this type of collocation than subjects in Group 3. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used only between Groups 2 and 3, but not between Groups 1 and 2, or between Groups 1 and 3 (see Table 24). Type 15. SVO to Infinitive - Results of the Kruskal-Wallis test for numbers of tokens of SVO to Infinitive collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 16.115, p = .0003). The table of means, Table 25, shows that the mean number of tokens used increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type only between Groups 1 and 3, but not between Groups 1 and 2, or between Groups 2 and 3 (see Table 24). Type 19. SV(O) that-clause - Results of the Kruskal-Wallis test for numbers of tokens of SV(O) that-clause collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 45.251, p = .0001). The table of means, Table 25, shows that the mean 253 number of tokens used per group increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used between all groups (see Table 24). Type 21. SVOc - Results of the Kruskal-Wallis test for numbers of tokens of SVOc collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 19.721, p = .0001). The tables of means, Table 25, shows that the mean number of tokens used per group increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used only between Groups 1 and 3, but not between Groups 1 and 2, or between Groups 2 and 3 (see Table 24). Type 24. SV(O) wh-word - Results of the Kruskal-Wallis test for numbers of tokens of SV (O) wh-word collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 8.585, p = .0137). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in Group 3, who used more tokens of this type of collocation than subjects in Group 1. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type used only between Groups 1 and 2, but not between Groups 2 and 3, or Groups 1 and 3 (see Table 24). 254 Type 26. SVc - Results of the Kruskal-Wallis test for numbers of tokens of SVc collocations used in the subjects' essays shows a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 41.535, p = .0001). The table of means, Table 25, shows that the mean number of tokens of this type of collocation decreases as the proficiency level of the subjects increases. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type of collocation between all groups (see Table 24). Type 29. Adjective Noun - Results of the Kruskal-Wallis test for numbers of tokens of Adjective Noun collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 63.637, p = .0001). The table of means, Table 25, shows that the mean number of tokens of this type of collocation decreases as the proficiency level of the subjects increases. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in numbers of tokens of this type of collocation across all groups (see Table 25). Type 30. Noun Verb - Results of the Kruskal-Wallis test for numbers of tokens of Noun Verb collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.212, p = 255 .0448), though the Dunn's multiple comparisons procedure revealed no significant between-group differences (see Table 24). Type 31. Noun1 of Noun2 - Results of the Kruskal-Wallis test for numbers of tokens of Noun1 of Noun2 collocations used in the subjects' essays show the difference between groups to be significant (Kruskal-Wallis c2 (2, N = 275) = 8.371, p = .0152). However, the results of the Dunn's multiple comparisonss procedure show no significant differences between any of the groups at the p < .05 level (see Table 24). Type 36. Preposition Determiner Noun - Results of the Kruskal-Wallis test for numbers of tokens of Preposition Determiner Noun collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 16.584, p = .0003). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in Group 3, who used more tokens of this type of collocation than subjects in Group 1. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences between all groups (see Table 24). Type 37. Phrasal Verb - Results of the Kruskal-Wallis test for numbers of tokens of Phrasal Verb collocations used in the subjects' essays show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 256 51.136, p = .0001). The table of means, Table 25, shows that subjects in Group 2 produced more tokens of this type of collocation than subjects in Group 3, who used more tokens of this type of collocation than subjects in Group 1. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences between Groups 1 and 2, and between Groups 2 and 3, but not between Groups 1 and 3 (see Table 24). Table 25. Means and standard deviations by group for the essay data Means per Group Std. Dev. per Group Types Group1 Group2 Group3 Group1 Group2 Group3 1. Noun Prep .099 .234 .400 .335 .517 .650 2. Noun to infinitive 0.000 .074 .083 0.000 .366 .323 3. Noun that 0.000 0.000 0.000 0.000 0.000 0.000 4. Preposition Noun .626 .989 .411 1.217 1.187 .701 5. Adjective Prep .231 .426 .411 .616 .823 .833 6. Pred Adj to inf .033 .128 .100 .180 .421 .337 7. Adj that 0.000 .011 .011 0.000 .103 .105 8. SVO to O/SVOO .209 .117 .133 .587 .384 .429 9. SVO to O 0.000 0.000 .011 0.000 0.000 .105 10. SVO for O/SVOO .011 .011 0.000 .105 .103 0.000 11. SV(O) prep O 2.066 2.213 1.200 2.081 2.047 1.432 12. SV to inf .835 2.404 1.933 1.790 2.329 2.360 13. SV inf .286 1.287 2.389 .898 1.708 2.355 257 14. SVV-ing .330 .606 .222 .844 1.483 .746 15. SVO to inf .011 .106 .256 .105 .427 .628 16. SVO inf .011 .021 .044 .105 .145 .207 17. SVO V-ing 0.000 .032 0.000 0.000 .177 0.000 18. SV poss V-ing 0.000 0.000 0.000 0.000 0.000 0.000 19. SV(O) that .165 .702 1.078 .719 1.199 1.326 20. SVO to be c 0.000 .011 0.000 0.000 .103 0.000 21. SVOc 0.000 .053 .256 0.000 .226 .646 22. SVOO 0.000 0.000 0.000 0.000 0.000 0.000 23. SV(O) Adverbial .813 .649 .533 .999 .924 .782 24. SV(O) wh-word .055 .245 .133 .273 .581 .373 25. S(it)VO to inf 0.000 0.000 0.000 0.000 0.000 0.000 26. SVc 7.846 6.160 3.789 5.131 3.740 3.000 27. Verb Noun (creat) .549 .681 .800 1.036 .941 1.083 28. Verb Noun (erad) 0.000 0.000 0.000 0.000 0.000 0.000 29. Adj Noun 3.440 2.021 .856 3.078 1.328 1.076 30. Noun Verb 0.000 0.000 .033 0.000 0.000 .181 31. Noun1 of Noun2 .044 .011 .122 .295 .103 .419 32. Adv Adj 0.000 0.000 0.000 0.000 0.000 0.000 33. Verb Adverb 0.000 .021 .056 0.000 .145 .275 34. Noun Noun .209 .298 .367 .548 .583 .800 35. Miscellaneous .033 .043 .022 .233 .203 .148 36. Prep Det Noun .714 1.362 1.033 1.088 1.335 1.146 258 37. Phrasal Verb .143 .936 .200 .382 1.096 .050 Col l oca ti on Tokens - Essay Data - Al l Groups 8 6 4 Gr oup1 2 Gr oup2 Gr oup3 0 Col l oca ti on Ty pes Note. Only those collocation types showing significant differences across groups are included in this figure. Figure 3. Mean use of collocation tokens - essay data - all groups 4.2.1.1.2 Summary of the Results for the Essay Data The results of the Kruskal-Wallis analyses of the accurate use of the 37 types of collocations in the students' essays partially support Hypothesis 1, since there are significant differences between different proficiency groups in the use of collocations. These differences are clear in the use of the collocation Types 13. SV Infinitive and 19. SV(O) that. As the proficiency level increases, the accurate use of these two types of collocations increases, resulting in significant between-group differences, across all three groups. 259 Type 1. Noun Preposition collocations are also positively related to proficiency, since Group 3 subjects use significantly more tokens of this type of collocation than subjects in the other two groups. The results also show that the direction of the between-group differences is not always the expected one. With respect to the collocation Types 26. SVc and 29. Adjective Noun, proficiency is negatively correlated with accurate use of these two types across groups: the less proficient students use significantly more tokens of these two types of collocation than the more proficient students. There are also collocation Types, 4. Preposition Noun, 12. SV to Infinitive, 36. Prep Det Noun, and 37. Phrasal Verb, for which Group 2 subjects use significantly more tokens than either Group 1 or Group 3 subjects. For Type 5. Adjective Preposition, Group 2 and Group 3 subjects use significantly more collocations than subjects in Group 1, while for Type 11. SV(O) Preposition O Group 1 and 2 subjects are significantly better users than Group 3 subjects. There is also a number of collocation types that did not receive any tokens of accurate use by any of the groups (3. Noun that, 18. SV possessive V-ing, 22. SVOO, 25. S(it)VO to inf, 28. Verb Noun (eradication), and 32. Adverb Adjective). The results of the analysis for this set of data, summarised below in Table 26, suggest that there are indeed proficiency-related differences in the accurate use of collocations, and that there are specific types of collocation that are used in the early stages of proficiency, and others that are used in the later stages of development. Table 26. Collocational use distinguishing proficiency levels 260 Group 1 Group 2 Group 3 Collocation Types Collocation Types Collocation Types 26. SV c** 4. Prep Noun** 1. Noun Prep** 29. Adjective Noun** 12. SV to Inf** 13. SV Inf** 11. SV(O) Prep O* 36. Prep Det Noun** 19. SV(O) that** 37. Phrasal Verb** 5. Adjective Prep* 5. Adjective Prep* 11. SV(O) Prep O* **: Significantly more occurences than the other two groups *: Significantly more occurences than one other group 4.2.1.1.3 Implicational Scaling for the Essay Data (All Groups) For the implicational scaling analysis the Guttman procedure was used. When the Guttman analysis reveals that a particular scale is consistently interpretable, that is if one item on the scale is statistically consistently more difficult than another, which is in turn harder than another, then the scale attains a certain predictive power (Davidson 1987). The coefficient of reproducibility, which shows how accurately a subject's performance can be predicted from that person's position in the matrix, and the coefficient of scalability, which is a single statistic detailing the strength of the items as an ordered scale and indicating whether a given set of features is truly scalable and unidimensional, were calculated. The higher the value of the coefficient of scalability, the more "implicational" the scale (Davidson 1987). 261 Each subject was coded as having used (1), or not used (0), each of the 37 types of collocation in their essays. The two axes of the matrix for the implicational scaling consisted of the 37 items ranked from most commonly used by all subjects to least commonly used, and the 275 subjects ranked in order of their frequency of use of all types of collocations, from subjects using the most types to subjects using the fewest types. This matrix is summarised for the first 17 types, mean >.1, in Figure 4 below. The coefficient of reproducibility for this analysis was .90. The coefficient of scalability was .33. While the coefficient of reproducibility is at the level necessary for this implicational scale to be considered valid (see Andersen 1978), the coefficient of scalability is below the recommended level of .6 (Hatch & Lazaraton 1991:212). This suggests that while the implicational scale for the essay data is valid, the variance in terms of numbers of errors, and the fact that most subjects did not use the majority of the scaled collocations, resulted in the low coefficient of scalability. 262 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 Note. Only those types with a mean > .1 are included in this figure. Figure 4. Mean use of collocation tokens per type in the essay data. 4.2.1.2 Translation Data (All Groups) The data set used in these analyses consisted of the mean accuracy of response to each of the six types of collocation repeated across groups in the translation test. As with the essay data reported above, the procedure followed in analysing the translation data was to perform Kruskal-Wallis tests of the differences between groups for each collocation type separately. Subsequently, where significant group effects were identified, post-hoc Dunn's multiple comparisons procedures were calculated in order to identify the source of the significant contrasts between groups. The results of the Kruskal-Wallis tests, together with the results of post-hoc Dunn's multiple comparisons procedures, are reported below. These are summarised in Table 27. Type 16. SVO Infinitive 263 showed no significant across groups differences and therefore is not included in the table. Table 27. Summary of the results of the Kruskal-Wallis tests and post-hoc analyses for the translation data Dunn’s Procedure: Mean Rank Differences Type K-W 1. Noun Prep 51.334 122.505-110.681 110.681<182.2* 122.505<182.2* 6.503 128.703-136.404 136.404-149.06 128.703<149.06* 5. Adjective Prep 1 vs. 2 2 vs. 3 1 vs. 3 11. SV(O) Prep O 14.546 127.434<159.479* 159.479>126.25* 127.434-126.25 14. SVV-ing 121.793<170.38* 122.709<170.38* 27. Verb N (creat) 85.758 101.269<126.511* 126.511<187.13* 101.269<187.13* 33.999 122.709-121.793 *: Significant at the .05 level, < or >: Direction of the difference 4.2.1.2.1 Kruskal-Wallis Analyses for the Translation Data Type 1. Noun Preposition - Results of the Kruskal-Wallis test of responses to Noun Preposition collocations show a significant main effect for the factor Group (Kruskal-Wallis c2(2, N = 275) = 51.334, p = .0001). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 27). 264 Type 5. Adjective Preposition - Results of the Kruskal-Wallis test of responses to Adjective Preposition collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 6.503, p = .0387). The table of means, Table 28, shows that the mean accuracy of response to Adjective Preposition collocations increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation only between Groups 1 and 3, but not between Groups 2 and 3, or between Groups 1 and 2 (see Table 27). Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test of responses to SV(O) Preposition O collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 14.546, p = .0007). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 2, and between Groups 2 and 3, but not between Groups 1 and 3 (see Table 27). Type 14. SVV-ing - Results of the Kruskal-Wallis test of responses to SVV-ing collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 33.999, p = .0001). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the 265 mean accuracy of response to this type of collocation between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 27). Type 27. Verb Noun (creation) - Results of the Kruskal-Wallis test of responses to Verb Noun (creation) collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 85.758, p = .0001). The table of means, Table 28, shows that the mean accuracy of responses to Verb Noun collocations increases across groups. The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation across all groups (see Table 27). Table 28. Means and standard deviations by group for the translation data Means per Group Types Std. Dev. per Group Group1 Group2 Group3 Group1 Group2 Group3 1. Noun Prep 23.077 18.085 57.222 29.162 29.193 4.368 5. Adjective Prep 10.989 10.106 20.000 31.449 21.477 35.790 11. SV(O) Prep O 19.231 40.957 23.333 30.523 44.579 42.532 14. SVV-ing 20.879 20.213 55.556 40.870 40.374 49.969 16. SVO Inf 25.275 23.404 36.667 43.699 42.567 48.459 27. Verb Noun (creat) 3.297 11.702 37.222 17.954 21.283 29.567 266 Mean Accuracy of Response - Translation Test - All Groups 60 Mean Accuracy of Response 50 40 30 20 Group1 Group2 Group3 10 0 Type1 Type5 Type11 Type14 Type16 Type27 Collocation Types Figure 5. Mean accuracy of response for the translation data 4.2.1.2.2 Summary of the Results for the Translation Data As with the results of the analyses of collocational use in the essay data, the translation data also reveal a significant difference across groups in terms of the accuracy of their responses to the six types of collocation. The results for Types 1. Noun Preposition, 14. SVV-ing, and 27. Verb Noun (creation) are consistent with the claim that differences in the accuracy of translation of collocations are positively related to differences of proficiency, since the more proficient subjects in Group 3 are more accurate in the use of these collocations than subjects in either Group 2 or Group 1. Group 3 subjects were significantly more accurate than subjects in Group 1 but equal to Group 2 subjects in their responses to Type 5. Adjective Preposition collocations. With the exception of 267 Type 11. SV(O) Preposition O collocations, in which Group 2 subjects were better than either Group 1 or Group 3 subjects, the differences across all groups are in the predicted direction (see Table 29). Table 29. Translation accuracy distinguishing proficiency levels Group 1 Collocation Types Group 2 Group 3 Collocation Types Collocation Types 11. SV(O) Prep O** 1. Noun Prep** 5. Adjective Prep* 14. SVV-ing** 27. Verb Noun (creat)** 5. Adjective Prep* **: Significantly more accurate than the other two groups *: Significantly more accurate than one other group 4.2.1.2.3 Implicational Scaling for the Translation Data (All Groups) For the implicational scaling analysis, following the Guttman procedure, each subject was coded as having translated accurately (1), or not (0), each of the 6 types of collocations in the translation test. A criterion of 80% accuracy was used for the coding of the data (see also Andersen 1978; Anderson 1978). That is, if a subject was 80 to 100% accurate in translating the particular collocation type, she/he was coded as 1. Accuracy less than 80% was coded as 0. As with the essay data, the two axes of the matrix for the implicational scaling consisted of the six items ranked from the most accurately translated by 268 all subjects to the least accurately translated, and the 275 subjects ranked in order of their accuracy of response to all types of collocations, from subjects translating accurately the most types to subjects translating accurately the fewest types. This matrix is given in Appendix M and summarised in Figure 6 below. The coefficient of reproducibility for this analysis was .92. The coefficient of scalability was .578 and so approached significance (see Andersen 1978; Hatch & Lazaraton 1991:212). Accura cy - Tra nsl a t i on Test - Al l Groups .3 5 .3 .2 5 .2 .1 5 .1 .0 5 0 Ty pe1 4 Ty pe1 6 Ty pe1 1 Ty pe1 Ty pe5 Ty pe2 7 Col l oca t i on Ty p es Figure 6. Mean accuracy of response for the translation data - all groups 4.2.1.3 Blank Filling Data (All Groups) As with the translation data, the mean accuracy of responses to each of the eleven types of collocation repeated across groups in the blank filling test was calculated. The procedure used was identical to the procedure followed in analysing the essay and translation data. The results of the Kruskal-Wallis tests 269 and the post-hoc Dunn's multiple comparisons procedures are reported below. These are summarised in Table 30. Type 4. Prep Noun showed no significant across-group differences and therefore is not included in the table. Table 30. Summary of the results for the Kruskal-Wallis tests and post-hoc analyses for the blank filling data Dunn’s Procedure: Mean Rank Differences Type K-W 1 vs. 2 1. Noun Prep 58.03 5. Adjective Prep 19.673 160.791>110.53* 2 vs. 3 95.566<146.883* 146.883<171.62* 1 vs. 3 95.566<171.628* 110.532<143.64* 160.791-143.64 11. SV(O) Prep O 14.711 115.747<137.5* 137.5 115.747<161.02* 23. SV(O) Adverb 39.351 102.495<149.42* 149.42 -161.97 102.495<161.97* 24. SV(O)wh 139.915<162.53* 111.758<162.53* 27. Verb N (creat.) 29.988 108.967<132.78* 132.787<172.8* 108.967<172.8* 33. Verb Adverb 77.892 116.731-108.138 108.138<190.69* 116.731<190.69* 34. Noun Noun 26.577 167.885>114.82* 114.824<131.98* 167.885>131.98* 36. Prep Det N 17.299 131.505-118.064 118.064<165.38* 131.505<165.38* 37. Phrasal Verb 26.57 125.872<172.73* 116.176<172.73* 22.03 111.758<139.91* 116.176-125.872 <161.02* *: Significant at the .05 level < or >: Direction of the difference 4.2.1.3.1 Kruskal-Wallis Analyses for the Blank Filling Data 270 Type 1. Noun Preposition - Results of the Kruskal-Wallis test of responses to Noun Preposition collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 58.03, p = .0001). The mean accuracy of responses to Noun Preposition collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between all groups (see Table 30). Type 5. Adjective Preposition - Results of the Kruskal-Wallis test of responses to Adjective Preposition collocations show a significant difference between conditions (Kruskal-Wallis c2 (2, N = 275) = 19.673, p = .0001). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 2, and between Groups 2 and 3, but not between Groups 1 and 3 (see Table 30). Type 11. SV(O) Preposition O - Results of the Kruskal-Wallis test of responses to SV(O) Preposition O collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 14.711, p = .0006). The mean accuracy of responses to SV(O) Preposition O collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between all groups (see Table 30). 271 Type 23. SV(O) Adverbial - Results of the Kruskal-Wallis test of responses to SV(O) Adverbial collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 39.351, p = .0001). The mean accuracy of responses to SV(O) Adverbial collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 2, and between Groups 1 and 3, but not between Groups 2 and 3 (see Table 30). Type 24. SV(O) wh-word - Results of the Kruskal-Wallis test of responses to SV(O) wh-word collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 22.03, p = .0001). The mean accuracy of responses to SV(O) wh-word collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between all groups (see Table 30). Type 27. Verb Noun (creation) - Results of the Kruskal-Wallis test of responses to Verb Noun (creation) collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 29.988, p = .0001). The mean accuracy of responses to Verb Noun (creation) collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons 272 procedure show significant differences in the mean accuracy of response to this type of collocation between all groups (see Table 30). Type 33. Verb Adverb - Results of the Kruskal-Wallis test of responses to Verb Adverb collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 77.892, p = .0001). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 30). Type 34. Noun Noun - Results of the Kruskal-Wallis test of responses to Noun Noun collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 26.577, p = .0001). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between all groups (see Table 30). Type 36. Preposition Determiner Noun - Results of the Kruskal-Wallis test of responses to Preposition Determiner Noun collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 17.299, p = .0002). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of 273 collocation between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 30). Type 37. Phrasal Verb - Results of the Kruskal-Wallis test of responses to Phrasal Verb collocations show a significant main effect for the factor Group (Kruskal-Wallis c2 (2, N = 275) = 26.57, p = .0001). The mean accuracy of responses to Phrasal Verb collocations increases across groups (see Table 31). The results of the post-hoc Dunn's multiple comparisons procedure show significant differences in the mean accuracy of response to this type of collocation between Groups 1 and 3, and between Groups 2 and 3, but not between Groups 1 and 2 (see Table 30). Table 31. Means and standard deviations by group for the blank filling data Means per Group Types Std. Dev. per Group Group1 Group2 Group3 Group1 Group2 Group3 1. Noun Prep 4.396 13.617 28.656 20.613 15.579 30.304 4. Prep Noun 51.538 43.819 47.989 25.269 23.649 33.728 5. Adjective Prep 41.099 26.170 35.156 24.651 24.012 22.714 11. SV(O) Prep O 24.648 31.468 37.633 21.185 23.631 22.321 23. SV(O) Adverbial 8.791 39.362 41.111 28.474 49.117 39.361 24. SV(O) wh-word 21.429 40.426 48.656 34.272 49.338 38.407 27. Verb Noun (creat) 20.758 25.851 40.600 19.007 20.289 26.100 33. Verb Adverb 23.077 12.638 72.222 42.366 22.233 45.041 274 34. Noun Noun 58.242 26.596 37.778 39.626 44.421 48.755 36. Prep Det Noun 42.527 38.489 52.311 25.933 23.514 19.253 37. Phrasal Verb 24.725 28.957 45.000 22.684 24.586 27.335 Mean Accuracy of Response - Blank Filling Test - All Groups Mean Accuracy of Response 80 60 40 Group1 Group2 20 Group3 0 1 4 5 11 23 24 27 33 34 36 37 Collocation Types Figure 7. Mean accuracy of response for the blank filling data 4.2.1.3.2 Summary of the Results for the Blank Filling Data As with the essay and the translation data, the results for the blank filling data also lend partial support to Hypothesis 1. The subjects' mean accuracy of response to Types 1. Noun Preposition, 11. SV(O) Preposition O, 24. SV(O) wh-word, 27. Verb Noun (creation), 33. Verb Adverb, 36. Preposition Determiner Noun, and 37. Phrasal Verb collocations, is consistent with the claim that differences in proficiency are positively related to differences in the 275 accuracy of collocation use, since the more proficient subjects in Group 3 are more accurate in the use of these collocations than subjects in Group 2 and Group 1. Responses to Type 23. SV(O) Adverbial collocations partially confirm Hypothesis 1, since Groups 2 and 3 are more accurate than Group 1. With respect to accuracy of responses to Type 5. Adjective Preposition collocations, Groups 1 and 3 are significantly better than Group 2. The results for Type 34. Noun Noun collocations are the only exception to the general direction of the blank filling data, since Group 1 students are significantly more accurate than students in Groups 2 and 3. Overall, the results for the blank filling data are in the predicted direction, that is, accuracy of response to collocational types increases with proficiency (see Table 32). Table 32. Blank filling accuracy distinguishing proficiency levels Group 1 Group 2 Group 3 Collocation Types Collocation Types Collocation Types 34. Noun Noun** 23. SV(O) Adverbial* 1. Noun Prep** 5. Adjective Prep* 11. SV(O) Prep O** 24. SV(O) wh-word** 27. Verb Noun (creat) ** 33. Verb Adverb** 36. Prep Det Noun** 37. Phrasal Verb** 276 23. SV(O) Adverbial* 5. Adjective Prep* **: Significantly more accurate than the other two groups *: Significantly more accurate than one other group 4.2.1.3.3 Implicational Scaling for the Blank Filling Data (All Groups) For the implicational scaling analysis the Guttman procedure was used. Each subject was coded as having answered accurately (1), or not (0), each of the 11 types of collocations repeated across groups in the blank filling test. As with the translation data, an 80% accuracy criterion was used for the coding of the data, i.e. accuracy less than 80% was coded as 0, accuracy 80 to 100% was coded as 1. The two axes of the matrix for the implicational scaling consisted of the eleven types ranked from most accurately answered by all subjects to the least accurately answered, and the 275 subjects ranked in order of their accuracy of response to all types of collocations, from types most accurately answered to types least accurately answered. This matrix is summarised in Figure 8 below. The coefficient of reproducibility for this analysis was .91. The coefficient of scalability was .4. As with the essay data, even though the coefficient of reproducility for the blank filling data is at the level necessary for this implicational scale to be considered valid, the coefficient of scalability is below the recommended level of .6 (see Andersen 1978; Hatch & Lazaraton 1991:212). 277 Accura cy - Bl a nk Fi l l i ng Da t a - Al l Group s .3 5 .3 .2 5 .2 .1 5 .1 .0 5 0 34 33 24 23 4 36 5 37 27 1 11 Col l oca t i on Ty p es Figure 8. Mean accuracy of response for the the blank filling data - all groups 4.2.1.4 Summary of the Results for Hypothesis 1 The results from the analyses of the three sets of data, the free production essay data and the elicited production translation and blank filling data, support Hypothesis 1 by providing evidence that there are differences between groups in the production and knowledge of collocations, assessed in this study both in terms of ability to use collocations in the essays, and in terms of accuracy of response to questions elicing collocations in the translation and blank filling tests. Hypothesis 1, however, is only partially supported since there was limited evidence in the data analysis to support the existence of accuracy orders in the use and knowledge of collocations across groups. The implicational scales for the essay and blank filling data, though proven to be valid according to the coefficient of reproducibility, were found to be only 278 marginally scalable. The implicational scaling for the translation data approached statistical significance and revealed a valid accuracy order. 4.2.2 Hypothesis 2: There are patterns in the development of collocational knowledge within proficiency groups To address Hypothesis 2 and examine the extent of the within-group differences in the use and knowledge of collocations, non-parametric Friedman repeated measures tests were used, followed by post-hoc Nemenyi's multiple comparisons procedures. Implicational scaling for each of the three groups in each of the three sets of data was then performed. Results of the analyses are reported below. 4.2.2.1 Essay Data As for the analyses in Hypothesis 1, the tokens of accurate use of the 37 types of collocation by each subject in each group were used as data for these analyses. 4.2.2.1.1 Friedman test for the Essay Data - Group 1 The results of the Friedman test for Group 1 show a significant difference in the use of the 37 types of collocation in the students' essays (Friedman c2 (36, N = 37) = 1699.221, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman rank sums were performed on the data. The results of these tests are summarised below in Table 33. The results of the post-hoc analysis show a clustering of certain collocations. Types 11. 279 SV(O) Prep O, 26. SVc, and 29. Adjective Noun are used significantly more than almost all the other types of collocation. 4.2.2.1.2 Friedman Test for the Essay Data - Group 2 The results of the Friedman test for Group 2 show a significant difference in the use of the 37 types of collocation in the students' essays (Friedman c2 (36, N = 37) = 1823.796, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman rank sums were performed on the data. The results of these tests are summarised below in Table 4.24. The results show types 11. SV(O) Prep O, 12. SV to Inf, 26. SVc, 29. Adjective Noun, and 36. Prep Det Noun to be used significanlty more than all the other types. 4.2.2.1.3 Friedman Test for Group 3 - Essay Data The results of the Friedman test for Group 3 show a significant difference in the use of the 37 types of collocation in the students' essays (Friedman c2 (36, N = 37) = 1401.246, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman rank sums were performed on the data. The results of these tests are summarised below in Table 35. The results show that types 12. SV to Inf, 13. SV Inf, and 26. SVc are used significantly more than all other types. 280 Table 33. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 1 Type 1 2 3 4 5 6 7 8 9 11.0 13.2* 13.7* 11.3 10 11 13.7* 13.5* ----- 12 13 14 15 16 17 18 7.47 11.1 10.5 13.5* 13.5* 13.7* 13.7* 11 12.2* 13.7* 13.7* 7.28 26 18.1* 19.5* 19.5* 13.1* 16.8* 19.0* 19.5* 17.1* 19.5* 19.3* 5.83 13.3* 16.9* 16.4* 19.3* 19.3* 19.5* 19.5* 29 14.8* 16.2* 16.2* 9.85 10.0 13.5* 15.7* 16.2* 13.8* 16.2* 16.1* 2.57 13.7* 13.1* 16.1* 16.1* 16.2* 16.2* * Significant at the .05 level Type 19 20 21 22 23 24 25 26 27 28 29 13.7* 2.57 30 31 32 33 34 36 12.1* 13.7* 13.7* 13.7* 4.77 13.0* 13.7* 5.83 7.79 26 18.0* 19.5* 19.5* 19.5* 10.6 18.8* 19.5* ----- 13.6* 19.5* 3.26 19.5* 19.1* 19.5* 19.5* 16.9* 19.1* 12.4* 17.3* 29 14.7* 16.2* 16.2* 16.2* 7.34 15.5* 16.2* 3.26 10.3 16.2* 15.8* 16.2* 16.2* 13.6* 15.9* 9.16 * Significant at the .05 level Note. Only those types that were significantly different from the other types are included in the table 281 13.3* 6.58 37 11 16.2* ----- 13.7* 13.3* 13.7* 13.7* 11.0 35 11.5 14.0* Table 34. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 2 Type 1 2 11 11.6 5 6 12 13 14 15 13.7* 14.6* 4.21 9.68 13.1* 14.5* 13.1* 14.6* 14.5* ----- .309 4.31 10.0 13.4* 14.3* 14.2* 14.6* 12 11.9* 14.0* 15.0* 4.52 9.99 13.4* 14.8* 13.4* 15.0* 14.8* .309 ----- 4.62 10.3 13.7* 14.6* 14.5* 15.0* 26 18.1* 20.2* 21.2* 10.7 16.1* 19.5* 21.0* 19.6* 21.2* 21.0* 6.50 6.19 10.8 16.5* 19.9* 20.8* 20.7* 21.2* 29 13.9* 16.1* 17.0* 6.58 12.0* 15.5* 16.9* 15.5* 17.0* 16.9* 2.36 2.05 6.68 12.4* 15.8* 16.7* 16.6* 17.0* 36 9.49 7.56 2.43 2.19 7.96 11.3 32 33 34 11.6 3 4 12.5* 2.09 * Significant at the .05 level Type 19 20 21 22 23 7 11.0 8 9 12.4* 11.0 24 25 26 10 11 12.5* 12.4* 2.12 27 28 29 30 31 16 17 18 12.2* 12.1* 12.5* 35 36 37 11 8.13 14.5* 13.9* 14.6* 6.79 11.6 14.6* 6.50 6.65 14.6* 2.36 14.6* 14.5* 14.6* 14.3* 10.7 14.0* 2.12 5.31 12 8.44 14.8* 14.2* 15.0* 7.10 12.0* 15.0* 6.19 6.96 15.0* 2.05 15.0* 14.8* 15.0* 14.6* 11.0 14.3* 2.43 5.62 26 14.6* 21.0* 20.4* 21.2* 13.3* 18.2* 21.2* ------ 13.1* 21.2* 4.13 29 10.5 16.9* 16.2* 17.0* 9.24 14.0* 17.0* 4.13 9.02 17.0* ------ 17.0* 16.9* 17.0* 16.7* 13.1* 16.3* 4.48 36 6.0 12.4* 11.7* 12.5* 4.76 9.57 4.53 12.5* 4.48 12.5* 8.62 21.2* 21.0* 21.2* 20.8* 17.2* 20.5* 8.62 12.5* 12.4* 12.5* 12.2* 8.62 Note. Only those types that were significantly different from the other types are included in the table 282 11.8* 7.68 11.9* ------ 3.19 Table 35. Nemenyi's multiple comparisons tests of mean rank diferences for the essay data - Group 3 Type 1 2 12 8.51 13 9.75 26 3 4 5 6 12.6* 13.8* 8.56 8.76 13.8* 15.1* 9.80 10.0 7 8 9 10 11 12 13 14 15 16 17 18 12.4* 13.7* 12.2* 13.7* 13.8* 3.62 ------ 1.24 11.8 10.8 13.1* 13.8* 13.8* 13.7* 14.9* 13.4* 14.9* 15.1* 4.86 1.24 ------ 13.0* 12.1* 14.4* 15.1* 15.1* 13.4* 17.5* 18.8* 13.4* 13.6* 17.3* 18.6* 17.1* 18.6* 18.8* 8.53 4.91 3.67 16.7* 15.7* 18.1* 18.8* 18.8* * Significant at the .05 level Type 19 20 21 22 23 12 3.46 13.8* 11.0 27 28 13.8* 4.91 5.63 13.8* 4.68 13.3* 12.3* 13.8* 13.1* 9.62 13.5* 2.37 11.1 13 4.70 15.1* 12.2* 15.1* 8.38 13.1* 15.1* 3.67 6.87 15.1* 5.92 14.5* 13.6* 15.1* 14.4* 10.8 14.7* 3.62 12.3* 26 8.37 18.8* 15.9* 18.8* 12.0* 16.7* 18.8* ------ 10.5 18.8* 9.59 18.2* 17.2* 18.8* 18.0* 14.5* 18.4* 7.29 16.0* 13.8* 7.13 24 25 11.8 26 29 30 31 32 33 * Significant at the .05 level Note. Only those types that were significantly different from the other types are included in the table 283 34 35 36 37 4.2.2.1.4 Implicational Scaling for the Essay Data by Groups The implicational scaling was done by coding each subject as having used (1), or not used (0), each of the 37 types of collocations in their essays. The two axes of the matrix for the implicational scaling consisted of the 37 collocation types, ranked from most commonly used to least commonly used, and the subjects in each group ranked in order of their use of types of collocations, from subjects using the most types to subjects using the fewest. This matrix is summarised in Figures 9, 10, and 11 below. For Group 1, the coefficient of reproducibility was .94, which is considered to be valid (see Andersen 1978). The coefficient of scalability was .40. While the coefficient of reproducibility is at the level necessary for this implicational scale to be considered valid, the coefficient of scalability is below the recommended level of .6 (see Hatch & Lazaraton 1991:212). This suggests that while the implicational scale for Group 1 is valid, the fact that most subjects did not use the majority of the 37 scaled collocation types resulted in the low coefficient of scalability. The implicational scale shows that the first three collocation types are those that were found to be used significantly more than all the other types in the post-hoc analyses (see Table 33). For Group 2, the coefficient of reproducibility was .90, which is considered to be valid (see Andersen 1978). The coefficient of scalability was .33, below the recommended level of .6 (see Hatch & Lazaraton 1991:212). The implicational scale shows that the first five items are those types that were found to be used significantly more than all the other types in the post-hoc analyses (see Table 34). 284 For Group 3, the coefficient of reproducibility was .89, and the coefficient of scalability was .31. As for the other two groups, the implicational scale for Group 3 was found to be below the recommended level of scalability. The implicational scale shows that the first three items were those collocation types that were found to be used significantly more than all the other types according to the post-hoc analyses (see Table 35). Collocation Tokens - Essay Data - Group 1 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 26 29 11 23 36 12 4 27 14 5 34 13 8 37 Collocation Types Note. Only those types with mean > .1 are included in this figure Figure 9. Mean use of collocation tokens in the essay data - Group 1 285 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 Note. Only those types with mean > .1 are included in this figure Figure 10. Mean use of collocation tokens in the essay data - Group 2 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 Note. Only those types with mean > .1 are included in this figure Figure 11. Mean use of collocation tokens in the essay data - Group 3 286 4.2.2.1.5 Summary of the Results for the Essay Data The results of the Friedman repeated measures for the essay data support Hypothesis 2 that there are group-specific patterns in the development of collocational knowledge. For Group 1, Types 26. SVc, 29. Adjective Noun, and 11. SV(O) Prep O were used significantly more than the other types of collocation. For Group 2, Types 26. SVc, 29. Adjective Noun, 12. SV to Inf, and 11. SV(O) Prep O were used significantly more than the other types. For Group 3, Types 26. SVc, 13. SV Inf, and 12 SV to Inf were used significantly more than the other types. These results suggest that for each group there are certain types that are used more than others, indicating that subjects in each group prefer to use, and are more accurate in using, specific types of collocation. These results also indicate the existence of group-specific patterns in the acquisition of collocation, as was predicted by Hypothesis 2. 4.2.2.1.6 Further Analyses on the Essay Data Due to the lack of statistical significance of the accuracy orders obtained from the implicational scaling analyses for the essay data, further analyses were performed to investigate the correlation of the accuracy orders for the three groups. Spearman’s Rho Correlation Coefficient was calculated for the accuracy orders by Groups 1 and 2, Groups 2 and 3, and Groups 1 and 3. Only those types with a mean greater than .1 were included in the analyses. The correlation for Groups 1 and 2 was rs = .832, p = .0004; for Groups 2 and 3 rs = .766, p = .0011; and for Groups 1 and 3 rs = .552, p = .019. The significance of these results are discussed in the next chapter. 287 4.2.2.2 Translation Data The mean accuracy of response to each type of collocation supplied in the translation test was calculated. The number of types differs from group to group. For Group 1 there were 8 types of collocation tested in the translation test, 6 types for Group 2, and 7 types for Group 3. The results of the analyses are summarised below. 4.2.2.2.1 Friedman Test for the Translation Data - Group 1 The results of the Friedman test for Group 1 show a significant difference in the students' mean accuracy of translation of the 8 types of collocation in the translation data (Friedman c2 (7, N = 8) = 220.613, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman's rank sums were performed on the data. The results of these tests are summarised below in Table 36. According to the results of the post-hoc analysis, Types 13. SV Inf, and 23. SV(O) Adverbial are translated significantly more accurately than all the other types of collocation. Table 36. Nemenyi's multiple comparisons tests of mean rank differences for the translation data - Group 1 Types 1. Noun Prep 5. Adj Prep 11. SV(O) Prep O 1 5 11 14 16 27 --- .687 .237 .264 .039 1.044 1.186 2.26* --- .45 .423 .648 .357 1.87* 2.95* --- .027 .198 .807 1.42* 2.50* --- .225 .78 1.45* 2.53* 14. SVV-ing 288 13 23 16. SVO Inf --- 1.005 1.225 2.30* 27. Verb Noun(creat) --- 2.23* 3.31* --- 1.083 13. SV Inf 23. SV(O) Adverb --- *: Significant at the .05 level 4.2.2.2.2 Friedman Test for the Translation Data - Group 2 The results of the Friedman test for Group 2 show a significant difference in the students' mean accuracy of translation of the 6 types of collocation in the translation data (Friedman c2 (5, N = 6) = 74.279, p = .0001). Nemenyi's multiple comparisons tests based on Friedman's rank sums were performed on the data. The results of these tests are summarised below in Table 37. According to the results of the post-hoc analysis, only Type 11. SV(O) Prep O is significantly more accurately translated than all the other types. Table 37. Nemenyi's multiple comparisons tests of mean rank differences for the translation data - Group 2 Types 1. Noun Prep 5. Adj Prep 11. SV(O) Prep O 1 5 11 14 16 27 --- .346 .931* .037 .154 .298 --- 1.27* .383 .5 .048 --- .894 .777 1.22* --- .117 .335 --- .452 14. SVV-ing 16. SVO Inf 289 27. Verb Noun(creat) --- *: Significant at the .05 level 4.2.2.2.3 Friedman Test for the Translation Data - Group 3 The results of the Friedman test for Group 3 show a significant difference in the students' mean accuracy of translation of the 7 types of collocation in the translation data (Friedman c2 (6, N = 7) = 134.62, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman rank sums were performed on the data. The results of these tests are summarised below in Table 38. According to the post-hoc analysis, Types 13. SV Inf, 14. SVV-ing, and 1. Noun Prep were found to be significantly more accurately translated than all the other types of collocation. Table 38. Nemenyi's multiple comparisons tests of mean rank differences for the translation data- Group 3 Types 1. Noun Prep 1 5 11 14 16 27 13 --- 1.52* 1.33* .056 .8 .789 .972 --- .189 1.47* .728 .739 2.5* --- 1.28* .539 .55 2.31* --- .744 .733 1.028 --- .011 1.77* --- 1.76* 5. Adj Prep 11. SV(O) Prep O 14. SVV-ing 16. SVO Inf 27. Verb Noun(creat) 13. SV Inf --- *: Significant at the .05 level 290 4.2.2.2.4 Implicational Scaling for the Translation Data by Groups As for the implicational scaling analysis of the translation data in the first hypothesis, each subject was coded as having translated accurately (1), or not (0), each of the types of collocations in the translation test according to the 80% accuracy criterion. The two axes of the matrix for the implicational scaling consisted of the collocation types ranked from most accurately translated by all subjects in each group to least accurately translated, and the subjects in each group ranked in order of their accuracy of translation of all types of collocations, from subjects translating accurately the most types to subjects translating accurately the fewest. The matrix for each group is given in Appendix N and summarised in Figures 12, 13, and 14 below. For Group 1, the coefficient of reproducibility was .936, and the coefficient of scalability was .632. The implicational scale for this set of data was found to be significant and the items on the scale are scalable (see Andersen, 1978; Hatch & Lazaraton 1991:212). The implicational scale (Figure 12) also shows that the first two items on the scale are the two types of collocation that were found to be tranlated significantly more accurately than all the other types in the post-hoc analyses (see Table 36). For Group 2, the coefficient of reproducibility was .97 and the coefficient of scalability was .78. As for Group 1, the implicational scale for this set of data was found to be significant and the items on the scale scalable. The implicational scale (Figure 13) also shows that the first item on the scale is Type 291 11. SV(O) Prep O which was also found to be translated significantly more accurately than all the other types in the post-hoc analyses (see Table 37). For Group 3, the coefficient of reproducibility was .89, and the coefficient of scalability was .59. Both coefficients are approaching significance and it can be conlcuded that the impicational scale for Group 3 is valid. The implicational scale (Figure 14) shows that the first three items on the scale are the ones that were found to be translated significantly more accurately than all the other types according to the post-hoc analyses (see Table 38). .8 .7 .6 .5 .4 .3 .2 .1 0 Figure 12. Mean accuracy of response for the translation data- Group 1 292 .4 .3 .2 .2 .2 .1 .1 0 Figure 13. Mean accuracy of response for the translation data- Group 2 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 Figure 14. Mean accuracy of response for the translation data- Group 3 4.2.2.2.5 Summary of the Results for the Translation Data The results of the Friedman repeated measures for the translation data support Hypothesis 2 that there are significant differences in the knowledge of 293 collocations within proficiency groups. For Group 1, Types 23. SV(O) Adverbial, and 13. SV Inf were translated significantly more accurately than the other types of collocation. For Group 2, only Type 11. SV(O) Prep O was translated significantly more accurately than the other types. For Group 3, Types 13. SV Inf, 14. SVV-ing, and 1. Noun Prep were translated significantly more accurately than the other types. These results suggest that for each group certain collocation types are easier to translate than others. 4.2.2.3 Blank Filling Data As for the translation data, for the within-group analyses of the blank filling data the mean accuracy of responses to each type of collocation included in the blank filling test for each group was calculated. Thus, the number of types differs from group to group. For Group 1 there were 14 types of collocation included in the blank filling test, for Group 2 there were 12 types, and for Group 3 there were 13 types (see Table 10, Chapter 3). The results of the analyses are summarised below. 4.2.2.3.1 Friedman Test for the Blank Filling Data - Group 1 The results of the Friedman test for the blank filling data for Group 1 show a significant difference in the students' mean accuracy of response to the 14 types of collocation in the blank filling data (Friedman c2 (13, N = 14) = 541.595, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman's rank sums were performed on the data. The results of these tests are summarised below in Table 39. According to the results, the significant differences are spread among many different pairs of collocation types. Thus, the clustering of only a limited number of types that are significantly different 294 to all other types, evident in the results of the post-hoc analyses for the translation and essay data, is not found in the post-hoc analyses for the blank filling data for Group 1. 4.2.2.3.2 Friedman Test for the Blank Filling Data - Group 2 The results of the Friedman test for the blank filling data for Group 2 show a significant difference in the students' mean accuracy of response to the 12 types of collocation in the blank filling data (Friedman c2 (11, N = 12) = 202.339, p = .0001). Nemenyi's multiple comparisons tests based on the Friedman's rank sums were performed on the data. The results of these tests are summarised below in Table 40. According to the results of the post-hoc analysis, Type 4. Prep Noun with the highest mean rank (i.e. most accurately answered) and Type 33. Verb Adverb with the lowest mean rank (i.e. least accurately answered) are the ones that show significant differences to most of the other collocation types. 4.2.2.3.3 Friedman Test for the Blank Filling Data - Group 3 The results of the Friedman test for the blank filling data for Group 3 show a significant difference in the students' mean accuracy of response to the 13 types of collocation in the blank filling data (Friedman c2 (12, N = 13) = 191.452, p = .0001). Nemenyi's multiple comparisons tests based on Friedman's rank sums were performed on the data. The results of these tests are summarised below in Table 41. According to the results, Type 33. Verb Adverb with the highest mean rank (i.e. most accurately answered), and Type 28. Verb Noun (eradication) with the lowest mean rank (i.e. least accurately answered), are significantly different to all the other types of collocation. 295 Table 39. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 1 TYPES 1 4 5 11 23 24 27 33 34 36 37 30 31 29 1. Noun Prep --- 6.68* 5.65* 3.87* .45 2.016 2.89 1.956 6.49* 5.97* 3.65* .115 1.104 .099 --- 1.022 2.807 6.23* 4.66* 3.79* 4.72* .187 .709 3.02* 6.56* 5.57* 6.78* --- 1.785 5.20* 3.64* 2.769 3.70* .835 .313 2.005 5.54* 4.55* 5.75* --- 3.42* 1.858 .984 1.918 2.62 2.098 .22 3.75* 2.77 3.97* --- 1.566 2.44 1.506 6.04* 5.52* 3.20* .335 .654 .549 --- .874 .06 4.47* 3.95* 1.638 1.901 .912 2.115 --- .934 3.60* 3.08* .764 2.775 1.786 2.98* --- 4.53* 4.01* 1.698 1.841 .852 2.055 --- .522 2.84 6.37* 5.39* 6.59* --- 2.318 5.85* 4.86* 6.07* --- 3.53* 2.55 3.75* --- .989 .214 --- 1.203 4. Prep Noun 5. Adj Prep 11. SV(O) Prep O 23. SV(O) Adverbial 24. SV(O) wh-word 27. Verb Noun (creat.) 33. Verb Adverb 34. Noun Noun 36. Prep Det Noun 37. Phrasal Verb 30. Noun Verb 31. Noun1 of Noun2 29. Adj Noun --- *: Significant at the .05 level 296 Table 40. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 2 TYPES 1 4 5 11 23 24 27 33 34 36 37 31 1. Noun Prep --- 4.04* 1.362 2.70* 1.559 1.66 1.793 .798 .421 3.64* 1.931 .196 --- 2.68* 1.335 2.48* 2.38* 2.25 4.84* 3.62* .394 2.112 4.23* --- 1.346 .197 .298 .431 2.16 .941 2.287 .569 1.558 --- 1.149 1.048 .915 3.50* 2.287 .941 .777 2.90* --- .101 .234 2.35* 1.138 2.09 .372 1.755 --- .133 2.45* 1.239 1.989 .271 1.856 --- 2.59* 1.372 1.856 .138 1.989 --- 1.219 4.44* 2.72* .602 --- 3.22* 1.51 .617 --- 1.718 3.84* --- 2.127 4. Prep Noun 5. Adj Prep 11. SV(O) Prep O 23. SV(O) Adverbial 24. SV(O) wh-word 27. Verb Noun (creat.) 33. Verb Adverb 34. Noun Noun 36. Prep Det Noun 37. Phrasal Verb --- 31. Noun1 of Noun2 297 *: Significant at the .05 level Table 41. Nemenyi's multiple comparisons tests of mean rank differences for the blank filling data - Group 3 TYPES 1 4 5 11 23 24 27 33 34 36 37 29 28 1. Noun Prep --- 2.74* 1.217 1.534 1.284 2.66* 1.884 4.61* 1.017 3.73* 2.506 1.339 1.572 --- 1.528 1.211 1.461 .078 .861 1.866 1.728 .994 .239 1.406 4.31* --- .317 .067 1.45 .667 3.39* .2 2.522 1.289 .122 2.78* --- .25 1.133 .35 3.07* .517 2.205 .972 .195 3.10* --- 1.383 .6 3.32* .267 2.455 1.222 .055 2.85* --- .783 1.944 1.65 1.072 .161 1.328 4.23* --- 2.72* .867 1.855 .622 .545 3.45* --- 3.59* .872 2.105 3.27* 6.18* --- 2.72* 1.489 .322 2.589 --- 1.233 2.4 5.31* --- 1.167 4.07* --- 2.91* 4. Prep Noun 5. Adj Prep 11. SV(O) Prep O 23. SV(O) Adverbial 24. SV(O) wh-word 27. Verb Noun (creat.) 33. Verb Adverb 34. Noun Noun 36. Prep Det Noun 37. Phrasal Verb 29. Adj Noun 28. Verb Noun (era) --- *: Significant at the .05 level 298 4.2.2.3.4 Implicational Scaling for the Blank Filling Data by Groups A matrix was compiled for each group, consisting of the types of collocation included in the blank filling test, ranked from most accurately answered by all subjects in each group to least accurately answered, and the subjects in each group ranked in order of their accuracy of response to all types of collocations, from subjects responding accurately to the most types to subjects responding accurately to the fewest types. As with the translation data, the 80% accuracy criterion was used. These matrices are given in Appendix O and summarised in Figures 15, 16, and 17 below. For Group 1, the coefficient of reproducibility was .95, and the coefficient of scalability was .5. Even though the coefficient of reproducibility for the blank filling data for Group 1 is at the level necessary for this implicational scale to be considered valid, the coefficient of scalability is below the recommended level of .6 (see Andersen 1978; Hatch & Lazaraton 1991:212). For Group 2, the coefficient of reproducibility was .95, and the coefficient of scalability was .61. The implicational scale for this set of data was found to be significant and the items on the scale scalable. For Group 3, the coefficient of reproducibility was .928, and the coefficient of scalability was .68. The implicational scale for Group 3 was found to be significant and the items on the scale scalable. 1 .5 .4 .4 .3 .2 .2 .2 .1 .1 0 Figure 15. Mean accuracy of response for the blank filling data - Group 1 .5 .4 .4 .3 .2 .2 .2 .1 .1 0 Figure 16. Mean accuracy of response for the blank filling data - Group 2 2 .8 .7 .6 .5 .4 .3 .2 .1 0 Figure 17. Mean accuracy of response for the blank filling data - Group 3 4.2.2.3.5 Summary of the Results for the Blank Filling Data The results of the Friedman repeated measures for the blank filling data support Hypothesis 2 that there are significant differences in the knowledge of collocations within proficiency groups. From the implicational scales it is apparent that certain types of collocation are answered more accurately than others in each group. For Group 1, Types 34. Noun Noun, and 33. Verb Adverb were answered more accurately than the other types of collocation. For Group 2, Types 24. SV(O) wh-word, 23. SV(O) Adverbial, and 34. Noun Noun were answered more accurately than the other types. For Group 3, Types 33. Verb Adverb was answered more accurately than the other types. These results suggest that in each group subjects found certain collocation types easier to answer than others. 3 4.2.2.4 Summary of the Results for Hypothesis 2 The results from the analyses of the three sets of data, the essay data, the translation data and the blank filling data, support Hypothesis 2 by providing evidence that there are within-group differences in the use and knowledge of collocation types, assessed in this study both in terms of ability to produce collocations in the essays, and in terms of accuracy of response to questions eliciting collocations in the translation and blank filling tests. The statistical significance for the implicational analyses of the translation and the blank filling data strongly suggests that there are group-specific patterns in the acquisition of collocations; that certain collocation types are easier than others to acquire; and that they do form an accuracy order. The next chapter discusses the significance of the results for the two hypotheses. 4 CHAPTER 5 DISCUSSION OF THE FINDINGS 5.0 Introduction This study investigated the acquisition of English collocations by ESL subjects at three proficiency levels - post-beginners, intermediate, and postintermediate - in an attempt to describe the development of English collocational knowledge in L2 learners. The acquisition of English collocations was measured both as free production of collocations (accuracy of use in the students' essays) and cued production of collocations (accuracy of response in the translation and blankfilling tests). Evidence for the development of collocational knowledge was sought in comparing the production and knowledge of collocation types across and within the different proficiency groups. An implicational scaling analysis was also performed on the data in an attempt to find evidence for accuracy orders in the acquisition of English collocations. The findings are summarised and discussed in the following sections. In section 5.1. the free production results are discussed; in 5.2. the cued production results are considered; a summary of the findings is then presented in 5.3.; the factors affecting the development of collocational knowledge 5 are discussed in 5.4.; and finally a summary of the discussion is given in 5.5. The pedagogical implications of this study are given in 5.6. 5.1 Free Production of Collocations The accurate use of collocations in the subjects' essays was used as evidence for the acquisition of collocations. There were significant differences in the production of a number of collocation types between and within the three proficiency groups. 5.1.1 Between-Group Differences Subjects used significantly more Type 13. SV Inf and 19. SV(O) that collocations as their level of proficiency increased (see Table 24). There were also a number of collocations which were used significantly more by subjects in the highest level group. For example, Type 1. Noun Prep, 5. Adjective Prep, 15. SVO to Inf, and 21. SVOc collocations were used significantly more by subjects in Group 3. Types 1 and 5 are lexical collocations, and Types 13, 19, 15 and 21 are grammatical collocations that are syntactically more demanding than the simple grammatical collocations SV to Inf and SVc. The analysis of the collocations in the subjects' textbooks revealed that the use of Type 13. SV Inf, 19. SV(O) that, 1. Noun Prep, 5. Adjective Prep, 15. SVO to Inf, 6 and 21. SVOc collocations in the textbooks also increases as the level of difficulty of the language increases from TWE1 to TWE3. It is possible that the subjects' exposure to larger amounts of collocations of these collocations as their level of proficiency increased has influenced the production of these types in their essays. That is, the more the subjects were exposed to a particular collocation type, the more they used it. This is also reflected in the fact that the order of frequency of the 37 collocation types in each of the three textbooks correlated significantly with the frequency of use of the 37 types of collocation in the students' essays, i.e. the order of frequency of use for each group correlated significantly with the order of frequency of the 37 types of collocation in the textbook for that particular group. Types 26. SVc and 29. Adjective Noun were used significantly more by subjects at the lower proficiency levels. This could be due to the fact that these two types of collocation are more frequent in everyday speech and syntactically simple (e.g. Type 26. SVc includes constructions such as 'I am a student', 'I am bad', 'She became a teacher'; Type 29. Adjective Noun includes collocations such as 'long hair', 'good student', 'beautiful girl'). Another explanation for the extensive use of these collocation types by Group 1 subjects is that students in this group used fewer collocation types overall, with more tokens used for each type. As the level of proficiency increased, the number of collocation types used in the essays also increased. The analysis of the collocations in the essays showed that subjects in the lowest proficiency level, Group 1, used only 23 out of the 37 collocation types investigated in this study, while subjects in the higher proficiency levels, Groups 2 7 and 3, used 29 and 28 of the 37 types respectively. Group 1 used fewer collocation types and a greater number of tokens for some types (e.g. 26. SVc and 29. Adjective Noun). Similar results were reported by Zhang (1993), who found that the more proficient L2 learners used significantly more collocation types than the less proficient L2 learners (Zhang 1993:147). Another factor that could have influenced the subjects' performance with regard to the use of SVc and Adjective Noun collocations is the topic of the essay. Group 1 subjects had to describe themselves and their family in the essay, a topic that may have prompted the use of more Adjective Noun and SVc phrases. Group 2 subjects had to describe themselves and their plans for the future, and Group 3 students had to describe and discuss pollution problems in their home town. The essay topics for Groups 2 and 3 were thus not as purely descriptive as for Group 1. Type 11. SV(O) Prep O collocations were also more frequent in the Group 1 and 2 essays. The textbook analysis also shows that TWE3, the textbook for Group 3 (post-intermediate students), contains the least number of collocations of this type when compared to the other two textbooks, i.e. the students' production of collocations may mirror their exposure to these collocations in their current textbook, and not necessarily the incremental growth of collocations from TWE1 to TWE3. There were also collocation types which were used significantly more by Group 2 than by Group 1, but they were used less frequently by Group 3. These types are: 12. SV to Inf, 36. Prep Det Noun, 37. Phrasal Verb, 4. Prep Noun, 14. SVV8 ing, and 24. SV(O) wh-word. Such a phenomenon has been described in previous studies as 'backsliding' (Lightbown 1985a). According to Lightbown, L2 acquisition is not linear and cumulative, but is characterised by backsliding and loss of forms that appeared to be previously mastered. In this study, Group 3 subjects are able to use the above collocation types, but they seem to rely less on the use of these types than subjects in the lower proficiency levels. Backsliding has been reported in previous developmental studies too (see Hyltenstam 1977; Andersen 1978). There were also collocation types which were not used at all. These are: 3. Noun that, 18. SV possessive V-ing, 22. SVOO, 25. S(it)VO to Inf, 28. Verb Noun (eradication), and 32. Adverb Adjective. The majority of these types are structurally demanding and infrequent in everyday English. According to the BBI, examples of these types are: 'We reached an agreement that she would represent us in court', or 'it was his desire that his estate be divided equally' (Type 3. Noun that); 'They love his clowning', or 'This fact justifies Bob's coming late' (Type 18. SV Possessive V-ing); 'It surprised me to learn of her decision' (Type 25. S(it)VO to Inf). Type 22. SVOO collocations consist of a transitive verb and two objects, neither of which can be used in the prepositional phrase with to or for, e.g. 'God will forgive them their sins', or 'we bet her ten pounds'. Previous research has also shown that SV Possessive V-ing constructions are acquired late (Anderson 1978:97). Also, SVOO constructions were found to be acquired after the more unmarked SVO to O constructions (Mazurkewich 1984). What the above collocation types appear to 9 have in common is a greater degree of complexity. Studies in L1 acquisition have shown that grammatical complexity is a determinant of acquisition orders (see Brown 1973). Given that collocations in this study are operationalised in terms of structurally determined patterns, grammatical complexity could be a factor affecting the pattern of results obtained in this study. Zhang's (1993) study also defined collocations in structural terms, and he found that the L2 learners in the study avoided, and were unable to produce, the more structurally demanding collocations when compared with native speakers (Zhang 1993:126). These collocation types are also structurally different from their equivalent collocations in Greek, e.g. Noun that collocations are Noun to [Passive Voice] Infinitive. Laufer and Eliasson (1993) have also reported that L1-L2 difference was the best predictor of avoidance in their investigation of the use of phrasal verbs by Swedish and Hebrew ESL learners. Finally, the absence of Type 28. Verb Noun (eradication) and 32. Adverb Adjective collocations could be due to the fact that these types are relatively fixed (not free combinations) and therefore difficult to acquire. For example, some Type 28. Verb Noun (eradication) collocations in the BBI are 'to reject an appeal', 'to reverse a decision', 'to rescind a tax'. The authors of the BBI suggest that collocations of this type are arbitrary and unpredictable, i.e. no predictions can be made as to why certain verbs combine with certain nouns, therefore L2 learners have difficulties acquiring them as they cannot tell why 'make an estimate' is acceptable but *'make an estimation' is not (Benson et al. 1986b:258). For Type 32. 10 Adverb Adjective collocations the BBI includes 'deeply absorbed', 'strictly accurate', 'sound asleep'. Previous research has also revealed that adverbs, in particular, are difficult for the L2 learner to use appropriately because they typically collocate with specific words, i.e. they are fixed (Linnarud 1986:105). With respect to Types 3, 18, 22, 25, 28, and 32, it is possible that the subjects in this study have not yet reached a proficiency level advanced enough to use such complex, infrequent, and/or fixed collocations. Also, the analysis of the TWE series showed that Types 3. Noun that, 18. SV Possessive V-ing, and 25. S(it)VO to Inf do not appear in the subjects' textbooks, i.e. no tokens of those collocation types were found in any of the three textbooks. Furthermore, only a limited number of Type 22. SVOO, 28. Verb Noun (eradication), and 32. Adverb Adjective collocations were found in the textbooks (see Table 7, Chapter 3). It appears that lack of exposure to specific collocation types or the low frequency of these collocation types in the subjects' textbooks have also contributed to the avoidance of these types by the subjects. Collocational development across groups was examined by implicational scaling analysis of acquisition orders. The implicational scale for the essay data was found to have a significant coefficient of reproducibility which means that a subject's performance can be predicted with a high degree of accuracy from that subject's position on the scale. Although previous studies using implicational scaling analysis considered a high coefficient of reproducibility as adequate evidence for the presence of an implicational scale (see Andersen 1978; Hyltenstam 11 1977), in this study the coefficient of scalability was also calculated, to provide additional evidence as to the strength of the collocation types as an ordered scale. The essay data were found to have a low coefficient of scalability (Cscalability = .33). It is possible that the large number of items on the scale for this data could have reduced the strength of the scale. Also, the backsliding learning patterns which occurred for some collocation types will influence the scalability of the data (Hatch & Lazaraton 1991:216). Even though the statistical validity of the essay scale does not reach statistical significance, the relative magnitude of the predictive power of the scale cannot be determined, due to the lack of other implicational analysis studies in the acquisition of collocations. According to Davidson (1987) "the magnitude of a coefficient of scalability should rightly be judged against similar findings in the field" (p. 25). Since there are no other studies similar to this one, it is possible that even scalability of .33 is valid enough as a predictor for the order of acquisition of collocation types (Davidson 1987:26). However, only future research and implicational analysis on collocations can verify this. 5.1.2 Within-Group Differences Differences in the use of collocations were also sought in the writing performance of each group of subjects. The following types were used most frequently in each of the three groups. 12 Table 42. Collocation types used most frequently in the students' essays Group 1 Group 2 Group 3 26. SVc 26. SVc 26. SVc 29. Adjective Noun 11. SV(O) Prep O 29. Adjective Noun 13. SV Inf 12. SV to Inf 12. SV to Inf 11. SV(O) Prep O 36. Prep Det Noun It appears that Type 26. SVc collocations were used significantly more than the other types in all three groups. Given that Type 26. SVc constructions are basic and frequent in everyday speech, e.g. 'I am a student', 'I am happy, 'She became a teacher', it is not surprising that subjects in all levels of proficiency used collocations of this type more than any other type. Zhang's (1993) study also showed that more SVc collocations were used by all learners, more and less proficient, in their essays (Zhang 1993:125). These results are also in line with previous research in the sequence of acquisition of grammatical structures by Fathman (1977). She found that structures that needed to be produced correctly for effective communication, such as SVc constructions, were learned early. Also, according to Pienemann's Processability Model, copula sentences such as 'I am a student' belong to Stage 1 (basic sentence structures and basic categories) of second language acquisition (Pienemann 1996). Evidence from Japanese as a second 13 language have also shown copula sentences to be a Stage 1 structure (Huter 1996). This collocation type may then be considered a 'core' type in the acquisition of collocations. Groups 1 and 2 also used Type 29. Adjective Noun and 11. SV(O) Prep O collocations significantly more than the other types. As already mentioned above, it is possible that the topic of the essay for these two groups (see Appendix B) could have influenced the frequency of use of Type 29. Adjective Noun collocations. As far as Type 11. SV(O) Prep O collocations are concerned, TWE1 and TWE2 contain more collocations of this type than TWE3. The implicational scaling for the essay data between groups (see Figure 3) shows that Types 26, 29 and 11 are also the first three items on the implicational scale for all groups. Since, Group 1 is the lowest proficiency group investigated by this study, it is understandable that the subjects in this group use the easiest collocation types more than the others. Therefore, it can be concluded that Types 26, 29 and 11 are early acquired collocation types, as their use was measured in the writing performance of L2 learners in this study. Types 12. SV to Inf and 13. SV Inf are also used more than the others by higher level students, Groups 2 and 3. Both these types are still among the first six items on the implicational scale of the essay data for all groups (see Figure 3), i.e. they are among the most frequently used types of collocation, but their use increases significantly in Groups 2 and 3. Zhang (1993) also reports that these two types of collocation were used frequently by the L2 learners in his study (Zhang 14 1993:126). The textbook analysis reveals a few tokens of these two types in TWE1 (19 tokens for Type 12, and 26 tokens for Type 13) and a considerable increase in TWE2 and TWE3 (TWE2: 234 tokens for Type 12, and 230 tokens for Type 13; TWE3: 285 tokens for Type 12, and 347 tokens for Type 13). From a linguistic point of view, the fact that Type 13. SV Inf collocations are acquired later than Type 12. SV to Inf collocations could be due to the cumulative grammatical complexity, introduced by Brown (1973). The cumulative grammatical complexity assumes that a construction y is more complex than a construction x only if y involves all the transformations involved in x plus one or more others (Brown 1973:377). In this respect, the cumulative grammatical complexity is different from the theory of derivational syntactic complexity which assumes that all transformations involve a constant increment of complexity (see Brown & Hanlon 1970). Derivational syntactic complexity proved inadequate for providing an explanation of language acquisition (see Smith 1988), and Brown claims that the cumulative number of transformations is a better index of complexity (Brown 1973:377; for other approaches to assessing lexico-syntactic complexity see Frazier 1988; Crain & Shankweiler 1988; Cheung & Kemper 1992; Hulstijn & deGraaff 1994; Hulstijn 1995). In the present data, Type 13. SV Inf requires all the rules that constructions that contain infinitives do, plus one more, i.e. to-deletion. Type 13. SV Inf collocations are thus more difficult and hence are acquired later. Furthermore, Type 13. SV Inf represents collocations that contain modal auxiliaries, e.g. 'can, could, should, would, may + Inf'. Modal auxiliaries constitute a closed class of 15 verbs with limited distributions and have distinct features when compared to regular verbs, e.g. they require to-deletion before their combination with an infinitive, they take no third-person inflection, they have abnormal time reference, and they can only occur as the first element of the verb phrase (see Quirk, Greenbaum, Leech & Svartvik 1985; Steele 1981; on the learnability of English auxiliaries in L1 acquisition see Pinker 1984). From a developmental point of view, the correct use of Type 13 collocations mainly by Group 3 subjects indicates that accurate use of modal auxiliaries develops later in L2 learners and thus Type 13 collocations are developmentally 'difficult'. From a learnability point of view, Type 13 collocations are different from their equivalent collocations in Greek which do not require to-deletion, e.g. ‘mporei;s na ywni;seis edw;’[you can to shop here] is SV[Modal Auxiliary] to Inf. Due to the L1-L2 difference, Type 13 collocations can be considered more difficult than Type 12 collocations. Similar results regarding the use of modal auxiliaries are also reported by Anderson (1978). In Ravem (1974) too, it was reported that the acquisition of a full range of auxiliary morphemes (which included Modals) and their distribution develops late (Ravem 1974:148). The implicational scales for the essay data by group have coefficients of scalability below the recommended level of statistical significance. As with the implicational scale for the essay data (all groups), the large number of items on the scale could be responsible for the low scalability. Despite the low coefficients of scalability, the three scales reveal orders of difficulty similar to the patterns of 16 acquisition, as measured by the Friedman repeated measures analyses. The scale for Group 1 has Types 26, 29 and 11 as the first three items on the scale. The scale for Group 2 has Types 26, 29, 12, 11 and 36 as the first five items on the scale. Finally, the scale for Group 3 has Types 26, 13, 12, and 36 as the first four items on the scale. The results from the implicational scaling analysis, although not reaching statistical significance, exhibit a pattern that supports the view that certain orders exist in the acquisition of collocations, as measured by the writing performance of L2 learners. These orders appeared to be influenced by exposure, as the subjects' textbook analysis shows, and/or the complexity, arbitrariness, and predictability of specific collocation types (see above). The correlation of the three implicational orders (see also Fathman 1977; Pica 1983) showed that the orders for subjects in Groups 1 and 2 were highly correlated (rs = .832); the orders for Groups 2 and 3 were also highly correlated (rs = .766); and the orders for Groups 1 and 3 revealed the lowest correlation (rs = .552). These results show a gradual development of collocational knowledge across the three Groups in the study. Since the subjects in each Group for this study were only one year apart, the development of collocational knowledge had progressed to a different stage after only two years of instruction (exposure to collocations via the TWE textbooks) and maturation (during the period between 12-15 years of age) had taken place. Thus, even though the implicational scales for the essay data lack statistical significance, they can still be used as indicators of the development of English collocational knowledge in L2 learners. 17 5.2 Cued Production of Collocations Knowledge of collocations was also measured as accuracy of response to the translation and blank-filling tests. Between- and within-group differences in the accuracy of responses were used as evidence of the development of collocational knowledge in the three proficiency levels. 5.2.1 Translation Data Subjects were tested on their ability to translate correctly sentences from their L1 into English. Each sentence contained an English collocation that was different from its equivalent in the learners' L1. The significant results for this set of data are discussed below. 5.2.1.1 Between-Group Differences The students in Group 3 performed with the greatest accuracy in the translation test. The results showed that Type 1. Noun Prep, 5. Adjective Prep, 14. SVV-ing, and 27. Verb Noun (creation) collocations were translated significantly more accurately by higher proficiency subjects. 18 The results for Type 1. Noun Prep reflect to some extent the treatment of this collocation type in the subjects' textbooks: 76 tokens in TWE1, 80 tokens in TWE2 and 145 tokens in TWE3. The use of Type 27. Verb Noun (creation) collocations also increases in the textbooks in a pattern similar to the one found in the results of the translation test, i.e. as level increases the number of collocations found in the textbooks increases. The results for Type 5. Adjective Prep collocations show a significant difference in accuracy only between Group 1 and Group 3, with the highest proficiency subjects, Group 3, were more accurate than the others. A similar pattern is also found in the students' textbooks. Although Type 14. SVVing collocations appear more in TWE1, students were able to translate them with significantly more accuracy after their level increased. Finally, Type 11. SV(O) Prep O collocations were translated most accurately by subjects in Group 2. Collocations of this type were also found more in TWE2 than in the other two textbooks. Implicational scaling analysis was also used for the between-group differences. The coefficient of reproducibility was found to be significant, which confirmed the predictive power of the scale. The coefficient of scalability for the translation data approached significance (Cscalability = .57). The smaller number of collocation types included in the translation test could have contributed to the high coefficient of scalability. This result also indicates that a small number of items and a translation test are more likely to yield strong enough differences in 19 performance for a set of collocation types to be truly scalable, i.e. implicationally ordered. Considering the implicational scale for the translation data, the following order of accuracy was found after the 80% criterion of acquisition was applied to the data (types at the top of the order were more accurately translated than types at the bottom): Table 43. Accuracy order for the collocation types included in the translation test - All Groups Type 14. SVV-ing 16. SVO Inf 11. SV(O) Prep O 1. Noun Prep 5. Adjective Prep 27. Verb Noun (creation) The results from the implicational analysis indicate that students were more accurate in translating grammatical collocations (Types 14, 16, and 11) than lexical collocations (Types 1, 5, and 27). Type 14. SVV-ing was easier to translate than Type 16. SVO Inf. Similar results were reported by Anderson (1978), who found 20 that gerund SVV-ing constructions were acquired earlier than SVO Inf constructions that required to-deletion (Anderson 1978:97). The most accurately translated lexical collocation type on the scale, Type 1. Noun Prep, included collocations such as 'things about', 'flight to', 'plans about', 'champion in', 'success in', 'pain in [the stomach]'. Students found these lexical collocations easier to translate than Type 5. Adjective Prep collocations. The following Type 5 collocations were included in the translation test: 'afraid of', 'interested in', 'bored with', 'married to'. Type 1 occurred more frequently than Type 5 in the TWE series, i.e. the type-token ratio for Type 1 in the TWE series was 100.3, while for Type 5 it was 82.6. Also, all of the Type 1 collocations included in the translation test have a similar structure in Greek, i.e. a noun followed by a preposition. Some Type 5 collocations, on the other hand, e.g. 'afraid of' and 'bored with', have a different structure in Greek, i.e. Verb Det Noun ’foba;mai ta fi;dia’ [afraid-[Middle Voice Verb] the snakes], ’barie;mai to scolei;o’ [bored-[Middle Voice Verb] the school]. The L1-L2 difference with regard to the English Adjective Prep collocations could be one factor responsible for the subjects' low accuracy in the translation of Type 5. Adjective Prep collocations. It has also been reported that Adjective Prep collocations are more fixed (i.e. consistently used with a preposition, e.g. 'fond of', 'afraid of', 'deaf to' (Benson et al. 1986a:xii)) and difficult for low proficiency students, and as such they are indicative of a higher level of proficiency. Zhang (1993) also reports that in his investigation of English 21 collocational knowledge by L2 learners and native speakers, collocations such as Adjective Prep were used considerably more by native writers than L2 learners. Type 27, the most difficult collocation type on the scale, included lexical collocations that are fairly fixed in English, e.g. 'draw conclusions', 'face problems', and different from their equivalent collocations in Greek, e.g. 'bga;zw sumpera;smata' [take out conclusions], 'antimetwpi;zw problh;mata' [confront problems]. The arbitrary nature of Verb Noun (creation) collocations has also been reported by the writers of the BBI (Benson et al. 1986b). The arbitrariness and unpredictability of these collocations makes non-native speakers unable to cope with them. It is not surprising, then, that such collocations were difficult for the subjects of this study. Also, an examination of the translations supplied by the students showed considerable influence from Greek. It is possible that the nature of the test, i.e. translation, could have increased L1 influence. L1 interference has been also reported in past studies on collocations involving a translation test (Marton 1977:46). The acquisition order for the translation data approached statistical significance: that is, students who correctly translated Type 27. Verb Noun (creation) collocations, the last and most difficult to translate type on the scale, also translated correctly the rest of the collocation types included in the translation test. Overall, results show a very low accuracy in the translation test, i.e. only 88 out of 275 subjects, about 33%, were 80% or more accurate in the translation of Type 14. SVV-ing collocations, which was the most accurately translated type on 22 the scale. Hence, translation proved to be a difficult test for the subjects. Previous research involving advanced L2 learners, i.e. fifth year Polish students majoring in English, in a translation test, Polish to English, showed that even advanced students did not have most of the collocations which were tested in their productive repertoires (Marton 1977:45). 5.2.1.2 Within-Group Differences The results for the translation data revealed significant within-group differences in the translation responses. Group 1 For Group 1, Types 23. SV(O) Adverbial and 13. SV Inf were translated more accurately than all the other types. The implicational scaling analysis also confirms that subjects were 80% or more accurate in translating Types 23 and 13 than the other types. The coefficient of reproducibility and the coefficient of scalability were both found to be statistically significant for this analysis, which suggests that the accuracy order for the translation data for Group 1 has validity and predictive power. The order is the following (at the top are those types that were the easiest to translate, while at the bottom are those types that were the most difficult to translate): 23 Table 44. Accuracy order for the collocation types included in the translation test - Group 1 Type 23. SV(O) Adverbial 13. SV Inf 16. SVO Inf 14. SVV-ing 5. Adjective Prep 11. SV(O) Prep O 1. Noun Prep 27. Verb Noun (creation) The above accuracy order shows the following. i) SV Inf collocations are easier to translate than SVO Inf collocations, which in turn are easier to translate than SVV-ing collocations. 24 ii) Collocations containing a preposition, i.e. Adjective Prep, SV(O) Prep O, and Noun Prep, are more difficult to translate than collocations containing an infinitive, i.e. SV Inf, and SVO Inf. Prepositions are also more likely to cause interference from the subjects' L1 than infinitives. Greek has a number of prepositions that do not always coincide with the English prepositions, i.e. 'pain in the stomach' is ‘po;nos sto stoma;ci’ [pain to the stomach], 'things about other countries' is 'pra;gmata gia a;lles cw;res’ [things for other countries]. On the other hand, infinitives in Greek are like their English equivalents. Prepositional phrases and phrasal verbs have also been reported as constructions that exhibit arbitrary lexical restrictions (Allerton 1984), and as such they are difficult to acquire. iii) Verb Noun (creation) lexical collocations are the most difficult to translate, a result that was also evident from the between-group analysis of the translation data. Group 2 The results for Group 2 show that only Type 11. SV(O) Prep O collocations were translated significantly more accurately than the other collocation types. The implicational scaling analysis for this set of data, which was also found to be statistically significant, shows the accuracy order given in Table 45: 25 Table 45. Accuracy order for the collocation types included in the translation test - Group 2 Type 11. SV(O) Prep O 16. SVO Inf 14. SVV-ing 1. Noun Prep 5. Adjective Prep 27. Verb Noun (creation) The above accuracy order shows that: i) Grammatical collocations are easier to translate than lexical collocations. ii) As with the accuracy order for Group 1, collocations that contain a preposition, i.e. Noun Prep and Adjective Prep, are more difficult to translate than collocations that contain an infinitive. iii) Verb Noun (creation) lexical collocations are also the most difficult to translate for Group 2. Comparing the accuracy orders for Groups 1 and 2, it appears that they are similar with respect to most types. The only exception is Type 11. SV(O) Prep O. Subjects in Group 2 found collocations of this type easier to translate than the subjects in Group 1. That is, subjects that received an additional year of instruction 26 were more accurate in translating Type 11. SV(O) Prep O collocations. Group 2 subjects also received more exposure to Type 11 collocations through their textbooks than Group 1 students, i.e. Type 11 collocations were found more frequently in TWE2 than in TWE1. Group 3 The results for the translation data for Group 3 show Types 13. SV Inf, 14. SVV-ing and 1. Noun Prep, to be translated more accurately than the other types. The implicational scaling analysis approached statistical significance (Cscalability = .59). The following accuracy order was obtained: Table 46. Accuracy order for the collocation types included in the translation test - Group 3 Type 13. SV Inf 14. SVV-ing 1. Noun Prep 16. SVO Inf 11. SV(O) Prep O 27 5. Adjective Prep 27. Verb Noun (creation) The above accuracy order shows that: i) SV Inf collocations are easier to translate than SVO Inf collocations. ii) With the exception of Noun Prep collocations, collocation types that contain a preposition are more difficult to translate than collocation types that contain an infinitive. iii) Verb Noun (creation) collocations are the most difficult to translate for Group 3. Even for the subjects in the highest proficiency level, Verb Noun (creation) collocations are the most difficult to translate with accuracy. The same applies for structures that are grammatically more complex, e.g. SVO Inf versus SV Inf (see above). According to the cumulative grammatical complexity (Brown 1973), SVO Inf structures are more complex than SV Inf structures since they require the insertion of an Object, and as such they are more difficult to acquire. Also, recent research in L2 acquisition has shown that the greater number of units and morphemes in some structures obscure their perceptual 'salience' making them harder to 'notice' and therefore to produce accurately (Bardovi-Harlig 1987; Robinson 1995; Schmidt 1990, 1995). SVO Inf collocations contain all the units of SV Inf constructions plus one more, i.e. Object, and as such they are less salient and harder to produce accurately. The above result is also consistent with previous studies on the acquisition of grammatical structures (see Anderson 1978). 28 The subjects' accuracy improves significantly in the translation of Type 1. Noun Prep collocations, i.e. as the students' level increased, their ability to translate lexical collocations also improved. This result is in line with Zhang's (1993) study, in which he found that the high proficiency L2 students had a better command of English lexical collocations than the low proficiency L2 students (Zhang 1993:148). Comparing the three accuracy orders for the translation data, we can conclude the following. i) Verb Noun (creation) collocations are difficult for all three proficiency groups. This was also evident from the between-group analysis. Collocations of this type also appeared infrequently in the students' essays with no significant betweengroup differences. As mentioned above, collocations of this type are fixed, e.g. 'to face problems', 'to draw conclusions', and subjects at all three levels exhibit a general weakness in the free production (essay data) and cued production (translation data) of Verb Noun (creation) collocations. Zhang (1993) also found the use of such collocations to be "weak areas" for L2 learners (Zhang 1993:106). ii) Grammatical collocations, e.g. SV Inf, SVO Inf, SVV-ing, are easier to translate than lexical collocations, e.g. Noun Prep, Adjective Prep, Verb Noun (creation). However, as subjects become more proficient, their accuracy in lexical collocations improves, i.e. Group 3 subjects become more accurate in translating Noun Prep collocations than students in Groups 1 and 2. This is also consistent with Zhang's results. 29 iii) Collocations that contain prepositions are harder to translate than collocations that contain infinitives. Zhang reports that "knowing prepositions and being able to use them in idiomatic combinations with other words are part of native fluency" (Zhang 1993:135). In his study, too, L2 learners showed a weakness in knowledge and ability to use collocations that contained prepositions. 5.2.2 Blank-Filling Data Cued production of collocations was also measured in a blank-filling test. Each sentence contained an English collocation with one part missing. Subjects were required to provide the missing part of each collocation. The collocations included in the blank-filling test could not be translated directly into the learners' L1, Greek. The results for this set of data are discussed below. 5.2.2.1 Between-Group Differences The results in the blank-filling data revealed that for Type 1. Noun Prep, 11. SV(O) Prep O, 23. SV(O) Adverbial, 24. SV(O) wh-word, 27. Verb Noun (creation), 33. Verb Adverb, 36. Prep Det Noun, and 37. Phrasal Verb collocations, subjects were 30 significantly more accurate in supplying the correct collocation as their level of proficiency increased. The textbook analysis (see Chapter 3) also showed that all but two of the above collocation types exhibit a similar pattern of increase across the three textbooks. For example, tokens for Types 1. Noun Prep, 24. SV(O) wh-word, 27. Verb Noun (creation), 33. Verb Adverb, 36. Prep Det Noun, and 37. Phrasal Verb increase as the level of language proficiency increases from TWE1 to TWE3. The students' exposure to these collocation types could have influenced their performance to the blank-filling test, i.e. the more frequently students were exposed to a particular type of collocation, the more accurate they became in their knowledge of collocations of this type. The subjects' performance on two collocation types showed a U-shaped pattern of acquisition. For Types 5. Adjective Prep and 34. Noun Noun, subjects in Group 1 were more accurate than subjects in Group 2, who were also less accurate than subjects in Group 3. A look at the specific collocations shows that the level of difficulty increases with proficiency. For example, Type 34. Noun Noun collocations for Group 1 were 'post office' and 'phone number', for Group 2 'traffic lights', and for Group 3 'curriculum vitae'. Group 3 subjects were more accurate in responding to the 'curriculum vitae' collocation than subjects in Group 2 were with the 'traffic lights' collocation, even though 'curriculum vitae' is less frequent than 'traffic lights' in everyday speech. The analysis of the textbooks shows that 'traffic lights' appears twice in TWE2, while 'curriculum vitae' appears only in TWE3, four 31 times. Again, the amount of exposure to a specific collocation appears to influence performance. Performance on the Adjective Prep collocations also increased as the level of proficiency increased. For example, some of the Adjective Prep collocations for Group 3 were: 'competent in', 'fond of', successful in', 'married to', 'unsure about', 'similar to', 'slow in', 'capable of', 'regardless of'. Group 2 subjects were tested on the following Type 5 collocations: 'full of', 'sympathetic to', 'engaged to', 'upset about'. Despite the fact that Group 3 students were faced with a larger number of Type 5 collocations compared to Group 2 students, they were more accurate in supplying the correct collocations. Both Noun Noun and Adjective Prep collocations are lexical collocations. It appears that students at the initial stages of ESL learn specific lexical collocations, possibly as unanalysed chunks, and hence they are relatively accurate with respect to selected lexical collocations. As their proficiency increases and their grammatical knowledge develops, their relative accuracy in lexical collocations declines and they become better in grammatical collocations (in the translation test too, intermediate level students were better in SV(O) Prep O collocations). At the post-intermediate level, the subjects' overall accuracy increases, and they once again become more accurate in lexical collocations. Such a U-shaped development in L2 learners has been reported in previous linguistic studies too (see McLaughlin 1987, 1990; Lightbown et al. 1980). 32 Overall, accuracy improves in the same fashion as in the essay and translating tasks. Subjects in Group 3 performed the best and subjects in Group 1 performed the worst. The implicational analysis for the blank-filling data had a low coefficient of scalability (Cscalability = .33), and thus there is little evidence of a stable acquisition order. Again, the U-shaped learning patterns probably contributed to the low scalability of the blank-filling data (see Hatch & Lazaraton 1991:216). 5.2.2.2 Within-Group Differences These were the significant differences in the accuracy of the subjects' responses to specific collocation types within each group. Group 1 After the 80% accuracy criterion was applied to the data for the implicational scaling analysis, the following accuracy order was revealed here): Table 47. Accuracy order for the collocation types included in the blank-filling test - Group 1 Type 34. Noun Noun 33 33. Verb Adverb 24. SV(O) wh-word 36. Prep Det Noun 5. Adjective Prep 23. SV(O) Adverbial 30. Noun Verb 1. Noun Prep 4. Prep Noun 37. Phrasal Verb 29. Adjective Noun 31. Noun1 of Noun2 11. SV(O) Prep O 27. Verb Noun (creation) The non-significant accuracy order shows that subjects in Group 1 were more accurate in lexical collocations, i.e. Noun Noun, Verb Adverb, Prep Det Noun, Adjective Prep, than in grammatical collocations that were more difficult to produce (longer collocational strings), e.g. SV(O) wh-word, SV(O) Adverbial, SV(O) Prep O. Group 2 34 After the 80% accuracy criterion was applied to the data for the implicational scaling analysis, the following accuracy order was obtained for Group 2 subjects: Table 48. Accuracy order for the collocation types included in the blank-filling test - Group 2 Type 24. SV(O) wh-word 23. SV(O) Adverbial 34. Noun Noun 31. Noun1 of Noun2 4. Prep Noun 5. Adjective Prep 36. Prep Det Noun 37. Phrasal Verb 27. Verb Noun (creation) 11. SV(O) Prep O 33. Verb Adverb 1. Noun Prep The accuracy order for Group 2 is statistically significant and reveals that as the level of proficiency increases, subjects become more accurate in their responses 35 to grammatical collocations that were initially difficult to produce (see scale for Group 1 subjects), i.e. SV(O) wh-word, SV(O) Adverbial. With regard to SV(O) wh-word collocations, the subjects' accuracy could be due to the systematic appearance of this type of collocation in their textbook, TWE2. SV(O) wh-word collocations are mainly used in the TWE series to give instructions for the various tasks in the textbooks, e.g. "ask what the area code for Liverpool is" (TWE2:56), "find out why Sam went back to his home town" (TWE2:100). TWE1, because it was designed for beginner levels, gives task instructions in Greek. It is in TWE2 that instructions are given in English and a large amount of SV(O) wh-word collocations are included in the textbook. Hence, the subjects in Group 2 had more exposure to SV(O) wh-word collocations compared with Group 1 students. Group 3 After the 80% accuracy criterion was applied on the data for the implicational scaling analysis, the following accuracy order was evident for subjects in Group 3: Table 49. Accuracy order for the collocation types included in the blank-filling test - Group 3 Type 36 33. Verb Adverb 34. Noun Noun 4. Prep Noun 24. SV(O) wh-word 23. SV(O) Adverbial 29. Adjective Noun 27. Verb Noun (creation) 1. Noun Prep 36. Prep Det Noun 37. Phrasal Verb 28. Verb Noun (eradication) 5. Adjective Prep 11. SV(O) Prep O The above accuracy order for Group 3 was statistically significant. It shows that subjects in the highest proficiency level were accurate in both lexical and grammatical collocations. Overall, as with the translation data, the subjects were less accurate in their responses to the blank-filling test, i.e. 45 subjects out of 275, about 16%, were accurate in their responses to Type 34, the type with the most accurate answers. Comparing the results from the three scales, the following conclusions can be drawn. 37 i) Type 11. SV(O) Prep O and 27. Verb Noun (creation) collocations were among the most difficult collocation types (see also the results for the translation data). Also, the subjects' responses to Type 28. Verb Noun (eradication) collocations were no more accurate than their responses to Type 27. Verb Noun (creation) collocations (see implicational scale for Group 3). It appears that Verb Noun collocations are difficult to acquire, irrespective of whether or not they denote creation or eradication (see Benson et al. 1986a). ii) Subjects in Groups 1 and 2 achieved similar levels of accuracy, while subjects in Group 3 were clearly more accurate in the blank-filling test, despite the fact that their test contained more items. Undoubtedly, students at the most proficient level for this study had a more advanced level of collocational knowledge. iii) The greatest difference in the three acquisition orders was with respect to Type 33. Verb Adverb collocations. Subjects in Groups 1 and 3 were accurate in their responses to this type of collocation, with Group 3 subjects significantly more accurate than Group 1 subjects, while subjects in Group 2 were not at all accurate on this collocation type. An examination of the specific collocations tested showed that Groups 1 and 3 were tested only on the collocation 'work hard', while subjects in Group 2 were tested on 'work hard', 'brake hard', and 'think highly'. In terms of idiomaticity, 'think highly' is more idiomatic than the other two Verb Adverb collocations. The idiomaticity of the collocation 'think highly' can be determined in terms of its level of abstraction and literalness (i.e. the likelihood of its literal meaning): 'think highly' is a more abstract collocation compared to the 'work hard' 38 and 'brake hard', which represent physical actions; also 'think highly' is of low literalness (i.e. of unlikely literal meaning), while 'work hard' and 'brake hard' are collocations with high literalness (see Cronk & Schweigert 1992). The collocation 'think highly' appeared to be especially difficult for subjects in Group 2, and even if they answered the other two collocations correctly they still would not be able to score more than 66% accuracy on this type of collocation (less than the 80% accuracy criterion). iv) As the level of proficiency increased, the students' performance on Prep Noun collocations also increased. Despite the fact that subjects in Group 3 had more than double the number of Prep Noun collocations in their version of the blank-filling test than subjects in Group 1, they were far more accurate in their responses to this type of collocation. On the other hand, Noun Prep collocations were difficult for all three groups. The two types of collocation consist of the same parts of speech (a noun and a preposition) but in a different order. When the preposition precedes the noun, collocations are easier for L2 learners. When the preposition comes after the noun, collocations become more difficult. A look at some of the Prep Noun collocations included in the test shows that these collocations are fairly fixed, frequent and regular (i.e. rule-governed), e.g. 'on Sundays' [on + day of the week], 'at 7:06' [at + time], 'in favour', 'in danger'. Noun Prep collocations are also fixed but less regular, more unpredictable (i.e. no rules can be generated for them) and associative, e.g. 'skills in', 'attitude towards', 'accusations against', 'degree in'. It is possible that the order in which the parts of a collocation combine, rather than the 39 class they belong to (e.g. noun, verb, preposition, etc.), influences the degree of difficulty and consequently the acquisition of a collocation. v) SV(O) Adverbial and SV(O) wh-word collocations were relatively easy for all groups, with SV(O) Adverbial collocations slightly more difficult than SV(O) whword collocations. Both these types have occurred frequently in the TWE series, with SV(O) wh-word collocations more frequent than SV(O) Adverbial collocations. vi) As students became more proficient their accuracy on Adjective Noun lexical collocations also improved. The Adjective Noun collocations for this test were fixed and formal, e.g. 'sore throat', 'marine life', 'heavy drinker'. The subjects' knowledge of fixed collocations therefore improved significantly with proficiency. 5.3 Summary of the Findings As far as the free production of collocations is concerned, the following conclusions can be drawn: i) Type 26. SVc collocations are 'core' collocations, as they were the most frequently used by students at all proficiency levels. ii) Type 26. SVc and 29. Adjective Noun collocations are early acquired ones as their use by subjects in this study revealed. iii) Types 3. Noun that, 18. SV Possessive V-ing, 22. SVOO, 25. S(it)VO to Inf, 28. Verb Noun (eradication), and 32. Adverb Adjective were avoided by subjects in all groups. 40 These types represent collocations that are structurally demanding, infrequent, and/or fixed. iv) The use of Type 12. SV to Inf, 13. SV Inf, 15. SVO to Inf, 21. SVOc, 1. Noun Prep, and 5. Adjective Prep collocations indicates a higher level of proficiency and development of collocational knowledge, as they were used mainly by subjects in Group 3, the highest proficiency level. v) The development of collocational knowledge occurs gradually, and collocational use develops significantly after two years of instruction, exposure and maturation has taken place (see correlations of the acquisition orders for the three Groups). As far as the learners' cued production of collocations is concerned, the following conclusions can be drawn. i) Grammatical collocations are easier to translate than lexical collocations. Accuracy in translating lexical collocations, Types 1,5, and 27, increased as language proficiency increased. ii) Grammatically more complex types were more difficult, e.g. Type 16. SVO inf collocations were more difficult to translate with accuracy than Type 13. SV Inf collocations. iii) Collocations containing a preposition were more difficult to translate than collocations containing an infinitive, as prepositions appeared to be more likely to cause L1 interference for the subjects in this study, and their combination with other words produced relatively fixed and difficult collocations. 41 iv) Type 27. Verb Noun (creation) lexical collocations were the most difficult to translate with accuracy for all subjects. Verb Noun collocations in the blank-filling test were also difficult for all subjects and they were also infrequent in the students' essays. 5.4 Factors Affecting the Development of Collocational Knowledge In previous developmental studies, frequency in the input has been considered a determinant of the sequence of acquisition of morphemes (LarsenFreeman 1976a, 1976b). In this study too, frequency of input seemed to affect the development of collocational knowledge. The results from the translation and the blank-filling tests suggest that the more frequently students are exposed to a particular collocation type, the more likely they are to know it. There is also evidence that the amount of exposure to a particular collocation via textbooks can influence the acquisition of that particular collocation, irrespective of how frequently that particular collocation occurs in everyday speech, e.g. 'curriculum vitae'. The results of the essay data strongly suggest that the production of English collocations by the subjects in the present study was influenced by the frequency of occurrence of English collocations in their textbooks. Greater frequency could have made certain collocations more salient and noticeable, supporting the argument that 'noticing' the form of input leads to learning (Doughty 1991; Robinson 1995; Schmidt 1990, 1995). Palmberg (1987), (1988) also found that the 42 vocabulary L2 learners produced consisted mainly of textbook vocabulary. Instruction has been found to influence the rate of acquisition in other studies too (Olshtain 1987; Doughty 1991). However, given the fact that the subjects in this study were tested on collocations already taught to them, their overall low accuracy in both the translation and the blank-filling tests suggests that mere exposure to collocations is not enough to facilitate recall. This conclusion is also in line with past research (Marton 1977:47; Bardovi-Harlig 1992b:272). Complexity was also considered as another factor influencing the development of collocational knowledge in ESL learners. With regard to grammatical collocations, for specific pairs of collocational structures, the type that was grammatically more complex was also more difficult for L2 learners. For example, learners were more accurate in SV Inf collocations than in SVO Inf collocations, and their use of SV Inf collocations increased later than the use of SV to Inf collocations. Also, grammatically complex and infrequent collocation types were avoided by the L2 learners in this study, e.g. students showed no evidence of acquisition of SV Possessive V-ing collocations. With regard to lexical collocations, 'complexity' in terms of arbitrariness, unpredictability and idiomaticity seemed to influence their acquisition by L2 learners, e.g. subjects were less accurate with fixed (not free), arbitrary, and unpredictable Verb Noun lexical collocations. Idiomaticity and arbitrariness have been previously found to affect the acquisition of individual words too (for a review see Laufer 1990b). Also, in this study, those collocation types, grammatical and lexical, that were early acquired, i.e. SVc and Adjective 43 Noun, represent collocations that are structurally 'salient' and need to be produced correctly for effective communication due to their high frequency in every day speech. Similar results with respect to these two structures were reported by Fathman (1977) in her study of the acquisition of grammatical structures. Also, there has been suggestive evidence that the order in which the parts of a certain collocation type combine can influence the degree of regularity of the collocations represented by that particular type. This has also been found to affect the degree of difficulty of acquisition for that particular type. For example, Prep Noun collocations, 'on Sundays', 'at 7:06', have been found to be more regular (i.e. rule-governed) and hence easier to acquire than Noun Prep collocations, e.g. 'degree in', 'attitude towards', 'skills in', which are unpredictable (i.e. associative). There is also evidence that the degree of L1-L2 difference influences the salience and consequently the acquisition of certain collocation types. For example, collocation types that were structurally different from the subjects' L1 were more difficult to translate, e.g. Type 5. Adjective Prep collocations that were 'Verb Determiner Noun' collocations in Greek were more difficult to translate, e.g. the Greek equivalent of 'I am bored with school' is ‘Barie;mai to scolei;o’ [bore[Middle Voice Verb] the school]. Finally, for a number of collocation types, knowledge develops as overall language proficiency increases, i.e. the subjects' accuracy and production of collocations was influenced by their overall language proficiency, and the most proficient students performed with greater accuracy in the translation and the 44 blank-filling tests than the other two groups. By and large, the greatest difference in performance appeared to be between Groups 1 and 3, which also suggests that maturation, in terms of language proficiency and age, affects the development of collocational knowledge. The following model summarises schematically the factors affecting the development of collocational knowledge: Language Proficiency Salient Maturation Instruction Grammatical Collocations Collocational Knowledge Complex Regular/Salient Saliency Lexical Collocations Fixed/Arbitrary Unpredictable L1-L2 Difference Figure 17. Model of the Development of Collocational Knowledge According to this model, collocational knowledge develops as overall language proficiency develops, as students become more mature, and as more exposure to collocations takes place. The development of collocational knowledge is influenced by the 'salience' of the particular collocation types. Grammatical collocations that are simple and frequent in everyday speech are early acquired. 45 The more complex structures are acquired later. Lexical collocations are more difficult to acquire than the simple grammatical collocations. They are syntactically simple (e.g. Noun Verb, Verb Noun, Noun Prep, Prep Noun, Verb Adverb), but their acquisition is affected by other factors of 'semantic complexity', e.g. arbitrariness, predictability and idiomaticity, i.e. the more fixed and idiomatic they are, the more difficult they are to acquire. Also, based on these results, a continuum of collocational knowledge and language proficiency can be described. Beginning students (Group 1) are able to produce simple grammatical collocations, they are more accurate on lexical collocations than complex grammatical collocations, but their overall accuracy is low. This can be interpreted as evidence that these students use lexical collocations as unanalysed blocks of language that they have memorised, and because their grammatical competence is not yet well-developed, they are less accurate with the more structurally demanding grammatical collocations. The fact that they can memorise lexical collocations more than grammatical ones could be due to the saliency of lexical collocations in terms of length of the collocational strings, i.e. most of the lexical collocation types consist of two words (Verb Adverb, Adjective Noun, Noun Prep, Adjective Prep) so they are easier to remember. Grammatical collocations, on the other hand, are longer and as such harder to memorise. At the intermediate level (Group 2), students become more accurate with the more complex grammatical collocations as their grammatical competence 46 increases, but their accuracy on lexical collocations and their overall accuracy do not improve. As students reach a higher level of proficiency, post-intermediate (Group 3), their overall accuracy in collocations (both lexical and grammatical) increases considerably, and they once again show greater accuracy on lexical collocations, indicating a richer vocabulary. Previous research has also shown that more advanced learners have more lexical and syntactic tools when they approach a language learning task (Ferris 1991, 1994). A similar step-by-step model of L1 acquisition is described by Berman (1986). According to Berman's model, children in acquiring their L1 go through three main phases: (a) a PREGRAMMATICAL phase... where children's knowledge is largely item-bound...; (b) the phase of GRAMMAR ACQUISITION..., where rules are applied productively across items in terms of linguistic structure, and items are interrelated within more general systems, categories and paradigms; and (c) a final phase of APPROPRIATE USAGE where the repertoire of forms and rules acquired previously are deployed with increasing skill. (Berman 1986:193). The beginners' stage is similar to Berman's pregrammatical phase: they learn collocations as lexicalised items. At the intermediate level, learners are at the phase of grammar acquisition: they apply rules productively, increasing their 47 knowledge of grammatical collocations. At the post-intermediate level, students are approaching Berman's final phase of appropriate usage: their overall knowledge of collocations increases for both grammatical and lexical collocations. Since collocations are one of the key building blocks of language, it is not surprising that their acquisition proceeds to a pattern similar to L1 acquisition. Pienemann's Processability Model also provides a framework for understanding the development of collocational knowledge. The first stage of Pienemann's model consists of basic sentence structures and basic categories (Pienemann 1996). This stage coincides with the initial stage of collocational knowledge: learners acquire simple grammatical collocations and relatively free lexical collocations that are basic and frequent in everyday speech. The second stage of Pienemann's model contains extensions of the noun phrase, verb phrase, and sentence. This is the stage where students become able to apply grammatical rules productively and have a better understanding of the constituents of the sentence resulting to the use of more complex collocational strings. Stage 3 of the Processability model is characterised by the use of new categories which are filled with lexical items. The third stage of collocational knowledge is also characterised by a better command of both lexical and grammatical and a preference for lexical collocations signifying a richer vocabulary. The roughly parallel stages between Pienemann's Processability model and the model of the development of collocational knowledge described in this study underscore the existence of a 48 stage-by-stage development of collocational knowledge and its significance for the overall development of L2 proficiency. 5.5 Summary of the Discussion With regard to the main questions in this study - is there development of collocational knowledge in L2 learners as their overall language proficiency develops; and are there any differences in development between and within proficiency levels? - the answer is affirmative. There is significant development of collocational knowledge as overall language proficiency develops. Evidence has been provided by both production (essay data) and knowledge of collocations (translation and blank-filling data). The development of collocational knowledge has been defined in the differences in the use and knowledge of collocations between and within three different proficiency levels: post-beginners, intermediate, and post-intermediate. This study also explored what possible factors can account for the acquisition of English collocations by L2 learners, and whether there are identifiable patterns of acquisition of that part of vocabulary previously described as 'ruleless'. As with most of the developmental studies, the main emphasis has been on describing the emerging patterns of acquisition of English collocations. The large number of structures examined by this study has led to the emergence of a number of different patterns of acquisition. Where possible, explanations 49 pertaining to theories of second language acquisition have been provided with regard to specific patterns of acquisition. The present study has shown that an overall explanation of lexical acquisition may require a modular theory of language acquisition with different modules on the grammatical complexity, learnability, processability, and developmental order of the different collocational structures. The ultimate aim of this study has been to shed light on the acquisition of collocations, which are considered an important aspect of L2 acquisition. In the next section, some pedagogical implications of the results of the present study are given. It is hoped that the data can also provide language instructors with an anchor point in the teaching of English collocations. 5.6 Pedagogical Implications The main goal of this study has been to investigate the acquisition of L2 collocations. L2 learners have been tested on how their collocational knowledge develops. Overall, results show that students from the three proficiency levels tested were not very accurate in either the translation or the blank-filling tests. This is indicative of the L2 learners' general weakness in producing acceptable collocations noted by other researchers, and of the need to provide L2 learners with help for the improvement of their collocational knowledge. 50 The subjects in this study did not receive explicit teaching on collocations. The teachers' questionnaire (see Appendix F) showed that the teachers did not emphasise either the importance of collocations to their students, or the use of other resources in learning collocations. The teachers also agreed that the treatment of vocabulary in TWE is inadequate. The results of this project reveal certain weaknesses and needs on the part of L2 learners, and ways to utilise these results in L2 classrooms are suggested below. The results provide useful information as to how collocational knowledge develops in L2 learners. Such information can be used for improving the treatment of collocations in ESL syllabuses. The knowledge of which collocation types are acquired early in L2 learning, and which are acquired later, can help syllabus designers order the presentation of collocations to promote a step-by-step development of collocational knowledge. Specific collocational problems for L2 learners have also been identified. Students from all the proficiency levels had difficulties with lexical collocations that are fairly fixed and arbitrary (not predictable) in English, e.g. Verb Noun collocations such as ’draw conclusions’, ’earn a living’, ’take shorthand’, ’call a penalty’. Such lexical combinations require specific collocational knowledge and native-like ability. L2 learners have no means of telling which words collocate with which unless they are specifically taught about such collocations. The findings can also be used as a guide to help teachers decide how to handle the teaching of collocations in their classroom, e.g. teaching early acquired 51 types before late acquired types, or more regular and frequently used collocations before more fixed and idiomatic ones. Also, by analysing teaching materials (e.g. readings) with respect to which collocation types they contain, teachers can assess the different teaching materials to be used with the different proficiency levels. Making teachers aware of the importance of collocations is not enough. Students also need to become aware of collocations and develop strategies for their acquisition. By raising the students' awareness of the existence of collocations and their usefulness in L2 learning, teachers can help students take note of the collocations they come across and make more effective use of them. Students should become aware that words do not occur in isolation, but in combination with other words. Increased awareness of and attempt to use communicatively redundant grammatical structures may also lead to faster rates of acquisition and possibly higher levels of L2 attainment (Long 1988:120). The present study also showed that the L1 can influence the learners' knowledge of collocations, especially lexical collocations, that are different from their equivalent collocations in the learners' L1. For example, Adjective Preposition collocations, such as ’afraid of [snakes]’, are Verb Det Noun collocations in Greek, ‘foba;mai ta fi;dia’ [afraid-[Middle Voice Verb] the snakes]. As a result, the Greek learners in this study often translated the Adjective Preposition collocations leaving out the preposition. Also, in coping with arbitrary Verb Noun collocations, such as ’draw conclusions’, ’take an examination’, 'earn a living', subjects seemed to use their knowledge of Greek, e.g. *’take out conclusions’ ‘bga;zw sumpera;smata’ 52 [take out conclusions], *’give an examination’ ‘di;nw exeta;seis’ [give examinations], *’take out a living’ ‘bga;zw to ywmi; mou’ [take out my bread]. Such differences between the L1 and the L2 should be pointed out to the L2 learners, and L2 learners should be encouraged to practise and use such collocations in order to sound more idiomatic in the target language. The accuracy orders reported here may also be relevant as a starting point for an index of L2 development (see Larsen-Freeman 1978b, 1978c). That is, the students' language proficiency can be determined according to which collocation types they have acquired. Such an index of development can also be used for designing language testing materials, and for the placement of students in a suitable proficiency level. 53 CHAPTER 6 CONCLUSIONS 6.1 Summary and Conclusions This study has investigated the development of English collocational knowledge in three different proficiency levels - post-beginners, intermediate and post-intermediate - of 275 Greek learners of ESL. Three tests measuring the learners’ knowledge of collocations were used: essay writing, a translation test and a blank-filling test. The essay writing measured free production of collocations, and the translation and blank-filling tests were measures of cued production. Evidence was sought for the development of collocational knowledge between and within the three proficiency groups. Results revealed that there are specific patterns of development across and within the three different groups. Collocational knowledge increased steadily as the overall language proficiency increased, and the development of collocational knowledge was found to be influenced by the frequency of the input, the L1-L2 difference, the overall language proficiency, and the 'saliency' of the collocation types. Grammatical and lexical collocations that were simple and frequent in everyday use of English were acquired early and the more complex grammatical collocations were acquired 54 later. Lexical collocations that were idiomatic, fixed and/or unpredictable were more difficult than those that were less arbitrary and more rule-bound. Finally, the development of collocational knowledge in terms of the three proficiency levels can be described as follows: Post-beginner students have already acquired the simple and frequent grammatical collocations, e.g. SVc, they use few types of collocation and a large number of tokens for some of them, they are more accurate with regard to lexical collocations than complex grammatical collocations, but their overall accuracy is very low. At the intermediate level, students use more collocation types and they use both simple and complex grammatical collocations, but their overall accuracy does not improve. At the post-intermediate level, students become more accurate with respect to grammatical, both simple and complex, and lexical collocations, and their collocational knowledge is significantly advanced. From a theoretical point of view, the present study developed a classification of the various studies on collocations in three major approaches: lexical composition, the semantic and the structural approach. Each approach has been critically reviewed to reveal its strengths and weaknesses for the study of collocations. The systematic use of a classification system for classifying collocations makes the replication of this study possible. If this classification system is used in future studies on collocations, it will enable a comparison of the results, and support a systematic contribution to how collocational ability develops. 55 The empirical contribution of this study lies in the use of the different elicitation instruments and the analyses of the data. The detailed description of the construction of the battery of tests used for the collection of data (Chapter 3), as well as their strengths and weakness (see next section) can be used as a guide for designing future studies on collocations and developing more sensitive and effective elicitation instruments. The analysis performed on the data is an improvement over analyses in other developmental studies, i.e. studies on the order of acquisition of morphemes. It shows not only the order of acquisition of collocational types, but also the strength of the relationship of the items on the implicational order. From a pedagogical point of view, this study provides a picture of how English collocational knowledge develops in ESL learners. Knowing how collocations are acquired is fundamental for devising ways of teaching them and strategies for learning them. It was the aim of this study to investigate the development of collocational knowledge in L2 acquisition, and to provide a starting point towards unravelling the acquisition process of English collocations. A model for the development of collocational knowledge has been suggested, and the possible factors affecting the various stages of collocational knowledge have been examined. Hopefully, the study of collocations will continue in the future. Further studies should reveal a more detailed picture of the development of collocational knowledge in L2 learners, with important implications for L2 theory and instruction. 56 6.2 Directions for Further Research This study used syntactic structures in defining and operationalising collocational knowledge, which is traditionally considered an area of lexical acquisition. The results suggest that defining collocational types syntactically is a valid approach in the examination of collocational development, especially with grammatical collocations. The description of the acquisition of lexical collocations, however, requires further refinement using semantic information. As it has already been mentioned in the discussion of the results (Chapter 5), lexical collocations are syntactically simple, i.e. they are usually combinations of two words such as Verb Noun, Adjective Prep, Noun Prep, Verb Adverb, but their acquisition is influenced by other factors. For example, lexical collocations that belonged to the same collocation type were found to vary in difficulty, e.g. subjects had more difficulties with the collocation ’think highly’ than with ’work hard’ even though both collocations belonged to the same collocation type, Verb Adverb. ’Think highly’ is more idiomatic than ’work hard’ and as such it was more difficult for the ESL learners in this study. Future researchers should be aware that the acquisition of syntactic forms is a necessary but not sufficient condition for the development of collocational knowledge, especially with regard to lexical collocations. 57 The translation test revealed strong differences in the development of collocational knowledge between and within proficiency levels. One of the advantages of translation, as opposed to a blank-filling test, is that it enables the testing of grammatical collocations as well as lexical ones. However, translation has proved to be difficult for both beginning and more advanced L2 learners. Furthermore, there is evidence that it promotes L1 interference in the students' production. Future research on collocations should take the above limitations into account before deciding on the use of a translation test. The blank-filling test for this study contained more lexical than grammatical collocations, mainly because grammatical collocations are more difficult to test in a blank-filling test. For example, testing SVO to O collocations in a blank-filling test creates the problem of where to put the blank space without making the collocation too general or too obvious. Even though the blank-filling test showed that most of the differences in accuracy reflect language proficiency, the accuracy orders were weak. This could be due to the fact that the majority of the test items tested lexical collocations. Lexical collocations, as already discussed above, are influenced by semantic factors as well as syntactic ones. Therefore, the students' performance on the blank-filling test was not consistent enough to produce a reliable accuracy order, as the students' accuracy of responses reflected not only their knowledge of the particular collocational type, but also which particular words were required for the particular lexical collocations. Research on collocations is in need of a reliable instrument to elicit information on a wider 58 range of collocational knowledge. For example, future research might examine the development of collocational knowledge in a two-fold way, i.e. development with respect to lexical collocations, controlling collocations for formality, frequency of occurrence and idiomaticity, and development with respect to grammatical collocations, controlling for grammatical complexity. Although the present study did not set out to determine the extent to which syllabuses influence the acquisition and the rate of acquisition of collocations, it has provided evidence that the frequency of occurrence of collocations in L2 textbooks influences their acquisition (see Long 1988). That is, the more students were exposed to a particular collocation type, the more they used it accurately. Future research can test this result by controlling for number of exposures to given collocations in an experimental condition. One question of interest is how much exposure to collocations accounts for acquisition orders. This would help identify the optimal instruction conditions leading to the acquisition of collocations (see also Chaudron 1988; Sheen 1994). Also, it will be useful to determine whether instruction can change the order of acquisition, i.e. whether emphasis of exposure on some types of collocation will produce a change in the acquisition orders obtained in this study, or whether classroom instruction affects only the rate of acquisition but not the order of acquisition of collocations (see also Ellis 1989). Long (1988) also underscores the need for research on collocational ability achievable with and without instruction. 59 In this study, essay writing revealed a number of interesting results with respect to the use of collocations. Subjects were controlled with respect to variables such as age, formal education, English proficiency, first language background, and knowledge of vocabulary. Unlike previous studies on collocations, subjects in the present study were tested on their knowledge of collocations already taught to them. The collocations included in the translation and the blank-filling tests were taken from the subjects' textbooks. This ensured that the subjects were tested on knowledge of collocations already presented to them. The topics of the essays were also chosen with the subjects' textbooks in mind. This ensured that subjects from all proficiency levels could perform successfully in the essay composition and produce those collocations that they had acquired and felt comfortable with using. However, the use of specific topics has been shown to promote the use of specific collocation types, such as a large number of SVc and Adjective Noun constructions in the essays by subjects in Group 1. Future research could investigate the performance of different proficiency levels in essay writing, using the same topic for all proficiency levels. In this way, any influences of the essay topic on the use of collocations would be equal for all levels. The present study has concentrated on accuracy in the use and knowledge of collocations. The analysis of collocational errors was not part of this study. However, future research could investigate the misuse of collocations by L2 learners, the possible causes leading to collocational errors, and ways to remedy them. The use of a corpus-based dictionary could also provide future researchers 60 with information as to whether collocational misuse is greater with infrequent collocations or not. Note that the BBI does not provide frequency information. Further research is also needed on how collocational knowledge develops in native speakers of English. Such information can be used to compare the routes of development by L2 learners and native speakers in the acquisition of English collocations. Also, research in the development of collocational knowledge by learners from different L1 backgrounds would reveal whether the accuracy orders found in this study are L1-neutral. A comparison of the collocational errors would yield important information about the extent of the influence of L1 in the development of collocational knowledge in L2 learners. The classification system used in this study has proved to be useful for a systematic categorisation of the collocations found in the students' essays. Some types, though, need some fine-tuning. For example, Type 15. SVO to Infinitive , as it is used in the BBI, implies that the object of the main verb is the subject of the infinitive, e.g. 'she told him to leave'. There can be cases, though, in which the subject of the main verb is also the subject of the infinitive, e.g. 'she used the knife to cut the bread'. In the present study both examples would be classified under the same type. However, future research could use a different type of collocation for the second example, e.g. 'SVO to Inf O' or 'SVO to Inf NP' (NP = Noun Phrase). Such fine-tuning may yield more sensitive differences in collocational performance among learners from different language proficiency levels. 61 Studies on collocations to date have concentrated on written data. It would be interesting also to investigate L2 learners' use of collocations in oral production. By using the classification system employed by the present study, L2 learners' oral production data could be analysed in a similar way to reveal acquisition orders and development of collocational knowledge. These orders could then be compared with the ones found in this study and reveal helpful information as to whether collocational knowledge in L2 writing and speech develop in similar or different ways. The above are selected directions for future research on collocations. The development of collocational knowledge in L2 learners is far from being exhaustively described. More work is needed in the area of lexical acquisition both for theoretical and pedagogical reasons as it has proved to be a profitable avenue for inquiry in the study of L2 acquisition. 62
© Copyright 2026 Paperzz