ComAComA 2014 24 August 2014 Dublin, Ireland A Taxonomy for Afrikaans and Dutch Compounds Gerhard B van Huyssteen Centre for Text Technology North-‐West University Potchefstroom, South Africa [email protected] Ben Verhoeven CLiPS -‐ ComputaEonal LinguisEcs University of Antwerp Antwerp, Belgium [email protected] 1. CompoNet 2. ExisEng taxonomies 3. Our taxonomy for Afrikaans and Dutch 4. Future work CompoNet hSp://componet.sslmit.unibo.it/ Exis5ng (amongst others) English = 163 Dutch = 188 56 revisions New Afrikaans = 144 Problems with 2005 annotaAon guidelines O:en vague • Only one arEcle: BiseSo & Scalise (2005) • Some language-‐specific guidelines for some languages on website Outdated • Revised taxonomy: Scalise & BiseSo (2009) • ApplicaEons and criEques: • • • • Lieber (2009a, 2009b) Arcodia et al. (2009) Vercellob and Mortensen (2012) Arnaud and Renner (2014) Insufficient for language specific phenomena • ProblemaEc for annotators BiseDo & Scalise (2005) Morphosemantic Semosyntactic level level • Head-‐complement (argumental) relaEon • NN compounds: oeen of-‐relaEon • Also syntheEc and (neo)classical compounds mushroom soup subordinate endo exo • Head-‐modifier (non-‐argumental) relaEon • Two semanEc heads, one • AN compounds categorial head mushroom cloud • and-‐relaEon poet-‐painter compounds attributive endo • Mostly semanEc head doormat exo coordinate endo exo other Scalise & BiseDo (2009) • Secondary compounds • Deverbal components taxi driver • Primary compounds • Uninflected components mushroom soup • Non-‐head expresses property of head by means of noun acEng as aSribute • Oeen metaphorical atomic bomb (ATT) vs mushroom cloud Morphosyntactic level SUB ground endo exo ATAP verbal-nexus endo exo attributive endo exo COORD appositive endo exo endo exo Scalise & BiseDo (2009) • Secondary compounds • Deverbal components taxi driver • Primary compounds • Uninflected components mushroom soup • Non-‐head expresses property of head by means of noun acEng as aSribute • Oeen metaphorical atomic bomb (ATT) vs mushroom cloud ATTR ATAP Morphosyntactic level SUB ground endo exo verbal-nexus endo exo attributive real endo exo COORD appositive endo exo Fábregas and Scalise (2012) endo exo Word-formation process VercelloK & Mortensen (2012) compounds ??? hierarchical Semosyntactic level Morphosyntactic level Morphosemantic level 1. “is unclear how many languages would need this category, given the difficulty disEnguishing the category” 2. “‘apposiEve’ is already in the literature as a type of coordinate compound” subordinate expressed predicate endo exo apposiEve unexpressed predicate endo exo equal attributive expressed predicate endo exo coordinate unexpressed predicate endo exo future research Arnaud & Renner (2014) NN composites subordinative relational generic-specific attributive analogical coordinative multifunctional equative hybrid additional Van Huyssteen & Verhoeven (2014) Word-formation process Wordformation Morphosemantic level Morphosyntactic level Semosyntactic level Compounding Subordinate compound Attributive compound Appositive compound Coordinate compound hierarchical Verbal-nexus compound Ground compound Exocentric Phrasal compound Endocentric (Neo-)classical compound Reduplication Derivation Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Phrasal compound Ground compound Exocentric Phrasal compound Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric Phrasal compound Ground compound Ground compound Endocentric Word-formation process • Wordformation Compounding mushroom soup Semosyntactic level Morphosyntactic level Morphosemantic level ApposiEves differ sufficiently from coordinates and aSribuEves atomic bomb Subordinate compound Attributive compound Deriva mushroom cloud chicken mayonnaise Appositive compound Coordinate compound hierarchical Verbal-nexus compound Ground compound Exocentric Phrasal compound Endocentric (Neo-)classical compound Phrasal compound Ground compound Exocentric Phrasal compound Endocentric Ground compound Ground compound Endocentric Word-formation process Wordformation Morphosemantic level Morphosyntactic level Semosyntactic level Compounding Subordinate compound Attributive compound Appositive compound Deriva Coordinate compound hierarchical Verbal-nexus compound Ground compound deverbal uninflected Exocentric Phrasal compound (Neo-)classical compound Phrasal compound Ground compound Phrasal compound Ground compound Ground compound phrases semi-‐words Endocentric Exocentric • • Kind of component Category of component Endocentric Endocentric Word-formation process Wordformation Morphosemantic level Morphosyntactic level Semosyntactic level Compounding Subordinate compound Attributive compound Appositive compound Deriva Coordinate compound hierarchical Verbal-nexus compound Ground compound Exocentric knip+oog snip+eye ‘to wink’ Phrasal compound (Neo-)classical compound Endocentric Phrasal compound Ground compound Exocentric • • Phrasal compound Ground compound Endocentric . rooi+kop red+head ‘ginger (derogatory) (Morpho)semanEc head Only two kinds of exocentric compounds Ground compound Endocentric Word-formation process Wordformation Morphosemantic level Morphosyntactic level Semosyntactic level Compounding Subordinate compound Attributive compound Appositive compound Coordinate compound hierarchical Verbal-nexus compound Ground compound Exocentric Phrasal compound Endocentric (Neo-)classical compound Reduplication Derivation Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Phrasal compound Ground compound Exocentric Phrasal compound Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric Phrasal compound Ground compound Ground compound Endocentric Wordformation Compounding Appositive compound Reduplication Derivation Coordinate compound Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Ground mpound Phrasal compound Exocentric Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric • • • Phrasal compound Ground compound Ground compound Endocentric [[[a]Adj [b]N]NP [c]N]N volcanic-‐mountain climber [[[a]Num [b]N]NP [c]N]N three-‐tooth giant [[[a]P [b]N]PP [c]N]N in-‐house exper;se Wordformation Compounding Appositive compound Reduplication Derivation Coordinate compound Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Ground mpound Phrasal compound Exocentric Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric • Phrasal compound Ground compound Ground compound Endocentric Compounding through derivaEon groot+skaal-‐s ‘large-‐scale’ Wordformation Compounding Appositive compound Reduplication Derivation Coordinate compound Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Ground mpound Phrasal compound Exocentric Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric • Phrasal compound Ground compound Ground compound Endocentric Well-‐known phenomenon in Germanic languages op+zoeken up+look ‘look up/search for’ Word-formation process Wordformation Morphosemantic level Morphosyntactic level Semosyntactic level Compounding Subordinate compound Attributive compound Appositive compound Coordinate compound hierarchical Verbal-nexus compound Ground compound Exocentric Phrasal compound (Neo-)classical compound Reduplication Derivation Compounding compound Parasynthetic compound Separable complex verb language specifi c/marginal Phrasal compound Ground compound Endocentric Exocentric subordinate [SEM j of SEMi]k [[x]i[y]j]k Phrasal compound Endocentric Ground compound Ground compound Phrasal compound Endocentric Endocentric Phrasal compound Ground compound Ground compound Endocentric Hence… • rigorous/explicit annotaEon guidelines • Corrected anomalies and discrepancies • Formalised (ConstrucEon Morphology) • Terminology disambiguated and translated • comprehensive for all compounds in Dutch and Afrikaans • All paSerns in De Haas & Trommelen (1993) and Afrikaans literature • Illustrated what/how to adapt for specific languages • Next: • Re-‐annotate data in CompoNet • Future research hSp://Enyurl.com/aucopro A Taxonomy for Afrikaans and Dutch Compounds Gerhard B van Huyssteen Centre for Text Technology North-‐West University Potchefstroom, South Africa [email protected] Ben Verhoeven CLiPS -‐ ComputaEonal LinguisEcs University of Antwerp Antwerp, Belgium [email protected] Acknowledgements • Funding: Nederlandse Taalunie (Dutch Language Union), South African Department of Arts and Culture (DAC), South African NaEonal Research FoundaEon (NRF), European Network on Word Structure (NetWordS) (European Science FoundaEon) • Benito Trollip, who populated the first version of the Afrikaans secEon in the CompoNet database • Anonymous reviewers for comments and suggesEons
© Copyright 2026 Paperzz