Comparative Studies in Australian and New Zealand English

Comparative Studies in Australian and New Zealand English
Varieties of English Around the World (VEAW)
A companion monograph series devoted to sociolinguistic research, surveys and
annotated text collections. The VEAW series is divided into two parts: a text series
contains carefully selected specimens of Englishes documenting the coexistence of
regional, social, stylistic and diachronic varieties in a particular region; and a general
series which contains outstanding studies in the field, collections of papers devoted to
one region or written by one scholar, bibliographies and other reference works.
General Editor
Edgar W. Schneider
Department of English & American Studies
University of Regensburg
Universitätsstraße 31
D-93053 REGENSBURG
Germany
[email protected]
Editorial Assistant
Alexander Kautzsch
Editorial Board
Laurie Bauer
Wellington
Manfred Görlach
Cologne
Rajend Mesthrie
Cape Town
Peter Trudgill
Fribourg
Walt Wolfram
Raleigh, NY
General Series, Volume G39
Comparative Studies in Australian and New Zealand English. Grammar and
beyond
Edited by Pam Peters, Peter Collins and Adam Smith
Comparative Studies in Australian
and New Zealand English
Grammar and beyond
Edited by
Pam Peters
Macquarie University
Peter Collins
University of NSW
Adam Smith
Macquarie University
John Benjamins Publishing Company
Amsterdam / Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of
American National Standard for Information Sciences – Permanence of
Paper for Printed Library Materials, ansi z39.48-1984.
Library of Congress Cataloging-in-Publication Data
Comparative studies in Australian and New Zealand English grammar and beyond /
edited by Pam Peters, Peter Collins, Adam Smith.
p. cm. (Varieties of English Around the World, issn 0172-7362 ; v. G39)
Includes bibliographical references and index.
1. English language--Australia--Grammar. 2. English language--New Zealand-Grammar. 3. Grammar, Comparative and general. 4. Language and culture. I.
Peters, Pam. II. Collins, Peter, 1950- III. Smith, Adam Michael.
PE3601.C66 2009
427'.994--dc22
isbn 978 90 272 4899 2 (hb; alk. paper)
isbn 978 90 272 8940 7 (eb)
2009011793
© 2009 – John Benjamins B.V.
No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands
John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa
Table of contents
List of abbreviations
vii
List of contributors
ix
Prologue
Peter Collins
1
section i. Morphology
Irregular verbs: Regularization and ongoing variability
Pam Peters
13
Pronoun forms
Heidi Quinn
31
Hypocoristics in New Zealand and Australian English
Dianne Bardsley & Jane Simpson
49
section ii. Verbs and verb phrases
Modals and quasi-modals
Peter Collins
73
The perfect and the preterite in Australian and New Zealand English
Johan Elsness
89
The progressive
Peter Collins
115
The mandative subjunctive in spoken English
Pam Peters
125
Light verbs in Australian, New Zealand and British English
Adam Smith
139
section iii. Nouns and noun phrases
Non-numerical quantifiers
Adam Smith
159
 Table of contents
From chairman to chairwoman to chairperson: Exploring
the move from sexist usages to gender neutrality Janet Holmes, Robert Sigley & Agnes Terraschke
183
section iv. Clauses and sentences
Concord with collective nouns in Australian
and New Zealand English
Marianne Hundt
207
No in the lexicogrammar of English
Pam Peters & Yasmin Funk
225
Zero complementizer, syntactic context, and regional variety Kate Kearns
243
Infinitival and gerundial complements
Christian Mair
263
Commas and connective adverbs
Peter G. Peterson
277
section v. Discourse
Information-packaging constructions
Peter Collins
295
Like and other discourse markers
Jim Miller
317
Final but in Australian English conversation
Jean Mulder, Sandra A. Thompson & Cara Penry Williams
339
Swearing
Keith Allan & Kate Burridge
361
Epilogue
Pam Peters
387
Index
401
List of abbreviations
ACE
Australian Corpus of English (data from 1986)
AmE
American English
ART
Australian Radio Talkback corpus (data from 2004–6)
AusE
Australian English
B-LOB
Before LOB corpus (British English from the 1930s)
BNC
British National Corpus (data from 1975–1990s)
BrE
British English
Brown
Brown Corpus (American English from the 1960s)
COLT
Corpus of London Teenager Language (spoken data from 1993)
FLOB
Freiburg corpus modeled on LOB (with data from the 1990s)
Frown
Freiburg corpus modeled on Brown (with data from the 1990s)
ICE
International Corpus of English
IDG
indigenized (variety of English) i.e. an “outer circle” English
LOB
Lancaster—Oslo/Bergen corpus (British English from the 1960s)
NZE
New Zealand English
SBCSanta Barbara Corpus of Spoken American English
(data from the 1990s)
STL
settler (variety of English) i.e. an “inner circle” English
WSC
Wellington Spoken Corpus (New Zealand English from the 1980s)
WWC
Wellington Written Corpus (New Zealand English from 1986)
List of contributors
Keith Allan
Monash University,
Melbourne
[email protected]
Kate Kearns
University of Canterbury,
Christchurch
[email protected]
Dianne Bardsley
Victoria University,
Wellington
[email protected]
Christian Mair
University of Freiburg,
Germany
[email protected]
Kate Burridge
Monash University,
Melbourne
[email protected]
Jim Miller
University of Edinburgh
[email protected]
Peter Collins
University of New South Wales
[email protected]
Jean Mulder
University of Melbourne
[email protected]
Johan Elsness
University of Oslo,
Norway
[email protected]
Cara Penry Williams
University of Melbourne
[email protected]
Yasmin Funk
Macquarie University,
Sydney
[email protected]
Pam Peters
Macquarie University,
Sydney
[email protected]
Janet Holmes
Victoria University,
Wellington
[email protected]
Peter G. Peterson
University of Newcastle,
NSW
[email protected]
Marianne Hundt
University of Zurich,
Switzerland
[email protected]
Heidi Quinn
University of Canterbury,
Christchurch
[email protected]

List of contributors
Robert Sigley
Daito Bunka University,
Tokyo
[email protected]
Agnes Terraschke
Macquarie University,
Sydney
[email protected]
Jane Simpson
University of Sydney
[email protected]
Sandra Thompson
University of California,
Santa Barbara
[email protected]
Adam Smith
Macquarie University,
Sydney
[email protected]
Prologue
Peter Collins
University of New South Wales
The characteristic phonological and lexical features of Australian English (AusE)
and New Zealand English (NZE) have attracted a good deal of scholarly interest for
about half a century, e.g. Mitchell and Delbridge 1965; Ramson 1966; Delbridge et al.
1981; Horvath 1985; Gordon and Deverson 1985; Orsman 1997. This is not surprising,
given that it is in these areas that AusE and NZE are generally perceived to differ most
significantly from other national varieties.
1. Previous grammatical studies of AusE and NZE1
The earliest grammatical studies of AusE and NZE examined usage and acceptability,
using data obtained through elicitation tests which were modeled on the techniques
pioneered by linguists involved in the Survey of English Usage (see Quirk & Svartvik
1966; Greenbaum & Quirk 1970). The focus of attention was primarily on questions
of divided usage (e.g. that displayed by verbs such as have, need, dare and used, which
in non-assertive constructions can either be auxiliary-like in not requiring do, or like
lexical verbs in selecting do), and on debatable usages involving questions of agreement (as in is/are a number of) and case selection (as in than me/I). Australian studies
of this type date from the 1970s, and include those by Eagleson (1972, 1976), Watson
(1978), and Collins (1978, 1979), while in New Zealand they date from the 1980s, notably Bauer’s (1988, 1989a–c) studies of number agreement with collective nouns and
aspects of verb morphology and syntax.
More recent studies have in many cases exploited the advent of corpora of AusE
and NZE. The most comprehensive corpus-based study of NZE is Hundt’s (1998)
monograph, which uses data from a set of corpora representing not only NZE but
also AusE, BrE and AmE to draw comparisons between the frequency and use of
a range of morphological, syntactic and lexicogrammatical features in NZE and in
. For recent surveys of AusE and NZE morphosyntax see Collins and Peters (2008) and
Hundt, Hay and Gordon (2008) respectively.

Peter Collins
other Englishes. A number of small-scale corpus-based studies of AusE have been
reported in the literature, including Collins (1988, 1991a,b, 1996, 2005, 2007, 2008),
Peters (1993, 1996, 1998, 2001), and Peters and Fee (1987).
In addition to these, there are studies that use either impressionistic evidence
or textual data collected from various sources (e.g. Sussex 1982; Newbrook 1992,
2001; Engel & Ritz 2000), as well as studies focusing on nonstandard grammatical
features (e.g. Eisikovits 1989; Shnukal 1989; Pawley 2008).
2. The present volume2
The present volume is a collection of invited contributions from scholars with relevant
research interests from both sides of the Tasman, and in some cases from further afield.
It includes chapters focusing not merely on topics central to the domain of grammar,
but also – as the “and beyond” in the title suggests – on those which are less central, on
the boundary of grammar and lexis, and of grammar and discourse.
There were a number of questions and issues that we as editors anticipated that
our contributors might cast light on. One was the evolutionary status of AusE and
NZE as World Englishes (Schneider 2003, 2007). During the nineteenth century both
were dominated by exonormative allegiance to standard British norms. During the
twentieth century the settler population in both countries developed a new identity
based on local realities and the need for self-sufficiency (significant historical events
including Australians finding themselves unprotected against Japanese attack in
1942, and New Zealanders finding themselves bereft of their primary export market
following British entry into the European Union in 1973). With the development of a
locally rooted self-confidence that followed, reflected inter alia in enthusiastic support
for national dictionaries and the local literature, there came endonormative stabilization for the dialects. It is beyond the scope of this volume to explore whether the final
stage of Schneider’s evolutionary model (“differentiation”) has been reached in both
countries, which would see Australians and New Zealanders regarding themselves not
as defined in terms of differences from the “Mother Country” but as composites of
subgroups defined regionally, socially and ethnically. However we anticipate that this
book will provide insights from the grammar into the extent of the endonormativity of
AusE and NZE. Neither variety can expect to be immune from global influences. But
. The preparation of the chapters written by the editors of the volume was supported by
an ARC Discovery Grant (2004–6), as was the compilation of the ART Corpus described
in Section 3.
Prologue
is there any grammatical evidence for “Americanization” in the wake of the loosening
of ties with Britain that has occurred in Australia and New Zealand, and the growing
dominance of America in their cultural, political and economic domains? Are there
grammatical features that we can identify as subject to continuing British influence?
Another question that the book is designed to address concerns the parallels
and differences between AusE and NZE, both transplanted approximately two centuries ago (with the first European colony dating from 1788 in Australia and 1840 in
New Zealand). Are features identified as “distinctive” found only in these Englishes,
or simply used more frequently in them than in other varieties? Where we can identify grammatical trends similar to those in British English (BrE) and/or American
English (AmE), to what extent are they the product of external influences, and to what
extent independent but parallel developments? Is there grammatical evidence to support
the view that NZE is more closely associated with BrE than is AusE (and which, if
valid, may be attributable to the fact that the immigration history of New Zealand
has involved significantly less diversity than Australia’s)? Are there sufficiently close
parallels in the grammatical features and trends shared between AusE and NZE for us
to posit a single antipodean standard that is distinguishable from the standard British
and American supervarieties?
3. Corpus-based approaches
Contributors to the volume were asked to use corpus data, and many availed themselves of the opportunity to use the Australian corpora housed and available at
Macquarie University: the Australian Corpus of English (ACE), a million-word
corpus of printed texts published in 1986 designed as a parallel to the Brown Corpus
of written American English (Brown) and the Lancaster/Oslo-Bergen Corpus
of written British English (LOB); the million-word Australian component of the
International Corpus of English (ICE-AUS), comprising spoken and written texts
recorded/published in 1991–5; the 257 000-word Australian talkback radio corpus,
with texts recorded in 2004–6 (ART). For NZE the primary resources were the
Wellington Corpus of written New Zealand English (WWC), a parallel to Brown,
LOB and ACE with texts published in 1986–90, and ICE-NZ (1990–8). A number
of authors also made use of further corpora of BrE and AmE: the Freiburg/Brown
Corpus of written American English, or Frown (1991); the Freiburg/LOB Corpus
of written British English, or FLOB (1990–1); and ICE-GB (1991–4). The corpora
not only allowed regional comparisons of national varieties to be made, but their
multigeneric composition (with the exception of ART) enabled authors to explore
patterns of stylistic variation as well.


Peter Collins
4. Structure of the volume
The book is structured into five sections progressing from the smallest (morphological)
phenomena, through phrasal and clausal grammatical categories, to discourse
phenomena. A number of the chapters address topics that concern more than one
of the areas mentioned here, so there was an inevitable element of arbitrariness in
their ultimate location in the book. For example the info-packaging constructions
discussed in Section V represent types of clausal construction but at the same time
their evolution in the English language has clearly been motivated by their special
“information packaging” functions. The chapter on pronoun morphology deals with
the inflectional category of case, warranting location in Section I, yet the association
of case with pronouns might alternatively have justified inclusion of this chapter in
Section III on the noun phrase. Similarly, the chapter on the mandative subjunctive
deals with the inflectional category of mood, again warranting location in Section I,
yet it can reasonably be included in Section II on the grounds of the association of
mood with the verb phrase.
4.1 Section I (“Morphology”)
The first two chapters in Section I deal with topics in inflectional morphology. Peters
uses a framework of language change to explore variable patterns of regularization with
the preterite and past participial forms of English verbs. She observes inter alia that with
verbs such as burn, learn and spell, the -t forms are more popular than the -ed forms in
the antipodean varieties (slightly more so in AusE than NZE) but more than in BrE in
both cases. Supplementary survey data indicate support in AusE for the “nonstandard”
forms amongst younger speakers, and rising support amongst older speakers.
Quinn examines the regional distribution of pronominal case forms. Two
AusE trends she notes are the use of I and myself in coordinative constructions and
the use of me as a possessive. Quinn’s findings also suggest differences between
AmE and the other three varieties with second person preferences: y’all was not
attested outside of her AmE data, whereas this was the only data in which yous(e)
was not found.
Bardsley and Simpson’s chapter is concerned with lexical morphology, more
specifically hypocoristic forms such as rego, blowie, turps and preggers, the use of
which reflects such speaker characteristics as good humour, jocular cynicism, love
of informality and a desire to knock things down to size. Hypocoristics suffixed
with -ie, Bardsley and Simpson observe, are commonly found in both AusE and
NZE, where they are more frequent than other forms and have the most diversified
range. By contrast -o is more frequent in AusE, especially in general common nouns,
geographical names and particular occupational labels.
Prologue
4.2 Section II (“Verbs and Verb Phrases”)
Section II begins with Collins’s study of the modal auxiliaries must, should, need, will
and shall, and the semantically related quasi-modals have to, have got to, need to, be
going to and want to, in AusE, NZE, AmE and BrE. Collins cites independent evidence
that the quasi-modals have in recent decades been enjoying an increase in popularity at
the expense of the modals, which are in decline. These two trends, his findings suggest,
are most advanced in AmE. Of the remaining three varieties it is AusE that is most
similar to AmE, and NZE the least, the latter showing a degree of conservatism even
greater than that of BrE.
Elsness’s study of the perfect aspect suggests that New Zealanders are still influenced by British practices, while (younger) Australians may be bowing to pressure
from AmE. Elsness finds that the present perfect is more common in AusE and
NZE than it is in the two northern hemisphere supervarieties (of which BrE has
the higher frequency, in both of which it is losing ground to the preterite). AusE
appears to lead the way in the use of the present perfect with past-referring adverbials,
but a contrary development is the preference among younger speakers of AusE but
not NZE for the use of the preterite – as in AmE – in contexts involving vague past
time specification.
Collins’s study of the progressive be + Ving construction takes into account a
range of variables including form classes, main clause use, special pragmatic uses, and
contracted forms. Noting that the progressive has enjoyed spectacular growth in late
Modern English, Collins finds AusE and NZE to have advanced further than BrE and
AmE. Meanwhile, of the two antipodean varieties it is AusE that is the more advanced,
and of the northern hemisphere pair, AmE is ahead of BrE.
Peters’s study of the mandative subjunctive disconfirms the predictions made
by some writers that the construction is moribund. Peters finds that the frequency
of the construction in AusE and NZE (as well as Singapore and Philippine English)
outstrips that in BrE, and confirms that the mandative subjunctive enjoys more
support in Commonwealth Englishes than in Britain itself.
Smith’s chapter on the light verbs give, have, make, and take (as in give someone
a poke, have a shower, make a start, take a rest) finds that have, which tends to be
more favoured in BrE than in AmE (the latter having a preference for take), has been
increasing in recent decades, while the other light verbs have shown little or no movement. Furthermore it is the antipodean varieties that are leading the way in the rise of
have, with AusE slightly ahead of NZE.
4.3 Section III (“Nouns and Noun Phrases”)
In the first of the two chapters in this section Smith explores regional and stylistic
variation in the frequency of non-numerical quantifiers such as lots/a lot of, in their


Peter Collins
agreement patterns with associated verbs and noun complements, and in the extent
of their delexicalization. Smith’s data indicate stronger support for a bunch of, heaps of
and a lot of in the antipodean varieties than in BrE.
Holmes, Sigley and Terraschke’s chapter provides up-to-date information on
developments involving the use of gender-marked words from a wide range of corpora.
The authors identify two different types of development, in both of which AusE and
NZE are leading the northern hemisphere varieties, BrE and AmE. In one case, the use
of gender-neutral expressions such as person, people, and chairperson, it is AusE that
is in the vanguard of change. In the other, the use of gender-visible expressions such
as woman, man, female, male, and of gendered heads followed by occupational labels
such as woman doctor, it is NZE leading the way.
4.4 Section IV (“Clauses and Sentences”)
The first four chapters in this section contain chapters dealing with grammatical
categories whose domain of operation is the clause (subject-verb agreement, negation,
clausal complementation), while the domain of the fifth chapter (run-on sentences)
is the sentence rather than the clause.
Hundt examines concord with collective nouns such as government and team,
which has seen a growing trend towards the use of singular concord, with AmE
leading the way. Hundt’s conclusion is that inter-dialectal variation (between AusE and
NZE, and between these varieties and BrE) is insignificant as against intra-dialectal
variation, triggered by speech/writing differences and by the choice of subject noun.
Peters’s chapter on negation finds evidence of linguistic creativity in the use of no
collocations, particularly in NZE. Another finding is that AusE is more advanced than
NZE in the replacement of no-negation with not-negation.
Kearns’s chapter is the first of two whose concern is with complementation.
Kearns’s focus is on finite clauses, in particular on the alternation between that and
zero complementizer. Using newspaper data Kearns finds a higher rate of zero complementizer in AusE and NZE (higher in the latter) than in AmE and BrE (higher
in the former).
Mair’s findings for patterns of infinitival complementation (as in help (to) V) and
gerundial complementation (as in prevent/stop NP (from) Ving) are similar to those
of Hundt on concord with collective nouns: differences between AusE and NZE are
negligible, with both patterning similarly to BrE and not showing signs of submitting
to American influence. Thus for example the American rejection of prevent/stop NP
Ving is not encountered in Britain or the antipodes.
Peterson studies three connective adverbs, however, therefore and thus, as used
to introduce a second main clause within a single orthographic sentence, preceded
Prologue
by a comma. In a comparison of the parallel corpora FLOB, Frown, WWC and ACE,
Peterson finds that the antipodean varieties are leading the way over BrE and AmE in
the spread of this usage.
4.5 Section V (“Discourse”)
The four chapters in this section all take us in various ways beyond the scope of
sentence-bound grammar into the realms of discourse. In the first chapter Collins
suggests that antipodean practices in the use of “information-packaging” constructions pattern more closely with BrE than with AmE, which appears to be leading the
way in a rise in the popularity of the reversed pseudo-cleft, and in a decline in the
popularity of the it-cleft. The study finds NZE to be more conservative than BrE in its
support for constructions with dummy it and there, and in its preference for using the
information-packaging constructions in the written medium.
In Miller’s chapter the “discourse marker” like is observed to serve different functions depending on its occurrence in clause-initial, clause-medial, or clause-final
position (respectively, introducing exemplification, highlighting, and countering or
anticipating incorrect inferences). Miller finds the frequency of like to be quite similar
in AusE and NZE (slightly higher in the former, which favours medial position far
more than NZE), and outstripped in both by the colloquial discourse markers well,
you know and so.
Mulder, Thompson and Penry Williams’ chapter on final but provides strong
evidence that this item is distinctive to AusE and indexical of Australiannness. The
authors distinguish between the fully grammaticized discourse particle but, which
conveys contrastiveness/concessiveness, and Final Hanging but, which merely leaves
these implications “hanging”. While final but is attested in other Englishes, its special
place in AusE is suggested by the extent of folklinguistic comment upon it, and by its
common use in fictional dialogue.
Allan and Burridge investigate swearing, the strongly emotive use of taboo terms
in insults, epithets and expletives. These terms are argued to have a variety of functions: to mark in-group solidarity; to spice up what is being said; to serve simply
as discourse particles, bleached largely of their taboo quality. Allan and Burridge
claim that bloody, which has deservedly acquired a reputation as the “great Australian
adjective”, and other swearwords have become widely accepted in the public arena in
contemporary Australia.
This book ends with an Epilogue by Peters, which explores the wider varietal
implications of a number of the findings of individual papers: the extent to which AusE
and NZE continue to reflect the norms of BrE, and where they are now independent of
it, and of each other.


Peter Collins
References
Bauer, Laurie 1988. “Number agreement with collective nouns in New Zealand English”.
Australian Journal of Linguistics 8: 247–59.
Bauer, Laurie 1989a. “The verb have in New Zealand English”. English World-Wide 10: 69–83.
Bauer, Laurie 1989b. “Marginal modals in New Zealand English”. Te Reo 32: 3–16.
Bauer, Laurie. 1989c. “Irregularity in past non-finite verb forms and a note on the New Zealand
weekend”. New Zealand English Newsletter 3: 13–16.
Collins, Peter. 1978. “Dare and need in Australian English: A study of divided usage”. English
Studies 59: 434–41.
Collins, Peter. 1979. “Elicitation experiments on acceptability”. Working Papers of the Speech and
Language Research Centre, Macquarie University 2: 1–49.
Collins, Peter. 1988. “The semantics of some modals in contemporary Australian English”.
Australian Journal of Linguistics 8: 233–58.
Collins, Peter. 1991a. “Will and shall in Australian English”. In Stig Johansson & Anna-Brita Stenström
(eds), English Computer Corpora: Selected Papers and Research Guide. Berlin: Mouton de
Gruyter, 181–99.
Collins, Peter. 1991b. “The modals of obligation and necessity in Australian English”. In Karin
Aijmer & Bengt Altenberg (eds), English Corpus Linguistics: Studies in Honour of Jan
Svartvik. London: Longman, 145–65.
Collins, Peter. 1996. “Get-passives in English”. World Englishes 15: 43–56.
Collins, Peter. 2005. “The modals and quasi-modals of obligation and necessity in Australian
English and other Englishes”. English World-Wide 26: 249–73.
Collins, Peter. 2007. “Can/could and may/might in British, American and Australian English: a
corpus-based account”. World Englishes 26: 474–91.
Collins, Peter. 2008. “The progressive aspect in World Englishes: A corpus-based study”. Australian
Journal of Linguistics 28(2): 225–49.
Collins, Peter & Pam Peters. 2008. “Australian English morphology and syntax”. In Kate Burridge &
Bernd Kortmann (eds), Varieties of English 3: The Pacific and Australasia. Berlin:
Mouton de Gruyter, 341–61.
Delbridge, Arthur, John Bernard, David Blair & Susan Butler (eds). 1981. The Macquarie Dictionary. Sydney: Macquarie Library.
Eagleson, Robert. 1972. “Aspects of Australian English usage”. AULLA (Australasian Universities
Language and Literature Association) Proceedings 14: 204–16.
Eagleson, Robert. 1976. “Anyone for his”. Working Papers in Language and Linguistics 4: 31–45.
Eisikovits, Edina. 1989. “Girl-talk/boy-talk: sex differences in adolescent speech”. In
Peter Collins & David Blair (eds), Australian English: The Language of a New Society.
St Lucia: University of Queensland Press, 35–54.
Engel, Dulcie & Marie-Eve Ritz. 2000. “The use of the present perfect in Australian English”.
Australian Journal of Linguistics 20: 119–40.
Gordon, Elizabeth & Tony Deverson. 1985. New Zealand English: An Introduction to New Zealand
Speech and Usage. Auckland: Heinemann.
Greenbaum, Sidney & Randolph Quirk. 1970. Elicitation Experiments in English: Linguistic Studies
in Use and Attitude. London: Longman.
Horvath, Barbara. 1985. Variation in Australian English: The Sociolects of Sydney. Cambridge:
Cambridge University Press.
Prologue
Hundt, Marianne. 1998. New Zealand English Grammar: Fact or Fiction? A Corpus-Based Study
in Morphosyntactic Variation. Amsterdam: John Benjamins.
Hundt, Marianne, Jennifer Hay & Elizabeth Gordon. 2008. “New Zealand English: morphosyntax”.
In Kate Burridge & Bernd Kortmann (eds), Varieties of English 3: The Pacific and Australasia.
Berlin: Mouton de Gruyter, 305–40.
Mitchell, Alex & Arthur Delbridge. 1965. The Speech of Australian Adolescents. Sydney: Angus &
Robertson.
Newbrook, Mark. 1992. “Unrecognised grammatical and semantic features typical of Australian
English: Checklist with commentary”. English World-Wide 13(1): 1–32.
Newbrook, Mark. 2001. “Syntactic features and norms in Australian English”. In David Blair &
Peter Collins (eds), English in Australia. Amsterdam: John Benjamins, 113–32.
Orsman, Harry. 1997. The Dictionary of New Zealand English. New Zealand Words and their
Origins. Oxford: Oxford University Press.
Pawley, Andrew. 2008. “Australian vernacular English: Some grammatical characteristics”. In
Kate Burridge & Bernd Kortmann (eds), Varieties of English 3: The Pacific and Australasia.
Berlin: Mouton de Gruyter, 362–97.
Peters, Pam. 1993. “American and British influence on Australian English verb morphology”. In
Udo Fries, Gunnel Tottie & Peter Schneider (eds), Creating and Using English Language
Corpora. Amsterdam: Rodopi, 248–55.
Peters, Pam. 1996. “Comparative insights into comparison”. World Englishes 15: 57–68.
Peters, Pam. 1998. “The survival of the subjunctive: Evidence of its use in Australia and elsewhere”.
English World-Wide 19: 87–103.
Peters, Pam. 2001. “Corpus evidence on some points of Australian Style and usage”. In David Blair &
Peter Collins (eds), English in Australia. Amsterdam: John Benjamins, 163–78.
Peters, Pam & Margery Fee. 1987. “New configurations: The balance of British and American
English features in Australian and Canadian English”. Australian Journal of Linguistics
9: 135–47.
Quirk, Randolph & Jan Svartvik. 1966. Investigating Linguistic Acceptability. The Hague: Mouton.
Ramson, William. 1966. Australian English: An Historical Study of the Vocabulary, 1788–1898.
Canberra: Australian National University Press.
Schneider, Edgar W. 2003. “The dynamics of New Englishes: From identity construction to
dialect birth”. Language 79(2): 233–81.
Schneider, Edgar W. 2007. Postcolonial English: Varieties around the World. Cambridge: Cambridge
University Press.
Shnukal, Anna. 1989. “Variable subject relative pronoun absence in Australian English”. In
Peter Collins & David Blair (eds), Australian English: The Language of a New Society.
St Lucia: University of Queensland Press, 70–77.
Sussex, Roland. 1982. “A note on the get-passive construction”. Australian Journal of Linguistics
2: 83–92.
Watson, Ken. 1978. “Teachers’ attitudes to usage”. ALAA (Applied Linguistics Association of
Australia) Occasional Papers 2: 32–40.

section i
Morphology
Irregular verbs
Regularization and ongoing variability
Pam Peters*
Macquarie University
Both language history and mathematical modeling suggest that the English
irregular verbs will generally evolve to become more regular. Yet closer
investigation of individual verbs and verb groups shows that evolutionary
expectations can be overstated. Data from the ICE-corpora for Australian,
New Zealand and British English show differing endorsements of nonstandard
forms, whether these are long-established variants as for ring/shrink/spring, or
latter-day revivals such as -t for burn, learn, spell. The data put Australian and
New Zealand English closer to each other than either is to British. Australian
population surveys show that younger citizens are more inclined to use
nonstandard/nonstandardized forms. Sociolinguistic and regional preferences
may thus run counter to the broad evolutionary trend for English verbs, at least
in the short term.
1. Introduction: Movements in the English verb system
The number of irregular (strong) verbs in English has steadily declined over the last
millennium. Of the 312 which were operative in Old English, only 66 (i.e. 34%) remain
irregular in the twentieth century (Fries 1940: 60–1). The Middle English period saw the
breakdown of many irregular paradigms, and mergers between irregular and regular
verbs, as in the case of cleave and let. Other irregular verbs were totally reconstructed as
regular verbs, e.g. melt > melted, though the adjective molten – formerly the verb’s past
participle – bears witness to its irregular past.
The historical trend towards regularization of English irregular verbs has
recently been the subject of mathematical modeling by Harvard mathematicians (Erez Lieberman et al.), published in a letter to the journal Nature vol. 449: 714
(11 October 2007). They calculated the regularization rate of 177 verbs from Old English to
*The contribution of my research assistant Yasmin Funk in the gathering and sifting of verbal
data for this paper is gratefully acknowledged.
 Pam Peters
the present-day, relative to their frequency of use, and found an inverse ratio between
their frequency and the speed of assimilation. The less frequent irregular verbs were
more quickly regularized, whereas higher frequency irregular verbs remained that way
much longer. This emerged as the dominant factor in assimilation or resistance over
many centuries, from which they extrapolated the half life for each irregular verb in
five frequency bands. Irregular verbs in the lowest frequency band, e.g. hew, shrive,
slink, wreak would almost all be regularized within 300 years; whereas high frequency
verbs such as come, find, get, give, go, know, say, see, take, think would remain irregular
for over 14 000 years – if the English language lasts that long! The correlation between
high frequency and irregularity of form has been strongly supported from the perspective
of cognitive linguistics, by Pinker (1999) and Bybee (1985, 2006) among others.
Yet the process of verb regularization has certainly not been taking place at same
rate everywhere in English-speaking world. Statistically significant differences (at
the < 0.01 level of probability or lower) in AmE and BrE use of irregular parts were
found by Hofland and Johansson (1982: 472–544) for verbs such as burn, learn, light,
smell, where the strong AmE preference for -ed contrasted with BrE variation between
-t and -ed. In AmE, verbs such as mow now have mowed as both past tense and
past participle; whereas in BrE the past participle is still usually mown (Peters 2004:
360). AmE was and is more strongly oriented toward regular forms, which Webster
endorsed as the distinctively American codification of the language (Baron 1982:
60–7). Webster’s American Dictionary of the English Language (1828) disendorsed
as obsolescent the -en (past participle) forms of verbs like bind, burst, sit (bounden,
bursten, sitten), which eighteenth century British grammarians such as Lowth (A
Short Introduction to English Grammar 1762) had affirmed. Lowth’s conservative
position was strongly supported by other British authorities, including Priestley The
Rudiments of Grammar (1769: ix), who argued that “the paucity of inflections is the
greatest defect in our language” and urged readers “if possible [to] make a participle
different from the preterite of a verb; as a book is written, not wrote; the ships are
taken, not took”. Johnson in the introductory Grammar to his Dictionary (1755), put
it more axiomatically: “a distinct past participle is more proper and more elegant”.
Their intervention may indeed have helped to arrest the contemporary trend to use
the past tense for the past participle with many irregular verbs. At any rate, there are
still distinct past participles for verbs like break, drive, eat, take, write and others in
standard English 250 years later.
The conservation of irregular verb parts by eighteenth century British grammarians
goes against the dominant historical trend towards regularization of the English irregular
verbs. Even more remarkable are the instances of English verbs which have added irregular parts to what were previously regular verbs. In Early Modern English, the verb thrive
acquired irregular past forms (throve, thriven) as if it were a member of the drive/drove/
driven paradigm – although thrived is used much more generally in late twentieth century
Irregular verbs 
English (Peters 2004: 540). The regular verb dive acquired the past tense dove in Canada
during the nineteenth century, which has spread as an alternative into the northern US
(Fee and McAlpine 2007: 186). Clearly there is continuing life in some of the irregular
verb paradigms, enough to provide alternative models to the regular -ed inflection.
Most examples of full or partial remodeling of English regular verbs according
to irregular patterns are somewhat regionalized. The use of hung as the past tense/
participle of the once regular verb hang is more fully integrated in AmE than BrE
(Biber et al. 1999: 397). In BrE the regular verbs saw and sew have acquired irregular
past participles (sawn/sewn), though the earlier -ed forms are more resilient in AmE
(Peters 2004: 487), witness the use of sawed-off shotgun in the US.
All these examples suggest that the regularization of English verb morphology is
conditioned by time and place, and that regional variation is a factor within any larger
historical trend. Differences between BrE and AmE have long been noted, and it is
of no small interest to see how they are reflected in southern hemisphere varieties,
especially AusE and NZE. It is also clear that verbal changes are not unidirectional,
given the various examples in which regular verbs have been remodeled according
to irregular paradigms. This incidentally argues for the importance of analogy in
morphological change (cf. Bauer 2001: 75–84), operating apart from any general rule
of regularization over the course of time. It indicates the need for a more complex
model of verbal movements, to allow for both regularization and irregularization.
2. Modeling and analyzing the directions of change
Apart from the bidirectionality in the movements of English verbs, the notion of regularization itself needs closer inspection. The changes and variability shown among
the irregular verbs of present-day English are certainly not all in the direction of
implementing the -ed suffix instead of vowel gradation for expressing past time.
Rather – as with the sing/sang/sing group – it’s the reduction of contrasting verb
parts from 3 to 2, while still using contrasting vowels (just i/u). Founding members
of the i/u paradigm are earlier refugees from the sing/sang/sung group, such as cling,
sling, slink, sting, stink, swing, wring. That paradigm seems to be gaining in strength:
witness its ability to recruit from among the regular verbs, e.g. fling, ring, string
in the sixteenth and seventeenth century, and examples such as sneak, drag with
their alternative irregular past forms snuck, drug in the nineteenth and twentieth
century. Though these two verbs end in velar stops rather than velar nasals, they
nevertheless seem to be susceptible to the power of the paradigm, and help to extend
its phonological base.
Examples such as these, and others mentioned above in Section 1, show that verb
morphology may change so as to reduce or increase the number of contrasting parts.
 Pam Peters
The reduction of a verb’s parts to 2 (as with cling/clung etc.) puts it on a par with the
ordinary -ed paradigm in having the same form for both past tense and past participle.
Meanwhile verbs with a “distinct past participle” may have retained it from Old English
or added it in Early Modern English, but either way they are 3-part verbs. These four
essential groups challenge the dichotomy between regular and irregular verbs, and need
to be plotted on an extended scale, as in Table 1.
Table 1. Modeling the directions of change in English verbs
〈 〈 〈 Regularization
2-part verbs
Irregularization 〉 〉 〉
3-part verbs
(A)
-ed (pt and pp)
(B)
other past forms
for pt and pp
(C)
pt as -ed plus irreg. pp
(D)
irreg. pt and irreg. pp
earn/earned
help/helped
like/liked
melt/melted
build/built
bring/brought
fling/flung
spin/spun
prove/proved/proven
sew/sewed/sewn
show/showed/shown
sow/sowed/sown
begin/began/begun
drive/drove/driven
bite/bit/bitten
take/took/taken
The four sets of verbs shown in Table 1 don’t line up neatly in terms of a regular/
irregular dichotomy. Those in group A on the extreme left hand side are of course to
be regarded as regular, or having reached the evolutionary target of regularization,
in Lieberman et al.’s terms (2007). Whether group B verbs such as those which use a
-t suffix for both past forms are regular or irregular is more debatable. Most modern
grammarians, including Quirk et al. (1985) and Huddleston and Pullum (2002), treat
them as irregular because of changes to the vowel and/or stem consonants. However
Fries’s (1940: 63–5) more historically informed classification distinguishes “weak” verbs
whose past forms now end with a dental suffix (e.g. build/built) from both “irregular”
verbs (be, do, go) and “strong” verbs with vowel gradation (e.g. sing/sang/sung). He also
draws attention to the fact that at least 24 of the “strong” verbs (e.g. fling, spin) have
become “two-form” verbs – which makes them a subset of Group B verbs in terms of
the model in Table 1 above. Group C verbs are also a mix, in that they form their past
tense with -ed but are strictly 3 part-verbs because of their distinct past participle. Yet
by their dental past tense they have more in common with regular verbs (i.e. group A)
than some of their 2-part neighbors (group B). Discontinuities like these challenge the
notional continuum from irregular to regular; as does the fact that group C includes
verbs which represent opposite trends: sow is a case of incomplete regularization, while
sew (like saw) is a case of incomplete irregularization.
The model drafted in Table 1 copes with the bidirectional changes in English verbs –
at least for those with either two or three contrasting parts, whatever the vagaries of
their evolution. It does not provide for invariant 1-part verbs like hit, let, put, which
Irregular verbs 
show no sign of regularizing their past tense with -ed and becoming 2-part verbs.
Perhaps their high frequency is a significant factor in their resisting regularization,
given that some lower frequency members of the set, e.g. bet, knit, quit, wed, wet have
had alternative past forms with -ed. But there’s little evidence of the -ed forms in contemporary English: they mostly occur in very particular idioms e.g. wedded to, wetted
his lips, with no particular regional variability (Peters 2004: 574–5). The general resistance of this group to regularization may be due to their phonology: with their short
tense vowels before a dental consonant, the addition of -ed for their past form would
have them “clogged with consonants”, as Addison noted in his English as an enemy to
loquacity (1711, cited in Tucker 1961). They stand apart as a phonologically defined
group, resisting regularization with -ed, and not likely to assimilate with the larger set of
2-part verbs including groups A and B. Yet like the fling/flung group, they are a resilient
paradigm of irregular verbs which challenge the notion that all English verbs will be
regularized to -ed. Both groups have distinguishing phonological features which seem
to consolidate them, and thus demonstrate that the English verb system is susceptible
to other linguistic factors than the purely morphological process.
In what follows here we will concentrate on two kinds of English verbs whose
current variability associates them with group B, though they come from opposite
ends of the spectrum in Table 1. The first set are members of the sing/sang/sung group
(group D in “standard English”), the second a subset of group A verbs (burn, spell etc.),
whose past forms are variably spelled with -ed and -t. Together they show the attraction of the 2-part verb paradigm, rather than regularization to -ed. However recent
research has shown them fluctuating in both BrE and AmE (Hundt 2009); and whether
AusE and NZE show similar variability is an open question.
Apart from regional variation, the medium of communication also affects both
the relative frequency of past forms of verbs, and the degree of control exercised
over them by those who produce or subsequently edit the discourse. Past forms are
typically less frequent in speech than writing, because spoken discourse is so often
embedded in its immediate context (Biber et al., 1999: 456). At the same time the
past forms of verbs which are used in speaking can vary because of the mode of
production, whereas those embedded in written/published texts are likely to present the standard past forms because of the normative process of editing (Cameron
1995: 50–4). Even in written discourse, the two forms of the past may be unevenly
represented, as Fries found in his corpus of unedited letters. Those from writers with
lower levels of education (= “Vulgar” letters) contained substantially less use of past
participles (only a quarter of that found in the “Standard” letters), and there were
twice as many nonstandard ones (Fries 1940: 67–71). Reliance on published writing
as language evidence probably inflates the relative frequency of standard past forms.
So the conditioning effect of the medium is a further factor to take into account in
analysing the variability of verb morphology.
 Pam Peters
3. Using ICE-corpus data, written and spoken
The ICE corpora lend themselves to this research project, with their parallel collections
of spoken and written material from multiple regional varieties of English. The material
from each variety is collected in the same categories/registers of writing and speech, and
from “educated” speakers in each case, i.e. those who have completed secondary education. They therefore show how members of a literate speech community conjugate such
verbs in everyday spoken interactions as well as in writing, and the contribution of both
modes of discourse to the patterns of inflection.
The ICE-corpora are not particularly large: within the total of 1 million words
there are just 600 000 words of natural or scripted speech, and 400 000 words of
written (mostly published) discourse. But the data comes from known contexts of
speaking, both more and less formal, including casual/private conversation and
institutional kinds of speech such as are used in classrooms, courts and radio discussions. Data from written discourse is drawn from some unpublished sources
as well as published works including newspapers and academic/scientific works,
and fiction.
In Sections 4 and 5 below, we will examine verbal data from the ICE-corpora
for three genetically related varieties of English, i.e. AusE and NZE, both antipodean
“settler” varieties; and BrE, the foundation variety, which they both still resemble in
many ways (Trudgill & Hannah, 2002: 1–8). The aim is to compare the relative frequency of the standard and nonstandard or nonstandardized past forms within each
variety, and their presence in spoken and written discourse. The analysis will focus on
two sets of verbs: those with a variable past tense, e.g. sing/sang/sung (a set of 8 verbs);
and those where the past tense and past participle both vary, e.g. burn/burned or burnt
(a set of 9 verbs).
Spoken usage itself varies much more than any published writing according to
discoursal context (public utterance vs. private conversation), and according to the
identities of the speakers. In conversation one’s sociolinguistic identity can be freely
expressed, and conversational styles vary with the speakers and the formality of the
setting. All speech data in the ICE corpora comes from adults who have completed
a secondary education, yet informal variants may be used to reduce the distance
between interlocutors. In any case, elements of “standard” usage may be overlooked
when everyone’s discourse is subject to production pressures. It is then a question as
to how far conversational performance reflects the speakers’ underlying usage preferences. These can be explored with the aid of population surveys, as we do in Section 6
below with some results from surveys carried out by Australian Style over several years.
This attitudinal data complements the textual evidence provided by the ICE-corpora,
at least for AusE.
Irregular verbs 
4. Frequencies of nonstandard past verb forms used
with sing/sang/sung verbs
The overall trend for most variable verbs is for the same form to be used for the past
tense and past participle, as shown above in Table 1 for verb groups A and B. This may
take the form of using the past tense form for the past participle for variable members
of group C, as when sheared is used instead of shorn; and it has been noted for common
group D verbs in Australian vernacular English, as in threw for thrown, took for taken,
went for gone, trod for trodden (Pawley 2004: 631–2). It is found in AmE with verbs like
drink, as noted in Merriam-Webster’s Third New International Dictionary (1961); and
Biber et al. (1999: 398) demonstrate it with swam substituted for the past tense swum,
in an example from fiction in the Longman corpus. The same kind of substitution of
the past tense for the past participle can be seen in the following Australian example
for the verb begin:
(1)The …Inquiry has began investigations into the events surrounding the
resignation of the former Police Minister [ICE-AUS S2B-008:254]
However examples like these were rare in the ICE data. Far more often the direction of
substitution is for the past participle to be used for the past tense, as in:
(2) Oh I had the impression …she begun the whole thing [ICE-AUS S1A-084:315]
The data in Table 2 below (p. 20) consist entirely of such cases, where the past participle
has been used for the past tense, a substitution which tends to be called “nonstandard”.
There are noteworthy differences in totals for the three sets of verbal data, but the
complete absence of nonstandard verb forms from ICE-GB means that their statistical
significance as part of the trio cannot be tested. The differences between the ICE-AUS
and ICE-NZ data are also not statistically significant. From this we can only say that
AusE and NZE do not seem to differ in their utilization of the nonstandard forms;
rather they pattern together in this aspect of English verb morphology. The fact that
both provide evidence of the use of nonstandard forms used by educated speakers and
writers is perhaps evidence of “colonial lag”. According to Fowler (1926), past tense
forms like shrunk, sung were formerly used (during the nineteenth century) but had
largely been replaced by shrank, sang – a further remarkable case of the maintenance
of, or reversion to, the 3-part verb paradigm. That this consolidation of 3-part verbs
should not have gone so far in the antipodes is not so surprising. It was after all BrE
norms of nineteenth century on which the southern hemisphere speech communities
were founded.
The absence of nonstandard forms in ICE-GB data is also worth commenting on,
suggesting that the corpus draws on metropolitan speakers rather than those from
 Pam Peters
Table 2. Data from three ICE corpora showing the relative frequency of standard past
tense forms and the nonstandard ones with u for 8 verbs with i/a/u conjugation
ICE-AUS
ICE-NZ
ICE-GB
Total
began
begun
drank
drunk
rang
rung
sang
sung
sank
sunk
shrank
shrunk
sprang
sprung
swam
swum
117
1
10
0
93
1
17
0
4
0
1
4
1
2
1
1
166
1
6
0
44
9
12
3
15
4
1
0
2
3
4
0
100
0
2
0
36
0
3
0
6
0
1
0
3
0
0
0
383
2
18
0
173
10
32
3
25
4
3
4
6
5
5
1
Total: standard forms
244
250
151
645
9
20
0
29
Total: nonstandard forms
spoken/written
2 sp
9 sp/1 wr
3 sp
3 sp/1 wr
2 sp/2 wr
2 sp/3 wr
1 sp
22 sp/7 wr
the traditional dialect areas of southern England. In her study of the sing/sang/sung
verbs, Anderwald (2007) succeeded in capturing reasonable quantities of nonstandard
past tense forms in her 2.4 million word corpus of contemporary British dialectal English (the so-called FRED corpus held at the University of Freiburg-im-Breisgau). Their
survival in dialectal speech complements Fowler’s suggestion that standard forms had
been restored in British written English (i.e. as edited and published). They are less
stigmatized in AusE and NZE, judging by the sociolinguistic data discussed in Section
6 below. Yet even Australians and New Zealanders – by these ICE data – seem to avoid
nonstandard use of drunk for the past tense of drink, perhaps because it carries a kind
of taboo from the homonymic adjective.
The most frequently found nonstandard form was rung, used in references to
phone calling, for example:
(3)I rung this morning. I rung your mother and she was out [ICE-NZ S1A-007:266]
(4)It was you rung us up, remember. It was you charged ’im with…
[ICE-AUS W2F-004:148]
All the New Zealand and Australian and examples used the verb ring in this sense, in
casual references to phoning (not as of a church bell). In the northern hemisphere it’s
Irregular verbs 
the verb call which carries the additional sense of phoning, so that it does not impact
on BrE use of the verb ring.
Although most instances of nonstandard verbal usage come with ring, its ratio of
nonstandard to standard is quite low in the comparison with other verbs in list: just
over 5%, whereas its counterparts for verbs sink and swim are 14% and 16% respectively,
and those for spring and shrink are 45% and 57%. Thus the higher frequency verb shows
far fewer nonstandard tokens for the past tense than the lower frequency ones, though
the correlation is not continuous. They demonstrate not verbal regularization, but the
pull of the 2-part verb paradigm which we have already noted with other analogous
verbs such as cling, sling, swing. In fact shrunk and sprung are listed as alternative past
forms in authoritative American and Canadian dictionaries (Peters 2004: 499, 513).
Table 2 shows that the occurrences of rung, sung are typically found in Australian
and New Zealand speech. Out of the total of 29 instances found in ICE-AUS and
ICE-NZ, 22 are embedded in spoken texts. The other 7, found in written texts, occur
in both edited and unedited writing. Of those, 4 are scattered across a variety of writing
styles, from informal nonfiction to fictional narrative and dialogue; while 3 occur in
unpublished writing from student exams. For example:
(5) Once people sunk this low it was up to them [ICE-NZ W1A-007:203]
Yet one student’s awareness of the need to put one’s best grammatical foot forward
in exam writing can be seen in self correction, and the replacement of nonstandard
sprung for the past tense with standard sprang:
(6)The broadbased opposition sprung sprang from a number of groups who
opposed the regime. [ICE-AUS W1A-041:69]
Apart from example 6, nonstandard past forms seem to be used quite unselfconsciously
by some Australian and New Zealand speakers and writers, as acceptable within their
speech communities. As indicated before, they pattern together on this aspect of verb
morphology, and distinguish themselves from otherwise comparable British data in
ICE-GB. The continuing currency of “nonstandard” forms suggests greater willingness
to allow sing/sang/sung verbs to become 2-part verbs in the southern hemisphere – or
at least outside Britain.
5. Frequencies of standard -ed and nonstandardized -t for verbs
with variable past forms
With the rather small gleanings of nonstandard forms for i/a/u verbs in the ICE data,
it seemed worthwhile looking at a larger data set – such as the variable -ed/-t verbs,
where the nonstandard or rather nonstandardized verb forms are more frequent, and
 Pam Peters
there is no stigma attached to using them. Here the -t form is a voiceless variant which
goes back to Old English, and it is a well represented variant of many verbs in ME and
EME, according to the Oxford English Dictionary (1989) records. British orthoepists
and grammarians of the seventeenth and eighteenth centuries worked to regularize
them all to -ed, in line with the linguistic trend of the times to disconnect norms of
writing from vagaries of pronunciation (Gordon 1966). Data from historical corpora
suggests that they were relatively successful in BrE until the later nineteenth century
(Hundt 2009: 14–15). However the -t forms were never fully repressed, and were sufficiently frequent in the early twentieth century to secure Fowler’s (1926) endorsement. He advised readers to use the -t forms on phonological grounds, while Gowers
in his second edition of Fowler’s Dictionary of Modern English Usage (1965) notes
his predecessor’s recommendation and claims that “there has since been a movement towards -t”. This remarkable set of verbs in which irregular forms have been
revived – at least in BrE – are thus moving in with the group B verbs of the model
discussed in Table 1 above. In AmE, where the -ed forms have been consistently
maintained through the twentieth century, they remain group A verbs. Let us now
review the frequency of the nonstandardized -t forms for this variable set of verbs
in data from the three ICE corpora.
Looking at the total frequencies of the two types shown in Table 3, we note that
the southern hemisphere varieties are more strongly in favor of the -t forms overall,
especially AusE, and they contrast with the BrE pattern in which the -ed forms are
preferred. The differing ratios of -t to -ed forms in the three corpora are highly significant in statistical terms (p < 0.0007). We might have expected the ICE corpora
to form a more coherent set, given the fact that both NZE and AusE are genetically
related to BrE and share its linguistic stock. Very similar results were found for the
most frequent examples (burned/-t, learned/-t) in NZE and BrE in an earlier study
by Hundt (1998: 24–5) based on newspaper data. Yet other corpus-based research on
the distribution of -ed/-t in written English (Peters 1994: 156) found no clear reflection
in AusE of British rather than the American norms. This data from the ICE-corpora
confirms the distance between the parent BrE and both AusE and NZE.
There are noteworthy differences between the New Zealand and Australian
data in Table 3. AusE shows a much higher ratio of -t forms to -ed forms, not just
larger totals of -t as Hundt, Hay and Gordon (2004: 562) found in their analysis
of purely written sources. Yet the ICE data from NZE shows a slightly wider commitment to the -t forms in terms of the overall number of verbs evidencing that
pattern. Two particular verbs (learnt, spelt) are major contributors to AusE’s larger
totals of -t forms, both strongly associated with the educational domain, and may
therefore reflect deeply rooted prescriptions in regional education. Without them
the Australian and New Zealand ratios for -t/-ed would be closer and more comparable overall.
Irregular verbs 
Table 3. Relative frequencies of -ed forms and -t forms in individual ICE corpora*
ICE-AUS
burned
burnt
dreamed
dreamt
kneeled
knelt
leaned
leant
leaped
leapt
learned
learnt
spelled
spelt
spilled
spilt
spoiled
spoilt
6
18
3
3
3
1
4
2
1
3
21
35
3
21
1
2
0
0
ICE-NZ
8
19
6
1
0
3
3
2
2
6
36
28
4
15
7
3
2
2
ICE-GB
8
7
3
1
1
0
10
2
1
1
35
36
1
1
3
6
0
1
Totals (% for each pair)
22 (33.3%)
44 (66.6%)
12 (70.6%)
5 (29.4%)
4 (50%)
4 (50%)
17 (73.9 %)
6 (26.1%)
4 (28.6%)
10 (71.4%)
92 (48.2%)
99 (51.8%)
8 (17.8%)
37 (82.2%)
11 (50%)
11 (50%)
2 (40%)
3 (60%)
TOTALS -ed
42 (33.1%)
68 (46.3%)
62 (53%)
172 (44%)
TOTALS -t
85 (66.9%)
79 (53.7%)
55 (47%)
219 (56%)
Further inspection of individual verbs finds two cases (learnt, spilt) where the largest number of -t forms comes from ICE-GB. This finding for spilt contrasts with Levin’s
(2008), based on data from the British newspaper The Independent, where spilled was
the preferred form. Levin noted (p.66 fn) that most of these were embedded in sports
idiom (spill the ball/shot etc.), whereas the examples found in this ICE data (with one
ICE-NZ exception) were not of this type. The case of spill shows the sensitivity of individual verbs to idiomatic usage, which impacts heavily on the smaller frequencies in
intercomparisons.
The data in Table 3 (combined totals for each pair) show the lack of correlation
between frequency and the regularity of morphology which would be predicted
by mathematical modeling. High frequency examples like learned/learnt and low
frequency ones like kneeled/knelt present similar ratios for the two suffixes. This noncorrelation may reflect the fact that the variation between -t and -ed forms in twentieth
and twenty-first century English is not an interim stage of regularization, but rather
analogical restoration of irregular forms, as argued by Hundt (2009: 16–17), i.e. the
process of irregularization discussed in Section 2. The differing frequencies of verbs
in spoken and written data could also contribute to the inconsistencies we see in this
 Pam Peters
set of ICE data. We noted above (Section 3) that past forms are less frequent generally
in speech. Yet when they do occur, there is a greater chance of their being irregular: more
irregular (i.e. nonstandardized) forms for the -t/-ed verbs were found by Biber et al.
(1999: 396) in conversation than in the written material of the Longman corpus.
To examine this effect in our ICE data, nonstandardized verb forms from the three
corpora were retabulated in Table 4, according to their frequencies in written and
spoken data.
Table 4. Written and spoken frequencies from all three ICE corpora for nonstandardized
past forms for -ed/-t verbs: raw and normalized frequencies per 1 million words (in bold)
(1) standardized past forms
written
burned
dreamed
kneeled
leaned
leaped
learned
spelled
spilled
spoiled
20
10
2
15
4
62
5
7
1
(50)
(25)
(5)
(37.5)
(10)
(155)
(12.5)
(17.5)
(2)
(2) nonstandardized past forms
spoken
2
3
2
2
(3.2)
(4.8)
(3.2)
(3.2)
44
3
4
2
(70.4)
(4.8)
(6.4)
(3.2)
written
burnt
dreamt
knelt
leant
leapt
learnt
spelt
spilt
spoilt
21
(52.5)
3
3
3
39
9
4
2
(7.5)
(7.5)
(7.5)
(97.5)
(22.5)
(10)
(5)
spoken
35
5
1
3
7
61
32
8
(57.6)
(8.0)
(1.6)
(4.8)
(11.2)
(97.6)
(51.2)
(12.8)
0
In Table 4 the regular -ed forms are in every case (except spoiled) more strongly
associated with writing; whereas the -t forms are less polarized in their distribution
over the two media. The normalized figures show that nonstandardized forms for high
frequency verbs (burnt, learnt) are also well established in writing, and are indeed
more frequent there for some low frequency verbs (knelt, leant). So despite separating
the data from written and spoken sources, no better correlation between high frequency
and regularity of form emerges, pace Lieberman at al. (2007). Various linguistic factors
have been associated with the distribution of -ed and -t forms. Grammatical distinctions such as aspect and transitivity have been found for some of them in BrE but not
AmE (Levin 2008), and they could not be upheld in Bauer’s (1993: 8–9) elicitation
study of NZE and WWC. The following set of data from ICE-AUS for the verb learn is
symptomatic of the lack of such distinctions in AusE:
(7) Yet when he learnt English… [ICE-AUS S1A-080:37]
(8) We learnt on the job [ICE-AUS S2B-047:54]
(9) I guess I’ve learnt a costly lesson [ICE-AUS S2B-040:100]
(10) I’ve learnt from that experience [ICE-AUS S1A-047:142]
Irregular verbs 
In those examples learnt is used for both past tense and past participle, transitive
and intransitive, and appears indifferent to aspect or transitivity. The lack of distinction
in AusE is unsurprising, given that it is the dominant form, as shown in Table 3.
The -t forms are invested with stylistic value by some – though paradoxically
with both formal/literary writing and spoken discourse (Trudgill & Hannah, 2002: 56).
In actual speech the -ed suffix may well be uttered as /t/ through the devoicing of the
final consonant. Yet the use of either -t or -ed in transcriptions of speech may simply
be the transcriber’s preferred way of representing it, rather than representing phonetic reality. We have no way of knowing which combinations of factors come into
play with the different speakers, transcribers, writers and editors captured in corpus
data. Their unpredictability would account for the rather variable ratios we see in
Tables 3 and 4.
The extent to which individual and social factors impact on the use of standard
nonstandard and nonstandardized verbs is not extractable from ICE corpus data, however closely analysed. Individual speech styles may underlie what seem to be regional
differences in the use of nonstandard past forms such as rung in the ICE corpora,
i.e. their apparently greater acceptability outside Great Britain, in the US, Australia
and New Zealand. Apart from regional differences in conversational norms, there is
the further question of how variable the norms are for younger and older speakers.
By their topics of conversation, the average age of speakers in ICE-AUS and ICE-NZ
corpora is lower than that of those included in ICE-GB (though all are “educated”
speakers, according to the ICE criteria). It seems very likely that the frequency of nonstandard verb usage varies with the age of the conversationalists, for example whether
they are under or over 35, the threshold used by Rayson et al. (1997) in their research
on speech patterns in the British National Corpus. The norms of “standard English”
conversation, including verb selections, are likely to vary with the sociolect, and are not
simply determined by the medium.
6. Sociolinguistic variation in verbal preferences, especially
on the age spectrum
Sociolinguistic variation with irregular verbs can be assessed more directly through
population surveys than corpus evidence. Here we turn to data from linguistic surveys,
especially those conducted in the wider Australian community over the last few years.
These have been designed to target the verbal selections of individuals in particular
syntactic structures (e.g. transitive, intransitive, active, passive) and to correlate their
preferences as far as possible with sociolinguistic aspects such as age and gender. Similar
surveys carried out in New Zealand by Bauer (1987) among others have tended to
use only undergraduate students, and do not therefore show how the age differential
 Pam Peters
affects linguistic preferences. The data presented below highlight generational variation
within the Australian community in verbal preferences, extracted from surveys
carried out in 2002 and 2007.
The data were returned through questionnaires published in the magazine Australian
Style, with hundreds of respondents (in 2002 there were >1100; in 2007 n = 376). The
set of results shown in Table 5 below focuses on three verbs from the two types we
have been discussing above, which show the typical pattern of variation across the age
spectrum. The test sentences used are listed below.1
Table 5. Nonstandard and nonstandardized verb past form selections by respondents to
Australian Style surveys in 2002 and 2007, separated by /
Overall
Age 1 (18–44)
Age 2 (45–64)
Age 3 (65+)
-u- for -ashrunk (2002*/2007)
sunk (2002)
sprung (2002/7)
44%/27%
37%
47%/35%
55%/46%
47%
52%/48%
31%/23%
22%
35%/24%
21%/24%
13%
24%/37%
-t for -ed
leant (2002*/2007)
learnt (2002/2007*)
spelt (2002)
47%/51%
51%/53%
73%
58%/53%
59%/58%
79%
49%/45%
54%/51%
71%
39%/55%
43%/53%
71%
Legend: results marked * are based on intransitive test sentences; the others all transitive
The overall results shown in Table 5 show that the nonstandard/nonstandardized variants command quite a following, much greater than was evident from the
ICE data shown in Table 2 above. In both surveys there was a notable gradation from
Age 1 to Age 3 in the extent to which shrunk, sunk, sprung are used for the past tense.
The readiness to shift to the 2-part paradigm for those verbs suggests they carry no
. The test sentences used to elicit past forms of the verb were as follows:
SHRINK
SINK
SPRING
(2002) My old woolly jumper ____ in the wash.
(2007) The heat ____ the plastic plate to a tiny disk.
(2002) The dog ____ his teeth into the visitor’s leg.
(2002) In heavy seas the ship ____ a leak.
(2007) The ginger cat ____ the mousetrap.
LEAN LEARN SPELL (2002) After the explosion, the wall ____ precariously.
(2007) He ____ his tired back against the wall.
(2002) In those two years they ____ nothing of any use.
(2007) Despite all the warnings, he never ____ .
(2002) The article had not ____ their name correctly.
Irregular verbs 
stigma as far as many respondents are concerned. In every case the nonstandard
forms were most strongly endorsed by the youngest group (up to 44 years). The
effect is there also in the 2002 data for the past forms of -ed/-t verbs, but less consistently in 2007. Overall the results from the 2007 survey suggest a slight reduction in
age-based differentiation from that shown in 2002 over a larger population, as if the
acceptability of the nonstandard(ized) forms among the over-65s is on the rise. The
differences in the returns in 2002/2007 for sprung, leant and learnt are quite remarkable in this regard. There is no consistent pattern of difference between transitive
and intransitive sentences.
That the youngest group should more freely endorse the nonstandard forms is
perfectly explicable, because they (especially the under 30s) are probably more attuned
to oral/visual culture than reading books for pleasure, and may therefore have more
limited exposure to written standard. Yet the discourse of this group is less often captured in the reference corpora as Minugh (2002: 72) has argued. Both spoken and
written corpora are easily biased towards middle-aged and older citizens, who are the
public speakers and published writers of the community. This applies equally to the
ICE-AUS corpus, even if the conversational data includes more younger speakers. Age
variation was not a factor in the design of ICE corpora, and in any case they are rather
too small to accommodate sociolinguistic parameters as well as those of medium and
text-type. Yet sociolinguistic characteristics are clearly relevant in the distribution of
nonstandard and nonstandardized verb forms, as these Australian surveys show.
7. Conclusions
Regional differences in the use of variable verb forms have come to light through this
analysis of data from three ICE corpora. We have seen that the two southern hemisphere varieties (AusE and NZE) pattern together in contrast to BrE in their tolerance
of nonstandard past forms in the ring, shrink, spring set, and in their greater use of the
nonstandardized -t suffix for the past forms of verbs such as burn, leap, spell. The deeper
commitment of younger and older Australians to these forms has been demonstrated
through the results of usage surveys, though not available for New Zealand.
These regional divergences paint a more complex picture of the evolution of
English verb morphology than is provided by unidirectional modeling. The irregularization of some verb forms is clearly a factor which needs accommodating in any
larger model, and the notion of regularization to -ed needs to be accompanied by the
larger concept of modification in the direction of the 2-part paradigm. This would
take account of the gathering strength of the fling/flung paradigm, which is still very
much alive in current English, though perhaps more vital in non-British varieties,
i.e. in North America and the southern hemisphere. It would also accommodate the
 Pam Peters
currently strong preference in the southern hemisphere for -t forms with verbs like
burn, spell. The 2-part verb paradigm, which includes all existing -ed forms, as well as
the realigned irregular verbs and those with latter-day nonstandardized forms, is the
focus of all these verbal movements, and it provides the most pervasive and essential
verbal contrasts in English.
References
Anderwald, Liselotte. 2007. “ ‘He rung the bell’ and ‘She drunk ale’ – nonstandard past forms in
traditional British dialects and on the internet”. In Marianne Hundt, Nadja Nesselhauf &
Carolin Biewer, (eds), Corpus Linguistics and the Web. Amsterdam: Rodopi, 271–86.
Australian Style. 2002 (June, December), 2007 (December). Sydney: Dictionary Research Centre.
Baron, Denis. 1982. Grammar and Good Taste. New Haven CT: Yale University Press.
Bauer, Laurie. 1987. “New Zealand English morphology: some experimental evidence”.
Te Reo: 37–53.
Bauer, Laurie. 1993. “Progress with a corpus of New Zealand English and some early results”.
In Clive Souter & Eric Atwell (eds), Corpus-based Computational Linguistics. Amsterdam:
Rodopi, 1–10.
Bauer, Laurie. 2001. Morphological Productivity. Cambridge: Cambridge University Press.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. Harlow, Essex: Pearson Education.
Bybee, Joan. 1985. Morphology: A Study of the Relation between Meaning and Form. Amsterdam:
John Benjamins.
Bybee, Joan. 2006. “From usage to grammar: The mind’s response to repetition”. Language
82(4): 711–33.
Cameron, Deborah. 1995. Verbal Hygiene. London: Longman.
Fee, Margery & Janice McAlpine. 2007. Guide to Canadian Usage. 2nd edn. Oxford: Oxford
University Press.
Fries, Charles C. 1940. American English Grammar. New York NY: Appleton Century Crofts.
Fowler, Henry. 1926. A Dictionary of Modern English Usage. Oxford: Clarendon Press.
Gordon, Ian. 1966. The Movement of English Prose. London: Longman.
Gowers, Ernest. 1965. A Dictionary of Modern English Usage by H.W. Fowler. 2nd edn. Oxford:
Clarendon Press.
Hofland, Knut & Stig Johansson. 1982. Word Frequencies in British and American English.
Bergen: Norwegian Computing Centre for the Humanities.
Huddleston, Rodney & Geoffrey Pullum. 2002. Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Hundt, Marianne. 1998. New Zealand English Grammar: Fact or Fiction? Amsterdam: John
Benjamins.
Hundt, Marianne. 2009. “Colonial lag, colonial innovation, or simply language change?” In Rohdenburg & Schlüter (eds): 13–37.
Hundt, Marianne, Jen Hay, & Elizabeth Gordon. 2004. “New Zealand English morphosyntax”.
In Kortmann et al. (eds): 560–92.
Kortmann, Bernd, Edgar W. Schneider, Rajend Mesthrie & Kate Burridge. 2004. Handbook of
Varieties of English. 2 vols. Berlin: Walter de Gruyter.
Irregular verbs 
Levin, Magnus. 2008. “The formation of the preterite and the past participle”. In Rohdenburg &
Schlüter (eds): 60–85.
Lieberman Erez, Michel J-B, Jackson J., Tang T. & Nowak M.A. 2007. “Quantifying the evolutionary
dynamics of language”. Nature 449 (October): 713–16.
Minugh, David. 2002. “The Coll corpus: towards a corpus of web-based college student newspapers”. In Pam Peters, Peter Collins & Adam Smith (eds), New Frontiers of Corpus Linguistics. Amsterdam: Rodopi, 71–90.
Oxford English Dictionary. 2nd edn. 1989. Oxford: Clarendon Press.
Pawley, Andrew. 2004. “Australian vernacular English: Some grammatical characteristics”. In
Kortmann et al.: 611–42.
Peters, Pam. 1994. “American and British influence on Australian verb morphology”. In Udo Fries,
Gunnel Tottie & Peter Schneider (eds), Creating and Using English Corpora. Amsterdam:
Rodopi, 149–58.
Peters, Pam. 2004. Cambridge Guide to English Usage. Cambridge: Cambridge University Press.
Pinker Steven. 1999. Words and Rules. London: Weidenfeld and Nicolson.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Rayson, Paul, Geoffrey Leech & Mary Hodges. 1997. “Social differentiation in the use of English
words. Some analyses of the conversational component of the BNC”. International Journal
of Corpus Linguistics 2(1): 133–52.
Rohdenburg, Günter & Julia Schlüter. 2009. One Language, Two Grammars? Differences between
British and American English. Studies in English Language. Cambridge: Cambridge University
Press.
Trudgill, Peter & Jean Hannah. 2002. International English. A Guide to the Varieties of Standard
English. 4th edn. London: Edward Arnold.
Tucker, Susie I. 1961. English Examined. Two Centuries of Comment on the Mother Tongue.
Cambridge: Cambridge University Press.
Pronoun forms
Heidi Quinn
University of Canterbury
This paper compares the distribution of pronoun case forms (I/me, he/him,
she/her, we/us, they/them), non-reflexive myself, and second person plural
variants in corpora of New Zealand, Australian, American, and British
English, with a view to identifying possible regional differences in pronoun
use. While low token numbers prevent a detailed comparison of the four
varieties, the corpus data suggest that the use of I and myself in coordinates
is most strongly favoured in Australian English. Similarly, possessive me is
significantly more frequent in the written Australian English corpus than
elsewhere. The second person plural variant y’all would seem to be confined
to American English, whereas yous(e) occurs only in the New Zealand,
Australian, and British English corpora.
1. Introduction
Kortmann and Szmrecsanyi (2004: 1146, 1154f, 1162–6, 1173f) observe that many
varieties of English share the pronominal variants listed in (1)–(6).
(1) me in coordinate subjects ([Me and you] were gonna do it)
(2) myself in non-reflexive contexts (Pam and myself)
(3) possessive me (me own private business)
(4) demonstrative them (them little fellas)
(5)use of us + noun phrase (us NP) in subject position ([us long distance drivers]
have been...)
(6) special second person plural (2pl) forms (yous(e), you guys)
Biber et al. (1999: 336f) found register-related pronoun case variation in the following
contexts in the corpus of British and American English compiled for the Longman
grammar project:
(7) it-clefts (e.g. it was she/her who stood in long queues)
 Heidi Quinn
(8) it BE constructions (e.g. it was I/me)
(9) than comparatives (e.g. than I/me)
Many of the variants in (1)–(9) are also discussed in Wales’s (1996 and 2004) studies of BrE, Quinn’s (2005) written survey of NZE, and various contributions in
Kortmann et al. (2004). However, nobody has so far attempted a systematic comparison of pronoun forms in written and spoken corpora of both northern and southern
hemisphere varieties of English. The aim of this paper is to present the results of
such a comparison and draw attention to the limitations of a purely corpus-based
approach.
2. Data sources and methodology
I examined the distribution of pronoun case forms in the written and spoken corpora
of NZE, AusE, AmE, and BrE listed in Table 1. The study of 2pl variants focused on the
spoken corpora, because the use of forms such as yous(e) and you guys is predominantly
a feature of speech.
Table 1. The corpora used in this study
corpus
abbreviation
variety
(mode)
approx. size
sampling period
International Corpus of
English – New Zealand
International Corpus of
English – Australia
Santa Barbara Corpus of
Spoken American English
ICE-NZ
NZE
(spoken)
AusE
(spoken)
AmE
(spoken)
600 000 words
1990s onwards
600 000 words
1990s onwards
249 000 words
1990s onwards
Corpus of London Teenage COLT
Language
BrE
teenagers
(spoken)
444 831 words
1993
Wellington Corpus of
Written New Zealand
English
Australian Corpus of
English
Freiburg-Brown Corpus
of American English
Freiburg-LOB Corpus of
British English
WWC
NZE
(written)
1 000 000
1986–90
ACE
AusE
(written)
AmE
(written)
BrE
(written)
1 000 000
1986
1 000 000
1991
1 000 000
1992
ICE-AUS
SBC
Frown
FLOB
Pronoun forms 
The main spoken corpora included in the study are the spoken components of the
ICE-NZ and ICE-AUS,1 and the Santa Barbara Corpus, which forms the unscripted
part of ICE-US.2 Unfortunately, I was unable to obtain access to ICE-GB. I did however examine the distribution of pronoun forms in the COLT, a corpus of BrE teenage
spoken language. Although COLT is not directly comparable to the spoken standard
NZE, AusE and AmE corpora, the COLT recordings do resemble the informal dialogues found in the other corpora. COLT also turned out to yield considerably more
pronoun tokens in a wider range of pronoun constructions than the other corpora, and
thus arguably represents the most suitable kind of corpus for a study of pronoun variation. It seems likely that the patterns found in COLT are indicative of future trends in
BrE, and possibly the other major varieties of English as well.
Given that there is as yet no written ICE-US corpus, and I was unable to gain
access to ICE-GB, I decided to focus on Frown, FLOB, ACE, and WWC for my analysis
of pronoun use in written texts.3 Although the written sections of ICE-NZ and
ICE-AUS cover a sampling period similar to that of Frown and FLOB, in terms of the
text categories covered, Frown and FLOB most closely resemble the slightly earlier
ACE and WWC.
Since the syntactic context is particularly important for determining the status
of a pronoun, I searched all corpus files with the concordancing program AntConc
3.2.1w, which returns a key-word-in-context (KWIC) output.4 The variants targeted
in the searches were: the nominative and accusative pronoun forms I/me, he/him,
she/her, we/us, they/them, possessive my, and all instances of reflexive myself, you,
yous(e) and y’all. The search results obtained were saved as text files and copied into
Excel, so that each pronoun token could be coded for the construction it appeared
in (e.g. coordinate, cleft) as well as its function in the sentence (e.g. subject of finite
clause, object of verb or preposition). Research funding from the Department of
Linguistics at the University of Canterbury enabled me to employ Ruth Hope, Sarah
Kerr, and Theo Dainis to code a large proportion of the nominative and accusative
. I would like to thank Pam Peters at Macquarie University for granting me access to the
ICE-NZ and ICE-AUS. Special thanks to Yasmin Funk for assistance with accessing the full
text of the ICE-NZ.
. ICE-US is as yet not completed. The Santa Barbara Corpus transcripts used in this study
were downloaded from the TalkBank website at http://www.talkbank.org/data/Conversation/.
. I would like to thank Alex D’Arcy and Kate Kearns for obtaining a multiple user licence
for the ICAME corpora through the Linguistics Department at the University of Canterbury.
. AntConc3.2.1w was developed by Laurence Anthony and can be downloaded from his
website: http://www.antlab.sci.waseda.ac.jp/software.html. Many thanks to Kate Kearns (p.c.)
for drawing my attention to AntConc and demonstrating how to use it.
 Heidi Quinn
pronoun tokens in the written corpora as well as the ICE-AUS and COLT. I coded
the 2pl variants as well as the remaining case tokens, and I double-checked all the
coding for consistency.
3. Results
The analysis of the corpus data supports Quinn’s (2005) observation that pronoun
case variation in English is almost entirely confined to instances where the pronoun is
modified or embedded in a more complex syntactic construction. Overall, we find the
greatest degree of variation (and the largest number of pronoun tokens) in coordinated
constructions involving a first person singular (1sg) pronoun.
3.1 Conjoined pronouns
3.1.1 First person singular (I/me)
The examples in (10) illustrate the range of conjunct orders and 1sg forms attested in
coordinate noun phrases that appear as the subject of a finite clause. For simplicity’s
sake I will refer to coordinates in this position as “subject coordinates”, but it is important to bear in mind that this term is only intended to cover subjects of finite clauses.
The variants in (10) are listed in order of decreasing overall frequency.
(10)
a. [Dad and I] were listening to the radio this morning [ICE-AUS S1A-066:452]
b. [me and Carl] kind of looked at it [ICE-NZ S1A-010:116]
c. ... that [Pam and myself] want to know if ... [ICE-AUS S1A-023:23]
d. “... [You and me] can be good friends ...” [ACE W12:2213]
e.... that [I and many of my colleagues] represented a new generation of
New Zealanders [WWC G20:089-91]
In all of the corpora in the sample, I and myself occur mainly in final conjuncts, whereas
me is largely confined to initial conjunct position, at least in subject coordinates. These
trends tie in with the patterns reported by Biber et al. (1999: 337ff) for the Longman
Spoken and Written English (LSWE) Corpus, examples from the Survey of English
Usage (SEU) and the ICE-GB cited in Wales (1996: 102f), and the results of a detailed
written survey of NZE speakers discussed in Quinn (2005). As examples (10d–e) illustrate, exceptions to the favoured patterns tend to involve conjunction mates other than
a proper noun: final me is most likely to occur when the initial conjunct is a pronoun,
and initial I will tend to appear only when the following conjunct is a more complex
noun phrase (cf. Wales 1996: 105).
Kortmann and Szmrecsanyi (2004: 1154) report that “me instead of I in coordinate
subjects” is one of the morphosyntactic features most widely attested in their survey
of 46 varieties of English around the world. The corpus data from my sample support
Pronoun forms 
this to a certain extent, in that we do find instances of me and X (10b) in subject
position in all of the corpora. However X and I (10a) is clearly favoured in the
written corpora and in most of the spoken corpora, the only exception being the
COLT (see Table 2). The teenagers in COLT display a strong overall preference for
me (and other accusative pronoun forms) in all contexts where we find pronoun
case variability. It is possible that this trend signals the future of the English pronoun
system (cf. Quinn 2005: 383).
Table 2. 1sg forms in coordinates functioning as the subject of a finite clause
X&I
me & X
X & myself
X & me
I&X
Total tokens
ICE-NZ
ICE-AUS
SBC
COLT
72.22%
85.11%
67.86%
22.73%
25.00%
6.38%
32.14%
75.00%
2.78%
8.51%
0.00%
0.00%
0.00%
0.00%
0.00%
2.27%
0.00%
0.00%
0.00%
0.00%
36
47
28
44
WWC
ACE
Frown
FLOB
93.65%
92.31%
88.51%
86.11%
0.00%
3.85%
5.75%
8.33%
3.17%
1.92%
1.15%
0.00%
0.00%
1.92%
2.30%
2.78%
3.17%
0.00%
2.30%
2.78%
63
52
87
36
subject coordinates
spoken
written
Among the remaining spoken corpora, the preference for X and I is markedly
greater in ICE-AUS than in ICE-NZ and SBC. ICE-AUS also contains more tokens of
X and myself than the other corpora. Since the use of me in subject coordinates overall
is largely confined to informal conversations (actual or fictional), it might be tempting
to put some of the differences in case patterns down to differences in the content of
the corpora. While 40% of the spoken ICE corpora are monologues, COLT and SBC
consist largely of informal conversations. However, the proportion of subject coordinates involving me is still significantly lower in the ICE-AUS than in the other spoken
corpora if we include only tokens from dialogues/conversations in our analysis (Table 3).5
As example (11) illustrates, we even find an example of self-correction from me and X
to X and I in an informal conversation in the ICE-AUS.
(11)And so [me and Julie] [me and Julie] [Julie and I] were first partners
[ICE-AUS S1A-089:293]
. The chi-square and p-values were calculated using the web-based calculator on
http://www.physics.csbsju.edu/cgi-bin/stats/contingency and the CHITEST function in Excel.
 Heidi Quinn
Table 3. The frequency of subject coordinates involving me in conversations
conversations
me & X, X & me
Total 1sg tokens in
subject coordinates
chi-square and p-value for
difference to ICE-AUS
ICE-AUS
ICE-NZ
SBC
COLT
8.11%
30.00%
32.14%
77.27%
37
30
28
44
χ2 = 5.40, p < 0.03
χ2 = 6.12, p < 0.02
χ2 = 38.7, p < 0.0001
Collins (1989: 146) reports that the sentence This decision will come between you
and I was given an 80 per cent acceptability rating for formal contexts in an evaluation
test involving Australian participants. Similarly, Bauer (2002: 107) observes that New
Zealand undergraduate students now consider I in contexts such as He saw you and I
and between you and I to be “better or more formal English” than me. He also speculates that this use of I may become the norm in varieties of English around the world
in fifty years’ time (Bauer 2002: 107). However, comments by Biber et al. (1999: 338f)
would seem to suggest that in the Longman corpus, which contains only British and
American texts, X and I is largely limited to subject position.
More than thirty years ago, Walshe (1972: 277) observed that “Myself seems
to be as much used as me by Australians and New Zealanders in sentences like
‘They have issued the invitation to the secretary and myself (me)’ ”. According to
Walshe, usage surveys in Britain and the US suggested that the majority of people
there preferred me. While Biber et al. (1999: 339) comment that self-forms “provide
a convenient way of avoiding a choice between a nominative and accusative case
form”, they note only occasional instances of myself in subject coordinates, and do
not provide any evidence that conjoined myself also occurs in other positions in the
Longman corpus.
The corpus data summarized in Table 4 indicate that Australians and New Zealanders
may be more ready than speakers of AmE and BrE to use X and I and also X and myself
in contexts where lone pronouns tend to surface in their accusative forms.6 Interestingly,
the majority of tokens come from informal conversations, where we might expect to
find a higher use of me. It would obviously be important to check the figures in the
. The figures in Table 4 relate only to non-reflexive uses of myself, where the pronoun is not
coreferential with another argument in the clause. Any instances of reflexive myself, as in the
sentence below, were excluded from the analysis:
Several times, I contemplated flinging myself and the baby out into the rain.
Pronoun forms 
Table 4. 1sg forms in coordinates that do not function as subject of a finite verb
me & X
X&I
X & me
X & myself
myself & X
Total tokens
ICE-NZ
ICE-AUS
SBC
COLT
45.83%
14.29%
58.33%
78.79%
29.17%
42.86%
8.33%
3.03%
0.00%
17.86%
33.33%
15.15%
20.83%
21.43%
0.00%
0.00%
4.17%
3.57%
0.00%
3.03%
24
28
12
33
WWC
ACE
Frown
FLOB
26.67%
45.00%
33.33%
36.84%
26.67%
5.00%
9.09%
15.79%
30.00%
30.00%
48.48%
26.32%
13.33%
15.00%
3.03%
15.79%
3.33%
5.00%
6.06%
5.26%
30
20
33
19
spoken
written
ICE-GB, but it is worth noting that none of Wales’s (1996: 105ff, 194f) examples of
I and myself in non-subject coordinates come from the ICE-GB.
When considering the results in Table 4, is important to keep in mind that the
figures generalize over a wide range of different syntactic contexts as shown in (12)–(21)
below, which arguably have quite distinct case properties (cf. Quinn 2005), and are not
equally represented in all of the corpora.
(12) object of a preposition
a. she was saying to [Josh and I] the other day... [ICE-NZ S1A-025:268]
b.so Saturday night you came with [Paul and myself] to euro
[ICE-NZ S1A-035:123]
(13) object of a verb
they were effusive in thanking [herself and myself] [ACE A41:8879]7
(14) possessive
Yep yep, it was [Firstname1 and I’s] celebration [ICE-AUS S1A-004:270]
(15) right-dislocation
a. we were there by ourselves [my brother and I] [ICE-AUS S1A-037:55]
b.we were playing darts [this other fellow and myself] [ICE-NZ S1B-048:73]
. This example, which comes from a review of a catering company, contains the only
attested non-reflexive use of a third person (3ps) self-form in my sample. Non-reflexive 3ps
self-forms are generally associated with special meanings. In this case, herself is used to refer
to the wife/partner of the author.
 Heidi Quinn
(16) subject of a V-ing construction
a.And then we’re gonna watch um the video that we took at the Leadership
Retreat [Firstname2 and I] playing darts and and um what else is on it
Firstname3 playing pool [ICE-AUS S1A-013:289–92]8
b.One autumn morning saw [Gordon and myself] cruising across the bay
[ACE E12:2387]
(17) subject of a small clause
... an underlying complicity a sense of [you and I] against the world
[ICE-NZ S2B-034:176]
(18) subject of a to-infinitive
And it’s time for [Jean and I] to to run away [ICE-AUS S1B-076:177]
(19) gapping
Two of my fellow ordinands, Frank Kennedy and Cyril Butler, went to the
Pacific Islands, Jim Beban to a parish, and [Denis Scully and I] to college work.
[WWC G35 007-010)]
(20) identificational construction
that’s [Gilbert, Hamish, and myself] [ICE-NZ S1B-032:87]
(21) it-cleft
was it [you and I] who were talking about you [ICE-NZ S1A-053:74]
3.1.2 Other pronouns
Wales (1996: 105) and Biber et al. (1999: 337f) note that while the 1sg nominative
I is largely restricted to final conjunct position, the third person singular masculine
(3sgM) nominative he mostly appears initially in the British and American corpora
they consulted. Quinn (2005) reports a similar ordering preference for other third
person (3ps) nominatives (she, they) and the first person plural (1pl) nominative (we).
According to a summary table presented in Biber et al. (1999: 337), the case of 3sgM
pronouns in subject coordinates is strongly influenced by register in the Longman
corpus: the accusative form him appears to be favoured in conversation (where it
occurs in both initial and final conjunct position), whereas in written registers (fiction
and news) there is an overwhelming preference for the nominative he.
We find the same preference for subject coordinates of the form he/she and X (22)
in all of the written corpora investigated in this study (see Table 5), with occasional
instances of X and he/she (23).
. The use of X and I here contrasts nicely with the lone me in the parallel V-ing construction
me being an idiot [ICE-AUS S1A-013:293] which was uttered by another speaker immediately
after (16a).
Pronoun forms 
(22) a. [He and Hilary] are keen to retain the bush [WWC E32:099]
b.So [she and her husband] set out to make the place liveable [ACE E07:1379]
(23) a. [my father and he] were soon in conversation [WWC K51:192-3]
b. [Her mother and she] had shared it [FLOB K25:40]
Interestingly, the only example of an accusative 3sg pronoun in a subject coordinate
appears with a 3sg verb form (24). As the example in (25) indicates, even lone unmodified subject pronouns occur in their accusative form when there is no person/number
agreement between the subject and the verb.9
(24)I imagine of all the people in the family, [me and her] was the closest.
[Frown R04:16-17]
(25) Oh me’s old I thought to myself [ICE-AUS S1A-040:32]
Table 5. 3sg forms in coordinates functioning as the subject of a finite clause
(raw token numbers)10
he/she & X
him/her & X X & he/she
X & him/her
Total tokens
ICE-NZ
ICE-AUS
SBC
COLT
11
11
5
0
4
3
2
4
0
0
0
0
1
0
1
4
16
14
8
8
WWC
ACE
Frown
FLOB
52
36
74
63
0
0
0
0
2
1
2
2
0
0
1
0
54
37
77
65
subject
coordinates
spoken
written
As in the Longman corpus, the favoured options in the spoken corpora are he/she and
X (26) and him/her and X (27), with occasional instances of X and him/her (28). The
small token numbers prevent us from drawing any meaningful comparisons between
the different spoken corpora, but it is worth noting that the absence of conjoined 3sg
. For a detailed discussion of such links between case and subject-verb agreement in Belfast
English see Henry (1995: 16–43).
. Since 3sgM (he/him) and third person singular feminine (3sgF) forms (she/her) exhibit
the same distribution and the overall token numbers in the spoken corpora are very low, Table 5
shows the combined 3sg token numbers.
 Heidi Quinn
nominative tokens from COLT ties in with the teenage speakers’ strong preference for
the 1sg accusative form me in the same context.
(26) a. [he and I] are pretty the same era [ICE-AUS S1B-055:162]
b. [she and her daughter] were comparing... [ICE-NZ S1B-044:46]
(27) a. [him and his erm girlfriend] are the champions [COLT B137804]
b. [her and Calvin] were coming down [ICE-NZ S1A-008:102]
(28) a. [me and him] were just sitting there [ICE-NZ S1A-054:77]
b. do you think [Lucy and her] were sad [COLT B142706]
In the ICE-AUS, we find self-corrections from nominative to accusative (29) as well as
accusative to nominative forms (30) in informal conversation, which would seem to
suggest that speakers differ in which 3sg case form they consider most appropriate for
subject coordinates.11
(29)[He him him and and the brood] are all going to Queensland
[ICE-AUS S1A-098:78]
(30)And then [her she and Stephen] hol huddled away little secrets you know
[ICE-AUS S1A-089:51]
In coordinates that do not function as the subject of a finite clause, 3sg and first person
plural (1pl) pronouns generally appear in initial conjunct position and surface in their
accusative forms (31)–(33).12 Examples (31b) and (33) illustrate nicely how this case
and ordering preference differs from the favoured 1sg pattern in the ICE-AUS.
(31) right-dislocation
a.He’s happy we’re going sailing on Saturday just [him and me]
[ICE-AUS S1A-022:293]
b.It’s hard because we’re all in the same boat just [her you and I].
[ICE-AUS S1A-024:376]
(32) object of a verb
I’ll give [her and her policy policies] one last chance [ICE-NZ S1B-033:181]
. It is of course possible that (29)–(30) simply reflect a difference in case preferences for
3sgM and 3sgF. Low token numbers prevent any statistical comparison of the distribution of
3sgM and 3sgF forms in coordinates, but it would be interesting to investigate this question in
a more controlled empirical study. The survey results reported in Quinn (2005: 115) indicate
that there are at least some speakers of NZE who favour him over he and she over her in initial
conjuncts of subject coordinates.
. Since in most of the corpora we only find a handful of tokens that cover a wide range of
not necessarily comparable contexts for each pronoun, I have decided not to give a table with
the results. The general trends discussed in this section appear to hold for all the corpora in
the sample.
Pronoun forms 
(33) object of a preposition
questions which arose for [Bob and I] as partners and for [us and our children]
[ICE-AUS S2A-042:113]
The accusative case is also favoured for 3pl pronouns in this context, but the position
of the pronoun in the coordinate tends to depend on the complexity of the conjunction
mate. In both Australian corpora, them appears in initial position when the other
conjunct is a full noun phrase (34), but in final position when the conjunction mate is
a pronoun (35).
(34)And I love finding out about [them and their language] [ICE-AUS S1A-037:178]
(35) all of these things are matters between [us and them] [ICE-AUS S1B-046:252]
In ICE-NZ, we find an instance of final them conjoined with a proper noun, but them
is reinforced with the quantifier all, which gives the final conjunct the expected greater
complexity (36).
(36) stayed with um [Michelle and all them] [ICE-NZ S1A-054:174]
3.2 Pronouns in identificational constructions, clefts,
and than-comparatives
Biber et al. (1999: 336f) observe that case variation in it-clefts, it BE constructions, and
than-comparatives is partly linked to register in the Longman corpus, with nominatives
almost entirely confined to written texts.
There are hardly any tokens of it BE (37a,c,d) and other identificational constructions (37b,e) in the corpora included in this sample, but all pronouns tend to surface in
their accusative form in this context, especially in conversations and fictional dialogue
(37). The only clear examples of unmodified nominatives come from the ICE-AUS and
WWC (38), largely from more formal texts.
(37)
a.
b.
c.
d.
e.
“How did you know [it was me]?” asked Erich [WWC L22:029]
yeah [that was him] [ICE-NZ S1A-010:192]
well [it probably is her] [ICE-NZ S1A-005:114]
“[If it wasn’t him] [it’d be us]” [WWC K67:180-1]
because [you are them] and they are you [ICE-NZ S1B-045:41]
(38) a. No [it was I] [ICE-AUS S1A-082:301]
b.Descendants of Robert Palmer, still living in Kaikoura, claim [it was he]
[WWC E33:014-015]
c. And [perhaps it was she] and perhaps it was not. [WWC L19:023-024]
d.They’re even exemplified in the Waltzing Matilda Down came the troopers
one two three [These were they] [ICE-AUS S2A-014:195-7]
 Heidi Quinn
It-clefts are fairly rare overall, and occur predominantly in written texts. The few
attested tokens do not provide any evidence for regional differences, but would seem
to support observations by Wales (1996: 96) and Quinn (2005: 134f) that the tendency
towards nominative case is strongest when the cleft focuses a 3sg pronoun and the
clause is introduced by a wh-pronoun relativizing the subject. Across the corpora, 3sg
pronouns almost always surface in the nominative form in such subject clefts (39),
whereas there is no clear case preference with 1sg (40). However, it is worth noting
that virtually all tokens of the accusative form me occur in actual (40c) or fictional
dialogue (40b).
(39) a. as if it was he who had financed the Apollo programme [WWC L17:034]
b. because it was she who explained it to me afterwards [ACE W08:1443]
c. is it he who makes it fictional in that game [ICE-NZ S1B-004:100]
(40) a. My friend told him [it was I who did the work] [WWC A41:060]
b. “It’s me who falls apart.” [ACE P13:2473]
c. It was actually me who picked it [ICE-NZ S1A-001:87]
The majority of than comparatives found in this sample involve a 1sg pronoun. In
keeping with their general preference for accusative forms, the teenage speakers in
COLT consistently opted for me. However, we find case variation in all of the written
corpora as well as in ICE-NZ and ICE-AUS (41)–(42). The choice of case form does
not clearly correlate with the formality of the text or the syntactic properties of the
comparative construction,13 although it is conceivable that such factors would emerge
as significant in a larger sample (cf. Collins & Peters 2004: 605).
(41) a.Technical people could explain in far more detail [than I] the method that
he has used... [WWC G50:173-4]
b.This great lout was taller [than I] and seemingly quite fearless.
[ACE F16:2971]
c. you could do a better job back announcing it [than I] [ICE-NZ S1B-024:14]
d.those more knowledgeable [than I] on such matters have suggested...
[ICE-AUS S2B-049:132]
(42)
a. Everyone else seemed to be better at it [than me] [WWC F06:039-040]
b. The statue had become a boy some years older [than me] [ACE L13:2567]
c. Hannah you’re older [than me] [ICE-NZ S1B-039:136]
d.one who actually I’m sure found the whole thing even more distressing
[than me] [ICE-AUS S1A-095:20]
. See Quinn (2005: 85f) for a brief discussion of different types of than comparatives.
Pronoun forms 
3.3 Possessive me
Burridge (2004: 1118) identifies possessive me as a feature that characterizes the
“vernacular varieties” of AusE and NZE. Given that the corpora in this sample were
mostly designed to represent regional standards, it is not surprising that possessive me is
generally rare and largely confined to written representations of very casual speech (43).
(43) “[Me bloody finger] is broken and poisoned...” [ACE G22:4696]
The only spoken corpus where me occurs more than a handful of times is the COLT,
which contains the most informal conversations among the youngest speakers in the
sample, and also has the highest overall number of possessive 1sg tokens (Table 6).
Table 6. Variation between possessive me and my (raw token numbers)
me
my
Total tokens
ICE-NZ
ICE-AUS
SBC
COLT
4
3
0
14
1029
1235
993
2403
1033
1238
993
2417
WWC
ACE
Frown
FLOB
9
19
6
5
2077
1416
1958
1572
2086
1435
1964
1577
possessive
spoken
written
Among the written corpora, possessive me is significantly more frequent in the
ACE than in WWC, Frown and FLOB (Table 7). Table 7 focuses on data from the
biographical (G) and fiction sections (K onwards) of the corpora, where we are most
likely to find informal dialogue. The higher rate of possessive me in the ACE would
seem to suggest that the variant is particularly common in vernacular AusE, or is at
least perceived as such by Australian writers.
Table 7. Possessive me in biographical writings and fiction
biography/essays and fiction
me
Total 1sg
possessive tokens
chi-square and p-value for
difference to ACE
ACE
WWC
Frown
FLOB
2.03%
0.54%
0.40%
0.41%
937
1665
1482
1211
χ2 = 12.6, p < 0.0005
χ2 = 14.9, p < 0.0005
χ2 = 12.5, p < 0.0005
 Heidi Quinn
3.4 Demonstrative them and us NP
Wales (1996: 100) notes that in informal English, a 1pl pronoun followed by a noun
phrase often surfaces as us, even when the whole construction appears as the subject
of a finite clause (44).
(44)I believe that if [us women] get in there we can make decisions quicker
[ICE-NZ S2A-028:116]
Comments by Wales (1996: 100) and Pawley (2004: 635) suggest that this use of us is
less stigmatized than demonstrative them (45).
(45) “[Them men] were ’ere, Grandad.” [WWC K99:063]
There are not many tokens of 1pl NP subjects in the corpora investigated here, and
hardly any of them involve the accusative form us. The corpus data do however
provide some support for Wales’s (1996: 100) and Burridge’s (2004: 1118) classification of demonstrative them as a feature of nonstandard varieties. In keeping with
Wales’s (1996: 100) observation that them NP is pervasive in the London vernacular, most instances of demonstrative them occur in the COLT (see Table 8). In the
written corpora, them NP appears only in fictional dialogue and typically cooccurs
with other features characteristic of vernacular English, such as possessive me and
reflexive meself (46).
(46) a.“Me multigrips! I’ve been lookin’ for them little fellas for weeks!”
[ACE R03:528]
b.and I used to say to meself: “Them chooks are eating their heads
off in there” [ACE R13:2744]
Table 8. Variation between demonstrative them and those
them NP
those NP
Total tokens
ICE-NZ
ICE-AUS
SBC
COLT
1
1
2
61
458
536
202
162
459
537
204
223
WWC
ACE
Frown
FLOB
4
7
16
4
375
301
463
346
379
308
479
350
spoken
written
Pronoun forms 
Table 9. The most common distinct 2pl forms in the spoken corpora, listed by relative
frequency (raw token numbers in brackets)14
ICE-NZ
ICE-AUS
SBC
COLT
you guys (22)
you all (7)
yous (7)
you people (6)
you guys (12)
you all (8)
you guys (51)
y’all (7)
you lot (49)
you two (29)
you two (7)
you people (3)
you lot (2)
you people (3)
you blokes (2)
you both (2)
you girls (2)
you three (2)
youse (2)
you all (2)
you guys (8)
youse (8)
you all (3)
you people (3)
you three (2)
youse lot (2)
youse two (2)
3.5 2pl variants
As Wales (2004: 179) points out, many spoken varieties of English have developed
new distinctive 2pl forms to make up for the loss of the original second person
singular/plural distinction. Table 9 lists all the distinct 2pl forms that occurred
more than once in the individual spoken corpora. 14
You guys, which appears to have originated in AmE (cf. Montgomery 2001: 149;
Butters 2001: 333), is easily the most popular form overall and refers just as readily
to females as males. For example, the question in (47a) is followed soon after by the
explanation in (47b), which makes it clear that at least some females are among the
intended referents for you guys.
(47) a. what did you guys get up to [ICE-NZ S1A-037:24]
b. oh we all helped out Suzy and Leigh and I did a bit [ICE-NZ, S1A-037:29-30]
The predominance of you lot in COLT ties in with observations by Butters (2001: 333)
and Wales (2004: 171f) that you lot is common but stigmatized in colloquial BrE and
. The syntactic status of you all is not always clear when you is the subject of a clause
containing only a lexical verb, as in (i). In clauses like (i), the all could be part of the noun
phrase containing you, or it could occupy the same lower floating quantifier spot as in (ii).
i. which one symbol you all had in common [ICE-NZ S1A-036:367]
ii. which one symbol you’ve all had in common
The token numbers for you all in Table 9 include only instances where all forms a clear a unit
with you and could not possibly be analyzed as a floating quantifier.
 Heidi Quinn
tends to be used somewhat disparagingly. However, the following passage from a
public lecture illustrates that, in NZE at least, you lot may also be used in a more formal
context and does not necessarily have negative connotations:
(48)things may have picked up in the next five years by the time you lot come
through [ICE-NZ S2A-027:58]
Yous(e) appears in the spoken NZE, AusE and BrE corpora (49), but not in SBC.
(49) a. we’ve been asked to let yous know... [ICE-NZ S1B-010:201]
b.even though youse’ve gone to uni youse’ve still merged
[ICE-AUS S1A-050:257]
c. why don’t youse two work together? [COLT B140701]
It is possible that the absence of yous(e) from the SBC is due to the West Coast origins
of the corpus (Katie Drager, p.c.). According to Wales (2004: 183) and Montgomery
(2001: 149), yous(e) can be heard in AmE varieties from the mid-West eastwards.
Maybe not surprisingly, given its strong association with southern US English
(cf. Butters 2001: 332), the contracted form y’all occurs only in the SBC. Example (50)
suggests that y’all has a pronominal status equivalent to you and yous(e) which allows
it to be followed by a numeral (49c)–(51).
(50) NICK: This way y’all two can see it [SBC 057]
(51) What do you two get up to [ICE-AUS S1A-076:259]
4. Conclusion
As the results of this study illustrate, there are considerable drawbacks to basing an
investigation of pronoun variants entirely on corpus data. Constructions that give rise
to pronoun case variation are comparatively rare in conversation and even written texts,
so that it is often difficult to obtain a sufficient number of comparable pronoun tokens
in the full range of variable contexts. The corpus data considered here do nevertheless
provide some evidence for possible differences between AusE, NZE, AmE, and BrE: The
use of the 1sg nominative form I in coordinates appears to be most widespread in
spoken AusE, and possessive me is significantly more frequent in the written AusE
corpus than elsewhere. While speakers of AusE and NZE readily use non-reflexive
myself in non-subject coordinates, even in informal conversation, this variant is virtually absent from the spoken AmE and BrE corpora. Many of the 2pl variants attested in
the corpora are found in all four varieties, but y’all would seem to be confined to AmE,
and yous(e) occurs only in the NZE, AusE, and BrE corpora. It would be interesting to
Pronoun forms 
see whether more structured elicitation could yield additional insights into the regional
trends identified here.
References
Bauer, Laurie. 2002. An Introduction to International Varieties of English. Edinburgh: Edinburgh
University Press.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. Harlow: Pearson Education.
Burridge, Kate. 2004. “Synopsis: Morphological and syntactic variation in the Pacific and
Australasia”. In Kortmann et al. (eds): 1116–31.
Butters, Ronald R. 2001. “Grammatical structure”. In John Algeo (ed.), The Cambridge History of
the English Language, Vol. VI: English in North America. Cambridge: Cambridge University
Press, 325–39.
Collins, Peter. 1989. “Divided and debatable usage in Australian English”. In Peter Collins &
David Blair (eds), Australian English. St Lucia: University of Queensland Press, 138–49.
Collins, Peter and Pam Peters. 2004. “Australian English: Morphology and syntax”. In
Kortmann et al. (eds): 593–610.
Henry, Alison. 1995. Belfast English and Standard English: Dialect Variation and Parameter
Setting. New York: Oxford University Press.
Kortmann, Bernd, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider & Clive Upton (eds).
2004. A Handbook of Varieties of English, Vol. 2: Morphology and Syntax. Berlin: Mouton
de Gruyter.
Kortmann, Bernd & Benedikt Szmrecsanyi. 2004. “Global synopsis: Morphological and syntactic variation in English”. In Kortmann et al. (eds): 1142–1202.
Montgomery, Michael. 2001. “British and Irish antecedents”. In John Algeo (ed.), The Cambridge
History of the English Language, Vol. VI: English in North America. Cambridge: Cambridge
University Press, 86–153.
Pawley, Andrew. 2004. “Australian Vernacular English: Some grammatical characteristics”. In
Kortmann et al. (eds): 611–42.
Quinn, Heidi. 2005. The Distribution of Pronoun Case Forms in English [Linguistik Aktuell/
Linguistics Today 82]. Amsterdam: John Benjamins.
Wales, Katie. 1996. Personal Pronouns in Present-Day English [Studies in English Language].
Cambridge: Cambridge University Press.
Wales, Katie. 2004. “Second person pronouns in contemporary English”. Franco-British Studies
33–4: 172–85.
Walshe, Robert D. 1972. “Guide to usage and style”. In George W. Turner (ed.), Good Australian
English and Good New Zealand English. Sydney: Reed Education, 241–310.
Hypocoristics in New Zealand
and Australian English
Dianne Bardsley & Jane Simpson
New Zealand Dictionary Centre/University of Sydney
New Zealand and Australia share a propensity to create new words and
hypocoristic forms of existing words by adding -ie and -o suffixes (among others)
to a base which is usually monosyllabic. While the creation of new words is
driven by the need to refer quickly to new things, the creation of hypocoristic
alternatives is driven partly by the desire to identify with a group’s particular
way of talking. The distribution of hypocoristic forms is similar across both
countries, except for the greater use of the -o ending in Australia, especially in
naming occupations and in fishing. Across different semantic domains there is a
greater range of suffixes to be found in proper names (personal, geographic and
institutional) than in common nouns.
1. Introduction
Many hypocoristics of common nouns, personal names and placenames are in widespread use in New Zealand and Australia, though they are most common in casual
speech and informal writing. While many are spontaneous formations, or restricted to
particular trades and hobbies, some, such as rego ‘registration’, compo ‘compensation’,
info ‘information’, have made their way into common use. Most consist of one syllable
from the base form followed by a vowel (/i/ written “y” or “ie” or “ey”, /oʊ/ written “o”
or “oh”, or /ə/ written “-er” or “-a”) or other ending (-s, or -as written “-ers” or “-as”),
as blowie ‘blowfly’, journo ‘journalist’, acca ‘cadet who enjoys academic work’, turps
‘turpentine’, preggers ‘pregnant’. Occasionally other sound changes occur, as in the s > z
alternation found in Aussie ([z]) from Australian ([s]). Several of the endings are
phonetically identical to endings used for creating words for new things which have no
existing names (the agentive suffix “-er” which creates nouns from verbs, and its derivative the processive suffix (Corne 1998), and the adjectival suffix “-y” which creates
adjectives from nouns). It is sometimes difficult to distinguish between new coinages
and hypocoristics of existing words, and so in this paper we will treat both together.
Australian hypocoristics have attracted the attention of numerous authors over
the last three decades, including Dabke 1976; Dermody 1980; Wierzbicka 1984, 1991;
McAndrew 1992; Mühlhäusler 1983; Poynton 1984; Taylor 1992, 1993; Simpson 2001;
 Dianne Bardsley & Jane Simpson
2004; Skelt 2002; Sussex in prep. Kiesling (2006: 78) suggests that the use of hypocoristics is a feature of AusE that distinguishes it from NZE, but this is very definitely
not the case. The use of hypocoristics is far from recent in New Zealand: the database
of more than 1150 hypocoristics maintained at the New Zealand Dictionary Centre
records usages from the 1800s. The Australian data in this paper come from a database
of nearly 2000 forms collected by the second author.1
One of the earliest of New Zealand’s recorded usages is beacher (1844), which
was the name given to a whaler or sailor who set up home on the New Zealand
coast.2 Other terms, such as spotter and spoiler, ‘self-interested land speculators’ were
recorded from 1856, while gummy ‘gum-digger’ has been found in print since 1890.
Many early New Zealand forms come from particular occupational domains. Such
examples include occupational roles within the early freezing industry: beefy ‘beef
butcher or killer on freezing works beef chain’, chainie ‘freezing works chainman’, cully
‘animal due for culling’, and guttie ‘employee in freezing works gut-house’. Similarly,
roles within the nineteenth century harvesting domain include baggie ‘harvesting gang
member responsible for sewing grain-bags’, chaffie ‘chaff collector in harvesting team’,
flaxie ‘flax-cutter/flax-mill worker’, forkie ‘sheaf-forker’, tanky ‘driver of the water tank’,
and water joey ‘driver of the watertank’. Shearing terms, such as broomy/broomie ‘one
employed to sweep in a shearing shed’, fleecie/fleec-o ‘woolshed fleece-gatherer/classer’,
greasie ‘a sheep with wool in which oil has risen’, moccie ‘shearing moccasin’, ringie
‘ringer’ fastest shearer’, roughie ‘sheep that has missed shearing’, rousie ‘rouseabout or
casual worker’, sheepo ‘woolshed pen-filler’, shornie ‘newly shorn sheep’, and woollie
‘full-woolled sheep’ are also among early usages.
Features of the environment, including micky ‘mingi mingi tree’, mockie ‘mocking
bird, bellbird or tui’, and Captain Cooker ‘a wild pig’, were also among early New Zealand
coinages. Skiddy ‘timberworker’, has been cited in various genres as early as 1910 and as
. The New Zealand Dictionary Centre hypocoristics database is compiled of terms from
the Dictionary of NZE, the newspaper database of January 2005–December 2007, and the
Dictionary Centre database of New Zealand English, including terms and citations collected
in PhD studies of Dianne Bardsley (2003), and Diana Looser (2001). Cherie Connor contributed terms and citations from her current doctoral project: “A diachronic exploration of
the contribution of the harvesting of the marine environment to a distinctive New Zealand
English lexicon”.
The AusE data has been compiled by Jane Simpson and David Nash from observations
of, and discussions with, Australian English speakers since 1987, as well as written sources and
other authors’ works, (Dabke 1976; Dermody 1980; Wierzbicka 1984; Taylor 1992; McAndrew
1992). Most of the collection is incorporated into a dictionary (Sussex in prep.). The placenames
material is discussed in Simpson (2001), and a general discussion of the material is in Sussex
(2004), and Simpson (2004).
. In the AND beacher is ‘a wave which a body-surfer rides to the beach’ (1930 attestation).
Hypocoristics in New Zealand and Australian English 
recently as 2006. In Australia, the Tasmanian term piner ‘someone involved in felling
Huon pines and getting them to market’ goes back to 1871, and can be found today on
the internet.
In sharing a similar colonial experience in the rural domain, it is not surprising
that some early terms are shared between New Zealand and Australia. Australian and
New Zealand shepherds, shearers, and sheep-breeders traveled freely across the Tasman in the 1850s and 1860s, although this traffic was mainly between the province
of Canterbury and states of Victoria and New South Wales. In a comparative study
of rural terms recorded in The Australian National Dictionary/AND (1988) and The
Dictionary of New Zealand English/DNZE (1997), Bardsley (2003, 2006) found several duplicate hypocoristics. Those that are shared, with earlier citations from Australia, include bullocky ‘bullock-diver’ and smoko/smoke-o ‘work-break’. Australian
farmers adopted the New Zealand terms woollie/woolly and placer ‘an animal that
stays in one location’.
Of the form cocky for farmer from Australian cockatoo, only the singular form
cocky, together with cow cocky and sheep-cocky are recorded in the DNZE,3 whereas the
AND shows further differentiated use in boss cocky, cane cocky, share cocky, spud cocky,
sugar cocky, and wheat cocky. There are also cases of the same term being used with a
different referent within the rural domain. Examples include roughie/roughy, used in
New Zealand for a wild sheep that has missed shearing and in Australia for an unbroken
horse;4 a bushie is someone who lives in the bush in Australia whereas it is a bush or
forestry worker in New Zealand. A duffer is a dry cow in New Zealand (an unusual
usage) while across the Tasman refers to an illegal grazier or rustler. While beefer was
formerly used in Australia for a beast killed for home consumption, killer is the term
used in both countries, along with boiler ‘animal to be boiled down’, chopper ‘animal
sold for pet food’ and milker ‘cow, goat, or sheep used for milk production’.5 Beefie is
. Cocky is also a shortened form of cockabully, which is an altered form of the Maori kokopu,
a small New Zealand freshwater fish. Sea-cocky ‘fisherman’ has been cited in New Zealand
newspaper sources. (Bully is another form for a freshwater crayfish and also for a pigdog with
bulldog genes or a bull-terrier.)
. Like many forms based on adjectives, roughie or roughy has many senses. For Australia
these include: a rough person, a rough-leaved pineapple, a Tommy Rough (fish), a shrewd
trick, an unbroken horse, an unpleasant job. In New Zealand, a wild cattle beast, a poor or
unqualified tradesman, a poorly performing racehorse, a poor sports game, and a disobedient
or poorly-trained sheep dog, are amongst further uses. In both countries, roughie is also used
for a racehorse that starts at long odds.
. There are related semantic differences here: in New Zealand, a killer is a sheep for home
consumption, and a chopper is a pig killed for second-grade pork products. (A chopper is also
an axeman in New Zealand).
 Dianne Bardsley & Jane Simpson
used in New Zealand as a general term for a cattle beast. But this is not a consistent
pattern: while swagger was noted by Morris (1898) as a New Zealand term for ‘swagman’
and is still the common term, across the Tasman the common term is swaggie (Morris
1898). Tussock jumper (‘a farm-hand’ according to the AND) is attested in a shortened
form tussy-jumper in Australia, but is left in an unshortened and semantically distinctive form in New Zealand. And the same base form may have both endings, thus
Morris (1898) has slusher as well as the more common slushy for a cook’s assistant at
shearing-time, both with citations in the 1890s. The Australian National Dictionary
(Ramson 1988) records both blocker and blockie for someone who farms a small block
of land.
Nevertheless, many distinctive examples are found in New Zealand and Australian
rural lexicons. Well-known examples from New Zealand include flyer ‘a fast-shearing
sheep’, nodder ‘nodding thistle’, packie ‘mustering packman and cook’, pinky ‘newly
shorn sheep’, scotchie ‘Scotch thistle’, and shotty ‘shot gun’. Only the AND records dogger ‘dingo hunter’, fizzers and ragers for wild stock, hornie ‘cattle beast’, jummy ‘jumbuck’, snagger ‘shearer’, and tomahawker or tommyhawker ‘rough shearer’. An early
Australian term is w(h)aler for a horse bred in Australia, especially New South Wales
(1849). Other early Australian examples in the AND include snailey ‘cow with horns
curled like a snail’, and poley ‘hornless cow’.
Hypocorism flourished during World Wars I and II among New Zealand and
Australian troops, generating forms in the tradition of their particular varieties of
English. The DNZE cites, among others, gypie and gippo ‘Egyptian’, homer ‘a serious
wound that will send a serviceman home’, Jacko and Jacky ‘Turkish soldier’, kriegie
‘prisoner of war’, limby ‘a serviceman who has lost a limb’, and slittie ‘slit trench,
a narrow slit in the earth used to protect a soldier or weapon in battle’. Aussie for
‘Australia’ or ‘Australian’ gained currency in World War 1 (AND), as did digger
(extended from its use for ‘miner’). Other terms used by Australian soldiers cited
in the AND or Laugesen (2003) include anty ‘sugar’, mousee ‘cheese’, prive ‘private
soldier’, and pozzy ‘position’ for a soldier’s place of shelter or firing position in
trench warfare. There were also many compounds using the agentive -er on verbs
(body-snatcher for ‘stretcher-bearer’, bum-brusher for ‘officer’s servant’, sin-shifter
for ‘army chaplain’), and also on nouns (cold-footer for ‘coward’, Would to Godder
for ‘a civilian who “would to God that he could go to the war”’). Compounds could
have variants, wagger ‘signaler’, which are thus indistinguishable from coinages
formed from the verb on its own, such as plonker ‘a shell’, and from coinages where
the source is not obvious, macker ‘a new recruit’, drongo ‘a fool’. The tradition of word
coinage continues, as evidenced by the many examples used by Australian Army
officer cadets in Moore (1993), including checkie ‘check parade’, drillie ‘drill sergeant’,
and messies ‘mess boots’.
Hypocoristics in New Zealand and Australian English 
2. Functions of hypocoristics
The examples detailed in the introduction show the diversity of domains in which -ie,
-a/-er and -o forms have occurred since the colonization of Australia and New Zealand.
But they also show the difficulty distinguishing between creating hypocoristic forms
for things which have existing names and creating words for new things which have
no existing names. This difficulty is observable in the nineteenth century; an early lexicographer, Edward Morris (1898) lists several forms in -ie/-y, some of which are new
words, and some of which are variations of existing words. He describes them variously
as “slang” (slushy p.xvii) , “a slang name in the bush” (see boss-cockie), as “a pet name”
(see Tassy for Tasmania), as “a humorous variation” (see swaggie for swagman), and as
“a school-boys’ name” (see greenie for the White-plumed Honeyeater). While Tassie
is an alternative name, boss-cockie is a new word, since the only alternative is the long
paraphrase offered as a definition: “a farmer, larger than a Cockatoo who employs other
labour as well as working himself ”.
Morris’s mention of slang, humour and pet names indicates that these forms
were used in maintaining good relations with people. This is still an important
function. The -ie ending has long been associated with babytalk, that is, with adults
speaking to children. It is noticeable that of the 21 diminutives for names of babies
found in an Australian newspaper (The Canberra Times Babies of 2007 supplement,
22/1/08), 9 ended in -ie or variants of it (Daley for Dale, Coopie for Cooper), 5 were
truncations (Max for Maxwell), 6 were reduplications (Lucy Lu, Zoe Poe); 4 ended in
Bear (Logie Bear for Logan), and 1 in -a (Bonza for Bonnie). None ended in -o or -s,
or -as, suggesting that these forms are not linked to the adult-to-child relationship.
Instead, the function of many of these forms is to express adults’ relationships
with other adults. In present-day Australian society it serves to assert group membership, as Skelt (2002) observes, adding that accommodation is often involved as a means
of showing positive politeness in face-threatening acts. Wierzbicka (1991) in discussing
the social solidarity function of hypocoristics in Australian usage, claims that their
use suggests not endearment6 but good humour, and the jocular cynicism, love of
informality, and tendency to knock things down to size, which she says are part of the
Australian ethos. These also belong to the New Zealand ethos, and are increasingly
evident in media interaction. Pejoratives are not common – in general, New Zealand
hypocoristics have a euphemistic function, rather than a pejorative effect. However,
. These hypocoristics need to be distinguished from terms of endearment (lovie, dearie)
and from babytalk (eggie). See Mühlhäusler (1983), McAndrew (1992).
 Dianne Bardsley & Jane Simpson
they may have a leveling or even patronizing effect, as when financial journalists use
hypocoristic forms of surnames of well-known financial figures.7
Hypocoristics simultaneously carry out two functions: creating a new word, and
creating the feeling that comes from sharing a common expression. This is well brought
out by Looser (1999) in her study of female prison argot:
Inmate D ... felt that the argot made things “easier to say ... takes less time” and
that it provided “short-term equivalents” for words and situations which would
otherwise involve lengthy explanations. (Looser 1999: 17)
... using words which are not of the dominant variety promotes the individuality
of the prisoner group against those in general society... The slang use emphasises
the fact that prisoners are eager to hold themselves different to and not to identify
with prison officials,... (Looser 1999: 19)
This social solidarity function has already been illustrated in the camaraderie and
shared social identity of wartime, and it continues in the naming of schools and school
students, and the familiar names given to members of sports teams, etc.
3. Sources of hypocoristics and their contexts of use
Recent New Zealand data collected from the reading of newspapers can be compared qualitatively and quantitatively with that from the Wellington corpus of spoken
New Zealand English (WSC), the Wellington corpus of written New Zealand English
(WWC), and ICE-NZ. There is in fact a dearth of examples in the corpora, apart
from geographical names. It is not however surprising that hypocorism is underrepresented in these particular corpora, given that (i) many hypocoristics have their
origin or use in particular occupations or contexts, and (ii) they are often used in
informal and personal contexts. Many are on-the-spot coinages, as in “I’ll get a second-handy” (a second-hand one), which are unlikely to make it into print, except
occasionally through on-line forums.
Of the 1150 hypocoristic terms recorded in the New Zealand Dictionary Centre
database, only 93 were found in the WWC and WSC. The most common form of hypocoristic found in the corpora is one syllable followed by -ie, with the most common type
. An example is:
TELSTRA’s public relations supremo Phil Burgess greeted City Beat at the company’s
shareholder shindig on Wednesday with his usual kind words. But at least with Burgo, what
you see and hear is pretty much what you get. (Michael Sainsbury, 9 November 2007, The
Australian, 32).
Hypocoristics in New Zealand and Australian English 
being placenames (most notably Palmie with 16 tokens). The most common specific
location is prefixed by the definite article (The Beehive). Among personal names Fitzy
and Foxy were the two most common terms for sportsmen, being rugby players of the
1980s, while Macca (5 tokens) was the most common example of a substitute name.
NZE-specific terms were not well represented in the corpora, the most commonly cited
general noun in all three corpora being varsity (118 tokens), a term which has been
widely superseded in 2007 by uni amongst young people. A broad search in 2007 using
Google resulted in 249 000 hits for uni, and 18 100 for varsity. Other common terms,
none of which are specific to NZE, include Aussie (20 tokens), physio (17 tokens), rep
(15 tokens), pro (10 tokens) and munchie (10 tokens).
While a comparable study of ACE and ICE-AUS has not been undertaken, a search
for polysyllabic words ending in -o in the Australian radio talkback corpus (ART) held
by Macquarie University (about 250 000 words) showed relatively few tokens: only 3
proper names (Peto, Dozzo, and Rotto for ‘Rottnest Island’), 14 tokens of fibro (both for
houses made of fibro-asbestos cement, and for the substance itself), 3 tokens of metho
‘methylated spirits’, and 1 or 2 tokens each of reno(vation), chemo(therapy), el cheapo,
hippo, hypno(therapy), info, macho, osteo(porosis), combo, rego, servo ‘service station’.
A number (apart from hello) were used as interjections: there were 9 tokens of Righto,
1 each of Alrightio, Cheerio, Goodo, Perfecto, and Bingo!
The New Zealand newspaper data, collected from two main metropolitan daily
newspapers in three calendar years between January 2005 and December 2007, show
an increasing use of hypocorisms in all morphological forms. They come from death
notices, obituaries, and classified advertisements, where it might be thought that
expression is more formal and formulaic than that used by columnists, feature writers,
and letter-writers.
Hypocoristic placenames, names of national sportspeople, and national figures are
used widely in newspapers, with hypocoristic forms such as steppie/steppy ‘stepmother/
stepfather’ being found in death notices, along with streetie ‘street inhabitant’ in obituaries in both New Zealand and Australia. Others from Situations Vacant columns
include glassie ‘glass-washer’ and hospo ‘worker in the hospitality industry’, while
skiddy ‘worker in a saw mill’ seems a strictly New Zealand creation, as is the use of
Wangas in a newspaper column to refer to the city of Wanganui. Food and lifestyle
columns are sources for a range of terms including the Cab Savs, the Chardies, the savs,
the savvy, the Eggs Benny, and the Eggs Florrie, which are found in both New Zealand
and Australia.
However, the widespread use of hypocoristics in the media is not a wholly
popular trend:
I am fed up with “Palmie” for Palmerston North, “chrissie” for Christmas and “the
heke” for Waiheke Island .. Jim Mora [National Radio host] does not sound like
us. (Letter to the editor, NZ Listener, 24.02.2007: 4)
 Dianne Bardsley & Jane Simpson
Although the use of hypocoristics in both New Zealand and Australia is well
represented in newspapers and periodicals, there are also distinctive groups that have
their own characteristic usage. Skiers, surfers, fishermen, and farmers all demonstrated
wide use of hypocorism in New Zealand, while the criminal and fringe element have
also produced numbers of them in Australia (Simes 1993). Aspects of mass culture lend
themselves to multiple hypocoristics: for example, the local soap opera in New Zealand
(titled Shortland Street) is known as Landy, Shorty, Streety, and Tantie.
Business and brand names are not exempt from hypocorism in newspapers and
periodicals. In both countries people talk of a Beamer ‘BMW’, a Fergie ‘Ferguson
tractor’, a Rangy ‘Range Rover’, a Steiny ‘Steinlager’, Vinnies or St Vinnies ‘St Vincent
de Paul’ and Woolies/Woollies ‘Woolworths’. Ballys ‘Ballantynes’, Kirks ‘Kirkcaldie &
Stains’, and Maccie ‘a Macintosh Apple computer’ are used in New Zealand. Australian
examples of abbreviated brandnames and businesses include DJs ‘David Jones’,
blunnies ‘Blundstone boots’, Bundy ‘Bundaberg rum’ and Inter ‘International tractor’.
Plant names, such as daff ‘daffodil’, gladdie ‘gladiolus’, poly ‘polyanthus’ and rhodo
‘rhododendron’ are common, and found in the UK as well as in Australian and
New Zealand usage.
In both Australia and New Zealand there are hypocoristic terms for religious groups
and sects. Both nations use Pressies ‘Presbyterians’, Sallies ‘Salvation Army’, although
Australians use Salvos more, and Happyclappies ‘charismatic denominations’. Australians
also use Metho ‘Methodist’, Proddie and Proddo ‘Protestant’, Presbie and Presbo
‘Presbyterians’; Catho and Caffo, rock crunchie and (rock) chopper ‘Catholic’. In NZE
other terms are used for a Catholic, such as benders and Doolans (from ‘Mickey Doolan’,
a stereotypical name for a Catholic). Scarfies is applied to the Exclusive Brethren from
the headcoverings worn by their female members. Hypocoristics are also commonly
used to refer to trade unions and political parties. Both countries have the Nats for the
National Party and the Shoppies for unions of shop-workers, while Australia has the
Missos ‘the Miscellaneous Workers Union’, the Libs ‘the Liberal party’.
Closely related by their usage are the hypocoristic acronyms used for other kinds of
institutions. In New Zealand these include Eggs ‘Epsom Girls’ Grammar School’ and Stac
‘St Andrew’s College’, examples from the schools domain. Other pet-names for schools
include Dio for Diocesan School for Girls, and Rangi, Rangi Ruru Girls’ School. Winz
‘Work and Income New Zealand’ and Doc ‘Department of Conservation’ are among
government departments. In Australia many government departments are known by
acronyms or initialisms or combinations: DEST ‘Department of Education Science and
Training’, Deefat ‘DFAT’ (Department of Foreign Affairs and Trade) and so on. Soob, the
acronym for ‘small owner-operated brothel’, has been generated with the legalization of
prostitution in New Zealand. With it goes P, the most common of the terms used for
the illegal drug methamphetamine, not to mention E ‘ecstasy’, which is not confined to
New Zealand usage.
Hypocoristics in New Zealand and Australian English 
Hypocoristics can be sourced from various registers but especially slang. NZE
provides pissy for “piss poor”, as in: “I woke up on the couch in my clothes with
only a pissy blanket over me and my socks still on my feet”. Glad rags are shortened to glads and gladdies. Baby-talk expressions like blanky ‘blanket’ have become
common in the conversation of adults, particularly with connotations of
comfort, found in “Alice Taylor describes it (bottling/preserving) as comforting, like
a little blanky” (Dominion Post Indulgence 4 August 2007: 5). As in that example, they
may be documented through quotations of everyday speech.
4. Hypocoristics of placenames
In both New Zealand and Australia, the local hypocoristics for placenames may be
found by searching for phrases such as “as (the) locals call it”, or “as it’s known to
locals”. Such phrases are common in travel articles, as they provide both an insider
perspective (you can learn the right password for this group), and an outsider perspective (the locals are not us). In a South Australian newspaper, the capital of Queensland
is described as “beautiful ‘Brissie’ as the locals call it” (Matt Williams, 26 August 2006,
The Advertiser). Searching for these phrases in Australian sources in Factiva for the
years November 2005–November 2007 produced 21 geographical hypocoristics, most
in travel articles; and a Google search produced a further 28 Australian placenames as
well as duplicates (listed in Appendix 1). The most common suffix was -ie (20 tokens),
followed by 10 truncations, and 10 using “The”.
New Zealand and Australia show similar strategies for forming hypocoristic
names for places. Australian places have been discussed in Simpson (2001), and so we
focus here on New Zealand. Cardy Capital, Cardy Town, Wellies, Wello, Wellers, Welly
and Wellytown are all hypocorisms of Wellington, while Hammers, Hammytown and
Hamiltron are forms for the city of Hamilton. There are occasional puns, the bestknown being Taradise and Nakiwood for ‘Taranaki’ and Waiberia ‘Waiouru’.8 Blends
include Ashvegas ‘Ashburton’ and Rotovegas ‘Rotorua’ (compare Australian Brizvegas
for Brisbane), which relate to the ribbon development of the town and its accommodation, while Te Texas for ‘Te Teko’ connotes the “back country” aspects of the town.
Although no hypocoristics have been recorded in the Wellington database for The
Hutt, or the (Lower or Upper) Hutt Valleys, there are citations for Hutties, who are
residents there. Residents of Carterton are termed Cartertonics, which those from Ashburton (‘Ashvegas’) are known as Ashvegans.
. The byname Taradise for Taranaki follows an extended film-making visit by Tom Cruise,
who described the area as paradise. Waiouru is located in the North Island’s barren central
plateau, with winter snowfalls.
 Dianne Bardsley & Jane Simpson
Several towns have familiar forms suffixed with -town: P-town ‘Porirua’, Q-town
‘Queenstown’, T town ‘Timaru’, and Whaka-town ‘Whakatane’. Te Awamutu in the Waikato is commonly known as TA.
The definite article is used in some geographical names, either in shortened or
complete forms. The Bay, The Coast, The Cone, The Mount, The Pass, The Sounds, and
The Strait are familiar forms respectively for Hawke’s Bay, the West Coast, Treble
Cone, Mount Maunganui, Arthur’s Pass, Marlborough Sounds, and Cook Strait.
Some forms combine the definite article with the shortened forms such as The Lewis
‘Lewis Pass’, The Naki ‘Taranaki’, The Nua ‘Horowhenua’, The Rap/Wrap ‘Wairarapa’,
The Takas ‘Rimutaka Range’, and The Waimak ‘Waimakariri’. The French definite
article is also found prefixed in New Zealand names for some locations, as in La
Central ‘Central Hotel’.
Other hypocoristic New Zealand placenames are simple abbreviated forms, such
as Annie for Gentle Annie, Central for Central Otago, and Lynn for Grey Lynn. Other
truncated forms include retention of either first or final syllables, Keri for Kerikeri,
Kune for Ohakune, Papa for Whakapapa, Pori for Porirua, and Tiki for Opotiki.9
Retention of the final syllables in hypocoristics appears more often among placenames
than common words, in both New Zealand and Australia.10
Placenames made hypocoristic with the -ie or -y suffix include Cardie/Cardies
‘Cardrona’, Gladdy ‘Gladstone’, Nellie ‘Nelson’, Palmie ‘Palmerston’, Piccy ‘Picton’, Queenie
‘Queenstown’, Welly ‘Wellington’, and Yaldie ‘Yaldhurst’. Others are actually truncated
forms of words from the Maori language: Hoki ‘Hokitika’, Kati ‘Katikati’, Kohi ‘Kohimarama’, Naki ‘Taranaki’, Pori ‘Porirua’, Wainui ‘Wainuiomata’, Heke ‘Waiheke’ and Kune
‘Ohakune’. Their use is definitely hypocoristic, so they are grouped here together with
those formed with an additional suffix (mostly -ie and -a (-er)) which are phonetically
identical. This also applies to a number of hypocoristic forms of Australian placenames,
including The Curry ‘Cloncurry’ and The Berra ‘Canberra’.
5. Ways of forming hypocoristics: Derivation, grammar and meaning
The overwhelming majority of hypocoristic forms in both countries involve nouns.
While hypocoristic forms have traditionally been used for concrete nouns, those
for abstract nouns are included in this survey, New Zealand examples being flattie
. More work needs to be done on the conditioning for these. Some, like Taranaki, have
main stress on the third syllable, but this is not consistent.
. Use of the final syllable is occasionally found in other domains, e.g. roo for kangaroo, keet
for lorikeet (AND).
Hypocoristics in New Zealand and Australian English 
(in a “flat spin”), and foggy ‘the foggiest idea’. Examples shared with Australia include
hissy ‘hysterical fit’ and tantie ‘tantrum’.
Nouns are frequently created from verbs, particularly with the suffix -ie.
New Zealand examples include baggie ‘a grain bagger’ and blowie ‘cow blown with bloat’,
and clipper ‘sheep ready for shearing’. Australian examples include floaties ‘tea leaves in
tea, or swimming aids’, bities ‘biting insects’. New Zealand examples of noun forms that
are created from adjectives include hairy ‘a wild goat’, and heavy ‘a serious consequence’.
In Australia, a bendie is a bendy bus, while a coldie is typically a cold beer.
Polysyllabic names for diseases are frequently shortened into hypocoristic nouns,
in well-established examples such as dermo ‘dermatitis’, lepto ‘leptospirosis’, sypho
‘syphilis’, TB ‘tuberculosis’, and pleura or pleuro ‘contagious bovine pleuro-pneumonia’
in historical usage. Similarly, drugs such as benzodiazepine (Benzo), Halcyon (Halcie),
Haloperidol (Halo), Mogadon (Mogie/Moggie), and Rohypnol (Rolie, or roey in
Australia) are known in shortened form both in medical and general usage.
Hypocoristics are sometimes formed from both parts of a compound or phrase,
e.g. the double abbreviation clan lab ‘clandestine laboratory’ which is used widely in
New Zealand. More often they abbreviate just one part, as in the Australian compound
dual-occy ‘a dual occupancy block’, where occy doesn’t seem to exist on its own. The same
holds for the older half squarie ‘prostitute’, from squarie ‘a young woman, a girl friend’,
hot crossie ‘hot cross bun’, and the New Zealand cutty-grass ‘cutting-grass’. At least one
coordinate structure has been recorded, the Australian fish-and-chippie ‘shop-keeper
or shop that sells fish and chips’. In NZE there are also compounds where both parts
have hypocoristic endings, as in walkie-chalkie ‘parking warden, who marks tyres with
chalk’, woolly-pullies ‘woolly pullovers’ and undie-5-hundie ‘500m sprint street race in
underwear’, and in AusE the children’s game hoppo-bumpo, and Subi Centro, the official
name given to a development in the Subiaco suburb of Perth. On both sides of the Tasman, new compounds and common nouns are formed with an existing hypocoristic,
for example westie. New Zealanders derive them from the notional West Aucklander:
Westie chic, westie chick, westieism/westyism, westiemobile, westieness. They parallel the
derogatory Australian use of Westie as a person from the western suburbs of Sydney,
and similarly westie car and westie chick.
Hypocoristic adjectives (apart from those which are also used as nouns, such as
prezzie/presbie, presbo ‘Presbyterian’) are less often coined. But they can be found in
New Zealand examples such as buttie ‘fat’ churchie ‘churchgoing’, dicky, ‘dimwitted’,
(from dick), kitcheny ‘housewifely’, and gunny ‘with expertise’. Examples used in both
countries include bosker ‘fine’, blokey, blokesy, bolshie, and pissy, as well as (go) berko
‘berserk’, preggers and preggo ‘pregnant’, plakky ‘plastic’, aggro ‘aggressive’, comfy ‘comfortable’, fantazzo ‘fantastic’, non-ressie ‘non-residential’, para ‘paralytic’ (i.e. drunk),
sarky ‘sarcastic’, and shonky ‘shady or dishonest’. There are a few hypocoristic forms
of verbs, including the New Zealand bungie ‘to move quickly’ and scarpo ‘to escape’,
 Dianne Bardsley & Jane Simpson
and trans-Tasman examples such as diss ‘discard, distribute, disrespect’, veg out (from
‘vegetable’). Australian examples of hypocoristic adjectives and verbs used by officer
cadets (Moore 1993) are maco (from immaculate, a term of praise), obno ‘obnoxious’, to
go meglo ‘megalomaniac’, and to acca ‘to engage in academic work’.
In both countries hypocoristics are frequently polysemous. A familiar New Zealand
example is pressie/prezzie. Pressie is the form used for a member of a President’s
rugby team (a team composed of players over a certain age, usually 35 years), along
with meanings found in both countries: the more widely used ‘present’, as well as
‘Presbyterian’. In New Zealand, cashie has four uses, referring to a car salesman,
cashier, cash converters, person who works for cash tax-free; and a distinct plural
formation: cashies ‘small change’. Similarly, flattie can be a flat-bottomed boat, flatmate, flat tyre, confusion (as in a ‘flat spin’), a low-heeled shoe, and a flat-headed
nail. Sav is a shortened form for both saveloy and sauvignon blanc, the latter often
in the alternative form savvy. Traditionally, a footy/footie can be either a rugby ball,
or a rugby game in New Zealand, and in Australia it may also denote an Australian
Rules football, but requires “the” for a football match as in “Off to the footie”. In New
Zealand Auntie is used for a mentor, an old ewe, and for an effeminate male. Spotty
is used for a spotted dick pudding, a deer under the age of three months, and for the
more common fish. The DNZE lists several uses for bluey from different historic and
contemporary semantic fields.11
A freshy in New Zealand is a new immigrant (Fresh Off the Boat) or a fresh snowfall
(skiing and snowboarding lexis). In Australia it may also be a freshwater crocodile. In
Australia, falsies has been used for padded bras, false teeth, false eyelashes, faked registration forms, or even lumps of fat used on lamp chops to make them look attractive in
window displays. Both nations share usage of stockie as a stock saddle, a stock inspector,
a stock transporter, a stock car or stock items used in a stock car (a stock steering wheel
for example). New Zealand cricketers also use the term for a stock bowler.
At the same time, multiple alternative hypocoristics are sometimes found for the
same referent in both countries. In New Zealand usage, corrugated iron is presented
in the building trade as corrie, corro and corru. In Australia bottley, bottle-oh and bottler
have been recorded for a type of marble. Several forms, including a(r)vo, afto, and
sarvo exist for ‘afternoon’ in both countries, and this arve is heard in Australia, while
afters is a common form for ‘after event’ in New Zealand. In both countries there are
hypocoristic alternatives for personal names with two suffixes, as when -s is added to
the hypocoristic ending -o, in Waynos for Wayne, and Juleso for Julia or Jules.
. Common to Australia and New Zealand are: a blanket roll, a blue blanket, luggage or a
pack, a traffic ticket, a summons, a red-haired male. In New Zealand it may also include a blue
denim overall, an error, a public bar banning notice, a beast with blue or red colouring. In
Australia it may include a blue swimmer crab or several species of blue-tongue lizards.
Hypocoristics in New Zealand and Australian English 
In sum, hypocoristic strategies are a powerful and productive way of forming
nouns from a range of linguistic raw material. The results may have a range of contextdependent interpretations, and on the other hand there may be multiple alternative
hypocoristics for the same form. Adjectives and verbs are much less frequently created
using these strategies.
6. Distribution of hypocoristics
We present now the preliminary findings from a comparison of the distribution of
hypocoristic forms across various semantic domains in the New Zealand and Australian
data sets. Those for proper names and other words are examined separately.
Table 1 below shows words other than proper names according to semantic domain,
country and form. The semantic domain information is not comparable across the
data sets for the two countries, because the coding was done separately and because
the data reflects the opportunistic nature of the collection. The Australian data includes
12
Table 1. Semantic domains of New Zealand and Australian hypocoristics and new
coinages according to endings
Domain
One or more12 One or more One or One or more
syllables
syllables plus more
syllables plus
plus -ie
-o, or -os
syllables -a or -as
plus -s
Other,
including
truncated
form
Rural NZ (N=94)
Rural Aus. (N=20)
Crime NZ (N=108)
Crime Aus (N=19)
Fishing NZ (N=40)
Fishing Aus. (N=38)
Sport NZ (N=44)
Sport Aus. (N=14)
Occupat. roles NZ
(N=110)
Occupat. roles Aus.
(N=147)
Marbles NZ (N=56)
Marbles Aus. (N=25)
Military Aus. (N =65)
70 (74%)
15 (75%)
94 (87%)
2 (11%)
39 (98%)
20 (53%)
39 (89%)
7 (50%)
81 (74%)
3 (3%)
0
9 (8%)
1 (5%)
0
11 (29%)
0
1 (7%)
9 (8%)
0
0
0
0
0
0
1 (2%)
0
0
17 (18%)
5 (25%)
3 (3%)
11 (58%)
1 (3%)
6 (16%)
1 (2%)
2 (14%)
13 (12%)
4 (4%)
0
2 (2%)
5 (26%)
0
1 (3%)
3 (7%)
4 (29%)
7 (6%)
88 (60%)
42 (29%)
0
6 (4%)
11 (7%)
45 (80%)
17 (68%)
27 (42%)
0
4 (16%)
4 (6%)
?2 (4%)
0
3 (5%)
?3 (5%)
4 (16%)
4 (6%)
6 (11%)
0
27 (42%)
Legend: Rounding of decimals means that not all percentages add up to 100%.
. Most of the forms are based on one syllable plus ending, but occasionally two syllable
bases are found, such as Duntroony (Duntroon cadet), common-oh (marble), dissolvo
(dissolving stitches).
 Dianne Bardsley & Jane Simpson
many forms from Bruce Moore’s collection of material from Australian military cadets
(Moore 1993), while the New Zealand corpus includes a more extensive collection of
terms for marbles and the rural sector. Much more systematic study of work-places
and hobbies would be needed to make conclusions about the prevalence of hypocoristics in particular fields. But the semantic domain information is nonetheless important
in showing tendencies for particular endings to occur in particular domains.
The most notable finding shown above is that hypocoristics in -ie are the most
common in both countries. However, the Australian data shows the gradual encroachment of the hypocoristic in -o; a larger proportion of forms with -o are recorded in
the domains of occupational roles and fishing. Some other domains (e.g. fishing and
military) include occupational roles, and so in fact the use of -o in occupational roles
is greater than the classification indicates.
Occupational roles also provide many examples of -a (-er) in other domains,
for example in the New Zealand and Australian rural domains, but perhaps most
strikingly in the Australian crime domain. Whereas the New Zealand crime domain
contains many forms in -ie, the Australian crime domain contains very few. This
reflects the fact that the New Zealand crime domain material was collected more
recently than the main source for the Australian crime domain (Simes 1993), which
largely consists of two glossaries compiled by prisoners in New South Wales gaols,
one in 1944 and the other in 1950. These glossaries contain a considerable amount
of rhyming slang (see Looser (1999) on the greater use of rhyming slang by older
(45–60) prisoners in New Zealand jails). There are some occupational nouns in
-a, but few forms in -ie or -o, most of them terms for things which are not specifically criminal and also in general currency.13 For example, terms for drugs used
by younger prisoners (20–30), happen to include a number ending in -ie (tammies
‘temazepan’, rivvies ‘rivitrol’, misties for morphine sulphate tablets), and one in -o
(benzo ‘benzedrine’).
Table 2 shows the distribution of types of proper names compared with common
nouns in the New Zealand and Australian data. Note that the data is not completely comparable because in the New Zealand data the use of the definite article is treated as a
separate way of forming hypocoristics, while for the Australian data some forms with the
definite article are included under truncations and endings as well as a separate strategy.
. Hypocoristics in Simes (1993) include: crim, dissing (dicing term), doughy, bumper and
cigga (all terms for cigarettes or cigarette butts), imbo (imbecile), to vag (imprison under the
Vagrancy Act). Words for sexual orientation or prostitution include aspro, chromo, chippy,
condy boy, lezo, lowie. Some, such as troppo (crazy), conshy (conscientous objector), are used
generally. The forms listed here under -a include forms analyzable with agentive -er, such as
wifestarver (someone imprisoned for desertion of his wife), and dudder (swindler), as well as
forms with nouns as sixer (six-month sentence).
Hypocoristics in New Zealand and Australian English 
Table 2. New Zealand Australian hypocoristics of proper names and common nouns
according to endings
Type
One or
more
syllables
plus -ie
Geographical
names NZ
(N=104)
Geographical
names Aus.
(N=366)
Personal names
NZ (N=26)
Personal names
Aus. (including
surnames and
nicknames)
(N=205)
Names of NZ
institutions/
locations22
(N=39)
One or
more
syllables
plus -s,
or -as
One or
more
syllables
plus -a
Other,
including
truncated
form
Use of
the
definite
article
22 (21%) 3 (3%)
9 (9%)
0
37 (36%)
33
147 (40%) 60 (16%)
(incl.
5 with
“The”)14
10 (38%) 4 (15%)
2515 (7%)
28 (8%)
(incl.
6 with
“The”)16
4 (15%)
107 (29%) [104]17
(incl. 23
(plus
with “The”) 70 full
words)
4 (15%) 0
5818
(28%)
One or
more
syllables
plus -o
4 (15%)
3419 (17%) 30 (15%)
4 (10%) 1 (3%)
7 (18%)
3220 (16%) 5121 (25%) 0
0
5 (13%)
22
(continued)
14. These include forms such as The Dangie (Urandangie) whose final ie is part of the stem.
15. These include forms such as Bruns (Brunswick) whose final s is part of the stem.
16. These include forms such as Coota (Cootamundra) whose final vowel is part of the stem.
17. This figure includes forms with other endings classified in this table, e.g. The Wello (The
Duke of Wellington pub), which is also classified as -o. Only 7 consist of complete words, e.g.
The Lion (the British Lion hotel).
18. This figure includes six which also end in -sy, e.g. Debsy (Deborah).
19. This figure includes two which also end in -so, e.g. Debso (Deborah).
20. This includes 23 in which the final consonant has been changed to /z/, e.g. Lazza (Larry),
Ozza (Owen), Mazza (Amanda).
21. This includes 44 truncations. Most retain the first syllable of the word, but a few take the
final syllable: Rell, (Narelle), Shell (Michelle), or the middle syllable: Liss (Melissa). Perhaps the
lack of initial stress in these words is a contributing factor.
22. “Specific location” is a sports ground, pub, etc.
 Dianne Bardsley & Jane Simpson
Table 2. (continued)
Type
One or
more
syllables
plus -ie
Names of Aus.
Institutions/
locations (N=66)
Sportspeople
and celebrities
NZ (N=35)
Sportspeople and
celebrities Aus
(N=35)
Common nouns
NZ (N=453)
Common nouns
Aus. (N=673)24
One or
more
syllables
plus -o
One or
more
syllables
plus -s,
or -as
One or
more
syllables
plus -a
Other,
including
truncated
form
Use of
the
definite
article
19 (28%) 18 (27%) 5 (7%)
7 (10%)
17 (27%)
[47]23
14 (40%) 2 (6%)
0
12 (34%)
0
15 (43%) 9 (26%) 8 (23%)
2 (6%)
1 (3%)
0
336 (74%) 38 (8%)
9 (2%)
35 (8%)
35 (8%)
0
41225 (61%) 129 (19%) 9 (1%)
59 (9%)
65 (10%)
0
7 (20%)
Table 2 shows that the hypocoristics found on common nouns are also found
on proper names, though the proportions differ, and there are some features either
restricted to, or most common on, proper names, such as the use of the definite article
in placenames. Yet is clear that with common nouns, in both countries -ie is dominant,
and the other endings are much less common. But proper names of people are most
subject to playful hypocoristic formations (Poynton 1984; Taylor 1992, 1993). So it is
not surprising that on proper names generally the percentage of use of hypocoristics
other than -ie is higher. Again -o is more common in Australia.
7. Conclusion
From the early days of European settlement in both New Zealand and Australia there
has been a need to create new words to describe new things. Some of the strategies
23. This figure includes forms with other endings classified in this table, e.g. The Wello (The
Duke of Wellington pub), which is also classified as -o. Only 6 consist of complete words, e.g.
The Lion (the British Lion hotel).
24. This figure is a subset of general common nouns, excluding those listed in semantic
domains in Table 1.
25. This figure includes some with -sie: footsie.
Hypocoristics in New Zealand and Australian English 
used to do this are the same as strategies used to create hypocoristic forms of existing words by adding endings to a base which is usually monosyllabic. Exactly which
endings are used fluctuates according to time and fashion, and depends in part on the
semantic domain of what the word denotes. The creation of hypocoristic alternatives
to existing words is driven in part by the desire to identify with a group which has a
particular way of talking. In turn, this desire influences the choice of ways to create
new words.
The similarities in settlement history and in the economic basis for settlement
(agriculture and sheep) and the long interchange between New Zealand and Australia
have led to similar patterns of word creation. As the tables show, the distribution of use
of hypocoristic endings is basically similar in both countries. In both countries, the -ie
ending is the most popular, and has the most diversified range. Perhaps the most notable difference between the countries is the greater use of the -o ending in Australia
among general common nouns, geographic names, words for occupational roles and
in fishing. However, finer-grained work is needed in Australia to show the geographic
and age distributions of these terms, since preliminary work on hypocoristics of
Australian placenames (Simpson 2001) showed that the -o forms were more common
in the eastern states NSW and Queensland than in the other states.
Appendix 1: Some Australian hypocoristics for placenames
and institutions
These forms were found through Google searching Australian web-pages 17/8/06, and
searching Australian sources in Factiva for the years November 2005–November 2007 for
the phrases “as the locals call it”, “as it’s called by (the) locals” and “as it’s known to locals”.
Hypocoristic
Place
Location
Forms with -ie
The Arty
artificial reef near Queensland
Bundaberg
Barci
Barcaldine
Queensland
Beakey
Billi
Beaconsfield
Billinudgel
Tasmania
NSW
Brissie
Brisbane
Queensland
Source
www.fishingmonthly.com.au/
AreaArchives/ qldarchives-ft/central/
bundaberg/med_0209.html
www.bigscreen.afc.gov.au/tour_blog/
blog21.aspx
www.tamarcove.com/gettinghere.htm
www.smh.com.au/articles/2004/
04/09/1081326928159.html?from=storyrhs
Matt Williams, 26 August 2006, The
Advertiser, also www.accordmb.com/
apply-queensland.html accessed 17/8/06
(continued)
 Dianne Bardsley & Jane Simpson
Appendix 1. (continued)
Hypocoristic
Place
Location
Source
Broadie
Broadbeach
Queensland
Cumby
Come-By-Chance NSW
Freshie
Freshwater Beach NSW
Maggie
Magnetic Island
Queensland
Mainy
Main Beach
Queensland
Mossy
Moss Vale
NSW
Newie
Newcastle
NSW
Palmy
Palm Beach
NSW
Patche
Rocky
Patchewollock
Rockhampton
Victoria
Queensland
The Roey
The Roebuck Bay WA
Hotel
North Stradbroke Queensland
Island
Maurice Dunlevy, The Australian
16 September 2006
Peter Trute, Daily Telegraph, 26 October
2006
Aimee Brown, Daily Telegraph, 27 January
2007
Peter Vincent, 29 July 2006, The Sydney
Morning Herald, also yahoo.domain.com.
au/Public/Article. aspx?id=1154198117234
&index=NationalInd accessed 17/8/06
The Go Girls with Shannon Willoughby
and Melanie Pilling, The Gold Coast
Bulletin 9 November 2006
James Cockington, The Sydney Morning
Herald 1 November 2006
www.fasterlouder.com.au/forum/
showthread.php?t=2344
www.surfit.com.au/Surfit/ Display.
asp?ss=5&AID=2467&CID=92
Orietta Guerrera, The Age 17 April 2006
www.choicehotels.com.au/resources/
ITINERARIES/QLD/QLD_East_Coast_
Adventure.pdf
www.alia.com.au/productions/
productionsuntil02-02-02.htm
travel.yahoo.com.au/guide/australia/
Queensland/north-stradbroke-island/
index.html
Jessica Hurt and Rob Malinauskas, The
Advertiser 17 December 2005
www.whitehat.com.au/Melbourne/
StreetsSuburbs/Williamstown.asp
www.broomstick.com.au
Straddie
The Subi
Willy
The Subiaco
Hotel, Perth
Williamstown
WA
Victoria
Witchy
Witchcliffe
WA
Forms with “The”
The Bay
Apollo Bay
Victoria
The Bay
Nelson Bay
Queensland
The Curry
Cloncurry
Queensland
The Druitt
Mount Druitt
NSW
The Ekka
The Royal
Queensland
Queensland Show
Tony Prytz, Geelong Advertiser
5 November 2007
Rachel Sullivan, The Sydney Morning
Herald 17 November 2007
www.bigscreen.afc.gov.au/tour_blog/
blog24.aspx
Elisabeth Wynhausen, The Australian
Magazine, 20 May 2006
www.withincooee.com/brisbane/
brisbane-major-events.htm
(continued)
Hypocoristics in New Zealand and Australian English 
Appendix 1. (continued)
The G
The MCG
The Gorge
Cataract Gorge
Reserve
Mount Isa
The Isa
The Sov
The Valley
Truncations
Alex
the Sovereign
Hotel
Kangaroo Valley
Melbourne,
Victoria
Tasmania
Queensland
Queensland
NSW
www.travellers-autobarn.com/
new-design2/tab-melbourne.shtml
www.discovertasmania.com.au/home/
index.cfm?SiteID=397
www.littlehills.com/travel_ information/
qld.mountisa.shtml
www.queenslandholidays.com.au/
travel-info/gay-and-lesbian.cfm
www.lawyersweekly.com.au/
articles/95/0c01f695.asp
Queensland
www.qldtravel.com.au/mooloolaba.html
Bre
Alexandra
Headland
Brewarrina
NSW
Byron
Byron Bay
NSW
Conspic
WA
Port
Conspicuous
Cliffs
Copacabana
Beach
Federation
Square,
Melbourne
Port Macquarie
Jordan Baker, The Sydney Morning Herald
2 November 2007, also www.openroad.
com.au/backtothebush.asp accessed
17/8/06
www.openroad.com.au/travel_
greatdrives_nothernriversdrive.asp
Caren Blair, Sunday Mail 29 July 2007
NSW
Port
Port Douglas
Queensland
Strath
Strathalbyn
Toke
Tocumwal
South
Australia
Victoria
Wang
Wangaratta
Victoria
Initials
JB
Jervis Bay
ACT
KI
Kangaroo Island
PA’s
The Prince
Alfred Hotel,
Melbourne
Thursday Island
South
Australia
Victoria
Copa
Fed Square
TI
NSW
Victoria
Queensland
Peter Vincent, The Sydney Morning Herald
17 December 2005
www.mynrma.com.au/
victoria_melbournes_museums.asp
www.nomadsworld.com/productlist.
asp?backpack=port+macquarie
Alison Cotes, Sunday Mail 7 January 2007,
also www.stayz.com.au/13971
www.smallguide.com.au/sa4.html
Neil McDonald, The Australian Magazine
24 February 2007
www.greatrides.com.au/inform.
php?a=4&b=22&c=145
Mark Eggleton, The Australian Magazine
3 March 2007
www.coxy.com.au/vic/bigbreak/?tourid=97
Melbourne/Yarra Leader 28 November
2005
www.reefwatch.com/sampleitineraries.htm
(continued)
 Dianne Bardsley & Jane Simpson
Appendix 1. (continued)
Hypocoristic
Place
Location
Source
Forms with -o
Freo
Fremantle
WA
Rotto
WA
www.frogandtoad.com.au/wa/grperth/
index.html
and www.aussieholidays.net.au/wainfo.
html
Rottnest Island
Forms with -a
Macca
Macquarie Island
Dubious
`kraffa
Forms with -s
Margs
external
territory
www.aad.gov.au/default.asp?casid=15435
Karratha
WA
www.grinspoon.com.au/band/joes_diary/
arc11-2002.html
Margaret River
area
WA
John Andersen, Townsville Bulletin
27 January 2007
References
Australian National Dictionary. See Ramson, William S.
Bardsley, Dianne. 2003. “The rural New Zealand English lexicon 1842–2002”. Ph.D. Victoria
University of Wellington.
Bardsley, Dianne. 2006. “A specialist study in New Zealand English lexis: the rural sector”.
International Journal of Lexicography 19 (1): 41–72.
Blank, Claudia (ed.). 1992. Language and civilization: a concerted profusion of essays and studies
in honour of Otto Hietsch. Frankfurt: Peter Lang.
Corne, Chris. 1998. “The -er processive suffix and You little bottler!” New Zealand English
Journal 12: 21–4.
Dabke, Roswitha. 1976. Morphology of Australian English. Ars Grammatica Band 6. Munich:
Wilhelm Fink.
Dermody, Anthony C. 1980. “Word abbreviation and suffixing in Australian English”. B.A. Honours
thesis, La Trobe University, Melbourne.
Dictionary of New Zealand English. See Orsman Harry O.
Kiesling, Scott F. 2006. “English in Australia and New Zealand”. In Braj Kachru, Yamuna Kachru &
Cecil L. Nelson (eds), The Handbook of World Englishes, 74–89. Malden MA: Blackwell.
Laugesen, Amanda (ed.). 2003. Glossary of Slang and Peculiar Terms in Use in the A.I.F. 1921–1924.
Canberra: Australian National Dictionary Centre. <http://www.anu.edu.au/andc/res/aus_
words/wwi/index.php> (03 Feb. 2008).
Looser, Diana. 1999. “ ‘Boob jargon’: The language of a women’s prison”. New Zealand English
Journal 13: 14–37.
Looser, Diana. 2001. “Boobslang: A lexicographical study of the argot of New Zealand prison
inmates in the period 1996–2000”. Ph.D. thesis, University of Canterbury.
Hypocoristics in New Zealand and Australian English 
McAndrew, Alex. 1992. “Hosties and Garbos: A look behind diminutives and pejoratives in
Australian English”. In Claudia Blank (ed.), 166–84.
Moore, Bruce. 1993. A Lexicon of Cadet Language: Royal Military College, Duntroon in the
period 1983 to 1985. Canberra: Australian National Dictionary Centre, Australian National
University.
Morris, Edward E. 1898. Austral English: A Dictionary of Australasian Words, Phrases and Usages.
London: Macmillan and Co.
Mühlhäusler, Peter. 1983. “Stinkiepoos, cuddles and related matters”. Australian Journal of
Linguistics 3: 75–91.
Orsman, Harry O. (ed.). 1997. The Dictionary of New Zealand English. Auckland: Oxford University
Press. Updates at: <http://www.victoria.ac.nz/lals/research/nzdc/index.htm>
Poynton, Cate. 1984. “Names as vocatives: forms and functions”. Nottingham Linguistic Circular
13: 1–34.
Ramson, William S. (ed.). 1988. The Australian National Dictionary: A Dictionary of Australianisms on Historical Principles. Melbourne: Oxford University Press. Updates at <http://www.
anu.edu.au/andc/res/aus_words/index.php>
Simes, Gary. 1993. A Dictionary of Australian Underworld Slang. Melbourne: Oxford University
Press.
Simpson, Jane. 2001. “Hypocoristics of place-names in Australian English”. In Peter Collins &
David Blair (eds), Varieties of English: Australian English, 89–112. Amsterdam: Benjamins.
Simpson, Jane. 2004. “Hypocoristics in Australian English”. In Bernd Kortmann, Kate Burridge,
Rajend Mesthrie, Edgar W. Schneider & Clive Upton (eds), Handbook of Varieties of English:
Volume 2: Morphology and Syntax): Australasia and the Pacific, 643–56. Berlin: Mouton de
Gruyter.
Skelt, Louise. 2002. “The discourse function of Australian -ie and -o suffixed hypocoristic terms”.
BA Honours thesis. Australian National University.
Sussex, Roland. 2004. “Abstand, Ausbau, creativity and ludicity in Australian English”. Australian
Journal of Linguistics 24 (1): 3–19.
Sussex, Roland. In prep. [Australian diminutives].
Taylor, Brian A. 1992. “Otto 988 to Ocker 1988”. In Claudia Blank (ed.), 505–36.
Taylor, Brian A. 1993. “Ocker, Richo and the other Aussie Dunny. Mucking about with people’s
names in Australian English: What is the code?” Australian Folklore. A Yearly Journal of
Folklore Studies 8: 112–37.
Wierzbicka, Anna. 1984. “Diminutives and depreciatives: semantic representation for derivational
categories”. Quaderni di semantica 5 (1 (June)): 123–30.
Wierzbicka, Anna. 1991. Cross-Cultural Pragmatics: The Semantics of Human Interaction. Berlin:
Mouton de Gruyter.
section ii
Verbs and verb phrases
Modals and quasi-modals
Peter Collins
University of New South Wales
The findings of the present study of selected modals and quasi-modals in
matching corpora of Australian, New Zealand, British and American English
reinforce those of diachronic investigations attesting to the rising popularity
of the quasi-modals and declining fortunes of the modals in recent decades.
That these two developments are connected is suggested by the near symmetrical
results obtained across the four regional varieties and across the spoken
versus written categories. American English appears to be in the vanguard of
change, both in simple frequency terms and in the extent of the gulf in stylistic
preferences between the quasi-modals and modals. New Zealand English
emerges as the most conservative of the four varieties, with Australian and
British English in between.
1. Introduction
This chapter examines the frequency, distribution and meanings – in contemporary
AusE, NZE, BrE, and AmE – of the quasi-modals have to, have got to, need to, be going
to and want to, and compares them systematically to the modals into whose semantic
space they appear to be making inroads, must, should, need, will, and shall. Two criteria
were exercised in the selection of quasi-modals: their frequency of occurrence (by
which criterion semantically relevant but nevertheless low frequency quasi-modals
such as had better were excluded); and their semantic similarities to modal auxiliaries
(resulting in the exclusion of used to).
The quasi-modals and modals examined fall into two broad semantic groups: those
expressing necessity and obligation (have to, have got to, need to, must, should, and
need); and those expressing prediction and volition (be going to, want to, will, and
shall). The classification of modal meanings for each item is based on Palmer’s (1990)
tripartite distinction between “epistemic” modality (concerned with the speaker’s
commitment to the truth of the proposition), “deontic” modality (concerned with
conditions relating to the completion of an action deriving from an external source),
and “dynamic” modality (concerned typically with an individual’s ability or volition).
 Peter Collins
2. Recent changes: Quasi-modals on the rise
Figures presented by Mair and Leech (2006) show that BrE and AmE have seen, in
the three decades spanning the early 1960s to the early 1990s, a rise in the frequency
of the quasi-modals with a concomitant and related decline in the frequency of the
modal-auxiliaries. Table 1 below reproduces Mair and Leech’s figures for written
BrE and AmE for the items that are investigated in the present study, determined by
calculating the difference between the frequencies derived from their 1960s corpora
(LOB and Brown) and from their 1990s corpora (FLOB and Frown), as a percentage
of the former.1
Table 1. Changes in the frequency of some quasi-modals and modals in recent British
and American writing
Quasi-modals
have to
have got to
need to
be going to
want to
Modals
BrE
AmE
+9.0%
–34.1%
+249.1%
–1.2%
+18.5%
+1.1%
+15.6%
+123.2%
+51.6%
+70.9%
must
should
need
will
shall
BrE
AmE
–29.0%
–11.8%
–40.2%
–2.7%
–43.7%
–34.4%
–13.5%
–12.5%
–11.1%
–43.8%
While Mair and Leech are not in a position to provide detailed information about
spoken BrE and AmE, they nevertheless report that a search of spoken corpora of BrE
covering a similar period of time shows the trends found for writing to be more pronounced in speech (compare an overall rise of 10.0% of the quasi-modals in British
writing with one of 36.1% in British speech, and an overall fall of 9.5% of the modals in
British writing with one of 17.3% in British speech). The unavailability of comparable
corpora for AmE speech deprives Mair and Leech of the opportunity to provide parallel
American figures. However they refer to the results of a search of the 4-million-word
Longman Corpus of Spoken American English (from the 1990s) which indicated that
the frequency of quasi-modals was 62.5% greater than that of core modals (compared
with a difference of 17% for written corpora of AmE and BrE of the same era). This
finding, Mair and Leech conclude, “suggests that, as is often suspected, the spoken
. The figures, from Table 14.3 on p.327 and Table 14.4 in Mair and Leech (2006: 327–8) are
based largely on those reported in Leech (2003).
Modals and quasi-modals 
American variety of the language is the main driving force of change in this area, as
presumably in others, and places the encroachment of semi-modals on the territory of
the modals in AmE speech, in frequency terms, beyond doubt” (p.328).
This chapter uses matching ICE corpora of the 1990s containing both spoken and
written material, to seek further confirmation that AmE is leading the way in this domain
of grammatical change. It seeks, furthermore, to determine where AusE and NZE fit in
to the picture. Do they pattern similarly to each other, or differently? Do they retain their
traditional British orientation or is there evidence of US influence? Is there evidence of
linguistic individualism, with patterns that are neither clearly British nor American?
3. The corpora
The corpora used in the study were selected for their capacity to facilitate the investigation of both dialectal and stylistic variation. They were the parallel million-word
corpora of the International Corpus of English representing BrE (ICE-GB), AusE
(ICE-AUS), and NZE (ICE-NZ), and a specially-assembled corpus of c.200 000 words
of AmE (C-US). ICE-GB, ICE-AUS and ICE-NZ, like all ICE corpora, conform to a
common design, comprising 500 texts each of 2000 words, sampled in the early 1990s
(300 spoken texts – 180 dialogic and 120 monologic; and 200 written texts – 50 nonprinted and 150 printed).
The texts for C-US – which was designed to fill the gap caused by the non-availability
hitherto of an actual ICE-US corpus – had two sources. For the spoken component
Part 1 and Part 2 of the Santa Barbara Corpus (SBC) were selected (containing 116 458
words, this count determined by stripping out all but orthographic words from the transcripts). Insofar as the SBC texts are predominantly (about 80%) dialogic, there is unfortunately some noncomparability with the ICE corpora, in which the spoken component
is 60% dialogic. For the written component of C-US 80 000 words were extracted from
the Freiburg-Brown Corpus of Written American English (Frown), the selection of texts
being made to match as closely as possible the ICE categories, as follows:
ICE
Non-printed (50 texts)
Printed: informational (100 texts)
Printed: instructional (20 texts)
Printed: persuasive (10 texts)
Printed: creative (20 texts)
C–US
G1–3; P1–7 (10 texts)
J1–8; F1–8; A1–4 (20 texts)
H1–2; E1–2 (4 texts)
B1–2 (2 texts)
K1–4 (4 texts)
All frequencies for C-US, which contains 196 458 words, were normalized to tokens
per one million words, to match those for ICE-GB, ICE-AUS and ICE-NZ.
 Peter Collins
4. Frequencies across the regional varieties
Table 2 presents frequencies for all the quasi-modals investigated in the four corpora.
Table 2. Frequencies of the quasi-modals*
have to
have got to
need to
be going to
want to
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
1311 (1311)
332 (332)
343 (343)
1191 (1191)
1039 (1039)
4216
1182 (1182)
228 (228)
338 (338)
1088 (1088)
800 (800)
3636
1244 (1244)
339 (339)
280 (280)
1056 (1056)
858 (858)
3777
1385 (272)
173 (34)
473 (93)
2413 (474)
1425 (280)
5869
* For C-US figures are normalized to tokens per one million words (raw figures in brackets).
What is immediately striking is that with the single exception of have got to, the
American corpus evidences the highest frequency for the quasi-modals examined, a
finding which lends plausibility to the claim that AmE is leading the way in the rise
of the quasi-modals. A comparison of the total number of tokens in the four corpora
confirms that in general terms C-US has the strongest affinity for these quasi-modals
and ICE-NZ the weakest, the ordering being as follows: C-US (5869 tokens) > ICE-AUS
(4216) > ICE-GB (3777) > ICE-NZ (3636). AusE appears to pattern more closely with
BrE than AmE.
Table 3 presents frequencies for the modals investigated in the four corpora.
Table 3. Frequencies of the modals*
must
should
need
will
shall
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
613 (613)
1141 (1141)
19 (19)
3868 (3868)
100 (100)
5741
714 (714)
1577 (1577)
20 (20)
3874 (3874)
99 (99)
6284
675 (675)
1124 (1065)
34 (34)
3861 (3861)
223 (223)
5917
402 (79)
850 (167)
15 (3)
3950 (776)
102 (20)
5319
* For C-US figures are normalized to tokens per one million words (raw figures in brackets).
Here again the findings suggest that AmE is in the box seat of linguistic change,
leading the way in the decline of must, should and need. As for the remaining two
items, with will – tokens of which massively outnumber those of the other items –
the frequencies across the four dialects are strikingly similar, as they are for shall in AusE,
NZE and AmE. If we again compare the total number of modals in each of the four
corpora as a means of determining general trends, the following ordering from most
Modals and quasi-modals 
to least innovative results: AmE (5319 tokens) > AusE (5741) > BrE (5850) > NZE
(6284). Removing will – whose vast number of tokens has the potential to skew the
results – makes no difference to the ordering AmE (1369) > AusE (1879) > BrE (2056) >
NZE (2410). As in the case of the rise of the quasi-modals, so here in the fall of the
modals, it would seem that AmE is leading the way, with NZE the most conservative.
In between are AusE and BrE, with AusE closer to BrE than to AmE.
In the following section we add a further variable to that of regional variation,
considering the frequencies of the modal expressions studied in the spoken and written
varieties of each dialect. In light of Mair and Leech’s (2006) discovery of more pronounced trends in speech than writing, it is anticipated that this will provide further
insights into diachronic developments across the dialects.
5. Frequencies across speech and writing
I shall begin with some general observations based on the frequencies reported in
Table 4, more detailed discussion of which is found in Section 6 below. The first is that
Table 4. Frequencies of the quasi-modals in speech and writing*
ICE-AUS
ICE-NZ
ICE-GB
C-US
have to
Spoken
Written
S:W ratio
1728 (1037)
685 (274)
2.52:1
1218 (731)
1128 (451)
1.07:1
1390 (834)
1025 (410)
1.35:1
2069 (241)
388 (31)
5.33:1
have got to
Spoken
Written
S:W ratio
530 (318)
35 (14)
15.14:1
327 (196)
70 (28)
4.67:1
540 (324)
38 (15)
14.21:1
266 (31)
38 (3)
7.0:1
need to
Spoken
Written
S:W ratio
347 (208)
338 (135)
1.02:1
253 (152)
465 (186)
0.54:1
293 (176)
260 (104)
1.12:1
670 (78)
188 (15)
3.56:1
be going to
Spoken
Written
S:W ratio
1853 (1112)
198 (79)
935:1
1578 (947)
353 (141)
4.47:1
1642 (985)
178 (71)
9.22:1
3821 (445)
363 (29)
10.52:1
want to
Spoken
Written
S:W ratio
1457 (874)
413 (165)
3.52:1
980 (588)
530 (212)
1.84:1
1142 (685)
433 (173)
2.63:1
1966 (229)
638 (51)
3.08:1
TOTAL
Spoken
Written
S:W ratio
5915
1699
3.48:1
4356
2546
1.71:1
5007
1934
2.58:1
8792
1615
5.44:1
* C-US figures and all speech/writing figures are normalized to tokens per one million words
(raw figures in brackets).
 Peter Collins
the five quasi-modals are commoner (three times more so, or 3.09:1) in speech than in
writing. The finding is not unduly surprising given that it is not uncommon for innovations to spread rapidly in informal spoken genres before becoming established more
broadly in the language. The second observation is that there are major differences across
the regional varieties with respect to the strength of the quasi-modals’ preference for
occurrence in speech over writing (ranging from 5.44:1 in C-US to 1.71:1 in ICE-NZ). The
third observation is that these stylistic preferences correlate with the overall frequency of
tokens (C-US has both the largest number of tokens and the greatest proportion of tokens
in speech, ICE-AUS has both the second largest number of tokens and the second greatest
proportion of tokens in speech, ICE-GB is third and ICE-NZ fourth.
Consider next the modals in Table 5 (with, again, more detailed discussion
presented in Section 5 below).
Table 5. Frequencies of the modals in speech and writing*
ICE-AUS
ICE-NZ
ICE-GB
C-US
must
Spoken
Written
S:W ratio
495 (297)
790 (316)
0.62:1
478 (287)
1068 (427)
0.44:1
527 (316)
898 (359)
0.58:1
318 (37)
525 (42)
0.60:1
should
Spoken
Written
S:W ratio
1053 (632)
1273 (509)
0.82:1
828 (497)
2700 (1080)
0.30:1
1043 (626)
1245 (498)
0.83:1
764 (89)
975 (78)
0.78:1
need
Spoken
Written
S:W ratio
10 (6)
33 (13)
0.30:1
10 (6)
35 (14)
0.28:1
33 (20)
35 (14)
0.94:1
0 (0)
38 (3)
0.0:1
will
Spoken
Written
S:W ratio
4270 (2562)
3265 (1306)
1.30:1
3473 (2084)
4475 (1790)
0.77:1
3818 (2291)
3925 (1570)
0.97:1
4173 (486)
3625 (290)
1.15:1
shall
Spoken
Written
S:W ratio
50 (30)
175 (70)
0.28:1
40 (24)
188 (75)
0.21:1
218 (131)
230 (92)
0.94:1
112 (13)
88 (7)
1.27:1
TOTAL
Spoken
Written
S:W ratio
5878
5536
1.06:1
4829
8466
0.57:1
5639
6333
0.89:1
5367
5251
1.02:1
* C-US figures and all speech/writing figures are normalized to tokens per one million words
(raw figures in brackets).
A comparison of the findings here with those for the quasi-modals above reveals
some intriguing parallels. We have seen that the ordering of the dialects with respect
to the frequency of quasi-modals corresponds to the ordering determined by the proportion of quasi-modals in speech. The figures in Table 5 mirror the situation with the
Modals and quasi-modals 
modals, with the strength of the dispreference for speech now being associated with
higher frequency (i.e. the strength of the frequency for declining modals is associated
with the perseverance of these items in more conservative, written, genres). A comparison of the speech/writing ratios across the four corpora yields the following ordering,
which differs from all previous orderings only in the reversal of positions for C-US and
ICE-AUS: ICE-AUS (1.06:1) > C-US (1.02:1) > ICE-GB (0.89:1) > ICE-NZ (0.57:1).
These findings provide strong evidence that the independently attested rise of the
quasi-modals and fall of the modals are related developments, and furthermore that they
are strongly linked not only to regional but also stylistic factors. As for the differences
between the two antipodean Englishes, the results suggest that AusE is more influenced
by American trends, and NZE is more “conservatively British” in orientation. A word of
caution is in order, however. In this study AusE does not pattern any more closely with
AmE than it does with the more conservative BrE and NZE. In their use of the modal
expressions examined here, Australians may be interpreted as dissociating themselves
both from the progressive and rapidly changing practices of the Americans, and from
the conservative and slowly changing practices of the British (and New Zealanders).
6. The individual quasi-modals
6.1 Have to
The figures in Table 1 above show that have to has increased in popularity in recent
British and American usage, while at the same time must has declined. It therefore
comes as no surprise that have to should as Table 2 shows, outstrip must in the four
contemporary corpora examined. The degree of difference (and therefore the degree
to which the trend may have advanced) is considerably greater in AmE than in the
other varieties, the ordering being C-US (3.44:1) > ICE-AUS (2.13:1) > ICE-GB
(1.84:1) > ICE-NZ (1.65:1). This ordering reveals the same American domination
as that determined by the frequency of have to tokens across the varieties: see Table 1
where C-US (1384 tokens) > ICE-AUS (1311) > ICE-GB (1244) > ICE-NZ (1182).
These findings suggest that at least one important factor driving the popularity of
have to in AusE, BrE and NZE may be Americanization.
Furthermore, the figures in Table 5 show have to to be considerably more popular in
speech than in writing (by a ratio of 1.98:1), contrasting strikingly with must (the modal
being almost twice as popular – 1.8:1 – in writing). This finding, in combination with
that of Mair and Leech (2006) that have to has been on the increase in recent British
and American writing, suggests that another possible factor is “colloquialization” (the
drift into other genres of – and increasing acceptability of – features associated with
colloquial speech). The result for AmE is particularly striking, with the frequency of
 Peter Collins
have to being more than five times greater in speech, and the same ordering of dialects
occurring on this variable as noted above for other variables: C-US (5.33:1) > ICE-AUS
(2.52:1) > ICE-GB (1.35:1) > ICE-NZ (1.07:1).
As Table 6 shows, have to expresses mainly deontic necessity, or obligation. Most
commonly the type of obligation conveyed is of an objective kind, with the deontic
source external to the speaker as in (1), rather than subjective, with the speaker as
deontic source as in (2).
(1)Yeah and if you’re a bit older I think and you have a a broader perspective of
things then when you have to do an assignment or a presentation your brain
is that little bit broader and it makes it easier to do something original and
inventive just being older [ICE-AUS S1A-042:231]
(2)You have to use your imagination. Have a look at the pieces and then
choose the pieces that fit into your lifestyle the pieces that are right for
you [ICE-AUS S2A-011:98]
Table 6. Meanings of the quasi modals*
Epistemic
Dynamic
Deontic
have to
ICE-AUS
ICE-NZ
ICE-GB
C-US
5 (0.4%)
29 (2.5%)
2 (0.2%)
20 (1.5%)
295 (22.7%)
309 (26.3%)
332 (26.9%)
255 (18.6%)
998 (76.9%)
835 (71.2%)
902 (73.0%)
1099 (80.0%)
have got to
ICE-AUS
ICE-NZ
ICE-GB
C-US
8 (2.4%)
5 (2.2%)
2 (0.6%)
5 (2.9%)
42 (12.8%)
34 (15.1%)
51 (15.4%)
15 (8.7%)
278 (84.8%)
186 (82.7%)
279 (84.0%)
153 (88.4%)
need to
ICE-AUS
ICE-NZ
ICE-GB
C-US
6 (1.9%)
7 (2.3%)
11 (4.2%)
10 (2.3%)
217 (67.4%)
198 (63.9%)
158 (59.6%)
305 (69.8%)
99 (30.7%)
105 (33.9%)
96 (36.2%)
122 (27.9%)
be going to
ICE-AUS
ICE-NZ
ICE-GB
C-US
659 (59.4%)
525 (52.7%)
562 (57.3%)
1217 (55.3%)
446 (40.2%)
466 (46.8%)
408 (41.6%)
957 (43.5%)
4 (0.4%)
5 (0.5%)
11 (1.1%)
25 (1.1%)
want to
ICE-AUS
ICE-NZ
ICE-GB
C-US
4 (0.4%)
2 (0.3%)
6 (0.7%)
15 (1.1%)
1021 (98.5%)
784 ( 98.5%)
843 (98.4%)
1390 (97.9%)
12 (1.2%)
10 (1.3%)
8 (0.9%)
15 (1.1%)
* C-US figures are normalized to tokens per one million words.
Modals and quasi-modals 
In this regard have to is distinguishable from must, which is more readily used with
strong subjective force, enabling the speaker to assert power and authority over the
addressee, as in (3).
(3)I’m not exactly sure but um ah er this is what I think but you must check
[ICE-AUS S1A-032:34]
The sum of the frequencies for each expression are slightly less than the frequencies
presented in Table 3 because they exclude indeterminate cases.
It is this difference that leads Myhill (1995) to suggest an explanation for the
encroachment of have to on the semantic territory of must based on the notion of
“democratization”, his term for the emphasis on equality of power which has emerged
as a desideratum in contemporary society. Have to may thus be an attractive option
for speakers seeking a more “democratic” and less overtly authoritative modal expression than must. The figures in Table 6 provide some quantitative support for Myhill’s
suggestion, to the extent that the relative popularity of the deontic meaning is greatest
in AmE (the dialect that we have identified as leading the way in the rise of this and
other quasi-modals), and the other three dialects are ordered in this respect as we have
noted them to be on a range of different variables (AmE > AusE > BrE > NZE). Not
surprisingly it is AmE that, as Table 7 shows, has the strongest dispreference of all four
dialects for deontic must.
Have to also expresses dynamic and epistemic necessity. Dynamic have to is
normally associated with some need relating to force of circumstances as in (4),
whereas with dynamic must the need often relates to an individual’s disposition or
behaviour as in (5).
(4)It’s true that roses do fall victim to disease, especially to fungus attacks which
damage their leaves, but it’s also true that you don’t have to spray them.
[C-US Frown-E02:14]
(5)We must also address in this Charter the complex forces that have
created the crises we face as well as working towards their improvement.
[ICE-AUS W1A-020:52]
The ascendancy of have to over must with root meanings does not extend to epistemic
necessity, exemplified with have to in (6).
(6)And it’s the nature of a new industry that if if it’s not been available in this
country before then you don’t have experience in it and we don’t we don’t
believe that ah it has to follow that the only people who are qualified to begin
a pay television service are those who’re already in television for the last thirty
years [ICE-AUS S1B-046:72]
The epistemic meaning accounts for 33.2% of all must tokens – AmE having
the highest frequency of the four dialects with 38.1% in C-US – but only 1.1% of have
 Peter Collins
Table 7. Meanings of the modals*
Epistemic
Dynamic
Deontic
must
ICE-AUS
ICE-NZ
ICE-GB
C-US
185 (31.1%)
244 (35.1%)
216 (33.3%)
153 (39.5%)
40 (6.7%)
31 (4.5%)
41 (6.3%)
25 (6.5%)
369 (62.1%)
420 (60.4%)
391 (60.3%)
209 (54.0%)
should**
ICE-AUS
ICE-NZ
ICE-GB
C-US
134 (14.0%)
89 (6.5%)
112 (13.4%)
122 (16.9%)
0 (0.0%)
0 (0.0%)
0 (0.0%)
0 (0.0%)
826 (86.0%)
1280 (93.5%)
721 (86.6%)
601 (83.1%)
need
ICE-AUS
ICE-NZ
ICE-GB
C-US
4 (22.2%)
3 (16.7%)
8 (25.0%)
0 (0.0%)
12 (66.7%)
11 (61.1%)
21 (65.6%)
10 (66.7%)
2 (11.1%)
4 (22.2%)
3 (9.4%)
5 (33.3%)
will
ICE-AUS
ICE-NZ
ICE-GB
C-US
2662 (72.4%)
2673 (73.0%)
2523 (69.1%)
2128 (56.9%)
945 (25.7%)
940 (25.7%)
1044 (28.6%)
1563 (41.8%)
68 (1.9%)
51 (1.4%)
84 (2.3%)
51 (1.4%)
shall
ICE-AUS
ICE-NZ
ICE-GB
C-US
1 (1.0%)
5 (5.2%)
14 (6.4%)
31 (30.1%)
48 (49.0%)
43 (44.8%)
90 (40.9%)
41 (39.8%)
49 (50.0%)
48 (50.0%)
116 (52.7%)
31 (30.1%)
* For C-US figures are normalized to tokens per one million words.
** The frequencies for should exclude both the use where it is an alternative to would and the
subjunctive use.
to tokens. Curiously NZE, with 29 tokens of epistemic have to, did not display as strong
a dispreference for this meaning as the other three dialects. The only other modal
with sufficiently robust numbers to be regarded as a serious competitor for have
to in the semantic field of deontic necessity is should. Table 1 indicates that should
has undergone a decline in recent British and American writing. It seems likely that
this decline has occurred in speech as well, given the smaller numbers for should in
speech as against writing in the present study (see Table 5). Furthermore the relatively modest number of shoulds in AmE (850), and the fact that it has the lowest
proportion of tokens expressing deontic necessity (see Table 7), suggests that it
may be leading the way in the decline of should, with NZE (1577 tokens) the most
conservative of the dialects. Even though deontic should is commonly, like deontic
must, associated with subjectivity, its strength is weaker than that of must (compare
the weakly subjective advice expressed by should in (7) below with the strongly
Modals and quasi-modals 
subjective imposition associated with must in (3) above). Should thus presents as a less
overbearing deontic modal than must and this may be a factor in the superiority of its
numbers over must, and as well its quite healthy numbers when compared to have to:
should (4692) versus have to (5121), with frequencies for C-US normalized to tokens
per million words.
(7)maybe you should say Hey hey look at that spunky tutor I’m sitting
next to down down in the front row [ICE-AUS S1A-020:114]
6.2 Have got to
Have got to is semantically similar to have to, but differs from it in its syntactic properties
and stylistic distribution. Unlike have to, have got to exhibits most of the formal features
of the modal auxiliaries: no non-tensed forms (e.g. *to have got to, *having got to); unable
to cooccur with modals (e.g. *may have got to); operator functions (e.g. negative forms
with n’t such as hasn’t got to, and inversion with the subject as in has she got to?). Have to
is, furthermore, consistently more popular than have got to in all four dialects, as Table 2
shows, ranging from eight times more popular in C-US to around four times in the other
dialects: C-US (8.00) > ICE-NZ (5.18) > ICE-AUS (3.94) > ICE-GB (3.66).
While Mair and Leech’s (2006) figures for recent written English as presented in
Table 1 show a decline for have got to in BrE and a modest rise in AmE, some (Krug
2000: 63; Smith 2003: 263; Leech 2003: 229) have noted that it is becoming more
common in conversation (where presumably it is less likely to attract the attention
of prescriptivists censuring the use of got). The figures in Table 4 show that have got
to had the strongest preference for speech (9.18:1) of all the quasi-modals. It is the
only quasi-modal for which C-US did not have the greatest number of tokens of the
four dialects: in fact C-US had the smallest number of tokens by far (see Table 2).
BrE and AusE had the largest number, and at the same time the strongest proportion
of tokens in speech.
Have got to expresses deontic necessity, as in (8), even more frequently than have
to, and like it presents a more “democratic” option than must. AmE heads the other
dialects in the extent of its preference for this meaning. The epistemic meaning, as in
(9), is rare, as for have to.
(8)you can hardly isolate one part of course like the tutorials and look at it by itself
you’ve got to look at all the components [ICE-NZ S1B-007:102]
(9) There’s got to be something better to read [C-US SBC-023:1041]
6.3 Need to
As Table 1 indicates, need to has undergone a spectacular increase in popularity in
recent British and American writing, particularly the former, while at the same time
 Peter Collins
its auxiliary counterpart need has suffered a massive decline. However these figures,
limited as they are to the written word, do not tell the full regional story of need to and
need and their contrasting fortunes. Once we enter spoken English into the equation
we find that, in the rise of need to and the decline of need, once again it is AmE leading
the way and that BrE (rather than NZE, which patterns closely with AusE with these
items) is the most conservative of the four dialects. As Table 2 shows, the ordering for
frequency of tokens with need to is: C-US (473) > ICE-AUS (343) > ICE-NZ (338) >
ICE-GB (280). As Table 3 shows, the ordering for frequency of tokens with need is:
ICE-GB (34) > ICE-NZ (20) > ICE-AUS (19) > C-US (15). Tables 4 and 5 reveal some
striking correlations between these findings and those for the comparative popularity
of the two modal items in speech and writing: in C-US need to has the strongest preference for occurrence in speech of the four dialects (3.56:1) and need the weakest (0.0:1);
in ICE-GB need has the strongest preference for occurrence in speech (0.94:1).
Need to and need are semantically alike, with their proportions of deontic, dynamic
and epistemic meanings (see Tables 6 and 7) being similar and roughly comparable to
those for have to and have got to. Like have to and have got to, but not on account of the
objective orientation displayed by this pair, they may be felt by contemporary English
users to convey a less authoritarian tone when used deontically. In the case of need
to and need, the deontic source is, at a literal level, the subject-referent: their use as
expressions of obligation comes about via a pragmatic extension of their intrinsically
dynamic sense, with the speaker expressing a requirement which appears to acknowledge the subject-referent’s needs, as in (10) (see further Smith 2005; Nokkonen 2006).
(10)The onus of proof is borne by the Crown and the accused need not persuade
you of anything and he is presumed to be innocent and he enjoys that
presumption until a jury decides otherwise [ICE-AUS S2A-065:131]
It is presumably the syntactic inflexibility of auxiliary need, restricted as it is largely to
negative clauses as in (10), that accounts for its declining fortunes at the expense of its
quasi-modal counterpart.
6.4 Be going to
Table 2 indicates that be going to was more than twice as frequent in the American
corpus as it was in the others, the regional frequencies for this quasi-modal – C-US
(2413) > ICE-AUS (1191) > ICE-NZ (1088) > ICE-GB (1056) – suggesting that
Americanization may be a factor in its growing popularity. That colloquialization
may be another relevant factor, at least for AmE, is suggested by the finding that
be going to is strongly preferred in speech over writing (by a ratio of 8.1:1), taken
in conjunction with Leech’s (2003) finding that be going to enjoyed an increase in
popularity in American writing (51.6%) between 1961 and 1991/2. Again we find a
Modals and quasi-modals 
correlation between overall frequency and speech/writing ratios (see Table 4): C-US
(10.52:1) > ICE-AUS (9.35:1) > ICE-GB (9.22:1) > ICE-NZ (4.47:1).
As Table 6 shows, be going to mostly expresses epistemic modality (56.1%) and
dynamic modality (43.1%). Epistemic be going to differs from epistemic will in always
locating the situation in future time. It is thus here, rather than in cases other than
those involving future reference, as in (11) and (12), that epistemic be going to is laying
down the gauntlet to epistemic will.
(11)No and the level of acceleration ah at any point will be ah related to the ah
instantaneous radius that it’s turning [ICE-AUS S1B-064:261]
(12)If New Zealand loses the distinctive “whio... whio...” from its mountain
streams then it has lost not only a national symbol of the back country,
but will have sacrificed the quality and character of the country’s river
systems. [ICE-NZ W2B-026:113]
Be going to often presents itself as an attractive option if the speaker wishes to highlight
the immediacy of the event, with situations that are on the verge of occurring or are
already in train, as in (13).
(13) I’m just going to top up my tea again if you don’t mind [ICE-GB S1A-067:169]
While dynamic will (and shall) tend to express “willingness”, dynamic be going to tends
to express the weaker sense of “intention”. Thus in (14) I’m not gonna is paraphraseable
as “I don’t intend to” whereas I won’t in the same context would be paraphraseable by
“I refuse to”:
(14)OK so this one obviously has got its body wall cut up so I’m not gonna
do another one now this afternoon but you can look at this one
[ICE-AUS S2A-052:10]
6.5 Want to
The frequencies in Table 2 show that want to was considerably more popular in the
American corpus than in the others, with the following ordering: C-US (1425) > ICE-AUS
(1039) > ICE-GB (858) > ICE-NZ (800). This finding is compatible with those of two
recent diachronic investigations that attest to the rising popularity of want to, particularly in AmE. Krug (2000) compared the frequencies of want to in samples of press and
fictional writing in LOB/Brown and FLOB/Frown and, noting huge increases in the
American corpora, concluded that “while the rise of the new volitional modal probably did not originate in the US, the change obviously caught on more rapidly here
than in Britain” (p.135). Leech’s (2003) study, figures from which are replicated in Table
1 above, found that want to enjoyed a spectacular increase in popularity of 70.9% in
American writing between 1961 and 1991/2, with a milder increase of 18.5% in British
writing.
 Peter Collins
The finding of the present study that be going to is preferred in speech over
writing by a ratio of 2.9:1 (see Table 4) both confirms the accuracy of Krug’s
(2000: 136) claim that it is “approximately three times more common in spoken English”, and points to the validity of his suggestion that colloquialization has had a
role to play in the frequency gains experienced by want to in contemporary written
English. Interestingly, the neat correlation between frequencies and speech/writing
ratios that has been noted with a number of the quasi-modals and modals was
disrupted with be going to (with AmE being surpassed by AusE: ICE-AUS (3.52:1) > C-US
(3.08:1) > ICE-GB (2.63:1) > ICE-NZ (1.84:1).
Semantically, want to is predominantly a dynamic modal expression, with a meaning
comparable to that of dynamic will, as in (15), where wanted to is paraphraseable by
“was willing to”:
(15) “My brother wanted to live my life for me” [C-US Frown-P05:88]
An inspection of the subjects of dynamic want to and will revealed an interesting
difference. Dynamic want to far more commonly takes a 2nd person subject (28.5%)
than does dynamic will (7.7%), while by contrast dynamic want to selects a 1st person
subject less commonly (40.5%) than does dynamic will (68.1%). What this finding
suggests is that dynamic want to is not invading the semantic territory of dynamic will
indiscriminately, but rather in a way that is bringing about a (partial) distributional
complementarity.
Although want to is dominantly dynamic, there is a smattering of tokens in which
it expresses deontic modality, as in (16), and epistemic modality, as in (17):
(16)You want to watch that I mean you could lose hours of work
[ICE-AUS S1A-006:269]
(17)Tough games for Agassi now. He wouldn’t wanna get behind two sets to love
against a big serve volleyer like Martin who’s got some good groundies too
[ICE-AUS S2A-004:138]
That fact that such modal meanings are beginning to appear lends further support to
the morphological evidence that want to is undergoing auxiliarization/modalization
with the incorporation of the infinitival to into a compound often written as wanna in
informal styles.
7. Conclusion
The findings of the present study are compatible with those of diachronic studies suggesting that recent decades have seen an increase in the popularity of the quasi-modals
and a decline in that of the modals. The near symmetrical nature of the results for
Modals and quasi-modals 
semantically-paired quasi-modals and modals, across the spoken and written modes
of four Englishes, suggests furthermore that the two trends are interconnected. It is
tempting to propose a hypothesis that might be tested in comparable studies of other
grammatical categories: a language will maintain a quantum of exponents for the
expression of a semantic category (modality, in the present study), such that if there
is an increase in the frequency of one type of exponent this will be at the expense of
other types.
The findings yielded a consistent regional pattern: it is AmE that is in the vanguard
of change in the rise of the quasi-modals and the decline of the modals. At the same
time there is ample evidence that stylistic factors are also at play, with quasi-modals
flourishing in speech, their modal counterparts maintaining a penchant for the written
word. Furthermore there is a connection between the regional and the stylistic: it is
in AmE that the stylistic gulf between quasi-modals and modals is most marked, and
there is a regular ordering of the other three dialects examined. BrE and NZE consistently differentiate themselves from AmE at the conservative end of the spectrum,
with AusE located in-between, Australians seemingly not prepared to completely
differentiate themselves from their more conservative New Zealand “cousins” and
British “parents” on the one hand, and yet not prepared to yield to the seductive
linguistic might of the Americans on the other.
References
Facchinetti, Roberta, Manfred Krug & Frank Palmer (eds). 2003. Modality in Contemporary English. Berlin: Mouton de Gruyter.
Krug, Manfred. 2000. Emerging English Modals. A Corpus-based Study of Grammaticalization.
Berlin: Mouton de Gruyter.
Leech, Geoffrey. 2003. “Modality on the move: The English modal auxiliaries 1961–1992”. In
Facchinetti et al. (eds): 223–40.
Mair, Christian & Geoffrey Leech. 2006. “Current changes in English syntax”. In Bas Aarts &
April McMahon (eds), Handbook of English Linguistics. Oxford: Blackwell, 318–42.
Myhill, John. 1995. “Change and continuity in the functions of the American English modals”.
Linguistics 33: 157–211.
Nokkonen, Soili. 2006. “The semantic variation of NEED TO in four recent British English
corpora”. International Journal of Corpus Linguistics 11: 29–71.
Palmer, Frank. 1990. Modality and the English Modals, 2nd edn. London: Longman.
Smith, Nicholas. 2003. “Changes in the modals and quasi-modals of strong obligation and
epistemic necessity in recent British English”. In Facchinetti et al. (eds): 241–66.
The perfect and the preterite in Australian
and New Zealand English
Johan Elsness
University of Oslo
The distinction between the present perfect and the preterite verb forms is
one of the comparatively few points of English grammar where clear differences
have been noted between the various national varieties, not least between
American and British English: it has often been pointed out that the present
perfect is used more extensively in the latter variety. This chapter takes up the
distribution of the two verb forms in Australian and New Zealand English.
A wide use of the present perfect is documented in both antipodean varieties,
but especially in Australian English. At the same time a trend is recorded for
younger speakers of Australian English to be moving in the direction of the
more restrictive American English norm.
1. Introduction
Like a large number of other languages, English has two main verb forms used in
references to past time: a synthetic preterite (past) tense and a periphrastic present
perfect construction, as in, respectively,
(1) She just came.
and
(2) She has just come.
The distribution between these two verb forms varies a great deal, both synchronically
between different languages and language varieties, and diachronically within individual languages, including English. In English the preterite tense is definitely the more
frequent of the two. In many kinds of present-day English texts the preterite tense has
been found to be at least ten times as frequent as the present perfect.
One major distinguishing feature in English is that the present perfect does
not generally allow any clear specification of the past time referred to, as long
as that time is located wholly in the past. The present perfect is used to refer to
 Johan Elsness
points or periods of time which wholly precede the deictic zero-point without that
preceding time being clearly located, and to periods of time defined as extending
from the past up until the deictic zero-point, and possibly further into the future,
as in, respectively:
(3) I’ve been there several times.
(4) I’ve lived here since 1999.
More or less clearly defined past time wholly preceding the deictic zero-point will
usually be expressed by the preterite tense instead. In the most typical cases the past
time will be identified by means of a temporal adverbial (such as yesterday, two weeks
ago, in 1989), but there are also all sorts of indirect ways of establishing reference
times for a preterite tense to attach itself to. Quite often the time referred to by a
preterite tense remains vague, as may well be the case in a sentence like (1) above.
The distinction between the two verb forms is drawn very differently even in
closely related languages such as German and French. Generally, these languages use
the present perfect more widely than English, as there is no similarly strict ban on the
combination of the present perfect with clear specifications of points or periods of
time located wholly in the past. In the case of German, the use of the present perfect
is especially widespread in southern dialects, including Austrian and Swiss German,
where combinations like that seen in (5) are particularly common:
(5) Ich habe ihn gestern gesehen.
Similarly, sentences like (6) are straightforward in French:
(6) Je l’ai vu hier.
By contrast, an English sentence like
(7) *I’ve seen him yesterday.
would usually be deemed unacceptable.
In English this difference between the present perfect and the preterite is often
seen as the chief criterion defining the distributional distinction between the two verb
forms. An alternative theory holds that the fundamental difference between them is
rather that the present perfect is selected to express what is often termed resultativeness,
or current relevance more generally, as when
(8) Has Joan come yet?
is uttered to ask whether Joan is now at the place in question. In such cases (8) will be
more or less synonymous with
(9) Is Joan here (yet)?
The perfect and the preterite in Australian and New Zealand English 
One major problem with all current relevance theories is that it is extremely difficult to
define what exactly the term “current relevance” implies – most, if not all, past events
can be seen as having a certain current relevance.1
Current relevance in its purest form occurs when the past-referring verb denotes
the inception of a durative situation which still obtains at the deictic zero-point, as
when (8) above is intended in the sense of (9). In such cases the present perfect may be
said to be associated with a pretty clear resultative meaning.
The trouble is that even the preterite tense may be used of past situations which
are resultative in this sense. A sentence like
(10) Joan came five minutes ago.
will normally be given a reading which is just as resultative as that of
(11) Joan has come.
Barring contextual signals to the contrary, (10) as well as (11) will be taken to imply
that Joan is now here.
In present-day English the distributional distinction between the present perfect
and the preterite is far from clear-cut. There are lots of cases where the temporal location
of the situation referred to is so vaguely defined that either verb form may easily be used.
This may be because this location is determined by an adverbial or other constituent
which only denotes a very vague past time, with an uncertain temporal distance from
the deictic zero-point – we have already seen constructions with just (cf. examples (1)
and (2) above) – or it may be because the temporal reference is determined less directly,
without any clear temporal reference being expressed by adverbials or other constituents
but rather signaled more vaguely by either the linguistic or the situational context. Thus,
as well as sentence (11) “Joan has come.”, one may equally say
(12) Joan came.
even without any very clear past-time reference being established by either the linguistic
or the situational context. In such cases it may seem as if a preterite verb form often
places the verbal situation in a past time-sphere, sometimes triggering expectations
that further situations will be placed in the same past time-sphere. A sentence like (12)
above may often occur in the following kind of context:
(13) Joan came. She told me that … .
By contrast, a present-perfect sentence like (11) “Joan has come.” will often occur in
the context of references to the present time-sphere, as exemplified by
(14) Joan has come. She intends to stay at least until the weekend.
. For a survey of theories attempting to explain the distributional distinction between the
present perfect and the preterite in English, see e.g. Elsness (1997) and McCoard (1978).
 Johan Elsness
Usage in these and many other cases is far from settled, however. Individual speakers
may use different verb forms on different occasions. There are also some pretty clear
dialectal differences, not least between the major national varieties of present-day
English. It has often been pointed out that as far as the two best studied national varieties are concerned, the present perfect is sometimes used in BrE where AmE would
prefer the preterite.
Vanneck (1958) was among the first to provide specific evidence for the claim
that in AmE the preterite may be common in constructions where the present perfect
would be expected in BrE. He links what he calls the “colloquial preterite” especially
with spoken AmE, where he has recorded cases like
(15) Spain’s a nice country. I know some people who were there. (Vanneck 1958: 239)
A generation later Görlach (1987) notes that Quirk et al. (1985) lists the opposition
between the present perfect and the preterite as one of the few points of present-day
English grammar where there is a pretty distinct difference between AmE and BrE.
On the basis of a comprehensive corpus investigation, Elsness (1997) was able
to confirm that the present perfect is indeed used more sparingly in AmE than in
BrE, and that this difference is not confined to the spoken language. This AmE/BrE
difference is placed in a diachronic perspective: while in languages such as German
and French the use of the present perfect has increased over the centuries and seems
to be continuing to increase, a similar growth in the use of the English present perfect leveled off in Early Modern English times, and in Late Modern English the
use of the present perfect even appears to be receding, especially in AmE. Elsness
(2009) finds that a continuing development along these lines is notable within the
thirty-year period from the early 1960s to the early 1990s: the frequency of the
present perfect still appears to be declining in both AmE and BrE but continues to
be higher in BrE.
At the same time a seemingly contrary development has been noted by several
writers: at least in BrE there appears to be a tendency – apparently an increasing one –
for the present perfect rather than the preterite to be used even in certain cases where
the verb form is accompanied by a clear specification of past time. This usage seems to
be largely confined to colloquial registers.
Trudgill (1984) is one writer who comments on this particular perfect usage:
The rules governing the use of the present perfect in Standard English English
seem to be altering somewhat, and there appears in particular to be an increase in
the usage of forms such as:
I’ve seen him last year
He’s done it two days ago
(Trudgill 1984: 42)
The perfect and the preterite in Australian and New Zealand English 
Engel (2002) also refers to claims that in BrE the present perfect can now be used
with clear past-time specifiers such as last year, three years ago, although she adds
that “we must concede that, on a global scale, this tendency would appear to be
weaker than the opposing one (the expansion of the [preterite]) led by American
English” (Engel 2002: 258). In her own corpus of BrE radio talk shows and news
bulletins she does not seem to have recorded any really convincing examples of the
extended perfect usage. That is the case with Cotte (1987), however, who presents
quite a few examples of the present perfect being used with very clear specifications
of past time, many of them from spoken (radio) sources. Two of these (the second
from AusE) are (Cotte 1987: 91):
(16)Well, everybody’s got to make their own decision, as I’ve said yesterday …
(Interview with Lord Brittan on BBC Radio 4, 1983)
(17)There have been more deaths in Northern Ireland yesterday.
(Australian radio news, 1976)
Such a liberal perfect usage is nothing new in English. Elsness (1997: 292–3) recorded
similar examples spanning the whole of the Modern English period:
(18)
Knew you not Pompey? Many a time and oft
Have you climbed up to walls and battlements,
To towers and windows, yea, to chimney-tops,
Your infants in your arms, and there have sat
The live-long day, with patient expectation,
To see great Pompey pass the streets of Rome.
And when you saw his chariot but appear,
Have you not made an universal shout…
(Shakespeare, Julius Caesar, 1600)
In (18) the choice between the present perfect and the preterite is apparently dictated
more by consideration of the metre than by any strict temporal distinction.
(19) Lady Sneer. But do your brother’s distresses increase?
Joseph S. Every hour. I am told he has had another execution in the house
yesterday. (Sheridan, The School for Scandal, 1777)
The following examples are from Visser (1973: 2197):
(20) ... which I have forgot to set down in my Journal yesterday. (Pepys’ Diary, 1669)
(21)The Englishman ... has murdered young Halbert ... yesterday morning.
(Scott, Monastery, 1820)
(22) I have been to Richmond last Sunday. (Galsworthy, In Chancery, 1920)
 Johan Elsness
2. The perfect and the preterite in AusE and NZE
The key question facing us here is where AusE and NZE place themselves in the
company of the other national varieties of present-day English: are they more similar
to AmE or to BrE, or are they perhaps halfway between the two, or have they set out
on their own?
It would be an exaggeration to say that the relationship between the present
perfect and the preterite – or the use of other verb forms, for that matter – have
received a great deal of attention in the available literature on AusE and NZE. One
reason for that is that many of the publications which have appeared focus on matters of phonology/phonetics and on lexical peculiarities, which represent the most
striking differences between AusE, NZE and the other national varieties. Several
of the publications which do address morphosyntactic topics also leave the present
perfect/preterite contrast unmentioned. Since there are clear, and well documented,
differences between AmE and BrE in this area, the position of AusE and NZE is,
however, of some interest.
Engel and Ritz (2000) offer a comprehensive treatment of the present perfect in
AusE, although their data are limited to a certain segment of the spoken language: news
bulletins and chat shows put out by three local Australian radio stations (one based in
Sydney, the other two in Perth – see Engel & Ritz 2000: 129).2 In this material they record
a distinctly more extensive use of the present perfect than has usually been reported
as acceptable in English. Above all they note a number of instances where the present
perfect occurs in combination with clear adverbial specifications of past time, i.e. cases
where the preterite, and not the present perfect, would usually be the expected verb.
Some examples are (from Engel & Ritz 2000: 130):
(23)In the morning he’s stuck an “I love Redman” sticker on her back.
(Chat show, Perth)
(24)Police confirm that at 16.30 hours yesterday the body of Ivan Jepp
has been located. (News, Perth)
In quite a few cases the most noteworthy thing about the use of the present perfect is
that this verb form occurs in a sequence of past-time references (examples from Engel &
Ritz 2000: 130 again):
. In their discussion of the present perfect in AusE, Collins and Peters (2004: 597–8) note
that the study published in Engel and Ritz (2000) suggests that the ‘generalization of the
present perfect to simple past contexts’ may be more advanced in AusE than in other varieties
of English, although they comment that the material put forward in Engel and Ritz (2000) is
‘symptomatic rather than quantitative’ (Collins & Peters 2004: 598).
The perfect and the preterite in Australian and New Zealand English 
(25) After the collision, the vehicle has sped off. (News, Perth)
(26)A man has been injured when the tanker he was driving crashed into …
(News, Perth)
(27)Remember last year we were giving out …the “I hate Redman” and “I love
Redman” stickers? …Well, there was a man, he’s used his initiative. … He’s
obviously got a handful of these stickers and he’s cut them all up … and he’s
made a new sticker and it says “I tolerate Redman”. (Breakfast show, Perth)
Now the use of the present perfect has been found to be particularly frequent in radio
news bulletins even in BrE. Elsness (1984 and 1997, esp. pp. 156–9) investigated the
distribution of past-referring verb forms in BBC radio news bulletins. The frequency
of the present perfect was found to be at its highest in the introductory news headlines,
where there are few adverbial specifiers and also few sequences of past-time reference;
then the frequency of the present perfect declines in the introduction to the more
detailed treatment of each item, and goes further down in the following text. What is
characteristic of the BBC news bulletins, however, is that the present perfect is used in
the absence of any clear specification of a definite past time.3
Some of the present perfect examples adduced by Engel and Ritz from their AusE
data are such that this verb form would be straightforward even in BrE (and perhaps
AmE). After all, it is only very clear specifications of distinct points or periods of time
in the past which tend to block the use of the present perfect in those other varieties;
cases of more vague past-time specification combine fairly freely with the present
perfect. And we have seen that cases of the present perfect combining with very definite specifications of past time have been reported even for BrE, although it may seem
as if such perfect uses are more widespread in AusE.4 To what extent that is really the
case is a question we shall consider further below.
As for NZE, Bauer takes up the distinction between the present perfect and the
preterite in several of his publications. Bauer (1989: 70–1) reports cases from Radio
New Zealand news of the present perfect being used with clear past-time specification,
two of them being:
(28)Sanctions have been imposed by the UN thirteen years ago.
(Radio New Zealand news, 1979)
(29)The union has informed the employers yesterday that … .
(Radio New Zealand news, 1980)
. Example (26) above comes close to the kind of present perfect use commonly occurring
initially in BBC news bulletins.
. See references to Cotte (1987), Engel (2002) and Trudgill (1984) above.
 Johan Elsness
However, in an elicitation test carried out by Bauer the informants (who were university
students) showed a very clear preference for the preterite in similar constructions. In the
case of constructions with yet the informants sided with BrE rather than AmE, rejecting
the preterite alternative in a construction such as:
(30) Have you read the book I recommended yet?
The conclusion seems to be that educated NZE does not differ significantly from what is
assumed to be the BrE norm in its treatment of the present perfect/preterite opposition.
This is confirmed in Bauer (1994: 400–1), where the opposition between the two
verb forms is absent from a list of nine grammatical factors said to distinguish standard
NZE from standard BrE. However, Bauer includes “generalization of the perfect to
simple past contexts” in a subsequent list of grammatical points which are claimed to
be characteristic of nonstandard NZE. Bauer’s example is:
(31) I have seen it last week.
Hundt (1998) devotes five pages to the present perfect and the preterite in NZE, largely
discussing occurrences with the adverbials yet, since and just (Hundt 1998: 70–5). She
records few instances which deviate significantly from standard BrE, and concludes
that “If the perfect is at all current in past contexts in NZE, it is probably a development
which has not yet affected the written medium.” (Hundt 1998: 74–5)5
Quinn (1999) also has a brief reference to the distinction between the present
perfect and the preterite in NZE, referring to Bauer’s and Hundt’s treatments. She
notes that neither Bauer nor Hundt have been able to confirm the use of the present
perfect with past-time specification in contexts other than the radio news bulletins
recorded by Bauer (Quinn 1999: 196).
3. AusE and NZE in the company of the other national varieties
To survey the use of the present perfect and the preterite in AusE and NZE, and compare
that use with AmE and BrE, we shall look at data from the following corpora:
i.
ii.
iii.
iv.
v.
vi.
The Brown University Corpus of American English from 1961
The Frown Corpus of American English from 1992 (Freiburg update of Brown)
The Lancaster-Oslo/Bergen (LOB) Corpus of British English from 1961
The FLOB Corpus of British English from 1991 (Freiburg update of LOB)
The Australian Corpus of English (ACE) from 1986
The Wellington Corpus of Written New Zealand English (WWC) from 1986
. See also Hundt et al. (2004: 567–8).
The perfect and the preterite in Australian and New Zealand English 
vii. The Wellington Corpus of Spoken New Zealand English (WSC)
viii.The Australian section of the International Corpus of English (ICE-AUS)
ix. The New Zealand section of the International Corpus of English (ICE-NZ)
x. The Australian radio talkback corpus (ART)
(i) – (vi) each consist of (a little more than) 1 million words of printed (and published)
texts, divided between informational and fictional prose. The further textual composition of these corpora is also largely similar, so as to facilitate comparison. (vii) – (ix)
likewise consist of c. 1 million words each, (viii) and (ix) divided between c. 600 000
words of spoken and c. 400 000 words of written language.6 (x) is a smaller corpus: it
consists of c. 256 000 words of unscripted spoken language, some from public radio
(ABC), some from commercial radio. At the time of writing (i) – (v), but not (vi) – (x),
are available in editions where they are supplied with a system of grammatical tagging
at the word-class level.
Since in practice it is impossible to count all verb forms in untagged corpora of
this size, Table 1 is confined to occurrences of 16 verbs. They are begin, choose, drive,
eat, fall, fly, give, grow, know, ride, see, sing, speak, steal, take and write. These particular
verbs were selected because (i) they can be assumed to be comparatively frequent in
many types of text, and (ii) they are irregular, with distinct forms for the preterite and
the past participle, which made electronic searches for these forms possible even in the
untagged corpora.7 It is also helpful that between them these verbs span a fairly wide
semantic distribution, for example in terms of the telic/atelic (or bounded/unbounded)
distinction. The verbs be and do, although topping frequency lists, were avoided because
of the various auxiliary functions they may have, which might disturb the comparison
between the present perfect and the preterite.
Table 1. Frequencies of the present perfect and the preterite of 16 verbs across corpora;
present perfect/preterite ratios. Active, positive, declarative, non-progressive
constructions only.
Brown Frown LOB FLOB ACE
WWC ICEICE- ICE- ART
AUSwr NZwr AUSsp
ICE- WSC
NZsp
Pr. perf. 184
Preterite 2417
199
2212
247 210
2267 2167
219
2034
194
2071
86
644
124
927
270
858
83
240
201
732
329
1610
Ratios
0.09
0.11
0.11
0.09
0.13
0.13
0.31
0.35
0.27
0.20
0.08
0.10
. There is partial overlap between WSC and the spoken section of ICE-NZ.
. It is true that some forms of these verbs have unwanted homographs (e.g. saw, fell) which
would be included in electronic counts. It was assumed, however, that these homographs were
sufficiently infrequent not to distort distributions significantly.
 Johan Elsness
The searches were confined to active, positive, declarative, non-progressive forms
of the 16 verbs, partly to simplify the searches, but also because the other forms might
have distorted the present perfect/preterite distribution.8 In the case of the present
perfect, adverbial and other forms which frequently intervene between the auxiliary
and the main verb were included in the search string.9 Both full and contracted forms
of the perfect auxiliary were allowed, i.e. have, ’ve, has and ’s. This resulted in a large
number of unwanted hits, especially since ’s doubles as a contraction of is. All potential
present perfect hits were checked manually, however, and unwanted cases (including
lots of perfect infinitives) were weeded out.
The results can be studied in Table 1, where a distinction is made between the
written and the spoken sections of the two ICE corpora. The table also gives present
perfect/preterite ratios, illustrated in Figure 1.
Ratios present perfect/preterite across corpora
0.35
0.30
0.25
0.20
0.15
0.10
0.05
W
SC
AR
T
IC
EN
Zs
p
AC
E
W
W
IC
C
EAU
Sw
IC
r
EN
Zw
IC
r
EAU
Ss
p
B
FL
OB
LO
n
Fr
ow
Br
ow
n
0.00
Figure 1. Ratios between the present perfect and the preterite of 16 verbs. See Table 1.
. For example, the preterite of lexical verbs does not generally occur in negative and interrogative constructions, while the present perfect does.
. These were always, never, ever, often, seldom, rarely, occasionally, sometimes, recently,
lately, probably, perhaps, wisely, already, now, particularly, just, only just, just now, just recently,
also, only, even, actually, naturally, of course, inexplicably, really, kind of, sort of, foolishly, easily,
sensibly, certainly, obviously, definitely, desperately, immediately, totally, completely, nevertheless,
rarely, um, er.
The perfect and the preterite in Australian and New Zealand English 
Some notable differences emerge. The most striking of them is the very marked
difference between speech and writing: on the whole the ratio of the present perfect
to the preterite is at least twice as high in the spoken corpora. This must be because
spoken texts tend to be more orientated towards present time, which is an environment
generally favouring the present perfect. Whether the recorded difference can also be
due to certain uses of the present perfect being more frequent in colloquial registers is
a question which will be addressed below.
It will further be seen that the assumed difference between AmE and BrE is
confirmed in the case of Brown and LOB (from 1961): the present perfect/preterite
ratio is distinctly higher in the latter corpus.10 The difference between the corresponding
corpora from thirty years later (Frown and FLOB) is only marginal, and obviously not
statistically significant (χ²=0.4452). What is more surprising is the apparent development in AmE over the thirty-year period between Brown and Frown: here there is
actually an increase in the present perfect/preterite ratio, although only a slight one,
not statistically significant (χ²=2.3026). This runs counter to the findings reported in
Elsness (2009). However, one should not lose sight of the fundamental fact that any
such comparison depends crucially on the composition of the corpora involved – even
fairly modest differences in the temporal orientation of the particular texts included
may lead to apparently significant variations in the distribution of verb forms, whether
or not those differences are representative of changes in the language as a whole. Also,
of course, our comparisons are vulnerable because of the limitation to just 16 verbs,
even though the total number of recorded cases from each corpus seems reassuring.
As regards AusE and NZE, it will be seen that AusE especially places itself at the
high end of the scale of present perfect/preterite ratios. The particularly high ratios
recorded for the written sections of both ICE-AUS and ICE-NZ compared with the
other written corpora must be linked with the textual composition of those written
sections, with much less fiction than in the various corpora of the so-called Brown
family and a substantial proportion of unpublished letters and other categories which
can be expected to be broadly orientated towards present rather than past time.
The significance of text categories in this respect is demonstrated in Table 2 and
Figure 2 below, where results for ACE and WWC are broken down into the four text
category groups that the corpora of the Brown family are often divided into. It can
be seen that the present perfect/preterite ratio varies considerably, even though these
corpora consist exclusively of printed, published texts. The ratio is particularly high
in the Press texts and well below average in Fiction. This reflects a clear difference
in temporal orientation: fictional texts tend to relate (imaginary) events located in a
. According to the Chi-square test the difference between Brown and LOB is statistically
significant at the .1 per cent level: χ²=12.1823
 Johan Elsness
Table 2. Frequencies of present perfect and preterite in four text category groups of
ACE and WWC: Press (text categories A–C), General prose (D–H), Academic (J),
Fiction (K–W in ACE, K–L in WWC).
Text category group
Press
General prose
Academic
Fiction
Corpus
ACE
WWC
ACE
WWC
ACE
WWC
ACE
WWC
Present perfect
Preterite
Ratios
64
224
0.29
46
246
0.19
97
727
0.13
74
741
0.10
17
126
0.13
18
129
0.14
41
957
0.04
56
955
0.06
Note: the data include frequencies of the present perfect and the preterite of 16 verbs; active, positive,
declarative, non-progressive constructions only.
Ratios present perfect/preterite in subdivisions of ACE and WWC
0.3
0.25
0.2
ACE
WWC
0.15
0.1
0.05
0
Press
General prose
Academic
Fiction
Figure 2. Subdivision of ACE and WWC into four text category groups. Ratios between the
present perfect and the preterite. See Table 2.
(fictional!) past, while Press writings contain a substantial proportion of commentary
and are generally more orientated towards present time.
Table 2 and Figure 2 cast interesting light on the relationship between ACE and
WWC displayed by Table 1 and Figure 1, which gave an impression of the present
perfect/preterite ratio being distinctly higher in the AusE corpus. It can now be seen
that this difference becomes blurred when the two corpora are broken down into text
category groups, the ratio actually being higher in WWC for both Academic and, especially, Fiction. This is a reminder of the importance of treating results from corpus
comparison with due caution.
The perfect and the preterite in Australian and New Zealand English 
The overall impression left by Table 1 and Figure 1 is nevertheless that the ratio
between our two verb forms may well be somewhat higher in AusE than in NZE. Apart
from the (slight) difference between ACE and WWC, we notice that the recorded ratio
is quite a bit higher for the spoken section of ICE-AUS than for that of ICE-NZ. Even
more striking is the difference between ART and WSC, i.e. the two corpora consisting
of spoken texts only, but then these corpora are very different in their textual composition, ART consisting exclusively of unscripted speech, the spoken texts making up
WSC being more varied.
As we have already seen, some writers have suggested that in AusE, and also NZE,
there may be a tendency to use the present perfect in some cases where the preterite
has been the expected verb form in English, most notably in combination with
adverbial specification of clearly defined past time. The figures we have looked at so
far are at least concomitant with an assumption that this is more common in the two
antipodean varieties of English than in BrE and, especially, AmE. To find out more we
shall have a look at the results of a search carried out on some of the most obvious,
and most easily searchable, expressions of past time in combination with the present
perfect: constructions with ago (such as three weeks ago), yesterday, and expressions
with last night/week/month/year/*day (such as last Wednesday). In these searches the
present perfect was represented simply by its potential auxiliary forms: have, ’ve, has
and ’s, in order to capture all lexical verbs. Lots of constructions were returned which
were irrelevant to our present concerns, either because there was no present perfect
verb form, or because the adverbial did not modify that verb or did not express clearly
defined past time. After all unwanted cases had been excluded from the count, a total
of 22 cases remained. Their distribution across the various corpora is set out in Table 3.
Comparing these results one should bear in mind that the size of ART is only about
one-quarter of that of the other corpora. In Figure 3 the result for ART has been
adjusted accordingly.
Table 3. Frequencies of the present perfect in constructions with past-referring temporal
adverbials with – ago, yesterday and last night/week/month/year/*day. Raw figures
Brown Frown LOB FLOB ACE WWC ICE-AUS ICE-NZ ART
WSC Total
0
2
2
3
1
2
0
6
0
6
22
With a total of just 22 recorded cases one should obviously be cautious about
drawing sweeping conclusions from the very lopsided distribution displayed. If we
keep our focus on the two antipodean varieties, the difference between them is striking, however: 16 of the recorded cases are from these varieties, but of those the three
AusE corpora account for as many as 14 cases, the three NZE corpora for just two, and
 Johan Elsness
The present perfect with past-referring adverbials
25
20
15
10
5
0
Brown Frown
LOB
FLOB
ACE
WWC
ICE- ICE-NZ ART
AUS
WSC
Figure 3. The present perfect in constructions with certain past-referring adverbials – see
Table 3. Result for ART adjusted to norm of 1 million words.
that is in spite of the fact that one of the AusE corpora, ART, is much smaller than the
other corpora.
It is true that the composition of ART is such (unscripted conversation, often of
a distinctly colloquial character) that any usage which is typical of informal language
could be expected to display above-average frequencies. But the difference between
AusE and NZE goes further than that: two cases recorded in ACE, not a single one
in the closely parallel WWC; as many as six cases in ICE-AUS, not a single one in the
closely parallel ICE-NZ. Even with the small overall number of recorded constructions, this suggests that the rather special use of the present-perfect with distinct
specifications of past time may be more common in (certain varieties of) AusE than
it is in NZE.
We shall look at some of the recorded instances of the present perfect combining
with adverbials of the type examined. The first are the two from ACE:
(32)Most come from broken homes and have themselves been broken a
long time ago. [ACE W13:2335]
(33)Supt Farrah said motorists last week has cooperated fully with police.
[ACE A39:8300]
We notice that (33) is from a police context, which is precisely the sort of environment in
which Engel and Ritz (2002) found this kind of perfect usage to be particularly common.
The perfect and the preterite in Australian and New Zealand English 
A notable difference between (32) and (33) is that (32) has a pretty vague indication of past time, whereas the temporal specification in (33) is much more precise.
Indeed, the specification in (32) is so vague that the use of the present perfect would
be considered straightforward in most registers and geographical varieties of English.
What is more, all the constructions recorded from the AmE and BrE corpora are of
this type. In fact the adverbial is invariably long ago or a long time ago in all those cases.
A couple of examples, the first from Frown, the second from LOB:
(34)I have myself long ago decided never to write … . [Frown G71:154]
(35)All this naturalism, he says, has been done such a long time ago in
France and elsewhere … [LOB A18:1899]
As regards the six recorded cases from ICE-AUS (set out below), they are all from
the spoken section, which, it will be recalled, makes up about 60% of that corpus.
With the exception of (40) they all have pretty precise past-time specifiers, and even
(40) places the verbal situation in a past time with a very distinct separation from
the deictic zero-point:
(36)Hey Trev are you selling sports cars or something eh what do you think
about ah that side Wayne that you’ve played against last week no change
[ICE-AUS S1B-035:55]
(37)Mr surname1 um um can I just start firstly with um ah ascertaining from
you some chronology of your um um advice to various people of this
allegation that you have made that Mr surname2 spoke to you in the
bowling club in the manner that you’ve er described yesterday in your
evidence [ICE-AUS S1B-065:4]
(38)That’s correct isn’t it? And in that statement you made no mention
whatsoever of this allegation that you have made in evidence yesterday about
Mr surname2 approaching you at the Christmas party [ICE-AUS S1B-065:13]
(39)A man has been bashed and a woman shot at during a robbery in a
house at Casula near Liverpool last night [ICE-AUS S2B-004:172]
(40)Cos I’ve done a lot of outback travel years and years ago butfirstname1
hasn’t done any and we bought the car with the purpose of doing it um but
we haven’t got around to it yet obviously and [ICE-AUS S1A-059:65]
(41)As I say and yet you know when he learnt English was very much the
Queen’s English and we have spoken that until twenty years ago so you
can always see [ICE-AUS S1A-080:37]
Most of the constructions recorded from ART likewise have pretty distinct past-time
specifiers, although the very informal character of these unscripted spoken texts means
that the expression of the temporal specification is sometimes less straightforward than
 Johan Elsness
in most written texts. It may also be noteworthy that all but one of the six recorded
cases are from commercial radio. The first example is the only one from the ABC:
(42)[Caller 10:] Hello Ramona hello Sandy 〈P1 g’day〉 and Tim. Tim I’ve just
coincidentally finished reading Cloudstreet three weeks ago. And you’re
a national treasure it’s one of the most beautiful books I’ve ever read.
[ART ABCnat2]
Here the occurrence of just between the auxiliary and the main verb eases the use of
the present perfect. Even so, an adverbial like just three weeks ago would probably be
more likely to combine with the preterite verb form in most kinds of text.
In (43) there are similar contextual factors which help to explain the use of the
present perfect in combination with the past-time specifier last year: the preceding adverbial already, very common in combination with this verb form, and the
hesitation marker um, which serves to separate the following adverbial from the
preceding verb form:
(43)I’ve got a house I’ve already bought um last year and I’ll probably be selling it
about say another three four years when I move. [ART COMne3]
The remaining constructions from ART also have features which serve to make the
past-time location less precise, with modifying elements such as about, say for instance
and just, or, as in the case of (47), with the adverbial being the inherently vague a long
time ago:
(44)[Caller4] Okay. Now my problem is I’ve put in these little lettuce seedlings
and they’re only probably three inches high and I've only had planted them
out about three weeks ago 〈E1 mm〉 but they’re bolting to 〈E1 yeah〉 seed.
[ART COMe1]
(45)[Expert1] The idea with buying off the plan and 〈C3 mm〉 it it will take a p a
fairly brave person to be buying off the plan now and having completion soon.
Well the people that have bought off the plan say for instance twelve months
ago 〈C3 mm〉 and are having to complete now. [ART COMe2]
(46)[Presenter1b] There’s a whole bunch of people all waiting on the phone
lines now let us jump to them very very shortly but let us begin with a couple 〈,〉
who meant oh so much to me and unfortunately they’ve actually literally
just hung up just a second ago. So aren’t they worthless bastards. Hello Brett.
[ART COMe6]
(47)[Expert1] Ah Claire pines conifers take a long while to die so that could’ve
been damage that’s happened a long time ago so it’s not probably anything
that you’ve done recently [ART COMne4]
Several of the constructions recorded from ART are marked by the anacolutha and
hesitations which are so frequent in many kinds of unscripted speech. The fact
The perfect and the preterite in Australian and New Zealand English 
that as many as five of the six recorded constructions come from commercial radio
may be an indication that this type of verbal usage is most common in particularly
informal registers.
The two constructions of this type recorded from WSC are (48) and (49):
(48)We’d need to look at the hospitals um [clears throat] how many women are
seeking um medical attention for injuries that they’ve sustained perhaps two
months ago ten years prior [WSC DGI157:0070]
Even though in (48) the reference is to a past time which is very clearly separated from
the deictic zero-point, there is a vagueness about the temporal reference which, in
combination with a couple of hesitation markers, may again have eased the use of the
present perfect.
Example (49) is from a distinctly colloquial conversation between two speakers,
with typical interruptions and hesitations:
(49) A: … she … definitely goes overboard.
B: She needs …
A: She’s pissed me by the end of the day yesterday.
B: Yeah, she really needs um she really needs to be pulled into line … .
[WSC DPC291:0475]
One thing which characterizes most of the constructions we have considered with the
present perfect being modified by past-referring adverbials is that, in spite of the pastreferring specifier, the temporal orientation seems to be very much towards the present
time-sphere. That leads to a potential clash between the present perfect and the preterite.
Our findings suggest that in such cases the usual ban on the combination of the present
perfect with past-referring adverbials is more easily relaxed in AusE than in other varieties
of English. This seems to apply especially to informal, colloquial AusE.
The constructions we have considered above can only be the tip of the iceberg
as far as accounting for the more frequent use of the present perfect in AusE is concerned. Even in this variety the use of the present perfect with clear and unequivocal
past-time specification seems to be no more than a marginal phenomenon. However,
this may be an indication of a generally more liberal use of the present perfect in AusE,
so that AusE is somewhat more likely to prefer the present perfect over the preterite
where the choice between the two verb forms is pretty open in English worldwide. This
would help to explain the higher ratios between the present perfect and the preterite
that we have recorded.
It must not be overlooked, however, that the use of the present perfect with past-time
specification has been reported even for other varieties of English. Trudgill’s (1984)
remarks concerned Standard English English, and Cotte (1987) also recorded constructions of this type from BrE. Moreover, Bauer (1994) cites similar examples
from NZE. A search of the British National Corpus, with a total of 100 million words
 Johan Elsness
(90 million written, 10 million spoken), yields several cases of the present perfect
combining with adverbials of the type we are concerned with, most of them from the
spoken section. A couple of examples are:
(50)Anyway [Pause] three people have phoned yesterday, we had two phone
calls yesterday, in the morning [Pause] I had one last night and there was
another one this morning about the washing machine and I said sorry but
I said the advert was put in the Campaign I said a month or so ago.
[BNC KCC 480, Conversation]
(51)In the event my Lord, erm, that er your Lordship felt that further guidance
was required, there are the two routes that I’ve indicated to your Lordship
briefly yesterday, there is the route of er seeking some information, if your
Lordship felt it’d be of assistance to you in resolving any doubts that you
may have from the and your Lordship has seen yesterday the notice on
co-operation which is in and at page eleven thirty two and is also the exhibit
[BNC K73 457, Royal Courts hearing]
In the case of (50) there is a highly noteworthy continuation to the use of the present
perfect, where the speaker switches to the preterite: we had two phone calls yesterday,
indicating that the speaker is uneasy about her own use of the present perfect in this
construction. (51) is characterized by the same orientation towards the present timesphere that we saw with several of the constructions from AusE. The occurrence of
hesitations and anacolutha is also typical. At the same time (51) serves as a reminder
that this particular verbal usage is not confined to colloquial language.
Among our findings there are few if any straightforward examples of the present
perfect being used in the kind of past narrative context reported by Engel and Ritz
(2000), but then our searches focused on particular past-referring adverbials. Example
(50) comes close to a narrative context, but with the one occurrence of the present
perfect and the speaker’s own correction this hardly counts as convincing evidence.
The bulk of the examples we have uncovered rather point to a clear present-time orientation, the present perfect being used to express what might be termed a synoptical
point of view: the reference of the present perfect verb itself is clearly to past time, but
the main contextual function of that past-time reference often seems to be to shed light
on a predominant present situation.
4. Data from Australian Style
In the Feedback column of the biannual publication Australian Style readers are invited
to give their responses to various points of AusE usage by indicating which of alternative constructions they prefer. The results are classified according to the sex and age of
the individual respondent, and also according to the state or territory within Australia
in which he or she is resident.
The perfect and the preterite in Australian and New Zealand English 
In most cases the Feedback column triggers responses from around 500 readers
or more, with a good spread across the various criteria used to classify respondents.
This column thus provides a wealth of information about current AusE usage, within
the limitations set by an enquiry of this kind (views registered only by readers of
Australian Style who volunteer to send in their responses, who can be assumed to be
more interested in points of language usage than the average citizen, and probably
better educated too).
In the issue of Australian Style from June 2004 I was allowed to include four present perfect/preterite constructions I wanted to test in the Feedback column.11 The four
sentence pairs, with the overall results recorded in Australian Style, are set out in Table 4.
They are illustrated in Figure 4.
Table 4. Sentences tested in the Feedback column of Australian Style, with recorded
results. Raw figures and vertical percentages within each sentence pair
Sentences
Results
Ia
Ib
That problem has been solved long ago.
That problem was solved long ago.
67 (12.1%)
486 (87.9%)
IIa
IIb
I know Joanna is around somewhere – Alex has just spoken to her.
I know Joanna is around somewhere – Alex just spoke to her.
327 (56.1%)
256 (43.9%)
IIIa
IIIb
You speak remarkably good French. Have you ever lived in France?
You speak remarkably good French. Did you ever live in France?
461 (81.3%)
106 (18.7%)
IVa
IVb
Have you told them the news yet?
Did you tell them the news yet?
497 (89.1%)
61 (10.9%)
It will be noticed that the respondents treated constructions I and II very differently, with long ago and just, respectively. With the former construction there is an
overwhelming majority for the preterite, with the latter there is a much smaller majority
favouring the present perfect. Both adverbials express a vaguely defined past time, but
with a very marked difference in distance from the deictic zero-point. It might also
be suggested that construction II signals a temporal orientation towards present time
(some would speak of current relevance here), construction I rather towards past time,
and that this contributes further to the different choices of verb form.
As regards constructions III and IV, with ever and yet, respectively, there is a very
clear preference for the present perfect. It has often been noted that in constructions
with yet (and also with already) AmE frequently has the preterite, while in BrE only the
. I am grateful to the editor of Australian Style, Professor Pam Peters, for being prepared to
include these constructions in the Feedback column, and to all the readers who took the time
to register their responses.
 Johan Elsness
The present perfect and the preterite from Feedback
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Pr. perf.
Preterite
I
II
III
IV
Figure 4. Percentages of the present perfect vs. the preterite for the four constructions tested
in the Feedback column of Australian Style. See Table 4.
present perfect is current. It is noteworthy that the result from Australian Style places
AusE firmly on the side of BrE in this case.12
In the next two tables the recorded Feedback results are subdivided according to sex
(Table 5) and age (Table 6).13 They are illustrated in Figure 5 and Figure 6, respectively.
Table 5. Responses to the Feedback column of Australian Style according to the sex of
respondents. Raw figures and vertical percentages within each pair
Sex
Female
Male
Ia has been solved
Ib was solved
30 (12.8%)
205 (87.2%)
24 (14.0%)
147 (86.0%)
IIa has just spoken
IIb just spoke
138 (55.2%)
112 (44.8%)
106 (59.6%)
72 (40.5%)
IIIa have ever lived
IIIb did ever live
208 (86.0%)
34 (14.1%)
133 (76.9%)
40 (23.1%)
IVa have told them yet
IVb did tell them yet
215 (90.3%)
23 (9.7%)
152 (88.9%)
19 (11.1%)
. The four constructions that were tested in the Feedback column of Australian Style are
(more or less) identical with constructions included in the elicitation test reported in Elsness
(1990 and 1997), where speakers of AmE and BrE were used as informants. The presumed
difference between AmE and BrE in constructions with these adverbials was amply confirmed
by that test.
. Breakdown of the Feedback results according to state or territory did not reveal any clear
and consistent differences and will not be further considered here.
The perfect and the preterite in Australian and New Zealand English 
The present perfect according to sex
100%
90%
80%
70%
Female
60%
Male
50%
40%
30%
20%
10%
0%
I
II
III
IV
Figure 5. Percentage of present perfect constructions in Feedback column according to sex of
respondent. See Table 5.
Table 6. Responses to the Feedback column of Australian Style according to the age of
respondents. Raw figures and vertical percentages within each pair
Age
10–24
25–44
45–64
65+
Ia has been solved
Ib was solved
7 (10.6%)
59 (89.4%)
9 (13.6%)
57 (86.4%)
27 (12.3%)
192 (87.7%)
26 (12.7%)
179 (87.3%)
IIa has just spoken
IIb just spoke
8 (12.3%)
57 (87.7%)
31 (44.9%)
38 (55.1%)
146 (64.0%)
82 (36.0%)
143 (64.1%)
80 (35.9%)
IIIa have ever lived
IIIb did ever live
53 (81.5%)
12 (18.5%)
52 (75.4%)
17 (24.6%)
187 (83.9%)
36 (16.1%)
171 (80.1%)
41 (19.3%)
IVa have told them yet
IVb did tell them yet
55 (85.9%)
9 (14.1%)
55 (78.6%)
15 (21.4%)
198 (90.8%)
20 (9.2%)
191 (91.8%)
17 (8.2%)
If we first look at the distributions recorded according to the sex of the respondent
(Table 5 and Figure 5), it will be seen that no very clear differences emerge. However,
there may seem to be a slight tendency for female respondents to be more clear-cut
in their choices than their male counterparts: in those cases where both sexes record
a clear preference for the present perfect (constructions III and IV), the preference
for that verb form is even stronger among female respondents; conversely, in the case
where the general preference is for the preterite verb form (construction I), that preference is again slightly higher among female compared with male respondents. This
might be taken as a confirmation that female speakers tend to stick to the language
 Johan Elsness
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
The present perfect according to age
I
II
III
IV
10–24
25–44
45–64
65+
Figure 6. Percentage of present perfect constructions in Feedback column according to age of
respondent. See Table 6.
norm more strictly than male speakers, although the numbers and the differences
recorded are not sufficient to warrant any firm conclusions.14
The figures for the distribution of verb forms according to the age of the
respondent can be studied in Table 6. They are illustrated in Figure 6. The three
constructions displaying a clear overall preference for either the present perfect
or the preterite (I, III and IV) show fairly stable results across the age brackets distinguished, as one would expect. With construction II, however, it can now be seen
that the overall results concealed a very clear differentiation according to age: here
there is a striking development from a very distinct preference for the preterite among
informants aged 10–24 to a marked preference for the present perfect in the two oldest
age brackets, 45–64 and 65+. In other words, a clear majority of respondents aged 45
and over side with what has been seen as the traditional BrE norm, while respondents
below the age of 25 place themselves firmly on the AmE side over this construction.15
We shall see whether there is any difference between the sexes in this respect. In
Table 7, illustrated in Figure 7, results are given for age and sex combined.16 It will be
seen that the very marked differentiation according to age is consistent for both sexes.
The tendency for male respondents to favour the present perfect in construction II
. On claims that the sexes tend to differ along these lines, see e.g. Eisikovits (1989) and
Trudgill (1972).
. There is no doubt about the statistical significance of the recorded differences among
the three lowest age brackets: 10–24 vs. 25–44: χ²=15.7154, p≤0.001; 25–44 vs. 45–64:
χ²=7.2576, p≤0.01
. Some respondents did not indicate both age and sex. Hence figures from the various
tables do not necessarily match.
The perfect and the preterite in Australian and New Zealand English 
Table 7. Feedback results for sentence pair II according to age and sex. Raw figures and
vertical percentages
Age
10–24
Sex
Male
25–44
Female
Male
Female
45–64
Male
65+
Female
Male
Female
Pr. perf. 1 (8.3%)
2 (5.9%)
5 (45.5%) 10 (34.5%) 30 (56.6%) 54 (70.1%) 52 (62.7%) 59 (60.2%)
Preterite 11 (91.7%) 32 (94.1%) 6 (54.5%) 19 (65.5%) 23 (43.4%) 23 (29.9%) 31 (37.3%) 39 (39.8%)
Totals
12
34
11
29
53
77
83
98
Sentence pair II according to age and sex
100%
80%
60%
Preterite
Pr. perf.
40%
20%
0%
Male Female Male Female Male Female Male Female
10–24
25–44
45–64
65+
Figure 7. Feedback results for sentence pair II according to age and sex. See Table 7.
slightly more than female respondents holds for three of the four age brackets: only
among those aged 45–64 is the present perfect somewhat more popular with female
respondents. The differences are not sufficient to be statistically significant in any age
bracket, however.17
Our findings may be taken as a highly significant indication that change is under
way in AusE as regards the distinction between the present perfect and the preterite
in cases similar to our construction II, away from the BrE norm towards the choice
which seems to be the one favoured by speakers of AmE. What is characteristic of
construction II is that there is a specifier of past time (just) but only a very vague one,
. Not even in the age bracket 45–64, with the most distinct difference between the sexes, is
the difference statistically significant at the 5 per cent level: χ²=1.9552
 Johan Elsness
which leaves considerable scope for variation and individual choice in all varieties of
English. Our results suggest that in such cases speakers of AusE from young middle
age upwards (i.e. aged 45 and above) still conform to what has been seen as the BrE
pattern, while the youngest Australians reject the choice of their elders and display a
linguistic behaviour which is hardly distinguishable from that of speakers of AmE. As
this type of time reference can be assumed to be fairly frequent in many kinds of text,
it may be expected to play a central role in the further development of the present
perfect/preterite distribution in AusE.
5. Summing up
The distribution between the present perfect and the preterite is an area of English
grammar where there are well documented differences between the two best studied
national varieties, AmE and BrE. Today the present perfect is distinctly more frequent
in the British variety, although its frequency appears to be declining in both varieties, the preterite gaining ground. We have seen that both AusE and NZE, especially
the former, place themselves at the high end of the scale as far as the frequency of
the present perfect is concerned. It was also seen that within ACE and WWC there
are marked differences in the present perfect/preterite ratio between the various text
category groups that these corpora may be divided into. The trend is for the ratio to
be at its highest in texts which can be assumed to be generally orientated towards
present time – especially newspaper texts – and particularly low in texts which tend
to be orientated towards past time, most notably fictional texts. This variation turned
out to be greater in AusE than in NZE. When spoken language was brought into the
comparison, the ratio between the present perfect and the preterite was on the whole
at least twice as high as in the written texts. There seem to be two main reasons for this:
(i) in most kinds of spoken texts orientation towards present time predominates, and
(ii) the use of the present perfect may be somewhat more liberal in varieties of AusE
and NZE which can be characterized as colloquial, informal. In both ICE-AUS and
ICE-NZ the present perfect/preterite ratio is more than twice as high in the spoken as
in the written sections. And in ART, a corpus made up exclusively of unscripted AusE
speech (radio talkback), the ratio is at its very highest. In that corpus the occurrence
of present perfect constructions combining with certain past-referring adverbials (as
in A man has been bashed last night) is also distinctly higher than in the other corpora
examined. In such constructions the preterite rather than the present perfect is the
prescribed verb form, but the occasional use of the present perfect is nothing new in
English. It may seem, however, as if this use is more common in (informal) AusE than
in other varieties. This is nevertheless no more than a marginal phenomenon even in
AusE, but it may be an indication that the present perfect is generally used somewhat
The perfect and the preterite in Australian and New Zealand English 
more widely in that variety. However, a contrary development now seems to be under
way: an inquiry in the Feedback column of Australian Style revealed a very marked
generation gap over a construction with vague past-time specification, i.e. the kind
of construction where the rivalry between the present perfect and the preterite is at
its greatest in English generally. Here middle-aged and elderly Australians conformed
to what has been considered the BrE norm, opting for the present perfect, while the
youngest respondents preferred the preterite alternative, which is the AmE favourite,
by a wide margin. This may be an indication that the pressure from AmE is now
considerable on the youngest speakers of AusE.
A possible explanation for the apparently opposite trends detected in AusE – a
liberal use of the present perfect especially in colloquial speech but also a movement
away from the present perfect – may be that what has often been considered the main
semantic distinction between the two verb forms has become blurred. This would give
speakers the opportunity to choose the present perfect when that verb form is considered
desirable, for instance to indicate that the main orientation is towards the present timesphere, even if the temporal specification is one of past time; but also increasingly to use
the simpler preterite form in cases where the choice of verb form is felt not to matter very
much. The situation in NZE is less well documented, but it seems as if that variety is still
closer to BrE in both respects: the use of the present perfect with clear past-time specification is less common (although it does occur), and no evidence is available for a particularly
marked development away from the present perfect among young speakers, although that
seems to be a general long-term trend among speakers of English worldwide.
References
Bauer, Laurie. 1989. “The verb have in New Zealand English”. English World-Wide 10: 69–83.
Bauer, Laurie. 1994. “English in New Zealand”. In Burchfield (ed.): 382–429.
Bell, Allan, & Koenraad Kuiper. 1999. New Zealand English. London and Henley: Routledge &
Kegan Paul.
Burchfield, Robert (ed.). 1994. The Cambridge History of the English Language V. English in
Britain and Overseas: Origins and Development. Cambridge: Cambridge University Press.
Caie, Graham, Kirsten Haastrup, Arnt L. Jakobsen, Jørgen E. Nielsen, Jørgen Sevaldsen, Henrik Specht & Arne Zettersten (eds). 1990. Proceedings from the Fourth Nordic Conference
for English Studies, vol.1. Copenhagen: University of Copenhagen.
Collins, Peter & David Blair (eds). 1989. Australian English: The Language of a New Society.
St. Lucia: University of Queensland Press.
Collins, Peter & Pam Peters. 2004. “Australian English: Morphology and syntax”. In Kortmann et al.
(eds): 593–610.
Cotte, Pierre. 1987. “Réflexions sur l’Emploi des Temps du Passé en Français et en Anglais à la
Lumière de deux Évolutions Récentes du Système Verbal de l’Anglais”. Contrastes: Revue de
l’Association pour le Développement des Etudes Contrastives 14–15: 89–161.
 Johan Elsness
Eisikovits, Edina. 1989. “Girl-Talk/Boy-Talk: Sex differences in adolescent speech”. In Collins &
Blair (eds): 35–54.
Elsness, Johan. 1984. “The preterite and the perfect in BBC news bulletins: The case for a text
linguistic approach”. In Ringbom & Rissanen (eds): 159–71.
Elsness, Johan. 1990. “The present perfect in American and British English: Some results from an
elicitation test”. In Caie et al. (eds): 169–78.
Elsness, Johan. 1997. The Perfect and the Preterite in Contemporary and Earlier English. Berlin:
Mouton de Gruyter.
Elsness, Johan. 2009. “The present perfect and the preterite”. In Rohdenburg & Schlüter (eds):
228–45.
Engel, Dulcie M. 2002. “Radio talk: French and English perfects on air”. Languages in Contrast
2(2): 255–77.
Engel, Dulcie M. & Marie-Eve Ritz. 2000. “The use of the present perfect in Australian English”.
In Australian Journal of Linguistics 20(2): 119–40.
Görlach, Manfred. 1987. “Colonial lag? The alleged conservative character of American English
and other ‘colonial’ varieties”. English World-Wide 8: 41–60.
Hundt, Marianne. 1998. New Zealand English Grammar: Fact or Fiction? A Corpus-based Study
in Morphosyntactic Variation. Amsterdam: John Benjamins.
Hundt, Marianne, Jennifer Hay & Elizabeth Gordon. 2004. “New Zealand English: Morphosyntax”. In Kortmann et al. (eds). 560–92.
Kortmann, Bernd, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider & Clive Upton (eds).
2004. A Handbook of Varieties of English 2: Morphology and Syntax. Berlin: Mouton de
Gruyter.
McCoard, Robert W. 1978. The English perfect: Tense-choice and Pragmatic Inferences.
Amsterdam: North-Holland.
Quinn, Heidi. 1999. “Variation in New Zealand English syntax and morphology”. In Bell &
Kuiper (eds): 173–97.
Quirk, Randolph, Sidney Greenbaum, S., Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Ringbom, Håkan & Matti Rissanen (eds). 1984. Proceedings from the Second Nordic Conference for
English Studies. Åbo: Åbo Akademi.
Rohdenburg, Günter & Julia Schlüter (eds). 2009. One Language, Two Grammars? Grammatical
Differences between British and American English. Cambridge: Cambridge University Press.
Trudgill, Peter. 1972. “Sex, covert prestige and linguistic change in the urban British English of
Norwich”. Language in Society 1: 179–95.
Trudgill, Peter. 1984. “Standard English in England”. In Trudgill (ed.): 32–44.
Trudgill, Peter (ed.). 1984. Language in the British Isles. Cambridge: Cambridge University Press.
Vanneck, Gerard. 1958. “The colloquial preterite in modern American English”. Word 14: 237–42.
Visser, F. Theodor. 1973. An Historical Syntax of the English Language 3, 2: Syntactical Units with
Two and with More Verbs. Leiden: Brill.
The progressive
Peter Collins
University of New South Wales
The progressive aspect has enjoyed spectacular growth in English since late
Modern English, but its spread has not been uniform across all varieties.
The study compared the frequency and uses of the progressive in Australian,
New Zealand, British and American English across a range of variables. These
included the overall frequency of tokens, the proportion of complex progressive
forms, the proportion of special pragmatic uses, the frequency of main clause
progressives and the frequency of contracted forms. It was found that the rise of
the progressive is most advanced in the two antipodean varieties, with Australian
English ahead of New Zealand English, and that of the northern hemisphere pair
American English is the more advanced.
1. Introduction
This chapter reports the findings of a corpus-based study of the progressive aspect,
the syntactic category that is realized in English by be in conjunction with an -ing
participle, and characteristically expressing progressive aspectuality (which is
associated with notions such as progressivity, duration and imperfectivity).
The distribution and frequency of the progressive is examined in four World
Englishes, the two established northern hemisphere varieties of AmE and BrE, and
the newer antipodean varieties of AusE and NZE.
2. Previous corpus-based studies
A number of corpus-based studies of the English progressive have been conducted.
The diachronic studies of Mair and Hundt (1995), Smith (2002), Mair and Leech
(2006), which are discussed in Section 5 below, are limited to BrE and/or AmE, and
in the case of the first two studies, to written data alone. Mindt (2000) and Scheffer
(1975) both draw examples from collections of textual data, rather than standard
corpora. Biber et al. (1999: 461–2) present some frequency figures, albeit based on
only a small number of variables. The most recent book-length study, Römer (2005),
 Peter Collins
is large-scale but nevertheless limited to spoken BrE. The present study analyzes all
progressive tokens (a total of 2933) across a set of four parallel corpora in terms of a
range of grammatical and pragmatic variables.
3. The corpora
The data for the study were extracted from the ICE corpora representing AusE, NZE
and BrE (each corpus comprising one million words, with a sampling date in the early
1990s, and conforming to a common design). Progressive tokens were analyzed in
120 000 words of text from each corpus, half spoken (conversations from Category
S1A) and half written (comprising 20 000 words of academic writing in the humanities from W2A, 20 000 of news reports from W2C, and 20 000 of fiction from W2F).
ICE-US has not yet been completed, so in order to represent AmE 60 000 words were
selected from the Santa Barbara Corpus (SBC), a corpus of primarily dialogic speech
which the compilers intend for use in ICE-US, and 60 000 words from Frown (from
categories J, A and K, corresponding to W2A, W2C and W2F respectively). In all the
tables frequencies for C-US are normalized to match those for the other corpora.
As Table 1 indicates, the four corpora yielded a total of 2933 progressive tokens,
with approximately twice as many in speech (1992) as in writing (941), and within the
three written genres approximately twice as many in fiction as in news (531:272, or
1.95:1), and approximately twice as many in news as in academic writing (272:138, or
1.97:1). The order of popularity for the four genres (conversation > fiction > news >
academic) matches that reported by Biber et al. (1999: 462) for AmE and BrE, although
Table 1. Frequencies of the progressive across the four Englishes
Speech
Writing
Academic
News
Fiction
Total for writing
TOTAL (speech and writing)
ICE-AUS
ICE-NZ
C-US
ICE-GB
TOTAL
541
71.8%
516
57.7%
476
76.0%
459
69.5%
1922
67.9%
32
15.1%
81
38.2%
99
46.7%
212
28.2%
61
16.1%
73
19.3%
244
64.6%
378
42.3%
29
19.3%
53
35.3%
68
45.3%
150
24.0%
16
8.0%
65
32.3%
120
59.7%
201
30.5%
138
14.7%
272
28.9%
531
56.4%
941
32.1%
753
100%
894
100%
626
100%
660
100%
2933
100%
The progressive 
fiction and news are closer in Biber than in the present study. While the frequency of
the progressive in AmE conversation was only slightly higher than in BrE conversation (477:459) in the present study, Biber et al. (1999: 462) report a larger difference
(approximately in the ratio of 4:3).
The overall figures presented in Table 1 suggest that the progressive is more
frequently used in the two antipodean Englishes than in AmE and BrE. The order
of popularity (NZE > AusE > BrE > AmE) is the same as that reported by Hundt
(1998: 75) in her study of four written corpora (WWC, ACE, FLOB and Frown). The
extremely high overall frequency for NZE is due largely to its striking popularity in
fiction (where it is twice as common as in British fiction, almost two and a half times
as in Australian, and more than three and a half as in American).
If we restrict our focus of attention to speech – the mode strongly favoured by
progressives, as noted above, and that which has seen the greatest increase in the
spread of the progressive (see Section 5 below) – a different picture emerges. The
southern hemisphere varieties still lead their northern hemisphere counterparts in
the frequency of progressives in speech, but there there is a reordering within the two
hemispheres as follows: AusE > NZE > AmE > BrE.
4. Progressive aspectuality
Progressive aspectuality involves the representation of a situation as having an internal
temporal structure, and as progressing through time. Consider (1), an extract from a
conversation about blacksmithing:
(1)I heard everybody saying um you know, like the people that took the class
before, talking about “Oh you have to do this, you have to do that, there’s dead
horse hooves, you know you gotta ... and they stink, and all this stuff ”, and I was
just going “Oh my God, I’m never … I don’t want to take that class, so maybe
I’ll wait till next year.” [SBC 01:607-8]
The speaker’s selection of the past progressive was going rather than the simple past
went has the effect of assigning durativity to the situation, enabling the addressee’s
attention to be “zoomed in” to what is happening.
Progressive situations are, furthermore, presented as susceptible to change. In (2) the
progressive ’re looking expresses the same kind of imperfectivity as its non-progressive
counterpart look, but suggests a greater degree of temporariness.
(2)
What, you’re looking a bit skinny [ICE-AUS S1A-008:67]
The progressive is commonly regarded as involving a “temporal frame” encompassing
a temporal reference point (following Jespersen 1931: 18). Examples of the type in (3),
 Peter Collins
where this temporal point is explicitly mentioned, are in fact rare, being associated
with only 3.3% of spoken tokens and 1.4% of written tokens in this study.
(3)There’s a big dockyard clock and that clock was striking exactly nine o’clock
when when we got there [ICE-GB S1A-28:139]
The correspondence between the progressive aspect and progressive aspectuality is not
one-to-one (hence the qualification “characteristically expressing” in Section 1 above).
As we shall see in Section 7, the progressive aspect has developed two uses in which it
expresses meanings associated with futurity rather than aspectuality.
5. The growth of the progressive
Much has been written on the spectacular spread of the progressive in (late) Modern
English (e.g. Smitterberg 2005), and the corpus-based studies mentioned in Section 2
above have investigated the extent to which changes are still going on (Mair & Hundt
1995; Smith 2002; Mair & Leech 2006). Various types of explanation have been offered.
Some have claimed (e.g. Mair & Hundt 1995) that one factor in the spread of the progressive is the development of new forms (e.g. combinations of the progressive with
modals, perfect aspect and passive voice) and of new uses (e.g. the intentional, interpretive and attitudinal uses). In the next two sections the forms and special uses of the
progressive respectively are examined, and the possible implications of their distribution across the corpora for patterns of change are considered.
Another suggestion is that stylistic factors have had a role to play in the growth
of the progressive (e.g. Quirk et al. 1985; Mair & Hundt 1995; Biber et al. 1999; Mair &
Leech 2006). The attested popularity of the progressive in speech, it is argued, is making it an increasingly welcome choice in writing, as the norms of writing move progressively closer to those of colloquial speech (the phenomenon of colloquialization).
One relevant variable in the present study was the frequency of contracted progressives
(including both be-contraction and not-contraction): see Section 8 below.
In the following sections we examine a range of variables which the various studies
referred to above have identified as contributing to the growth of the progressive: complex
forms, main clause use, special pragmatic uses, and contraction.
6. Grammatical features
6.1 Forms of the progressive
The set of progressive forms was determined by the following variables: tense (e.g.
present is watching vs. past was watching), perfect aspect (e.g. perfect has been
The progressive 
Table 2. Progressive forms across the four Englishes
ICE-AUS
ICE-NZ
C-US
ICE-GB
TOTAL
Present
Past
Pres perf
Past perf
Modal
Mod perf
To-infin
Pres pass
Past pass
405 (53.8%)
251 (33.3%)
31 (4.1%)
7 (0.9%)
33 (4.4%)
2 (0.3%)
6 (0.8%)
12 (1.6%)
6 (0.8%)
378 (42.3%)
382 (42.7%)
32 (3.6%)
22 (2.5%)
45 (5.0%)
5 (0.6%)
16 (1.8%)
4 (0.4%)
10 (1.1%)
368 (58.8%)
188 (30.0%)
29 (4.6%)
4 (0.6%)
15 (2.4%)
3 (0.5%)
4 (0.6%)
10 (1.6%)
5 (0.8%)
37 (56.5%)
202 (30.6%)
15 (2.3%)
4 (0.6%)
17 (2.6%)
1 (0.2%)
26 (3.9%)
8 (1.2%)
14 (2.1%)
1524 (52.0%)
1023 (34.9%)
107 (3.6%)
37 (1.3%)
110 (3.8%)
11 (0.4%)
52 (1.8%)
34 (1.2%)
35 (1.2%)
TOTAL
753 (100%)
894 (100%)
626 (100%)
660 (100%)
2933 (100%)
watching vs. non-perfect is watching), modality (e.g. modal might be watching vs.
non-modal is watching), to-infinitival (e.g. infinitival to be watching vs. non-infinitival
is watching), and voice (e.g. active is watching vs. passive is being watched).
As Table 2 indicates, a number of the sixteen possible forms were unattested
in the present corpora (perfect to-infinitival, present perfect passive, past perfect
passive, modal passive, modal perfect passive, to-infinitival passive, and perfect
to-infinitival passive). Those that did occur varied strikingly in their frequencies
(the simple present and simple past together accounting for 86.8% of tokens). The
present progressive accounted for 61.0% of all forms in speech, as against 37.3% in
writing, whereas the past progressive accounted for more forms in writing (44.5%)
than in speech (31.4%). These findings are consistent with Smith’s (2002) claim that
the most significant increases in the progressive in recent decades have involved the
present rather than past forms.
A further finding that can be extrapolated from Table 2, which may be relevant
to the patterns of growth of the progressive across the four Englishes studied, is the
proportion of complex forms (i.e. perfect, modal, infinitival, and passive) to simple
tense forms (i.e. present and past). Interestingly, the ordering determined by the
percentage of complex forms – ICE-NZ (14.98%) > ICE-AUS (12.88%) > ICE-GB
(12.87%) > C-US (11.18%) – is the same as that determined by the overall frequency
of progressives (see Section 3).
6.2 Clause type
While progressives occurred mainly in subordinate clauses in the eighteenth and
nineteenth centuries (see Strang 1982; Smitterberg 2005), their use in main clauses
has been on the increase in recent decades (Smith 2002).
 Peter Collins
Table 3. Subordinate/main clause progressives across the four Englishes
ICE-AUS
ICE-NZ
C-US
ICE-GB
TOTAL
Subord clauses
Main clauses
157 (20.8%)
596 (79.2%)
287 (32.1%)
607 (67.9%)
188 (30.0%)
438 (70.0%)
242 (36.7%)
418 (63.3%)
874 (29.8%)
2059 (70.2%)
TOTAL
753 (100%)
894 (100%)
626 (100%)
660 (100%)
2933 (100%)
As Table 3 shows, AusE leads the way in the frequency of main clause uses of the
progressive, as it does on a number of the other variables examined. The ordering of the
varieties for main clause use of the progressive is as follows: AusE > AmE > NZE > BrE.
7. Special pragmatic uses
A number of specialized pragmatic uses of the progressive have developed (listed
below (i) to (v)), in which the meanings associated with progressive aspectuality are
extended in various ways. These uses are more popular in speech, where they account
for 26.2% of all tokens, than they are in writing, where they account for 12.8% of
tokens, the only exception being the attitudinal use, which is equally popular in both
speech and writing. What this finding suggests is that as a set the special uses are likely
to have impacted on the growth of the progressive.
i. In its attitudinal use the progressive, in combination with a temporal adjunct
(typically always), expresses the unpredictable recurrence of a situation of which
the speaker disapproves (see Huddleston & Pullum et al. 2002: 167; Killie 2004), as
in (4).
(4)“He’s always rabbiting on about the suburbs,” Claire continued. “Who cares?”
[ICE-NZ W2F-003:405]
ii. In the interpretive use the durative meaning of the progressive is, as in the
attitudinal use, extended into the domain of subjective interpretation. The speaker’s
concern is with explaining or clarifying what someone says, as in (5), or does, as in
(6) (see Ljung 1980; Wright 1995, and Mindt 2000). Huddleston and Pullum et al.
(2002: 165) suggest plausibly that this use evolves from the imperfective and durative meanings associated with progressive aspectuality, by metaphorically slowing
down a situation in order to concentrate on interpreting it.
(5) Are you sort of saying music’s a funny game [ICE-AUS S1A-011:112]
(6)And uhm I think f what’s been happening for quite a period of time is that
therapy has been uhm put on one side dance on the other [ICE-GB S1A-004:93]
The progressive 
iii. In the politeness use the durativity associated with the progressive serves to convey
the speaker’s wish or attitude in a way that is construed as more polite or deferential
than the corresponding non-progressive form(s): compare I wonder/wondered in (7).
(7)we’re about to go to Glebe markets and he hasn’t called so I was just wondering
if you want to come [ICE-AUS S1A-007:64]
iv. In the intentional use the progressive expresses the non-aspectual meaning of
futurity, more specifically that in which the subject-referent’s intentionality is typically
involved, as in (8).
(8) I’m only going up one thousand grand [ICE-AUS S1A-008:32]
As Table 4 indicates, the intentional use is the most frequent of the special uses. Mair and
Hundt (1995: 116) suggest that it has been a factor in the growth of the progressive.
v. In the “matter of course” use the progressive combines with will (or occasionally
shall or be going to) to suggest that the actualization of the situation is inevitable, a
matter of course, as in (9).
(9)’Cause she won’t be taking anything good. Because they’ll be doing rough and
tumble stuff. [ICE-AUS S1A-005:167, 169]
A comparison of the proportion of special uses in the Englishes once again shows
NZE lagging behind the other varieties and BrE holding an intermediate position.
This time, however, AusE is ahead of AmE (AusE 21.9% > AmE 18.8% > BrE 17.9% >
NZE 15.3%). The differences between the Englishes are even more marked if we restrict
ourselves to just speech: AusE 29.2% > AmE 19.0% > BrE 18.8% > NZE 13.4%.
Table 4. Special uses of the progressive across the four Englishes
ICE-AUS
Interpretive
Attitudinal
Politeness
Intentional
Matter-of-course
TOTAL
ICE-NZ
C-US
ICE-GB
TOTAL
41 (26.6%)
0 (0.0%)
3 (1.9%)
98 (63.6%)
12 (7.8%)
24 (17.5%)
1 (0.7%)
0 (0.0%)
81 (59.1%)
31 (22.6%)
63 (53.4%)
5 (4.2%)
9 (7.6%)
32 (27.1%)
9 (7.6%)
43 (36.4%)
0 (0.0%)
5 (4.2%)
63 (53.4%)
7 (5.9%)
171 (32.4)
6 (1.1%)
17 (3.2%)
274 (52.0%)
59 (11.2%)
154 (100%)
137 (100%)
118 (100%)
118 (100%)
527 (100%)
8. Contraction
As noted in Section 5 above, one tangible indicator of colloquialization is contraction.
Not surprisingly the incidence of contraction (of forms of be, and not) was far greater
 Peter Collins
Table 5. Contracted progressives in speech and writing across the four Englishes
ICE-AUS
ICE-NZ
C-US
ICE-GB
TOTAL
Speech
Writing
319 (59.0%)
31 (14.6%)
264 (51.2%)
55 (14.6%)
248 (52.1%)
22 (14.7%)
235 (51.2%)
24 (11.9%)
1066 (53.5%)
132 (14.0%)
TOTAL
350 (46.5%)
319 (35.7%)
270 (43.1%)
259 (39.2%)
1198 (40.8%)
Note: The percentages represent the frequencies of contracted progressives of all progressive tokens in
speech, in writing, and in speech + writing, respectively
in speech (53.5%) than it was in writing (14.0%). Smith (2002: 326) reports a large rise
in the frequency of be-contractions and not-contractions in his written BrE data.
The figures presented in Table 5 show that in its preference for contracted
progressives AusE is again in the forefront of change. For this variable the ordering
of the varieties is as follows: AusE > AmE > BrE > NZE.
9. Conclusion
The present study suggests that the antipodean Englishes are more advanced than the
northern hemisphere Englishes in the growth of the progressive. In sheer frequency
of tokens the former outstrip the latter. Furthermore, of the southern hemisphere pair
it is AusE that tends to lead NZE, while in the north it is AmE that tends to be more
advanced than BrE. This is what we find with the frequency of progressive tokens in
speech rather than speech and writing combined (i.e. AusE > NZE > AmE > BrE), and
also for contracted progressives and for special uses of the progressive (except that
C-US and ICE-GB have the same number of special use tokens). Slightly different is
the ordering for complex progressive forms: while the antipodean Englishes are again
ahead of the northern hemisphere Englishes, NZE outstrips AusE, and BrE outstrips
AusE. However the overall picture is not in doubt. If we can trust that the corpora are
representative of their regional varieties, the popularity of the progressive is greater in
the southern hemisphere (where it is in turn greater in AusE than in NZE) than in the
northern hemisphere (where it is in turn greater in AmE than in BrE). They are thus
compatible with claims that AusE and NZE are endonormative in this respect, and
have consolidated their own norms as independent national standards (cp Collins fc,
Collins & Peters 2008; Hundt 1998).
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999.
Longman Grammar of Spoken and Written English. London: Longman.
The progressive 
Collins, Peter. (forthcoming). “Australia”. In Martin Ball (ed.), Sociolinguistics around the World.
Oxford: Routledge.
Collins, Peter & Pam Peters. 2008. “Australian English morphology and syntax”. In
Kate Burridge & Bernd Kortmann (eds), Varieties of English 3: The Pacific and Australasia,
341–61. Berlin: Mouton de Gruyter.
Huddleston, Rodney & Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Hundt, Marianne. 1998. New Zealand English Grammar: Fact or Fiction? Amsterdam: John
Benjamins.
Jespersen, Otto. 1931. A Modern English Grammar on Historical Principles. Vol 4. Heidelberg:
Carl Winter.
Killie, Kristin. 2004. “Subjectivity and the English progressive”. English Language and Linguistics.
8(1): 25–46.
Ljung Magnus. 1980. Reflections on the English Progressive. Gothenburg: Gotab.
Mair, Christian & Geoffrey Leech. 2006. “Current changes in English syntax”. In Bas Aarts &
April McMahon (eds), Handbook of English Linguistics, 318–42. Oxford: Blackwell.
Mair, Christian & Marianne Hundt. 1995. “Why is the progressive becoming more frequent
in English? A corpus-based investigation of language change in progress”. Zeitschrift fur
Anglistik und Amerikanistik 43(2): 111–22.
Mindt, Dieter. 2000. An Empirical Grammar of the English Verb System. Berlin: Cornelsen.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Römer, Ute. 2005. Progressives, Patterns, Pedagogy. A Corpus-driven Approach to English
Progressive Forms, Functions, Contexts and Didactics. Amsterdam: John Benjamins.
Scheffer, Johannes. 1975. The Progressive in English. Amsterdam: North-Holland.
Smith, Nicholas. 2002. “Ever moving on? The progressive in recent British English”. In
Pam Peters, Peter Collins & Adam Smith (eds), New Frontiers of Corpus Research: Papers from
the Twenty-first International Conference on English Language Research on Computerized Corpora, 317–30. Amsterdam: Rodopi.
Smitterberg, Eric. 2005. The Progressive in 19th Century English. A Process of Integration.
Amsterdam: Rodopi.
Strang, Barbara. 1982. “Some aspects of the history of the be + ing construction”. In John Anderson
(ed.), Language Form and Linguistic Variation: Papers dedicated to Angus Macintosh [Current
Issues in Linguistic Theory 15], 427–74. Amsterdam: Benjamins.
Wright. Susan. 1995. “The mystery of the modal progressive”. In Dieter Kastovsky (ed.), Studies
in Early Modern English, 467–85. Berlin: Mouton de Gruyter.
The mandative subjunctive in spoken English
Pam Peters*
Macquarie University
Twentieth century corpus-based research on regional variation of the mandative
subjunctive has shown it to be standard usage in American English but limited
in British English. This research reviews the use of the mandative in spoken data
from six ICE-corpora, to show marked regional differences among both settler
and indigenized varieties of English. While its currency in spoken data from New
Zealand is relatively low, it is on a par with written usage in Australian English,
as well as Singaporean and Philippine English. However spoken instances of the
mandative are typically found in public and institutional dialogue/monologue,
rather than private conversation, so that it cannot be said to have become
vernacularized.
1. Introduction: A vexed construction
The status and currency of the mandative subjunctive ((the) MS)1 prior to the
twentieth century is rather uncertain. Research based on limited data from the
ARCHER corpus suggests that MS was not common in either AmE or BrE of
the eighteenth and nineteenth centuries (Hundt 2009: 30–1). The twentieth century has
seen repeated forecasts of its extinction, by lexicographers, usage commentators,
grammarians and creative writers. They begin with Henry Bradley’s comment in The
Making of English (1904): that MS would survive no more than a generation. Henry
Fowler – a generation later in A Dictionary of Modern English Usage (1926) – is similarly
pessimistic about its future: “moribund except in a few specified cases”. As usage commentator, Fowler counseled avoidance of MS because it was easily misunderstood or
misused by the average writer. Somerset Maugham, in A Writer’s Notebook (1949),
advises giving it the coup de grace, because it is “in its death throes, and the best thing
to do is to put it out of its misery”. If that were not sufficient disincentive to use it,
*The contribution of my research assistant Yasmin Funk in the gathering and analysis of data
is warmly acknowledged.
. The abbreviation MS is used for convenience in the main text of this paper for the
mandative subjunctive.
 Pam Peters
George Vallins, in Good English: How to write it (1951) declares that the last stronghold
of MS is in “poetry, and adolescent romantic poetry at that”. Three generations later
Quirk et al. (1985) take a more objective view, concluding that its use represents a
stylistically marked choice, usually associated with formal and legal style.
We might observe that all five of those commentators are British, and that the
first four are curiously vexed by the construction – inclined to talk it down and out
of existence. Only the fifth makes it clear that MS is still used in BrE, though as a
marked feature of more conservative and formal styles. The possibility that its currency and contexts of use are somewhat different in other varieties of English is not
contemplated.
2. P
revious corpus-based studies of the mandative subjunctive
in British and American English
One of the early triumphs of corpus research was the demonstration based on the
Brown and LOB corpora (Johansson & Norheim 1988) that use of MS was much
livelier in AmE than BrE, and that it was not as moribund as British writers and grammarians had thought. This demonstration was followed by Gerd Overgaard’s diachronic
study (1995) of MS, showing how its use in the two major varieties of English had
been on divergent paths for most of the twentieth century. She used a diachronic corpus of published material from five time points in the century, to show that British
usage of MS around 1900 was almost zero; and that even in US it was relatively low.
This tallies with Charles Fries’s finding in American English Grammar (1940): there
was little evidence of MS in his corpus of US government correspondence dating from
World War I. But following the war (between 1920 and 1940) there was a rapid rise in
American use of MS, and a continuing rise from 1940 to 1960, while British usage
hardly changed. This large difference between the two regional varieties helps to explain
the British commentators’ alienation from the use of MS. Overgaard’s data also suggests
why British use of MS rose sharply between 1960 and 1990, to above 50% of susceptible
constructions: that most likely it was due to American post World War II influence.
The level of usage was still far lower than that of contemporary AmE, which she found
closer to 90% but nevertheless much higher than before World War II.
Substantial differences in British and American levels of usage of MS have continued
to the close of the twentieth century, as documented through its relative frequency in
LOB/FLOB and Brown/Frown by Hundt, Hay and Gordon (2004), Mair and Leech
(2006) and others. Data from FLOB shows a definite rise in British use of MS from
12.9% to 39.6% (Hundt et al. 2004: 316), not as great as in Overgaard’s data, but still
an upward movement. The level of usage in Frown meanwhile is more or less stable at
around 89%. BrE is still much less inclined to use MS than AmE.
The mandative subjunctive in spoken English 
A point to note in all these studies of MS usage is that they were based almost
entirely on written prose as source material. Overgaard (1995) did add drama texts to
her corpora, but their status as representations of speech is always debatable. In fact
very few of her examples came from the drama texts, as you find if you check the detail
of her appendix. So the evidence we have showing the liveliness of MS in published
prose doesn’t directly challenge the notion that it belongs to rather formal styles of
writing – except for the scattering of examples in the everyday prose categories of
Frown/FLOB. So far we have no evidence that MS can be used freely in conversation
and spoken interactions, which would suggest that it is stylistically more neutral than
Quirk et al. (1985), among others, have maintained.
3. V
ariation in postcolonial Englishes in their use
of the mandative subjunctive
The survival of MS has been affirmed in several postcolonial Englishes. Its presence
in NZE was demonstrated by Hundt (1998a), and in AusE through Peters’s (1998)
research. The levels of MS usage found in their research put both varieties rather closer
to AmE than BrE, even though BrE provided the linguistic stock for both antipodean
varieties. However both AusE and NZE were found to occupy an intermediate position
between BrE and AmE on other grammatical variables.
The frequency of MS in indigenized – as opposed to settler – varieties of English
(Schneider 2007) is a further frontier of interest, opened up by Schneider (2000)
himself, through a corpus-based study of Indian English using data from the Kolhapur
corpus with written data from c.1980. He found that the level of MS use in Indian
English was substantially higher than in LOB (c.30%), and closer to that of FLOB
(Schneider 2000, 2005). This is consistent with the growth in use of MS in BrE
mentioned in Section 2, though it may also be a larger regional or worldwide phenomenon. It is also arguable that the higher level of MS usage in Indian English of
the 1980s, vis-à-vis the BrE on which it is based, owes something to the neutralization
of MS’s formal associations in BrE (Quirk et al. 1985), so that it can be used across
more registers of writing than speakers of settler varieties might expect. A lack of
register differentiation in indigenized varieties of English has been associated by
some researchers with the wider distribution of other grammatical elements of
more formal style (Hundt 2006). Whether it represents exonormativity or a kind of
endonormativity is the further issue to discuss.
A second study of MS in an indigenized variety of English, i.e. Philippine English
(PhilE), raises similar questions of exonormativity. There Schneider (2005: 37) found
substantially greater use of MS in ICE-PHIL than in Indian English; in fact it was on a
par with AmE as documented in the Brown and Frown corpora. This is unsurprising,
 Pam Peters
given that PhilE was based on AmE. Schneider was able to separate MS frequencies in
the spoken and written components of ICE-PHIL, and found that it was actually somewhat more common in the spoken data. Because it was used across the various spoken
and written text types, he found “no difference in the overall propensity of any style, or
form of expression, to use the (mandative) subjunctive” (p.35). This conclusion about
the stylistic neutrality of MS in PhilE invites comparison with its use in a wider range
of ICE corpora, including both settler and indigenized varieties.
4. Written vs. spoken use of the mandative subjunctive
The question of using MS in speech was first broached by Hoffmann (1997), with
his research on the occurrences of the BE subjunctive in the BNC. He showed that
it actually occurred more often in the BNC’s spoken subcorpus than in imaginative
writing – a radical discovery, challenging the notion of its inherent formality in BrE.
A second challenge came with the research of Hundt (1998b), comparing data
from the same BNC spoken subcorpus with written data from FLOB, to find that
MS occurred slightly more often in the spoken data. She noted however that much
of the spoken data came from the context-governed part of the subcorpus, and that
there were far fewer from the demographic part. This finding does not support the
suggestion that MS is stylistically neutral, at least not in BrE.
Schneider’s (2005) study of MS in PhilE, mentioned in the previous section, was
the first to compare spoken and written use of MS in an indigenized variety. He too
found that it was definitely more frequent in spoken than written material, and though
quite a number of instances came from “rather formal” political speeches, there were
occurrences in “fairly informal phrases as well”. All these findings seem to challenge
the association of MS with formal written style, and to raise the possibility that in postcolonial Englishes at least, the use of MS is relatively neutral. Whether it might have
even “vernacular” associations (Overgaard 1995: 47, 50) is a further issue. There’s little
doubt that if MS is used in ordinary conversation, its prospects of survival are a good
deal better than if it is confined to formal style. There its future would be jeopardised
by the world-wide trend away from formality and towards colloquialization of English
(Mair 2006: 187–93).
5. Spoken and written data from six ICE corpora
Research on spoken and written English is now greatly helped by the availability of
ICE corpora for several other varieties of English to parallel Schneider’s (2005) study.
With data from ICE-AUS, ICE-NZ, ICE-GB, ICE-SING, ICE-IND we can compare the
The mandative subjunctive in spoken English 
frequency of MS in spoken and written usage of both settler and indigenized varieties,
and probe further the question of what types of spoken usage MS occurs in. Does it
occur in all four subtypes of speech represented in ICE (private conversation, public
discussion, public monologue, scripted monologue)? Are there marked differences
between the two regional sets?
Table 1 below presents the summary numerical findings for the use of MS in our
six varieties of English, for a common set of matrix verbs (verb lemmas and related
nouns) which are identified below (p. 132) in Table 2. Although computer string
searches could be used to extract relevant examples of MS, a good deal of culling was
needed to filter them out from:
a. mandative constructions using modal paraphrases with should, and a not inconsiderable number of others, usually deontic, with must, had to, need to, whereas the
meaning of would, will, may, might etc. varies with the context.
b. nonmandative uses of the same verb, as when suggest is used to introduce a declarative content clause. Compare:
(1)Are you suggesting that we should find a lawyer for the Board?
[ICE-AUS SIB-074:199]
(2)A poll taken last week suggests that National will hold the seat
[ICE-NZ S2B-077:103]
This discrimination needs to be made with other verbs such as propose. The nonmandative use is often signaled by the type of modal used: deontic modals or semimodals
signifying mandative intent of construction, whereas the use of typically epistemic
modals or semimodals (like will in the second example above) is taken to signify
nonmandative intent. The data on modal paraphrases of MS, shown in the last line
of Table 1 include instances with should, must, have to, would, could, can, might, may.
Negative forms have also been included.
Table 1. Data from six ICE corpora on the relative frequency of MS in spoken and
written material, and total for modal paraphrases, with normalizations per 1m. words
shown afterwards in brackets
MS in
spoken
written
Total: MS
Total: modal
paraphrases
ICE-AUS
ICE-NZ
ICE-GB
ICE-SING ICE-IND
ICE-PHIL*
23 (36.8)
17 (42.5)
40
11
16 (25.6)
47 (117.5)
63
16
6 (9.6)
11 (27.5)
17
28
25 (40)
24 (60)
49
16
30 (48)
23 (57.5)
53
19
*ICE-PHIL data have been extracted from Schneider (2005)
11 (17.6)
10 (25)
21
41
 Pam Peters
The differences between raw frequencies from spoken/written parts of the corpus don’t all go one way. While those for ICE-AUS and ICE-PHIL have MS used
rather more often in speech, those for ICE-NZ and ICE-GB go the other way. Those
for ICE-SING and ICE-IND are much of a muchness, and there is no statistical
significance between the three sets of results for the indigenized varieties. However
the differences in the spoken and written frequencies for the three settler varieties are statistically significant, with the chi-square test putting the p-value at less
than the <0.005 threshold (p <0.0045). The normalizations of the raw frequencies
per 1 million words also confirm that the differences between spoken and written
frequencies are greater within the settler group than the indigenized group.
The ratios between use of MS and modal paraphrases in each ICE-corpus also
diverge considerably. ICE-GB and ICE-IND set themselves apart by their strong preference for modal paraphrases, where the other four varieties are all strongly inclined to
MS. Here the divergent frequencies within each set are highly statistically significant,
with the chi-square test showing p-values of <0.0001 for both the settler and indigenized
set. There are also regional differences which cut across the settler/indigenized groups.
Both ICE-IND and ICE-GB share the inclination to paraphrase MS, much as Schneider
(2005) found when comparing the relative frequency of the two constructions in the
Kolhapur corpus and FLOB. We might then expect the same to apply to ICE-SING as
a derivative of BrE. Instead it shows a strong preference for MS, like ICE-PHIL, which
is AmE-based. The ICE-SING result is particularly surprising, given that this data was
collected in the 1990s at the height of the Singaporean government’s campaign to “Speak
good English”, for which the model was BrE. Also of interest, but beyond the scope of
this paper, is the very wide range of deontic modals and quasi-modals used in ICE-IND,
as well as remarkable paraphrases such as:
(3)I request you truly sir, that please bring in the practice that…
[ICE-IND S1B-051:54]
Examples like this are unique to the Indian corpus.
Table 1 above shows that both AusE and NZE make more use of MS than ICE-GB.
This contrast between northern and southern hemisphere varieties is like that found
by Hundt (1998), except that the overall level of MS usage in ICE-AUS puts it closer
to ICE-NZ than ICE-GB. In their utilization of MS, the two southern hemisphere
varieties diverge considerably. For ICE-NZ it is very strongly associated with writing, whereas for ICE-AUS, it appears about equally in spoken and written discourse.
This might seem to indicate that MS is stylistically more neutral in Australia, although
other possibilities will be discussed in Section 7 below, when we break the spoken data
down into the different contexts of use. In its level of MS usage, ICE-GB is the lowest
of the three settler varieties and indeed the lowest of the six varieties examined here.
Its position as the least MS-friendly variety is confirmed.
The mandative subjunctive in spoken English 
The data for the three indigenized varieties in Table 1 show strong similarities
between ICE-SING and ICE-PHIL in their overall levels of MS usage and similar distribution of MS in spoken and written discourse, whereas ICE-IND shows much less
evidence of its use. If the strength of MS in ICE-PHIL reflects its AmE base, this might
be seen as a symptom of exonormativity, and especially if the extended use of MS is
seen as a lack of register differentiation (cf. Hundt 2006). However the very similar
pattern of usage in ICE-SING cannot be explained in the same way – unless we allow
that CNN is now a more potent linguistic influence in Singapore than the BBC. This is
further discussed below, Section 8.
6. Matrix verbs for the mandative subjunctive across six ICE corpora
Research on the use of MS has shown that it regularly complements a specifiable set of
suasive verbs, shown in Table 2 below. Included with each verb are instances where MS
is used in the clause complementing the related abstract noun, e.g. advice for advise,
insistence for insist, though they do not add substantially to their frequency. Certain
other structures also cooccur occasionally with MS (cf. Hundt 1998b: 161–2), including adjectives such as necessary (and the related noun necessity), and a wider range of
verbs and verb phrases e.g. agree, as in:
(4) It was agreed that the budget be placed…[ICE-IND S1B-080:48]
However these other matrices are relatively uncommon, and provide little comparative
material across multiple varieties of English in both spoken and written modes, so have
not been included here.
The highest frequencies of MS in the ICE data are with verbs suggest, demand, recommend, move as shown in Table 2 below. All four belong to the subset which require a
that clause complement: demanded that he bring a partner. The nonfinite construction:
*demanded him to bring a partner is not available. While this syntactic requirement probably contributes to the high MS scores for those four verbs, it also means that the use of
MS is lexically conditioned in some cases, as noted in Schneider (2005: 35). By the same
token, the relatively frequent use of MS with verbs which do allow a choice between clausal
(i.e. finite) and nonfinite forms of complementation (e.g. request, require, insist, ask)
is all the more significant. The use of MS with these middle-frequency verbs indicates
that MS is still freely used within the variety, as the optional rather than the required
construction. It has been suggested that the nonfinite option is more popular in conversation, as can be shown for ask (Hundt 1998b). In the fiction and news texts of the
Longman corpus, ask is the most frequently found speech act verb with a to- infinitive
(with or without an intervening NP), as shown by Biber et al. (1999: 712–13). At the
same time Biber et al. (1999: 664) list ask as one of the controlling verbs which less
 Pam Peters
Table 2. Relative frequency of MS with the canonical set of matrix verbs and their related
abstract nouns, across six regional ICE corpora
ICE-AUS
ICE-NZ
advise
ask
demand*
insist*
move*
order
propose*
recommend*
request
require
stipulate*
suggest*
urge
2
1
9
3
12
1
3
2
1
6
7
9
5
4
TOTAL
40
ICE-GB ICE-SING
1
3
13
3
7
9
3
2
1
3
2
4
2
2
1
6
4
7
2
8
6
1
2
12
63
17
49
ICE-IND
ICE-PHIL total
5
2
1
1
2
1
3
8
6
1
1
3
3
5
17
4
5
1
5
21
53
3
17
35
17
26
9
15
28
25
22
1
39
6
* Asterisked verbs take only (finite) clausal complements; those without an asterisk take either clausal or
infinitival complements
commonly take a that-clause, so its bias is clear. The fact that ask nevertheless appears
in the mid-range of frequencies with MS in Table 2 suggests the continuing viability
of the construction, at least in NZE and Philippine English. Meanwhile Biber et al.
(1999: 663, 665) confirm that suggest is commonly found as a matrix (controlling) verb
with that clauses across all written registers in the Longman corpus, correlating with
its relatively high frequency across all the varieties of English shown in Table 2.
7. Spoken contexts for the use of the mandative subjunctive
In earlier discussion we questioned whether the occurrences of MS in spoken data
might signify its availability in all types of spoken interaction, public and private, more
and less formal. Let us now examine the distribution of data for the 6 matrix verbs
which most often take MS in the ICE data – those with a frequency of 20 or more.
Those with lower frequencies offer insufficient instances for further subdivision into
the four speech contexts of ICE, and even with a threshold of 20, the distribution
is quite uneven. Note also that of the four speech types included in ICE, only one
(S1A private conversation) could indicate that MS has become “vernacularized” in
that variety. The other three speech types do not constitute spontaneous conversation
among equals, because all involve institutional contexts for speaking. S1B (public
dialogue) is sampled from settings such as courts and classrooms, as well as broadcast
The mandative subjunctive in spoken English 
discussions, and though interactive they are contexts in which the parties in dialogue
play unequal roles and form unequal dyads. Both S2A and S2B consist of monologic
speech, scripted and unscripted, i.e. noninteractive speech, which is fully or partly
prepared. The distribution of MS across all four types of speech is shown in Table 3
below for the most frequent of the verbs shown in Table 2 above.
Table 3. Distribution of MS over four types of speech for the 6 highest-ranking matrix
verbs over five regional ICE corpora
demand
S1A and S1B
S2A and 2B
move
S1A and S1B
S2A and S2B
recommend
S1A and S1B
S2A and S2B
request
S1A and S1B
S2A and S2B
require
S1A and S1B
S2A and S2B
suggest
S1A and S1B
S2A and S2B
TOTAL MS for 6 verbs
ICE-AUS 3
3
ICE-NZ ICE-GB 1
1
ICE-SING
ICE-IND 1*
12
2
1
6
3
3
1
5
2
2
1
1
2
1
2*
22
1
1*
15
1
1
1
1*
20
1
1
3
3
8
Legend: instances of MS found in S1A are set in bold and asterisked
Table 3 shows that occurrences of MS are scattered pretty thinly over the four
speech types included in the ICE corpora. The paucity of instances of MS occurring
in spontaneous conversation (S1A) is evident – a total of 5 in all. Of those instances,
4 occur with the matrix verb suggest, the most frequent matrix verb for MS in the set
examined in this study. Interestingly, suggest and request both appear in Stenström’s
(1994) list of primary speech acts, as recurrent conversational moves of the kind which
could be enacted between friends and equals. But Table 3 shows that suggest is the
only verb which occurs more than once with MS in ICE conversations, where it can be
stylistically neutral. It also appears in S1B settings in four out of the five varieties, i.e.
in institutional contexts and unequal dyads (e.g. teacher/student), where its use carries
more weight than in conversation.
 Pam Peters
Overall there is scant evidence of MS usage becoming “vernacular” in the sense of
it being everyday conversational usage. Table 3 shows rather that most instances of MS
occur in the discourse associated with institutional settings. Almost a third of those
from the ICE data examined are instances of move with MS, which is symptomatic of
formal meeting procedures and found in all five varieties. The other high-frequency
verbs occurring with MS are associated more with S2A and S2B contexts, i.e. public
monologues, scripted and unscripted, across all varieties of English. The use of MS
with demand in both ICE-AUS and ICE-SING exemplifies its public function, to mandate the actions of others when there is no right of reply. Not to be used in social
conversation! Likewise the relatively frequent use of MS with recommend in S2B in
ICE-NZ smacks of public speaking rather than interpersonal transactions.
So the ICE spoken data shows that MS is not often generated in conversation or
interactive speech. Most cases are found in institutionalized settings, where the directive speech acts with which they are associated are used for the management of others
or ritual purposes. MS does not seem to be part of the conversational repertoire: it
would be odd to express everyday requests with demand, or provide advice for your
friends with verbs like recommend. Such verbs are probably more acceptable as part
of professional consultation, where professional advice is sought in an unequal dyad.
They resonate with that, even when they do not come from the lips of a professional.
In an auxiliary corpus of Australian talkback radio (ART) there were instances of MS
with verbs like demand and recommend especially in samples from commercial radio,
from the demagogic type of anchorperson we know as “shockjocks” – those who want
to use the microphone to plug their views over the airwaves. Interestingly there were
few examples of MS in data from the much more discursive ABC Radio National talkback, where the anchorpersons are proactive in seeking the views of listeners in the
community, and engaging with them in natural conversation.
8. The future of the mandative subjunctive in world English
With this close look at MS data from settler and indigenized varieties of English in
different speech contexts, the future of MS still looks rather constrained. It is still
largely confined to institutional kinds of speech, and not readily used in ordinary
conversation. The linguistic constraints on its use include the fact that the suasive
verbs like demand which currently show the highest frequencies of MS also have
rather specialised uses, because of the unequal interpersonal relationship they presuppose, and discoursal settings in which they can be used. Even where it is demonstrably still current usage, MS seems to be marked rather than stylistically neutral.
The use of MS with almost half the verbs shown in Table 2 is also challenged by the
fact that it competes with other nonfinite constructions, especially in conversation.
The mandative subjunctive in spoken English 
Its popularity probably varies within particular English-speaking communities,
which may lead to declining use over the course of time.
The sociolinguistic matrix for the use of MS in the Australian community has
been surveyed more than once through magazine Australian Style (in 1993 and 2004).
On the first occasion, there was more than 80% support overall for the two following
test sentences, one using a suasive verb, the other a suasive noun:
(5) They insisted that the complaint be presented in writing
(6) She expressed the wish that her jewelry be given to charity.
When the survey including those sentences was rerun in 2004, the overall response
in favor of MS was down to 67% – still a majority, but not as strong an endorsement
as before.
Sensitivity to using MS is still there in the Australian community, to judge by a
recent flurry of interest in it in correspondence to the Sydney Morning Herald, where a
quotation from Winston Churchill: All that matters is that he goes was hypercorrected
by one writer to All that matters is that he go. The hypercorrector’s version was then
challenged by other readers/writers, who defended Churchill’s usage as conversational
style – and/or his well-known disregard for grammatical strictures, being the orator
that he was. This storm in a teacup incidentally raises other issues in the use of MS
which have not been discussed within the confines of this paper, including the fact that
BrE makes greater use of the indicative as an alternative to MS (Overgaard 1995: 41, 52).
It also highlights the role of prescriptivism in supporting or countering the use of MS,
whose impact is measurable (Peters 2006).
Substantial variation between regional varieties of English in their use of MS has
emerged from the comparative corpus data analyzed in Sections 5–7 above. Support
for MS is still much stronger outside BrE than within it, despite some increase in its
use in the later twentieth century (Section 2 above). The contrasting data from AusE
and NZE in Table 1 show greater usage of MS in Australian speech and in NZ writing,
though both may be the products of sampling. The ICE-AUS data is certainly swelled
by 12 instances of MS found with move as used in a formal meeting (as shown in
Table 3). There may be some skewing in the samples of writing for ICE-NZ, which are
somewhat conservative in their syntax, as has emerged in research on other syntactic
variables such as the use of no-negation (Peters 2008).
Closer affinities than we might have expected in the use of MS have emerged
from ICE-SING and ICE-PHIL data (shown in Table 1), with considerable frequencies in spoken as well as written contexts. Perhaps this similarity is underpinned by
their participation in cooperative regional alliances such as SEAMEO (South-East
Asian Ministers of Education Organization), which might explain why ICE-SING has
remarkably strong use of MS, contrary to what we might expect for a British-based
variety. However the results for ICE-PHIL show that AmE may indeed have provided
 Pam Peters
exonormative support for its use of MS. Unfortunately there is no comparable
ICE-US data to show whether MS is as much used in general American speech as in
writing. Wider American influence in the Pacific region during and especially after
World War II though the ANZUS pact may also be a contributing factor, although
New Zealand has been less well-disposed to it than Australia (Peters 2009). This might
help to account for the Australia/New Zealand difference in their use of MS, while
demonstrating that American influence cannot be seen as a simple areal effect.
Despite all the evidence of the survival of MS in many varieties of English, its
significance in English grammar is usually played down, blended in with the discussion
of other syntactic topics. In Biber et al. (1999: 80, 667, 674), these are (i) subject-verb
concord for the third person singular (which MS lacks), and (ii) its use as an alternative
formulation of that clauses with certain controlling verbs and adjectives of necessity.
In Huddleston and Pullum (2002: 993–1004) MS is treated under the general heading
of “content clauses and reported speech”. So in contemporary English, MS is projected
simply an alternative syntactic resource. Yet the selection of MS retains some stylistic
significance in different varieties of English. In the British ICE data, it remains a formal
construction by its limited frequency and restricted contexts of use (mostly written);
whereas in AusE and NZE, as well as SingE, it is also found in several kinds of public
speech. These do not guarantee it a place in everyday spoken English however, or in
world English in the future.
References
Aarts, Bas & April McMahon (eds). 2006. Handbook of English Linguistics Oxford: Blackwell.
Australian Style December 1993; December 2004. Dictionary Research Centre: Macquarie
University.
Biber, Douglas, Geoffrey Leech, Stig Johansson, Susan Conrad & Edward Finegan. 1999. Longman Grammar of Spoken and Written English. London: Longman.
Bradley, Henry. 1904. The Making of English. Repr. 2007. Lucas Press.
Fowler, Henry. 1926. A Dictionary of Modern English Usage. Oxford: Clarendon Press.
Fries, Charles. 1940. American English Grammar. New York NY: Appleton Century Crofts.
Hoffmann, Sebastian. 1997. “Mandative sentences. A study of variation based on the British
National Corpus”. Unpublished Lizenziats-Arbeit. University of Zurich.
Huddleston, Rodney & Geoffrey Pullum. 2002. Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Hundt, Marianne. 1998a. New Zealand English Grammar: Fact or Fiction. Amsterdam: John
Benjamins.
Hundt, Marianne. 1998b. “It is important that this study (should) be based on the analysis of
corpora: On the use of the mandative in four major varieties of English”. In Hans Lindquist,
Staffan Klintborg, Magnus Levin & Maria Estling (eds), The Major Varieties of English.
Växjö University, 159–76.
The mandative subjunctive in spoken English 
Hundt, Marianne. 2006. “The committee has/have decided …On concord patterns in inner and
outer circle varieties of English”. Journal of English Linguistics 34(3): 206–32.
Hundt, Marianne. 2009. “Colonial lag, colonial innovation, or simply language change?” In
Günter Rohdenburg & Julia Schlüter (eds), One Language, Two Grammars. Cambridge:
Cambridge University Press, 13–37.
Hundt, Marianne, Jennifer Hay & Elizabeth Gordon. 2004. “New Zealand English: Morphosyntax”.
In Bernd Kortmann, Edgar W. Schneider & Kate Burridge (eds), Handbook of Varieties of
English (vol. 2). Berlin: De Gruyter, 560–92.
Johansson, Stig and Else Norheim. 1988. “The subjunctive in British and American English”.
ICAME Journal 12: 27–36.
Mair, Christian. 2006. Twentieth Century English: History, Variation and Standardization.
Cambridge: Cambridge University Press.
Mair, Christian & Geoffrey Leech. 2006. “Current changes in English syntax”. In Aarts &
McMahon (eds): 318–42.
Overgaard, Gerd. 1995. The Mandative Subjunctive in American and British English in the
Twentieth Century. Uppsala, Studia Anglistica Upsaliensis 94.
Peters, Pam. 1998. “The survival of the subjunctive. Evidence of its use in Australian English and
elsewhere”. English World-Wide 19(1): 87–103.
Peters, Pam. 2006. “English usage: prescription and description”. In Aarts & McMahon (eds):
759–80.
Peters, Pam. 2008. “Patterns of negation: the relationship between NO and NOT in regional varieties of English”. In Terttu Nevalainen, Irma Taavitsainen, Päivi Pahta & Minna Korhonen (eds),
The Dynamics of Linguistic Variation: Corpus Evidence on English Past and Present. Amsterdam: John Benjamins, 147–62.
Peters, Pam. 2009. “Australian English as a regional epicentre”. In Lucia Siebers & Thomas Hoffmann
(eds), World Englishes: Problems – Properties – Prospects. Amsterdam: John Benjamins.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Schneider, Edgar W. 2000. “Corpus linguistics in the Asian context: Exemplary analyses of the
Kolhapur corpus of Indian English”. In Maria Lourdes Bautista, Teodoro A. Llamzon &
Bonifacio P. Sibayan (eds), Parangal cang Brother Andrew: Festschrift for Andrew Gonzalez,
Manila, Linguistic Society of the Philippines, 115–37.
Schneider, Edgar W. 2005. “The subjunctive in Philippine English”. In Danilo T. Dayag &
J. Stephen Quakenbush (eds), Linguistics and Language Education in the Philippines and
Beyond: A Festschrift in Honor of Ma Lourdes S Bautista. Manila, Linguistic Society of the
Philippines, 27–40.
Schneider, Edgar W. 2007. Postcolonial English: Varieties of English around the World. Cambridge:
Cambridge University Press.
Stenström, Anna-Brita. 1994. An Introduction to Spoken Interaction. London: Longman.
Light verbs in Australian, New Zealand
and British English*
Adam Smith
Macquarie University
This paper examines regional and register differences in the use of the light verbs
give, have, make and take across British, Australian, New Zealand and American
English, to see whether statements in the literature such as the US preference for
take can be supported. Primary and secondary materials were investigated, in the
form of L1 and L2 dictionaries across the regions, and data from the ICE corpora
for Britain, Australia and NZ. The dictionary data only partially confirmed
regional differences between take and have, while the corpora showed a growing
use of the light verb have, with Australian and New Zealand English leading the
way. The corpora also demonstrated more frequent and more productive use of
the construction in spoken than in written data, which allowed conclusions to be
drawn about the interpersonal functions of light verbs.
1. Definition of “light verb”
Poutsma (1926) and Jespersen (1931) first identified the tendency of modern English to
form verbal expressions with a noun complement, where the semantic content resides
almost entirely in the noun: have a look, take a rest, do a dance etc. The construction
“hovering between grammar and lexis” as Algeo (2006: 269) remarks, has provided
problems of classification for grammarians and linguists ever since. The sheer range
of terms that have been applied to the phenomenon gives an indication of its ambiguous status. For the verb there is “light or insignificant verb” (Jespersen 1931), “copula”
(Curme 1935), “function verb” (Nickel 1968), “empty” or “stretched” verb (Allerton
2002); and for the construction “verbo-nominal phrase” (Rensky 1964), “complex
verbal structure” (Nickel 1968), “composite predicate” (Cattell 1984) and “expanded
predicate” (Algeo 1995). All of these encompass a wide range of constructions, the
boundaries of which are not always clearly defined (although Algeo does distinguish
*With grateful acknowledgement to Pam Peters for her assistance and encouragement in the
development of this paper, and to Peter Collins for his helpful comments.
 Adam Smith
between “core expanded predicates” and others (1995: 207)). In order to collect a compatible set of data for regional and register comparison, it is necessary first to establish
some clear guidelines as to what is to be classified as a light verb.
The widest definition of light verbs is that they are “semantically ‘light’ in the sense
that their contribution to the meaning of the predication is relatively small in comparison with that of their complements” (Huddleston & Pullum 2002: 290). This allows for
constructions where the choice of verb might complement the noun without modifying the meaning (ask a question), or where a common verb is added to a noun where
there is no equivalent single-word verb (have mercy). This definition also makes no
stipulation about the form of the noun, so that make a calculation, do the ironing can
be classified as light verb constructions.
At the other end of the scale, it is possible to classify the apparent noun complement as verb by limiting examples to those where the noun is exactly equivalent to
the infinitive form of the verb (have a swim = to swim etc.). Wierzbicka (1982) took
this approach in her study on the have a V frame. While this allows her to make some
interesting observations on the semantic rules that govern the choice of verb in the
have construction, the criteria for distinguishing between the identical forms as a verb
or a noun are not always clearcut.
A more useful approach for the present study is provided by Kearns (2002) who
divided the traditional class of light verbs into two categories, based on grammatical
principles. Her distinctions between “true light verb”(TLV) and other types of light
verbs which she termed “vague action verb” excluded nominalized forms of the verb as
well as other variations where the noun complement is highlighted in some way.
Table 1. The main attributes of TLVs according to Kearns (2002)
Description
True Light Verb
1. The complement is headed by
a N which is a stem form identical
to a verb
2. The complement NP must be
indefinite
3. The complement NP cannot be
subject of passive
4. The complement NP cannot be the
focus of a WH-question or modified
by a relative clause
5. The complement cannot be
pronominalized
Have a look = look
Take a walk = walk
He gave a groan not Who gave the groan?
*A groan was given by John
*Which groan did he give?
*The groan (which) he gave startled me.
*The deceased gave a groan at midnight and
another one later.
In addition, she notes the particular identification between the stem noun and its
corresponding verb in certain light HAVE constructions, where the noun use appears
Light verbs in Australian, New Zealand and British English 
to be coined explicitly for the TLV construction in examples such as: Can I have a /ju:z
(*/ju:s/) of your pen or Can I have a lend/borrow of your pen?
The light verb construction’s tendency towards pre-modification1 shifts the
balance away from this verb as complement interpretation. Kearns also makes the
point that modification often appears in TLV phrases to reinforce the “common use
of these constructions as hedging strategies, e.g. I just gave it a little poke and it
exploded.” (ibid). Evidence of pre-modified forms within the corpus evidence will be
examined in the current study, but dealt with separately from the unmodified data.
Because this ambivalence between verbalization and nominalization appears to be
at the heart of the construction, I shall follow Kearns’s classification of the TLV (light
verb = TLV in the rest of the paper) in identifying examples and corpus data. The light
verbs covered are give, have, make and take, which are defined as the verb of the core
expanded predicate by Algeo (1995: 208), and also, being the commonest, they provide
the most corpus data for regional comparison.
2. Evidence for regional divergence
2.1 Research studies
The literature on light verb constructions is full of bold statements about its regional variability. “Examples of constructions with the gerund, which occur with higher frequency
in American English, are ... I gave Gulliver’s Travels a re-reading” (Nickel, 1968: 6).
Another is “In these synonymous sets the selection of the light verb is often a matter of
dialect – social or regional” (Live 1973: 33). An extended example is “In particular, the
frequency of use, and hence the importance, of this construction in British or Australian
English is far greater than in American English, which makes much greater use of the
related take a V construction. In Australian English, in particular, the have a V construction constitutes a fundamental part of everyday talk.” (Wierzbicka 1982: 756). Finally: “It
appears that interdialectal variation is greater for the light HAVE construction, which
is most robust in Australian English, common in New Zealand and British English, but
limited in American English” (Kearns 2002). There have been few corpus studies to confirm or deny these claims.
The first research based on corpus material was published by Stein and Quirk
(1991), using a corpus of British novels from the 1980s. It therefore concentrated on the
different semantic areas covered by three different light verbs, and was not able to make
regional comparisons. In a separate paper, Stein (1991) had questioned Wierzbicka’s
1. Observed by Jespersen “Such constructions also offer an easy means of adding some
descriptive trait in the form of an adjunct: we had a delightful bathe, a quiet smoke, etc.”
 Adam Smith
contention (see above) that the have + V construction was a particularly informal one.
Again, their corpus was purely composed of written material, so not suited to make
comparisons of register – although one wonders what proportion of their light verb
examples came from literary representation of speech.
A more extensive study, comparing British and American data in the Brown and
LOB corpora, was carried out by Algeo (1995, 2006). He looked at the evidence for
regional differences in the uses of the five light verbs do, give, have, make, take. While
the examples he used were not all TLVs, he found no significant differences in frequency of use of four of them. However, “British [English] uses have as the verb of an
expanded predicate nearly twice as often as American does and in about 1.75 times as
many different constructions” (2006: 270). This modifies the accepted belief that AmE
favours take, BrE have – Algeo quotes Quirk et al. saying that “when the eventive object
collates with both have and take, have is typically British option, take the American”
(1995: 211). Algeo’s findings show that “The difference is not that American favours
take but that British favours have” (213).
Both of these previous corpus studies concentrate on British and/or American
written material. There have been no corpus studies of the use of light verbs in AusE
and NZE, or of spoken material. This study will therefore focus on the spoken components of the Australian, New Zealand and British ICE corpora (ICE-AUS, ICE-NZ,
ICE-GB) to examine whether there are regional differences between the varieties and
test Wierzbicka’s claim that the light verb construction is particularly informal.
First, however, to give a wider view of regional variation, let us look at the treatment of a set of light verbs using secondary evidence. Their coverage and labeling
in a range of dictionaries can provide an indication of variation between regions
and registers.
2.2 Dictionary evidence
Light verb constructions present a problem for lexicographers. As Algeo writes
“Because it [the expanded predicate] is not exclusively either grammatical or lexical,
it is likely to be treated inadequately in both grammars and dictionaries” (1995: 204).
A selection of American, British, Australian and New Zealand dictionaries was therefore surveyed to see how far their treatment of light verbs was systematized, and
if regional and register variations were covered by inclusion/exclusion and labeling.
Dictionaries both for native speakers (L1) and second-language learners (L2) were
included in the survey, as L2 dictionaries of comparable size are more likely to cover
spoken idiom.
The following tables show a comparison of the dictionaries’ treatment of the
common light verb constructions take/have a bath, break, holiday/vacation, look,
shower, walk.
Light verbs in Australian, New Zealand and British English 
Table 2a. Coverage of light verbs in American, British, Australian and New Zealand L1
dictionaries
American (RHD) British (NODE)
Australian (MD) NZ (NZOD)
In definition
holiday (take) –
definition of
vacation as verb
(chiefly US)
shower (take)
In example
sentence
As subheadword
bath (take)
holiday (take)
bath (take)
holiday (take)
– definition of
vacation as verb
shower (take)
vacation/holiday
(take/have) –
definition of
vacation as verb
bath (take)
look (take) bath
(take) – in sense of
“suffer defeat, loss”
(informal) holiday
(take)
bath (take) – in
sense of “suffer
defeat” (colloq.)
look (have) bath
(take) – in sense
of “suffer defeat”
(colloq.)
RHD = Random House Dictionary, 2nd Edition (1987)
NODE = New Oxford Dictionary of English (1998)
MD = Macquarie Dictionary, 4th Edition (2005)
NZOD = New Zealand Oxford Dictionary (2005)
There is quite uneven coverage of the 7 sample light verbs across the four L1
dictionaries. The Australian MD has the widest coverage, covering 5 of them either in
a definition, in an example sentence or as a sub-headword. The American RHD has
the smallest coverage with only 2. There is also a remarkable preference for take as the
light verb across all the regions, with only the Australian dictionary offering have as an
equal alternative with take in take/have a holiday/vacation, and as the chosen example
for have a look. Take a bath, covered in NODE, MD and NZOD, is a particularly interesting example as it shows the light verb construction being used to introduce a new
sense to the simple verb, and it is the only one marked as colloquial/informal.2
In Table 2b, 6 out of 7 of the light verb constructions chosen were found in at
least one of the L2 dictionaries, with only have/take a vacation not covered. LongBr
has the completest coverage with 6 of the 7, and vacation would be excluded as a US
variant, in line with NODE’s labeling of it as “chiefly US”. But across the regions there
is greater coverage of the TLVs than in the L1 dictionaries. There is also a sign of
2. See Section 3.1.1 for treatment of the innovative colloquial use of light verb constructions.
 Adam Smith
Table 2b. Coverage of light verbs in American, British, Australian and New Zealand
L2 dictionaries
In definition
In example sentence
American
(LongAm)
British
(LongBr)
bath (take)
break (take)
bath (take)
break (take)
holiday (have)
look (have)
shower (take)
walk (take)
bath (have (Br)/take (Am))
break (have/take)
holiday (have)
look (have/take)
look (take)
shower (take)
walk (take)
As sub-headword
shower (have (Br)/take (Am))
walk (have/take)
AUS
(MALD)
bath (have/take)
bath (have)
break (have)
holiday (take)
look (have)
look (have) – in
sense “see and
pay attention to”
LongAm = Longman Dictionary of American English, 2nd Edition (1997)
Long Br = Longman Dictionary of Contemporary English, 3rd Edition (1995)
MALD = Macquarie Australian Learners Dictionary (1997)
regional preferences shown in LongAm’s consistent choice of take as the light verb, and
LongBr/MALD’s presentation of have, at least as an option. LongBr even labels have as
“Br” and take as “Am” in the case of take/have a bath/shower.
Across the L1 and L2 dictionaries there are varying ways of treating the light verb
construction within the entry. Using it to define the simple verb demonstrates that it
is regarded as exactly synonymous. Demonstration of its use through an example sentence gives no indication as to whether the light verb has been considered in choosing
the example – although choice through frequency would have been an option for the
Longman dictionaries where corpora were used to provide examples. Treatment of
the construction as a distinct sub-headword occurs only in the British and Australian
dictionaries, and only consistently in LongBr. This is actually an unconventional lexicographical approach, as the LongBr sub-headwords are exemplifying usage rather
than drawing attention to a change in sense. The Australian dictionaries, on the other
hand, do distinguish between the senses of both have a look, take a bath and the
simple verb senses.
Overall, both sets of dictionaries confirm an American preference for take in
common light verb constructions, while the British and Australian dictionaries at
Light verbs in Australian, New Zealand and British English 
least present have as an option. Backing up Algeo’s instinct, the uneven coverage
of the construction across the dictionaries tends to imply that they have not been
treated consistently (except in LongBr), although the greater coverage in L2 dictionaries acknowledges the prevalence of the construction in modern English, and the
need to support L2 users in constructing sentences. While the Longman dictionary
quotations would have been based on examples from corpora, it is not possible
to say whether they were chosen to represent proportional tendencies of light
verb choice.
3. Frequency of common light verbs in the ICE corpora
Many verbs go through the process of delexicalization, from those with quite a wide
range of meaning such as put, get, set, to verbs like shed, cast and throw that are more
semantically constrained (see Verde 2003). This study will focus on the main light
verbs, or “core expanded predicates”, as Algeo (1995) categorizes them: give, make, have,
take.3 These, being the most semantically general, have been found to be the most commonly used in light verb constructions, and are therefore likely to produce the most
corpus data for regional and register comparisons.
As explained above, the Kearns classification of the TLV was used to collect a consistent set of data. Searches were made on the present and past forms of the four light
verbs selected, with premodified constructions (“make a great impact”; “have a bit of a
look”) and monotransitive vs. ditransitive uses (“give a call” vs. “give her a call”) noted.
The following results are taken from ICE-GB, ICE-AUS and ICE-NZ, with reference also made to ART, the corpus of Australian radio talkback. The ICE corpora were
chosen because they are compatible in size and construction, and allowed for comparison between spoken and written usage.
3.1 Spoken vs written
As noted above, Wierzbicka’s instinct for have a V is that “the construction is highly
colloquial” (1982: 757). Table 3 below gives an overview of the returns for the spoken
and written components of the ICE corpora:
3. Algeo (1995) also looked at do, but found only 4 tokens in Brown, and none in LOB.
Searches in the ICE corpora discovered a similarly limited range.
 Adam Smith
Table 3. The light verb construction in ICE-GB, ICE-AUS and ICE-NZ*
ICE-GB
%
ICE-AUS
%
ICE-NZ
%
give (spoken)
give (written)
total
2.3 (14)
3.5 (14)
(28)
40%
60%
6.5 (39)
2.3 (9)
(48)
74%
26%
6.0 (36)
5.3 (21)
(57)
53%
47%
have (spoken)
have (written)
total
23 (138)
9.8 (39)
(177)
70%
30%
44.0 (264) 80%
11.3 (45) 20%
(309)
26.5 (159) 66%
13.5 (54) 34%
(213)
make (spoken)
make (written)
total
9.5 (57)
6.5 (26)
(83)
59%
41%
6.3 (38)
6.0 (24)
(62)
51%
49%
6.3 (38)
3.0 (12)
(50)
68%
32%
take (spoken)
take (written)
total
2.0 (12)
2.5 (10)
(22)
44%
56%
3.2 (19)
3.0 (12)
(31)
51%
49%
2.0 (12)
3.0 (12)
(24)
40%
60%
*Frequency/100 000 words is given to offset the different size of spoken and written components of the
corpora, with raw figures in brackets. The percentages are based on the normalized frequencies.
This shows almost uniformly a higher incidence of the most common light verb
constructions in spoken than in written English. The exceptions, where the frequencies are very close or equal, are in the comparatively lower frequency items for give
and take in ICE-GB, and for take in ICE-NZ. Given that there are more spoken texts
than written in the ICE corpora (300 spoken, 200 written), the raw figures have been
normalized to show frequency per 100 000 words. This adjustment gives a higher proportional frequency in written texts for the examples above, and makes the difference
minimal between spoken and written for give in ICE-NZ and make/take in ICE-AUS.
It is therefore only have constructions that appear more often in speech across all
three regions, with give more common in spoken AusE, and make more common in
spoken BrE and NZE.
3.1.1 Informality vs. interpersonal courtesy
Even where there is higher frequency of light verbs in the spoken categories of the
corpora, this does not necessarily indicate a greater degree of informality for the construction over the use of the simple verb. Stein (1991: 26) challenges Wierzbicka’s
characterization of the “informality” of have constructions, suggesting that the construction is better described as a form of “interpersonal courtesy” rather than “reduced
formality”. We can test this assertion on the phrase have a look, which is by far the most
common light verb construction across the corpora examined, and almost exclusively
a spoken usage:
Light verbs in Australian, New Zealand and British English 
Table 4. Frequency of have a look in ICE-GB, ICE-AUS and ICE-NZ
ICE-GB
have a look
Spoken
Written
ICE-AUS
ICE-NZ
Occurrences Freq./
100 000
Occurrences Freq./
100 000
Occurrences Freq./
100 000
43
42
1
116
113
3
79
76
3
7
0.3
18.8
0.8
12.7
0.8
When we look at individual instances of the phrase, well over half of them are
used as invitations to join the speaker in looking at something, as in:
(1) Let’s just have a look at some of the headlines [ICE-NZ S1B-054:39]
(2)Thank you If Your Worship has a look at the evidence and Your Worship ah I
submit has an easy job in this particular matter [ICE-AUS S2A-066:3]
(3) Let’s have a look at that date [ICE-GB S1A-077:146]
It should be noted that each of these examples comes from a different category of the
corpus (S1A – private dialogue; S1B – public dialogue; S2A – unscripted monologue),
so that each represents a different communicative situation and different degrees of
formality. The speaker is in a different power relationship with their listener: in (1) it
is a teacher instructing students, in (2) a solicitor in court trying to gain the cooperation of the judge, and in (3) it is a personal conversation where there is no difference
in status between the speakers. This suggests that the light verb construction, while
typically prevalent in speech, covers a range of registers and interpersonal functions
within the medium. A breakdown of the spoken categories for have a look confirms
this spread:
Table 5. Occurrences of have a look across ICE spoken categories
S1A (Dialogue, private)
S1B (Dialogue, public)
S2A (Monologue, unscripted)
S2B (Monologue, scripted)
ICE-GB
ICE-AUS
ICE-NZ
21
14
6
1
31
18
55
6
30
21
19
2
Only S2B is under-represented here. It is the scripted monologue category which
might therefore be expected to bear more relation to written language, and make less
use of the interpersonal resources of the light verb construction.
 Adam Smith
3.1.2 Hedging and other uses of pre-modification
The have a look examples not only provide interest by way of their distribution across
the spoken categories. Another noticeable feature they demonstrate is the frequent
use of hedging, a characteristic described by Kearns “the use of one of these TLV
constructions trivializes or minimizes the denoted action or event, which is consistent
with the common use of these constructions as hedging strategies” (p.5). The most
common instance of hedging in the corpora is through the adverbial just as in (1) or
the following example which is additionally hedged by perhaps before and briefly after
the light verb construction:
(4)what i was going to do is just perhaps have a look briefly at
[ICE-NZ S2A-044:37]
Another means of hedging is by pre-modification of the noun complement, as in:
(5)We’ve just had a really quick look probably not very good uhm 〈,〉
at these single cell organisms that live in natural ponds
[ICE-GB S2A-051:110]
In this example the speaker is giving a tutorial, and is therefore using hedging as a
means of mitigating the unequal teacher-student discoursal relationship. Another
example from the same category in ICE-AUS shows the teacher using pre-modification to the opposite effect, to stress a point and give instruction as to what the students
should focus on:
(6)There’s a little more about Lizzie in this book and there’s a a lot more
about other people and one of the the things that you’ll find the about
this book that you it’s the opportunity to take a good long hard look
at what a stallholder actually uh is and looks like
[ICE-AUS S2A-048:115]
The purpose of pre-modification is therefore not clearcut in interpersonal terms. It
also has the effect of emphasizing the nominal character of the complement and distancing it from the verb, dislocating the light verb construction as a discrete grammatical entity (in contrast to example (5) where the adverbial modification enhances
the verbal status of the have a look construction). In this light it would interesting
to discover whether there are differences between spoken and written English in the
amount of pre-modification used.
An overview of all the pre-modified constructions in the ICE corpora shows a
subtle pattern in the difference in use for spoken and written English:
Light verbs in Australian, New Zealand and British English 
Table 6. The occurrence of pre-modified against unmodified light verbs in ICE-GB,
ICE-AUS and ICE-NZ*
ICE-GB
ICE-AUS
ICE-NZ
Pre-mod.
Unmod.
Pre-mod.
Unmod.
Pre-mod.
Unmod.
give
(spoken)
give
(written)
1.2 (7)
50%
2.0 (8)
57%
1.2 (7)
50%
1.5 (6)
43%
1.0 (6)
15%
1.75 (7)
78%
5.5 (33)
85%
2 (0.5)
22%
1.0 (6)
17%
1.8 (7)
33%
5.0 (30)
83%
3.5 (14)
67%
have
(spoken)
have
(written)
7.0 (42)
30%
5.8 (23)
59%
16.0 (96)
70%
4.0 (16)
41%
11.5 (69)
26%
6.0 (24)
53%
32.5 (195)
74%
5.3 (21)
47%
5.8 (35)
22%
8.3 (33)
61%
20.7 (124)
78%
5.3 (21)
39%
make
(spoken)
make
(written)
4.0 (24)
42%
2.8 (11)
42%
5.5 (33)
58%
3.75 (15)
58%
3.0 (18)
47%
3.3 (13)
54%
3.3 (20)
53%
2.8 (11)
46%
2.8 (17)
45%
1.5 (6)
50%
3.5 (21)
55%
1.5 (6)
50%
take
(spoken)
take
(written)
0.7 (4)
33%
1.5 (6)
60%
1.3 (8)
67%
1.0 (4)
40%
0.8 (5)
26%
1.8 (7)
58%
2.3 (14)
74%
1.3 (5)
42%
0.8 (5)
42%
0.8 (3)
25%
1.2 (7)
58%
2.3 (9)
75%
*Raw
figures are given in brackets, then frequency per 100 000 words to offset the different sizes of spoken
and written corpora. The percentages are based on the normalized frequencies.
These figures show that in general there are proportionally more pre-modified
light verb constructions than unmodified ones in written texts, and less of them in
spoken. Even when pre-modified forms are not the majority in written, as in give for
ICE-NZ, the proportion still goes up from 17% to 33%. The only examples to contradict this trend are take in ICE-NZ, where pre-modification is lower in written texts
(although there are very few occurrences), and make in ICE-GB, where there are equal
proportions of pre-modified and unmodified in spoken and written.
The suggestion that pre-modification might be more prevalent in written texts
is only partially supported by Stein and Quirk (1991) where their corpus of British
fiction from the 1980s shows 3 times more modified than unmodified examples with
give, but only 1:2 for take/have. They suggest that this skewed distribution is because
“Give tends to be used with realized experience which is thus more prone to invite
detailed description and evaluation.” (201), and give a clearly novelistic example from
John Fowles: “She…gave him a still faintly doubting smile back” (ibid.). Of the ICE corpora, ICE-AUS shows a similar preference for pre-modified give in written texts, but the
creative writing category (W2F) doesn’t significantly swell the number of examples.
 Adam Smith
3.1.3 Colloquial productivity
The number of light verb constructions in spoken data, whether they are indicators of
register or serve an interpersonal function, is just one sign that the construction is a
feature of spoken language. Another is its capacity to produce creative extensions of
conventional phrases or words. There is evidence in the ICE corpora of this happening
in two ways:
a. Introduction of noun complements in a specialized sense within the light
verb construction
The clearest example of this in the ICE data is the proliferation of alternative nouns
based on the conventional collocation have a/give (it) a try. In ICE-AUS, the synonymous variations are have a bash, a crack, a shot, and in ART (radio talkback) there is
also have a lash and give (it) a go. ICE-NZ has have a crack, lash, give(it) a go and take
a stab, while ICE-GB has make a stab. All of these are marked as “colloquial” in the
Macquarie Dictionary, and the Oxford English Dictionary (OED), and have the same
sense “to try” or “make an attempt”. Along the same lines are take a nosey (ICE-GB),
have a shoofty (ART) – although the former could be said to be adding shades of
meaning to the basic have/take a look. The phrase take/have a shufti is recorded in
the OED as a now rare piece of military slang and is labeled as British slang in the
Macquarie Dictionary (4th edition) with the spelling shoofty. Take a nosey, however,
appears to be a nonce usage, highlighting the potential of the light verb construction
to creatively expand usage. This use of the light verb construction to specify a variant
sense for a familiar word shows what productive power is inherent within it, in the
spoken idiom at least.
b. Use of verb forms as noun complements
There are several examples of nouns within light verb constructions that hardly appear
elsewhere as nouns, and tend to be categorized as highly informal or nonstandard.
Huddleston and Pullum claim that they are especially found in AusE, and cite some
examples: Can I have a borrow/lend of your pen/a carry of the baby? (2002: 296).4
The OED at least supports the regional labeling of lend in this sense, describing
it as “Sc. and north. dial.” as well as “Austral. and N.Z. colloq.”, while the Macquarie
Dictionary labels it “non-standard”. These particular examples do not appear in the
corpora, but there are four instances of have a lend in the sense of “to tease someone”
in ART, and both ICE-AUS and ICE-NZ provide examples of have a feed, where the
verb form feed is used to mean “meal”.
4. Add to that Kearns example cited above: Can I have a /ju:z/ (*/ju:s/) of your pen?
Light verbs in Australian, New Zealand and British English 
All of the examples above, apart from take a nosey, appear in the spoken sections of
the corpora, and take a nosey appears in a sample of reported speech from a novel, so it
is at least imitating a spoken register. Of the others, virtually all appear in category S1A –
private discussion, where we would expect the most informal speech. ICE-NZ appears to
vary from the other ICE corpora in that it gives us have a lash in S1B (public dialogue),
and have a stab, have a crack and give it a go in S2A (unscripted monologue). These appear
less anomalous when we see that the S1B example is from an interview with a man talking
about his criminal past, and all the S2A examples are from sports commentary where we
might expect a more informal register than in other broadcast speech – especially as they
often involve dialogic exchanges despite their classification as “monologue”.
The comparative lack of BrE examples suggest that the light verb construction might
be more productive in both AusE and NZE than it is for BrE, both for extending the
sense of a noun in spoken idiom, and further breaking down the grammatical integrity
of its element by employing a verb form as the noun complement.5 From these particular
instances, we shall move on to look at more general evidence of regional variation.
3.2 Regional and temporal differences
3.2.1 Corpus comparisons
Light verb constructions are often considered to vary regionally (see Section 2), with the
American preference for take over the British for have as the light verb the most widely
cited variation of this evidence. Algeo’s survey of the American Brown written corpus,
and its British equivalent, LOB, showed the spread of light verb choice as follows:
Table 7a. Light verbs in the LOB and Brown corpora (adapted from Algeo 1995)
Summary of tokens/type
give
have
make
take
LOB
Brown
40/29
100/61
67/39
38/20
40/30
55/35
59/44
41/20
These figures, Algeo argues, do show regional variation, but it is not so much of
US English preferring take, as British English favouring have. A direct comparison cannot be made with the ICE corpus findings for several reasons: LOB and Brown consist of
exclusively written material; the texts are 30 years older than ICE, dating from the early
5. Note that have a lend is also labeled as a Scottish and northern dialect usage.
 Adam Smith
1960s, and Algeo used a much broader classification of what constitutes a light verb.
However, an overview of the comparative frequency of the same light verbs in the ICE
corpora offers an interesting contrast:
Table 7b. Light verbs in ICE-GB, ICE-AUS and ICE-NZ
Summary of tokens/type
give
have
make
take
ICE-GB
ICE-AUS
ICE-NZ
28/23
177/48
83/41
22/13
48/29
309/64
62/37
31/17
57/26
213/46
50/32
24/12
The most striking difference between these figures taken from 1990s data, as
opposed to the earlier LOB/Brown corpora, is the higher ratio of tokens to types for
have in the ICE corpora – reaching ratios of nearly 5:1 in ICE-AUS and ICE-NZ. This
figure can be partially explained by the exceptionally high frequency of have a look
(see above, Table 4). If we remove all these incidences, the adjusted number of tokens
are: ICE-GB, 134; ICE-AUS, 193; ICE-NZ, 134. This now gives a consistent type/token
ratio of about 3:1 across the regions, which is still higher than for any other light verb
in ICE, or in LOB and Brown. There are several noun complements for have other
than look that have multiple instances in the ICE corpora, such as chat, effect, holiday,
impact, which account for this high ratio. It appears that have, as well as being the
most frequent light verb across the ICE regions, also generates more tokens of constructions using the same noun complement – thus reinforcing the choice of have as
the light verb.
The other light verbs do not have a significantly higher incidence in the ICE corpora
than in LOB and Brown, and in fact take produces lower figures across the regions.
Algeo’s broader categorization of the construction would tend to inflate the number
of tokens/types, but this research still supports his finding that have is the most common
light verb outside the US. ICE-GB shows the same order of preference as LOB, as does
ICE-AUS (have, make, give, take), while ICE-NZ has give higher than make).
The data suggests that there has been an increase in the use of have in light verb
constructions in BrE over the 30 years between the compilation of the LOB and
ICE-GB corpora, while the other light verbs show little sign of change, except in the
case of give which appears to be less frequent. The even greater frequency of have in
New Zealand, and particularly Australia, show the southern hemisphere varieties to be
ahead in this trend.
Light verbs in Australian, New Zealand and British English 
3.2.2 Regional choices of have or take
It should follow, therefore, that in instances where there is a choice between the use
of have or take with a particular noun complement, have will be the usual choice. The
following table shows the occurrences in ICE of the light verbs checked for regional
labeling in dictionaries in Tables 2a/2b above.
Table 8. Choice of have/take in the ICE corpora
(have) bath
(take) bath
(have) break
(take) break
(have) holiday
(take) holiday
(have) look
(take) look
(have) shower
(take) shower
(have) walk
(take) walk
ICE-GB
ICE-AUS
ICE-NZ
Total
0
0
5
2
10
1
43
6
1
0
3
0
2
1
4
4
5
1
119
3
4
1
1
1
1
0
3
5
5
0
79
2
3
0
1
0
3
1
12
11
20
2
241
11
8
1
5
1
Again the preference for have in all regions is quite consistent, although the frequency is low in some cases. The only exception is with break, where have is preferred
in ICE-GB, but is equal with take in ICE-AUS and lower in ICE-NZ. ART takes this
trend even further by giving 15 examples of take a break but none with have. This
might be an instance of the choice of verb changing the sense of the phrase – all the
ART instances are to announce a pause for advertizing, rather than being used in the
more general sense of resting. It’s also possible that this is a conventional formula borrowed from American talkshows, where take might be the more standard choice of
verb. As Tables 2a/b above showed, this was one where no clear regional pattern was
evident in dictionaries, and the corpus data backs up this lack of clear distinction.
4. Conclusions
The corpus evidence confirms the preference for have as a light verb in BrE, AusE and
NZE that is indicated by its coverage in L2 dictionaries. In fact it gives reason to think
that have may be even more prevalent than is suggested by the L2 dictionaries when
they offer take as an equal alternative. The corpora show the L1 dictionaries’ tendency
 Adam Smith
to illustrate take as the light verb in all varieties to be somewhat misrepresentative. On
the other hand, the L1 dictionaries do give an indication of the colloquial productivity
of light verb constructions by recording variants on the simple verb use, as in the case
of take a bath. It is possible that a more exhaustive study of the coverage of light verbs
across these dictionaries would give an alternative picture of the use of light verbs in
the different varieties, although the uneven coverage of the set of constructions chosen
suggests that this is unlikely
Much more frequent use of light verbs in spoken than in written English was
confirmed by the ICE corpora for Australia, New Zealand and Britain. However, this
was not necessarily found to be an indicator of informality, but more of its usefulness
for a range of interpersonal strategies. This appeared to be a particularly rich field
of investigation, which warrants further research. Evidence for colloquial usage was
found in the use of the light verb that extended the sense of the base verb, as in have a
bash/lend/feed, make a stab, with AusE and NZE appearing to be more innovative than
BrE in this respect. Results were not definitive as to whether pre-modified light verb
constructions are more typical of written than spoken texts.
Comparison of ICE-GB with Algeo’s analysis of LOB data indicated that the use
of have in light verb constructions had increased over the 30-year period between the
corpora, while the use of other light verbs showed little or no movement. AusE and NZE
again appear to be leading the way in the expanding use of light have constructions.
References
Algeo, John. 1995. “Having a look at the expanded predicate”. In Bas Aarts & Charles F. Meyer
(eds), The Verb in Contemporary English: Theory and Description. Cambridge: Cambridge
University Press, 203–17.
Algeo, John. 2006. British or American English? A Handbook of Word and Grammar Patterns.
Cambridge: Cambridge University Press.
Allerton, David. 2002. Stretched Verb Constructions in English. London: Routledge.
Cattell, Ray. 1984. Composite Predicates in English [Syntax and Semantics 17]. New York NY:
Academic Press.
Curme, George. 1935. A Grammar of the English Language. Boston: D.C. Heath & Co.
Huddleston, Rodney & Geoffrey Pullum 2002. The Cambridge Grammar of English Usage.
Cambridge: Cambridge University Press.
Jespersen, Otto. 1931. A Modern English Grammar, vol. 6. Heidelberg: Carl Winter.
Kearns, Kate. 2002. “Light verbs in English”. Unpublished Generals Paper, MIT.
Live, Anna H. 1973. “The take-have phrasal in English”. Linguistics 95: 31–50.
Longman Dictionary of American English, 2nd edn. 1997. New York: Longman.
Longman Dictionary of Contemporary English, 3rd edn. 1995. London: Longman.
Macquarie Australian Learners Dictionary. 1997. Sydney: Macquarie Library.
Macquarie Dictionary, 4th edn. 2005. Sydney: Macquarie Library.
Light verbs in Australian, New Zealand and British English 
New Oxford Dictionary of English. 1998. Oxford: Oxford University Press.
New Zealand Oxford Dictionary. 2005. Auckland: Oxford University Press.
Nickel, Gerhard. 1968. “Complex verbal structures in English”. International Review of Applied
Linguistics 6: 1-21.
Oxford English Dictionary, 2nd edn. 1989. 20 vols. Oxford: Clarendon Press.
Poutsma, Hendrik. 1926. A Grammar of Late Modern English. Groningen: P. Noordhoft.
Random House Dictionary of the English Language, 2nd edn. 1987. New York NY: Random House.
Rensky, Miroslav. 1964. “Nominal tendencies in English”. Philologica Pragensia 7: 135–50.
Stein, Gabriele. 1991. “The phrasal verb type ‘to have a look’ in Modern English”. International
Review of Applied Linguistics 29: 1–29.
Stein, Gabriele & Randolph Quirk. 1991. “On having a look in a corpus”. In Karin Aijmer &
Bendt Altenberg (eds), English Corpus Linguistics: Studies in Honour of Jan Svartvik. London:
Longman, 19–203.
Verde, Maria 2003. “Shedding light on SHED, CAST and THROW as nodes of extended lexical
units”. In Dawn Archer, Paul Rayson, Andrew Wilson & Tony McEnery (eds), Proceedings
of the Corpus Linguistics 2003 conference. <http://ucrel.lancs.ac.uk/publications/CL2003/
papers/verde.pdf> (5 May 2008).
Wierzbicka, Anna. 1982. “Why can you have a drink when you can’t *have an eat?”. Language 58:
753–99.
section iii
Nouns and noun phrases
Non-numerical quantifiers*
Adam Smith
Macquarie University
This paper looks at non-numerical quantifiers (NNQs), such as a lot of,
loads of. The set of quantifiers to be discussed is first identified in relation
to their description in major English grammars. Issues of variable noun
complementation and verb agreement with the NNQ are identified as being
of interest, along with the choice of quantifier and its collocations in different
regions (Australian, New Zealand and British English) and registers. Corpus
findings for a lot/lots of are compared with other NNQs where the quantifying
noun can be singular or plural (ONNQs), indicating a level of delexicalization/
grammaticization. Some regional variation was found, with the ONNQ loads of
much more frequent in British English, and heaps of more freqent in Australian
and New Zealand English.
1. Introduction
The NNQ is a class of noun phrase that is rarely discussed – either for its grammatical
properties or as an indicator of variation in English. This is surprising as phrases such
as a lot/lots of, a heap/heaps of provide points of relevant interest on both fronts. Grammatically they raise problems of agreement with their associated verb (“heaps of food
was/were on the table”) and issues in relation to their noun complement (compare “a
lot of money/mistakes”). The choice of singular or plural agreement provides a basis
for regional comparison for collective nouns (see e.g. Hundt, Section IV; Levin 1998),
and might well do so for NNQs also.
Another influence on number agreement in NNQs is what Reid calls “semantic
weight” (1991: 269). The most straightforward use of numerical quantifiers will give
regular agreement with the verb and the noun complement (“six dogs were barking”;
“half a day was wasted”), whereas NNQs often contain extra semantic content that
brings into question whether they are acting simply as a quantifier or have a descriptive
*With thanks to Yasmin Funk (Macquarie University), for her work on helping to identify
non-numerical quantifiers through corpus searches, and to Pam Peters for all her invaluable
insights and assistance in the development of this paper.
 Adam Smith
purpose (see heaps of example above). To assess the degree to which particular NNQs
are functioning as quantifiers or descriptors, it will be instructive to look at their
agreement and collocational patterns across a range of corpora. In the process, it is
hoped that regional differences in the choice of quantifier can be linked to the extent
of their delexicalization.
Dictionary labeling can provide an indication of regional divergence in NNQs. A
bunch of, for example, is sometimes marked as AmE, while loads of as more typically
BrE. Dictionaries and grammars also note that many NNQs are more common in
spoken than in written English, suggesting that a study of the different registers in
which particular NNQs are used would be interesting.
Corpus research allows us to quantify the grammatical regularity of NNQs, and
assess their regional, register and collocational divergence. This paper will use data
from ICE-AUS, ICE-NZ and ICE-GB to look for patterns in the use of NNQs.
2. Classification of NNQs
The standard grammars do not provide a standard approach to classifying NNQs.
Their divergent labeling and grouping of categories within the grammatical class
problematizes the selection of a coherent group, which is necessary for a systematic
corpus evaluation. I shall therefore look at the treatment of these quantifiers in three
recent grammars, to give a definition of the term non-numerical quantifier for use in
this study, and provide criteria for the selection of corpus searches.
The reference grammars selected are Quirk et al. (1985), Biber et al. (1999) and
Huddleston and Pullum (2002). Both Quirk et al., and Huddleston and Pullum
treat the kind of NNQs we have been looking at as a set, describing the quantity
noun + of construction as “open-class quantifiers” and “non-count quantificational
nouns” respectively.1 Some of these take noncount nouns, some take plural count
nouns, and some can take either. Biber et al. separate quantifying nouns into different categories according to whether they take plural count nouns (quantifying
collectives), or noncount nouns (unit nouns). A summary of the grammars’ classifications is provided below:
1. The labeling of such nouns as “non-count” appears problematic when there are singular
and plural alternatives in NNQs such as a lot of/lots of; a heap of/heaps of. The label is presumably being used to distinguish the quantitative use from the descriptive use where the sense of
individual units is emphasised: two lots of paté (ICE-GB), several heaps of leaves.
Non-numerical quantifiers 
–A Comprehensive Grammar of the English Language (Quirk et al. 1985)
–closed-class quantifiers (single-word postdeterminers, sometimes preceded
by central determiners e.g. “a”, “these”, “that”); much/a little – only with noncount nouns; many/(a) few/several – only with plural count nouns
–open-class quantifiers (phrasal quantifiers consisting of quantity noun + of,
often preceded by indefinite article) deal/amount of – only with noncount
nouns; number of – only with plural count nouns; plenty/a lot/lots of – with
noncount or plural count nouns
–
Longman Grammar of Spoken and Written English (Biber et al., 1999)
– quantifying collectives (e.g. bunch of/group of/set of) – with countables
– unit nouns (e.g. bit of/piece of/slice of) – with uncountables
–quantifying nouns (e.g. barrel of/heap(s) of/pint of/dozens of/load(s) of/armful
of/pair of) – with countable and uncountable
–
The Cambridge Grammar of the English Language (Huddleston & Pullum 2002)
Non-count quantificational nouns treated as a set with a noncount noun as head with
of PP as complement. These have different patterns of complementation:
–number of noun complement (“oblique”) controls number of whole NP e.g.
“a lot of work was done”/“a lot of errors were made” (number transparent)
–singular quantifying noun with singular oblique (“a great deal of work
was done”)
– plural quantifying noun with plural oblique (“dozens of errors were made”)
Despite the difference in terminology, it is clear that each of the grammars classifies
NNQs according to the same principle – whether they regularly take a singular or
plural noun, or are variable. In this paper I will be focusing on the variable NNQs, using
Huddleston and Pullum’s phrase “number transparent” to denote those that can take
either a singular or plural noun complement (an “oblique” in their terms) – although
the question as to whether it is the number of the noun complement or other factors
that control the number of the associated verb will be questioned.
In order to ascertain a set of variable NNQs, corpus searches were carried out on a
broad range of quantity noun + of constructions (see Appendix 1 for listing), excluding
only items such as numerals that consistently take plural complements, or unit nouns
(or partitives), which consistently take noncount complements. The partitive set is not
a neatly circumscribed one, since quantifying nouns that are not unit nouns can be
used partitively (a lot of the cake was eaten), and some unit nouns can function more
widely than simply to “split up an undifferentiated mass” (Biber et al. 1999: 250). A
bit of, for example, can sometimes express a large quantity of something, especially
when qualified as in quite a bit of time/money, and can even be used as an intensifier
 Adam Smith
with an adjective she’s a bit of alright. Channell (1994: 99) labels her vague quantifiers
“pseudopartitives”, in order to compare their behaviour with standard partitive
nouns. This study will exclude standard partitive nouns and uses, to concentrate on
expressions of quantity that are ambivalent as to number.
3. Issues that apply to NNQs
3.1 Grammar: Verbal agreement and noun complementation
Number ambivalence is not restricted to NNQs. As Huddleston and Pullum write
“Two of the most common overrides of the simple agreement rule are found with
singular collective nouns and with the number-transparent quantification noun
construction” (2002: 501). Reid (1991: 261) gives examples such as “Seventy years
of Marxist doctrine now seems headed for the dustheap” to show how numerical
quantification can present the same kind of mismatches as occur with collective
nouns between the number of the noun and its associated verb. In an international
survey on agreement, the sentence: Six days of rain was/were not what we expected in
the sunshine state provided the “least decisive result”: 52% of the respondents opting
for singular (notional) agreement, and 48% for plural (formal) agreement. (Peters
1999: 6). The question hinges on whether the quantitative element is seen as a single
span of time or as several successive days. A similar distinction could be made in the
sentence Heaps of food was/were on the table. Are we to imagine simply a lot of food,
or several distinguishable piles of food – at which point heaps is no longer a general
quantifier, but a specific descriptor? If we take heaps, with all its lexical baggage to be
the head of the noun phrase, then plural verbal agreement is natural, whereas if the
sequence heaps of is acting as a complex determiner, the singularity of food dictates
the verbal agreement.
Heap can also be used as a quantifier in the singular form (like lot/lots, load/loads
etc.), as in: A heap of watches was/were stolen. Here the descriptive possibilities of heap
are less likely to come into play – delicate items like watches not normally put in heaps.
But a choice still has to be made over whether the verb agrees with the singular quantifier heap, or is affected by the proximity of the plural watches.
The different interpretations available in the use of heap suggest that there is
a process of delexicalization for some NNQs. From their primary lexical role as
a descriptor, they can become a kind of quantifying collective, thence to number
transparency where the descriptive and quantificational functions assume varying
significance according to context, to the final stage as a complex determiner such as
a lot of, where the lexical content of the quantificational noun has no bearing on the
number of the noun complement. Corpus data will help us trace patterns for NNQs
to see if this process can be observed.
Non-numerical quantifiers 
3.2 Collocation
The number of the noun complement was one of the main factors used by Biber et al.
in their classification of “quantifying nouns” at large. These were classed either as
“quantifying collectives” (see summary of classifications in Section 2 above), the most
common of which were bunch of and group of, which were consistently used with plural nouns; and as unit nouns (e.g. bit of, piece of), which were consistently used with
singular nouns. Using data from the Longman Corpus of Spoken and Written English,
they found many of the quantifiers they looked at to be associated with particular entities. Typical collocations included bunch of flowers, group of friends, bit of fun, piece of
cake. Yet some of these would behave less predictably as to the number of their noun
complement when used as more general quantifiers (e.g. “a pack of nonsense”). Other
categories of quantifying nouns, such as those denoting shape, were more flexible both
in the number of their complement (a heap of leaves/rubble; a pile of bricks/wood)
and their range of collocation. Such NNQs will repay investigation to see whether
productivity of collocation correlates with delexicalization where the emphasis is on
quantification rather than description.
3.3 Semantic weight
The question of the semantics of individual quantifiers affecting their number agreement is discussed by Reid (1991). In dealing with a category that includes some of
our NNQs, where a singular quantifier + of takes a plural complement (a row of
horns, an increasing number of men), he suggests a scale of semantic categories with
varying semantic weight that dictates whether the verbal agreement goes with the
singular entity (in our terms the quantifier), or the plural complement. At the “light”
end of the scale are “decimals and fractions with one as numerator” (e.g. one fifth
of Australian men are…), while at the heavy end are “semantically specific words
that do not imply a referential plurality (e.g. presence, process, retrieval, weight…)”
(1991: 270).2 In the middle of this scale, and therefore most unpredictable as to
agreement, come the categories of “imprecise aggregate: group, handful, host, spate”
and “precise aggregate: team, band…number, series, sequence” – many of which fall
under the current classification of NNQ.
3.4 Variation
3.4.1 Regional divergence
While we might expect to find regional difference in terms of verbal agreement,
following the example of collective nouns, there is also some dictionary evidence of
2. For example “Sheer weight of numbers gives us an advantage”.
 Adam Smith
regional divergence in the choice of NNQ. The Longman Dictionary of Contemporary
English, based on data from the Longman Corpus, labels a bunch of and a raft of as
“esp. AmE”, while loads of and a stack/stacks of are labeled “esp. BrE”. The fact that these
phrases are also labeled as either colloquial or informal suggests that it is at the less
standard end of the spectrum that these regional differences occur. By using spoken
corpora we will be able to investigate the range of these idioms across regions, and
whether AusE or NZE show higher frequencies of forms that are regionally marked as
British or American.
3.4.2 Register
The question of register also features heavily in the discussion of NNQs. Quirk et al.
describe what they call “open-class quantifiers” as being “chiefly used informally”
(1985: 264), and Channell writes that the “starting point” for her study of this class of
“vague language” was Crystal and Davy’s observation of the frequency of such phrases
in conversation data (1994: 95). It might be expected that as a symptom of vagueness,
non-numerical quantification belongs more naturally in speech than in writing. But
Channell herself showed in an earlier study (1990) that some academics have quite specific uses in their writing for apparently vague expressions such as a number of. Corpus
data can quantify their relative frequency of usage in spoken and written texts, and also
help to establish the range and regional spread of phrases considered to be at the more
colloquial end of the spectrum, such as oodles of, heaps of, loads of.
4. Previous corpus studies
There is very little by way of previous corpus studies of NNQs. Kennedy (1987) looked
at the variety of types of quantification in English from a language learner’s perspective, using a small corpus of journalistic and academic written English to identify
subcategories of quantification, and then compared the relative frequencies of these
subcategories in the scholarly texts of the Brown and LOB corpora. These categories
were titled “specific”, “non-specific” and “relative” quantities/degrees. The NNQ construction, as classified here, came in under several of the “non-specific” subcategories,
such as “small quantities/degrees”/“large quantities/degrees”/“non-specific parts of a
whole”, so it is not possible to make direct comparisons with any of the figures given by
Kennedy for the overall categories, and individual tokens were only recorded for the
“approximation” category. Although American (Brown) and British (LOB) corpora
were used, no regional inferences were drawn.
A study more focused on NNQs was conducted by Channell (1994), as part of
her investigation into different types of vague language. Spoken and written corpora,
including the Oxford Corpus of the English Language and the Birmingham collection
Non-numerical quantifiers 
of English Texts, were used to assess the frequency of certain NNQs such as bags of,
a load/loads of, a lot/lots of, oodles of, a bit of in spoken and written English, and to analyze whether they collocated more frequently with countable or uncountable nouns,
where there was possible variation. All the corpora used were of BrE, and therefore
no regional distinctions could be made. While the quantitative data presented by
Channell is limited, her study gives a useful starting point to the discussion of NNQs.
Continuing with the theme of vague language, Drave (2002) used a corpus of conversation between native speakers of English and Cantonese to investigate the range
and function of vague language in intercultural conversations. The only NNQ he studied was a lot of, the most frequent in the corpus. Drave focused on the collocational
patterning around a set of examples of vague language, such as about, stuff, thing, to
determine if there were any functional differences between the use of vague language
by this set of native and nonnative English speakers.
In this study we use data from the spoken and written components of ICE-AUS,
ICE-NZ and ICE-GB to investigate regional and register variation in the class of nonpartitive quantifying constructions consisting of a quantifying noun followed by of (as
defined above, Section 3). The areas considered will be:
––
––
––
verbal agreement and noun complementation in the cases of common NNQs that
can be either singular or plural in form (such as heap(s), load(s), lot(s))
fixed and variable collocation with a set of quantifying nouns (bunch, heap, load)
the regional and register distribution of a selection of lower frequency NNQs
5. Corpus findings
5.1 A lot/lots of
Both Channell and Drave state that a lot of/lots of is particularly frequent in their
corpus material, and a lot of was found to be the most frequent NNQ in the ICE
corpora; it is one of the quantifying nouns which may be singular or plural, and it
can take either a singular or plural noun complement (i.e. is number transparent).
Quirk et al. (1985: 262) note that it is chiefly used informally, and is therefore of
interest with regard to register. This indication of plentiful data and variability make
it a useful point to start from in a comparative corpus analysis.
5.1.1 Regional and register features
Channell found that there was a distinct register difference in the use of lots of,
(confirming Quirk et al.’s observation about its informality). It was much more frequent
in the Cobuild spoken corpus than in the written corpus, where its use was restricted
to direct/reported speech or personal narrative (1994: 102). The overall frequency of
 Adam Smith
a lot/lots of in each of the ICE corpora, along with the comparative figures between the
spoken and written components, are shown in Table 1:
Table 1. Frequency of a lot/lots of in ICE-AUS, ICE-NZ, ICE-GB
(a) lot of
lots of
ICE-AUS
ICE-NZ
ICE-GB
spoken written total
spoken written total
spoken written total
355
59
353
67
241
78
45
17
400
76
71
31
424
98
22
26
263
104
The difference in frequencies of both a lot of and lots of between the spoken
and written data is marked, with this NNQ notably more common in spoken than
in written English across the regions. Both ICE-AUS and ICE-NZ show particularly
high frequencies for a lot of, with ICE-GB showing a lower frequency, but higher for
lots of than either ICE-AUS or ICE-NZ.
5.1.2 Grammatical features
Channell noted that a lot of collocated with countables and uncountables (1994: 106),
but did not remark on any difference in frequency. There was a strong preference
for collocating lots of with countable nouns (lots of things/children), while noncount
complements were equally possible (lots of money) though less represented in her corpora. She made no comment on verbal agreement for either quantifier.
In this study, complements have been classified either as singular or plural, to concentrate on the grammatical form rather than semantic distinctions that may arise from
the choice of count or noncount uses of the noun, as in “lots of time”/“lots of times”.
Table 2 shows how a lot of and lots of collocated with singular and plural complements,
as well as showing the verbal agreement with the NNQ phrase, where it was the subject
Table 2. Noun complementation and verbal agreement for a lot/lots of in ICE-AUS,
ICE-NZ, ICE-GB
ICE-AUS
ICE-NZ
ICE-GB
CP
CS
VP
VS
CP
CS
VP
VS
CP
CS
VP
VS
(a) lot of 152
(people)
68
total
220
180
180
38
31
69
38
9
47
141
62
203
221
221
25
14
39
38
3
41
77
37
114
149
149
22
19
41
47
4
51
27
27
10
3
13
3
0
3
54
16
70
28
28
5
2
7
7
1
8
48
12
60
44
44
17
4
21
10
2
12
lots of
(people)
total
40
9
49
(CP = plural complement; CS = singular complement; VP = plural verb; VS = singular verb)
Non-numerical quantifiers 
of the verb and the number of the verb was evident. The complement people is treated
separately as it is singular in form, although usually plural in meaning (it is treated as a
plural complement by Channell).
In the ICE corpora, a lot of took a singular complement more frequently than
a plural one in all regions (the difference was most marked for ICE-GB (=1:2), and
then, for ICE-NZ (=2:3)). However, the numbers of singular and plural complements
become more even, as in Channell’s findings, when people is added to the plural complements. Lots of took a plural complement more consistently, with a lot of being about
half as common in ICE-AUS and ICE-NZ, but much more common in ICE-GB. These
results show a slight tendency for the singular form of the quantifier to take a singular
complement (most marked in ICE-GB), and for the plural form of the quantifier to
take a plural complement (least marked in ICE-GB). So there is a tendency towards
consistency, but not a clear pattern.
Different trends were detectable in terms of verbal agreement across the regions.
ICE-AUS displayed a consistent preference for plural verbs with a lot of and lots of
(especially if the figures for people are included), whereas ICE-NZ and ICE-GB both
went the other way for a lot of, with a slight preference for singular agreement.
When we look more closely at examples from the corpora, we discover that a
high proportion of the singular verbs used in conjunction with NNQs are instances
of existential there. As Biber et al. note: “The subject status of existential there is also
indicated by the strong tendency in conversation to use a singular verb regardless of
the number of the notional subject” (1999: 994). We cannot therefore attribute the
mismatch between the number of the verb and that of the complement to the singular
lot in Table 3 below (1a/b).
Table 3. Verbal agreement with noun complement of a lot of/lots of: existential there
construction preceding NNQ compared with verb following NNQ
ICE-AUS
ICE-NZ
ICE-GB
lot
1a
1b
2a
2b
there’s a lot of CP
a lot of CP is
there’s a lot of CS
a lot of CS is
8
0
20 (74%)
7 (26%)
35
11
0
11 (48%)
12 (52%)
34
5
0
30 (81%)
7 (19%)
42
lots
3a
3b
4a
4b
there’s lots of CP
lots of CP is
there’s lots of CS
lots of CS is
2
0
1
0
10
6
0
0
0
13
4
0
3
1
21
 Adam Smith
All of the instances from 1a are examples with existential there, and coming predominantly from the spoken corpora are proof of Biber et al.’s assertion that “In fact,
such examples [there’s + plural noun] are somewhat more common in conversation than
the standard constructions with plural verb plus plural noun phrase” (1999: 186).
The data for lots of is insufficient to show a clear preference, but the marked
tendency for a singular verb to be attached to lots of + plural complement (especially
in ICE-NZ, see 3a) again shows the influence of existential there.
Examples of mismatches between the number of the complement when the verb
follows are not common, although there are some to be found in ICE-GB. The only
example with a lot of is a rather questionable one where the plurality of the verb is
influenced by coordination of the singular NNQ phrase with a plural subject:
(1)Poorly shaded lights or a lot of movement are also undesirable.
[ICE-GB W2B-033:095]
The speech data provides good evidence for speakers of English being comfortable
with there’s + plural complement in quantitative statements (100% of the instances
of a lot/lots of with a singular verb and plural complement in Table 3 have existential
there, and they are all from the spoken components of the corpora). Yet the following
example enacts the apparent uncertainty for one speaker over verbal agreement with
lots of, with a self-correction within the same utterance:
(2)But there there there there’s lots of deers and lots of rabbits
[ICE-GB S1A-006:260]
The stuttering repetition of there suggests hesitation by the speaker over what follows,
and the normally zero plural deer is mistakenly regularized to deers to match rabbits.
A few moments later the speaker has resolved the problem with this grammatically
regularized statement:
(3) There are lots of deer and lots of rabbits [ICE-GB S1A-006:264]
Even given the influence of existential there on the number of the verb, it is worth
looking at the overall pattern of agreement between the NNQ, its complement and
associated verb, to look for signs of variability. Table 4 presents the percentages and
numbers for a lot of and lots of showing where the associated verb agrees with the
number of the complement, with the NNQ, where there is no distinction (i.e. both the
NNQ and the complement are either singular or plural), or there is no agreement with
either (e.g. there’s lots of ideas).
Here we can see for a lot of that, where there is a mismatch between the number of
the NNQ and the complement, the number of the complement is most likely to dictate
the number of the following verb in ICE-AUS, and it is reasonably common in ICE-NZ
(38%) and ICE-GB (31%). For these last two, there is a high proportion of NNQ phrases
Non-numerical quantifiers 
Table 4. Verbal agreement with number of noun complement and/or NNQ for a lot/lots of*
a lot of
plural verb + plural complement
singular verb + sNNQ
singular verb + singular complement + sNNQ
plural verb + singular complement + sNNQ
lots of
singular verb + singular complement
plural verb + pNNQ
plural verb + plural complement + pNNQ
Singular verb + plural complement + pNNQ
ICE-AUS
ICE-NZ
ICE-GB
38 (50%)
8 (11%)
30 (39%)
0
23 (38%)
9 (15%)
28 (47%)
0
21 (31%)
5 (7%)
40 (60%)
1 (2%)
1 (10%)
0
7 (70%)
2 (20%)
0
0
7 (50%)
7 (50%)
4 (19%)
1 (5%)
12 (57%)
4 (19%)
sNNQ = singular NNQ; pNNQ = plural NNQ
*includes existential constructions
where the number of the complement and the NNQ match, and it is therefore not possible to distinguish which element dictates agreement. However, if we add the examples
for a lot of + people, we find a very consistent use of the plural verb across the regions
(ICE-AUS, 31/40 instances, ICE-NZ, 14/17 and ICE-GB 19/23) which makes the complement the main determiner of verbal agreement across the regions, i.e. giving semantic (notional and proximity) rather than formal grammatical agreement.
We will now look at the set of NNQs found in the corpora that can vary between
singular and plural forms, to see if they show any similar tendencies to the most
frequent NNQ a lot/lots of.
5.2 ONNQs that have a singular or plural quantifying noun
The only vague quantifiers that Channell identified as being able to take a singular or
plural form are a load/loads of and a mass/masses of (1994: 101–2). Neither of these
were particularly frequent, so for the purpose of this study we will look at a larger set
of NNQs with the singular/plural alternative, to see how they compare with a lot/lots
of. As noted earlier, NNQs are an open grammatical set. We therefore applied certain
criteria to select a coherent set of other NNQs (ONNQs) for comparison:
––
––
––
No partitives were included, which also excluded measure nouns such as cup/cups.
Numerical uses were included where the number did not refer to a precise amount,
as in a couple of/dozens of.
Only those were included that could take either a singular or plural noun
complement (i.e. are number transparent), thus excluding some relatively frequent
NNQs such as group, set, which consistently take plural complements.
 Adam Smith
––
––
Only examples that had a quantitative function were included. Uses such as
clouds of gas, which could be interpreted either as descriptive or quantitative were
excluded, as were others such as two piles of leaves, where the number obviously
takes over any quantifying function that would exist in examples such as piles of
homework, and those with an adjectival qualifier that puts focus on the descriptive
function of the word – a small pile of leaves.
For the purposes of identifying a set of NNQs that could take either a singular or
plural noun, only those that took a singular and a plural form in at least one of the
ICE regions were included.
See Table 6 for a full list of the other quantifiers (ONNQs) inspected.
5.2.1 Regional and register features
Regional and register frequencies across the corpora for the plural and singular forms
of these NNQs are shown in Table 5, compared with data from Table 1 for a lot/lots of.
Table 5. Comparison of a lot/lot of with alternative singular/plural ONNQs
ICE-AUS
(a) lot of
lots of
Total
Singular
ONNQ
Plural
ONNQ
Total
ICE-NZ
ICE-GB
spoken written total
spoken
written
total
spoken written total
355
59
414
45
17
62
400
76
476
353
67
420
71
31
102
424
98
522
241
78
319
22
26
48
263
104
367
48
23
71
21
36
57
31
20
51
38
86
13
36
51
122
50
71
32
68
82
139
36
67
25
45
61
112
The most striking contrasts with the a lot/lots of data is that there is a much more
even spread of ONNQs between the regions. ICE-NZ is just ahead, followed by ICEAUS and then ICE-GB. There is not the same weighting of frequencies towards the
spoken component of the corpora. While there are still higher frequencies of ONNQs
in spoken data across the regions, the ratio between spoken and written is much closer,
particularly in ICE-NZ.
With regard to the ratio of singular and plural forms, again the effect is one of
a more even distribution. Whereas the singular a lot of is the dominant form in all
regions, ONNQs appear to offer more flexibility as to the choice of singular or plural
form, with the plural forms actually the more frequent option in ICE-NZ and ICE-GB.
Table 6 gives a breakdown of the ONNQs selected following the criteria in 5.2:
Non-numerical quantifiers 
Table 6. Frequency of alternative singular/plural ONNQs in ICE-AUS, ICE-NZ, ICE-GB
ICE-AUS
ICE-NZ
spoken written total
bag
bags
band
bands
block
blocks
body
bodies
bunch
bunches
crowd
crowds
dose
doses
dozen
dozens
heap
heaps
host
hosts
load
loads
mass
masses
pile
piles
quantity
quantities
stack
stacks
ICE-GB
spoken written total
spoken written total
2
1
0
0
7
1
3
1
9
1
3
0
4
0
2
3
0
21
0
0
7
1
4
3
2
1
3
3
2
2
0
1
3
0
2
4
4
0
1
0
2
0
3
1
0
2
0
0
3
0
1
1
0
1
4
0
0
3
0
0
2
2
3
0
9
5
7
1
10
1
5
0
7
1
2
5
0
21
3
0
8
2
4
4
6
1
3
6
2
2
0
0
1
0
5
3
3
1
5
0
0
1
0
0
0
3
1
38
2
0
1
0
0
0
2
1
0
0
1
2
1
0
2
1
3
4
7
0
5
2
0
0
2
1
2
4
2
4
1
0
1
2
4
0
2
3
3
7
1
1
1
0
3
1
8
7
10
1
10
2
0
1
2
1
2
7
3
42
3
0
2
2
4
0
4
4
3
7
2
3
2
0
2
2
1
0
3
0
3
0
0
0
2
0
0
4
0
2
1
0
10
16
5
5
1
3
1
3
0
2
0
0
2
4
2
1
3
1
1
0
1
4
0
0
0
3
0
1
2
1
1
4
8
1
0
0
0
5
0
0
2
0
4
6
3
1
6
1
4
0
1
4
2
0
0
7
0
3
3
1
11
20
13
6
1
3
1
8
0
2
86
36
122
71
68
139
67
45
112
This list shows the variety of ONNQs available. While a lot/lots of gives no more
specific sense than “a large amount of ” (see Channell 1994: 101, 106), we have several
different categories here:
––
––
Apparently numerical (dozen) or precise (dose) which are used to refer to a large
or unspecified number, e.g. dozens of celebrities, a dose of discipline
Non-specific quantifiers that often have to be qualified to give a sense of the
amount referred to: a small/large body of work
 Adam Smith
––
––
––
Nouns that are used generally to denote a large amount (crowd, host, mass, quantity)
Nouns that are used generally to denote a collection of objects of unspecified size,
but are used as quantifiers always to express a large amount (bag, heap, load, stack)
The fact that they can be used either singularly or plurally shows that they can all be
conceptualised either as an undifferentiated mass or a collection of discrete items.
The low frequencies of individual ONNQs in Table 6, in comparison to the figures
for a lot/lots of, indicate that their primary functions are not as quantifiers. They can
however be number transparent, and are therefore interesting to compare with a lot/
lots of. Some of the more frequent ONNQs such as set of, group of, which are excluded
from Table 6 because they showed no signs of number transparency, only taking plural
complements, are listed in Appendix 1.3
In contrast to a lot/lots of it is clear that these NNQs are not restricted in the same
way to the spoken genre. Some such as band, blocks, body and host in particular show
a higher frequency in written material. The following examples show them being used
in both bureaucratic (4) and technical (5) written contexts, with (6) appearing in the
business correspondence category:
(4)Directly related to these problems of responsiveness and sensitivity were a host
of ‘technical’ problems… [ICE-AUS W2A-012:13]
(5)The alternatives are, therefore, discussed, where possible, in terms of blocks of
energy corresponding to baseload generation at 250 MW. [ICE-NZ W2A-038:10]
(6)As discussed I am introducing a new band of charges, which will be applicable
for performance times, as against rehearsal times. [ICE-GB W1B-021:091]
Table 6 also draws attention to some regional difference in the use of ONNQs. Heaps
of appears to be a distinctively antipodean quantifier, with 21 and 42 occurrences
in ICE-AUS and ICE-NZ respectively, with only 3 in ICE-GB. Conversely, loads of
appears almost exclusively in the British corpus, with 20 in ICE-GB as against 2 each
in ICE-AUS and ICE-NZ (although ICE-AUS has a similar frequency for the singular
form). The other ONNQ showing a distinctive regional grouping is bunch of, often
identified as typically American (see Section 1.3), which is more than twice as frequent in ICE-AUS and ICE-NZ as it is in ICE-GB.4 In order to look more closely at
3. Note that these are both classified by Biber et al. as “quantifying collectives”, which only
take countable nouns. Likewise bunch which, conversely, was found to be number transparent
in the corpus data.
4. There are also 7 instances of heaps, and 12 of bunch in the ART corpus, with no examples
of loads as a quantifier, further confirming the regional distinctions.
Non-numerical quantifiers 
these regional divergences, we will assess the collocations of bunch, heap and load, to
see if there are differences in usage that reflect the differences in frequency.
5.2.2 Collocation with heap(s)/load(s)/bunch(es)
These ONNQs form a useful set for investigation not only because of their regional
differences, but also because they each possess, in their primary senses, an element of
physical description that more abstract quantifiers such as lot/lots do not. Table 7 gives
an inventory of collocations for each of the ONNQs for the regions under discussion:
Table 7. Collocations with heap(s)/load(s)/bunch(es)
ICE-AUS
heap
heaps
load
loads
bunch
bunches
ICE-NZ
ICE-GB
metal
cash, character, friends,
HECS, junk, kids,
nationalities, money,
people (2), places, plastic
surgery, stuff, students,
times (2), work (2)
clothing, cushions, papers, stuff
action, apostrophes, energy, files, bodies, hours, relations
halyard, hassles, leaves, lights,
mending, money (2), movies,
music, noise, overtime(2),
people (7), places, profits, pubs,
presents, rouge, rubbish,
seats, spray & rain, shit, stars,
support, tapes (2), things (2),
time, wine, work, zeros
crap (3), filters, money,
casks, nonsense
advantages, bread,
people, rubbish (2)
engineers, fun,
nonsense, numbers,
rubbish (2),
sensitivity, uniforms
calcium, cards & letters
ammonia & phosphate, food
blanks, books (2), films,
jobs, money (2),
people, photos, rabbits
& guinea pigs, rails,
sentences, space, stuff
(2), texts, things (2)
Aborigines, clods, cows,
blood, guys, lefties, live wires,
dorks & bubbleheads,
kids, lads, men & women, mountains, no-hopers, people
flowers, people, roses
people (2), recruits, words (3), warriors
flowers
flowers, greenery
It is interesting to note, first of all, the difference between the uses of heap and
heaps. While the collocations for the singular form are all objects that can form a
physical heap, clothing, cushions, metal, the collocations for ICE-AUS and ICE-NZ
give a much wider range, from the purely abstract time, to objects that have an abstract
and a concrete sense, such as money, and to animate entities such as kids, people. The
ICE-GB collocations show some of that range, but the reference to bodies, as in
 Adam Smith
example (7) below, is referring to piles of human corpses, and therefore foregrounding the descriptive rather than quantitative sense of heaps:
(7)I was walking by piles of heaps of bodies that had been torched cos
there was these black charred embers of grotesque 〈,〉 figures you know
[ICE-GB S2A-050:152]
Load(s) doesn’t show the same distinction between the singular and plural forms in
the ICE-GB data, with load and loads both giving examples of animate, inanimate
and abstract objects. ICE-AUS and ICE-NZ have a much more limited range, with
inanimate objects (casks, filters) again emphasizing the descriptive (as do forms not
included here such as bus loads, car loads, truck loads, which appear in both corpora).
Both ICE-AUS and ICE-NZ do, however give a dismissive connotation to loads with
the collocations crap, nonsense, rubbish.
(8)They reckon this thing’s cost them about six thousand dollars and I reckon that’s
a load of crap [ICE-AUS S1A-030:51]
Interestingly ICE-GB shows a similar variation happening with bunch. Alongside the
very literal collocations of flowers, roses, is the evocative dorks and bubbleheads, mirroring the sardonic tone of clods, lefties, no-hopers. Perhaps this is a transition that
many NNQs go through, from being purely descriptive to a colloquial means of
bunching things/people together in an offhand, dismissive way, to achieving a neutral
sense of quantity.
5.2.3 Grammatical features
To draw comparisons between a lot/lots of, and the number-flexible ONNQs, we
will treat the latter as a set. Table 8 compares the figures from Table 2 for noun complementation and verbal agreement (with any indeterminate plurals such as people,
data removed).
Table 8. Noun complementation and verbal agreement for a lot of/lots of compared with
alternative singular/plural ONNQs
ICE-AUS
(a) lot of
lots of
ONNQ
(singular)
ONNQ
(plural)
ICE-NZ
ICE-GB
CP
CS
VP
VS
CP
CS
VP
VS
CP
CS
VP
VS
152
40
37
180
27
32
38
10
5
38
3
17
141
54
32
221
28
22
25
5
5
38
7
4
77
48
20
149
44
24
22
17
4
47
10
10
28
22
6
3
35
34
12
6
30
29
8
2
(CP = plural complement; CS = singular complement; VP = plural verb; VS = singular verb)
Non-numerical quantifiers 
There is a remarkable consistency between the AusE and NZE preference for a
plural complement with a singular ONNQ, and the data for ICE-GB shows a much
higher proportion of CPs than for a lot of. This suggests an even higher degree of
grammaticization for the ONNQs than for a lot of, where the CS is preferred across
the regions.
Again the plural forms of ONNQs are less frequent (as for lots of ), but to a lesser
degree. They show a greater tendency than lots of to take a singular complement, reinforcing the case for grammaticization of ONNQs.
With verbal agreement, the numbers for singular and plural verbs with a singular
ONNQ are quite even across the region, in contrast to the preference for a singular
verb in ICE-NZ/ICE-GB for a lot of. Plural ONNQs take a plural verb more regularly
than does lots of, so we have contrasting trends whereby the singular ONNQ appears
more likely to take a plural complement, but the verb agreement is unpredictable,
whereas plural ONNQs take plural complements and plural verbs quite consistently.
As with a lot/lots of, it will be instructive to look more closely at the impact that the
existential there construction has.
Table 9. Verbal agreement with noun complement of alternative singular/plural
ONNQs: existential there’s preceding ONNQ compared with verb following ONNQ
ICE-AUS
ICE-NZ
ICE-GB
Singular
1a
1b
2a
2b
there’s sONNQ of CP
sONNQ of CP is
there’s sONNQ of CS
sONNQ of CS is
1 (33%)
2 (67%)
6 (86%)
1 (14%)
10
1 (50%)
1 (50%)
2 (67%)
1 (33%)
5
1 (33%)
2 (67%)
3 (60%)
2 (40%)
8
Plural
3a
3b
4a
4b
there’s pONNQ of CP
pONNQ of CP is
there’s pONNQ of CS
pONNQ of CS is
7 (88%)
1 (13%)
0
0
8
4 (100%)
0
1 (50%)
1 (50%)
6
1 (100%)
0
0
1 (100%)
2
sONNQ = singular ONNQ; pONNQ = plural ONNQ
CP = plural complement; CS = singular complement
If we compare the results in Table 9 to those of Table 3, there is not quite the same
degree of exclusivity for ’s as the singular verb with a plural complement. Whereas
Table 3 had no returns for 1b in any of the regions, the singular verb following the
singular ONNQ + plural noun complement is actually more common than there’s
introducing it (1a) in both ICE-AUS and ICE-GB. There is also one example of
 Adam Smith
there’s with the plural ONNQ. Again we could attribute this to the fact that these
ONNQs are less delexicalized/grammaticized than a lot/lots of.
Table 10 gives the overall patterns for agreement between other singular/plural
ONNQs and the noun complement, including the existential there’s construction.
Table 10. Verbal agreement with noun complement or ONNQ for singular/plural ONNQs
ICE-AUS
ICE-NZ
ICE-GB
singular verb + sONNQ
singular verb + singular complement + sONNQ
plural verb + singular complement + sONNQ
plural verb + plural complement
4 (29%)
3 (21%)
7 (50%)
0
4 (44%)
2 (22%)
3 (33%)
0
4 (33%)
3 (25%)
5 (42%)
0
singular verb + singular complement
plural verb + pONNQ
plural verb + plural complement + pONNQ
singular verb + plural complement + pONNQ
0
0
2 (20%)
8 (80%)
2 (13%)
3 (20%)
6 (40%)
4 (27%)
1 (11%)
1 (11%)
6 (67%)
1 (11%)
sONNQ = singular ONNQ; pONNQ = plural ONNQ
The clear contrast between these figures and those for a lot in Table 4 is that there
is a greater tendency for the number of other singular ONNQs to dictate verbal agreement, although the complement still has a higher percentage. This suggests that these
ONNQs have more lexical weight – a load or a heap retaining a physical quality that a
lot doesn’t. This lexical weight is lost in the plural form, where the proportion of verbal
agreement with the ONNQ is much lower. The figures for plural ONNQ are inconclusive as to agreement with complement or ONNQ, but they do show an even more
marked tendency for inconsistency than that already seen for lots of (although again
this is heavily influenced by existential there as in:
(9)And there was heaps heaps of kids like you go to Carnarvon Gorge
[ICE-AUS S1A-067:308]
Although the numbers that demonstrate verbal agreement with these plural ONNQs
are quite small, there is evidence here of the tension between the descriptive and the
quantitative roles of these less frequent ONNQs.
6. NNQs with singular or plural forms only
This paper has concentrated on a subset of NNQs – those that can be either singular
or plural in form – in order to focus on questions of agreement. The full inventory of
NNQs discovered through corpus searches (see Appendix 1) spans a much wider range.
The NNQs listed are all non-partitive, and were selected on the basis that they had
Non-numerical quantifiers 
quantitative uses that went beyond their conventional descriptive use, e.g. a parcel of
shares (ICE-AUS S2A-031:125), a spot of coffee and porridge (ICE-NZ W1B-003:423).
The lists may not be exhaustive, but they are representative of the range of NNQs
found. There was a remarkable consistency between the regions of number and range of
quantifiers (ICE-NZ had the most NNQs, with 70, then ICE-AUS with 67, ICE-GB, 58).
Table 11 below shows the only points of regional divergence.
Table 11. NNQs common and specific too different regions
ICE-AUS only
ICE-NZ only
ICE-GB only
ICE-AUS,NZ
ICE-NZ,GB
dollops (1)
dash (1)
droves (1)
gaggle (1)
swag (2)
wadges (1)
clump(s) (2,3)
mob (1,1)
oodles (1,1)
raft (2,3)
smattering (1,3)
spot (2,1)
Clearly the numbers are not large enough to make regional comparisons. There
are however particularly AusE and NZE words, such as droves, mob and swag, and
it is therefore not surprising that they don’t appear in ICE-GB. Conversely, oodles is
one that Channell highlighted in her BrE data (1994: 103), so its appearances only in
ICE-AUS and ICE-NZ here are not representative.
While the overall figures in Appendix 1 do point to a greater use of NNQs in spoken
than written English, they are by no means exclusive to informal communication.
Consider the following examples for raft of:
(10)Male unionists were intent on the exclusion of possible competition; they
managed to achieve a raft of restrictions to women’s employment opportunities
at the same time as they successfully excluded immigrant competition by means
of the White Australia Policy. [ICE-AUS W2A-017:73]
(11)There was a a raft of of basically policy givens that we were working within
[ICE-NZ:S2A-047:20]
(12)The use of in situ concrete for floors in Australia generates a whole raft of
falsework design associated with table forms. [ICE-NZ:W2A-040:92]
Two of these examples are from the “learned information” category of ICE, and are
being used in a quite formal, or technical context (in the case of (12)). While (11)
comes from the spoken medium, it is clearly being used as a piece of political jargon,
and therefore within a specialised discourse. Raft of has no more specific sense than
lot of, and yet its imprecision does not disqualify it from a formal setting. Channell
(1990) looked at some examples of academics in the field of economics – where we
 Adam Smith
would expect numerical precision to be particularly important – who indicated “that
they recognize the inherent vagueness [of the non-numerical quantifier] and know
how to exploit it for particular communicative purposes” (1990: 103). For example,
the choice of the NNQ a number of was explained by the academic who had used it to
indicate “this is an area where a considerable amount of work has been done and there
is no monopoly of interpretation…There’s at least two, ’cause I think I’d have put two if
there’s only two, and I think the word a number also indicates there’s no front-runner.
I think if there was about 25 of them I’d have started introducing classifying things in
there – ‘a great number’, ‘a vast number’…”. Note also the prevalence of existential there’s
with a plural in the expert’s reported utterance.
7. Conclusions
This study has shown several areas of interest in the study of NNQs that would repay
study beyond the corpora used here:
Process of grammaticization
The figures for a lot/lots of showed a high degree of variability in the choice of a singular or plural complement, while verbal agreement was quite regular – when the influence of existential there was allowed for – demonstrating this NNQ’s role as a complex
determiner. The singular/plural ONNQs looked at developed rather lower degrees of
number transparency, but more variability as to verbal agreement, suggesting a lesser
degree of delexicalization/grammaticization.
Classifications of NNQs modified by corpus evidence
The grammars looked at provided a template which allowed this study to target the
most likely types of NNQ to give evidence of variation as to the number of the noun
complementation and verbal agreement. Therefore partitives were excluded, and
Biber’s class of “quantifying collectives” not looked at in detail. Corpus evidence did,
however, provide motivation to reassess his labeling of some NNQs, with bunch of in
particular showing signs of number transparency that suggest it does not always function simply as a collective.
Regional differences
Corpus evidence confirmed load(s) of as a typically British NNQ, and found that the
AmE-marked bunch of is more popular in AusE and NZE than in BrE. Heap(s) was also
found to be a particularly antipodean NNQ, with few examples of its usage in ICE-GB.
NZE showed itself to be particularly productive both in the overall range of NNQs
used, and in the range of collocations found with the subset bunch/heap/load.
Non-numerical quantifiers 
Register differences
While the figures for a lot of/lots of confirmed Quirk et al.’s contention (1985: 264) that
NNQs are more frequent in spoken than written English, some NNQs revealed more
subtle relationships to register. Examples such as band/block/host/number of showed
evidence of use in written texts of carrying more specialised meanings than simply that
of a vaguely large amount. Even where vague quantity appeared to be the only sense,
as in raft of, the formal context in which this NNQ is used suggests a more purposeful
approach to numerical imprecision than mere informal vagueness or overstatement.
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999.
Longman Grammar of Spoken and Written English. London: Longman.
Channell, Joanna. 1990. “Precise and vague quantities in academic writing”. In Walter Nash
(ed.), The writing scholar: Studies in the Language and Conventions of Academic Discourse,
95–117. Beverly Hills CA: Sage.
Channell, Joanna. 1994. Vague Language. Oxford: Oxford University Press.
Drave, Neil. 2002. “Vaguely speaking: a corpus approach to vague language in intercultural
conversations”. In Pam Peters, Peter Collins & Adam Smith (eds), New Frontiers of Corpus
Linguistics. Papers from the 21st International Conference on English Language Research on
Computerized Corpora, 25–40. Amsterdam: Rodopi.
Huddleston, Rodney & Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.
Kennedy, Graeme. 1987. “Quantification and the use of English: A case study of one aspect of
the learner’s task”. Applied Linguistics 8(3): 264–86.
Levin, Magnus. 1998. Concord with collective nouns in British and American English. In
Hans Lindquist, Staffan Klintborg, Magnus Levin & Maria Estling (eds), The Major Varieties
of English, 193–204. Växjö University.
Longman Dictionary of Contemporary English, 4th edn. 2007. Harlow: Pearson Education.
Peters, Pam. 1999. “Differing on agreement: A report on Langscape 3”. English Today 15(2): 6–9.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Reid, Wallis 1991. Verb and Noun Number in English: A Functional Explanation. London:
Longman.
 Adam Smith
Appendix 1
ICE-AUS Spoken Written ICE-NZ
1
2
3
4
5
6
7
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
bag
bags
band
batch
block
blocks
bodies
body
bunch
bunches
chunks
clump
clumps
cluster
clusters
clutch
clutches
couple
crowd
cup
cups
deal
dollops
dose
doses
dozen
dozens
drop
flocks
flood
gang
gangs
group
groups
heaps
hordes
host
load
loads
lot
lots
lump
2
1
0
2
7
1
1
3
9
1
0
0
1
2
1
2
0
187
3
20
4
21
0
4
0
2
3
0
2
1
1
0
47
8
21
0
0
7
1
355
59
0
0
1
3
0
2
4
0
4
1
0
1
1
0
4
2
0
2
36
2
7
1
9
1
3
1
0
2
1
0
3
0
1
17
5
0
1
3
1
1
45
17
1
bag
bags
band
batch
block
blocks
bodies
body
bunch
bunches
chunk
cluster
clusters
clutch
couple
crowds
cup
cups
dash
deal
dose
doses
dozen
dozens
drop
droves
flocks
flood
gaggle
gang
gangs
group
groups
heap
heaps
horde
hordes
host
load
loads
lot
lots
Spoken Written ICE-GB
0
1
0
1
5
3
1
3
5
0
4
0
0
1
148
1
13
1
1
13
0
0
1
3
1
0
0
0
0
1
0
33
7
1
38
2
0
2
1
0
353
67
1
2
1
0
3
4
0
7
5
2
1
5
1
1
51
0
5
2
0
26
2
1
2
4
1
1
2
4
1
1
1
51
16
2
4
0
1
1
1
2
71
31
bag
bands
batch
block
blocks
bodies
body
bunch
chunks
cluster
clusters
clutch
couple
crowd
crowds
cup
cups
deal
dose
dozens
drop
flock
flood
group
groups
heaps
hordes
host
hosts
load
loads
lot
lots
lumps
mass
masses
myriad
number
numbers
parcels
pile
piles
Spoken Written
2
2
1
1
0
0
3
3
1
1
0
0
97
0
0
10
1
26
2
4
1
0
1
38
10
2
2
1
0
10
16
241
78
1
5
5
0
123
19
0
3
3
0
4
4
2
1
1
3
1
2
0
1
1
29
1
4
3
0
12
0
3
0
1
1
32
21
1
0
2
1
1
4
22
26
0
8
1
2
115
10
1
0
0
(continued)
Non-numerical quantifiers 
Appendix 1. (continued)
ICE-AUS Spoken Written ICE-NZ
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
mass
masses
mob
myriad
number
numbers
oodles
parcel
pile
piles
pinch
quantities
quantity
raft
scrap
scraps
set
sets
spate
stack
stacks
swathe
touch
Total
Spoken Written ICE-GB
4
3
1
1
87
8
0
1
3
1
0
3
1
1
0
0
31
1
1
2
2
0
2
0
1
0
2
69
15
1
0
5
0
1
3
0
1
1
1
32
6
3
0
0
2
3
mass
0
mob
0
myriad
0
number
113
numbers
9
oodles
0
pack
4
parcel
0
pile
1
piles
1
pinches
1
quantities
1
quantity
0
raft
1
scrap
1
scraps
0
set
26
sets
6
smattering
0
spate
1
spot
1
stack
1
stacks
2
swag
0
swathe
1
touch
2
4
1
2
200
32
1
0
2
2
4
0
13
7
2
0
1
43
7
3
0
0
1
1
2
0
6
932
329
883
651
pinch
quantities
quantity
scrap
set
sets
smattering
spate
spot
stacks
swathe
swathes
touch
wadges
Spoken Written
1
3
2
0
20
6
1
0
2
2
0
0
2
0
1
7
2
2
34
7
0
2
0
0
0
1
3
1
752
381
From chairman to chairwoman
to chairperson
Exploring the move from sexist usages
to gender neutrality*
Janet Holmes, Robert Sigley & Agnes Terraschke
Victoria University of Wellington/Daito Bunka University/
Macquarie University
This paper analyzes data from written and spoken corpora of British, American,
Australian and New Zealand English to track social change in patterns of
gender-marking. Frequency data for the use of general terms like woman and
man are compared across the different regional varieties of written English, and
contrasted with spoken corpus data from Australia and New Zealand. Several
alternative social interpretations of the data are considered and discussed. The
distributional patterns for occupational terms in the corpora are examined with
regard to gender pre-modification and post-modification. The results indicate
that female roles are often still explicitly linguistically marked, but this could be
interpreted as an indication of women’s entry into formerly male-centric domains.
The most recent Australian data suggests a move towards gender neutrality.
1. Introduction
Sexist language is one means by which a culture or society can perpetuate sexist
attitudes, and for this reason it has attracted much attention from feminists. Analyses of
sexist usages in a number of varieties of English have identified many ways in which
sexist attitudes are encoded, maintained and systematically re-created or constructed
in everyday interaction (Holmes 1999, 2001). Areas which have been examined include
the use of pseudo-generic terms, such as -man and he; gender-neutral terms such as
chairperson; sexist suffixes such as -ess and -ette; metaphorical reference terms such
as bitch and stud; and terms of address such as Ms and dear (Holmes 1993a, 1993b,
1993c, 1997, 2001; Pauwels 1998, 2001a; Wolfson & Manes 1980). Such analyses indicate
*We would like to thank Rowan Shoemark and Yasmin Funk for their assistance in
extracting some of the collocational data; and Laurie Bauer, Tom Lavelle and Jan Svartvik for
their insightful comments on drafts and presentations of this material.
 Janet Holmes, Robert Sigley & Agnes Terraschke
that women are often assigned subordinate status by virtue of their gender alone, and
that they are treated linguistically as subordinate, regardless of their actual power
or social status in a particular context. The English language appears to collude in
the subordination of women, in that females are “marked” compared to males. However, what counts as “marked” or normal is often socially rather than linguistically
determined, with different standards being applied to each sex. Marked behaviour is
usually that which runs counter to conservative social expectations. Corpus analysis
of linguistic differences in markedness can therefore provide interesting insights into
social change.
Feminists long ago identified the linguistic markedness of women as a particular
cause for concern in the battle for equal treatment of the sexes. And indeed, the most
commonly advocated policy among official nonsexist language guidelines is to move
towards gender neutrality (Hellinger 2001: 109). This strategy includes both avoidance of pseudo-generics and sexist suffixes as well as making both genders visible
where necessary, e.g. female and male writers. These guidelines are primarily aimed
at language use in documents written by certain organizations (e.g. UNESCO) or
governments. Thus, the effect on the language use of the general public is not immediate and likely to be slow. According to a relatively recent analysis of BrE (Romaine
1999: 130), there is little reason to believe that these feminist initiatives and language
guidelines have brought about any radical change in the situation in the last few
decades. Romaine cites evidence of the linguistic markedness of women from the
100 million word British National Corpus. She points to disparities in patterns such
as lady doctor (125 instances) vs. gentleman doctor (0). Clearly, women are still the
marked gender in the area of occupational labels, as in so many other areas.
Similar observations of the linguistic markedness of women were made by
Holmes and Sigley (2002), who also noted an increased reference to women in job
titles as opposed to men between 1961 and the 1990s. They pointed out that this
linguistic visibility is not necessarily a bad thing but an important acknowledgement
of women moving into traditionally male work domains. Their prediction was that,
over time, the novelty of female judges or policewomen may wear off, and then this
stage of gender or female visibility could move on to the use of gender-neutral marking
(Holmes & Sigley 2002: 254). Overall, current research on AusE and NZE suggests
a general reduction in the use of pseudo-generics (e.g. Holmes 2001; Pauwels 1997;
Pauwels & Wrightson-Turcotte 2001); and an increase in the use of gender-neutral
markers such as -person or singular they (Holmes 2001; Pauwels 1997, 2001b), and the
neutral title Ms (Chiles 2003; Holmes 2001; Pauwels 1998, 2001a). It appears, then,
that after several decades of exposure to nonsexist language policies, the idea might
have finally caught on.
This paper updates that of Holmes and Sigley (2002), supplementing their data
with more recent corpora of NZE and AusE. One of its aims is to investigate their
From chairman to chairwoman to chairperson 
prediction of the trend from explicit gender-marking to gender neutrality. Another
aspect considered in this study is Holmes and Sigley’s (2002) suggestion that NZE is
more progressive than northern hemisphere English, and whether this reflects larger
hemispheric differences in the use of nonsexist language. Finally, the inclusion of three
spoken corpora in this study further expands the scope of the earlier research to compare the relative frequency of nonsexist language in written and spoken discourse. This
comparison will give an indication of the extent to which nonsexist language guidelines for public discourse impact on private and more ad hoc language use.
The data analysis follows the structure established in Holmes and Sigley’s
(2002) paper. After examining the patterns revealed by frequency data alone, the
focus shifts to the area of occupational labels, comparing the relative frequencies
of forms used for referring to men and women. With attention paid both to the
use of gender-marked premodification and postmodification, primary data (raw frequencies, frequency indexes, and examples of use in context) are drawn from six
parallel 1 million-word corpora of edited written English, as listed in Table 1. Each
contains 500 texts consisting of 2000 words, following the structure established for
Brown Corpus of American English (Francis 1964) and its British equivalent, the
Lancaster-Oslo-Bergen (LOB) Corpus (Johansson 1978). Time depth is provided by
comparing those corpora (based on texts published in 1961) with parallel samples
from 1991 (known respectively as Frown and FLOB), and with the Australian Corpus
of English (ACE) based on texts from 1986 (Green & Peters, 1987), and the Wellington
Corpus of Written NZE (WWC) based on texts from 1986–90 (described in Bauer
1993). This data is further supplemented by the more recent written components of
ICE-AUS and ICE-NZ, both of which consist of about 400 000 words with 200 texts
each about 2000 words long.
Table 1. Corpora of edited written English
Name
National variety
Texts published in
Number of words
Brown
LOB
ACE
WWC
FLOB
Frown
ICE-NZ (wr)
ICE-AUS (wr)
American
British
Australian
New Zealand
British
American
New Zealand
Australian
1961
1961
1986
1986–90
1990–1
1991
1990–8
1991–5
1 000 000
1 000 000
1 000 000
1 000 000
1 000 000
1 000 000
400 000
400 000
Apart from these written corpora, this study makes use of data from three
corpora of spoken English, two of AusE and one of NZE, as shown in Table 2. Since
these corpora were collected slightly later than WWC and ACE, they offer further
insights into the progressive use of gender-marked terms over time. The corpora under
 Janet Holmes, Robert Sigley & Agnes Terraschke
Table 2. Corpora of spoken English
Name
National variety
Collected in
Number of words
ICE-NZ (sp)
ICE-AUS (sp)
ART
New Zealand
Australian
Australian
1990–4
1991–5
2004–5
600 000
600 000
256 000
consideration are the spoken segments of ICE-AUS and ICE-NZ (600 000 words
each), collected between 1991 and 1994–5, and the Australian corpus of talkback radio
(ART), compiled in 2004–5. While the ICE corpora consist of a combination of speech
from public or broadcast events and private conversation, ART is designed to provide
samples of speech from radio talkback interactions, representing a type of discourse
between the realms of public monologue and private conversation.
In order to provide direct comparisons between corpora of varying size, the raw
frequency count is supplemented by a frequency index (FI). This index shows the frequency of occurrence in relation to the size of the corpus by dividing the number of
tokens found by the number of words in each corpus. Since this would produce very
small frequency indices, the results were then multiplied by 10 000, to give the relative
frequency of use per 10 000 words.
2. Women, men and social salience
Even using the grossest level of analysis, there are indications that the position of
women in society has changed over the last forty years. An earlier analysis of the relative frequency of tokens of woman/women vs. lady/ladies, for instance, indicated that in
British, American, New Zealand and Australian usage, instances of woman/women had
increased dramatically over the last thirty years, while by comparison, lady/ladies had
decreased (Holmes 1999). Holmes and Sigley (2002) noted a steady increase of references to women between 1961 and 1991 – a pattern confirmed in the British National
Corpus by Romaine (1999: 139).
Table 3 provides the relative frequencies with which the terms woman/women and
man/men have appeared in written material over a thirty-year period. A consistent
increase in the use of both woman and women can be observed from Brown and LOB
to ACE, WWC and Frown. An increase over time in the use of women can also be
observed between ACE/WWC and the ICE corpora. In New Zealand this could be
related to the prominent public roles that women have played since 1986 (Holmes &
Sigley 2002). At the time of writing (June 2008), women comprise eight of a Cabinet
of 26 members, and have been appointed to four of New Zealand’s top political and
judicial posts – Prime Minister, Chief Justice, Attorney-General and Speaker of the
House of Representatives.
From chairman to chairwoman to chairperson 
Table 3. Frequency indexes (FI) of woman/women, man/men, female(s) and male(s) in
written corpora1
written
Brown
LOB
ACE
WWC
Frown
FLOB
ICE-NZ
(wr)
ICE-AUS
(wr)
FI
FI
FI
FI
FI
FI
FI
FI
woman
women
2.48
2.23
2.67
2.23
3.56
6.1
4.26
7.43
4.78
8.51
2.67
4.61
3.33
9.23
1.97
6.97
man
men
13.5
7.88
11.9
6.41
7.1
5.3
7.57
5.10
8.40
5.56
7.38
4.54
6.08
7.1
3.85
4.72
female
females
0.54
0.17
0.38
0.07
0.85
0.15
1.11
0.51
1.44
0.24
0.86
0.17
1.53
0.8
1.02
0.32
male
males
0.37
0.19
0.40
0.08
1.01
0.2
1.61
0.59
1.77
0.47
0.89
0.22
2.08
0.95
1.7
0.77
Contrasting with the increased use of woman in New Zealand data, there
is a notable decrease in its frequency in the progression from ACE to ICE-AUS.
However, the latter corpus has low frequencies for references to both women and men,
particularly for the singular forms (see further Table 12 in the Appendix). ICE-AUS is
also the first corpus where the overall frequency index for references to men is lower
than the one for women, which could be regarded as a great success for the feminist
movement. However it is more likely that this peculiar drop in frequencies is produced
by coincidental factors such as the kinds of written texts included in the corpus, rather
than a general shift in language use.
The figures for the spoken corpora as shown in Table 4 present a similar picture
to the results for the written corpora: the terms woman/women occur more frequently
than in the earlier data. In fact, the plural forms are used at rates similar to those in
the written corpora. This suggests that women and women’s issues not only feature as
important topics in carefully drafted writing of the southern hemisphere, but have also
entered the consciousness and discourses of the general public. Again, the data shows
a noticeable drop in frequencies for woman/women in ART, the most recent corpus.
And again, it appears that references to both women and men are comparatively low.
The fact that this pattern has already been observed for written ICE-AUS suggests a
. The raw frequency count was excluded from this table for reasons of space. For a
table including the raw frequency counts please refer to the expanded version (Table 12)
in Appendix 1.
 Janet Holmes, Robert Sigley & Agnes Terraschke
Table 4. Overall frequencies and frequency indexes of woman/women, man/men,
female(s) and male(s) in corpora of spoken English
spoken
ICE-NZ
ICE-AUS
ART
No.
FI
No.
FI
No.
FI
woman
women
135
377
2.25
6.28
120
455
2
7.58
39
117
1.52
4.57
man
men
248
143
4.13
2.38
213
179
3.55
2.98
78
26
3.05
1.02
female
females
17
4
0.28
0.07
67
26
1.12
0.43
19
1
0.74
0.04
male
males
48
6
0.8
0.1
68
31
1.13
0.52
31
6
1.21
0.23
general trend in the language which reduces explicit gender specification. Furthermore,
the data also suggests that changes like this are implemented in the language over time,
as the drop was first noted in the written corpus and then surfaced in the spoken one
a few years later.
As Holmes and Sigley (2002) point out, the increased references to women in other
corpora are disproportionately associated with the plural rather than the singular form.
Indeed, while men is more frequent in the older corpora, Brown and LOB, women is
consistently more frequent than men in each of the newer ones. This increased use of
the plural form indicates an increase in the discussion of women as a group, not as
individuals. One explanation for this could be the inclusion of interactions and texts
promoting or discussing feminist viewpoints on a variety of issues. The increasing
frequency of women in these corpora can thus be regarded, in part, as a sign of ongoing
consciousness-raising, as shown by examples (1)–(4).
(1)It is strange that feminists, despite frequent rhetoric about the uniqueness or
even the superiority of women’s insights, invariably regard patterns of choice
by women as worse than those made by men. [ACE F15:2915]2
(2)the feminist view that prostitution is a form of exploitation of women, [...].
[WWC G52:024]
. Corpus markup has been removed from all cited examples. Example references are in
the format (Corpus: Text Line): e.g. WWC G52 024 is from line 24 of text 52 in category G
(Belles Lettres) from WWC.
From chairman to chairwoman to chairperson 
(3)[M]en’s control of women’s sexuality places additional constraints on women’s
lives. [Frown F25:091]
(4)It’s based on a feminist philosophy. It aims to empower women by giving
them more choices enhances those who community by assisting them change.
[ICE-AUS S2A-043:67]
Other examples of discourse relating to women as a group are discussions of the
changing role of women in society, in particular their role in the workforce and in
politics (spoken ICE-AUS) or health issues such as menopause (ART). Some other
topics of discussion relating to women point towards other, less positive social trends.
For example, increases in reported violence against women could indicate not just
increasing awareness or condemnation, but also a real increase in such behaviour.
(5)In recent years there has been a marked increase in the amount of violent crime
committed in urban centres, especially involving rape and sexual assault. [...]
As a result, urban parks have become increasingly dangerous to women and
unaccompanied children. [WWC F41:002-007]
(6)[...] headlines began to scream all over the country with a renewed, shall
we say, outbreak of visible incidents involving the victimization of women.
[Frown H03:057]
(7)In New South Wales especially in nineteen eighty seven approximately
thirty thousand serious assaults against women by men were reported to
auth authorities. [ICE-AUS S2B-044:91]
Still other examples display a more reactionary stance:
(8) After all women are only what their husbands are. [WWC L20:159]
According to Holmes and Sigley (2002), the frequency of the singular woman can be
considered a closer index of real social progress towards gender equality, at least in
terms of whose stories are judged important enough or interesting enough to write
about and publish. The raw frequency of such references seems to have increased in
writing in New Zealand, the United States and to a lesser extent in Australia, but not in
Britain and quite notably not in either of the spoken corpora. It is also noticeable that
after a steady increase in the frequencies for both woman and women, the ICE corpora
show an increase in the plural but a drop in the use of the singular form. It remains
to be seen whether this drop in frequencies reflects the demise of feminism-inspired
political correctness with regard to the public discussion of women, an actual decline
of women in prominent public positions, a return to male-centric publishing or some
other methodological reason (e.g. the types of texts included in the corpus).
In any case, despite the overall evidence of progress, it is still true that far fewer
individual women than men are singled out as topics of discussion in these corpora,
 Janet Holmes, Robert Sigley & Agnes Terraschke
so that equality in the workforce and access to more powerful positions remains a
legitimate feminist goal. This trend seems to be particularly prominent in the spoken
data where the frequency indexes for references to individual women are markedly
low, while references to women as a group are comparatively high. Meanwhile men
are referred to more frequently as individuals in the earlier corpora, but less often in
data from the ICE corpora (Tables 3 and 4). The inclusion of generic uses of man and
Man! as an exclamation does not pose a problem for this claim, since the frequencies
for these uses are rather low (see further below).
In addition to the quantitative analysis, it is important to examine in more detail
whether this increase in references to women is accompanied by changes in attitudes
and usage. Even if a greater number of individual women are mentioned in texts, this
is not necessarily a positive step if they continue to be mentioned in sexist ways. For
example, while FLOB contains several references to female Prime Ministers, these are
less than complimentary:
(9)Nehru also harboured a protectionist obsession even more paranoically than
does the troubling new woman prime minister of France. [FLOB B12:040-042]
(10)She believes that “the so-called womanly qualities – like listening and
supporting – make for ideal MP’s.” Those are qualities that she feels Margaret
Thatcher lacks: “I think it’s an absolute tragedy that she was the first woman
Prime Minister. She’s done nothing for women and is so unwomanly, so
uncaring. She’s a role model for politics that young women shouldn’t follow.”
[FLOB F13:155-161]
The fact that comparable figures for the terms man/men have declined rather
than increased throws the increase in woman/women into even starker contrast.
Figures for man/men are complicated by its (pseudo-)generic usage, though that
does appear to be decreasing.3 The consistent decrease in frequency of man/men
between Brown/LOB and ACE/WWC/Frown/FLOB and ICE-NZ/ICE-AUS seen
in Table 2 may be due partly to this change in usage; however, this cannot account
for all of the observed shift (since pseudo-generics accounted for at most 1 in 3
of man/men tokens even in the older corpora). It should be noted that, in the ICE
corpora, men are more often referred to as a group than as individuals, mirroring
the patterns observed for women more closely. This could suggest an increased
. An analysis of the relative frequency of pseudo-generic vs non-generic instances of
man in the New Zealand press found that by 1986 no more than 16% of instances were
(pseudo-) generic, a decline of about 50% over the last twenty-five years if comparison
with the 1961 LOB Corpus is used as an indication of earlier usage patterns (see Holmes
1994, 2001).
From chairman to chairwoman to chairperson 
public interest in men’s issues such as men’s health problems, general gender issues
and differences as well as an increase in public visibility and interest in homosexuality
and gay men.
(11)In politically correct circles the non-existence of men’s health at the expense of
women’s is called “positive discrimination” [ICE-AUS W2B-012:39]
(12)A survey of 535 gay and bisexual men in New South Wales and the Australian
Capital Territory was designed in the light of practice-based analyses of gender
and sexuality [ICE-AUS W2A-011:19]
In their 2002 paper, Holmes and Sigley noted that the decrease in the use of man/men
may be related to an increase in the use of male(s), to clarify that male-only reference
is intended instead of a generic: and indeed, a marked increase in frequency of male(s)
can also be observed in data from the more recent corpora in Tables 3 and 4. At the
same time, there has been an almost exactly parallel increase in female(s), so it is
more likely that these terms indicate an increase in explicit gender comparisons. The
trend towards increased use of these terms seems particularly pronounced in both
the written and spoken ICE-AUS data. In this corpus, frequencies for woman/women
and man/men appear low when compared to the other more recent corpora, while the
results for female(s) and male(s) are similar, making their use proportionately more
prominent within the set of gender-marked terms. However, it is also possible that the
overall drop in gender references in the ICE-AUS and ART corpus indicates a move
away from gender-marking to gender invisibility in AusE.
We turn now to examine the trends in the occupational labels in the corpora and
then proceed to explore some of the socio-semantic implications of the use of certain
items in different contexts.
3. Occupational terms
The increase in references to women gives us evidence of increased interest in and
description of women as a group. But it is also important to examine what perspective this description takes. One extremely important arena of social change over the
last 40 years has been the workplace, where older divisions of labour and gender
differences in salary have come under scrutiny. For this reason, occupational terms
are potentially an important area where language can provide some indication of
social change in progress.
Gendered occupational terms are of special interest because linguistically
unmarked items indicate that traditional expectations have not been disrupted, while
marked items signal a dangerous transgression of established boundaries, threatening the status quo. For example, Romaine (1999: 130) draws attention to the lack
 Janet Holmes, Robert Sigley & Agnes Terraschke
of gender parallelism in the occurrence of terms such as family man but not family
woman (what else would she be after all?), and career woman or even career girl (but
rarely career man). As she puts it,
Expressions such as career woman/lady/girl count as two strikes against women.
On the one hand, they suggest that as women, females can’t be real professionals,
and on the other, they suggest that as professionals, females can’t be real women.
(Romaine 1999: 131).
In their analysis of occupational terms in LOB, FLOB and WWC, Holmes and Sigley
(2002) found that female occupational terms occurred more often in the latter corpora. This extended analysis of those terms also considers data from ACE and ICEAUS (ICE-NZ was excluded for technical reasons), and involved compiling a list of
all gendered occupational labels found in the corpora, following Holmes and Sigley’s
model. This list includes occupational labels headed by the alternative neutral
suffixes -person and -people (e.g. spokesperson, business person). The term ‘occupation’
was interpreted in a relatively wide sense, to include terms such as chairperson,
professional man and women slaves. Terms relating to sports, hobbies, and military
roles (e.g. batsman; horsewoman; seaman; swordsman; infantryman) were excluded in
Holmes and Sigley (2002) on the basis they did not denote a proper profession at the
time or, in the case of military labels, excluded women. While it can be argued that the
terms sportswoman/man nowadays can be used as an occupational term, the items
have been excluded to maintain consistency across the data.
The following search patterns were used:
words containing -man, -woman, etc. as a bound suffix, e.g. spokesman, chairwoman,
frontperson
ii. gendered heads preceded by an occupational indicator (including hyphenated
tokens): e.g. Qantas girls, maintenance man, call-girl
iii. gendered phrasal premodifiers followed by an occupational indicator, e.g. woman
doctor, male nurse, lady teacher
i.
Classes (i) and (ii) can be usefully grouped together as consisting of occupational
descriptors followed by gendered heads. These categories show much the same
trends between corpora, so it is of lesser importance (except perhaps as a rough indication of how recent a given formation is) whether the gendered head is represented
orthographically as within the word or as a separate word. The overall distribution
of these gendered heads is shown in Table 5.
While LOB and FLOB provide only slight evidence of a decrease in the use of
-man terms in BrE, such terms are less frequent in WWC (X2(2df)=11.39, p=0.003)
and ACE, and they occur even less often in ICE-AUS. Occupational terms headed by
lady/ladies, boy(s) and girl(s) also show little difference between LOB and FLOB, but
From chairman to chairwoman to chairperson 
Table 5. Gendered heads following occupational descriptors (including unspaced, spaced
and hyphenated forms) in written corpora
Head
man/men
woman/women
lady/ladies
person(s)/people
boy(s)
girl(s)
LOB
FLOB
WWC
ACE
ICE-AUS (wr)
No.
FI
No.
FI
No.
FI
No.
FI
No.
FI
434
8
12
18
11
11
4.34
0.08
0.12
0.18
0.11
0.11
408
13
11
23
10
9
4.08
0.13
0.11
0.23
0.1
0.09
342
22
6
40
6
2
3.42
0.22
0.06
0.4
0.06
0.02
337
12
6
31
7
10
3.37
0.12
0.06
0.31
0.07
0.1
70
5
0
54
3
1
1.75
0.13
0
1.35
0.07
0.02
appear to be dispreferred in the southern hemisphere. However, the numbers of tokens
for these three heads are very small.
By contrast, terms in -woman and -people both show a slight increase between LOB
and FLOB, and are almost twice as common in WWC as in FLOB. The Australian data
meanwhile does not show more frequent use of -woman terms, but a marked increase
in the use of person/people constructions from ACE to ICE-AUS. The frequencies for
spoken ICE-AUS, shown in Table 6, sit between ACE and written ICE-AUS, but are
closer to those of ACE.
Table 6. Gendered heads following occupational descriptors (including bound morphs,
hyphenated forms, and phrases) in spoken corpora
Head
man/men
woman/women
lady/ladies
person(s)/people
boy(s)
girl(s)
ICE-AUS (sp)
ART
No.
FI
No.
FI
87
10
1
34
5
0
1.45
0.17
0.02
0.57
0.08
0
15
0
1
3
0
0
0.59
0
0.04
0.12
0
0
The ART corpus does not provide many occupational terms at all, which suggests
that care should be taken not to over-interpret the data. However, the fact that the
Australian talkback radio corpus does not include a single occupational term headed
by -woman but at least a few headed by -person/people could be taken as a further
confirmation of a preference for gender neutral terms in AusE. Overall, these figures
reflect the decrease in man/men, the increase in woman/women (in all but the Australian data), and the decrease in girl(s), boy(s) and lady/ies that was noted by Holmes and
Sigley (2002).
 Janet Holmes, Robert Sigley & Agnes Terraschke
The trends with occupational terms premodified by gender (Table 7 and Table 8)
are largely consistent with the general trends noted above. However, one linguistic asymmetry is immediately evident: the productive use of woman/women, but not man/men,
as an occupation premodifier (with men servants from LOB and men workers in spoken
ICE-AUS as the only occurrences), so that it may be possible to talk of a woman doctor,
but impossible to remark on a *man doctor. This is consistent with men having a continuing unmarked status in relation to most occupations, while women continue to be
marked, and it is also consistent with the noted drop in the use of -man and increase of
-woman in occupational terms. Yet the frequency of woman/women in premodification
seems to have dropped in recent years, and overall appears to be low, so that it is difficult
to draw any firm conclusions.
The implications of this data are nevertheless worth further discussion. Firstly,
at least for the moment, gender-marking using male is increasing alongside forms
with female and woman/women, suggesting a more general trend towards marking
of both genders where appropriate. At the same time, unmarked terms (using -person) are increasing as well, in particular in the more recent Australian corpora, which
indicates a trend towards avoidance of gender-marking where it would be inappropriTable 7. Occupational terms with gender premodification in written corpora
Premodifier
man/men
male
woman/women
female
lady
boy
girl
LOB
FLOB
WWC
ACE
ICE-AUS (wr)
No.
FI
No.
FI
No.
FI
No.
FI
No.
FI
1
2
7
4
1
1
5
0.01
0.02
0.07
0.04
0.01
0.01
0.05
0
7
22
11
2
1
0
0
0.07
0.22
0.11
0.02
0.01
0
0
17
14
11
2
2
0
0
0.17
0.14
0.11
0.02
0.02
0
0
13
10
9
0
0
1
0
0.13
0.1
0.09
0
0
0.01
0
5
3
3
0
0
0
0
0.125
0.075
0.075
0
0
0
Table 8. Occupational terms with gender premodification in spoken corpora
Premodifier
man/men
male
woman/women
female
lady
boy
girl
ICE-AUS (sp)
ART
No.
FI
No.
FI
1
5
8
2
0
0
0
0.017
0.084
0.14
0.04
0
0
0
0
0
2
3
0
0
0
0
0
0.08
0.12
0
0
0
From chairman to chairwoman to chairperson 
ate. Overall, then, we may expect that the choice of whether or not to mark gender
in occupational contexts is now more determined by (prescriptive) societal, rather
than linguistic rules. That societal norms are indeed trending towards avoidance of
gender-marking can be seen in the official description and classification of occupations
published cooperatively by Australian and New Zealand government statistics agencies (the Australian and New Zealand Standard Classification of Occupations 2006),
which was compiled in order to facilitate comparisons between the Australian and
New Zealand job markets. This list of job titles contains hardly any gender-markings,
with traditional occupations like fireman being replaced by the gender neutral
firefighter, fire officer and fireperson. Among the very few exceptions are the military
job of a crewman, which appears to be a job exclusively open to males, and occupations
such as dogman/dogwoman, which have a male and a female version. One sign that the
changes in marking gender of occupational terms are conscious – or even externally
imposed – is that they are occasionally directly discussed as in (13) or overtly parodied
as in (14).
(13)Phrases such as “craftsperson” and “wait person” forget that effective language
requires simplicity . We should be reaching for simple alternatives e.g. “waiter”
instead of “waitress” not “wait person”. [ACE B26:5898]
(14)[…] there’s a woman here in New York named Dina Goldfield who has ah
suggested that the syllable man must be replaced in all written and ah printed
speech and language because men have had had the best of it for too long so
she suggests that in in commercials and where ever else you would replace the
syllable man with with the syllable fem just to make things more even. So instead
of saying manhole you would say femhole. You would say instead of amendment
you would say afemdfemt while commencement becomes confemcefemt and it
trips right off the tongue I’m sure you’ll admit. [ICE-AUS S2B-013:164]
Meanwhile, as Holmes and Sigley (2002) suggest, the trivializing or pseudo-polite postmodifer lady seems to be disappearing; this also suggests a conscious change in attitudes.4 In this light, the general rise in the use of woman markers could be interpreted
as encouraging evidence that women are increasingly gaining attention as they break
into professional areas they have not previously occupied. However, the recent leveling
and even slight drop in the use of gender-markings, at least in AusE, could mean that
women are now fully accepted in many professions in Australia, and that this initial
phase of gender visibility is giving way to that of gender neutrality.
. See Holmes (1999) for a discussion of the socio-semantics of lady in current usage.
 Janet Holmes, Robert Sigley & Agnes Terraschke
3.1 Contextual considerations
3.1.1 Gender-marking and sexism
The analysis of occupational terms raises some interesting issues which require greater
attention to their contexts of use. Firstly, it is worth considering the general underlying
assumption that marking gender in relation to occupations is unwarranted, and thus
provides evidence of sexist attitudes. In fact, there are only a few instances where the
gender-marking is clearly superfluous to the text, as in (15):
(15)A woman journalist insisted that unemployment was our major difficulty in
Britain. She simply smiled disbelievingly at the statement that, in fact, there
were more situations vacant than people looking for jobs. [LOB B21:118]
Yet sexist attitudes may surface even where the gender-marking itself is justified. For
example, the marking in (16) is unexceptionable, as it makes an explicit comparison
between women and men – albeit one founded on sexist stereotypes.
(16)“Though there are not many women registrars yet, I think we can give men
registrars a lead in some ways,” she told me with a smile. “Men may be more
efficient and businesslike, but on the personal side of Births, Deaths and
Marriages women have a more sympathetic approach.” [LOB F18:027-031]
In practice, most examples of premodification by woman/women in the later corpora
can be attributed to consciousness-raising and explicit contrast between genders. This
supports our generally positive interpretation of the increased frequency of woman/
women in these corpora.
(17)Women artists have never been excluded from culture, but they have occupied
and spoken from a different place within it. [ACE F36:7193]
(18)The Labour Research survey shows there are only 20 women judges out of the
527 surveyed, a miserable 4% and an increase of only three over the last five
years. There is only one woman judge in the Court of Appeal and one in the
High Court. [FLOB G72:130]
(19)We don’t have any any women judges on the Supreme Court or the County
Court [ICE-AUS S1B-024:31]
There are only a handful of occupational woman tokens in the data which cannot be
explained in this way; and these occur in fiction or in discussion of historical periods, where we may expect presentation of unreconstructed sexist viewpoints without
authors necessarily agreeing with them. We can safely assume that the writer of (20),
for example, is being sarcastic.
From chairman to chairwoman to chairperson 
(20)[After WWII] the life of the nation, freed of the necessary wartime aberrations
such as women tram conductors, could resume its rightful, natural pattern: Dad
at work, Mum at home with the kids. [WWC G40:048]
3.1.2 Occupational labels ending in -man
A second important concern is the kinds of occupations which are associated with
different heads. To some extent, these may merely reflect (with greater or lesser
accuracy) real-world differences in employment. For example, on examining the
kinds of occupations associated with the head -man (singular and plural versions),
Holmes and Sigley (2002) noted a general decrease over time in the range of occupational terms taking this form. This decrease can also be observed in the more
recent data examined here. Table 9 and 10 provide exemplification of this trend in
the 1 million word written corpora (for items with 5 or more tokens), and in the
smaller, more recent ICE corpora (for items with 3 tokens or more). The range of
Table 9. Occupational terms headed by -man in the written corpora
LOB
chairman
craftsman
alderman
spokesman
statesman
fisherman
policeman
business(-)man
clergyman
workman
barman
ombudsman
salesman
cattleman
tradesman
foreman
FLOB
119
29
25
22
16
16
15
14
11
10
10
9
8
7
6
5
ICE-NZ (wr)
chairman
fisherman
spokesman
businessman
ombudsman
salesman
chairman
spokesman
policeman
fisherman
businessman
fireman
salesman
clergyman
watchman
craftsman
churchman
tradesman
statesman
doorman
WWC
111
50
39
31
23
12
10
10
9
8
6
5
5
5
ICE-AUS (wr)
33
14
11
3
3
3
spokesman
chairman
ombudsman
policeman
35
17
3
3
chairman
spokesman
fisherman
policeman
businessman
ombudsman
headman
tradesman
foreman
craftsman
barman
ACE
109
35
26
24
19
14
13
8
6
5
5
chairman
spokesman
policeman
businessman
junkman
ombudsman
barman
cameraman
deskman
fisherman
salesman
50
47
26
16
16
10
9
6
6
6
5
 Janet Holmes, Robert Sigley & Agnes Terraschke
Table 10. Occupational terms headed by -man in spoken corpora
ICE-NZ (sp)
chairman
spokesman
policeman
businessman
foreman
ICE-AUS (sp)
35
11
8
8
5
spokesman
chairman
policeman
businessman
salesman
money man
ART
15
14
6
4
4
3
policeman
7
these occupational terms is particularly small for written ICE-AUS and the set of
spoken corpora, with chairman, spokesman, businessman and policeman emerging
as the most frequent -man labels.
At the lower-frequency end, many of the -man items in LOB constitute conservative
or old-fashioned labels for occupations, some of which are now relatively rare, or have no
exact modern equivalents: e.g. crossbowman, liveryman, herdsman, cowman, coachman.
The predictable decrease in reference to such occupational classes certainly contributes
to the overall decrease in occupational -man items between LOB and FLOB. Another
major contributing factor to this decrease is the official adoption of gender neutral terms
mentioned above.
But there are still several areas where the choice of label makes a difference to
meaning. One of these is the use of lady. As Lakoff (1975) suggested, the head and
premodifier lady is used as a politeness device to add or attribute status to the referent,
and not surprisingly, this is most necessary in relation to “demeaning” jobs, such as
cleaning lady and tea lady. The only occupational term in LOB, FLOB and WWC as
well as ACE is landlady, but the American corpora do contain a saleslady (Brown) and
an Elevator Lady (Frown), as well as professional ladies, or ladies of the night. Interestingly,
the other items found in ACE were less condescending, such as funny lady, used to refer
to a comedic radio host, and a reference to a former First Lady of the United States.
Nevertheless, as Holmes and Sigley (2002: 256) point out, due to the generally demeaning connotations of the word lady, serious occupations such as judge and doctor tend
to be modified by the terms woman/women in current usage (see Holmes 1999). As
a premodifier, lady is used for a curiously limited set of occupations in these corpora,
mainly concerned with writing: we find references to a lady singer (LOB), lady novelists
and a lady writer (FLOB), and a lady editor and lady poet (WWC). These closely parallel
suffixed forms such as poetess or editress, which tend to suggest dilettantish scribblers
rather than serious scribes (Holmes 1993b: 360). Lady as a premodifier does not occur
in any of the spoken or written corpora of AusE, which further supports the notion
that this construction is on its way out. However, a preliminary search of spoken and
written ICE-NZ found two items, lady editor and lady councilor. This could be taken as
From chairman to chairwoman to chairperson 
additional evidence that AusE is further ahead than other varieties of English in terms
of the implementation of gender neutrality.
3.1.3 Chairman
Alternative applications (male or female) are still available for some labels with -man.
The persistent use of -man as a (pseudo-)generic means we cannot always be sure of
the extent to which -man/men terms are restricted to male referents. Most of the tokens
in the corpora have specifically male referents, reflecting the reality of a working world
still dominated by men. Nevertheless, there are also several instances of occupational
terms ending in -man which are used to refer to women. Chairman is the easiest case
to identify, as it is generally used to refer to named individuals.
(21)He is, he adds: “A very great supporter of chairman Mrs Miscampbell.”
[FLOB A42:024]
It has been suggested (e.g. Salter-Duke 1983: 18; Ehrlich & King 1994; Romaine 1999)
that terms such as chairperson might in practice be used to refer mainly to women
(and so mask women’s presence). However, even though the figures in Table 11 suggest
that the use of chairperson has become progressively more popular in AusE and NZE,
Table 11. Frequency index scores for chairman, chairperson, chairwoman, and chair in
written corpora
chairman/men
male individual
female individual
no information
chairperson
male individual
female individual
no information
chairwoman
chairpeople
chair (verb)
male subject
female subject
general/collective
chair (=position)
chair (=person)
male individual
female individual
general/collective
Brown LOB Frown
FLOB WWC
ICE-NZ ACE
(wr)
ICE-AUS
(wr)
0.79
0.50
0.19
0.09
0
1.19
0.85
0.03
0.31
0
1.11
0.68
0.05
0.38
0
0
0
0
0
0
0.1
0.01
0
0
0.08
0.01
0
0
0.01
0.825
0.575
0.025
0.225
0.2
0.05
0.05
0.1
0.025
0.05
0.08
0.03
0
0.05
0.03
0
0.57
0.35
0
0.23
0.76
0
0
0.76
0.03
0
0
0.03
0
0.69
0.59
0.03
0.07
0.02
0
0.01
0.01
0
0
0.05
0.03
0.01
0.01
0.03
0.10
0.06
0.04
0
0.01
0
0.07
0.05
0
0.02
0.05
0.03
0.02
0
0.01
1.06
0.76
0.07
0.23
0.06
0.01
0.01
0.04
0.02
0
0.11
0.07
0.01
0.03
0.03
0.01
0
0.01
0
0.50
0.33
0.01
0.17
0.04
0.02
0
0.02
0
0
0.08
0.06
0.01
0.01
0
0.01
0
0.01
0
0.08
0
 Janet Holmes, Robert Sigley & Agnes Terraschke
there is little evidence of any gender preference in their use. Among the few tokens
found in ACE, Frown and WWC (no tokens were found in the earlier corpora and
only 1 in all of the spoken corpora), only 2 clearly referred to a woman, whereas three
were used to refer to men. Even among the 8 tokens found in ICE-AUS only 2 clearly
refer to women and 2 to men. Interestingly, the 31 tokens in written ICE-AUS all
occurred in a legal document outlining the responsibilities of a chairperson of a club
or an organization and therefore the title does not refer to any person in particular.
While this use does not show the adoption of this neutral form by the wider society,
it further exemplifies the policy to use nonsexist gender neutral language in official
writings. Nevertheless, many other occupational terms cannot straightforwardly be
assigned a gender in context. Spokesman and spokesperson in particular are almost
always used without naming or even pronominalizing the referent. So there may be a
level of concealed sexist usage in the corpora, where terms such as postman or spokesperson are used to refer to a woman, without this being evident from the context
(Holmes & Sigley 2002).
A rather striking point to note in the data is the occurrence of the term chairpeople
as a plural form in ICE-NZ. Even though only 2 instances could be found in all
corpora, this use is an interesting linguistic innovation, almost certainly generated
from the search for gender-neutral terminology, and its further development would
be worth watching.
(22)Community board members receive a salary of $3000 and their chairpeople
$10 000. [ICE-NZ W2C-001:25]
Another positive, though less direct, piece of evidence for increasing avoidance of
chairman is provided by a very slight increase in the use of chair in several alternative
constructions: as a verb (23); as a noun referring to the position (24); and as the title
of the occupant (25).
(23)When all of them had steaming cups from the famous Bremmer gas ring,
Harry made several polite attempts to chair proceedings, then gave up and
sipped his tea. [ACE M04:642]
(24) LABOUR OUTVOTED – SO A TORY GETS THE CHAIR [LOB A28:089]
(25)I’ve asked them to come along and to sort of start us of and ah I myself um
as chair will be talking to that sheet of paper that you have there which ah.
[ICE-AUS S2A-034:11]
The data available suggests that the verb is favoured in NZE, while chair is used in
AmE as a direct equivalent of chairperson, and the recent AusE data indicates a slight
From chairman to chairwoman to chairperson 
preference for the use of chair to refer to the position. However, the frequencies for
these items are rather low, so that these findings cannot be considered conclusive.
Overall, this survey of occupational terms reveals a consistent series of shifts
(albeit small) within all the varieties of English under consideration. This appears
to be partly driven by a decrease in reference to more traditional rural or historical
roles, and partly a result of increased use of contemporary and gender-neutral
alternatives for profession where women are competing in the real world (including terms headed by -person, as well as more idiosyncratic items such as chair or
firefighter) which, within the limits of our data, seem to be used as true epicenes
rather than covertly marking female gender. At the same time, after an increase in
marking women’s presence in the workplace in WWC (whether by specific labels
headed by -woman, or by premodification), explicit marking of women appears
to have dropped again in the recent AusE data. This could indicate that the stage
of visibility of women in the workplace has given way to the next stage where
occupations take on non-gender-specific markers such as -person, officer, manager, consultant, etc. This would suggest that these jobs are open to all now – at least
in theory. However, since there was no evidence of a previous surge in the use of
markers of women, it is also possible that this stage of making women’s roles visible
was skipped in Australia, and job title conventions moved straight from male
domination to gender neutrality, hiding women’s participation in the labour force.
Generally speaking, then, the New Zealand data emerges as more progressive than
FLOB for some of these changes, while Australia seems to be ahead of New Zealand
in terms of neutral job labels.
4. Conclusion
The data used in this research, gathered over a 45 year period, shows significant differences in the frequency of terms referring to women and men. References to women
more than doubled between 1961 and 1991, and then ebbed again in the late 1990s, at
least in Australia and New Zealand. Meanwhile references to men have decreased continuously over the same period across all varieties of English, partly through avoidance
of (pseudo-) generic man. In the same period, the frequencies of female(s), male(s)
and people/person(s) have increased dramatically, suggesting more frequent explicit
comparison of gender groups.
Intercomparisons of the occupational terms in all the corpora investigated
confirms this trend of increasing reference to women, and increased discussion
of gender issues. Holmes and Sigley (2002) found clear evidence of an increase in
gender premodification, and in the use of forms suffixed by -woman, between 1961
 Janet Holmes, Robert Sigley & Agnes Terraschke
and 1986/1991. However the Australian data does not support this observation,
since the frequencies remain in the mid range and even decrease in the most recent
ICE-AUS. The more recent data used in this paper seems to show an ongoing move
towards using gender-neutral markers, at least in AusE. In NZE and BrE, women
continue to be linguistically marked in occupational terms compared to men – even
if, as was argued, such usages need not be regarded negatively. Almost all examples from the more recent corpora can be attributed to consciousness-raising and
explicit contrast between genders, and this increase in gender-marking reflects both
increased real-world participation, and continued attention to equal opportunity
issues. Yet the quite marked move away from gender-marking that was observed
for AusE could be the result of women being accepted more and more as members
of the workforce and as holders of more powerful positions, which would make
explicit gender-marking unnecessary.
At the same time, there is decreasing use of terms compounded with -man.
This appears to be partly driven by reduced reference to more traditional rural or
historical roles, and partly a result of more contemporary terms being used where
women are competing in the real world, as with terms headed by -person, as well
as individual lexical items such as firefighter. These seem (within the limits of our
data) to be true epicenes rather than covert markers of female gender. Where prescriptions do not prevail, less politically correct usages may still surface, as with
the occasional use of girl in the workplace. However, we have evidence to suggest
that the attention focused on pseudo-generics and similar sexist terminology in the
workplace has had some real effect.
The data from New Zealand written corpora emerges as more progressive than
the British data in relation to these changes, giving greater visibility to female-marked
terms and indicating a greater awareness of feminist issues – at least among New
Zealand writers. On the other hand, the more recent Australian data suggests a move
towards gender neutrality, which could either indicate acceptance and inclusion of
women or a stronger push towards politically correct language. Overall, then, it can
be said that there are differences in nonsexist language use between the northern
and the southern hemisphere varieties, with the southern forms both showing social
progress towards gender equality, but marking this in different ways. With regard to
the differences between written and spoken English, the data indicates that the general trends observed in written usage are usually not as pronounced in spoken usage,
though they are still reflected there.
In conclusion, then, while women continue to be the linguistically marked
gender, there is evidence to support a positive interpretation of many of the
patterns identified in the most recent corpora, since the relevant marked contexts
reflect inroads made by women into occupational domains previously considered as
exclusively male.
From chairman to chairwoman to chairperson 
References
Australian Bureau of Statistics and Statistics New Zealand. 2006. Australian and New Zealand
Standard Classification of Occupations. <http://www.abs.gov.au/AUSSTATS/[email protected]/
DetailsPage/1220.02006?OpenDocument> (16 April 2008).
Bauer, Laurie. 1993. Manual of Information to accompany the Wellington Corpus of Written
New Zealand English. Wellington: Department of Linguistics, Victoria University of
Wellington.
Chiles, Tina. 2003. “Titles and surnames in the linguistic construction of women’s identities”.
New Zealand Studies in Applied Linguistics 9(1): 87–97.
Ehrlich, Susan & Ruth King. 1994. “Feminist meanings and the (de)politicization of the lexicon”. Language in Society 23: 59–76.
Francis, W. Nelson. 1964. Manual of Information to Accompany a Standard Sample of Present-Day
Edited American English, For Use with Digital Computers. Providence RI: Department of
Linguistics, Brown University.
Green, Elizabeth & Pam Peters. 1987. “Towards a corpus of Australian English”. ICAME Journal
11: 27–38.
Hellinger, Marlis. 2001. “English – gender in a global language”. In Marlis Hellinger &
Hadumod Bussman (eds), Gender across Languages. The Linguistic Representation of
Women and Men. Vol. 1. Amsterdam: John Benjamins, 105–14.
Holmes, Janet. 1993a. “Charpersons, chairpersons and goddesses: Sexist usages in New Zealand
English”. Te Reo 36: 99–113.
Holmes, Janet. 1993b. “Sex-marking suffixes in written New Zealand English”. American Speech
68(4): 357–70.
Holmes, Janet. 1993c. “He-man beings, poetesses, and tramps: Sexist language in New Zealand”.
In Laurie Bauer & Christine Franzen (eds), Of Pavlova, Poetry and Paradigms: Essays in
Honour of Harry Orsman. Wellington: Victoria University Press, 34–49.
Holmes, Janet. 1997. “Generic pronouns in the Wellington Corpus of Spoken New Zealand
English”. Kotare 1: 32–40.
Holmes, Janet. 1999. “Ladies and gentlemen: Corpus analysis and linguistic sexism”. Proceedings of
the 20th Annual Meeting of the International Computer Archive of Modern/Medieval English.
Freiburg: University of Freiburg, 141–55.
Holmes, Janet. 2001. A corpus-based view of gender in New Zealand English. In Marlis Hellinger &
Hadumod Bussman (eds), Gender across Languages. The Linguistic Representation of Women
and Men. Vol. 1. Amsterdam: John Benjamins, 115–36.
Holmes, Janet & Robert Sigley. 2002. “What’s a word like girl doing in a place like this?” In Pam
Peters, Peter Collins & Adam Smith (eds), New Frontiers of Corpus Linguistics. Papers from
the 21st International Conference on English Language Research on Computerized Corpora.
Amsterdam: Rodopi, 247–63.
Johansson, Stig. 1978. Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of
English, for Use with Digital Computers. Oslo: Department of English, University of Oslo.
Lakoff, Robin. 1975. Language and Woman’s Place. New York NY: Harper Colophon.
Pauwels, Anne. 1997. “Of handymen and waitpersons: A linguistic evaluation of job classifieds”.
Australian Journal of Communication 24(1): 58–69.
Pauwels, Anne. 1998. Women Changing Language. London: Longman.
Pauwels, Anne. 2001a. “Spreading the feminist word: The case of the new courtesy title Ms
in Australian English”. In Marlis Hellinger & Hadumod Bussman (eds), Gender across
 Janet Holmes, Robert Sigley & Agnes Terraschke
Languages. The Linguistic Representation of Women and Men. Vol. 1. Amsterdam: John
Benjamins, 137–51.
Pauwels, Anne. 2001b. “Non-sexist language reform and generic pronouns in Australian
English”. English World-Wide 22(1): 105–119.
Pauwels, Anne & Kellinde Wrightson-Turcotte. 2001. “Pronoun choice and feminist language
change in the Australian Media”. Australian Journal of Communication 28(1): 69–82.
Romaine, Suzanne. 1999. Communicating Gender. Mahwah NJ: Lawrence Erlbaum.
Salter-Duke, Linden. 1983. Woman-Machine Interaction: A computational investigation into
the disparity between feminine and masculine pronouns in the LOB Corpus. MA thesis,
Lancaster University.
Wolfson, Nessa & Joan Manes. 1980. “Don’t ‘dear’ me!” In Sally McConnell-Ginet, Ruth Borker &
Nelly Furman (eds), Women and Language in Literature and Society. New York NY: Praeger,
79–92.
Appendix
Table 12. Raw numbers and frequency indexes (FI) of woman/women, man/men,
female(s) and male(s) in written corpora
Brown
(1961
AmE)
No FI
LOB
(1961 BrE)
No FI
Frown
(1991
AmE)
No FI
FLOB
WWC
(1991 BrE) (1986–90
NZE)
No FI
No FI
ACE
(1986
AusE)
No FI
ICE-NZ
(1990–8
NZE)
No FI
ICE-AUS
(1991–5
AusE)
No FI
woman
women
248 2.48 267 2.67 478 4.78 267 2.67 426 4.26 356 3.56 133 3.33 79 1.97
223 2.23 223 2.23 851 8.51 461 4.61 743 7.43 610 6.1 369 9.23 279 6.97
man
men
1354 13.5 1194 11.9 840 8.40 738 7.38 757 7.57 710 7.1
788 7.88 641 6.41 556 5.56 454 4.54 510 5.10 530 5.3
female
females
54
17
0.54 38
0.17 7
0.38 144 1.44 86
0.07 24 0.24 17
0.86 111 1.11 85
0.17 51 0.51 15
0.85 61
0.15 32
1.53 41
0.8 13
1.02
0.32
male
males
37
19
0.37 40
0.19 8
0.40 177 1.77 89
0.08 47 0.47 22
0.89 161 1.61 101 1.01 83
0.22 59 0.59 20 0.2 38
2.08 68
0.95 31
1.7
0.77
263 6.57 154 3.85
288 7.2 189 4.72
section iv
Clauses and sentences
Concord with collective nouns
in Australian and New Zealand English*
Marianne Hundt
University of Zurich
In English, nouns like government or team can be used with singular or plural verbs
and pronouns. In the twentieth century, there seems to be a growing trend to use
singular concord with most collective nouns. This change is particularly pronounced
in American English but can also be found in other national varieties of English. The
focus of this chapter is variable concord in Australian and New Zealand English.
Data for the study come from the relevant components of the International Corpus
of English which, unlike the corpora used in most previous studies, offer information
on written as well as spoken usage. Somewhat surprisingly, variability in this area of
grammar is not, primarily, a question of the regional variety investigated. Instead,
it is mainly due to language-internal factors, such as medium (written vs. spoken
usage) or the choice of noun (with some nouns preferring singular, others preferring
plural concord).
1. Introduction
The syntactic variable investigated in this chapter is variable concord or ‘agreement’
after collective nouns like committee, government or family. These can be used with
either singular or plural verbs and pronouns in English, as in The government has/have
managed to create a full-scale crisis of confidence in its/their unity. This dual concord
is of interest for several reasons. Ongoing linguistic change in the twentieth century
shows a tendency towards singular rather than plural concord. This ongoing change,
in turn, may result in regional variation because change happens at differential speeds
in different varieties, and some turn out to be more advanced than others.
In the following, I will briefly summarize the main results from previous studies
on diachronic, regional, and stylistic variation concerning agreement with collective
nouns in English. Syntactic factors that influence the variability of concord patterns will
*I am grateful to Nina Störiko and Martin Schendzielorz, who helped with the initial retrieval
of data.
 Marianne Hundt
also have to be taken into account.1 Previous studies (Hundt 1998 & Levin 2001) were
based on newspaper databases rather than more stratified reference corpora. Furthermore, AusE and NZE have, so far, not been compared directly. To test the hypotheses
on regional and stylistic variation, corpus data on variable concord in AusE and NZE
were collected from the respective components of the International Corpus of English,
ICE-AUS and ICE-NZ. Comparative data come from ICE-GB.
2. Previous research and hypotheses
2.1 Diachronic variation
Ongoing language change in the twentieth century only provides the background against
which patterns of regional, stylistic and language-internal variation are investigated because
diachronic evidence is not available for AusE and NZE. Currently, AmE is more advanced
in the use of singular concord than AusE, NZE and BrE at the end of the twentieth century
(cf. Siemund 1995: 366ff.; Hundt 1998: 88–9; Levin 2001: 36–7 and 86ff.).
It is important to note in this context that singular concord is not simply an American
innovation. Historically, agreement of formally singular collective nouns with singular
verbs goes back to the earliest stages of English, but plural forms can also be found as early
as 1000 (cf. Marckwardt 1958: 77). Consistent use of plural concord with certain nouns
is a much later phenomenon in BrE (Marckwardt 1958; Levin 2001: 36). Preliminary evidence from corpora and text databases of eighteenth- and nineteenth-century English on
eleven nouns (Hundt 2009) in fact suggests that the singular has always remained a latent
option in both AmE and BrE. Levin (2006) does not find a clear diachronic trend towards
singular concord for a set of low-frequency collective nouns, either.2
2.2 Regional variation
With respect to regional variation, various sources (cf. Quirk et al. 1985: 16–17; Biber et al.
1999: 188; Trudgill & Hannah 2002: 70) found that singular concord is most frequently
used in AmE; plural concord is used most frequently in BrE; while varieties like AusE and
1. Depraetere (2003) and Levin (2001) have convincingly shown that semantic factors (such
as notional concord or the semantics of the verb) do not play an important part in this area
of variable grammatical rules.
2. More detailed, long-term studies that investigate comparable sets of nouns are still needed,
especially since lexicogrammatical variation seems to play an important role in this area of
variable usage (see Section 3.2.3 below).
Concord with collective nouns in Australian and New Zealand English 
NZE take an intermediate position (Hundt 1998: 83; Levin 2001: 60–70).3 As Depraetere
(2003: 112–13) remarks, “from a sociolinguistic point of view, the preference for the
singular may reflect the pecking order among the different varieties of English: American
English […] is beginning to set the norm for British English.” This runs counter to
Bauer’s (1994: 61–6) argument: he points out that the development within BrE must
have taken place independently of influence from AmE because in his data, singular
forms start increasing in BrE in the 1930s, i.e. at a time when influence from AmE
through mass media and increased global mobility was not as widespread as it is today.
As pointed out in the previous section, long-term evidence on differential change in
AmE and BrE (Hundt 2009) also suggests independent developments in the two varieties (with a revival of a latent option in BrE) rather than direct influence from AmE
onto BrE. The same is likely to hold for AusE and NZE.
2.3 Stylistic variation
As far as stylistic variation is concerned, the general tendency in all varieties is
that singular concord is preferred in more formal styles (with the exception of BrE
officialese, see Fries 1981: 23ff.; Hundt 1998: 88; Mollin 2007: 201). In more informal styles (e.g. those used in sports reportage) the likelihood of plural concord
increases (cf. Levin 2001: 85). Even though ‘written’ and ‘spoken’ language are obviously a question of medium rather than style, singular concord is more likely to
occur in the written medium and plural concord in the spoken medium, especially
in informal conversation.
2.4 Language-internal variation4
2.4.1 Verbal vs. pronominal concord
Overall, pronouns used after collective nouns are more likely to yield plural marking than verbs (Nixon 1979: 123ff.; Hundt 1998: 84–6; Levin 2001: 91ff.). One of
the main reasons for this is that verbs are more likely to show a close proximity to
their antecedent (as in example (1)), whereas pronouns are quite likely to occur at
a greater distance. Pronominal concord may even run across sentence boundaries,
as example (2) illustrates.
(1)if the government was really serious about tourism and the future growth of
tourism they’d at least give it some stability [ICE-AUS S2B-010:80]5
3. For concord with collective nouns in Philippine and Singapore English, see Hundt (2006).
Sand (forthcoming) investigated a somewhat larger set of New Englishes but did not include
agreement across the sentence boundary in her definition of the variable.
4. The most detailed study of inter-linguistic variation is provided for BrE in Depraetere (2003).
5. Emphasis in the examples has been added throughout.
 Marianne Hundt
(2)Well the main difference is of course that we have a plan a detailed plan for the
future of Australia and the government doesn’t. They’ve been there for ten years
and all we have is ah an economy of queues [ICE-AUS S1B-029:8-10]
As a result, mixed concord or ‘discord’ (Johansson 1979: 205), i.e. the combination of a
singular verb and a plural pronoun “typically occurs where there is considerable distance
between the co-referent noun phrases; discord is generally motivated by notional considerations, i.e. a tendency towards agreement with the meaning, rather than the form,
of the subject noun phrase” (Biber et al. 1999: 192). Mixed concord or discord shows a
fairly complicated interaction of regional, stylistic, and inter-linguistic variation:
a.mixed concord is slightly more common in AmE than in BrE, NZE or AusE
(cf. Trudgill & Hannah 2002: 72; Hundt 1998: 85; Johansson 1979: 205)
b.mixed concord is more often used in informal and spoken language than in
formal, written language (cf. Levin 2001: 116; Biber et al. 1999: 332)
c.some collective nouns are more likely to yield mixed concord than others e.g. family and team vs. government and committee (cf. Hundt 1998: 85)
This last aspect brings us to the lexicogrammatical aspect of variation: the preference
for certain concord patterns is linked to individual collective nouns.
2.4.2 Concord patterns with individual nouns
Biber et al. (1999: 188) point out that “[m]ost collective nouns prefer singular concord,
although a few collective nouns commonly take plural concord.” Nouns like audience,
board, committee, government, jury and public belong to the singular-type, staff is given as
a noun that prefers plural concord; examples of nouns that are truly variable according to
their corpus findings are nouns like crew and family. It is for this last group of nouns that
Biber et al. comment on regional differences between AmE and BrE.
In previous studies on national varieties, divergence was most likely to occur at
the lexicogrammatical level (team and family, for instance, are more likely to yield
plural verbal concord in BrE than in AmE).
3. Corpus data
3.1 Definition of the variable
Before we can move on to the results, a few comments on data collection are in order.
One possible approach would have been to limit the corpus searches to those nouns
that Biber et al. (1999: 188) list as being truly variable in AmE and BrE. This, however,
would have meant working on the tacit assumption that the same set of nouns would
have variable concord patterns in AusE and NZE as in the northern hemisphere varieties. Instead, I chose to include most of the nouns listed in Quirk et al. (1985: 316),
Concord with collective nouns in Australian and New Zealand English 
excluding only those that occurred very infrequently in one-million-word corpora
(e.g. jury and enemy).6 A total of 35 nouns were included:
army association audience board cast clan class club college commission
committee community company corporation council couple crew crowd
department family federation gang generation government group institute
majority ministry minority opposition party population staff team university
A closer look reveals that they are all collective nouns which refer to humans, a subtype that Nixon (1979: 120) refers to as ‘corporate’ nouns. This also meant that instances
in which words like generation or class refer to an inanimate, non-human entity (e.g.
second generation of mobile phones or a class of acceleration) were not included in the
data. Similarly, whenever the abstract institution or notion rather than a collection of
individuals is referred to, the instances were not included in the data set:
(3) a.
Dad – was college like this when you went? [ICE-AUS W1B:88]
b.The historical fact that opposition to foreign control is a venerable part of
Labor’s ideological baggage is not, on its own, a sound basis for the formulation of public policy. [ICE-AUS W2E-002:157]
An exception to this rule was the use of the noun family in an institutional sense
which occurs quite frequently in ICE-AUS. I decided to include these instances
because this use also allows for variable concord, as the following instance of mixed
concord illustrates:
(4)The family is like a ch is an arm that um capitalism uses um their functioning I
think […] [ICE-AUS S1B-015:195-200]
Sometimes the nouns are not instances of the collective use but actually refer to
individuals, for instance in the following uses of the word party which illustrate its
legal sense (in which it is always singular):
(5) a.An indemnity […] involves a contract by which one party agrees to keep
the other party harmless against a loss. [ICE-AUS W1A-015:154]
b.Now a restive party has knifed Mr Brown and his deputy, Alan Stockdale.
[ICE-AUS W2E-004:88]
This also applies to nouns in the above list that are preceded by a numeral, which
automatically makes them require plural concord:
(6) Forty staff have been urged to have medical tests. [ICE-AUS S2B-004:207]
6. On agreement with infrequent collective nouns in BrE, see Levin (2006).
 Marianne Hundt
Instances of verbs which are not unambiguously either plural or singular were not
included in the counts. This can be the case wherever the relevant VP occurs in a subordinate clause following mandative expressions like recommended (6) or inevitable (7) which
open up the choice between an indicative, a subjunctive and a periphrastic construction:
(7)[…] recommended that the harbour board provide a fleet of modern salvage
tugs […] [ICE-NZ W2B-010]
(8)[…] so it is it is inev inevitable that University Council support the students
and the u ah the above two the other two motions [ICE-AUS S2A-040:234]
The subjunctive in hypothetical if-clauses is also a context in which a choice between
singular and plural concord is obscured by another grammatical phenomenon.7
The following example is somewhat problematic because of the intervening
appositive clause that contains plural NPs which might have triggered the plural verb.
I decided to include this instance because, syntactically, the apposition is clearly not
the antecedent of the verb:
(9)During culling, a suitably-sized group of elephants is located from the air and
the ground party, of two or three hunters and their assistants, are then guided
to the group by radio. [ICE-AUS W2B-028:214]
Similarly, the plural noun ministers is likely to have had an influence on the choice of the
plural verb in the following sentence, but a singular verb would also have been a grammatically correct option, and therefore, the example was included in the data set:
(10)[…] accordingly the electricity corporation not the shareholding ministers are
responsible for decisions relating to the clyde dam […]. [ICE-NZ S1B-051]
Generally, I only counted instances of mixed concord whenever the collective noun
was directly followed by a singular and a plural form. A typical examples is
(11)The Liberal Party of Australia is dead and buried. They can kiss goodbye any
aspirations of leading this country in the years to come. [ICE-AUS W1E-009:28]
In other words, whenever another possible antecedent was introduced, the following
verb or pronoun could no longer safely be taken to refer back to the collective noun.
This is the case in the following examples:
7. The following example from ICE-AUS would be a case in point if the noun were a collective noun (which it is not in this case, as party is used here in the non-collective, legal sense):
“Because of the inequality in bargaining power, it would be best if the weaker party were at least
given, or told to obtain, independent advice as to the consequences of entering the transaction”
[ICE-AUS W1A-015:205].
Concord with collective nouns in Australian and New Zealand English 
(12) a.The group has grown from originally sixteen who’re interested in doing
this to sixty or seventy people [ICE-AUS S2A-007:243]
b.A spokesman later said the group was made up of timber industry workers
and farmers all incensed at what they see as an attack on their livelihoods
by the Green Movement […]. [ICE-AUS S2B-001:2]
The collective noun group in (12a) is first followed by a singular verb. The plural
verb ’re more likely refers back to the direct antecedent, the numeral sixteen than
to the collective noun. In (12b), the plural pronouns are more likely to refer back to
the plural antecedent NP industry workers and farmers. Therefore, only the singular
verbs were counted as instances of (singular) concord. The examples were not taken
to illustrate mixed agreement.
Obviously, only verbs which allow for number marking (i.e. finite, indicative present tenses of full verbs) were considered. Like Levin (2001: 51) I disregarded unclear
instances from the spoken sections of the ICE corpora. Unlike Levin (2001: 31–2
and 55–60), however, I did not include relative which and who as number-marked
forms. In other words, only instances followed by personal pronouns were included
in my data.8
For the set of 35 nouns listed above, all instances of singular, plural or mixed concord were recorded. All instances of the nouns in question – also cases where they were
part of proper name NPs (e.g. NSW Dairy Corporation or The Australian Labor Party)
were included in the counts. As mentioned in Section 2.4 above, mixed concord often
occurs across sentence boundaries; the unit of analysis was therefore not limited to the
sentence. Instances of mixed concord were counted as single instances of ‘mixed concord’. Likewise, occurrences of all-singular and all-plural concord were only counted
once. In other words, if a collective noun was followed by a singular verb and, further
on in the text, by a singular pronoun, these occur as only one instance of ‘singular
concord’ in the figures and tables. For the investigation of verbal and pronominal concord, each instance (verb or pronoun) after a collective noun was counted separately.9
8. The data sets in this chapter only include instances of either verbal or pronominal concord.
Rare instances where a noun that refers back to a preceding collective also indicates number
of the antecedent were not included. One such example from ICE-AUS would be: “Manly Art
Gallery became part of the emergence in New South Wales of museums and galleries which
were trustees of public art, with local government as their major patrons, during the 1970s
and 1980s” (W1A-003:136). In this example, the noun patrons suggests that the preceding collective noun is conceived of as a collection of individuals rather than a single body.
9. This accounts for the fact that the overall raw frequencies in Tables 1a and 1b and 2a and
2b do not tally.
 Marianne Hundt
3.2 Results and discussion
74.3%
23.9%
1.8%
77.6%
19.5%
2.9%
76.8%
19.6%
3.6%
singular
plural
mixed
(N
=
IC
69
9)
E-
N
Z
(N
=
IC
10
17
)
E-
G
B
(N
=
56
1)
The overall results of the searches (see Figure 1a) show that AusE and NZE have very
similar concord patterns after collective nouns. There is a tendency for slightly more
plural and especially mixed agreement in AusE, whereas NZE has a slightly higher incidence of singular agreement, but these differences are below the level of statistical significance. Both southern hemisphere varieties are different from BrE, which is still relatively
conservative in having the highest relative frequency of plural verbs and pronouns after
collectives, but again, the differences are below the level of statistical significance.
IC
EA
us
0%
20%
40%
60%
80%
100%
Figure 1a. Concord patterns in the ICE corpora
78.6%
20.9%
0.5%
81.3%
17.3%
1.5%
80.3%
17.2%
2.5%
31
9)
0%
20%
40%
60%
80%
IC
E-
Au
s(
N
=
IC
E
-N
Z
(N
=
47
5)
IC
E
-G
B
(N
=
22
0)
3.2.1 Variation across medium
A look at the written sub-corpora (see Figure 1b) shows that all three varieties have a
higher frequency of singular concord in the written medium; otherwise, the regional
Figure 1b. Concord patterns in the written parts of the ICE corpora
100%
singular
plural
mixed
Concord with collective nouns in Australian and New Zealand English 
71.6%
25.8%
2.6%
74.4%
21.4%
4.2%
73.9%
21.6%
4.5%
singular
plural
mixed
IC
EA
us
(N
=
IC
38
0)
E-
N
Z
(N
=
IC
54
2)
E-
G
B
(N
=
34
1)
tendencies resemble the overall results, with AusE and NZE aligning into similar usage
preferences. BrE has practically no mixed concord in the written data. Again, none of
these differences turned out to be statistically significant.
In the spoken sections of the ICE corpora (see Figure 1c), plural concord is more
common across all varieties. As a result, we also find more mixed concord in all varieties.
Interestingly, AusE and NZE have almost identical proportions of plural concord, but
ICE-AUS yields an even higher share of mixed concord patterns than ICE-NZ. The proportion of mixed agreement patterns in the spoken part of ICE-GB is the lowest, despite
the fact that BrE has the highest relative frequency of plural concord.
0%
20%
40%
60%
80%
100%
Figure 1c. Concord patterns in the spoken parts of the ICE corpora
3.2.2 Verbal vs. pronominal concord
Even at a deeper, more qualitative level of analysis, AusE and NZE show very similar
agreement patterns: both have almost identical proportions of verbal and pronominal
concord, as Table 1 shows:10
Table 1. Verbal vs. pronominal concord in ICE-AUS and ICE-NZ
singular
ICE-AUS
ICE-NZ
plural
verb : pron.
verb : pron.
488 : 168
74% : 26%
695 : 251
73% : 27%
94 : 110
46% : 54%
130 : 152
46% : 54%
10. For patterns in the written and spoken components of the corpora, see Tables 2a and 2b
in the Appendix.
 Marianne Hundt
As in previous studies, plural concord is more likely to be of the pronominal type,
whereas singular concord shows a strong connection with verbs.
Mixed concord – across all varieties – is typically of the ‘singular verb + plural
pronoun’-type. A typical example is the following:
(13)[…] the humming bird energy budget analysis group was fairly bursting
with pride as their leader presented pretty pictures of their findings
[ICE-AUS S2B-027:16]
A variation on the singular-plural sequence of mixed concord patterns are singular
pronouns which are relatively close to the antecedent and are then followed by plural
pronouns at a greater distance:
(14) a.A record forty two member group of ministers and parliamentary secretaries
has held its first meeting of Labor’s fifth term only twenty four hours after
they were sworn in [ICE-AUS S2B-010:59]
b.the new zealand government let itself down by saying they wouldn’t protest
every every test [ICE-NZ S1B-035]
The reverse sequence, plural–singular, is much rarer but still attested, as in the following
instances from ICE-AUS (the first instance is a verb-pronoun pattern; in the second, the
usual sequence of singular verb and plural pronoun is reversed):
(15) a.Oh I’ve been through it plenty of times It’s all written in the paper here
there one there’s only one team have won a premiership on more occasions
than Saint George and that was South Sydney (ICE-AUS S1A-006:28).
b.A group calling themselves Tas Alert has complained to the
anti-discrimination board [ICE-AUS S2B-015:187]
The overall occurrence of mixed concord patterns in the three ICE-corpora is, however,
too low to allow for conclusions on regional differences in the use of individual mixed
concord patterns.
3.2.3 Lexicogrammatical variation
At the level of individual nouns it is interesting to note that for government, ICE-AUS
yields the highest proportion of singular concord (91%), followed by ICE-NZ (89%)
and ICE-GB (84%); AusE and NZE also have a sizeable number of instances with mixed
concord patterns following this noun, whereas BrE exclusively has either singular or
plural concord in the ICE corpus data. If the cases of mixed concord are disregarded,
the differences in the proportion of singular vs. plural concord after government in the
southern hemisphere varieties on the one hand, and BrE on the other hand prove significant in a chi-square test at p ≤ 0.01. Officialese might be the genre that is responsible for the slightly higher proportion of plural patterns following government in
ICE-GB. Preliminary evidence for this explanation comes from a comparison of
Concord with collective nouns in Australian and New Zealand English 
British parliamentary debates and their Hansard ‘transcriptions’. Mollin (2007: 201–2)
found that originally singular agreement in the debates was regularly changed into
plural patterns in the transcripts, as in for instance:
(16) a.And I’m disappointed that the government continues to drag its feet when
it comes to this issue.
b.I am disappointed that the Government continue to drag their feet on
this issue.
Her analysis shows that the language actually used in the debates was somewhat conservative, in that MPs used singular concord after government in 67% (26 of 39) of all
instances in the debates (cf. singular concord with 79.7% in the spoken part of ICE-GB).
The Hansard transcriptions seem to employ an even more conservative style, rendering all 39 instances with plural concord. But parliamentary reporting style and officialese cannot explain all the instances of plural concord in the ICE-GB data. In fact the
instances of plural concord with “government” come from a range of sources: 16 are
from the spoken part of the corpus; they occur in broadcast discussions (7), parliamentary debates (3), broadcast news (1), other talks (3) and even spontaneous conversations
(2). The two written instances are from an examination script and a press news report.
Regional difference in concord can be seen for family with a tendency towards
singular concord in the southern hemisphere varieties: 19 out of 29 in ICE-AUS and
20 out of 30 in ICE-NZ. Whereas in ICE-GB, singular and plural concord are more
evenly distributed: in 11 out of 23 instances. The same holds for board: 17 out of 18
in ICE-AUS and 36 out of 38 in ICE-NZ, but 7 out of 11 in ICE-GB. On the other
hand team showed higher proportions of plural concord in both ICE-AUS (25 of 51)
and ICE-NZ (18 of 38) than in ICE-GB (only 7 of 21). A similar tendency can be
observed for group (see the tables in the appendix). Due to the overall lower frequencies, however, these differences are below the level of statistical significance.
It is not always the case that the two southern hemisphere varieties pattern
together whenever regional differences for individual nouns can be observed. Population, for instance, is a noun that shows a clear tendency towards singular in both
ICE-GB (12 out of 13) and ICE-NZ (28 out of 30), whereas ICE-AUS yields a fair
share of plural patterns (7 out of 28). ICE-AUS and ICE-GB produce similar findings
for the noun class, namely a clear preference for singular concord (with 6 of 6 and 8 of
11 instances, respectively); ICE-NZ, on the other hand, shows a preference for plural
concord with this noun (10 out of 18 contexts).11 Club, finally, has both singular and
plural agreement in ICE-GB and ICE-NZ, but prefers singular concord in ICE-AUS.
11. A Fisher Exact test applied to the findings for class did not produce statistically significant differences.
 Marianne Hundt
In previous studies, family and team were nouns that triggered mixed concord patterns more often than government or committee. Data from the ICE corpora have shown
that this is not necessarily the case: in ICE-GB and ICE-NZ, family does not yield any
instances of mixed concord. Government, on the other hand, is followed by a mix of
singular and plural forms in both ICE-NZ and ICE-AUS, as pointed out earlier.
Of the 9 nouns that are potentially of interest with respect to regional variation, only
1 – namely government – produced statistically significant differences. The remaining 8
nouns show interesting tendencies that might be supported on the basis of larger corpora in the future. The important thing to remember, however, is that the large majority
of nouns investigated in this chapter show practically uniform agreement patterns in
the Australian, New Zealand and British components of the ICE.
4. Conclusion
On the basis of unstratified newspaper databases, Hundt (1998) and Levin (2001)
found that singular concord was the preferred pattern in AmE, that plural concord
was used most frequently in BrE, whereas varieties like AusE and NZE took an intermediate position. The data from the more balanced ICE corpora that were investigated
for this chapter do not, however, provide evidence of substantial regional differences
between AusE and NZE nor between the southern hemisphere varieties and BrE, not
even at the level of individual nouns. Instead, language-internal variation seems to
dominate the picture.
A comparison of the written and spoken components underscores previous
hypotheses on stylistic variation: across all varieties, plural concord is more likely to
occur in informal, spoken English than in formal, written texts.12 At the level of individual nouns, the ICE-corpora investigated in this study also provide little evidence of
variation across varieties: in BrE, NZE and AusE, the majority of nouns (e.g. association, audience, committee, community, company, council, department, government) tend
towards singular concord; nouns like couple or staff show the opposite trend, namely a
preference for plural concord.
12. This contrasts sharply with more variety-internal homogeneity that Hundt (2006) found
in two outer-circle varieties, namely Singaporean and Philippine English.
Concord with collective nouns in Australian and New Zealand English 
References
Bauer, Laurie. 1994. Watching English Change. An Introduction to the Study of Linguistic Change
in Standard Englishes in the Twentieth Century. London: Longman.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. Harlow: Longman.
Depraetere, Ilse. 2003. “On verbal concord with collective nouns in British English”. English Language and Linguistics 7 (1): 85–127.
Fries, Udo. 1981. “Zur Kongruenz bei Kollektiven”. In Wolfgang. Pockl (ed.), Europäische
Mehrsprachigkeit. Festschrift zum 70. Geburtstag von Mario Wandrushka. Tübingen:
Niemeyer, 17–27.
Hundt, Marianne. 1998. New Zealand English Grammar. Fact or Fiction? Amsterdam and Philadelphia: John Benjamins.
Hundt, Marianne. 2006. “The committee has/have decided… On concord patterns with collective nouns in inner and outer circle varieties of English”. Journal of English Linguistics
34 (3): 206–32.
Hundt, Marianne. 2009. “Colonial lag, colonial innovation, or simply language change?” In
Günter Rohdenburg & Julia Schlüter (eds), One Language, Two Grammars. Cambridge:
Cambridge University Press, 13–37.
Johansson, Stig. 1979. “American and British English grammar: An elicitation experiment”. English
Studies 60: 195–215.
Levin, Magnus. 2001. Agreement with Collective Nouns in English. Stockholm: Almquist and
Wiksell.
Levin, Magnus. 2006. “Collective nouns and language change”. English Language and Linguistics
10(2): 321–43.
Marckwardt, Albert H. 1958. American English. London: Oxford University Press.
Mollin, Sandra. 2007. “The Hansard Hazard. Gauging the accuracy of British parliamentary
transcripts”. Corpora 2:2: 187–210.
Nixon, Graham. 1979. “Corporate-concord phenomena in English”. Studia Neophilologica 44:
120–6.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Sand, Andrea. (forthcoming). Angloversals? Shared Morpho-Syntactic Features in Contact Varieties of English. Amsterdam: John Benjamins.
Siemund, Rainer. 1995. “ ‘For who the bell tolls.’ – Or why corpus linguistics should carry the
bell in the study of language change in present-day English”. Arbeiten aus Anglistik und
Amerikanistik 20(2): 351–77.
Trudgill, Peter & Jean Hannah. 2002. International English. A Guide to the Varieties of Standard
English, 4th edn. London: Arnold.
 Marianne Hundt
Appendix
Table 1a. Collective nouns in ICE-AUS (written vs. spoken)
written
spoken
singular plural mixed Total
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
0
10
3
10
0
0
3
1
1
4
14
7
19
3
21
1
3
1
15
2
1
0
3
69
17
2
0
1
1
4
12
17
0
4
7
0
1
1
0
0
0
0
0
0
1
1
1
1
0
1
3
1
2
0
4
0
0
0
2
1
0
1
0
0
0
1
7
21
5
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
2
1
0
0
0
0
0
1
1
0
0
0
0
11
4
11
0
0
3
1
1
5
15
8
20
3
22
4
4
3
15
8
1
0
3
73
19
2
1
1
1
4
14
25
21
9
7
256
55
8
319
80.3% 17.2% 2.5%
singular plural mixed Total
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
1
4
5
7
1
0
3
7
0
11
8
7
14
4
9
0
1
6
13
17
4
1
0
79
21
0
0
1
0
12
21
2
1
13
8
2
2
0
1
0
0
0
1
1
1
2
1
4
0
2
5
2
0
2
6
0
0
1
7
11
0
7
0
0
1
1
0
8
13
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
2
0
0
0
3
5
0
0
0
0
0
1
1
0
3
0
3
6
5
8
1
0
3
8
1
12
10
8
18
5
11
6
3
6
15
25
4
1
1
89
37
0
7
1
0
13
23
3
9
29
9
281
82
17
380
73.9% 21.6% 4.5%
Concord with collective nouns in Australian and New Zealand English 
Table 1b. Collective nouns in ICE-NZ (written vs. spoken)
written
spoken
singular plural mixed Total
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
4
13
8
26
0
0
4
6
1
9
22
7
44
2
40
0
0
0
10
6
0
2
5
71
41
3
2
5
0
1
24
20
0
8
2
386
81.3%
0
1
2
1
0
0
7
1
1
0
3
0
2
1
3
3
5
1
1
7
0
0
1
5
7
0
0
1
0
1
4
1
18
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
3
0
0
0
0
0
1
0
0
2
0
82
7
17.3% 1.5%
4
14
10
27
0
0
11
7
2
9
25
7
46
3
43
3
5
1
11
13
0
2
6
77
51
3
2
6
0
2
29
21
18
15
2
475
singular plural mixed Total
army
2
association
5
audience
2
board
10
cast
1
clan
0
class
4
club
0
college
2
commission 11
committee
5
community 13
company
14
corporation
3
council
15
couple
0
crew
1
crowd
2
department 12
family
14
federation
5
gang
1
generation
1
government 174
group
24
institute
0
majority
0
ministry
7
minority
0
opposition
1
party
41
population
8
staff
2
team
15
university
8
Total
403
74.4%
1
0
3
1
0
0
3
3
0
2
1
0
1
1
3
5
0
6
3
12
0
0
0
14
15
0
0
0
0
5
8
1
7
20
1
0
0
1
2
0
0
0
0
0
0
1
2
1
0
1
0
0
1
0
0
0
0
0
10
1
0
0
0
0
0
2
0
0
1
0
3
5
6
13
1
0
7
3
2
13
7
15
16
4
19
5
1
9
15
26
5
1
1
198
40
0
0
7
0
6
51
9
9
36
9
116 23
542
21.4% 4.2%
 Marianne Hundt
Table 1c. Collective nouns in ICE-GB (written vs. spoken)
written
spoken
singular plural mixed Total
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
5
9
1
3
0
0
0
1
11
0
8
6
14
1
21
0
0
0
8
8
1
1
2
34
6
0
1
0
0
1
18
9
0
4
0
0
0
0
2
1
0
0
2
1
0
0
0
3
0
1
1
0
0
3
5
0
1
0
2
2
2
1
0
0
0
1
0
17
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
173
46
1
78.6% 20.9% 0.5%
5
9
1
5
1
0
0
3
12
0
8
6
17
1
22
1
0
0
11
13
1
2
2
36
8
2
2
0
0
1
20
9
17
4
1
220
singular plural mixed Total
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
6
3
6
4
0
0
8
3
5
4
3
17
19
0
12
0
0
2
12
3
1
0
4
63
19
5
1
4
0
1
24
3
1
9
2
1
1
5
2
1
0
2
3
1
1
1
1
5
0
0
8
0
1
2
7
0
0
0
16
3
0
3
0
1
1
5
1
9
7
0
244
88
71.6% 25.8%
0
0
1
0
0
0
1
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
2
0
0
1
0
6
4
12
6
1
0
11
6
7
5
4
18
24
0
13
8
0
3
15
10
1
0
4
79
23
5
4
4
1
2
31
4
10
17
2
9
341
2.6%
Concord with collective nouns in Australian and New Zealand English 
Table 2a. Verbal vs. pronominal concord in ICE-AUS (written vs. spoken)
ICE-AUS
written
singular
spoken
plural
singular
plural
verbal : pronom. verbal : pronom. verbal : pronom. verbal : pronom.
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
0:0
8:3
3:0
11 : 1
0:0
0:0
3:1
0:1
0:1
2:2
13 : 3
5:3
14 : 10
3:1
19 : 6
1:0
3:0
1:1
12 : 5
3:1
1:0
0:0
3:0
54 : 33
18 : 3
1:2
0:0
0:1
1:0
2:2
9:6
17 : 4
0:0
4:0
5:3
0:0
0:1
0:1
0:1
0:0
0:0
0:0
0:0
0:0
0:1
1:0
1:0
1:0
0:0
0:1
2:1
1:1
1:1
0:0
5:4
0:0
0:0
0:0
1:3
0:2
0:0
1:0
0:0
0:0
0:0
1:1
4:6
15 : 12
4:2
0:0
1:0
4:1
5:0
7:0
1:1
0:0
3:0
6:3
0:0
11 : 3
6:3
6:1
12 : 4
2:4
9:0
1:0
1:0
6:1
13 : 3
17 : 5
4:1
1:0
0:0
74 : 22
25 : 4
0:0
0:0
1:1
0:0
12 : 6
18 : 7
3:0
1:0
16 : 1
6:4
2:2
1:1
0:0
1:1
0:0
0:0
0:0
1:1
1:1
0:1
2:1
1:0
2:2
1:0
1:2
4:4
0:2
0:0
1:1
2:7
0:0
0:0
1:0
1:9
9 : 11
1:1
7:5
0:0
0:0
1:1
0:2
0:1
7:4
9 : 11
0:1
216 : 93
69.9% : 30.1%
38 : 38
50% : 50%
272 : 75
78.4% : 21.6%
56 : 72
43.75% : 56.25%
 Marianne Hundt
Table 2b. Verbal vs. pronominal concord in ICE-NZ (written vs. spoken)
ICE-NZ
written
singular
spoken
plural
singular
verbal : pronom. verbal : pronom. verbal : pronom.
army
association
audience
board
cast
clan
class
club
college
commission
committee
community
company
corporation
council
couple
crew
crowd
department
family
federation
gang
generation
government
group
institute
majority
ministry
minority
opposition
party
population
staff
team
university
Total
plural
verbal : pronom.
2:2
13 : 3
7:2
18 : 13
0:0
0:0
2:3
6:1
1:0
7:2
19 : 5
6:2
35 : 15
2:1
26 : 19
0:0
0:0
0:0
10 : 2
6:1
0:0
2:1
3:2
49 : 32
40 : 14
3:0
1:1
4:1
0:0
1:0
16 : 13
22 : 3
0:0
7:3
2:0
0:0
1:0
1:1
0:1
0:0
0:0
4:6
1:0
0:1
0:0
3:0
0:0
0:2
0:0
2:1
2:2
3:2
1:0
0:1
4:5
0:0
0:0
0:1
2:1
5:9
0:0
0:0
0:1
0:0
0:1
2:5
0:1
16 : 7
2:6
0:0
2:0
0:0
3:0
12 : 0
1:0
0:0
4:3
0:0
2:0
10 : 3
6:0
14 : 1
14 : 7
1:2
13 : 4
0:0
1:0
3:0
12 : 3
13 : 1
5:0
1:0
0:1
168 : 55
21 : 5
0:0
1:0
6:4
0:0
1:1
39 : 13
6:1
2:0
17 : 2
7:4
0:1
1:0
3:3
0:3
0:0
0:0
3:2
2:2
0:0
1:2
0:2
0:2
0:2
2:1
2:4
3:4
0:0
6:2
2:2
9:7
0:0
0:0
0:0
10 : 25
8:5
0:0
0:0
0:0
0:0
3:2
7:8
1:1
5:4
13 : 13
0:1
310 : 141
68.7% : 31.3%
49 : 54
47.6% : 52.4%
385 : 110
77.8% : 22.2%
81 : 98
45.3% : 54.7%
No in the lexicogrammar of English
Pam Peters & Yasmin Funk
Macquarie University
This paper analyzes the continuing uses of no in negative collocations in three
varieties of English: Australian, New Zealand and British, using their respective
ICE corpora. In all three varieties of English, the use of no as determiner in
nominal phrase collocations far outnumbers its use in adverbial collocations,
though the latter cluster high in the frequency rankings for both speech and
writing. Comparative analysis finds that while Australian English makes more use
of no as a reaction signal (No!) and its emphatic counterparts (e.g. No way!), the
New Zealand English data present a wider range of freely formed no collocations,
especially in writing. Thus the two southern hemisphere varieties diverge, with no
increasingly fixed into Australian lexical idiom, while it remains a well-utilized
syntactic resource in New Zealand English.
1. Introduction: Expressing negation
In the history of European languages, the expression of negation has been continually
evolving, especially in English. Language historians have traced the gradual replacement of no-negation with not-negation (Jespersen 1917); while recent corpus-based
research, including that of Tottie (1991), Hundt (1998) and Biber et al. (1999), has
focused on the legacy of variation in the formulation of negation in contemporary
English. In parallel with the analysis of negation in standard English, its realization
in nonstandard English continues to be of interest, especially the various devices
used for emphasizing negative polarity (Kortmann & Szmrecsanyi 2004), in the continual quest to reinforce its rhetorical strength. One such device is the emphatic use
of never instead of not in colloquial English; another is the double negative used to
express negation through more than one sentence constituent. Both can be found in
Australian “vernacular” speech, and even in the same sentence, witness the following
from a Tasmania speaker recorded by Pawley (2004: 634).
(1) I never said nothing for a while.
The use of the double negative in English has a long history of commentary and censure
from the eighteenth century on, when grammarians argued over its interpretation in
 Pam Peters & Yasmin Funk
terms of the logical principle whereby two negatives cancel each other out: “Two
negatives may make an affirmative but they cannot express a denial”, according to
Mennye in his 1785 Grammar (quoted in Leonard 1962: 93). Yet the use of more than
one negative in an English sentence is almost always significant in terms of intensifying or fine-tuning the negative polarity (Peters 2004: 163). There is nevertheless
a widespread stigma attached to the colloquial use of double negatives like that in
example 1 above, which has turned it into a shibboleth for standard English speakers,
and a sociolinguistic marker of lower socioeconomic speech in all quarters of the
English-speaking world. It ranks 7 out of the 76 features analysed by Kortmann and
Szmrecsanyi (2004). For sociolinguists, it is simply a form of negative iteration or
“negative concord” (Labov 1972). More recent grammars embrace it under the heading
of “multiple negation” (Biber et al. 1999: 177–9), showing that while complex sentences
with more than one kind of negative in successive clauses escape censure, a single
clause containing two or more negatives will probably not.
English grammar supports multiple negation within the clause through the fact
that it can be expressed through several elements of syntax:
––
––
through the determiner no prefacing a noun phrase, as well as in nominal
compounds such as nothing, nobody, no-one
through the adverb not attached to the verb phrase (or to other adverbial or adjectival
components of the clause); and in adverbial compounds such as nowhere. (Never was
historically a compound consisting of ne + ever)
Because of these various syntactic realizations, the analysis of negation still tends
to be scattered over several chapters in major grammars such as Quirk et al. (1985)
and Biber et al. (1999), whereas Huddleston and Pullum (2002) put all negation phenomena together in a single chapter. Either way, the discoursal and lexical aspects
of negation tend to be overshadowed, although they are of no small importance as
the contexts of use and the collocational raw material. Their impact on the use of
no-negation in particular is the focus of this study.
2. Research on the uses of no in contemporary English
While the use of no is generally declining in favor of not constructions in late twentieth
century English, research by Tottie (1991) has also indicated the existence of a substantial quantity of more or less fixed collocations with no. They may indeed be seen
as the natural correlates of its declining frequency (Bybee 2006: 728–9). At one end of
the scale of fixity there is the no “boilerplate” (Tottie 1999: 326), including emphatic
reaction signals formed with no such as “No way!” or in AusE especially “No worries!” They and others are the stereotypical responses of informal conversation. At the
No in the lexicogrammar of English 
other end of the scale there are no collocations that are clearly freely formed, as in the
immortal line “No birds sing” from Keats’s poem La Belle Dame sans Merci, which
has resonated so strongly with environmental concerns. Here the negative determiner
no is fully integrated within the noun phrase, in what for some is the essence of more
literary writing, and more elegant expression (Jespersen 1917: 56). Thus the uses of no
may be becoming polarized: occurring in speech as conventional, stereotypical units,
and in writing as individualized expression.
In between these contrasting uses of no, there are expressions which are at home
in both spoken and written discourse. Alongside the reaction signal “No way!”, the
same collocation takes its place in fuller discussion and formulation of issues: “there
was no way we could match…”, as well as in the adverbial phrase “in no way discouraged by the response”. Three-part “bundles” like in no way and four-part ones like “take
no notice of ” were found to make up as much as 20% of written registers and 30% of
spoken registers (Biber et al. 1999: 1027). In ordinary conversation, the total of 2-, 3- and
4-part collocations have been shown to constitute more than 85% of the word count
(Eeg-Olofsson & Altenberg 1996). Negative collocations that contribute to these high
counts of prefabricated material are nevertheless capable of some variation. Research
by Biber et al. (1999: 169) estimated that no collocations could be paraphrased by not
constructions in about 80% of cases, whereas only about 30% of not collocations could
be paraphrased with no constructions. “There was no way of ” could certainly be paraphrased by “There wasn’t any way of …”, and is thus not “boilerplate” but a discretionary
collocation available as a means of making a negative statement. These discretionary
uses of common collocations with no stand at the mid-point of the notional scale
between boilerplate and fresh uses of no as in “no birds sing”. As common collocations
they are on the cusp of English lexicogrammar: they belong to the idiomatic stock of the
lexicon, but they are also syntactic alternatives to the “not any” construction.
Previous research on no has diverged on whether the uses of no are generally more
frequent in speech or writing, though without always distinguishing the different uses
of no collocations. Tottie’s (1991) research found greater levels of no in writing, using
a methodology that included only instances of discretionary no (i.e. examples which
could equally well have been formulated with any). Meanwhile Biber et al. (1999)
found that not was more frequent in both speech and writing; however it is unclear
whether they restricted their data to discretionary uses of no. How no boilerplate and
ready-made collocations were counted is also unclear, though we might expect them
to have bulked up the presence of no in spontaneous conversation. Further research is
therefore needed to clarify the roles of no and its collocational behavior in writing and
speech. We might expect it to function more as a syntactic resource for writers, and as
a lexical resource for speakers.
Earlier studies of no-negation have also shown that its usage varies somewhat in
different varieties of English. No occurred far more often in American conversation
 Pam Peters & Yasmin Funk
than British in indefinite noun phrases (i.e. have no… rather than have not any…),
according to Biber et al. (1991: 161), with the implication that AmE is more conservative in this detail of no usage. Meanwhile research based on four parallel corpora of
the 1980s and 1990s by Peters (2008: 153, 156) showed that AmE was ahead of AusE
and BrE in the replacement of no by not/n’t in written material, and that NZE was
the most conservative of the four in retaining no as a syntactic resource. Yet early
(mid-nineteenth century) attestation in NZE of the frequent use of the exclamation
“No fear” (citation in the Dictionary of New Zealand English (1997)) shows that boilerplate uses are also deeply embedded there, expressing the conversational need for
intensified negation. Regional tendencies to retain or replace no may be reinforced
or modified by the medium of communication (speech or writing). This suggests the
value of comparing spoken and written data on the uses of no collocations in New
Zealand and other varieties of English, to see if they present contrasting or consistent
patterns. Regional differentiation of sociolects has been associated with the maturation of new varieties of English (Schneider 2007), and may also be manifest in the
differentiation of registers. It is therefore worth investigating whether newer varieties
of English such as NZE and AusE differ in this respect from old varieties such as BrE.
3. Source material used in this study
The ICE corpora lend themselves to this study, since they provide comparative data on
three varieties of English from the same period (late twentieth century). All contain
samples in the same categories and sub-categories of spoken and written discourse,
by comparable speakers and writers (all adult, i.e. 18 years and over, native-speakers
of their variety, and all “educated”, i.e. they have completed a standard secondary education, in terms of the local educational institutions). This research is based on data
from the ICE-corpora of Australia (ICE-AUS) and New Zealand (ICE-NZ) as well
as ICE-GB (=BrE), to provide some comparative data from an older English in the
northern hemisphere.
The data from ICE-AUS and ICE-NZ were extracted by means of string searches
performed by the search engine built into the Macquarie University corpus website.
Data from ICE-GB was found using ICECUP III. The corpus data were then sorted,
first mechanically to establish their relative frequencies, and then by “manual” sifting
of individual examples, to separate out the various types of no in the data. Some of
them serve both as “boilerplate” and as discretionary uses of the same collocations
which could be paraphrased by not any. In order to distinguish these uses, the no
collocations were examined to assess their relative paraphrasability, and the impacts
of paraphrase on the original wording. No collocations which are adverbial in function do not usually require any reordering of elements when paraphrased with not
No in the lexicogrammar of English 
any, except that not attaches itself to the verb which may then require do-support, as
when “they went no further” becomes “they didn’t go any further”. However nominal
collocations can be embedded in various constituents of the clause, and may require
reordering or the rephrasing of the indeterminate NP with additional constituents.
The first example below is readily paraphrasable:
(2)Labour’s caucus delivered no surprises when it selected its new senior and
junior whips and caucus secretary yesterday, despite contests for all positions.
[ICE-NZ W2C-008:3]
As with adverbial collocations, the paraphrase needs only to turn “delivered no
surprises” into “didn’t deliver any surprises”. But paraphrasing the next example below
requires more intensive rewording:
(3)The problem lay with no criteria being attached to the new range of rates
and principals fearing that backdating of any rise to before April 1, would
not apply,〈-〉” he said. [ICE-NZ W2C-010: 64]
Here the paraphrase requires manipulation of the nonfinite clause, into “The problem
lay with there not being any criteria attached…” Provided some paraphrase of the no
constituent could be accommodated within the same clausal structure, the no collocation was considered to be paraphrasable, and discretionary rather than fixed. These
discriminations underlie the data presented below in Sections 5, 6 and 7.
4. Preliminary identification of reaction signals
The first step in this research study was to identify and set apart all the ordinary uses
of no as a reaction signal in spoken exchanges, including its repeated use. This filtering
process then allowed us to concentrate on collocations with no, and the question of
whether they were typically “boilerplate” expressions like “No way!”, or more discretionary uses of no collocations.
Table 1 below summarizes the overall frequency of no in the three ICE corpora,
and shows the dramatic reduction when all simple uses of no are subtracted from the
Table 1. Data on no from three ICE corpora, showing the proportions of no used as a
simple negative reaction signal in each.
no (total use)
no (as simple neg. reaction)
no collocations
ICE-AUS
ICE-NZ
ICE-GB
TOTALS
3363
2138 (63.6%)
1225
2941
1488 (50.6%)
1453
2906
1529 (52.6%)
1377
9210
5155 (56%)
4055
 Pam Peters & Yasmin Funk
totals. This reflects the relatively large proportions of spoken material (60%) included
in the ICE-corpora.
Table 1 shows that the proportion of no used as a simple negative reaction signal
is more than 50% in all three corpora, and at its highest (63.6%) in ICE-AUS. It
contributes to the fact that negative forms are used many times more often in conversation than in written discourse (Biber et al. 1999: 159). Some speakers produce
a string of three or four nos to emphasize their negative reaction, the most basic way
of intensifying it. Yet not all uses of no are actually negative, and the transcriptions
occasionally have the negative reaction signal elongated as “Nooo!”, to show that it
effectively affirms the utterance of the other speaker.
With the removal of all instances of no as a reaction signal, the frequency of no
collocations in each variety is more visible. NZE emerges from Table 1 as the variety
containing the largest number of tokens, followed by BrE then AusE. It remains to be
seen whether the ratio of tokens to types of no collocations will also vary across the
three sets of data.
5. Types of no collocation found in speech and writing
Let us now put the spotlight on the commoner collocations with no which come to
light in comparative analysis of the ICE data. They are typically two-part nominal
collocations e.g. no idea, no doubt, no worries, although some such as the second are
also quite commonly found as adverbial collocations:
Compare: There’s no doubt that they will be present (NP)
They will no doubt be present (AdvP)
The grammatical flexibility of some no collocations allows them to contribute to larger
discourse units. For example:
no good (nominal) >
no less (as adverbial) >
no more (as adverbial) >
no longer (as adverbial) >
no sooner (as adverbial) >
(adjectival phrase) no good reason
(adj. phrase) no less keen
(adj. phrase) no more need
(adj. phrase) no longer available
(clause complex) no sooner said than done
Like other two-part “lexical bundles”, no collocations can be caught up in three- or
more-part lexical bundles (Biber et al.1999: 169), as prefabricated elements of discourse,
especially speech. We may expect some variability in the inventory of collocations
contributing to the speech styles of regional varieties of English, given the more localized
nature of spoken discourse. But let us first set in parallel the most frequent types of
no collocation, nominal and adverbial, found in all three ICE corpora, to see whether
there are any broad differences.
No in the lexicogrammar of English 
Table 2 summarizes the data on the highest-frequency no collocations in ICE-AUS,
ICE-NZ and ICE-GB, i.e. those with a frequency of 7 or more in one of the three
corpora. In all these corpora no longer, no doubt and no more rank highest in terms of
frequency, the second and third collocations helped by their dual use as nominal and
as adverbial phrases. There are however relatively few adverbial collocations with no,
only those two plus no less and no further, both of which are also nominal collocations.
The paucity of adverbial collocations is in line with the relatively infrequent use of no
as adverb vis-à-vis no determiner found in the British National Corpus: in the ratio of
17: 1343 (Leech, Rayson & Wilson, 2002: 82).
Beginning with no longer/no doubt/no more, Table 2 shows a similar set of
high-ranking collocations at the top of each list. In the middle of each list the same
generic types of collocation appear with slightly different rankings e.g. no reason,
Table 2. Rankings of the most frequent types of no collocations in the three ICE corpora.
RANK
ICE-AUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
no longer
no doubt
no more
no way
no idea
no matter
no good
no other
no reason
no problem
no time
no point
no money
no need
no question
no evidence
no means
no room
no less
no better
no choice
no right
no further
no wonder
Total
ICE-NZ
65
63
44
35
28
25
20
19
18
17
14
13
12
12
11
11
9
9
9
9
9
8
7
7
474
no longer
no doubt
no more
no matter
no way
no idea
no reason
no wonder
no other
no time
no means
no need
no problem
no run
no use
no good
no evidence
no chance
no further
no point
no money
no room
no ball
no alternative
no sign
no less
ICE-GB
75
58
40
29
25
22
21
20
19
15
15
14
11
11
11
10
9
9
8
7
7
7
7
7
7
7
471
no doubt
no longer
no more
no reason
no idea
no means
no good
no way
no further
no less
no other
no problem
no need
no sign
no evidence
no time
no matter
no point
no wonder
no question
no choice
no difference
no claims
no trouble
88
79
49
22
20
19
18
17
17
17
16
16
14
13
13
12
11
10
10
9
9
9
8
7
503
 Pam Peters & Yasmin Funk
no problem, no need, no means – the common stock of southern hemisphere Englishes
and their northern hemisphere progenitor. Lower down the list we begin to observe
differences in the actual inventory, which reflect local interests and/or the subject
matter of particular texts within the regional corpus. For example no run (x 11) and
no ball (x 7) in ICE-NZ, extracted from local cricket commentary; and in ICE-GB the
repeated use of no claims (x 8) from the discussion of an insurance case. In ICE-GB we
also note the appearance of no trouble (as in it’s no trouble), a common British courtesy.
Absent from the ICE-GB list but present in both the Australian and New Zealand lists
are the collocations no money and no room, suggesting blunter discussion of these
essential resources.
Further asymmetries come to light as we separate the relative rankings for each
set of data in spoken and written discourse. Because of the unequal sizes of the ICE
spoken and written components (600 000 to 400 000 words), normalized frequencies
were calculated for the rankings, and used to decide the cut-off points for data from
the two types of discourse, at the nearest approximation to a normalized 7 per million.
Since a frequency of 4 amounts to 6.68 in spoken discourse, and of 3 to 7.5 in written
discourse, we used raw frequencies of 4 and 3 respectively as the thresholds for
inclusion in the sets presented in the table below.
Table 3 (in the Appendix) shows both common and differential preferences for
particular no collocations in spoken and written discourse in the three corpora. All
three show a mix of the more generic types used in both spoken and written collocations,
e.g. no matter, no reason, no evidence, as well as some more specific to speech. All three
corpora show the impacts of sporting commentary on the inventory of spoken collocations, with the British no maidens, no score matching the New Zealand cricketing
terms no run, no ball, and contrasting with the Australian no injuries, from Australian
football commentary. The rankings for no idea and no way are notably higher in speech
than writing, which go with their rather conspicuous role as conversational boilerplate
in all three varieties, as shown in the following sets:
(4) No way! What would you want to visit us for [ICE-AUS S1A-017:251]
(5) No no way not for a million dollars [ICE-NZ S1A-021:265]
(6) C: Oh no way [ICE-GB:S1A-046 #296]
A: No [ICE-GB S1A-046:306]
This boilerplate use of no way provides conversationalists in all three varieties with an
emphatic means of negating a topic of discourse, the New Zealand example showing
also the use of multiple negation. No idea provides a similarly categorical negative, not
usually duplicated, but sometimes followed up with an explanation. Thus apart from
providing rhetorical emphasis, it serves a discoursal purpose in signaling the need for
further discussion of the issue.
No in the lexicogrammar of English 
(7) C: Where Do you know where he lives
B: No idea
C: ‘Cos there’s a double storey house next to the shops now
[ICE-AUS S1A-089:122-4]
(8) J: what days are people going up?
B: no idea
B: see it all depends on the weather [ICE-NZ S1A-042:277-81]
(9) A: No no no
A: No idea
B: No not a word of course [ICE-GB S1A-069:81-105]
Both no way and no idea are also regularly used in discretionary formulations with
complementary phrases and clauses as illustrated below. The examples for each are all
from spoken data, though they have their counterparts in written data.
(10) There’s no way it’s the sole cause of the U S deficit [ICE-AUS S2A-027:160]
(11)You’ve got no way of browsing for something as you would have on the
bookshelf looking it up rejecting it [ICE-NZ S1B-016:65]
(12)I mean what might seem radical in cinema is in no way radical in the theatre
[ICE-GB S1B-045:74]
(13)I had no idea that they were next to me [ICE-AUS S1B-063:336]
(14)bill in his innocence has no idea what’s going on and many a reader doesn’t
notice hints like these [ICE-NZ S2B-031:122]
(15) She’s got no idea what a bet is 〈,〉 [ICE-GB S2A-030:28]
The versatility of these particular no collocations is evident – and speakers of all
three varieties of English evidently use them in both ways. There are others like
them, e.g. no problem, no matter, no wonder, which are similarly able to be used
as both emphatic reaction signals and as the stuff of more extended articulation of
negative ideas. No worries serves both purposes now in AusE, though its appearance
at rank 20 in the spoken list and at the bottom of the written ICE-AUS list (with a
frequency of 1) suggests that its boilerplate use is more salient than its use in formulating ideas in writing.
The rest of the collocations listed in Table 3 are usually postmodified in some way:
no reason to/for, no point in, no right to, no time for/to, with a following word or phrase.
All are typically discretionary uses of no collocations, part of more extended spoken
and written discourse. For speakers of all three varieties, these very frequent no collocations are all available as syntactic resources, though some are also conversational
boilerplate. Clearly no contributes to both the grammar and the lexicon in each variety,
with substantial overlaps in the types of collocation available.
 Pam Peters & Yasmin Funk
6. Relative frequency of no collocations and not any paraphrases
The majority of no collocations may be paraphrased by means of not any (noted
above in Section 3). Formulations with no are preferred in some cases, as Tottie
(1991: 221) suggests, but there are not any alternatives for many common collocations in the ICE data shown in Table 2, and their relative frequency in spoken and
written data may be an index of stylistic preferences or constraints on the use of
either construction. Table 4 below presents the frequencies of five high-ranking no
collocations that are paraphrased with not any in the three corpora, separating out
their grammatical uses and frequencies in spoken and written discourse.
The overall totals for the no collocations shown in Table 4 are substantially
greater than those of their not any paraphrases. This is also true individually for the
Table 4. Some no collocations and their variants with not any, with raw frequencies and
normalizations per 1 million words (plus overall totals).
ICE-AUS
spoken
ICE-NZ
written
norm
x1.67
spoken
norm
x2.5
ICE-GB
written
norm
x1.67
spoken
norm
x2.5
written
norm
x1.67
norm
x2.5
no idea (NP)
not any idea (NP)
24
4
40.08
6.68
4
1
10
2.5
15
8
25.05
13.36
7
0
17.5
0
18
7
30.06
11.69
2
1
5
2.5
no good (NP, Adj.P)
not any good (NP)
16
5
26.72
8.35
3
1
7.5
2.5
8
6
13.36
10.02
2
2
5
5
13
6
21.71
10.02
5
2
12.5
5
no longer (Adv.P)
not any longer
(Adv.P)
24
5
40.08
8.35
41
2
102.5
5
30
3
50.1
5.01
44
0
110
0
38
5
63.46
8.35
41 102.5
3
7.5
no doubt (NP)
not any doubt (NP)
no doubt (Adv.P)
20
5
15
33.4
8.35
25.05
14
2
15
35
5
37.5
6
1
12
10.02
1.67
20.04
8
5
31
20
12.5
77.5
16
2
23
26.72
3.34
38.41
9
1
30
22.5
2.5
75
no more
(NP, Adj.P, Adv.P)
not any more
(NP, Adj.P, Adv.P)
24
40.08
20
50
19
31.73
21
52.5
14
23.38
35
87.5
44
73.48
13
32.5
35
58.45
23
57.5
42
70.14
27
67.5
total: no collocations 123
total: not any
63
paraphrases
total:
186
(all constructions)
97
19
90
53
113
30
122
62
122
34
116
143
143
184
156
No in the lexicogrammar of English 
first four pairs of collocations, but not for the fifth, where the not any paraphrase
has the greater frequencies (spoken and written) in all three corpora. The results for
the first four show the effect of the no collocation being the “preferred” collocation
(Tottie 1991: 221) which resists paraphrase. This effect does not show up for the fifth
pair, which is much more commonly paraphrased.
The distribution of no collocations in Table 4 is not particularly weighted
towards speech or writing; whereas that of not any is weighted towards speech in all
three corpora, and in fact there are no examples of not any idea or not any longer in
ICE-NZ written data. This tendency for not any paraphrases to be used in speech is
most pronounced in the fifth pair of results (for not any more), and probably reflects its
ability to serve as nominal, adjectival and adverbial phrase. Compare the grammatical
specialization for no collocations within the first four pairs/sets of collocations, where
those which are used as nominal phrases (no idea, no good, no doubt) are usually more
frequent in speech, and those used as adverbial phrases (no doubt, no longer) tend to
be more frequent in writing.
These data suggest that the high-frequency no collocations with a single syntactic
role are able to hold their place in both speech and writing, and are effectively the more
salient form. Meanwhile the not any paraphrases are a ready alternative for the polyfunctional and less salient collocations, especially in speech. There the ability of not
any to replace no is most evident. But in both mediums of discourse, no and not any
collocations coexist, confirming that the use of the no collocations is not idiomatically
fixed but still often discretionary.
7. Freshly created no collocations
The high-frequency collocations discussed in Section 6 are the common lexical and
syntactic property of all native-speaker varieties. Let us now consider the no collocations at the lower end of the frequency lists and not discussed so far. We may
expect them to be relatively hard to paraphrase with not any because they are the
purposeful creations of individual speakers and writers in particular discoursal contexts. They represent speakers/writers responding with the full syntactic resources
of English for expressing the negative, subject to the needs of their subject matter
and the style of discourse they are generating. First let us estimate the quantities of
these individual uses of no in each variety by subtracting all instances of the most
frequent collocations of no listed in Table 3 above from the totals of no collocations
other than simple reaction signals, i.e. the totals shown in Table 1.
Once all the reaction signals and high-frequency no constructions have been
removed, the more individual uses in each variety come to light. The bottom line
of Table 5 includes just a few no collocations which are multiple tokens of the same
 Pam Peters & Yasmin Funk
Table 5. Comparative frequencies of low frequency no collocations in the three corpora
(spoken and written).
ICE-AUS
all no
collocations
(minus reaction
signals)
high frequency
no collocations
low frequency
no collocations
ICE-NZ
ICE-GB
Total
spoken
written total spoken written total spoken written total
674
551
1225 630
823
1453 705
672
1377 4055
309
210
519
267
266
533
312
271
583
1635
365
341
706
363
557
920
393
401
794
2420
type (up to three tokens in spoken data and up to two in written data), but most are
single tokens of a single type. The table shows that although the totals from each
variety are rather different, there are comparable numbers in spoken data from all
three corpora. The following are typical examples from spontaneous conversation
formed with existential be and stative have but with highly individual no collocations
embedded in them:
(16)So there are no right angles where water can gather [ICE-AUS S2A-054:75]
(17)there are no ideological spectacles sufficiently rose coloured to protect partisan
illusions [ICE-NZ S2B-027:27]
(18)But if there are no such standards only choices then moral language becomes
an anomaly [ICE-GB S2B-029:117]
(19)The course that you’re providing has no substance for them whatsoever
[ICE-AUS S1B-059:88]
(20)and we had no predetermined list of what those social services benefits as such
nor the g r i [ICE-NZ S2A-047:6-7]
(21)The city has no river and must bear the heavy cost of pumping water in
[ICE-GB S2B-022:116]
Examples formulated with existential be and stative have are relatively common, as
might be expected from previous research (Tottie 1991: 194; Biber et al. 1999: 172).
In fact they constitute 17% and 13% respectively of the totals of no collocations
in the spoken and written data from the three ICE corpora, although their impact
on the differences between spoken and written data from the three varieties is not
statistically significant (Peters 2008: 157). They have therefore been included with
the total frequencies of no collocations in the spoken and written data shown in
No in the lexicogrammar of English 
Table 5 above. But when the relative frequencies of all these fresh no collocations in the
written data from each corpus are compared, statistically significant (p = < 0.001)
differences between spoken and written frequencies can be confirmed. It suggests there
are regional differences not attributable to the stock uses or syntactically conditioned
examples of no.
From the data presented in Table 5, it is clear that New Zealand writers make more
extensive and creative use of no collocations than their counterparts writing in the other
two varieties (AusE, BrE). The New Zealand written data offers numerous examples of
effective use of no collocations to articulate negative aspects of the imaginative world
being created:
(22)The children collapsed and slept with no urging. Rihi did not.
[ICE-NZ W2F-002:300-301]
(23)No gnomes or trolls or wood-sprites; and no pantheistic transports. Sensuous
things only, please, she asked. [ICE-NZ W2F-016:291-2]
The first example shows the use of no at its most succinct, the second its value in
more expansive phrasing where the repeated negatives make for the coherence of the
two very disparate coordinates. Collectively they create an emphatic negative polarity which contrasts with the less emphatic follow-up in the affirmative. Clearly there’s
no possibility of paraphrasing the use of no with not any in these larger NPs, and the
writer exploits the full force of no as determiner.
These are tokens of the many effective no collocations to be found in New Zealand fiction, the creations of published authors. They are not however a purely literary
resource, since fresh examples are also readily found in New Zealand nonfiction writing,
by authors little known outside their immediate professional contexts:
(24) Where the fire is burning there are no tracks. [ICE-NZ W2C-001:139]
(25)He said there would be no witch hunts, no pointing of the finger and no
disclosure of the names by the Government. [ICE-NZ W2C-006:98]
These examples, extracted from the press section of ICE-NZ (W2C), show that skill
in using no collocations is not confined to literary genres. It correlates with the fact
that the usage of no is higher in all registers of New Zealand writing than in the
equivalents for BrE, AmE and AusE (Peters 2008: 159). Within the ICE data, a greater
quantity of tokens and types of no collocations can be found in ICE-NZ than in ICEAUS or ICE-GB. They confirm that the use of no as determiner is still a very lively
element of NZE grammar, and though it is available in the other varieties, its stylistic
and rhetorical value is less often exploited. This correlates with the stronger trend
towards replacement of no with not any in both AusE and BrE.
 Pam Peters & Yasmin Funk
8. Conclusions
In this analysis of the roles of no in collocation, the widest range is associated with its
use within nominal rather than adverbial phrases. In both there is some fossilization
of collocations into conversational boilerplate, to provide emphatic negation and
perhaps avoid recourse to the more stigmatized forms of multiple negation. The study
also found substantial discretionary use of some high frequency no collocations in
spoken discourse, where they are still favored over the not any alternatives. There is
also plenty of evidence of other creative uses of no in fresh collocations especially in
writing, which are the ultimate demonstration of its status as a syntactic resource.
Discretionary and creative uses of no expressions abound in NZE, in comparison
with both AusE and BrE. Perhaps this is a case of colonial lag: that they are better
preserved and exercised there while the expression of negation is elsewhere changing
in favor of not. The fact that these no collocations are most abundant in New Zealand
writing might suggest that register differentiation is stronger there than in the other
two varieties. The presence of local sociolinguistic differentiation has been associated
with the final stage (Stage V) in the evolution of new varieties of English (Schneider
2003, 2007), and if fresh register differentiation also occurs at that stage, it could mean
that NZE is more fully evolved than AusE in this respect. Otherwise this differentiation bespeaks the preservation of an older stylistic resource in the more controllable
medium of writing, where it can resist the rival everyday construction of negation with
not. This would align it with the most likely explanation of other details of NZ written
style, e.g. the use of the gerund-participle (see Peters 2009), in which NZE also seems
to preserve an older pattern of register differentiation.
All in all AusE emerges from this study as more advanced than NZE in terms of
replacement of no-negation with not-negation. It makes more use of no in reaction
signals and in boilerplate, and less by way of discretionary no collocations that reflect
its use as a syntactic resource. The two southern hemisphere varieties are thus on
opposite sides of the cusp in lexicogrammatical terms.
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. Harlow: Longman.
Bybee, Joan. 2006. “From usage to grammar: the mind’s response to repetition”. Language 82 (4):
711–33.
Collins, Peter & Pam Peters. 2004. “Australian English morphology and syntax”. In
Kortmann et al. (eds): 592–610.
Dictionary of New Zealand English. 1997. ed. Harry Orsman. Auckland: Oxford University Press.
No in the lexicogrammar of English 
Eeg-Olofsson, Mats & Bengt Altenberg. 1996. “Recurrent word combinations in the London
Lund corpus: Coverage and use of word-class tagging”. In Carol Percy, Charles Meyer &
Ian Lancashire (eds), Synchronic Corpus Linguistics, 112–23. Amsterdam: Rodopi.
Huddleston, Rodney and Geoffrey Pullum. 2002. Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Hundt, Marianne. 1998. New Zealand English Grammar: Fact or Fiction. Amsterdam: John
Benjamins.
Jespersen, Otto. 1917. “Negation in English and other languages”. Repr. 1962 in Selected Writings
of Otto Jespersen, 3–151. London: Allen and Unwin.
Kortmann, Bernd, Edgar W. Schneider, Rajend Mesthrie and Kate Burridge. 2004. Handbook of
Varieties of English. 2 vols. Berlin: Walter de Gruyter.
Kortmann, Bernd & Benedict Szmrecsanyi. 2004. “Global synopsis: morphological and syntactic Variation in English”. In Kortmann et al., (eds): 1142–1202.
Labov, William. 1972. Language in the Inner City. Philadelphia PA: University of Pennsylvania
Press.
Leech, Geoffrey, Paul Rayson & Andrew Wilson. 2002. Word Frequencies in Written and Spoken
English. London: Longman.
Leonard, Sterling A. 1962. The Doctrine of Correctness in English Usage 1700–1800. New York
NY: Russell and Russell.
Pawley, Andrew. 2004. “Australian Vernacular English: Some grammatical characteristics”. In
Kortmann et al., (eds): 611–642.
Peters, Pam. 2004. Cambridge Guide to English Usage. Cambridge: Cambridge University Press.
Peters, Pam. 2008. “Patterns of negation. The relationship between NO and NOT in regional varieties of English”. In Terttu Nevalainen, Irma Taavitasinen, Paivi Pahta & Minna Korhonen
(eds), The Dynamics of Linguistic Variation, 147–62. Amsterdam, Rodopi.
Peters, Pam. 2009. “Personal pronouns in spoken grammar”: In M. Moberg et al., Festschrift
for KA. University of Gothenburg Press.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Rayson, Paul, Geoffrey Leech & Mary Hodges. 1997. “Social differentiation in the use of
English words. Some analyses of the conversational component of the BNC”. Inter­
national Journal of Corpus Linguistics 2(1): 133–52.
Schneider, Edgar W. 2003. “The dynamics of new Englishes: from identity construction to
dialect birth”. Language 79 (2): 233–81.
Schneider, Edgar W. 2007. Postcolonial English. Varieties of English around the World. Cambridge:
Cambridge University Press.
Tottie, Gunnel. 1991. Negation in Speech and Writing. San Diego CA: Academic Press.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
no doubt
no way
no more
no idea
no longer
no good
no reason
no problem
no matter
no other
no money
no point
no question
no time
no injuries
no evidence
no knowledge
no stage
no better
no worries
no wonder
no right
ICE-AUS
SPOKEN:
35
27
24
24
24
16
15
14
14
11
10
9
9
8
6
5
5
5
5
5
5
5
no longer
no doubt
no more
no matter
no less
no other
no need
no way
no time
no evidence
no choice
no room
no means
no good
no better
no stopping
no further
no idea
no limit
no point
no work
no problem
ICE-AUS
WRITTEN:
41
28
20
11
9
8
8
8
6
6
6
5
5
4
4
4
4
4
4
4
3
3
no longer
no more
no doubt
no way
no idea
no matter
no reason
no wonder
no run
no good
no problem
no other
no need
no ball
no money
no evidence
no chance
no use
no room
no fewer
no time
no point
ICE-NZ
SPOKEN:
31
19
19
17
15
14
13
12
11
8
8
8
7
7
6
6
6
6
5
5
4
4
no longer
no doubt
no more
no matter
no other
no time
no means
no reason
no way
no wonder
no need
no idea
no less
no use
no problems
no power
no further
no surprise
no people
no evidence
no point
no difference
ICE-NZ
WRITTEN:
44
39
21
15
11
11
11
8
8
8
7
7
7
5
4
4
4
4
3
3
3
3
Table 3. Frequencies for the most common no collocations found in spoken and written data.
Appendix
no doubt
no longer
no idea
no problem
no more
no good
no way
no reason
no further
no sign
no other
no claims
no means
no point
no matter
no evidence
no time
no wonder
no question
no difference
no score
no confidence
ICE-GB
SPOKEN:
49
38
18
15
14
13
13
11
10
9
8
8
8
7
7
7
6
6
6
6
6
5
41
39
35
12
11
11
11
8
7
6
6
6
6
5
5
4
4
4
4
4
4
4
(continued)
no longer
no doubt
no more
no less
no need
no reason
no means
no other
no further
no time
no room
no evidence
no trouble
no choice
no good
no way
no part
no sound
no matter
no account
no exception
no feedback
ICE-GB
WRITTEN:
 Pam Peters & Yasmin Funk
Total
23
24
25
26
27
28
29
30
31
32
33
34
no need
no means
no thanks
no room
no shoes
no trouble
no relationship
ICE-AUS
SPOKEN:
Table 3. (continued)
4
4
4
4
4
309
4
4
no right
no reason
no place
no attempt
no option
ICE-AUS
WRITTEN:
210
3
3
3
3
3
no means
no interest
no property
no jobs
no alternative
no further
no question
no sign
no deposit
ICE-NZ
SPOKEN:
267
4
4
4
4
4
4
4
4
4
no problem
no job
no Maori
no sign
no female
no grounds
no harm
no alternative
no answers
no chance
no damage
no option
ICE-NZ
WRITTEN:
266
3
3
3
3
3
3
3
3
3
3
3
3
no intention
no less
no right
no accidents
no news
no excuse
no choice
no food
no threat
no maidens
ICE-GB
SPOKEN:
312
5
5
4
4
4
4
4
4
4
4
no attempt
no sign
no substitute
no wonder
no control
no point
no system
no alternative
no difference
no question
ICE-GB
WRITTEN:
271
4
4
4
4
3
3
3
3
3
3
No in the lexicogrammar of English 
Zero complementizer, syntactic context,
and regional variety
Kate Kearns
University of Canterbury
This paper presents empirical findings on the alternation between that and
zero complementizer in a range of syntactic environments, including clausal
complement to a verb with or without an intervening indirect object or adverbial,
complement to an adjective, complement to a noun, it-extraposition sentences,
and cleft sentences. The data were taken from British, United States, Australian
and New Zealand newspapers. It is shown that Australian and New Zealand
English have significantly higher rates of zero complementizer than American
and British English, and that the effect of syntactic context on zero rates differs
across regional varieties. In particular, New Zealand and Australian English show
little or no inhibition of zero in contexts where the complementizer position is
not adjacent to a potentially licensing lexical head. New Zealand and Australian
English also show comparatively high zero rates in the complements to nouns,
but no general syntactic patterns (such as light verb constructions) were found
to be involved here. Instead, the higher rates of zero in noun complement clauses
appear to be associated with particular collocations.
1. Introduction
This article reports findings on the alternation between that and zero complementizer
in (non-relative) embedded clauses in various syntactic environments. First, I review
the alternation in syntactic environments chosen to represent general predictions of
formal syntactic theory, using data from American, Australian, British and New Zealand
newspapers. I show that syntactically determined variations in the rate of zero complementizer also vary across the main regional varieties. In the second part of the article I
address zero-promoting patterns in the clausal complements to nouns, where syntactic
theory stipulates zero complementizer to be ungrammatical. I present evidence that
certain routine collocations are associated with high rates of zero complementizer in
NZE and AusE.
 Kate Kearns
2. The predictions of syntactic theory
Space does not allow a full discussion of the theoretical frameworks within which predictions arise concerning the distribution of zero complementizer, so my remarks here are
mainly confined to reviewing the predictions in descriptive terms. It is generally agreed
that zero complementizer is grammatical in principle in the complements to verbs and
adjectives as in (1)–(2), but ungrammatical in the complements to nouns, as in (3) (see
for example Stowell 1981; Bošković & Lasnik 2003; Pesetsky & Torrego 2004). In preMinimalist accounts of generative grammar, the licensing of zero complementizer as in
(1)–(2) is analyzed in terms of both thematic selection of the clause by a major lexical
head, and the fact that the head is adjacent to the clause – thus selection by and adjacency
to a lexical head are general factors in the licensing of zero complementizer Ø.
(1) He decided Ø the colour was too dark.
(2) She was afraid Ø the cat would get lost.
(3) *He bridled at the suggestion Ø his budget was unreasonable.
A verb complement clause is not adjacent to the selecting verb where an indirect object or
adverbial intervenes, as illustrated in (4)–(5). However, zero complementizer is observed
to be grammatical in the former case but not in the latter. For verbs with indirect objects,
Stowell (1981) and Bošković and Lasnik (2003) both propose that verbs which take
indirect objects constitute a special class which can either license zero complementizer
without adjacency (Stowell) or select a clause that lacks the complementizer position
altogether (Bošković and Lasnik). For other verbs, as in (5), the lack of adjacency between
the verb and embedded clause prevents the licensing of zero complementizer.
(4) They told the team Ø the bus would pick them up later.
(5) *They announced the other day Ø a bus would be hired.
The requirement that the zero-licensing head be a major lexical category is demonstrated in various constructions where the embedded clause is presumably selected
by the copula and generally adjacent to it, but zero complementizer is nevertheless
considered to be ungrammatical. This is attributed to the minor lexical category status
of the copula (Bošković & Lasnik 2003: 529, fn. 5).
(6) *What I heard was Ø the tickets were discounted.
(7) *The trouble was Ø the tickets were really expensive.
(8) *It could be Ø we’ll see them at the concert.
Finally, it-subject constructions represent a context in which the embedded clause is
adjacent to a major lexical head which does not select it, as illustrated in (9). Assumptions
concerning grammaticality differ with this construction: although zero complementizer
Zero complementizer, syntactic context, and regional variety 
in an it-subject construction was earlier considered to be ungrammatical, Bošković and
Lasnik find it grammatical, and propose that the main predicate in an it-subject construction (pity in (9)) is able to license an adjacent zero complementizer, despite not
selecting the embedded clause.
(9) It’s a pity Ø the whole crowd won’t be there.
The main syntactic contexts and the general zero-licensing factors they represent are
summarized in Table 1.
Table 1. Syntactic construction types
Complement to verb
(without indirect object or
intervening adverbial)
Complement to adjective
Complement to verb
preceded by indirect object
Extraposed complement
to verb (i.e. adverbial
intervenes)
It-subject construction
Copula construction
Clause is selected Clause is adjacent
complement to
to a head
a head
Selecting and/or
adjacent head is major
lexical category
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
No
Yes
No
No
Yes
Yes (except some
wh-cleft)
Yes
No
3. Non-syntactic factors in the occurrence of zero complementizer
Previously studied non-syntactic influences on the rate of zero complementizer include
genre, style, and channel, the form of the embedded subject, elements intervening
between the matrix verb and the subject of the embedded clause, and characteristics
of the matrix verb.
In general, a lower degree of formality (informal genre/style, spoken rather than
written channel) correlates with a higher zero rate (Bolinger 1972: 22; Burchfield
1996: 773, para 2; Elsness 1984: 519; Finegan & Biber 1995: 256, fn. 6; Huddleston &
Pullum 2002: 953; Poutsma 1929: 615; Quirk, Greenbaum, Leech, & Svartvik 1985: 734–5;
Storms 1966: 262).
A pronominal subject in the embedded clause promotes zero in comparison to a full
noun phrase (NP) subject (Elsness 1984; Finegan & Biber 1995; Hawkins 2003; Rissanen
1991; Rohdenburg 1999; Roland, Elman, and Ferreira 2005; Thompson & Mulac 1991).
 Kate Kearns
Several authors have also noted that a long NP subject inhibits zero more than a shorter
NP (Elsness 1984; Hawkins 2003; Roland et al. 2005).1
A lower zero rate correlates with elements intervening between the matrix verb
and the subject of the embedded clause (Bolinger 1972; Elsness 1984; Finegan & Biber
1995; Hawkins 2003; Jespersen 1928; McDavid 1964; Rissanen 1991; Rohdenburg
1999; Tagliamonte & Smith 2005; Thompson & Mulac 1991). These elements include
indirect objects (see (4) above), matrix adverbials (see (5) above) and clause-initial
adverbials in the embedded clause, as in (10).
(10)She acknowledged that had the alarm been activated there would have been
no response.
As reviewed in the previous section, indirect objects and “extraposed” verb complement clauses characterize two of the contexts to be examined. A clause-initial
adverbial in the embedded clause was found to inhibit zero significantly (zero rate
in subject-initial verb complement clauses = 55%, zero rate in adverb-initial verb
complement clauses = 8%, p ≤ 0.0000, Pearson’s GFX). To control for this effect,
tokens of adverb-initial embedded clauses were excluded from the data.
Individual verbs are known to have different rates of zero complementizer, and
to some extent higher-frequency verbs appear with higher zero rates. Roland et al.
(2005), in a study of a very large corpus (142 956 tokens) found that verb frequency
was the weakest factor of those studied in predicting zero rates, but was still significant. In the present study the verb say was found to be by far the most frequent (38%
of all verb complement tokens) and had a very high zero rate of 90% (cf. the average
non-say verb complement zero rate of 57%). Say was not evenly distributed across the
regional corpora, and with a zero rate of 90% is not a good exemplar of alternation,
as it appears that zero is almost categorical for this verb. It was decided to remove say
tokens from further study to exclude the very strong zero-promoting effect.
To sum up, the non-syntactic factors to be further taken into account are the degree
of formality, the length of the embedded subject, and the frequency of the verb.
4. Description of the corpus and results for general syntactic factors
The study of general syntactic factors in zero rates is based on a corpus comprising the
first 1000 tokens of embedded, non-relative, finite clauses without wh-gaps encountered
in a search of each of the Boston Globe (US), the Los Angeles Daily Times (US), The Age
1. It has also been proposed that the topicality or referentiality of the subject may affect zero
rates. These potential factors were examined and found not to be significant in the corpus
used in the present study – for discussion see Kearns (2007).
Zero complementizer, syntactic context, and regional variety 
(Australia), the Daily Telegraph (Australia), the Guardian (UK), the Daily Telegraph (UK),
the Christchurch Press (NZ), and the Sunday Star Times (NZ) in the period 2002–4. For
each embedded clause the full containing sentence was collected. Data collection and
initial coding were carried out by research assistants and all codes were checked by the
author. Tokens representing the constructions discussed above were selected (excluding,
for example, conjoined clauses and clauses in apposition).
The data were coded for three broad subtypes (within the genre of broadsheet newspaper prose): letters to the editor, news and columns, and directly quoted speech. A
preliminary check showed that these subtypes had different zero rates consistent with
expectations based on formality. Quoted speech, as the most informal, had an overall zero
rate of 65%, news and columns at an intermediate level of formality had a 58% zero rate,
and letters to the editor, representing the highest level of formality had a 27% zero rate.
Letters and quoted speech, which together comprised only 23% of the data, were removed
from further analysis to exclude their strong zero-inhibiting and zero-promoting effects.
The remaining data (3623 tokens) were coded for the regional source (Australia,
Britain, New Zealand, United States), the form of the embedded subject (pronoun;
short NP = 1–2 words; long NP = 3+ words), and the construction type (complement
to verb other than say, complement to adjective, complement to noun, verb complement with an indirect object, extraposed complement to verb, it-subject construction,
copula construction).
The results of a logistic regression analysis using goldvarb (2001) software
(Cedergren & Sankoff 1974) are shown in Table 2. All factors were significant.
Table 2. Results of logistic regression analysis
Construction
type
Embedded
subject
Regional
source
Verb complement
Adjective complement
Verb complement with
indirect object
Extraposed verb
complement
It-subject construction
Noun complement
Copula construction
Pronoun
Short NP
Long NP
NZE
AusE
AmE
BrE
No. of tokens
Zero rate
Factor weight
2179
176
207
55%
54%
48%
0.638
0.647
0.501
271
29%
0.321
106
404
206
1217
1248
1084
854
885
739
1071
19%
16%
8%
62%
40%
29%
62%
51%
41%
26%
0.246
0.201
0.095
0.703
0.447
0.327
0.698
0.621
0.454
0.286
 Kate Kearns
As the results in Table 2 show, the differences in zero rates across constructions
do not reveal a categorical distinction between grammatical and ungrammatical environments for zero complementizer, but nevertheless the ranking of constructions by
zero rate does reflect the general syntactic licensing factors discussed previously. The
canonical licensing environment for zero is in the complement to a verb or adjective
with no element intervening between the verb or adjective and the embedded subject –
the embedded clause is selected by and adjacent to a major lexical head which is [+V].
These constructions have the highest overall zero rate of 55%. The lowest zero rate of 8%
occurs in copula constructions, where the embedded clause is neither selected by nor
adjacent to a major lexical head, the copula being a minor lexical category. Constructions with intermediate values show embedded clauses which are selected by a major
lexical head but not adjacent to it (indirect objects and extraposed clauses), or adjacent
to a major lexical category but not selected by it (it-subject constructions). The exceptional construction is the noun complement clause – the embedded clause is selected
by and adjacent to a major lexical head, but it is stipulated that nouns cannot be zerolicensers despite this configuration. Given a fairly general agreement in the syntactic
literature that zero is ungrammatical in noun complement clauses, the fairly high zero
rate of 16% is unexpected, and not much lower than the 19% zero rate in it-subject constructions, where Bošković and Lasnik (2003) predict zero to be grammatical.
The effect of the embedded subject is fairly constant across constructions, as
shown in Figure 1, although the pronoun effect is more robust than the difference
between short and long NPs, which does not always appear.
80
70
60
50
40
30
20
10
0
copula
const.
noun
complement
it-subject
long NP
extraposed
complement
short NP
indirect
object
verb/adjective
complement
pronoun
Figure 1. Subject effect on zero rate across constructions (all varieities)
Within constructions, the different zero rates across varieties reflect different
strengths of subject effect. The most orderly pattern appears in the complements to
Zero complementizer, syntactic context, and regional variety 
verbs and adjectives (Figure 2), with a similar pattern in extraposed complements
(Figure 3), although the difference between long and short NPs is not consistent.
100
80
60
40
20
0
BrE
AmE
long NP
AusE
short NP
NZE
pronoun
Figure 2. Zero rate in verb/adjective complement across varieties
80
70
60
50
40
30
20
10
0
BrE
long NP
AmE
AusE
short NP
NZE
pronoun
Figure 3. Zero rate in extraposed complement across varieties
The higher overall zero rate in the NZE data for it-subject constructions (Figure 4)
and noun complement clauses (Figure 5) appears largely attributable to an increased
pronoun effect.
With indirect objects, the form of the indirect object affects zero rates in the same
way as (and in addition to) the form of the embedded subject. That is, zero is promoted
 Kate Kearns
80
70
60
50
40
30
20
10
0
BrE
AmE
long NP
AusE
short NP
NZE
pronoun
Figure 4. Zero rate in it-subject across varieties
80
60
40
20
0
BrE
long NP
AmE
AusE
short NP
NZE
pronoun
Figure 5. Zero rate in noun complement across varieties
by a pronoun in indirect object and/or embedded subject position, and comparatively
inhibited by a full NP in indirect object and/or embedded subject position. Preliminary investigation (see Kearns 2007: 316–7 for further discussion) suggested that the
effect is broadly the same for indirect objects and embedded subjects, and additive.
The data fell into three divisions. Tokens with two pronouns had the highest zero rate
and data with two full NPs had the lowest zero rates. Tokens with one pronoun and
one NP had intermediate zero rates, regardless of whether the indirect object or the
embedded subject was the pronoun. Accordingly, the zero rates for these three divisions are shown in Figure 6. This is the only construction in which the overall AusE
zero rate is higher than the NZE rate.
Finally, the subject effects across varieties were separately calculated for the verb
complement say tokens, which were excluded from the general analysis. Here (Figure 7)
we see the subject effect disappearing as zero becomes categorical.
Zero complementizer, syntactic context, and regional variety 
120
100
80
60
40
20
0
AmE
BrE
NZE
AusE
subj pron & obj pron
subj pron & obj NP or subj NP & obj pron
subj NP & obj NP
Figure 6. Zero rate with indirect object across varieties
120
100
80
60
40
20
0
BrE
long NP
AmE
AusE
short NP
NZE
pronoun
Figure 7. Zero rate in complement to say across varieties
The cross-regional comparisons reviewed above were checked for possible interference from the unequal distribution of frequent lexemes – particularly verbs and
adjectives – or of different forms of embedded subject across varieties. Calculations
 Kate Kearns
on subsets of the data excluding the most frequent heads, selecting pronoun subjects
only, or excluding pronoun subjects, yielded the same regional rankings, indicating
that the regional comparisons were not significantly distorted by these other factors
(see Kearns 2007 for further discussion).
Having seen that the effect of the embedded subject is fairly constant across
regional varieties, I turn now to the role of the general syntactic factors of selection
by and adjacency to a major lexical head, as discussed above, with their representative constructions. Noun complement clauses are not included as the general factors are not predictive of zero rates in that construction, which is stipulated to be
a non-licensing environment for zero. Significant differences in zero rates between
constructions were calculated within each variety, grouping the constructions into
three divisions for each variety. The first division is characterized by the canonical
licensing environment for zero, which is the (adjacent) complement to a verb. Any
construction for which the zero rate was not significantly lower than that of verb
complements was placed in this group. The third division was characterized by the
canonical non-licensing environment for zero, the copula construction, in which
there is no adjacent major lexical head to serve as a potential licenser. Any construction for which the zero rate was not significantly higher than for copula constructions
was placed in this group. The intermediate division contained any construction with
a zero rate which was significantly lower than verb complements and significantly
higher than copula constructions. The summarized results are shown in Table 3 (for
more detail see Kearns 2007).
Table 3. Grouping of constructions by zero rates across regional varieties
Zero rate not
significantly lower than
in verb complement
AmE
BrE
AusE
NZE
verb
verb
verb
verb
adj
adj
adj iobj
adj iobj
Intermediate
iobj
extr
extr itsub
Zero rate not significantly
higher than in copula
construction
iobj extr itsub cop
extr itsub cop
itsub cop
cop
Note: verb = verb complement (not extraposed, no indirect object); adj = adjective complement;
iobj = verb complement preceded by indirect object; extr = extraposed verb complement;
itsub = it-subject construction; cop = copula construction.
The main differences of interest are that zero rates for indirect object constructions and extraposed clauses are significantly higher for AusE and NZE than for AmE
and BrE. These constructions represent clauses which are selected by a major lexical
head but not adjacent to it. In addition, the intervening element is also selected by
Zero complementizer, syntactic context, and regional variety 
the head in indirect object constructions, but not with extraposed clauses, and this
difference is presumably reflected in the fact that indirect object constructions have
a higher zero rate than extraposed clauses for all varieties except AmE, which has no
constructions of intermediate status. The results indicate that a decreased sensitivity to
non-adjacency is among the factors contributing to overall higher zero rates in NZE
and AusE.
The effect of adjacency to a selecting head can also be observed in additional constructions not included in the main analysis reported so far, but available in the main
corpus. The clause selected by conjunction except as in (11)–(12) and by purpose so as
in (13)–(14) is adjacent to the selecting head, while the clause selected by degree so as
in (15)–(16) is not.2
(11)I’m sure there’s some fun if you’re carried off the field at the end of the Super
Bowl (except Ø you’re soaking wet from that Gatorade bath and that one ice
cube is stuck somewhere down your shorts) (Boston Globe (US) 11.10.02)
(12)Nothing unusual in that, except that the film doesn’t open until May 16 in
Seattle and it’s winter over there. (The Age (AU) 05.02.02)
(13)But Mr White, sixth on the list, also has ambitions to open access to the
ombudsman so Ø people with a complaint on public administration can go
straight to the top, instead of approaching their MP first. (The Guardian (GB)
22.11.02)
(14)Changes were also written in so that objections from parents could be
overridden. (Christchurch Press (NZ) 23.09.02)
(15)A buyer from Florida was so impressed with the Verge show Ø he immediately
placed an order. (Sunday Star Times (NZ) 27.10.02)
(16)Digital Insight was so focused on acquisitions that they neglected their
existing customers and didn’t realize the implications.
(Los Angeles Daily News (US) 27.11.02)
In these contexts, zero rates were significantly higher in adjacent clauses than in
non-adjacent clauses for the AmE data (p ≤ 0.0467, Pearson’s GFX) and the BrE data
(p ≤ 0.000424, Fisher Exact), but not for the AusE data (p ≤ 0.7324, Pearson’s GFX)
or the NZE data (p ≤ 0.6441, Pearson’s GFX). This result further supports the finding
2. The comparison assumes that except and so are major lexical items. Although they
seem not to be analysable in terms of the features [±N], [±V], these heads select the clause
both syntactically and semantically, and thus are reasonably classed as lexical rather than
functional heads.
 Kate Kearns
that the AusE and NZE data show less sensitivity than the AmE and BrE data to the
zero-inhibiting effect of non-adjacency.
5. Noun complement clauses
As noted above, the general syntactic factors reviewed in the previous section
apparently do not contribute to licensing zero in the complements to nouns,
where zero is stipulated to be ill-formed. However, we have seen that zero does
occur in noun complement clauses at an overall rate of 16%, so the question arises
whether any part of the syntactic context promotes zero in these cases. One possible hypothesis (see Chomsky 1975, Ch. 3, fn. 4) is that a light verb construction
(LVC) such as make the claim that S is reanalyzed as a complex predicate, which
presumably weakens the nominal character of the noun.3 There is considerable
variation in what counts as an LVC for different researchers. Here I use the term
to refer to constructions for which detailed syntactic analyses have been proposed
(Cattell 1984; Grimshaw & Mester 1988; Kearns 1989; Rosen 1990, among others).
The main characteristics of LVCs are that: (i) the noun N of an LVC is morphologically related to a verb V; (ii) N and V have the same argument structure; and (iii) N
contributes the argument structure to the LVC, and so the LVC projects the same
arguments as a sentence headed by V. Given the close relationship between N and
V, the LVC is loosely paraphrasable by the corresponding verb. These points are
illustrated in (17).
(17) a. Theyx made himz an offer of moneyy.
b. Theyx offered himz moneyy.
LVCs (strictly defined) form a subset of so-called complex predicates. I take a complex
predicate to be an expression which has the predicative semantics typically expressed
by a simple verb but consists of a verb with additional material, generally a noun
phrase or preposition (cf. Allerton’s (2002) “stretched verb constructions”). Complex
predicates which are not strict LVCs are illustrated in (18).
(18) a.
He got the impression that she was nervous.
“He gathered ...” not “He impressed ...”
b. He got the idea that she was nervous.
no verb related to idea.
3. A possible correlation of ‘light verbs’ with zero complementizer in the complement is
noted by Jespersen (1928: 36).
Zero complementizer, syntactic context, and regional variety 
The kind of formal reanalysis which has been proposed for LVCs, in which the noun
contributes the argument structure of the whole predicate, is not applicable to non-LVC
complex predicates like those in (18) because the noun does not have the appropriate
verb-related sense and argument structure. Nevertheless, the predicative sense of such
collocations could be considered to make the noun less “nominal”.
The main syntactic hypothesis to be tested, then, is that complex predicates –
complex predicates in general or specifically LVCs – promote zero complementizer.
If complex predicates are not found to significantly promote zero, then an alternative
hypothesis is that higher zero rates are associated with particular routines, which may
or may not be complex predicates.
6. Description of the data and results for noun complement clauses
To study primarily the effects of complex predicates and routinization on zero rates in
noun complement clauses, a limited number of head nouns which were observed to
show alternation were selected – the selected nouns were fact, belief, idea, indication,
impression, suggestion, and reminder. To maximize the zero rates and ensure sufficient
that/zero variation, only clauses with nominative pronoun embedded subjects I, he,
she, we, they were collected.4 It was also decided to select data from New Zealand
sources, as the NZE data in the study reviewed above had the highest zero rate in noun
complement clauses (34%, cf. AusE 18%, AmE 16%, BrE 4%).
Given these constraints, a very large source corpus was needed, and data were
collected using Webcorp at 〈http://webcorp.org.uk〉, developed by the Research and
Development Unit for English Studies, University of Central England, Birmingham. The
domain searched was .co.nz, “webpage last updated” in the period January-December
2006, and the token form was set as “full sentence”. The domain setting returned predominantly but not exclusively New Zealand-generated data. All tokens were checked
and irrelevant tokens (e.g. relative clauses) or unusable data (unanalyzable fragments)
were discarded.
The remaining tokens (n = 2230) were coded for the complement-taking noun, the
governing head to the NP (usually a verb, adjective, or preposition), any modifier internal
to the NP, the determiner in the NP including possessive phrases, and a simple binary
measure of degree of formality: tokens with marked informality (generally 1st person,
4. Hawkins (2003: 191) suggests that pronoun embedded subjects with nominative morphology may promote zero more than non-case-marked pronouns. Although no evidence
supporting this was found in a previous study (see Kearns 2007), the selection of nominative
pronouns excluded a possible effect.
 Kate Kearns
direct quotes, postings to discussion sites, and/or tokens with grammatical errors or nonstandard punctuation) were coded as informal, and all others as not informal.
Using the initial codes as a guide, tokens were also coded as LVC or non-LVC,
complex predicate or non-complex predicate, and routine or non-routine. Tokens
were coded as LVC where the V-N sequence could be paraphrased with a simple verb
morphologically related to the noun, as in (19).
(19)I say that, because they are from Christchurch and also because it may give you
an indication (“indicate”) that they will be fragile, conservative folk. (Homepage
of Jimi Kumara, 〈http://homepages.ihug.co.nz/~lurid/index-16.htm〉)
Tokens were coded as complex predicates if the V-N sequence could be loosely paraphrased with a simple verb, as in (20).
(20)Labour MPs prior to the election went to meetings organized by the health
unions and did nothing to dispel the idea (“deny”) that they support a 10% plus
wage round in health. (Richard Prebble’s Budget 2000 speech, 〈http://www.act.
org.nz/news/budget-2000-speech-richard-prebble〉)
Tokens were also coded as complex predicates where fact was used as a clause-embedding
strategy, as in (21).
(21)Tanya takes her role as Miss Lucy seriously and loves the fact she can inspire
and get people excited about travelling, exploring and trying something
different. (House of Travel web page, 〈http://www.houseoftravel.co.nz/
about-us/miss-lucy.htm〉)
Tokens were coded as routines if the collocation of the noun and the governing head
appeared at least five times with determiner a/the and no modifier, or with a modifier
and/or determiner that appeared at least five times in the collocation, e.g. get the distinct impression, give every indication, have no idea.
A logistic regression analysis was carried out using Goldvarb (2001) software
(Cedergren & Sankoff 1974), with the results shown in Table 4. Only the individual
nouns and the factor of routinization were significant. LVCs had a slightly lower zero
rate than non-LVCs, and the small difference between complex predicates (26%) and
non-complex predicates (23%) was not significant. Accordingly, there is no support
for the hypothesis that complex predicate formation (including strictly defined LVCs)
may weaken the nominal character of the complement-taking noun and so weaken the
zero-inhibiting effect of the noun.
The zero-promoting effects of the individual nouns (except idea) correlate fairly
well with their frequency in the data, as shown in Figure 8.
The routine collocations are shown in Table 5 (p. 258). The main routines with both
high frequency (n ≥ 30) and markedly higher zero rates than non-routine instances of
the same noun are indicated in bold. There are three main types of expression: adverbial
Zero complementizer, syntactic context, and regional variety 
Table 4. Results of logistic regression analysis
noun
formality
± LVC
± complex predicate
± routine
fact
belief
idea
indication
impression
suggestion
reminder
informal
not informal
LVC
non-LVC
predicate
non-predicate
routine
non-routine
no. of tokens
total zero rate
factor weight
651
283
299
250
358
273
116
625
1605
181
2049
849
1381
879
1351
34%
20%
12%
19%
31%
20%
6%
27%
23%
19%
24%
26%
23%
32%
19%
0.669
0.501
0.317
0.404
0.523
0.506
0.183
not sig.
not sig.
not sig.
not sig.
not sig.
not sig.
0.620
0.421
40
35
30
25
frequency of N
zero rate
20
15
10
5
fact
impression
idea
belief
suggestion
indication
reminder
0
Figure 8. Frequency of nouns and zero rates
prepositional phrases (despite/due to/apart from the fact, in/with the belief, etc.)
existential there constructions (exst there + det indication/suggestion), and complex
predicates. In contrast to the other nouns in the study, routines with impression are all
complex predicates, and account for 81% of all impression tokens.
 Kate Kearns
Table 5. Routines
noun
routine
no. of tokens
zero rate
fact
non-routine
zero rate = 31%
despite the fact
due to the fact
like the fact
proud of the fact
apart from the fact
given the fact
take pride in the fact
love the fact
enjoy the fact
get over the fact
reflect the fact
hide the fact
in the belief
with the belief
exst there det belief
have det belief
based on det belief
give the belief
have no idea
have det idea
like the idea
get the idea
with the idea
give det indication
be a (good/clear) indication
exst there det indication
have det indication
as an indication
take as an indication
give the (clear/distinct) impression
get the (clear/distinct) impression
be under the impression
have det (distinct) impression
leave det (clear) impression
create the impression
leave X with det (clear/distinct)
impression
gain det impression
exst there det suggestion
reject det suggestion
make det suggestion
X be a (constant) reminder
just a reminder
62
41
13
9
7
7
7
6
6
5
5
5
34
8
8
6
5
5
61
16
10
9
8
60
42
31
9
5
5
135
53
32
21
16
14
12
45%
46%
54%
22%
86%
71%
0%
67%
33%
40%
40%
20%
44%
50%
25%
22%
40%
0%
38%
19%
0%
0%
0%
22%
17%
32%
33%
0%
20%
36%
43%
45%
29%
13%
36%
8%
7
34
18
6
30
8
0%
35%
22%
33%
3%
0%
belief
non-routine
zero rate = 15%
idea
non-routine
zero rate = 5%
indication
non-routine
zero rate = 15%
impression
non-routine
zero rate = 17%
suggestion
non-routine
zero rate = 18%
reminder non-routine
zero rate = 7%
Zero complementizer, syntactic context, and regional variety 
So far, routines are identified as fixed phrases except for some variation in the
form of determiner – only modifiers which are frequent in the collocation are included
in the routines. A number of the preposition + noun or verb + noun combinations
found in the routines also appear with less frequent modifiers and/or determiners, e.g.
with the pie-eyed belief, make the excellent suggestion, be a reasonably good indication.
To establish whether routines must be fixed phrases to produce a higher zero rate, the
zero rates were compared for fixed phrase routines, routine collocations including less
frequent modifiers or determiners, and all other tokens of the same noun. In this comparison fact was excluded because the fact routines show no variation in modifiers or
determiners, and reminder was excluded because the data were insufficient. The results
in Figure 9 indicate that frequent collocations produce the main rise in zero rates,
although fixed phrases have slightly higher zero rates again.
45
40
35
30
belief
impression
suggestion
idea
indication
25
20
15
10
5
0
non-routine
frequent collocation
fixed phrase
Figure 9. Zero complementizer rates in non-routine contexts, frequent collocations, and set
phrases
To sum up the findings of this section, the possible contribution of a general syntactic factor – the nominal category of the noun – was examined in light verb constructions and more loosely defined complex predicates in comparison with other
contexts, on the hypothesis that the noun in a complex predicate construction has
a weakened nominal category. It was found that complex predicates including light
verb constructions have no significant effect on zero rates. The only significant factors
were the individual noun and frequent collocations, identified as routines. The zeropromoting effects of the individual nouns were found to correlate fairly well with the
overall frequency of the nouns in the data (Figure 8).
 Kate Kearns
The greatest increase in zero rates due to routinization is associated with routine collocations of the main lexical items (e.g. have + belief, create + impression) even
where the expression contains non-routine modifiers (e.g. have the firmly ingrained
belief, create the erroneous impression). Fixed phrase routines (e.g. have no idea, despite
the fact) showed a slight further increase in zero rates.
7. Concluding remarks
Previous studies of the that/zero alternation have generally focused on the complements to verbs and used only BrE and/or AmE data. The initial purpose of this study
was to examine the alternation in other syntactic contexts, with particular reference to
the predictions of formal syntactic theory. The possibility of regional variation arose
from the informal observation of zero complementizer in New Zealand newspapers in
syntactic contexts where zero is predicted to be ungrammatical – an additional purpose
of the study was to determine whether this was a regional feature or occurred in international English, possibly as a recent development.
The results reported here show that zero rates in a range of syntactic contexts are
significantly different across regional varieties, with higher zero rates in the NZE and
AusE data. It is also shown that the AusE and NZE data not only have higher zero rates
overall, but also differ from the BrE and AmE data in the pattern of occurrence of zero
across syntactic constructions. In particular, the general factor of adjacency between
the embedded clause and a potential zero-licensing lexical head is a weaker inhibitor
of zero in the AusE and NZE data than in the AmE and BrE data.
One interpretation of the results is that antipodean English is initiating a change in
the syntax of zero complementizer, which may or may not also appear in BrE and AmE.
However, given the previous focus on the complements to verbs in the study of the
that/zero alternation, there is no evidence that the rates reported here of zero complementizer in noun complement clauses, it-subject constructions or extraposed clauses
constitute a recent development. This question will be addressed in future research.
The second part of the study, examining noun complement clauses, tested the
hypothesis that the noun in a complex predicate has in some sense weakened categorial nominality, and that this would be expected to weaken the zero-inhibiting effect
of the noun. Apart from the lexical category of the noun, a noun complement clause
has the other proposed general syntactic zero-licensing properties, in that the clause is
selected by and (typically) adjacent to a major lexical head. However, it was found that
complex predicates (including light verbs) do not significantly affect zero rates. Rather,
higher zero rates in noun complement clauses show no identifiable syntactic pattern,
but occur in routine collocations of the noun with another lexical item. In addition, a
small increase in zero rates is evident where the routine collocation takes the form of
Zero complementizer, syntactic context, and regional variety 
a fixed phrase. Cross-regional variation of zero rates in a range of routine collocations
is a matter for future research.
References
Allerton, David. 2002. Stretched Verb Constructions in English. London: Routledge.
Bolinger, Dwight. 1972. That’s That. The Hague: Mouton.
Boškovic, Željko & Howard Lasnik. 2003. “On the distribution of null complementizers”. Linguistic
Inquiry 34: 527–46.
Burchfield, Robert (ed.). 1996. The New Fowler’s Modern English Usage. (3rd edn). Oxford: Clarendon Press.
Cattell, Ray. 1984. Composite Predicates in English [Syntax and Semantics 17]. New York NY:
Academic Press.
Cedergren, Henrietta & David Sankoff. 1974. “Variable rules: performance as a statistical reflection
of competence”. Language 50: 333–55.
Chomsky, Noam. 1975. Reflections on Language. London: Fontana/Collins.
Elsness, Johann. 1984. “That or zero? A look at the choice of object clause connective in a corpus
of American English”. English Studies 65: 519–33.
Finegan, Edward & Douglas Biber. 1995. “That and zero complementisers in Late Modern English:
Exploring ARCHER from 1650–1990”. In Bas Aarts & Charles F. Meyer (eds), The Verb in
Contemporary English, 241–57. Cambridge: Cambridge University Press.
Grimshaw, Jane & Arnim Mester. 1988. “Light verbs and theta-marking”. Linguistic Inquiry 19: 205,
241–57.32.
Hawkins, John A. 2003. “Why are zero-marked phrases close to their heads?” In Günter Rohdenburg
& Britta Mondorf (eds), Determinants of Grammatical Variation in English, 175–204. Berlin:
Mouton de Gruyter.
Huddleston, Rodney & Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Jespersen, Otto. 1928. A Modern English Grammar on Historical Principles. Part III: Syntax, second
volume. London: George Allen and Unwin.
Kearns, Kate. 1989. “Predicate nominals in complex predicates”. In Itziar Laka & Anoop Mahajan
(eds), MIT Working Papers in Linguistics 10: 123–34.
Kearns, Kate. 2007. “Regional variation in the syntactic distribution of null finite complementizer”.
Language Variation and Change 19: 295–336.
McDavid, Virginia. 1964. “The alternation of ‘that’ and zero in noun clauses”. American Speech
39: 102–13.
Pesetsky, David & Esther Torrego. 2004. “Tense, case, and the nature of syntactic categories”. In
Jacqueline Guéron & Jacqueline Lecarme (eds), 495–537. The Syntax of Time. Cambridge
MA: The MIT Press.
Poutsma, Hendrik. 1929. A Grammar of Late Modern English. Groningen: P. Noordhoff.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English language. London: Longman.
Rissanen, Matti. 1991. “On the history of that/zero as object clause links in English”. In
Karin Aijmer & Bendt Altenberg (eds), English Corpus Linguistics: Studies in Honour of Jan
Svartvik, 272–89. London: Longman.
 Kate Kearns
Rohdenburg, Günter. 1999. “Clausal complementation and cognitive complexity in English”. In
Fritz-Wilhelm Neumann & Sabine Schülting (eds), Anglistentag 1998, Erfurt, 101–12. Trier:
Wissenschaftlicher Verlag.
Roland, Douglas, Jeffrey L. Elman & Victor S. Ferreira. 2006. “Why is that? Structural prediction
and ambiguity resolution in a very large corpus of English sentences”. Cognition 98: 245–72.
Rosen, Sara Thomas 1990. Argument Structure and Complex Predication. New York NY: Garland.
Storms, G. 1966. “That-clauses in Modern English”. English Studies 47: 249–70.
Stowell, Timothy. 1981. “Origins of phrase structure”. Doctoral dissertation, MIT.
Tagliamonte, Sali & Jennifer Smith. 2005. “No momentary fancy! The zero complementizer in
English dialects”. English Language and Linguistics 9: 289–309.
Thompson, Sandra A. & Anthony Mulac. 1991. “The discourse conditions for the use of the
complementizer that in conversational English”. Journal of Pragmatics 15: 237–51.
Infinitival and gerundial complements
Christian Mair
Freiburg University
The present contribution investigates three patterns of non-finite clausal
complementation which are known to be variable in contemporary British
and American English, namely the use of bare and to-infinitives with help, the
presence or absence of from before gerunds following the verb prevent, and the
choice between infinitives and gerunds as complements of begin and start. On
the whole, Australian and New Zealand English usage displays a broadly British
profile of variation, and differences between the two antipodean varieties are
minor. While not spectacular in themselves, these findings fit quite well into
long-term developments that have been shaping the complement-clause system
of English in the Late Modern period. Australian and New Zealand English are
taking part in these world-wide drifts at a pace comparable to British English.
In particular, no rapid recent “Americanization” of usage can be observed.
1. Introduction
The system of English non-finite complement clauses has been subject to considerable
diachronic change and structural expansion since the Early Modern English period.
As shown, for example, in Mair (2006: 119–40), there have been two fairly general
diachronic drifts. On the whole, non-finite patterns of clausal complementation have
expanded both in frequency and diversity of function, partly at the expense of finite
complement clauses. Within this overall realignment in favour of non-finite clausal
complements, there has been an additional incursion of gerundial complements into
territory previously occupied by infinitival ones. These two major developments are
complemented by numerous less systematic shifts in the use of variants, for example
with regard to presence or absence of the infinitival marker to (e.g. with help) or the
complementizer/preposition from optionally preceding gerunds with verbs of prevention. Whether these changes will ultimately amount to the “Great Complement Shift”
claimed to be taking place in English (Vosberg 2006a, 2006b) is an open question,
but there is more than sufficient documentation that they are still going on and that,
at least in some cases, they have led to increasing regional divergence between BrE
and AmE in the twentieth and early twenty-first centuries. Given this background,
it will be very interesting to investigate the relevant variables in contemporary AusE
 Christian Mair
and NZE, in order to gauge the current position of these two antipodean standards
in a field defined by the partly conflicting pulls of an inherited British orientation,
contemporary influences from US usage, and independent local innovation.
In the expectation that – at least at the standard end of the dialect continuum – the
syntax of World English is marked by convergence in writing and (some) divergence in
speech (Mair 2007), the analysis will proceed in two steps. First, I will review findings
about the grammar of the verbs help, prevent, stop, begin and start in corpora of written
BrE and AmE and then place the findings from comparable AusE and NZE corpora
in this context. Second, we will turn to the study of variation in spoken corpora – on
the assumption that regional variability masked in writing may come to the fore in the
spoken medium. An additional advantage of studying stylistic variation within varieties is that in this way we can assess the relative prominence of region with regard to
other determinants of synchronic variation, such as medium, genre or stylistic level of
formality. The conclusion will assess the extent to which AusE and NZE have moved
away from an inherited BrE profile over the past century.
2. G
erunds and infinitives in written English: The Brown
family of corpora
In two recent studies (Mair 2002, 2006) three instances of syntactic variability in
the field of non-finite complement clauses were investigated in BrE and AmE, using
four matching one-million word corpora of written BrE and AmE, namely LOB,
Brown, FLOB and Frown. LOB and Brown documented language use in Britain and
the US in 1961, and FLOB and Frown carried documentation forward to the years
1991/92. In this way it was possible to present an integrated treatment of regional
variation and ongoing grammatical change. In the absence of similar diachronic
documentation for AusE and NZE it is impossible to simply expand this integrated
treatment. However, the sampling dates for the Australian Corpus of English (ACE)
and the Wellington Corpus of New Zealand English (WWC) are sufficiently close
to FLOB and Frown to allow a meaningful synchronic comparison of the use of
gerundial and infinitival complements in these four standards in the last quarter of
the twentieth century.
2.1 help + infinitive
The verb help takes either a bare infinitive or a to-infinitive as a complement and
commonly occurs in one of the following four constructions (examples from ACE):
(1)It is little wonder that droughts, often lasting 10 years in regions, have helped to
cripple the country. [ACE H28a:41 ff.]
Infinitival and gerundial complements 
(2)At Koyuga one night a dance was held to help raise funds for a beauty queen.
[ACE G 22:164 ff.]
(3)These blood tests help the clinician to diagnose what is called “occult
heartworm”, i.e. where the disease is severe (lungs particularly are affected)
but the routine blood tests for the presence of microfilariae (offspring of the
adult worms) are inconclusive. [ACE E17b:60 ff.]
(4)A project aimed at helping young people cope with technological change
was launched today at the Futures in Education Conference in Melbourne.
[ACE A15d:2 ff.]
Many attempts have been made to account for this variable usage in present-day English.
In a small number of cases, structural factors force the use of one or the other variant. For
example, a negated infinitive seems to require the use of to. Choice between the two variants may also be motivated stylistically, with the more explicit to-variants being preferred
in formal styles and the bare infinitives being preferred in informal styles.1 Apart from
these structural and stylistic factors, grammarians have explored the possible iconic and
semantic motivations for structural variation. Dixon, for example, has argued that the
to-infinitive represents more indirect causation or support than the bare infinitive, claiming that John helped Mary to eat the pudding suggests that he did so indirectly, for example
“by guiding the spoon to her mouth,” while John helped Mary eat the pudding actually
means that he himself ate part of it (2005: 201). The most common assumption found
in the literature, however, is that this particular instance of grammatical variability in
present-day English reflects diverging preferences in British and American usage, with
the bare infinitives being the preferred option in AmE (cf. Trudgill & Hannah 2002: 67).
A previous analysis of the use of help in the Brown family of corpora (Mair 2002)
has shown a reversal of preferences in BrE between LOB and FLOB. Whereas the ratio
was 94:27 in favour of the to-infinitive in LOB, this changed to 77:122 in FLOB, bringing BrE in line with AmE (which displayed a preference for the bare infinitive both in
Brown and in Frown). Table 1 presents these figures and the corresponding ones from
ACE and WWC, which, for ease of reference, are visualized as Figure 1:
Table 1. Bare vs. to-infinitives with help in four corpora
bare inf.
to-inf.
FLOB
ACE
WWC
Frown
122
77
89
67
129
65
203
44
Significances (chi-square): There are highly significant contrasts between Frown and each of the other
three corpora, but not among FLOB, ACE and Wellington themselves (Frown: FLOB – p ≤ 0.0001;
Frown: ACE – p ≤ 0.0001; Frown: WWC – p ≤ 0.0001).
. Or cognitively less complex processing environments – on which see Rohdenburg 1996.
 Christian Mair
250
200
150
bare inf.
to-inf.
100
50
0
FLOB
ACE
WWC
Frown
Figure 1. Bare vs. to-infinitives with help in four corpora
As can be seen, the profiles of BrE and NZE are virtually identical with regard to
the variable. The bare infinitive is the more common form also in AusE, although just
barely. On the assumption that 1960s AusE and NZE were like BrE as documented in
LOB, we can thus infer that the same reversal of preferences in favour of bare infinitival complements which we noted for BrE has also taken place in AusE and NZE.
2.2 prevent/stop + NP + (from) + gerund
If the development of help thus shows convergence between British and American
usage in the late twentieth century, the opposite is the case for prevent and stop. These
verbs can be used with from + gerund in all varieties of English, as is shown by the
following two examples from WWC and Frown:
(5)They discourage workers from moving out of declining industries by
preventing growing industries from offering them higher wages.
[WWC J45:65-7]
(6)But the questions she raises, unlike Lee’s, come from the perspective of a woman
who must deal not only with racism but with pregnancy, miscarriage, and the
experience of being an intellectual whose academic husband was able to do the
things her pregnancies prevented her from doing. [Frown G29:133-5]
An alternative, from-less pattern was current in eighteenth- and nineteenth-century
BrE and AmE.2 In the course of the twentieth century, however, it seems to have been
. Among major reference works on AmE, it is still recorded – mistakenly as contemporary
usage – in Webster’s Third New International Dictionary (1961) (s.v. prevent).
Infinitival and gerundial complements 
virtually eliminated from AmE but persists, and may even be increasing, in BrE and
related varieties, as is shown by the following two examples from ACE and WWC:
(7)Even so, there are frequent barriers to this natural process of adjustment –
barriers that may not prevent adjustment occurring altogether but rather
that slow it to a degree that causes problems to emerge. [ACE H14:12-14]
(8)Curiously, there is nothing in the Standing Orders of New Zealand House of
Representatives to prevent a bill first being referred to a select committee; yet
this has occurred only once when the Public Finance Bill 1977 was successfully
referred to the Public Expenditure Committee. [WWC J40:153-5]
Similar variability in complementation can be observed in the corpora with the verb
stop. As the following example culled from a New Zealand website shows, the phenomenon seems to extend to other, less frequently used verbs of prevention, as well:
(9)With the minimum age for purchasing liquor now reconfirmed at 18, how
tough do we get on those under 18 who drink[?] Do we ban them drinking
alcohol at all, anywhere? Or do we ban it anywhere outside a home with
parental supervision? (Russell Fairbrother’s Napier Mail column for
15 November 2006; http://www.labour.org.nz/Our_mps_top/russell_fairbrother/
news/abig101106/index.html)
Given the size of the corpora, sufficient data can be retrieved only for the two verbs
prevent and stop, and Table 2 below presents the pertinent evidence. For reasons which
will become obvious in the discussion, the two rightmost columns contain the figures
from the British LOB (1961) and B-LOB (early 1930s) corpora.3
These figures require some interpretation and contextualization. First of all,
the one probably spurious counter-example notwithstanding, from-less variants
must be considered as categorically absent from present-day AmE, which removes
this feature from the realm of statistical facts for this variety. For what they are
worth, the results of the chi-square tests thus support this obvious regional contrast
between AmE and all other varieties. They further suggest a diachronic dynamic
within BrE which in fact has helped consolidate this regional contrast over the past
half century.
With regard to AusE and NZE, the two varieties at issue here, and their relation to BrE, the situation is characterized largely by the absence of statistically
. B-LOB (“before LOB”) is a further Brown “clone” recently completed at Lancaster
(UK) and documenting written BrE of the early 1930s. I would like to thank Geoffrey
Leech and Nicholas Smith for allowing me access to this resource, which is not yet publicly
available.
 Christian Mair
Table 2. From and “zero” with gerund with prevent and stop
FLOB
ACE
WWC
Frown
LOB
B-LOB
prevent NP from V-ing
prevent NP V-ing
24
24
24
9
27
11
36
14
34
7
54
13
stop NP from V-ing
stop NP V-ing
3
17
10
12
13
13
7
-
7
4
1
15
Statistically significant contrasts (chi-square) were only obtained for prevent, as several cells were too small
in the case of stop. No significant contrasts were found between FLOB: ACE, FLOB: WWC, ACE: WWC,
ACE: LOB, ACE: B-LOB, WWC: LOB, WWC: B-LOB, Frown: LOB, Frown: B-LOB. Significant contrasts
(p ≤ 0.01) were found in all other cases.
significant regional contrasts in the distribution of variants, and Table 2 chiefly
shows two things:
a. AusE and NZE are extremely similar to each other (as corroborated by the absence
of any statistically significant contrast between them)
b. they are intermediate between BrE and AmE with regard to this variable
As for prevent, the ACE and WWC figures closely resemble earlier twentieth-century
usage as recorded in LOB and B-LOB, suggesting that although AusE and NZE preserve a generally British profile with regard to the variable at issue, they have not
participated in the most recent British developments, which have led to an extension
of the from-less variant. With stop, the American and British preferred forms are
represented in AusE and NZE in about equal measure.45
2.3 start and begin in catenative uses
Present-day English has a number of catenative verbs which may be used to indicate the
beginning, continuation or end of an activity or state, the most important among them
being begin, start, continue, go on, finish, cease and stop. They differ considerably with
regard to the complementation patterns they occur in. Stop (in the relevant sense) and
. The sole American attestation of the “British” pattern (in Frown) is from a work of
military history dealing with the Battle of Britain. Not unexpectedly, an “archaic” construction with a gerund preceded by a possessive determiner (e.g. prevent his leaving), which is
extremely rare today, is attested best in B-LOB, with three unambiguous instances and two
more involving ambiguous her.
. Interestingly, the example without from has stop in the passive, which is unusual even
today. “And old Farre, being not at all the fool he had seemed, had seen that Gronard could not
be stopped getting away with those secrets – save in one way.”
Infinitival and gerundial complements 
finish, for example, require an obligatory gerund and do not allow infinitives. Cease, by
contrast, allows both types of complementation, and so do go on, continue, start and
begin (though the statistical preferences and semantic constraints on the use of the two
options are far from comparable for these verbs).
In view of such variability, some ongoing diachronic change is only to be expected.
Corpus-based studies with a synchronic orientation (e.g. Biber et al. 1999: 746–7) have
found that with begin the infinitive is the statistically normal form and the gerund a
minor additional option, whereas with start there is a more even distribution of the two
variants. Diachronic studies (e.g. Mair 2002) have noted a tendency towards increasing
use of gerunds in the recent past, which thus seems to continue a long-term general
trend towards the use of more gerund complements (see Fanego 1996). Thereby, the
increase in gerunds takes off from a higher level and is more pronounced for start,
whereas it is as yet largely restricted to certain types of written AmE in the case of
begin. Table 3 provides corpus-evidence for the four varieties under study.
Table 3. Gerunds and infinitives with begin and start
begin + to-inf.**
begin + gerund**
start + to-inf.*
start + gerund*
FLOB
ACE
WWC
Frown
204
20
111
30
222
46
202
95
49
59
44
58
79
89
59
110
*Chi-square test results: no significant p-values at .01 level
**Chi-square test results: all p-values ≤ 0.01 except ACE : WWC, ACE : Frown
The situation is undramatic. There is no contrast worth mentioning between BrE,
AusE and NZE in the use of the gerund with start (corroborated by the absence of
statistically significant contrasts); and whether the fact that AusE and NZE occupy a
transitional position between BrE and AmE preferences in the use of the gerund after
begin has any sociolinguistic significance is highly doubtful. Despite the somewhat
different frequencies, the gerund is clearly the dispreferred option with begin in all
four varieties.
3. Gerunds and infinitives in spoken English: Data from three ICE corpora
3.1 help + infinitive
The complementation of help with infinitives is overall rarer in speech than in writing
(for confirmation in ICE corpora see Table 7 below), and certain types of construction
 Christian Mair
which are very common in written English are largely absent from the spoken language.6
Nevertheless, the preference for bare infinitival complements which was observed in
recent written material is also evident in spoken AusE and NZE:
Table 4. Bare vs. to-infinitives with help in three ICE corpora (spoken texts only)
bare inf.
to-inf.
ICE-GB
ICE-AUS
ICE-NZ
23
27
42
16
45
22
Chi-square tests showed a statistically significant contrast for GB:AUS: p ≤ 0.01; otherwise no significant
differences
For ease of reference, the figures are visualized in Figure 2:
45
40
35
30
25
bare inf.
20
to-inf.
15
10
5
0
ICE-GB
ICE-AUS
ICE-NZ
Figure 2. Bare vs. to-infinitives with help in three ICE corpora (spoken texts only)
If anything, the preference is even more pronounced than in writing, suggesting
that bare infinitives are superseding to-infinitives even faster in informal language.
While the common trend manifests itself more sharply in AusE than NZE, the fact
that the observed numerical contrast between these two varieties turns out not to be
. Thus while, for example, the c. 400 000 words of written material in ICE New Zealand
contain a total of 82 instances of help with either type of infinitival complement, the 600 000
words of speech have only 67. A similar imbalance is found in ICE-GB (87 vs. 50). A usage
typically encountered in written and formal styles and largely absent from spontaneous speech
is illustrated by the following example, in which help is used to indicate that the activity in the
infinitive was only one of several contributory factors to the state of affairs described:
To help meet these objectives the Home Secretary appointed a Civil
Emergencies Adviser (Mr David Brook CB CBE). [FLOB, H24 48 f.]
Infinitival and gerundial complements 
statistically significant should serve as a caution against over-interpreting the difference. In fact, the only explanandum in these figures is the surprisingly conservative
profile of the spoken portions of ICE-GB, which seems to be specific to this corpus and
probably represents an age or social bias in the pool of informants. It is not replicated
in the much bigger spoken-demographic sub-part of the BNC, where bare infinitives
outnumber to-infinitives by 126 to 66, as expected.
Given the fact that, at least historically, the use of the bare infinitive with help is
a statistical Americanism, it is tempting to regard the figures observed in AusE and
NZE as a symptom of a recent “Americanization” of these two varieties. Unfortunately,
ICE-US – the ideal data-base to investigate such an assumption – is not yet completed.
However, available corpora of contemporary spoken AmE suggest that the preference
for bare infinitives is even stronger in this variety than in the three investigated here.
For example, the American National Corpus Switchboard component has 227 bare
infinitives against 40 to-infinitives. Interpreting synchronic regional variation in terms
of “apparent time”, it seems safe to conclude that AmE is leading the development and
providing the model for other national standards.
However, a closer look at long-term developments 1600 to 2000 (as undertaken in
Mair 2006: 138) shows that the situation is not as simple as that. The recent development
is unlikely to represent “Americanization” of the grammars of BrE, AusE and NZE in
the narrow sense of speakers, consciously or unconsciously, borrowing American usages
which they find prestigious. Rather, there is a longer-term groundswell promoting the
use of bare infinitives with help in all varieties of English studied here which is not due to
sociolinguistic prestige and stigma but to a grammaticalization process.
As shown by Mair (2006: 138), the shift to bare infinitives is not a zero sum game in
which the increase in the use of one variant compensates for the decrease in the other.
In other words, we are not dealing with realignments of variants within a stable variable.
Rather, what we note is a change of the variable itself. Help followed by any kind of infinitive has increased its frequency continuously over the past four centuries, which suggests
that the construction is undergoing the incipient stages of a grammaticalization process.
Interpreted against this background, contrasting preferences in mid-twentieth century
British and American usage have turned out to be a transient phenomenon. Ultimately
the bare infinitive, as the variant in which the infinitive is more closely integrated with
the superordinate verb, has asserted itself as the dominant variant everywhere.
3.2 prevent/stop + NP + (from) + gerund
An analysis of the complementation of the verbs of prevention in the spoken sections
of the ICE corpora is bedeviled by the very low figures in some cells.
The from-less variants, which emerged as a grammatical Briticism in the course
of the twentieth century, are firmly attested in AusE and NZE both with prevent and
stop. For what the figures are worth, one might even venture the generalization that with
 Christian Mair
Table 5. From and “zero” with gerund with prevent and stop in three ICE-corpora
(spoken texts only)
ICE-GB
ICE-AUS
ICE-NZ
prevent NP from V-ing
prevent NP V-ing
12
7
3
4
12
9
stop NP from V-ing
stop NP V-ing
7
19
10
5
14
27
Chi-square test results: not significant
regard to this particular variable NZE is closer to BrE than AusE is. The fact that we have
so much greater diversity in one matrix verb (stop) instantiating the construction, than
in the other (prevent) suggests that there may be a lexical or idiomatic factor at play. In
the absence of suitably large corpora, it is difficult to test this hunch systematically. For a
more general assessment of the lexical factor, see Section 4 below.
3.3 start and begin in catenative uses
As Table 6 shows, spoken BrE, AusE and NZE present identical profiles with regard to
the use of gerunds and infinitives after begin and start.
Table 6. Begin and start in three ICE-corpora (spoken texts only)
ICE-GB
ICE-AUS
ICE-NZ
begin + to-infinitive
begin + gerund
56
9
31
6
41
6
start + to-infinitive
start + gerund
49
76
73
111
96
123
Chi-square test results: not significant
There is a very clear preference for infinitival complements with begin, and a
lesser, but still pronounced reverse preference for the gerund with start. Note also that
while begin is the more frequent aspectual catenative of inception in the written corpora evaluated for Table 3 above, the informal start is more frequent in all three spoken
corpora. This point will be taken up again in a comparative analysis of the spoken and
written ICE texts in Section 4 below.
4. The regional factor in context: Medium, style and lexical incidence
Our analyses have shown that AusE and NZE by and large still display British-style
profiles of variation, both in their spoken and written forms. Where there are
Infinitival and gerundial complements 
obvious contrasts, for example in the use of help as documented in ICE-GB, they
cannot be replicated in other corpora of spoken BrE. Where slightly different tendencies are observed for AusE and NZE, lack of statistical significance usually warns
us of over-interpreting the findings. The question thus arises what we gain from an
investigation which systematically separates spoken and written usage, as has been
done in the present study.
Before attempting a definitive answer to this question, it is probably best to
extend the database, from a comparison of the findings from ICE spoken texts with
those from the one-million-word written reference corpora of the “Brown” type, to a
comprehensive survey of variation between speech and writing within the respective
ICE corpora themselves:7
Table 7. Verb complementation in three ICE corpora – written and spoken components7
ICE-GB
written
ICE-GB
spoken
ICE-NZ
written
ICE-NZ
spoken
ICE-AUS ICE-AUS
written spoken
help + bare infinitive
help + to-infinitive
Total
46
41
87
23
27
50
54
28
82
45
22
67
44
24
68
42
16
58
prevent + gerund
prevent + from + gerund
Total
8
19
27
7
12
19
17
12
29
9
12
21
5
11
16
4
3
7
stop + gerund
stop + from + gerund
Total
11
5
16
19
7
26
8
4
12
27
14
41
4
2
6
5
10
15
begin + gerund
begin + to-infinitive
Total
10
69
79
9
56
65
29
108
137
6
41
47
19
50
69
6
31
37
start + gerund
start + to-infinitive
Total
26
42
68
76
49
125
39
49
88
123
96
219
37
26
63
111
73
184
Chi-square test results: ICE-GB written : ICE-NZ written → no significance (all p-values > 0.01)
Chi-square test results: ICE-GB spoken : ICE-NZ spoken → no significance (all p-values > 0.01)
Chi-square test results: ICE-GB written : ICE-AUS written → no significance (all p-values > 0.01)
Chi-square test results: ICE-GB spoken : ICE-AUS spoken → no significance (p-values > 0.01)
except help + bare infinitive/help + to-infinitive (p = 0.005)
Chi-square test results: ICE-NZ written : ICE-AUS written → no significance (all p-values > 0.01)
Chi-square test results: ICE-NZ spoken : ICE-AUS spoken → no significance (all p-values > 0.01)
. Note that the written components consist of 200 texts and the spoken of 300 texts.
 Christian Mair
With regard to the question of immediate interest in this paper, complement choice
after help, prevent, stop, begin and start in BrE and its two antipodean descendants, the
figures (and the results of the significance tests) convey the familiar message: there
are no dramatic regional contrasts, with the exception of the probably problematical
returns for help from ICE-GB already noted and discussed above, following Table 4.
Beyond that, though, the table is very instructive because it shows a strong interdependence between medium (speech vs. writing) and the incidence of the variables
(rather than specific variants) investigated. Irrespective of complement choice, the
matrix verbs begin and prevent are consistently more frequent in the ICE written
texts than in the spoken ones, whereas the opposite is true for start and stop. The
totals for help are similarly skewed: in spite of the fact that the written part of an ICE
corpus makes up only 40 per cent of the total, both variants of the help-construction
are consistently more common in writing, in every count in every corpus.
Mukherjee and Hoffmann have recently pointed out that in addition to
“collocations and idioms, particle verbs, article usage and tense usage […] another
important field in the core area of lexicogrammar showing clear traces of regional
differentiation is verb complementation” (2006: 148), suggesting that more attention
should be paid to this previously under-researched area in studies of regional variability in World Englishes. The three variables studied here corroborate this view up
to a point, but a rider needs to be added. Medium, genre and style still account for a
greater amount of the potential variability in this area of the grammar than the effects
of nationally specific processes of grammatical standardization. From among the pool
of variants studied here, the only one which can be claimed to be a dominantly regional
variable is prevent/stop somebody doing something. While it does not discriminate
between AusE, BrE and NZE, it is categorically excluded from contemporary AmE. In
all other cases, the regional factor is secondary to style or medium. In one instance, the
preference for the bare infinitive with help, which was a plausible candidate for a statistical Americanism in the mid-twentieth century, it has even become less important in
the course of the past century.
5. Conclusion
A corpus study of three instances of variable complement choice in present-day
English has failed to uncover more than minimal contrasts between BrE, AusE and
NZE. As usage is categorically different for one of these variables (prevent/stop someone doing something) in AmE, this means that, at least in this fragment of the grammar, there has not been any strong recent American influence.
Two of the three variables, namely choice between to- and bare infinitives after
help and between to-infinitives and gerunds with verbs of inception (begin, start) result
Infinitival and gerundial complements 
from long-term diachronic trends which affect varieties of English world-wide and
are therefore unlikely to lead to more than temporary regional differentiation. With
help, bare infinitives are expanding and thus superseding to-infinitives, probably as
a result of an auxiliation/grammaticalization process in which this verb is becoming
more like a semi-auxiliary of causation than a lexical verb expressing the notion of
assistance. This process may be somewhat more advanced in AmE than in either BrE,
AusE or NZE, but the direction of the development (and its presumable end-point) are
the same. The same is true for the spread of the gerund after verbs of inception – slow
with begin and faster with start. As these two variables thus stand for convergent longterm developments in World Englishes, it is unlikely that they will help to establish
nationally distinct profiles for AusE and NZE.
The more interesting case in this connection is represented by the third one of
the variables investigated, persistence into the present and possible expansion of
from-less gerunds after verbs of prevention. Variability between gerunds with and
without from after prevent and stop (and a few other less frequent verbs) was part
of the eighteenth- and nineteenth-century colonial legacy of English. It has been
eliminated in favour of the more elaborate variant (with from) in recent AmE. That
there is no sign of a similar trend towards the suppression of optional variability in
BrE, AusE and NZE is a sign of the power of historical continuity in the development of the grammar of standard varieties. Furthermore, it shows that processes of
simplification and regularization which have occurred in AmE are not automatically salient for speakers of other standard varieties and hence not candidates for
prestige borrowing. Where – as in the case of similar such processes (e.g. regularization of do-support for have and need in questions and negations) – speakers of
other varieties do seem to follow American norms, more may be involved than
straightforward “Americanization.”
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. The
Longman Grammar of Spoken and Written English. London: Longman.
Dixon, Robert M.W. 2005. A Semantic Approach to English Grammar. 2nd edn. Oxford: Oxford
University Press.
Fanego, Teresa. 1996. “The development of gerunds as objects of subject-control verbs in
English (1400–1760)”. Diachronica 13: 29–62.
Mair, Christian. 2002. “Three changing patterns of verb complementation in Late Modern
English: A real-time study based on matching text corpora”. English Language and Linguistics
6: 105–31.
Mair, Christian. 2006. Twentieth-century English: History, Variation and Standardization. Cambridge: Cambridge University Press.
 Christian Mair
Mair, Christian. 2007. “British English/American English grammar: Convergence in writing –
divergence in speech?” Anglia 125: 84–100.
Mukherjee, Joybrato & Hoffmann, Sebastian. 2006. “Describing verb-complementation
profiles of New Englishes: A pilot study of Indian English”. English World-Wide 27: 147–73.
Rohdenburg, Günter. 1996. “Cognitive complexity and increased grammatical explicitness in
English”. Cognitive Linguistics 7: 149–82.
Trudgill, Peter & Jean Hannah. 2002. International English: A Guide to Varieties of Standard English. 4th edn. London: Arnold.
Vosberg, Uwe. 2006a. Die Große Komplementverschiebung: Außersemantische Einflüsse auf die
Entwicklung satzwertiger Ergänzungen im Neuenglischen. Tübingen: Narr.
Vosberg, Uwe. 2006b. “The Great Complement Shift. Extra-semantic factors determining the
evolution of sentential complement variants in Modern English”. In English and American
Studies in German 2005. (Summaries of Theses and Monographs. A Supplement to Anglia.)
Tübingen: Niemeyer, 19–22.
Webster’s Third New International Dictionary. 1961. Springfield MA: Merriam-Webster.
Commas and connective adverbs
Peter G. Peterson
University of Newcastle
This chapter reports the findings of a corpus-based study of the use of three
connective adverbs, however, therefore and thus, to introduce a second main
clause within a single orthographic sentence (“run-on sentences”). It is
established that these three items still display in this usage all the criterial
syntactic properties of connective adverbs. This usage is more frequent in
current written English from Australian and New Zealand English than in
British and American English, and much more frequent in unedited writing.
The phenomenon is essentially a change in the use of punctuation devices,
demonstrating a tendency to treat as a single (orthographic) sentence two
clauses that form a closely linked logical sequence.
1. Introduction
This chapter reports the findings of a corpus-based study of connective adverbs, a
subset of adverbs whose central function is to provide a semantic (but not a syntactic)
link between two main clauses. The investigation focuses in particular on the use of
three of these connective adverbs, however, therefore and thus, to introduce a second
main clause within a single orthographic sentence (so-called “run-on sentences”).
Having established that these connective adverbs are syntactically distinct from
subordinators, and that the phenomenon under investigation does indeed involve
“run-on sentences”, the frequency and distribution of this usage was examined in
corpora of current written AusE and NZE as well as BrE and AmE.
2. The problem
It is by no means unusual in current written English to encounter the use of however
with a preceding comma rather than a full stop or semicolon, as in:
(1) a.Currently there are no specials available, however our prices are such that
no matter where you travel in Australia or New Zealand you will be getting
value for money when staying at a Family Park.
 Peter G. Peterson
b.This website includes all the information contained on the TOP 10 Holiday
Park Map Directory, however if you would like a copy of the TOP 10 Holiday
Parks Map Directory, please complete the online order form.
c.The unit is fully furnished, however you will need to bring your own towels
and linen.
However, along with moreover, nevertheless, therefore, thus, and a few other similar
items, is classified by Huddleston and Pullum (2002) as an adverb, with the specific
function (in relevant examples) of connective adjunct.1 The term “connective adjunct”
captures two facets of the function of these items: (i) they are adjuncts in that they are
optional (additional) syntactic elements in the clausal structure; and (ii) they carry the
semantic load of providing a semantic connection between two clauses, the one they
are attached to and the preceding clause. I will follow Huddleston and Pullum by using
the term “connective adverb” as a convenient way of referring to adverbs functioning
as connective adjuncts.
The problem presented by examples such as (1) is that adjuncts, even connective
adjuncts, unlike coordinators or subordinators, do not provide a syntactic link between
two clauses. The linkage that they supply is essentially a semantic one. Accordingly, if
however is a connective adverb, the examples in (1) “should” be a sequence of two
sentences, with a full stop replacing the comma. This view is represented explicitly in
Huddleston and Pullum:
In general, the absence of any grammatical link strongly favours a stronger
indicator than a comma to separate the clauses. Thus, although examples like the
following occur, they would be widely regarded as infelicitous in varying degrees:
[15]i.?The locals prefer wine to beer, the village pub resembles a city wine bar.
ii.*Your Cash Management Call Account does not incur any bank fees,
however, government charges apply.
Example [i] illustrates what prescriptivists call a “spliced” or “run-on” comma,
with the implication that the sentence should be split into two. A special case of
this is where the second clause begins with a connective adjunct such as however,
nevertheless, thus, and the like; while [ii] is an attested example, it would generally
be regarded as unacceptable. (Huddleston & Pullum 2002: 1742)
Is the phenomenon illustrated by (1), and by Huddleston and Pullum’s [15ii], indeed
“unacceptable” as Huddleston and Pullum claim? If not, is it evidence that however
. Chambers Dictionary, somewhat confusingly in light of the following discussion, classifies
however as a conjunction, but retains the traditional classification of adverb for therefore.
Commas and connective adverbs 
and its ilk are functioning as coordinators or subordinators, rather than as connective
adverbs, reflecting a change in syntactic category? Or is what we are observing
simply a change in the use of punctuation devices, with the result that a comma is
used to join two main clauses, an expansion of the range of asyndetic combination
of main clauses?
The situation is further muddied by the following discussion from Huddleston
and Pullum, which comes immediately after the quotation given above:
Nevertheless there are certain conditions under which a comma is acceptable ….:
[16] i. It was raining heavily, so we decided to postpone the trip.
Example [i] might be called “quasi-syndetic”: although so here does not belong to
the syntactic category of coordinators, it serves a similar linking function, and a
comma is strongly preferred over a semicolon or colon. Yet behaves in the same
way. … The comma-linked cases are thus broadly coordinative in interpretation.
(Huddleston & Pullum 2002: 1742)
There are a number of issues that arise from these quotations from Huddleston and
Pullum. For the purposes of the current discussion, we will focus on the central
question as formulated in (2):
(2)What is the actual attestable status of [15ii]? Are examples of “run-on” commas
with connective adverbs rare enough to be considered aberrations? or are they
in fact well enough established to be considered a reflection of the grammatical
system of current English?
A related question here would be: What is the basis for the grammaticality judgments
provided by Huddleston and Pullum for their examples [15i] and [15ii]? Frequency?
Stylistic grounds? “Breach” of assumed “rules” of the grammar? The last presupposes
an independently verifiable notion of sentence connection, a question we will return
to later in this chapter. However, a detailed examination of the whole gamut of “run-on
sentences” would take us beyond the bounds of the current discussion, and must be
deferred to another occasion.
3. Properties of connective adverbs
Before we attempt to answer our central question, we first need to establish the grammatical status of the connective adverbs such as however, therefore, and thus, to ensure
that there has not been a functional shift to a different grammatical category. The key
 Peter G. Peterson
distinctive properties of connective adverbs, which differentiate them in particular
from other linking items such as coordinators and subordinators, are summarized in
the following sub-sections. The discussion will focus on the properties of the connective
adverbs however, therefore, and thus, as these are the items examined in detail in
Section 3 below. As Huddleston and Pullum (2002: 1319–20) point out, yet and so differ
in some respects from the prototypical connective adverbs and they are excluded from
the current discussion.2
P1 Clause-initial position
One property that distinguishes connective adverbs from coordinators and subordinators is that a connective adverb may occur in initial, medial or final position in its clause,
whereas coordinators and subordinators must be clause-initial. Compare the acceptability
of non-initial moreover in (3b) with the unacceptable examples in (4b) and (5b):3
(3) a. It has been raining all week; moreover there is a howling southerly.
b. It has been raining all week; there is moreover a howling southerly.
(4) a. It has been raining all week and there is a howling southerly.
b. *It has been raining all week there is and a howling southerly.
(5) a. They have canceled the race because there is a howling southerly.
b. *They have canceled the race there is because a howling southerly.
While some connective adverbs can occur in non-initial position, as shown in (3),
this property does not apply uniformly to all members of the class. As Huddleston
and Pullum (2002: 1320) point out, so and yet strongly prefer initial position:
(6) a. There is a howling southerly so they have canceled the race.
b. *There is a howling southerly they have so canceled the race.
This then is a necessary but not sufficient criterion for connective adverb status: if the
item can occur in non-initial position, it is a connective adverb (and is not a coordinator
or subordinator); if it cannot occur in non-initial position, the item may or may not
be a connective adverb.
. For more in-depth discussion of these and related criteria, see Huddleston and Pullum
(2002), Peterson (1999), Quirk et al. (1985).
. The annotation of source material, in line with Huddleston and Pullum (2002: xii) is as
follows: * signifies “ungrammatical”, ? “of questionable grammatically”, ?? “of very questionable
grammaticality”. For further explanation, see discussion in Section 3 below.
Commas and connective adverbs 
P2 Cooccurrence with a coordinator
Members of the coordinator category are mutually exclusive, i.e. it is not possible to
have a coordinator immediately preceding another coordinator on the same hierarchical
level. So there can be only one coordinator per coordinate, as shown in (7):
(7) *John is lazy and but he is still my friend.
This property distinguishes coordinators from connective adverbs such as therefore,
thus and nevertheless, which happily accept a preceding coordinator, as shown in (8):
(8) a.I realize it is not in vogue, but nevertheless I offer Ms. Johnson and
anyone who thinks like her to reconsider. [Frown B22:216]
b.It’s the single greatest cause in the breakdown of human relations
and therefore people have no great incentive or enthusiasm.
[ICE-AUS W2B-014:105]
Examples of however with a preceding coordinator seem marginal at best, as suggested
by the examples in (9); no examples were found in the corpora:
(9) a.
b.
??There
?There
is a howling southerly but however the race will still be held.
is a howling southerly but the race however will still be held.4
P3 Symmetric reversibility
In some (prototypical) cases of coordination, the sequence [X coordinator Y] is
semantically equivalent to [Y coordinator X]. So, for instance, the pairs of sentences
in (10) and (11) have the same propositional meaning regardless of the order of the
coordinated elements:
(10) a. Pat was tall and Kim was fat.
b. Kim was fat and Pat was tall.
(11) a. Kim is very hard-working but Pat is inclined to be lazy.
b. Pat is inclined to be lazy but Kim is very hard-working.
Symmetric reversibility is a property which depends on the equal status of the coordinated elements and for that reason is not shared with any hypotactic constructions. We
would therefore predict that neither connective adverbs nor subordinators display this
. Interestingly Fowler (1926: 239), while regarding but … however as “disagreeable”,
expresses the opinion that “But however with nothing intervening … is … better”. Regrettably
he does not offer any actual examples.
 Peter G. Peterson
property. This prediction is confirmed by the following examples where the sequences
in (b) have a distinctly different meaning from the sequences in (a):
(12) a. I went away because you were angry.
b. You were angry because I went away.
(13) a. I went away; therefore you were angry.
b. You were angry; therefore I went away.
P4 Sequence-initial position
Connective adverbs share with coordinators the requirement that they can occur only
in the second (or subsequent) terms. Given the sequence [X c Y], where “c” represents a
coordinator or a connective adverb, there is no equivalent sequence [c Y, X]. Compare the
acceptable reordering of the subordinate and main clauses in (14) with the impossibility
of such a reordering with a coordinator (15) or a connective adverb (16):
(14) a. I’m going to be late because the car has broken down.
b. Because the car has broken down, I’m going to be late.
(15) a. The car has broken down and I can’t get a taxi.
b. *And I can’t get a taxi, my car has broken down.
(16) a. It has been raining all week; moreover there is a howling southerly.
b. *Moreover there is a howling southerly, it has been raining all week.
This then is a negative criterion for connective adverb status: if the sequence [linker + Y]
can be fronted, the linker is not a connective adverb (nor a coordinator).
P5 Category restrictions on linking
There are tighter restrictions on linking with subordinators and connective adverbs
than there are with coordinators. In particular, coordinators but not subordinators can
link finite VPs, as shown in examples (17) and (18):
(17) a. Kim works very hard and deserves a pay rise.
b. *Kim deserves a pay rise because works very hard.
(18) a. Pat has not worked here long but has earned our respect.
b. *Pat has earned our respect although hasn’t worked here long.
Huddleston and Pullum (2002: 1321, fn41) give one (presumably attested) example of
however linking finite VPs (reproduced here as (19)), and I have found one example of
finite VPs linked with therefore (20, from a teacher’s report).
(19)Please note that the costs are correct, however are subject to change prior to
final payment.
Commas and connective adverbs 
(20) Sean has trouble recognizing numerals, therefore struggles in these areas.
However, there are no such examples to be found in any of the corpora of written
English listed in Section 3 below, and the status of such examples must therefore be
considered marginal at best.5
Table 1 (below) summarizes the above properties of connective adverbs compared
with coordinators and subordinators:
Table 1. Summary of syntactic properties
P1
P2
P3
P4
P5
Property
connective adverbs
coordinators
subordinators
linker must be initial
disallows preceding coordinator
reversibility
sequence-initial position
can link finite VPs
no
no
no
no
no
yes
yes
some
no
yes
yes
no
no
yes
no
Connective adverbs share some characteristics with coordinators. There are some
semantic similarities; for instance, and and moreover are both additive, but and however are concessive. They also have at least one syntactic property (P4) in common with
coordinators; the item that they introduce cannot be the first in the linked sequence.
It is also clear from Table 1 that connective adverbs also share a number of syntactic
properties (P2, P3, P5) with subordinators. However, although the three categories,
coordinators, subordinators and connective adverbs, have some properties in common, there are sufficient clear differences to allow us to distinctly identify members
of each category.
What then is the status of the items however, therefore and thus when they are used
in “run-on” sentences? To repeat our earlier concern, can we ensure that “there has not
been a functional shift to a different grammatical category”? We can do this by taking
examples of “run-on” connective adverbs from the corpora and testing them against
the criterial syntactic properties P1–4. Consider, then, example (21):
(21)The market is opened to competition therefore the price of the good is reduced.
[ICE-AUS W1A-004:21]
We can see from the examples in (22) that the linking item therefore in (21) can be moved
into non-clause-initial position (22a) and can be preceded by a coordinator (22b), that
. Rather oddly, in Huddleston and Pullum’s footnote, (19) is left unannotated, whereas in
the same footnote an example of clauses linked with however is marked as “%” (indicating
“acceptable in some dialects only”).
 Peter G. Peterson
reversal of the linked clauses is not semantically neutral (22c), and that the linking item
with its clause cannot be sequence-initial (22d):
(22) a.The market is opened to competition; the price of the good is therefore
reduced.
b.The market is opened to competition and therefore the price of the good
is reduced.
c.The price of the good is reduced therefore the market is opened to
competition. [not semantically equivalent to (21)]
d. *Therefore the price of the good is reduced, the market is opened to
competition.
The linking item therefore in (21) has thus been shown to retain all the relevant
properties (P1–4) of connective adverbs. The same results can be demonstrated for
however (23) and thus (24), with the proviso that, as discussed above, however resists
the addition of a preceding coordinator (P2):
(23) a.Its proximity to the main line may provide problems however it remains a
possibility for the future. [ICE-NZ W2B-003#137:2]
b.Its proximity to the main line may provide problems; it remains however a
possibility for the future.
c. [not applicable]
d.It remains a possibility for the future however its proximity to the main line
may provide problems. [not semantically equivalent to (23a)]
e. *However it remains a possibility for the future, its proximity to the main
line may provide problems.
(24) a.The sales worker group is little changed from the census thus stockbrokers
are in this category. [WWC J29:092]
b.The sales worker group is little changed from the census; stockbrokers are
thus in this category.
c.The sales worker group is little changed from the census and thus
stockbrokers are in this category.
d.Stockbrokers are in this category thus the sales worker group is little
changed from the census. [not semantically equivalent to (24a)]
e. *Thus stockbrokers are in this category, the sales worker group is little
changed from the census.
The conclusion, then, is clear: however, therefore and thus have not shifted their
grammatical category. Despite their use on “run-on” sentences, they still display all
the criterial properties of connective adverbs.
An important theoretical consequence of the preceding discussion is that punctuation cannot be used as a guide to syntactic analysis. A sequence of clauses may be
punctuated with a full stop, a semi-colon, a comma, or even no punctuation device at all,
but that does not affect the grammatical status of the sequence, nor the grammatical
Commas and connective adverbs 
categorization of any linking item. Whether two clauses are linked with a comma
plus and, or by a full stop plus And, does not affect the status of and as a coordinator.
Likewise whether two clauses are linked with a comma plus however, or by a full stop
plus However, does not affect the grammatical status of however. And exactly the same
applies to thus, therefore, and other connective adverbs.
4. Commas and connective adverbs
To seek an answer to our central question, formulated in (2) above, concerning the
actual attestable status of “run-on” commas with connective adverbs, the following
corpora of current written English were examined:6
––
––
––
––
––
––
ACE (AusE texts)
WWC (NZE texts)
FLOB (BrE texts)
Frown (AmE texts)
ICE-AUS (written genres only)
ICE-NZ (written genres only)
The written components of the ICE corpora consist of 400 000 words (200 samples of
2000 words each); the other corpora consist of one million words each (500 samples
of 2000 words).
The corpora were searched for the use of connective adverbs used as the sole
clause linker, i.e. with no preceding coordinator, between two main clauses. The connective adverbs investigated were however, therefore, and thus.7 Examples of interest
were those in which the connective adverb is in clause-initial position, since it is in that
. My thanks to Pam Peters for facilitating my access to these corpora, and to Adam Smith
for his assistance.
. However and thus occur not only as connective adverbs but also as adverbs of degree or
manner, as in:
(i)
a.Thus fortified, Menzies consented without consulting cabinet or parliament
… [ICE-AUS W2A-008:14]
b.However did we live, only a few short years ago, without this array of
medication? [ICE-AUS W2B-023:156]
Such examples are of course not relevant to the present discussion of connective adverbs, and
were excluded from the count.
 Peter G. Peterson
position that the writer has a choice between initiating a new “sentence” or producing
a “run-on” sentence. Relevant examples are illustrated in (25):
(25) a.Other services have been expanded to meet the need, however the
situation is still critical. [ACE H07:1646]
b.I do not believe you were given three thousand dollars. Therefore,
I have made my offer. [Frown N01:111]
The examples in (26) illustrate constructions that are not relevant for our current
purposes: (26a) shows a connective adverb not in clause-initial position; (26b) shows
a connective adverb with a preceding coordinator:
(26) a.I always regarded him as a man of refined manners good intentions, and
an earnest desire to serve his country ... as a speaker however, he was not
fluent nor did he always know when to leave off. [ACE G15:3128]
b.It’s the single greatest cause in the breakdown of human relations
and therefore people have no great incentive or enthusiasm
[ICE-AUS W2B-014:105]
Tokens of clause-initial connective adverbs preceded by a comma or by no punctuation marking at all were counted as examples of “run-on” sentences. The following
are representative examples taken from the six corpora, with the relevant connective
adverb bolded:
(27) a.The importance of strong seat anchorages and seat structures in buses is
self-evident, however, design deficiencies in this area have been noted by
several witnesses. [ACE H04:1033]
b.Mail will probably take about 14–21 days to reach me and vice versa,
however I still think that this system is more reliable than Egyptian Post!
[ICE-AUS W1B-013:75]
c.It [the Fabian Society stall] took determination, a good map and sharp
eyesight to find, however many did and renewed old acquaintances, bought
literature and joined up. [FLOB F17:179]
(28) a.In practical terms, I will have to broadcast from a shed down the back
of the section, therefore I must restrict my broadcasting hours to the
times when it is cold throughout New Zealand and people are indoors.
[WWC G64:123]
b.She knew the body of a man, therefore she knew all their other secrets.
[ICE-NZ W2F-013:162]
c.For example, the Kulubnarti Nubians were egalitarian, therefore, evidence
for preferential access to food resources based on political or economic
status was absent. [Frown J13:200]
(29) a.The Japanese Government chose to turn a blind eye, thus it worked.
[WWC A14:179]
Commas and connective adverbs 
b.These messages require recording by a speaker and although a natural
sounding voice output is achieved, large amounts of data storage are
required thus speech coding is essential. [ICE-AUS W1A-009:13]
c.In the case of Courier the same actual numerals were used but their relative
positions were interchanged, thus 514 Ultimate became 145 Courier, and
so on. [ICE-NZ W2B-040:137]
Tokens of clause-initial connective adverbs preceded by primary terminal (Huddleston
& Pullum 2002: 1731), i.e. a full stop, exclamation mark or question mark, clearly initiate a new (orthographic) sentence. Less clearly categorized are examples preceded by
a colon or a semicolon, or by other punctuation devices such as a dash or opening
parenthesis. Although the semicolon is traditionally regarded as an intra-sentence
punctuation device, and is so treated in Huddleston and Pullum (2002: 1735), between
clauses it seems to have the force of a “weakened full stop” rather than a “strong
comma”. Furthermore, Huddleston and Pullum (2002: 1735) note that the comma is the
“weakest” of the secondary boundary markers. For the purposes of this discussion,
examples with punctuation markers “stronger” than a comma are not regarded
as belonging to the “run-on” category. Thus, all examples such as those in (30) are
excluded from the count of “run-on” sentences.
(30) a.Of course, the security of such a system would depend on the probity of
the privacy auditor: however, a similar remark applies to the role of the
Auditor General in checking public accounts. [ACE J72:15712]
b.The story includes such a leader who came from Samaria; thus the
building by the shore of the Dead Sea was “Samaria” while he was there.
[ICE-AUS W2B-008:46]
c.We’ll move to the very Ireland from whence we were hewn and away
from this island from whence she was hewn also – thus we can save our
wonderful marriage [WWC K84:015]
Table 2 presents the number of tokens of “run-on” sentences involving clause-initial
connective adverbs in each corpus, with regional totals added for Australia and
New Zealand.
As Table 2 shows, the number of “run-on” examples in the corpora is not high.
Of 349 examples of however used as a connective adjunct in the ACE Corpus, only
6 were used with a preceding comma; the vast majority of examples employed a
semicolon, colon or full stop. The numbers for “run-on” however were even smaller
in WWC, with 3 examples out of 304; and FLOB and Frown produced just one
example each. ICE-AUS and ICE-NZ produced comparatively much higher figures,
for reasons to be discussed below. The figures for thus parallel those for however, on
a smaller scale throughout. The total figures for therefore are small; there seems to
be an avoidance in all corpora of clause-initial position for this connective adverb.
 Peter G. Peterson
But in that small sample, “run-on” examples have a relatively strong showing,
especially in the NZE data.
Table 2. Clause-initial connective adverbs
however
Corpus
therefore
thus
total
run-on
all
run-on
all
run-on
all
run-on
ACE
ICE-AUS (wr)
Total AUSE
349
245
594
6
19
25
32
35
67
1
6
7
113
63
176
1
2
3
8
27
36
WWC
ICE-NZ (wr)
Total NZE
304
293
597
3
10
13
27
44
71
7
10
17
82
88
168
3
2
5
13
22
36
FLOB
Frown
304
188
1
1
23
43
0
1
141
134
1
0
2
2
What is particularly striking in Table 2 are the differences between the corpora. ICEAUS and ICE-NZ have a noticeably higher number of “run-on” examples than even their
regional counterparts ACE and WWC. This is even more striking when we take into account
the differences in the sizes of the data bases: the ICE corpora contain 400 000 words of
written text, whereas the other corpora contain one million words each. If we “normalize”
the ICE results to represent the number of examples per million words (by multiplying
by 2.5), the figures for “run-on” however in ICE-AUS and ICE-NZ would be 47.5 and 25,
compared to 6 and 3 in ACE and WWC respectively. This mismatch can be explained, at
least in part, when we examine the distribution of examples in terms of genre.
Tables 3 and 4 show the distribution of “run-on” examples of connective adverbs
across the different genres represented in the corpora. The ICE corpora (ICE-AUS
and ICE-NZ) divide their data into eight genres, as represented in Table 3, whereas
ACE, WWC, FLOB and Frown use the classification established by the LOB and Brown
corpora, as represented in Table 4.
Table 3. Distribution by genre* in ICE-AUS and ICE-NZ
Genre
Description
however
therefore
thus
Total
W1A
W1B
W2A
W2B
W2D
W2F
non-printed essays
non-printed letters
academic writing
popular information
instructional writing
creative writing
8
14
1
3
3
10
1
1
2
2
20
17
2
3
6
1
* There were no instances in W2C and W2E.
3
1
Commas and connective adverbs 
Table 4. Distribution by genre in ACE, WWC, FLOB and Frown
Genre
Description
however
A
B
D
E
F
G
H
J
K–W
press reportage
press editorial
religion
skills, trades, hobbies
popular lore
belles lettres, essays
government, corporate
learned, scientific
(categories of) fiction
1
therefore
thus
Total
1
1
2
1
2
2
5
1
3
7
2
2
2
3
3
1
1
2
1
3
1
3
Although there are some obvious matches that could be made between the
categories used in the two sets of corpora, the small numbers of tokens involved
makes direct comparisons not worthwhile. However, some useful observations can be
made directly from the above tables. First, it is noteworthy that examples of “run-on”
connective adverbs occurred in a wide range of text-type categories within all the
corpora. In ACE, WWC, FLOB and Frown, examples were found in Press (A), Religion
(D), Skills (E), Popular Lore (F), Belles Lettres (G), Miscellaneous (government and
corporate) (H), Learned and scientific (J), and two different subcategories of Fiction.
In ICE-AUS and ICE-NZ, examples occurred in non-printed writing (W1A), Letters
(social and business) (W1B), Printed information (W2A, W2B), Instructional Writing
(W2D) and Creative Writing/Fiction (W2F). We can conclude from this that the phenomenon of “run-on” connective adverbs is not restricted to one type of writing only.
The second, and even more obvious, observation is that by far the greatest
number of tokens of “run-on” connective adverbs was found in the non-printed genres
W1A and W1B in the ICE corpora. Of the total of 74 “run-on” connective adverbs,
37 occurred in these two genres alone. 76% (37 of 49) of the examples from the ICE
corpora are found in the non-printed data. Given that the non-printed data comprised
only 25% of the ICE corpora (50 of 200 samples), this means that 76% of the ICE
“run-on” connective adverbs occurred in 25% of the data set. The conclusion is clear:
“run-on” connective adverbs are a phenomenon primarily of unmonitored, or at least
unedited, writing.
We can now return to Table 2 to investigate the data on a regional basis. Compare
first the ACE and WWC corpora on the one hand with FLOB and Frown on the other
hand. All these corpora are directly comparable, being of the same size and containing
data from published sources only. Frown (representing AmE) and FLOB (representing
BrE) contain only 2 examples each of “run-on” connective adverbs. In marked contrast,
ACE has 8 examples and WWC has 13. It appears that the use of commas to precede
clause-initial adverbs is much more widespread in Australian and New Zealand
 Peter G. Peterson
published writing than in American and British sources. This pattern is reinforced
when we look at the ICE corpora from Australia and New Zealand. Restricting the
count to tokens found in the W2 (published writing) genres, we find 7 examples in
each of ICE-AUS and ICE-NZ. As noted above, the written genres in the ICE corpora
contain only 400 000 words, and of these only 300 000 are from published sources.
7 examples from 300 000 words is equivalent to over 23 examples per million words,
compared to 2 examples per million words in FLOB and Frown. The numbers are too
small to warrant formal statistical analysis, but the trend is clear: “run-on” examples
with clause-initial connective adverbs are rare in published British and American
writing, but they are comparatively much more frequent in comparable Australian and
New Zealand sources.
We are then in a position to answer our central question (2). Although examples
of “run-on” connective adverbs are not frequently attested in the corpora, nevertheless
they do occur across a wide range of genres, particularly in unpublished written work.
Furthermore, informal surveys within Australia indicate that the use of a comma
preceding however in particular is essentially the norm in business communications
(inter-office memos, etc.), in primary teachers’ reports on student progress, in advertising
brochures, as in the examples in (1), and in other written work that is unmonitored or
not officially proofread.
If we take seriously the tenet of the descriptive grammarian, that “Whatever is in
general use in a language is for that very reason grammatically correct” (Sweet 1891: 5),
then we are bound to accept that connective adverbs can and do occur as the sole linking
item between two “run-on” clauses. The claim by Huddleston and Pullum (2002: 1742)
that such examples “would generally be regarded as unacceptable” cannot be sustained.
5. “One sentence or two?”8
Do “run-on” examples such as those in (1) constitute a single sentence, or a sequence
of two sentences? Matthews (1981: 32) states that “According to Bloomfield, two or
more forms stand in a relation of parataxis … if they are joined only by their intonation” (or, in written texts, by punctuation). Parataxis in this sense is equivalent to
Huddleston and Pullum’s notion of “asyndetic construction”, in which there is no
overt syntactic linkage. Huddleston and Pullum (2002: 1355) give the following as an
example of asyndetic supplementation:
(31) The poem asserts emotion without evoking it: it is sentimental.
. I have borrowed this sub-heading from Matthews (1981: 29). I refer the reader to
Matthews’ discussion for important insights into the problems alluded to in this sub-section.
Commas and connective adverbs 
But the question we need to ask is whether this is in fact a “construction” at all. Is there
a distinction, from the point of view of syntax, between (31) as written, with a colon
between the two main clauses, and the same sequence with the colon replaced by a
full stop?
We can here bring in the important distinction drawn by Lyons (1981, 1998)
between what he terms “system-sentence” and “text sentence”. A “system-sentence” is
a syntactic unit, part of the apparatus of the grammatical description of the language,
whereas a “text-sentence” is a unit of language in production, of “performance” in
Chomskyan terms. A parallel distinction is drawn in Huddleston and Pullum (2002)
between a sentence as a syntactic unit and an “orthographic sentence”. A “text-sentence”,
or “orthographic sentence”, is simply “what would be conventionally punctuated as such
in the written language” (Lyons 1981: 59).
Punctuation cannot be a determining criterion for a grammatical decision. The
relationship between punctuation and the syntax of written language is parallel to the
relationship between intonation and the syntax of spoken language. i.e. it is in part
(at most) a reflection of the grammatical structure. An “orthographic sentence” then
cannot be a guide to what is or is not a “system-sentence”.
We can now provide an answer to the question that heads this section, by reiterating the conclusion stated at the end of Section 2. Punctuation does not determine
grammatical status. Therefore although orthographically a “run-on” example produces a
single “text-sentence”, grammatically the result is a sequence of two “system-sentences”.
6. Towards a semantic explanation
We are left with the initial puzzle yet to solve. Writers, even those who use a comma
freely before a connective adverb, do not generally link two independent clauses with
a comma rather than a full stop. What then is “special” about however, therefore, and
thus that encourages writers to treat a sequence of clauses linked solely by one of these
connective adverbs as a single sentence? If the argument of the preceding sections is
accepted we have to rule out a syntactic solution. There is no evidence that the connective adverbs, either as a group or singly, are in the process of changing category,
to become either (marginal) coordinators or (marginal) subordinators. Essentially
we have two independent clauses linked asyndetically, that is, with no overt syntactic
marker of their linkage.
Huddleston and Pullum (2002: 1320) provide a valuable clue when they allude
to the semantic similarity between yet and but. This leads us to consider the semantic
function of connective adverbs in a little more detail. As stated in Section 1, the
central function of connective adverbs is to act as connective adjuncts, “serv[ing]
to relate the clause to the neighbouring text” (Huddleston & Pullum (2002: 775).
 Peter G. Peterson
As Huddleston and Pullum (2002: 777ff) point out, some of the connective adverbs
may be regarded as “pure” connectives, in the sense that the connection is their sole
linking function, while others combine the linking function with other semantic
overlays such as “concession”, “condition” and “reason/result” (Huddleston & Pullum
2002: 779). But these “additional” functions may also be carried by coordinators such
as and and but. And can be used not simply as a “pure” link between two propositions
but with an additional sense of “therefore” or “consequently”; but can carry an implication of concession. So there are significant semantic parallels between connective
adverbs and coordinators.
The phenomenon we have been examining then – the use of commas rather than
a “stronger” punctuation device – is essentially a change in the use of punctuation
devices, driven by a sense of semantic unity.9 Since however is semantically very close
to but, and thus and therefore have a semantic affinity with the causal sense of and,
the sequences of clauses can be adjudged to have a semantic unity, to express a single
“complete thought”. Many writers, when not governed by the dictates of editorial
boards or eagle-eyed assessors, are demonstrating an inclination to treat as a single
(orthographic) sentence two clauses that form a closely linked logical sequence.
References
Chambers Dictionary. 1998. Edinburgh: Chambers.
Fowler, Henry W. 1926 (1950). A Dictionary of Modern English Usage. London: Oxford University
Press.
Huddleston, Rodney & Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.
Lyons, John. 1981. Language and Linguistics. Cambridge: Cambridge University Press.
Lyons, John. 1998. “Sentences, clauses, statements and propositions”. In Peter Collins & David Lee
(eds), The Clause in English, 149–75. Amsterdam: John Benjamins.
Matthews, Peter. 1981. Syntax. Cambridge: Cambridge University Press.
Peterson, Peter. 1999. “Coordinators plus ‘plus’?” Journal of English Linguistics 27: 127–42.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Sweet, Henry. 1891. New English Grammar. Oxford: Oxford University Press.
. The fact that the use of “run-on” commas before however is not mentioned by that most
fierce and eagle-eyed commentator, Fowler (1926), suggests the phenomenon is of more recent
occurrence. An examination of the Brown and LOB corpora, representing writing of 1961,
revealed no examples of “run-on” however or therefore. Brown has three relevant examples of
thus, and LOB has one.
section v
Discourse
Information-packaging constructions
Peter Collins
University of New South Wales
The study whose findings are reported in this chapter compares the frequencies
and uses of five “information-packaging” constructions across four Englishes
(Australian, New Zealand, British and American) and a range of registers
(informal dialogue, learned writing, news reportage, editorials and fiction).
Antipodean practices are found to pattern more closely with British English
(with New Zealanders even more conservative than the British in various
respects). American English diverged in its resistance to the constructions in
the written language but acceptance of them in spoken language. The findings
are discussed in the light of recent diachronic trends in British English.
1. Introduction
This paper compares and contrasts three categories of “information-packaging” construction – existential, extraposition, and cleft (including it-cleft, basic pseudo-cleft
and reversed pseudo-cleft) – across a range of English dialects and registers. Information-packaging constructions characteristically share the same truth conditions and
illocutionary force as their structurally more basic counterparts, but differ from them
syntactically and informationally. The constructions are illustrated in the constructed
examples below (with basic counterparts in square brackets):
(1) a.
There are some dark clouds on the horizon
[Some dark clouds are on the horizon]
b. It’s unlikely that the rain will fill the dams
[That the rain will fill the dams is unlikely]
c.
It’s a prolonged downpour that the farmers need
[The farmers need a prolonged downpour]
d. What the farmers need is a prolonged downpour
[The farmers need a prolonged downpour]
e.
A prolonged downpour is what the farmers need
[The farmers need a prolonged downpour]
 Peter Collins
Existentials (as in (1a)) and extrapositions (as in (1b)) share the broad similarity that
they are derived via right-movement of the subject of their canonical counterpart (or
occasionally the object, in the case of extraposition), namely some dark clouds in (1a)
and that the rain will fill the dams in (1b), and its replacement by a dummy pronoun.1
In the case of existentials, this movement enables new information to be placed later
in the clause (an effect which applies, as we shall see, even to cases where the displaced
subject is grammatically definite). In the case of extraposition a grammatically “heavy”
constituent – typically a clausal subject – is placed at the end of the clause. Not only does
extraposition facilitate compliance with the tendency in English for heavy constituents
(which in turn typically express new information) to be located later rather than earlier
in the sentence, but it also contributes to ease of processing (see further Collins 1994).
There are two main types of cleft construction, it-clefts (as in (1c)) and pseudoclefts (basic, as in (1d), and reversed, as in (1e)). Both involve a more thoroughgoing
structural reorganization of the canonical counterpart than occurs with existentials
and extraposition. It-clefts involve the division of a more elementary clause into two
parts, one of which is foregrounded as complement of be (a prolonged downpour in
(1c)), and the other of which is backgrounded in a subordinate relative construction
(the effect of the latter being to present the content as a presupposition, and typically
as old information). In pseudoclefts, as in it-clefts, a more elementary clause is divided
into two parts. One part, a prolonged downpour in (1d), is foregrounded as complement of be (or, in the case of the “reversed” pseudo-cleft construction, as subject of be),
and the other backgrounded in a subordinate (fused) relative construction (the effect
of the latter again being to present the associated information as a presupposition).
2. The corpora
The present study was based on texts extracted from four parallel corpora, of AusE,
NZE, BrE, and AmE. For AusE, NZE and BrE, I used ICE-AUS, ICE-NZ and ICE-GB,
the 1 million word Australian, New Zealand and British components of the Inter­
national Corpus of English. Unfortunately ICE-US has not yet been completed, so for
AmE I selected a set of American texts that would parallel ICE-AUS, ICE-NZ and
ICE-GB as closely as possible: the c.116 000-words of Parts A and B of the Santa Barbara corpus (SBC), along with 76 000 words extracted from matching categories of the
Freiburg-Brown Corpus (Frown).
.­ In the absence of any standard term in the literature to refer to sentences with an extraposed
clause I shall adopt the practice of using “extraposition” as a non-count noun to refer to the derivational process, and as a count noun to refer to an instance of the resultant construction.
Information-packaging constructions 
The composition of the database, as represented below, reflects the 60%:40% ratio
of spoken to written texts found in the ICE corpora. It comprises c.116 000 words of
dialogic text (S1A 1–58 from each of ICE-AUS, ICE-NZ, ICE-GB, and SBC), and c.76
000 words of printed text from (ICE-AUS, ICE-NZ, ICE-GB, and Frown). The selection of the four printed categories represented – learned, reportage, editorials and
fiction – was based on the recognition recorded in corpus work on register variation
(e.g. by Biber et al. 1999) that they cover much of the range of register variation in
written English (see further Section 4 below). When, in ensuing discussion and tables,
comparisons are being made between genre categories of differing sizes, frequencies
are normalized to tokens per 10 000 words.
Table 1. Composition of the database
Categories
AusE/NZE/BrE
AmE
Spoken texts
ICE S1A 1–58
116 000 wds
ICE W2A 1–10
20 000 wds
ICE W2C 1–10
20 000 wds
ICE W2E 1–8
16 000 wds
ICE W2F 1–10
20 000 wds
76 000 wds
192 000 wds
SBC
116 000 wds
Frown J 1–10
20 000 wds
Frown A 1–10
20 000 wds
Frown C 1–8
16 000 wds
Frown K 1–10
20 000 wds
76 000 wds
192 000 wds
Written texts
Learned
Reportage
Editorials
Fiction
Total
TOTAL
Total
464 000 wds
80 000 wds
80 000 wds
64 000 wds
80 000 wds
304 000 wds
768 000 wds
The four corpora yielded a total of 3518 information-packaging constructions,
whose distribution across the four dialects examined (AusE, NZE, BrE, and AmE),
across the spoken and written modes, and across the four selected written registers, is
discussed in Section 4 below.
3. Some diachronic trends
There is no published account available of diachronic variation in the frequency of the
constructions under investigation here. I shall begin with some diachronic observations for BrE based on a comparison of the frequencies obtained in the present study
(which are based on corpora with texts dating from the early 1990s) with those derived
from comparable corpora with texts dating from the early 1960s. It must be emphasized that these figures are just for BrE, the only dialect for which corpora are available
that are suitable for making diachronic comparisons for this period.
 Peter Collins
In the case of the cleft family of constructions, frequency findings for the 1960s
were available from an earlier study by the present author (Collins 1991) based on
the London-Lund and LOB corpora.2 For existentials and extraposition in the 1960s I
used 50 000 words of conversation from categories S.1.1–S.1.10 of the London-Lund
Corpus, and 80 000 words of writing from categories A1–10, C1–10, J1–10, and K1–10
of the LOB corpus.
The results are presented in Table 2 (where for comparability across the various
corpora all findings are reported as tokens per 10 000 words).3 Percentages indicating
a rise or fall are derived by calculating the difference between the frequencies derived
from the 1960s and 1990s corpora as a percentage of the former.
Table 2. Information-packaging constructions in BrE in the 1960s and 1990s*
SPEECH
Existentials
Extrapositions
It-clefts
Basic P-Cs
Reversed P-Cs
WRITING
TOTAL
60s
90s
Diff
60s
90s
Diff
60s
90s
Diff
32.6
10.0
3.3
3.9
4.3
31.6
6.1
2.2
2.8
4.9
–3%
–39%
–33%
–28%
+14%
20.5
7.1
5.0
1.2
1.1
25.5
16.3
3.7
3.2
0.9
+24%
+130%
–26%
+167%
–18%
25.3
8.2
4.5
2.0
2.1
29.2
10.2
2.8
2.9
3.3
+15%
+24%
–38%
+45%
+57%
* Figures represent frequencies per 10 000 words
The it-cleft is the only construction to have suffered a decline (of broadly similar
dimensions in speech and writing). It is arguably the most “rhetorical” or “crafted” of the
constructions, distinguished from the others by its consistently greater popularity in writing than in speech over the period from the 1960s to the 1990s (and by its comparable
decline in both the spoken and written modes). At the other extreme the construction
that has enjoyed the greatest rise in popularity, the reversed pseudo-cleft, is arguably the
least “rhetorical” – most instances being short and formulaic (see Section 7.3 below) – and
. Some recalculating of the figures presented in the earlier study was necessary. In the case
of it-clefts tokens described in the earlier study as involving “zero-theme” (such as It’s not
that she is unwell) were not counted as clefts in the present study. In the case of pseudo-clefts,
only those with a relative clause headed by what were included in the present study, so the
figures from the earlier study had to be recalculated to exclude examples of the type That’s
why I left.
. The figure of 16.3 tokens of extraposition per 10 000 words in 1990s writing is abnormally
high, and would appear to have been skewed by the popularity of this construction in a small
number of texts in ICE-GB.
Information-packaging constructions 
distinguished from the others in the extent of its preference for speech over writing and as
the only construction to have increased in frequency in speech. The remaining three constructions – existentials, extrapositions and basic pseudo-clefts – have all enjoyed overall
gains, but only because the extent of their increase in writing has outweighed the losses
they have suffered in speech.
The percentages in Table 2 suggest that stylistic factors have had a major role to
play in the fortunes of the information-packaging constructions. If we set aside reversed
pseudo-clefts, the overwhelming impression is one of flagging fortunes in speech, compensated by a concomitant increase in writing (apart from it-clefts which, though suffering a decline in writing, maintain their numerical supremacy in that mode).
4. Regional and stylistic variation
Table 3 below presents total frequencies for tokens of the information-packaging constructions investigated across the four Englishes. Existentials are by far the most numerous,
accounting for well over half the total number of tokens, while basic pseudo-clefts are
smallish in number. Total frequencies for the five constructions across the dialects show
AmE (with only 682 tokens) differentiated from the other three varieties, all with around
1000, NZE having a slightly higher frequency (1010) than the other two (BrE 928, AusE
898). Interestingly the three constructions which are significantly less popular in AmE
than in the other varieties – existentials, extrapositions and it-clefts – all have a dummy
pronoun as subject, and it may be that the relative dispreference shown by American
speakers results from an American aversion to the “impersonality” generated by the presence of such a pronoun.
Table 3. The information-packaging constructions in the AusE, NZE, BrE, and
AmE corpora*
ICE-AUS
Existentials
Extrapositions
It-clefts
Basic P-Cs
Reversed P-Cs
532 (27.7)
204 (10.6)
54 (2.8)
28 (1.6)
80 (4.1)
TOTAL
898
ICE-NZ
568 (29.6)
255 (13.2)
89 (4.6)
35 (1.8)
63 (3.2)
1010
ICE-GB
C-US
TOTAL
560 (29.2)
195 (10.1)
53 (2.7)
56 (2.9)
64 (3.3)
335 (17.4)
145 (7.5)
39 (2.0)
47 (2.4)
116 (6.0)
1995 (25.9)
799 (10.4)
235 (3.0)
166 (2.1)
323 (4.2)
928
682
3518
* Tokens per 10 000 words in brackets
The regional picture becomes more intriguing when we factor stylistic distribution
into the equation. As Table 4 shows, frequencies for the set of constructions across the
Sp
Wr
Sp
Wr
Sp
Wr
* Tokens per 10 000 words in brackets
544 (46.9) 354 (46.6) 492 (42.4) 518 (68.2) 551 (47.5) 377 (49.6) 523 (45.1)
755 (24.8)
464 (15.3)
103 (3.4)
52 (1.7)
34 (1.1)
TOTAL
159 (20.9) 2110 (45.5) 1408 (46.3)
Wr
C-US
TOTAL
Sp
ICE-GB
74 (9.7) 1240 (26.7)
65 (8.6) 335 (7.2)
6 (0.8) 132 (2.8)
6 (0.8) 114 (2.5)
8 (1.1) 289 (6.2)
Wr
ICE-NZ
Existentials
333 (28.7) 199 (26.2) 280 (24.1) 288 (37.8) 366 (31.6) 194 (25.5) 261 (22.4)
Extraposition
99 (8.5) 105 (13.8) 85 (7.3) 170 (22.3) 71 (6.1) 124 (16.3) 80 (6.9)
It-cleft
28 (2.4)
26 (3.4)
46 (3.9)
43 (5.6)
25 (2.2)
28 (3.7)
33 (2.8)
Basic P-Clefts
16 (1.4)
12 (1.6)
25 (2.1)
10 (1.3)
32 (2.8)
24 (3.2)
41 (3.5)
Reversed P-Clefts 68 (5.9)
12 (1.6)
56 (4.8)
7 (0.9)
57 (4.9)
7 (0.9) 108 (9.3)
Sp
ICE-AUS
Table 4. The information-packaging constructions in the spoken and written categories of the ICE-AUS, ICE-NZ, ICE-GB, and C-US corpora*
 Peter Collins
Information-packaging constructions 
dialects do not differ greatly in speech (BrE 551 > AusE 544 > AmE 523 > NZE 492),
but they do in writing (NZE 518 > BrE 377 AusE 354 > AmE 159). Why does NZE
have over three times more tokens in writing than AmE? If the diachronic findings for
BrE reported in Table 2 above are generally applicable to the other dialects, may we
legitimately infer that NZE – which evidences the strongest preference for writing and
the strongest dispreference for speech relative to the other dialects – is in the vanguard
of change with the three dummy-subject constructions (existentials, extrapositions,
it-clefts)? Is it possible that, just as AmE seems often to play a leading role in cases of
contemporary grammatical change that are most advanced in the spoken word (as in
the case of reversed pseudo-clefts), so we might expect a more “conservative” dialect
such as NZE to be at the forefront of grammatical developments associated more with
the written word?
Let us consider each construction in turn.
i.
ii.
iii.
iv.
v.
Existentials were more popular in speech than writing in every variety (most markedly
in C-US where the ratio was 2.3:1) except for ICE-NZ (where the ratio was 0.6:1).
Extraposition showed a more marked preference for occurrence in writing than
any other construction, and again it was NZE and AmE that were the most stylistically differentiated, NZE in the strength of its preference for writing (3.1:1 in
ICE-NZ) and AmE in the relative weakness of its preference (1.2:1 in C-US).
If the decline of the it-cleft that appears to be under way in BrE (see Section 3) is
also occurring in the other varieties, the figures in Table 4 suggest that the decline
may be most advanced in AmE, the only variety for which the relatively healthy
number of it-cleft tokens in writing was not in evidence (with 6 tokens in C-US,
as against 26 for ICE-AUS, 28 for ICE-GB, and 43 for ICE-NZ).
While the figures for basic pseudo-clefts in BrE presented in Table 2 above suggest
some degree of vitality, the figures in Table 3 paint a gloomier picture for the other
three varieties. None of the latter has an overall frequency for this construction,
nor a degree of support for it in writing, to rival that enjoyed by BrE.
Reversed pseudo-clefts, on the rise in BrE apparently due to their popularity in
speech (see Table 2), show a strong preference for speech in the other dialects as
well (see Table 4), none moreso than in AmE (where the ratio of tokens in speech/
writing is much larger than for the other varieties: C-US 8.5:1 > ICE-GB 5.4:1 >
ICE-NZ 5.3:1 > ICE-AUS 3.7:1).
Consider, finally, the distribution of the five constructions across the four written
genres of each corpus. In Table 5 the genres are ordered from least to most “speechlike” in the extent of their interactiveness and focus on the personal concerns of the
writer/reader: learned, editorials, reportage and fiction. Learned writing, with its
specialist audience and concern with informing, arguing and explaining is the most
distant from conversational speech. Fiction is the closest, not only in its inclusion of
fictional dialogue, but more generally in lacking the informative motivation of the
 Peter Collins
Table 5. Information-packaging constructions in four written registers*
Existentials
Extrapositions
It-clefts
Basic P-Cs
Rev P-Cs
TOTAL
ICE-AUS
ICE-NZ
ICE-GB
C-US
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
Total
ICE-AUS
ICE-NZ
ICE-GB
C-US
Total
Learned
Editorials
Reportage
Fiction
TOTAL
52 (26.0)
72 (36.0)
49 (24.5)
15 (7.5)
188 (23.5)
29 (14.5)
57 (28.5)
36 (18.0)
20 (10.0)
142 (17.8)
9 (4.5)
9 (4.5)
11 (5.5)
2 (1.0)
31 (3.9)
3 (1.5)
2 (1.0)
17 (8.5)
0 (0.0)
22 (2.8)
5 (2.5)
0 (0.0)
0 (0.0)
0 (0.0)
5 (0.6)
53 (33.2)
43 (26.8)
45 (28.1)
12 (7.5)
153 (23.9)
35 (21.9)
55 (34.3)
43 (26.8)
8 (5.0)
141 (22.0)
3 (1.9)
11 (6.8)
3 (1.9)
2 (1.3)
19 (3.0)
4 (2.5)
4 (2.5)
1 (0.6)
2 (1.3)
11 (1.7)
0 (0.0)
1 (0.6)
0 (0.0)
3 (1.9)
4 (0.6)
48 (24.0)
63 (31.5)
39 (19.5)
19 (9.5)
169 (21.1)
29 (14.5)
30 (15.0)
25 (12.5)
15 (7.5)
99 (12.4)
2 (1.0)
4 (2.0)
2 (1.0)
2 (1.0)
10 (1.3)
3 (1.5)
3 (1.5)
4 (2.0)
2 (1.0)
12 (1.5)
0 (0.0)
0 (0.0)
2 (1.0)
2 (1.0)
4 (0.5)
46 (23.0)
110 (55.0)
61 (30.5)
28 (14.0)
245 (30.6)
12 (6.0)
28 (14.0)
20 (10.0)
22 (11.0)
82 (10.3)
12 (6.0)
19 (9.5)
12 (6.0)
0 (0.0)
43 (5.4)
2 (1.0)
1 (0.5)
2 (1.0)
2 (1.0)
7 (0.9)
7 (3.5)
6 (3.0)
5 (2.5)
3 (1.5)
21 (2.6)
199 (26.2)
288 (37.8)
194 (25.5)
74 (9.7)
755 (24.8)
105 (13.8)
170 (22.3)
124 (16.3)
65 (8.6)
464 (15.3)
26 (3.4)
43 (5.6)
28 (3.7)
6 (0.8)
103 (3.4)
12 (1.6)
10 (1.3)
24 (3.2)
6 (0.8)
52 (1.7)
12 (1.6)
7 (0.9)
7 (0.9)
8 (1.1)
34 (1.1)
388 (48.5)
328 (51.3)
294 (36.8)
398 (49.8)
1408 (46.3)
* Tokens per 10 000 words in brackets
other three genres. The two news genres are, on the one hand, like fiction in the breadth
of their readership but, on the other hand, like learned writing in their informational
and evaluational functions. Of the two it is reportage that is closest to fiction, with its
concern with past events (factual rather than fictional), while editorials are closer to
learned writing insofar as the writer’s concern is to analyze and reflect (specifically, in
order to evaluate news events and express an opinion about them).
Overall, existentials were more popular in fiction than in the other three registers.
The most striking regional result was the large number of existentials in NZ fiction
(110), which accounts largely for the significantly higher frequency of existentials in
written NZE than in written AusE, BrE or AmE. Not only is NZE more welcoming
Information-packaging constructions 
than the other varieties of existentials in the most speech-like of the four genres, but,
as we shall see in Section 5 below, it is also more tolerant than the other varieties of
existentials with “speech-preferred” features.
Not surprisingly extraposition, with its impersonal presentation of opinion, was
relatively very popular in editorials, and least popular in fiction. This pattern was in
evidence in AusE, NZE and BrE, with AmE alone having more tokens in fiction than
in editorials (yet further evidence that extraposition in AmE – where we have already
noted this construction to be low in frequency and only weakly preferred in writing – is
faring differently than in the other varieties).
Given the quite small overall numbers for cleft constructions in writing, it would
be unwise to attach too much significance to the even smaller figures for individual
written registers. For it-clefts, as with extraposition, AusE, NZE and BrE showed
a degree of consistency in their genre-preferences (most frequent in fiction and least in
reportage) while AmE was completely at variance (most frequent in editorials and least
in fiction). With basic pseudo-clefts the favoured genre was learned (attributable to its
high popularity in several ICE-GB texts), while fiction was the least favoured. There
was little consistency across the dialects. The genre preferences of reversed pseudoclefts were similar to those for it-clefts: fiction being the most popular register, followed
by learned, followed by the two newspaper registers. It may be that this distribution
originates in the informational similarities between the two constructions: both feature
a topical highlighted element, with the presupposed relative clause in final position.
Once again it is AmE (with its preference for reversed pseudo-clefts in editorials and
dispreference for learned texts) which is out-of-step with the other regional varieties.
In Sections 5–7, we shall discuss the specific features of each information-packaging
construction, and investigate their regional/generic distribution. It will be of interest to
determine whether this exercise can shed any light upon the quantitative findings presented
thus far: the contrasting degrees of robustness of existentials, extraposition and it-clefts in
NZE and AmE; the robustness of basic pseudo-clefts in BrE and of reversed pseudo-clefts
in AmE; and the relatedness of these regional findings to patterns of stylistic preference.
5. Existentials
A prototypical existential construction has dummy there as subject, be as main verb,
a displaced subject NP in the post-verbal complement position, and a locative complement as the “extension”, as in (2).4
. The term “extension” was coined by Hannay (1985: 6) and subsequently adopted by
Collins (1993), Huddleston and Pullum (2002: 1393) and others. It refers to the range of
possible structures that can occur to the right of the displaced subject NP.
 Peter Collins
(2) There’s grapefruit juice as well in the fridge [ICE-GB S1A-047:40]
Dummy there behaves in most respects like the subject. Consider the following:
(3)Well you know then there’s no point in me going on about it is there
[ICE-AUS S1A-053:48]
(4)Um there seem to be a lot of people who sort of work there or at other you
know sort of other places [ICE-AUS S1A-016:212]
(5)but there’s so many different forms of it that they’d easily manage it
[ICE-GB S1A-005:6]
In (3) there occupies post-operator position in an interrogative tag, in (4) it functions
as subject in a so-called “raising” construction, and in (5) it is there, rather than the
NP so many different forms of it, which agrees in number with the verb. Verb agreement is a rather ambivalent criterion, however, with the number inflection of the verb
being determined by the post-verbal NP in formal style (compare there are so many
different forms of it). As can be seen in Table 6 (which presents figures for existentials
like (5) with a singular form of be and plural displaced subject, as a percentage of
all existentials), such cases were almost totally restricted to speech, and considerably
more common in NZE and less common in BrE. Interestingly, all of the three tokens in
writing occurred in the NZ data, and this may provide a clue as to why existentials are
relatively more popular in written NZE than they are in written AusE, BrE and AmE
(see Section 4 above). Could it be that New Zealand writers are more tolerant than
others of certain types of informality? This suggestion is pursued further below.
Table 6. Displaced subject ~ verb disagreement in existentials
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
29/532
(5.5%)
63/568
(11.0%)
16/560
(2.9%)
17/335
(5.1%)
122/1240
(9.8%)
3/755
(0.3%)
125/1995
(6.2%)
It has sometimes been assumed (e.g. Leech et al. 1982: 127) that the displaced subject
NP must be indefinite. However definite NPs are also possible: under certain conditions it
is possible for a definite NP to satisfy the pragmatic requisite for the displaced subject of
existential sentences that it should newly identify a referent. As Rando and Napoli (1978)
observe, one such condition involves the presentation of a list, some aspect of which –
such as the selection or number of members – is unknown, as in (6):
(6)Cos I mean there was Air Chief Marshal X and then there’s uh Air Marshal Z
and uh [ICE-GB S1A-030:89]
In the present study, as Table 7 shows, definite displaced subject NPs accounted for 3.9% of
tokens, with the vast majority occurring in speech and with NZE displaying a greater liking
Information-packaging constructions 
for them than the other varieties. Interestingly, again it was written NZE that displayed the
greatest tolerance for informality: of the 17 tokens in writing 13 were from ICE-NZ.
Table 7. Definite displaced subjects in existentials
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
12/532
(2.3%)
31/568
(5.4%)
19/560
(3.4%)
17/335
(5.1%)
62/1240
(5.0%)
17/755
(2.2%)
79/1995
(3.9%)
Some existentials comprise just the elements discussed thus far. Following Huddleston
and Pullum (2002: 1393) I shall refer to these as “bare existentials”. They are typically used
to predicate the existence of an entity or entities as in (7), or the occurrence of an event
or events as in (8):
(7) well there’s four of us [ICE-NZ S1A-001:294]
(8)There was a lunch for those who rallied round, the day Patrick was to go into
hospital. [ICE-AUS W2F-003:73]
As Table 8 indicates, the proportion of bare existentials was consistently about 40% in
all four dialects, and in the spoken and written modes.
Table 8. Bare (vs. extended) existential clauses
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
211/532
(39.7%)
216/568
(38.0%)
244/560
(43.6%)
148/335
(44.2%)
498/1240
(40.1%)
321/755
(42.5%)
819/1995
(41.0%)
A range of structures may serve as the extension in extended existentials (compare Collins
1993): locative and temporal complements as in (9) and (10), predicative complements as
in (11), infinitivals as in (12), participials as in (13), and relative clauses as in (14).
(9)There’s a photograph on the mantelpiece, Jessie – yes, that’s the one.
[ICE-NZ W2F-009:246]
(10)Along the way there were three marriages. [Frown C06:34]
(11)The doorway was directly in front of her. “I’m going upstairs,” she
announced. “I think there’s something wrong.” [ICE-AUS W2F-009:63]
(12)Unfortunately the public always recognized him; there was a price to
pay for his appearance [ICE-NZ W2F-001:17]
(13)I expect there’s just as much going on in Galicia [ICE-GB S1A-006:193]
(14)I mean, there’re certain things that you can get cheaper out there
[ICE-GB S1A-048:340]
 Peter Collins
Frequencies and percentages based on extended existentials are presented in
Table 9. “Spatial” (locative + temporal) extensions accounted for over half of all
extended existentials (54.3%), participials and finite relative clauses for about
20% each (21.3% and 20.0% respectively), while predicatives were a small category (4.5%). One noteworthy dialectal difference that is in keeping with findings reported above is the higher frequency of (speech-preferred) locatives and
temporals in ICE-NZ than in the other corpora, and the lower frequency of (writingpreferred) relative clauses.
Table 9. Existential clause extensions
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
Loc
Temp
Predic
Vinfin
V-ing
V-en
Rel Cl
Total
143 (44.5%)
184 (52.2%)
141 (44.6%)
93 (49.7%)
352 (47.4%)
209 (48.2%)
561 (47.7%)
19 (5.9%)
33 (9.3%)
20 (6.3%)
5 (2.7%)
31 (4.2%)
46 (10.6%)
77 (6.5%)
15 (4.7%)
15 (4.2%)
15 (4.7%)
8 (4.3%)
30 (4.0%)
23 (5.3%)
53 (4.5%)
35 (10.9%)
21 (5.9%)
19 (6.0%)
17 (9.1%)
41 (5.5%)
51 (11.8%)
92 (7.8%)
30 (9.3%)
24 (6.8%)
23 (7.3%)
16 (8.6%)
69 (9.3%)
24 (5.5%)
93 (7.9%)
18 (5.6%)
15 (4.2%)
24 (7.6%)
8 (4.3%)
36 (4.9%)
29 (6.7%)
65 (5.5%)
61 (19.0%)
60 (17.0%)
74 (23.4%)
40 (21.4%)
183 (24.7%)
52 (12.0%)
235 (20.0%)
321 (100%)
352 (100%)
316 (100%)
187 (100%)
742 (100%)
434 (100%)
1176 (100%)
6. Extraposition
Extraposition moves a syntactic unit, characteristically a subordinate nominal clause,
to the right of the predicate in the superordinate clause and replaces it with the dummy
pronoun it. As mentioned in Section 1 above, this process enables a heavy constituent
(which typically expresses new information) to be placed at the end of the sentence,
where it is easier to process than in subject position. It is pointed out in Collins (1994)
that a complementary motivation for extraposition is the positioning of a typically
light and less informative matrix predicate at the front of the sentence.
The extraposed clause may be finite as in (15), or non-finite as in (16).
(15)And then the guy that was playing with this girl said Oh wouldn’t it be
funny if we lost this set six two [ICE-AUS S1A-031:116]
(16) it’s really funny to go round there [ICE-NZ S1A-013:81]
As Table 10 indicates, the total numbers for finite and non-finite clauses were quite
evenly balanced. Finite clauses were slightly more popular in speech, and finite clauses in
writing. Only one of the dialects, NZE, had a significant imbalance in the finite vs. nonfinite frequencies, with its relative preference for non-finite clauses and dispreference for
Information-packaging constructions 
finite clauses. Here, as noted above in the case of existentials, New Zealanders seem to be
more comfortable with some features associated with informality than Australians, the
British, or Americans (in fact evidencing more non-finite than finite tokens – 86:84 – in
the written ICE-NZ data).
Table 10. Extraposed clause types
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
Finite
110 (53.9%) 109 (42.7%) 96 (49.2%) 74 (51.0%) 149 (44.5%) 240 (51.7%) 389 (48.7%)
Non-finite 94 (46.1%) 146 (57.3%) 99 (50.8%) 71 (49.0%) 186 (55.5%) 224 (48.3%) 410 (51.3%)
Total
204 (100%) 255 (100%) 195 (100%) 145 (100%) 335 (100%)
464 (100%) 799 (100%)
Another finding worth mentioning, which is not accessible from the figures in
Table 10, is the predominance of infinitival clauses within the non-finite category (376
tokens, or 91.7%) over present participial clauses (34 tokens, or 8.3%). This statistic
presumably reflects the greater degree of nominalization associated with present participial clauses generally (see Huddleston & Pullum et al. 2002: 1188). The distribution
of the 34 tokens of -ing across the two modes was also uneven (32 in speech and 2 in
writing), a finding which supports Quirk et al.’s (1985: 1393) claim that extraposition
of ing-clauses is “uncommon outside informal speech”.
7. Clefts
In this section we shall deal in turn with it-clefts, basic pseudo-clefts and reversed
pseudo-clefts.
7.1 It-clefts
As we have seen in Section 1 above, it-clefts serve the informational function of foregrounding one part of the non-cleft counterpart as complement of be, and backgrounding the other in a subordinate relative construction. A variety of grammatical classes
may be foregrounded (see Table 11 below). Those represented in the present study were
NPs as in (17), PPs as in (18), finite clauses as in (19), and adverbial phrases as in (20).
(17)Crazily she believed then that it was Ronald she had seen from the spy-hole of
the descending plane, living and reliving his moment of glory in that steaming
wilderness of tree and vine. [ICE-AUS W2F-005:54]
(18)It is through genealogy that kinship and economic ties are cemented and
that mana or power of a chief is inherited. [ICE-NZ W2A-004:113]
 Peter Collins
(19)It is because his team is essentially young that McFadden put them
through the most intensive pre-season programme of any league team.
[ICE-NZ W2A-010:23]
(20)If she had had parents it was here she would have brought them to show
them what she had become. [ICE-AUS W2F-007:34]
Table 11. It-cleft highlighted elements
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
NP
PP
Clause
AdvP
Total
43 (79.6%)
59 (66.3%)
40 (75.5%)
27 (69.2%)
89 (67.4%)
80 (77.7%)
169 (71.9%)
5 (9.3%)
12 (13.5%)
6 (11.3%)
6 (15.4%)
13 (9.8%)
16 (15.5%)
29 (12.3%)
4 (7.4%)
17 (19.1%)
4 (7.5%)
2 (5.1%)
25 (18.9%)
2 (1.9%)
27 (11.5%)
2 (3.7%)
1 (1.1%)
3 (5.7%
4 (10.3%)
5 (3.8%)
5 (4.9%)
10 (4.3%)
54 (100%)
89 (100%)
53 (100%)
39 (100%)
132 (100%)
103 (100%)
235 (100%)
Table 11 shows that NPs were overwhelmingly the most popular type of highlighted
element, accounting for three quarters of tokens. AmE was a little out of step with the
other dialects, with highlighted NPs and finite clauses being slightly less popular, and
PPs and adverbial phrases slightly more popular than in BrE and AusE. A comparison
of speech and writing reveals that PPs were relatively more popular in writing and that
the reverse was the case for finite clauses (findings consistent with those reported by
Collins 1991: 201 for BrE).
Do the frequencies in Table 11 offer us any clues to explain the greater frequency of
it-clefts in NZE than in the other varieties (which we related in turn, in Section 4 above,
to its stronger preference for it-clefts in writing than in speech)? The ICE-NZ figures in
Table 11 differ in at least two respects from those for the other three corpora: in NZE’s
relative preference for highlighted clauses, and in its relative dispreference for highlighted NPs. In both cases NZE leans towards distributional patterns associated with
speech rather than writing. Again the relative robustness of an information-packaging
construction in NZE may be related to a greater tolerance amongst New Zealanders of
speech-preferred features.
If the it-cleft is in decline not just in BrE (see Section 3 above) but more generally in English, and if its low numbers in the AmE data used in this study can be
interpreted to mean that its decline is more advanced in AmE than in the other
varieties, then we might again anticipate that Table 11 would yield some clues as
to why this might be happening. Indeed, the rows for speech/writing show that
highlighted PPs are favoured in writing and highlighted clauses in speech, the latter
markedly so. AmE, in its relative preference for highlighted PPs and dispreference
Information-packaging constructions 
for clauses shows an orientation towards distributional patterns associated with
writing rather than speech (and of a type that are unlikely to guarantee its viability
in the dialect in the future).
The it-cleft relative clauses in the present data featured the following relativizers:
that as in (21), which as in (22), who as in (23), and zero as in (24).
(21)And then he said something about oh I think I’ll have to go again next time
just to see whether I liked it it wasn’t just the novelty that I enjoyed it but it was
actually really that I did enjoy that theatre [ICE-AUS S1A-021:165]
(22)It is, however, the third episode in this narrative which occupies the greatest
contemporary interest. [ICE-AUS W2A-007:14]
(23) I hate whoever it was who made me a Maori [ICE-NZ W2F-004:106]
(24)He was warm. Breathing. Breathing heavily. It was Toby’s breathing she had
heard from the door. [ICE-AUS W2F-009:105]
Table 12. It-cleft relativizers
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
that
which
who
zero
Total
14 (40.0%)
32 (57.1%)
20 (42.6%)
22 (66.7%)
48 (57.8%)
40 (45.5%)
88 (51.5%)
5 (14.3%)
2 (3.6%)
6 (12.8%)
0 (0.0%)
1 (1.2%)
12 (13.6%)
13 (7.6%)
3 (8.6%)
10 (17.9%)
7 (14.9%)
3 (9.1%)
11 (13.3%)
12 (13.6%)
23 (13.5%)
13 (13.7%)
12 (21.4%)
14 (29.8%)
8 (24.2%)
23 (27.7%)
24 (27.3%)
47 (27.5%)
35 (100%)
56 (100%)
47 (100)
33 (100%)
83 (100%)
88 (100%)
171 (100%)
Table 12, which excludes it-clefts with ellipsis of the relative clause, shows that that
was the favoured relativizer, followed by zero, with a smaller number of whichs and
whos. Perhaps due to American prescriptive traditions which was completely unattested in AmE, while that was proportionately far more common in AmE than in
the other varieties. This finding is consistent with Biber et al.’s (1999: 616) finding,
for the genre of news, that AmE news shows a “marked preference” for relative that
over which, in contrast to BrE. According to Biber et al. (1999: 616): “The AmE
preference for that over which reflects a willingness to use a form with colloquial
associations more widely in written contexts than BrE”. Who was more popular in
ICE-GB and ICE-NZ than in C-US or ICE-AUS, while ICE-AUS had a stronger
preference for zero than the others. As for mode, the two most striking findings
were the greater popularity of that in speech than writing, and the greater popularity of which in writing than speech (findings consistent with those of Biber et al.
1999: 609–11).
 Peter Collins
7.2 Basic pseudo-clefts
The information expressed in the relative clause of basic pseudo-clefts is, in the vast majority of cases, salient in the discourse. This salience is of various different types. In (25) the
presupposition represents information that has been explicitly evoked in the immediately
preceding clause.
(25)He was plotting, certainly, continually, every moment of the day, but what he was
plotting to do was to have a life like Ernest Tubb, The Gold Chain Troubadour.
[ICE-AUS W2F-002:11]
In (26) the discourse-oldness of the presupposition is established by the explicit contrast with what has gone before (between “can” and “cannot”).
(26)Contemporary feminism is, of course, not alone in its recent loss of faith and
interest in the critical potentialities of Marxism. The reasons for this general
crisis of confidence cannot be debated here. What can be discussed, however,
is the more specific issue of the significance which contemporary feminism has
given to its own turn away from Marxism. [ICE-AUS W2A-007:7]
In (27) the information is readily inferrable from prior discourse (that is, that the
implementation of ruthless financial measures will have repercussions).
(27)It is one thing to achieve efficiencies and savings. It is another to
decimate budgets in a bid to implement extra programs. What you can
so easily end up with is programs that do not work because they are
under-funded. [ICE-AUS W2C-002:195]
Collins (1991: 95; 2004a) identifies three categories of givenness relevant to the relative
clauses of pseudo-clefts in which the antecedent is recoverable from the extralinguistic
context rather than the discourse. For present purposes I have retained the names used
by register theorists such as Halliday (1978) and Gregory and Carroll (1978): “field”
antecedents are those signaled by the presence of pro-verb do or happen as in (28) and
(29) respectively, reflections of “a pragmatic principle that our experience of the world
consists of a series of ‘doings’ and ‘happenings’ ” (Collins 1991: 127). “Tenor” antecedents are those located in the thoughts or feelings of the speaker, as in (30). “Mode” (or
“metalinguistic”) antecedents are those involving an interpretation or clarification of a
previous section of discourse, as in (31).
(28) What I think I’ll do is um take these ideas back to Matt [SBC 16:1142]
(29) Okay, and so what happens is, your mic runs into it [SBC 16:854-6]
(30)So what I what ah I was interested from to hear from you is whether the advice
we’ve given them’s right or not [ICE-AUS S1A-003:59]
(31) okay so what you’re saying is fly to Bhutan [ICE-NZ S1A-024:137]
Information-packaging constructions 
Table 13 indicates that approximately one third of antecedents were located in prior
discourse, and that of the three extralinguistic categories, field was considerably larger
than the other two.
Table 13. Relative clause givenness in basic pseudo-clefts
Discourse
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
11 (39.3%)
5 (14.3%)
22 (39.3%)
15 (31.9%)
18 (15.8%)
35 (67.3%)
53 (31.9%)
Field
Tenor
Mode
“do”
“happen”
“feel”
“say”
4 (14.3%)
9 (25.7%)
11 (19.6%)
17 (36.2%)
39 (34.2%)
2 (3.8%)
41 (24.7%)
6 (21.4%)
13 (37.1%)
5 (8.9%)
7 (14.9%)
21 (18.4%)
0 (0.0%)
31 (18.7%)
3 (10.7%)
3 (8.6%)
12 (21.4%)
3 (6.4%)
10 (8.8%)
11 (21.2%)
21 (12.7%)
4 (14.3%)
5 (14.3%)
6 (10.7%)
5 (10.6%)
16 (14.0%)
4 (7.7%)
20 (12.0%)
Total
28 (100%)
35 (100%)
56 (100%)
47 (100%)
114 (100%)
52 (100%)
166 (100%)
As noted above in Section 4, it is in BrE that basic pseudo-clefts are most common. We have also previously noted the greater popularity of the construction in
writing over speech in that dialect, a finding compatible with the evidence presented in Section 3 that basic pseudo-clefts have been increasing in British writing
in recent decades. In view of these findings it is not surprising to find evidence in
Table 13 that BrE displays a stronger preference for basic pseudo-cleft tokens with
“writing-friendly” features. As the speech/writing rows indicate, discourse antecedents and tenor antecedents are more common for basic pseudo-clefts in writing
than speech, so it is not surprising that BrE has the strongest preference of the dialects for these (shared with AusE in the case of discourse antecedents). Also in line
with the orientation of BrE towards writing-friendly features is its dispreference for
field and mode antecedents.
There were three classes of highlighted elements: NPs as in (32), finite clauses as
in (33), and non-finite clauses as in (34).
(32)It it’s very high-tech because what you see is this giant X in chrome silver
don’t you [ICE-AUS S1A-002:52]
(33)So what I what ah I was interested from to hear from you is whether the
advice we’ve given them’s right or not [ICE-AUS S1A-003:59]
(34)What they seem to have chosen is to hit a few of those superannuitants very
very hard. [ICE-NZ W2C-004:67]
The findings for BrE presented in Table 14, as in Table 13, are compatible with
the synchronic and diachronic robustness of the basic pseudo-cleft in that dialect
noted above.
 Peter Collins
Table 14. Highlighted elements in basic pseudo-clefts
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
NP
8 (28.6%) 6 (17.1%) 24 (42.9%) 11 (23.4%) 20 (17.5%) 29 (55.8%) 49 (29.5%)
Fin Cl 15 (53.6%) 23 (65.7%) 24 (42.9%) 30 (63.8%) 75 (65.8%) 17 (32.7%) 92 (55.4%)
NF Cl 5 (17.9%) 6 (17.1%) 8 (14.3%) 6 (12.8%) 19 (16.7%) 6 (11.5%) 25 (15.1%)
Total 28 (100%) 35 (100%) 56 (100%) 47 (100%) 114 (100%) 52 (100%) 166 (100%)
Again the relevant evidence relates to the strength of the British orientation to
“writing-friendly” features in the construction, in this case a strong preference for
highlighted NPs (which the speech/writing rows show to be favoured by basic pseudoclefts in writing), and a strong dispreference for highlighted finite clauses (which the
speech/writing rows show to be favoured by basic pseudo-clefts in speech).
7.3 Reversed pseudo-clefts
Collins (1991: 145ff; 2004b) observes that reversed pseudo-clefts commonly have little
dynamically new information content, a fact which provides the basis for explaining
their typically “summative” discourse role (pulling the threads of a discourse together).
For example:
(35)
LINDA: She’s very practical.
PATTY: Yeah.
DIANE: Mhm.
PATTY: and she didn’t care ... the- to b- to she didn’t care about emancipation.
LINDA: Mhm.
PATTY: I mean,
LINDA: Mhm.
PATTY: She was.
LINDA: Mhm.
PATTY: So she … when you really are already that way, you don’t have to make
a big to-do about it.
EVELYN: Mhm.
LINDA: Mhm.
X: Mhm.
DIANE: But she also …
LOIS: thought also, in the movie, she had a very strong personality... a lot of
charisma.
LINDA: Mhm.
LOIS: And very positive,
DEBORAH: Did you see the movie Diane.
DIANE: Hm-m.
LOIS: which is also what you’re saying. But,
Information-packaging constructions 
LINDA: Mhm.
LOIS: ... but she just radiated something which was .. personality.
LINDA: Mhm. [SBC 23:86]
The generally low informational content of the reversed pseudo-clefts derives from their
equation, in typical instances, of a discourse-anaphoric demonstrative with a presupposed
relative clause. The study confirmed the finding of other studies (Collins 1991; 2004b;
Biber et al. 1999) that by far the most popular highlighted item was demonstrative that, as
in (36): see below. Other items included in the present study were this as in (37), NPs as
in (38), and a small number of other items including it as in (39) and which as in (40), put
together as “other” in Table 15.
(36) that’s what Tala says [ICE-NZ S1A-015:225]
(37) This is what we do all the time [ICE-GB S1A-037:45]
(38) But that rule-based system is what LSE is providing [ICE-GB S1A-024:91]
(39) It’s what she earns [ICE-GB:S1A-020:72]
(40) which is what America do and stuff [ICE-NZ S1A-026:307]
Table 15. Highlighted elements in reversed pseudo-clefts
ICE-AUS
ICE-NZ
ICE-GB
C-US
Speech
Writing
Total
that
this
NP
other
Total
62 (77.5%)
58 (92.1%)
52 (81.3%)
90 (77.6%)
246 (85.1%)
16 (47.1%)
262 (81.1%)
8 (10.0%)
0 (0.0%)
7 (10.9%)
12 (10.3%)
22 (7.6%)
5 (14.7%)
27 (8.4%)
5 (6.3%)
2 (3.2%)
3 (4.7%)
6 (5.2%)
10 (3.5%)
6 (17.6%)
16 (5.0%)
5 (6.3%)
3 (4.8%)
2 (3.1%)
8 (6.9%)
11 (3.8%)
7 (20.6%)
18 (5.6%)
80 (100%)
63 (100%)
64 (100%)
116 (100%)
289 (100%)
34 (100%)
323 (100%)
Table 15 documents the dominance of demonstratives, and most notably that, as
highlighted item in reversed pseudo-clefts. The two demonstratives accounted for
approximately 90% of all tokens in all four dialects. A comparison of the frequencies
for speech and writing, however, reveals the extent to which reversed pseudo-clefts
with highlighted that are a speech phenomenon. In speech they account for a massive
85.1% of all reversed pseudo-clefts, but a considerably smaller proportion (47.1%) in
writing. In the written mode the reduced popularity of that allows the other categories
to figure relatively more prominently.
As we have seen in Section 3 above, the reversed pseudo-cleft construction has
enjoyed increasing popularity in recent British speech, while the interdialectal comparisons pursued in Section 4 suggest that AmE, with almost twice as many tokens
 Peter Collins
as BrE and a far greater preponderance of tokens in speech, has probably enjoyed a
bigger increase than BrE and the other two dialects. Unfortunately Table 15 does not
appear to offer any insights into these differences, with the frequencies for highlighted
elements in AmE being generally consistent with those for the other dialects.
The content of the relative clause of reversed pseudo-clefts is typically given, or
predominantly given, with the same categories of informational familiarity that are
relevant to basic pseudo-clefts being applicable. Thus we find discourse-antecedents as
in (41), field-antecedents with do as in (42) and happen as in (43), tenor-antecedents as
in (44), and mode-antecedents as in (45).
(41) A country practice that’s what it was [ICE-AUS S1A-004:320]
(42) yeah well you see that’s what I do [ICE-NZ S1A-004:73]
(43) cos that’s what happened to Hannah [ICE-NZ S1A-022:267]
(44)that’s what I’m that’s what bothers me I think it’s rude and socially inadequate
[ICE-AUS S1A-012:246]
(45) Well that’s what I mean [ICE-GB:S1A-038:174]
Table 16. Relative clause givenness in reversed pseudo-clefts
Discourse
Field
“do”
ICE-AUS 24 (37.5%)
ICE-NZ
18 (28.6%)
ICE-GB
24 (37.5%)
C-US
39 (33.6%)
Speech
94 (32.5%)
Writing
15 (44.1%)
Total
109 (33.7%)
“happen”
9 (14.1%) 2 (3.1%)
11 (17.5%) 2 (3.2%)
9 (14.1%) 2 (3.1%)
12 (10.3%) 7 (6.0%)
40 (13.8%) 13 (4.5%)
3 (8.8%)
1 (2.9%)
43 (13.3%) 14 (4.3%)
Tenor
Mode
“feel”
“say”
6 (9.4%)
0 (0.0%)
6 (9.4%)
10 (8.6%)
20 (6.9%)
0 (0.0%)
20 (6.2%)
23 (35.9%
32 (50.8%)
23 (35.9%)
48 (41.4%)
122 (42.2%)
15 (44.1%)
137 (42.4%)
Total
64 (100%)
63 (100%)
64 (100%)
116 (100%)
289 (100%)
34 (100%)
323 (100%)
In Table 16 the category of “discourse” includes cases where the content of the relative clause is given and/or new. A noteworthy aspect of the frequencies presented in
this table is the high proportion (42.4%) of mode tokens compared to basic pseudoclefts (12.0%). This confirms what we have said above about the “texturing” role of
the reversed construction. Another statistic worthy of comment is the greater popularity of do-antecedents in speech than writing (as in basic pseudo-clefts), presumably a reflection of the more active than reflective orientation of speech (see further
Martin 1984). As was the case for Table 15, so for this table, there are no apparent
clues to explain the superior robustness of this construction in AmE, with the AmE
frequencies again being generally consistent with those for the other dialects.
Information-packaging constructions 
8. Conclusion
In order to gain some initial insights into how the five information-packaging constructions have been faring in their frequency of use in recent decades, the figures for
BrE from the present study were compared with figures from earlier British corpora
(LLC and LOB). The comparison suggests a rise in the frequency of all but the it-cleft,
but one attributable to increasing use in writing rather than speech in the case of existentials, extraposition and basic pseudo-clefts, and one attributable to increasing use
in speech rather than writing in the case of reversed pseudo-clefts.
The findings for the two impersonal “it-constructions” were uncannily similar,
with NZE having by far the most tokens, AmE by far the least, and AusE and BrE
a similar number. The third dummy-subject construction, the existential, was similarly
unpopular in AmE, and most popular in NZE (but in this case NZE was only marginally ahead of AusE and BrE). These findings, it was noted, correlated interestingly with
patterns of distribution across speech and writing, with NZE providing the strongest
support for these constructions in writing, and AmE the least. Examination of the four
written genres provided further evidence of AmE differentiating itself from the other
three varieties in its distributional preferences.
The strongest support for basic pseudo-clefts came from BrE (followed by AmE,
with AusE and NZE again behaving similarly). Correlations with speech/writing preferences were not in evidence: AmE displayed its familiar preference for speech, NZE
for writing, while for AusE and BrE numbers were quite similar in the two modes.
For reversed pseudo-clefts, which differ from the other constructions in their formulaicity, their lack of “rhetorical” flavour, and their strong distributional preference
for speech, there was an entirely different story. This time it was AmE, with its familiar
preference for speech, that yielded the largest number of tokens, and NZE the least.
The general picture that emerged was, then, one of American resistance to the
use of these constructions in writing, but acceptance of them in speech (such that we
might extrapolate from the various findings that AmE is leading the way in a rise in
the popularity of the reversed pseudo-cleft, and in a decline in the popularity of the
it-cleft). The antipodean varieties generally pattern more closely with BrE than with
AmE, with NZE being even more “conservative” than BrE in its distributional preference for writing and overall support for the three dummy-subject constructions.
The next step was to scrutinize the grammatical and pragmatic properties of the
constructions, to see if the frequencies with which features were selected in particular dialects might correlate in any way with the general regional and stylistic findings reported above. One notable finding was that NZE was often more tolerant than
the other varieties of constructional features whose general association was with
the spoken word, suggesting that the vitality of the dummy-subject constructions in
 Peter Collins
NZE may be more than merely numerical, reflected as well in its tolerance of speechfriendly features in writing.
Further study is needed with larger corpora to test whether the differences
reported here may have been affected by the configurational discrepancies between
the specially-compiled American corpus used in the present study, and the ICE
corpora used.
References
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. London: Longman.
Collins, Peter. 1991. Cleft and Pseudocleft Constructions in English. London: Routledge.
Collins, Peter. 1993. “Cleft existentials in English”. Language Sciences 14(4): 1–15.
Collins, Peter. 1994. “Extraposition in English”. Functions of Language 1(1): 1–18.
Collins, Peter. 2004a. “The information structure of what-clefts in English”. In Gerald Knowles,
Jamaliah Mohd Ali, Jariah Mohd Jan, Su’ad Awab & Zuraidah Mohd Don (eds), Language,
Linguistics and the Real World. Volume 1: Making Linguistics Relevant. Kuala Lumpur:
University of Malaya Press, 227–44.
Collins, Peter. 2004b. “Reversed what-clefts in English: information structure and discourse
function”. Australian Review of Applied Linguistics 27(2): 63–74.
Gregory, Michael & Susan Carroll. 1978. Language and Situation: Language Varieties and their
Social Contexts. London: Routledge and Kegan Paul.
Halliday, Michael A.K. 1978. Language as Social Semiotic: The Social Interpretation of Language
and Meaning. London: Edward Arnold.
Hannay, Michael. 1985. English Existentials in Functional Grammar. Dordrecht: Foris.
Huddleston, Rodney & Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Leech, Geoffrey, Margaret Deuchar & Robert Hoogenraad. 1982. English Grammar for Today:
A New Introduction. London: Macmillan.
Martin, James R. 1984. “Language, register and genre”. In Frances Christie (ed.), Language Studies:
Children Writing. Geelong, Vic.: Deakin University Press, 21–30.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Rando, Emily & Donna Jo Napoli. 1978. “Definites in there-sentences”. Language 54: 300–13.
Like and other discourse markers*
Jim Miller
University of Edinburgh
This analysis of like as a discourse marker looks at its meaning relative to
its position in the clause, and the discoursal context including the type of
interaction. The data come from the Australian and New Zealand ICE corpora,
and additional transcripts of Australian radio talkback programs. Like is the
sixth most frequent discourse marker in the data, found in speech, both scripted
and unscripted, but absent from writing. Clause-initial like can be glossed as
“for example”; clause-medial like is a highlighter; clause-final like has to do with
explanations and preventing hearers making incorrect inferences. Clause-final
like is attested in novels by Scott and Hogg, and much older than generally
thought. In the antipodean corpus data, like is used by speakers ranging from
teenagers to 50-year-olds, including manual workers, skilled tradesmen, and
various types of professionals.
1. Introduction
This chapter provides a brief overview of some discourse markers current in AusE and
NZE but focuses on like. The occurrence and meaning of most of the discourse markers
(henceforth DM) follow the patterns described in Quirk et al. (1985) and Biber et al.
(1999), but the analysis of like does throw up some surprises with respect to the type of
texts it occurs in, the speakers who use it, the structures that occur and what information
it signals.
The chapter argues that like has various discourse functions. Clause-initial like
highlights clauses and phrases and these constituents may be used to exemplify a
*I am grateful to Pam Peters, Peter Collins and Adam Smith for their insightful comments
on an earlier draft. Adam Smith carried out a computer search for instances of like in
ICE-NZ and supplied the results in a useful spreadsheet. Bernadette Vine, Manager of the
Wellington Corpus, kindly sent me CDs of WSC and WWC. She also gave generous help
to Andrea Calude, who made two visits to Wellington while researching her PhD thesis on
clefts and was able to listen to some passages with like in addition to many passages with
different types of cleft.
 Jim Miller
previous general statement. Clause-medial like highlights phrases and the information
they carry; while clause-final like is used to anticipate objections, to provide explanations and to ask for explanations. The speakers who use like in this study are not
typically teenagers: most of them speak standard AusE or NZE and many of them are
highly educated professionals. Like occurs in informal conversation but also in lectures
and in monologic radio programs.
The analysis of DMs in AusE is based on data from the online corpora held at
Macquarie University. The discussion is mainly qualitative but the chapter also
offers a brief quantitative overview. The latter is based on data from the Australian
and New Zealand contributions to the International Corpus of English.1 The spoken
and written components of these corpora will be labeled ICE-AUS (sp), ICE-NZ
(wr) respectively. The qualitative analysis deals mainly with a subset of data from
ICE-AUS (conversation and monologue) and from the ART corpus with its transcripts of phone-in/talkback programs. Researchers at Macquarie University made
audio recordings available as well as orthographic transcripts, and both are essential
for adequate syntactic and discourse analysis of spoken language. In particular, for
many instances of like, only the audio recordings made it possible to decide which
clause an instance of like belonged to, or whether an utterance-final like was indeed
in utterance-final position or really belonged to a clause which the speaker failed
to complete.
The ICE-NZ spoken data can only be listened to in Wellington, but some qualitative
analysis was possible. This was carried out on the original Wellington Corpus of Spoken
New Zealand English.2
. The Australian and the New Zealand contributions to ICE each consist of 300 chunks of
spoken language and 200 chunks of written language, each chunk consisting of 2000 words.
The ART corpus consists of various phone-in programs, some very monologic, extracted from
commercial and especially ABC Radio National, the Australian Broadcasting Corporation. The
sub-sets contain 124 421 words. The spoken contributions to ICE are classified into dialogue,
private and public, and monologue, scripted and unscripted. The written contributions are
classified into letters, persuasive texts, literature, instructional texts, and so on. WSC is classified into face-to-face conversation, telephone conversation, radio talkback, oral history, teacher
monologue, lecture and so on.
. ICE-NZ has a transcription which uses no capital letters or written-language punctuation.
WSC contains symbols giving information about pauses, rate of speech, non-speech sounds
such as laughter, and so on. ICE-AUS and ART use a simpler transcription in which capital
letters occur. The original transcriptions have been retained here except that, in the interests
of legibility, the information about laughter and so on has been removed from the examples
from WSC.
Like and other discourse markers 
2. Discourse markers
The title of this chapter indicates that we take like to be a DM, a view that will be
defended in Sections 3 and 5–7. But what are DMs? Schiffrin (2001: 54) mentions well,
but, oh and y’know and adds and, but, because, I mean, by the way, to sum up, so, then,
hence, therefore. Blakemore (2004: 221) gives as her initial list well, but, so, indeed, in
other words, as a result and now. Quirk et al. (1985: 631–4) offer syntactic criteria. They
talk of “adverbials”, not “discourse markers”, but their criteria are presented here as
applying to discourse markers.
As shown by sentences a–c, DMs cannot be the focus of a cleft construction.
a. Nonetheless you should send her the agenda
b. It is the agenda you should send her
c. *It is nonetheless you should send her the agenda
DMs are entirely optional and never carry a participant role such as Agent, Location,
Goal. Discourse markers occur in various positions, though not just anywhere within
the clause. See sentences d–f below for variations on (a) above.
d. You should nonetheless send her the agenda
e. You should send her the agenda nonetheless
f. You should send her nonetheless the agenda
Quirk et al. (1985: 632) state that “we relate [discourse markers] to the speaker’s
comment in one quite specific respect: his assessment of how he views the connection
between two linguistic units”. Quirk et al. emphasize that the linguistic units can be
very large or very small: sentences, paragraphs or even larger parts of a text.
Discourse markers relate both to text and to speakers and hearers, as recognized
by Quirk et al., and by Andersen (1998). Blakemore (2004: 238–9) focuses on users and
treats DMs as encoding information about the relevance of utterances. A may say to B
There’s nothing in your wallet. So you’ve spent all your money. A’s first clause conveys an
assumption to the hearer. In a different situation A might watch B arriving laden with
parcels and say So you’ve spent all your money. The assumption is made manifest by A’s
perceiving that B is heavy-laden.
While it is true that Blakemore’s Relevance Theory approach can deal with the
latter situation, which is beyond the reach of analyses confined to text, this is not sufficient reason to abandon text. All language use is certainly situated in context, but text
and reactions to text constitute the essential data and the focus of analysis.
Here we adopt the approach set out in Schiffrin (2001). It focuses both on
language (What form was used? What was its meaning?) and on social interactions.
DMs are multifunctional; but, for example, can signal that speakers are introducing a
 Jim Miller
proposition, disagreeing with the hearers, rebutting arguments or establishing themselves as the current speaker in an exchange (Schiffrin 2001: 55–6). Like Quirk et al.,
Schiffrin establishes formal properties of DMs: being syntactically detachable, allowing
various prosodic contours, and occurring in initial position. (Note however examples
d–f above, with DMs in other positions.) Items that have these properties can be single
words such as conjunctions (but), adverbs (now, then), interjections (oh) or lexicalized
phrases (y’know, I mean, after all).
We will see in Section 4 that like marks relationships between different chunks
of text, highlighting certain chunks/putting them into focus or signaling that the
current chunk of text serves as exemplification or explanation of a previous chunk.
The textual relations reflect what the speaker is doing: participating in the exchange
in a relevant way (explaining, exemplifying) or drawing particular attention to a
piece of information. For some users, particularly teenagers, it is possible that the
use of like can be interpreted as a marker of solidarity, but whether it functions as a
meaningless filler is open to doubt. (The analysis proposed in Andersen (1998, 2000)
is discussed in Section 4.) Because of the word limits on this chapter we focus on
textual data but return to the role of speakers in Section 5.
3. Quantitative data
Easy access to a digital corpus of respectable size tempts analysts to perform intensive
quantitative analysis. The writer has resisted the temptation, partly because like (DM)
requires preliminary qualitative analysis, which in turn requires careful listening to
the audio recordings – see Section 1. But some numbers are provided in order to give
a general picture of DM usage, and in particular to demonstrate the place of like (DM)
in that usage.
One crucial issue is whether like is worth the attention that has been paid to
it recently. Macaulay (2005) says “This item [like] has been the focus of intensive
study in recent years...Many of these studies have made a point of trying to counter
the negative image of this item. As a result, perhaps there has been a corresponding
danger of exaggerating its value.” Whether the value of like has been exaggerated
can only be determined by examining its frequency and its functions in discourse.
Nothing can be done about its negative image. However, linguists are supposed to
describe and not prescribe; we will show that like is very frequent relative to other
discourse particles and that it does have discourse functions.
The table below shows the frequency of 20 DMs in ICE-AUS and ICE-NZ spoken
and written texts. Apart from like, the DMs are those mentioned by Quirk et al., and
by Schiffrin (2001) and Blakemore (2004).
Though there are minor differences in the frequency rankings of DMs within the
two sets of numbers, Table 1 shows that there are comparable high, middle and low
Like and other discourse markers 
Table 1. Comparative frequencies of 20 discourse markers in spoken and written data
from the Australian and New Zealand ICE corpora
ICE-AUS
well
you know
so
like
actually
anyway
in fact
however
therefore
plus
in other words
after all
by the way
as a result
besides
in any case
nevertheless
hence
moreover
to sum up
ICE-NZ
frequency
in speech
frequency
in writing
>1000
>1000
>1000
794
692
259
203
73
63
40
32
25
24
20
16
13
10
6
1
1
55
10
>220
0
19
29
79
347
121
2
10
14
3
16
9
10
19
28
16
0
well
so
actually
you know
like
in fact
anyway
however
therefore
in other words
after all
nevertheless
by the way
as a result
moreover
plus
in any case
hence
to sum up
besides
frequency
in speech
frequency
in writing
>1000
>1000
738
697
670
200
181
76
52
30
16
13
5
5
4
3
1
1
0
0
52
>220
6
13
0
82
68
432
187
14
17
31
3
12
19
0
3
62
0
17
frequency sets in both varieties. In each corpus there are DMs with more occurrences
than like, which is the fourth most frequent DM in ICE-AUS (sp) and the fifth most
frequent in ICE-NZ (sp) The quantities of like (DM) in ICE-AUS (sp) and ICE-NZ
(sp) are similar, and neither corpus presents any in the written component. Yet for
others the raw numbers cannot give an accurate picture, since the spoken and written
corpora are of different sizes (see footnote 1). Nonetheless several features emerge.
Some DMs are relatively frequent in speech and relatively infrequent in writing
in both corpora: well, so, actually, you know, like. In ICE-AUS (sp) the most frequent DMs are well, you know, so, like and actually, in that order. In ICE-NZ (sp)
the most frequent DMs are well, so, actually, you know, and like, in that order.3 Some
DMs are relatively infrequent in speech but relatively frequent in writing: however,
. The figures for actually relate to examples such as Actually, she dealt very efficiently with
the problems but not She actually refused to hand over the documents.
 Jim Miller
therefore, hence.4 Many of the DMs mentioned by Blakemore and Schiffrin have
very low frequencies, particularly in the spoken corpora, as shown by the frequencies for to sum up, therefore, hence, in other words, as a result, however, after all,
nevertheless, besides.5
To provide closer comparison of the spoken and written frequencies of DM usage,
Table 2 shows the normalized frequencies per 10 000 words for a subset of 13 DMs in
the Australian and New Zealand data.
Table 2. Normalized frequencies of 14 DMs per 10 000 words in the Australian and
New Zealand ICE corpora
well
you know
like
actually
anyway
in fact
however
therefore
in other words
as a result
besides
in any case
nevertheless
hence
ICE-AUS frequency in
ICE-NZ frequency in
spoken
written
spoken
written
>16.7
>16.7
13.2
11.5
4.3
3.4
1.2
1.1
0.5
0.3
0.3
0.2
0.2
0.1
1.4
0.3
0
0.5
0.7
2
8.3
3
1.3
1.3
0.2
0.3
0.5
0.7
>16.7
11.5
12.3
11.2
3.3
3
1.3
0.9
0.5
0.01
0
0.02
0.2
0.02
2.3
0.3
0.2
0
2.1
1.7
10.8
4.7
0.4
0.3
0.4
0.1
0.8
1.6
well
you know
actually
like
in fact
anyway
however
therefore
in other words
as a result
besides
in any case
nevertheless
hence
The normalized frequencies show four points very clearly:
––
––
well, you know, actually and like are primarily spoken DMs
in fact is both spoken and written; besides is a spoken and written DM in ICE-AUS
but only a written DM in ICE-NZ, and has very low frequencies
. The figures for however relate to examples such as However, this turned out to be
incorrect but not however important this might be and however you decide to do this. The
DM plus appears in examples such as You will get a good meal plus you won’t have to pay,
but not in, e.g. They get a salary plus expenses or She gets $60 000 plus.
. There are other items whose status as DMs is controversial but which may turn out to
merit analysis. One is mm, not the back-channel item but an assertion reinforcer produced
by the speaker. The second DM requiring further investigation is eh, one of whose functions
is to form yes-no tag questions. For the NZE eh, see Meyerhoff (1994); for eh in Scottish
English, see Miller (2008).
Like and other discourse markers 
––
––
anyway is primarily a spoken DM in ICE-AUS but has a higher normalized
frequency in ICE-NZ (wr) than in ICE-AUS (wr)
however is primarily a written DM; therefore, nevertheless and hence are also
primarily written DMs but less strikingly so than however
We close this section with a closer look at the raw frequencies of like (DM) in the
different types of spoken text contained within the corpora. What comes out very clearly
is that like is primarily a DM in private dialogue (whether domestic or social conversation), but also relatively frequent in public dialogue including radio discussion.
Table 3. Raw and normalized frequencies per 10 000 words for like, in the four subsets of
spoken data in the Australian and New Zealand corpora
spoken text-type
ICE-AUS
ICE-NZ
raw frequency/normalized raw frequency/normalized
private dialogue (S1A)
public dialogue (S1B)
unscripted monologue (S2A)
scripted monologue (S2B)
683
78
24
9
34.1
4.9
1.7
0.9
555
103
11
1
27.8
6.4
1
0.1
The comparative data summarized in Table 3 shows that like (DM) is somewhat
more frequently used in the private and public dialogue of ICE-AUS (= 39 per 10 000)
than ICE-NZ (= 34.2 per 10 000). Yet the rankings of its relative frequency in each of the
four text-types are the same in both corpora. Both find like (DM) far more common in
spontaneous dialogue than in any of the other text-types, with only very low levels of
occurrence in monologue, unscripted or scripted. The Table also confirms that like (DM)
has a higher frequency (per 10 000 words) in spontaneous dialogue than that of other
DMs in all kinds of speech: compare well, you know, and actually in Table 2 above. This is
some measure of its importance in everyday conversation.
4. Constructions with like: Competing analyses
There are competing analyses of like, although the label like (DM) used in the preceding
sections presupposes an analysis of like as a discourse marker. This analysis will be set
out and defended below as we review three kinds of analysis. The AusE and NZE data
do not support the view that like is a filler. It is not accompanied by hesitations, false
starts and long pauses, which would indicate lexical indecision or problems in planning syntactic structure. In this respect the data match two sets of Scottish English
data, one analyzed by Miller and Weinert (1995) and the other by Macaulay (2005).
 Jim Miller
The former emphasized that like was integrated with the clause syntax, a finding
supported by Macaulay.
4.1 Like as a marker of intimacy and solidarity, especially among teenagers
Siegel (2002) treats like as having an interpersonal function among adolescents (though
she also claims that it is a product of lexical indecision). It reflects spontaneity and
intimacy between interactants who know each other and are participating in informal
exchanges. Siegel’s analysis is parallel to Schiffrin’s earlier account with the concepts of
participation and exchange structures.
The AusE and NZE data show that teenagers frequently use like (DM) – but
they are not the sole users, far from it (see below Section 5.1). And since some of the
Australian teenagers who did use it were contributing to phone-in radio programs,
this does not support the idea that like (DM) necessarily reflects intimacy. These are
not face-to-face encounters but very public discussions. The callers are not friends or
acquaintances of the presenter or experts. Moreover, since potential contributors have
to telephone the radio station and join a queue of callers, they have time to prepare
at least a skeleton of what they want to say and are not being totally spontaneous. In
contrast, Siegel’s (2002) data was gathered by her daughter accosting friends and fellow
students at her high school, and asking them to answer on the spot a philosophical
question that required complex thinking.
Our conclusion is that while the use of like can reflect solidarity among speakers,
this fails to explain the vast majority of uses. It fails to distinguish like from other words
and phrases, since reciprocal accommodation among speakers can affect pronunciation, syntax, lexical items, phrases and idioms. Siegel’s explanation also does not apply
to clause-final like – see Section 8 below. Speakers who have reached the end of a clause
are no longer searching for syntactic patterns or lexical items for that clause.
4.2 Like as a marker of loose talk
Andersen (1998, 2000) considers like to be a signal of “loose talk”. This is talk
produced by people who, knowing that the exact time on their watch is 10.31.04,
announce: It’s ten thirty. The reply “impoverishes” the proposition in order to give
hearers the maximum of useful information (“optimal relevance”) with a minimum
of processing effort. In Andersen’s view, looseness is a general property of utterances
and literalness is the exception.
Andersen’s analysis is open to three major objections.
––
––
Most statements in informal conversation are examples of loose talk but only a
minority of them contain like.
He provides no context for his examples and no argument for the analysis.
Like and other discourse markers 
––
In the AusE and NZE data (and also the Scottish English data analyzed in Miller
and Weinert 1995), speakers working with approximations hardly ever use like.
Instead they use about, something like that, stuff like that, sort of like, kind of,
whatever and about like as in 1–3 below. Macaulay (2005: 83) says that in the
Glasgow corpus no more than a third of like + NP sequences could possibly be
interpreted as “approximately”. The spoken part of ICE-AUS has only 10 examples
of like immediately preceding a numeral, and its NZ ICE counterpart only 6.
Two examples are shown below. In the first, like precedes a numeral, while (2) is
an example of an approximation.
(1)A:…I had two months in Africa on the way back…I actually went back at
Christmas time…just for five weeks because I had a sort of long distance
sort of relationship going which is a bit ridiculous ’cos I mean
B: Mmm Mmm Right Really Oh Is it still going Can it work
A:Well yes…Kind of…I met I met X in um Zimbabwe…and it was the
week before I came back to Australia and he was flying back to London
and um we had like five days together…[ICE-AUS S1A-040:105-14]
Given the situation discussed in (1), we can sure that speaker A knows exactly how
many days she spent with her boyfriend. In (2), on the other hand, speaker (B) signals
that the figure is approximate with the use of about:
(2) A: How much does it cost
B:It’s gonna cost us about five grand something like that
[ICE-AUS S1A-011:22-3]
Here the approximator about five grand is followed up with the phrase something like
that which the writer would use to mean “I don’t know the exact figure but my guess
is quite close”. In (3) below, the uncertainty is signaled by the phrase sort of like and
reinforced by the phrase or whatever. The essential point is that speakers do not signal
approximations just by means of like.
(3)FA: okay so we’re at it’s at we’re at five and it’ll be sort of like five till eight or
whatever and er carols and some food [WSC DPF020:0180]
Schourup (1985: 35–6) claims that speakers use like to signal that their conception
does not match the real world situation they refer to. This is not dissimilar to Andersen’s analysis, but Schourup’s analysis is difficult to replicate, and he provides no detailed
context or arguments for his account.
4.3 Like as a discourse marker
The analysis in terms of discourse function was first proposed by Underhill (1988)
and was independently used in Miller and Weinert (1995). Underhill provides rich
 Jim Miller
contexts, linguistic and extralinguistic. Miller and Weinert likewise provide rich
contexts but they also supply linguistic arguments. For example, they point out that
in 32 task-related dialogues (i.e. half their corpus), 76% of the occurrences of clauseinitial and clause-medial like can be paraphrased by a Wh-cleft; 8% can be paraphrased
by an it-cleft; 4% by a structure with do you mean. The remaining 12% were unclear.
Consider examples 4–6 below, from a set of task-related dialogues.
(4)
A: go straight down + to the cattle stockade
B: like above or below it?
A: below it
B: right OK
[rephrasing: is it above or below it?/do you mean above or below it?]
(5)
A: er I’m I’m not very sure ++ what I’m supposed to be doing
B: em and then you have to go down again
C: like I go past the collapsed shelter?
[=so what I do is go past the collapsed shelter?]
(6) A:to the lefthand side of East Lake? like the very far end of East Lake?
[Is it the very far end of East Lake? // Do you mean the very far end? Of East Lake?]
Wh-clefts, it-clefts and Do you mean…? all serve to highlight constituents. Miller and
Weinert take the intersubstitutability of the clause-initial/clause-medial like constructions and the cleft constructions, and the paraphrase relationship with do you mean,
as evidence that they have the same discourse function and that like also highlights
constituents. These criteria do not apply to clause-final like. As we will see in Sections 7
and 8 below, this analysis applies straightforwardly to the AusE and NZE data.
5. The speakers
Who uses like (DM)? According to Siegel (2002) it is a hedge used by adolescents in the
United States. In fact, it is used by teenagers in all parts of the UK (Miller and Weinert
(1995), Andersen (1998, 2000), Levey (2003) and Macaulay (2005)), and in Australia
and New Zealand.
More importantly, like (DM) is used by adult speakers, male and female, in their
twenties and thirties or older; by speakers with a minimum of formal education,
speakers receiving higher education and speakers with university degrees and occupying professional posts. Although the Australian databases provide little socioeconomic
information about the contributors, a certain amount can be gleaned from voices, roles
and content of conversations. In contrast, there is a lot of socioeconomic information
about the New Zealand speakers in the Guide to the Wellington Corpus of Spoken NZ
English, a subset of which is part of the New Zealand contribution to ICE.
Like and other discourse markers 
5.1 Australian speakers
Almost all the occurrences of like in the private dialogues (S1A) of the ICE corpus analyzed in detail were produced by male and female speakers between 20 and 40 years of
age. In the extracts analyzed in detail from S1B (public dialogue), the occurrences of like
(DM) are produced by university lecturers or school teachers in their 30s and 40s.
The callers to the phone-in radio programs included in the ART corpus ranged
in age from the mid 20s to the late 40s and possibly older. The experts answering the
questions ranged in age from the mid 30s to the mid 50s. Some of the experts used like
(DM), e.g. the scientist/science communicator Karl Kruselnicki, and children’s literature specialist Kerry White (in their early fifties and forties respectively at the time of
the recordings), and a specialist in personal problems. However two of the experts did
not use like: the author Tim Winton and a medical specialist. One caller specifically
identified himself as a teenager and another indirectly identified herself as a teenager
by referring to the secondary school she attended. Both of these younger callers used
like (DM) far more frequently than the others. The Australian data allow us to hypothesize that teenagers (at least in Australia) are heavy users of like, that this use declines as
they move into their twenties, but they may continue to use like throughout adulthood.
Whether they use it at a constant rate has yet to be investigated.
5.2 New Zealand speakers
The occurrences of like (DM) that were analyzed in detail came mostly from speakers
in their 20s, 30s or 40s. There were three school students between the ages of 16
and 19, but the adult speakers included a computer consultant, a journalist and a
midwife (30–34), who each produced 3 occurrences; a public health officer (20–24),
who produced 6 occurrences; two IT Managers (45–49); a Policy Manager (40–44);
a software consultant (50–54); a lecturer (45–49); the Deputy Head of a School of
Health Sciences (40–44); the Head of a Polytechnic School (50–54). Occurrences
of like were produced by a service station attendant (45–49), a builder (20–24), and
firefighters (30–34).
The traditional practice of recording data on the hoof with pencil and notebook
(or cut-and-paste on computers) also brings useful data, such as (7), an e-mail from
a very senior female academic in the University of Auckland, or (8); from a very
well-educated middle-class Scottish woman in her fifties.
(7)Just to remind everyone that all the documents relating to Special Topics are
due in very soon – like Friday.
(8)The bottom has really fallen out of the B&B market – like I mean I have
had literally no “oats”. This is terribly bad for general feelings of well-being &
self-confidence.
 Jim Miller
The idea that like (DM) is confined to the language of teenagers is clearly wrong.
Macaulay (2005: 83), working on a corpus of data collected from Glasgow teenagers
in the late 1990s by Jane Stuart-Smith of the University of Glasgow, found that the
heaviest users of like (DM) were female and middle-class, which is compatible with the
users in ICE-AUS and ICE-NZ. As Macaulay remarks, this is not the result one would
expect for an item that is nonstandard.
6. Constructions with like
6.1 The functions of like (DM) in three sentence locations
Like plays different roles in discourse depending on its position in the sentence. Both
clause-initial like (Section 7) and clause-medial like (Section 8) highlight new information (Miller & Weinert, 1995; Underhill, 1988). They also link up-coming clauses with
preceding text, marking them as exemplifying some previous point. Miller and Weinert
analyze clause-final like as having to do with anticipating objections and providing or
requesting explanations, as argued in Section 9 below.
Of the three like constructions, the clause-initial one is the most frequent in both
the ICE-AUS and ICE-NZ data.6
Table 4. Comparative frequencies of like (DM) in three sentence positions in spoken data
from ICE-AUS and ICE-NZ
ICE-AUS (sp.)
ICE-NZ (sp.)
cl-initial
cl-medial
cl-final
58
50
58
26
6
3
The table shows that like (DM) is used more frequently in the Australian data in
all three positions. Yet while the frequencies are much the same for clause-initial use,
there is much greater use in AusE of clause-medial like. The very low frequencies for
clause-final like in both corpora are also worth commenting on. This construction
is not generally recognized and rather elusive, perhaps because of its specialized
. For AusE 122 extracts with like were examined in detail. Some of the extracts were from
ICE-AUS, others were from ART. For NZE 79 extracts were examined in detail. The extracts
were taken direct from WSC (with the exception of one example of clause-final like that
turned up in the DVD of a film). Some of them may also be in ICE-NZ.
Like and other discourse markers 
function: not just to highlight but to signal the supply of and requests for explanations.
In fact it is probably the oldest like construction, attested in dialogue in the novels of
Sir Walter Scott and James Hogg.
(9)The leddy, on ilka Christmas night, gae twelve siller pennies to ilka puir body
about, in honour of the twelve apostles like
(Sir Walter Scott Guy Mannering (1815) Cited in the Oxford English Dictionary)
(10)A man, my good friend, may act foolishly at a time, an’ yet no be a’thegither a
fool. To be a fool, you see, is to-is to-In short, it’s to be a fool-a born fool like.
(Hogg Winter Evening Tales 1820 Edinburgh: Oliver and Boyd. Reprint of the
Stirling/South Carolina Edition, ed. Ian Duncan 1995. Edinburgh: Edinburgh
University Press 2004. 245–6)
That clause-final like has its own discourse function was demonstrated by the following event. During the summer of 1991 the writer and his colleague Regina Weinert
had the services of a final-year (fourth year) undergraduate student at Edinburgh
University. Mike Cullen was in his late twenties and had come to university late. He
was studying for a degree in English Language and Linguistics and was an everyday
user of Scottish English and Scots. One day when Mike arrived late for a meeting, the
following interaction took place.
(11)MC1: sorry I’m late my father’s in the Royal [Edinburgh Royal Infirmary] again
[JM and RW look at each other, knowing that A’s father has had serious heart
surgery.]
MC2: it’s OK he’s just in for observation like
JM and RW immediately commented that MC had used a clause-final like. MC
responded that he had done so because he could see that they were jumping to
the wrong conclusion and was providing another piece of explanation to prevent
the error. When JM and RW re-examined their examples of clause-final like, all in
rich context, they found that they were all used as part of explanations, many of
which served to prevent or correct wrong assumptions or conclusions, or as part
of requests for explanations. The speakers in (9) and (10) are supplying explanations (as the narrative context makes clear) and the speaker in (12) was asking for
further explanation.
(12) A1: how many of these interview or conversation things have you done?
B1:have I done? I must have done quite a few now actually – I mean I’ve spent
a couple of terms
A2: mostly in Edinburgh like? or B2:at X – I was down at X High School – I had a couple of terms down
there – it was good – it was all laid on...
 Jim Miller
As Section 9 will show, clause-final like has the same explanatory function in the
Australian and New Zealand data.
7. Clause-initial like (DM)
Like (DM) is found in the databases of spoken AusE and NZE but not in the written
ones, as noted above in Section 3. In the Australian data it is used by speakers of both
genders and all ages, many of whom have gone through higher education. They use like
in private, informal conversation but also in public dialogues, such as discussions in
classrooms or on radio. We begin with an excerpt from talkback radio on ABC Radio
National. Note that the transcript uses minimal punctuation within each turn, to avoid
prejudging the sentence structures, but the beginnings of individual clauses can be
read in context. [P1] is the presenter, and the expert [E2] is Kerry White, a children’s
literature specialist.
(13)[P1] ... what I’m interested in is…what it tells us about our approach to
children and childhood. In the sixties it was very learning orientated
around literacy 〈E2 yes〉 rather than firing the human imagination.
[E2] That’s right I think the idea of fired human imaginations was left
up to family and friends that like that was a personal thing uh and so the
the school wasn’t providing um fiction as such and so at h you depended
on what happened at home. Um I know in Wollongong we did have a
a reasonably good children’s library but that had only started in the forties
like it hadn’t been around a long time and I and it became better resourced
by the sixties it was quite good but not everyone had access to that
I presume. [ART ABCnat3]
The two occurrences of like (DM) are in E2’s extended contribution. The first is
preceded by that, which is interpreted here, on the basis of the sound recording,
as a demonstrative. The speaker pauses (a short pause) and begins again, this time
producing a like. It highlights the upcoming clause: that was a personal thing, but it
also signals this clause as an addition to the preceding statement and an explanation
as to why family and friends were left to fire, or not to fire, children’s imaginations.
The second occurrence of like (DM) has a similar function. Kerry White comments that while Wollongong had a reasonably good children’s library it had only been
established in the 1940s. By the late 1990s the library had been in existence for 50 years
but Kerry White is talking about her experience as a child, and we know from her web
page that she was born in 1958. The significance of its being founded in the 1940s is
precisely that it had not been in existence for very long when she was using it. Like
(DM) signals an addition to that had only started in the forties, the addition being an
explanation of why the statement is significant.
Like and other discourse markers 
Consider now the following text produced by the media-savvy Australian scientist,
Karl Kruselnicki, born in 1948 according to his web page. In the following talkback
segment, he is the expert responding to a question about stitches, the kind people can
suffer when out running or walking.
(14)E1 Well there’s a guy called Darren Morton ...and he’s been looking at the stitch
and there’s been y’know what what the heck’s going on when you got a stitch.
Like you think oh it’s because you’re not very well trained y’know and but uh
y’know like one fifth of highly trained athletes you couldn’t train any more
they’d drop dead get stitches [ART ABCnat6]
The question is What causes a stitch? Kruselnicki provides a possible answer, which
is introduced by like. The like highlights the answer in the following chunk: Like you
think… not very well trained, y’know, which could be glossed as “For example” or
“Here’s an example of an answer to that question: it’s a lack of training”. Clause-initial
like is balanced by clause-final y’know, which Kruselnicki uses to include his listeners
in the discussion and the search for common ground.
Explanatory uses of clause-initial like also occur in private dialogue. Consider
(15) below, an informal conversation between a couple of Australian women in their
twenties. Speaker A is describing how she was badly injured in a road accident on a
freeway. The police had stopped the traffic, which was backed up, not just an ordinary
traffic jam but right across the bridge and further. The latter piece of information is
introduced by like, which again highlights the upcoming clause. It also marks it as an
explanation of how much traffic there was and the extent of the jam. A few lines later
in the conversation, in (16), the speaker explains that she had a tyre blowout (Because
my tyre blew) and foregrounds the fact that it wasn’t just an ordinary blowout (Like the
tyre blew right off the wheel).
(15) A: ... cos I remember lying on the side of the road ...
B: On the free on the freeway
A:And they had the traffic banked back Like it went right across the bridge
and out that way and they had the ambulances coming backwards down the
road [ICE-AUS S1A-041:9-15]
(16) B: Well why were you outside the car on that freeway
A: Because my tyre blew
A:Like the tyre blew right off the wheel so I was getting a tyre out of the
car and I couldn’t go anywhere because there was absolutely nowhere
to go …[ICE-AUS S1A-041:21-4]
Informal conversation in ICE-NZ provides a similar example (17) of explanatory
use of clause-initial like. The discussion concerns scarce medical resources and
access to them. Sammy’s friend was in very poor health and probably not going
 Jim Miller
to benefit from a liver transplant. Nonetheless his parents were wealthy and were
able to pay for a transplant. The speaker states that Sammy’s friend’s parents were
wealthy and then deviates from the main line of the story to say how they came to
be wealthy and what the speaker means by “wealthy”. The deviation is introduced
and highlighted by like:
(17) DY that friend of sammy’s had a new liver
MS what did you say then
DY 〈slowly〉that friend of sammy’s〈/slowly〉
MS oh right
DY had a new bit but i didn’t think that was very nice either cos i mean he was
just about dying 〈,,〉 but because his parents were exceedingly wealthy and like
his father was the financial director of X they happened to have two hundred
thousand in cash sitting round home so they could pay for him to have it in
sydney. [WSC DPF014: 0410-0430]
In all these examples (14–17), clause-initial like is used by the speaker to flag crucial
information for the listener. The device is used freely in both public and private
dialogue, on both sides of the Tasman Sea.
8. Clause medial like (DM)
The clause medial construction is possibly the most familiar one although the comparative frequencies shown above in Table 4 suggest that its regional frequency may
vary. There were twice as many examples in ICE-AUS as ICE-NZ. We can nevertheless
demonstrate its use in both by adult speakers and teenagers, and its highlighting or
focusing function in public and private speech.
Example (18) is from a conversation between two Australian women in their
twenties. Speaker A is talking about a man she had seen. She employs a technique
common in unplanned speech whereby speakers do not produce complex noun
phrases but produce serially the components of a possible noun phrase. This technique lightens the cognitive load of speech production and helps comprehension of
the various pieces of information, as shown in the following:
(18)
A: He was tall
B: Hmm
A: He had sandy sort of blond-brown hair, really nice, in a little pony-tail
B: Hmm
A:He had beautiful skin, really beautiful skin, like really tanned and he had a
little tattoo on his arm, on his arm [ICE-AUS S1A-007:2-7]
The first piece of information in (18) is that the man had beautiful skin and the second,
that the skin did not just have ordinary beauty. After really beautiful the speaker pauses
Like and other discourse markers 
before adding like really tanned. Tanned carries a High Rising Terminal (HRT). The
like phrase conveys new information but is also the culmination of the description,
which is highlighted by like and by the HRT (whatever other functions the HRT
might have).7
Example (19) shows clause-medial like in an excerpt from an Australian scripted
lecture. There are one or two disfluencies in the delivery but this is an excellent example
of how highly-educated speakers use like (DM), integrating it with complex language.
Again like functions as a focusing or highlighting item.
(19) A:Low self esteem starts early. Research among primary school students in the
US revealed that only twenty nine percent of girls were happy the way I am
compared to forty six percent of boys and that’s only in primary so imagine
what it would be in secondary
A: That would be like huge [ICE-AUS S2B-044:151-5]
The first segment quotes the actual percentages, which is followed by a rhetorical
question, used to emphasize their size. The additional turn (155) with like signals
new information, how to assess the percentages, but it also highlights huge as the
culmination of her comment.
Example (20) comes from casual conversation in WSC. Two occurrences of like
(DM) are produced by the same speaker in a discussion of driving in snowy conditions.
(20)BN ...so we thought oh we’ll take the car because we’ve got chains and
everything with it ...
BN i mean there will there WAS this what would you say the first one was
nineteen seventy seven there was snow or something in the village like when
we got here
AS yeah
BN you needed like CHAINS to drive around town [WSC DPF026:0200-0240]
The first instance of like (DM) occurs not in the middle of a single clause but in the
middle of a clause complex, consisting of the existential clause there was snow or something in the village and the adverbial clause of time like when we got here. The speaker
emphasizes that they didn’t have to wait; snow was lying in the village, never mind up
on the mountains. The second instance of like highlights CHAINS. What is emphasized is that even in the town/village it was necessary to fit chains to the wheels in
. According to Horvath (2008), the use of HRT is an ongoing change in AusE led by women.
She describes two proposed functions of HRT as plausible, seeking that the listener has
understood or requesting the “heightened participation of the listener.” She states that HRTs
are used by speakers of all ages and socio-economic backgrounds, the heaviest users being
teenage working-class girls. Bauer and Warren (2008) accept Horvath’s set of functions but
add that HRT has been shown to be a positive politeness marker.
 Jim Miller
order to be able to drive around. Higher up, on the roads to the ski-slopes or climbing
areas themselves, chains are usually required in winter, but not in the town. CHAINS is
highlighted both by emphatic stress (signaled by the capitals) and by like.
Clause-medial like is shown in all these examples to put the spotlight on the
following piece of information, and give it additional rhetorical and dramatic force.
9. Clause-final like (DM)
We come now to the rarest of the three constructions with like, as indicated by its low
frequency in both the Australian and New Zealand data (Table 4 above), and the dearth
of attestations in the literature. With the exception of D’Arcy (2005), clause-final like is
mentioned only in accounts of Scots or Scottish English. Examples are nevertheless to
be found in novels by Dorothy Sayers and John Mortimer, but only in dialogue in nonstandard English, from East Anglia and Cockney. In the Australian data clause-final
like is used by speakers producing quite complex language and discussing complex
topics. The speakers in the following excerpts are well educated, the first being a female
university lecturer, probably in her thirties. In (21) she discusses the characters in
Othello, their relationships, and why Desdemona has been criticized:
(21) A:Next, Desdemona Um there’s been a lot of criticism about Desdemona,
whether she’s um passive, just a helpless wimp who doesn’t do anything,
or whether she’s a um, next to a whore like, um, and she sort of asks asks
for what she gets in the end
A:Um, I I sort of, I sort of don’t see her as passive and I don’t see her as as a
whore either [ICE-AUS S1B-002:43-7]
The critics’ crucial question: “Is she a wimp, or is she a whore?” is marked off with
clause-final like. In (21) it signals the provision of an explanation. In other cases it can
be part of a request for further information, as in (22) and (23), both uttered by experts
during discussions on radio. In (22) the caller is countering the expert’s assumption
that he is going to drop a particular topic and says that he just wants more information
before developing the topic:
(22) C4: Oh no I was going to develop that but um 〈,〉 ih I was just wondering what
if you can give me a guide on what month like uh [ART ABCe1]
Similarly in (23), Speaker A is challenging B to supply more properties of pseudomemory.
(23) A:I I I couldn’t really see it um you know too clearly than more more clearly
than that um…
A: Well what else can we think of as actually defining pseudo-memory like
Like and other discourse markers 
B:Wasn’t it influenced by a lot of different parameters such as genome known
susceptibility and context
A: Yeah but that that doesn’t actually define it [ICE-AUS S1B-017:7-17]
Clause-final like serves to mark argument and counterargument in formal discussion
as well as private conversation. Example (24) is from a New Zealand telephone
conversation, and reflections on doing work for someone but doing it for nothing:
(24)BS:i’ve OCCASIONALLY thought that you could actually do some work for
kelvin and sharon but i’m not sure
AC:but they i should just do it for them for free
BS:yeah tricky isn’t it
AC:and also you don’t want to sort of overstep their privacy like
[WSC DPC059:0160-0175]
The turning-point of the discussion follows BS’s agreement (l.170) that the situation is
tricky, and AC’s alternative suggestion: you don’t want to overstep their privacy like. The
use of clause-final like marks the clinching argument in an exploratory conversation.
10. Like in combination with other DMs
In researching the use of like (DM) in Australian and NZ sources, examples in which it
combined with you know turned up. Two such instances occur in (14) discussed above,
where the text-highlighting like is counterbalanced by an audience-oriented y’know,
in either order. A further example where like comes as the second component can be
found below as (25), from a recent DVD of the New Zealand film Goodbye Pork Pie,
containing a discussion of the film’s stunts by one of the writers and producers, Geoff
Murphy. He explains that they did not try to imitate the extravagant stunts to be seen
in films such as the James Bond series:
(25)when we do the stunts we’ll try and [ ] the stunts so that the average bloke in
the audience could go “I can do that” you know like instead of the super-duper
major spectacle stunts that are obviously impossible
The use of you know engages the audience before flagging the details of the explanation
with like. An Australian example of this combination, but in the reverse order, can be
found in (26). Here Speaker B counters A’s assumption that he will abandon public
transport and defends his decision to learn to drive a car:
(26) A: You live too far away
B:Well no See I public I always use public transport like you know as a rule
[ICE-AUS S1A-011:169-70]
 Jim Miller
In this combination, the use of you know probably serves to mitigate the first quasicategorical statement which is emphasized by clause-final like, and it prefaces the move
to the more flexible as a rule. Used together, the two DMs allow the speaker to both
mediate and modulate his position. In all these examples like signals what the speaker is
doing, while y’know brings hearers into the discussion and encourages them to accept
the explanation or counter-reason. Speakers naturally prefer their explanations to be
accepted by their hearers; this explains why the two DMs occur together, though it does
not explain why the pairing is relatively frequent in the antipodean data but absent from
the Scottish English data analyzed by Miller and Weinert (1995).
11. Conclusion
The key points in this chapter are that like (DM) has three different discourse functions
depending on its position in a clause or phrase. Evidence to support this analysis can
be found in the form of paraphrases, and the substitution of other constructions for
those with like. Clause-medial like is simply a highlighter. Clause-initial and clause-final like are both implicated in the process of explanation or exegesis. Clause-initial like
has the simpler function, since it signals that an explanation or exegesis is being supplied by means of examples; hence the possible substitution of for example or that is. It
may also signal a side-step from or interpolation in the flow of the narrative in order to
provide the explanation. Clause-final like signals that an explanation is being supplied
in order to anticipate an objection or to counter a proposition already expressed (and
also to request an explanation).
Clause-final and clause-medial like both have an interpersonal role, the former
because of its use by speakers to persuade their hearers to go along with an explanation
or assertion, the latter because it is closely implicated in the give-and-take of discourse:
“I can see you are about to make an objection, I can anticipate what it is and I hereby
counter it” or “You have signaled a proposition, I disagree with it and I hereby signal
that I disagree and explain why”.
The accounts proposed by Siegel and Andersen and summarized in Section 4 put
like (DM) outside the language system or at best on the periphery. The account given
in Sections 7–9 places like (DM) firmly in the language system, in the component concerned with discourse organization.
References
Andersen, Gisle. 1998. “Like from a relevance-theoretic perspective”. In Andreas H. Jucker &
Yael Ziv (eds), Discourse Markers. Descriptions and Theory. Amsterdam: John Benjamins,
147–70.
Like and other discourse markers 
Andersen, Gisle. 2000. Pragmatic Markers and Sociolinguistic Variation. Amsterdam: John
Benjamins.
Bauer, Laurie & Paul Warren. 2008. “New Zealand English: Phonology”. In Burridge & Kortmann
(eds): 39–63.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan. 1999. The
Longman Grammar of Spoken and Written English. London: Longman.
Blakemore, Diane. 2004. “Discourse markers”. In The Handbook of Pragmatics Lawrence Horn &
Gregory Ward (eds), 221–40. Oxford: Blackwell.
Burridge, Kate and Bernd Kortmann (eds). 2008. Varieties of English. The Pacific and Australasia.
Berlin: Mouton de Gruyter.
D’Arcy, Alexandra. 2005. “Like: Syntax and Development”. Ph.D. thesis, University of Toronto.
Horvath, Barbara. 2008. “Australian English: Phonology”. In Burridge & Kortmann (eds): 89–110.
Levey, Stephen. 2003. “He’s like ‘do it now!’ and I’m like ‘No!’ ”. English Today 19(1): 24–32.
Macaulay, Ronald K.S. 2005. Talk That Counts. Age, Gender and Social Class Differences in Discourse.
Oxford: Oxford University Press.
Meyerhoff, Miriam. 1994. “Sounds pretty ethnic, eh?: A pragmatic particle in New Zealand
English”. Language in Society 23: 367–88.
Miller, Jim. 2008. “Scottish English: Morphology and syntax”. In Bernd Kortmann & Clive Upton
(eds), Varieties of English. The British Isles, 299–327. Berlin: Mouton de Gruyter.
Miller, Jim & Regina Weinert. 1995. “The function of LIKE in dialogue”. Journal of Pragmatics
23: 365–93.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A Comprehensive
Grammar of the English Language. London: Longman.
Schiffrin, Deborah. 2001. “Discourse markers”. In Deborah Schiffrin, Deborah Tannen &
Heidi E. Hamilton (eds), The Handbook of Discourse Analysis, 54–76. Oxford: Blackwell.
Schourup, Lawrence C. 1985. Common Discourse Particles in English Conversation. New York
NY: Garland.
Siegel, Muffy E.A. 2002. “Like: the discourse particle and semantics”. Journal of Semantics 19: 35–71.
Underhill, Robert. 1988. “Like is, like, focus”. American Speech 63: 234–46.
Final but in Australian English
conversation*
Jean Mulder, Sandra A. Thompson & Cara Penry Williams
University of Melbourne/University of California, Santa Barbara/
University of Melbourne
In contemporary Australian English but has progressed through a
grammaticization continuum to become a “fully developed” final discourse
particle. Here we document the place of Final Particle but in Australian English.
Firstly, we make a case that it provides further evidence of the mixed origins
of Australian English. Secondly, we show how prosody, turn organization, and
speaker interaction indicate that Final Particle but marks contrastive content and
is a turn-yielding discourse particle. Thirdly, we establish through survey data
that its usage in Australian English differs from that in American English and
that but as a Final Particle can be seen as a distinctive feature of Australian
English. Lastly, we argue that Final Particle but has social meaning and can
index “Australianness”.
1. Introduction
In investigating the occurrence of Final but in English conversation, Mulder and
Thompson (2008) found that but could be usefully seen as a connective which is in
the process of grammaticizing to a final particle.1 Moreover, their data showed that
in contemporary AusE, this process has led to Final but becoming a “fully developed”
discourse particle.
*Our thanks to the following people for valuable assistance in preparing this paper:
Laura Delrose, Caroline Thomas and Brian Fricker. In particular, we wish to thank Pam Peters
and Adam Smith for their expert and timely help in making specific sound recordings
available to us.
. A shorter version of this paper was originally presented at the 2005 Conference of the
Australian Linguistic Society and appeared as Mulder and Thompson (2006a).
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
The following examples, drawn from the spoken part of ICE-AUS, illustrate the
AusE “Final Particle but”.2 In (1), Patricia, Jess and another female speaker are looking
at photos and opals from Jess’s recent visit to an opal mining area.
(1) Jess:But, if you ever go there, it’s good like, you don’t stay there
for too long but.
Patricia: So [you]—
Jess:
[We on]ly stayed there for three days.
Patricia: You haven’t got any other photos.
[ICE-AUS S1A-067:217-23]
In (2), four tennis players are discussing how various players respond when they get
upset on the court.
(2)
Andrew: I think [~Leah] gets a bit agro,
Jill:
[Yeah].
Rebecca: Does she?
Sharon:
~Leah,
Rebecca: ~Leah? Oh yea:h,
Andrew: Not, you know, this pounding [thing] like we do but.
Sharon:
[Yeah].
Jill:
Mmm.
[ICE-AUS S1A-031:21-5]
In this chapter we document the place of Final Particle but in AusE, where its
“final-particalization” is supported by a wealth of data. Firstly, we make a case that
it provides further evidence of the mixed origins of AusE. Secondly, we show how
prosody, turn organization, and speaker interaction in contemporary Australian
conversational data indicate that two central features of Final Particle but are that it
marks contrastive content and that it is a turn-yielding discourse particle. Thirdly,
we establish through survey data that its usage in AusE differs from that in AmE
and that but as a final particle can be seen as a distinctive feature of AusE. Lastly, we
argue that Final Particle but has social meaning and can index “Australianness”, as
evidenced in written dialogue, emails and text messages.
While we focus here on the usage of but as a final particle in AusE, in fact it appears
to be more widespread than this. We find citations of the “sentence-final discourse
marker but” in nonstandard NZE (Bauer 2002: 107); Falkland Islands English (Sudbury
2001: 73, 2004: 415); Hawaiian Creole (Sebba 1997: 172f); Australian Aboriginal English
. In examples (1) and (2), ‘[]’ indicates overlapped speech and ‘~’ indicates the use of
a pseudonym.
Final but in Australian English conversation 
(Leitner 2004: 245); South African Indian English (Mesthrie 1992: 21); and Irish English,
Scots English, and varieties of north-eastern England (Trudgill 1983: 26, 1986: 140; Beal
1993: 211). However, while the Final but has been commented on anecdotally, as far
as we know, there has been scant previous research considering its prosodic and social
functions in everyday interactions.
2. The origins of Final Particle but in AusE
In this section we look at the distribution of Final Particle but in varieties of
English around the world and consider the implications for our understanding of
the origins of Final Particle but in AusE, hypothesizing that it is a result of dialect
retention, having migrated through colonialization from the British Isles to the
southern hemisphere.3
Our approach is to first locate the occurrence of this final particle in present-day
varieties of English spoken in the British Isles, noting its discourse functions and likely
presence at the time of the major emigrations to the southern hemisphere from the late
eighteenth to the early twentieth century. We then establish that there is a historical
link between these varieties and the varieties that had input in the genesis of AusE, and
consider the social setting from which Final Particle but has emerged as a nonstandard
feature of AusE.
As outlined above, ending a sentence with but has been noted in passing as a
feature of Irish English, especially that of Belfast (Horvath 1985: 39 citing Leslie
Milroy pc 1983; Trudgill 1983: 26, 1986: 140; Beal 1993: 211; Harris 1993: 176; Hickey
2007: 375); dialects of Scots English, particularly those on the west coast, midlands
and in Glasgow (Beal 1993: 211; Freegard pc 2004; Mac Donald pc 2004; Mc Donald pc
2004; Sharp pc 2004; Trudgill 1983: 26, 1986: 140; Turnbull pc 2004); and varieties
of north-eastern England, such as those of Northumberland and Tyneside (the latter,
popularly known as “Geordie”) (Trudgill 1983: 26, 1986: 140; Beal 1993: 211; Dickinson
pc 2004).4 In the brief mentions of Final but in these present-day varieties of
English, it is often depicted not only as being equivalent in meaning to standard
English though, but as “nonstandard”, being typical of “working/lower class” speech.
In Glasgow, where it seems to be commonly pronounced with a final glottal stop,
. An earlier version of this section was presented at the 37th Poznań Linguistic Meeting,
Poland (Mulder & Thompson 2006b).
. In contrast, Trudgill (1983: 26) found that use of but as a final particle is not understood
in southern England, which perhaps partially explains its presence as a vernacular rather than
a more standard feature of present day AusE (see below).
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
one teacher commented of her time in a junior high school in the early 1950s: “We
tried to stop them using but in that way, all the time. Our aim was not snobbish,
but a means of teaching a way of gaining successful employment after the school
years. The standard reply was: “Ah know it isne right, bu’. Ah canny help it, bu’.”
(Perriment pc 2004).
It appears that Final but has been around for some time in these varieties of
English. In Scots English its origin can likely be traced to a calque of terminal cia ta
in Gaidhlig (Scots Gaelic) (Mac Donald pc 2004). There has been a sustained Irish
English influence in Scotland (cf. Hickey 2007), and the occurrence of Final but in
Scots English may have also been reinforced by its presence in Irish English. In the
case of the Northumberland and Tyneside varieties of north-eastern England, there
has also been a strong Irish English influence, particularly through a major influx of
Irish immigrants from 1840 to the end of that century (Beal 1993; Hickey 2007). In
addition, these north-eastern English varieties have a common origin and continuing
close relationship with Lowland Scots.
Turning to the British colonialization of Australia, historical records point
to the mixing of settlers from differing regions of the British Isles, with a large
Irish presence, and a considerable but smaller Scottish presence (cf. Mitchell 2003;
Taylor 2003). In terms of the shape English would take as it developed in Australia,
it seems valid to assume that it would depend on the nature of the input varieties of
English and their relative weight, not only in terms of numbers of speakers but also
in terms of social prestige. In this respect it is generally agreed that AusE began
as a leveled variety with the favoured input, particularly in terms of pronunciation, being the speech of the south-eastern and London English emigrants, who
had the highest social status (cf. Hickey 2004; Kiesling 2004). While the speech
varieties of emigrants from Ireland, Scotland, and the north-eastern of England,
who all had lower social standings, were never a dominant influence, this is not
to say that these varieties had no input in the genesis of AusE. Rather, as Kiesling
(2004: 427) suggests for several phonological features that AusE shares with Irish
English: “These features do point to a “covert” entrance into AusE, as they are
generally from informal registers…”.
What this means for the origins of Final Particle but in AusE is that the likely
scenario is one of input from Irish English, mutually reinforced by that of Scots
English and the north-eastern English varieties. The standing association with nonstandard speech in these northern hemisphere varieties would further support a
“covert” entrance into AusE, and would also account for its status in present-day
AusE as a feature of colloquial usage (see Section 6). Such a development would
parallel that of youse in AusE which is generally considered to be of Irish English
origin (cf. Bradley 2003; Hickey 2003, 2007).
Final but in Australian English conversation 
The hypothesis of dialect retention is further supported by the presence of
Final but with a similar usage in other southern hemisphere Englishes. For example,
NZE and Falkland Islands English originated from emigrants who left the British
Isles from roughly the same regions at roughly the same time.5 As the proportion
of speakers from the different input varieties was not the same in Australia, New
Zealand and the Falkland Islands, a scenario of mutual reinforcement is also supported. To illustrate, unlike Australia with its higher proportion of Irish emigrants,
the Falkland Islands had a higher proportion of Scottish emigrants, leading Sudbury
(2004: 415) to suggest that Final but, along with various other features, is a retention
from the early Scottish settlers.
In sum, while AusE may have largely been built on the backs of the south-eastern
and London English-speaking emigrants due to their higher social status, speakers
of English varieties with lower social status have also left their mark. The presence
of Final Particle but in AusE provides further evidence of the mixed sources of AusE
(Bradley 2003; Lonergan 2003; Leitner 2004).
3. Data
Our primary corpus for this chapter has been the conversational part of ICE-AUS,6
where we found many potential Final buts. After identifying these in the transcripts,
we then listened repeatedly to each one, and selected those whose prosody and
conversational context made it clear that they were functioning as Final Particle
buts, for a total of 12.7 To this, we have added 15 examples drawn from other spoken sources, including the Monash University Dimensions of Australian English
Corpus, the ART corpus of Australian talkback radio,8 and various Australian film
and television productions such as “Aussie Rules”, “East of Eden”, “The Panel” and
. Correspondingly, Leitner (2004) attributes the presence of final but in Hawaiian Creole
to British sailors, whilst Mesthrie (1992: 21) speculates that in South African Indian English,
there is “a (very) small chance” that but as a “clause-final equivalent of though or isn’t it” derives
from the influence of missionaries who were in charge of education in the second half of the
nineteenth century, some of whom were Irish and English.
. <http://www.ucl.ac.uk/english-usage/ice/iceaus.htm>
. This entailed eliminating a number of potential examples because of overlapping talk.
. <http://www.ling.mq.edu.au/shlrc/resources.htm>
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
“The Chaser’s War on Everything”. Unless otherwise noted, the examples given here
are drawn from ICE-AUS.
4. Final Particle but in AusE
In investigating the use of Final but in both AmE and AusE conversations, Mulder and
Thompson (2008) argued for the grammaticization continuum shown in (3):
(3) Initial but
> Janus faced but >
[IU-initial conjunction]
Final but
[IU-final discourse particle]
For our purposes in this chapter, we concentrate on the Final but, of which
Mulder and Thompson (2008) found two types, here termed Final Hanging but
and Final Particle but.9 Both types of Final but have two essential features: (1) they
end an intonation unit (IU)10 and (2) they end a turn. Mulder and Thompson proposed that what differentiates the two types is that the Final Particle type has fully
developed as a discourse particle, parallel to that of though (Barth-Weingarten &
Couper-Kuhlen 2002). Furthermore, they observed that while both AmE and AusE
have Final Hanging but, only AusE has Final Particle but.
We begin our discussion with Final Hanging but, before moving in the following
section to Final Particle but, which is our focus in this chapter.
4.1 Final Hanging but
To give a feel for the use of Final Hanging but we provide several examples. This usage
occurs in both our American and Australian corpora; the first example below is from
our American data.11
(4) 1 Steve:
What is French [over the phone.]
2 Karen:
[He and Didier -] give lessons over the phone.
3 Charles: French lessons.
. Mulder and Thompson (2008) refer to these as “final-1 but” and “final-2 but”, respectively.
. For ease of reference, following Chafe (1994) and Du Bois et al. (1993), we call these
prosodic units Intonation Units (IUs), fully recognizing that prosody involves much more
than ‘intonation’.
. In example (4), we have kept the transcription system used by the original transcriber,
which follows the Jefferson system described in Atkinson and Heritage (1984). The most
noteworthy feature is that heavy stress is indicated by capital letters.
Final but in Australian English conversation 
4
(1.0)
5 Steve:
Was this their own: - idea?
6
(2.8)
7 Karen:→ W’l now Didier - makes his money by going to Atlantic City but8
(1.7)
9 Charles: hhh hhh HAH HAH HAH HAH
10
(1.3)
11 Karen: ’ts inCREDible,
12
how they live 13 Charles: It IS incredible
[Thompson AmE corpus]
Here, three friends have been talking about a fourth friend, Didier, and his latest
money-making venture, French over the phone. In line 7, Karen jokingly comments
that currently, his income is coming from (gambling in) Atlantic City. This comment
ends with a Final Hanging but, where the implication left hanging is a contrasting idea
for the others to infer, perhaps something like “but who knows how long that will last”,
or “but who knows if he makes enough to live on”.
In the next example, which is from our Australian data,12 the speakers are talking
about going scuba diving as part of a one-day tour out to the Barrier Reef.
(5)
1 Tim:
They give you .. some sort of certificate,
2
I’m not —
3
→ I’m ^sure it’s not PADI: but,
4 Sean:
Yeah.
5 Tim:
@@
[S1A-003:171-75]
Tim’s utterance in line 3 again ends in a Final Hanging but, leaving a contrasting
implication hanging. A reasonable inference for the hearers to make in this context
would be that, while the certificate might not certify one as having acquired a recognized level of knowledge and skill as set out by PADI (Professional Association of
Diving Instructors), it does verify that one has done a day of scuba diving.
In each of these examples, but ends both an IU and a turn. Smoothly and with
no evidence of trouble, another participant then takes a turn. Yet there is a clear
implication left “hanging”, such that the clause ending with but is open to being
interpreted as a concession, with the claim for which it is a concession only implied
. We have re-transcribed each of the Australian examples following the DuBois et al. (1993)
transcription system, adapted such that within an utterance ‘:’ indicates prosodic lengthening,
and ‘=’ indicates latching. Symbols used in this paper include ‘@’ = laughter, ‘%’ = glottal stop
and ‘#word’ = uncertain hearing.
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
(see Couper-Kuhlen & Thompson 2000). That is, this but tells the hearer that there’s
an implication, and invites the listener to infer what it is and to continue the interaction appropriately given that implication. The prosody of the IU with but in each
of these examples accords with its function as leaving an implication “hanging”; as
indicated by the comma following but, the prosody with this Final Hanging but is not
the “final” falling prosody of clearly turn-yielding instances, including the examples
in the next section. As Mulder and Thompson (2008) show, there is strong evidence
in the data that participants routinely orient to this implication left open for listener
interpretation.
4.2 Final Particle but
Turning to the Final Particle but, our Australian data provide considerable evidence of
Final but having progressed far enough to be considered a final particle. Not only is it
uttered with final prosody, but instead of leaving an implication “hanging”, the semantically contrastive material is supplied in the IU ending with the Final but Particle.
That is, our data support the claim that in AusE but has become a fully-developed final
particle marking contrastive content.
To demonstrate these claims, let’s revisit examples (1) and (2) again in greater detail:
(1)
1 Jess:
But,
2
(0.3)
3
if you ever go there,
4
(0.5)
5
it’s good like,
6
(0.4)
7 →
you don’t stay there for too long but.
8
(0.4)
9 Patricia:
So [you]—
10 Jess:
[〈F〉 We on]ly stayed 〈/F〉 there for three days.
11 Patricia: You haven’t got any other photos.
[ICE-AUS S1A-067:217-23]
In line 7 of (1), Jess is reporting that the opal mining area they went to is worth a visit,
even though one wouldn’t want to stay there for too long.
(2)
1 Andrew:
2 Jill:
3
4 Rebecca:
5
6 Sharon:
7 Rebecca:
I think [~Leah] gets a bit agro,
[Yeah].
(0.3)
Does she?
(0.4)
〈P〉 ~Leah 〈/P〉,
~Leah?
Final but in Australian English conversation 
8
Oh yea:h,
9 Andrew: Not,
10
you know,
11 →
this pounding [thing] like we do but.
12 Sharon:
[Yeah].
13 Jill:
=Mmm.
[ICE-AUS S1A-031:21-5]
In line 1 of (2), Andrew identifies Leah as one tennis player who gets a bit aggravated
when she is upset on the court. Then in lines 9–11, he qualifies this statement by
conceding that when she gets upset she doesn’t pound on things like they do, though.
In these two examples, as compared to (4) and (5) in the previous subsection, there
is no unstated, “hanging” implication following the Final but. Rather, the semantically
contrastive material is supplied in the IU ending with the Final Particle but.13
To further support our characterization of the distinctive prosody and pragmatics
of Final Particle but and our claim that it is robust in spoken AusE, we offer a number
of additional examples.
In the first, three students are scheduling a time to hold an Asian Studies
Society meeting.
(6) 1 Tracey:
I thought we would —
2
I’m going to the Jugs on Friday,
3
I wanna go to the Jugs on [Friday].
4 Ross: →
[#Probly] ~Jeff ’ll
wanna hit the town but.
5
(0.7)
6 James:
Yeah well.
7
(0.8)
[ICE-AUS S1A-039:181-84]
In line 4, Ross ends his turn with but. Unlike the examples of Final Hanging but,
however, Ross’s turn is uttered with final prosody, and, crucially, it is taken by all participants to have been finished, as is evidenced by James’s responsive turn in line 6.
Ross’s line 4 is taken by the others to have conveyed the concession that, even though
Tracey wants to go to the Jugs on Friday, Jeff will probably want to do something else,
namely hit the town.
In the next example, the two speakers are discussing Alice’s performance in the
play “Abracadabra”.
. See Mulder and Thompson (2008) for a discussion of the development of Final Particle
but as a retrospective clause-final contrast marker.
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
(7)
1 Carol: I mean,
2
you thought you were so 〈@〉 ^terrible in that 〈/@〉.
3
(0.3)
4
(H) and you ^were 〈MRC〉 so: bloody brilliant 〈/MRC〉.14
5
(0.2)
6 Alice:
Yeah,
7
I know.
8
(1.2)
9 →
I don’t know if I was ^brilliant but.
10 Carol: No,
11
you ^were.
[ICE-AUS S1A-022:268-9]
In line 4 of this example, Carol proclaims Alice to have been brilliant in the play. In
lines 6–7, Alice accepts the praise, but in line 9, she backs down from Carol’s “extreme
case formulation” with the concession that, even though she might have given an
adequate performance, she doesn’t know if it could be termed “brilliant”.15
As another example, consider the following extract from the ART talkback radio
corpus, with John Laws as the host. Prior to this extract, a Bureau of Meteorology
spokesperson was discussing a hailstorm that had hit Sydney the previous evening,
and the following caller had then noted that where she was, there was still hail on the
ground the next morning. Julie then follows on with this topic.
(8)
1 John: Okay?
2
Julie?
3 Julie:
Morning John.
4
How are you love?
5 John: Pretty good,
6
tha〈@〉nk you〈/@〉.
7 Julie:
=I’ve still got me: ^hail in Mascot here.
8 →
Broke me gazebo but.
9
@@@@
10 John: Oh did it really?
11
Was it ^that heavy.
[ART COMe4 (Caller 6)]
In line 8, Julie finishes her turn with the Final Particle but, followed by laughter.
Again, her prosody is final, and her interlocutor John responds immediately with
an appreciation token oh did it really? Hail isn’t necessarily a positive event, but in a
. MRC indicates a “marcato” voice quality that is slightly slower and very deliberate.
. See Pomerantz (1986) for “extreme case formulations” and Couper-Kuhlen and
Thompson (2005) for discussions of concessions following such extreme case formulations.
Final but in Australian English conversation 
drought-stricken region of Australia, precipitation in any form is a positive, even if,
as Julie concedes in line 8, it did break her gazebo.
In the following example, two female speakers are discussing various interactions
each had with her mother.
(9) 1 Bridget:
2
3
4
5
6
7 Marie:
8
9
10 →
11
12 Bridget:
13
We’re trying to convince Mum to get a cappuccino maker,
but she said,
oh.
Can you believe it?
(H) You’d be at it all the time.
(0.9)
Mmm,
that’s true.
(4.1)
It’d be good to have but.
(0.5)
Mm.
〈DREAMY〉 Be unreal 〈/DREAMY〉.
[Ice-AUS S1A-078:195-9]
To Bridget’s report of her mother’s reaction to having a cappuccino maker, Marie
assents with that’s true, adding the concession that, even though Bridget would be at it
all the time, such a machine would still be good to have.
In the final example we offer here, Diane is complaining about her work situation.
(10)
1 Diane:
I think,
2
like ~Libby and ~Rosa are doing exactly the same job,
3
and like,
4
(1.3)
5
~Rosa does all her stuff,
6
and I think she gets peeved,
7
‘cause,
8
~Libby doesn’t help sort of thing,
9
and,
10
(2.0)
11 Peter: → Don’t they have different patches but?
12
(0.5)
13 Diane:
No.
14
They both work in the #zone,
15
(0.6)
16
#though.
[ICE-AUS S1A-045:27-32]
To Diane’s lament about her two co-workers Rosa and Libby, Peter asks in line 11
whether they don’t have different patches, i.e. different job responsibilities. This example
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
nicely illustrates the use of Final Particle but in a question; Peter is asking how it can
be that even though Rosa and Libby have different responsibilities, Rosa is still doing
the work of both of them.
In this subsection, we have examined a number of instances of Final Particle but,
illustrating its characteristic use to close a turn. In contrast to the examples of Final
Hanging but examined in Section 4.1, the Final Particle but does not leave the semantically contrastive material hanging as an implication, but instead closes a construction
which conveys the semantically contrasting content. Once again, we find that the
prosodic contours of our Final Particle but instances correlate with this interactional
“closing”; as indicated by the full stops, these examples have the final terminal falling
pitch characteristic of typical turn-yielding utterances in conversational English.
4.3 Summary
Our AusE and AmE data provide us, then, with a rich array of examples of Final but.
As we have shown, while both AusE and AmE have Final Hanging but, only AusE has
Final Particle but. Interestingly, we find that this final particle usage of but is clearly
recognized as a feature of AusE, as we show in Section 5.
5. Comprehending Final but in contemporary AusE
To further investigate the place of Final but in AusE, we conducted a survey in 2007
where we explored levels of comprehension and awareness of its usage and compared
responses of AusE and AmE speakers. The AusE participants were a large group of
Victorian students in their final year of high school education and the AmE participants a comparable group of first-year university students in California. The survey
focused on investigating how these young people interpreted Final buts and whether
AusE and AmE groups differed in their interpretations.
The task involved listening to three speech segments ending in a Final but: two
taken from the Monash University Dimensions of Australian English Corpus and one
from the Australian feature film, “Aussie Rules”.
Here are transcriptions of the sound files that the participants heard:
(11)
1 John:
2
3 Chris:
4
5 Wallace:
6 Daniel:
7
→
[MECG4M-B 4]
Is your —
... in y- in your family like are the jobs sha:red around fairly?
Yep.
They [are] now any[2way].
[Yep].
[2u:m],
... my mum doesn’t think so — %but,
Final but in Australian English conversation 
In this example, Chris, Wallace and Daniel each answer John’s question in line 2, with
Daniel ending his response in line 7 with a Final but. This turn, however, could have
two interpretations, as an instance of Final Hanging but (“my mum doesn’t think so,
but they are”), or Final Particle but (“they are, but my mum doesn’t think so”).
In (12), a football coach is ending a practice session:
(12)
1 Coach:
2
3
→
(Aussie Rules)
That’ll do it,
lads.
Good work but.
At the arrow the coach finishes an IU with but and a final contour. Note that, unlike
in the previous example, there is no implication of semantically contrastive material
left “hanging” by the Final but. Rather, the coach conveys that training is over for
the day, but he is pleased with the effort by the team members – a Final Particle but
interpretation.
In (13), Diana has just made some strange noises:
(13)
1 Kylie:
2 Diana:
3
→
[MEP1F-B 21]
You sounded fun〈@〉ny〈/@〉 (H).
I know.
Sounded like an alright person but.
As in (12), the arrowed IU, which ends with but and a final contour, has a Final Particle
interpretation. In her IU in line 2, Diana agrees with Kylie, and then in the following
IU introduces the contrastive material in the form of an assessment about herself that,
in spite of sounding funny she still sounded like an alright person.
Both groups of participants were already gathered in similar university lecture
theatres for educational purposes: the Victorian students as part of a one-day enrichment program at the University of Melbourne, and the Americans for a lecture in
their first-year linguistics subject at the University of California. Participants provided
some basic personal information which allowed us to include responses only from the
participants aged 17–20 years old who completed all their primary schooling in the
country of survey. These parameters were set to make the groups maximally comparable and to eliminate participants who were possibly speakers of other varieties of
English. This left the AusE group with 319 participants (aged 17–19) and the AmE
group with 88 participants (aged 18–20).16
. The instrument was slightly altered to seem less foreign to the American students; their
response form had mom not mum written on it and footy was changed to soccer. All American
participants were likely to have had some exposure to AusE accented speech as there was an
Australian student in the class (this student’s data was not included in the analysis).
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
Each sound file was played twice over the installed sound system when the
participants seemed ready, as judged by the researchers present. On hearing each
example, participants indicated their comprehension of but by selecting one of the
interpretations under the prompt “What do you think Speaker X meant in her/his
final utterance?” For each item, option (a) provided a Final Hanging but interpretation, (b) a Final Particle but interpretation, (c) was “either (a) or (b)”. The other
options were (d) “I don’t know/It’s nonsense” or (e) “other” with a space for participants
to write their own interpretation.
“Other” responses accounted for between 1.5% and 18.2% of answers. In our
analysis we recategorized many of the restatements supplied in (e) by putting them
together with either (a) the Hanging but or (b) the Final Particle but responses in
cases where the responders’ interpretations clearly matched one of these two categories. Additionally, we placed responses that indicated the participant had misheard the
Final but into a separate category to preserve these potentially revealing responses.
This left only five instances of genuine “other” responses, all for item (13).
In comparing the American and Australian participants’ responses, clear
differences in interpretations are evident. Firstly, in the responses to (11) there
is a statistically significant difference in whether the but was selected by the two
groups as (a) Hanging but, (b) Particle but or (c) Hanging/Particle but (p < 0.0001,
chi-square = 15.550, df = 2).
As is summarized in Table 1, in both groups the majority of participants selected
the Hanging but interpretation (a), but this is much higher in the AusE group, with
smaller percentages selecting the Final Particle but (b) and either Hanging but or Final
Particle but (c) interpretations. Overall, comprehension of the Final but by both groups
seems quite high as evidenced by the very low number of “I don’t know/It’s nonsense”
or “misheard” responses.
Table 1. AmE and AusE responses (frequencies and percentages) for (11)
AmE
AusE
Frequency
Percent
Frequency
Percent
Hanging but
Particle but
Hanging/Particle
I don’t know/It’s nonsense
Misheard
Total
47
17
21
2
1
88
53.4
19.3
23.9
2.3
1.1
100.0
241
26
51
1
0
319
75.5
8.2
16.0
0.3
0.0
100.0
Final but in Australian English conversation 
The results for (12) show that similar percentages of AmE and AusE participants
(50% and 48.4%, respectively) selected a Final Particle interpretation, as is displayed
in Table 2:
Table 2. AmE and AusE responses (frequencies and percentages) for (12)
AmE
AusE
Frequency
Percent
Frequency
Percent
Hanging but
Particle but
Hanging/Particle
I don’t know/It’s nonsense
Misheard
16
44
15
2
11
18.2
50.0
17.0
2.3
12.5
79
154
36
36
13
24.8
48.4
11.3
11.3
4.1
Total
88
100.0
318
100.0
The two groups still show, however, statistically significant differences in their
understanding of the Final but (p = 0.002, chi-square = 17.41, df = 4). This is where
the misheard category is revealing. A higher percentage of the AusE group selected the
“I don’t know/It’s nonsense” category, but the “other” responses (in (e)) revealed that
12.5% of the Americans, compared to 4.1% of the Australians, seem to have misheard the
utterance, not perceiving the Final but at all. Also of interest were various AusE “other”
responses, which we recategorized as Final Particle but interpretations. Amongst these
were a number which restated the utterance with a though. Such interpretations were
absent in the American data, which provides strong support for the claim that AmE lags
behind AusE in the “particalization” of Final but.
With the final item, (13), we find large differences in comprehension, although
here the results are not statistically significant:
Table 3. AmE and AusE responses (frequencies and percentages) for (13)
AmE
AusE
Frequency
Percent
Frequency
Percent
Hanging but
Particle but
Hanging/Particle
I don’t know/It’s nonsense
Other
24
33
4
24
27.3
37.5
4.5
27.3
93
184
24
13
29.4
58.2
7.6
4.1
3
3.4
2
0.6
Total
88
100.0
316
100.0
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
As shown in Table 3, the AmE participants are fairly evenly divided between a
Hanging, Final Particle and “I don’t know/It’s nonsense” response, whilst the majority
of AusE participants selected the Final Particle interpretation with only a very small
percentage responding “I don’t know/It’s nonsense”.
Looking across the responses for the three items, we see that, while both the
absolute majority of AmE and AusE participants interpreted (11) as a Hanging but,
and (12) and (13) as a Final Particle but, the AmE group showed much less internal
agreement, with, for example, 30.7% of the AmE participants (as compared to 4.7%
of the AusE participants) not even identifying the final IU in (13) as containing some
type of Final but (i.e. as either (a) Hanging but, (b) Particle but or (c) Hanging/Particle
but). Combined with the fact that only the AusE participants substituted though for
but in their “other” responses, this leads us to conclude that AmE and AusE speakers
comprehend Final but in quite different ways. These results provide further evidence
of Final but differing in the two varieties of English, with AusE being further along the
grammaticization continuum, as argued by Mulder and Thompson (2008).
Turning to just the AusE group, since these participants interpreted one example
as Hanging but (75.5%) and the other two as Final Particle but (48.4% and 58.2%), we
can also conclude that AusE speakers orient to the distinction between Final Hanging
but and Final Particle but as laid out in Sections 4.1 and 4.2. This is further supported
by considering individual speakers’ responses across the three items. There were 41
different patterns produced by the 319 Australian participants. The analysis presented
above, that is, (11) as a Hanging but and (12) and (13) as Final Particle buts, was the
mode for the AusE group, accounting for 23.2% of participants.17 It is also worth
noting that the AusE participants who selected this pattern are a diverse group: these
74 students are fairly evenly distributed across the Catholic, state, and independent
school systems, and come from different parts of Melbourne, as well as country Victoria. This indicates that a variety of AusE speakers hear and are familiar with these
two different types of Final buts.
Taken together, these findings underscore the place of Final Particle but in
spoken AusE. What’s more, while it is often claimed that AusE has few distinctive
grammatical features, in particular as distinct from BrE and AmE (cf. Lass 1987;
Newbrook 2001), the presence of but as a final discourse particle in AusE argues that
. As with the aggregate results discussed above, the AmE group clearly had lower levels of
consensus than the AusE group, with agreement by individuals peaking at 8% and 37 patterns
for the 88 participants, close to the total number of patterns for the much larger AusE group.
Additionally, only 5.7% of AmE participants displayed comprehension in line with the
analysis presented here. The difference between the two groups in selecting or not selecting
this combination, was significant (p < 0.0001, chi-square = 12.737, df = 1).
Final but in Australian English conversation 
there are indeed distinguishing features, some of which may be shared with other
antipodean Englishes.
6. Social meaning of Final but in contemporary AusE
Along with the examples from ICE-AUS and other corpora of spoken AusE, we have
collected instances of Final Particle but usage from more informal, more conversational written styles such as internet forums, emails and text messages. One source of
these has been personal communications with the authors:
(14)Hey there cara. . Got in yesterday. . Pretty all over the shop at the mo!
Had excellent time but. . ! How are you goin? Good i hope!
(Text Message (21 March 2007 10:17 pm))
In this text message, the sender, who has just returned from an extended overseas trip,
is letting Cara know that he had an excellent time even though at the moment he is
pretty all over the shop, that is, “not thinking clearly” presumably because of jetlag. The
use of Final Particle but in informal contexts such as these may also relate to the use of
nonstandard language to show solidarity (Pawley 2004: 614).
While the use of Final Particle but in emails and text messages shows that it is part
of everyday AusE, the infrequent occurrence of syntactic forms and discourse particles
like but, as opposed to, for example, phonological forms such as vowel alternations, can
pose difficulties in applying variationist analysis to assess pervasiveness and sociolinguistic patterning (Cheshire 2005). That is, Final Particle but does not have a form it
alternates with, as do phonological variants. Nor is its occurrence obligatory, in the way
that a morphosyntactic feature such as tense or number marking is either present or
not present. For these reasons, to further investigate the sociolinguistic status of Final
Particle but, we conducted the survey presented in the previous section and, as reported
in Mulder and Thompson (2008), we have also explored sources of self-reporting or
folklinguistic comment such as the Macquarie Dictionary/Australian Broadcasting
Corporation’s “Australian Word Map”.18
Another fruitful source of data has been the representation of spoken conversation
in contemporary Australian fiction, where instances of Final Particle but seem to
draw on associations of nonstandard language and “Australianness”.19 As Leitner
(2004: 250) suggests, in the absence of well-recognized and marked regional varieties,
. <http://macquariedictionary.com.au/wordmap>
. Indeed, it was the occurrence of Final but in contemporary Australian fiction that first
drew our attention to this phenomenon.
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
as there are in Britain, for example, nonstandard language can be employed to create
“localness” in the penning of Australian literary characters.
Overall, our collection of 40 instances of Final Particle but in fictional dialogue
by a range of authors, supports the observation that it is viewed as distinctively
Australian, although perhaps associated with particular styles, situations or speakers.
While Final Particle but is found in the dialogue of both male and female, and both
urban and rural characters, it is usually found in contexts involving cultural stereotypes;
in her Phryne Fisher novels set in Melbourne in the 1920s, author Kerry Greenwood
uses dialogue to help portray Bert and Cec, two ex-wharfies, as “very” Australian “salt
of the earth types”:
(15)“Strewth,”20 Bert declared after two fruitless hours. “What have you got, mate?”
“Not much,” said Cec. “Well, something. Not many people live around here.”
“Lotta dogs, but,” said Bert, who had been bailed up in two different yards by
hounds which Mr Baskerville might have considered overdrawn.
(Raisins & Almonds (2002) [1997]: 244))
Analogous to the use of Final Particle but observed in AusE conversation in Section 4.2,
Bert concedes in his reply to Cec that even though there were not many people, there
were a lot of dogs. Here the use of an apparent Final Particle but can be seen as standing alongside quintessential Australian words such as strewth and mate in drawing on a
particularly stereotypical Australian identity.
The next example is drawn from one of Shane Maloney’s Murray Whelan novels,
which are set in Melbourne from 1984 through 1998.
(16)“WOW.” Holly pummelled the air, buzzing from the action. “Did you see that?
Did you see that?”… “That’s the first time I’ve ever done anything like that
outside the ring.” Holly tore off her Nike cap and shook her hair loose. “I didn’t
think I still had it.”… “What was that back there?” I said. “Karate? Tae kwon
do?” “Kickboxing. I used to be the northern region Under-17 champion.
I’m a bit out of practice, but.” She glowed with false modesty, pleased as
Punch. Or Judy, I supposed. “Wow.” Now I was saying it. “You saved my life.”
(Nice Try (1998: 166–7))
In Holly’s answer to Murray, she first asserts that she was an Under-17 kickboxing
champion, then backs down from this “extreme case formulation” by conceding that
she is out of practice, though. Holly’s use of an apparent Final Particle but, along with
her use of wow, which Whalen, who is in his late 30s, comments on, can be taken as
serving to characterize her as a young AusE speaker.
. strewth: an interjection derived from “God’s truth”, that expresses surprise or verification.
Final but in Australian English conversation 
What we are suggesting here is that the recognition and utilization of Final
Particle buts in fictional dialogue can be taken as evidence that this usage has social
meaning, akin to that of a sociolinguistic marker (Labov 1972). Perhaps a more
accurate description, however, given the differences in these and Labov’s (1972)
data, is that Final Particle but has what Silverstein (2003) calls “n + 1st order indexicality”. Linguistic features with this sort of indexicality can do “social work” as
ideologies become attached to them, imbuing social meaning (Johnstone, Andrus
and Danielson 2006: 83). That is, what the form may be associated with in terms of
social categories such as gender and social class, its first level (nth) of indexicality,
can be drawn on for creative and stylistic purposes.
As the examples given here illustrate, writers seem to find Final Particle but
useful as a short hand way of saying something about a character. Although present not only in AusE, Final Particle but appears to index “Australianness” through
the links between nonstandard language and essentialist representations of national
identity. The fact that it is the focus of folklinguistic comment also supports the idea
that it has more than first order indexicality – that is, that it has social meaning in
contemporary AusE.
7. Conclusion
In conclusion, we have documented the origins and current place of Final Particle
but in AusE. We have suggested that it is a turn-yielding particle that marks contrastive content in the utterance it closes. We have documented how it provides
further evidence of the mixed origins of AusE, and we have shown that its usage in
AusE is distinct from that in AmE. Finally, we have brought evidence in favour of
our contention that Final but can be seen as a distinctive feature of AusE and can be
used to index “Australianness”.
References
Atkinson, J. Maxwell & John Heritage (eds). 1984. Structures of Social Action: Studies in Conversation Analysis. Cambridge: Cambridge University Press.
Barth-Weingarten, Dagmar & Elizabeth Couper-Kuhlen. 2002. “On the development of final
though: A case of grammaticalization?” In Ilse Wischer & Gabriele Diewalds (eds), New
Reflections on Grammaticalization. Amsterdam: John Benjamins, 345–61.
Bauer, Laurie. 2002. An Introduction to International Varieties of English. Edinburgh: Edinburgh
University Press.
Beal, Joan. 1993. “The grammar of Tyneside and Northumbrian English”. In James Milroy &
Lesley Milroy (eds), Real English: The Grammar of English Dialects in the British Isles.
London: Longman, 187–213.
 Jean Mulder, Sandra A. Thompson & Cara Penry Williams
Bradley, David. 2003. “Mixed sources of Australian English”. Australian Journal of Linguistics
23(2): 143–50.
Chafe, Wallace. 1994. Discourse, Consciousness, and Time: The Flow and Displacement of
Conscious Experience in Speaking and Writing. Chicago IL: University of Chicago Press.
Cheshire, Jenny. 2005. “Syntactic variation and beyond: Gender and social class variation in the
use of discourse-new markers”. Journal of Sociolinguistics 9(4): 479–508.
Couper-Kuhlen, Elizabeth & Sandra A. Thompson. 2000. “Concessive patterns in conversation”. In
Elizabeth Couper-Kuhlen & Bernd Kortmann (eds), Cause, Condition, Concession, and Contrast: Cognitive and discourse perspectives. Berlin: Mouton de Gruyter, 381–410.
Couper-Kuhlen, Elizabeth & Sandra A. Thompson. 2005. “A linguistic practice for retracting
overstatements: Concessive repair”. In Auli Haulinen and Margret Selting (eds), Syntax and
Lexis in Conversation. Amsterdam: John Benjamins, 257–88.
Du Bois, John, Stephan Schuetze-Coburn, Danae Paolino & Susanna Cumming. 1993. “Outline of
discourse transcription”. In Jane A. Edwards & Martin D. Lampert (eds), Talking Data: Transcription and Coding Methods for Language Research. Hillsdale NJ: Lawrence Erlbaum, 45–89.
Greenwood, Kerry. 2002 [1997]. Raisins and Almonds. Crows Nest NSW: Allen and Unwin.
Harris, John. 1993. “The grammar of Irish English”. In James Milroy & Lesley Milroy (eds), Real
English: The Grammar of English Dialects in the British Isles. London: Macmillan, 139–86.
Hickey, Raymond. 2003. “Rectifying a standard deficiency: Pronominal distinctions in varieties
of English”. In Irma Taavitsainen & Andreas H. Jucker (eds), Diachronic Perspectives on
Address Term Systems. Amsterdam: John Benjamins, 343–74.
Hickey, Raymond (ed.). 2004. Legacies of Colonial English: Its Origins and Evolution. Cambridge:
Cambridge University Press.
Hickey, Raymond. 2007. Irish English: History and Present-day Forms. Cambridge: Cambridge
University Press.
Horvath, Barbara M. 1985. Variation in Australian English: The Sociolects of Sydney. Cambridge:
Cambridge University Press.
Johnstone, Barbara, Jennifer Andrus & Andrew E. Danielson. 2006. “Mobility, indexicality, and
the enregisterment of ‘Pittsburghese’ ”. Journal of English Linguistics 34(2): 77–104.
Kiesling, Scott F. 2004. “English input to Australia”. In Hickey (ed.): 418–39.
Labov, William. 1972. Sociolinguistic Patterns. Philadelphia PA: University of Pennsylvania Press.
Lass, Roger. 1987. The Shape of English. London: Dent and Sons.
Leitner, Gerhard. 2004. Australia’s Many Voices: Australian English – the National Language.
Berlin: Mouton de Gruyter.
Lonergan, Dymphna. 2003. “An Irish-centric view of Australian English”. Australian Journal of
Linguistics 23(2): 151–9.
Maloney, Shane. 1998. Nice Try. Melbourne: Text Publishing.
Mesthrie, Rajend. 1992. English in Language Shift: The History, Structure and Sociolinguistics of
South African Indian English. Cambridge: Cambridge University Press.
Mitchell, Alexander G. 2003. “The story of Australian English: Users and environment”. Australian
Journal of Linguistics 23(2): 111–28.
Mulder, Jean & Sandra A. Thompson. 2006a. “The grammaticization of but as a final particle in
English conversation”. In Keith Allan (ed.), Selected Papers from the 2005 Conference of the
Australian Linguistic Society. <http://www.als.asn.au>, 1–18.
Mulder, Jean & Sandra A. Thompson. 2006b. “The grammaticization of but as a final particle
in English conversation”. Paper presented at the 37th Poznań Linguistic Meeting, Poznań,
Poland.
Final but in Australian English conversation 
Mulder, Jean & Sandra A. Thompson. 2008. “The grammaticization of but as a final particle in
English conversation”. In Ritva Laury (ed.), Crosslinguistic Studies of Clause Combining: The
Multifunctionality of Conjunctions. Amsterdam: John Benjamins, 179–204.
Newbrook, Mark. 2001. “Syntactic features and norms in Australian English”. In David Blair &
Peter Collins (eds), English in Australia. Amsterdam; Philadelphia PA: John Benjamins,
113–32.
Pawley, Andrew. 2004. “Australian Vernacular English: Some grammatical characteristics”. In
Bernd Kortman, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider & Clive Upton (eds),
A Handbook of Varieties of English: A Multimedia Reference Tool. Vol. 2, Morphology and
Syntax, Berlin: Mouton de Gruyter, 611–42.
Pomerantz, Anita M. 1986. “Extreme case formulations: A way of legitimizing claims”. Human
Studies 9: 219–30.
Sebba, Mark. 1997. Contact Languages: Pidgins and Creoles. Basingstoke: Macmillan Press.
Silverstein, Michel. 2003. “Indexical order and the dialectics of sociolinguistic life”. Language
and Communication 23: 193–229.
Sudbury, Audrey. 2001. “Falkland Islands English: A southern hemisphere variety?” English
World-Wide 22(1): 55–80.
Sudbury, Audrey. 2004. “English on the Falkland Islands”. In Hickey (ed.): 402–17.
Taylor, Brian. 2003. “Englishes in Sydney around 1850”. Australian Journal of Linguistics
23(2): 161–83.
Trudgill, Peter. 1983. On Dialect: Social and Geographical Perspectives. Oxford: Basil Blackwell.
Trudgill, Peter. 1986. Dialects in Contact. Oxford: Basil Blackwell.
Swearing
Keith Allan & Kate Burridge
Monash University
In this chapter, we provide an account of antipodean swearing patterns, drawing
on examples from existing written and spoken data banks. As part of this
investigation, we consider general questions to do with swearing: what it is, why
speakers do it and how swearing patterns have changed over the years. We identify
four overlapping functions of swearing: the expletive, abusive, social and stylistic
functions. We also consider the shift in social attitudes toward swearing and the
repercussions of this for the law. Swearing has always been characterized as an
earmark of Australian and New Zealand English. We conclude that it remains an
important feature of these varieties, but question just how uniquely antipodean it is.
1. Introduction
This chapter focuses on a particularly rich area of creativity engaged in by ordinary
AusE and NZE speakers in the use of swearing and insult – so-called “bad” language,
as described in, for example, Allan (1992a, b); Allan and Burridge (1991, 2006);
Andersson and Trudgill (1990); Dabke (1977); Taylor (1976). Australians, in particular, have always regarded their colloquial idiom as being a significant part of their
cultural identity. The standard language is more global in nature and many AusE
speakers see their colloquialisms, nicknames, diminutives, swearing, and insults to be
important indicators of their Australianness and expressions of cherished ideals such
as friendliness, nonchalance, mateship, egalitarianism, and anti-authoritarianism
(Lalor & Rendle-Short 2007, Seal 1999, Stollznow 2004, Wierzbicka 1992). Australian
attachment to the vernacular can be traced back to the earliest settlements of English
speakers during the eighteenth and nineteenth centuries. The language of convicts and
free settlers alike was largely derived from the slang and dialect vocabularies of Britain.
The “vulgar” language of London and the industrial Midlands, the cant of convicts, the
slang of seamen, whalers, and gold-diggers contributed significantly to the linguistic
melting pot in those early years. As Edward Wakefield wrote in his Letter to Sydney
in 1829:
Bearing in mind that our lowest class brought with it a peculiar language, and is
constantly supplied with fresh corruption, you will understand why pure English
is not, and is not likely to become, the language of the colony. (Ramson 1966: 47).
 Keith Allan & Kate Burridge
At that time colonial colloquialisms were an important way of fitting in and avoiding
the label “stranger” or “new chum” in Australia (Gunn 1970: 51). This holds also for
New Zealand. Turner (1966: 114) describes how a character in Alexander Bathgate’s
novel Waitaruna (Bathgate 1881) justifies his use of such colloquialisms: “No use letting every one know you are a new chum”. The cant of the underworld (so-called “flash
language”) flourished in those early days and, as the various corpora of modern AusE
and NZE attest, colloquialisms and “bad” language have remained an important part
of the antipodean idiom.
The examples in this chapter are drawn from a wide range of sources including the
internet, creative writing, spontaneous public speech, and private conversation. The
usual linguistic corpora (especially those consisting of written texts, such as ACE1 and
WWC) are not always fruitful when it comes to yielding examples of foul language.
However some examples come from corpus samples of fiction, and rather more from
informal speech within ICE-AUS, ICE-NZ and WSC, as well as ART, the talkback
radio corpus drawn from commercial radio stations and the Australia-wide ABC. We
take other examples from the English-language social networking website Myspace.
com (Bugeja pc 20082), using data from AusE speakers.
2. What is swearing?
Who ever stubbed his toe in the dark and cried out, “Oh, faeces!”? (Adams
1985: 45)
Swearing is the strongly emotive use of taboo terms in insults, epithets, and expletives.
In Modern English, only certain terms can function as swearwords. For instance, learned
words for sexual organs and effluvia generally do not (e.g. *You faeces! *Urine off!) and
nor do certain mild obscenities and nursery terms (e.g. *You willie! *Wee-wee on you!).
The original meaning of the verb swear based on entries in the Oxford English
Dictionary (1989) was “to take an oath; make a solemn declaration, statement, affirmation, promise or undertaking; often in the eyes of God or in relation to some sacred
object so that the swearer is, by implication, put in grave danger if found to be lying”,
e.g. I swear by Almighty God to tell the truth, the whole truth, and nothing but the truth.
. This and the other corpora listed were accessed via <http://www.ling.mq.edu.au/shlrc/
resources.htm>. All URLs referred to in this chapter were accessed in June 2008. To make
reading easier, the examples we quote from corpora do not stick rigidly to their transcription
conventions – which in any case vary from corpus to corpus.
. Brendan Bugeja MS. “Teenagers, Myspace and language”.
Swearing 
So help me God. The noun oath “an act of swearing” is the nominal counterpart of
the verb swear. These meanings still obtain alongside those derived from it of profane
swearing and profane oaths. At first these would have been statements made with
profane reference to the deity; they have been around at least since the Middle Ages
and probably much longer.3 The extension of profanity from irreligious language to
incorporate obscene language took swearing and (to a lesser extent) oaths with it. The
dysphemistic (offensive) senses of swear and oath became dominant in unmarked contexts; a result aided by the fact that situations favourable to the attestation (I swear by
Almighty God …) are infrequent compared to the number conducive to profane swearing and profane oaths. Profane swearing, like slang, is restricted to colloquial styles
(which is not to say it never occurs in formal situations, see (7) below). It includes
religion-based profanity and blasphemy (i.e. irreligious language), as well as a wealth
of obscenities taken from the pool of “dirty words”. To swear at someone or something
is to insult and deprecate the object of abuse.
Used when a higher style is expected, taboo terms – whether as insults, epithets,
expletives or even descriptives – are likely to cause offence. They may also be specifically used to offend, but in both cases they reflect discredit on the speaker. It is not
only the style of usage, but also the relative status of the interlocutors that affects
the perceptions of profane swearing. Relative status derives from two sources: the
relative power of the interlocutors, and the social distance between them. The relative power is defined by social factors which obtain in the situation of utterance: e.g.
the relative power of a physician and a policeman is not given for every occasion, it
depends on where they encounter one another: imagine how it will differ depending
whether the policeman is requiring a medical consultation at the doctor’s office, or
the doctor has been stopped for alleged dangerous driving. Social distance between
interlocutors is determined by such parameters as their comparative ages, genders,
and socio-cultural backgrounds. Swearing at someone of lower status is possible
without loss of status; though it is generally assumed to demean the person swearing
and can in principle be legally actionable. Swearing at someone of higher status is
more likely to lead the target to take umbrage and pursue sanctions against the low
status offender.
The dysphemistic connotations of swearing have led to its being associated with
cursing “imprecating malevolent fate”. Although curses can hardly be literally profane,
the term Curses! has certainly been used lightly as a disguised expletive (a euphemistic
. There is a reference to them in the ‘First Grammatical Treatise’ written in Icelandic around
1135, cf. Haugen (1972).
 Keith Allan & Kate Burridge
dysphemism4) for several centuries.5 Hence we find in Matthew 26: 74 “Then began
he [St Peter] to curse and to swear”. Interestingly, the colloquial form of curse, cuss,6 is
often used in cussing and swearing. The term cuss word is found from the nineteenth
century as synonymous with swear word.
3. Why do we swear?
Children of both sexes use swearwords from as young as one year old and the practice
continues into old age – even when other critical linguistic abilities have been lost.
People with certain kinds of dementia and/or aphasia can curse profusely, producing
what sound like exclamatory interjections as an emotional reaction. However, when
called upon to repeat the performance, they are unable to do so because they have lost
the capacity to construct ordinary language. The fact that dirty words, abusive words,
and slurs pour forth in these particular mental disorders is only possible because they
are stored separately (or at least accessed differently) from other language.7
As we will see later, the language used varies across time; it also varies between
genders. According to Timothy Jay, AmE-speaking males swear about three times
. We refer to orthophemisms (straight-talking), dysphemisms (offensive language), and
euphemisms (sweet-talking). Orthophemisms and euphemisms are words or phrases used
as an alternative to a dispreferred expression. They avoid possible loss of face by the speaker
and also the hearer or some third party. An orthophemism is typically more formal and more
direct (or literal) than the corresponding euphemism. A euphemism is a word or phrase
used as an alternative to a dispreferred expression. It avoids possible loss of face: either the
speaker’s own positive face or, through giving offence, the negative face of the hearer or
some third party. A euphemism is typically more colloquial and figurative (or indirect) than
the corresponding orthophemism. A dysphemism is a word or phrase with connotations
that are offensive either about the denotatum and/or to people addressed or overhearing the
utterance. As examples of these different X-phemisms compare orthophemism faeces,
euphemism poo, dysphemism shit.
. For example: Seagoon: “Wait. (Raspberry) Curses, the spirit has gone. It must have
been only 70% proof.” (Spike Milligan script “The Internal Mountain” for a Goon Show first
broadcast March 29, 1954).
. Cuss~curse is just one pair of many synonymous doublets in which the colloquial variant
has a short lax vowel and the standard form a long tense vowel. Others are: ass~arse bin~been
bubby~baby bust~burst crick~creek critter~creature gal~girl hoss~horse hussy~housewife
puss~purse sassy~saucy tit~teat.
. Jay (2000) offers a comprehensive account of the mental disorders associated with
coprolalia and other coprophenomena.
Swearing 
more frequently than females and they use “stronger” obscenities, e.g. among 8–12 year
olds “males used words such as shit, fuck and damn, while females used words such as
god or euphemisms darn it and shucks” (Jay 1992: 60–70). Among adults “[b]oth male
and female speakers are more likely to swear in the company of same sex companions” (Jay 1992: 123). Other studies such as Stapleton (2003), Murray (1995), Johnson
(1991) support Jay’s findings. Indeed, numerous surveys and studies leave no doubt
that in nearly all societies, if not all, males swear more and use more obscene language
than females – Australia and New Zealand are no exception. When Alcock (1999)
surveyed 242 Australian university students from the Melbourne area, she found that
male speakers reported refraining from swearing in the company of females, while
females appeared more reticent about swearing in front of authority figures and family
than before male friends and peers. However, times do change and what “nice” girls
say today, and what they used to say or not say, is very different. A study by Bayard
and Krishnayya (2001) of New Zealand University students’ use of expletives found
a general tendency for males to swear slightly more frequently than females, but it
also reported that there was very little difference in the strength of expletives used by
women and men. We need more research on the swearing patterns of female versus
male speakers in the twenty-first century, and especially in the New Zealand and
Australian context where there has been very little work to date on actual or even
reported usage.
We can identify at least four functions for swearing which often overlap: the expletive function, abuse and insult, expression of social solidarity, and stylistic choice – the
marking of attitude to what is said. We take them in turn.
3.1 The expletive function
Old Lady: I shouldn’t cry if I were you, little man.
Little Boy:Must do sumping; I bean’t old enough to swear.
(Punch cartoon April 2, 1913)
Most cussing is an emotive reaction to frustration, something unexpected (and usually,
but not necessarily, undesirable), or in anger. This is the expletive function of swearing –
the use of a swear word to let off steam: imagine hitting yourself with hammer or
being cut off in traffic. Expletives are kinds of exclamatory interjection, and, like other
interjections, they have an expressive function; cf. Wow!, Ouch!, Oh dear!, Gosh!, Shit!
(1) Welfare, my arsehole. [ACE F10:1953]
(2) “Clouding over my arse,” says Ruth. [WWC K20:055]
(3) Oh bugger I should’ve got the lunch bucket. [WSC DPC306:0430]
(4) Well, bollocks to that. [ACE N01:114]
 Keith Allan & Kate Burridge
(5) It’s my bloody birthday goddamn it. [ART ABCnat7]
(6)Oh damn it’s you see I turned I thought I turned that one on.
[ICE-AUS S1A-058:306]
(7)I ran off because it’s something like you know eeh eeh eeh eh eh eeh eeh eeh
eh suddenly this string just went Boom I don’t know bang and I just went Fuck
and ran off the stage. [ICE-AUS S1A-019:143]
(8) I said FUCK we’ve only got half a bloody house. [WSC DPC066:1275]
(9) Oh shit I’m getting lost. [ICE-NZ S1A-033:85]
(10)Shit has that tiger picture gone mouldy just from sitting there?
[ICE-NZ S1A-056:84]
Unlike typical expressives such as greetings or apologies, interjections (including
expletives) such as these are not normally addressed to the hearer. At best hearers are
treated like ratified participants, and at worst as overhearers (bystanders) and therefore not, strictly speaking, addressees. Instances of expletives, and other interjections
uttered without an audience, are expressions of auto-catharsis, a release of extreme
emotional energy.8 Even where they are used with an audience of ratified participants
or bystanders, they are concomitantly displays of auto-catharsis: i.e. the illocutionary
intention is to display a particular attitude or degree of feeling to oneself and anyone
who happens to be in earshot.
Since taboo terms make good dysphemisms, they also make good expletives.
Hence, many taboo terms share this particular function. Furthermore, the very fact
that a term is taboo may improve its value as auto-cathartic: the breaking of the taboo
is, ipso facto, an emotional release (cf. Allan & Burridge 2006: Ch. 10). As Allen Read
once described it (in his characteristically flamboyant fashion):
The ordinary reaction to a display of filth and vulgarity should be a neutral one
or else disgust; but the reaction to certain words connected with excrement
and sex is neither of these, but a titillating thrill of scandalized perturbation.
(Read 1977: 9)
This is what provides the auto-catharsis that a speaker wants in order to cope with the
situation that provoked the expletive. This very strong motivation no doubt accounts for
the consistent historical failure of legislation and penalties against swearing. “Cursing
intensifies emotional expressions in a manner that inoffensive words cannot achieve”
(Jay 1992: 68; Jay 2000: 91, 137); we have more to say about this in Section 3.4.
. Pinker (2007) Ch. 7, offers an account of the neural mechanisms involved in cathartic
swearing.
Swearing 
It should be said that auto-catharsis through swearing is regarded as a conventional
way of violating a taboo: a convention that is not socially approved of, but one that is
grudgingly excused by society. In both public and private, an individual’s self-control
will determine the choice of vocabulary used. Where a situation provokes dysphemism,
a speaker can choose between using a full-blown swearword such as Fuck! or one of
the many euphemistic disguises such as Oh fiddle-faddle!. The latter can be regarded as
a euphemistic dysphemism. Here the locution (the form of words) is at variance with
the reference and illocutionary point of the utterance (i.e. what the speaker is doing
in making the utterance). The expressive exclamation Shit! typically expresses anger,
frustration, or anguish, and is ordinarily a dysphemism. Its remodeled forms Sugar!,
Shoot!, Shivers! or Shucks! are euphemisms – they are nothing but linguistic fig leaves
for a thought that can be castigated as dysphemistic. As the following examples show,
euphemism is not confined to expletives but occurs in other types of swearing as well.
(11) Oh shucks Tony could’ve made a gourmet. [ICE-AUS S1A-090:190]
(12) Oh sugar. We’ve burnt it. [ICE-AUS S1A-058:284]
(13) “Get stuffed,” answered Witcharde. [ACE L07:1281]
(14)Yeah When I think drugs I just think you know stuffed up mind body
everything, you know. [ICE-AUS S1A-053:159]
(15) This this advert sucks. [WSC DPC030:0170]
(16) you know I was going gosh don’t you remember anybody [WSC DPC219:1455]
(17) These screwed up men then screw up women. [ICE-NZ W1A-002:135]
(18)which I’m having to redo cos one of [the] disks was screwed.
[WSC DGZ079:0015]
These are prime examples of the censoring of language for the purpose of taboo
avoidance (for stuffed and screwed understand fucked). A person may feel the inner
urge to swear but at the same time not wish to appear overly coarse in their behaviour.
Society recognizes the dilemma and provides an out – a conventionalised euphemistic dysphemism like Oh shucks! or Oh sugar! Such euphemistic dysphemisms exist to
cause less face-loss or offence than an out-and-out dysphemism (although they will
not always succeed in doing so).
Conversely there are locutions that are dysphemistic while the illocutionary point
is euphemistic and these we label dysphemistic euphemisms. Where the situation provoking an emotional outburst is pleasing and there is no call for dysphemism, it is less
likely that a taboo term will be used. However, there are also situations under which
euphemistic uses of taboo terms are appropriate; for example, the well known 1999
West Australian Lotteries advertisement where the lottery winner uttered Bullshit!
 Keith Allan & Kate Burridge
upon hearing the good news. Similarly, in (19) the use of the offensive expression
Shit! is at odds with the positive emotions that lurk behind it. Similar things can be
said about (20–22) (which reveal the more social function of swearing that we discuss
in Section 3.3).
(19) SHIT that’s great. [WSC DPC331:1545]
(20) DAVEEE; crazy hockey cunt. Love him (Bugeja pc 2008)
(21) wookey is a gem love that cunt (Bugeja pc 2008)
(22) [laughs] you’re a gross cunt [laughs] [WSC DPC251:0980]
3.2 Swearing as abuse or insult
The language of swearing can also have an abusive function. This includes curses, namecalling, any sort of derogatory comment directed towards others to insult or wound
them. Speakers may also resort to swearwords to talk about the things that frustrate
and annoy them, things that they disapprove of and wish to disparage, humiliate and
degrade. Presumably there is no need to try to account for why people (deliberately)
use insults like You are a stupid little shit! or dysphemistic epithets like It’s a pain in
the arse: it is because they do not like who they are addressing, or who or what they
are talking about. To insult someone verbally is to abuse them by assailing them with
contemptuous, perhaps insolent, language that may include an element of bragging. It
is often directly addressed to the target as in You arsehole, you’re a fucking tight-assed
cunt! Get fucked!.
(23)show-off city bitch who thinks the sun shines out of her arse.
[ICE-NZ W2F-017:40]
(24) “Well bloody get your arses in here. I’m not getting up.” [WWC K49:151]
(25) the people on night fills are arseholes [WSC DPC311:0320]
(26) but he’s a ARSEHOLE man. [WSC DPF076:0750]
(27) nice tight poncey jeans. I hope they cut your balls off. [WWC G48:095]
(28) one word to say to you Mollie BOLLOCKS [WSC DGB024:0800]
(29) yes it is a bugger [WSC DPF021:0320]
(30) going to get you kidfucker! We’re gonna cut your balls off [ICE-NZ W1B-004:110]
(31)Like at the top there’s all these cocksuckers all these rich you know selfish
greedy power-hungry peoples and like they don’t do anything for anyone
except you know help their buddies. [ICE-AUS S1A-090:249]
(32) Fuck you NAME [ICE-AUS S1A-083:107]
Swearing 
(33)I can’t believe this shit They’re promoting this fucking ideal [look]
[ICE-AUS S1A-026:45]
(34) little shits dressed me up as a fucking angel [WSC DPC162:2150]
(35)oh yeah the audience thought it was really shit and he you know …
[WSC DPC118:1230]
(36) Outa Out of my way, sucker. [ICE-NZ W2B-012:31)
Example (27) shows how difficult it can be to draw clear lines between swearing and
abuse. There is no doubt that this is abusive, but is it swearing? The expression balls
is slang for a bawdy body part, but it is a fairly mild taboo term and not uncontroversially a swearword in this context. As an expletive expressing disbelief, Balls! is a clear
instance of swearing because of the emotional outburst. Yet (27) is insulting language
and is also aggressive. It would certainly be viewed by most people as “bad” language;
we’ll count it as swearing and leave it to the reader to agree or disagree.
The language of abuse is normally intended to wound the addressee or bring
a third party into disrepute, or both. Typically, insults pick on a person’s physical
appearance and mental ability, character, behaviour, beliefs, and familial and social
relations to degrade. Thus insults are sourced in the target’s supposed ugliness, skin
colour and/or complexion, over or undersize (too small, too short, too tall, too fat, too
thin), perceived physical defects (squint, big nose, sagging breasts, deformed limb),
slovenliness, dirtiness, smelliness, tartiness, stupidity, untruthfulness, unreliability,
unpunctuality, incompetence, incontinence, greediness, meanness, sexual laxness or
perversion, sexual persuasion, violence towards others (even self), ideological or religious persuasion, social or economic status, and social ineptitude. And additionally,
supposed inadequacies on any of the grounds just listed among the target’s family,
friends and acquaintances.
Verbal insults can occur in all styles of language and may or may not contain
swearwords; you dag! can be an expression of abuse, but it is not swearing. Abusive
swearing can involve epithets derived from tabooed bodily organs (e.g. asshole,
prick), bodily effluvia (e.g. shit), and sexual behaviours (e.g. whore, fucker, poofter,
arse-licker, dipshit, cock-sucker, wanker). Maledictions often utilize images of sexual
violation e.g. I was stuffed; We got fucked/screwed; What a ball buster/breaker; He was
just jerking us off.
A dysphemistic epithet like Short-arse! picks on real physical characteristics that
are treated as though they are abnormalities. Epithets like these merge into racist
dysphemisms, and dysphemistic epithets based on behaviours that the speaker disapproves of, such as homosexuality. There are many imprecations and epithets invoking
mental subnormality or derangement: Dickhead! Fuckwit! Fuckhead! Shithead!. These
are doubly-dysphemistic in that they not only ascribe mental derangement, but do
so using a dysphemistic locution which unscrambles as “your wits are (your head is)
 Keith Allan & Kate Burridge
fucked (deranged)”. Shithead! has much the same meaning as Shit for brains! where
the figure is made explicit.
3.3 The social function of swearing
Swearing can act as an in-group solidarity marker within a shared colloquial style –
especially when directed against out-groupers. Social swearing was the most
usual type of swearing in the corpora examined here. In the following handful of
examples, we have provided more context so as to better reveal the intentions of
the speakers.
(37)My my parents dressed me up in as an angel once and they said they had this
big poster on my on my chest saying my looks belie me or something belie
means I’m not really what I look w like I look sort of thing and I I mean I was
only about ten I didn’t know what the fuck I was wearing on my own chest you
know I was going yeah funny eh and everyone was laughing I thought it was
just funny you know and then then I found out later what it meant I’ll
never forget them little shits dressed me up as a fucking angel [laughs].
[WSC DPC162:2120-2150]
(38)S1: pray to baby Jesus open up your heart let god’s love come pouring in let
god’s love shine down on you like it has me and Miss Suzanne over here.
S2: oh fuck off. [ICE-NZ S1A-006:85-6]
(39)Yeah and I didn’t even know I was and I feel like I feel like I did real shit work
you know I feel like I let everyone down again. [ICE-AUS S1A-022:251]
(40)like NAME walked off to the loo or something and come back and put mousse
all over my head and we ended up in this big fight with like all this powder and
shit all over the house and we’re running around the place n doing laps of the
flat so everyone’s sort of looking out at us … [ICE-AUS S1A-045:103]
(41)Synge’s got a sense of humour though; before he hot-footed it down the
drive he hot wired the Porsche with a high tension lead from the engine to
the petrol tank. It fair blew the arse off the flashy car. [WWC K80:157]
(42)Marketing strategies [for this uni project] are going to be interesting.
Are you just choosing prostitution to be a smart arse? [WSC DPC164:1130]
Helen E. Ross (1960) examined swearing among a group of five male and three female
British zoologists in the Norwegian Arctic during continuous daylight. Although the
research was conducted some 50 years ago, it corresponds to what we believe to be the
case today and in the antipodes. Ross writes:
As the work entailed considerable interruption or loss of sleep, most members
had good cause for becoming irritable and swearing. […]
Swearing 
Each individual had his own vocabulary and habitual level of swearing and
tended to keep to the same rank order in the group however much the total
swearing level rose or fell.
The words used were blasphemous rather than obscene, as is to be expected
among the middle classes. Unlike the working classes, however, their use of
obscene words was deliberate rather than habitual, and they took delight in using
them in their correct biological sense. The heavier swearers used the more violent
language.
[…] The amount of swearing increased noticeably when people were relaxed and
happy, though it also increased under slight stress, it decreased when they were
really annoyed or tired. In fact there seemed to be two types of swearing: “social”
swearing and “annoyance” swearing. Social swearing was intended to be friendly
and a sign of being “one of the gang”; it depended upon an audience for its effect,
while annoyance swearing was a reaction to stress regardless of audience. Social
swearing was by far the commoner. […] Under conditions of serious stress, there
was silence. (Ross 1960: 480f)
Ashley Montagu (1968: 87–9) cites these findings by Ross and adds:
[The expedition leader commented i]t was his own impression that under extreme
stress fewer words are used, but that most them are swear words.
Among Dr Ross’ interesting findings was the fact that absence of an appreciative
audience or the presence of nonswearers inhibited social swearing.
…[Furthermore] those who swear are likely to suffer less from stress than those
who do not swear. (Ibid.)
This reinforces the common observation that those who condemn swearing are
“uptight”. Ross (1960: 481) also confirms that social swearing typically diminishes if
there are non-swearers present. Shared swearing patterns indicate a membership to
the group. Like the “incorrect” language of nonstandard grammar, taboo words fall
outside what is good and proper, and they therefore help to define the gang. Thus
we should extend this category to cover expressions of mateship and endearment
like fuckster, and the epithet “cute little shit” in Have you seen Edna’s baby boy? He’s
a cute little shit isn’t he?, or “silly bugger” in Joe’s a silly bugger, he should never have
married that woman. As in other native varieties of English, this usage is routine
in Australian and New Zealand and speakers often report that the more affectionate they feel towards someone, the more abusive the language can be towards that
person. The conversational corpora examined here certainly bear this out. Examples
like (20)–(22) and (43) are commonplace.
(43) fuck you’re exaggerating bitch [laughs]. [WSC DPC163:2240]
Many younger speakers when in the company of good mates engage in what can only
be described as a kind of ritual insult. Here are some examples from Australia.
 Keith Allan & Kate Burridge
(44)
[Two urban working class teenage Australian Aboriginal females]
A: Gimme the smoke if you want it lit Eggbert.
B: Here shit-for-brains. [passes the cigarette]
A: Geez you’re a fuckin’ sook. I swear to God.
B: Shut up fucker ... (Allen 1987: 63)
(45) A:If I had a pussy like yours I’d take it to the cat’s home and have it put down ...
B: If I had brains like yours I’d ask for a refund ...
A:Well, if I had tits like yours I’d sell them off for basket balls ... (Allen 1987: 62)
(46) [Two urban working class teenage Australian Aboriginal males]
A: Have you got a match?
B: Yeah, your prick and a jelly bean (Allen 1987: 66)
Ritual abuse of this nature is a competitive game, a kind of teasing. It utilizes the same
categories as the kind of insults to outgroupers (or people cast as outgroupers) that we
have just discussed. Yet it is not an attack on an enemy or someone who is an outsider
despised or disparaged, but an expression of group solidarity. This clearly comes out in
a celebrity roast “unmerciful mockery of a celebrity in his or her presence”. As a display
of upmanship, these displays use insults based on people’s (supposed) sexual practices,
age, appearance (body and clothes), smell, and domestic arrangements. Exactly these
categories are also found in true insults, intended to wound, humiliate, and belittle.
Thus true insults are subject to taboo and censoring.
As already mentioned, taboo terms make good offensive epithets and expletives
for the same reason that they make good insults. At least one occasional reason for
using taboo terms is to savour the hearer’s adverse reaction. A related reason is for the
speaker to flaunt his or her disrespect for social convention (this is presumably one
motivation for writers of graffiti); though in the verbal stoushes of ritual abuse this
inverts to a respect for the social convention of the game.
Over the years, the art of the ritual insult has gone by different names. The term
flyting has been around since Anglo-Saxon times, and continued into the fifteenth
and sixteenth centuries. Late nineteenth century American cowboys engaged in
cussing contests, where a saddle would be awarded to the most abusive participant.
The dozens is the term used of the same behaviour among African-Americans today.
The dozens is also called bagging, capping, chopping, cracking, cutting, dissing, hiking,
joning, joaning, joining, ranking, ribbing, serving, signifying, slipping, sounding, snapping. Essentially flyting, the dozens and the like are (at best) a confrontation of wit,
insight and upmanship in which people try to outdo each other in the richness of
their rhetorical scorn by taunting another person with insults about them or their
family in front of an audience.
If we make the solidarity function of ritual insult the criterion that distinguishes
it from true insult, then we have to class what is sometimes called friendly banter as
Swearing 
ritual insult. It is marked by the use of normally abusive address forms or epithets
which are uttered without animosity, which can be reciprocated without animus, and
which typically indicate a bond of friendship.
(47) First youth: Hullo congenital idiot!
Second youth: Hullo, you priceless old ass
Damsel: I’d no idea you two knew each other so well!
(Punch cartoon quoted in Stern 1965: 323)
Here is a recent example from a chat room interchange (logged August 29, 2002,
nationality of participants is unknown; for the uninitiated, “lol” = ‘laughing out loud’,
“:-)” = ‘smile’, “j/k” = ‘just kidding’).
(48)
<mark> didnt your motherboard come with any papers
<Darkman-X> iz that that book that says A7V333 on it?
<[RaW]> yes
<Darkman-X> lol the one that i’m using to prop up my comp table?
<[RaW]> probably
<Darkman-X> whoops
<Darkman-X> :-)
<Darkman-X> j/k
<JoHn> lol
<[RaW]> yur supposed to use your school books for that dummy
As mentioned earlier, there is a psychological gain in letting off steam and expressing
extreme emotion when expletives, forbidden words, automatically come tumbling out.
It is not surprising therefore to find that many societies have public acts of ceremonial
misbehaviour to function as a social safety valve. Flyting, playing the dozens, and other
kinds of competitive ritual insulting appear to manifest this function. When players
bait and tease each other, trying to outdo with insults, this represents a conventionalized
breaking of taboo, a way to let off steam without harming themselves or others.
3.4 Stylistic functions of swearing
One aspect of the stylistic function is to use bad language to spice up what is being
said: to make it more vivid and memorable than if orthophemism (straight-talking)
had been used. An example is Paul Keating’s alleged description of Australia as “the
arse end of the world”.9 Another, not unrelated aspect, is to display an attitude of
emotional intensity towards what is being said or referred to in the utterance as in (49).
(49) Welfare, my arsehole. [ACE F10:1953]
. Alleged by Prime Minister Bob Hawke in 1990.
 Keith Allan & Kate Burridge
Here are some examples of spicing up what is said.
(50)While his partner and twin brother Norman had given up trying to make
it on his own and sworn never again to have anything to do with sheep
(“groundlice” as he scathingly called them), “as long as his arse pointed south,”
Battler took his sheep onto the stockroutes for three hard years, and never lost
faith in the return of the wool market. [ACE R09:1734]
(51)Don’t phone me yet as I am having both my ears transplanted to my nuts so I
can listen to you talk through your arse. [ACE S05:873]
(52) freeze your balls off in winter [ACE P13:2358]
(53) She put her hand on his cock. [ACE P13:2516]
(54)How in the HELL do they think they can change it by sitting on their arses
doing nothing? [WSC DGI148:0305]
(55)
Oh there was yeah those people over there
were still raging when i got back
nfs
n f s?
nfs
um oh true
[…]
I was on a totally different planet … actually
[laughs] trying to work out what n f s meant
yes [laughs]
[indecipherable]
nfs
is this what the dog’s called
no
oh
no fucking shit [laughs]
I thought he was asking me. No no fucking sex? [laughs]
[groans] [WSC DPC162:0855]
(56) Bad luck boys – you blew your arses this time. [ICE-NZ W2F-017:134]
(57) he also decided to get ripped to the tits [ICE-NZ W1B-004:116]
(58)Yeah we’re hooking up with them in Adelaide we’ll swab the decks finger each
other in the arses y’know all that sorta shit. [ART ABCnat7]
(59)So I mean if England can do it [security checks] in less than six months and
we’re supposed to be under the umbrella of the uh British so to speak with the
uh queen and such shouldn’t we be running that way instead of doing the old uh
head up Bush’s arse thing and y’know doing it their way, so … [ART ABCnat8]
Swearing 
(60)On the wall of his office was a framed Elbert Hubbard homily, If You Work
For A Man, For Heaven’s Sake Be Loyal To Him, blasphemously known to the
apprentices as the bumsuckers’ oath . [ACE S07:1186]
(61)[The shop] was called “Beauty Spot”. That’s a suckful name. [ICE-NZ S1A-002:106]
(62) You’ve been screwing someone else. [WWC K41:216]
Example (62) is a pained accusation in which “screwing” is less forceful and more
ladylike than fucking but displays more emotional intensity than sleeping with.
No discussion of antipodean swearing would be complete without some consideration of the so-called “great Australian adjective”. Although barely a taboo word
or a swearword in AusE and NZE, bloody still raises eyebrows in other parts of the
English-speaking world. In February 2006, Tourism Australia launched an international tourism campaign with a television advertisement showing images of everyday day Australians set against a backdrop of famous landmarks concluding with the
ockerish Australian invitation So where the bloody hell are you?10 The advertisement
was censored in North America and even managed to get itself banned from British
TV. However, the then Minister for Tourism, Fran Bailey, persisted with the advertising campaign: “This is a great Australian adjective. It’s plain speaking and friendly. It
is our vernacular”. The 2006 ban on the advert in the UK was not in keeping with a
country responsible for the designer label FCUK and comedies like “Absolutely Fabulous” <http://www.bbc.co.uk/‌comedy/‌abfab/> and “Little Britain” <http://‌www.bbc.
co.uk/‌comedy/‌littlebritain/>. For British authorities to be squeamish about bloody hell
was also not in keeping with their own research into attitudes to offensive language.
Millwood-Hargrave (2000) was a joint study carried out by the Advertising Standard
Authority, the British Broadcasting Corporation, the Broadcasting Standards Commission, and the Independent Television Commission; 1500 participants were asked
to respond to the perceived “strength” of 28 swearwords: a mere 3% found bloody to
be offensive.
The description “great Australian adjective” goes back to the mid-1800s. Alexander
Marjoribanks (1847) wrote: “The word bloody is a favourite oath in that country. One
man will tell you that he married a bloody young wife, another, a bloody old one,
and a bushranger will call out, ‘Stop, or I’ll blow your bloody brains out’ ” (pp. 57f).
The word made a deep impression on Mr Marjoribanks who also noted that a bullock driver he encountered had used the word bloody 27 times in 15 minutes. So
astounded was Mr Marjoribanks, that he went on to further calculate that within a
150 year period that same bullock-driver would use bloody 18 200 000 times. A few
. It can be viewed at <http://www.wherethebloodyhellareyou.com/>.
 Keith Allan & Kate Burridge
years later the Sydney Bulletin referred to bloody as “the Australian adjective” and the
name has stuck.
It is worth pointing out that the swearword bloody is not truly an adjective.
Compare the bloody skies (ART) with the blue skies. While bloody appears to have the
attributive function of an adjective that precedes the noun it modifies, it cannot normally have a complementary predicative function. We can alternate between the blue
skies and the skies are blue but the skies are bloody is not a paraphrase for the bloody
skies. (There are occasions when It is just bloody can be heard, but the predicative
use is rare.) While most adjectives can be modified by intensifiers like very (e.g. the
very blue skies), modifying bloody in the very bloody skies invokes the literal meaning
(which then does make “bloody” an adjective).
In fact, bloody functions like an intensifier when it co-occurs with evaluative
adjectives, as in She’s a bloody good root; the same is true for blasted, bleeding, sodding,
fucking as well as standard intensifiers like very, awfully, exceedingly, etc. Concatenated
with nouns, adjectives, participles and verbs, bloody emphasizes the emotive, often
urgent attachment to the speaker’s speech act, as in the invented examples (63–69).
In brackets we’ve supplied a typical interpretation of the emotive force that might be
provided by these intensifying expletives. Later we give examples from the corpus.
(63) It’s a bloody/fucking crocodile! [warning]
(64) It’s a bloody/fucking picture! [nothing to make a fuss about]
(65) You’ve bloody/fucking broken it! [lamentation]
(66) But I’m going on bloody/fucking holiday! [exasperation at question asked]
(67) You’re driving too bloody/fucking fast! [condemnation]
(68) This train is bloody/fucking late/slow. [complaint, exasperation]
(69) It’s turned bloody/fucking red! [surprise]
As the corpus evidence confirms, the word bloody is still common in the antipodes.
In the conversational data of the ICE-AUS it occurs 46 times in 22 933 sentences (20
per 10 000), in the ART corpus there are 13 occurrences of bloody in 20 375 sentences
(6 per 10 000), and in the ICE-NZ corpus 43 times in 60 175 sentences (7 per 10 000).
For comparison in COLT, the Corpus of London Teenage Language, there are 291
instances of bloody in 107 429 sentences (27 per 10 000). Which just goes to show that
Australians (and New Zealanders) lag well behind Londoners (though the populations
are not completely comparable).
Some annotated examples of bloody from the corpora. As example (70) illustrates,
intensifying expletives such as bloody do not always convey an attitude of exasperation,
disapproval, or whatever, but may simply be a marker of excitement or exuberance that
serve to colour or spice up what is being said.
Swearing 
(70)Did you hear about the new Irish Airways they just had they were allowed to
come into into Australia for the first time. Anyway they were flying into Perth
n the conning tower there was a lotta cloud over the bloody skies n everything.
N the conning tower called up he said Irish Airways Irish Airways he said you
can’t land yet we’ll have to get you to circle round the airport so he says can you
give me your height n position please. So the little Irish bloody pilot gets up n
he says I’m five foot two n I’m sitting up the bloody front. [ART COMne2]
(71)Oh yeah Essie Essie’s There’s no point in Eddie taking her out because she’s
bloody too stuffed you know. She’s an old duck. She doesn’t want to bloody stuff
around town all day. […] Yeah she went down there and bloody went all over
the place. [ICE-AUS S1A-009:18]
(72)Yeah but when we eat a bloody meal ya bloody can hardly move when you’ve
finished it. [ICE-AUS S1A-009:80]
(73)we’re gonna bloody start doing that bloody extension to the house.
[ICE-NZ S1A-052:65]
(74)She leaped at the opportunity, as she always did in such places, to go and have
the total beauty treatment – face massage, manicure, pedicure, everythingbloody-cure! [ACE N01:58]
(75)Well uh I’d be straight down there I tell you right now. I I I’d be the first one
down there. And I tell you I’ve b I’ve been around the mill a few b few few times
I’ve got a young wife she’s only thirty-four and as I said I’m sixty-bloody-four
n n n no I mean I’ve brought up I got three other daughters. They’ve never had
they’ve never got pregnant thank Christ n w they were brought up in the sixties
n seventies and I taught them right from bloody wrong from the start and at
least each and every one of them have had their children and got married and
I’m really really proud of them and these little boys of mine are gunna be the
same way around mate. [ART COMne2]
Bloody has a fine pedigree. There are two colliding origins – both respectable (see Allan
and Burridge 1991: 130–1). One is basically the idea of blood. Quite simply, descriptions
like bloody battle and bloody murder would have extended to other expressions and the
colourful associations of bloodshed and murder would have made bloody a very suitable
intensifying word. You might compare other graphic intensifiers like awfully and horribly that have similar violent origins. A second source involved the so-called bloods, the
young aristocratic louts of the seventeenth and early eighteenth centuries. At that time
descriptions like drunk as a blood (i.e. drunk as a lord) meant that an expression like
bloody behaviour would have had double significance – objectionable behaviour, something you might expect of a young blood, with the added force of the intensifier bloody.
It is also quite apparent that early on in its life bloody was not considered a bad
word at all. In 1714 Jonathan Swift in a letter to a woman friend described the weather
 Keith Allan & Kate Burridge
as “bloody hot”. And in later letters he talked about being “bloody sick”, and the weather
being “bloody cold” (Montague 1968: 245). The Dean of St Patrick’s Cathedral (Dublin)
seems to have been using bloody with the same freedom that gentlemen and ladies of
good breeding would have used terms like frightfully, vastly or dashed. It could not have
been an impolite term at that time.
And yet, two hundred years later, bloody had become such “a horrid word” that it
was necessary to render it in print with asterisks. Eliza Doolittle’s scandalous outburst
in Act III of Pygmalion (“Walk! Not bloody likely”, Shaw 1946: 78) provoked such an
outrage that the press in 1914 could do no more than just hint at it. It became “the
Unprintable Swearword”, “the Word”, “Shaw’s Bold Bad Word”. So why this fall from
grace? There are at least two reasons. One is undoubtedly the bogus etymologies that
derived the expression either from By our lady, an oath calling on the assistance of the
Virgin Mary, or from [God’s] blood. There is no evidence for either of these histories;
what is more, bloody is not an independent expletive like these two expressions, but
rather an intensifier. Yet, for some people there were blasphemous and profane implications and that was enough to condemn the word. Secondly, and probably more
importantly, its lurid associations meant it was much used by the criminal classes.
Captain Grose in his Dictionary of the Vulgar Tongue (1795) describes it as “a favourite word used by thieves in swearing”. He gives the example bloody rascal. This connection with the underworld explains its currency in the colonial slang of Australia
and New Zealand.
4. Swearwords as discourse particles
There are people who use expletives and taboo epithets so frequently that one cannot
persuade oneself they are auto-cathartic. Writing of the use of fuck by British soldiers
in World War I, Brophy and Partridge had this to say:
So common indeed [was fuck] in its adjectival form that after a short time the ear
refused to acknowledge it and took in only the noun to which it was attached. ... Far
from being an intensive to express strong emotion it became a merely conventional
excrescence. By adding -ing and -ingwell, an adjective and adverb were formed and
thrown into every sentence. It became so common that an effective way for the
soldier to express emotion was to omit this word. Thus if a sergeant said, “Get
your -ing rifles!” it was understood as a matter of routine. But if he said, “Get your
rifles!” there was an immediate implication of emergency and danger. (Brophy &
Partridge 1931: 16f.)
Where a taboo term such as fuck is bleached of its taboo quality, it loses all its standard force. The following example from AusE appears in the court case Police v Butler
Swearing 
(2003). The incident occurred outside the defendant’s house at around 11.30 at night;
he was intoxicated and is addressing the police and neighbours:
“What the fuck are youse doing here. My fuckin’ son had to get me out of bed. I
can’t believe youse are here. What the fuck are youse doing here?”
“I fuckin’ know what this is about. It’s about that fuckin’ gas bottle. They can get
fucked, I’m not paying them fucking nothing. They can get me our fuckin’ bottle
back” [to the police about the neighbours]
“We never had any fuckin’ trouble till youse fuckin’ moved here. Youse have
fuckin’ caused this trouble and called the fuckin’ police on me” [to the neighbours].
(Police v Butler [2003] NSWLC 2 before Heilpern J, June 14, 2002)
Lashings of obscenities have also become an earmark of celebrated chef and restaurateur Gordon Ramsay, so much so that his television cooking series is called “The
F-Word” (see <http://www.channel4.com/food/on-tv/f-word>). Ramsay uses obscenities as discourse particles – where other people might use like, well, I mean, you know,
and the like. This is not to suggest that such bleached swearwords are empty. Like other
discourse particles, these expressions convey subtle nuances of meaning and can have
complex effects on utterances. Wierzbicka (2002) describes the various meanings of
bloody in AusE and shows how they provide important clues to Australian attitudes
and values. Yet one must presume that under such circumstances the auto-cathartic
value of both the expletives and the corresponding epithets is reduced, and that either
alternative expressions will be invented or some other form of catharsis will be sought.
We are put in mind of Shakespeare’s aphorism:
If all the year were playing holidays
To sport would be as tedious as to work. (Henry IV Pt.1 I.ii.192)
Indeed, there is evidence that swearing will diminish under very stressful circumstances,
as suggested in the quote from Brophy and Partridge (1931) and the earlier ones from
Ross (1960) and Montagu (1968).
5. The evolution of swearing patterns – what is offensive changes over time
[I]f you were driving in your car, somebody cuts you up in your car, if they shout
and call you a f-ing idiot, or a bloody idiot or whatever, fair enough. If they
start putting your racial background into that, it’s unacceptable. (Interview in
Millwood-Hargrave 2000: 20)
The processing of the emotional components of language, such as swearwords, belongs
to the limbic system. This is an older deeper part of the mammalian midbrain (about
the size of a walnut) that adds emotional spice to the surrounding cerebral cortex –
the part of the brain that is responsible for verbal reasoning, calculation, analytical
 Keith Allan & Kate Burridge
thinking, and rational thought. As yet, there are no laboratory or neuro-imaging studies that have conclusively identified the exact neuroanatomical sites where tabooed
expressions are stored or that have evaluated specifically the neurological processing
of obscenities, but the evidence seems overwhelming: taboo language is rooted deeply
in human neural anatomy; it is hard-wired into the limbic systems of our brains (see
Allan & Burridge 2006 Ch. 10). What motivates the actual expression is the sociocultural setting.
That which is taboo in a society will furnish the language with its swearwords
and, because taboo is dynamic, there will always be shifts of idiom employing terms of
opprobrium. The history of foul language in English has seen the sweeping transition
from the religious to the secular in its patterns of swearing. Outside of Islam, blasphemous and religiously profane language is no longer considered offensive by a majority of
speakers and has given way to more physically and sexually based modes of expression.
In part, this reflects a natural bleaching process; it is a fact of lexical life that words wear
out over time and nowhere is this more evident than in slang terms and swearwords.
But this change also indicates a shift in the perception of what is taboo, concomitant
with and perhaps triggered by the waning power of the Church and growing secularisation of English-speaking societies. Consider the once shocking nature of the expressions that underlie remodeled curiosities such as drat and rats, both shortened forms
of May God rot you (your body, bones, and soul). Even in their full forms these would
be mild curses today, but they were heinous at a time when most people believed in the
fires of hell and eternal damnation. The 1600s saw the first organized form of linguistic
censorship, specifically laws against profanity on the stage. The fine was a whopping ten
pounds that could have bankrupted a theatre company of that time.11 It is small wonder
that irreverent language went into heavy disguise giving rise to the so-called “minced”
or “dismembered” oaths such as Zounds/zoons “God’s wounds”; gadzooks “God’s hooks”
(meaning “God’s fingers” or “God’s bones”); slidikins “God’s little eyelids”.
The same weakening is now evident in the physically and sexually based swearwords. Sex and bodily functions no longer provide the potent swearwords they once
did. Our experience in Australia is that since the 1990s such words are frequently
encountered in the public arena and there now seems to be wide acceptance of it. The
designer label FCUK (French Connection, UK) appears prominently on billboards
everywhere. When in a radio interview (April 1999), the then Premier of Victoria,
Jeff Kennett, used the insult pricks to describe a group of people who had flouted the
restrictions that had been imposed during the gas crisis of that year, there was barely
a ripple. In June 1999, the Australia Institute’s executive director, Dr Clive Hamilton,
was heard using fuck during an interview on the ABC’s well-respected current affairs
. See Hughes (1991) for a full historical account.
Swearing 
program Four Corners. This was the third occurrence of the word on a Four Corners
program that year. Around the same time appeared a highly successful TV advertisement using bugger to sell the new Toyota Hilux utility truck <http://www.youtube.
com/‌watch?v=1Sn9L94YrNk>. This advertisement has now something of a cult following, especially in New Zealand. In Australia, the advertisement had followed hot
on the heels of the West Australian Lotteries advertisement in which a winner says
Bullshit! on being told he has won. On 19 July 2007, after renewed controversy over
then Prime Minister’s John Howard’s alleged broken promise to hand over the Liberal
leadership position to his Deputy Peter Costello, the Minister of Health and Ageing
Tony Abbott said in an ABC Lateline interview “Not to put too fine a point on it, shit
happens ... we just have to cope” <http://www.youtube.com/watch?v=g5acfubEyZs>.
Newspapers, which would once have resorted to coy abbreviations when reporting
such events, often used the full words without warning. In February 1991 the Press
Council of Australia in Adjudication No 479 defended the inclusion of four-letter
expletives in an interview with actor Bryan Brown, published in the Arts Section of
the Weekend Australian (4 August 1990). Mr JD Purvey wanted an apology for the
use of “vile obscene language”. Part of the determination reads:
News Ltd responded at some length to Mr Purvey’s objections, saying in essence
that the use of expletives had gained wide acceptance and such profanities were
no longer confined to the factory floor or dockside. It supported its argument
with a Telegraph-Mirror article quoting a university language expert as saying
that four-letter profanities were now widely used by both men and women. The
Council believes, in this case, that the use of the word in full was justified. (Cited
in Police v Butler 2003: 4)
In this regard, it is interesting that Roy Eccleston’s recent article on swearing
in the Weekend Australian Magazine (June 7–8, 2008, <http://‌www.theaustralian.
news.com.au/‌story/‌0,25197,‌23819802-5012694,00.html>) used only abbreviations
such as f..k, c..t, the f-word and the c-word. Clearly, there are still some people
who are uncomfortable hearing these two particular swearwords. According to
recent research conducted by the Australian Communications and Media Authority (2008), around 5% of the viewers surveyed gave bad language as something of
concern. A Senate Committee was set up by the Liberal Senator Cory Bernardi to
investigate the frequency and use of coarse and foul language in programs <http://
www.refused-classification.com/‌Reviews.htm>. It is reported that the Senate will
reject the notion that some profanities should be decreed unacceptable because
community standards evolve, and to codify them would be exceptionally difficult.
And “according to an ACMA survey, only 3% of parents stopped children from
watching programs because of bad language last year [2007], compared with 34% in
1995” (The Age June 19, 2008, <http://www.theage.com.au/‌national/‌ramsays-notgoing-to-fffade-away-20080618-2swm.html>).
 Keith Allan & Kate Burridge
The corpora examined in preparing this chapter show an abundance of examples of bugger, bloody, fuck, fuck off, fucking, and also cunt. Free-to-air television
now frequently includes words such as fuck, fuck off, fucking, as well as cunt. “Foul
language” regularly turns up in movies rated PG (parental guidance), and is no
longer confined to MA (mature audiences) or R (restricted) rated movies. Clearly
the censors who make the classifications do not find language such as we have been
discussing a problem. In reality TV programs such as Big Brother, sitcoms like Sex
in the City, and dramas such as The Sopranos these words are now commonplace. In
Australia, the swearing and sex clearly had no damaging effect on the ratings Channel Nine received for its (2008) television drama series Underbelly based on the real
events of the 1995–2004 gangland war in Melbourne <http://www.underbellytv.com>.
In Sydney, Brisbane, Adelaide and Perth where it was shown (the Supreme Court
suppressed it in Victoria), episode one of the series drew an average national audience of 1.32 million people. It was the most popular show of the night in these four
mainland capitals and the third most-watched show on Australian screens overall.
Moreover, this show screened at 8.30 pm, despite Australia’s official 9.00 pm TV
watershed, before which it is supposedly not permissible to show television programmes which have “adult content”. In New Zealand the show started at 9.30 pm
and although it was axed after only three episodes, it was reinstated due to a public
outcry in its favour.
The social acceptance of swearing explains why obscene language charges in
Australia and New Zealand are now typically dismissed, with courts ruling that words
such as fuck, shit, and cunt are no longer “offensive”. Earlier we quoted some of the
defendant’s words in the case Police v Butler (2003). Although the speaker was summonsed for using offensive language, the case was dismissed. Clearly the defendant
did use language that might reasonably be described as “offensive” – so why is it not
offensive in law (at least in the State of New South Wales, Australia)? The presiding
magistrate, Heilpern J, referred to another case where a defendant was summonsed
for saying to police trying to restrain him during a brawl, Get fucked you cunts, I’m
just trying to help my mates. That case was heard by Yeldham J, who wrote:
I determined by a consideration as best I could of community standards today
and decisions on this kind of legislation over the last twenty years, that the words
were not intrinsically “offensive” in the requisite legal sense of that word.
In Police v. Butler, Heilpern J referred to several additional cases and also to the extreme
prevalence of words like fuck and cunt within the community, and their frequency on
free-to-air television and in other media.
Channel 9 has recently broadcast a show (Sex in the City) that includes the words
“fuck off ” and “fucking” as well as “cunt”. The word was used on “The Panel” and
Swearing 
the station only received two complaints. Recently, the Sydney Morning Herald
revealed that “fuck” was used in the television program “The Sopranos” seventyone times in one single episode (SMH April 29, 2000, 3s). Big Brother residents
evidently cannot live without the word in every episode.
Heilpern J concluded that:
This is a classic example of conduct which offends against the standards of
good taste or good manners which is a breach of the rules of courtesy and runs
contrary to accepted social rules – to use the words of Justice Kerr. It was illadvised, rude, and improper conduct. Some people may be offended by such
words, but I am not satisfied beyond a reasonable doubt that it is offensive within
the meaning of the section. There is doubt in my mind that a reasonably tolerant
and understanding and contemporary person in his or her reactions would be
wounded or angered or outraged. Such a person would be more likely to view
it as a regrettable but not uncommon part of living near people who drink to
excess. I have no doubt that people would have been disturbed as a result of
being awoken or distracted by the yelling and carry on, whatever the language
used. I ask myself this question – what difference would it make to the reasonably
tolerant person if swear words were used or not. I answer that there would be
little difference indeed.
What is interesting about this legal decision and similar judgements is that they
reflect the changes in social attitudes: taboos on various kinds of profanity have been
relaxed. They have been replaced by sexual, racial and ethnic slurs, so that the new
swearwords these days include expressions such as faggot, dike, queer, dago, kike, kaffir, nigger, mick, wog, boong, abo and so on. These reflect the new taboos in Englishspeaking societies. Since the 1980s, speakers have shown a growing apprehensiveness
of how to talk to and about those perceived to be disadvantaged or oppressed. There
has been a gradual establishment of legally recognised sanctions against what we have
described as -IST language (Allan & Burridge 1991; 2006). These new taboos make
sexist, racist, ageist, religionist, etc. language not only contextually dysphemistic, but
also legally so. The -IST taboos have surpassed in significance irreligious profanity,
blasphemy and sexual obscenity, against which laws have been relaxed. In the sporting arena, for example, players are occasionally sin-binned but never charged for foul
language on the field, that is, unless the complaint involves race discrimination and
vilification. In 1995 an Australian Rules football player Damian Monkhurst was disciplined for calling Aboriginal player, Michael Long, a “black cunt” or “black bastard”
during a game. It was the racial abuse that triggered the furore and the incident gave
rise to the AFL’s “Rule 30: A Rule to Combat Racial and Religious Vilification” – a new
code of conduct to apply both on and off the sporting oval: see <http://www.austlii.
edu.au/‌au/‌journals/‌AJHR/‌2000/‌18.html>. -IST language can be so provocative as to
be found offensive in law.
 Keith Allan & Kate Burridge
6. Swearing is ever changing, but here to stay
The whole history of swearing bears unequivocal testimony to the fact that
legislation and punishments against swearing have only had the effect of driving
it under the cloaca of those more noisome regions, where it has flourished and
luxuriated with the ruddiness of the poppy’s petals and blackness of the poppy’s
heart. It has never been successfully repressed. (Montagu 1968: 25)
Over the centuries, attempts to stamp out swearing have met with little to no success.
Censorship and repression, whether they amount to full-blown sanctions or merely
social niceties, seem only ever to provide a more fertile breeding ground for “dirty”
words to thrive. One only has to look at the oxymoronic behaviour of the Victorian
middle classes. When sex ceased to be talked about openly the sex trade and pornography flourished underground. During the Renaissance the very first organised form of
linguistic censorship in England coincided with a flourishing of linguistic subterfuge
in the form of the minced or dismembering oaths mentioned earlier such as zounds
or sfoot.12 Today we see the same mix of exuberance and restraint. Jonathon Green’s
(1996) collection of abuse terms reveals a flourishing lexicon of bigotry. His collection
of largely racial slurs highlights waves of new arrivals furnishing a brand new litany of
abuse. In grim irony, Green points out (p. 13) that the United States of America, the
land of immigrants and aliens, tops his list of abusers; American coinages make up
the largest proportion of dysphemistic language in his book. Work by Kevin Dunn,
James Forrest and colleagues at the University of New South Wales shows that there
is deep rooted racism in Australia against Muslims, Indigenous Australians, Jews, and
people of Asian background (see e.g. Dunn 2003; Dunn, Forrest, Burnley et al. 2004).
Unfortunately we do not at this time have sustained linguistic evidence of racist slurs
arising from these attitudes and must leave it for another occasion. But it is worth
mentioning the relative scarcity of -IST abuse terms such as faggot, dike, queer, dago,
kike, kaffir, nigger, mick, wog, boong, abo in the spoken language corpora examined
here – one example of nigger, one of queer, two of faggot and two of abo.
Finally, as the corpora reveal, swearing remains an important feature of the
antipodean varieties of English. But just how uniquely Australian and New Zealander
are the swearing patterns that we have described here? We need comparisons with
the slang, swearing and terms of insult used in other varieties of English, especially
BrE and AmE. Prima facie there is much that is common to the northern hemisphere
and antipodean expressions used. It remains to be seen whether Australians and
. Hughes (1991) Ch. 5, describes the ingenious circumvention that such repression encourages. Ch. 7 also offers a splendid account of the schizoid behaviour of the Victorians – a rich
exuberance of swearing went hand in hand with the decorum and censorship of the time.
Swearing 
New Zealanders really do live up to their popular image of having an unusually rich
and creative “bad” language.
References
Adams, Robert M. 1985. “Soft soap and nitty gritty”. In D.J. Enright (ed.), Fair of Speech: The Uses
of Euphemism, 44–55. Oxford: Oxford University Press.
Alcock, Sophie. 1999. “Attitudes to swearing in Australian English: A study of gender and
Subculture differences”. Honours Thesis. Linguistics Department, La Trobe University.
Allan, Keith. 1992a. “Body-parts and animals ”. In Tom Dutton, Darrell Tryon & Malcolm Ross
(eds), A Memorial Volume for Donald C. Laycock, 29–39. Canberra: Pacific Linguistics.
Allan, Keith. 1992b. “Something that rhymes with rich”. In Eva Kittay & Adrienne Lehrer (eds),
Frames, Fields, and Contrasts. Norwood NJ: Lawrence Erlbaum, 355–74.
Allan, Keith & Kate Burridge. 1991. Euphemism and Dysphemism: Language Used as Shield and
Weapon. New York NY: Oxford University Press.
Allan, Keith & Kate Burridge. 2006. Forbidden Words: Taboo and the Censoring of Language.
Cambridge: Cambridge University Press.
Allen, Wendy F. 1987. “Teenage speech: The social dialects of Melbourne teenagers”. B.A. Honours
Thesis. Linguistics Department, La Trobe University,
Andersson, Lars-Gunnar & Peter Trudgill. 1990. Bad Language. Harmondsworth: Penguin.
Australian Communications and Media Authority. 2008. ACMA Communications Report
2006–07. Melbourne: Commonwealth of Australia.
Bathgate, Alexander. 1881. Waitaruna: A Story of New Zealand Life. London.
Bayard, Donn & Sateesh Krishnayya. 2001. “Gender, expletive use, and context: Male and female
expletive use in structured and unstructured conversation among New Zealand university
students”. Women and Language 24 (1): 1–15.
Brophy, John & Eric Partridge. 1931. Songs and Slang of the British Soldier: 1914–1918. 3rd edn.
London: Routledge and Kegan Paul.
Dabke, Roswitha. 1977. “Swearing and abusive language of Australian Rules Football spectators”.
Talanya 4: 76–90.
Dunn, Kevin M. 2003. “Racism in Australia: findings of a survey on racist attitudes and experiences
of racism”. National Europe Centre Paper No. 77. Sydney: University of New South Wales.
Dunn, Kevin M., James Forrest, Ian Burnley & Amy McDonald. 2004. “Constructing racism in
Australia”. Australian Journal of Social Issues 39: 409–30.
Green, Jonathon. 1996. Words Apart: The Language of Prejudice. London: Kyle Cathie Ltd.
Grose, (Captain) Francis 1795. A Classical Dictionary of the Vulgar Tongue. London: S. Hooper.
Gunn, John S. 1970. “Twentieth-century Australian idiom”. In William S. Ransom (ed.),
English Transported: Essays on Australasian English, 49–67. Canberra: Australian
National University Press.
Haugen, Einar. 1972. The First Grammatical Treatise: The Earliest Germanic Phonology. London:
Longman.
Hughes, Geoffrey. 1991. Swearing. A Social History of Foul Language, Oaths and Profanity in
English. Oxford: Blackwell.
Jay, Timothy. 1992. Cursing in America. Philadelphia PA: John Benjamins.
 Keith Allan & Kate Burridge
Jay, Timothy. 2000. Why We Curse: A Neuro-Psycho-Social Theory of Speech. Philadelphia PA:
John Benjamins.
Johnson, Jean L. 1991. “A Comparative Ethnography of Linguistic Taboo: Profanity and
Obscenity among American Undergraduate College Women”. Ph.D. thesis, Indiana
University of Pennsylvania.
Lalor, Thérèse & Johanna Rendle-Short. 2007. “ ‘That’s so gay’: A contemporary use of gay in
Australian English”. Australian Journal of Linguistics 27: 147–73.
Marjoribanks, Alexander. 1847. Travels in New South Wales. London: Smith, Elder, and Co.
Millwood-Hargrave, Andrea. 2000. Delete expletives? London Advertising Standards Authority,
British Broadcasting Corporation, Broadcasting Standards Commission, Independent
Television Commission.
Montagu, Ashley. 1968. The Anatomy of Swearing. New York: Macmillan.
Murray, Thomas E. 1995. “Swearing as a function of gender in the language of Midwestern
American college students: Who does it more, what do they say, when and where do they
do it, and why do they do it?” Maledicta 11: 139–52.
Oxford English Dictionary. 2nd edn. 1989. 20 vols. Oxford: Clarendon Press.
Pinker, Steven. 2007. The Stuff of Thought: Language as a Window into Human Nature. New York
NY: Viking.
Ramson, William S. 1966. Australian English: An Historical Study of the Vocabulary, 1788–1898.
Canberra: Australian National University Press.
Read, Allen W. 1977. Classic American Graffiti: Lexical Evidence from Folk Epigraphy in Western
North America. Waukesha WI: Maledicta Press. (First published 1935).
Ross, Helen E. 1960. “Patterns of swearing”. Discovery 21 (November): 479–81.
Seal, Graham. 1999. The Lingo: Listening to Australian English. Sydney: University of New South
Wales Press.
Shaw, George B. 1946. Pygmalion. Harmondsworth: Penguin.
Stapleton, Karyn. 2003. “Gender and swearing: a community practice”. Women and Language
26 (2): 22–3.
Stern, Gustaf. 1965. Meaning and Change of Meaning (with Special Reference to the English Language).
Bloomington IN: Indiana University Press. (First published 1931).
Stollznow, Karen. 2004. “Whinger! Wowser! Wanker! Aussie English: Deprecatory language and the
Australian ethos”. In Christo Moskovskey, ed. Proceedings of the 2003 Conference of the Australian
Linguistic Society. <http://au.geocities.com/‌austlingsoc/‌proceedings/als2003.html>
Taylor, Brian A. 1976. “ Towards a sociolinguistic analysis of ‘swearing’ and the language of abuse
in Australian English”. In Michael G. Clyne (ed.), Australia Talks: Essays on the Sociology of
Australian Immigrant and Aboriginal languages. Canberra: Pacific Linguistics, 43–62.
Turner, George W. 1966. The English Language in Australia and New Zealand. London:
Longman.
Wierzbicka, Anna. 1992. Semantics, Culture, and Cognition: Universal Human Concepts in
Culture-specific Configurations. New York NY: Oxford University Press.
Wierzbicka, Anna. 2002. “Australian cultural scripts – bloody revisited”. Journal of Pragmatics 34:
1167–1209.
Epilogue
Collective findings and conclusions
Pam Peters
Macquarie University
1. Differentiation among varieties of English
The major varieties of English are most likely to continue differing in phonology, less
likely to diverge in their lexica, and in grammar the direction is quite unclear, according to Trudgill (1998: 29–34). Yet evidence on the directions of grammatical change
is increasingly available through the compilation of corpora of the regional varieties
of English. Many of the contributions to this volume show how AusE and NZE are
subtly – and not so subtly – differentiated from the major northern hemisphere varieties at the turn of the millennium. Well-established settler (STL) varieties such as
AusE and NZE might indeed be expected to present a set of endonormative features,
in keeping with Schneider’s (2003/2007) model of the evolution of new Englishes. Yet
given the common British ancestry of both AusE and NZE, there would also be points
of usage on which one or both of them still select the variants preferred in BrE, and
thus seem to be not independent of it.
As regional standards consolidate and forge their own linguistic identity, further
internal differentiation of their registers and modes of discourse may be seen. This
is part of the ongoing lectal variation to be found in newly endonormative varieties
of English, and in linguistic changes in the relationships between registers in the
larger history of modern English (Biber, Finegan & Atkinson 1994; Biber 2004).
It is interconnected with recalibration of the formal and informal stylistic features
which combine to set the tenor of discourse. Features of spoken English such as the
use of contractions may become acceptable in written English, as they have in late
twentieth century newspaper material (Axelsson 1998). The changing alignments of
stylistic features help to integrate speech-based syntax and idiom into the common
standard, and to reduce the distance between spoken and written styles. The neutralization of stylistic elements and recalibration of register boundaries intersects with
varietal evolution.
Let us first focus on lexicogrammatical variants where AusE and NZE usage preferences remain in line with BrE, and typically contrast with AmE. We shall then proceed
to examine the usage preferences which AusE and NZE share with each other but not
 Pam Peters
with BrE, suggesting greater independence and endonormativity in the two southern
hemisphere varieties, and the possibility of an antipodean standard. There are also usage
issues on which AusE and NZE are clearly differentiated from each other. Finally we
take up the larger implications for corpus research and the evolution of world English.
2. Reflexes of BrE persisting in AusE and NZE
How far does the grammar of AusE and NZE diverge from that of BrE? The short
answer must be “not very much”, because they draw on a common stock of grammatical resources, of which the linguistic variables treated in this volume are only a subset.
Yet for that subset, corpus data can distinguish those on which AusE, NZE and BrE
preferences are more or less identical, and those where they diverge either in terms of
frequency or the actual selection of variants.
Identical or very similar patterns of selection in AusE, NZE and BrE were found
in Mair’s study of verb complementation (Section IV). He found that, following verbs of
hindering such as stop, prevent, all three varieties continue to exploit both simple -ing
and from + -ing constructions, while AmE usage is almost entirely concentrated on the
from + -ing construction. In the choice between simple gerund and to- infinitive following begin, start, BrE, AusE and NZE are less advanced than AmE in their use of the
gerund, in changes analogous to the “Great Complement Shift” (Rohdenburg 2006).
The closer proximity of BrE, AusE and NZE to each other than to AmE manifests itself
in other areas of complementation, e.g. the preposition following different. While from
is the most frequent choice in all four varieties (at least in written corpora), the second choice is sharply divided, with than preferred in AmE to the exclusion of to, and
to preferred to the exclusion of than in BrE, AusE and NZE (Hundt, Hay & Gordon
2004: 325–6).
Shared patterns of usage for AusE, NZE and BrE were also found in the study
of concord with collective nouns (Hundt, Section IV). Though the raw frequencies of
plural concord were slightly higher in BrE than AusE and NZE, the differences were
not statistically significant. Only at the lexicogrammatical level were there some less
convergent findings, as in the particularly frequent use of plural agreement in BrE for
family and board, and in AusE and NZE for the nouns team and group. However those
nouns amount to just 4 out of the 35 examined by Hundt, and the dominant finding
was one of similarity rather than difference among the three varieties. In their greater
tolerance of plural agreement, all three contrast with AmE (Hundt 2009).
The use of light verb constructions with have, take, give, make etc. is probably on
the increase world-wide (Smith, Section II). Their relative distribution, with take being
more characteristic of AmE and have of BrE continues, and AusE and NZE share the
British preference. Likewise the inventories of frequently occurring non-numerical
Epilogue 
quantifiers are much the same for AusE, NZE and BrE. They share 12 out of the 15
examples examined by Smith (Section III), and only 3 showed particular regional
biases: loads of being much more strongly associated with BrE than the other two varieties, while heaps of and a bunch of were strongly associated with AusE and NZE.
In the information packaging of clausal information, i.e. the use of nonbasic clause
formulations such as cleft sentences as focusing devices, Collins (Section V) found
marked similarities between AusE, NZE and BrE. All three contrast with AmE in their
strong endorsement of “dummy subject” structures using it and there to introduce the
focal item, in both spoken and written discourse. In AmE, especially in writing, the frequency of dummy subject use is distinctly lower than all the others, and the reversed
pseudo-cleft more strongly preferred. On discourse-structuring features like these, the
norms of AusE/NZE usage are still pretty much in line with those of BrE. Neither they
nor the other points of complementation just discussed provide clear signs of endonormativity in the antipodean varieties.
3. Similarities between AusE and NZE grammar: An antipodean standard?
Quite a few of the research studies reported in this volume show that the grammars of
AusE and NZE pattern together in contrast with the major northern hemisphere varieties, to form a common antipodean standard. These points of AusE/NZE convergence
are to be found at several levels of syntax, and in four out of the five sections.
In several details of morphology, AusE and NZE have more in common with each
other than with BrE. One is their higher usage of nonstandard and nonstandardized
verb forms within ordinary discourse (Peters, Section I). Another is their selections of
first person pronouns in coordinated constructions (Quinn, Section I), where both varieties resist the more general trend towards using the accusative pronoun me (Wales
1995; Biber et al. 1999: 335). Instead they are quite strongly inclined to use “X and I”
coordination in nonsubject roles in speech, so much so that Huddleston and Pullum
(2002: 463) treat it as an acceptable variant of standard English. AusE and NZE are
also more inclined than AmE and BrE speakers to use “myself ” as an alternative to me
in coordinated constructions, whether subject or nonsubject. A third example can be
seen in lexical morphology, where Bardsley and Simpson (Section I) found that the
exploitation of hypocoristic forms with -ie is far more extensive in AusE and NZE than
anything reported for BrE or AmE.
These morphological differences entail the further question as to whether they
represent elements of endonormativity in the antipodean varieties. The answer for the
first depends on whether we regard use of a single past form for irregular verbs such
as ring (rung), shrink (shrunk) as (i) deviation from current standard English, or (ii)
accommodation to the larger two-part verb paradigm discussed by Peters (Section I).
 Pam Peters
In the first case, they would represent a kind of “colonial lag”, in the second a more pronounced move (than in BrE) in the direction of consolidating the English verb system,
and therefore endonormativity. In their use of hypocoristic forms with -ie, Aus/NZ
would seem to be quite definitely endonormative, since both have enormous inventories of them which are unparalleled in either northern hemisphere variety. Both also
produce hypocorisms with -o, but there are many more types instantiated in AusE than
in NZE (see further below Section 4).
When it comes to variability within the verb phrase, the two southern hemisphere
varieties generally sit close to each other in the middle of the scale with BrE and AmE
at the extremities. This is true in the case of usage of the present perfect, where AusE
and NZE show greater frequency of use (like BrE) as well as greater tolerance of its
use with past-referring adverbs (like AmE), and thus have more in common with each
other than with the northern hemisphere varieties (Elsness, Section II). In their use of
mandatives, both AusE and NZE show greater distancing from BrE during the course
of the twentieth century. In Australia’s case, this could be argued as AmE influence
especially through the American media, but not for NZE (Peters Section II; 2008a).
Their similar levels of use of the mandative may therefore reflect antipodean lag, apart
from the freedom to reconfigure variants supplied by BrE and AmE. Corpus-based
studies of many other grammatical variables (Hundt 1998; Levin 2009) have found
that AusE and NZE tend to occupy an intermediate position between the more extreme
values of BrE and AmE. This supports the notion of a common antipodean standard,
though not its endonormativity. Yet when it comes to AusE and NZE use of progressive
forms of verbs (Collins, Section II), both show much higher frequencies of use than
contemporary BrE and AmE, and are at the extreme end of the scale. On this variable
they pattern together ahead of the northern hemisphere varieties, and demonstrate an
endonormative development in the antipodes.
Several other cases of endonormativity in the grammar of AusE and NZE can be
found among the studies of sentence relations. A clear example emerges in Kearns’s
research on the use of zero complementizers (Section IV), where both AusE and NZE
data show significantly higher rates for their use with extraposed clauses and with it
subject constructions. Another is the tendency for connective adverbs such as however,
thus, therefore to become conjunctions preceded simply by a comma, especially in
unedited AusE and NZE (Peterson, Section IV). Corpus-based research on these variables shows AusE/NZE to be leading the fray in the transformation of these adverbs,
with levels of frequency not matched at all in comparable data from BrE/AmE. The
antipodean developments in both of these areas might of course be regarded as either
(i) greater dereliction of the standard, or (ii) greater innovativeness in terms of extending the underlying rules of syntax, at least in association with certain lexical exponents.
Even as extensions to the existing rules, they might still be regarded as lexicogrammatical developments rather than syntactic innovations. Yet language diffusion and change
Epilogue 
typically involve “lead words” (Aitchison 2001). So the use of however as a contrastive
conjunction, and the lack of complementizer with phrases such as in the belief, due to
the fact, would seem to be leading-edge developments in AusE and NZE, showing independent patterns of usage which are not matched in corpus material from the northern
hemisphere varieties.
In matters of discourse, AusE and NZE again seem to show similar trends which
contrast to a greater or lesser extent with the northern hemisphere varieties. In both
southern varieties, like has well-established interactional functions as a discourse marker
in spontaneous speech (Miller, Section V), which are found to vary with its sentence
position. This interface of like with sentence position, especially initial and final position, has not so far been discussed for standard northern-hemisphere English, though
Miller presents symptomatic evidence of its use in Scottish (Glasgow) English. Further
research on spoken data from standard BrE and AmE is needed to show whether such
discoursal uses of like are mostly to be found in the southern hemisphere; and whether
other discourse markers play similar roles in other varieties of English.
The same goes for intensifiers/swearwords such as fuck(ing), bloody, shit(ty) etc.
used as markers of informal discourse in (STL) varieties of English. A comparable
range is found in AusE and NZE conversation in the ICE corpora by Allan and Burridge
(Section V), though the frequency of bloody in the antipodean corpora was actually
lower than that in the contemporary British Corpus of teenager language (COLT).
So much for its proverbial status as the “great Australian adjective”! The prevalence
of swearing in casual BrE conversation was confirmed by other evidence from the
British National Corpus (Rayson, Leech and Hodges 1997), where fuck, shit were
among the most highly significant words for under 35s, corresponding to bloody, bugger for the over 35s. Often-used swearwords lose their impact on the public ear, and
survey evidence cited by Allan and Burridge suggests that the great majority of British
people may be less offended by the use of swearwords than their counterparts elsewhere
in the English-speaking world. Sensitivity to the use of bugger and bloody as conversational intensifiers is probably much lower in BrE than AmE (Peters 2004: 74–5; 82–3).
All this suggests that on a scale of responsiveness to swearing and offensive language,
AusE and NZE occupy the middle, flanked by BrE at the lighter end of the scale and
AmE at the most intense. So despite some deeply rooted stereotypes, Australian and
New Zealand use of swearwords does not seem to be particularly extreme, or endonormative. They pattern together to show that while swearwords are not taboo in informal
discourse, they are not so freely used as in BrE. Larger corpora of conversational data
(especially for AmE) are needed to confirm these regional differences.
These are some of the points of usage on which AusE/NZE are closely aligned,
so as to suggest a common regional standard in the South Pacific – a shared lexicogrammar which is distinguishable from both BrE and AmE. Many of these features
show the antipodean preference for the less formal variant, as in the morphological
 Pam Peters
variables discussed as well as the relaxation of syntactic constraints on the use of zero
complementizer and combinations of past-referring adverbs with the present perfect.
In their relative tolerance of using swearwords as discourse markers, AusE and NZE
also seem to set themselves apart from both BrE and AmE. However the acceptance of
more informal features within the standard range suggests that there are larger issues
of style and register intersecting with the antipodean lexicogrammar, to be discussed
below in Sections 5 and 6.
4. D
ifferences between AusE and NZE lexicogrammar: Independent
national characteristics
Despite the commonalities between AusE and NZE, some of the research studies included in this volume point to differences between them. The question then
is how far such differences contribute to their separate identities and to national
standards within the South Pacific region. One difference already noted is the
unequal use of -o amid the shared set of hypocoristic devices discussed by Bardsley
and Simpson (Section I) The data show that -o formations are far more common in
AusE, occurring across a variety of communicative settings, not just occupational
contexts. Other chronological evidence points to the -o suffix being much more
strongly rooted in AusE, and not very productive in NZE (Peters 2009). Differing
levels of productivity of the same linguistic resource have not hitherto been factored into the descriptions of individual varieties, yet the strong presence or effective absence of a grammatical variable is an obvious point of difference.
A syntactic example is the use of but as a final particle, which has been documented
in AusE since the mid-nineteenth century (Australian National Dictionary citation
1858), and is also known in Scottish and Irish English. It is found in contemporary
AmE (Mulder, Thompson & Williams, Section V), though only as “hanging” but, not as
the turn-yielding discourse particle into which it has evolved in AusE. This pragmatic
function identified through the accompanying prosody is indexical of AusE, according
to the authors. Although the use of but as a sentence-final adverb is noted as “NZ and
Aust. colloquial” in the New Zealand Oxford Dictionary (2005), its place in NZE grammar does not seem to have been recognized, judging by its absence from Hundt (1998)
and Hundt, Hay and Gordon (2004). There is as yet no New Zealand research to challenge Mulder et al.’s claim about its status in AusE.
Asymmetries in the use of male/female genderized language are clearly declining worldwide, and especially in Australia and New Zealand (Holmes, Sigley &
Terraschke, Section III) But the authors document an interesting new development
particular to AusE during the last ten years in the rise of gender-neutral language to
provide nonsexist labeling for occupational roles. In the absence of comparable NZ
Epilogue 
data from the twenty-first century, this can only be presented as a feature of AusE, not
a more general antipodean trend.
In cases where AusE and NZE pattern together in contrast with the northern
hemisphere, e.g. in their relatively frequent use of progressive verb forms overall, they
nevertheless show differing commitment to particular subtypes. AusE shows the greatest number of special discoursal uses of -ing (Collins, Section II), e.g. to express politeness or a particular attitude or interpretation. Meanwhile data from NZE includes a
wider range of complex forms of the progressive.
Examples like the last show the distinctiveness of AusE and NZE in their greater
readiness to innovate, in comparison with their northern hemisphere counterparts.
However NZE is also distinctive in its greater conservation of certain older grammatical forms, such as the canonical modals. Many varieties of English (both STL and IDG)
show a tendency to replace them with quasi-modals (e.g. must with have to), especially
in spoken discourse, as shown in Collins (2008). But NZE emerges as the least inclined
to do this, with lower frequencies than BrE, AusE or AmE, and its inventory of modals
is correspondingly higher than the other three (Collins, Section II). Meanwhile the
AusE adoption of quasi-modals puts it closest to AmE. The two varieties are thus at
quite different stages in replacing modals with quasi-modals.
In its expression of negation, NZE has been shown to be more conservative in
its relatively frequent use of no in comparison with not (Peters 2008b). Correlating
with this is the much higher frequency of no collocations found in NZE than in AusE
(Peters and Funk, Section IV), especially in writing. There ICE-NZ data provide far
more of the lower frequency no collocations, showing that no-negation continues to
be an expressive resource for New Zealand writers, though not so obviously for their
Australian counterparts. Meanwhile in the relative frequency of using stereotypical no
collocations, NZE and AusE are much the same, in spoken and written discourse.
In several of these points of difference between them, AusE distinguishes itself by
greater tolerance of colloquial features than NZE, and its willingness to deploy them
in writing as well. By contrast, NZE grammar is distinguished most by its conservation
of more formal elements, especially in written style. The effect is to reduce register differentiation in AusE and to intensify it in NZE.
5. Register differentiation in AusE and NZE via the ICE corpora
Though regional differences were demonstrated for quite a few of the grammatical variables analyzed in this volume, register differences emerged as being more important in
accounting for the distribution of some of them. Notable cases were the tendency to
select particular verbs from complementary pairs in spoken and written registers. Thus
start and stop are much more often found in speech, and begin and prevent in writing.
 Pam Peters
This pattern of selection was firmly embedded in the spoken and written data from
all three varieties of English (AusE, NZE, BrE), where none of the regional differences
proved statistically significant (Mair, Section IV). The same held for the selections of
singular or plural concord with collective nouns (Hundt, Section IV), where contexts of
spoken discourse prompted higher levels of plural agreement than written discourse
in all varieties of English.
The contexts of speech and writing made a noticeable difference to the types of
non-numerical quantifiers (NNQs) deployed. Those bleached of lexical meaning e.g.
stacks of references were far more common in spoken data (Smith, Section III), whereas
NNQs used in writing, e.g. band(s) of air, often meshed into the fabric of meanings
within the text. The effects of register (spoken and written) can also be seen in contrasting uses of light verbs (Smith, Section II). Though often said to be particularly
associated with spoken discourse, those typically used in speech are the unadorned
light verb constructions, e.g. have/take a look, and colloquial paraphrases of it, e.g. have
a shoofty, take a nosey. In written discourse, simple light verb constructions are often
elaborated by means of premodification: gave a still faintly doubting smile, as a means
of adding descriptive and interpretive material into the discussion.
Different types of no collocations are associated with spoken and written discourse
(Peters and Funk, Section IV). The data from AusE, NZE and BrE all showed how
no collocations with adverbial roles (e.g. no more, no doubt) abound in writing, while
nominal phrase examples such as no way, no idea predominate in speech, and serve
as prefabricated elements of ad hoc conversation. Yet the more unusual nominal collocations – especially those embedding complex themes – can be found in NZ writing.
Here again it is the specific lexical content of the no construction that links it to the
written register, not the construction itself.
The association of some linguistic variables with particular kinds of discourse
reflects their function in the production of discourse and management of its content.
Of the several constructions used for information packaging, there existentials seem
to be particularly frequent in spoken or speech-like discourse, as a means of setting
up informal narrative. Collins (Section V) also found particularly high levels of ordinary pseudo-clefts in samples of learned writing, which extract the object from the
unmarked clause and make it the ongoing topic. The restructuring of clausal elements
in all such constructions helps to vary both sentence patterns and topical progression
in extended written text (Halliday & Matthiessen 2004).
Spoken discourse is of course strongly embedded in a context, and its immediacy
(the “it’s happening as we speak” quality) fosters its own special grammatical forms, as
with the so-called “narrative present”. Another marked form is the tendency of some
speakers – especially those reporting news – to use the present perfect rather than
simple past forms (Engel & Ritz 2000). In his study, Elsness (Section II) found high levels of the present perfect in the Australian corpus of talkback radio, and that instances
Epilogue 
of the present perfect with past-referring adverbials were especially common. Whether
this remarkable combination will remain a special feature of live news reporting, or
increase its presence in other forms of speech (and eventually writing) remains to be
seen. The same present perfect structure is of course used regularly to express past
events in modern French (= passé composé) and other European languages.
The case of the present perfect in radio talkback and news texts suggests that particular subregisters of English, especially spoken registers, may foster particular grammatical innovations. However most of the studies reported in this volume have pooled
the spoken and written material from the various subcategories of the ICE-corpora,
in order to create data sets sufficiently large for experimental purposes. The individual
differences between the subtypes of speech and writing are then merged, as are any
regional differences within them. The contrastive findings for speech and writing nevertheless provide a baseline against which finer-grained differences in the grammar of
spoken and written discourse can be measured.
6. Corpus-based analysis and sociolinguistic variation
The sample corpora used by contributors to this volume are most useful in supporting comparative studies of regional varieties of English, and providing data on highfrequency grammatical variables. They are less useful for research on lower-frequency
grammatical and lexical items, and those which intersect with sociolinguistic and register variables. Each of the ICE-corpora contains a range of types of writing, though
not very many samples of each, and the four major types of speech (private conversation, formal public discussion, monologue, scripted speech) represent extreme
spoken registers. In terms of documenting the grammar of dialogue, there is a large
gap between the first two, since the first is often casual or intimate as between friends,
while the second is typically institutionalized discourse, as of the lawyer in a courtroom or the teacher in a classroom. In each case the participant roles condition the
discourse produced. The first tends to reflect sociolinguistic variables very strongly,
the second to foreground institutional styles (Peters, Section II). So data from less
polarized speech contexts are needed, to show the common-denominator resources of
grammar, those which may be used in a wide range of communicative contexts.
This was the motivation for compiling the additional ART corpus of Australian
radio talkback, which is less constrained than institutional dialogue, but also less personal
than that of private conversation. As a concentrated sample of a particular subregister,
it has its own discourse imperatives, and so probably accentuates the use of the present
perfect with past-referring adverbs instead of the past tense (Elsness, Section II). It also
allows us to examine certain sociolinguistic variables more closely, and helps to demonstrate that the age range of those using like as a discourse marker definitely includes
 Pam Peters
adults (Miller, Section V). The study of genderized usage (Holmes, Sigley and Terraschke,
Section III) was usefully extended with evidence from ART, to show that the trend towards
using nonsexist terminology is probably not just a reflection of officially mandated usage
(otherwise found mostly in written administrative texts). The AusE community does
appear to have taken up various gender-neutral terms which are underrepresented in
the ICE data. These various findings suggest the value of collecting corpora of talkback/
chatshows of other varieties of English.
Apart from the need for more varieties of spoken discourse, the overall volume
of spoken discourse in the ICE-corpora is rather limited, and so is the number of different speakers represented. Much of the conversational data contained in ICE-AUS
and ICE-NZ comes from students, yet its coverage of the more informal uses of the
1st and 2nd person pronouns is symptomatic (Quinn, Section II), and much more is
needed across the sociolinguistic spectrum to account for antipodean trends which
run counter to the general tendency to use me as the default pronoun. Alternatively
such data might be drawn from population surveys drawn from the community at
large, like those carried out in Australia through Australian Style since 1992. Australian survey data used to complement corpus data on irregular verbs (Peters, Section I)
helps to show their uptake in the first decade of the twenty-first century, and to show
that mandatives are still majority usage (Peters, Section II). Age graded data is particularly useful in providing insights into the extent to which linguistic innovations
are accepted across the community, or likely to be, if used regularly by younger and
middle-aged people. The age of speakers can be extracted from the catalogues of those
whose data is included in the ICE corpora, whereas the age of the writers included is
rarely indicated.
7. Conclusions: Larger evolutionary trends in AusE, NZE and world English
The contributions to Comparative Studies in Australian and New Zealand English:
Grammar and Beyond show how ongoing variation in antipodean English parallels
that of the major northern hemisphere Englishes, and where southern and northern
varieties diverge. Research on South African English is clearly needed to show how far
it too participates in these hemispheric differences, or whether those discussed in this
volume are strictly antipodean (i.e. South Pacific).
The elements of an antipodean English found in these studies lend substance to
the notion of a regional standard postulated by Goerlach (1990) among others. They
raise the possibility that this STL (settler) standard could exercise some influence in
the region over more recent indigenized (IDG) varieties. A few examples of Australian
lexical inputs to Fiji English have been found (Tent 2001), while the evidence of syntactic influence sought awaits confirmation in larger corpora (Hundt & Biewer 2007).
Epilogue 
The possibility of regional influence on Singapore English can also be considered,
given its preference for the mandative subjunctive over should paraphrases – despite
its base in BrE (Peters, Section II). The fact that large numbers of Singaporean students
come to Australasian universities exposes them to antipodean usage which impacts on
their own English. Thus together or individually, AusE and NZE may act as a kind of
epicentre (Peters 2009).
These examples also show that IDG Englishes in the Pacific (and elsewhere) have
less commitment to the particular features of BrE or AmE which were their primary
input, and may indeed blend features of both, as has been found in Chinese written English (Peters 2003). The reconfiguring of linguistic material – BrE and AmEbased elements inherited from the colonial era – is the most pervasive finding among
the grammatical studies contributed to this volume. They probably reflect exposure
to both major northern hemisphere varieties via global media (BBC and CNN), and
the ever-increasing impact of global English material on the internet. The impact of
international forms of English on the development of individual varieties is a further
dimension for research on the evolution of new Englishes.
The research presented in the nineteen chapters of this volume adds to the body
of evidence on the trend towards greater colloquialization of English. So far this has
mostly been identified in northern hemisphere varieties (Mair & Leech 2006; Mair
2008), though it is also a characteristic of AusE (Peters 2001, 2006). AusE and NZE
are not only participating in it but at the leading edge of importing spoken features of
English into writing. Where they diverge, AusE is generally more advanced than NZE
in the colloquialization of standard usage, since NZE writers maintain greater separation of the spoken and written registers.
Finally, these studies of variation and standardization in English grammar show
how interconnected the variables are with lexical selections. Indeed it is in the lexicogrammar that changes can be seen to begin (cf. Schneider 2003, 2007), establishing
themselves in certain idiomatic structures before generalizing as patterns of syntax.
However these lexicogrammatical changes are also involved with other language systems, e.g. with phonology in confirming the pragmatic role of sentence final but; and
with punctuation in marking the evolution of connective adverbs into conjunctions.
Movements at the grammatical level interconnect thus with the language at large.
References
Aitchison Jean. 2001. Language Change: Progress or Decay? (3rd edn). Cambridge: Cambridge
University Press.
Australian National Dictionary. 1988. William S. Ramson (ed.). Melbourne: Oxford University
Press.
Australian Style. 1992–2007. Vols 1–15. Sydney: Dictionary Research Centre.
 Pam Peters
Axelsson, Margareta W. 1998. Contractions in British Newspapers in the late 20th Century. Acta
Universitatis Upsaliensis: Uppsala University Press.
Biber, Douglas. 2004. “Historical patterns for the grammatical marking of stance: A cross-register
comparison”. Journal of Historical Pragmatics 5(1): 107–36.
Biber, Douglas, Edward Finegan & Dwight Atkinson. 1994. “ARCHER and its challenges: Compiling and exploring a representative corpus of Historical English registers”. In Udo Fries,
Gunnel Tottie & Peter Schneider (eds), Creating and Using English Language Corpora, 1–14.
Amsterdam: Rodopi.
Biber, Douglas, Geoffrey Leech, Stig Johansson, Susan Conrad & Edward Finegan. 1999. Longman
Grammar of Spoken and Written English. London: Longman.
Collins, Peter. 2008. “The English modals and semimodals: Regional and stylistic variation”.
In Nevalainen et al. (eds): 129–46.
Dictionary of New Zealand English. 1997. Harry Orsman (ed.) Auckland: Oxford University Press.
Engel, Dulcie & Marie-Eve Ritz. 2000. “The use of the present perfect in Australian English”
Australian Journal of Linguistics 20(2): 119–40.
Goerlach, Manfred. 1990. Studies in the History of the English Language. Heidelberg: Carl Winter.
Halliday, Michael & Christian Matthiessen. 2004. Introduction to Functional Grammar (3rd edn).
London: Hodder Arnold.
Huddleston, Rodney & Geoffrey Pullum. 2002. Cambridge Grammar of the English Language.
Cambridge: Cambridge University Press.
Hundt, Marianne. 1998. New Zealand English: fact or fiction? Amsterdam: John Benjamins.
Hundt, Marianne. 2009. “Colonial lag, colonial innovation, or simply language change?” In Rohdenburg & Schlüter (eds): 13–37.
Hundt, Marianne & Carolin Biewer. 2007. “The dynamics of inner and outer circle varieties in
the South Pacific and Asia”. In Marianne Hundt, Nadia Nesselhauf & Carolin Biewer (eds),
Corpus Linguistics and the Web, 249–69. Amsterdam, Rodopi.
Hundt, Marianne, Jen Hay & Elizabeth Gordon. 2004. “New Zealand English: morphosyntax”. In
Kortmann et al. (eds): 560–92.
Kortmann, Bernd, Edgar W. Schneider & Kate Burridge (eds), 2004. Handbook of Varieties of
English. 2 vols. Berlin: Mouton de Gruyter.
Levin, Magnus. 2009. “The formation of the preterite and the past participle”. In Rohdenburg &
Schlüter (eds): 60–85.
Mair, Christian. 2008. Twentieth Century English: History, Variation and Standardization.
Cambridge: Cambridge University Press.
Mair, Christian and Geoffrey Leech. 2006. “Current changes in English syntax”. In Bas Aarts &
April McMahon (eds), Handbook of English Linguistics. Oxford: Blackwell, 318–42.
Nevalainen, Terttu, Irma Taavitsainen, Paivi Pahta & Minna Korhonen (eds), 2008. The Dynamics
of Linguistic Variation. Amsterdam: John Benjamins.
New Zealand Oxford Dictionary. 2005. Graeme Kennedy & Tony Deverson (eds), Auckland:
Oxford University Press.
Pawley, Andrew. 2004. “Australian vernacular English: Some grammatical characteristics”. In
Kortmann et al. (eds): 611–42.
Peters, Pam. 2001 “Corpus evidence on some points of Australian style and usage”. In Peter Collins
& David Blair (eds), English in Australia, 163–78. Amsterdam: John Benjamins.
Peters, Pam. 2003. “What is international English?” In Pam Peters (ed.), From Local to Global
English: Proceedings of Style Council 2001/2, 33–9. Sydney: Dictionary Research Centre.
Peters, Pam, 2004. Cambridge Guide to English Usage. Cambridge: Cambridge University Press.
Epilogue 
Peters, Pam. 2006. “Similes and other evaluative formulae in Australian English”. In Paul Skandera,
(ed.), Phraseology and Culture in English. Berlin: Mouton de Gruyter, 235–56.
Peters, Pam. 2008a. “Australian and New Zealand English”. In Hal Momma & Mike Matto (eds),
Companion to the History of the English Language. Oxford: Blackwell, 389–99.
Peters, Pam. 2008b. “Patterns of negation: the relationship between NO and NOT in regional
varieties of English”. In Nevalainen et al. (eds): 147–62.
Peters, Pam. 2009. “Australian English as a regional epicentre”. In Lucia Siebers & Thomas Hoffmann
(eds), World Englishes: Problems – Properties – Prospects. Amsterdam: John Benjamins.
Rayson, Paul, Geoffrey Leech & Mary Hodges. 1997. “Social differentiation in the use of English
vocabulary: Some analyses of the conversational component of the British National corpus”.
International Journal of Corpus Linguistics 2(1): 133–52.
Rohdenburg, Günter. 2006. “The role of functional constraints in the evolution of the English
complementation system”. In Christiane Dalton-Puffer, Dieter Kastovsky, Nikolaus Ritt &
Herbert Schendl (eds), Syntax, Style and Grammatical Norms, 143–66. Bern: Peter Lang.
Rohdenburg, Günter & Julia Schlüter (eds). 2009. One language, two grammars? Differences
between British and American English. Cambridge: Cambridge University Press.
Schneider, Edgar W. 2003. “The dynamics of new Englishes: From identity construction to dialect
birth”. Language 79: 233–81.
Schneider, Edgar W. 2007. Postcolonial English: Varieties around the World. Cambridge: Cambridge
University Press.
Tent, Jan. 2001. “A profile of the Fiji English lexis”. English World-Wide 22(2): 209–45.
Trudgill, Peter. 1998. “World Englishes: convergence or divergence?” In Hans Lindquist,
Staffan Klintborg, Magnus Levin & Maria Estling (eds), The Major Varieties of English,
29–34. Vaxsjo: Vaxsjo University Press.
Wales, Katie. 1995. Personal Pronouns in English. Cambridge: Cambridge University Press.
Index
Note: Items covered in the Table of Contents and the List of Abbreviations (pp. v–vii) are not
detailed in the index below.
1 pl form See first person
plural
2 pl form See second person
plural
a
a lot of See non-numerical
quantifier
Aboriginal English,
Australian 340
abusive function, of
swearing 363, 365, 368–9,
372, 383, 384
adjunct 120, 141n., 278, 287,
291
age differences in language
use 4, 5, 26–7, 109–12, 113,
326–8, 350–4, 356, 396
agentive suffix 49, 52, 62
agreement 1, 6, 39, 159–60,
162, 163, 166–9, 174–6, 178,
207–11, 213–15, 217, 218,
248, 304, 335, 354, 388, 394
See (also) concord
Algeo, John 139–40, 141, 142,
145, 151–2, 154
American English (AmE) 3–7,
14–15, 17, 19, 21, 22, 24,
27, 31–47, 73–87, 89–113,
115–22, 125–8, 130, 131,
135–6, 141–5, 151–2, 160,
164, 172, 178, 183–202,
207–10, 218, 227–8, 237,
243–61, 263–75, 277,
285–90, 295–316, 339–57,
364–5, 372, 384, 387–97
American influence 6, 136, 274
Americanization 3, 79, 84, 263,
271, 275
Andersen, Gisle 319, 320, 324,
326, 336
antecedent 209, 212, 213, 216,
310–11, 314
antipodean varieties of
English 3–7, 18, 79, 89,
101, 115, 117, 122, 127, 172,
178, 260, 263, 264, 274,
295, 315, 317, 336, 355, 361,
362, 375, 384, 388–92, 393,
396, 397
antipodean lag See colonial
lag
aspect 5, 24–5, 115, 118, 185, 210,
304, 314, 373
aspectuality 115, 117, 118,
120–1
asyndetic combination of
clauses 279, 290–1
attitudinal use of the
progressive 118, 120, 121
Australian Style 18, 26–7,
106–9, 113, 135, 396
Australianness 7, 339, 340, 355,
357, 361
b
baby-talk 53, 57
bare existential 305
bare infinitive 263–75
Bauer, Laurie 1, 24, 25, 36,
95–6, 209, 333n.
be going to See quasi-modal
begin See complementation,
verb
Biber et al. 15, 17, 19, 24, 31,
34, 36, 38, 41, 115–17, 131,
132, 136, 160–1, 163, 167,
168, 172n., 210, 225–8,
230, 236, 269, 297, 309,
313, 317, 389
blasphemy 363, 371, 375,
380, 383
bloody 7, 375–8, 379, 391
boilerplate 226–9, 232, 233, 238
brandname 56
British English (BrE) 2–7,
13–28, 31–47, 73–87,
89–113, 115–22, 125–36,
139–54, 159–78, 183–202,
207–18, 225–38, 243–61,
263–75, 277, 285–90,
295–316, 354, 387–97
business name 56
but, connective adverb 281n., 283, 291–2,
but, discourse marker 319–20
but, final particle 7, 339–57,
392, 397
c
Cambridge Grammar of
the English Language
See huddleston and
pullum
case, grammatical 1, 4, 31–42,
46, 255n.
censorship 367, 372, 374, 380,
382, 384
Channell, Joanna 162, 164–7,
169, 171, 177
Chinese written English 397
cleft construction 7, 31, 33,
38, 41–2, 243, 245, 295–6,
298–303, 307–15, 317n.,
319, 326, 389, 394
collective noun See noun,
collective
Collins, Peter 36, 42, 94n.
colloquialization 79, 84, 86,
118, 121, 128, 397
colonial lag 19, 238, 390
complement clause 6, 131–2,
243–61, 263–75
 Index
complementation,
adjective 243–61
complementation, noun 6,
139–40, 148, 150–1, 152,
153, 159, 161–9, 174–6, 178,
243–61
complementation, verb 6,
131–2, 243–61, 263–75,
296, 303, 305, 307,
388–9
complementizer 6, 243–61,
263–75, 391, 392
complex determiner 162, 178
complex predicate 254–61
Comprehensive Grammar
of the English Language
See quirk et al.
concession 292, 345, 347–349
concord 6, 136, 207–218,
223, 224, 226, 388, 394
See (also) agreement
conjoined pronouns 34–41
conjunction 34–41, 253, 278n.,
320, 344, 390–1, 397
connective adverb/
adjunct 278, 280, 282,
285–7, 291
consciousness-raising 188,
196, 202
conservative attitude 184, 198
conservative language 7, 14,
73, 77, 79, 82, 84, 87, 126,
135, 214, 217, 228, 271, 295,
301, 315, 393
contraction 5, 46, 98, 115, 118,
121–2, 387
contrastiveness 7, 339, 340,
345–7, 350–1, 357, 391
conversation 18, 24–5, 27,
35–6, 38–41, 43, 46, 57,
83, 102, 105, 116–17, 125,
127–9, 131–5, 147,
164–5, 167, 168, 186,
209, 217, 226–8, 230,
232–3, 236, 238, 298,
301, 318, 323–4, 326,
330–1, 335, 339–57,
362, 371, 376, 391, 394,
395–6
coordination 4, 31, 33, 34–41,
59, 237, 389
coordinator 278–86, 291–2
corporate noun 211
d
definite subject 296, 304–5
delexicalization 6, 145, 159,
160, 162, 163, 176, 178
democratization 81, 83
demonstrative 31, 44–5, 313,
330
deontic modality 73, 80–4,
86, 129, 130
diachronic variation 50n.,
73, 77, 85, 86, 89, 92, 115,
126, 207, 208, 263, 264,
267, 269, 275, 295, 297–9,
301, 311
differentiation of varieties 2,
87, 228, 238, 274, 387–8,
393–5
diminutive 53, 361
discord See mixed concord
discourse function 2, 4, 7,
310–14, 317–36, 339–57,
378–9, 387, 389, 391–2,
394, 396
discourse, type 17, 18, 25, 27,
130–1, 134, 177, 185, 186,
187, 189, 227, 228, 23–5,
238, 389, 391, 393–6
discourse marker/particle 7,
317–36, 339, 340, 344,
354, 355, 378–9, 391,
392, 395
double negative 225–6
dozens, the 372, 373
dummy pronoun/subject 7,
296, 299, 301, 303, 306,
315, 389
dynamic modality 73, 80–2,
84–6
dysphemism 363–4, 366–70,
383, 384
e
endonormativity 2, 122, 127,
387–91
Engel, Dulcie 2, 94–5, 102,
106, 394
epicene 201, 202
epistemic modality 73, 80–6,
129
epithet 7, 362, 363, 368–73,
387–9
euphemism 53, 363, 364n.,
365, 367
evolution of English verb 13,
16, 27
existential construction 167–9 175–6, 178, 236,
257–8, 295–316, 333, 394
exonormativity 2, 127, 131, 136
expanded predicate 139–40,
141, 142, 145
expletive function of
swearing 7, 361, 362, 363,
365–8, 369, 372, 373, 376,
378, 379, 381
extended existential 305–6
extraposition 243, 245–9,
252–3, 260, 295–316, 296,
298, 301, 303, 306, 307,
315, 390
f
Falkland Islands English 340,
343
Feedback column (Australian
Style) 106–12, 113
feminism 183–4, 187–90, 202
field 310–11, 314
field, semantic 60, 82
Fiji English 396
first person plural 38, 40, 44
flash language 362
floating quantifier
See quantifier
flyting 372–3
formal style 18, 25, 36, 41, 46,
55, 83, 126, 127, 128, 132,
134, 135, 136, 146, 177, 179,
209, 210, 218, 245, 247,
265, 270n., 304, 335, 363,
364n., 387, 393, 395
Fowler, Henry 19–20, 22, 125,
281n., 292n.
Fries, Charles 13, 16, 17, 126
g
gender as a sociolinguistic
variable 25, 330, 357
gender-marking 6, 183, 185–6,
191, 194–6, 202
genre 78, 79, 116, 172, 216, 237,
245, 247, 264, 274, 285,
288–90, 297, 301–3, 309, 315
Geordie See tyneside, variety
of english
gerund 141, 238, 263–76, 388
Index 
give See light verb
grammaticalization/
grammaticization 7, 159,
175, 176, 178, 271, 275, 339,
344, 354
h
Halliday, Michael 310, 394
have See light verb
have got to See quasi-modal
have to See quasi-modal
Hawaiian Creole 340, 343n.
heaps of See non-numerical
quantifier
heavy constituent 296, 306
hedging 141, 148–9, 326
help See complementation,
verb
however See connective adverb
Huddleston and Pullum 16, 120, 136, 140, 150,
160–1, 162, 226, 245,
278–80, 282, 283n.,
287, 290–2, 303n., 305,
307, 389
Hundt, Marianne 1, 7, 22, 23,
96, 115, 117, 118, 121, 122,
125–8, 130–1, 159, 225,
388, 390, 392, 396
hypocoristic 4, 49–51, 53–65
i
identificational
construction 38, 41–2
impersonality 299, 303, 315
Indian English 127, 130 See
(also) south african
indian english
indigenized English 125,
127–31, 134, 396
infinitive See bare infinitive,
to-infinitive
informal style 4, 18, 21, 33,
35, 36, 40, 43, 44, 46,
49, 53, 54, 78, 86, 102, 103,
105, 112, 128, 142–3, 146–7,
150, 151, 154, 164, 165, 177,
179, 209, 210, 218, 226,
245, 247, 255–7, 260, 265,
270, 272, 295, 304, 305,
307, 318, 324, 330, 331, 342,
355, 362, 387, 391, 392,
394, 396
information-packaging 7,
295, 297–9, 302, 303,
308, 315
-ing construction See gerund
-ing participle See progressive,
aspect
institututional discourse 18,
49, 56, 125, 132, 133–4, 395
intentional use of the
progressive 118, 121
interaction, speaker 132–3,
134, 186, 301, 319, 324,
339, 340, 349–50, 391
interpersonal role of
language 134, 139, 146–7,
148, 150, 154, 324, 336
interpretive use of the
progressive 118, 120–1
Irish English 341, 342–3, 392
irregular verb 13–28, 97, 389,
396
irregularization 15, 16, 23, 27
-IST Language 383, 384
it BE constructions 32, 41–2
it-cleft See cleft construction
it-subject construction 244–5,
247–50, 260
j
Jespersen, Otto 117, 139, 141n.,
225, 227, 246, 254n.
job title See occupational
label/term
k
Kortmann, Bernd 31, 32, 34,
225, 226
l
L1 dictionary 139, 142–3, 153,
153–4
L2 dictionary 139, 142, 143–5,
153–4
Labov, William 226, 357
language policy 184, 200
language-internal
variation 207, 208,
209–10, 218
lead word 391
Leech, Geoffrey 74, 77, 79,
83, 84, 85, 115, 118, 126,
231, 245, 267n., 304,
391, 397
lexicogrammar 1, 225–38, 274,
387, 390, 391, 392, 397
lexicogrammatical
variation 208n., 210,
216–18, 387–93, 396–7
lexicography 53, 125, 142–5
lexis 2, 60, 139
light verb 5, 139–54, 243,
254–7, 259, 260, 388, 394
like See discourse marker
linguistic markedness 184, 202
loads of See non-numerical
quantifier
Longman Grammar of Spoken
and Written English See
biber et al.
Longman Spoken and Written
English Corpus 34
loose talk 324–6
Looser, Diana 50n., 54, 62
lots of See non-numerical
quantifier
m
main clause 5, 6, 115, 118,
119–20, 277, 279, 282,
285, 291
Mair, Christian 74, 77, 79, 83,
115, 118, 121, 126, 128, 388,
394, 397
make, See light verb
mandative subjunctive
(MS) 4, 5, 125–36, 212,
390, 396, 397
marker, sociolinguistic 226, 357
marker of solidarity 7, 320,
324, 370
matter-of-course use of the
progressive 121
military lexicon 62
mixed concord 210–18
modal 5, 73–87, 118, 119, 129,
130, 393
modal auxiliary 5, 73, 74,
83, 84
modality See deontic,
dynamic, epistemic,
mode 310–11, 314
morphology 1, 4, 11, 15, 17,
19, 21, 23, 27, 55, 86, 254,
255n., 256, 389, 391
morphosyntax 1n., 34, 94, 355
 Index
multiple negation 226, 232,
238
must See modal auxiliary
n
native speaker dictionary See
l1 dictionary
necessity (modal) 73, 80–3,
136
need See modal auxiliary
negation 6, 135, 225–6, 227,
228, 232, 238, 275, 393
negative collocation 225, 227
negative concord 226
negative polarity 225, 226,
237
new information 296, 306, 312,
328, 333
no as adverb(ial) 225, 228,
230–1, 235, 238
no as determiner 225, 226, 227,
231, 237
no collocation 225, 229–38,
393, 394
no idea 230–5, 394
no-negation 6, 135, 225–7,
238, 393
no way 225, 226–7, 229, 231–3,
394
no worries 226, 230, 233
nominal phrase 225, 230, 231,
234–5 See (also) noun
phrase
nominalization 140–1, 307
non-numerical quantifier
(NNQ) 5–6, 159–79,
388–9, 394
non-sexist language 184–5,
202, 392, 396
nonnative speaker of
English 165
nonstandard language 2, 44,
96, 150, 225, 234, 242, 328,
340, 341, 355–7, 371
nonstandard verb form 4,
13–28, 389
non-subject coordinate See
subject coordinate
north-eastern England,
varieties 341, 342
northern hemisphere
(English) 5–6, 20–1, 115,
117, 122, 130, 185, 202, 210,
228, 232, 342, 384, 387, 389,
390, 391, 393, 396, 397
Northumberland variety of
English 341, 342
not any 227–8, 234–5, 237, 238
not-negation 6, 235, 238
noun 4–6, 55, 59, 131, 135, 139,
140, 150, 151, 159–63,
165, 166, 168–70, 200,
207, 210–14, 216, 217,
247, 254–60, 296, 363,
376, 378
noun, collective 1, 6, 159,
162–3, 207–18, 388, 394
noun complement 6, 139–40,
148, 150–3, 159, 161–3,
165, 167, 169, 174–6,
247–8
noun complement clause 243,
248–9, 252, 254, 255, 260
noun complementation See
complementation, noun
noun phrase (NP) 4, 5, 31, 34,
41, 44, 45n., 159, 162, 168,
210, 226, 227, 228, 245,
249–51, 254, 332 See (also)
nominal phrase
null complementizer See zero
complementizer number transparency 161, 162,
165, 169, 172, 178
o
obligation 73, 80, 84
occupational label/term 4, 6,
49, 50, 54, 62, 65, 183, 184,
185, 191–201, 392
orthophemism 364n., 373
p
parataxis 290
participle, present See
progressive (aspect)
past participle 4, 13–19, 25, 97
past tense 13–28, 117, 118–19,
145, 389, 394–5 See (also)
preterite
pejorative 53
perfect aspect 5, 89–113,
118–19, 390, 392, 394–5
personal name 49, 55, 60, 63
Peters, Pam 2, 42, 92n., 162,
185
Philippine English 5, 125, 127,
132, 209n., 218n.
placename 49, 50n., 55, 57–8,
64, 65–8 plural agreement/concord 159,
162, 168, 207–18, 388, 394
politeness 53, 121, 195, 198,
333n., 378, 393
political correctness 189, 202
possessive me 4, 31, 43, 44, 46
postmodification 183, 185,
195, 233,
pragmatic use of the
progressive 5, 115, 118,
120–1
pragmatics 304, 310, 315, 347,
392, 397
predictive use of the modal 5, 73
premodification 145, 185, 192,
194, 196, 198, 201, 394
prescriptivism 83, 135, 195,
278, 309
present perfect 5, 89–113, 119,
390, 392, 394–5
presupposition 296, 303, 310,
313
preterite 4, 5, 14, 89–113 See
(also) past tense
prevent See complementation,
verb
prison argot 54
profanity 363, 378, 380, 381,
383
progressive (aspect) 5, 115–22,
390, 393
pronominal concord 209–10,
213, 215–16, 223, 224
pronouns 4, 30–47, 207, 209,
210, 213, 214, 216, 247–52,
255, 296, 299, 306, 389,
396
proper name/noun 34, 41, 49,
55, 61, 62–4, 213
property See syntactic
property
prosody 320, 339, 340–1, 343,
344n., 345n., 346–8,
350, 392
pseudo-cleft See cleft
construction
pseudo-generic term 183–4,
190, 199, 201, 202
Index 
punctuation 256, 277, 279, 284,
286, 287, 290–2, 318n.,
330, 397
q
quantifier 5, 41, 45, 159–79,
389, 394
quasi-modal 5, 73–87, 130,
393
Quirk et al. 16, 92, 118, 126, 127,
142, 160–1, 164, 165, 179,
208, 210, 226, 245, 280,
307, 317, 319, 320
r
radio talkback See talkback
radio
reaction signal 225, 226,
227, 229–30, 233, 235,
236, 238
regional differentiation 27, 31,
139, 141, 142, 164, 228, 237,
274, 275, 391, 394
regional standard 387, 389,
390, 391, 396
regional variation 15, 17, 77,
125, 142, 151, 159, 207, 208,
218, 260, 261, 264, 271
register 18, 31, 38, 41, 57, 92,
99, 103, 105, 127, 132,
140, 142, 145, 147, 150, 151,
159, 160, 164, 165–6, 170–3,
227, 228, 295, 297, 302–3,
310, 342, 387, 392–4, 395,
397
register differentiation 127, 131,
139, 179, 238, 393
regular verb 15, 16
regularization 4, 13–28, 168,
275
relative clause 140, 243, 246,
255, 298n., 303, 305–6,
309, 310–11, 313, 314
rhetoric 225, 232, 237, 298, 315,
334, 372
rhyming slang 62
Romaine, Suzanne 184, 186,
191, 192, 199
routine collocation 243,
256–61
run-on sentence 6, 277, 279,
283, 284, 286, 287
rural lexicon 51–2, 61–2
s
Schneider, Edgar 2, 127–8, 129,
130, 131, 228, 238, 387, 397
Scots/Scottish English 151n.,
322n., 323, 325, 329, 334,
336, 341, 342, 391
second-language learner
dictionary See L2
dictionary
second person plural 21, 32,
34, 45–6
self-form 36, 37n.
semantic domain 49, 61–2,
64n., 65
semantic weight 159, 163
settler English (STL) 18, 125,
127, 128, 129, 130, 134,
342–3, 361, 387, 391, 393,
396
sexist language 183, 184, 190,
196–7, 200, 202, 383
shall See modal auxiliary
should See modal auxiliary
singular agreement/concord 6, 159, 162, 167, 207–18,
394
slang 53, 54, 57, 62, 150,
361, 363, 369, 378,
380, 384
Smith, Nicholas 83, 84, 115, 118,
119, 122, 267n.
social change 183, 184, 189,
191, 202
social function of
swearing 361, 363, 365,
368, 370–3, 382
social meaning 339, 340,
355–7
sociolinguistic marker See
marker, sociolinguistic
sociolinguistic variation 25–7, 395–6
solidarity marker See marker
of solidarity
solidarity function of
language 7, 53, 54, 320,
324, 355, 365, 370, 372
South African English 396
South African Indian
English 341, 343n.
southern hemisphere
English 15, 19, 21, 22,
27–8, 32, 117, 122, 130, 152,
187, 193, 202, 214, 216–18,
232, 238, 341, 343, 388,
390, 391
speech vs. writing 6, 17, 18,
74–5, 77–87, 99, 119, 120,
122, 130, 136, 146, 164, 225,
227, 230–3, 235, 264, 269,
273, 274, 298–301, 304–15,
321–2, 393–5
speech-like, genre 301,
303, 394
spoken discourse 17, 18, 25,
185, 230, 232, 238, 393,
394, 396
spoken English 84, 86, 125,
136, 185, 186, 202, 218,
269, 387
spoken medium 177, 209, 264
spoken usage 18, 129, 146,
202, 207
standard See regional standard
standardization 274, 397
start See complementation,
verb
stop See complementation,
verb
strong verb See irregular verb
stylistic variation 3, 5, 73,
75, 78, 79, 83, 87, 118,
126–8, 130, 133, 134, 136,
207–10, 218, 234, 238,
264, 265, 279, 299–303,
315, 387
stylistic function of
swearing 361, 365, 373–8
suasive verb 131, 134–5
subject coordinate 34–40
subject-verb agreement 6,
39, 136, 167–70, 207–18,
304
subordinate clause 119–20, 212,
282, 296, 306, 307
subordinator 277–83, 291
subregister 395
summative discourse role 312
swearing 7, 361–85, 391
syntactic category 1, 25, 33,
34, 37, 42, 45, 83, 84, 115,
131, 135, 136, 207, 225–8,
233, 235, 238, 243–5, 246,
248, 252, 254, 255, 259,
260, 264, 277–9, 283, 284,
290, 291, 306, 318,
 Index
319, 323, 324, 355,
390, 392
syntactic property 42, 83, 87,
159, 260, 265, 275, 277,
280, 283, 284, 318, 319, 320
syntactic unit 291, 306, 318
syntax 1, 135, 226, 260, 264,
291, 324, 387, 389, 390, 397
system-sentence 291
t
taboo 7, 20, 362, 363, 366–7,
369, 371, 372, 373, 375, 378,
380, 383, 391
take, See light verb
talkback radio 3, 55, 97, 112,
134, 145, 150, 186, 193, 317,
318, 330–1, 343, 348, 362,
394, 395–6
Taylor, Brian 49, 50n., 64,
342, 361
temporal frame 117–18
temporal specification 103–4,
113
temporariness 117
tenor 310–11, 314, 387
tense (past) See past tense
terms of address 183
text-sentence 291
than comparative 32, 41–2
there See existential
construction
therefore See connective adverb
though 341, 344, 353, 355
thus See connective adverb
to-infinitive 38, 86, 131, 140,
263–75, 388
Tottie, Gunnel 225–7,
234–5, 236
Trudgill, Peter 18, 25, 90, 92,
95n., 105, 110n., 208, 210,
265, 341, 361, 387
truncation 53, 57, 58, 61–4
turn organization 339, 340
Tyneside, variety of
English 341, 342
u
Underhill, Robert 325–6, 328
unmarked infinitive See bare
infinitive
v
vague language 5, 90, 91, 95,
103–5, 107, 111, 113, 162,
164–5, 169, 178, 179
variability 13, 15, 17, 35, 141,
165, 168, 178, 207, 230, 264,
265, 267, 269, 274, 275, 390
variable, linguistic 4, 5, 18–19,
21, 22, 27, 77, 80, 118, 122,
209, 263, 266, 268, 271–2,
274, 390, 392
variable (pattern) 159, 161, 165,
207–11, 263, 265
variant, linguistic 13, 22, 31, 43,
46, 143, 150, 265, 268, 271,
275, 364, 389, 391
variation See diachronic,
language-internal,
lexico­grammatical,
regional, sociolinguistic,
stylistic
variety of English See
aboriginal, american,
antipodean, british,
chinese written, falkland
islands, fiji, indian, irish,
northern hemisphere,
northumberland,
philippine, scots, south
african, south african
indian, southern
hemisphere, tyneside
verb 4–6, 13–17, 19–21, 24–8,
33, 37, 40, 45, 52, 91, 97–8,
100, 101, 104, 107, 129,
131–3, 135, 136, 159, 161,
162, 166, 167–9, 174, 175,
199, 200, 208, 210, 212, 213,
215, 226, 229, 243, 244–9,
251, 254–6, 260, 263, 264,
266, 267, 271–2, 274–5,
303, 310, 362, 363, 388–90,
393, 394
verb complementation See
complementation, verb
verb form (part) 14–16, 17,
18, 20, 21–4, 26, 27, 28, 39,
89, 90, 92, 94–7, 99, 104,
105, 107, 109, 110,
112, 113, 393
verb, light See light verb
verb morphology 1, 15, 17, 19,
21, 27, 256, 389
verb paradigm 13–17, 19, 21,
26, 27–8, 389
verb phrase 4, 226, 390
verb(al) agreement/
concord 162, 163, 165–9,
174–6, 178, 209, 210, 213,
215–16, 223, 224, 304
verbalization 141
vernacular English 19, 43,
44, 125, 128, 132, 134, 225,
341n., 361, 375
volitional use of modal 73, 85
w
Wales, Katie 32, 34, 37, 38, 42,
44–6, 389
want to See quasi-modal
weak verb See irregular verb
wh-cleft See cleft construction
Wierzbicka, Anna 49, 50n.,
53, 140, 141–2, 145, 146,
361, 379
will See modal auxiliary word coinage 49, 50, 52, 54, 59,
61, 303n., 384
written discourse 17, 18, 130,
131, 227, 228, 230, 232–4,
389, 393–5
written English 20, 22, 83,
86, 128, 146, 148, 154, 160,
164–6, 177, 179, 183, 185,
264, 270, 277, 283, 285,
297, 387
written medium 7, 96, 209, 214
y
y’all 4, 31, 33, 45, 46
you guys 31, 32, 45
you lot 45, 46
yous(e) 4, 31, 32, 33, 45, 46, 342
z
zero complementizer 6, 243–61,
390, 392
In the series Varieties of English Around the World the following titles have been published thus
far or are scheduled for publication:
G40Hoffmann, Thomas and Lucia Siebers (eds.): World Englishes – Problems, Properties and Prospects.
Selected papers from the 13th IAWE conference. xix, 432 pp. + index. Expected September 2009
G39Peters, Pam, Peter Collins and Adam Smith (eds.): Comparative Studies in Australian and New Zealand
English. Grammar and beyond. 2009. x, 406 pp.
G38Sedlatschek, Andreas: Contemporary Indian English. Variation and change. 2009. xix, 363 pp.
G37Schreier, Daniel: St Helenian English. Origins, evolution and variation. 2008. xv, 312 pp.
G36Murray, Thomas E. and Beth Lee Simon (eds.): Language Variation and Change in the American Midland.
A New Look at ‘Heartland’ English. 2006. xii, 320 pp.
G35Hickey, Raymond: Dublin English. Evolution and change. 2005. x, 270 pp. (incl. CD-Rom).
G34Mühleisen, Susanne and Bettina Migge (eds.): Politeness and Face in Caribbean Creoles. 2005.
viii, 293 pp.
G33Lim, Lisa (ed.): Singapore English. A grammatical description. 2004. xiv, 174 pp.
G32Hackert, Stephanie: Urban Bahamian Creole. System and variation. 2004. xiv, 256 pp.
G31Thompson, Roger M.: Filipino English and Taglish. Language switching from multiple perspectives. 2003.
xiv, 288 pp.
G30Aceto, Michael and Jeffrey P. Williams (eds.): Contact Englishes of the Eastern Caribbean. 2003.
xx, 322 pp.
G29Nelson, Gerald, Sean Wallis and Bas Aarts: Exploring Natural Language. Working with the British
Component of the International Corpus of English. 2002. xviii, 344 pp.
G28Görlach, Manfred: Still More Englishes. 2002. xiv, 240 pp.
G27Lanehart, Sonja L. (ed.): Sociocultural and Historical Contexts of African American English. 2001.
xviii, 373 pp.
G26Blair, David and Peter Collins (eds.): English in Australia. 2001. vi, 368 pp.
G25Bell, Allan and Koenraad Kuiper (eds.): New Zealand English. JB/Victoria UP, 2000. 368 pp.
G24Huber, Magnus: Ghanaian Pidgin English in its West African Context. A sociohistorical and structural
analysis. 1999. xviii, 322 pp. (incl. CD-rom).
G23Hundt, Marianne: New Zealand English Grammar – Fact or Fiction? A corpus-based study in
morphosyntactic variation. 1998. xvi, 212 pp.
G22Görlach, Manfred: Even More Englishes. Studies 1996–1997. With a foreword by John Spencer. 1998.
x, 260 pp.
G21Kallen, Jeffrey L. (ed.): Focus on Ireland. 1997. xviii, 260 pp.
G20Macaulay, Ronald K.S.: Standards and Variation in Urban Speech. Examples from Lowland Scots. 1997.
x, 201.
G19Schneider, Edgar W. (ed.): Englishes around the World. Studies in honour of Manfred Görlach. Volume 2:
Carribbean, Africa, Asia, Australasia. 1997. viii, 358 pp.
G18Schneider, Edgar W. (ed.): Englishes around the World. Studies in honour of Manfred Görlach. Volume 1:
General studies, British Isles, North America. 1997. vi, 329 pp.
G17Patrick, Peter L.: Urban Jamaican Creole. Variation in the Mesolect. 1999. xx, 329 pp.
G16Schneider, Edgar W. (ed.): Focus on the USA. 1996. vi, 368 pp.
G15de Klerk, Vivian (ed.): Focus on South Africa. 1996. iv, 328 pp.
G14McClure, J. Derrick: Scots and its Literature. 1996. vi, 218 pp.
G13Görlach, Manfred: More Englishes. New studies in varieties of English 1988–1994. 1995. 276 pp.
G12Glauser, Beat, Edgar W. Schneider and Manfred Görlach: A New Bibliography of Writings on
Varieties of English, 1984–1992/93. 1993. 208 pp.
G11Clarke, Sandra (ed.): Focus on Canada. 1993. xii, 302 pp.
G10Fischer, Andreas and Daniel Amman: An Index to Dialect Maps of Great Britain. 1991. iv, 150 pp.
G9 Görlach, Manfred: Englishes. Studies in varieties of English 1984–1988. 1991. 211 pp.
G8 Görlach, Manfred and John Holm (eds.): Focus on the Caribbean. 1986. viii, 209 pp.
G7 Penfield, Joyce and Jack Ornstein-Galicia: Chicano English. 1985. vii, 112 pp.
G6 Petyt, K.M.: 'Dialect' and 'Accent' in Industrial West Yorkshire. 1985. viii, 401 pp.
G5 Görlach, Manfred (ed.): Focus on Scotland. 1985. iv, 241 pp.
G4 Viereck, Wolfgang (ed.): Focus on: England and Wales. 1984. iv, 304 pp. (includes 40 maps).
G3 Viereck, Wolfgang, Edgar W. Schneider and Manfred Görlach (comps.): A Bibliography of Writings
on Varieties of English, 1965–1983. 1984. iv, 319 pp.
G2 Day, Rita (ed.): Issues in English Creoles. Papers from the 1975 Hawaii Conference. (Julius Groos) Heidelberg,
1980. iii, 188 pp.
G1 Lanham, Len W. and C.A. MacDonald: The Standard in South African English and its Social History.
(Julius Groos) Heidelberg, 1979. 96 pp.
T9 Mühlhäusler, Peter, Thomas E. Dutton and Suzanne Romaine: Tok Pisin Texts. From the beginning
to the present. 2003. x, 286 pp.
T8 McClure, J. Derrick: Doric. The dialect of North-East Scotland. 2002. vi, 222 pp.
T7 Mehrotra, Raja Ram: Indian English. Texts and Interpretation. 1998. x, 148 pp.
T6 Winer, Lise: Trinidad and Tobago. 1993. xii, 368 pp.
T5 Wakelin, Martyn F.: The Southwest of England. 1986. xii, 231 pp.
T4 Platt, John, Heidi Weber and Mian Lian Ho: Singapore and Malaysia. 1983. iv, 138 pp.
T3 Macafee, Caroline: Glasgow. 1983. v, 167 pp.
T2 Holm, John: Central American English. (Julius Groos) Heidelberg, 1982. iv, 184 pp., + tape.
T1 Todd, Loreto: Cameroon. (Julius Groos) Heidelberg, 1982. 180 pp., 1 map.