Internet language in user-generated comments

Internet language in user-generated
comments
Linguistic analysis of data from four commenting groups
Internetspråk i användarkommentarer
Lingvistisk analys av material från fyra kommenterande grupper
Jenny Dahlström
Faculty of Arts and Education
English
English III: Degree project in linguistics
15 hp
Supervisor: Pia Sundqvist
Examiner: Solveig Granath
Autumn 2012
Title:
Internet language in user-generated comments: Linguistic analysis of data from four commenting
groups
Titel på svenska:
Internetspråk i användarkommentarer: Lingvistisk analys av material från fyra läsargrupper
Author:
Jenny Dahlström
Pages:
33
Abstract
The present study examines typical features of internet language found in user-generated comments
collected from commenting groups from four online magazines aimed at different readerships: (1)
adult women (Working Mother and Mothering), (2) adult men (Esquire), (3) young women
(Seventeen) and (4) young men (Gameinformer). Approximately 5,000 words from each
commenting group were collected, creating a 21,087 word corpus which was analyzed with regard to
typographic (emoticons, nonstandard typography of and, personal pronouns you and I) and
orthographic features (abbreviations, acronyms) as well as syntactic and stylistic features resembling
spoken language (contracted forms, ellipsis of subject and/or verb and commenting tone). The
results show that adult men wrote the longest comments, followed by adult women, young men and
young women in descending order. Furthermore, as for the typical features regarding typography
and orthography, it was found that among the four commenting groups, adult men and adult women
used them very sparsely, young men used them occasionally and young women used the features
most frequently. The analysis of tone showed that adult men mostly used an aggressive or neutral
tone, while adult women, young women and young men mostly used a friendly or neutral tone.
Young women used an aggressive tone more often than adult women and young men. Moreover,
regarding the syntactic and stylistic features, results revealed that the young men were the most
frequent users of ellipsis of subject and/or verb, followed by adult women, young women and adult
men. Contracted forms were used extensively in the potential places of contractions, regardless of
commenting group. Since young men used the ellipsis of subject and/or verb most frequently of all
commenting groups and also used the contracted forms in all potential places of contractions, the
conclusion is that the young men used a style that is closer to spoken English than the three other
commenting groups.
Keywords: asynchronous CMC, internet language, netspeak, chatspeak, user-generated content,
user-generated comments, reader responses, gender
Sammanfattning på svenska
Den här studien undersöker språkdrag som är typiska för språk på internet. Det material som har
undersöks har hämtats från användarkommentarer i nättidningar som är riktade till fyra olika
läsargrupper: (1) kvinnor (Working Mother, Mothering), (2) män (Esquire), (3) unga kvinnor
(Seventeen) och (4) unga män (Gameinformer). Cirka 5 000 ord hämtades från kommentarsfälten
för varje tidning, vilket resulterade i en korpus som omfattade 21 087 ord totalt. Korpusen
analyserades med hänsyn till typografiska språkdrag (smileys, ickestandardiserad stavning av
personliga pronomen I och you samt and) och ortografiska språkdrag (förkortningar, akronymer)
samt syntaktiska och stilistiska språkdrag som påminner om talspråk (sammandragningar, ellips av
subjekt och/eller predikatsverb, tonläge). Resultaten visade att män skrev de längsta
kommentarerna, följda av kvinnor, unga män och unga kvinnor i fallande ordning. Vad gäller typiska
typografiska och ortografiska språkdrag visar resultatet att de återfanns mycket sparsamt i
kvinnornas och männens data, att de återfanns då och då i de unga männens data och att de unga
kvinnorna var de som använde dessa språkdrag mest frekvent. Analys av tonläge i
användarkommentarerna visade att män oftast använde en aggressiv eller neutral ton, medan
kvinnor, unga kvinnor och unga män oftast använde en vänskaplig eller neutral ton. Unga kvinnor
använde en aggressiv ton oftare än kvinnor och unga män. Utöver detta visade resultatet att ellips av
subjekt och/eller predikatsverb var mest frekvent i de unga männens användarkommentarer, följt av
kvinnornas, de unga kvinnornas och männens. Sammandragna former användes näst intill
undantagslöst i hela korpusen. Eftersom pojkarna uppvisade mest frekvent användning av ellips av
subjekt och/eller predikatsverb samt använde sammandragna former i full utsträckning, kan
slutsatsen dras att de unga männens syntax är mer påverkad av engelskt talspråk än syntaxen hos de
tre andra kommenterande grupperna.
Nyckelord: internetspråk, användarkommentarer
Contents
1. Introduction ........................................................................................................................................ 1
1.1 Presentation of aims...................................................................................................................... 2
2. Theoretical background...................................................................................................................... 3
2.1 Internet language .......................................................................................................................... 4
2.2 Typographic and orthographic features ....................................................................................... 6
2.3 E-grammar and level of formality ................................................................................................ 8
2.4 Internet language and gender ...................................................................................................... 9
2.5 Internet language among young men and young women ........................................................... 11
3. Methods ............................................................................................................................................ 12
3.1 Material ....................................................................................................................................... 13
3.3 Methods of analysis .................................................................................................................... 16
3.4 Ethical considerations ................................................................................................................ 18
3.5 Methodological considerations .................................................................................................. 18
4. Analysis and results .......................................................................................................................... 19
4.1 Length and number of comments .............................................................................................. 20
4.2 Typographic features: emoticons and typographic respellings ................................................. 22
4.3 Orthographic features: abbreviations ........................................................................................ 23
4.4 Syntactic features and level of formality .................................................................................... 25
4.5 Commenting tone ....................................................................................................................... 27
4.6 Further remarks on the results .................................................................................................. 29
5. Discussion: Implications for the future of the English language ..................................................... 30
6. Conclusion and future research ....................................................................................................... 32
References ............................................................................................................................................ 35
Appendix A: Emoticons found in the corpus ....................................................................................... 37
Appendix B: Table of additional contractions...................................................................................... 38
1. Introduction
There have been various attempts to define the language used on the Internet. Squires
(2010:457) explains that internet language is the variety of language commonly used for
communication on the internet and in other types of electronic communication such as
mobile phone text messages. The terms netspeak, chatspeak, computer mediated
communication (CMC) and electronically mediated language are also employed to describe
language used on the internet (Baron, 2009:43, Squires, 2010:457 -459). Abbreviations,
blends, respelling of words and emoticons are examples of features that would be considered
typical of this variety of language. There are also grammatical structures that are typical of
internet language. For example, it is common to leave out the subject in sentence
construction and using punctuation in a more informal, and often exaggerated, way. Though
the features mentioned above are often considered typical of the language variety, scholars in
this field stress that the language used on the internet varies with its user, its purpose and the
genre in which the communication occurs (e.g., Baron, 2004:398, Hård av Segerstad
2002:14, 16-21, Squires 2010:463). In this paper the term internet language will be used. It
is a term introduced by Lauren Squires, who uses it to imply that it is a language variety that
has emerged from and is used on the internet (Squires, 2010:485). It should be stressed that
the term does not imply that the features of internet language only are used on the internet they can be found elsewhere as well.
Research about internet language often examines to what extent the language variety consists
of features from spoken language. Internet language is predominantly written, yet consisting
of features of both spoken and written language, according to Squires (2010:462). The fact
that the language variety is informal, synchronous and ephemeral means that it contains
features typical of speech and the fact that it is editable, text-based and asynchronous means
that it contains features typical of writing. Research has shown that asynchronous internet
language, such as blogging, holds a position closer to writing whereas synchronous internet
language, such as communication in chatrooms, is closer to speech (Herring, 2007:8).
What impact does internet language currently have on changes in the English language?
What impact will it have on English in the future? Though the answers to these questions
must be mere speculations, they are often reflected in public discourse. Squires (2010: 467,
475) explains that discussions in the mass media about this topic are often infected. Many
people are afraid of language change and take a prescriptive view, stating that Standard
1
English is superior to the more informal language used on the internet. The same opinions
were found in a study of articles from the print media in the early 2000s, where Thurlow
(2006:667) examined headlines and found that they warned about the effects of internet
language on language change. But will internet language make Standard English change at an
alarming rate as the articles predicted? Will it dumb down the English language? Will not
users of English still be able to understand in which contexts informal or formal language is
suitable? It is important to find answers to these questions, both to avoid prejudices against
the language variety and for the purpose of making a contribution to the public discussion,
with arguments from scholarly research.
1.1 Presentation of aims
The aim of this study is to examine the frequency of features typical of internet language and
relate the findings to different age and gender groups. The research questions are:
-
-
To what extent are features considered typical of internet language used in usergenerated comments in online magazines?
What linguistic differences can be seen in the language used in the user-generated
comments from the following commenting groups: (1) adult men (2) adult women (3)
young men and (4) young women?
What does scholarly discourse imply about the impact of internet language on the
future English language? What impact of internet language on English can be seen in
this study?
In order to answer these questions, the extent of some typical features present in internet
language is investigated using data from user-generated comments. The user-generated
comments are appended to articles, blog posts and columns in online magazines geared
towards four different groups of readers: adult men and adult women plus young men and
young women. The material, on which the study is based, consists of a corpus of
approximately 5,000 words from user-generated comments from each commenting group.
The corpus is analyzed with regard to linguistic differences between the commenting groups
as regards the frequency of nonstandard typography, abbreviations, acronyms, contractions,
ellipsis of subject and/or verb, commenting tone and level of formality in language use.
The choice to study abbreviations and acronyms is based on Squires’ (2010:467) information
that these are the nonstandard written features most commonly presented by the mass media
as typical of internet language in the first decade of the 21st century. Ellipsis of subject and/or
verb is studied based on Herring’s (2011:5) description of the syntax of internet language as
fragmented and Crystal’s (2006:467) claim that the non-standard usage of verbs is a feature
of the grammar of internet language. The study of ellipsis of subject and/or verb together
with contractions provides results at the syntactic level. In order to provide results at the
2
stylistic level, each user-generated comment is categorized according to the tone of language
and content. The method to categorize comments according to tone is based on Herring and
Kapidzic’s (2011) previous study on tone in instant messaging (IM). Herring (2007:19) has
earlier described tone as the manner in which the communication is carried out.
The texts in the corpora were run separately in the free version of the software program
Linguistic inquiry and word count (LIWC), which can be found online.1 The program
automatically calculates the frequencies of different categories of words, such as selfreferences (I, me, my), social words, positive and negative emotions, cognitive words, articles
and words with more than six letters (big words). These linguistic dimensions are indications
of the level of formality in the writings as they are compared to percentages provided by
LIWC for reference to formal and personal texts. Previously this method has been used in an
analysis of teen chatroom language made by Herring and Kapidzic (2011) and in my study it
provides quantitative results at the word level.
2. Theoretical background
As discussed above, various attempts have been made to define internet language. Crystal
(2006:20) characterizes it as a language with features unique for the internet and explains
that it has originated from a medium that is electronic, global and interactive. Squires
(2006:463), among others, does not agree with this definition since many of the features
Crystal describes as part of internet language are not new or unique: they have been used
before, for example in personal notes and telegraph messages. She explains that although
there are features that have emerged through the use of the internet as a medium, they are
neither used in the same way by all users nor used in all situations of communication on the
internet (Squires, 2010:463). The use of internet language varies with purpose and genre – as
all language use does. The divergence between Crystal’s and Squires’ views of internet
language implicates that research about language use and communication on the internet has
not yet established its boundaries. Scholars are still breaking ground to investigate the impact
internet language has and will have on the English language.
This section will present internet language and give a historical as well as a current picture of
the variety. The discrepancy in opinions between scholars of this particular research area and
the print media is discussed, as are differences in language use between women and men.
1
LIWC, http://liwc.net/tryonline.php
3
Research about grammatical features considered typical of internet language is also
presented below.
2.1 Internet language
When trying to decide how internet language should be defined, several ideological issues
need to be considered. Squires (2010:458) points to the following ideologies which she finds
central when enregistering internet language:
-
Linguistic correctness (internet language compared to Standard English)
Distinction between what happens in “real” life (IRL) and what happens on the
internet (virtual reality, VR)
Technology driven language change (technological determinism)
Social acceptability
English language protectionism
During the 1990s when research began in this field, the language used in CMC was
considered to be a result of the medium used to produce it, i.e., a result of the computer. It
was implied that the language variety and the words used in that variety were a direct result
of technology (Squires, 2010:461). The term for this view is technological determinism and it
puts the place where communication occurs, i.e., in the computer, in leading position before
the actual language used to communicate. Today linguists who study language on the internet
take into consideration other ideological issues, such as the common opinion that Standard
English is superior to internet language and that the use of internet language is not
considered politically correct as it is assumed to differ greatly from Standard English. These
are sociocultural and historical considerations that need to be a part of the discussion about
this particular variety of language, according to Squires (2010:460). In addition, there is
another sociocultural question to address and consider: The speech community of internet
language is to a great extent heterogeneous – the people of the speech community are
different regarding for instance nationality, social characteristics, age and gender. This
means that internet language will vary with its users and the genre and situation in which
communication takes place (Hård af Segerstad, 2002:14, 16-21).
In the early days internet language was pictured more like a new lexical register than a new
language variety. The mass media printed glossaries of netspeak that were translated into
Standard English so new users could understand the jargon (Squires, 2010:465). Even today
technological determinism is highly present in public discourse and often reflected in the
mass media. This has in some ways determined people’s views on internet language and
established it as a more fixed language variety than it actually is, according to Squires
(2010:461, 464). Thus, ideological issues, as well as the picture produced by the mass media,
4
have affected public discourse on the subject over the last twenty years. This has made people
anxious about language change and an attempt to control the trends of language change is
made through prescriptivism, according to Thurlow (2006:668), who also states that many
blame technology for this decline in standards. Thurlow (2006:679, 686) explains further
that online language is caricatured on a selected set of features in the mass media. Thereby it
is presented as a threat to literacy and this makes people in general look at the variety with an
exaggerated fear of drastic changes in language.
To find out whether the picture of internet language that Thurlow found in the early 2000s is
still present in public discourse online, I searched the social network site Facebook for online
groups about netspeak and chatspeak. Many groups sending the same kind of messages were
found. The groups convey that internet language is a language variety considered inferior and
a threat to future English. The groups have names like “Chatspeak is an insult to the English
language”2 and “Your chatspeak pisses me off; learn to type real words”.3 My search for the
opposite stand – groups that argue for the use of internet language – resulted in nothing on
the same social network site; there were no groups to be found on Facebook that defended
the use of electronically mediated language.
Another example that shows evidence of the picture presented by Thurlow is the online
collaborative Urban Dictionary and its definition of the word chatspeak. Users of chatspeak
are there said to be looked at in a disparaging way by people who have the decency to type out
full sentences and know how to spell correctly. The following definitions of chatspeak were
found in the Urban Dictionary:
(1) Also known as webspeak, chatspeak is basically an illiterate way of typing, and a way
to massacre a language. Shortening words (such as you to u), insisting on ignoring
captials [sic], making words numbers, (such as 2 or 4) and not using endmarks are all
parts of chatspeak. For most people it annoys them shitless, but certain people insist.
(2) This is a form of speech in which one shortens words and replaces the letter "s" with
the letter "z" in an effort to save time and look cool. Chatspeakers also rarely use
capitalization or correct punctuation. Chatspeakers are generally looked down on by
people who can actually spell and who have enough self-respect to type out a real
sentence. Chatspeak can never be considered 'literate'.4
2
https://www.facebook.com/pages/Chatspeak-is-an-Insult-to-the-EnglishLanguage/105385522831983?fref=ts. Accessed: 10 October, 2012.
3 https://www.facebook.com/pages/Language-On-TheInternet/206373759390544?ref=ts&fref=ts#!/pages/Your-CHATSPEAK-pisses-me-off-LEARN-TOTYPE-REAL-WORDS/10150099618180646?fref=ts. Accessed: 24 October, 2012.
4 http://www.urbandictionary.com/define.php?term=chatspeak Accessed 13 November, 2012.
5
In my opinion, these examples show that there is a public reaction to electronically mediated
language and the reaction sends out signals that the English language is degenerating on
account of internet language.
While public discourse about CMC is often concerned with people’s fear that the English
language will go to ruin, linguistic scholars like Baron, Crystal and Herring are more hesitant
about this assumption. They conclude that purists can relax since the digital dialect of
English is doing more good than harm (Boyd in Squires, 2010:467). In fact, scholars studying
internet language take quite the opposite stand from public discourse and point to facts that
show that electronically mediated language might not be as different from Standard English
as was previously thought. Many findings from scholarly research reject the assumption that
internet language is a unique and uniform language variety, since – as has been discussed
previously – it varies with purpose and genre like all use of language (e.g., Hård af Segerstad,
2002:14. 16-21, Squires, 2010:463). With this in mind, my conclusion is that the fear of
language decay is most likely a result of the fact that the media as well as public discussions
on the internet are responsible for spreading a myth about internet language – a myth built
on the exaggerated, or misinterpreted, scope of language change. The myth implies that
internet language is very different from Standard English and that its impact makes language
change very rapidly. This mediated picture has given people in general a distorted picture of
internet language which is different from that of the scholar.
2.2 Typographic and orthographic features
Research on internet language has provided evidence for a language variety with distinctive
written features, according to Squires (2010:457). Most common are acronyms like BRB (be
right back) and LOL (laughing out loud), abbreviations such as coz for because and u for
you, and respelling of words such as gal or grrlz where both respellings refer to the word
girl/s. The following changes in spelling are presented by Crystal (2006:86-98) as typical of
internet language: orthographic reduction of letters as in thx for thanks, rebus replacements
of letter combinations such as gr8 and b4 (great and before), capitalization and punctuation
that varies from standard use like in rAndoM!!!!!!!!!!!!!!!!!!!! (my own example) and letter
replacements such as s spelt z. Acronyms and abbreviations are examples of the loosened
orthographic norms widely considered to be a defining character of internet language, though
Herring (2011:3) stresses that they are not to be interpreted as misspellings but as a means to
emphasize the playfulness and creativity in chat-language.
6
Squires (2010:467) explains that abbreviations and acronyms are presented as iconic
characteristics of internet language in the mass media in the mid 00s. The mass media’s
assumption that these features are most common is based on the idea that internet
communication needs to save time and space in order to be efficient. As discussed above,
findings by scholars like Baron (2004:416), Squires (2010:482) and Tagliamonte and Denis
(2008:12) contradict the presentation of internet language in the mass media. These scholars
have analyzed corpora from IM and found that “characteristic IM forms” are not used as
often as the media has led people to believe. In fact, they are not used often at all –
Tagliamonte and Denis’ (2008:12) study shows a proportion of 2.4% of the total corpus they
analyzed. This means that approximately 650 items out of nearly 27, 000 words were
considered typical of IM language.
Also a study of text messages and IM carried out by Baron (2008:154) showed the same low
frequency of language features considered typical of internet language. The study was carried
out on female language among college students. Eight acronyms were found in the text
messages and four in the IM conversations: a total of eight examples of lol, two of omg, two
of ttyl (talk to you later) and one wtf (what the fuck). The two different corpora studied by
Baron were made up of a total of 2,619 words (Baron, 2008:151-154). In the same study, no
examples of abbreviations typical of internet language were found in the IM corpus, after the
lexical shortenings ya (you), prob. (probably) and em (them) were excluded on account of
not being typical of online language. In the text messages 47 abbreviations were found,
among others u (you), r (are) and k (OK) (Baron, 2008:154).
In internet language there is an overlap between nonstandard typography and nonstandard
orthography (Herring, 2011:2). This means that replacing letters with symbols representing
the sound, e.g., gr8 for great and u for you, can be classified as both nontraditional spelling
and as a typographic characteristic of internet language. Early studies of online language
emphasized the creativity in language use as the foremost drive for these phenomena, but
later research suggests that only a small number of nontraditional spellings have been
standardized and they occur most frequently in mainstream online contexts (Kapidzic in
Herring, 2011:3). The examples mentioned are u for you, msg for message and wanna for
want to.
Both Squires (2010:482, 484) and Tagliamonte and Denis (2008:14) note that the
nonstandard form of first-person pronoun I written in the lowercase letter is more common
in IM than its standard equivalent. Individual differences in the use of apostrophes in
contractions and the possessive are other findings made by Squires, who notes that some
7
individuals use the apostrophe all the time while others never use it. She finds it notable that
these orthographic features are rarely discussed as a part of internet language in the mass
media, even though they are more common in IM than acronyms and abbreviations (Squires,
2010:482).
Another typographic characteristic of internet language is the use of the non-alphabetic
keyboard symbols used in emoticons: sequences of keyboard characters that represent facial
expressions (Herring, 2011:2). Western-style emoticons are viewed at a 90-degree angle (:D
for a laughing face), while Asian-style emoticons are viewed straight on (O_o for a confused
face). Crystal (2006:41-42) claims that smileys are one of the most distinctive features of
internet language, but at the same time he explains that studies have shown that they are not
very common. Also Herring (2011:2) argues that studies have shown that emoticons occur
less often than popularly believed in English internet language. Both Crystal and Herring
explain that the most frequently used emoticons are variants of a happy face, e.g., smileys
like :) and :))), and a winking face, e.g., winkies such as ;) and ;-) (Crystal, 2006:41, Herring,
2011:2). In the study of text messages and IM done by Baron (2008:151-154), emoticons were
very infrequent in the messages sent by females. In the corpus of 1,473 words from text
messages only two emoticons occurred, and in the corpus of IM conversations, five emoticons
were found in the 1,146 words (0.7%). Out of the eight examples, seven were smileys (Baron,
2008:154).
2.3 E-grammar and level of formality
Herring (2011:1) introduces the term e-grammar to represent the set of features that
characterize the grammar of electronic language. E-grammar does not imply that there is a
single grammar for all varieties of electronically mediated language, since there is evidence
that points to e-grammar as “varying systemically across languages, contexts, users, and
technological modes” (Herring, 2011:1). Herring (2011:6) refers to previous research of CMC
in which it has been shown that internet language can be distinguished from traditional
genres of speech and writing. Typically it falls between the two categories when scholars
measure the frequency of grammatical function words such as pronouns, determiners, modal
auxiliaries and negations in corpora. Herring (2011:6) explains that asynchronous modes of
internet language, e.g., email, are often closer to formal writing while synchronous chat often
is closer to casual speech.
The syntax of internet English is sometimes described as fragmented compared to standard
syntax (Herring, 2011:5). Herring points out that parts of speech such as articles and subject
pronouns can be elided in order to save keystrokes in informal styles. Messages that consist
8
of sentence fragments where the subject and/or finite predicate are left out (ellipsis of subject
and/or verb), are common in genres like chat, IM, texting and microblogging. Herring
(2011:5) states that ellipsis of subject and/or finite verb may be a way to try to type speechlike utterances or may be done as to be brief. Other research shows that internet language can
be syntactically casual with freely omitted subjects, modals or articles (Maynor in Baron,
2001:193).
Another part of e-grammar, which traditionally has also been considered a marker for level of
formality in a text, is the use of contractions. Baron (2008:154) analyzed the use of
contracted forms in text messages and IM conversations. In IM, contracted forms were used
to a rate of 68% and in texting 85% of all potential contractions were contracted. The
apostrophe in contractions was used far more frequently in IM than in text messages: 32% of
the contractions in texting had the apostrophe and 94% of the contractions in IM. Baron
(2008:160) suggests that there might be different reasons for leaving out the apostrophe in
contractions in the two different modes of electronically mediated language. For instance in
texting, several extra steps are needed to insert the apostrophe, while inserting the
apostrophe while typing demands almost no effort (Baron, 2008:154). She reasons that the
cases where the apostrophe actually was inserted in text messages can reflect the author’s
writing habits from the computer keyboards (Baron, 2008:160).
2.4 Internet language and gender
When internet language was first discussed by sociolinguists in the 1990s, it was believed that
gender roles would be more equalized as the form of communication was more anonymous
than face-to-face communication (Baron, 2004:405). But soon the idea that the anonymity of
the internet should make communication more coequal had to give way to ”the realization
that online dynamics often replicated offline gender distinctions” (Baron, 2004:405). If this
is linked to Walther and Jang’s (2012:4) claim that user-generated content can be in a
reactive or interactive interrelationship with the article it comments on, I assume that gender
differences in the style of internet language are evident in the same way as gender differences
are in speech-style.
Previous research has shown some gender differences in the use of internet language and
online communication. For example, Herring (2003:207, see also Baron, 2004:405-406)
analyzed one-to-many asynchronous CMC and found that men tended to post longer
messages and to be the ones who opened and closed conversations in gender-mixed groups.
Other findings by Herring (2011:40) show that men express their opinions strongly, use
9
harsh language and hold an adversarial orientation towards their interlocutors. Women, on
the other hand, tend to post relatively short messages where they express support of others
and manifest their alignment towards their interlocutors. Herring’s (2003:210) earlier study
shows that representation of smiles and laughter (e.g., emoticons, lol, hahaha, hehe etc.) are
typed three times as often by women than by men in chatrooms, while the gender ratio is
reversed for aggressive and insulting behaviour. The fact that women write shorter messages
contrasts with traditional research on writing, where findings show that women write longer
texts than men (Baron, 2004:418). To conclude, previous studies show that men and women
tend to use different discourse styles in asynchronous CMC (Herring, 2003:210). Women
tend to use a style which is supportive and aligned while men more often use a style which is
oppositional and adversarial.
While Herring notes differences in male and female language in asynchronous CMC, Atai and
Chahkandi (2012:887) conclude that gender-specific stylistic features were used by both
sexes, i.e., men used features considered typical of female language and the other way
around. The analysis was based on a corpus from two different professional listservs used by
both men and women. Their study also showed that the topic of conversation contributed to
the choice of language features and also that topics with real world consequences attracted
women more, while men tended to be attracted to topics containing abstract theorizing. If the
topic was about real world consequences, men wrote longer messages than women on the
same subject (Atai and Chahkandi, 2012:884). This can be linked to Dare’s (2011:185)
opinion that a holistic examination of the purposes and motivations for communication is an
important focus when identifying gender differences in online communication. Other
scholars like Walter and Jang (2012:4) also state that the qualities of linguistic, stylistic and
semantic components of a message are of interest in research.
When studying a motherhood blog in Brazil, Braga (2011:215) took such a holistic view. She
found that women participating in guestbook activities of the blog used this activity to
recover the social practice of “woman’s talk” – a practice the interlocutors were missing in
their private lives. Generally this female form of communication is viewed from a male
perspective and thereby considered as useless and futile, explains Braga. She elaborates on
blog environments and language:
In blog environments, topics are generally addressed in a backstage language, closer to spoken
than written language, even if their comments are all written. In other words, blog
communication is a written form of spoken language.
(Braga, 2011:218)
10
While Herring (2007:19, see below for further explanation) uses the term tone to categorize
conversations, Braga uses attitude (2011:218). The most regular pattern of attitude in
comments tagged to the motherhood blog that Braga has analyzed is kindness (Braga,
2011:218).
2.5 Internet language among young men and young women
When researching synchronous CMC in IM, Baron (2004:414-415) found indications that
conversations between female college students were about one third longer than
conversations between male college students. In other words, females took more turns in
conversations. Baron’s corpus is based on IM conversations between college students and
consists of about 12,000 words. Her study showed no patterns based on gender in the use of
acronyms and abbreviations, but showed a difference in the use of contractions and
emoticons:
-
males used contracted forms to a rate of 77% while females used them to a rate of 57%
twice as many emoticons were found in the female’s messages, and it is noteworthy
that all the emoticons used by males were actually from one single person’s
conversations with females
In IM, female college students took longer turns than men in the conversations and their
conversations were longer as well (Baron, 2004:418). This could be said to contrast with the
findings of Atai and Chahkandi (2012) and Herring (2003) in their studies of adult language,
where they found that men tended to write longer messages than women. Baron discusses the
fact that there are no indications that same-gender conversations in speech show patterns
that women’s oral discussions are longer than men’s, but women tend to write longer
sentences in essay writing. Her conclusion is that her corpus shows evidence of a female
writing style, but not of a female speech style (Baron, 2004:418). In addition to women’s
lesser use of contractions, this fact indicates that women look upon IM as a written medium
(Baron, 2004:418).
Another gender difference found in studies on internet language is the use of different
stylistic tones in which messages are written. Herring (2007:19) refers to tone as the manner
in which a message is performed. The message can be emphasized with the use of emoticons
that take on different pragmatic meanings depending on the tone (Herring, 2007:21).
According to Leurs and Ponzanesi (2011:205), the interlocutors can emphasize, hide and add
nuances to their identities through the use of subject matter, voice, tone and emoticons when
they communicate online. In this way it is possible for people in adolescence to monitor
response and interactions of others and this will contribute to the development of the identity
of the adolescent self (Leurs and Ponzanesi, 2011:206). Guiller and Durndell (2006, quoted
11
in Herring and Kapidzic, 2011:42) found significant gender differences in the use of stylistic
variables when researching students’ language in asynchronous discussion groups. Their
results showed that young men were more likely to use authoritative language and to respond
negatively in interactions, while young women showed support, agreed explicitly and made
more personal and emotional contributions.
In support of the findings of Guiller and Durndell, Herring and Kapidzic (2011:42) cite
research about tone in profiles on MySpace, done by Thelwall, Wilkinson and Uppal (2010).
Their research showed that female messages had a positive tone to a greater extent than the
male messages did. Herring and Kapidzic’s (2011:48) own study of tone in IM showed that
young women used a friendly tone much more often than young men, who used an aggressive
or flirtatious tone more often than young women did. It also showed that young men and
young women used the neutral tone to the same extent.
3. Methods
In this study a corpus of 21,087 words was collected in order to investigate typical features of
internet language. The data was collected from comments on articles in online magazines.
The term user-generated content is used by Walther and Jang (2012:2) and is a definition of
one of the message types used in participatory websites - also known as Web 2.0 or social
web sites. User-generated content includes readers’ responses to both proprietor content and
other user-generated content (Walther and Jang, 2012:4). The systems in participatory
websites present both
central messages posted by a web page proprietor, and user-generated content that other
readers contribute. These systems can both facilitate and complicate social influence because
they provide information from a variety of sources simultaneously who possess different
attributes and connote different relationships to readers.
(Walther, as seen in Walther and Jang, 2012:2)
One example of such user-generated content is talk-back features that are tagged to online
news stories (Walther and Jang, 2012:2). In my view this is comparable to comments posted
by readers in online magazines and therefore the term user-generated comments is used in
this study.
12
3.1 Material
The corpus was collected from online magazines geared towards adult men, adult women,
young men and young women respectively. In order to find suitable magazines, I began by
searching Top 10-lists of magazines for men, women and teenagers, since my study is
focusing on both gender and age. Different online magazines were visited and I made my
choices based on the explicitly targeted readership of each magazine as well as on the ease of
accessing article comments. To each commenting group, i.e., adult men, adult women, young
men and young women, a second magazine option was selected in case there would not be
enough user-generated comments to reach a total of 5,000 words from the first magazine.
The online magazines used for collecting the corpus have explicit targeted readerships,
expressed either on the publisher’s homepage or on the homepage of the actual magazine.
The two magazines used to collect the adult women’s data are Working Mother
(http://www.workingmother.com/) and Mothering (http://www.mothering.com/), both of
which are geared towards women. Seventeen (http://www.seventeen.com/) is geared
towards teenage girls and young women and was used for collecting the data for young
women. The adult men’s data was collected from Esquire (http://www.esquire.com/), which
is geared towards men. A magazine geared towards people who play computer- and TVgames and are interested in technical equipment, Gameinformer
(http://www.gameinformer.com/), was used for the young men’s data. The choice of this
particular magazine is based on the assumption that the targeted readership of
Gameinformer is different from the targeted readerships of Esquire and Seventeen in terms
of age and gender. Gameinformer has a section that welcomes new users of the online
magazine. It is called “Welcome to the brotherhood” and is an indication that the target
group is young men and men. In the user guidelines of the site users are recommended to act
like adults even if they are not and to use language without curses.5 When a brief search of
ten different user-profile pages was done, eight users were in high school, college or
university, indicating that the users mostly are in their late teens or early twenties.
Both Esquire and Seventeen are parts of the Hearst Corporation’s publishing. Seventeen’s
aim is to report on issues that young women face everyday and it is a teen fashion and beauty
magazine.6 A brief search of ten of the user-profiles showed that six out of ten users go to
high school and the remaining four users did not state age on their profile page. Esquire is a
general life-style magazine “for sophisticated men of contemporary America”. It aims to
5
6
http://www.gameinformer.com/forums/f/32/t/3729.aspx [Accessed 28 October, 2012].
http://www.hearst.com/magazines/seventeen.php [Accessed 28 October, 2012].
13
reach men “who are intellectually curious and socially aware”.7 Research of ten of the userprofiles showed that nine out of ten were 50 years of age or older and all of them stated
having a university education. The community guidelines on both Esquire and Seventeen say
that obscene language or abusive behavior is not accepted in the communities. They
encourage respectful and civil behavior to others and do not allow commercial solicitation or
advertising. When the data was collected for adult men, only comments posted from userprofiles stating to be male were used and when data for young women was collected only
female user-profiles were used.
Working mother is geared towards mothers who work and reports on issues that are of
interest to women who are both mothers and professionals. The guidelines ask their users to
be careful with spelling mistakes.8 Mothering is a magazine for parents interested in “natural
family living” and the user guidelines do not encourage debate; instead they want to provide
support to parents on their website.9 Mothering’s forum guidelines do not allow advertising
in posts and they provide the users with a list of abbreviations and emoticons that can be
used in the community.10 When I collected the data for adult women, all comments that were
posted from male user-profiles were excluded.
User-generated comments posted on the sites in the week containing the 15th of the months
of July, August, September, October and November 2012 were collected from the magazines
to reach a total of 5 000 words for each commenting group. For Seventeen, the one-week
period needed to be expanded when collecting the comments as there were not enough
comments posted during 35 days. First a try was made to find comments on the second
choice magazine geared towards young women, but there were not enough comments to be
found there either. Among the magazines examined in this category, Seventeen is the one
that has most comments on their articles and therefore the choice was made to expand the
time of data collection rather than using the second option magazine. A total of
approximately 100 days were used to collect the data from Seventeen.
The user-generated comments from magazines geared towards adult women had to be
collected from both the first and the second choice of magazine. Many user-generated
comments on the articles in Working Mother and Mothering were posted with a commercial
http://www.hearst.com/magazines/esquire.php [Accessed 28 October, 2012].
http://www.workingmother.com/other/working-mother-magazine-writers-guidelines. Accessed: 28
October, 2012
9 http://www.mothering.com/community/a/about-us. Accessed: 28 October, 2012
10 http://www.mothering.com/community/a/pleased-to-meet-you-forum-guidelines. Accessed: 28
October, 2012
7
8
14
purpose: members of the community post a comment and at the same time get paid for
product placement. These comments often did not comment on the actual article, but instead
contained more general content that could apply to many different articles. That means that I
encountered problems in selecting which comments should be part of the corpus. I made the
choice to exclude comments with commercial content, and only collect comments that could
be read as a reader’s response to the article. Therefore two different magazines needed to be
searched to reach 5,000 words. Hence two different methods were used when the usergenerated comments from the first choice of magazine were not enough:
-
the period was expanded when collecting comments from magazines geared towards
teenage girls and young women and
two different magazines were used when collecting comments from magazines geared
towards adult women.
On Gameinformer, Working mother and Mothering the readers are allowed to have their
own user-written blogs, while Seventeen and Esquire only have proprietor-written blogs in
different categories. User-written blogs means that the members of the community are
allowed to create their own blogs on the magazine’s site, while proprietor-written articles and
blog posts are written by people employed by the magazine. The user-generated comments
were collected from user-written blog posts as well as proprietor-written articles and blog
posts. The structures of the magazines’ archives differed greatly and as a result, the collection
of comments from the various magazines differed in terms of the time it took collecting all
material. For example, collecting data from Gameinformer was quickly done and only took
about six hours, compared to the twenty hours needed for collecting data from Esquire and
Working Mother/Mothering, respectively. Seventeen required about 12 hours.
The major part of the comments from the magazines geared towards adult women were
found in the fashion and food sections, while Seventeen’s most commonly commented
articles were on music and celebrities. The political blog in Esquire was commented on to a
great extent and in Gameinformer there were comments on almost all articles; both on
proprietor-written articles and reviews as well as on many of the user-written blog posts. It is
not possible to say anything about which category was mostly commented on in
Gameinformer. While comments from Gameinformer and Esquire were mostly placed on
the same day as the articles, commenting threads from Seventeen at times had comments
that differed by one year in date.
15
3.3 Methods of analysis
The analysis of the corpus was done separately for each commenting group. First the number
of words and comments were counted and then the data was searched manually for
abbreviations, acronyms, nonstandard typography and orthography plus contractions and
ellipsis of subject and/or verb. Nonstandard typography and orthography were then counted
with the advanced search function in Word. The frequency of ellipsis of subject and/or verb
was manually counted in the data for each commenting group and in the analysis
exclamations were not regarded as examples of ellipsis of subject and/or verb. In this study a
word was defined as letters or symbols divided by blank spaces, which means that emoticons
and articles are counted as words. Each comment consists of at least one utterance but in
most cases more than one, based on a definition of utterance as consisting of one or several
words that provide referential or pragmatic meaning (Brock 1996, quoted in Sundqvist,
2009:104). Utterances mostly coincided with punctuation. In cases when punctuation was
missing, punctuation was added by myself in order to divide the comment into utterances.
After the corpus had been checked for typical features of internet language, the free online
version of Linguistic Inquiry and Word Count (described in section 1.1) was used to examine
the corpus for self-references, social words, emotions, cognitive words, big words and
articles. The results from this analysis were expected to give an indication about the level of
formality in the data. The data for each of the four commenting groups was run separately in
LIWC.
Next, each user-generated comment was categorized as one out of three different modes of
tone – aggressive, friendly or neutral – in order to investigate the discourse style of the four
commenting groups. This categorization is based on Herring’s (2007: 18) presentation of a
classification scheme for online language. Her scheme is developed from The Speaking model
by Hymes (Hymes in Herring 2007:6) and regards the modality of speech called key. In
Herring’s adaptation of the model this modality is called tone and a message can be separated
into different manners of tone; serious/playful, formal/casual, contentious/friendly and
cooperative/sarcastic etc. (Herring, 2007:18). In other words, ”’tone’ refers to the manner or
spirit in which discursive acts are performed” (Herring, 2007:19). In my study Herring and
Kapidzic’s basic categorization from 2010 is used. Their study categorized the messages from
teen chatrooms in three different tones: aggressive, friendly or neutral (Herring and
Kapidzic, 2011:45). Later they added three more categories to their study, but in my analysis
of the corpus of user-generated comments only the three first categories were used and each
16
comment was categorized under only one of the tones. The categorization was done manually
and impressionistically, with the following criteria to guide me:
-
-
comments that carried irony or sarcasm were categorized as aggressive in those cases
where a subject of insult was addressed in the comment
plaintive, reactive and/or adversary content of the comment were categorized as
aggressive
if irony was used in a humorous way, the comment was categorized as friendly or
neutral (e.g., friendly bantering)
comments on content in articles, with or without expressed personal opinions were
categorized as neutral (i.e., you need not agree with the article’s content to be
categorized as neutral)
cheering comments expressing love, gratitude and/or encouragement were
categorized as friendly
if the comment contained more than one tone, it was categorized by the most
dominant tone
Most of the comments contained only one tone, but in cases when commenting language was
friendly or neutral but content insulting, adversary, reactive or plaintive, the comment was
categorized in the most dominant tone, like in the following example:
(3) Ahhhhh hellllooooo that's what girls do we're girls!!!!!
A civil language is used in this comment, but the content is adversarial; the Ahhhhh
hellllooooo and the repeated exclamation marks, imply the questioning of someone and the
wish to tell someone off. Therefore the speech-like exclamation indicates an adversarial
stance and the comment was consequently categorized as aggressive. To clarify further how
the categorization was made, a few examples (4-9) from the corpus are presented below.
(4) This is such a great quote, brightened my day :) I love her! (friendly)
(5) Glad you made it through Sandy with such a great attitude! (friendly)
(6) You know X, I like you and all but goddamit it seems like every time I read a post and
have something snarky, mildly humorous to say about it - there you are. Damn it, I
don't mind that you're better at it than I am - it's just this 'cut off at the pass' stuff that
sticks in my craw. (aggressive)
(7) I am very upset right now.
I HATE YOU I HATE YOU I HATE YOU I HATE YOU I HATE YOU. (aggressive)
(8) This looks a bit kitchy, just another way to juice your bucks (neutral)
(9) I really thik it's a good ideea to have more womens at US government. They must have
the same rights as men's do [sic] (neutral)
17
3.4 Ethical considerations
Based on the collected corpus of user-generated comments, a language analysis was made to
look for linguistic differences in language in relation to age and gender. The comments were
collected from the different magazines without names or the article which was commented
on, and then copied into a computer file. On the internet it is impossible to know exactly who
is behind the aliases or the user accounts on social network sites. Hence the conclusions from
this study have to be founded on the assumption that most commenters are a certain age and
a certain gender, based on the explicit aim of the magazines and the brief search that was
made on user-profiles of each commenting group (see section 3.1). Consequently, my
assumptions are that the commenting group of Esquire represents adult men, while the
commenting group of Gameinformer consists of young men. Likewise I assume that the
commenting group of Mothering and Working mother consists of adult women, while young
women make up the commenting group of Seventeen.
3.5 Methodological considerations
The method chosen for this study is based on previous research on internet language. Hård af
Segerstad (2002:120) concluded that the use of language in email varies with age and gender.
Also Herring (2003, 2011) and Baron (2004) found gender differences in the language of IM,
as did Tagliamonte and Denis (2008). Tagliamonte and Denis (2008:24) concluded that
stylized IM forms are abandoned by adolescents at a young age although the unique style of
the medium is kept. Their studies of IM are quantitative, but they do not contain any
information about user-generated content or comments, but either way their findings
motivate the focus of the present study: age and gender. The studies on online commenting
that I was able to retrieve were made from a technological, sociolinguistic and/or social
psychological perspective (e.g., Braga, 2011 and Atai and Chahkandi, 2012). This is in line
with Thurlow’s (2006:668-669) view that CMC research has tended to focus on ways of
communicating rather than on linguistic practice, assumedly on account of technological
determinism. Walter and Jang (2012:4) believe that linguistic, stylistic and sematic qualities
are of interest when studying online language and behaviour. So opposed to early research
about internet language, when the variety was considered a result of the medium, today’s
research has a broader focus and takes traditional linguistics such as genre, stylistics and
semantics into account. This study has its focus at the typographic, orthographic, syntactic
and stylistic levels, and is thereby focused on linguistic practice.
Baron (2004:398) argues that each type of CMC has its own usage conditions depending first
on the number of interlocutors and second on whether the communication is instantaneous
18
or not. The usage conditions in their turn can affect the language used in communication
when it comes to formality, tone, the number of words, correctness and informativeness.
Hård af Segerstad (2002:252) explains that communicators in asynchronous modes have
time to plan and revise their writings, while synchronicity demands more rapid typing and
with that comes less revision of the written text. She explains further that the relationship
between the participants as well as the activity where communication takes place are
conditions that need to be regarded:
Language use is adapted according to level of synchronicity, the particular conditions for
production and perception in each means of expression, as well as according to the
communicative situation and context
(Hård af Segerstad, 2002:253)
The findings of my corpus analysis will be linked and compared to previous quantitative
findings of research on IM corpora as well as previous findings about asynchronous internet
language. User-generated commenting and IM are two different online genres that differ in
terms of synchronicity and the number of interlocutors. User-generated comments are not
usually received by a known interlocutor, as is the case in IM conversations, but address
members of a community that the commenter belongs to. The other members might or might
not be known to the commenter and quite often user-generated comments can be read also
by people who are not members of the community. Hence the conclusion can be drawn that
the user-generated comments that were collected in this study belong to a public genre within
internet language. The comments do not require an instant answer from someone and an
online comment thread can be visited and read by many different members of the community
on different occasions. That is another contrasting feature compared to IM. The assumption
can be made that these conditions contribute to greater formality in language as the
comments address people unknown to the commenter and apart from that, the commenters
have time to revise and express themselves since no one is waiting for a quick reply. One-toone communication between people who know each other must be assumed to encourage the
use of more informal language than one-to-many communication between people who are
unacquainted with one another.
4. Analysis and results
The results of the analysis of the corpus of user-generated comments from the four different
commenting groups are presented below. In 4.1, a basic analysis of comment length and
number of comments is presented and compared to findings by scholars cited above. In 4.2
19
and 4.3 the features generally considered typical of internet language: emoticons,
nonstandard spelling, abbreviations and acronyms, are analyzed. In 4.4 the LIWC analysis of
the level of formality in the comments is conferred, together with a survey of ellipsis and
contractions. Finally section 4.5 delivers the results from the analysis of commenting tone.
Results from the gender- and age analysis of typical features in the collected data are thereby
presented from a typographic, orthographic, syntactical and stylistic view respectively. The
nonstandard spellings analyzed in my study plus the omission of the apostrophe in
contractions are features that overlap in language level. Nonstandard spellings of the
pronouns I and you plus the conjunction and are in the present study regarded as
typographic features, but could also have been counted as orthographic features since they
are respellings (see further discussion in Herring 2011:2 and section 2.2 above). The omission
of the apostrophe in contractions can belong to both typography and syntax. As the use of
contractions traditionally has been considered a marker of informality in a text, the feature
will be covered in section 4.4, Syntactic features and level of formality and not in section 4.2,
Typographic features: emoticons and typographic respelling.
4.1 Length and number of comments
Table 1 and 2 present the results from a basic analysis of the user-generated comments. Table
1 shows the analysis from a gender perspective and Table 2 shows the results from the
perspective of the four commenting groups.
Table 1. Survey of the corpus from a gender perspective.
Commenting group
Male
Female
Total no. of words
10,512
10,575
Total no. of comments
239
402
44
26
3.28
2.29
Mean length of comments
(no. of words)
Average number of utterances per
comment
20
Table 2. Survey of the corpora from the perspective of the four commenting groups.
Adult
Adult
Young
Young
Total
men
women
men
women
Number of words
5,295
5,343
5,217
5,232
21,087
Number of comments
98
132
141
270
641
Mean length of comments
54
41
37
19
33
Shortest comment
6
2
4
1
1
Longest comment
175
167
197
224
224
Average number of utterances per
3.32
2.92
3.25
1.98
2.66
(no. of words)
comment
As mentioned, traditional writing research has shown that women write longer texts than
men, but this contrasts with findings from asynchronous internet language in which women
have been found to write shorter messages than men (Baron, 2004:418, Herring, 2003:207).
This contrast is apparent in the analysis of the data in the present study where male
comments are nearly twice as long as the female comments in mean length (see Table 1). If a
comparison is made between the commenting groups of young women and adult men, the
contrast is even more pronounced and Table 2 shows that young women’s comments are
approximately one third of adult men’s in mean length. Thus my findings are in line with
earlier research done by Herring (2003) and Atai and Chahkandi (2012) on asynchronous
internet language.
In the synchronous, one-to-one and private mode of internet language IM, Baron found that
female college students write longer messages than male college students (Baron 2004:418).
Though the data in this study was collected from a different mode of internet language – a
mode very different from the mode analyzed by Baron as it is asynchronous, one-to-many
and public – it is interesting to note the difference in results. In the data analyzed here, the
mean length of comments by young men is twice as long as the mean length in the young
women’s data (see Table 2); a result which differs radically from Baron’s findings. As the
genres are so different, the comparison cannot be made too much of, but it emphasizes the
fact that internet language is not a single universal language variety used similarly across the
internet. As argued by Hård af Segerstad (2002:14, 16-21) and Squires (2010:463) among
others, it differs in features and style depending on user, genre and situation. Another result
to notice is the fact that the young women’s comments contain both the longest and the
shortest comment of the whole corpus.
21
4.2 Typographic features: emoticons and typographic respellings
Use of emoticons is a typographic feature of internet language, and as pointed out in the
theoretical background, it is generally presented as a very frequent feature of internet
language in the mass media. In my data for the four commenting groups, the differences in
emoticon usage range from none in the adult men’s data to 90 examples in the young
women’s data. In percentages, 1.7% of the words in the young women’s comments are
emoticons. In Baron’s (2004:414-415) study of IM, female college students used emoticons
twice as often as male college students and also Herring’s (2003:210) research showed that
women use representations of smiles and laughter more often than men. Their findings
correlate the results of this study where nine emoticons, all representations of a happy face,
were found in the data for the adult women and none were found in the adult men’s data. Of
the 90 emoticons in the data for young women, 44 are different representations of smileys,
32 are graphic representations of a heart and five are representations of winkies (see
Appendix A for a full list). Altogether, the results show that the vast majority of the 90
emoticons represent positive feelings. Also in the young men’s comments the representations
of positive feelings are in the majority: of the total of 16 emoticons, six are smileys and four
are winkies.
Previous research by Baron (2008:151-154) and Herring (2011:2) has shown that emoticons
are very infrequent in IM and mobile phone text messages, even though they are often viewed
as a common feature in public discourse about internet language. The results from this study
are in accord with previous findings: a total of 215 emoticons in the whole corpus makes a
ratio for emoticons of 1%. One percent must be considered a very low ratio, considering the
space the feature is given in public discourse (cf Thurlow 2006:679, 686).
Table 3. Frequency of typographic features in the comments.
Adult men
Adult women
Young men
Young women
Emoticons
0
9
16
90
I
102 (100%)
231 (97.9%)
305 (95.9%)
249 (92.9%)
i
0
5
13
19
(7.1%)
and
121 (98.4%)
134 (96.4%)
122 (96.8%)
98
(96.1%)
&
2
5
(3.6%)
4
(3.2%)
4
(3.9%)
you
39 (100%)
55
(100%)
64
(98.5%)
124
(95.4%)
u
0
0
1
(1.5%)
6
(4.6%)
(1.6%)
(2.1%)
22
(4.1%)
Respellings and nonstandard typography overlap, according to Herring (2011:2). In Table 3
nonstandard spelling of I, and and you (i, &, u), are compared with the number of standard
spellings of the same words. And is occasionally represented with the keyboard symbol & in
the data, but overall the standard form is used more than 96% of the time. The use of u for
you is employed even more rarely and is never used by adult men or adult women. Though
the use of nonstandard spelling u for you is most frequent in the commenting group of young
women, it is rare overall and the standard form is used in 95% of the cases.
In IM the use of lowercase letter i for personal pronoun I has been noted to be more common
than the standard form (Squires 2010:482, 484, Tagliamonte & Denis 2008:14). As discussed
above it can be awkward to compare results from two very different genres within internet
language, unless it is to accentuate differences between the modes of communication. As can
be seen in Table 3, the use of lowercase i is clearly not more common than its standard
equivalent in the data of the present study. However, it is still the nonstandard typographic
feature most commonly used in the corpus of user-generated comments. Though the ratio is
very low compared to the ratio in IM, where the use of lowercase i is more common than the
standard spelling of I, it is the young women who show the highest frequency in the use of
nonstandard spelling for the personal pronoun I in my study. The results of the present study
suggest that the young women are the most frequent users of nonstandard typography, as, for
all examined features, their ratios are the highest compared to the three other commenting
groups.
4.3 Orthographic features: abbreviations
When discussing the orthographic features there is a need to clarify the different kinds of
abbreviations that will be discussed. The acronym is one kind of abbreviation presented in
this study and here the term acronym also includes initialisms. This means that all
representations where the initial letters in the words of a phrase are combined will be called
acronyms, even though they could be divided into (1) acronyms (initial letters put together
that can be pronounced, e.g., lol) or (2) initialisms (initial letters put together that cannot be
pronounced, e.g., omg). This is done in line with a study of IM done by Baron (2004), in
which she did not separate the two. Only acronyms that seem to be distinctive for language
on the internet were counted, leaving out acronyms that are also part of common offline
writing, e.g., UN, US, i.e., etc. and their likes. This is also in line with Baron’s (2004) study of
IM. The term abbreviation is used to represent all short forms of words found in the corpus,
e.g., clippings (e.g., fave), lexical shortenings (e.g., ‘cuz), and orthographic reduction of
letters (e.g., fk).
23
Abbreviations and acronyms are characterizing features of internet language, since they are
timesaving and save space when there is a need to be brief, according to Squires (2010:467).
In my corpus, both abbreviations and acronyms are very infrequent. Corresponding to
Baron’s (2008:154) study of IM, a few abbreviations were found in the data. All but one were
instances of abbreviations used in informal written and spoken English and thereby they
cannot be considered typical of internet language. The abbreviation found that can be
considered typical of internet language is the clipping props (proper recognition or proper
respect), which was found two times in the comments by the young men. Three abbreviated
representations of expletives (bugfk, fk, f’ing) were found in the data for adult men, all of
which contained orthographic reduction of letters. No expletives were found in the comments
by the three other commenting groups. Kinda, a contracted speech-like form of kind of,
occurred three times in the adult men’s data and five times in the young men’s data. Another
word normally found in speech, the lexical shortening ‘cuz (because), was found one time in
the young women’s data. The clipping fave was found in both young women’s and adult
women’s data (three and one time/s respectively) and another clipping, kiddi (kidding), was
found in the adult men’s comments. In the adult men’s data a total of 11 abbreviations were
found, though neocon (neoconservative), commies (communists) and kiddi cannot be
considered typical of language on the internet. The young men’s data showed a total of six
abbreviations, a total of three were found in the young women’s data and one in the adult
women’s data. All in all 21 abbreviated words, 0.1%, were found in the total corpus made up
of 21,087 words, not counting acronyms. Consequently the results from this study
corroborate Baron’s findings from IM conversations. The fact that expletives only occur in
the comments by the adult men also corroborates earlier findings about language use on the
internet (Herring 2011:40).
A special form of abbreviation is the acronym. Acronyms occurred in the data from usergenerated comments with approximately the same frequency as abbreviations. All in all 27
acronyms were found in the total corpus: 20 in the young women’s data, six in the young
men’s data and one in the adult men’s data. The adult women’s comments contained no
acronyms. Lol, including equivalents such as looolll, lolz and olz, was found nine times in the
young women’s comments and five times in the young men’s comments and is consequently
the most common acronym used in the collected corpus. Omg could be seen seven times, all
of which were in the young women’s comments. Btw (by the way), idk (I don’t know), imho
(in my humble opinion), ba (bad a**), pll (pretty little liar) were found one time each. These
results are in line with Baron’s (2008:154) study on IM and text messages, where acronyms
were also sparse.
24
4.4 Syntactic features and level of formality
As discussed above, both ellipsis and contractions can be regarded as orthographic
reductions as well as markers of informality in a text. Not all possible contractions have been
analyzed in this study, only the ones listed in Table 4. Those are the negated auxiliary verbs
that appeared more than ten times each in the corpus (a table of additional contractions can
be found in Appendix B). When analyzing contracted forms, also the use of the apostrophe in
contractions was included, as it can be read as a signal for new patterns in the use of the
apostrophe.
In Baron’s (2004:414-415) study of IM, a difference in the use of contractions was noticed
between genders: men used contracted forms more often than women. In the study on IM
and text messages, she found that contracted forms were used 68% of the time in IM and
85% in text messages (Baron 2008:154). A difference in the use of the apostrophe in IM and
text messages was also found: in IM 94% of the contractions contained the apostrophe while
only 32% of the contracted forms in text messages contained the apostrophe. In the present
study, contracted forms are used in the great majority of the potential places of negative
contractions (91%), as shown in Table 4.
Table 4. Frequency of negated contractions and their uncontracted forms plus
frequency of ellipsis of subject and/or verb in user-generated comments (the number in
brackets shows the instances in which the apostrophe has been left out).
Adult men
Adult women
Young men
Young women
Total
cannot
0
1
0
0
1
can’t/cant
1
7
7
10
do not
1
0
1
0
don’t/dont
11
5
11 (1)
27
is not
4
1
0
0
5
isn’t/isnt
2
1
2
2
7
will not
1
0
0
0
1
won’t/wont
2
0
3
5
did not
2
0
0
0
2
didn’t/didnt
2
1
13
5
21
Ellipsis*
3
19
37
28
87
Utterances**
325
385
458
535
1703
(1)
*ellipsis of subject and/or verb
**total number of utterances
25
(2)
25 (3)
2
(1)
(1)
54 (2)
10
(1)
Only 11 out of 128 possible contractions are uncontracted forms, of which eight were written
by the adult men. That means that the adult men’s comments contain the highest usage of
uncontracted negated auxiliaries in the present study. In six of the total of 117 contractions,
the apostrophe has been left out. Phrased differently, 95% of the contractions contained the
apostrophe. Added up, the results show that contracted forms are used more often in this
study of asynchronous internet language than in previous studies done on synchronous
internet language. The use of the apostrophe in contractions in my study is comparable to
Baron’s (2008:154) findings based on IM.
Fragmented syntax is common in genres like chat, IM, texting and microblogging and can be
a way to try to write speech-like utterances (Herring 2011:5). The analysis of the data from
user-generated comments shows some differences in frequency between the commenting
groups. Ellipsis of subject and/or verb is most common in the comments written by the
young men where 8% of the utterances contain ellipsis of subject and/or verb. Ratios for
ellipsis for the three other commenting groups are 5 % (adult women), 5% (young women)
and 3% (adult men). Taking into account that comments consist of at least one but
sometimes up to 15 utterances, ellipsis cannot be considered a very common feature in this
study.
Table 5. Analysis of linguistic features at word level (percentages).
LIWC
Self-
dimension
Social
Positive
Negative
Cognitive
Articles
Big
references words
emotions
emotions
words
Adult men
3.04
7.34
2.42
2.11
5.89
7.89
19.36
Adult
7.00
9.41
4.72
1.05
6.14
5.99
15.65
Young men
6.77
6.59
5.11
1.39
6.90
6.09
14.67
Young
6.09
9.82
5.33
1.81
7.45
4.45
11.34
Formal *
4.2
8.0
2.6
1.6
5.4
7.2
19.6
Personal*
11.4
9.5
2.7
2.6
7.8
5.0
13.1
words
women
women
*percentages provided by LIWC for reference to a formal and a personal writing style (henceforth referred to as
the LIWC references)
As already mentioned, asynchronous modes of internet language are often closer to formal
writing than synchronous modes and when measuring the frequency of grammatical words
such as pronouns and determiners, internet language typically falls between the two
extremes speech and writing (Herring 2011:6). LIWC provides references for all word
26
categories in formal and personal texts (see Table 5). The LIWC analysis shows that the adult
men’s use of self-references and positive emotions is much lower than the three other
commenting groups’, though the adult men use big words and negative emotions to a higher
degree. By and large the analysis of data for the adult men shows similarities to the LIWC’s
references for formal texts. Adult women and young women have the highest figures when it
comes to the use of social words and positive emotions, both of which are indicators of
personal texts. Both the young men and the adult women fall between the LIWC references
for self-references, cognitive words, articles and big words. Data for the young women shows
infrequent use of big words, probably depending on the age of the members of the
commenting group. As shown in section 3.1, most of the user profiles stated that the
commenters were in high school. The LIWC analysis of the comments by the young women
shows closeness to the LIWC references for a personal text.
Seemingly the adult men’s comments include linguistic features at word level that are close to
a formal writing style, while the young men and the adult women fall between a formal and
personal writing style, though high as regards positive emotions. The analysis also shows that
the adult women use social words to a high extent, thus indicating a personal style. The
young women make frequent use of social words and words which display positive emotions,
but less so when it comes to the use of articles and big words. Thus indications of a personal
writing style are found in the young women’s comments.
4.5 Commenting tone
Below the ratio of use for each commenting group is presented in pie charts showing
comments categorized as aggressive, friendly or neutral. The results from the analysis of
tone in comments are at the stylistic level and to make it easier to compare the results, the pie
charts are presented next to one another, see Figures 1-4.
The results show that the neutral tone is used by the adult women, the young men and the
young women in slightly more than half of their comments, while nearly half of the adult
men’s comments were classified as neutral. This means that the most frequent tone,
regardless of commenting group, is the neutral tone. Moreover, the results show great
differences at the stylistic level when it comes to the use of the friendly and the aggressive
tone. The friendliness shown by the adult women and young men and to some extent by the
young women in my study is in line with Braga’s (2011:218) finding that the most common
attitude in the blog environment she analyzed was kindness. Almost half of the comments
from the young men were written in a friendly tone, which means that the young men use a
friendly tone more often than the young women do. This result is not in line with earlier
27
findings: the results from the study by Thelwall et al (2010, as seen in Herring 2011:42)
showed that young women used a friendly tone to a greater extent than young men did and
consequently this contrasts to the results of the present study.
Figure 1. Tone in user-generated
comments by adult men.
Figure 2. Tone in user-generated
comments by young men.
Figure 3. Tone in user-generated
comments by adult women.
Figure 4. Tone in user-generated
comments by young women.
Most striking in the analysis of tone is perhaps the extent to which the adult men in my data
use an aggressive tone in communication, compared to the three other commenting groups.
Also worth noting is the result showing that the young women use an aggressive tone five
times as often as the young men and the adult women. From the gender perspective it is
interesting that the aggressive tone is used only marginally in the data for the young men
while the data for the adult men shows a decidedly higher frequency. Likewise it is interesting
to note that the young women use an aggressive tone more often than the adult women do.
The findings from my study regarding the adult men’s use of an aggressive tone in 42% of the
user-generated comments is in line with Herring’s (2011:40) findings that men express their
opinions strongly and hold an adversarial orientation to their interlocutors. The adult men in
the present study show a much higher use of aggressive and insulting behaviour compared to
28
males in Herring’s study from the early 00s, who used it three times as often as females. In
this study of user-generated comments, the adult men used aggressive and insulting
behaviour in 42% of the comments, while the adult women used it in only 2% of their
comments. The fact that the young men used an aggressive tone in only 2% of the comments
contradicts previous findings. Guiller and Durndell (2006, cited in Herring 2011:42) found
that male students were more likely to use authoritative language and respond in a negative
way in interactions in asynchronous discussion groups. When my data is analyzed by gender
only, disregarding age, the results from the analysis of tone come closer to the findings of
Herring’s (2011) study: aggressive tone in female data accounts for 11% and male data
accounts for 43% in user-generated comments.
Findings by Atai and Chahkandi (2012:887) saying that gender-typical stylistic features are
used by both sexes can be linked to the results from both the young men’s and the young
women’s comments. In the young women’s data at least one of ten comments holds an
aggressive tone which is considered typical of a male style of online language. In contrast to
that, the data for the young men shows a rather large proportion of comments that holds a
friendly tone, generally considered a typical stylistic feature of women.
In this study the adult women and the young men can be said overall to display either a
friendly or a neutral stance, while the adult men more often are either aggressive or neutral
and only occasionally friendly. The young women are friendly or neutral in most cases, but
occasionally they use an aggressive tone.
4.6 Further remarks on the results
The reasons for the differences in the writing styles of the four commenting groups are hard
to figure out, but educational background and age are likely to be important factors when a
person chooses his or her level of formality as regards writing style. In this study the people
with the highest education - middle-aged men and mature women - use a more formal
written style than the two younger commenting groups. The level of educational background
and age is also evident in that the data for the young men shows a more formal style in
typography and orthography, than the young women do. The commenting group of young
women is the youngest of the examined groups - most of the commenters are of high schoolage. In the young men’s commenting group most of the commenters go to college or
university and that makes them approximately five years older than the young women. A
five-year age difference at this time in life is quite a large span. This might have affected the
results of the study of user-generated comments and it would have been preferable if the age
of the young women and the young men had been more alike. On the other hand, it is
29
possible to draw on the age difference and conclude that the typical features of internet
language are used more by young women in early and mid-teenage years and still evident,
but not so emphasized, among young men in their late teens and early twenties. Therefore, it
can be assumed that the use of typical features of internet language alters with ascending age
in adolescence, which is in line with the conclusion made by Tagliamonte and Denis
(2008:24) saying that stylized IM forms are abandoned by adolescents at an early age.
An interesting, and perhaps the most surprising, result in this study was the degree to which
adult men used an aggressive tone when commenting on articles in the online magazine.
Previous research has shown that men hold an adversarial stance to their interlocutors, so
this study is in line with that, but the frequency surprised me. In recent previous research, for
instance in Atai and Chahkandi (2012) and Braga (2011), the topic of discussion has been
regarded when analyzing user-generated content. In this study the additional information
about topic might have revealed something about the great differences in use of the
aggressive tone in the different commenting groups. Several of the scholars cited in this
paper argue that internet language varies with purpose and situation. Therefore the holistic
view that Dare (2011) argues for would have given a more outspoken framework to this study.
That might have made the study more informative concerning the user-generated content, as
a qualitative study of the content could have been performed as well as the quantitative.
5. Discussion: Implications for the future of the English
language
As discussed above, scholars are not very worried about the decay of the English language on
account of internet language, while the public reaction to internet language is often
represented by fear of drastic language changes. Though it is impossible to say exactly how
the English language will evolve, Baron has made some suppositions about possible future
changes. Baron (2009:44) argues that among the changes that will be brought into the future,
spelling or vocabulary will not be the most important. Instead she believes that the changes
in attitude towards language structures will be more significant. Two shifts are listed that,
according to Baron, are likely to affect future English language:
-
the whatever-attitude towards language rules and correctness
the enhanced control of linguistic interactions (Baron, 2009:44-45)
The whatever-attitude towards language rules and correctness is a change that will result in
people’s declining concern for what is prescriptively considered good English. People will
30
simply not worry too much about traditional language rules and grammar, as long as the
communication can be understood. Instead focus will be more on tolerance and personal
expression. The other shift – control of linguistic interactions – means that people in general
“increasingly come to see language not as an opportunity for interpersonal dialogue but as a
system we can maneuver for individual gain” (Baron, 2009:45). In other words, people have
the opportunity to manipulate their interaction with others by using communication acts they
can control. How then, is this manipulation set to work? According to Baron (2009:45) there
are a number of actions concerning the interaction with others, where the message sent can
be controlled:
-
-
the choice between calling or texting someone
the signals that are sent when designing social network pages with staged photos and
when choosing what information the contacts will be able to access in social network
profiles
the possibility to choose not to answer a phone call since, with caller-id, the caller is
known beforehand
the possibility to pretend to be talking on the phone when meeting an unwanted
conversation partner on the street (Baron, 2009:45)
In the present study the style used by the young women is closest to the writing style
commonly presented in the media and at social network sites as the style of language used on
the internet – a style ridiculed and mocked by people who consider Standard English
superior (see section 2.1 for quotes from Urban Dictionary). But will young women change
their online communication style as they grow up and get more educated as Tagliamonte and
Denis (2008) suggest? And will their written style be transferred to the next generation and
become a type of standard online communication style that young teenagers use? When
Baron explains the “whatever-attitude”, she states that correctness, which I suppose can be
analogous to Standard English, will not be as important in the future as it is today. Her
supposition says that people will be more tolerant towards language varieties and personal
expressions in the future and if this is so, parts of the internet language style used by the
young women in this study might have a bright future despite the prescriptivists’ views.
When concluding this discussion about future language, it is exciting to note that Baron’s
suppositions contrast radically with the wishes of the public discourse. While the discussions
within public discourse express that standard spelling and formally correct sentence
structures are features that ought to be preserved, Baron believes that tolerance towards
nonstandard language features and language correctness is what the future will bring. The
gap between those two views on future language is rather large and mirror two absolutely
opposite viewpoints. And what does that imply? Has the media given people with
controversial ideas more speaking space and as a result prejudices about language change
31
which has resulted in a somewhat lopsided debate? Or have scholars researching online
language not succeeded in conveying their knowledge that Standard English is actually not
changing at a high speed? Whether this is a question of scholars uninterested in making their
results known to the public or if it is a lack of interest from the mass media to report the
scholarly results – as they are not making any headlines – will be left unsaid here, but need to
be addressed in another discussion. As I see it, there is a question regarding responsibility
here that needs to be approached so the debate can become more equal.
6. Conclusion and future research
One aim of this study was to examine the extent to which typical features of internet language
could be found in user-generated comments collected from four different commenting
groups. The adult men’s comments contain only a few features of internet language, adult
women’s and young men’s comments show a rather low, but still evident, usage of typical
features and the comments by the young women show the highest frequency of features
typical of internet language. Another aim of the study was to examine to what degree internet
language consists of spoken features. The examined feature contractions is common in
spoken language and the ellipsis of subject and/or verb can be an attempt at writing speechlike utterances, according to Herring (2011:5). Throughout the commenting groups
contracted forms were used in the vast majority of the potential places of contractions.
Ellipsis of subject and/or verb was most common in the young men’s comments, but very
rare in the data for the adult men. This implies that the young men use a style that is closer to
spoken English than the three other commenting groups.
A matrix describing where the four commenting groups are placed along the dimensions
informal-formal and nonstandard-standard is shown in Figure 5. The young women are
placed in the first quadrant as their data contains many nonstandard features and indicates
an interpersonal writing style. Thereby they seem to represent the writing style closest to the
picture of internet language mediated by the mass media. Their data shows the highest rates
of use of nonstandard typography and orthography, such as the use of emoticons and
acronyms, the use of lowercase letter i for I and the omission of apostrophes in contractions –
all features that are generally considered typical of internet language. In addition to this, the
analysis of the young women’s data in LIWC shows the highest use of social words and
positive emotions and values that overall represent a personal writing style. The fact that the
young women are high in the use of typical typographic and orthographic features and at the
same time show percentages at word level close to the LIWC references for a personal text,
32
indicate informality in the language of the comments. Other signs of an informal writing style
are the use of the friendly as well as the aggressive tone, showing that the content of the
comments is emotional.
Figure 5. Matrix describing language use on the internet for four commenting groups.
In contrast to the young women’s writing style, the adult men’s writing style is placed in the
fourth quadrant of the matrix. The adult men’s comments show many formal and standard
writing features. In the LIWC analysis they contain the highest rates of big words, negative
emotions and articles and in addition to that the comments have the lowest rate of
fragmented sentences, as well as the highest usage of uncontracted negated auxiliaries. Thus,
the overall analysis of the adult men’s comments implicates a writing style close to formal
writing, something which contradicts the mediated picture of internet language from the
mass media. Though the adult men’s style is close to standard and indicate a formal writing
style, there are two things that contradict this picture: first, the results show that the adult
men use contracted forms more often than uncontracted forms and second, it is only in the
adult men’s comments that expletives are found.
The adult women’s and young men’s commenting groups are placed in between the young
women and the adult men in the matrix. In comparison to young women and adult men, the
comments by adult women show some formal features and other features considered typical
of internet language. The formal features are the relatively frequent usage of big words and
the frequent usage of standard typography, and the infrequent, almost nonexistent, presence
of abbreviations and acronyms. The use of emoticons and the extensive use of a friendly tone
are features that are indications of an interpersonal writing style. All in all, adult women
seem to use a personal and friendly writing style characterized by standard spelling and
syntax.
Similarly, the data for the young men indicates a friendly and interpersonal writing style. The
young men’s comments contain the highest ratio of fragmented sentences, which must be
considered a marker of an informal writing style. The rather infrequent use of big words and
33
the frequent use of positive emotions, as well as the extensive use of the friendly tone are
indicators of an interpersonal writing style. Another finding is that the young men fall
between the LIWC references for formal and personal writing styles in self-references,
cognitive words and articles. Typography and orthography are close to standard in the young
men’s comments and this is attested by the relatively low usage of emoticons, abbreviations
and acronyms and the standard typography in pronouns. Fragmented sentences and the use
of contractions are other signs of informal written language and the data for the young men
shows the highest ratio as regards the use of ellipsis of subject and/or verb and full use of
contractions. Therefore it is possible to conclude that the young men’s syntax is more heavily
influenced by spoken English than the other three commenting groups.
As for future research, I suggest a study that analyzes user-generated content quantitatively
to find out more about the differences in commenting tone. In such a study knowledge of
discourse content is of interest, as well as the purpose of the user-generated content.
34
References
Ames, Melissa, & Himsel Burcon, Sarah (eds.). 2011. Woman and language: Essays on
gendered communication across media. Jefferson, NC: McFarland & Co.
Atai, Mahmood Reza, & Chahkandi, Fatemeh. 2012. Democracy in computer-mediated
communication: Gender, communicative style, and amount of participation in
professional listservs. Computers in Human Behavior 28(3): 881-888.
Baron, Naomi S. 2004. See you online: Gender issues in college student use of instant
messaging. Journal of Language and Social Psychology 23(4): 397-423.
Baron, Naomi S. 2008. Always on: Language in an online and mobile world. New York, NY:
Oxford University Press.
Baron, Naomi S. 2009. Are digital media changing language? Educational leadership, 66(6):
42-46.
Braga, Adriana. 2011. Gender blogging: Femininity and communication practices on the
internet. In M. Ames & S. Himsel Burcon (eds.), 215-228.
Crystal, David. 2006. Language and the internet (2nd edition). Cambridge: Cambridge
University Press.
Dare, Julie. 2011. Women, kin-keeping, and the inscription of gender in mediated
communication environments. In M. Ames & S. Himsel Burcon (eds.), 185-198
Herring, Susan C. 2003. Gender and power in on-line communication. In J. Holmes & M.
Meyershoff (eds.), 202-228.
Herring, Susan C. 2007. A faceted classification scheme for computer-mediated discourse.
Language@internet 4(1). Available at
http://www.languageatinternet.org/articles/2007/761/Faceted_Classification_Scheme_
for_CMD.pdf [Accessed December 7, 2012].
Herring, Susan C. 2011. Grammar and electronic communication. Preprint version retrieved
from: http://ella.slis.indiana.edu/~herring/e-grammar.2011.pdf. [Accessed November
10, 2012].
Herring, Susan C., & Kapidzic, Sanja. 2011. Gender, communication, and self-presentation in
teen chatrooms revisited: Have patterns changed? Journal of Computer-Mediated
Communication, 17(1): 39-59.
Holmes, Janet, & Meyerhoff, Miriam (eds.). 2003. The handbook of language and gender.
Oxford: Blackwell Publishing.
35
Hård af Segerstad, Ylva. 2002. Use and adaptation of written language to the conditions of
computer-mediated communication. Göteborg: Göteborgs universitet.
Leurs, Koen, & Ponzanesi, Sandra. 2011. Gendering the construction of instant messaging. In
M. Ames & S. Himsel Burcon (eds.), 199-214.
Squires, Lauren. 2010. Enregistering internet language. Language in Society, 39(4): 457492.
Sundqvist, Pia. 2009. Extramural English matters: Out-of-school English and its impact on
Swedish ninth graders’ oral proficiency and vocabulary. Dissertation. Karlstad:
Karlstad University Studies 2009:55.
Tagliamonte, Sali A., & Denis, Derek. 2008. Linguistic ruin? LOL! Instant messaging and
teen language. American Speech 83(1): 3-34.
Thurlow, Crispin. 2006. From statistical panic to moral panic: The metadiscursive
construction and popular exaggeration of new media language in the print media.
Journal of Computer-Mediated Communication 11(3): 667-701.
Walther, Joseph B. & Jang, Jeong-woo. 2012. Communication processes in participatory
websites. Journal of Computer-Mediated Communication 18(1): 2-15.
36
Appendix A: Emoticons found in the corpus
Emoticon found in the adult women’s data:
Smileys:
:)
Emoticons found in the young men’s data:
Smileys:
:)
:-)
(:
^_^
Winkies:
;P
;)
;-)
(;
Other:
:/
:(
D:
-_-
;D
Emoticons found in the young women’s data:
Smileys:
:)
:-)
(:
(((:
Winkies:
;P
;)
;-)
(;
Other:
:/
D:
-_-
*_*
^_^
<33333
37
♥
Appendix B: Table of additional contractions
Table 6. Frequency of contracted forms in user-generated comments (the
number in brackets shows the instances in which the apostrophe has been left
out)
I am
I'm/im
I will
I'll/Ill
I would
I'd/Id
you are
you're/your
it is
it's/its
Adult men
4
8
4
2
4
3
2
2
4
18
Adult women
15
22
4
2
3
1
9
7
8
19
Young men
3
34 (6)*
4
10
7
4
1
10 (2)
3
37 (6)
Young women
15
25 (2)
3
1
11
3 (1)
6
7
(4)
3
38 (11)
Total
37
89 (8)
15
15 (0)
25
11 (1)
18
26 (6)
18
112 (17)
*four of the instances where the apostrophe had been left out was written by the same person
Note that the words your and its are used as contracted forms for you are and it is. The
instances where the words were used as possessive pronouns are not included in these
numbers.
38