self-presentation in online dating – an analysis of behavioural diversity

Association for Information Systems
AIS Electronic Library (AISeL)
PACIS 2016 Proceedings
Pacific Asia Conference on Information Systems
(PACIS)
Summer 6-27-2016
SELF-PRESENTATION IN ONLINE DATING
– AN ANALYSIS OF BEHAVIOURAL
DIVERSITY
Martin Haferkorn
Goethe University Frankfurt, [email protected]
Moritz Christian Weber
Goethe University Frankfurt, [email protected]
Follow this and additional works at: http://aisel.aisnet.org/pacis2016
Recommended Citation
Haferkorn, Martin and Weber, Moritz Christian, "SELF-PRESENTATION IN ONLINE DATING – AN ANALYSIS OF
BEHAVIOURAL DIVERSITY" (2016). PACIS 2016 Proceedings. 371.
http://aisel.aisnet.org/pacis2016/371
This material is brought to you by the Pacific Asia Conference on Information Systems (PACIS) at AIS Electronic Library (AISeL). It has been
accepted for inclusion in PACIS 2016 Proceedings by an authorized administrator of AIS Electronic Library (AISeL). For more information, please
contact [email protected].
SELF-PRESENTATION IN ONLINE DATING – AN ANALYSIS
OF BEHAVIOURAL DIVERSITY
Martin Haferkorn, Goethe University Frankfurt, Germany, [email protected]
Moritz Christian Weber, Goethe University Frankfurt, Germany, [email protected]
Abstract
Human communication experiences a major shift towards virtual interactions and social networks.
These virtual environments enable users to present themselves to a virtual audience. In this paper we
analyze a unique online dating data set from a mobile application, which allows observing the user’s
diversity in terms of gender and sexual orientation and their individual and environmental influences.
Based on the research streams on impression management, online dating, diversity of gender and
sexual orientation we derive hypotheses on similarity and differences in the group behavior. We also
analyze how deviations from the group mean behaviors affect the level of self-presentation. Our results give indication that gender research requires a more diverse perspective when analyzing male
and female behavior. We find first evidence that deviations from group behavior is emotionally related
to the user’s self-presentation, contrary to this the information content shared with other users seems
not to be affected. Our results further indicate that individual and environmental influences have an
effect on the amount of shared information as well as the emotional level of self-presentation.
Keywords: Impression Management, Online Dating, Sexual Orientation, Dating Platform
1
Introduction1
During the last years an ongoing shift from direct, personal communication and social circles towards
virtual interactions and social networks becomes more and more apparent. In these social networks
people tend to present a better version of themselves and their lives (Zytko et al. 2014). These liberal
and open structures foster a freedom of choice in terms of self-presentation including the sexual orientation and the selection of partners. Lim et al. (2008) argue that the use of today’s technological opportunities enables so-called “minority members” to unwind from their typical stereotypes, in particular in
their impression management strategies. The limitation and abstract level of online dating profiles
leads to an information disclosure and supports users to categorize profiles into archetypes and stereotypes of similar behaviors (Tene 2012).
Online dating can be understood as a special case of self-presentation and impression management in
the field of social networks and systems. Beside the problems of privacy (Wilson et al. 2014) and information disclosure, users face a selection problem whom to choose from the pool of potential partners only based on online profiles. Additional, intimacy becomes relevant in this special case (Jøsang
et al. 2005). Therefore, users try to find other potential mates that match their personal preferences
based on their self-presentation (Krämer & Winter 2008). According to Lim et al. (2008) impressions
are unlikely to change once they are manifested. As the initial impression matters, users evaluate the
given information and behavior for aspects like authenticity and congruence in relation to the given
information and the expected behavior (Wang et al. 2014).
In this paper we investigate a unique online dating data set that highlights self-presentation constructs,
textual information and media data to analyze its relation with user-specific attributes like gender,
sexual orientation, age and urbanization. This data set gives a unique perspective into the behavior of
online dating users concerning their impression management and self-presentation. These insights are
not exclusively applicable to online dating, but might be transferable to general personal information
management systems. As our data set allows a differentiation of gender and sexual orientation effects,
we expect to gain insights on the behaviors of online dating users. If behaviors can be identified as
norms for groups of users, these stereotypes can help to understand users that differ from these stereotypes and might have been perceived differently by other users. If those individuals can be identified
as being different from the means, then it is interesting to analyze whether these individual behavior
compensates the differences. Therefore, we ask the following two research questions:
How does gender and sexual orientation influence group behaviors in online dating information systems?
How does a deviation from group behavior norms influence the self-presentation in online dating information systems?
The paper is structured as follows. In the next section we give an overview on the theoretical background and related work on impression management, online dating and gender and sexual orientation
based research. On this basis we identify research gaps, which are addressed in the research design in
the following section. Based on related research, we derive hypotheses on the similarity and differences in group behaviors. Section 4 outlines the methodology introducing the regression analysis setup. Descriptive statistics of our data set and results of the regression analysis are presented in the following section. These results are discussed afterwards in section 6, including implications to literature
and to practice. We conclude this paper in the last section by highlighting potential limitations, further
research and giving final remarks.
1
We are open for any kind of suggestions (in particular to increase neutrality) and apologize, if personal feelings might be
unintentionally harmed by any kind of phrasing or indication from this investigation.
2
Theoretical Background and Related Work
2.1
Impression Management
Impression management is an important factor when interacting with others as impressions are unlikely to change once they are manifested (Lim et al. 2008). In the following section we include literature
from the impression management in the business context (I), in social networks (II) and distortions of
impression management (III).
(I) Especially, in long term relations such as in the business context impressions are important. Hereby
Jeffery et al. (2007) find that lower performing employees tend to apply impression management by
setting personal goals visible for others. Further, Higdon (2008) investigates judicial audiences. He
finds that strong messages with strong non-verbal communication can influence the advocates’ effectiveness. Non-verbal techniques and optical impressions influence the perceived attractiveness and
dominance. To analyze the underlying factors of impression management Leary and Kowalski (1990)
build a two component model that split impression management into impression motivation and impression construction. They find that impression construction is determined amongst others by the
current social image as well as desired and undesired identity images.
(II) In addition to the business context, impression management becomes relevant in private and virtual settings. Lim et al. (2008) find that strategies for successful impression management in virtual environments differ from those that are in less virtual settings. Krämer and Winder (2008) analyze social
networks. They find that efficient impression management is related to the number of virtual friends,
the level of profile details and the style of personal photos. Kurian (2015) analyzes the motivations for
user generated content on Facebook. He finds that self-presentation and relationships are the main
drivers, while risking a loss of privacy, security risks and identity theft. Posting textual messages supports impression management, enjoyment and relationships, while also risking conflicts and emotional
distress. Pike et al. (2012) find that individuals on social networks tend to segment their selfpresentation depending on the audience on those sites. Wilson et al. (2014) find evidence for the hypothesis that social networks are used for impression management and that those management capabilities can be seen as a strong counterbalance to privacy concerns.
(III) In general, social network users seem not to provide untruthful information about themselves.
Zytko et al. (2014) analyze impression management in online dating. They find that users do not try to
trick potential partners to appear more attractive. Instead they highlight their positive attributes. They
also observe that the frustration in online dating results from insecurities how other users perceive
them and why communication ended unexpectedly. Sezer et al. (2015) investigate bragger in social
networks. They confirm it as an effective strategy for self-promotion, but also find that people malign
them. Additionally, they find that with any kind of brag, sincerity is higher rewarded than bragging.
However, differences between virtual and physical self-presentation give indication to personality
weaknesses (Wang et al. 2014).
2.2
Gender Differences
2.2.1
General Research
With our study, we add to the realm of gender-differences in Information Systems (IS). Our review is
complimented by studies in the field of psychology. In general, we include the studies, which fit into
the scope of our paper: textual differences among genders (I) and how IS are engaged genderdependent (II).
Concerning (I), most of work was conducted by analyzing various aspects of the microblogging service Twitter. Bamman et al. (2014) analyze the differences in linguistic style between genders incorporating a cluster algorithm. They find that topics are gender-dependent. Regarding this content,
Soedjono (2012) finds that female tweets are more self-centered. Similar results are shared by Walton
and Rice (2013) who analyzed tweets towards mediated disclosure. For example, female users exhibit
a higher information-disclosure than their male counterparts. Based on textual features Cunha et al.
(2014) observe that male tweets contain more likely imperative verbal forms while female tweets contain more declarative forms.
(II) In regard of gender-dependent behavior Slyke et al. (2010) analyze adoption behavior in an ecommerce setting. They observe that male users prefer online shops as they have a relative technological advantage over classic shops while perceived compatibility with the needs and values is more important for female users. Hwang (2010) finds that social norms play a bigger role for female users,
while male users are influenced by enjoyment, when adopting new e-commerce systems. Lin et al.
(2013) further observe that female users value vividness and diagnosticty, while male users value
more product presentation for a purchase intention. Trauth and Quesenberry (2005) observe that in
social networks individual inequalities influence the behavior of women, when working in IT contexts.
2.2.2
Online Dating
The scholarly work on online dating focuses on the user behavior, in particular how users engage with
each other (I), with the platform (II) and how they present themselves (III).
(I) A major concern is the reply behavior on these platforms as it is crucial for a potential future interaction. By analyzing the messaging behavior Schoendienst and Dang-Xuan (2011) find a connection
between the message properties and the likelihood of a response. In particular, they discover that response rates increase for men, if they write longer messages contrary to women. Vice versa, women
receive more responses, if they write short messages. In addition, women reply more selectively to
messages than men (Fiore et al. 2010). With a machine learning approach Xia et al. (2014) find that
features from the user-profiles, e.g. profile and preference, as well as graph-based features, e.g. similarity and the number of followers predict the reply behavior almost equally. Aretz (2015) investigates
the usage of the mobile application Tinder. Via a survey she concludes that users engage with the platform for amusement, self-validation, comfort, communication, flirting and sexual reasons.
(II) Considering the second stream, i.e. how users engage with the platform, Umyarov et al. (2013)
analyze how an anonymity feature (invisible visits at other profiles) alters user behavior. They observe
that the anonymity feature increases profile visits. Jung et al. (2014) analyze the usage of mobile dating applications compared to web pages. They observe that users prefer to use mobile dating applications compared to their classical counterpart. Further, they unveil gender specific criteria in the usage
of mobile dating applications: female users were not only able to archive more matches but also get
more replies per message compared to their male counterparts.
(III) Regarding the third stream, self-representation, Guadagno et al. (2012) observe by conducting a
questionnaire that male participants increase self-presentation in the case of increasing future interactions, for example a potential date. Considering self-disclosure Gibbs et al. (2006) show that the perceived online dating success can be explained by the level of self-disclosure, hereby they measure selfdisclosure with honesty, amount, intent and valence. Further, Whitty (2008) observes that the level of
self-disclosure differs in an online dating setting compared to traditional ways of dating. Hancock et
al. (2007) analyze deception behavior within online dating platforms and find, similar to Zytko et al.
(2014), that in nine out of ten cases users lie about their properties, in particular men exaggerate their
height and women understate their weight. Chidambaram et al. (2008) find that males and females do
not differ in the amount of tactics on self-promotion.
2.2.3
Sexual Orientation
IS-Research in the area of sexual orientation is not very widespread. In line with the previous scholarly work, it focuses on explicit properties such as textual differences. Groom and Pennebaker (2005)
analyze the linguistic features of online personal advertisements. Their sample allows them to distin-
guish between advertisement for heterosexual individuals, gay men and lesbian women2. They find no
linguistic difference based on the sexual orientation on personal advertisements, but that, in contrast to
heterosexual individuals, homosexual individuals do not try to differentiate themselves from their potential mates. However, concerning most linguistic features, such as e.g. words per sentence and question marks they observe no effect of sexual orientation. Still they observe a significant effect for the
number of words. Bower (2008) analyses the gender and sexual orientation specific effect on orientation claims and harassment. He finds that people with a homosexual preference are more likely to pass
off sexual orientation claims as gender discrimination.
3
Research Design
Literature does not show consistency in terms of the behavior of male and female users as well as in
terms of their sexual orientation. Woods and Harbeck (2008) analyze the identity management strategies of lesbian educators and find that they conceal their sexual orientation by self-distancing themselves from homosexuality in order to be perceived as heterosexual. Schiller et al. (2011) analyze virtual collaborations in the online game Second Life. They observe that male and female groups of two
behave differently in terms of impression management to receive a specific team outcome.
Venkatsubramanyan and Hill (2009) document a gender difference in the perception of social network
presence. Females are found to be more favorably impressed than males when finding a desired person
with an online social profile. According to Kite and Deaux (1987) “people do subscribe to an implicit
inversion theory wherein male homosexuals are believed to be similar to female heterosexuals, and
female homosexuals are believed to be similar to male heterosexuals. These results offer additional
support for a bipolar model of gender stereotyping, in which masculinity and femininity are assumed
to be in opposition.” Consequently, we derive the following two working hypotheses:
H1: Profiles of heterosexual men and lesbian woman are similar in online information systems.
H2: Profiles of heterosexual women and gay men are similar in online information systems.
Given these assumed differences and similarities in the mentioned groups, it is questionable whether
differences in the common group behaviors can explain effects of self-presentation. In a business context Benthaus (2014) finds that social media messages with higher sentiment are able to positively
influence corporate reputation. John and Robins (1994) analyze the self-enhancement and selfdiminishing in managerial group-discussions. They find that both are strongly related to narcissism.
For the professional context Friesen and Weller (2006) investigated analyst earnings forecasts. They
find strong evidence that analysts with more information are overconfident about the precision of their
information and influenced by a cognitive dissonance bias. Hobson et al. (2012) analyzed financial
misreporting and CEO speeches. They find that vocal dissonance is positively related to the possibility
of irregularity restatements and therefore an indicator for misreporting. As evidence from the nonbusiness context is sparse, we derive the following working hypotheses for sentiment and information
content from the outlined literature to investigate whether there are differences in the non-business and
hedonistic context:
H3: Higher deviations from the common group behavior results in extremer sentiment.
H4: Higher deviations from the common group behavior results in lower information content.
Finally, Trauth (2013) observes that gender-variations are influenced by individual identity, individual
influences and environmental influences. Consequently, we include individual influences like age and
environmental influences like urbanization (distance to the city center) as control variables to our
analysis.
2
The terms heterosexual individuals, gay men and lesbian women used in this paper are based on the “Guidelines for Psychological Practice with Lesbian, Gay, and Bisexual Clients” of the American Psychological Association (2012).
4
Methodology and Data
4.1
Hypotheses tests and Regression analysis
To investigate our research hypotheses, we employ OLS regressions (Greene 2007) on the data set
outlined in the next section. In order to identify the influences of gender and sexual orientation, each
user is coded by a dummy variable into the following groups (heterosexual men, gay men, heterosexual women, lesbian women). Additionally, we control for individual and environmental influences by
additional adding information on age and distance to the city center (indicator for urbanization). For
further investigation and for the sake of robustness, we estimate the regression including a second
content variable and without. As second content variable we use photos as media affinity proxy for
text length and likewise text length as media affinity proxy for photos. This shall give indications on
user’s media affinity as well as on the robustness of gender and sexual orientation effects. The regression setup to test H1 & H2 in for text length is as follows:
The regression setup to test H1&H2 for the number of photos is as follows:
We check for multicollinearity within our estimation by analyzing the correlations between our variables. The highest variance inflation factor (VIF) for our variables is 2. Thus, we dismiss
multicollinearity as a potential issue in our estimation.
Previous studies on gender and confidence deviations analyzed absolute deviations of group means to
investigate group untypical behaviors (Postma et al. 2004; Clark & Grandy 1984; Kirchlerer &
Maciejovsky 2002). To calculate the deviation of the common group behaviors, we also calculate the
absolute deviations in text length and number of photos, which is a measure of how much individuals
differ from the mean group behavior. Thereby is the mean of a group and xi is the observation for
the user i in this group:
This deviation measure aims at explaining the relationship between deviations, individual influences
and environmental influences on the sentiment and the information content (entropy) of the profile
texts. The regression setup for H3 & H4 in terms of sentiment and entropy is as follows:
4.2
Data Set
For our analysis we obtained a data set consisting of 82,012 user profiles from an international dating
platform. We choose New York City (NYC) as the location for our investigation as it has been reported that such dating platforms are widely used in this city (Vanityfair 2015). Additionally, this allows
us to obtain a sample containing a great variety of users regarding their cultural background thus lessening a certain cultural bias. The profiles used to download the data were placed at the Empire State
Building in NYC. We choose Manhattan as cornerstone for our investigation as it has the highest urbanization, thus we expect many users in this area. We expect that effects depend on the urbanization,
which decreases by the distance from the Empire State Building (Figure 1).
Relative number of users
14.00%
12.00%
10.00%
8.00%
6.00%
4.00%
2.00%
0.00%
1
21
41
61
81
Distance (miles from Empire State Building, NYC)
Figure 1: Relative number of users by distance. The figure shows an unequal distribution of users
within the NYC area. The number of users is higher within the city center (20 miles) compared
to the suburbs showing the high urbanization within NYC.
A profile consists of a short text, user pictures, age, distance and time of last activity. While the first
two properties can be set by the user the last three are automatically assigned. Interesting to note in
this context is that despite the superficial reputation of these platforms drawn by the media (The
Guardian 2013), users tend to give further textual information along with their profiles and do not rely
on pictures only. Additionally, we find in our data that the relative amount of men and women differ
depending on the distance from the city center (Figure 2):
Relative number of women
100.00%
80.00%
60.00%
40.00%
20.00%
0.00%
1
21
41
61
81
Distance (miles from Empire State Building, NYC)
Figure 2: Relative number of women (compared to men) by distance. The figure shows that women are
strongly outnumbered by men within the NYC city center. This effect gradually reverses by distance.
We received the data as a JSON data structure (Bray 2014) and transformed it into a cross-sectional
table. Additionally, dummy variables for the sexual orientation were included, which were derived
from the request-response pattern via a male and a female user profile. The sexual preference of these
two user profiles were set to both genders. Same gender user suggestions indicate homosexual preferences of the suggested user and different gender suggestions indicate heterosexual preferences of the
suggested user. Finally, we count the number of provided picture URLs to measure the number of
pictures (neglecting links to Instagram profiles) and calculate profile text features like sentiment based
in General Inquirer Dictionary (Stone et al. 1962) as well as entropy rate (Shannon and Weaver 1963).
5
Descriptive and Empirical Results
5.1
Descriptive Results
5.1.1
Gender and Sexual Orientation
Table 1 gives an overview on to the gender and sexual orientation structure of the data set used in our
study. Our data set consists of 82,012 dating profiles. 53,575 (65.3%) of the profiles belong to male
users and 28,437 (34.7%) to female users. As in previous research we have an overhang of male users
(Hitsch et al. 2010). Taking into consideration the sexual orientation we observe that our sample contains 41,460 (50.6%) heterosexual individuals and 40,552 (49.4%) homosexual individuals. Interestingly, both groups are almost evenly distributed. These observations are in line with Conway et al.
(2015) who analyzed online personal advertisements. Thus our data set has a similar structure. Finally,
we observe pair-wise similar text length and photo means for heterosexual men and lesbian women as
well as for gay men and heterosexual women.
Total
82,012
men
gender
53,575
65.3%
28,437
34.7%
gay
heterosexual
lesbian
heterosexual
orientation
26,216
text length
photos
women
32.0%
27,359
33.4%
14,336
17.5%
14,101
17.2%
125.191
107.856
107.212
126.199
4.895
5.173
5.190
4.893
Table 1. Number of users and their respective gender, orientation, text length and photos (mean).
5.1.2
Key Descriptives
The descriptive statistics of our data set are shown in Table 2. We also include variables that are not of
primary interest for our research model to provide a better overview over our sample.
Statistic
Mean
Std. Dev.
Min
Max
Unit
age
26.12
6.60
17.00
110.00
years
0.43
0.58
-1.00
1.00
119.40
112.72
0.00
4,640.00
characters
distance
35.22
317.19
1.00
11,620.00
miles
photos
4.99
1.30
0.00
9.00
number
entropy
3.51
1.41
0.00
8.32
-
sentiment
text length
-
Table 2. Key descriptives of our data set.
The average age in our data is 26.12 years with a standard deviation of 6.6 years indicating that our
data set consists mainly of young users. Even if the maximum age is 110 years, we do not assume that
we have hundred-year-old users in our sample. As dating platform obtains the age via the corresponding Facebook profile, users with wrong birth year set on Facebook appear with the wrong age on the
dating platform. We expect this as the reason behind this observation.
Taking distance into consideration (which is measured in miles from the Empire State Building, NYC)
we observe that most of our users (mean=35; median=6) are located within NYC. However, some
users seem to be located far away (max=11,620). These users seem to use the premium feature which
allows them to swipe in any location independent from their current position. Taking into consideration the length of the text (text length) we observe that the average biography length is 119.4 characters, which is similar to the length of Tweets (Bamman et al. (2014); Soedjono (2012)). The sentiment
of our data set (with -1 is a very negative sentiment and 1 a very positive one) is positive on average
(mean=0.4).
5.2
Regression Results
The results of our regression models show an interesting effect: lesbian women and gay men seem to
exhibit similar coefficients and equal significance levels and signs. A second regression with heterosexual men and lesbian women estimated for robustness exhibits a similar effect, when dropping either
lesbian women or gay men due to perfect multicollinearity. The low coefficients for heterosexual men
in Table 3 also indicate a similarity to the behavior of lesbian women (dummy variable was dropped
due to perfect multicollinearity and behaves as a benchmark in comparison to lesbian women (benchmark coefficient=0)).
text length (H1 & H2)
heterosexual men
photos (H1 & H2)
(1)
(2)
(3)
(4)
0.796
0.622
-0.0170
-0.0160
heterosexual women
11.648 ***
8.237 ***
-0.3310 ***
-0.3190 ***
gay men
10.384 ***
6.989 ***
-0.3280 ***
-0.3180 ***
0.0001 ***
0.0001 ***
distance
photos
0.001
0.002
10.680 ***
0.0010 ***
text length
2.544 ***
2.602 ***
0.0020 **
0.0050 ***
constant
-7.728 ***
46.317 ***
4.9930 ***
5.0600 ***
Observations
81,973
81,973
81,973
81,973
R2
0.042
0.027
0.027
0.012
Adjusted R2
0.042
0.027
0.027
0.012
age
Table 3: Results of the OLS regression on text length and photos; *p<0.1; **p<0.05; ***p<0.01.
A similar 2x2 clustering is observable in the mean of the descriptive results (see section 5.1.1). Therefore, we cannot reject H1 and H2. Additionally, our results show an effect depending on the age. In
particular, older users exhibit a higher significant amount of writing (2.5-2.6 characters per year more
on average) and upload on average more pictures. Interestingly, the number of photos is slightly related to the urbanization (distance), while the text length exhibits no distance effect. The results of the
regression are robust when including and excluding opposite media proxy (text length and photos).
Further, we observe a significant effect for longer texts and photos. Considering the R2, we observe
that it roughly doubles after including our media proxies. Results remain robust and significant even
when excluded.
The results of our regression analysis for H1 and H2 seem robust. The R2 is quite low with 4.2% for
the model (1), 2.7% for (2) and model (3) and 1.2% for the model (4). These low values seem to be
reasonable due to the huge amount of factors that might be related to text lengths and the number of
photos. The variance inflation factor (VIF) of the variables gives no indication of multicollinearity.
sentiment (H3)
entropy (H4)
(5)
(6)
-0.00010 ***
0.00010
photo deviation
0.00800 **
-0.00300
distance
0.00002 **
0.00002
age
0.00300 ***
-0.00020
constant
0.34500 ***
text deviation
Observations
3.68200 ***
65,407
78,278
R2
0.00200
0.00005
Adjusted R2
0.00200
-0.00000
*
**
Table 4: Results of OLS regression on sentiment and entropy; p<0.1; p<0.05; ***p<0.01.
For the analysis of deviations in the groups (heterosexual men, heterosexual women, gay men and
lesbian women) from the mean behavior in relation to sentiment and information content, we run a
second regression analysis. Results show an obvious difference between the estimated regressions for
H3 and H4: While all sentiment coefficients are significant, in our entropy model (6) only the constant
is significant. Thus, we do not reject H3, but reject H4. The deviation coefficients switch signs in the
sentiment model. The absolute deviations of mean text length show a slightly negative significant influence. Every character more or less in text descriptions seems to decrease the sentiment by 0.0001,
i.e. increase the negativity. Every difference in number of photos from the group mean seems to increase the positivity by 0.008. The level of urbanization (distance to the Empire State Building, NYC)
has the smallest coefficient. People living outside the city center seem to present themselves slightly
more positive with a factor of 0.0002 per mile on average. Results show a similar effect for older users
with a high level of significance. For every year users seem to increase the positivity of their profile
text by 0.003.
Due to the fact that values of our sentiment measure are only ranged between -1 to 1 the coefficients
are rather small. Similarly, this is the case for the independent variable of model (6), which is in between 1 and 100. Additionally, it should be noted that sentiment and information content can be influenced by more factors not covered by our data set. This relates with the low R2 (0.2%) of the sentiment
model (5) and (0.005%) for information content model (6). The F Statistic shows a general overall
validity of factors in the sentiment model (5), but not for the entropy model (6) (due to space limit not
reported). Observations had to be dropped, when sentiment values or entropy rates could not be calculated from the underlying data (nsentiment= 65,407 and nentropy=78,278). The variance inflation factor
(VIF) of the variables gives no indication of multicollinearity.
6
Discussion
6.1
Contributions to Theory
Trauth et al. (2013) criticize that individuals are often just classified into one of two groups: masculine
or feminine. It neglects diversity of sexual orientations and often forces minorities to be subsumed in
heteronormality or simply to be neglected. This study addresses this gap of diversity of sexual orientations and applies a diversified research design to online dating. We conduct our research in line with
Trauth’s individual differences theory of gender and IT (2005), which proclaims that the underrepresentation of women in the field of IT restricts the diversity of IT related products and services. We
contribute to a diverse perspective not only for women, but also for men using IT services. Finally,
this study adopts constructs from the individual differences theory of gender. IT individual identity
seems to be covered by a cross-sectional user data set and individual influences by variables like age
and environmental influences by proxies like urbanization.
This paper indirectly builds on the Social Information Processing theory (SIP) by giving evidence to
relations that are proposed (Hall et al. 2010). According to SIP user can use the reduced channels of
media and communication select whether they want to present or misrepresent themselves. Our study
confirms the effect and adds an additional potential explanation, why there are differences in presentation and representation. This study gives first evidence that deviations from mean in self-presentation
are related to a more negative sentiment in self-representation. Very negative self-presentation might
be an indicator for deviations of group mean and vice versa.
In the field of self-presentation and impression management this study gives indication to reject several hypotheses given by literature. Especially the combination of gender and sexual-orientation effects
give evidence why males and females can be analyzed as behaving similar in some studies but differing in other studies. Our research shows that opposite genders combined with different sexual orientations seem to be a strong indicator for similar behaviors, while opposite genders combined with same
sexual orientations seem to be strong indicators for different behaviors. From the evidence of our
model it seems that gay men and heterosexual women on the dating platform prefer to self-present
themselves with longer texts and pictures, while lesbian women and heterosexual men on the dating
platform prefer to self-present themselves shortly (Hall et al. 2010). Generally, we argue that gender
research must not neglect the influence of sexual orientation, as this might result in faulty indications
and misinterpretation of results.
For psychology in general our results seem to extend evidence for the hypotheses H1 & H2 motivated
by Kite and Deaux (1987). They assumed that people do subscribe to an implicit inversion theory. Gay
men are believed to be similar to female heterosexuals, and lesbian women are believed to be similar
to male heterosexuals. Our results seem to extent this evidence as we do not find that people believe in
an implicit inversion theory on behavior. But in terms of text lengths and number of pictures gay men
and female heterosexual the users’ self-presentation seems to be affected by similar behaviors, and
lesbian women and heterosexual men seem to show similar behaviors too.
Results cannot give evidence on other attributes of the platform users, as self-presentation in this platform is limited to texts and pictures. The evidence seems to be supported as follows: First in similarities of means between the 2x2 groups, second in size, sign and significance of group-specific dummy
variables and indirectly in the evidence that absolute deviation from mean results in a change of sentiment.
6.2
Contributions to Practice
Our contributions to practice are manifold. First, we give first insights on the distinct information sharing behavior among the sexual orientations. Hereby a new online dating platform might be designed to
meet these criteria should reflect these results. For instance, if the platform focuses on homosexual
users, it should have the possibility to allow the users to upload more pictures.
Taking into consideration our urbanization proxy, i.e. the distance, we observe that users share more
pictures, if they are farther away from the city center. Thus dating platforms designed for more rural
areas might incorporate the possibility to reflect this finding by offering more picture slots as well.
Similar observations could be made regarding the age as the information being shared (i.e. text and
photos) increases with the age of the user. Again this can be incorporated into the design of an online
dating system. If the outlined steps are followed, the platform could potentially suit the needs of these
users better. This might attract more users resulting in an increased market share. By incorporating
these design suggestions online dating platforms might be able to have a distinct competitive advantage.
However, we also contribute to users of these platforms by outlining an ideal profile. In particular, we
might give some guidance about the expected profile characteristics in their target peer group which
might increase their chances to actually meet the expectations of potential mates. As, for example,
older people tend to provide on average more information about themselves, a behavior similar to their
peer group is advisable. The same information disclosure might be applying for people living in rural
areas.
7
Limitations, Further Research and Conclusion
7.1
Limitations
Even though we are confident that our methodology is appropriate for our research question, our approach still faces some limitations. Our data set only consists of profiles in NYC. Despite the fact that
NYC is a multicultural hub, we cannot entirely rule out a local bias. Further, as we obtained the data
set from the mobile application, our research results might be biased towards the selection criteria of
the algorithm utilized at the firm. It is also important to mention that there are other online dating apps
that focus more on gay and lesbian users. Although these app user groups are not distinct, there might
be a selection bias for all of them. We decided to select our platform for this investigation as the platform seems to be more heteroneutral and the selection bias might be potentially smaller for the selected platform. This data set is only a cross-sectional snapshot and distribution and behaviors might
change in the future. These limitations might hinder generalizability of our results.
Even as our research included gender and sexual orientation to indicate evidence on group behaviors,
the current setup does not include all sexual orientation diversity. Obviously, bisexual users are not
considered in the current methodology. Also users without any sexual orientation cannot be covered
by the data provided the platform. Identification of bisexual users would require a change in the request procedure, filtering and intersection of platform data and is therefore a potential improvement
and an interesting research gap for further research.
Additionally, e.g. other dating platforms might be included to control for robustness and cross-checks.
In general, we see all results as indications from statistical tests that seem to give evidence on the behavior of users from our international dating platform. Such results shall not indicate any kind of offense or harassment against any of the observed groups. We analyzed the data from a neutral and inclusive perspective. We are open for any kind of suggestions (in particular to increase neutrality) and
apologize, if personal feelings might be unintended harmed by any kind of phrasing or indication from
this investigation.
7.2
Further Research
We identify at least four kinds of potential further research: Extending content analysis, extending
sexual orientation diversity, triangulation with other methodologies and a shift to professional HR
platforms. For the extension of content analysis, we would like to have a deeper analysis of further
textual features and computational analysis of the photos (color schemes, face detection, etc.). To extend the mention limitation in further sexual orientation diversity, we plan to modify our crawling and
pre-processing tool chain to be able to identify mixed sexual preference, i.e. of bisexual users.
Currently our analysis relies on a passive, empirical data regression of available user profiles. An interesting extension could be to combine this data with actively collected user data for triangulation.
Potential approaches might be anonymized interviews and the analysis of matching statistics of the
interviewed users.
Finally, our analysis observes user behaviors in a non-professional, hedonistic environment. Especially
for the analysis of gender and sexual orientation behavior in the professional setting, it might be interesting to extend the current investigation to further platforms. For instance, there are job search apps
that enables users to match with job offers by applying the same user interface and workflow as our
platform. We do not expect to find sexual orientation data within this platform, but if other variables
stay similar in the professional context that gives indication that sexual orientation behavior might also
be relevant in the HR context, even as it is not observable within the job application data.
7.3
Conclusion
In this study we answer the two research questions regarding the behavior of groups in online dating
systems, which have been defined in the introduction. They deal with the influence of gender and sexual orientation on group behaviors and the influence of deviations from group behavior on selfpresentation. In this paper, the insights from literature on impression management, online dating, gender and sexual orientation are applied to an online dating data set from our dating platform. The data
highlights the self-presentation with text and photos, gender and urbanization and seems to allow to
analyze gender groups in relation to their attribution.
The analysis gives evidence that people do not only subscribe to an implicit inversion theory on behavior, but that in terms of text lengths and number of pictures gay men and lesbian women seem to
behave similar, and lesbian women and heterosexual men seem to behave similar, as well. Using 3
different tests (means, regression with dummy variable and R2 of the deviation model (H3)) gives indication for the robustness of this insight. Additionally, this study sheds light on deviation from means
of the various groups (heterosexual men, heterosexual women, gay men and lesbian women). The
results of the second empirical regression analysis give indications that users tend to be more negative
(lower sentiment), if the deviation from the mean group behavior increases. This effect cannot be statistically supported for the entropy of profile texts, which indicates that the information content is not
affected by deviations from group means.
Finally, the research questions on male and female behavior cannot be solved without regarding the
sexual orientation. Male and female users seem to have similar behaviors in self-presentation, if they
have different sexual orientations. They seem to differ, if they have the same sexual orientation. Results show that deviations from typical group behavior can result in a slightly more negative selfpresentation. Finally, we find that older users living outside of major cities tend to be less reserved to
share self-presentation content and have in general a slightly more positive sentiment to present them.
We conclude this study with a call for broader diversity and more inclusion in information systems
research.
References
American Psychological Association (2012). Guidelines for Psychological Practice with Lesbian, Gay,
and Bisexual Clients. Working Paper.
Aretz (2015). Match me if you can: Eine explorative Studie zur Beschreibung der Nutzung von Tinder. Journal of Business and Media Psychology, 6 (1), 41-51.
Bamman, D., Eisenstein, J. and Schnoebelen, T. (2014). Gender identity and lexical variation in social
media. Journal of Sociolinguistics, 18 (2), 135–160.
Benthaus, J. (2014). Making the right impression for corporate reputation: Analyzing impression management of financial institutions in social media. In Proceedings of 22nd ECIS, Tel Aviv, Israel.
Bray, T. (2014): The JavaScript Object Notation (JSON) Data Interchange Format, RFC7159, Internet
Engineering Task Force (IETF), URL: https://tools.ietf.org/html/rfc7159. (visited on 05/07/2016)
Bower, T. (2008). Social Cognition 'At Work:' Schema Theory and Lesbian and Gay Identity. Working Paper. Western State University.
Chidambaram, L., Lim, J.Y. and Carte, T.A. (2008). Gender, Media and Leader Emergence: Examining the Impression Management Strategies of Men and Women in Different Settings. In Proceedings of the AMCIS, Toronto, Canada.
Clark, M.J. and Grandy, J. (1984). Sex Differences in the Academic Performance of Scholastic Aptitude Test Takers. Working Paper. College Entrance Examination Board New York.
Conway, J.R. (2015). Finding your Soulmate: Homosexual and heterosexual age preferences in online
dating. Personal Relationships. Forthcoming.
Cunha, E., Magno, G., Gonçalves, M. A., Cambraia, C., Almeida, V. and Preis, T. (2014). He Votes or
She Votes? Female and Male Discursive Strategies in Twitter Political Hashtags. PLoS ONE 9
(1),1–9.
Fiore, A.T., Taylor, S.T., Zhong, X. Mendelsohn, G.A. and Cheshire, C. (2010). Who’s Right and
Who Writes: People, Profiles, Contacts, and Replies in Online Dating. In Proceedings of 47 th
HICCS, Kauai, USA.
Friesen, G. C. and Weller, P.A. (2006). Quantifying Cognitive Biases in Analyst Earnings Forecasts.
Journal of Financial Markets 9(4), 333–365.
Greene, W.H. (2007). Econometric Analysis. 7th Edition. Upper Saddle River: Prentice Hall
Groom C.J. and J.W. Pennebaker (2005). The Language of Love: Sex, Sexual Orientation, and Language Use in Online Personal Advertisements. Sex Roles 52(7/8), 447–461.
Gibbs, J. L., Ellison, N. B., and R. D. Heino (2006). Self-presentation in online personals the role of
anticipated future interaction, self-disclosure, and perceived success in Internet dating. Communication Research 33(2), 152–177.
Gudagno R.E., Okdie, B.M. and S.A. Kruse (2012). Dating deception: Gender, online dating, and exaggerated self-presentation. Computers in Human Behavior 28(2), 642–647.
Hall, J. A., Park, N., Song, H., and M. J. Cody (2010). Strategic misrepresentation in online dating:
The effects of gender, self-monitoring, and personality traits. Journal of Social and Personal Relationships 27(1), 117–135.
Hancock J.T., C. Toma and N. Ellison (2007). The Truth about Lying in Online Dating Profiles. In
CHI 2007 Proceedings, San Jose, USA.
Higdon, M. J. (2008). Oral Argument and Impression Management: Harnessing the Power of Nonverbal Persuasion for a Judicial Audience. Kansas Law Review 57 (3), 1–38.
Hitsch, G. J., Hortaçsu, A. and D. Ariely (2010). What makes you click? - Mate preferences in online
dating. Quantitative marketing and Economics 8(4), 393–427.
Hobson, J. L., Mayew, W. J. and M. Venkatachalam (2012). Analyzing Speech to Detect Financial
Misreporting Journal of Accounting Research 50(2), 349–392.
Hwang, Y. (2010). An Empirical Investigation of Normative, Affective, and Gender Influence on ECommerce Systems Adoption. In Proceedings of the 16th AMCIS, Lima, Peru.
Jeffrey, S., Webb, A. and A. K. Schulz (2007). The Use of Self-Set Goals as an Impression Management Tactic: Antecedents and Consequences. Working Paper. University of Waterloo.
John, O. P. and R. W. Robins (1994). Accuracy and bias in self-perception: individual differences in
self-enhancement and the role of narcissism. Journal of personality and social psychology 66 (1),
206–219.
Jøsang, A., Fabre, J., Hay, B., Dalziel, J. and S. Pope (2005). Trust requirements in identity management. In Proceedings of the 2005 Australasian workshop on Grid computing and e-research,
Darlingshurst, Australia.
Jung, J., Umyarov, A., Bapna, R., and J. Ramaprasad (2014). Mobile as a Channel: Evidence from
Online Dating. In Proceedings of the 35th ICIS, Auckland, New Zealand.
Kirchkerer and Maciejovsky (2002). Simultaneous Over- and Underconfidence: Evidence from Experimental Asset Markets, Journal of Risk and Uncertainty 25(1), 65–85.
Kite and Deaux (1987). Gender belief systems: Homosexuality and the implicit inversion theory. Psychology of Woman Quarterly 1(1), 83–96.
Krämer, N. C. and S. Winter (2008). Impression management 2.0: The relationship of self-esteem,
extraversion, self-efficacy, and self-presentation within social networking sites. Journal of Media
Psychology 20 (3), 106–116.
Kurian, J. C. (2015). Implications of User Generated Content on Facebook. In Proceedings of the 19 th
Pacific Asia Conference on Information Systems (PACIS), Singapore, Singapore.
Leary, M. R. and R. M. Kowalski (1990). Impression management: A literature review and twocomponent model. Psychological Bulletin 107 (1), 34–47.
Lim, J. Y., Chidambaram, L. and T. Carte (2008). Impression Management and Leadership Emergence in Virtual Settings: The Role of Gender and Media. In Proceedings of JAIS Theory Development Workshop, New York, USA.
Lin, X., Featherman, M. and S.L. Brooks (2013). Factors Affecting Online Consumer's: Behavior: An
Investigation Across Gender. In Proceedings of the 19th AMCIS, Chicago, USA.
Pike, J., Betaman, P. and B. Butler (2012). You Saw THAT?: Social Networking Sites, SelfPresentation, and Impression Formation in the Hiring Process. In Proceedings of the 18th AMCIS,
Seattle, USA.
Postmaa, A., Jagerb, G., Kesselsa, R. P. C., Koppeschaarc, H. P. F. and van Honka, J. (2004). Sex
differences for selective forms of spatial memory, Brain and Cognition 54(1), 24–34.
Schiller, S., Nah, F., Mennecke, B. and K. Siau (2011). Gender Differences in Virtual Collaboration
on a Creative Design Task. In Proceedings of the 2011 ICIS, Shanghai, China.
Schoendienst, V. and L. Dang-Xuan (2011). The Role Of Linguistic Properties In Online Dating
Communication - A Large-Scale Study Of Contact Initiation Messages. In Proceedings of the Pacific Asia Conference on Information Systems, Brisbane, Australia.
Sezer, O., Gino, F. and M. I. Norton (2015). Humblebragging: A Distinct – And Ineffective – SelfPresentation Strategy. Working Paper. Harvard Business School.
Shannon, C. E. and Weaver, W. (1963): The Mathematical Theory of Communication. 1st Edition.
Illinois: University of Illinois Press.
Slyke, C., Bélanger, F., Johnson, R. and R. Hightower (2010). Gender-Based Differences in Consumer
E-Commerce Adoption. Communications of the Association for Information Systems 26(1), 17–34.
Soedjono, A. H. (2012). The comparisons between the language used by male and female peers in
Twitter. Working Paper. University of Unair.
Stone, P. J., Bales, R. F., Namenwirth, J. Z., and Ogilvie, D. M. (1962). The general inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information, Behavioural Science 7(4), 484–498.
Tene, O. (2012). Me, Myself and I: Aggregated and Disaggregated Identities on Social Networking
Services.” Journal of International Commercial Law and Technology 8(2), 118–133.
The
Guardian
(2013).
Tinder:
the
shallowest
dating
app
ever?
URL:
http://www.theguardian.com/lifeandstyle/2013/nov/23/tinder-shallowest-dating-app-ever. (visited
on 09/22/2015).
Trauth and Quesenberry (2005). Individual Inequality: Women’s Responses in the IT Profession.
Working Paper. Pennsylvania State University,
Trauth, E. M. (2013). The role of theory in gender and information systems research. Information and
Organization 23(4), 277–293.
Umyarov, A., Bapna, R., Ramaprasad, J. and G. Shmueli (2013). One-Way Mirrors and WeakSignaling in Online Dating: A Randomized Field Experiment. In Proceedings of the 34th ICIS, Milan, Italy.
Vanityfair (2015). Tinder and the Dawn of the ‘Dating Apocalypse’. URL:
http://www.vanityfair.com/culture/2015/08/tinder-hook-up-culture-end-of-dating.
(visited
on
09/22/2015).
Venkatsubramanyan, S. and T. R. Hill (2009). Gender Differences in Social Networking Presence
Effects on Web Based Impression Formation. In Proceedings of the 15th AMCIS, San Francisco,
USA.
Walton, S. C. and R. E. Rice (2013). Mediated disclosure on Twitter: The roles of gender and identity
in boundary impermeability, valence, disclosure, and stage. Computers in Human Behavior 29(4),
1465–1474.
Wang, C., Yang, Y. Y. and I. Shen (2014). Self-Present by Avatars in Multiplayer Online RolePlaying Games: The influence of self-esteem, online disinhibition, and self-discrepancy. In Proceedings of the PACIS 2014, Chengdu, China.
Whitty M.T. (2008). Revealing the ‘real’ me, searching for the ‘actual’ you: Presentations of self on an
internet dating site. Computers in Human Behavior 24(4), 1707–1723.
Wilson, D., Proudfoot, J. and J. Valacich (2014). Saving Face on Facebook: Privacy Concerns, Social
Benefits, and Impression Management. In Proeedings of the 35th ICIS. Milan: Italy
Woods and Harbeck (2008). Living in Two Worlds: The Identity Management Strategies Used by
Lesbian Physical Educators, Journal of Homosexuality 22(3 –4), 141-166.
Xia, P., Jiang, H. and X. Wang, C. Chen, B. Liu (2014). Predicting User Replying Behavior on a
Large Online Dating Site. In Proceedings of the 8th International AAAI Conference on Weblogs
and Social Media, Michigan, USA.
Zytko, D., J. Quentin and S.A. Grandhi (2014). Impression Management and Formation in Online
Dating Systems. (Research in Progress) In Proceedings of 22nd ECIS, Tel Aviv, Israel.