Revealing Communication Patterns in an Online Dating System

Revealing Communication Patterns
in an Online Dating System
Andrew T. Fiore
School of Information
University of California, Berkeley
102 South Hall
Berkeley, CA 94720-4600 USA
[email protected]
Abstract
Social visualizations are powerful tools both for users of
mediated communication systems seeking social
context and for researchers seeking insights into user
behavior. This paper describes a visualization tool built
to support research into patterns of communication
among 50,000 of users of an online dating system,
featuring geographic and categorical variable layouts
overlaid with individual communications.
Keywords
Online personals, online dating, social visualization,
computer-mediated communication
ACM Classification Keywords
H5.3. Group and Organization Interfaces;
Asynchronous interaction; Web-based interaction.
Introduction
Copyright is held by the author/owner(s).
CHI 2006, April 22–27, 2006, Montreal, Canada.
ACM 1-xxxxxxxxxxxxxxxxxx.
Good social visualizations reveal patterns of interaction
that would otherwise be hard to perceive. In the best
social visualizations, these patterns provide insight to
participants trying to make sense of a social milieu or
researchers trying to understand the dynamics in terms
of a theory or hypothesis.
2
The organizers of this workshop categorize social
visualizations by whether they depict physical or virtual
worlds and by the modality of communication (text,
audio, video). I would like to elaborate on this typology
— I believe it is important also to consider the scope
and point of view of the visualization. Scope indicates
the breadth of the data presented; most salient is
whether the data (behaviors, emails, etc.) represent a
global or local collection — that is, are these data the
product of an individual or the entire population? Does
the user have access to data beyond her own?
Relatedly, the point of view of the visualization
describes how the display relates to the user: is it an
ego-centric or omniscient point of view? Although
scope and point of view are intertwined, they remain
distinct. We can conceive of an ego-centric
presentation of social information that is nonetheless
global, incorporating everyone's behavior.
End users undertaking social tasks are more likely to
benefit from ego-centric visualizations with either
global or local data; the omniscient bird's-eye view may
be more complex than necessary to help them navigate
a social world, for which just-in-time, just-in-place
information would likely suffice and be easier to
understand. On the other hand, global, omniscient
social visualizations are the kind most often used by
researchers studying mediated communication.
visualization organizes the users according to some
spatial conceit and then overlays the communications
among them as thin, transparent lines.
Geographic View
I began the system as a geographic visualization,
plotting users by their location in the United States or
Canada (Figure 1). Because the colored points
representing users accumulate color intensity as the
number of users in the same location increases, this
view has the advantage of revealing the density of
users in various cities and regions. I then overlaid each
email sent from one user to another through the dating
system as a thin, transparent line. These, too,
accumulate in intensity, so common communication
paths are brighter than rare ones. However,
communication patterns in a visualization of North
America do not provide much insight because most
communications in online dating occur over short
distances that become vanishingly small on a map that
spans the continent.
Categorical Density View
Online dating systems collect a great deal of
demographic and personal information from their users.
Taking advantage of this information proved essential
to the development of a visualization useful for making
sense of communication patterns among the online
dating users.
Studying Online Dating
The visualization I will describe here is a researcher's
tool that I built to facilitate the study of user behavior
in an online dating system (Fiore & Donath 2004). It
provides an omniscient, global perspective on the
communications among more than 50,000 users of this
heterosexual online dating system. At its core, this
Presenting two continuous, numerical variables in a
two-dimensional scatterplot is a common way to reveal
how they co-vary. Doing this with two categorical
variables is unusual and somewhat problematic.
However, the most interesting descriptive
characteristics from the online dating system I studied
3
were categorical — for example, race, religion, and
education. To show the population of users according
to two categorical variables, I used categorical density
plots. Specifically, I divided the space in the plot
according to the intersection of the levels of each
variable, then filled in the intersections with randomly
scattered points representing users who posses both of
the intersecting qualities. As a result, those
intersections of two categorical qualities that contain a
greater proportion of the population will appear denser
than those that contain less of the population. Figure 1
shows such a plot for the categorical variables sex and
marital status. This plot reveals that most users are
divorced or never married and that these
characteristics do not differ noticeably by sex.
WHO’S TALKING WITH WHOM?
Whereas the overlaid communication lines were not
particularly useful in the geographic layout because
they represented unusual cases (long-distance
communications), they are integral to the usefulness of
the categorical layout. The patterns of communication
lines reveal which groups of people, as defined by the
intersecion of the two categorical characteristics, are
communicating with each other. In Figure 2, a glance
shows that most communications occurred between two
divorced users or two who had never been married.
That is, there was less between-group than withingroup communication. There is another, more subtle
pattern evident as well — there are more widowed
woman than men, and these women seem to
communicate primarily with divorced men.
Observations like this drove the development of a
methodology (described in Fiore & Donath 2005) for
quantifying the degree to which communicating dyads
in this online dating system tend to share the same
value for a given categorical variable. A similar style of
pairwise analysis has emerged recently as well in social
psychology, where it is used to characterize the
importance of various kinds of similarity and
dissimilarity among romantic couples (e.g., Klohnen &
Mendelsohn 1998).
References
[1] Donath, J.S., Karahalios, K., & F.B. Viegas.
Visualizing Conversations. In Proc. HICSS 32, 1999.
[2] Fiore, A.T., & J.S. Donath. Homophily in Online
Dating: When Do You Like Someone Like Yourself?
Short paper, Computer-Human Interaction 2005.
[3] Fiore, A.T., and J.S. Donath. Online Personals: An
Overview. Short paper, Computer-Human Interaction
2004.
[4] Klohnen, E.C., & G.A. Mendelsohn. Partner
Selection for Personality Characteristics: A CoupleCentered Approach. In Personality and Social
Psychology Bulletin 24 (3), March 1998, pp. 268-278.
4
Figure 1 (above). Geographic view of online dating users. At bottom,
communications among users have been overlaid as thin, transparent lines. Color
intensity of points and lines indicates the relative quantity of people and
communications.
Figure 2. Categorical density plots showing online dating
users according to marital status and sex. At bottom,
communications have been overlaid as in Figure 1.