Social Science Computer Review

Social Science Computer Review
http://ssc.sagepub.com
Massive Questionnaires for Personality Capture
William Sims Bainbridge
Social Science Computer Review 2003; 21; 267
DOI: 10.1177/0894439303253973
The online version of this article can be found at:
http://ssc.sagepub.com/cgi/content/abstract/21/3/267
Published by:
http://www.sagepublications.com
Additional services and information for Social Science Computer Review can be found at:
Email Alerts: http://ssc.sagepub.com/cgi/alerts
Subscriptions: http://ssc.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations (this article cites 9 articles hosted on the
SAGE Journals Online and HighWire Press platforms):
http://ssc.sagepub.com/cgi/content/refs/21/3/267
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
ARTICLE
10.1177/0894439303253973
SOCIAL
Bainbridge
SCIENCE
/ QUESTIONNAIRES
COMPUTER REVIEW
FOR PERSONALITY CAPTURE
Massive Questionnaires for Personality Capture
WILLIAM SIMS BAINBRIDGE
National Science Foundation
Contemporary information technology facilitates the creation and administration of much longer questionnaires than was feasible traditionally. People might be motivated to respond to these questionnaires
as a means of capturing significant aspects of their personalities, and this can be useful when designing
sociable technology—computer avatars, software agents, and robots with simulated personalities—
and when creating personality archives for research or memorial purposes. In this article, the author illustrates how “personality capture” can be accomplished through 20,000 questionnaire items culled from
responses to open-ended online questions, content analysis of existing verbal or textual material, and
words from dictionaries, encyclopedias, and thesauri. This approach enables detailed idiographic study
of a single individual, based on fresh measurement items and scales derived from the ambient culture.
Keywords: personality capture; Survey2000; Survey2001; survey research; opinion
research; questionnaire; software
P
ersonality capture” is the process of entering substantial information about a person’s
mental and emotional functioning into a computer or information system; in principle,
sufficiently detailed to permit a somewhat realistic simulation. This term draws an analogy
with the widely used technique called “motion capture,” in which the movements of a human
being are entered into a computer, usually by some kind of machine vision system, so they
can be used to program realistic images of people in movies and videogames. If motion capture records the motions of a person, personality capture records the emotions, attitudes,
opinions, beliefs, values, habits, perceptions, and preferences of a person.
The Leiden Institute of Advanced Computer Science, Digital Life Technologies
(www.liacs.nl/research/DLT/) uses personality capture in exactly the sense
intended here, but the term has not yet become firmly rooted in the lexicons of either computer science or social science. Altiris (www.altiris.com), a software company, uses
the term to refer to the process of migrating a person’s files and software preference settings
from one computer to another. The abstract of a computer science journal article about modeling a person’s interpretations of images states “Personalizing web search engines, a crucial
issue nowadays, would obviously benefit from the system’s ability to capture such an important aspect of a user’s personality as visual impressions and their communication” [italics
added] (Bianchi-Berthouze, 2002, p. 43). Clearly, computer science is on the verge of adopting the term personality capture, and I suggest that social science consider doing so as well.
Some recent work connects personality capture to motion capture. For example, researchers have been developing computer vision systems that can scan a person’s facial expressions
AUTHOR’S NOTE: The views expressed in this article do not necessarily represent the views of the National Science Foundation or the United States.
Social Science Computer Review, Vol. 21 No. 3, Fall 2003 267-280
DOI: 10.1177/0894439303253973
© 2003 Sage Publications
267
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
268
SOCIAL SCIENCE COMPUTER REVIEW
into a software system that performs “emotion extraction” to duplicate these expressions
graphically in an electronic “clone” (Thalmann, Kalra, & Escher, 1998). Several kinds of
conventional software already perform limited forms of personality capture. For example, a
person who wants his or her word processor to handle speech-to-text dictation must train the
speech recognition software by reading long samples of text out loud, thereby capturing the
parameters of his or her own unique voice. Recognizing that human personality can express
itself in many different modalities, this article will explain how one traditional social science
technique can be adapted for personality capture, while transcending some traditional limitations of that technique.
A RESEARCH PROGRAM
New information technologies can stimulate fresh developments in social science. Across
the other sciences, a new approach for fundamental science has been emerging in which
terabyte data sets are used to explore complex systems dynamics. The same can be done for
personality research by analyzing the complex connections among the thousands upon thousands of distinguishable memories and associations that make up a single human mind. It
would be premature at this point to predict what discoveries might be gained, but one possible area of accomplishment would be uniting personality psychology with cognitive neuroscience (Gazzaniga, 1995).
New needs have also arisen in recent years; notably the growing realization that researchers must find ways to design computer and information systems that are optimized for use by
human beings. This requires the development of data resources, tools, and conceptual
approaches for designing “sociable technology,” that is, computer avatars, software agents,
and robots that possess personalities themselves and that better to serve our own personal
needs (Turkle, 2002). The very first fragmentary applications of personable computing have
begun to appear; for example, adaptive interfaces that employ very primitive artificial intelligence techniques to adjust to the user’s habits.
Another application area is the development of digital libraries and web sites that preserve
vast troves of information about individuals for historical or memorialization purposes. The
Library of Congress pioneered historical digital libraries in the mid-1990s by posting nearly
3,000 life histories on the web from the 1930s Folklore Project of the Federal Writers’ Project for the Work Projects Administration (http://memory.loc.gov/ammem/
wpaintro/wpahome.html). The Survivors of the Shoah Visual History Foundation
(http://www.vhf.org/) has carried out digital video interviews with more than
50,000 informants about the experience of enduring the Nazi holocaust. Several leading
computer scientists have argued that some day it will be possible to enhance a rich archive of
data about an individual with artificial intelligence to achieve a kind of immortality (Bainbridge, 2000a; Bell & Gray, 2001; Kurzweil, 1999; Robinett, 2002).
The chief function of most standard personality tests is to reduce the unique complexity of
the individual to measurements along a small number of dimensions, often as few as five
(e.g., Zuckermann, Kuhlman, Joireman, Teta, & Kraft, 1993). Much of the psychological
research of the past century has been nomothetic, or seeking comprehensive ways of understanding humanity in general and testing hypotheses about general tendencies. In contrast,
personality capture is more idiographic, or seeking to document the distinctive characteristics of a specific individual (Pelham, 1993; Shoda, Mischel, & Wright, 1994). A vast number
of psychological scales exist that measure a multitude of concepts (Goldman et al., 19952002; Sweetland & Keyser, 1991). These scales will be useful for personality capture, but I
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
269
suggest we also need to break new ground in the sociology and anthropology of personality
and develop a great diversity of culture-based measures to chart individual characteristics.
Individuals do not exist in isolation. Rather, they internalize or react against elements of
the surrounding culture, for example, speaking a shared language with only slightly distinctive pronunciation and vocabulary. Thus, one way to develop new measures that would be
relevant for capturing the personality of a particular individual is to harvest questionnaire
items from the ambient culture surrounding that individual. This article offers examples of
three ways of doing this:
1. Collecting statements and other verbal material volunteered in response to open-ended questions administered through interviews or questionnaires
2. Culling existing verbal or textual material derived by content analysis or data mining from movies, television programs, novels, or other exemplars of popular culture
3. Developing measures based on the language itself, using words from dictionaries, encyclopedias, and thesauri
To explore these potentials and develop some of the specific technical methods that would
be needed for this work, I set the goal of using all three approaches to develop 20,000 questionnaire items. To facilitate creation and initial testing of such a massive corpus of items, it
was necessary to employ many of the latest technologies. The hardware included desktop,
laptop, and pocket computers. Data collection software was programmed in a conventional
computer language (Pascal), as web pages (HTML forms), and as specialized spreadsheets
(Excel), with text and data ported back and forth among these as well as to a standard statistical analysis package (SPSS). Thousands of people provided material for questionnaires, and
8 adults volunteered to serve as intensive test participants. The questionnaires were delivered
to the volunteers through web pages, Windows-based software downloaded from a web
page, or e-mail, or on magnetic disk and CD, or the questionnaires were transferred by wire
between a desktop and pocket computer.
The 20,000 items were assembled into 10 Windows-based software modules (see Table 1)
and administered to one or more participants. Each item consisted of a stimulus, such as a
statement, to which the participant would respond. As will be explained shortly, each module
presented a pair of response scales for each stimulus, so in terms of the data collected there
were actually 40,000 items, 2 for each of the 20,000 stimuli. We will look closely at 5 of these
modules, beginning with one that asked the research participant to think about the future.
THE YEAR 2100
In May 1997, I launched a web site called The Question Factory to administer surveys
through the Internet (Bainbridge, 2000b). One of these surveys pretested questionnaire items
about the future, such as “Imagine the future and try to predict how the world will change
over the next century. Think about everyday life as well as major changes in society, culture,
and technology.” After successful preliminary work with The Question Factory, this item
was included in the pioneering web-based questionnaire, Survey2000, organized by sociologist James Witte and sponsored by the National Geographic Society (Bainbridge, 2002b;
Weber, Loumakis, & Bergman, 2003; Witte, Amoroso, & Howard, 2000). About half of the
roughly 46,500 adults who responded to Survey2000 gave thoughtful written responses to
the item about the future, producing more than 10 megabytes of text.
The method of analysis has been used many times before, for example, in surveys about
the space program (Bainbridge, 1991; cf. 1989, 1992). I read carefully through the text,
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
270
SOCIAL SCIENCE COMPUTER REVIEW
TABLE 1
The 20,000 Idiographic Questionnaire Items in 10 Modules
Module
Items
Type
Year 2100
2,000
Beliefs
2,000
Predictions of
the future
Statements
Beliefs II
2,000
Statements
Wisdom
2,000
Statements
Emotions
2,000
Emotional stimuli
Experience
2,000
Taste
2,000
Events,
experiences
Foods
Self
1,600
Association
Action
Chief Sources
2,000
Adjectives for
a person
Pairs of words
Online questionnaire
Survey2000
Online questionnaires,
social science literature
Online questionnaire
Survey2001
Babylon 5 TV program
and novels
Online questionnaires,
web site search
Questionnaire of a
communal religion
Online questionnaire
Survey2000
Sociology classes,
dictionaries, thesauri
Dictionaries, thesauri
2,400
Verbs
Dictionaries, thesauri
Scale 1
Scale 2
Good-bad
Likely-unlikely
Agreedisagree
Agreedisagree
Agreedisagree
Good-bad
Importantunimportant
Importantunimportant
Importantunimportant
a
Much-little
Good-bad
Recentlynever
Healthyunhealthy
b
Much-little
Like-dislike
Good-bad
Strongc
weak
Like-dislike
Importantunimportant
Activepassive
a. How much or how little the stimulus would make the respondent feel the given emotion.
b. How much or how little the respondent judges that he or she possesses the given quality.
c. Mental connection between the two words.
copying out phrases and sentences that seemed to identify distinct ideas about the future.
This process produced a new file with a little more than 5,000 text extracts, which were then
combined and edited into clear statements of single ideas. Iteratively, the ideas were categorized in many groups that were then combined, until there were 20 groups with 100 items in
each; see Table 2 for data for Participant 1. For ease in remembering, the groups have simple
mnemonic names. For example, the Domestic group not only has statements about people’s
homes but also includes ideas about urban and rural environments and about the food people
will eat at home.
The items were embedded in an administration software module. One response scale asks
the participant to say how good it would be if the particular statement came true, from 1 = bad
to 8 = good. A second scale asks how likely it is that the statement will come true, from 1 =
unlikely to 8 = likely. Figure 1 shows the area of the computer screen where the participant
enters responses. Above this area, the particular item is displayed. For example, the first of
the Domestic items is “There will be special rooms with three-dimensional projectors set
aside in homes for virtual reality entertainment.” The respondent first thinks about this prediction and decides how bad or good it would be if this came true over the next century. Participant 1 rated the prediction 5 by using the computer’s mouse to click the 5 button in the
horizontal BAD-GOOD row. Then the respondent clicked the 6 button in the vertical
UNLIKELY-LIKELY column to indicate that the prediction was somewhat likely to come
true. The computer displayed both clicked buttons in a lighter color, highlighting the participant’s tentative choices. At this point, the participant could change either of the choices or
click the OK button in the center to register the data.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
271
TABLE 2
Items About the Future, 100 in Each of 20 Categories, From Survey2000
Mnemonic
Art
Business
Conflict
Domestic
Education
Family
Government
Health
International
Justice
Knowledge
Labor
Miscellaneous
Nature
Outer space
Population
Quality of life
Religion
Society
Technology
Topic Areas
Art, music, literature, culture, entertainment,
sports, style
Business, commerce, the economy, wealth,
inequality
Conflict between groups, including
nonviolent competition
Home life, houses, foods, urban and
rural communities
Students, schools, academics, languages,
education in society
Marriage, families, children, reproduction,
sexuality
Government, politics, politicians, political
systems, ideologies
Health, medicine, sickness, genetics, drugs,
specific diseases
International relations, nations, regions of
the world
Crime, justice, courts, law, police,
morality, punishment
Knowledge, science, beliefs, philosophies,
worldviews
Jobs, labor relations, occupations, working
conditions, careers
Miscellaneous aspects of technology,
culture, society, life
Environment, climate, natural resources,
flora, fauna
Space exploration, space technology,
human future in the universe
Demography, life span, fertility, mortality,
migration, cloning
Lifestyles, values, social problems,
general quality of life
Religion, spirituality, faith, secularization,
denominations
Relations between individual people and
social classes
Transportation, communications,
computer technology
Good
Likely
(6-8 on
(6-8 on
1-8 scale) 1-8 scale)
Correlation
of Good
and Likely
19
29
.40
13
35
.46
18
29
.37
29
37
.54
14
30
.26
5
26
.10
24
26
.28
40
39
.30
17
29
.55
7
19
.26
45
62
.08
20
47
.08
27
41
.51
23
37
.13
77
23
–.36
14
33
–.14
13
30
.23
23
28
.61
7
16
.47
19
33
.52
All 8 participants employed this cross-shape input method, which requires three clicks for
each stimulus and pair of responses. However, the software also includes a “block” input
method, where a single click on a checkerboard of 64 buttons registers both responses. Either
way, after one pair of responses is registered, the next stimulus will appear. The Back and
Skip buttons allow the participant to move backward or forward in the list of stimuli without
responding to them, primarily to allow reconsideration of responses. The Help button leads
to a screen that explains the input method and provides a demonstration of how it works. The
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
272
SOCIAL SCIENCE COMPUTER REVIEW
Figure 1: The Input System for the Year 2100 Software Module
Return button exits the input mode, for example, allowing the participant to save data and
quit the software.
It is essential to note that Table 2 is merely a summary that communicates a superficial
overview of the items and the participant’s responses. Personality capture really focuses on
the full, undigested data set. For example, one can output a text file of the predictions that the
participant rated in any particular way. Participant 1 said that just two predictions were both
very likely (7-8 on the 1-8 likely scale) and very bad (1-2 on the bad-good scale): “Humanity
will not leave the Earth in meaningful numbers, because the technology required will be
beyond its grasp” and “Space exploration will stall, symbolizing the failed promises of technology,” respectively. Coincidentally, the participant rated just two items at the extremes of
good (8) and likely (8): “Human consciousness will be transmitted to advanced computers”
and “For the first time in human history, human-computer interfaces will permit development of technologies of the soul,” respectively. Clearly, this particular participant seems to
be pessimistic about the space program but optimistic that personality capture really can confer a kind of immortality.
Although this article focuses on the personality capture itself, it is necessary to think
ahead to how the data could be analyzed idiographically or employed in simulations. This
requires exploration of ways in which patterns could be found in the data, whether by conventional social-scientific statistical analysis or by the pattern recognition and data-mining
techniques that have become prominent in contemporary computer science. Table 2 lets us
see the differing levels of optimism the particular participant has concerning different areas
of human life and culture. Note that Participant 1 rated 77 of the 100 “outer space” items as
good (6-8 on the 1-8 scale) but only 23 of the 100 as likely. This is one indicator of the participant’s mixture of enthusiasm and pessimism about the space program.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
273
A good way of measuring a person’s optimism is to calculate the correlation between
good and likely ratings in a given area. As the last column in Table 2 shows, Participant 1 is
most optimistic about religion (r = .61). This does not in itself say whether the participant is
religious but reveals that the predictions about religion the participant thinks would be good
tend also to be likely. One would have to look at the particular religion items the individual
thought were best: “Science will become the official state religion, with scientists as high
priests” and “The spiritual deadness affecting prosperous societies will lead to a proliferation
of strange cults and fanatic religious movements.” Clearly, this participant is not religiously
conventional, although optimistic in the area of religion. Participant 1 is most pessimistic
about the space program, as measured by a negative correlation (–.36) between rating the
space items good and likely.
This categorization in Table 2 is rather artificial, and the other modules described in the
next section used very different categorizing methods, beginning with one that also derived
its items from an online survey.
BELIEFS II
Three of the 10 modules consisted of Likert-type agree-disagree statements: Beliefs,
Beliefs II, and Wisdom. The material for 1,000 of the items in Beliefs II came from a second
National Geographic web-based survey, Survey2001 (Bainbridge, 2002a). A battery of 20
agree-disagree items measured people’s beliefs about 10 different issues at the borderland of
science, often called pseudoscience (Frazier, 1981), primarily for nomothetic research on the
cultural territory between religion and science (Bainbridge & Stark, 1980). These items were
in pairs, one phrased positively and the other negatively.
For example, one pair concerned astrology: “There is much truth in astrology—the theory
that the stars, the planets, and our birthdays have a lot to do with our destiny in life” and
“Astrologers, palm readers, tarot card readers, fortune tellers, and psychics cannot really
foresee the future.” Another pair concerned spiritual development techniques: “Some techniques can increase an individual’s spiritual awareness and power” and “Yoga, meditation,
mind control, and similar methods are really of no value for achieving mental or spiritual
development.” After this battery of items, subsets of the respondents were given pairs of
statements like these again and were asked to write comments about their topics. Following
the approach described above, 1,000 items were derived from the resulting verbiage, in 10
categories, as shown in Table 3. Each category consisted of a pair of items from Survey2001,
followed by 98 statements that came from the respondents’ comments.
Using software similar to that described previously, Participant 2 was asked to rate the
1,000 statements in terms of how true or false each was, as well as how important each was.
Then Participant 2 was given a laptop computer that had the 1,000 items in a spreadsheet file.
The participant was permitted to take the laptop for a few days, and whenever convenient to
look through each group of 100 items and mark all the statements that supported the first one
in the group. For the astrology items, Participant 2 marked 38 (including the first one) that in
some way supported the idea that astrology might be true. The remaining 62 items either contradicted belief in astrology (like the second item) or were neutral. Thus, Table 3 begins with
a categorization based on the origins of the items in an online survey, then subcategorizes in
terms of the participant’s individual categorization habits.
Table 3 shows one way it is possible to locate the belief of a single respondent in the surrounding culture. We see the percentage of 3,909 respondents to Survey2001 who agreed
with each of the 10 positive agree-disagree items, compared with the false-true ratings of
Participant 2. Whereas fully 48% of the respondents to Survey2001 apparently believe in
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
274
SOCIAL SCIENCE COMPUTER REVIEW
TABLE 3
Items About Pseudoscience, 100 in Each of 10 Categories, From Survey2001
Responses From Participant 2
Mean Rating on 1 to 8
False-True Scale
Stimulus Statement
From Online Survey
There is much truth in astrology—
the theory that the stars, the
planets, and our birthdays have
a lot to do with our destiny in life.
Every person’s life is shaped by
three precise biological rhythms—
physical, emotional, and
intellectual—that begin at birth
and extend unaltered until death.
Scientifically advanced civilizations,
such as Atlantis, probably existed
on Earth thousands of years ago.
Dreams sometimes foretell the future
or reveal hidden truths.
Some people really experience
telepathy, communication between
minds without using the traditional
five senses.
Some UFOs (Unidentified Flying
Objects) are probably spaceships
from other worlds.
Some scientific instruments
(e.g., e-meters, psionic machines,
and aura cameras) can measure
the human spirit.
Some techniques can increase an
individual’s spiritual awareness
and power.
Some people can hear from or
communicate mentally with
someone who has died.
Some people can move or bend
objects with their mental powers,
what is called telekinesis.
Percentage
Who Agree
(N = 3,909)
Number of Items
in 100 Supporting
This Item
Items
Supporting
Items Not
Supporting
14.3
38
3.6
5.7
28.1
32
3.3
5.5
34.8
59
3.8
6.2
55.4
39
4.4
5.4
48.0
54
2.7
5.7
22.2
12
2.8
5.2
9.1
42
3.3
5.4
57.3
55
3.5
5.1
23.4
44
2.7
5.3
18.1
63
2.7
5.5
telepathy, only 9% believe scientific instruments can measure the human spirit. In contrast,
Participant 2 rates 54 pro-telepathy items only 2.7 on the 1-8 false-true scale, compared with
3.3 for 42 items supporting the idea that the spirit can be measured.
EMOTIONS
The Emotions module consists of 2,000 items measuring what stimuli make the respondent have 20 feelings: love, fear, joy, sadness, gratitude, anger, pleasure, pain, pride, shame,
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
275
TABLE 4
Stimuli Eliciting 20 Emotions
Mean Rating of 100 Stimuli
in Each of 20 Categories on
1-8 Bad-Good Scale
Category Defining Words
in 10 Near Antonym Pairs
Love-fear
Joy-sadness
Gratitude-anger
Pleasure-pain
Pride-shame
Desire-hate
Satisfaction-frustration
Surprise-boredom
Lust-disgust
Excitement-indifference
Correlation Between Saying
100 Stimuli Are Good and
They Elicit the Given Emotion
First
Category
Second
Category
First
Category
Second
Category
5.07
5.09
5.34
4.78
5.53
4.56
5.00
4.51
4.61
4.30
4.32
3.59
3.80
3.83
3.88
4.13
3.74
4.62
3.90
4.27
.59
.79
.60
.66
.75
.44
.73
–.03
.55
.05
–.72
–.56
–.34
–.53
–.50
–.66
–.53
–.26
–.60
–.01
desire, hate, satisfaction, frustration, surprise, boredom, lust, disgust, excitement, and indifference. One thousand stimuli came from a pair of questionnaires administered through The
Question Factory. Each questionnaire listed 10 emotions, each followed by a space in which
to write, and explained
For each of these ten emotions, we will ask you to think of something that makes you have that
particular feeling. By “things” we mean anything at all—actions, places, kinds of person,
moods, physical sensations, sights, sounds, thoughts, words, memories—whatever might elicit
this emotion. We will also ask you to think of what makes someone else—a person very different
from you—have the same feelings.
The other 1,000 stimuli came from 20 searches of the World Wide Web using search
engines (Google, Alta Vista, Metacrawler) to find texts describing situations that elicited
each of the emotions. By this means, a large number of works of literature and online essays
were located that used the words in context. Each of the stimuli in the set was written on the
basis of the entire context around the quotation, although in many cases the phrase is a direct
quotation. Thus, 1,000 of these items were collected by means of a web-based survey,
whereas we culled the remaining 1,000 from existing expressions of the culture on the web.
Table 4 shows how Participant 3 responded to these 2,000 stimuli.
For example, one of the stimuli in the fear category was “not being able to breathe.” Participant 3, who was asthmatic as a child, said that this would be extremely bad (1 on a 1-8 badgood scale) and would very strongly tend to elicit the given emotion of fear (8 on a 1-8 scale
of how much or little the stimulus would make the respondent feel the given emotion). The
20 emotions were arranged naturally in 10 pairs of opposites, as shown in Table 4, and the
participant generally prefers the stimuli in the first category of each pair, with the exception
of surprise and excitement, on which the participant appears to be neutral or ambivalent. The
last two columns of Table 4 show the correlations between the two scales within each of the
20 categories, again showing a connection between the stimuli in a category and goodness or
badness, with the notable exceptions of surprise, excitement, and indifference.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
276
SOCIAL SCIENCE COMPUTER REVIEW
TABLE 5
Wisdom Module Popular Culture Items Drawn From Babylon 5
Number of
Statements
Character
Lennier
Delenn
Jeffrey Sinclair
Lyta Alexander
Byron
G’Kar
Alfred Bester
Vir Cotto
Michael Garibaldi
Minor characters
Marcus
Londo Mollari
Susan Ivanova
Dr. Stephen Franklin
John Sheridan
Anonymous
Kosh
32
147
86
34
30
131
72
37
100
636
36
147
72
68
190
159
23
Mean Rating on
1-8 True Scale
4.22
4.44
4.49
4.50
4.60
4.60
4.61
4.62
4.62
4.63
4.64
4.64
4.82
4.85
4.90
5.01
5.09
Mean Rating on
1-8 Important Scale
4.69
5.08
4.85
5.32
5.37
5.08
4.69
4.76
4.88
5.04
4.97
5.01
4.93
5.12
5.05
5.33
5.48
WISDOM
The next module, Wisdom, shows how questionnaire items can be derived entirely from a
particular exemplar of the ambient culture. Material for it came from content analysis of 120
hours of the science-fiction television program, Babylon 5 (B5), guidebooks to its complex
mythos, and B5 fiction (up to but not including the Technomage trilogy of novels). Traditionally, social scientists have often culled potential questionnaire items from the writings of
a great thinker, as Richard Christie and Florence Geis (1970) did when they created the influential Mach Scale from the writings of political philosopher Nicolò Machiavelli. I have
merely done the same thing with a contemporary source that addressed some of the same
issues of power in human relationships as did Machiavelli.
Created by J. Michael Straczynski, B5 draws deeply from the traditions of science fiction
literature, thereby reflecting a major genre of popular culture (Bassom, 1997). B5 is a city in
space, where humans and aliens meet, unaware that two vast powers are battling for dominance of the universe on a level of technical sophistication far beyond human understanding.
On one side are the Vorlons, who value order and ask, “Who are you?” On the other side are
the Shadows, who value chaos and ask, “What do you want?” The challenge of the TV series
concerns whether humans can unite the other aliens against both of these forces and establish
a cosmopolitan culture valuing liberty and diversity.
The items are statements derived from sentences spoken in an episode of the program,
limited to 10 sentences per hour of television, or published in a book, limited to 20 sentences
from each source. In many cases, the item is a verbatim quotation, but in other cases the original was edited minimally to transform it into a statement about people or life in general.
Table 5 shows how Participant 4 responded to the 2,000 items, categorized by the B5 character who spoke the original words, arranged in ascending order of mean “true” ratings. Participant 4 did not know the identities of the characters while rating their statements, and they
were administered in random order.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
277
The first two characters in the list, Delenn and Lennier, are aliens from a haughty species
that once tried to exterminate humans, and Participant 4 tends to rate their statements lowest
on the true scale. One of the statements spoken by the character Delenn communicates well
the central principle of their caste-ridden society: “Understanding is not required, only obedience.” At the opposite end of the list, Participant 4 gives the highest true rating to statements by Kosh, the enigmatic Vorlon. Kosh is famous throughout the science fiction subculture for making inscrutable pronouncements hinting at profound wisdom, such as “The
avalanche has already started; it is too late for the pebbles to vote.”
Participant 4 also gives somewhat high true ratings to anonymous statements and to those
from the commander of B5, John Sheridan. The “anonymous” category consists of statements from television characters who are so minor they lack names and from authors writing
about B5 without taking the voices of characters, so in a sense these statements lack personality. Sheridan, the central character of the series, is a Christ-like figure who dies but is reborn.
His statements express both optimism and stoicism: “If you’re falling off a cliff, you might as
well try to fly” and “The way to deal with pain is to turn it into something positive.”
SELF
Having seen several examples of how items could be collected by means of web-based
questionnaires or extracted from published exemplars of the ambient culture, we must now
conclude with an example that focuses on the language itself. This is also the only example
here of comparing across individuals, which can be a valid part of understanding the individual in a social context.
The Self software module consists of 1,600 adjectives that could describe a person. They
came from a line of research that began a decade ago with a project exploring the “semantic
differential.” This is a commonly used kind of questionnaire scale developed back in the
1950s that asks the respondent to judge something in terms of several pairs of opposite adjectives (Bainbridge, 1994; Osgood, Suci, & Tannenbaum, 1957). The items were developed
with the help of 36 students in classes on the sociology of organizations and on small group
processes. Students were asked to think about the qualities they would like to see in people
they were working with. Each student wrote down as many as 20 of these terms, then wrote
down the antonym of each. Four standard thesauri were then used to check these antonyms
and to generate pairs of opposites that described personal qualities relevant outside the context of work, without reusing any of the words or employing any obscure terms. A total of
800 pairs of antonym adjectives were incorporated in the Self software, but each item was
just a single word, and the software unobtrusively kept track of antonym linkages that connected the 1,600 words into pairs.
Table 6 summarizes responses from 4 participants, numbers 5 through 8 in this study. A
respondent judged how bad or good it was for a person to have the quality described by each
word and how little or much he or she actually possessed the quality. Because we have so
many data points for each individual, it is possible to correlate people with each other to see
how similar or different their ratings are. For example, of the 1,600 qualities, Participant 5
and Participant 6 correlate .67 with each other on their bad-good ratings and .52 on the littlemuch ratings. The averages for the six coefficients linking the 4 participants are .67 again
(ranging from .61 to .74) for bad-good ratings and .47 (ranging from .37 to .56) for littlemuch ratings. The difference between .67 and .47 is actually quite interesting. Apparently,
the 4 subjects share cultural assumptions about how good or bad the qualities are, but they
have different self-images, each stressing a somewhat different collection of personal
qualities.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
278
SOCIAL SCIENCE COMPUTER REVIEW
TABLE 6
Adjectives Describing a Person’s Character, Categorized by Participant 5
Self-Esteem (correlation Good and Much)a
Most “Good” Items
in Participant 5’s Category
1. Alert, alive, motivated
2. Clear, dedicated, focused
3. Unique, credible, exceptional
4. Healthy, complete, durable
5. Enlightened, innovative, aware
6. Future, real, instinctive
7. Courageous, hopeful, inquisitive
8. Constructive, inspiring, true
9. Spiritual, affectionate, loveable
10. Resourceful, best, energetic
11. Able, capable, honest
12. Celestial, cosmic, eternal
13. Good-natured, initiating,
approachable
Number
of Items
Participant
5
Participant
6
Participant
7
Participant
8
72
110
102
82
90
88
150
284
114
178
196
56
.57
.63
.71
.12
.87
.59
.13
.61
–.21
.39
.87
.38
.77
.78
.78
.58
.92
.70
.44
.82
.37
.80
.91
.80
.95
.94
.89
.92
.95
.93
.87
.91
.87
.81
.90
.87
.83
.89
.82
.58
.95
.71
.77
.85
.75
.83
.93
.87
78
.33
.73
.95
.90
a. Self-esteem is defined as saying qualities are “good” and having them “much.”
The 13 categories of qualities that define the row of Table 6 were developed by Participant 5. We gave the participant a pocket computer loaded with a spreadsheet listing the 800
pairs and asked the participant to categorize them in about a dozen groups, using any principles he or she wished. Over a period of several days, the participant carried the pocket computer and from time to time worked on the categorization task, which in itself was yet another
way of capturing aspects of the participant’s personality. The labels of the 13 categories are
the three words that garnered the highest total good score from all 4 participants.
For each participant, Table 6 gives the correlations between the Good and Much scales in
each of the categories, which is a plausible measure of self-esteem. For example, Category 7
includes qualities like courageous and hopeful (and their antonyms, fearful and despairing).
Participant 5 placed 75 pairs of items in this category. Among the ratings given these 150
items by Participant 5, the correlation between the Good and Much scales is only .13, which
means essentially no correlation between rating a quality good and feeling that one possesses
it. This is much lower than the self-esteem coefficients for the three other participant: .44,
.77, and .87, respectively.
However, it may not be appropriate to say that Participant 5 has abnormally low selfesteem, because we do not have population norms for the coefficients. In addition, it is
important to remember that self-esteem can be abnormally high, as well as abnormally low.
This can occur, for example, during a clinically manic episode, as was in fact the case for Participant 7. More important, we can compare the self-esteem coefficients within the data for a
given respondent. Participant 5’s self-esteem is lowest for qualities like “spiritual, affectionate, loveable” (–.21) and highest for qualities like “able, capable, honest” (.87). Indeed, the
tables in this article are only the most superficial sketch of the patterns that can be seen by
looking closely at extremely rich data concerning one individual.
CONCLUSION
Tens of millions of people work and play daily on computers, and a few million carry
laptops, pocket computers, or PDAs. They could, if they wished, respond to very long ques-
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
Bainbridge / QUESTIONNAIRES FOR PERSONALITY CAPTURE
279
tionnaires by doing a few items at a time whenever they had a few spare moments. For example, this paragraph was written with a pocket computer while riding on a subway. Archiving
one’s own personality could become a pleasurable hobby in which a few people invest hundreds of hours over a period of years.
Obviously, this vision has little to do with the traditional use of questionnaires as tools for
surveying random samples of the population. But the new information technology might
enable a very wide range of new social science applications and research methods that enrich
science and human life.
In a sense, this article has turned questionnaire methodology upside down. Instead of having one person write a questionnaire for a thousand people to answer, thousands of people
created questionnaires for one individual respondent. Instead of calculating the correlation
between two items across 1,000 respondents, we calculated the correlation between two
responses across 2,000 items within one person.
Personality capture may be carried out in a variety of ways for a variety of purposes. Thus,
a great number and diversity of scientific studies will be needed to determine which applications will be valuable and how to create them. Massive questionnaires created from the ambient culture are one viable approach for idiographic social science study of an individual
personality.
REFERENCES
Bainbridge, W. S. (1989). Survey research: A computer-assisted introduction. Belmont, CA: Wadsworth.
Bainbridge, W. S. (1991). Goals in space. Albany: SUNY Press.
Bainbridge, W. S. (1992). Social research methods and statistics. Belmont, CA: Wadsworth.
Bainbridge, W. S. (1994). Semantic differential. In R. E. Asher & J. M. Y. Simpson (Eds.), The encyclopedia of language and linguistics (pp. 3800-3801). Oxford, UK: Pergamon.
Bainbridge, W. S. (2000a). New technologies for the social sciences. In M. Renaud (Ed.), Social sciences for a digital world (pp. 111-126). Paris: Organisation for Economic Co-Operation and Development.
Bainbridge, W. S. (2000b). Religious ethnography on the World Wide Web. In J. K. Hadden & D. E. Cowan (Eds.),
Religion on the Internet (pp. 55-80). New York: Elsevier.
Bainbridge, W. S. (2002a). Public attitudes toward nanotechnology. Journal of Nanoparticle Research, 4, 561-570.
Bainbridge, W. S. (2002b). Validity of web-based surveys. In O. V. Burton (Ed.), Computing in the social sciences
and humanities (pp. 51-66). Urbana: University of Illinois Press.
Bainbridge, W., & Stark, R. (1980). Client and audience cults in America. Sociological Analysis, 41, 199-214.
Bassom, D. (1997). Creating Babylon 5. New York: Ballantine.
Bell, G., & Gray, J. (2001). Digital immortality. Communications of the Association for Computing Machinery,
44(3), 29-31.
Bianchi-Berthouze, N. (2002). Mining multimedia subjective feedback. Journal of Intelligent Information Systems,
19(1), 43-59.
Christie, R., & Geis, F. L. (1970). Studies in Machiavellianism. New York: Academic Press.
Frazier, K. (Ed.). (1981). Paranormal borderlands of science. Buffalo, NY: Prometheus.
Gazzaniga, M. S. (Ed.). (1995). The cognitive neurosciences. Cambridge, MA: MIT Press.
Goldman, B. A., et al. (1995-2002). Directory of unpublished experimental mental measures (8 vols.). Washington,
DC: American Psychological Association.
Kurzweil, R. (1999). The age of spiritual machines. New York: Penguin.
Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana: University of Illinois Press.
Pelham, B. W. (1993). The idiographic nature of human personality: Examples of the idiographic self-concept.
Journal of Personality and Social Psychology, 64(4), 665-677.
Robinett, W. (2002). The consequences of fully understanding the brain. In M. C. Roco & W. S. Bainbridge (Eds.),
Converging technologies for improving human performance (pp. 148-151). Washington, DC: National Science
Foundation.
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.
280
SOCIAL SCIENCE COMPUTER REVIEW
Shoda, Y., Mischel, W., & Wright, J. C. (1994). Intraindividual stability in the organization and patterning of behavior: Incorporating psychological situations into the idiographic analysis of personality. Journal of Personality
and Social Psychology, 67(4), 674-687.
Sweetland, R. C., & Keyser, D. J. (Eds.). (1991). Tests: A comprehensive reference for assessments in psychology,
education, and business. Austin, TX: PRO-ED.
Thalmann, N. M., Kalra, P., & Escher, M. (1998). Face to virtual face. Proceedings of the IEEE, 86(5), 870-883.
Turkle, S. (2002). Sociable technologies: Human performance when the computer is not a tool but a companion. In
M. C. Roco & W. S. Bainbridge (Eds.), Converging technologies for improving human performance (pp. 148151). Washington, DC: National Science Foundation.
Weber, L. M., Loumakis, A., & Bergman, J. (2003). Who participates and why? Social Science Computer Review,
21(1), 26-42.
Witte, J. C., Amoroso, L. M., & Howard, P. E. N. (2000). Method and representation in Internet-based survey tools:
Mobility, community, and cultural identity in Survey2000. Social Science Computer Review, 18(2), 179-195.
Zuckermann, M., Kuhlman, D. M., Joireman, J., Teta, P., & Kraft, M. (1993). A comparison of three structural models for personality. Journal of Personality and Social Psychology, 65(4), 757-768.
William Sims Bainbridge earned his doctorate from Harvard University. He is the author of 10 books,
four textbook-software packages, and about 150 shorter publications in information science, social
science of technology, and the sociology of culture. His software employed innovative techniques to
teach theory and methodology: Experiments in Psychology, Sociology Laboratory, Survey Research,
and Social Research Methods and Statistics. Most recently, he coedited Converging Technologies to
Improve Human Performance, which explores the combination of nanotechnology, biotechnology, information technology, and cognitive science (National Science Foundation, 2002; www.wtec.org/
ConvergingTechnologies/). He has represented the social and behavioral sciences on five
advanced technology initiatives: high performance computing and communications, knowledge and distributed intelligence, digital libraries, information technology research, and nanotechnology. Currently, he
is deputy director of the Division of Information and Intelligent Systems of the National Science Foundation,
after having directed the division’s Human Computer Interaction, Universal Access, and Knowledge and
Cognitive Systems programs. He may be contacted at [email protected].
Downloaded from http://ssc.sagepub.com at PENNSYLVANIA STATE UNIV on February 8, 2008
© 2003 SAGE Publications. All rights reserved. Not for commercial use or unauthorized distribution.