Elite Ideology Across Media: Constructing a Measure of

Elite Ideology Across Media: Constructing a Measure of Congressional Candidates’ Ideological
Self-Presentation on Social Media
David S. Lassen
Benjamin J. Toff
March 10, 2015
Draft prepared for discussion at the American Politics Workshop at UW-Madison
Please do not cite without authors’ permission
Introduction
Recently there has been renewed interest among political scientists in the messages
circulated by elected officials, especially members of Congress. Joining technical advances in
collection and analysis techniques with a significant body of previous studies showing the
informative potential in elite communication, scholars have argued that members’
“representational style” is a crucial, yet often overlooked element of concepts such as
representation and mass political discourse. By influencing media content, public opinion, and
the broader information environment, elected officials’ words help create the political world in
which we live. As Grimmer and Stewart note, “language is the medium for politics and political
conflict,” a tool elected officials frequently use to advance their interests (2013, 1).
This attention has implications for how we conceptualize and measure elite ideology. In
its purest form, ideology is often defined as “a latent set of values that organizes personal
political attitudes” (Burden, Caldeira, and Groseclose 2000). Recognizing that such a concept is
impossible to confidently measure in strategic actors, however, many scholars have noted that in
its observable form, congressional ideology is influenced by the context in which elite behavior
occurs, a space carved out by partisan and institutional forces. Roll call measures of ideology
such as NOMINATE (Poole and Rosenthal 1997), for example, permit a member to express their
position on gun control only in support of or opposition to legislative language they may have
had little control over. Some have therefore created alternate measures of members’ philosophies
rooted in non-voting data such as campaign contributions (Bonica DATE), members’ responses
to surveys (e.g., Bianco 1994, Erikson and Wright 1996), and relevant media coverage (Hill,
Hanna, and Shafqat 1997). Such efforts may provide only minimal benefits over established
action-based scores (e.g., Burden, Caldeira, and Groseclose 2000; Bishin 2003).
By contrast, we argue that the increasing volume and sophistication of language members
today distribute via social media may represent valuable material for a new measure of
congressional ideology. Using existing text-analytic methods and Twitter messages posted by
members of and candidates for the 113th Congress, we create and examine an indicator (which
we refer to as a “Tweetscore”) of the manner in which members voluntarily present themselves
to the public in a space relatively unencumbered by the congressional docket, scholars’ inquiries,
or party officials. Departing from previous studies and scores that sought to differentiate
themselves from NOMINATE, the focus of our paper is not that our measure captures a new
dimension of ideology. Indeed, we are reassured by evidence that our tweet-based measure
correlates well with and appears to be motivated in the same manner as NOMINATE. Instead,
we contend that because of its unusually constant and increasingly institutionalized production,
member social media content may provide an opportunity to examine heretofore elusive aspects
of operative congressional ideology (see Rohde 1991), including individual longitudinal change
and sensitivity to exogenous shocks such as national security crises.
Measuring Elite Ideology
Existing measures of congressional ideology have generally adopted one of four
empirical approaches. The first and most common centers on members’ roll call votes (e.g.,
Poole and Rosenthal 1997, Heckman and Snyder 1997, Snyder and Groseclose 1997).
NOMINATE is by far the most widely used of this group. Based on the votes cast in all nonunanimous roll calls,1 NOMINATE draws on a much larger dataset than most other measures.
The NOMINATE score for the average legislator in our data was constructed using almost 1200
1
Poole and Rosenthal (1997) include all votes in which more than 2.5 percent of legislators
disagreed.
distinct votes. Yet for most members these votes center on legislative language they had little
direct influence on. Roll call based measures of member ideology may therefore be best suited to
distinguish between partisans of different stripes. In other words, ideology in this case is
conceptualized as low dimensional scaling, differentiating members along a common spectrum
(Noel 2013). Another category of scores relies on organizational evaluations of member behavior
by interest groups such as the American Conservative Union (ACU) and Americans for
Democratic Action (ADA). Though common in the literature, these scores have been frequently
criticized for the small, potentially ideologically motivated sample of votes they rely on (e.g,
Jackson and Kingdon 1992, Cox and McCubbins 1993, Krehbiel 1994, Box-Steffensmeier and
Franklin 1995). Because of the nature of their data, ACU and ADA scores are often seen as
needlessly granular (many members receive the same score) and artificially extreme.2
Other scores leverage potentially costly candidate evaluations made by individual
observers. Some studies, for example, have based their scores on media descriptions and
analyses of candidate behavior (e.g., Clinton, Jackman, and Rivers 2004). These measures reflect
the manner in which members’ campaign and legislative activity are presented in major media
outlets. These scores assume that journalists’ possess a unique blend of skills and incentives to
identify, collect, and present key evidence of officials’ worldviews, especially during an election
campaign (Hill, Hanna, and Shafqat 1997). Reporters who fail to report important stories or
misstate key facts may face professional sanction. Broad comparisons of these media-based
measures suggest, however, that there are few key differences between them and action-based
scores such as NOMINATE (Burden, Caldeira, and Groseclose 2000). More recently, Bonica
2
Some have attempted to redeem interest group ratings in some form (e.g., Levitt 1996;
Groseclose, Levitt, and Snyder 1999; Bishin 2003; Anderson and Habel 2009; Chand,
Schreckhise, and Parry 2014), but they have largely fallen out of use.
(2013) has modeled ideology on patterns of campaign contributions. Given that their financial
support likely increases the probability of a candidate's victory, contributors have a clear
incentive to identify candidates whose ideology is similar to their own. Unlike many other
scores, however, the resultant Database on Ideology, Money in Politics, and Elections (DIME)
includes ideal point estimates for both incumbents and challengers.
Finally, some scholars adopt what may be referred to as a "sociological approach." These
studies attempt to directly infer members' preferences either through elite opinion surveys or
demographics (e.g., Miller and Stokes 1963, Kingdon 1988, Bianco 1994, Erikson and Wright
1996, Bishin 2003, Shor 2012). Direct approaches of this type have a clear, intuitive appeal,
mirroring mass opinion surveys. Yet they are also extremely resource intensive and inflexible.
Scores created in this style are either unlikely or unable to be updated to reflect potential changes
in members' value systems.
Measures arising from each empirical category thus offer distinct strengths and
weaknesses as empirical tools. Though most appear to represent a similar underlying construct,
some are more readily generated (roll call measures), widely applicable (campaign contribution
measures), or meaningful for a democratically important relationship (media-based measures).
At the same time, these measures often rely on infrequent, externally constrained behaviors that
may not fully capture members' representational intentions. We argue that our social media
measure of elite ideology carries with it many of these strengths but does not suffer from similar
weaknesses.
Congress and Twitter
Social media use among members of Congress has taken time to develop. Members have
long been wary of new technology.3 Electorally focused, members are often hesitant to change
previous patterns of constituent interaction,4 patterns that have been part of successful campaigns
(Wood 1974). As a technology becomes more widely used among the public, however, many
members of Congress do eventually embrace it and, over time, develop relatively stable patterns
of use. The well-studied case of congressional campaign websites is instructive in this regard.
Largely viewed as an inconsequential novelty when the first official congressional sites launched
in 1993 (Owen et al. 1999, Bimber and Davis 2003), early adopters were generally young,
inexperienced members looking to cement their place in the institution and among their
constituents (D’Alessio 1997, 2000). Over time as online activity among the public rose, more
tenured members also created websites. By 2004, approximately 90 percent of major party
candidates for Congress had campaign websites (Foot and Schneider 2006). In 2002 one
observer asserted that “the question is no longer whether candidates for major office will have a
website, but what the website will look like and how it will be used” (Williams, Aylesworth, and
Chapman 2002, 43).
More importantly for our purposes, during this time, candidates’ website content became
more similar as well. Perhaps recognizing the potential to connect with politically active voters
3
A telling event occurred in 1868 when a young Thomas Edison offered the House a reliable
system for electronically recording floor votes (North 2009). He was flatly rejected, with one
member purportedly replying “young man, that won’t do at all. That is just what we do not want”
(North 2009, 64, emphasis in original). It would be more than 100 years before the House
electronically recorded an official vote (United States House of Representatives). The Senate
continues to require all votes be recorded by hand (United States Senate).
4
In 1997 Pat Leahy quipped that many members of Congress “wouldn’t even know how to turn
on a computer if they had to. They think it’s a not working television that won’t give you CNN”
(Johnson 2004, 98).
(Tolbert and McNeal 2003; Norris 2004), activists (Foot and Schneider 2006), and journalists
(Lipinski and Neddenriep 2004) through the same medium, “certain content and functionality or
tools [became] standard features on [congressional] sites” (Gulati and Williams 2009). This is
not to suggest that all campaign websites are interchangeable, indeed, there is evidence of
significant variation by the gender (Gulati 2004) and geographic location (Esterling, Lazer, and
Neblo 2013) of the candidate-owner. Instead, this literature indicates that major party candidates’
website designs and content reflect generally predictable, stable responses to measurable
incentives (Druckman, Kifer, and Parkin 2014; Esterling, Lazer, and Neblo 2005). Congressional
website design has thus exhibited significant path dependence, developing norms among
incumbents that “may be taken for granted as design requirements for legislative websites”
(Esterling, Lazer, and Neblo 2011; see also Bimber 2003; Chadwick 2006; Fountain 2001;
Xenos and Foot 2005).
The experience of congressional candidates’ website development provides a rare
illustration of elites incorporating a new communication technology into existing
representational practices and thereby gives us a set of expectations for the language candidates
distribute via Twitter. Though candidate websites and Twitter feeds differ in many ways, they
share many of the same affordances.5 Each offers the opportunity to disseminate information to a
wide population, distribute multimedia content, develop a community of supporters, and direct
visitors to volunteer and donation opportunities. Even the most significant difference—Twitter’s
restriction of 140 characters per post—is mitigated by candidates’ extensive use of hyperlinks to
connect tweet readers with longer form content or online services.
5
Putnam (2008) defines technology affordances as “the ways in which technology offers or
supports certain things” (see also Gaver 1991).
In many ways elite use of Twitter has followed the same developmental arc as campaign
websites. Congressional elites began tweeting during the 2008 campaign season. Early adopters
were again young newcomers to Congress, though party leaders were soon advising all members
how to best communicate with voters on social media (Lassen and Brown 2011). The content of
early tweets varied widely, was mostly personal for some members, and at times was somewhat
bizarre.6 It was simple for observers to dismiss congressional Twitter use as an idiosyncratic
hobby. Like campaign websites, however, Twitter content has become an increasingly common,
professionalized element of candidate self-presentation efforts. By the end of the 2012 election
nearly 90 percent of all major party candidates were active Twitter users and were producing a
steady stream of 100,000 to 200,000 tweets annually. Though there has been little systematic
research on the evolution of the content of candidate tweets during this period, anecdotal
evidence suggests that congressional elites in 2012 had become more sophisticated in their social
media presence. Candidates in this election were, for example, more likely to tweet about
campaign (not personal) issues in a manner consciously integrated into larger, electorally
motivated communication strategies than in previous elections (Hemphill et al. 2013, Lassen and
Bode 2013, Evans et al. 2014).7
The widespread expansion of Twitter use among even older, more experienced members
of Congress is unsurprising when one considers the promises of the service as a communication
tool. Twitter allows individual members to create a dramatically inexpensive, direct
communication space largely unencumbered by partisan or institutional pressures. On Twitter
6
Senator Charles Grassley’s early tweets, for example, included phrases such as “We still on
skedul/even workinWKEND” and “hit a deer on hiway 136 … Assume deer dead” (see Malone
2012).
7
The development of Senator Charles Grassley’s approach to Twitter is an intriguingly open
example (Camia 2013).
members need not wait for relevant legislation, hope for a journalist's favorable pen, or pay for
access to a given media market to express their position on a subject. Similarly, producing one or
even a dozen tweets may be far less resource intensive for a campaign than crafting and
distributing longer form newsletters or press releases, especially given the potential for members'
tweets to be subsequently shared with new, larger audiences by citizens, journalists, and other
elites (Kwak, Lee, Park, and Moon 2010; Arceneaux and Weiss 2010). Tweets can form the
beginning of longer form attention and discussion. This is true even accounting for funding and
staffing disparities among members and candidates. Instead, in important ways Twitter has
become a new kind of wire service, a place where all congressional actors may briefly but
meaningfully comment on issues and events they may or may not elaborate on elsewhere.
Candidates of all stripes thus have a strong normative and practical incentive to frequently and
directly communicate with potential supporters using a nearly universally known, low-cost
platform such as Twitter (Pitkin 1967, Mayhew 1974).
Elite social media use is also attractive for scholars. Similar to websites, the
extensiveness of candidate Twitter use facilitates the creation of datasets of directly comparable
messages, even among challengers who fail to attract much attention or support (see Druckman,
Kifer, and Parkin 2014). The development of recent computational tools further suggests that
such data may facilitate more direct and nuanced ideological comparisons of elites. Unlike many
websites, for example, members often tweet on a near daily basis, producing a continuous body
of time stamped content that facilitates longitudinal examinations. We therefore consider
congressional Twitter use not with the assumption that tweets necessarily exert a significant
influence on voters (although this potential has yet to be adequately examined), but that they may
provide a nearly universal indicator of broader campaign strategy and representational style
(Meinke 2009, Grimmer 2013).
Hypotheses
We therefore expect that a social media based measure of elite ideology would be very
useful for students of Congress. In the remainder of this paper we begin to examine a potential
score of this type. Existing evidence suggests to us that such a score will possess the following
characteristics:
H1 (Ideological Distinctiveness): The Twitter content produced by members of Congress
during 2011 and 2012 included significant, identifiable ideological language consistent with
members’ partisan identification. In other words, we expect Republicans’ tweets to more closely
resemble each other than those produced by otherwise similar Democrats.
H2 (Competition Constraints): As a result of candidates’ desire to win elections,
individuals facing a more challenging or competitive election will be more likely to use their
tweets to present themselves as ideologically appealing and distinct from other candidates, even
within their own party.
H3 (Universal Adoption): Because of Twitter’s low cost, accessible nature, we expect
that there will be few systematic differences between the ideological language posted by
otherwise similar incumbents and challengers. Unlike most other ideology measures, including
those proposed by Bonica (2013), our social media based measure puts incumbents and
challengers on nearly equal footing, allowing all candidates equal access to produce relevant
data.
Positioning Texts in Ideological Space
To construct our social media based measure of elite expressions of ideology, we first
examined every tweet posted by a major party general election candidate for the United States
Congress from January 1, 2011 to election day 2012. To compile this dataset, we identified all
Twitter handles (both campaign and congressional, where applicable)8 associated with all
relevant candidates and then collected all tweets from the accounts during our time period using
Twitter’s REST API.9 Our final sample includes 1,269 separate Twitter feeds. Since both official
congressional office and campaign handles were tracked using the API, where multiple feeds
were associated with a single member or candidate, feeds were combined and treated in the
analysis that follows as a single unit.10 Combining feeds by candidate/member produced a total
sample of 881 unique users and approximately 500,000 tweets.
We then employed an established text-as-data computational method to extract
comparable ideology scores from each candidate’s combined feed. Our method involves both
systematic human examination of the words used by the feed owners as well as computer-aided
techniques.11 As we show below, our method offers scholars of 21st century political
8
Congressional rules require members to separate their Twitter content according to its purpose.
In other words, any tweet not directly related to the member’s official business as a legislator is
required to originate from an identifiably separate account (Davidson 2014).
9
For more information on all aspects of the Twitter API, visit: https://dev.twitter.com/docs/api.
10
The authors acknowledge that treating campaign feeds and feeds administered as part of
official congressional duties may be problematic. However, combining feeds was often
necessary as some handles produced too few tweets for inclusion in the analysis.
11
See Grimmer and Stewart (2013) and Hopkins and King (2010) for a fuller discussion of the
various pros and cons of different computer-aided techniques for analyzing text-as-data.
communication great potential for examining hypotheses previously difficult to test in a
generalizeable manner.12
For comparison purposes we also used the same method to construct scores from the text
of all one minute floor speeches given by members during the 112th Congress.13 Member
language on the floor of Congress provides a useful contrast to Twitter content because while it
possesses some of social media’s beneficial characteristics (e.g., one minute speeches are not
necessarily bound by current legislation (Maltzman and Sigelman 1996)), they are still heavily
influenced by party leaders who recruit and provide talking points to members for as many as
four in ten one minute speeches (Harris 2005). Similarly, many floor speeches are brief and
appear sensitive to constituents’ views (Kringer and Shen 2014). On this point, Shogan and
coauthors (2013) find that only 20 percent of one minute speeches include more than one policyrated statistics such as the rate of Americans who lack employment or health insurance.
Automated methods offer both practical and theoretical benefits over human coding of
subsamples. Given the sheer unwieldy size of the data, the efficiency savings to researchers are
self-evident. In addition, the polysemic, subjective nature of political content makes reliance on a
team of human coders who will almost certainly differ in interpreting the content and ideological
character embedded in tweets problematic. Instead, while the methods employed in this paper
may fail to capture some nuances of language use, by using systematic, objective measures, we
12
Pre-processing of the text followed the methods in Toff and Kim (2014) and utilized the same
Python code. The steps included the removal of English language “stop-words,” punctuation,
numerals, hyperlinks, and at-mentions. Frequently used abbreviations on Twitter were replaced
with their more universal cognates (e.g., “natl” for “national;” or “cont” for “continue”), and
approximately 200 politically salient n-grams were created so as to distinguish between, for
instance, references to “social security” and other uses of the words social “social” and
“security.” This list of political n-grams was adopted from the index of an introductory American
politics textbook (Canon and Bianco 2013).
13
Our thanks to Brad Jones for sharing this data with us.
may limit bias associated with researchers' own subjective impressions and aid in ensuring that
our findings may be replicated other data.
The specific text-as-data tool we employ is a "bag of words" approach known as
Wordscores, developed by Laver and Garry (2000). This automated method situates
documents—or in this case, Twitter feeds—along a 2-dimensional ideological continuum
according to the frequency with which partisan words are employed. The advantage of this
method over other similar dictionary-based methods is that it requires fewer theoretical
assumptions. Rather than assigning ideologies to words based upon a dictionary or detailed
codebook of a priori partisan language, the Wordscores procedure generates a list of scored
words based on assigned reference texts of known position in ideological space. These scored
words—and only the words that appear in the assigned reference texts—are then used to assess
documents of unknown partisanship.
As in Toff and Kim (2014), which uses Wordscores to examine Twitter messages
disseminated by party leaders and partisan media, this paper also utilizes the national Democratic
and Republican party platforms from each party’s 2012 nominating conventions as reference
texts. Platforms, although relatively obscure documents, are particularly useful for this enterprise
in that they capture each party’s official positions on the precise issues they chose to highlight
during the 2012 election year. Each document was assigned a position in ideological space (1 =
Republican platform, -1 = Democratic platform), and resulting Wordscores are interpretable in
relation to these reference scores assigned by the researchers.14 The set of unique words that
appear in each platform, weighted by their frequencies, are used to construct the list of scored
14
The precise numbers should not be read as absolute ideological scores but as reflective of a
single cross-section in time.
words with which the Wordscores procedure assigns average scores to our sample of Twitter
users. The number of documents and scored words for both our Twitter and floor speech samples
are presented in Table One.
[Insert Table One About Here]
Wordscores involves four basic computational steps: (1) selection of a reference set of
documents with known positions in an ideological space; (2) scoring of the unique words that
appear in the reference documents weighted according to their frequency; (3) scoring of a new
set of texts of unknown position in the same ideological space; and (4) transformation of those
numerical scores according to a stock formula that accounts for differences in the variance of
word usage in the reference and scored texts.15 The four steps are summarized in Figure One.16
[Insert Figure One About Here]
We refer to the member-specific scores that result from applying Wordscores to member
Twitter feeds in this manner as Tweetscores. This measure captures the extent to which
members’ self presentation on Twitter is consistent with the general ideological language used
by their party. The distributions of Tweetscores (and the related Worscores-based measure for
our floor speech texts) for Democrats and Republicans are plotted in Figure Two. A clear divide
is evident between members of each party, and indeed estimates derived from both types of
sampled text are significantly correlated with DW-NOMINATE scores, although Tweetscores
15
This final step allows scholars to more easily compare texts of unknown ideological position to
the positions of the reference documents used in deriving the scores. However, in that the
transformation utilizes the variances of the entire set of scored documents, comparisons between
scores not derived simultaneously are impossible (Laver, Benoit and Garry 2003; Lowe 2008),
so caution must be applied to assessing differences in the magnitude of scores between samples
estimated separated.
16
For more information about Wordscores, please consult the website of its authors. See
http://www.tcd.ie/Political Science/wordscores/.
are comparatively more so. Among the subset that includes incumbents whose NOMINATE
scores are available (N=483), the correlation is a healthy 0.65, and 0.69 when the sample is
further reduced to just those in competitive elections.17 Among incumbent senators, the
correlation is 0.75. While some of the difference in correlations may be attributable to
differences in messaging styles—senators might be expected to devote more resources to
maintaining a professional-style Twitter feed that more accurately captures their own political
self-presentation—other differences in the correspondence between NOMINATE scores and
Tweetscores likely reflect the precision of the estimates. As Figure Three demonstrates, the
correlation between NOMINATE scores and Tweetscores increases steadily as we restrict our
sample to only the most prolific tweeters in Congress—in other words as we increase the volume
of language available to score each feed.
[Insert Figure Two About Here]
[Insert Figure Three About Here]
More broadly, we also compare Tweetscores to other measures of congressional
ideology. Table Two reports the correlation between our measure and at least one measure from
each of the empirical approaches to ideology scoring noted earlier. Consistent with Hypothesis 1,
in each case Tweetscores presents comparable and significant results. Reassuringly, Tweetscores
correlate most strongly with the most widely used score presented here: NOMINATE. Note also
that Tweetscores correlates more strongly with each measure than our floor speech-based score
(CR Score) does with any other version. At the same time, however, it is not surprising that the
Tweetscore correlation is somewhat weaker than that evidenced by comparisons of existing
17
Defined as the 18 states and 81 house districts the New York Times identified as "in play" in
the 2012 election. See http://elections.nytimes.com/2012/ratings/house/.
measures. The Twitter data on which Tweetscores are based are unusually expansive and
unconstrained, allowing members and challengers to construct an ideological profile that though
it may depart at times from some members’ operative institutional choices, represents a truer
portrait of the member’s desired ideological profile.
[Insert Table Two About Here]
As with NOMINATE scores that measure incumbent ideology through their roll call
voting behaviors, Tweetscores also appear closely related to the ideology of candidates' home
districts or states. In Figure Four, Tweetscore estimates are plotted alongside Cook Political
Voting Index (PVI) scores for the candidates' home district or state. Among winners of the
election, a clear positive relationship is evident, with candidates in more conservative
environments using significantly more conservative language in their Twitter feeds. Among
losers in the election, however, the relationship is reversed, as losing candidates appear markedly
misaligned ideologically with voters in their district or state. We find a similar but somewhat
weaker pattern (presented in Figure Five) among member language expressed by incumbent (i.e.,
non-retiring) members during floor speeches. Results in the left hand panel of Figure Five
include members who were defeated during a primary, including those whose district was
substantially altered for the 113th Congress. No similar relationship appears when considering the
ideological nature of the language used by retiring members in their floor speeches.
[Insert Figure Four About Here]
[Insert Figure Five About Here]
To more closely examine the factors shaping candidates’ Tweetscores, we created
multivariate regression models to predict incumbent and challenger ideology scores expressed in
both congressional debate and social media. The results of five OLS models are presented in
Table Three.18 The first two models predict each candidate’s raw Tweetscore and floor speechbased ideology score. Again we see evidence of the influence of party, with Republicans in each
case more likely to receive a numerically higher (i.e., more conservative score). Non-party
estimates in each model are also largely consistent with expectations. Female candidates, for
example, present a consistently more liberal face to the public than their male counterparts do.
The second two models in Table Three predict within party variance in scores of each type.
These results provide clear evidence in support of Hypothesis 2, suggesting that candidates are
sensitive to their electoral environment when managing their public ideological profile. As local
electoral pressures increase, candidates become more likely to diverge from their party’s mean
ideological position in order to appeal to local voters. Arguments that candidates position
themselves according to voters’ preferences are not new, but Tweetscores are uniquely well
suited to provide evidence on this point, especially when compared with roll-call scores like
NOMINATE.
[Insert Table Three About Here]
The model presented in Table Four also tests the utility of Hypothesis 2 by evaluating the
factors motivating extreme partisan language. Extreme partisan language is both a reflection of
an individual member or candidate's own political views but also a strategically modulated
campaign device. We might expect candidates for office to moderate the ideology in their public
facing communications while engaged in electoral politics. To test this implication of Hypothesis
2, we estimated a fifth model (presented in Table Four) to compare estimates of extremity
generated by both Tweetscores and NOMINATE. Given that extreme language is the variable we
18
We also estimated a number of other hierarchical and nested models but finding that they
produced no substantive differences, we report the simpler models here.
seek to explain, and we operationalize it as the absolute value of a candidates' Tweetscore, we
must use a separate measure of candidate ideology as a covariate in the model itself. Therefore,
we test this hypothesis on a smaller subset of the overall sample for which NOMINATE scores
are available, using those scores as a measure of incumbent ideology.
[Insert Table Four About Here]
Using this considerably smaller sample, we do see a significant interaction between
candidate and district ideology in the expected direction. A member with a -0.5 NOMINATE
score can be expected to produce a more extreme Tweetscore (by 0.19)—or roughly a tenth of
the difference between the average Democrat and Republican in the sample—in a conservative
district where Obama won just 30 percent of the vote compared to a liberal district where 70
percent of ballots endorsed Obama. The difference is even more extreme among those with
conservative NOMINATE scores: an incumbent with a 0.5 score is expected to moderate his or
her score by almost a full point (0.97) between a similar range of districts; however, given the
existing narrow spectrum of incumbents in districts that differ substantially from their own
ideology, using incumbents for these scenarios may be less revealing than scenarios involving
the full sample. Two other variables of note were also statistically significant predictors of
extreme language. Female members were associated with more extreme language (0.19), while
the median age of the district was associated with less extreme language, although there are no
clear theoretical explanations to support either of these findings.
Similarly, Tweetscores allow us to directly compare the ideological statements made by
both incumbents and challengers in similar electoral conditions. The results in Table Three also
support Hypothesis 3 in their lack of a clear distinction between challengers and incumbents.
Instead, as we might expect, incumbency status appears to have little to do with the ideological
nature of the public face of a campaign. Again, the value added here is less connected with the
nature of the argument itself and more with the ease with which we are able to compare the
public communication behavior of both incumbents and challengers.
Conclusion
In general, our paper makes two contributions to the ongoing discussion surrounding the
place of social media in congressional campaigns. First, we have demonstrated the tractability of
Twitter data. Using text as data methods accessible to political scientists, we have categorized
and modeled a large dataset of candidate messages. Our results lay bare both the benefits and
difficulties associated with these methods. While they hold the promise of quantifying the
increasingly large amount of campaign-related text produced and distributed on the Internet, they
are by definition bound by the quality of their reference texts. Though our Tweetscores highlight
the unique utility of the nearly daily strategic messages now produced on Twitter by the vast
majority of members of Congress, we rely on major party platforms to generate our ideal point
estimates. This leaves our measure vulnerable to internal partisan fragmentation and the
potentially declining relevance of unifying texts like party platforms.
The combination of these tools and the still increasing volume of tweets produced by
congressional Twitter users19 holds substantial promise for those interested in unpacking the
causes and consequences of members’ public displays of ideology. Because of Twitter’s nearly
universal adoption among candidates, low-cost nature (both to produce and collect), and
temporal precision (each tweet is time stamped with the exact moment it is posted) candidate
tweets offer a uniquely comparable and insightful dataset. Our Tweetscores therefore may allow
19
As part of another, related project, the authors have identified more than 800,000 tweets posted
by candidates for Congress during 2013 and 2014, a 60 percent increase over the time period we
consider here.
researchers to more effectively examine longitudinal and event-specific changes in candidates’
self presentation. Similarly, because challengers as well as incumbents are frequent producers of
tweets, Tweetscores may allow us to model the manner in which electoral opponents respond to
one another over the course of a campaign. This may allow us to more fully address issues such
as how incumbents respond to challengers who effectively appeal to a large portion of the
electorate. Previous studies of campaigns have often been limited by the incumbent-centric
nature of ideology measures such as NOMINATE scores. We find that in races with sufficient
candidate Twitter activity, some of this limitation may be moderated. Data of the kind we have
considered have immediate, direct applications for studies of at least campaign effects, mass
communication, and vote choice. Beyond this, however, tools such as Wordscores are powerfully
agnostic and offer the ability to plot candidate tweets along any two dimensional space (as
defined by the selected reference texts).
Second, we also identify the increasingly strategic nature of congressional candidate
Twitter use. At nearly every step our results are consistent with a story of fully integrated,
strategic tweeting by members of all ages, experience, support, and partisanship. Indeed, our
most consistent finding is the influence of local electoral competition. In nearly all instances,
candidate Twitter activity displayed a distinct, expected sensitivity to electoral conditions. Not
only did we find that candidates altered the nature of their tweets when the campaign winds
shifted, but we also found that the group of candidates we would expect to be most skilled at
campaign rhetoric—incumbents, those who have a record of electoral success—appear as such in
our data.
Ultimately, therefore, while we argue that it is possible to accrue enduring insights from
modeling the use of Twitter among major party candidates for Congress, the road to doing so
remains uneven. We have identified an important tool researchers can call upon to make sense of
candidates’ tweets—there are many others available as well—but their use requires planning and
careful application. Thus, while we have used these tools to begin to identify the value of a social
media based measure of congressional ideology and, by extension, the role of electoral
incentives in Twitter use, many questions about what motivates candidates’ tweet behavior
remain unanswered. We look forward to beginning to address these questions in future iterations
of this project.
Bibliography
Amman, Sky. 2010. “A Political Campaign Message in 140 Characters or Less: The Use of
Twitter by U.S. Senate Candidates in 2010.” Social Science Research Network.
Arceneaux, Noah and Amy Schmitz Weiss. 2010. “Seems Stupid Until You try it: Press
Coverage of Twitter, 2006-09.” New Media and Society 12 (8): 1262-79.
Arnold, Douglas R. 1990. The Logic of Congressional Action. New Haven, CT: Yale University
Press.
Bimber, Bruce. 2003. Information and American Democracy: Technology in the Evolution of
Political Power. New York: Cambridge University Press.
Bimber, Bruce and Richard Davis. 2003. Campaigning Online: The Internet in US Elections.
New York: Oxford University Press.
Bode, Leticia, David S. Lassen, Young Mie Kim, Dhavan Shah, Erika Franklin Fowler, Travis
N. Ridout, and Michael Franz. 2011. “Social and Broadcast Media in 2010 Midterms:
The Expanding Repertoire of Senate Candidates’ Campaign Strategies.” Prepared for the
Annual Meeting of the American Political Science Association, Seattle, WA.
Bovitz, Gregory L. and Jamie L. Carson. 2006. “Position-taking and Electoral Accountability in
the US House of Representatives.” Political Research Quarterly 59 (2): 297-312.
Camia, Catalina. 2013. “No More Dead Deer: Grassley Changes What He Tweets.”
USAToday.com. http://www.usatoday.com/story/onpolitics/2013/03/05/chuck-grassleytwitter-personal/1965235/ (accessed August 21, 2014).
Canon, David T. and William T. Bianco. 2013. American Politics Today, 3rd ed. New York: W.
W. Norton.
Chadwick, Andrew. 2006. Internet Politics: States, Citizens, and New Communication
Technologies. New York: Oxford University Press.
Chi, Feng and Nathan Yang. 2010. “Twitter in Congress: Outreach vs. Transparency.”
http://mpra.ub.uni-muenchen.de/24060/1/MPRA_paper_24060.pdf (accessed August 21,
2014).
D’Alessio, Dave. 1997. “Use of the World Wide Web in the 1996 US Election.” Electoral
Studies 16 (4): 489-500.
--------------. 2000. “Adoption of the World Wide Web by American Political Candidates, 19961998.” Journal of Broadcasting & Electronic Media 44 (4): 556-68.
Davidson, C. Simon. 2014. “When is a Tweet an Ethics Violation?” RollCall.com.
http://blogs.rollcall.com/beltway-insiders/when-is-a-tweet-an-ethics-violation-a-questionof-ethics/?dcz= (accessed August 21, 2014).
Druckman, James N., Martin J. Kifer, and Michael Parkin. 2014. “US Congressional Campaign
Communications in an Internet Age.” Journal of Elections, Public Opinion & Parties 24
(1): 20-44.
Esterling, Kevin M., David M.J. Lazer, and Michael A. Neblo. 2005. “Home (Page) Style:
Determinates of the Quality of the House Members’ Web Sites.” International Journal of
Electronic Government Research 1 (2): 50-63.
--------------. 2011. “Representative Communication: Web Site Interactivity and Distributional
Path Dependence in the US Congress.” Political Communication 28 (4): 409-39.
--------------. 2013. “Connecting to Constituents: The Diffusion of Representation Practices
among Congressional Websites.” Political Research Quarterly 66 (1): 102-14.
Evans, C. Lawrence, and Walter J. Oleszek. “The Internet Institutional Change.” In James A.
Thurber and Colton C. Campbell, eds., Congress and the Internet. New York: Prentice
Hall.
Evans, Heather K., Victoria Cordova, Savannah Sipole. 2014. “Twitter Style: An Analysis of
How House Candidates Used Twitter in their 2012 Campaigns.” PS: Political Science &
Politics 47 (2): 454-62.
Fenno, Richard. 1978. Home Style: House Members in their Districts. New York: Longman.
--------------. 2000. Congress at the Grassroots: Representational Change in the South, 19701998. Chapel Hill, NC: University of North Carolina Press.
Fiorina, Morris P. 1974. Representatives, Roll Calls, and Constituencies. Lexington, MA:
Lexington Books.
Foot, Kirsten A. and Steven M. Schneider. 2006. Web Campaigning. Cambridge, MA: MIT
Press.
Fountain, Jane E. 2001. “Building the Virtual State.” Information Technology and Institutional
Change 61-82.
Gaver, William. 1991. “Technology Affordances.” Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems 79-84.
Glassman, Matthew Eric, Jacob R. Straus, and Colleen Shogan. 2010. “Social Networking and
Constituent Communications: Member Use of Twitter During a Two-Month Period in the
111th Congress.” CRS Report for Congress.
Golbeck, Jennifer, Justin M. Grimes, and Anthony Rogers. 2010. “Twitter Use by the US
Congress.” Journal of the American Society for Information Science and Technology 61
(8): 1612-21.
Grimmer, Justin. 2013. Representational Style in Congress: What Legislators Say and why it
Matters. New York: Cambridge University Press.
Grimmer, Justin and Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of
Automatic Content Analysis for Political Texts.” Political Analysis 21(3): 1–31.
Grimmer, Justin and Gary King. 2011. “General Purpose Computer-Assisted Clustering and
Conceptualization.” Proceedings of the National Academy of Sciences 108(7): 2643–
2650.
Gulati, Jeff. 2004. “Members of Congress and Presentation of Self on the World Wide Web.”
International Journal of Press/Politics 9 (1): 22-40.
Gulati, Jeff and Christine Williams. 2009. “Closing Gaps, Moving Hurdles: Candidate Web Site
Communication in the 2006 Campaigns for Congress.” In Costas Panagopoulos, ed.,
Politicking Online: The Transformation of Election Campaign Communications. New
Brunswick, NJ: Rutgers University Press.
Gulati, Jeff and Christine Williams. 2010. “Communicating with Constituents in 140 Characters
or Less: Twitter and the Diffusion of Technology Innovation in the United States
Congress.” Paper prepared for the Annual Meeting of the Midwest Political Science
Association, Chicago, IL.
Haber, Steven. 2011. “The 2010 US Senate Elections in Less than 140 Characters: How Senate
Candidates Use Twitter as a Campaign Tool.”
http://aladinrc.wrlc.org/bitstream/handle/1961/10028/Haber,%20Steven%20%20Spring%20'11.pdf?sequence=1 (accessed August 21, 2014).
Hall, Richard. 1996. Participation in Congress. New Haven, CT: Yale University Press.
Hemphill, Libby, Jahna Otterbacher, and Matthew Shapiro. 2013. “What’s Congress Doing on
Twitter?” Proceedings of the 2013 Conference on Computer Supported Cooperative
Work 877-86.
Highton, Benjamin and Michael S. Rocca. 2005. “Beyond the Roll-call Arena: The Determinants
of Position Taking in Congress.” Political Research Quarterly 58 (2): 303-16.
Hopkins, Daniel J. and Gary King. 2010. “A Method of Automated Nonparametric Content
Analysis for Social Science.” American Journal of Political Science 54(1): 229–247.
Jacobson, Gary. 1987. “The Marginals Never Vanished: Incumbency and Competition in
Elections to the US House of Representatives, 1952-82.” American Journal of Political
Science 126-41.
Johnson, Dennis. 2004. Congress Online: Bridging the Gap between Citizens and their
Representatives. New York: Routledge.
Jones, David R. 2003. “Position Taking and Position Avoidance in the US Senate.” Journal of
Politics 65 (3): 851-63.
Kingdon, John W. 1989. Congressmen’s Voting Decisions. Ann Arbor, MI: University of
Michigan Press.
Kwak, Haewoon, Changhyun Lee, Hosung Park, and Sue Moon. 2010. “What is Twitter, a
Social Network or a News Media?” Paper presented at the International World Wide
Web Conference, Raleigh, NC.
Lassen, David S. and Adam R. Brown. 2011. “Twitter: The Electoral Connection?” Social
Science Computer Review 29 (4): 419-36.
Lassen, David S. and Bode, Leticia. 2013. “Social Media Coming of Age: Developing Patterns
of Congressional Twitter Use 2007-2012.” Paper prepared for presentation at the Annual
Meeting of the American Political Science Association, August 29-September 1, 2013.
Laver, Michael and John Garry. 2000. “Estimating Policy Positions from Political Texts.”
American Journal of Political Science 44(3): 619–634.
Laver, Michael, Kenneth Benoit and John Garry. 2003. “Extracting Policy Positions from
Political Texts Using Words as Data.” American Political Science Review 97(2): 311–
331.
Lowe, Will. 2008. “Understanding Wordscores.” Political Analysis 16(4): 356–371.
Lipinski, Daniel, and Gregory Neddenriep. 2004. “Using ‘New’ Media to Get ‘Old’ Media
Coverage: How Members of Congress Utilize their Web Sites to Court Journalists.” The
Harvard International Journal of Press/Politics 9 (1): 7-21.
Livne, Avishay, Matthew P. Simmons, Eytan Adar, and Lada A. Adamic. 2011. “Networks and
Language in the 2010 Election.” Prepared for the 2011 Political Networks Conference,
Ann Arbor, MI.
Mayhew, David R. 1974. Congress: The Electoral Connection, New Haven, CT: Yale University
Press.
Meinke, Scott R. 2009. “Presentation of Partisanship: Constituency Connections and Partisan
Congressional Activity.” Social Science Quarterly 90 (4): 854-67.
Mirer, Michael L. and Leticia Bode. 2013. “Tweeting in Defeat: How Candidates Concede and
Claim Victory in 140 Characters.” New Media & Society 1-17.
Norris, Pippa. 2004. “Who Surfs? New Technology, Old Voters, and Virtual Democracy in U.S.
Elections 1992-2000.” In E.C. Kamarck, eds., Democracy.com. Washington, DC:
Brookings Institution.
North, Sterling. 2009. Young Thomas Edison. New York: Penguin.
Owen, Diana, Richard Davis, and Vincent James Strickler. 1999. “Congress and the Internet.”
International Journal of Press Politics 4 (2): 10-29.
Parmalee, John H. and Shannon L. Bichard. 2012. Politics and the Twitter Revolution: How
Tweets Influence the Relationship Between Political Leaders and the Public. Lanham,
MD: Lexington Books.
Pierson, Paul. 2000. “Increasing Returns, Path Dependence, and the Study of Politics.” American
Political Science Review 94 (2): 251-67.
Pitkin, Hanna. 1967. The Concept of Representation. Berkeley, CA: University of California
Press.
Putnam, Robert. 2008. “Affordances of Technology for Supporting Teaching and Learning.”
SMU.edu. http://centres.smu.edu.sg/cte/innovative-development/affordance-oftechnology/ (accessed August 21, 2014).
Samuelsohn, Darren. 2014. “Pols Have a #Fakefollower Problem.” Politico.com.
http://www.politico.com/story/2014/06/twitter-politicians-107672.html (accessed August
21, 2014).
Stromer-Galley, Jennifer. 2000. “On-line Interaction and Why Candidates Avoid It.” Journal of
Communication 50 (4): 111-32.
Toff, Benjamin J. and Young Mie Kim. 2014. “Words that Matter: Twitter and Partisan
Polarization.” Paper presented at the Annual Meeting of the International
Communications Association, Seattle, WA.
Tolbert, Caroline J. and Ramona S. McNeal. 2003. “Unraveling the Effects of the Internet on
Political Participation.” Political Research Quarterly 56 (2): 175-85.
United States House of Representatives. “The First Electronic Vote.” House.gov.
http://history.house.gov/HistoricalHighlight/Detail/37169?ret=True (accessed August 21,
2014).
United States Senate. “Voting Process.” Senate.gov
https://www.senate.gov/reference/Index/Votes.htm (accessed August 21, 2014).
Williams, Christine, Andrew Aylesworth, and Kenneth J. Chapman 2002. “The 2000 ECampaign for U.S. Senate.” Journal of Political Marketing 1 (4): 39-63.
Xenos, Michael and Kirsten A. Foot. 2005. “Politics as Usual, or Political Unusual? Position
Taking and Dialogue on Campaign Websites in the 2002 US Elections.” Journal of
Communication 55 (1): 169-85.
Table One
Summary Statistics
Feature
Tweets Floor Speeches Roll Call Votes
Documents (mean)
491.5
29.9
1189.7
Documents (median)
334
1393.5
Scored words (mean)
2236
3104
Scored words (median) 1582
1083
Figure One
Wordscores Procedure Illustrated
Figure Two
Distribution of Ideology Scores by Communication Method
Twitter Feeds
Congressional Record
Republicans
Democrats
0.0
0.1
0.2
Density
0.2
0.1
0.0
Density
0.3
0.3
0.4
0.4
Republicans
Democrats
-4
-2
0
Lengths
2
4
-4
-2
0
Lengths
2
4
Figure Three
Comparing Tweetscores and NOMINATE Scores by Tweet Volume
Table Two
Ideology Score Correlations
Tweetscore CR Score NOMINATE DIME Nat Journal Shor ADA
CR Score
0.47
NOMINATE
0.63
0.43
DIME
0.53
0.49
0.94
Nat Journal
0.61
0.48
0.94
0.89
Shor
0.53
0.52
0.86
0.87
0.86
ADA
-0.58
-0.42
-0.91
-0.90
-0.91
-0.86
ACU
0.62
0.45
0.97
0.94
0.95
0.88 -0.95
Figure Four
Distribution of Tweetscores by Election Context and Outcome
Winners
-2
0
2
4
Republicans
Democrats
Other
-4
-2
0
Wordscore of Twitter Feeds
2
4
Republicans
Democrats
Other
-4
Wordscore of Twitter Feeds
Losers
-40
-20
0
20
PVI of District/State
40
-40
-20
0
20
PVI of District/State
40
Figure Five
Distribution of CR Scores by Election Context
Candidates
-2
0
2
4
Republicans
Democrats
Other
-4
-2
0
2
Wordscore of Congressional Record
4
Republicans
Democrats
Other
-4
Wordscore of Congressional Record
Retirees
-40
-20
0
20
PVI of District/State
40
-40
-20
0
20
40
Table Three
Modeling Tweetscores and CR Scores
District/State Covariates
District/State
Competitiveness (PVI)
Percent Nonwhite
Median Age
Percent with
Bachelor's Degree
Median Income
(logged)
Candidate Covariates
Party (1=Rep, -1=Dem)
Tweetscore
Cr Score
0.02***
(0.00)
0.75*
(0.36)
0.00
(0.01)
1.52*
(0.70)
-0.63*
(0.28)
0.02**
(0.01)
0.88
(0.45)
-0.03
(0.02)
1.43
(0.86)
-0.36
(0.34)
0.02***
(0.00)
0.46
(0.30)
-0.01
(0.01)
0.19
(0.59)
-0.13
(0.24)
0.02**
(0.01)
0.05
(0.43)
-0.03*
(0.02)
0.02
(0.78)
0.24
(0.31)
0.66***
(0.04)
0.41***
(0.08)
-0.19
(0.16)
-0.04
(0.11)
-0.10**
(0.03)
0.09*
(0.04)
0.05
(0.06)
0.05
(0.03)
0.27*
(0.13)
-0.31**
(0.10)
0.51*
(0.25)
1.50***
(0.32)
0.03
(0.08)
0.05
(0.06)
0.15*
(0.07)
0.06
(0.04)
0.02
(0.18)
-0.49***
(0.13)
-0.47***
(0.04)
0.01***
(0.00)
-0.10
(0.14)
-0.18
(0.10)
-0.07*
(0.03)
0.10**
(0.04)
0.01
(0.05)
0.02
(0.02)
0.00
(0.11)
-0.06
(0.08)
0.02
(0.24)
-0.04
(0.04)
2.40
(2.46)
689
0.29
0.27
-0.63***
(0.07)
0.01
(0.01)
-0.06
(0.22)
0.25
(0.29)
0.02
(0.08)
0.04
(0.06)
0.02
(0.06)
0.02
(0.03)
0.01
(0.16)
0.00
(0.12)
0.00
(0.27)
Party X PVI
Senator (1=Yes)
Incumbent (1=Yes)
Campaign Spending
(logged)
Age (in increments
of 10)
Years in Congress (in
increments of 10)
Education (0=High
School, 6=Doctorate)
Nonwhite (1=Yes)
Female (1=Yes)
Unopposed (1=Yes)
# of Tweets (logged)
6.71
1.68
(2.93)
(3.61)
N
693
363
R-Squared
0.36
0.37
Adj. R-Squared
0.35
0.34
***p<0.001 **p<0.01 *p<0.05. Results from OLS models.
(Intercept)
Distance from Party Mean
Cr Score
Tweetscore
-1.19
(3.50)
363
0.30
0.27
Table Four
Members’ Tweetscore Extremity
TweetScore
Extremity
District/State Covariates
Obama 2012 Vote %
Percent Nonwhite
Median Age
Percent with
Bachelor's Degree
Median Income
(logged)
Candidate Covariates
NOMINATE Score
Ideology X Obama 2012
Party (1=Rep, -1=Dem)
Senator (1=Yes)
Incumbent (1=Yes)
Campaign Spending
(logged)
Age (in increments
of 10)
Education (0=High
School, 6=Doctorate)
Nonwhite (1=Yes)
Female (1=Yes)
# of Tweets (logged)
(Intercept)
N
Adj. R-Squared
-1.45***
(0.55)
-0.02
(0.40)
-0.03*
(0.01)
-0.18
(0.71)
0.17
(0.28)
0.64***
(0.49)
-1.94***
(0.91)
-0.01
(0.08)
0.11
(0.22)
-0.15
(0.30)
0.06
(0.08)
-0.00
(0.00)
0.04
(0.03)
-0.05
(0.15)
0.19**
(0.08)
-0.11***
(0.03)
6.16
(8.95)
347
0.08