Elite Ideology Across Media: Constructing a Measure of Congressional Candidates’ Ideological Self-Presentation on Social Media David S. Lassen Benjamin J. Toff March 10, 2015 Draft prepared for discussion at the American Politics Workshop at UW-Madison Please do not cite without authors’ permission Introduction Recently there has been renewed interest among political scientists in the messages circulated by elected officials, especially members of Congress. Joining technical advances in collection and analysis techniques with a significant body of previous studies showing the informative potential in elite communication, scholars have argued that members’ “representational style” is a crucial, yet often overlooked element of concepts such as representation and mass political discourse. By influencing media content, public opinion, and the broader information environment, elected officials’ words help create the political world in which we live. As Grimmer and Stewart note, “language is the medium for politics and political conflict,” a tool elected officials frequently use to advance their interests (2013, 1). This attention has implications for how we conceptualize and measure elite ideology. In its purest form, ideology is often defined as “a latent set of values that organizes personal political attitudes” (Burden, Caldeira, and Groseclose 2000). Recognizing that such a concept is impossible to confidently measure in strategic actors, however, many scholars have noted that in its observable form, congressional ideology is influenced by the context in which elite behavior occurs, a space carved out by partisan and institutional forces. Roll call measures of ideology such as NOMINATE (Poole and Rosenthal 1997), for example, permit a member to express their position on gun control only in support of or opposition to legislative language they may have had little control over. Some have therefore created alternate measures of members’ philosophies rooted in non-voting data such as campaign contributions (Bonica DATE), members’ responses to surveys (e.g., Bianco 1994, Erikson and Wright 1996), and relevant media coverage (Hill, Hanna, and Shafqat 1997). Such efforts may provide only minimal benefits over established action-based scores (e.g., Burden, Caldeira, and Groseclose 2000; Bishin 2003). By contrast, we argue that the increasing volume and sophistication of language members today distribute via social media may represent valuable material for a new measure of congressional ideology. Using existing text-analytic methods and Twitter messages posted by members of and candidates for the 113th Congress, we create and examine an indicator (which we refer to as a “Tweetscore”) of the manner in which members voluntarily present themselves to the public in a space relatively unencumbered by the congressional docket, scholars’ inquiries, or party officials. Departing from previous studies and scores that sought to differentiate themselves from NOMINATE, the focus of our paper is not that our measure captures a new dimension of ideology. Indeed, we are reassured by evidence that our tweet-based measure correlates well with and appears to be motivated in the same manner as NOMINATE. Instead, we contend that because of its unusually constant and increasingly institutionalized production, member social media content may provide an opportunity to examine heretofore elusive aspects of operative congressional ideology (see Rohde 1991), including individual longitudinal change and sensitivity to exogenous shocks such as national security crises. Measuring Elite Ideology Existing measures of congressional ideology have generally adopted one of four empirical approaches. The first and most common centers on members’ roll call votes (e.g., Poole and Rosenthal 1997, Heckman and Snyder 1997, Snyder and Groseclose 1997). NOMINATE is by far the most widely used of this group. Based on the votes cast in all nonunanimous roll calls,1 NOMINATE draws on a much larger dataset than most other measures. The NOMINATE score for the average legislator in our data was constructed using almost 1200 1 Poole and Rosenthal (1997) include all votes in which more than 2.5 percent of legislators disagreed. distinct votes. Yet for most members these votes center on legislative language they had little direct influence on. Roll call based measures of member ideology may therefore be best suited to distinguish between partisans of different stripes. In other words, ideology in this case is conceptualized as low dimensional scaling, differentiating members along a common spectrum (Noel 2013). Another category of scores relies on organizational evaluations of member behavior by interest groups such as the American Conservative Union (ACU) and Americans for Democratic Action (ADA). Though common in the literature, these scores have been frequently criticized for the small, potentially ideologically motivated sample of votes they rely on (e.g, Jackson and Kingdon 1992, Cox and McCubbins 1993, Krehbiel 1994, Box-Steffensmeier and Franklin 1995). Because of the nature of their data, ACU and ADA scores are often seen as needlessly granular (many members receive the same score) and artificially extreme.2 Other scores leverage potentially costly candidate evaluations made by individual observers. Some studies, for example, have based their scores on media descriptions and analyses of candidate behavior (e.g., Clinton, Jackman, and Rivers 2004). These measures reflect the manner in which members’ campaign and legislative activity are presented in major media outlets. These scores assume that journalists’ possess a unique blend of skills and incentives to identify, collect, and present key evidence of officials’ worldviews, especially during an election campaign (Hill, Hanna, and Shafqat 1997). Reporters who fail to report important stories or misstate key facts may face professional sanction. Broad comparisons of these media-based measures suggest, however, that there are few key differences between them and action-based scores such as NOMINATE (Burden, Caldeira, and Groseclose 2000). More recently, Bonica 2 Some have attempted to redeem interest group ratings in some form (e.g., Levitt 1996; Groseclose, Levitt, and Snyder 1999; Bishin 2003; Anderson and Habel 2009; Chand, Schreckhise, and Parry 2014), but they have largely fallen out of use. (2013) has modeled ideology on patterns of campaign contributions. Given that their financial support likely increases the probability of a candidate's victory, contributors have a clear incentive to identify candidates whose ideology is similar to their own. Unlike many other scores, however, the resultant Database on Ideology, Money in Politics, and Elections (DIME) includes ideal point estimates for both incumbents and challengers. Finally, some scholars adopt what may be referred to as a "sociological approach." These studies attempt to directly infer members' preferences either through elite opinion surveys or demographics (e.g., Miller and Stokes 1963, Kingdon 1988, Bianco 1994, Erikson and Wright 1996, Bishin 2003, Shor 2012). Direct approaches of this type have a clear, intuitive appeal, mirroring mass opinion surveys. Yet they are also extremely resource intensive and inflexible. Scores created in this style are either unlikely or unable to be updated to reflect potential changes in members' value systems. Measures arising from each empirical category thus offer distinct strengths and weaknesses as empirical tools. Though most appear to represent a similar underlying construct, some are more readily generated (roll call measures), widely applicable (campaign contribution measures), or meaningful for a democratically important relationship (media-based measures). At the same time, these measures often rely on infrequent, externally constrained behaviors that may not fully capture members' representational intentions. We argue that our social media measure of elite ideology carries with it many of these strengths but does not suffer from similar weaknesses. Congress and Twitter Social media use among members of Congress has taken time to develop. Members have long been wary of new technology.3 Electorally focused, members are often hesitant to change previous patterns of constituent interaction,4 patterns that have been part of successful campaigns (Wood 1974). As a technology becomes more widely used among the public, however, many members of Congress do eventually embrace it and, over time, develop relatively stable patterns of use. The well-studied case of congressional campaign websites is instructive in this regard. Largely viewed as an inconsequential novelty when the first official congressional sites launched in 1993 (Owen et al. 1999, Bimber and Davis 2003), early adopters were generally young, inexperienced members looking to cement their place in the institution and among their constituents (D’Alessio 1997, 2000). Over time as online activity among the public rose, more tenured members also created websites. By 2004, approximately 90 percent of major party candidates for Congress had campaign websites (Foot and Schneider 2006). In 2002 one observer asserted that “the question is no longer whether candidates for major office will have a website, but what the website will look like and how it will be used” (Williams, Aylesworth, and Chapman 2002, 43). More importantly for our purposes, during this time, candidates’ website content became more similar as well. Perhaps recognizing the potential to connect with politically active voters 3 A telling event occurred in 1868 when a young Thomas Edison offered the House a reliable system for electronically recording floor votes (North 2009). He was flatly rejected, with one member purportedly replying “young man, that won’t do at all. That is just what we do not want” (North 2009, 64, emphasis in original). It would be more than 100 years before the House electronically recorded an official vote (United States House of Representatives). The Senate continues to require all votes be recorded by hand (United States Senate). 4 In 1997 Pat Leahy quipped that many members of Congress “wouldn’t even know how to turn on a computer if they had to. They think it’s a not working television that won’t give you CNN” (Johnson 2004, 98). (Tolbert and McNeal 2003; Norris 2004), activists (Foot and Schneider 2006), and journalists (Lipinski and Neddenriep 2004) through the same medium, “certain content and functionality or tools [became] standard features on [congressional] sites” (Gulati and Williams 2009). This is not to suggest that all campaign websites are interchangeable, indeed, there is evidence of significant variation by the gender (Gulati 2004) and geographic location (Esterling, Lazer, and Neblo 2013) of the candidate-owner. Instead, this literature indicates that major party candidates’ website designs and content reflect generally predictable, stable responses to measurable incentives (Druckman, Kifer, and Parkin 2014; Esterling, Lazer, and Neblo 2005). Congressional website design has thus exhibited significant path dependence, developing norms among incumbents that “may be taken for granted as design requirements for legislative websites” (Esterling, Lazer, and Neblo 2011; see also Bimber 2003; Chadwick 2006; Fountain 2001; Xenos and Foot 2005). The experience of congressional candidates’ website development provides a rare illustration of elites incorporating a new communication technology into existing representational practices and thereby gives us a set of expectations for the language candidates distribute via Twitter. Though candidate websites and Twitter feeds differ in many ways, they share many of the same affordances.5 Each offers the opportunity to disseminate information to a wide population, distribute multimedia content, develop a community of supporters, and direct visitors to volunteer and donation opportunities. Even the most significant difference—Twitter’s restriction of 140 characters per post—is mitigated by candidates’ extensive use of hyperlinks to connect tweet readers with longer form content or online services. 5 Putnam (2008) defines technology affordances as “the ways in which technology offers or supports certain things” (see also Gaver 1991). In many ways elite use of Twitter has followed the same developmental arc as campaign websites. Congressional elites began tweeting during the 2008 campaign season. Early adopters were again young newcomers to Congress, though party leaders were soon advising all members how to best communicate with voters on social media (Lassen and Brown 2011). The content of early tweets varied widely, was mostly personal for some members, and at times was somewhat bizarre.6 It was simple for observers to dismiss congressional Twitter use as an idiosyncratic hobby. Like campaign websites, however, Twitter content has become an increasingly common, professionalized element of candidate self-presentation efforts. By the end of the 2012 election nearly 90 percent of all major party candidates were active Twitter users and were producing a steady stream of 100,000 to 200,000 tweets annually. Though there has been little systematic research on the evolution of the content of candidate tweets during this period, anecdotal evidence suggests that congressional elites in 2012 had become more sophisticated in their social media presence. Candidates in this election were, for example, more likely to tweet about campaign (not personal) issues in a manner consciously integrated into larger, electorally motivated communication strategies than in previous elections (Hemphill et al. 2013, Lassen and Bode 2013, Evans et al. 2014).7 The widespread expansion of Twitter use among even older, more experienced members of Congress is unsurprising when one considers the promises of the service as a communication tool. Twitter allows individual members to create a dramatically inexpensive, direct communication space largely unencumbered by partisan or institutional pressures. On Twitter 6 Senator Charles Grassley’s early tweets, for example, included phrases such as “We still on skedul/even workinWKEND” and “hit a deer on hiway 136 … Assume deer dead” (see Malone 2012). 7 The development of Senator Charles Grassley’s approach to Twitter is an intriguingly open example (Camia 2013). members need not wait for relevant legislation, hope for a journalist's favorable pen, or pay for access to a given media market to express their position on a subject. Similarly, producing one or even a dozen tweets may be far less resource intensive for a campaign than crafting and distributing longer form newsletters or press releases, especially given the potential for members' tweets to be subsequently shared with new, larger audiences by citizens, journalists, and other elites (Kwak, Lee, Park, and Moon 2010; Arceneaux and Weiss 2010). Tweets can form the beginning of longer form attention and discussion. This is true even accounting for funding and staffing disparities among members and candidates. Instead, in important ways Twitter has become a new kind of wire service, a place where all congressional actors may briefly but meaningfully comment on issues and events they may or may not elaborate on elsewhere. Candidates of all stripes thus have a strong normative and practical incentive to frequently and directly communicate with potential supporters using a nearly universally known, low-cost platform such as Twitter (Pitkin 1967, Mayhew 1974). Elite social media use is also attractive for scholars. Similar to websites, the extensiveness of candidate Twitter use facilitates the creation of datasets of directly comparable messages, even among challengers who fail to attract much attention or support (see Druckman, Kifer, and Parkin 2014). The development of recent computational tools further suggests that such data may facilitate more direct and nuanced ideological comparisons of elites. Unlike many websites, for example, members often tweet on a near daily basis, producing a continuous body of time stamped content that facilitates longitudinal examinations. We therefore consider congressional Twitter use not with the assumption that tweets necessarily exert a significant influence on voters (although this potential has yet to be adequately examined), but that they may provide a nearly universal indicator of broader campaign strategy and representational style (Meinke 2009, Grimmer 2013). Hypotheses We therefore expect that a social media based measure of elite ideology would be very useful for students of Congress. In the remainder of this paper we begin to examine a potential score of this type. Existing evidence suggests to us that such a score will possess the following characteristics: H1 (Ideological Distinctiveness): The Twitter content produced by members of Congress during 2011 and 2012 included significant, identifiable ideological language consistent with members’ partisan identification. In other words, we expect Republicans’ tweets to more closely resemble each other than those produced by otherwise similar Democrats. H2 (Competition Constraints): As a result of candidates’ desire to win elections, individuals facing a more challenging or competitive election will be more likely to use their tweets to present themselves as ideologically appealing and distinct from other candidates, even within their own party. H3 (Universal Adoption): Because of Twitter’s low cost, accessible nature, we expect that there will be few systematic differences between the ideological language posted by otherwise similar incumbents and challengers. Unlike most other ideology measures, including those proposed by Bonica (2013), our social media based measure puts incumbents and challengers on nearly equal footing, allowing all candidates equal access to produce relevant data. Positioning Texts in Ideological Space To construct our social media based measure of elite expressions of ideology, we first examined every tweet posted by a major party general election candidate for the United States Congress from January 1, 2011 to election day 2012. To compile this dataset, we identified all Twitter handles (both campaign and congressional, where applicable)8 associated with all relevant candidates and then collected all tweets from the accounts during our time period using Twitter’s REST API.9 Our final sample includes 1,269 separate Twitter feeds. Since both official congressional office and campaign handles were tracked using the API, where multiple feeds were associated with a single member or candidate, feeds were combined and treated in the analysis that follows as a single unit.10 Combining feeds by candidate/member produced a total sample of 881 unique users and approximately 500,000 tweets. We then employed an established text-as-data computational method to extract comparable ideology scores from each candidate’s combined feed. Our method involves both systematic human examination of the words used by the feed owners as well as computer-aided techniques.11 As we show below, our method offers scholars of 21st century political 8 Congressional rules require members to separate their Twitter content according to its purpose. In other words, any tweet not directly related to the member’s official business as a legislator is required to originate from an identifiably separate account (Davidson 2014). 9 For more information on all aspects of the Twitter API, visit: https://dev.twitter.com/docs/api. 10 The authors acknowledge that treating campaign feeds and feeds administered as part of official congressional duties may be problematic. However, combining feeds was often necessary as some handles produced too few tweets for inclusion in the analysis. 11 See Grimmer and Stewart (2013) and Hopkins and King (2010) for a fuller discussion of the various pros and cons of different computer-aided techniques for analyzing text-as-data. communication great potential for examining hypotheses previously difficult to test in a generalizeable manner.12 For comparison purposes we also used the same method to construct scores from the text of all one minute floor speeches given by members during the 112th Congress.13 Member language on the floor of Congress provides a useful contrast to Twitter content because while it possesses some of social media’s beneficial characteristics (e.g., one minute speeches are not necessarily bound by current legislation (Maltzman and Sigelman 1996)), they are still heavily influenced by party leaders who recruit and provide talking points to members for as many as four in ten one minute speeches (Harris 2005). Similarly, many floor speeches are brief and appear sensitive to constituents’ views (Kringer and Shen 2014). On this point, Shogan and coauthors (2013) find that only 20 percent of one minute speeches include more than one policyrated statistics such as the rate of Americans who lack employment or health insurance. Automated methods offer both practical and theoretical benefits over human coding of subsamples. Given the sheer unwieldy size of the data, the efficiency savings to researchers are self-evident. In addition, the polysemic, subjective nature of political content makes reliance on a team of human coders who will almost certainly differ in interpreting the content and ideological character embedded in tweets problematic. Instead, while the methods employed in this paper may fail to capture some nuances of language use, by using systematic, objective measures, we 12 Pre-processing of the text followed the methods in Toff and Kim (2014) and utilized the same Python code. The steps included the removal of English language “stop-words,” punctuation, numerals, hyperlinks, and at-mentions. Frequently used abbreviations on Twitter were replaced with their more universal cognates (e.g., “natl” for “national;” or “cont” for “continue”), and approximately 200 politically salient n-grams were created so as to distinguish between, for instance, references to “social security” and other uses of the words social “social” and “security.” This list of political n-grams was adopted from the index of an introductory American politics textbook (Canon and Bianco 2013). 13 Our thanks to Brad Jones for sharing this data with us. may limit bias associated with researchers' own subjective impressions and aid in ensuring that our findings may be replicated other data. The specific text-as-data tool we employ is a "bag of words" approach known as Wordscores, developed by Laver and Garry (2000). This automated method situates documents—or in this case, Twitter feeds—along a 2-dimensional ideological continuum according to the frequency with which partisan words are employed. The advantage of this method over other similar dictionary-based methods is that it requires fewer theoretical assumptions. Rather than assigning ideologies to words based upon a dictionary or detailed codebook of a priori partisan language, the Wordscores procedure generates a list of scored words based on assigned reference texts of known position in ideological space. These scored words—and only the words that appear in the assigned reference texts—are then used to assess documents of unknown partisanship. As in Toff and Kim (2014), which uses Wordscores to examine Twitter messages disseminated by party leaders and partisan media, this paper also utilizes the national Democratic and Republican party platforms from each party’s 2012 nominating conventions as reference texts. Platforms, although relatively obscure documents, are particularly useful for this enterprise in that they capture each party’s official positions on the precise issues they chose to highlight during the 2012 election year. Each document was assigned a position in ideological space (1 = Republican platform, -1 = Democratic platform), and resulting Wordscores are interpretable in relation to these reference scores assigned by the researchers.14 The set of unique words that appear in each platform, weighted by their frequencies, are used to construct the list of scored 14 The precise numbers should not be read as absolute ideological scores but as reflective of a single cross-section in time. words with which the Wordscores procedure assigns average scores to our sample of Twitter users. The number of documents and scored words for both our Twitter and floor speech samples are presented in Table One. [Insert Table One About Here] Wordscores involves four basic computational steps: (1) selection of a reference set of documents with known positions in an ideological space; (2) scoring of the unique words that appear in the reference documents weighted according to their frequency; (3) scoring of a new set of texts of unknown position in the same ideological space; and (4) transformation of those numerical scores according to a stock formula that accounts for differences in the variance of word usage in the reference and scored texts.15 The four steps are summarized in Figure One.16 [Insert Figure One About Here] We refer to the member-specific scores that result from applying Wordscores to member Twitter feeds in this manner as Tweetscores. This measure captures the extent to which members’ self presentation on Twitter is consistent with the general ideological language used by their party. The distributions of Tweetscores (and the related Worscores-based measure for our floor speech texts) for Democrats and Republicans are plotted in Figure Two. A clear divide is evident between members of each party, and indeed estimates derived from both types of sampled text are significantly correlated with DW-NOMINATE scores, although Tweetscores 15 This final step allows scholars to more easily compare texts of unknown ideological position to the positions of the reference documents used in deriving the scores. However, in that the transformation utilizes the variances of the entire set of scored documents, comparisons between scores not derived simultaneously are impossible (Laver, Benoit and Garry 2003; Lowe 2008), so caution must be applied to assessing differences in the magnitude of scores between samples estimated separated. 16 For more information about Wordscores, please consult the website of its authors. See http://www.tcd.ie/Political Science/wordscores/. are comparatively more so. Among the subset that includes incumbents whose NOMINATE scores are available (N=483), the correlation is a healthy 0.65, and 0.69 when the sample is further reduced to just those in competitive elections.17 Among incumbent senators, the correlation is 0.75. While some of the difference in correlations may be attributable to differences in messaging styles—senators might be expected to devote more resources to maintaining a professional-style Twitter feed that more accurately captures their own political self-presentation—other differences in the correspondence between NOMINATE scores and Tweetscores likely reflect the precision of the estimates. As Figure Three demonstrates, the correlation between NOMINATE scores and Tweetscores increases steadily as we restrict our sample to only the most prolific tweeters in Congress—in other words as we increase the volume of language available to score each feed. [Insert Figure Two About Here] [Insert Figure Three About Here] More broadly, we also compare Tweetscores to other measures of congressional ideology. Table Two reports the correlation between our measure and at least one measure from each of the empirical approaches to ideology scoring noted earlier. Consistent with Hypothesis 1, in each case Tweetscores presents comparable and significant results. Reassuringly, Tweetscores correlate most strongly with the most widely used score presented here: NOMINATE. Note also that Tweetscores correlates more strongly with each measure than our floor speech-based score (CR Score) does with any other version. At the same time, however, it is not surprising that the Tweetscore correlation is somewhat weaker than that evidenced by comparisons of existing 17 Defined as the 18 states and 81 house districts the New York Times identified as "in play" in the 2012 election. See http://elections.nytimes.com/2012/ratings/house/. measures. The Twitter data on which Tweetscores are based are unusually expansive and unconstrained, allowing members and challengers to construct an ideological profile that though it may depart at times from some members’ operative institutional choices, represents a truer portrait of the member’s desired ideological profile. [Insert Table Two About Here] As with NOMINATE scores that measure incumbent ideology through their roll call voting behaviors, Tweetscores also appear closely related to the ideology of candidates' home districts or states. In Figure Four, Tweetscore estimates are plotted alongside Cook Political Voting Index (PVI) scores for the candidates' home district or state. Among winners of the election, a clear positive relationship is evident, with candidates in more conservative environments using significantly more conservative language in their Twitter feeds. Among losers in the election, however, the relationship is reversed, as losing candidates appear markedly misaligned ideologically with voters in their district or state. We find a similar but somewhat weaker pattern (presented in Figure Five) among member language expressed by incumbent (i.e., non-retiring) members during floor speeches. Results in the left hand panel of Figure Five include members who were defeated during a primary, including those whose district was substantially altered for the 113th Congress. No similar relationship appears when considering the ideological nature of the language used by retiring members in their floor speeches. [Insert Figure Four About Here] [Insert Figure Five About Here] To more closely examine the factors shaping candidates’ Tweetscores, we created multivariate regression models to predict incumbent and challenger ideology scores expressed in both congressional debate and social media. The results of five OLS models are presented in Table Three.18 The first two models predict each candidate’s raw Tweetscore and floor speechbased ideology score. Again we see evidence of the influence of party, with Republicans in each case more likely to receive a numerically higher (i.e., more conservative score). Non-party estimates in each model are also largely consistent with expectations. Female candidates, for example, present a consistently more liberal face to the public than their male counterparts do. The second two models in Table Three predict within party variance in scores of each type. These results provide clear evidence in support of Hypothesis 2, suggesting that candidates are sensitive to their electoral environment when managing their public ideological profile. As local electoral pressures increase, candidates become more likely to diverge from their party’s mean ideological position in order to appeal to local voters. Arguments that candidates position themselves according to voters’ preferences are not new, but Tweetscores are uniquely well suited to provide evidence on this point, especially when compared with roll-call scores like NOMINATE. [Insert Table Three About Here] The model presented in Table Four also tests the utility of Hypothesis 2 by evaluating the factors motivating extreme partisan language. Extreme partisan language is both a reflection of an individual member or candidate's own political views but also a strategically modulated campaign device. We might expect candidates for office to moderate the ideology in their public facing communications while engaged in electoral politics. To test this implication of Hypothesis 2, we estimated a fifth model (presented in Table Four) to compare estimates of extremity generated by both Tweetscores and NOMINATE. Given that extreme language is the variable we 18 We also estimated a number of other hierarchical and nested models but finding that they produced no substantive differences, we report the simpler models here. seek to explain, and we operationalize it as the absolute value of a candidates' Tweetscore, we must use a separate measure of candidate ideology as a covariate in the model itself. Therefore, we test this hypothesis on a smaller subset of the overall sample for which NOMINATE scores are available, using those scores as a measure of incumbent ideology. [Insert Table Four About Here] Using this considerably smaller sample, we do see a significant interaction between candidate and district ideology in the expected direction. A member with a -0.5 NOMINATE score can be expected to produce a more extreme Tweetscore (by 0.19)—or roughly a tenth of the difference between the average Democrat and Republican in the sample—in a conservative district where Obama won just 30 percent of the vote compared to a liberal district where 70 percent of ballots endorsed Obama. The difference is even more extreme among those with conservative NOMINATE scores: an incumbent with a 0.5 score is expected to moderate his or her score by almost a full point (0.97) between a similar range of districts; however, given the existing narrow spectrum of incumbents in districts that differ substantially from their own ideology, using incumbents for these scenarios may be less revealing than scenarios involving the full sample. Two other variables of note were also statistically significant predictors of extreme language. Female members were associated with more extreme language (0.19), while the median age of the district was associated with less extreme language, although there are no clear theoretical explanations to support either of these findings. Similarly, Tweetscores allow us to directly compare the ideological statements made by both incumbents and challengers in similar electoral conditions. The results in Table Three also support Hypothesis 3 in their lack of a clear distinction between challengers and incumbents. Instead, as we might expect, incumbency status appears to have little to do with the ideological nature of the public face of a campaign. Again, the value added here is less connected with the nature of the argument itself and more with the ease with which we are able to compare the public communication behavior of both incumbents and challengers. Conclusion In general, our paper makes two contributions to the ongoing discussion surrounding the place of social media in congressional campaigns. First, we have demonstrated the tractability of Twitter data. Using text as data methods accessible to political scientists, we have categorized and modeled a large dataset of candidate messages. Our results lay bare both the benefits and difficulties associated with these methods. While they hold the promise of quantifying the increasingly large amount of campaign-related text produced and distributed on the Internet, they are by definition bound by the quality of their reference texts. Though our Tweetscores highlight the unique utility of the nearly daily strategic messages now produced on Twitter by the vast majority of members of Congress, we rely on major party platforms to generate our ideal point estimates. This leaves our measure vulnerable to internal partisan fragmentation and the potentially declining relevance of unifying texts like party platforms. The combination of these tools and the still increasing volume of tweets produced by congressional Twitter users19 holds substantial promise for those interested in unpacking the causes and consequences of members’ public displays of ideology. Because of Twitter’s nearly universal adoption among candidates, low-cost nature (both to produce and collect), and temporal precision (each tweet is time stamped with the exact moment it is posted) candidate tweets offer a uniquely comparable and insightful dataset. Our Tweetscores therefore may allow 19 As part of another, related project, the authors have identified more than 800,000 tweets posted by candidates for Congress during 2013 and 2014, a 60 percent increase over the time period we consider here. researchers to more effectively examine longitudinal and event-specific changes in candidates’ self presentation. Similarly, because challengers as well as incumbents are frequent producers of tweets, Tweetscores may allow us to model the manner in which electoral opponents respond to one another over the course of a campaign. This may allow us to more fully address issues such as how incumbents respond to challengers who effectively appeal to a large portion of the electorate. Previous studies of campaigns have often been limited by the incumbent-centric nature of ideology measures such as NOMINATE scores. We find that in races with sufficient candidate Twitter activity, some of this limitation may be moderated. Data of the kind we have considered have immediate, direct applications for studies of at least campaign effects, mass communication, and vote choice. Beyond this, however, tools such as Wordscores are powerfully agnostic and offer the ability to plot candidate tweets along any two dimensional space (as defined by the selected reference texts). Second, we also identify the increasingly strategic nature of congressional candidate Twitter use. At nearly every step our results are consistent with a story of fully integrated, strategic tweeting by members of all ages, experience, support, and partisanship. Indeed, our most consistent finding is the influence of local electoral competition. In nearly all instances, candidate Twitter activity displayed a distinct, expected sensitivity to electoral conditions. Not only did we find that candidates altered the nature of their tweets when the campaign winds shifted, but we also found that the group of candidates we would expect to be most skilled at campaign rhetoric—incumbents, those who have a record of electoral success—appear as such in our data. Ultimately, therefore, while we argue that it is possible to accrue enduring insights from modeling the use of Twitter among major party candidates for Congress, the road to doing so remains uneven. We have identified an important tool researchers can call upon to make sense of candidates’ tweets—there are many others available as well—but their use requires planning and careful application. Thus, while we have used these tools to begin to identify the value of a social media based measure of congressional ideology and, by extension, the role of electoral incentives in Twitter use, many questions about what motivates candidates’ tweet behavior remain unanswered. We look forward to beginning to address these questions in future iterations of this project. Bibliography Amman, Sky. 2010. “A Political Campaign Message in 140 Characters or Less: The Use of Twitter by U.S. Senate Candidates in 2010.” Social Science Research Network. Arceneaux, Noah and Amy Schmitz Weiss. 2010. “Seems Stupid Until You try it: Press Coverage of Twitter, 2006-09.” New Media and Society 12 (8): 1262-79. Arnold, Douglas R. 1990. The Logic of Congressional Action. New Haven, CT: Yale University Press. Bimber, Bruce. 2003. Information and American Democracy: Technology in the Evolution of Political Power. New York: Cambridge University Press. Bimber, Bruce and Richard Davis. 2003. Campaigning Online: The Internet in US Elections. New York: Oxford University Press. Bode, Leticia, David S. Lassen, Young Mie Kim, Dhavan Shah, Erika Franklin Fowler, Travis N. Ridout, and Michael Franz. 2011. “Social and Broadcast Media in 2010 Midterms: The Expanding Repertoire of Senate Candidates’ Campaign Strategies.” Prepared for the Annual Meeting of the American Political Science Association, Seattle, WA. Bovitz, Gregory L. and Jamie L. Carson. 2006. “Position-taking and Electoral Accountability in the US House of Representatives.” Political Research Quarterly 59 (2): 297-312. Camia, Catalina. 2013. “No More Dead Deer: Grassley Changes What He Tweets.” USAToday.com. http://www.usatoday.com/story/onpolitics/2013/03/05/chuck-grassleytwitter-personal/1965235/ (accessed August 21, 2014). Canon, David T. and William T. Bianco. 2013. American Politics Today, 3rd ed. New York: W. W. Norton. Chadwick, Andrew. 2006. Internet Politics: States, Citizens, and New Communication Technologies. New York: Oxford University Press. Chi, Feng and Nathan Yang. 2010. “Twitter in Congress: Outreach vs. Transparency.” http://mpra.ub.uni-muenchen.de/24060/1/MPRA_paper_24060.pdf (accessed August 21, 2014). D’Alessio, Dave. 1997. “Use of the World Wide Web in the 1996 US Election.” Electoral Studies 16 (4): 489-500. --------------. 2000. “Adoption of the World Wide Web by American Political Candidates, 19961998.” Journal of Broadcasting & Electronic Media 44 (4): 556-68. Davidson, C. Simon. 2014. “When is a Tweet an Ethics Violation?” RollCall.com. http://blogs.rollcall.com/beltway-insiders/when-is-a-tweet-an-ethics-violation-a-questionof-ethics/?dcz= (accessed August 21, 2014). Druckman, James N., Martin J. Kifer, and Michael Parkin. 2014. “US Congressional Campaign Communications in an Internet Age.” Journal of Elections, Public Opinion & Parties 24 (1): 20-44. Esterling, Kevin M., David M.J. Lazer, and Michael A. Neblo. 2005. “Home (Page) Style: Determinates of the Quality of the House Members’ Web Sites.” International Journal of Electronic Government Research 1 (2): 50-63. --------------. 2011. “Representative Communication: Web Site Interactivity and Distributional Path Dependence in the US Congress.” Political Communication 28 (4): 409-39. --------------. 2013. “Connecting to Constituents: The Diffusion of Representation Practices among Congressional Websites.” Political Research Quarterly 66 (1): 102-14. Evans, C. Lawrence, and Walter J. Oleszek. “The Internet Institutional Change.” In James A. Thurber and Colton C. Campbell, eds., Congress and the Internet. New York: Prentice Hall. Evans, Heather K., Victoria Cordova, Savannah Sipole. 2014. “Twitter Style: An Analysis of How House Candidates Used Twitter in their 2012 Campaigns.” PS: Political Science & Politics 47 (2): 454-62. Fenno, Richard. 1978. Home Style: House Members in their Districts. New York: Longman. --------------. 2000. Congress at the Grassroots: Representational Change in the South, 19701998. Chapel Hill, NC: University of North Carolina Press. Fiorina, Morris P. 1974. Representatives, Roll Calls, and Constituencies. Lexington, MA: Lexington Books. Foot, Kirsten A. and Steven M. Schneider. 2006. Web Campaigning. Cambridge, MA: MIT Press. Fountain, Jane E. 2001. “Building the Virtual State.” Information Technology and Institutional Change 61-82. Gaver, William. 1991. “Technology Affordances.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 79-84. Glassman, Matthew Eric, Jacob R. Straus, and Colleen Shogan. 2010. “Social Networking and Constituent Communications: Member Use of Twitter During a Two-Month Period in the 111th Congress.” CRS Report for Congress. Golbeck, Jennifer, Justin M. Grimes, and Anthony Rogers. 2010. “Twitter Use by the US Congress.” Journal of the American Society for Information Science and Technology 61 (8): 1612-21. Grimmer, Justin. 2013. Representational Style in Congress: What Legislators Say and why it Matters. New York: Cambridge University Press. Grimmer, Justin and Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis for Political Texts.” Political Analysis 21(3): 1–31. Grimmer, Justin and Gary King. 2011. “General Purpose Computer-Assisted Clustering and Conceptualization.” Proceedings of the National Academy of Sciences 108(7): 2643– 2650. Gulati, Jeff. 2004. “Members of Congress and Presentation of Self on the World Wide Web.” International Journal of Press/Politics 9 (1): 22-40. Gulati, Jeff and Christine Williams. 2009. “Closing Gaps, Moving Hurdles: Candidate Web Site Communication in the 2006 Campaigns for Congress.” In Costas Panagopoulos, ed., Politicking Online: The Transformation of Election Campaign Communications. New Brunswick, NJ: Rutgers University Press. Gulati, Jeff and Christine Williams. 2010. “Communicating with Constituents in 140 Characters or Less: Twitter and the Diffusion of Technology Innovation in the United States Congress.” Paper prepared for the Annual Meeting of the Midwest Political Science Association, Chicago, IL. Haber, Steven. 2011. “The 2010 US Senate Elections in Less than 140 Characters: How Senate Candidates Use Twitter as a Campaign Tool.” http://aladinrc.wrlc.org/bitstream/handle/1961/10028/Haber,%20Steven%20%20Spring%20'11.pdf?sequence=1 (accessed August 21, 2014). Hall, Richard. 1996. Participation in Congress. New Haven, CT: Yale University Press. Hemphill, Libby, Jahna Otterbacher, and Matthew Shapiro. 2013. “What’s Congress Doing on Twitter?” Proceedings of the 2013 Conference on Computer Supported Cooperative Work 877-86. Highton, Benjamin and Michael S. Rocca. 2005. “Beyond the Roll-call Arena: The Determinants of Position Taking in Congress.” Political Research Quarterly 58 (2): 303-16. Hopkins, Daniel J. and Gary King. 2010. “A Method of Automated Nonparametric Content Analysis for Social Science.” American Journal of Political Science 54(1): 229–247. Jacobson, Gary. 1987. “The Marginals Never Vanished: Incumbency and Competition in Elections to the US House of Representatives, 1952-82.” American Journal of Political Science 126-41. Johnson, Dennis. 2004. Congress Online: Bridging the Gap between Citizens and their Representatives. New York: Routledge. Jones, David R. 2003. “Position Taking and Position Avoidance in the US Senate.” Journal of Politics 65 (3): 851-63. Kingdon, John W. 1989. Congressmen’s Voting Decisions. Ann Arbor, MI: University of Michigan Press. Kwak, Haewoon, Changhyun Lee, Hosung Park, and Sue Moon. 2010. “What is Twitter, a Social Network or a News Media?” Paper presented at the International World Wide Web Conference, Raleigh, NC. Lassen, David S. and Adam R. Brown. 2011. “Twitter: The Electoral Connection?” Social Science Computer Review 29 (4): 419-36. Lassen, David S. and Bode, Leticia. 2013. “Social Media Coming of Age: Developing Patterns of Congressional Twitter Use 2007-2012.” Paper prepared for presentation at the Annual Meeting of the American Political Science Association, August 29-September 1, 2013. Laver, Michael and John Garry. 2000. “Estimating Policy Positions from Political Texts.” American Journal of Political Science 44(3): 619–634. Laver, Michael, Kenneth Benoit and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97(2): 311– 331. Lowe, Will. 2008. “Understanding Wordscores.” Political Analysis 16(4): 356–371. Lipinski, Daniel, and Gregory Neddenriep. 2004. “Using ‘New’ Media to Get ‘Old’ Media Coverage: How Members of Congress Utilize their Web Sites to Court Journalists.” The Harvard International Journal of Press/Politics 9 (1): 7-21. Livne, Avishay, Matthew P. Simmons, Eytan Adar, and Lada A. Adamic. 2011. “Networks and Language in the 2010 Election.” Prepared for the 2011 Political Networks Conference, Ann Arbor, MI. Mayhew, David R. 1974. Congress: The Electoral Connection, New Haven, CT: Yale University Press. Meinke, Scott R. 2009. “Presentation of Partisanship: Constituency Connections and Partisan Congressional Activity.” Social Science Quarterly 90 (4): 854-67. Mirer, Michael L. and Leticia Bode. 2013. “Tweeting in Defeat: How Candidates Concede and Claim Victory in 140 Characters.” New Media & Society 1-17. Norris, Pippa. 2004. “Who Surfs? New Technology, Old Voters, and Virtual Democracy in U.S. Elections 1992-2000.” In E.C. Kamarck, eds., Democracy.com. Washington, DC: Brookings Institution. North, Sterling. 2009. Young Thomas Edison. New York: Penguin. Owen, Diana, Richard Davis, and Vincent James Strickler. 1999. “Congress and the Internet.” International Journal of Press Politics 4 (2): 10-29. Parmalee, John H. and Shannon L. Bichard. 2012. Politics and the Twitter Revolution: How Tweets Influence the Relationship Between Political Leaders and the Public. Lanham, MD: Lexington Books. Pierson, Paul. 2000. “Increasing Returns, Path Dependence, and the Study of Politics.” American Political Science Review 94 (2): 251-67. Pitkin, Hanna. 1967. The Concept of Representation. Berkeley, CA: University of California Press. Putnam, Robert. 2008. “Affordances of Technology for Supporting Teaching and Learning.” SMU.edu. http://centres.smu.edu.sg/cte/innovative-development/affordance-oftechnology/ (accessed August 21, 2014). Samuelsohn, Darren. 2014. “Pols Have a #Fakefollower Problem.” Politico.com. http://www.politico.com/story/2014/06/twitter-politicians-107672.html (accessed August 21, 2014). Stromer-Galley, Jennifer. 2000. “On-line Interaction and Why Candidates Avoid It.” Journal of Communication 50 (4): 111-32. Toff, Benjamin J. and Young Mie Kim. 2014. “Words that Matter: Twitter and Partisan Polarization.” Paper presented at the Annual Meeting of the International Communications Association, Seattle, WA. Tolbert, Caroline J. and Ramona S. McNeal. 2003. “Unraveling the Effects of the Internet on Political Participation.” Political Research Quarterly 56 (2): 175-85. United States House of Representatives. “The First Electronic Vote.” House.gov. http://history.house.gov/HistoricalHighlight/Detail/37169?ret=True (accessed August 21, 2014). United States Senate. “Voting Process.” Senate.gov https://www.senate.gov/reference/Index/Votes.htm (accessed August 21, 2014). Williams, Christine, Andrew Aylesworth, and Kenneth J. Chapman 2002. “The 2000 ECampaign for U.S. Senate.” Journal of Political Marketing 1 (4): 39-63. Xenos, Michael and Kirsten A. Foot. 2005. “Politics as Usual, or Political Unusual? Position Taking and Dialogue on Campaign Websites in the 2002 US Elections.” Journal of Communication 55 (1): 169-85. Table One Summary Statistics Feature Tweets Floor Speeches Roll Call Votes Documents (mean) 491.5 29.9 1189.7 Documents (median) 334 1393.5 Scored words (mean) 2236 3104 Scored words (median) 1582 1083 Figure One Wordscores Procedure Illustrated Figure Two Distribution of Ideology Scores by Communication Method Twitter Feeds Congressional Record Republicans Democrats 0.0 0.1 0.2 Density 0.2 0.1 0.0 Density 0.3 0.3 0.4 0.4 Republicans Democrats -4 -2 0 Lengths 2 4 -4 -2 0 Lengths 2 4 Figure Three Comparing Tweetscores and NOMINATE Scores by Tweet Volume Table Two Ideology Score Correlations Tweetscore CR Score NOMINATE DIME Nat Journal Shor ADA CR Score 0.47 NOMINATE 0.63 0.43 DIME 0.53 0.49 0.94 Nat Journal 0.61 0.48 0.94 0.89 Shor 0.53 0.52 0.86 0.87 0.86 ADA -0.58 -0.42 -0.91 -0.90 -0.91 -0.86 ACU 0.62 0.45 0.97 0.94 0.95 0.88 -0.95 Figure Four Distribution of Tweetscores by Election Context and Outcome Winners -2 0 2 4 Republicans Democrats Other -4 -2 0 Wordscore of Twitter Feeds 2 4 Republicans Democrats Other -4 Wordscore of Twitter Feeds Losers -40 -20 0 20 PVI of District/State 40 -40 -20 0 20 PVI of District/State 40 Figure Five Distribution of CR Scores by Election Context Candidates -2 0 2 4 Republicans Democrats Other -4 -2 0 2 Wordscore of Congressional Record 4 Republicans Democrats Other -4 Wordscore of Congressional Record Retirees -40 -20 0 20 PVI of District/State 40 -40 -20 0 20 40 Table Three Modeling Tweetscores and CR Scores District/State Covariates District/State Competitiveness (PVI) Percent Nonwhite Median Age Percent with Bachelor's Degree Median Income (logged) Candidate Covariates Party (1=Rep, -1=Dem) Tweetscore Cr Score 0.02*** (0.00) 0.75* (0.36) 0.00 (0.01) 1.52* (0.70) -0.63* (0.28) 0.02** (0.01) 0.88 (0.45) -0.03 (0.02) 1.43 (0.86) -0.36 (0.34) 0.02*** (0.00) 0.46 (0.30) -0.01 (0.01) 0.19 (0.59) -0.13 (0.24) 0.02** (0.01) 0.05 (0.43) -0.03* (0.02) 0.02 (0.78) 0.24 (0.31) 0.66*** (0.04) 0.41*** (0.08) -0.19 (0.16) -0.04 (0.11) -0.10** (0.03) 0.09* (0.04) 0.05 (0.06) 0.05 (0.03) 0.27* (0.13) -0.31** (0.10) 0.51* (0.25) 1.50*** (0.32) 0.03 (0.08) 0.05 (0.06) 0.15* (0.07) 0.06 (0.04) 0.02 (0.18) -0.49*** (0.13) -0.47*** (0.04) 0.01*** (0.00) -0.10 (0.14) -0.18 (0.10) -0.07* (0.03) 0.10** (0.04) 0.01 (0.05) 0.02 (0.02) 0.00 (0.11) -0.06 (0.08) 0.02 (0.24) -0.04 (0.04) 2.40 (2.46) 689 0.29 0.27 -0.63*** (0.07) 0.01 (0.01) -0.06 (0.22) 0.25 (0.29) 0.02 (0.08) 0.04 (0.06) 0.02 (0.06) 0.02 (0.03) 0.01 (0.16) 0.00 (0.12) 0.00 (0.27) Party X PVI Senator (1=Yes) Incumbent (1=Yes) Campaign Spending (logged) Age (in increments of 10) Years in Congress (in increments of 10) Education (0=High School, 6=Doctorate) Nonwhite (1=Yes) Female (1=Yes) Unopposed (1=Yes) # of Tweets (logged) 6.71 1.68 (2.93) (3.61) N 693 363 R-Squared 0.36 0.37 Adj. R-Squared 0.35 0.34 ***p<0.001 **p<0.01 *p<0.05. Results from OLS models. (Intercept) Distance from Party Mean Cr Score Tweetscore -1.19 (3.50) 363 0.30 0.27 Table Four Members’ Tweetscore Extremity TweetScore Extremity District/State Covariates Obama 2012 Vote % Percent Nonwhite Median Age Percent with Bachelor's Degree Median Income (logged) Candidate Covariates NOMINATE Score Ideology X Obama 2012 Party (1=Rep, -1=Dem) Senator (1=Yes) Incumbent (1=Yes) Campaign Spending (logged) Age (in increments of 10) Education (0=High School, 6=Doctorate) Nonwhite (1=Yes) Female (1=Yes) # of Tweets (logged) (Intercept) N Adj. R-Squared -1.45*** (0.55) -0.02 (0.40) -0.03* (0.01) -0.18 (0.71) 0.17 (0.28) 0.64*** (0.49) -1.94*** (0.91) -0.01 (0.08) 0.11 (0.22) -0.15 (0.30) 0.06 (0.08) -0.00 (0.00) 0.04 (0.03) -0.05 (0.15) 0.19** (0.08) -0.11*** (0.03) 6.16 (8.95) 347 0.08
© Copyright 2026 Paperzz