Effect of Terror Attack Locale on Social Media Coverage and

Effect of Terror Attack Locale on Social Media Coverage and Sentiment
Batyrlan Nurbekov, Sam Khalandovsky
Introduction
The project explores how terror attack locations affect social media coverage and sentiment.
The primary analysis was done by extracting sentiment of Twitter posts, and Google News and Bing
were used to validate conclusions. We found that while coverage varied widely, the emotional response
was similar regardless of locale.
Hypothesis
Our hypothesis is that terrorist attacks in Western countries lead to much greater media
coverage, and that this difference is visible on Twitter through the sentiment of user posts.
Methodology
First, we selected a number of queries to analyze on twitter such as "Paris terror OR terrorism
OR terrorist", in order to isolate tweets related to the terror attacks in various locales. We chose twitter
as a data source due to its broad representation of people's reactions and large user-base. To extract
the tweets, we used a tool built by Healey and Ramaswamy at North Carolina State University [1], which
used the Twitter Search API to create a downloadable table with recent tweets that matched the query.
The tweets were then run through a sentiment analysis algorithm, provided within the same tool and
described in the "data analysis" section, where each tweet was assigned two scores from 1-9 to
represent it's "pleasure" and "arousal" to place it on the Russell model of emotional effect. Using Excel,
we constructed scatterplots for each query of the tweets based on their location on the 2D plane of the
Russell Model. Additionally, we calculated the means and standard deviations of the pleasure and
arousal values for tweets of a certain query. By looking at scatterplot clustering, means, standard
deviations, and amount of tweets returned for each query, we were able to draw conclusions about the
social media impact of different attacks. We supported these conclusions with the number of hits for
the same queries on news search engines.
Assumptions
One of the assumptions is that the sentiment analysis, as many other data mining algorithms, is
not precise. This assumption was derived after performing a sentiment analysis of Tweets over some
time frame: we found out that the variance of results was high. A potential error that could result from
this assumption is making conclusions without performing analysis for wide set of data.
Another assumption that we made is that while many of the tweets will be from news outlets
rather than from individuals, sentiment analysis will still give valid results. We believe this to be true as
news headlines still vary from emotional to neutral. We found that the majority of tweets were in fact
links to media outlets, with a short comment or headline preceding it. Limitations of the API meant that
we were unable to filter out tweets containing links; however, our assumption was validated as
sentiment was still visible despite the preponderance of media outlet tweets over personal ones.
However, potential skew from using mostly headlines could have an effect on the conclusions of our
analysis. One potential error that could result from not taking into consideration this assumption would
be attempting to use the results as a measure of public opinion, when the data is actually a mix of public
opinion with media output.
Additionally, we made the assumption that sentiment on Twitter over the past few days is
representative of sentiment over time, even if the attacks to which the query referred was not recent.
This was based on the idea that while sentiment is likely more significant immediately after an attack, it
is still visible over time as people continue to comment on the event and the media publishes new
information. This assumption was necessary to make, as the Twitter Search API severely limited our
ability to collect older tweets. Our data confirmed that this assumption was correct; the Tel Tamer
attack was the only one happening within a few days of our data collection, but negative sentiment was
clear even for attacks occurring several months ago. However, had the assumption proved inaccurate,
we would have needed to find a way to collect tweets from several months back around the date of the
attack. Potential errors resulting from this assumption could come from the influence of other events
matching the same query clouding the results; for instance, Syria-related searches could show results
relevant to a number of different events.
One of the invalid assumptions that can be made is that the sentiment analysis is not accurate
due to the fact that the analysis just gives results with high variance. This is not true, since many
researchers successfully validated the accuracy of sentiment analysis [2].
While the dataset of Twitter does not have inherent flaws, limitations of our access to the
dataset hindered analysis. Specifically, using the Twitter Search API it was difficult to filter out news
links, select precise date ranges, or find all tweets matching the query (rather than a subset of the "most
relevant"). However, as described in our assumptions above, we were able to work around these
limitation.
Results
Raw data
Examples of Tweets found for "Paris terrorist OR terror OR terrorism" can be seen below, along
with their sentiment scores (more raw data can be found in the excel sheet attached):
Date
12/11/20
15 17:34
Username
CNNsWorld
Pleasure Arousal Content
3.75
5.85
Authorities issue alert for another Paris terror
suspect https://t.co/hwogRziKiW
#BreakingNews#ParisAttack#Paris #gamedev
12/11/2015 ThomasMullen 3.78
5.34
"In each case, it wasn't that the government
17:11
couldn't obtain the information it needed to
prevent a terrorist... https://t.co/I31Em8Gk2k
12/11/2015 tcschnell67
3.79
5.97
The fake passport used by 'Paris terrorist' to
16:21
travel through Europe https://t.co/tykLNacWua
via @MailOnline
Examples of Tweets found for "Beirut terrorist OR terror OR terrorism" can be seen below, along
with their sentiment scores:
Date
Username
12/11/2015 NEILTONKS
17:03
Pleasure Arousal Content
4.6
6.12
Commuters on TERROR ALERT: Armed police on
patrol of public transport after Leytonstone
https://t.co/vrBhIOuF9r
12/5/2015
5:52
littlebirdbeans 3.94
12/11/2015 FenRawrys
16:21
3.27
5.97
6.33
Why do Beirut & the Russian airliner incidents
not evoke media attention like Paris did? Are
factors other than crime-terror responsible?
@mattyglesias @drvox cut and ran after a
terrorist attack in Beirut
Data analysis used
The core of our analysis lay with the sentiment extraction algorithm that we used to assess
twitter sentiment. Sentiment is defined as "an attitude, thought, or judgment prompted by feeling"; our
goal was to associate with each tweet a score representing the emotion associated with it. The
algorithm we used works by utilizing the "Affective Norms for English Words" (ANEW) database, which
selects 1,034 English words for their ability to convey emotion, and scores them from 1-9 on pleasure,
arousal, and dominance. These scores were obtained through surveying volunteers. To obtain twitter
sentiment, only tweets with at least two words from the ANEW that matched the query were selected,
and the scores for pleasure and arousal were calculated by averaging the scores of the ANEW words
contained in the tweet.
The reasoning for using pleasure and arousal for evaluating sentiment comes from Russell's
model of emotional affect, a popular method in psychology for representing emotional states on a 2dimensional plane, with the "pleasure" axis ranging from miserable to pleased, and the "arousal" axis
ranging from sleepy to aroused. According to the model, all major emotions (i.e. distress, depression,
excitement, etc.) can be placed on this plane. We can then represent all tweets for a specific query by
plotting them in a scatterplot on this plane to qualitatively assess the emotions associated with the
tweets.
Figure 1. Russel's model of emotional affect
In addition to the scatterplot, for each query we also used the standard statistical methods to
evaluate the mean and standard deviation for both pleasure and arousal. Finally, we also looked at the
number of total tweets returned by our query. To quantitatively compare the twitter sentiment of
different queries, we compared these five values (number of tweets, and mean/variance for both
pleasure and arousal).
Besides twitter sentiment, we also looked at the number of hits on Google News and Bing News
for the queries. For Google, we limited the search to articles since the attack in that location occurred,
and took the number of hits for the top grouping of news stories that Google provided. For Bing, we
could not limit results by date, but were able to take the total number of news hits (which Google did
not provide).
Overview of results
The query "Beirut terrorist OR terror OR terrorism" gave us 135 results, whereas the similar
query "Paris terrorist OR terror OR terrorism" gave us only 195 results. The same trend can be seen for
other locations such as Sinai, Syria and Tel Tamer: in all those cases significantly lower number of Tweets
was found for those locations, as opposed to Paris.
Again, we found a similar trend when we tried to look at the number of articles that were
published after terrorist attacks in different locations. The number of media articles regarding terrorist
attacks in Paris was significantly larger the number of articles about terrorist attacks in Beirut, Sinai,
Syria and Tel Tamer.
After performing sentiment analysis on Tweets we found out that pleasure level did not vary
significantly. However, we found that arousal for Paris was a little stronger than for other locations.
Also, variance of pleasure and arousal was smaller in Paris than in other locations.
Details of results
The following five queries were used, corresponding to significant recent attacks. The returned
tweets were from the past few days only.
1. "Paris terrorist OR terror OR terrorism"
Mass shooting on 11/13, 130 dead and 368 wounded
2. "Sinai terrorist OR terror OR terrorism"
Bombing on 10/31, 224 dead
3. "Beirut terrorist OR terror OR terrorism"
Suicide bombing on 11/12, 43 dead and 240 wounded
4. "Syria terrorist OR terror OR terrorism"
A number of attacks across Syria over the past several months
5. "Tamer terrorist OR terror OR terrorism"
Bombing on 12/11 in Tell Tamer, 60 dead and 80 wounded
Figure 2. Sentiment analysis results
The chart above shows the results of sentiment analysis for the five queries; the Appendix provides the
raw data used in this chart. For each query, we display the total number of tweets that matched the
query and were valid for sentiment analysis, the average pleasure and arousal scores for each, and error
bars representing the standard deviation of the scores. The difference in tweet volume is the clearest
difference in this analysis, showing a much greater reaction to the attack in Paris. However, other
elements of the sentiment analysis can show interesting information as well. For instance, Tell Tamer
shows a visibly lower pleasure score; the attack occurred shortly before the data was collected, showing
an effect on the displeasure expressed by Twitter users. This effect is visible even though the Tell Tamer
attacks received far fewer mentions on Twitter than the Paris attacks, which were much less recent.
Paris also showed slightly greater and more concentrated arousal scores. While Beirut and Paris showed
similar pleasure scores, the reactions to Paris were considerably more homogenous, while Beirut
showed greater variance in sentiment. Scatterplots of those two queries show the same effect:
Figure 3. Scatter plot for Paris Data
Figure 4. Scatter plot for Beirut Data
The greater spread visible in the response to Beirut results is likely due to the greater number of events
that have occurred near Beirut, compared to just the one primary even in Paris. However, overall the
analysis shows that the emotions expressed in response to an attack are similar even when the volume
of the response is vastly different, corresponding to nervous, stressed, and upset reactions on the
Russell Model.
The number of Tweets and news articles found on Google and Bing is summarized in the table
below and compared to the number of tweets:
Number of
tweets found
Number of
news articles
since the date
of incident
found on
Google
Number of
news articles
found on Bing
Paris
195
Beirut
135
Sinai
71
Syria
174
Tell Tamer
71
22,345
1,286
3,234
10,959
252
1,210,000
273,000
281,000
876,000
54,000
The data from news search engines confirms what we saw from tweets: media coverage of the
Paris attacks was far greater than that of Middle-Eastern ones. This is particularly surprising when
comparing the Paris attacks to Syria, as Syria has seen a large number of terrorist attacks since the Paris
bombing.
Conclusion
Based on the quantity of results it can clearly be seen that the media coverage is greater for
terrorist attacks in the West, as opposed to terrorist attacks in other countries, such as Syria, Lebanon
and Egypt. However, the sentiment analysis did not show significant difference of tweets' sentiments for
terrorist attacks in Western and Middle-Eastern countries. This implies that the emotional reaction of
those tweets that were posted did not differ significantly
Overall, this two points leads to conclusion that the terrorist attacks in Western countries are
better broadcasted by the social media. However, the emotional reaction to terrorist attacks in the
Middle-East is just as negative as reactions to terrorist attacks in the West.
There are a number of additional avenues that can be explored with this dataset. One potential
area would be identifying if there is a difference in sentiment between tweets representing an
individual's opinion relative to tweets linking to a news article. Another interesting direction would be
analyzing how sentiment changes over time for a specific query, and seeing how long a specific event
remains relevant and emotionally charged across Twitter.
There are multiple directions that can be taken in subsequent experiments. First of all, a larger
amount of data sources can be analyzed to reduce bias. Secondly, the sentiment analysis algorithm can
be improved in order to increase precision of the results and reduce variance. Finally, a large offline
corpus of tweets could be collected, allowing better filtering based on date ranges and the occurrence
of links.
Without looking at the quantity of tweets and news results, someone could take our results to
mean that twitter sentiment does not provide valuable data about actual sentiments due to high
variance. Therefore, someone could arrive at different results by using our data and disregard sentiment
altogether. However, the track record of sentiment analysis demonstrates that it can be a useful tool,
even when pointing out a lack of differences in sentiment.
Our research has a wide-ranging potential social impact in terms of opening people's eyes to the
unbalanced worldview presented by the media and amplified by social media like Twitter and by
traditional digital media like Google News and Bing News. As digital media becomes and ever more
important source of information for many people worldwide, it is important for people to realize that
the issues that gain the most exposure on that media are not necessarily representative of their impact
on the world. By showing these results to people, we can encourage them to look at more sources of
media and perhaps consider the greater effect of specific events rather than just following whatever is
currently popular. Therefore, social impact of the project potentially spans entire English-speaking world
(470 million to 1 billion people) [3].
Appendix
Average
pleasure score
Pleasure score
standard
deviation
Average
arousal score
Arousal score
standard
deviation
Paris
Beirut
Sinai
Syria
4.140103
4.11563
4.014507
4.177759
(Tel/Tal)
Tamer
3.759155
0.857762
1.210374
0.972595
1.136209
1.042239
5.922103
5.815852
5.920704
5.702874
5.733521
0.536653
0.760485
0.600679
0.702284
0.387528
Averages and standard deviations for sentiment analysis
References
[1] https://www.csc.ncsu.edu/faculty/healey/tweet_viz/tweet_app/
[2] http://www.cs.columbia.edu/~julia/papers/Agarwaletal11.pdf
[3] McCrum, Robert; MacNeil, Robert; Cran, William (2003). The Story of English (Third Revised ed.).
London: Penguin Books. ISBN 978-0-14-200231-5.