Classifying Information Systems Risks

2013 46th Hawaii International Conference on System Sciences
Classifying Information Systems Risks:
What Have We Learned So Far?
Manuel Wiesche
Chair for Information Systems
Technische Universität München
[email protected]
Hristo Keskinov
Chair for Information Systems
Technische Universität München
[email protected]
Michael Schermann
Chair for Information Systems
Technische Universität München
[email protected]
Helmut Krcmar
Chair for Information Systems
Technische Universität München
[email protected]
Abstract1
technological challenges early, prevents lock-in effects,
allows transparency over IT processes and ensures
responsibilities [2-4].
With information systems becoming ubiquitous, IS
risks permeate every aspect of life and risk mitigation
increasingly requires a holistic approach. Literature
argues that IS risks are neither exclusively
technological or organizational but are embedded in
work systems that consist of information systems,
business processes, and work practices [1]. The
following examples illustrate the need for a holistic
approach on IS risks:
x The CEO of a multi-national corporation is
finalizing an important email on an upcoming
takeover in his hotel room. He is using the
unsecured wireless internet connection provided by
the hotel. He always felt that the VPN client
installed on his notebook is too cumbersome to use.
Against better judgment, he sends off the email
without securing it. An industrial spy in the next
hotel room eavesdrops on the internet traffic of the
CEO and intercepts the particular email. He
publicizes the content, causing the multi-billion
takeover to fail.
x Currently, Facebook has more than 900 million
members that are producing 500,000 comments per
minute. Facebook collects information on people’s
private and professional life. Other companies are
increasingly using Facebook as a platform for
marketing purposes or even doing business. Several
privacy breaches have heightened the awareness of
potential risks from using Facebook on a private or
organizational level. Furthermore, the significant
role of Facebook in recent uprisings revealed the
power of the social network on a national level.
Understanding the risks caused by relying on
information systems is an enduring research
stream in the Information Systems (IS) discipline.
With information systems becoming ubiquitous, IS
risks permeate every aspect of life and effective
risk mitigation increasingly requires a holistic
structure. We use the largest and oldest publicly
available risk collection to understand the
developments of IS risks, its characteristics, and
interdependencies. We review this data set using
text mining techniques. Interestingly, we find that
some types of IS risks tend to reoccur. We find
that this database provides rich opportunities for
learning from previous mistakes, which could
help avoid similar problems in the future. Our
contributions to theory includes a risk-taker’s
view on contemporary information systems, a
differentiation
between
controllable
and
reoccurring
risks,
and
the
increased
interconnection of IS risks. As implications for
practice we provide a basis for learning from past
IS risks and an initial structure.
1. Introduction
Managing risks caused by relying on information
technology is an enduring research stream in the
Information Systems (IS) discipline [1]. Effectively
managing IS risks helps organizations recognize future
1
We thank SAP AG for funding this project as part of the
collaborative research center, Center for Very Large Business
Applications (CVLBA).
1530-1605/12 $26.00 © 2012 IEEE
DOI 10.1109/HICSS.2013.130
5011
5013
x
Consider the design, implementation, and operation
of a nation-wide road toll billing system involving
satellite-based vehicle tracking. The project was
delayed by almost three years, which amounted to
1.6 billion Euro in penalties and 3.5 billion Euro in
lost earnings with legal actions still in progress.
Furthermore, privacy concerns were raised by nongovernmental agencies and prominent figures in
society. However, today the system is operating
very effectively and other countries are interested
in adopting the system.
The resulting entanglement of information
systems, organizations, people, and societies
challenges IS risk research. A central problem of
research on IS risks are the diverse dimensions, IS
risks occur in [1]. Literature on IS risks address
different topics ranging from IT security, software
projects, outsourcing, e-commerce, inter-organizational
systems, IT infrastructure, healthcare, to cryptography.
These topics range in different risk domains including
operational, project, portfolio, monitoring, or strategic
risks. However, many of the articles research a
particular IS topic and focus on either one or a number
of risks associated with this topic in particular. Thus,
literature still demands a holistic perspective on IS
risks [5-9].
In order to understand the developments of IS
risks, its characteristics, and interdependencies, we
review the largest and oldest publicly available IS risk
collection on the net, the Risks Digest. This risk
collection has been published by the Committee on
Computers and Public Policy of the Association for
Computing Machinery (ACM) since 1985. The Risks
Digest is edited by Peter G. Neumann and
characterized by its broad range of IS risks enhanced
by the knowledge and experience of experts from all
over the world.
There have been analyses of this interesting data
set [10, 11]. Several classifications have been
discussed within the database as well. Examples
include classifications of program bugs in the 1980s,
errors in the 1990s, and attacks in the 2000s. However,
none has been generally established and agreed upon.
We review the Risks Digest using text mining
techniques to learn from this longitudinal data set. We
automatically classify the entries and empirically
derive a taxonomy of IS risks. We contrast different
time slices and analyze the development of similarly
described risks over time. We use the data set to
analyze the development of IS risks in the past
decades. We find that some types of IS risks tend to
reoccur. Hence, this database provides rich
opportunities for learning from previous mistakes by
reusing risk factors, consequence assessments, and
mitigation strategies.
The remainder of this paper is organized as
follows. The next section outlines the theoretical
background by defining information systems and IS
risks. The third section describes the methodology we
followed in the course of our research. After that, the
fourth section presents our results. The fifth section
explores the limitations of our methodology and the
initial implications of the achieved results. The last
section summarizes our findings and presents the
conclusion.
2. Theoretical Background
Following fundamental definitions of risk in IS
reference disciplines [12, 13], IS researchers frequently
define risks as events with a perceived probability of
occurrence and a perceived negative impact on the
objectives [1, 14-16].
Only few researchers explicitly state their
understanding of IS risk [1]. Among these, many
articles see risk as a quantifiable construct with a
probability and impact value [17-22]. Other articles
define risk explicitly as IS project failure [23-26] or
systems failure [26, 27]. Further researchers
understand risk from a broader perspective, in terms of
outcome variation. Out of these, risk is understood as
undesirable outcome variation [17, 28-30], uncertain
outcome [31-33], or variation in outcome [34]. Risk is
also defined in even broader terms as some kind of loss
in general [18-22, 35, 36]. For this research, we define
IS risk as “any threat that may lead to the improper
modification, destruction, theft, or lack of availability
of IT assets” [2].
Extant research exists on the sources of IT risks
and individual countermeasures [37]. As a core body of
knowledge in IS risk research, literature on operational
IS risks focuses on managing and maintaining IT
systems. Main challenges include maintaining
availability, i.e. ensuring that systems do not break
down, integrity, i.e. keeping information from being
confused and incomplete and confidentiality, i. e.
securing information systems against unauthorized
access [38]. Such risks are caused during the standard
usage of IT; the IT-component fails as part of a bigger
system [27]. Such IS risks can be divided in two
further categories: on the one hand, there are new and
unknown risks [39] and on the other hand, there are
known, but still unsolved risks [40]. Characteristics of
these known risks include the fact that the degree of
uncertainty is relatively low and the number of risks
occurring is relatively high. This makes it easier to
quantify probability and impact of the considered risks.
New and unknown risks usually occur with the
emerging of new technologies [41-43]. Similarly, for
IT projects, research identified lists of risk factors,
5014
5012
which affect the degree of variation in expected
outcomes [44, 45]. Countermeasures include software
development methodologies or contract design and
coincide with real options [14, 37]. On the other hand,
the example of the Y2K problem was researched only
at a certain point in time and thus researchers cannot
provide a structure they place their research in [39].
Considering all these heterogeneous topics and
different application domain, literature on IS risks is
scattered. Although there have been promising
attempts to classify IS risks [1, 6, 10, 46], research still
demands a holistic perspective on IS risks [5-9].
impact of information systems and their failure, and the
usage of information systems.
Participants provide contributions via the
comp.risks newsgroup and email. The contributions are
reviewed for relevance, soundness, taste, objectivity,
cogence, coherence, conciseness, nonrepetitiousness,
and meeting compliance regulations. Usually between
5 and 50 contributions are summarized within one
issue. The issues are aggregated into volumes of
between 45 and 98 issues. Though the Risks Digest is
published on an irregular basis, 1.65 issues are
published a week on average. The longest volume
lasted 116.4 weeks, the shortest 20.5. One volume
comprises 52.8 weeks on average. We focus on all
risks posted between July 1985 when the first issue
was published and March 2011 when we started this
research. Comprising a total number of 26,050 risk
items, IS risks cover 26 volumes and 2,264 issues.
There have been many analyses of this data set.
However, these focus on single cases [10] and manual
classifications [11]. Others used parts of the risk
collection for empirically validating a developed
taxonomy of cyberspace deception [50] and scenario
analysis [51]. In line with these researchers, we find
this risk collection as unique and most comprehensive
for analyzing publicly available IS risk information
[10]. However, since this risk collection is edited by a
single person, the topics included could be biased. As
stated in the Risks Digests mission, the editor has a
broad perspective on risks and initial classifications
reveal the heterogeneity of the data set [11].
3. Methodology
In order to classify the risk collection for
understanding the development of IS risks, we adopted
co-word analysis, also referred to as “actor network
analysis” [47]. The co-word analysis is a content
analysis technique that uses co-occurrence patterns for
pairs of items, such as words or noun phrases, in a
corpus of texts. These items are necessary in order to
extract the themes and detect the linkages among
topics directly from the subject content presented in the
texts [47]. The co-word analysis is based on the
assumption that a document’s keywords constitute an
adequate description of its content and that two
keywords co-occurring in the same document indicate
a relationship between its topic and keywords [48].
Co-word analysis is generally conducted in three
steps: extract keyword list, data standardizing and data
mapping [49]. However, to start the analysis we had to
create and clean the risk database first. Having
collected the data and unified it by using text-mining
techniques as described in paragraph 3.2 “Data
transformation”, we manually selected the keywords
that best describe clusters of IS risks. Afterwards we
standardized the data with the construction of a cooccurrence matrix of keywords. In the final step we
mapped the data to create semantic network maps that
revealed the relationship between the chosen keywords
and the clusters for the taxonomy of IS risks.
3.2 Data Transformation
In order to work on the risk collection, we
extracted the data into ‘RapidMiner’ - an environment
combining text, data and web mining, machine
learning, predictive analytics, and business analytics.
By specifying two regular expressions we defined a
region delimiter containing one single IS risk. Then we
cut the extracted web sites into single risks (that we
will refer to as documents) creating our unique risk
database. Each document contains not only a detailed
risk description, but also information about volume,
issue, publish date, subject, and author. Besides risk
contributions, these documents also include calls for
papers and book reviews. We automatically removed
contributions containing the string ‘call for paper’.
However, we kept book reviews, since they summarize
risk books and therefore address important IS risk
topics as well.
The first step when handling a text-based database
is breaking the stream of characters into words called
tokens [52]. Consecutively, we transformed all
characters in the documents to lower case and filtered
3.1 The Risks Digest
For this research, we review the largest and oldest
publicly available risk collection on the net, the Risks
Digest.2 It is published by the ACM Committee on
Computers and Public Policy and edited by Peter G.
Neumann. The first contribution appeared on August
1st 1985. Within the Risks Digest, researchers and
practitioners discuss various topics on IS risks. Topics
in this data set include technical security breaches,
2
http://catless.ncl.ac.uk/Risks/
5015
5013
out the stop words by removing every token that equals
one in the build-in stop word list. Afterwards we
discarded tokens consisting of only 1 or longer than 15
characters. Once we segmented the character stream
into a sequence of uniform tokens, the next step was to
convert each of the tokens to a standard form as a basis
for all future operations. This process is referred to as
stemming or lemmatization [52]. We used the Porter
stemming algorithm for English words that applies an
iterative, rule-based replacement of word suffixes
intending to reduce the length of the words until a
minimum length of the stem is reached. Examples of
such rules are “y” or “ies” into “i”, “sses” into “s”, and
“s” into “ ”. Having processed all documents, our risk
database consisted of 34,318 unique words. The total
frequency of occurrence of these words, including the
repetitive use of a given word in one document, is 3,6
million. In addition, we calculated the number of
documents containing it for every unique word.
word analysis step, we opted for NetDraw to read the
standardized data and create a semantic network map
that presents the analyzed content. Working toward
building the risk clusters, we use the Girven-Newman
method [53].
3.5 Data Analysis
By following the methodology depicted above, we
created a semantic network map, containing 40 colorcoded clusters based on the co-occurrences between
the keywords. An important factor in our analysis is
the chosen level of Chi-square. Its subjective choice
aims at uncovering the data structure optimally and
showing the IS clusters. In the decision process we
considered the number of depicted nodes and the
percentage of ties that they reveal. Keeping in mind
that maps featuring more than 200 nodes quickly
become unreadable [54], we chose to work with Chisquare higher than 180. This removed 35% of our
keywords due to their weak ties. As a result the
semantic network map contains 314 nodes and 1,170
ties (1,4 % from all positive ties). However, only the
257 nodes with the strongest ties are used for our
analysis. Although the number is still bigger than 200,
varying variables had shown that this is the smallest
number of words needed to best describe the potential
IS clusters, without losing any valuable information
with regards to content.
We depicted our results in a network map. The
size of the nodes was chosen proportional to the
number of documents containing the word and the
thickness of the edges to the strength of co-occurring
ties. The length of the edges, the nodes location in the
two-dimensional space, and the color of the clusters
were used for illustration purposes. However, closely
related words are positioned near each other.
We use the modularity Q, calculated by the
Girven-Newman community structure algorithm, as
quality criteria for our network map [55]. Values
greater than Q = 0.3 appear to indicate significant
community structure. In our case Q was 0.824,
indicating that our network map is acceptable and
providing us with relationships for further discussions.
After examining the results of the network map we
developed a hierarchical categorization of IS risks
reducing the initial 40 clusters to 33. We combined
three pairs due to their thematic similarity and
excluded four because they either contained risk
synonyms (e.g. “error and mistake”) or were
consequences of IS risks (e.g. “death, injury” and
“disaster, recovery”). Afterwards we thematically
assigned the remaining 33 clusters to the 10 categories.
The classification is based on the cluster’s content and
was manually conducted by the first and second author.
3.3 Keywords selection
We selected relevant key words from the unique
word list to understand IS risks antecedents,
characteristics, and developments. Since we used
manual classification, we reduced the total of more
than 34,000 words to a more reasonable number. We
chose the 10% of words, which occurred most often in
the risk collection. These words included ‘risk’, which
was mentioned 14631 times and ‘diskette’, which was
mentioned 100 times. Consecutively, we reviewed and
reduced these keywords by adhering to the
classification rules depicted in appendix A. Once we
selected all relevant IS risk words, we merged all
synonyms so that we can avoid the possibility of strong
relationships between them. As a result our final
keyword list contains 432 single words, 46 pairs of
double words (e.g. “automobile and car”), and 6 triples
(e.g. “airplane, aircraft, and plane”), which are used as
key words for developing IS risk categories.
3.4 Data standardizing and data mapping
In co-word analysis, once a research subject is
selected, a matrix based on the word co-occurrence is
built [53]. This matrix depicts the observed frequencies
of all selected keywords in a cross tabulation form.
Each value of a cell of two words is determined by the
times these two words both appear in the same
document. In order to calculate the association strength
between word pairs we use a normalized statistical
coefficient based on Chi-square analysis for the
relationship between qualitative variables. Hereby,
higher positive values are associated with stronger
relationships between the word pairs. For the last co-
5016
5014
Therefore we determined the size of each cluster
following the two principles:
x A document belongs to a given category if it
resides in at least one of the clusters in this
category.
x A document belongs to a given cluster if it
contains at least 2 of its keywords.
The second principle certainly undervalues the
size of clusters containing a small amount of keywords
but in the same time it increases our precision and the
quality of the results. As a consequence we classified
20,807 of all 26,050 documents each of which falls
into one or more clusters (mean = 2). The rest of the
5,243 documents are not classified because they do not
contain any of the wanted word combinations.
4. Results
Our analysis identified 257 topics of IS risks
within the Risks Digest database. Figure 1 provides an
overview of our found topics and their interrelations.
Each bubble represents the topic as labeled. The size of
the bubble represents the number of documents within
the database. The interrelations are represented through
ties between bubbles. The site of the tie indicates the
strength of co-occurring interrelations.
The most prominent group of topics (represented
as red bubbles in the center of figure 1) comprised IT
security issues. The Risks Digest database comprises
articles on password safety, unauthorized access,
hacking, breaking into systems, and securing sessions.
In the context of fraud (black bubbles in the top right
hand corner of figure 1), credit-card fraud, theft,
involved parties and systems, and ATMs are discussed.
Concerning transportation related IS risks, the topics
within the Risks Digest database (green bubbles on the
left hand side of figure 1) are centered around
operational failure of information systems, various
transportation alternatives by on land, water, and air,
and the consequences of such systems. Finally, IS risks
related to communication (purple bubble in the middle
of figure 1) concern the Internet, E-Mail provider,
technical infrastructure, and types of communication
threats.
Figure 1: Co-word semantic network map
To provide an overview of risks, that are
mentioned most often within the database, we
classified all topics into clusters and built categories
depending on the content of the topic. Figure 2
visualizes the developed classification of IS risks. Due
to the ambiguity of certain words, we considered the
full topic description to classify it to a certain category.
For
example,
network
issues
referred
to
communication between hardware and which address
the technical connection between devices. We
5017
5015
therefore categorized it as computer-related risk.
Another example is the classification of social mediarelated risks. Although such risks could be classified as
content-related risks, we categorized social media risks
as Internet-related risks since the discussion focused on
technical implementations on the Internet.
The transportation-related IS risk category is
heavily discussed (8,056). Automotive risks (2,470) are
interconnected with different “vehicles” (693) such as
“automobile” (1,580), “motor” (337) and “truck”
(197). Among theirs most common causes are “GPS”
(336) and “brake” (292) problems as well as improper
“speed” (982). Railway risks (1,030) are the least
common in this category. “Trains” (1,304) are
considered a safe transportation way that reduces
pollution levels and traffic congestion. The cluster does
not contradict with this assumption but still points out
some problems. “Track” (1,392), “signal” (1,039) and
“radio” (1,087) are the keywords with highest
frequency, which leads us to the conclusion that the
operational problems are predominant in this area.
Aviation risks (6,238) represent the biggest and most
clearly defined cluster. It includes civil aviation
incidents such as “autopilot” (133) malfunction,
“engine” (2,867) “failure” (3,936) or “traffic” (1,209)
control. Common causes are “navigation” (418) and
“whether” (271) problems. Unfortunately, they have
often ended with a “crash” (1,368) or “collision” (290).
Internet-related risks include 7,587 documents and
are thus one of the biggest categories of IS risks. They
consist of well-known computer threats and general
risks of online presence. Cybercrimes (4,862) refer to
security problems such as “hacks” (1,580) and
“cracks” (634). Among other common concerns are
stolen “passwords” (1,282) and unauthorized “access”
(3,076) to “user” (3,946) information. Social media
(147) cluster is labeled based on only two keywords:
“post” (2,696) and “Usenet” (352). Usenet is a
worldwide-distributed Internet discussion system that
was established in 1980. User communicated
interactively by posting messages in categories known
as newsgroups. Web browsing (4,438) is an important
part of modern life. However, everyone should be
aware that even simple activities like visiting
“websites” (2,734) and using browsers such as
“Internet Explorer” (228), “Netscape” (269) or
checking e-mails with “Microsoft’s” (1,157) “outlook”
(144) bear some risks.
Information systems risks
Content-related risks
Internet-related risks
Communication-related risks
Power supply-related risks
Health-related risks
(atom, plant, water, nuclear)
Web content
Cybercrimes
(google, blog, php, privacy, typo)
(hack, crack, login, password, user)
Electronic messaging
(mail, phish, scam, spam, AOL)
Adult content
Social media
Communication systems
(porn, pornography, sex, sexual)
(usenet, post)
(phone, denial, service, laptop)
Broadcasting media
Web browsing
Unethical behavour
(cable, video, tv, gamble, game)
(SSL, java, IE, netscape, ISP, site)
(fool, joke)
Computer-related risks
Government-related risks
Software engineering
Aeronautics
(ada, fortran, cpu, code, chip)
Operational systems
Finance-related risks
(drug, cancer, hazard, blood)
Research-related risks
(lab, research, science)
Transportation-related risks
Automotive risks
(NASA, orbit, space, rocket)
Fraud and identity theft
(ATM, PIN, id, credit, card)
e-Government
Stock market
Railway risks
(linux, mac, os, windows,vm)
(biometric, vote, elect, passport)
(market, price, stock)
(metro, train, rail jam, signal)
Network security
Freedom of speech
Aviation risks
(network, ip, router, tcp)
(censorship, freedom, speech)
Account ownership
(registration, fee)
Data security
(PGP, RSA, encryption, decryption)
Military technology
(DOD, SDI, ship, weapon)
Computer networking
Terrorism
(ARPA, collapse, BITNET, net)
(terror, terrorist)
Mainframe computers
Law enforcement
(IBM, mainframe)
(eavesdrop, FBI, wiretap)
Data representation
Intellectual property
(ASCII, printer, postscript)
(copyright, cd music, amazon)
(GPS, speed, motor, truck)
(airplane, fail, engine,crash)
Legend:
IS Risk Category
IS Risk Cluster
Malicious software
(trojan, worm, virus, disk)
(example words)
Figure 2: Classification of IS risks
Aeronautics (615) focuses on the design and
manufacturing of flight-capable machines. We separate
this cluster from the risks related to commercial
aviation. The most common words that describe it are
“space” (1,235), “satellite” (451), “tank” (214),
“rocket” (167) and “orbit” (154). Possible causes of
problems relate to “fire” (995) and “tank” (214) that
Government-related risks are another large
category (5,937) that includes clusters with strong
dependencies on governmental structures and
regulations. They are all either financed or legally
regulated by the government, which is the means by
which a policy is enforced. Therefore it can be referred
to as the source of all related risks and problems.
5018
5016
often lead to “explosion” (288). The only organization
that falls into this cluster and had to handle the
reliability and safety problems that occurred over the
years is “NASA” (524). e-Government (1,082) is a
thematically defined cluster containing two
independent subgroups. The first, e-Passport, is defined
by “biometric” (112) and “passport” (126). The
second, e-Voting, features “electronic” (2,826), “vote”
(957), “president” (875) and “election” (740). Freedom
of speech (112) is yet another small cluster without any
external ties. It consists of “freedom” (398), “speech”
(320) and “censorship” (221). On a lower level of Chisquare, this cluster connects with the adult content
risks and uncovers concerns in this field. Military
technology (1,208) consists mainly of governmental
institutions such as Department of Defense (198) as
well as the technology-driven facilities and equipment
at their disposal (e.g. “ship” (549) and “helicopter”
(107)). Terrorism (59) is a small but important cluster
because of its regard to human life. Law enforcement
(2,960) combines consequences for criminal acts
related to IS (e.g. “jail” (255)) and mentions concerns
about their possible misuse from governmental
authorities (“eavesdrop” (158), “wiretap” (202), and
“FBI” (682)). Intellectual property (1,035) is
thematically built from two small clusters. The first
one addresses problems with digital property rights
(“cd” (338)) in the “music” (180) industry. The second
one reveals general concerns about “copyright” (784).
It is often associated with the keywords “book”
(1,796), “deal/dealer” (1,720) and “amazon” (254).
# of
documents
5. Discussion
In this section we will discuss the results of our
analysis of the Risks Digest database using text mining
techniques to understand the developments of IS risks,
its characteristics, and interdependencies. We discuss
our results regarding (1) the temporal development of
IS risks, (2) the implications for IS risk mitigation, and
(3) the increasing complexity of IS risks.
We (1) found that IS risks develop over time. The
Girven-Newman algorithm, which we used to identify
groups of related keywords is a form of agglomerative
hierarchical clustering [55]. Figure 3 visualizes the
results of our analysis of the temporal development of
IS risks. The figure provides an initial overview of the
development of each IS risk category as developed
above. Using this exploratory analysis reveals that
some IS risks are dynamic and tend to reoccur over and
over again. For instance, phenomena like Facebook
and Twitter have caused many discussions about
privacy and security over the past few years, but such
concerns have already alarmed people between 1991
and 2000. In contrast, there were many IS, which are
certainly unique. Consider the example of the Y2K
problem: Being a risk concerning only one point in
time and resulting in fundamental redesign of
information systems, the information on this risk
within the database is very special and will probably
not provide any guidance for future IS risks.
800
- worm
attacks
700
- end of cold war
- IT-bubble burst
- attacks of 9/11
- Commercialisation
of the internet
600
500
400
300
200
100
Computer-related risks
Transportation-related risks
Communication-related risks
Research-related risks
Government-related risks
Content-related risks
Power supply-related risks
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
0
Finance-related risks
Internet-related risks
Health-related risks
Figure 3: Discussion timeline of IS risk categories
Regarding (2) the implications for IS risk
mitigation, our analysis shows that many consequences
relate logically to the identified clusters (e.g. healthrelated risks “injury” and or cybercrimes and physical
“breach”). Future IS risk mitigation strategies can be
identified through a historical analysis of causal
relationships of the visualized peaks in figure 3. This
detailed database will provide the opportunity to learn
from previous mistakes by providing IS risk factors,
characteristics, consequences, and mitigation strategies
of previously assessed IS risks which have similar
characteristics as future other IS risks.
5019
5017
Our results (3) clearly indicate an increase in the
complexity of IS risks. The temporal analysis showed
that ties between risks of at least one different category
increased by factor 1.8. To the best of the author’s
knowledge, a similar text-mining based analysis of IS
risks has not been conducted before. Yet, there have
been many theoretically and empirically driven
classifications of IS risks conducted before [e. g., 1, 6,
10]. Reflecting on these classifications, we find that
existing classifications indicate the need for further
details on existing IS risks (e.g., as in IS project risk
literature). In contrast, our results imply that due to the
rising interdependencies between IS risks, a holistic
view would help structure, assess, and mitigate today’s
and future IS risks more adequately.
their mastery [1]. Further research could address the
relationship between system maturity and coping with
IS risks. Considering the development of the data set,
several biases need to be taken into account. First, the
list was established, initially filled and is still
moderated by a single person with a certain perspective
on IS risks. Though the moderator uses a review board
for considering certain posts, the data still might be
biased. Second, our current methodology neglects
emerging IS risks because they are still not so
commonly discussed as the already established ones.
Regarding the co-word analysis, the quality of results
depends on a variety of factors, such as keywords, the
scope of database, and method adequacy for
simplifying and representing the findings [57]. We
carefully selected keywords and implemented
classification algorithms, but in a next step we need to
evaluate our classification using inter-coder-reliability
and other algorithms. We provide initial analysis on the
relative consequences of IS risks and their
countermeasures. We provide a basis for researching
empirical historical IS risks and outline worthwile
avenues for furher research.
6. Implications and Limitations
Our analysis provides an overview over the most
prominent IS risks of the past 25 years that have been
discussed in the Risks Digest database as the oldest and
most comprehensive list of IT-related risks. In contrast
to other research, it accounts for the historical
development of IT risks from initially technologyrelated (e.g. transportation) to rather socio-technical
(e.g. social networks) issues.
Implications for IS risk research are threefold: (1)
The temporal analysis reveals the growing complexity
and interconnection of IS risks and demands a holistic
perspective on these. The finding that risks develop
differently indicates (2) that it will be interesting to
differentiate between controllable and non-controllable
IS risks. Finally, (3) the developed classification allows
guiding further research on IS risks. Researchers can
use it to structure research endeavors and limit the
range, their findings are pertinent in. It further allows
transferring solutions to other IS risk topics, which
either belong to the same category or have similar
characteristics.
Practitioners benefit from this research in two
ways. Professionals in charge of IS risk assessment
benefit from additional structure on the one hand to
identify potential IS risks and on the other hand
classify and confine IS risks they are confronted with
in their daily business.
We acknowledge that there are limitations to our
study. This research presents a tentative analysis and
must be seen in its context. An excerpt of the IS risks
has already been classified by Neumann [56].
However, this research extends the existing
classification by using learning algorithms and
predictive analytics to achieve completeness and
hierarchical structure. Classifying the whole data set
reveals various levels of IS risks and countermeasures.
We find IS risks closely related to work systems and
7. Conclusion
In this paper we conduct a long-term co-word analysis
of the oldest and most comprehensive IS risk database
available. After selecting keywords describing the IS
risks we constructed a semantic map that helped us to
identify the core IS risk categories and clusters. We
find that IS risk complexity increases and that IS risks
develop over time. We derive implications for IS risk
mitigation by drawing on organizational learning. Our
analysis leads the way for using this comprehensive
documentation of IS risks for organizational learning
by providing knowledge on similar IS risks and their
mitigation for future IS risks.
8. References
[1] S. Alter and S. A. Sherer, "A general, but readily
adaptable model of information system risk,"
Communications of the AIS, vol. 14, pp. 1-28, 2004.
[2] J. Goldstein and A. Chernobai, "An Event Study
Analysis of the Economic Impact of IT Operational Risk
and its Subcategories," Journal of the Association for
Information Systems, vol. 12, pp. 606-631, 2011.
[3] P. G. Armour, "Sarbanes-Oxley and Software Projects,"
CACM, vol. 48, pp. 15-17, 2005.
[4] F. Caldwell, T. Scholtz, and J. Hagerty, "Magic
Quadrant for Enterprise Governance, Risk and
Compliance Platforms," Gartner, Stamford, CT, 2011.
[5] M. Parent and B. H. Reich, "Governing information
technology risk," California Management Review, vol.
51, pp. 134-152, 2009.
5020
5018
[6] S. Sharma and G. Dhillon, "IS risk analysis: a chaos
theoretic perspective," Issues in Information Systems,
vol. X, pp. 552-650, 2009.
[7] H. A. Smith and J. D. McKeen, "Developments in
Practice XXXIII: A Holistic Approach to Managing ITbased Risk," Communications of the Association for
Information Systems, vol. 25, Article 41, 2009.
[8] D. W. Hubbard, The Failure of Risk Management.
Hoboken, New Jersey John Wiley & Sons, 2009.
[9] R. K. Rainer, C. A. Snyder, and H. H. Carr, "Risk
Analysis for Information Technology," Journal of
Management Information Systems, vol. 8, pp. 129-147,
1991.
[10] P. G. Neumann, "Reviewing the Risks Archives,"
CACM, vol. 38, 1995.
[11] P. G. Neumann, "Illustrative risks to the public in the
use of computer systems and related technology," ACM
SIGSOFT Software Engineering Notes, vol. 19, pp. 169, 1994.
[12] J. G. March and Z. Shapira, "Managerial Perspectives
on Risk and Risk Taking," Management Science, vol.
33, pp. 1404-1418, 1987.
[13] F. H. Knight, Risk, Uncertainty and Profit. Washington,
DC, USA: BeardBooks, 2002.
[14] B. Boehm, "Software risk management: Principles and
practices," IEEE Software, vol. 8, pp. 32-41, 1991.
[15] F. Heemstra and R. Kusters, "Dealing with risk: A
practical approach," Journal of Information Technology,
vol. 11, pp. 333-346, 1996.
[16] R. Charette, "The mechanics of managing IT risk,"
Journal of Information Technology, vol. 11, pp. 373378, 1996.
[17] H. Barki, S. Rivard, and J. Talbot, "An Integrative
Contingency Model of Software Project Risk
Management," Journal of Management Information
Systems, vol. 17, pp. 37-69, 2001.
[18] H. Barki, S. Rivard, and J. Talbot, "Toward an
Assessment of Software Development Risk," Journal of
Management Information Systems, vol. 10, pp. 203-225,
1993.
[19] R. L. Baskerville and J. Stage, "Controlling Prototype
Development Through Risk Analysis," MIS Quarterly,
vol. 20, pp. 481-504, 1996.
[20] K. D. Loch, H. H. Carr, and M. E. Warkentin, "Threats
to Information Systems: Today's Reality, Yesterday's
Understanding," MIS Quarterly, vol. 16, pp. 173-186,
1992.
[21] R. J. Kauffman and R. Sougstad, "Risk Management of
Contract Portfolios in IT Services: The Profit-at-Risk
Approach," Journal of Management Information
Systems, vol. 25, pp. 17-48, 2008.
[22] H. Tanriverdi and T. W. Ruefli, "The Role of
Information Technology in Risk/Return Relations of
Firms," Journal of the Association for Information
Systems, vol. 5, pp. 421-447, 2004.
[23] J. Ropponen and K. Lyytinen, "Can software risk
management improve system development: An
exploratory study," European Journal of Information
Systems, vol. 6, pp. 41-41, 1997.
[24] J. H. Iversen, L. Mathiassen, and P. A. Nielsen,
"Managing Risk in Software Process Improvement: An
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
5021
5019
Action Research Approach," MIS Quarterly, vol. 28, pp.
395-433, 2004.
A. Mursu, K. Lyytinen, H. A. Soriyan, and M. Korpela,
"Identifying software project risks in Nigeria: an
International Comparative Study," European Journal of
Information Systems, vol. 12, pp. 182-194, 2003.
H. K. Jain, M. R. Tanniru, and B. Fazlollahi, "MCDM
Approach for Generating and Evaluating Alternatives in
Requirement Analysis," Information Systems Research,
vol. 2, pp. 223-239, 1991.
D. W. Straub and R. J. Welke, "Coping With Systems
Risk: Security Planning Models for Management
Decision Making," MIS Quarterly, vol. 22, pp. 441-469,
1998.
R. Schmidt, K. Lyytinen, M. Keil, and P. Cule,
"Identifying Software Project Risks: An International
Delphi Study," Journal of Management Information
Systems, vol. 17, pp. 5-36, 2001.
P. A. Pavlou and D. Gefen, "Psychological Contract
Violation in Online Marketplaces: Antecedents,
Consequences, and Moderating Role," Information
Systems Research, vol. 16, pp. 372-399, 2005.
M. Keil, B. C. Y. Tan, K.-K. Wei, T. Saarinen, V.
Tuunainen, and A. Wassenaar, "A Cross-Cultural Study
on Escalation of Commitment Behavior in Software
Projects," MIS Quarterly, vol. 24, pp. 299-325, 2000.
S. R. Nidumolu, "A Comparison of the Structural
Contingency and Risk-Based Perspectives on
Coordination in Software-Development Projects,"
Journal of Management Information Systems, vol. 13,
pp. 77-113, 1996.
E. K. Clemons, M. C. Row, and M. E. Thatcher,
"Identifying Sources of Reengineering Failures: A
Study of the Behavioral Factors Contributing to
Reengineering Risks," Journal of Management
Information Systems, vol. 12, pp. 9-36, 1995.
E. D. Hahn, J. P. Doh, and K. Bunyaratavej, "The
Evolution of Risk in Information Systems Offshoring:
The Impact of Home Country Risk, Firm Learning, and
Competitive Dynamics " MIS Quarterly, vol. 33, pp.
597-616, 2009.
A. I. Nicolaou and D. H. McKnight, "Perceived
Information Quality in Data Exchanges: Effects on Risk,
Trust, and Intention to Use," Information Systems
Research, vol. 17, pp. 332-351, 2006.
R. Willison and J. Backhouse, "Opportunities for
computer crime: considering systems risk from a
criminological perspective," European Journal of
Information Systems, vol. 15, pp. 403-414, 2006.
T. Dinev and P. Hart, "An Extended Privacy Calculus
Model for E-Commerce Transactions," Information
Systems Research, vol. 17, pp. 61-80, 2006.
M. Benaroch, Y. Lichtenstein, and K. Robinson, "Real
Options in Information Technology Risk Management:
An Empirical Validation of Risk-Option Relationships,"
MIS Quarterly, vol. 30, pp. 827-864, 2006.
T. Herath and H. R. Rao, "Protection motivation and
deterrence: a framework for security policy compliance
in organisations," European Journal of Information
Systems, vol. 18, pp. 106-125, 2009.
[39] B. Zmud, "The Year 2000 Problem: A Laboratory for
MIS Research," MIS Quarterly, vol. 21, pp. iii-vi, 1997.
[40] A. Y. Du, X. Geng, R. Gopal, R. Ramesh, and A. B.
Whinston, "Capacity Provision Networks: Foundations
of Markets for Sharable Resources in Distributed
Computational Economies," Information Systems
Research, vol. 19, pp. 144-160, 2008.
[41] M. Alavi and I. R. Weiss, "Managing the Risks
Associated with End-User Computing," Journal of
Management Information Systems, vol. 2, pp. 5-20,
1985.
[42] E. K. Clemons and G. U. Bin, "Justifying Contingent
Information Technology Investments: Balancing the
Need for Speed of Action with Certainty Before
Action," Journal of Management Information Systems,
vol. 20, pp. 11-48, 2003.
[43] R. Y. Arakji and K. R. Lang, "Digital Consumer
Networks and Producer--Consumer Collaboration:
Innovation and Product Development in the Video
Game Industry," Journal of Management Information
Systems, vol. 24, pp. 195-219, 2007.
[44] S. Alter and S. A. Sherer, "Information system risks and
risk factors: Are they mostly about information
systems?," Communications of the AIS, vol. 14, pp. 2964, 2004.
[45] A. Tiwana and M. Keil, "Functionality risk in
information systems development: An empirical
investigation," IEEE Transactions on Engineering
Management, vol. 53, pp. 412-425, 2006.
[46] M. Carr, S. Konda, I. Monarch, F. Ulrich, and C.
Walker, "Taxonomy-Based Risk Identification,"
Software Engineering Institute, Pittsburgh1993.
[47] Q. He, "Knowledge Discovery Through Co-Word
Analysis," Library Trends, vol. 48, pp. 133-159, 1999.
[48] A. Cambrosio, C. Limoges, J. P. Courtial, and F.
Laville, "Historical scientometrics? Mapping over 70
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
years of biological safety research with coword
analysis," Scientometrics, vol. 27, pp. 119-143, 1993.
M. Rokaya, E. Atlam, M. Fuketa, T. C. Dorji, and J.-i.
Aoe, "Ranking of field association terms using Co-word
analysis," Information Processing & Management, vol.
44, pp. 738-755, 2008.
N. Rowe, "A taxonomy of deception in cyberspace," in
International Conference in Information Warfare and
Security, Princess Anne, MD, 2006, pp. 173-181.
M. Slaymaker, E. Politou, D. Power, S. Lloyd, and A.
Simpson, "Security Aspects of Grid-based Digital
Mammography," Methods Inf Med, vol. 44, pp. 207-10,
2005.
S. Weiss, N. Indurkhya, and T. Zhang, Fundamentals of
Predictive Text Mining. London: Springer, 2010.
Y. Ding, Chowdhury, G.G. and Foo, S., "Bibliometric
cartography of information retrieval research by using
co-word analysis," Information Processing &
Management, vol. 37, pp. 817-842, 2001.
P. Bourret, A. Mogoutov, C. Julian-Reynier, and A.
Cambrosio, "A New Clinical Collective for French
Cancer Genetics: A Heterogeneous Mapping Analysis,"
Science Technology and Human Values, vol. 31, pp.
431-464, 2006.
M. E. J. Newman, "Fast algorithm for detecting
community structure in networks," Physical Review E,
vol. 69, 2004.
P. G. Neumann, Computer related risks. New York, NY:
Addison-Wesley, 1995.
J. Law, S. Bauin, J.-P. Courtial, and J. Whittaker,
"Policy and the mapping of scientific change: A coword analysis of research into environmental
acidification," Scientometrics, vol. 14, pp. 251-264,
1988.
Appendix A: Keyword selection criteria
Action
Word type
or meaning
Nouns
Verbs
Adjectives
Conversions from verbs and
adjectives to nouns
Adverbs
Names
Acronyms
Countries, cities, nationalities
Outliers (frequency of
occurrence > 6.500)
Sources of information
Unessential words (see
explanation below)
Include words
Description
Related to IS
No verbs
Related to IS
Example
Exclude words
Description
Example
General
All verbs
General
Star, Chair, Lesson
Explain, Present
Slow, Fair, Nice
General
Turn, Order, Record
No adverbs
Companies and
Products
Technical
acronyms
Algorithm, Chip
Binary, Cellular
Charge, Crash, Trace,
Fake, Defect
IBM, Microsoft,
Outlook, Netscape
PGP, RSA, SMTP,
NT, NASA, ARPA
All adverbs
Possibly, Easily
Personal
Eric, Kevin, Frank
Abbreviations
Etc., Mr., Jr., Dr.
None of those
-
All of those
No outliers
-
All outliers
None of those
No unessential
words
-
All of those
All unessential
words
Related to IS
-
5022
5020
China, London,
American
Risk, System, Time,
Problem
Reuters, BBC, MIT
Data, Software,
Company, Address