2013 46th Hawaii International Conference on System Sciences Classifying Information Systems Risks: What Have We Learned So Far? Manuel Wiesche Chair for Information Systems Technische Universität München [email protected] Hristo Keskinov Chair for Information Systems Technische Universität München [email protected] Michael Schermann Chair for Information Systems Technische Universität München [email protected] Helmut Krcmar Chair for Information Systems Technische Universität München [email protected] Abstract1 technological challenges early, prevents lock-in effects, allows transparency over IT processes and ensures responsibilities [2-4]. With information systems becoming ubiquitous, IS risks permeate every aspect of life and risk mitigation increasingly requires a holistic approach. Literature argues that IS risks are neither exclusively technological or organizational but are embedded in work systems that consist of information systems, business processes, and work practices [1]. The following examples illustrate the need for a holistic approach on IS risks: x The CEO of a multi-national corporation is finalizing an important email on an upcoming takeover in his hotel room. He is using the unsecured wireless internet connection provided by the hotel. He always felt that the VPN client installed on his notebook is too cumbersome to use. Against better judgment, he sends off the email without securing it. An industrial spy in the next hotel room eavesdrops on the internet traffic of the CEO and intercepts the particular email. He publicizes the content, causing the multi-billion takeover to fail. x Currently, Facebook has more than 900 million members that are producing 500,000 comments per minute. Facebook collects information on people’s private and professional life. Other companies are increasingly using Facebook as a platform for marketing purposes or even doing business. Several privacy breaches have heightened the awareness of potential risks from using Facebook on a private or organizational level. Furthermore, the significant role of Facebook in recent uprisings revealed the power of the social network on a national level. Understanding the risks caused by relying on information systems is an enduring research stream in the Information Systems (IS) discipline. With information systems becoming ubiquitous, IS risks permeate every aspect of life and effective risk mitigation increasingly requires a holistic structure. We use the largest and oldest publicly available risk collection to understand the developments of IS risks, its characteristics, and interdependencies. We review this data set using text mining techniques. Interestingly, we find that some types of IS risks tend to reoccur. We find that this database provides rich opportunities for learning from previous mistakes, which could help avoid similar problems in the future. Our contributions to theory includes a risk-taker’s view on contemporary information systems, a differentiation between controllable and reoccurring risks, and the increased interconnection of IS risks. As implications for practice we provide a basis for learning from past IS risks and an initial structure. 1. Introduction Managing risks caused by relying on information technology is an enduring research stream in the Information Systems (IS) discipline [1]. Effectively managing IS risks helps organizations recognize future 1 We thank SAP AG for funding this project as part of the collaborative research center, Center for Very Large Business Applications (CVLBA). 1530-1605/12 $26.00 © 2012 IEEE DOI 10.1109/HICSS.2013.130 5011 5013 x Consider the design, implementation, and operation of a nation-wide road toll billing system involving satellite-based vehicle tracking. The project was delayed by almost three years, which amounted to 1.6 billion Euro in penalties and 3.5 billion Euro in lost earnings with legal actions still in progress. Furthermore, privacy concerns were raised by nongovernmental agencies and prominent figures in society. However, today the system is operating very effectively and other countries are interested in adopting the system. The resulting entanglement of information systems, organizations, people, and societies challenges IS risk research. A central problem of research on IS risks are the diverse dimensions, IS risks occur in [1]. Literature on IS risks address different topics ranging from IT security, software projects, outsourcing, e-commerce, inter-organizational systems, IT infrastructure, healthcare, to cryptography. These topics range in different risk domains including operational, project, portfolio, monitoring, or strategic risks. However, many of the articles research a particular IS topic and focus on either one or a number of risks associated with this topic in particular. Thus, literature still demands a holistic perspective on IS risks [5-9]. In order to understand the developments of IS risks, its characteristics, and interdependencies, we review the largest and oldest publicly available IS risk collection on the net, the Risks Digest. This risk collection has been published by the Committee on Computers and Public Policy of the Association for Computing Machinery (ACM) since 1985. The Risks Digest is edited by Peter G. Neumann and characterized by its broad range of IS risks enhanced by the knowledge and experience of experts from all over the world. There have been analyses of this interesting data set [10, 11]. Several classifications have been discussed within the database as well. Examples include classifications of program bugs in the 1980s, errors in the 1990s, and attacks in the 2000s. However, none has been generally established and agreed upon. We review the Risks Digest using text mining techniques to learn from this longitudinal data set. We automatically classify the entries and empirically derive a taxonomy of IS risks. We contrast different time slices and analyze the development of similarly described risks over time. We use the data set to analyze the development of IS risks in the past decades. We find that some types of IS risks tend to reoccur. Hence, this database provides rich opportunities for learning from previous mistakes by reusing risk factors, consequence assessments, and mitigation strategies. The remainder of this paper is organized as follows. The next section outlines the theoretical background by defining information systems and IS risks. The third section describes the methodology we followed in the course of our research. After that, the fourth section presents our results. The fifth section explores the limitations of our methodology and the initial implications of the achieved results. The last section summarizes our findings and presents the conclusion. 2. Theoretical Background Following fundamental definitions of risk in IS reference disciplines [12, 13], IS researchers frequently define risks as events with a perceived probability of occurrence and a perceived negative impact on the objectives [1, 14-16]. Only few researchers explicitly state their understanding of IS risk [1]. Among these, many articles see risk as a quantifiable construct with a probability and impact value [17-22]. Other articles define risk explicitly as IS project failure [23-26] or systems failure [26, 27]. Further researchers understand risk from a broader perspective, in terms of outcome variation. Out of these, risk is understood as undesirable outcome variation [17, 28-30], uncertain outcome [31-33], or variation in outcome [34]. Risk is also defined in even broader terms as some kind of loss in general [18-22, 35, 36]. For this research, we define IS risk as “any threat that may lead to the improper modification, destruction, theft, or lack of availability of IT assets” [2]. Extant research exists on the sources of IT risks and individual countermeasures [37]. As a core body of knowledge in IS risk research, literature on operational IS risks focuses on managing and maintaining IT systems. Main challenges include maintaining availability, i.e. ensuring that systems do not break down, integrity, i.e. keeping information from being confused and incomplete and confidentiality, i. e. securing information systems against unauthorized access [38]. Such risks are caused during the standard usage of IT; the IT-component fails as part of a bigger system [27]. Such IS risks can be divided in two further categories: on the one hand, there are new and unknown risks [39] and on the other hand, there are known, but still unsolved risks [40]. Characteristics of these known risks include the fact that the degree of uncertainty is relatively low and the number of risks occurring is relatively high. This makes it easier to quantify probability and impact of the considered risks. New and unknown risks usually occur with the emerging of new technologies [41-43]. Similarly, for IT projects, research identified lists of risk factors, 5014 5012 which affect the degree of variation in expected outcomes [44, 45]. Countermeasures include software development methodologies or contract design and coincide with real options [14, 37]. On the other hand, the example of the Y2K problem was researched only at a certain point in time and thus researchers cannot provide a structure they place their research in [39]. Considering all these heterogeneous topics and different application domain, literature on IS risks is scattered. Although there have been promising attempts to classify IS risks [1, 6, 10, 46], research still demands a holistic perspective on IS risks [5-9]. impact of information systems and their failure, and the usage of information systems. Participants provide contributions via the comp.risks newsgroup and email. The contributions are reviewed for relevance, soundness, taste, objectivity, cogence, coherence, conciseness, nonrepetitiousness, and meeting compliance regulations. Usually between 5 and 50 contributions are summarized within one issue. The issues are aggregated into volumes of between 45 and 98 issues. Though the Risks Digest is published on an irregular basis, 1.65 issues are published a week on average. The longest volume lasted 116.4 weeks, the shortest 20.5. One volume comprises 52.8 weeks on average. We focus on all risks posted between July 1985 when the first issue was published and March 2011 when we started this research. Comprising a total number of 26,050 risk items, IS risks cover 26 volumes and 2,264 issues. There have been many analyses of this data set. However, these focus on single cases [10] and manual classifications [11]. Others used parts of the risk collection for empirically validating a developed taxonomy of cyberspace deception [50] and scenario analysis [51]. In line with these researchers, we find this risk collection as unique and most comprehensive for analyzing publicly available IS risk information [10]. However, since this risk collection is edited by a single person, the topics included could be biased. As stated in the Risks Digests mission, the editor has a broad perspective on risks and initial classifications reveal the heterogeneity of the data set [11]. 3. Methodology In order to classify the risk collection for understanding the development of IS risks, we adopted co-word analysis, also referred to as “actor network analysis” [47]. The co-word analysis is a content analysis technique that uses co-occurrence patterns for pairs of items, such as words or noun phrases, in a corpus of texts. These items are necessary in order to extract the themes and detect the linkages among topics directly from the subject content presented in the texts [47]. The co-word analysis is based on the assumption that a document’s keywords constitute an adequate description of its content and that two keywords co-occurring in the same document indicate a relationship between its topic and keywords [48]. Co-word analysis is generally conducted in three steps: extract keyword list, data standardizing and data mapping [49]. However, to start the analysis we had to create and clean the risk database first. Having collected the data and unified it by using text-mining techniques as described in paragraph 3.2 “Data transformation”, we manually selected the keywords that best describe clusters of IS risks. Afterwards we standardized the data with the construction of a cooccurrence matrix of keywords. In the final step we mapped the data to create semantic network maps that revealed the relationship between the chosen keywords and the clusters for the taxonomy of IS risks. 3.2 Data Transformation In order to work on the risk collection, we extracted the data into ‘RapidMiner’ - an environment combining text, data and web mining, machine learning, predictive analytics, and business analytics. By specifying two regular expressions we defined a region delimiter containing one single IS risk. Then we cut the extracted web sites into single risks (that we will refer to as documents) creating our unique risk database. Each document contains not only a detailed risk description, but also information about volume, issue, publish date, subject, and author. Besides risk contributions, these documents also include calls for papers and book reviews. We automatically removed contributions containing the string ‘call for paper’. However, we kept book reviews, since they summarize risk books and therefore address important IS risk topics as well. The first step when handling a text-based database is breaking the stream of characters into words called tokens [52]. Consecutively, we transformed all characters in the documents to lower case and filtered 3.1 The Risks Digest For this research, we review the largest and oldest publicly available risk collection on the net, the Risks Digest.2 It is published by the ACM Committee on Computers and Public Policy and edited by Peter G. Neumann. The first contribution appeared on August 1st 1985. Within the Risks Digest, researchers and practitioners discuss various topics on IS risks. Topics in this data set include technical security breaches, 2 http://catless.ncl.ac.uk/Risks/ 5015 5013 out the stop words by removing every token that equals one in the build-in stop word list. Afterwards we discarded tokens consisting of only 1 or longer than 15 characters. Once we segmented the character stream into a sequence of uniform tokens, the next step was to convert each of the tokens to a standard form as a basis for all future operations. This process is referred to as stemming or lemmatization [52]. We used the Porter stemming algorithm for English words that applies an iterative, rule-based replacement of word suffixes intending to reduce the length of the words until a minimum length of the stem is reached. Examples of such rules are “y” or “ies” into “i”, “sses” into “s”, and “s” into “ ”. Having processed all documents, our risk database consisted of 34,318 unique words. The total frequency of occurrence of these words, including the repetitive use of a given word in one document, is 3,6 million. In addition, we calculated the number of documents containing it for every unique word. word analysis step, we opted for NetDraw to read the standardized data and create a semantic network map that presents the analyzed content. Working toward building the risk clusters, we use the Girven-Newman method [53]. 3.5 Data Analysis By following the methodology depicted above, we created a semantic network map, containing 40 colorcoded clusters based on the co-occurrences between the keywords. An important factor in our analysis is the chosen level of Chi-square. Its subjective choice aims at uncovering the data structure optimally and showing the IS clusters. In the decision process we considered the number of depicted nodes and the percentage of ties that they reveal. Keeping in mind that maps featuring more than 200 nodes quickly become unreadable [54], we chose to work with Chisquare higher than 180. This removed 35% of our keywords due to their weak ties. As a result the semantic network map contains 314 nodes and 1,170 ties (1,4 % from all positive ties). However, only the 257 nodes with the strongest ties are used for our analysis. Although the number is still bigger than 200, varying variables had shown that this is the smallest number of words needed to best describe the potential IS clusters, without losing any valuable information with regards to content. We depicted our results in a network map. The size of the nodes was chosen proportional to the number of documents containing the word and the thickness of the edges to the strength of co-occurring ties. The length of the edges, the nodes location in the two-dimensional space, and the color of the clusters were used for illustration purposes. However, closely related words are positioned near each other. We use the modularity Q, calculated by the Girven-Newman community structure algorithm, as quality criteria for our network map [55]. Values greater than Q = 0.3 appear to indicate significant community structure. In our case Q was 0.824, indicating that our network map is acceptable and providing us with relationships for further discussions. After examining the results of the network map we developed a hierarchical categorization of IS risks reducing the initial 40 clusters to 33. We combined three pairs due to their thematic similarity and excluded four because they either contained risk synonyms (e.g. “error and mistake”) or were consequences of IS risks (e.g. “death, injury” and “disaster, recovery”). Afterwards we thematically assigned the remaining 33 clusters to the 10 categories. The classification is based on the cluster’s content and was manually conducted by the first and second author. 3.3 Keywords selection We selected relevant key words from the unique word list to understand IS risks antecedents, characteristics, and developments. Since we used manual classification, we reduced the total of more than 34,000 words to a more reasonable number. We chose the 10% of words, which occurred most often in the risk collection. These words included ‘risk’, which was mentioned 14631 times and ‘diskette’, which was mentioned 100 times. Consecutively, we reviewed and reduced these keywords by adhering to the classification rules depicted in appendix A. Once we selected all relevant IS risk words, we merged all synonyms so that we can avoid the possibility of strong relationships between them. As a result our final keyword list contains 432 single words, 46 pairs of double words (e.g. “automobile and car”), and 6 triples (e.g. “airplane, aircraft, and plane”), which are used as key words for developing IS risk categories. 3.4 Data standardizing and data mapping In co-word analysis, once a research subject is selected, a matrix based on the word co-occurrence is built [53]. This matrix depicts the observed frequencies of all selected keywords in a cross tabulation form. Each value of a cell of two words is determined by the times these two words both appear in the same document. In order to calculate the association strength between word pairs we use a normalized statistical coefficient based on Chi-square analysis for the relationship between qualitative variables. Hereby, higher positive values are associated with stronger relationships between the word pairs. For the last co- 5016 5014 Therefore we determined the size of each cluster following the two principles: x A document belongs to a given category if it resides in at least one of the clusters in this category. x A document belongs to a given cluster if it contains at least 2 of its keywords. The second principle certainly undervalues the size of clusters containing a small amount of keywords but in the same time it increases our precision and the quality of the results. As a consequence we classified 20,807 of all 26,050 documents each of which falls into one or more clusters (mean = 2). The rest of the 5,243 documents are not classified because they do not contain any of the wanted word combinations. 4. Results Our analysis identified 257 topics of IS risks within the Risks Digest database. Figure 1 provides an overview of our found topics and their interrelations. Each bubble represents the topic as labeled. The size of the bubble represents the number of documents within the database. The interrelations are represented through ties between bubbles. The site of the tie indicates the strength of co-occurring interrelations. The most prominent group of topics (represented as red bubbles in the center of figure 1) comprised IT security issues. The Risks Digest database comprises articles on password safety, unauthorized access, hacking, breaking into systems, and securing sessions. In the context of fraud (black bubbles in the top right hand corner of figure 1), credit-card fraud, theft, involved parties and systems, and ATMs are discussed. Concerning transportation related IS risks, the topics within the Risks Digest database (green bubbles on the left hand side of figure 1) are centered around operational failure of information systems, various transportation alternatives by on land, water, and air, and the consequences of such systems. Finally, IS risks related to communication (purple bubble in the middle of figure 1) concern the Internet, E-Mail provider, technical infrastructure, and types of communication threats. Figure 1: Co-word semantic network map To provide an overview of risks, that are mentioned most often within the database, we classified all topics into clusters and built categories depending on the content of the topic. Figure 2 visualizes the developed classification of IS risks. Due to the ambiguity of certain words, we considered the full topic description to classify it to a certain category. For example, network issues referred to communication between hardware and which address the technical connection between devices. We 5017 5015 therefore categorized it as computer-related risk. Another example is the classification of social mediarelated risks. Although such risks could be classified as content-related risks, we categorized social media risks as Internet-related risks since the discussion focused on technical implementations on the Internet. The transportation-related IS risk category is heavily discussed (8,056). Automotive risks (2,470) are interconnected with different “vehicles” (693) such as “automobile” (1,580), “motor” (337) and “truck” (197). Among theirs most common causes are “GPS” (336) and “brake” (292) problems as well as improper “speed” (982). Railway risks (1,030) are the least common in this category. “Trains” (1,304) are considered a safe transportation way that reduces pollution levels and traffic congestion. The cluster does not contradict with this assumption but still points out some problems. “Track” (1,392), “signal” (1,039) and “radio” (1,087) are the keywords with highest frequency, which leads us to the conclusion that the operational problems are predominant in this area. Aviation risks (6,238) represent the biggest and most clearly defined cluster. It includes civil aviation incidents such as “autopilot” (133) malfunction, “engine” (2,867) “failure” (3,936) or “traffic” (1,209) control. Common causes are “navigation” (418) and “whether” (271) problems. Unfortunately, they have often ended with a “crash” (1,368) or “collision” (290). Internet-related risks include 7,587 documents and are thus one of the biggest categories of IS risks. They consist of well-known computer threats and general risks of online presence. Cybercrimes (4,862) refer to security problems such as “hacks” (1,580) and “cracks” (634). Among other common concerns are stolen “passwords” (1,282) and unauthorized “access” (3,076) to “user” (3,946) information. Social media (147) cluster is labeled based on only two keywords: “post” (2,696) and “Usenet” (352). Usenet is a worldwide-distributed Internet discussion system that was established in 1980. User communicated interactively by posting messages in categories known as newsgroups. Web browsing (4,438) is an important part of modern life. However, everyone should be aware that even simple activities like visiting “websites” (2,734) and using browsers such as “Internet Explorer” (228), “Netscape” (269) or checking e-mails with “Microsoft’s” (1,157) “outlook” (144) bear some risks. Information systems risks Content-related risks Internet-related risks Communication-related risks Power supply-related risks Health-related risks (atom, plant, water, nuclear) Web content Cybercrimes (google, blog, php, privacy, typo) (hack, crack, login, password, user) Electronic messaging (mail, phish, scam, spam, AOL) Adult content Social media Communication systems (porn, pornography, sex, sexual) (usenet, post) (phone, denial, service, laptop) Broadcasting media Web browsing Unethical behavour (cable, video, tv, gamble, game) (SSL, java, IE, netscape, ISP, site) (fool, joke) Computer-related risks Government-related risks Software engineering Aeronautics (ada, fortran, cpu, code, chip) Operational systems Finance-related risks (drug, cancer, hazard, blood) Research-related risks (lab, research, science) Transportation-related risks Automotive risks (NASA, orbit, space, rocket) Fraud and identity theft (ATM, PIN, id, credit, card) e-Government Stock market Railway risks (linux, mac, os, windows,vm) (biometric, vote, elect, passport) (market, price, stock) (metro, train, rail jam, signal) Network security Freedom of speech Aviation risks (network, ip, router, tcp) (censorship, freedom, speech) Account ownership (registration, fee) Data security (PGP, RSA, encryption, decryption) Military technology (DOD, SDI, ship, weapon) Computer networking Terrorism (ARPA, collapse, BITNET, net) (terror, terrorist) Mainframe computers Law enforcement (IBM, mainframe) (eavesdrop, FBI, wiretap) Data representation Intellectual property (ASCII, printer, postscript) (copyright, cd music, amazon) (GPS, speed, motor, truck) (airplane, fail, engine,crash) Legend: IS Risk Category IS Risk Cluster Malicious software (trojan, worm, virus, disk) (example words) Figure 2: Classification of IS risks Aeronautics (615) focuses on the design and manufacturing of flight-capable machines. We separate this cluster from the risks related to commercial aviation. The most common words that describe it are “space” (1,235), “satellite” (451), “tank” (214), “rocket” (167) and “orbit” (154). Possible causes of problems relate to “fire” (995) and “tank” (214) that Government-related risks are another large category (5,937) that includes clusters with strong dependencies on governmental structures and regulations. They are all either financed or legally regulated by the government, which is the means by which a policy is enforced. Therefore it can be referred to as the source of all related risks and problems. 5018 5016 often lead to “explosion” (288). The only organization that falls into this cluster and had to handle the reliability and safety problems that occurred over the years is “NASA” (524). e-Government (1,082) is a thematically defined cluster containing two independent subgroups. The first, e-Passport, is defined by “biometric” (112) and “passport” (126). The second, e-Voting, features “electronic” (2,826), “vote” (957), “president” (875) and “election” (740). Freedom of speech (112) is yet another small cluster without any external ties. It consists of “freedom” (398), “speech” (320) and “censorship” (221). On a lower level of Chisquare, this cluster connects with the adult content risks and uncovers concerns in this field. Military technology (1,208) consists mainly of governmental institutions such as Department of Defense (198) as well as the technology-driven facilities and equipment at their disposal (e.g. “ship” (549) and “helicopter” (107)). Terrorism (59) is a small but important cluster because of its regard to human life. Law enforcement (2,960) combines consequences for criminal acts related to IS (e.g. “jail” (255)) and mentions concerns about their possible misuse from governmental authorities (“eavesdrop” (158), “wiretap” (202), and “FBI” (682)). Intellectual property (1,035) is thematically built from two small clusters. The first one addresses problems with digital property rights (“cd” (338)) in the “music” (180) industry. The second one reveals general concerns about “copyright” (784). It is often associated with the keywords “book” (1,796), “deal/dealer” (1,720) and “amazon” (254). # of documents 5. Discussion In this section we will discuss the results of our analysis of the Risks Digest database using text mining techniques to understand the developments of IS risks, its characteristics, and interdependencies. We discuss our results regarding (1) the temporal development of IS risks, (2) the implications for IS risk mitigation, and (3) the increasing complexity of IS risks. We (1) found that IS risks develop over time. The Girven-Newman algorithm, which we used to identify groups of related keywords is a form of agglomerative hierarchical clustering [55]. Figure 3 visualizes the results of our analysis of the temporal development of IS risks. The figure provides an initial overview of the development of each IS risk category as developed above. Using this exploratory analysis reveals that some IS risks are dynamic and tend to reoccur over and over again. For instance, phenomena like Facebook and Twitter have caused many discussions about privacy and security over the past few years, but such concerns have already alarmed people between 1991 and 2000. In contrast, there were many IS, which are certainly unique. Consider the example of the Y2K problem: Being a risk concerning only one point in time and resulting in fundamental redesign of information systems, the information on this risk within the database is very special and will probably not provide any guidance for future IS risks. 800 - worm attacks 700 - end of cold war - IT-bubble burst - attacks of 9/11 - Commercialisation of the internet 600 500 400 300 200 100 Computer-related risks Transportation-related risks Communication-related risks Research-related risks Government-related risks Content-related risks Power supply-related risks 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990 1989 1988 1987 1986 1985 0 Finance-related risks Internet-related risks Health-related risks Figure 3: Discussion timeline of IS risk categories Regarding (2) the implications for IS risk mitigation, our analysis shows that many consequences relate logically to the identified clusters (e.g. healthrelated risks “injury” and or cybercrimes and physical “breach”). Future IS risk mitigation strategies can be identified through a historical analysis of causal relationships of the visualized peaks in figure 3. This detailed database will provide the opportunity to learn from previous mistakes by providing IS risk factors, characteristics, consequences, and mitigation strategies of previously assessed IS risks which have similar characteristics as future other IS risks. 5019 5017 Our results (3) clearly indicate an increase in the complexity of IS risks. The temporal analysis showed that ties between risks of at least one different category increased by factor 1.8. To the best of the author’s knowledge, a similar text-mining based analysis of IS risks has not been conducted before. Yet, there have been many theoretically and empirically driven classifications of IS risks conducted before [e. g., 1, 6, 10]. Reflecting on these classifications, we find that existing classifications indicate the need for further details on existing IS risks (e.g., as in IS project risk literature). In contrast, our results imply that due to the rising interdependencies between IS risks, a holistic view would help structure, assess, and mitigate today’s and future IS risks more adequately. their mastery [1]. Further research could address the relationship between system maturity and coping with IS risks. Considering the development of the data set, several biases need to be taken into account. First, the list was established, initially filled and is still moderated by a single person with a certain perspective on IS risks. Though the moderator uses a review board for considering certain posts, the data still might be biased. Second, our current methodology neglects emerging IS risks because they are still not so commonly discussed as the already established ones. Regarding the co-word analysis, the quality of results depends on a variety of factors, such as keywords, the scope of database, and method adequacy for simplifying and representing the findings [57]. We carefully selected keywords and implemented classification algorithms, but in a next step we need to evaluate our classification using inter-coder-reliability and other algorithms. We provide initial analysis on the relative consequences of IS risks and their countermeasures. We provide a basis for researching empirical historical IS risks and outline worthwile avenues for furher research. 6. Implications and Limitations Our analysis provides an overview over the most prominent IS risks of the past 25 years that have been discussed in the Risks Digest database as the oldest and most comprehensive list of IT-related risks. In contrast to other research, it accounts for the historical development of IT risks from initially technologyrelated (e.g. transportation) to rather socio-technical (e.g. social networks) issues. Implications for IS risk research are threefold: (1) The temporal analysis reveals the growing complexity and interconnection of IS risks and demands a holistic perspective on these. The finding that risks develop differently indicates (2) that it will be interesting to differentiate between controllable and non-controllable IS risks. Finally, (3) the developed classification allows guiding further research on IS risks. Researchers can use it to structure research endeavors and limit the range, their findings are pertinent in. It further allows transferring solutions to other IS risk topics, which either belong to the same category or have similar characteristics. Practitioners benefit from this research in two ways. Professionals in charge of IS risk assessment benefit from additional structure on the one hand to identify potential IS risks and on the other hand classify and confine IS risks they are confronted with in their daily business. We acknowledge that there are limitations to our study. This research presents a tentative analysis and must be seen in its context. An excerpt of the IS risks has already been classified by Neumann [56]. However, this research extends the existing classification by using learning algorithms and predictive analytics to achieve completeness and hierarchical structure. Classifying the whole data set reveals various levels of IS risks and countermeasures. We find IS risks closely related to work systems and 7. Conclusion In this paper we conduct a long-term co-word analysis of the oldest and most comprehensive IS risk database available. After selecting keywords describing the IS risks we constructed a semantic map that helped us to identify the core IS risk categories and clusters. We find that IS risk complexity increases and that IS risks develop over time. We derive implications for IS risk mitigation by drawing on organizational learning. Our analysis leads the way for using this comprehensive documentation of IS risks for organizational learning by providing knowledge on similar IS risks and their mitigation for future IS risks. 8. References [1] S. Alter and S. A. Sherer, "A general, but readily adaptable model of information system risk," Communications of the AIS, vol. 14, pp. 1-28, 2004. [2] J. Goldstein and A. Chernobai, "An Event Study Analysis of the Economic Impact of IT Operational Risk and its Subcategories," Journal of the Association for Information Systems, vol. 12, pp. 606-631, 2011. [3] P. G. Armour, "Sarbanes-Oxley and Software Projects," CACM, vol. 48, pp. 15-17, 2005. [4] F. Caldwell, T. Scholtz, and J. Hagerty, "Magic Quadrant for Enterprise Governance, Risk and Compliance Platforms," Gartner, Stamford, CT, 2011. [5] M. Parent and B. H. Reich, "Governing information technology risk," California Management Review, vol. 51, pp. 134-152, 2009. 5020 5018 [6] S. Sharma and G. Dhillon, "IS risk analysis: a chaos theoretic perspective," Issues in Information Systems, vol. X, pp. 552-650, 2009. [7] H. A. Smith and J. D. McKeen, "Developments in Practice XXXIII: A Holistic Approach to Managing ITbased Risk," Communications of the Association for Information Systems, vol. 25, Article 41, 2009. [8] D. W. Hubbard, The Failure of Risk Management. Hoboken, New Jersey John Wiley & Sons, 2009. [9] R. K. Rainer, C. A. Snyder, and H. H. Carr, "Risk Analysis for Information Technology," Journal of Management Information Systems, vol. 8, pp. 129-147, 1991. [10] P. G. Neumann, "Reviewing the Risks Archives," CACM, vol. 38, 1995. [11] P. G. Neumann, "Illustrative risks to the public in the use of computer systems and related technology," ACM SIGSOFT Software Engineering Notes, vol. 19, pp. 169, 1994. [12] J. G. March and Z. Shapira, "Managerial Perspectives on Risk and Risk Taking," Management Science, vol. 33, pp. 1404-1418, 1987. [13] F. H. Knight, Risk, Uncertainty and Profit. Washington, DC, USA: BeardBooks, 2002. [14] B. Boehm, "Software risk management: Principles and practices," IEEE Software, vol. 8, pp. 32-41, 1991. [15] F. Heemstra and R. Kusters, "Dealing with risk: A practical approach," Journal of Information Technology, vol. 11, pp. 333-346, 1996. [16] R. Charette, "The mechanics of managing IT risk," Journal of Information Technology, vol. 11, pp. 373378, 1996. [17] H. Barki, S. Rivard, and J. Talbot, "An Integrative Contingency Model of Software Project Risk Management," Journal of Management Information Systems, vol. 17, pp. 37-69, 2001. [18] H. Barki, S. Rivard, and J. Talbot, "Toward an Assessment of Software Development Risk," Journal of Management Information Systems, vol. 10, pp. 203-225, 1993. [19] R. L. Baskerville and J. Stage, "Controlling Prototype Development Through Risk Analysis," MIS Quarterly, vol. 20, pp. 481-504, 1996. [20] K. D. Loch, H. H. Carr, and M. E. Warkentin, "Threats to Information Systems: Today's Reality, Yesterday's Understanding," MIS Quarterly, vol. 16, pp. 173-186, 1992. [21] R. J. Kauffman and R. Sougstad, "Risk Management of Contract Portfolios in IT Services: The Profit-at-Risk Approach," Journal of Management Information Systems, vol. 25, pp. 17-48, 2008. [22] H. Tanriverdi and T. W. Ruefli, "The Role of Information Technology in Risk/Return Relations of Firms," Journal of the Association for Information Systems, vol. 5, pp. 421-447, 2004. [23] J. Ropponen and K. Lyytinen, "Can software risk management improve system development: An exploratory study," European Journal of Information Systems, vol. 6, pp. 41-41, 1997. [24] J. H. Iversen, L. Mathiassen, and P. A. Nielsen, "Managing Risk in Software Process Improvement: An [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] 5021 5019 Action Research Approach," MIS Quarterly, vol. 28, pp. 395-433, 2004. A. Mursu, K. Lyytinen, H. A. Soriyan, and M. Korpela, "Identifying software project risks in Nigeria: an International Comparative Study," European Journal of Information Systems, vol. 12, pp. 182-194, 2003. H. K. Jain, M. R. Tanniru, and B. Fazlollahi, "MCDM Approach for Generating and Evaluating Alternatives in Requirement Analysis," Information Systems Research, vol. 2, pp. 223-239, 1991. D. W. Straub and R. J. Welke, "Coping With Systems Risk: Security Planning Models for Management Decision Making," MIS Quarterly, vol. 22, pp. 441-469, 1998. R. Schmidt, K. Lyytinen, M. Keil, and P. Cule, "Identifying Software Project Risks: An International Delphi Study," Journal of Management Information Systems, vol. 17, pp. 5-36, 2001. P. A. Pavlou and D. Gefen, "Psychological Contract Violation in Online Marketplaces: Antecedents, Consequences, and Moderating Role," Information Systems Research, vol. 16, pp. 372-399, 2005. M. Keil, B. C. Y. Tan, K.-K. Wei, T. Saarinen, V. Tuunainen, and A. Wassenaar, "A Cross-Cultural Study on Escalation of Commitment Behavior in Software Projects," MIS Quarterly, vol. 24, pp. 299-325, 2000. S. R. Nidumolu, "A Comparison of the Structural Contingency and Risk-Based Perspectives on Coordination in Software-Development Projects," Journal of Management Information Systems, vol. 13, pp. 77-113, 1996. E. K. Clemons, M. C. Row, and M. E. Thatcher, "Identifying Sources of Reengineering Failures: A Study of the Behavioral Factors Contributing to Reengineering Risks," Journal of Management Information Systems, vol. 12, pp. 9-36, 1995. E. D. Hahn, J. P. Doh, and K. Bunyaratavej, "The Evolution of Risk in Information Systems Offshoring: The Impact of Home Country Risk, Firm Learning, and Competitive Dynamics " MIS Quarterly, vol. 33, pp. 597-616, 2009. A. I. Nicolaou and D. H. McKnight, "Perceived Information Quality in Data Exchanges: Effects on Risk, Trust, and Intention to Use," Information Systems Research, vol. 17, pp. 332-351, 2006. R. Willison and J. Backhouse, "Opportunities for computer crime: considering systems risk from a criminological perspective," European Journal of Information Systems, vol. 15, pp. 403-414, 2006. T. Dinev and P. Hart, "An Extended Privacy Calculus Model for E-Commerce Transactions," Information Systems Research, vol. 17, pp. 61-80, 2006. M. Benaroch, Y. Lichtenstein, and K. Robinson, "Real Options in Information Technology Risk Management: An Empirical Validation of Risk-Option Relationships," MIS Quarterly, vol. 30, pp. 827-864, 2006. T. Herath and H. R. Rao, "Protection motivation and deterrence: a framework for security policy compliance in organisations," European Journal of Information Systems, vol. 18, pp. 106-125, 2009. [39] B. Zmud, "The Year 2000 Problem: A Laboratory for MIS Research," MIS Quarterly, vol. 21, pp. iii-vi, 1997. [40] A. Y. Du, X. Geng, R. Gopal, R. Ramesh, and A. B. Whinston, "Capacity Provision Networks: Foundations of Markets for Sharable Resources in Distributed Computational Economies," Information Systems Research, vol. 19, pp. 144-160, 2008. [41] M. Alavi and I. R. Weiss, "Managing the Risks Associated with End-User Computing," Journal of Management Information Systems, vol. 2, pp. 5-20, 1985. [42] E. K. Clemons and G. U. Bin, "Justifying Contingent Information Technology Investments: Balancing the Need for Speed of Action with Certainty Before Action," Journal of Management Information Systems, vol. 20, pp. 11-48, 2003. [43] R. Y. Arakji and K. R. Lang, "Digital Consumer Networks and Producer--Consumer Collaboration: Innovation and Product Development in the Video Game Industry," Journal of Management Information Systems, vol. 24, pp. 195-219, 2007. [44] S. Alter and S. A. Sherer, "Information system risks and risk factors: Are they mostly about information systems?," Communications of the AIS, vol. 14, pp. 2964, 2004. [45] A. Tiwana and M. Keil, "Functionality risk in information systems development: An empirical investigation," IEEE Transactions on Engineering Management, vol. 53, pp. 412-425, 2006. [46] M. Carr, S. Konda, I. Monarch, F. Ulrich, and C. Walker, "Taxonomy-Based Risk Identification," Software Engineering Institute, Pittsburgh1993. [47] Q. He, "Knowledge Discovery Through Co-Word Analysis," Library Trends, vol. 48, pp. 133-159, 1999. [48] A. Cambrosio, C. Limoges, J. P. Courtial, and F. Laville, "Historical scientometrics? Mapping over 70 [49] [50] [51] [52] [53] [54] [55] [56] [57] years of biological safety research with coword analysis," Scientometrics, vol. 27, pp. 119-143, 1993. M. Rokaya, E. Atlam, M. Fuketa, T. C. Dorji, and J.-i. Aoe, "Ranking of field association terms using Co-word analysis," Information Processing & Management, vol. 44, pp. 738-755, 2008. N. Rowe, "A taxonomy of deception in cyberspace," in International Conference in Information Warfare and Security, Princess Anne, MD, 2006, pp. 173-181. M. Slaymaker, E. Politou, D. Power, S. Lloyd, and A. Simpson, "Security Aspects of Grid-based Digital Mammography," Methods Inf Med, vol. 44, pp. 207-10, 2005. S. Weiss, N. Indurkhya, and T. Zhang, Fundamentals of Predictive Text Mining. London: Springer, 2010. Y. Ding, Chowdhury, G.G. and Foo, S., "Bibliometric cartography of information retrieval research by using co-word analysis," Information Processing & Management, vol. 37, pp. 817-842, 2001. P. Bourret, A. Mogoutov, C. Julian-Reynier, and A. Cambrosio, "A New Clinical Collective for French Cancer Genetics: A Heterogeneous Mapping Analysis," Science Technology and Human Values, vol. 31, pp. 431-464, 2006. M. E. J. Newman, "Fast algorithm for detecting community structure in networks," Physical Review E, vol. 69, 2004. P. G. Neumann, Computer related risks. New York, NY: Addison-Wesley, 1995. J. Law, S. Bauin, J.-P. Courtial, and J. Whittaker, "Policy and the mapping of scientific change: A coword analysis of research into environmental acidification," Scientometrics, vol. 14, pp. 251-264, 1988. Appendix A: Keyword selection criteria Action Word type or meaning Nouns Verbs Adjectives Conversions from verbs and adjectives to nouns Adverbs Names Acronyms Countries, cities, nationalities Outliers (frequency of occurrence > 6.500) Sources of information Unessential words (see explanation below) Include words Description Related to IS No verbs Related to IS Example Exclude words Description Example General All verbs General Star, Chair, Lesson Explain, Present Slow, Fair, Nice General Turn, Order, Record No adverbs Companies and Products Technical acronyms Algorithm, Chip Binary, Cellular Charge, Crash, Trace, Fake, Defect IBM, Microsoft, Outlook, Netscape PGP, RSA, SMTP, NT, NASA, ARPA All adverbs Possibly, Easily Personal Eric, Kevin, Frank Abbreviations Etc., Mr., Jr., Dr. None of those - All of those No outliers - All outliers None of those No unessential words - All of those All unessential words Related to IS - 5022 5020 China, London, American Risk, System, Time, Problem Reuters, BBC, MIT Data, Software, Company, Address
© Copyright 2026 Paperzz