VISMON: FACILITATING ANALYSIS OF TRADE-OFFS, UNCERTAINTY, AND SENSITIVITY IN FISHERIES MANAGEMENT DECISION MAKING by Maryam Booshehrian B.Sc. Sharif University of Technology, 2005 M.Sc., Sharif University of Technology, 2007 a Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in the School of Computing Science Faculty of Applied Sciences c Maryam Booshehrian 2012 SIMON FRASER UNIVERSITY Fall 2012 All rights reserved. However, in accordance with the Copyright Act of Canada, this work may be reproduced without authorization under the conditions for “Fair Dealing.” Therefore, limited reproduction of this work for the purposes of private study, research, criticism, review and news reporting is likely to be in accordance with the law, particularly if cited appropriately. APPROVAL Name: Maryam Booshehrian Degree: Master of Science Title of Thesis: Vismon: Facilitating Analysis of Trade-Offs, Uncertainty, and Sensitivity in Fisheries Management Decision Making Examining Committee: Dr. Alexandra Fedorova, Associate Professor Chair Dr. Torsten Möller Senior Supervisor Professor, Computing Science Dr. Randall Peterman Co-Supervisor Professor, Resource and Environmental Management Dr. Tamara Munzner Co-Supervisor Professor, Computer Science University of British Columbia Dr. Hao Zhang Internal Examiner Associate Professor, Computing Science Date Approved: ii Abstract In this work, we present an analysis and abstraction of the data and task in the domain of fisheries management, and the design and implementation of the Vismon tool to address the identified requirements. Vismon was designed to support sophisticated data analysis of simulation results by managers who are highly knowledgeable about the fisheries domain but not experts in simulation software and statistical data analysis. The previous workflow required the scientists who built the models to spearhead the analysis process. The features of Vismon include sensitivity analysis, comprehensive and global trade-offs analysis, and a staged approach to the visualization of the uncertainty of the underlying simulation model. The tool was iteratively refined through a multi-year engagement with fisheries scientists with a two-phase approach, where an initial diverging experimentation phase to test many alternatives was followed by a converging phase where the set of multiple linked views that proved effective were integrated together in a useable way. We performed a user study in Alaska among a group of fisheries scientists and managers to examine the usefulness of Vismon in data analysis. To date, several fisheries scientists have used Vismon to communicate with policy makers, and it is scheduled for deployment to policy makers in Alaska. iii to my parents, Fatemeh and Mohammad Hassan whose love and support enlightened the path . to my husband, Mehran without whom this would not be finished . to my daughter, Avina who gave me joy and enthusiasm iv “Data is not information; information is not knowledge; knowledge is not understanding; understanding is not wisdom.” — Clifford Stoll v Acknowledgments I would like to thank my supervisor, Dr. Torsten Möller, for finding a project that suitably matched my passion and kept me enthusiastic during these years. I also would like to thank Dr. Randall M. Peterman for introducing the great world of fisheries science to me. His productive criticism pushed me to think about things more deeply. I also thank Dr. Tamara Munzner. Her experience and knowledge in her field helped me gain greater understanding of the research work. I also thank my labmates in GrUVi and SCIRF lab. They made the lab a more enjoyable place to work in. My many years of study at SFU was special because of the great people working here. I would like thank my parents and my brother, Ahmad, and my sisters, Tara and Mahnaz. Although there were far from me, their understanding and support made me feel they were close-by all the time. And finally, I am truly thankful to my husband, Mehran. His endless conversations, reviews, and critiques were of great assistance in my research. This thesis is based on the following paper: • Maryam Booshehrian, Torsten Möller, Randall M. Peterman, and Tamara Munzner. ”Vismon: Facilitating Analysis of Trade-Offs, Uncertainty, and Sensitivity In Fisheries Management Decision Making”. Computer Graphics Forum (Proceedings of Eurographics Conference on Visualization 2012 (EuroVis 2012)), vol. 31, no. 3, pp. 1235-1244. The paper received the second best paper award in EuroVis 2012, in June 2012. vi Contents Approval ii Abstract iii Dedication iv Quotation v Acknowledgments vi Contents vii List of Tables ix List of Figures x 1 Introduction 1 2 Fisheries Background 3 2.1 Fisheries Policy Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Fisheries Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Data and Task Analysis 6 3.1 The AYK Simulation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Analysis Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Design Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 vii 4 Related Work 14 4.1 Multidimensional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 Uncertainty Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.4 Fisheries Visualization Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Vismon Interface 17 5.1 Constraint Pane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2 Contours Pane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Trade-offs Pane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.4 General Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6 Case Study 6.1 36 AYK Chum Salmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7 Process and Validation 46 7.1 Two-Phase Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 7.2 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.3 Staged Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.4 Validation and Initial Deployment Outcome . . . . . . . . . . . . . . . . . . . 53 7.4.1 Initial Deployment in Alaska . . . . . . . . . . . . . . . . . . . . . . . 53 8 Conclusion and Future Work 64 Appendix A Questionnaire 66 Appendix B Vismon Versions 74 viii List of Tables 3.1 Analysis tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 7.1 Q3 Results: The participants in the workshop were either managers or fisheries scientists. Only one participant did not give his/her occupation. 7.2 . . . . 54 Q4 Results: Division/Section of respondents, who came from Alaska Department of Food and Game (ADF&G), U.S. Fish and Wildlife Service (USFWS), or universities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 7.3 Q6 : Previous tools or methods used to carry out the kind of analysis supported by Vismon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 7.4 Q7 and Q8 Results: Number of management options and indicators in fisheries analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 ix List of Figures 3.1 The Arctic-Yukon-Kuskokwim (AYK) region of Alaska, U.S.A. . . . . . . . . 3.2 The simulation model used by Vismon takes two inputs, known as manage- 7 ment options, and produces groups of output, called indicators. This example has three groups of indicators, each with a median, average, CV value, and a percentage of simulated years with undesirable behaviour. The data underlying each summarized indicator are from 500 Monte Carlo trials. 3.3 . . . . . . 8 The AYK model has two management options: Escapement target (X axis) and Harvest rate (Y axis). The plots show the contours of the statistical summaries of the 3 indicators in the high-level dataset. The indicators are Escapement (top row), Subsistence catch (middle row), and Commercial catch (bottom row). The statistical measures are Median, Average, CV, and %, from left to right, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.1 The Vismon interface: (a) Constraint pane, top-left, shows the list of management options and indicators in separate tabs; (b) Contour Plot Matrix pane, top-right, shows the contour plots of indicators as functions of the two management options and supports scenario selection; (c) Trade-offs pane, bottom, shows detail with the indicators for the selected scenarios; (d) Separate sliders are assigned to management options and indicators in the Constraint pane. . 18 5.2 The Constraint pane: (a) Management Options tab shows bidirectional sliders and two text fields for each management option; (b) Indicators tab shows single handle sliders and one text field for each indicator. The handle’s small flag and the label Best show the desirable direction for the indicator (i.e., either minimum or maximum value). x . . . . . . . . . . . . . . . . . . . . . . 19 5.3 (a) Moving the management option slider results in (b) straight line sweeping out horizontally or vertically to change the active region of the contour plots. 5.4 (a) Moving the indicator slider results in (b) complex and non-obvious shapes for the active region. 5.5 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 The Constraint pane sliders become scented widgets showing histograms on demand. (a) MC Trials shows the distribution of all indicator values across all Monte Carlo trials (500 in our example data). (b) Selecting a scenario shows its distribution vs. the full distribution, log-scaled vertically. (c) Probabilistic Objectives allows the user to set a second probabilistic constraint. (d) Any histogram can show the cumulative density function rather than the direct probability density function. 5.6 . . . . . . . . . . . . . . . . . . 22 The Contours pane shows a contour plot matrix of two-dimensional plots for each active output indicator. Numeric legends on contour lines are automatically shown when the plot size is sufficiently large. 5.7 . . . . . . . . . . . . . . 24 A contour plot can (a) be turned off (by selecting Delete plot) and (b) be turned on with a right-mouse popup menu. . . . . . . . . . . . . . . . . . . . 25 5.8 Several colourmaps are build in Vismon for colouring the contour plots. The contour plots are coloured by default with the first sequential blue-white colourmap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.9 Indicators tab in the Contours pane shows the list of selected indicators. . . 27 5.10 Scenarios tab in the Trade-offs pane shows the list of selected scenarios. Here, the user can (a) change the default colour of a scenario, (b) make a scenario invisible or visible, (c) select a scenario by clicking on its associated row, (d) delete a scenario, or (e) create a new scenario by typing the values of management options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.11 Contour plots can show (a) small points showing the underlying 11×11 grid of our example data that is the basis for isocontour interpolation, (b) the points size coded in two directions to summarize the underlying uncertainty as two numbers, (c) a histogram showing the full probability distribution for the point under the cursor, (d) the X’d out region from probabilistic constraints, for comparison to the greyed-out region from the deterministic constraints. . 29 xi 5.12 The Trade-offs pane shows a detailed assessment of the trade-offs between the selected scenarios. Coloured bars are associated with scenarios whereas the grey bars are associated with the scenario represented by the cross-hair’s current location on a contour plot. . . . . . . . . . . . . . . . . . . . . . . . . 30 5.13 The plots in the Trade-offs pane can be grouped in two modes: (a) the default mode is to group outputs by indicator, showing one plot for each indicator with a different coloured bar for each scenario; (b) the second mode is to group outputs by scenario, having one plot for each scenario with the bar heights showing all of its indicators. In Group by Scenario mode, the bar heights are normalized. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.14 Plots in the Trade-offs pane can be sorted by the value of any indicator: (a) when the plots are grouped by indicator, the bars in all of them are sorted by the value of the Sort by indicator; in this case Average commercial catch. The bars for the selected indicator are displayed in the increasing order of heights; (b) when the plots are grouped by scenario, the plots are sorted by the value of the Sort by indicator, rather than the default based on the order in which the scenarios were created. The selected indicator is a grey outline box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.15 The Trade-offs pane can show uncertainty information on bars in four ways. (a) None. (b) Error bars. (c) Box plots. (d) Shaded distributions. . . . . . . 33 5.16 The Trade-offs pane shows small multiples of three possible plot types: (a) Bar charts; (b) Multipodes plot; (c) Star glyph. . . . . . . . . . . . . . . . . 33 5.17 Vismon enables the users to create scatterplots of any two indicators. Coloured dots are related to user-selected scenarios. Grey dots are the combination of management options that fall into the acceptable region, whereas white dots are the combinations that fall into the unacceptable region given the constraint limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.18 Users can add reference lines to contour plots at specific values. (a) The Reference Lines window in which the user can add reference lines; (b) Reference lines are shown on contour plots. . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1 On startup, no constraint has been set, so there is no grey, inactive region on contour plots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 xii 6.2 After setting a constraint, the active region has an irregular shape. . . . . . 39 6.3 A second constraint shrinks the region more. . . . . . . . . . . . . . . . . . . 39 6.4 The third constraint shrinks the active region again. . . . . . . . . . . . . . . 40 6.5 Clicking on Contour plots selects scenarios. The scenarios are displayed in the Trade-offs pane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.6 Any contour plot can be turned off using a menu available through a rightmouse click. 6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 The plots in the Trade-offs pane are switched to the box plot uncertainty display. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.8 The plots in the Trade-offs pane are switched to the shaded distribution uncertainty display. 6.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 The plots are grouped by scenario. Labels for the bars appear on mouseover. 42 6.10 The bars are sorted by the Average commercial catch indicator. In the Constraint pane, the MC Trials histograms are turned on. . . . . . . . . . . 43 6.11 Choosing the red scenario shows that distribution against the rest at log scale. 43 6.12 The histograms on Contour plots are turned on. A new light blue scenario is selected based on the floating histograms. . . . . . . . . . . . . . . . . . . . . 44 6.13 The user turns on the second set of histograms underneath the sliders in the Constraint pane, and sets a probabilistic limit to the Average commercial catch. The Contour plots now have many crossed out locations on top of the deterministic constraints (greyed out regions) due to the probabilistic constraint. 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 The first-phase prototype of Vismon was developed in a diverging phase. It included many views that could be created by the user. During that phase, we gathered feedback for the target users about various views, and concluded that the users were showing a tendency to use contour plots and scatterplots, as these were the most familiar to the target users. In addition, the Tradeoffs pane (top-left) was showing the scenarios in separate window which was overlapping Vismon’s main window. . . . . . . . . . . . . . . . . . . . . . . . 47 xiii 7.2 The Overview pane versus the Constraint pane: (a) In the first-phase prototype of Vismon, the Overview pane (renamed to Constraint in the later versions) contained the list of management options as well as indicators. The constraint sliders for indicators were two-way, allowing the user to add constraints to both minimum and maximum values. (b) In the final version of Vismon, the Constraint pane has two separate tabs, one for management options and the other for indicators. The two-way sliders in the earlier Overview pane are substituted with one-way sliders, because there is no need to have both minimum and maximum constraints. 7.3 . . . . . . . . . . . . . . . . . . . 49 The first-phase prototype of Vismon included a correlation matrix. It showed the correlation value of every two indicators in the cell connecting the two indicators. The user was able to see the scatterplot of any two indicators, instead of their correlation value by clicking on their connecting cell. The colour coding encoded the “goodness” of the correlation between the two indicators, based on the maximum/minimum preferability. 7.4 . . . . . . . . . . 50 Parallel coordinates were among the many views included in the first-phase prototype of Vismon. However, they were rejected by the target users. 7.5 . . . 51 (a) The Probability Statement plot was an experimental visual encoding of uncertainty. It was a two-dimensional plot with the horizontal axis showing indicator values, from minimum to maximum, and the vertical axis showing probabilities from 0 to 1. A pixel, had a greyscale value indicating the percentage of management option combinations that satisfied the axis as the boundary value, and the y axis value as the probability value. This plot proved to be complicated. (b) The Probability Objectives histogram is substituted the complex Probability Statement plot. It shows only a single one of those curves. The Probability Statement plot in (a) is annotated with a red line to show where the Probability Objectives histogram (shown in (b)) is situated. 7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Q9 Results: Usefulness of 9 visual encodings on a 5-point Likert scale. On the left is 1, very valuable; on the right is 5, not very valuable. High frequency on the left indicates the plot being more valuable, but on the right indicates the plot being less valuable. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 xiv 7.7 Q10 Results: Usefulness of Vismon in data analysis on a 5-point Likert scale. On the left is 1, strongly agree; on the right is 5, strongly disagree. Note that high frequency on the left indicates more agreeing, and on the right indicates more disagreeing. 7.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Q11 Results: How Vismon improved the communication between different groups of interest on a 5-point Likert scale. On the left is 1, strongly agree; on the right is 5, strongly disagree. Note that high frequency on the left indicates more agreeing, and on the right indicates more disagreeing. xv . . . . 63 Chapter 1 Introduction Fisheries management involves the regulation of fishing to balance the interests of groups of people who want to catch fish now, with the goal of sustaining viable fish populations for the future. Policy makers make these regulatory decisions in consultation with fisheries scientists, who run extensive computer simulations informed by real-world data. The sophistication of these simulations has steadily increased, and they now generate complex multi-dimensional datasets. The latest simulation models are stochastic and time-dynamic and reflect the variation in environmental influences and uncertainties of many processes [?]. The greatest need in this domain is to analyze the relationship between the simulation inputs and outputs. That is, there is a need to quantify tradeoffs among the simulation outputs in the form of outcome indicators, which are produced by the set of management options that are part of the simulation inputs. A second goal is to support sensitivity and uncertainty analysis; that is, to understand when small changes in inputs lead to relatively large changes in outputs. However, the sophistication of visualization support for the analysis of, and communication about, these model outputs lags far behind, leaving an unmet need for more effective interactive visualization tools. Most analysis is done with simple individual plots: viewing time series plots with multiple curves, individual scatterplots to see correlations, and contour plots for overview summary information. The need to link between these plots has long been recognized; decades ago Peterman proposed doing so manually with carefully aligned paper printouts of multiple plots and physical transparency printouts with multiple crosshairs [?]. Obviously, these manual methods are inadequate for the complex datasets of the present. 1 CHAPTER 1. INTRODUCTION 2 The contribution of this work is two-fold. We present an analysis and abstraction of the data, tasks, and requirements for this fisheries management domain, elucidating the similarities and differences from other domains that also involve multi-dimensional analysis of the relationship between input and output dimensions. We also present the design and implementation of the Vismon interactive visualization tool. It supports (a) sensitivity analysis to check whether small changes in input result in small or large changes in output, and (b) constraint-based analysis to rule out parts of the input space based on constraints in the output space. Vismon supports the (c) analysis of tradeoffs between indicators globally distributed in the output space, whereas many previous systems only support analysis within a local neighbourhood. It allows users to (d) incorporate uncertainty information into their analysis at many levels of detail, with a staged introduction on demand that allows, but does not mandate, drilling down to the full complexity of the underlying probability distributions computed by the stochastic model. Both the abstractions and design were developed and validated iteratively, through a long-term close collaboration [?] with fisheries scientists. The main focus of this research is to perform a design study in the domain of fisheries management and decision making. A design study is a problem-driven research, where the visualization researchers analyze a particular real-world problem, design a visualization tool for the domain experts to help solve the real-world problem, validate the solution, and finally reflect about lessons learned to improve the process of the design [?]. In Chapter 2, we discuss the background information about fisheries policies and science, followed by the fisheries tasks and data analysis in Chapter 3. After reviewing the related work in Chapter 4, we continue with a description of the Vismon design and interface in Chapter 5, and afterwards, we describe an example of using Vismon with a case study in Chapter 6. In Chapter 7, we review the iterative design and validation process for Vismon. We conclude with a discussion of future work in Chapter 8. Chapter 2 Fisheries Background We begin with background information about the policy making context of decision-making support for fisheries, and continue with a summary of the scientific problems faced by fisheries decision makers. 2.1 Fisheries Policy Making Fisheries management is in the midst of major changes, both in terms of the policy making situation and in terms of physical and biological conditions. For setting policy, there is greater stakeholder involvement (known as multi-stakeholder consultation) and a stronger expectation of accountable decision making than in the past. Institutional roles are also shifting. Fisheries science was once the uncontested domain of the government and academia, but is now being undertaken by other groups including nongovernmental conservation groups and the fishing industry itself. Physically, the impact of climate change means that the processes being studied have moving baselines. Efforts to interpret cause and effect are often confounded by the reality of multiple simultaneous changes in both natural and human processes. The goals of policy makers in fisheries management have evolved over time, in three major waves. The first was a command and control model : “we have the data, and we decide”. The second model, decide, announce, defend, grew out of a push for public consultation, but that did not go far enough for meaningful input from the public. The current model is multi-stakeholder consultation with the new idea of including many stakeholders in the analysis process itself. Thus, the current goal of policy makers in fisheries management is to 3 CHAPTER 2. FISHERIES BACKGROUND 4 make well-informed decisions based on large amounts of quantitative information. The new need is to support not only communication to, but also analysis with, multiple stakeholders. The fisheries scientists need to provide fisheries managers with extensive quantitative information to support decision making, via simulation informed by monitoring efforts. The managers choose actions to best meet management objectives, such as maintaining sufficient spawning fish and allocating the allowed catches among the competing interest groups. The stakeholders include environmentalists and three different fishing interest groups: the commercial fishing industry, subsistence fishing communities (both First Nations and other groups), and recreational anglers. 2.2 Fisheries Science The goal of fish stock assessment is to quantitatively estimate potential outcomes of contemplated management actions. Simulation has been extensively used for stock assessment for decades, with increasing complexity and sophistication [?, ?, ?, ?]. Current stochastic simulation models take into account several different sources of uncertainty and risk. Stochastic dynamic models of the natural population implicitly encode a range of alternative hypotheses, taking into account the scientists’ limited understanding of the structure and function of aquatic systems and the variability of natural environments. These stochastic models also take into account the challenges of estimating probabilities for uncertain quantities given current monitoring capabilities; real-world data collection (based on harvesting and assessment data) inevitably involves observation and sampling error. The stochastic evaluation of the performance of management options also takes into account both natural variation in catchability and the fact that there may be imperfect compliance with governmental control of human behaviour through regulation. A final challenge is communicating complex technical information to decision makers and the public, that is, conveying assumptions, results, and implications to people not actively involved in the analyses. Peterman describes a high-level risk assessment framework to use when addressing these complex problems through simulation [?]. The first step is to define management objectives along with indicators such as catch variation over time that can measure how well objectives are met. Indicators might have an associated undesirable probability of falling below some critical limit. The second step is to consider several management options explicitly by stating them as inputs to the simulation model. The third step is to run the stochastic model with CHAPTER 2. FISHERIES BACKGROUND 5 a wide range of hypotheses expressed as parameter values and alternative submodels used inside the simulation. Finally, a sensitivity analysis can be done to see the effects of the different assumptions on the outcomes. In this paper, we refer to the managers and stakeholders as non-expert users. Although they are experts in the fisheries domain, they are non-expert users of simulation software and statistical data analysis. In contrast, the fisheries scientists are expert users, as they are intimately familiar with the data analysis of the simulation systems they have created. The goal of Vismon is to make the data analysis workflow commonly employed by expert users accessible to non-expert users without needing the guidance of the expert user. Vismon is the result of a multi-year two-phase design process in collaboration with various expert users. Chapter 3 Data and Task Analysis We now discuss the data, workflow, and tasks for this design study using an example, moving from the domain-specific details to abstractions. 3.1 The AYK Simulation Model The specific simulation model used as a concrete driving example in this paper is for chum salmon populations in the Arctic-Yukon-Kuskokwim (AYK) region of Alaska, U.S.A. [?] (Figure 3.1). The motivation for this particular model was concern with the large and rapid decreases in abundance in the late 1990s. The three major stakeholder interests at play in this region are sustainability of salmon populations, commercial fishing revenue, and subsistence fishing. One input option used to help meet managers’ objectives is the escapement target, which is the desired number of spawning salmon, that is, the number that “escape” being fished. The other is the harvest rate, the number of fish that the combination of the commercial and subsistence harvesters should catch after the escapement target is met. Each option is set to one of 11 levels; thus, the simulation covers 121 combinations of these input parameters. Any combination of two options is called a scenario. If the user selects a scenario that has not been explicitly simulated, we use bilinear interpolation to predict the outcome of that scenario based on the closest four neighbouring simulated scenarios. The simulation output is in the form of 12 indicators for each scenario, grouped into 3 categories: escapement, subsistence catch, and commercial catch. Each simulation run covers a 100-year time period, and the indicators are statistical measures to characterize the 6 CHAPTER 3. DATA AND TASK ANALYSIS 7 Figure 3.1: The Arctic-Yukon-Kuskokwim (AYK) region of Alaska, U.S.A. results in each category with four output numbers: the average, median, temporal coefficient of variation (CV, or standard deviation divided by the average), and a risk measure expressed as the percentage of years that something undesirable happened during the 100 simulated years. The risks for each respective category are when the run size is below the escapement target, when the subsistence fishery is below the lower quartile of historical catches, or when no commercial fishing is allowed. The stochastic simulation carries out 500 Monte Carlo trial runs for each scenario. The full dataset has 726,000 data elements in total, from the product of 121 scenarios, 12 indicators, and 500 runs. The high-level dataset of 1452 elements aggregates the values over the 500 runs into a single number, either the average or the median. We use the term underlying uncertainty to mean the information contained in the full dataset rather than CHAPTER 3. DATA AND TASK ANALYSIS 8 the high-level aggregate dataset. In fisheries science, indicators of outcome need to be either maximized or minimized, but not both. Each of the twelve indicators has a direction of desired change as associated metadata, in addition to its set of quantitative values. The simulation output includes both the average and median so that the analysts can easily see whether these quantities are similar, indicating symmetric distributions. When they are different, their relative values give a quick sense of the direction of the distribution’s asymmetry. Figure 3.2: The simulation model used by Vismon takes two inputs, known as management options, and produces groups of output, called indicators. This example has three groups of indicators, each with a median, average, CV value, and a percentage of simulated years with undesirable behaviour. The data underlying each summarized indicator are from 500 Monte Carlo trials. CHAPTER 3. DATA AND TASK ANALYSIS 3.2 9 Data Abstraction Simulation models are one particular example of multi-dimensional models where the dimensions are divided into two classes: inputs and outputs. The independent input dimensions to the simulation are known as the management options, or options for short (2 in the AYK model). The dependent output dimensions are known as indicators (12 in the AYK model). The simulation generates the outputs given the inputs by running hundreds of Monte Carlo trials (500 in the AYK model). This full underlying dataset is summarized by a few statistical measures for each indicator (see Figure 3.2). The design target for Vismon is two input dimensions and ten to twenty output dimensions. This abstraction covers a significant and interesting part of the possible design space, but not all of it. Our collaborators do not currently run simulations with more than two input dimensions, and do not anticipate needing more than a few dozen indicators in total to reflect the interests of the major stakeholder groups. This choice to constrain the tool to supporting few inputs and a moderate number of outputs has many design implications; for example, it is possible to provide overviews of the entire input space and output space without recourse to complex user interactions for selecting which subsets to view or for exploring only local neighbourhoods. Simulation run times do not affect the interactivity of Vismon, which loads a precomputed static data file, which is the output file from the stochastic model. 3.3 Analysis Task Our collaborators, who are expert users, are working with very mature simulation models that have already been extensively refined and validated [?], so their interest is in using this model rather than building or verifying it. The analysis workflow established by the scientists contained a detailed list of domain-specific subtasks: • W1: summarize a large number of simulations, • W2: add constraints on value ranges for simulation input (options) and output (indicators) based on stakeholder interests, • W3: select a few candidate combinations of options, • W4: quantify trade-offs in indicators between selected options, CHAPTER 3. DATA AND TASK ANALYSIS 10 The selection and trade-off analysis should allow the user to observe the following constraints: • C1: avoid sensitive regions of the parameter space where small changes of options yield large changes of indicators, • C2: avoid options with high underlying uncertainty across all Monte Carlo trials. At the start of our collaboration, Peterman’s group was actively engaged in the analysis of simulation results, as they had been for years. Their previous analysis procedure was to use a wide range of individual plots generated with scripts for R and other similar packages. The need to link between these plots has long been recognized; decades ago Peterman proposed doing so manually with carefully aligned paper printouts of multiple plots and physical transparency printouts with multiple crosshairs [?]. In addition, although scripts for general-purpose frameworks are a powerful and flexible way to create nearly any individual view showing details at a low level, they require users to know exactly what to specify in advance. As well, such details were accessible to non-expert users only with close collaboration with scientists. Hence, the overarching goal of our collaborating scientists was: • G1: enable scientists to communicate simulation results to policy makers. • G2: to make this workflow accessible to the non-expert users without the scientists’ involvement. Table 3.1 contains the list of subtasks required for the analysis of our collaborators’ data. Workflow tasks W1: summarization Constraint tasks C1: sensitivity analysis W2: constraining C2: trade-off analysis W3: selection W4: trade-off quantification W4: trade-off quantification W4: trade-off quantification Table 3.1: Analysis tasks Collaboration goals G1: collaboration between scientists and policymakers G2: non-expert user involvement W4: trade-off quantification W4: trade-off quantification CHAPTER 3. DATA AND TASK ANALYSIS 11 In addition, in the past, only a tiny fraction of the information theoretically available in the dataset was actively considered in the analysis process. The scientists and managers were buried by the quantity of information put out by the models. Essentially, they picked a few points in the parameter space through trial and error and ignored the rest, because they did not have a systematic way to explore the information. Their view of the dataset was narrowly focused; they lacked high-level overviews and other ways to easily synthesize information across a combination of low-level detail views. Exploring the dataset at the level of the aggregate statistical measures (see Figure 3.2 and Figure 3.3) was very difficult, and understanding the underlying uncertainty expressed in the full details of the Monte Carlo runs was even more challenging. Their analysis procedure was most successful in supporting the first two subtasks of summarization (W1) and adding constraints (W2) at a basic level. A very small set of candidate management scenarios was picked (W3) based on past experience, rather than through a data-driven exploration of the simulation output. While the high-level trade-offs were well known to the scientists and managers, quantifying them for any specific combination of choices was difficult because the relationships are nonlinear (W4). Quantifying trade-offs involved a great deal of cognition and memory, with only minimal help from their perceptual system, in order to synthesize information across multiple individual views. Avoiding sensitive regions (C1) required a great deal of trial and error. Inspecting an individual contour plot showing the values for one indicator could show them regions of rapid change for that indicator where the contour lines were closely spaced, but synthesizing a mental model across all of the indicators was not well supported by the available methods of analysis. Understanding the complexity of the Monte Carlo trials (C2) was also not easily addressed; the scientists typically just worked with the averages because the full dataset was too overwhelming. Communicating results (G1) was only partially addressed. Although their process did support some level of communication between scientists, simulation results were very difficult for policy makers to understand and extremely challenging for stakeholders and the public to grasp (G2). Considering their tasks at a generic level, we conjectured that many useful scenarios might not even be considered in the candidate set, and conversely that too much analysis time was being spent exploring candidates later found to be unsuitable. Our goal was to allow managers to explore the data on their own and together with different stakeholders. Hence, we needed to create an interface that would allow anyone to quickly find relevant CHAPTER 3. DATA AND TASK ANALYSIS 12 scenarios and to compare several of them. 3.4 Design Requirements Two key design requirements were to allow all workflow tasks (W1-W4) and constraints (C1, C2) to be carried out within a single tool, and to support interleaving these tasks in any order. For example, the tool should allow uncertainty information to be incorporated into the process of selecting of the candidate actions. The requirement of encouraging but not forcing uncertainty analysis led us to the design goal of creating views that could all be used in a straightforward way with only the information from the high-level simplified dataset. The combination of these two goals led us to a strategy where views could be augmented with information from the full underlying uncertainty dataset at different levels of complexity. For example, a user should be able to start exploring scenarios using only the high-level average values, and then later incorporate uncertainty information to see which of them are uncertain or risky. The user should also be able to include that information in the initial exploration, so that when faced with scenarios that have the same average values, they can prefer the more certain ones. Figure 3.3: The AYK model has two management options: Escapement target (X axis) and Harvest rate (Y axis). The plots show the contours of the statistical summaries of the 3 indicators in the high-level dataset. The indicators are Escapement (top row), Subsistence catch (middle row), and Commercial catch (bottom row). The statistical measures are Median, Average, CV, and %, from left to right, respectively. CHAPTER 3. DATA AND TASK ANALYSIS 13 Chapter 4 Related Work We will review related work in the area of multidimensional analysis, followed by work that deals with the problem of parameter estimation. We will briefly discuss uncertainty visualization, and also fisheries visualization tools. 4.1 Multidimensional Analysis The previous work most relevant to Vismon pertains to systems for multi-dimensional analysis. One of the earliest systems for exploring multi-dimensional parameter spaces was HyperSlice [?]. van Wijk and van Liere suggested using a SlicePlane matrix to summarize the n-dimensional space through 2D slices. Their idea was later extended as the Prosection Matrix [?]. Contrary to HyperSlice, in Prosection Matrix, an (n-2)-dimensional sub-space is summarized into a single plot instead of a matrix of 2D slices. These systems required complex navigation of high-dimensional input spaces that is not necessary given our target of only two input dimensions, and they did not support uncertainty or trade-offs analysis. 4.2 Parameter Estimation During recent years, researchers in the visualization community have been actively engaged in analysis of high-dimensional parameter spaces. The goal is to find a combination of parameters that best meets the objectives of the users. The inspirational works of Piringer et al. [?] and Berger et al. [?] address some of the same problems as Vismon. We share some aspects of their solutions in HyperMoVal and 14 CHAPTER 4. RELATED WORK 15 its follow-on system, including our choice of linked views in general, and sensitivity analysis via contour plot matrices in particular. Although trade-off analysis was not supported by HyperMoVal [?], it was added in their follow-on work [?]. They used a Pareto frontier [?] technique that allowed the user to change the position of a single scenario in order to understand the trade-off of two indicators, supporting local trade-off analysis. Similarly, HyperMoVal does not support uncertainty analysis at all, whereas the follow-on system does provide simple uncertainty visualization such as box plots. A major difference between these two systems and Vismon is their support for validating simulations by comparing measured data to simulated data, a task that is not required in our domain. Another is their emphasis on navigating the space of many input dimensions, adding complexity that is not necessary for our target of only two inputs. Two previous systems have focused on parameter selection for image segmentation algorithms [?, ?]. Both of these systems handle the case of many input dimensions rather than many output dimensions, for a key difference from Vismon. Pretorius et al. [?] support sensitivity analysis for this complex case by presenting segmented images at the leaves of a tree encapsulating input parameter settings; users can make visual judgements of the difference in quality between nearby segmentations in local neighbourhoods. Their system does not support uncertainty analysis. On the other hand, the Tuner system [?] does handle sensitivity analysis in a similar fashion to Vismon, but only with two outputs rather than many. Tuner’s trade-off analysis is constrained to a local analysis, and uncertainty analysis is covered only to a limited extent. A strength of both systems is the incorporation of sampling into the analysis pipeline; however, in Vismon the simulation of the data and its analysis are decoupled, so handling sampling issues is not a requirement for our problem. A number of other systems, such as Design Galleries [?] and the more recent FluidExplorer [?] use clustering to present a comprehensive overview of the variations in the high-dimensional output space. However, neither sensitivity analysis, trade-offs analysis, nor uncertainty is covered in their systems. 4.3 Uncertainty Visualization Due to its uncertain nature, understanding scientific data is a challenging task. Imperfect data [?] affects the process and result of decision making [?]. Johnson et al. and MacEachren CHAPTER 4. RELATED WORK 16 et al. give a comprehensive overview of different techniques available for uncertainty visualization [?, ?]. Error bars and box plots are part of the standard visual encodings for uncertain data [?, ?, ?, ?]. Both techniques summarize data distribution, and are useful for comparison of distributions. To convey more information, variations of box plots are proposed by adding different density descriptions, such as skewness and modality, to standard box plots [?, ?]. Jackson proposed a density strip [?] to show the uncertainty information in full detail by using a saturation map that encodes the full distribution as normalized density. The highlevel aggregate number is not the most visually salient aspect of the display, but this visual encoding conveys the most information. 4.4 Fisheries Visualization Tools Fisheries scientists usually provide information for fisheries managers using standard sets of outputs, which include time series projections of future states of the fishery system and numerous summary statistics in tables and simple, individual graphs such as scatterplots and contour plots [?, ?]. Static graphs are typically generated by either data analysis tools, such as Excel, or statistical tools, such as R. Having many indicators and numerous scenarios to consider makes the analysis sophisticated and complex. To the best of our knowledge, fisheries scientists lack effective interactive visualization tools to cope with their high-dimensional data. The current manual methods are inadequate for the complex datasets of the present. Previous work on specific visual encoding and interaction techniques used in Vismon is discussed in the context of the design decisions covered in Chapter 5 and Chapter 7. Chapter 5 Vismon Interface In this section we explain the design decisions that went into the creation of Vismon. It is to be seen as complementing the walkthrough of an analysis as performed by a policy maker, which is described in Chapter 6. Vismon is built using multiple linked views, as shown in Figure 5.1. The three main data abstractions used in Vismon are management options (the two input dimensions), indicators (the many output dimensions), and scenarios (a specific combination of the two input options, each of which has associated with it a value for each output dimension). Each of the three main views has a different visual encoding to emphasize different aspects of these elements and the relationship between them. The colour coding for scenarios is the same across all these main views. Each main view is itself composed of small-multiple charts, with linked highlighting between analogous items on mouseover. All views provide ways to show uncertainty at multiple levels of complexity on demand, but do not force uncertainty analysis on users who want to start simply. The Constraint pane on the left (Figure 5.1(a)) has sliders that show ranges for, and allow constraints to be imposed on, individual options and indicators (Figure 5.1(d)), with optional histograms showing the underlying data distributions for a richer view. The location of the average or median of each scenario with respect to the value ranges is shown with a coloured triangle. The Contours pane on the right (Figure 5.1(b)) has a contour plot matrix with one plot for each active indicator showing values with respect to options axes, with scenarios shown as coloured dots. The Trade-offs pane on the bottom (Figure 5.1(c)) shows details about indicator values for the active scenarios through rectangular or radial charts. The view can either show a chart for each scenario with marks for indicators, or vice versa. 17 Figure 5.1: The Vismon interface: (a) Constraint pane, top-left, shows the list of management options and indicators in separate tabs; (b) Contour Plot Matrix pane, top-right, shows the contour plots of indicators as functions of the two management options and supports scenario selection; (c) Trade-offs pane, bottom, shows detail with the indicators for the selected scenarios; (d) Separate sliders are assigned to management options and indicators in the Constraint pane. CHAPTER 5. VISMON INTERFACE 18 CHAPTER 5. VISMON INTERFACE 5.1 19 Constraint Pane The Constraint pane shows a tab for the management options and a tab for indicators, both as one-dimensional ranges (see Figure 5.2). In both cases, the base small-multiple view shows a range slider [?, ?], with both a moveable handle for quick interactive positioning and a text box for precise numerical entry when the user knows a value of interest in advance. The sliders allow the user to restrict the active range of any input option or output indicator to avoid unacceptable regions, which changes the shape of the shaded permissible scenario region in the Contours pane plots. (a) (b) Figure 5.2: The Constraint pane: (a) Management Options tab shows bidirectional sliders and two text fields for each management option; (b) Indicators tab shows single handle sliders and one text field for each indicator. The handle’s small flag and the label Best show the desirable direction for the indicator (i.e., either minimum or maximum value). The results of moving the input option sliders are not surprising; a straight line sweeps CHAPTER 5. VISMON INTERFACE 20 (a) (b) Figure 5.3: (a) Moving the management option slider results in (b) straight line sweeping out horizontally or vertically to change the active region of the contour plots. out horizontally or vertically to change the rectangular size of the active region (see Figure 5.3(b)), because these values correspond with the underlying grid used for both simulation computation and the contour plot display axes. However, changing the undesirable range of the output indicators leads to complex and non-obvious shapes for the active region, as shown in Figure 5.4(b). With just a few minutes of exploring with these sliders, the analyst can get the gist of how constraining the different input and output dimensions affects the set of possible scenarios (helping to achieve goal W2). Options have bidirectional sliders with both a minimum and maximum handle, and two text boxes (see Figure 5.3(a)). Indicator sliders have only a single handle (Figure 5.4(a)), since their directionality is known from the metadata. The label Best appears instead of a text box on the side that is the most desirable direction, and the handle also has a small flag pointing in that direction as a subtle visual cue. The plain sliders for the indicators convert to scented widgets [?] on demand from the user, showing histograms of distributions in the underlying dataset in order to provide more CHAPTER 5. VISMON INTERFACE 21 (a) (b) Figure 5.4: (a) Moving the indicator slider results in (b) complex and non-obvious shapes for the active region. guidance on what choices to make when setting the ranges (Figure 5.5). There are two choices, either or both of which can be shown. The simpler choice, MC Trials, shows a histogram with the distribution of all values for this indicator across all the Monte Carlo trials. Figure 5.5(a) shows an example for the Average commercial catch indicator, where the slider bar has been moved from the default position of 0 to the value of 100K. We can see this is an indicator where the maximum value is the most desirable because the Best label is on the right side of the slider. The geometric intuition is straightforward: the user can see in advance whether a small or a large part of the distribution will be filtered out when the slider bar is moved to a particular position, rather than using a trial and error process where the slider is moved and then the results are scrutinized. Figure 5.5(b) shows the result of drilling down even further by clicking on the red triangle. The histogram has a coloured overlay allowing the user to compare the distribution of the 500 Monte Carlo trials just for the chosen scenario with that of the full dataset of all simulated scenarios. The Log label appears on the left to show that in this mode the vertical axis is now log-scale rather CHAPTER 5. VISMON INTERFACE 22 (a) (b) (c) (d) Figure 5.5: The Constraint pane sliders become scented widgets showing histograms on demand. (a) MC Trials shows the distribution of all indicator values across all Monte Carlo trials (500 in our example data). (b) Selecting a scenario shows its distribution vs. the full distribution, log-scaled vertically. (c) Probabilistic Objectives allows the user to set a second probabilistic constraint. (d) Any histogram can show the cumulative density function rather than the direct probability density function. than linear, to ensure that the overlay details are fully visible. The more complex choice, Probabilistic Objectives (see Figure 5.5(c)), allows a sophisticated user to reason about all the Monte Carlo trials, not just their average (following constraint C2). It uses a two-part filter with a second slider and histogram. The base slider still sets a limit on the value of the target indicator. The second slider allows the user to set a probabilistic limit corresponding to the percentage of Monte Carlo simulation trials that are above that indicator limit for indicators that need to be maximized, or below for those that need to be minimized. The second slider allows the user to change this probability value interactively from the default of 0%, meaning that no possibilities have been ruled out, up to a higher number. In Figure 5.5(c), the user has set the probability to 80% that the % of years escapement below target indicator is smaller than 50; that is, a high probability of avoiding low escapement. The plots in the Contours pane will show which scenarios have been ruled out by this limit by crossing them out with X’s, as illustrated in Figure 5.11(d). Typically, one can think of probability distributions as cumulative or direct distributions. CHAPTER 5. VISMON INTERFACE 23 By default all histograms show probability directly, but the user can switch them to show cumulative density functions, as in Figure 5.5(d), if this is the preferred mental model. We considered a design change to compress the histograms in the Constraint pane to use less vertical space, so that users could see more augmented sliders at once without the need to scroll. However, our collaborators had strong opinions that current information density is at a good set point, and that making the histograms any smaller would impede their utility. The sliders are not only controls but also displays, even when not augmented by the histograms. They act as legends that document the range of each option or indicator; the sliders have the same visual range on the screen but cover very different regions of data space. They also show the full name for options and indicators, rather than the short names used in the other panes to save space. Most importantly, the scenario triangles show the distribution of the scenarios with respect to these ranges in a high-precision way using spatial position. That distribution would require more mental effort to glean from the Contour plots, where it is encoded more indirectly and with lower precision as the colour of the contour band in which the scenario dot is embedded. 5.2 Contours Pane The Contours pane contains a contour plot matrix (Figure 5.6) that has one two-dimensional plot for each active output indicator. Each contour plot is drawn using Marching Squares [?] based on simulation values given at an 11 × 11 grid. Each (x,y) location in the contour plot represents a scenario. Again, a scenario is characterized by two independent variables, the parameter settings used for the input management option choices, and has many dependent variables, the output indicators. The small-multiple views are linked with a crosshair that appears at the same (x,y) location in each of them when the cursor moves across any of them, and the exact numeric value for the indicator at that point is shown in each title bar. The scenario at the same (x,y) location is also shown in the Trade-offs pane. The cross-hair scenario is displayed with grey to be distinguished from the already selected scenarios, as shown in Figure 5.1. Numeric legends on contour lines are automatically shown when the plot size is sufficiently large, as shown in Figure 5.6; these are different in each plot, since each indicator is separately normalized to the colour map. CHAPTER 5. VISMON INTERFACE 24 Figure 5.6: The Contours pane shows a contour plot matrix of two-dimensional plots for each active output indicator. Numeric legends on contour lines are automatically shown when the plot size is sufficiently large. The plots are all linked to the Constraint pane sliders that provide data-driven constraints on the active region within each of them. All plots have the same two axes of the options, and the demarcation between the coloured active region and the greyed-out restricted region is the same in all. The plots show the high-level dataset: either the average or the median of the underlying simulation runs (500 in our example data) for their indicator. The plots resize dynamically to fit within the pane as it resizes or the number of plots to show changes as indicators are de- or re-activated, so that they are always visible side by side without the need to scroll. By default, all indicator plots are shown; Figure 5.6 shows the full set of 12 in the example dataset. They can be turned on and off with a right-mouse popup menu when the cursor is over a plot (see Figure 5.7), through the control pane which CHAPTER 5. VISMON INTERFACE (a) 25 (b) Figure 5.7: A contour plot can (a) be turned off (by selecting Delete plot) and (b) be turned on with a right-mouse popup menu. is accessible through the Indicators tab (Figure 5.9) on the top right of the pane, or through the Indicator menu. In Figure 5.7, the Median escapement contour plot is turned off and then turned on by right-clicking on the contour plot. The contour plots are coloured by default with a sequential blue-white colourmap that incorporates hue and saturation in addition to luminance in order to make the contour patterns highly visually salient. Several other colour-blind-safe colourmaps are built in (see Figure 5.8), all generated by ColorBrewer [?]. The preferred direction of each indicator (minimum or maximum value) is taken into account so that the dark and highly saturated end is the most preferred value for each. The static array of contour patterns provides an overview of the high-level dataset that is focused on the individual indicators (helping to achieve W1). Moving the cursor across the plot allows fast comparison between the indicator values for a single scenario because of the dynamically linked crosshairs. We chose to keep a contour plot matrix at the heart of the system because such plots were both familiar and effective, and facilitate sensitivity analysis by allowing users to note regions where isocontours fall close together as places to avoid (facilitating constraint C1). One main use of this view is to guide the user in selecting a small set of candidate CHAPTER 5. VISMON INTERFACE 26 Figure 5.8: Several colourmaps are build in Vismon for colouring the contour plots. The contour plots are coloured by default with the first sequential blue-white colourmap. scenarios (part of W3), which can be compared in detail in the Trade-offs pane. Clicking within a contour plot selects the scenario at that point. Its location is marked with a coloured dot in all plots in this pane and a coloured triangle along each indicator range on the Constraint pane sliders. Marks representing selected scenarios are small and show an identifier that is unique for each scenario, so they are coded with high-saturation colours in different hues. This shared colour coding acts as a link across different views. The design target is that users are unlikely to select more than 10 scenarios to inspect in detail at once. A palette of pre-selected highly distinguishable colours is used for the first 10 scenarios, with random colours used for any additional ones. The user can use the colour picker in the Trade-off pane control panel to override these colour defaults (see Figure 5.10). The user can also explore some of the underlying uncertainty data in the Contour pane (facilitating constraint C2), as shown in Figure 5.11. The 11 × 11 grid of our example data through which the contours are interpolated is the set of 121 pre-computed scenarios, which can be shown on demand as small points (Figure 5.11(a)). These points can be size coded with two additional numbers that summarize the underlying Monte Carlo trials in terms of the same 95% confidence interval information that is used for the error bars described in CHAPTER 5. VISMON INTERFACE 27 Figure 5.9: Indicators tab in the Contours pane shows the list of selected indicators. the next section (Figure 5.11(b)). Uncertain regions for a particular indicator across 500 Monte Carlo trials are clearly indicated by large dots, and strongly asymmetric intervals can be seen where the dots have visibly different aspect ratios. The user can also turn on a histogram showing the full distribution over all Monte Carlo trials at the point under the crosshair (Figure 5.11(c)). The histogram updates as the cursor moves, and can be displayed either in the plot containing the cursor or in all of the linked plots. Defining probabilistic constraints (in the Constraint Pane as shown in Figure 5.5(c)) adds X’s to the contour plots. This way, the user can compare the greyed-out region (from the deterministic constraints) to the X’d-out region (from the probabilistic constraints) (Figure 5.11(d)). CHAPTER 5. VISMON INTERFACE 28 Figure 5.10: Scenarios tab in the Trade-offs pane shows the list of selected scenarios. Here, the user can (a) change the default colour of a scenario, (b) make a scenario invisible or visible, (c) select a scenario by clicking on its associated row, (d) delete a scenario, or (e) create a new scenario by typing the values of management options. 5.3 Trade-offs Pane The Trade-offs pane (Figure 5.12) allows a detailed assessment of the trade-offs between a small set of scenarios with a set of small-multiple bar chart plots (supporting W4). These plots support two kinds of analysis (see Figure 5.13). The default mode is to group outputs by indicator, showing one plot for each indicator with a different colored bar for each scenario, allowing easy comparison of how indicators change across scenarios. The opposite mode is to group by scenario, where each plot shows a single scenario with the bar heights showing all of its indicators. Conversely, this mode allows easy comparison of indicator values within a particular scenario, and the profiles of entire scenarios with each other. When the plots are grouped by indicator, the bars in all of them can be sorted by the value of any indicator, rather than the default based on the order in which the scenarios were created. Figure 5.14 shows the bars sorted by Average commercial catch. On the other hand, when the plots are grouped by scenario, the plots can be sorted by the value of any indicator. Here, the order of the plots changes based on the value of the selected indicator, as shown in Figure 5.14. The plots support four different levels of showing the underlying uncertainty information, as shown in Figure 5.15. The simplest possibility is none (Figure 5.15(a)), so that analysts who want to do only high-level analysis are not forced to deal with uncertainty. The default CHAPTER 5. VISMON INTERFACE 29 (a) (b) (c) (d) Figure 5.11: Contour plots can show (a) small points showing the underlying 11 × 11 grid of our example data that is the basis for isocontour interpolation, (b) the points size coded in two directions to summarize the underlying uncertainty as two numbers, (c) a histogram showing the full probability distribution for the point under the cursor, (d) the X’d out region from probabilistic constraints, for comparison to the greyed-out region from the deterministic constraints. error bar mode superimposes a simple error bar showing the 95% confidence interval on top of the mark (Figure 5.15(b)). In box plot mode (Figure 5.15(c)) the high-level aggregate number is still shown explicitly, but with less salience. Both error bars and box plots are part of the standard arsenal of visualization techniques for uncertain data [?, ?, ?, ?]. The shaded distribution mode shows the uncertainty information in full detail (supporting C2) by using a saturation map that encodes the full distribution as normalized density (Figure 5.15(d)); this visual encoding was first introduced by Jackson with the name of density strip [?]. The high-level aggregate number is not the most visually salient aspect of the display, but this visual encoding conveys the most information. CHAPTER 5. VISMON INTERFACE 30 Figure 5.12: The Trade-offs pane shows a detailed assessment of the trade-offs between the selected scenarios. Coloured bars are associated with scenarios whereas the grey bars are associated with the scenario represented by the cross-hair’s current location on a contour plot. The Trade-offs pane provides two other possible plot types. The default choice shown in Figure 5.16(a) is the familiar bar chart, where one-dimensional marks aligned on a horizontal axis encode values with spatial position along a vertical axis. The Multipodes plot in Figure 5.16(b) is a radial alternative where the bars are laid out along a circle. Both the standard bar chart and the radial bar chart can show the underlying uncertainty in analogous ways. The third plot type is the Star glyph, a simpler version of the Multipodes plot that uses a one-pixel width mark instead of a wider bar, shown in Figure 5.16(c). Star glyphs do not support showing uncertainty information as above, but they are a visual encoding that is already familiar to many fisheries scientists. We thus include them as a bridge to the less familiar but more powerful multipodes plots. The highly familiar flat bar chart layout allows fast and highly accurate comparison between bar lengths, but radial layouts may better support focusing on a particular data dimension [?]. The multipodes plot is designed to avoid many of the perceptual problems with previous radial displays [?, ?] by encoding data with length rather than angle, and is an example of the Disconnected Ring pattern in the taxonomy of Draper et al. [?]. 5.4 General Functionality The user can request a separate fully-linked window showing a single large contour plot for any indicator, or select any two indicators to create a separate window with a scatterplot comparing the simulated scenario values linked to the main windows through the colour coding of the dots that represent selected scenarios. Scatterplots are created either from the rightmouse popup menu on contour plots, or from the Constraint pane by clicking CHAPTER 5. VISMON INTERFACE 31 (a) (b) Figure 5.13: The plots in the Trade-offs pane can be grouped in two modes: (a) the default mode is to group outputs by indicator, showing one plot for each indicator with a different coloured bar for each scenario; (b) the second mode is to group outputs by scenario, having one plot for each scenario with the bar heights showing all of its indicators. In Group by Scenario mode, the bar heights are normalized. on the desired indicators. Although scatterplots were only occasionally used in the early prototype, they were sometimes useful for analysis of the Pareto frontier [?]. Figure 5.17 shows the scatter plot of Average commercial catch versus Median escapement. The grey and white dots show all of the combinations of management options, and the coloured dots are the scenarios that are selected by the user. When falling into an unacceptable region given the constraint limits, a grey dot changes to white fill with light-grey borders. This choice enables the user to focus on acceptable scenarios. Users can also add persistent reference lines that appear in all contour plots at specific values of the options. Figure 5.18 shows a contour plot with one horizontal and two vertical dashed reference lines. Vismon supports export for both analysis and communication workflows: the selected set of scenarios can be exported as a comma separated value (CSV) file so that results from a Vismon session can be imported into other analysis tools, and any window can be exported as a PNG image file for inclusion into external documents. In addition, users can save the project file to be reopened for future analysis. The project saves the constraints, as well as CHAPTER 5. VISMON INTERFACE 32 (a) (b) Figure 5.14: Plots in the Trade-offs pane can be sorted by the value of any indicator: (a) when the plots are grouped by indicator, the bars in all of them are sorted by the value of the Sort by indicator; in this case Average commercial catch. The bars for the selected indicator are displayed in the increasing order of heights; (b) when the plots are grouped by scenario, the plots are sorted by the value of the Sort by indicator, rather than the default based on the order in which the scenarios were created. The selected indicator is a grey outline box. the selected scenarios. 5.5 Implementation Vismon is implemented in Java 1.6, with diagrams drawn using custom Java2D graphics code. The tick marks on plot axes dynamically adapt to use the available space [?]. Our implementation achieves interactive response on current hardware with the datasets in use by our collaborators, after start-up preprocessing of a few seconds. CHAPTER 5. VISMON INTERFACE (a) 33 (b) (c) (d) Figure 5.15: The Trade-offs pane can show uncertainty information on bars in four ways. (a) None. (b) Error bars. (c) Box plots. (d) Shaded distributions. (a) (b) (c) Figure 5.16: The Trade-offs pane shows small multiples of three possible plot types: (a) Bar charts; (b) Multipodes plot; (c) Star glyph. CHAPTER 5. VISMON INTERFACE 34 Figure 5.17: Vismon enables the users to create scatterplots of any two indicators. Coloured dots are related to user-selected scenarios. Grey dots are the combination of management options that fall into the acceptable region, whereas white dots are the combinations that fall into the unacceptable region given the constraint limits. CHAPTER 5. VISMON INTERFACE 35 Figure 5.18: Users can add reference lines to contour plots at specific values. (a) The Reference Lines window in which the user can add reference lines; (b) Reference lines are shown on contour plots. Chapter 6 Case Study 6.1 AYK Chum Salmon This case study summarizes an interactive Vismon session using the driving example dataset described in Chapter 3. It incorporates the kinds of information shown in demonstrations by the fisheries scientists. The target user is a fisheries manager in Alaska who hopes to make an informed policy decision by finding a scenario with low uncertainty that best suits her objectives. On startup, no constraints have yet been set, so the entire rectangular region of each plot in the Contours pane is fully coloured to show that it is active (Figure 6.1). The active region consists of all the (x,y) combinations of management options that are acceptable given the constraint limits. The manager decides to add constraints using the sliders in the Constraint pane on the left, to reduce the size of the active region by eliminating scenarios that produce unacceptable values of particular indicators. She first chooses 50% as the maximum allowable percentage of years in which escapement is below the target, and a curvilinear region in the upper right becomes greyed out, as shown in Figure 6.2. She then chooses 30% as the maximum acceptable percentage of time in which subsistence catch is in the lower quartile historically, and Figure 6.3 shows the continuing decrease in the size of the active range. Finally, she sets a minimum of 100,000 fish for the average annual commercial catch, as shown in Figure 6.4. The resulting active region in the contour plots showing the set of feasible scenarios is much smaller than the full original set, thereby simplifying the complexity faced by the manager. Some indicators within the active region are dark blue, showing that 36 CHAPTER 6. CASE STUDY 37 they are in the highly preferred range, while others are the lighter green colour indicating unfavourable values. The manager then explores a few management options by clicking the mouse in a few locations within the active region, and the Vismon window updates to include the Trade-offs pane showing detailed information about those scenarios. The Trade-offs pane shows the cross-hair scenario in grey as the manager changes the location of the cursor, as shown in Figure 6.5. The manager decides that she no longer needs to consider any of the indicators that pertain to median values. After she deactivates those with the rightmouse popup menu, contour plots use the newly available room, as shown in Figure 6.6. Figure 6.7 shows the Trade-offs pane after she switches to showing more detailed uncertainty information as box plots rather than error bars. She then digs even deeper by looking at the shaded distributions (Figure 6.8). By assessing the trade-offs for the chosen scenario, and considering her objectives of having high commercial catch while having low percentage of years with low escapement target and no commercial fisheries, she decides that the red scenario looks promising. She then switches to grouping by scenario rather than indicator and returns to error bars so that she can compare the profiles easily between her five scenarios, as shown in Figure 6.9. She switches back to grouping the charts by indicator. She scrolls down in the Constraint pane to look at the commercial catch indicators, and then sorts the charts in the Trade-offs pane by the Average commercial catch indicator. She also changes the settings in the Constraint pane to show the distributions for each indicator (Figure 6.10). Figure 6.11 shows her view after selecting the red scenario to compare its distribution to the full dataset; the histograms in the Constraint pane are now log-scaled on the vertical axis so that the details are visible. Figure 6.12 shows the result of turning on the histograms in the Contours pane to see details about the full Monte Carlo trials for the scenario under the crosshairs. This additional uncertainty information leads the manager to realize that the new light blue scenario has better trade-offs than the red one. Figure 6.13 shows the display after she turns on the second set of histograms underneath the sliders in the Constraint pane, and sets the probabilistic limit that the Average commercial catch must be more than 100,000 fish in 70% of the Monte Carlo trials. The Contours plots now have many crossed out locations (Figure 6.13), indicating the option CHAPTER 6. CASE STUDY 38 combinations that have been ruled out by this setting for the probabilistic acceptable values. She notes that none of the scenarios are in the crossed out region that is unacceptable according to this probabilistic constraint (Figure 6.13). She concludes that the light blue scenario is indeed the best alternative, given her analysis of the underlying Monte Carlo information in addition to the summarized version of the dataset. Figure 6.1: On startup, no constraint has been set, so there is no grey, inactive region on contour plots. CHAPTER 6. CASE STUDY Figure 6.2: After setting a constraint, the active region has an irregular shape. Figure 6.3: A second constraint shrinks the region more. 39 CHAPTER 6. CASE STUDY 40 Figure 6.4: The third constraint shrinks the active region again. Figure 6.5: Clicking on Contour plots selects scenarios. The scenarios are displayed in the Trade-offs pane. CHAPTER 6. CASE STUDY 41 Figure 6.6: Any contour plot can be turned off using a menu available through a right-mouse click. Figure 6.7: display. The plots in the Trade-offs pane are switched to the box plot uncertainty CHAPTER 6. CASE STUDY 42 Figure 6.8: The plots in the Trade-offs pane are switched to the shaded distribution uncertainty display. Figure 6.9: The plots are grouped by scenario. Labels for the bars appear on mouseover. CHAPTER 6. CASE STUDY 43 Figure 6.10: The bars are sorted by the Average commercial catch indicator. In the Constraint pane, the MC Trials histograms are turned on. Figure 6.11: Choosing the red scenario shows that distribution against the rest at log scale. CHAPTER 6. CASE STUDY 44 Figure 6.12: The histograms on Contour plots are turned on. A new light blue scenario is selected based on the floating histograms. CHAPTER 6. CASE STUDY 45 Figure 6.13: The user turns on the second set of histograms underneath the sliders in the Constraint pane, and sets a probabilistic limit to the Average commercial catch. The Contour plots now have many crossed out locations on top of the deterministic constraints (greyed out regions) due to the probabilistic constraint. Chapter 7 Process and Validation The requirements for and design of Vismon were created through an iterative two-phase process, by engaging with target users in fisheries science and management. It has already been successfully deployed for communication between scientists and policymakers, and for use by policymakers in Alaska. 7.1 Two-Phase Process We embarked on a two-stage design process for Vismon: a diverging phase to test alternatives and refine our understanding of the requirements, and a converging phase to recombine the successful elements into a focused and useable system based on the fine-tuned requirements. We chose a framework featuring multiple linked views as the obvious starting point given known visualization design principles [?, ?]. We reasoned that this baseline capability would speed up the previous analysis process immensely, since our target users were essentially doing linking and brushing [?] by hand in their previous workflow (see Chapter 3.3). 46 Figure 7.1: The first-phase prototype of Vismon was developed in a diverging phase. It included many views that could be created by the user. During that phase, we gathered feedback for the target users about various views, and concluded that the users were showing a tendency to use contour plots and scatterplots, as these were the most familiar to the target users. In addition, the Trade-offs pane (top-left) was showing the scenarios in separate window which was overlapping Vismon’s main window. CHAPTER 7. PROCESS AND VALIDATION 47 CHAPTER 7. PROCESS AND VALIDATION 48 In the diverging design phase, we assembled many views, both existing and novel, within a test-bed linked-view framework (see Figure 7.1). These views included contour plots and scatterplots as single views or a full matrix, a correlation matrix, parallel coordinates, histograms, box plots, star glyphs, and an experimental pixel-based view to show probability distributions, as will be discussed later in this chapter. In the terminology of Lloyd and Dykes [?], this prototype/test-bed functioned as a data sketch, allowing the users to explore real data with many different visual encoding and interaction techniques. In formative testing over the course of 18 months, we gathered feedback from the target users about many versions of the interactive prototype. We were able to distinguish between the more and less effective views for their purposes, iteratively refined the more promising views, and iteratively refined our understanding of the tasks, requirements, and use case. Chapter 3.3 lays out this final understanding; it was not nearly so clearly understood at the start of this design study. In the converging design phase, we redesigned the entire interface for usability based on the combination of previously gathered user feedback and a cognitive walkthrough that incorporated our validated requirements. We carefully considered the trade-offs between the power of multiple views and the cognitive load of excessive window management, and considered screen real estate as a scarce resource. The final design had a core set of three main views with critical functionality that were always visible on screen, with a very small set of optional pop-up views for uncommon operations. This design proved to meet the needs of the scientists (as expressed in goal G1) much better than the first version (Figure 7.1), and was finalized after only a short round of further refinement. Chapter 5 describes the capabilities of this final version. 7.2 Lessons Learned We were able to simplify some aspects of the interface as we moved to a more precise data abstraction: for instance, the constraint sliders started as two-way sliders, but after we realized that indicators did not need to be both maximized and minimized, we substituted one-way sliders (see Figure 7.2). Other changes were based on a better understanding of the task abstraction. For example, noting that users had created individual scatterplots as part of their previous analysis process, we conjectured that a scatterplot matrix might be even more useful than a collection CHAPTER 7. PROCESS AND VALIDATION (a) 49 (b) Figure 7.2: The Overview pane versus the Constraint pane: (a) In the first-phase prototype of Vismon, the Overview pane (renamed to Constraint in the later versions) contained the list of management options as well as indicators. The constraint sliders for indicators were two-way, allowing the user to add constraints to both minimum and maximum values. (b) In the final version of Vismon, the Constraint pane has two separate tabs, one for management options and the other for indicators. The two-way sliders in the earlier Overview pane are substituted with one-way sliders, because there is no need to have both minimum and maximum constraints. of separate plots located at haphazard locations; we also provided a correlation matrix with small boxes colour-coded by correlation as a compact overview (see Figure 7.3). However, these views were not used; we eventually understood that finding correlations between the indicators was not a central task for sensitivity analysis. Several more exotic visual encodings were not effective in this domain. The familiar bar charts were strongly preferred by many of the users. Some were willing to use star glyphs, which were left active in the final tool. Parallel coordinates (Figure 7.4) were firmly rejected by these users. The histogram sliders underneath the constraint sliders evolved in response to the failure of an experimental visual encoding of uncertainty that proved to be incomprehensibly CHAPTER 7. PROCESS AND VALIDATION 50 Figure 7.3: The first-phase prototype of Vismon included a correlation matrix. It showed the correlation value of every two indicators in the cell connecting the two indicators. The user was able to see the scatterplot of any two indicators, instead of their correlation value by clicking on their connecting cell. The colour coding encoded the “goodness” of the correlation between the two indicators, based on the maximum/minimum preferability. complex. The original probability statement plot encoding (see Figure 7.5(a)) was a twodimensional plot with the horizontal axis encoding indicator values from minimum to maximum, and the vertical axis showing probabilities from 0 to 1. Within the box, a pixel had a greyscale value indicating the percentage of management option combinations that satisfied the x axis as the boundary value and the y axis value as the probability value. The final solution does not attempt to show this entire two-dimensional space simultaneously: the Probabilistic Objectives histogram (Figure 7.5(b)) shows only a single one of those curves, and changing the top slider changes which curve is shown. We realized that the users did not need to internalize the full details of the global uncertainty distribution; they only needed to inspect local regions. CHAPTER 7. PROCESS AND VALIDATION 51 Figure 7.4: Parallel coordinates were among the many views included in the first-phase prototype of Vismon. However, they were rejected by the target users. 7.3 Staged Deployment Vismon is being deployed through a staged process, in a similar spirit to the staged development approach of LiveRAC [?]. The primary target user is the gatekeeper to other fisheries scientists, and they in turn are the gatekeepers to fisheries policymakers. These two interesting deployment milestones for Vismon match our goals G1 and G2. Several versions of the Vismon prototype (see Appendix B) have been deployed at the G1 (scientists’ communication with policymakers) level by Peterman, including a demonstration to 40 research biologists and high-level fisheries managers from the Alaska Department of Fish and Game in October 2009. The response of these policymakers was highly positive, verifying that the goal of facilitating communication between technical and non-technical users was achieved. More specifically, the visual framework allowed managers to ask new questions, promoted discussion and debate, and built trust between managers and scientists for the data analysis process. CHAPTER 7. PROCESS AND VALIDATION (a) 52 (b) Figure 7.5: (a) The Probability Statement plot was an experimental visual encoding of uncertainty. It was a two-dimensional plot with the horizontal axis showing indicator values, from minimum to maximum, and the vertical axis showing probabilities from 0 to 1. A pixel, had a greyscale value indicating the percentage of management option combinations that satisfied the axis as the boundary value, and the y axis value as the probability value. This plot proved to be complicated. (b) The Probability Objectives histogram is substituted the complex Probability Statement plot. It shows only a single one of those curves. The Probability Statement plot in (a) is annotated with a red line to show where the Probability Objectives histogram (shown in (b)) is situated. In 2008, the AYK Sustainable Salmon Initiative1 called for an expert panel to review the harvest policies of the AYK region. As part of their work, they recommended a model-based analysis, whose outcomes are based on simulations, to allow policymakers to consider alternative harvesting policies. One scientist plans to use Vismon as the central tool for scientists and managers to understand this simulation’s data output, including the uncertainties in the model as well as the sensitivities of particular policy decisions to various assumptions. The authors have demonstrated Vismon to a number of Pacific Salmon biologists in the Department of Fisheries and Oceans in Nanaimo in February 2010. Moreover, another fisheries scientist used Vismon in a presentation to the managers of the Fraser River Panel in Vancouver in May 2011. Vismon is freely available, with written and video tutorials, at http://www.vismon.org. Scientists have begun to use it at the G1 level using only these training materials, for 1 http://www.aykssi.org/ CHAPTER 7. PROCESS AND VALIDATION 53 example a recent presentation to the Fraser River panel of the Pacific Salmon Commission. Two scientists have now gained sufficient confidence in Vismon to plan for its deployment at the G2 (analysis by policymakers) stage as part of the AYK Sustainable Salmon Initiative. Their trust was gained after the demonstration of Vismon in Alaska in April 2012. (The results of the deployment are discussed later in Chapter 7.4.) It will be a central tool for scientists and managers to carry out model-based analysis to consider alternative harvesting policies. The commitment has been made that “the software will be used by the Panel at Phase 2 workshops and stakeholder meetings”; the first such workshop was held in April 2012 in Alaska. Workshop participants were trained to build and run a management strategy evaluation model, and used Vismon to visualize its output. After the workshop, salmon fisheries managers in the Arctic-Yukon-Kuskokwim region of Alaska plan to use Vismon starting in Summer of 2012 (alongside their existing methods) to help them decide which of many management options to choose during the salmon fishing season. 7.4 Validation and Initial Deployment Outcome A fisheries scientist with extensive experience in building and analyzing simulation models was our primary target user and source of domain information (Randall M. Peterman). Over the course of 30 months, 15 interviews and feedback sessions took place. We also solicited feedback from four more fisheries scientists who work directly with fisheries policymakers for the AYK region, as arms-length target users who were not directly involved in the design process. There were a total of five individual sessions of a few hours each where they used versions of the Vismon first-phase prototype with Peterman’s data in individual sessions, and one additional session where one used a near-final Vismon secondphase prototype with his own dataset. 7.4.1 Initial Deployment in Alaska Overview In April 2012, Vismon 2.5 (See Appendix B) was demonstrated in a workshop about Management Strategy Evaluation I: Harvest Policy Trade-offs. The workshop was held in Alaska, CHAPTER 7. PROCESS AND VALIDATION 54 and a group of fisheries scientists and high-level fisheries managers from the Alaska Department of Fish and Game were the main attendees. Vismon was introduced as a tool that could assist in fisheries analysis. During the workshop, the scientists and managers had the opportunity to work with Vismon to get familiarized with its workflow. At the end of the workshop, we handed out a questionnaire to the participants of the workshop to gather feedback from the participants. A questionnaire was designed to gather information about the dimensionality of the data with which the participants typically work, the users’ familiarity of various visual encodings, how Vismon can facilitate data analysis, how Vismon can assist in communication between scientists, managers and stakeholders (towards goals G1 and G2), and finally, the usefulness of Vismon in general. In addition, there were some questions about the background of the participants. Appendix A shows the full questionnaire; the questions Q1 through Q14 and their results are discussed later. Results This section discusses the results of the Alaska initial deployment, which involved 14 respondents. However, not every participant answered all questions. Hence, the total number of answers for each question is not identical. Occupation Managers Fisheries scientists No answer Total Number of participants 5 8 1 14 Table 7.1: Q3 Results: The participants in the workshop were either managers or fisheries scientists. Only one participant did not give his/her occupation. The participants fell into two categories: managers and fisheries scientists. Table 7.1 shows the list of number of managers and fisheries scientists that participated in the workshop. Note that one participant did not mention their occupation. Among 14 participants who gave us their background information, in response to Q4, 11 participants were from the commercial fisheries research and sport fishing divisions of the Alaska Department of Fish and Game (ADF&G), 2 from U.S. Fish and Wildlife Service, and 1 was from a university (Table 7.2). The fish stocks on which the participants were focusing (Q5 ) were Chinook, CHAPTER 7. PROCESS AND VALIDATION 55 Chum, Sockeye, and Coho salmons in the Yukon, Kuskokwim, and Unalakleet rivers. Section Alaska Department of Fish and Game (ADF&G) U.S. Fish and Wildlife Service (USFWS) Universities Total Number of participants 11 2 1 14 Table 7.2: Q4 Results: Division/Section of respondents, who came from Alaska Department of Food and Game (ADF&G), U.S. Fish and Wildlife Service (USFWS), or universities. Table 7.3 shows the responses the participants gave to Q6: What previous tools or methods did you use to carry out the kind of analysis that Vismon supports?. The participants used to use models such as the Ricker model or data manipulation tools such as Excel, or used programming in R to carry out their data analysis. 7 out of 12 participants mentioned that they did not have any tools with capabilities similar to Vismon. This lack is supporting evidence that the fisheries scientists are in need for a tool that can assist them with analyzing their data and considering the trade-offs between various actions. Hence, Vismon has the potential to meet previously unmet needs. Based on the answers about the dimensionality of the data (Q7 and Q8 ), we found out that 8 out of 12 participants (67%) deal with 2 management options (Table 7.4). However, they are all interested to include more than 2 management options in their typical analysis. The number of indicators in the typical analysis is strictly less than 10, with 75% less than 7, and the desirable number of indicators increases to up to 12. Q9, Q10, and Q11 have 5-point Likert scales where 1 is best and 5 is worst. Figure 7.6(ai) shows histograms of answers to Q9 about the usefulness of visuals in the users’ analysis, with 1 being very valuable and 5 being not very valuable (Q9-a to Q9-i ). As expected, contour plots (Figure 7.6(a)) were the most useful visual encoding, with almost 85% of participants considering it very useful. However, bar charts (Figure 7.6(b)) and scatterplots (Figure 7.6(d)) were less favourable compared to contour plots (Figure 7.6(a)). Histograms (Figure 7.6(c)) seemed to be of less value in the participants’ data analysis. Finally, no one believed in the usefulness of star glyphs (Figure 7.6(e)) and multipodes plots (Figure 7.6(f)). Error bars (Figure 7.6(g)) and box plots (Figure 7.6(h)) were both considered valuable. Interestingly, the participants showed interest in using shaded distributions encoding CHAPTER 7. PROCESS AND VALIDATION 56 What previous tools or methods did you use to carry out the kind of analysis that Vismon supports? - Qualitative understanding of trade-offs - Used Excel and R to visualize outputs of models - Excel; R; VBA - Just a cursory examination of historical fishery performance under various exploitation rates of estimated run sizes. Looking at recruits /per spawner = productivity under varying levels of escapement (?) - Ricker analysis using Excel - Lony hand (!?); Excel - Essentially none - None have remotely similar capabilities - Simple S/L assessment of MSY; Not a lot of comparison - None - Ricker - Not sure - conversations w/ managers and stakeholders based on outcomes/(surficial) analysis of historical management actions & run conditions/sizes Table 7.3: Q6 : Previous tools or methods used to carry out the kind of analysis supported by Vismon. (Figure 7.6(i)) which was a newly-introduced visual for uncertainty visualization. Figure 7.7(a-f) shows the extent the participants agreed about different aspects of helpfulness of Vismon in data analysis (Q10-a to Q10-f ). 7 out of 14 participants (50%) strongly agreed that Vismon would help them do data analysis more quickly than previous tools (Figure 7.7(a)). Only 3 out of 14 (21%) participants disagreed with such a statement. However, the participants were less sure about how thoroughly Vismon helps them doing data analysis compared to previous tools. The high frequency of responses to the second question (Figure 7.7(b)) is more inclined towards the median (rank 3) compared to the first question (Figure 7.7(a)). Interestingly, 1 person strongly disbelieved that Vismon supports a thorough data analysis. We received bimodal answers to the question whether Vismon provides insights into the user’s data that they did not have using previous tools. 11 out of 14 (strongly) agreed with this statement, whereas 3 out of 14 strongly disagreed. We also asked the participants to compare the management policies they found using Vismon with previous tools. In all three questions (see Figure 7.7(d-f)), most responds were CHAPTER 7. PROCESS AND VALIDATION No. of Management Options Typical Analysis Desired Analysis 2 7 2 2 5-10 5-10 2-4 2-4 7 2 2 5 2 2 6 6 6 9 2 3 3-6 1-4 57 No. of Indicators Typical Analysis Desired Analysis 5 10 6 6 9 9 5 10 8 8 4 12 - Table 7.4: Q7 and Q8 Results: Number of management options and indicators in fisheries analysis around rank 3, which was stating neither agreeing nor disagreeing with these statements in question. The participants were more confident about finding better management policies than they found using previous tools, as they were more agreeing with it (Figure 7.7(d)). When asked whether Vismon increased their trust that they found the best possible management policy, the participants gave mostly neutral responses (Figure 7.7(e)): 9 out of 12 participants checked rank 3, and 3 out of 12 checked either rank 2 or rank 4. However, they were more confident in finding a very good management policy (Figure 7.7(f)). Altogether, from the responses we received about the usefulness of Vismon in data analysis, we concluded that Vismon has the potential to assist the users to do data analysis more quickly and throughly than previous tools. In addition, Vismon provides insights into the users’ data that they did not have using previous tools. However, at this time, the users do not have enough confidence to use Vismon instead of their previous data analysis methods; it may be more appropriate to augment rather than to replace them. Figure 7.8(a-c) show the responses to Q12 : whether Vismon increases the collaboration and communication between different groups of interest. The first question (see Figure 7.8(a)) asks about communication between scientists and managers. 10 out of 13 participants (77%) (strongly) agreed with this statement. Participants agreed less on Vismon CHAPTER 7. PROCESS AND VALIDATION 58 improving collaboration and communication among managers (Figure 7.8(b)) compared to the first question. The participants gave neutral responses to whether Vismon improved collaboration and communication between managers and other stakeholders. Two participants specifically wrote, “hopeful but will be difficult for manager folks to understand” “Likely too complicated to present to stakeholders.” Following the questions related to communication, we asked the participants to expand their answers with a more detailed explanation, if possible. The responses were: “Vismon will definitely provide a bridge between managers and research and once managers become proficient with the software, they will become better at explaining complexity of management trade-offs to stakeholders. Does provide transparency in that regard. However, federal and state fisheries managers will do what their respective bureaucracies mandate them to do.” “Could be difficult to explain contour plots to stakeholders.” “not work w/ enough to know.” “This is a great tool, but generally in a meeting it would be hard to present much of this in a way that the average person would understand.” “Looking forward to using Vismon more often. At this point, my experience is too limited to give any detailed feedback.” “It’s great to see all the trade-offs in graphical form and multiple scenarios with mean/CIs. I think this would be especially beneficial to show stakeholders our options and the probability of a certain outcome.” The responses to Q13, What aspects of Vismon did you find most useful?, were: “good visualization; it’s nice to be able to quantify trade-offs” “contour plots; trade-offs bar plots; scatter plots; results list; histograms” “Its speed in running; large amounts of simulations to look at variability in outcomes; It’s flexibility” “Contour plots, bar charts, scatter plots” “Am a graphic user to understand dots” “The ability to look at multiple parameters at same time” CHAPTER 7. PROCESS AND VALIDATION 59 “visualization of diff. scenarios; Ability to constrain; Comparing multiple indicators/tradeoffs; Ability to export scenario values/modify colors” The participants’ responses to Q14, What aspects of Vismon did you find confusing?, were: “The scatterplots. grey dots obscured white dots” “sometimes the dots don’t show up on the plots - or had to click multiple times to get it to come up” “Seemed very straight forward, with a very quick learning curve. The confusing part would be interpretation.” “I have problems with the crosshairs on plots; I couldn’t access the left half of any plot, therefore was unable to create trade-offs plots.” “Star glyphs - need labels” Furthermore, a participant replied to Q15, What aspects of Vismon did you find least useful? : “the program was slow and prone to freezing up.” None of the participants gave informative answers to Q16, What tasks are harder or impossible to perform, compared with previous tools that you have used?. All except one mentioned that they needed more time to work with Vismon. One participant wrote: “haven’t used similar tools” For Q17, What features are missing from Vismon?, participants mentioned: “scenario clicking didn’t work well; ability to consider more than 2 policy axes” “shaded pattern fills that would show up in black and white publications. something analogous to pattern fill options in graphing in Excel” “ability to export specific graphs → good for making presentations where a lot of graphs would be confusing” “Ability to modify figure labels for export for presentations” In questions 13 through 17, we received answers related to the fact that the participants needed more time to explore the tool. CHAPTER 7. PROCESS AND VALIDATION 60 Conclusion The Alaska initial deployment identifies many aspects regarding usefulness of Vismon in data analysis and communication. The participants agreed that Vismon improves the speed and the details of data analysis. They also agreed that Vismon includes visual encodings that are useful for its target users, and that visualizing uncertainty has been appropriately selected. The participants also agreed that Vismon can improve the collaboration and communication between scientists and managers, and moderately improve the collaboration between managers, and between managers and stakeholders. These results provide supporting evidence that Vismon has succeeded in accomplishing goals G1 and G2 (see Chapter 3). CHAPTER 7. PROCESS AND VALIDATION 61 (a) Contour plots (b) Bar charts (c) Histograms (d) Scatterplots (e) Star glyphs (f) Multipodes plot (g) Error bars (h) Box plots (i) Shaded distributions Figure 7.6: Q9 Results: Usefulness of 9 visual encodings on a 5-point Likert scale. On the left is 1, very valuable; on the right is 5, not very valuable. High frequency on the left indicates the plot being more valuable, but on the right indicates the plot being less valuable. CHAPTER 7. PROCESS AND VALIDATION 62 (a) Vismon helps me do data analysis more quickly than previous tools (b) Vismon helps me do data analysis more thoroughly than previous tools (c) Vismon provides insights into my data that I did not have using previous tools (d) Vismon allows me to find a better management policy than I found using previous tools (e) Vismon increases my trust that I have found the best possible management policy (f) Vismon increases my trust that I have found a very good management policy Figure 7.7: Q10 Results: Usefulness of Vismon in data analysis on a 5-point Likert scale. On the left is 1, strongly agree; on the right is 5, strongly disagree. Note that high frequency on the left indicates more agreeing, and on the right indicates more disagreeing. CHAPTER 7. PROCESS AND VALIDATION (a) Vismon will improve collaboration and communication between scientists and managers 63 (b) Vismon will improve collaboration and communication among managers (c) Vismon will improve collaboration and communication between managers and other stakeholders Figure 7.8: Q11 Results: How Vismon improved the communication between different groups of interest on a 5-point Likert scale. On the left is 1, strongly agree; on the right is 5, strongly disagree. Note that high frequency on the left indicates more agreeing, and on the right indicates more disagreeing. Chapter 8 Conclusion and Future Work Over the past three years we worked closely with fisheries scientists to develop a tool that allows the analysis of fisheries models to move from their builders into the hands of managers and policymakers. The need for such a tool arose from a changed climate for policy makers to a) base their policy decisions on a broader scientific basis and to b) incorporate different stakeholders in the decision process. We engaged in a requirements analysis that allowed us to articulate and abstract data and tasks for this domain. The resulting Vismon system facilitates the analysis of multidimensional data with 2 independent inputs (management options), 10 to 20 outputs (indicators), and includes a multi-level view of the models underlying uncertainty, expressed by multiple Monte Carlo runs. Sensitivity of a policy decision is encoded by close contour lines in the contour plots. A comprehensive trade-off analysis allows users to compare several alternative policy options. Several versions of Vismon have been demonstrated to many fisheries scientists and managers over the past three years. The response of these policymakers was highly positive, verifying that our visual framework allows managers and scientists to ask questions, promote debate and discussions to build trust between them for the data analysis process. In our last demonstration in Alaska in April 2012, we asked questions from participants about usefulness of Vismon in data analysis as well as its assistance in communication between various groups of interest. The feedback we received showed that Vismon in fact has the potential to assist fisheries scientists and managers along with their traditional methods. The responses we received on the dimensionality of the data that the domain expert is 64 CHAPTER 8. CONCLUSION AND FUTURE WORK 65 dealing with made us examine the fundamental fact of having only 2 independent dimensions (management options). Although the domain experts typically have 2 management options in the data analysis (due to lack of a systematic way of looking at more than 2), they are greatly interested in including higher number of management options. Expanding Vismon to deal with any number of management options would be interesting future work to augment Vismon’s current capabilities. Classifying the parameter space into classes of different outcomes, similar to what Design Galleries [?] and FluidExplorer [?] suggested, would be one way to do so. Appendix A Questionnaire Vismon 2 Questionnaire for Alaska Workshop April 2012 Your contact 1. What is your name (optional)? 2. If you are willing to be contacted about your further use of Vismon, please give us an email address or a phone number (optional): 66 APPENDIX A. QUESTIONNAIRE 67 Vismon 2 Questionnaire for Alaska Workshop April 2012 Your background 3. Are you a # manager # fisheries scientist # data analyst # other ? 4. What division/section are you in? 5. Which fish stocks are your main concern? 6. What previous tools or methods did you use to carry out the kind of analysis that Vismon supports? (For example, how did you select sets of management actions? How did you compare them?) APPENDIX A. QUESTIONNAIRE 68 Data and visuals 7. How many management objectives and indicators do you use in a typical analysis? [ management objectives, indicators] 8. What is the maximum number of management objectives and indicators that you have wanted to include in analysis? [ management objectives, indicators] How useful do you find these visuals in your analysis Very valu- 1 2 3 4 5 able Not very Unfamiliar valuable 9a. Contour plots # # # # # 2 9b. Bar charts # # # # # 2 9c. Histograms # # # # # 2 9d. Scatter plots # # # # # 2 9e. Star glyphs # # # # # 2 9f. Multipodes plot # # # # # 2 9g. Error bars # # # # # 2 9h. Box plots # # # # # 2 9i. Shaded distributions # # # # # 2 9j. Other (please specify) # # # # # 2 APPENDIX A. QUESTIONNAIRE 69 Vismon Please indicate to what extent you agree with the statements below: Strongly 1 2 3 4 5 agree 10a. Vismon helps me do data analysis more quickly than previous tools 10b. Vismon helps me do data analysis more thoroughly than previous tools 10c. Vismon provides insights into my data that I did not have using previous tools 10d. Vismon allows me to find a better management policy than I found using pre- disagree # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # vious tools 10e. Vismon increases my trust that I have found the best possible management policy 10f. Vismon increases my trust that I have found a very good management policy 11. How often do you expect that you will use Vismon for analysis? # Daily # Weekly # Monthly Strongly # Annually APPENDIX A. QUESTIONNAIRE 70 Please indicate to what extent you agree with the statements below: Strongly 1 2 3 4 5 agree 12a. Vismon will improve collaboration and communication between scientists and disagree # # # # # # # # # # # # # # # manager 12b. Vismon will improve collaboration and communication among managers 12c. Vismon will improve collaboration and communication between managers and other stakeholders Please expand on your answer with a more detailed explanation, if possible: 13. What aspects of Vismon did you find most useful? Strongly APPENDIX A. QUESTIONNAIRE 14. What aspects of Vismon did you find confusing? 71 APPENDIX A. QUESTIONNAIRE 72 15. What aspects of Vismon did you find least useful? 16. What tasks are harder or impossible to perform, compared with previous tools that you have used? 17. What features are missing from Vismon? APPENDIX A. QUESTIONNAIRE 73 Appendix B Vismon Versions Version Release Date Notes Vismon 1.0 Sep 2009 Demonstrated in Alaska in Oct 2009 Vismon 1.0.2 Feb 2010 Demonstrated in Nanaimo in Feb 2010 Vismon 2.0 May 2011 Demonstrated in Vancouver in May 2011 Vismon 2.4.1 Dec 2011 Submitted to EuroVis Conference in Dec 2011 Vismon 2.5 Jan 2012 Demonstrated and played with in Alaska in Apr 2012 Vismon 2.5.1 Aug 2012 Current version 74
© Copyright 2026 Paperzz