Challenges for visual analytics

Information
Visualization
http://ivi.sagepub.com/
Challenges for Visual Analytics
Jim Thomas and Joe Kielman
Information Visualization 2009 8: 309
DOI: 10.1057/ivs.2009.26
The online version of this article can be found at:
http://ivi.sagepub.com/content/8/4/309
Published by:
http://www.sagepublications.com
Additional services and information for Information Visualization can be found at:
Email Alerts: http://ivi.sagepub.com/cgi/alerts
Subscriptions: http://ivi.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://ivi.sagepub.com/content/8/4/309.refs.html
Downloaded from ivi.sagepub.com by guest on May 22, 2011
Original Article
Challenges for visual analytics
Jim Thomasa, ∗ and
Joe Kielmanb
a
Pacific Northwest National Laboratory,
PO Box 999, K7-28, Richland, WA 99352, USA.
b
Department of Homeland Security, Science
and Technology Directorate, Washington, DC,
USA.
∗
Corresponding author.
E-mail: [email protected]
Abstract Visual analytics has seen unprecedented growth in its first 5 years
of mainstream existence. Great progress has been made in a short time, yet
significant challenges must be met in the next decade to provide new technologies that will be widely accepted throughout the world. This article explains
some of those challenges in an effort to provide a stimulus for research, both
basic and applied, that can realize or even exceed the potential envisioned for
visual analytics technologies. We start with a brief summary of the initial challenges, followed by a discussion of the initial driving domains and applications.
These are followed by a selection of additional applications and domains that
have been a part of recent rapid expansion of visual analytics usage. We then
look at the common characteristics of several tools illustrating emerging visual
analytics technologies and conclude with the top 10 challenges for the field of
study. We encourage feedback and continued participation by members of the
research community, the wide array of user communities and private industry.
Information Visualization (2009) 8, 309 -- 314. doi:10.1057/ivs.2009.26
Keywords: visual analytics; domains and applications; state of practice; future
challenges
Early Science Challenges for Visual Analytics
This article is a product of a workshop on the
Future of Visual Analytics, held in Washington,
DC on 4 March 2009. Workshop attendees
included representatives from the visual
analytics research community across
government, industry and academia. The goal
of the workshop, and the resulting articles,
was to reflect on the first 5 years of the visual
analytics enterprise and propose research
challenges for the next 5 years. The article
incorporates input from workshop attendees
as well as from its authors.
Received: 26 May 2009
Revised: 7 July 2009
Accepted: 8 July 2009
The visual analytics field of study was developed informally over several
years through a series of specific mission-focused research and development (R&D) projects. The publication of Illuminating the Path: The R&D
Agenda for Visual Analytics1 in 2005 marked what many consider the formal
beginning of the field. At that time, a team of about 40 visionary leaders,
including the authors, identified 19 major challenges and numerous additional sub-challenges. These challenges have provided the foundation for
both basic and applied visual analytics research around the world. The
enthusiastic feedback to the R&D agenda provided a strong indication
that the team had identified common technology and capability gaps.
Especially exciting were the acceptance of a new scientific vision and the
acknowledgement of a potentially broader application space for visual
analytics than was evident from the initial drivers.
The R&D agenda identified four main science areas: the science of
analytic reasoning; visual representations and interaction techniques; data
representations and transformations; and presentation, production and
dissemination. These four areas provide the foundation for visual analytics
as the science of analytical reasoning facilitated by interactive visual interfaces.
Progress in each of the four major science areas has been beyond expectations, as illustrated by the increasingly higher-quality papers being
published at the IEEE Visual Analytics Science and Technology (VAST)
Symposium,2–4 journals,2 IEEE TVCG,3 Information Visualization4 and related workshops. Although visual analytics interest started in the United
States, the rapid developments and contributions from around the world,
© 2009 Palgrave Macmillan 1473-8716 Information Visualization
www.palgrave-journals.com/ivs/
Downloaded from ivi.sagepub.com by guest on May 22, 2011
Vol. 8, 4, 309 – 314
Thomas and Kielman
especially in Canada and Europe, provided the foundational growth of the new science. The European
Union VisMaster and German DFG-SPP Visual Analytics
programs are good examples. The IEEE VisWeek Proceedings are also excellent references. Yet much work remains
in both the basic and applied aspects of visual analytics
sciences. The many new domains and applications have
also expanded the challenges and potential impact.
Domain/Applications
The initial domain driving the development of visual
analytics was homeland security, with leadership provided
by the US Department of Homeland Security (DHS)
Science and Technology (S&T) Directorate. In just its first
year in existence, and as it was developing its mission
and a stakeholder community, the S&T Directorate saw
the compelling need for advancement in visual analytics
to support its missions of preparedness for, response
to, and recovery from man-made and natural disasters.
Consequently, some of the first deployments for visual
analytics technologies have been for the public safety
and emergency response communities.
Almost immediately, however, other fields became
interested. In domains facing rapidly expanding data
volumes, complex analysis tasks, and the need to
communicate analytic outcomes simply and clearly, such
as human and environmental health, economics and
commerce, visual analytics can make important contributions. Brief descriptions of other applications and needs
are described below.
Security: Security has many aspects, including national
intelligence, national defense and regional law enforcement. Activities ranging from policymaking to strategic
thinking to tactical action planning and after-action evaluation require analytic reasoning and the application of
human judgment in real time. Both can be facilitated
by engaging visual interactions between the users and
their information spaces. Multiple, diffuse and diverse
information sources; ever-growing information volumes;
the complexity of decision making; and massive, multimodal real-time information streams all suggest that new
analytic reasoning methods and technologies are needed.
Another major driver is the presence of a new generation
of end users who are web literate and experienced game
players.
Health: Health care offers many potential applications
for visual analytics, including detection of disease or
other health issues, pharmaceutical development and
discovery, and many aspects of patient care in hospitals,
doctors’ offices, medical laboratories, field applications
and hospices or the home. All are driven by an increasing
and widening breadth of information that requires new
analytic methods with precise information response in
the language of the domain. The uncertainty of information and plausible effects of health-care products and
310
© 2009 Palgrave Macmillan 1473-8716
processes also requires proactive thinking going beyond
just investigative analytics into ‘what if’ and ‘how can
I influence outcomes’ analytics.
Energy: The availability of effective and efficient energy is
essential for almost all aspects of modern life, and many
diverse factors influence energy discovery, development,
delivery and commerce. The sources of energy in the
current power grid are many and varied. Maintaining
the energy grid depends upon the reliability of energy
sources, the real-time dynamics and interplay between
sources, and real-time response to demand. Energy use
data often consist of billions of small transactional information units, are often updated in sub-second time or
real time, and must be combined with economic and
reliability data to be useful to operators as well as utility
managers. This information must be communicated to
operators and engineers in such a way that they can
quickly make highly complex analytic assessments.
Environment: Air, water and the climate all have interdependent effects on both near-term and long-term quality
of life. Assessing those impacts, investigating alternative
scenarios, and communicating effectively to the policymaker and the public are just a few of the many challenges that require new, more effective analytic methods
and tools. Visual analytics can become a game changer,
enabling all of us to understand our roles and responsibilities to provide better environmental health today and in
the future.
Commerce: Industry is often seen as a global enterprise,
in which large firms use economies of scale to remain
competitive, but even start-up companies are seeing
the need to better assess potential markets, alternative
product designs and product feedback. Such assessments
often rely on unstructured surveys, opinion analytics,
competitive evaluations and proactive market thinking.
Most analytic methods used in industry today are investigative in nature, look at single or a few sources and
use fixed format surveys. However, the potential data
are varied and not always applied to full advantage.
Their sources include text, web pages, projection models,
patents and news articles, among others. They may be
in multiple languages. Obversely, sometimes the data
are scarce but highly complex; the sources may combine
aspects of finance, large market reports or projections,
product evaluations and customer focus group results,
and even details of the enabling technologies. Such
assessments often do not require real-time analytics, but
industries differ significantly in their needs, scope and
analytic methods. Typically, only a few major players
in any specific industry conduct this type of strategic
analytics; nevertheless, to fully benefit from the available
data, industry requires a new suite of analytic technologies using visual analytics techniques to integrate and
help interpret the data available.
Transportation: The design of transportation systems and
methods and the development of motor vehicles, planes,
rail cars and other transportation modes require analysis
of massive amounts of detailed information. The details
Information Visualization Vol. 8, 4, 309 – 314
Downloaded from ivi.sagepub.com by guest on May 22, 2011
Challenges for visual analytics
required include technical data about products and
materials, up-to-the-minute understanding of dynamically changing customer needs, engineering designs,
manufacturing specifications, quality measurements,
maintenance processes and manuals, and supply chain
and delivery data. New visualization methods are required
to enable integration and understanding of these data
owing to the changing nature of the world’s transportation networks and, in many cases, the real-time dynamics
of transportation flows, alternate paths, and surrounding
environmental and economic impacts. A mix of transaction, modeling, document and sensor analytics based on
visual techniques offers just such a method.
Food/agriculture: Farm-based applications are becoming
highly information enabled, from selection of alternative
crops for a specific location and climate, to optimization of planting and growth cycles, to distribution and
sales of produce, to health impacts and public opinion
associated with pesticide use. The modern, highly interconnected food and agriculture industry now requires
access, management, analysis and reporting of information on a scale not seen in the last several decades. Many
farmers access the web daily to search for patterns of
food pricing, techniques for improving the health and
quality of their livestock or agricultural products,
and harvesting and delivery options to maximize profits
and quality. Some farmers are even using modeling
to project farm product business assessments before
planting, and then testing these models with real-time
data during the growth, harvest and delivery processes.
Clearly, the food and agriculture industry requires
analytics on a scale mostly unavailable today.
Economy: Our society’s financial system is complex,
dynamic, and, as recent events show, highly interrelated. Consequently, its soundness and stability require
constant monitoring. Whereas many financial applications for analytics are investigative in nature, many more
are prospective or even predictive, including evaluating
the potential and real impact of local or global economic
situations. Economic data are mostly numerical; however,
many economic decisions are based on impressions,
opinions, likely public response and other sources of
information typically expressed in text form. Additional
data of value can include news in the form of narrative
text documents or video, which can provide context for
changes observed in financial data or serve as indicators
of coming changes and trends. Also, because most of the
projections derived from these data, including financial
modeling data, real-time prices and money flows, and
human/product dynamics, are complex, the data sources
must be pulled together at a semantic level, synthesized,
and then made available through visual analytic technology empowered by human judgment.
Insurance: Insurance requires understanding of diverse
factors. An understanding of health trends, weather,
crime, safety and business all are important to the insurance industry. All of these areas require extensive analysis
to determine business cases and likely costs, identify
© 2009 Palgrave Macmillan 1473-8716
national and regional impacts of unexpected events,
and proactively isolate which health-care options reduce
insurance costs. Modeling, surveys, claim analysis, and
fraud detection analytics all involve understanding of
numbers, text, and simulations looking for patterns and
rare events. These require new visually based analytics
that support the use of informed human judgment.
Cyber security: Constant analysis of the cyber infrastructure and network-related data to identify malicious activities and enable effective operations while supporting
web content, security and effective delivery is emerging
as a critical need. The typical cyber analytics application involves examination of masses of transactions to
find patterns, rare events and hidden communication
content. An effective cyber analytics application also must
understand content flow dynamics and real-time infrastructure operational characteristics. All of these features
demand highly specialized analytics. In some cases,
responses must be virtually automatic, and in other, more
ambiguous cases, application feedback must be combined
with human judgment to respond effectively. In the
future, humans will need to be able to design tools that
automatically deploy cyber security snippets and agents,
while at the same time being able to monitor and respond
to real-time flows of traffic. Real-time analytics, investigative analytics, and proactive/predictive analytics are key
components of an analytical suite for cyber analytics.
Knowledge worker: The knowledge worker is a newly recognized job type for most industries, except for business
and state intelligence, where it was first acknowledged
as critical to the analysis process. Sophisticated knowledge workers must go beyond more traditional statistical
analysis of products, processes, and opinions and use
predictive and prospective techniques. This requires the
synthesis of multiple information types, across different
time scales, in often very dynamic situations, sometimes
engaging in gaming-like processes to determine likely
outcomes. Determining how to influence these outcomes
requires a blend of experimental, theoretical and predictive analytics, all of which are supported by human judgment using visual and highly interactive technology to
deal with uncertainty refinement.
Individual or personal uses: Through the web and other
sources, individuals have access to huge volumes of information. Humans typically are not experts in the use of
deep analytic tools, yet we are increasingly required to
understand and interpret large volumes of conflicting and
ambiguous information, to decide whether to investigate
a health condition, identify which new electronics to buy,
decide where to travel or determine how to invest. We may
even look for aids to effective communication through
storytelling within social networks. Internet searches typically generate thousands of responses. Even daily e-mail
correspondence is information intensive.
The discussion above has mentioned only a few application domains, and within each one, only a few applications were highlighted. These few, nonetheless, illustrate
Information Visualization Vol. 8, 4, 309 – 314
Downloaded from ivi.sagepub.com by guest on May 22, 2011
311
Thomas and Kielman
the breadth and depth of applications that will benefit
from new visual analytics technologies. Common to all is
the notion we denote as a ‘walk-up usable’ interface. Such
interfaces are highly dynamic, interactive, progressively
more complex and as full-featured as the task demands,
and responsive to individual needs. They also allow immediate use of a tool/technology without training. Despite
allowing the user to see immediate value in the application of visual analytics to her task, these interfaces can
lead to the development of progressively more complex
interfaces and capabilities.
Top 10 Observations for Visual Analytics
Technologies and Systems
The discussion above suggested that there are some
requirements for visual analytics in 10 disparate domains–
for example, prospective or predictive techniques and
usable interfaces. Likewise, examining recent visual
analytics systems such as Jigsaw,5 WireVis,6 IN-SPIRE™,7
video analytics software,8 and geospatial software9
reveals some common approaches enabling analysis and
reasoning.
1. Whole-part relationship: In many visual analytics
systems, scale-independent visual representations of
the entire information space to be analyzed exist along
with a detailed representation. This approach provides
linked context at the highest and lowest levels of information understanding and involving multiple levels
of abstraction of vision and interaction.
2. Relationship discovery: Most systems include interaction techniques that enable discovery of relationships
among people, places, times and so on, through iterative queries or via full multi-dimensional exploration.
This discovery is accomplished through exploration of
high-dimensional spaces; temporary subsetting; identification of groups, clusters and rare events; and use
of search techniques including Boolean keyword or
phrase searching and search by example.
3. Combined exploratory and confirmatory interaction:
Exploratory interaction enables the analyst to discover
relationships, develop and refine hypotheses, and
confirm or refute hypotheses. This is the basis for a
human cognitive model for analytics. Some models
include the beginnings of predictive analytics.
4. Multiple data types: Systems today tend to be mediatype-specific, focusing on unstructured text, video,
transactions, or, in some cases, problem-specific data
such as wire fraud and cyber data. The interactions
within these tools are usually specifically designed for
the data type and/or applications.
5. Temporal views and interactions: Almost all analytic
systems have a degree of temporal dynamics. Some
have flow representations, some have timelines, and
some have event and milestone representations. Some
of the systems are strongly geospatial in context and use
maps and cartography as their organizing principles.
312
© 2009 Palgrave Macmillan 1473-8716
6. Groupings and outlier identification: Most systems have
analytic methods that allow formation of individual
items into groups and groups into high-order groups
with labeling and annotation.
7. Multiple linked views: Most systems have multiple linked
views active on the display(s) at the same time, with
actions on one view being represented within other
views.
8. Labeling: Most systems have developedextensive
methods for labeling all information on the displays.
Labeling conveys the context and details that enable
the analytic process. Often, labeling is dynamic and
can provide user control over such items as level of
detail, size and color.
9. Reporting: Critical to analytical assessment is the
ability to capture analytic process and results that can
become part of an assessment report, presentation,
web communication or other form of communication.
10. Interdisciplinary science: These systems and embedded
technologies are the products of highly interdisciplinary teams and often benefit by having direct and
regular access to the end users.
This in no way completely describes all the capabilities
of current visual analytic tool suites but, rather, offers
some characteristics found to be common among visual
analytic tools in use or under development.
Top 10 Challenges for Visual Analytics
In framing any discussion of the challenges for visual
analytics in the coming years, it is salutary to first consider
the conditions under which the capabilities will be used.
Although there are many, all can be seen as variants of
the following:
• Untethered to device/network/interaction – That is, we
should not be dependent on particular devices, network
designs or interaction schemes, and admit to operation
on any current or future multiplicity of such designs.
• Tethered to data/information – the key to future utility of
visual analytics capabilities is that they enable continual
use of multiple types, forms, and sources of data and
information.
• Indefinite or indeterminate data – the actual data or information sets in use at any one time will vary and the
contents, forms and value of same will be unknown or
uncertain; nevertheless, the tools will have to enable
judgments on their usefulness to be made in real
time.
• Minimized transaction costs – the network bandwidth and
computational processing power, as well as the interaction and decision space, required for visual analytics
capabilities must be minimized to enable immediate
access and active use on multiple platforms.
• Trust – the provenance and validity of the data must be
known, and the security of the sources and privacy of
Information Visualization Vol. 8, 4, 309 – 314
Downloaded from ivi.sagepub.com by guest on May 22, 2011
Challenges for visual analytics
individuals guaranteed even for dynamically established
access and interaction.
There are many more than 10 challenges in visual
analytics, so selecting just a few to highlight is difficult.
We encourage the reader to use these as guides to deeper
investigation and prospective thinking toward the future
capabilities that can be enabled through visual analytics
across a wide variety of applications.
1. Human-information discourse: We need an understanding of and foundational science for the interaction underpinning effective visual analytics and
reasoning-supported systems. This science will provide
‘walk-up usable’ interfaces, interfaces supporting
mixed-initiative interactions and multi-device and
cross-platform interaction that are usable on systems
ranging from large displays systems, desktops, to
mobile devices.
2. Collaborative analytics: We need new reasoning foundations supporting not only evidential and confirmatory
analytics but also exploratory, hypothesis-driven, and
predictive and proactive thinking.
3. Holistic visual representations: We need visual representations that tell a complete story at a glance with effective labeling. These representations must present multisource, multi-type data, including both structured and
unstructured data from simulations, sensors, data structures and masses of streaming data.
4. Scale independence: We need scale-tolerant mathematical and visual approaches for analytics, enabling
reasoning over large, diverse information spaces to
facilitate analytics and uncertainty refinement.
5. Information representations: We need mathematical
and semantically rich, data-preserving representations; information synthesis of all forms of data
including model and sensor data into inter-related
knowledge structures; and representation of human
judgment. Such representations will be created using
discrete mathematics, knowledge generation techniques, and visualization sciences that enable scale and
complexity tolerant analytics. Inherent to these representations are techniques for maintaining privacy and
security.
6. Information sharing: We need effective decision-making
tool suites that support information sharing within
secure, privacy-aware technologies, with dissemination
and sharing between visual analytics components and
people.
7. Active information products: We need the methods
and science to capture reusable analytic components
into complete stories for effective communication of
analytic outcomes. These products must be active,
in that they must be able to support multiple levels
of abstraction and allow users to unwrap the logic
within the product, add their own reasoning and facts
and transform the results into new communication
products.
© 2009 Palgrave Macmillan 1473-8716
8. Lightweight software architectures: We need support and
standards to rapidly develop visual analytics applications and create specific analytic tools for new applications, domains and data types, with sharing among
visual analytics technologies and components.
9. Utility evaluation: We need science, support structure,
and data for evaluating the utility of visual analytics
science, technology and systems. We need to provide
core methods for utility-based evaluations that can
be used to test applications for audiences ranging
from national knowledge workers to regional mobile
analytics users such as law enforcement officers.
10. Sustaining talent base: We need a growing and sustainable talent base to enable research, application design
and development, and operations and training support
for new visual analytics applications and tools.
Some of these challenges have been stated before, so it
is fair to ask about the progress made to date. Although
we are excited by the initial research and deployed examples, the community and funding base still remain small.
The desired interdisciplinary mix of talents has not been
achieved, which limits progress on many fronts. For more
rapid and considered progress, we must develop interdisciplinary teams that work in what is sometimes called
transdisciplinary science. Furthermore, few are working
on the foundational vision; instead, in many cases,
priority is given to developing fully deployed systems.
Overall, each of these 10 challenges and expected results
would benefit from clearer definition and interactive
examples that would drive interdisciplinary research
teams.
We also must show near-term progress in the science,
development, and deployment of visual analytics systems
in order to maintain sustained interest of the funding
sources and to extend and expand interest of potential end
users. The research foundation must advance in parallel
to practical tool and technology development through
informed feedback on the changing analytic processes.
These capabilities should be applied to an ever-increasing
domain and application space.
Conclusion
The challenges described here are bold. Although they
represent long-term goals, we anticipate progress toward
achieving them will be made regularly and on a shorterterm basis. The progress made in the science, technology,
and deployment of visual analytics in the past 5 years has
helped clarify the needs and opportunities. (More discussion about where we are headed can be found in ‘The
Future for Visual Analytics’ and ‘Taxonomy for Visual
Analytics: Seeking Feedback,’ both found in.10 ) These
initial successes have encouraged us in the view that
these opportunities are real and can be addressed within
a 10-year time frame – given the availability of resources,
partnerships and interdisciplinary talents.
Information Visualization Vol. 8, 4, 309 – 314
Downloaded from ivi.sagepub.com by guest on May 22, 2011
313
Thomas and Kielman
Acknowledgements
We wish to thank Kris Cook and Pak Wong for their very
helpful edits. This work has been supported by the National
Visualization and Analytics Center™ (NVAC™) located at the
Pacific Northwest National Laboratory in Richland, WA. NVAC
is sponsored by the US Department of Homeland Security (DHS) Science and Technology (S&T) Directorate. The
Pacific Northwest National Laboratory is managed for the US
Department of Energy by Battelle Memorial Institute under
Contract DE-AC05-76RL01830.
References
1 Thomas, J.J. and Cook, K.A. (eds.) (2005) Illuminating the Path: The
Research and Development Agenda for Visual Analytics, Los Alamitos,
CA: IEEE Computer Society Press.
2 Ebert, D. and Ertl, T. (eds.) (2008). IEEE Symposium on
Visual Analytics Science and Technology: VAST ’08; 21–23
October 2008, Columbus, OH. Los Alamitos, CA: IEEE Computer
Society, http://conferences.computer.org/vast/vast2008/, accessed
13 August 2009.
314
© 2009 Palgrave Macmillan 1473-8716
3 Ertl, T. (ed.) (2009) IEEE Transactions on Visualization and Computer
Graphics. Los Alamitos, CA: IEEE Computer Society, available
online at: http://www2.computer.org/portal/web/tvcg.
4 Information Visualization (IVS), http://www.palgrave-journals.
com/ivs/index.html.
5 Stasko, J., Görg, C. and Liu, Z. (2008) Jigsaw: Supporting
investigative analysis through interactive visualization.
Information Visualization 7(2): 118–132.
6 Chang, R.M. et al. (2007) WireVis: Visualization of categorical,
time-varying data from financial transactions. In: W. Ribarsky
and O. Keim (eds.). IEEE Symposium on Visual Analytics Science
and Technology: VAST ’07; 30 October–1 November, Sacramento,
CA. Los Alamitos, CA: IEEE Computer Society Press, pp. 155–162.
7 Pacific Northwest National Laboratory (PNNL) (2008) IN-SPIRE
visual document analysis, http://in-spire.pnl.gov, accessed 13
August 2009.
8 Ghoniem, M., Luo, D., Yang, J. and Ribarsky, W. (2007) NewsLab:
Exploratory broadcast news video analysis. In: W. Ribarsky and
O. Keim (eds.). IEEE Symposium on Visual Analytics Science and
Technology: VAST ’07; 30 October – 1 November, Sacramento,
CA. Los Alamitos, CA: IEEE Computer Society Press, pp. 123–130.
9 Pennsylvania State University (2006) GeoVista Center,
http://www.geovista.psu.edu/, accessed 13 August 2009.
10 Pacific Northwest National Laboratory. (2009) National Visualization and Analytics Center VAC Views, http://nvac.pnl.gov/
vacviews/, accessed 13 August 2009.
Information Visualization Vol. 8, 4, 309 – 314
Downloaded from ivi.sagepub.com by guest on May 22, 2011