Introduction to Data Science Lecture 11 Data Visualization - b

Introduction to Data Science
Lecture 11
Data Visualization
CS 194 Fall 2014
John Canny
incorporating notes from Michael Franklin, Dan Bruckner, Evan
Sparks, Shivaram Venkataraman, Maneesh Agrawala and Jeff
Hamerbacher
Outline
Visualization:
• How not to do it
• How to do static visualizations
• Making it interactive
FIRST, A CLASSIC
Charles Joseph Minard 1869
Napoleon’s March
According to Tufte: “It may well be the best statistical graphic ever drawn.”
5 variables: Army Size, location, dates, direction, temperature during retreat
More Examples
• The famous Gapminder Video, Hans Rosling:
200 Countries, 200 Years, 4 Minutes
•
https://www.youtube.com/watch?feature=player_embedded&v=jbkSRLYSojo
• NY Times Interactive Visualizations (e.g., 2013 Federal Budget)
•
http://www.nytimes.com/interactive/2012/02/13/us/politics/2013-budget-proposal-graphic.html
• Also, Map-based visualizations, such as CrimeMapping
•
http://www.crimemapping.com/map.aspx?aid=3f1738a8-6160-4c68-998a-ae00f597613a
Some Anti-Examples
• Courtesy of WTFViz.net
Visualization to Educate?
from wtfviz.net
Another Interesting One
from wtfviz.net
Pie in the Sky?
from wtfviz.net
Needs Fixing
from wtfviz.net
Unsafe at Any Speed?
from wtfviz.net
Okay, so that’s how not to do it!
Let’s talk about how to do it well:
• Some principles
• Best practices for static visualization
• Emerging principles and tools for interactive
visualization
What is Visualization?
Definition (www.oed.com)
1. The action or fact of visualizing; the power or
process of forming a mental picture or vision of
something not actually present to the sight; a picture
thus formed.
More Definitions
• “Transformation of the symbolic into the geometric”
[McCormick et al. 1987]
• “... finding the artificial memory that best supports
our natural means of perception.” [Bertin 1967]
• “The use of computer-generated, interactive, visual
representations of data to amplify cognition.”
[Card, Mackinlay, & Shneiderman 1999]
Uses for Data Viz
A: Support reasoning about information (analysis)
•
•
•
•
Finding relationships
Discover structure
Quantifying values and influences
Should be part of a query/analyze cycle
B: Inform and persuade others (communication)
• Capture attention, engage
• Tell a story visually
• Focus on certain aspects, and omit others
Data Presentation
• Designer-Reader-Data Trinity
17
From “Designing Data Visualizations”,
Iliinsky and Steele, O’Reilly, 2011
Uses for Data Viz
Uses for Data Viz
Uses for Data Viz
A case for Ugly visualizations
People instinctively gravitate to attractive visualizations, and
they have a better chance of getting on the cover of a journal.
But does this conflict with the goals of visualization?:
• Rapid exploration
• Focus on most important details
• Easy and fast to develop and
customize
Powerpoint vs Keynote vs InDesign
A case for Ugly visualizations
But you can go too far:
Ugliness does correlate with hard-to-interpret, but they’re not
the same thing.
Data Scientist’s Workflow
Sandbox
Production
Digging Around
in Data
Hypothesize
Model
Evaluate
Interpret
Large Scale
Exploitation
A case for Interactivity
i.e. visualizations usually aren’t an end in themselves,
but part of a query/interpret cycle.
Interactivity can speed up the query/interpret cycle.
Baby Names Voyager
(Wattenberg et al. 2005)
An interactive visualization with rich narrative quality
(i.e. you can discover stories through the names).
http://www.babynamewizard.com/
Hides more than it reveals, but lets you explore in an
intuitive way. i.e. supports rapid query/interpret cycles.
Many Eyes
(Wattenberg et al. 2007)
Participatory visualization and explanation site:
http://www.many-eyes.com
Outline
Visualization:
• How not to do it
• How to do static visualizations
• Making it interactive
Chart Selection – Andrew Abela
Chart Selection – Juice Analytics
Design Considerations
• Tables and charts
• Reduce chartjunk/tablejunk; increase data-ink ratio
• Lessons from perception: Limit the number of objects
displayed at once
• Typography: capitalization, serif/non-serif; use what your
company uses!
• Colors
• Color scheme
• Contrast, emphasis
• Use what your company uses!
• 6 Gestalt Psychology principles (1912):
• For groups of objects: proximity, similarity, enclosure, connection
• Visual representation: closure, continuity
30
Chart Design
• Example from Tim Bray
31
Chart Design
• Example from Tim Bray
32
Chart Design
• Example from Tim Bray
33
Chart Design
• Example from Tim Bray
34
Chart Design
• Example from Tim Bray
35
Chart Design
• Example from Tim Bray
36
Design Considerations
• Color
• By default, use your organization’s palette
• Choose colors based on the information you want to
convey
• Sequential
• Diverging
• Categorical
• Use online resources to discover and record your color
schemes
• Color Brewer
• Kuler
• Colour Lovers
37
Design Considerations
• Color
38
Design Considerations
• Color
39
Design Considerations
• Color
40
Design Considerations
• Color
41
Design Considerations
• Color
42
Design Considerations
• Color
43
Updates and Break
Midterm is on 11/24, 5:00-6:30 pm here.
Sample midterm (Spring 2014) is online now.
Project presentations on 12/1 and 12/3 (5 mins)
Poster session on Thursday 12/11 3:30-5pm, BIDS
BREAK