Subjective Probability Information Design Scott Matthews Courses: 12-706 / 19-702/ 73-359 1 Admin Issues HW 5 (due next wed) Next project schedule Case studies coming 12-706 and 73-359 2 Subjective Probabilities Main Idea: We all have to make personal judgments (and decisions) in the face of uncertainty (Granger Morgan’s career) These personal judgments are subjective Subjective judgments of uncertainty can be made in terms of probability Examples: “My house will not be destroyed by a hurricane.” “The Pirates will have a winning record (ever).” “Driving after I have 2 drinks is safe”. 12-706 and 73-359 3 Outcomes and Events Event: something about which we are uncertain Outcome: result of uncertain event Subjectively: once event (e.g., coin flip) has occurred, what is our judgment on outcome? Represents degree of belief of outcome Long-run frequencies, etc. irrelevant - need one Example: Steelers* play AFC championship game at home. I Tivo it instead of watching live. I assume before watching that they will lose. *Insert Cubs, etc. as needed (Sox removed 2005) 12-706 and 73-359 4 Next Steps Goal is capturing the uncertainty/ biases/ etc. in these judgments Might need to quantify verbal expressions (e.g., remote, likely, non-negligible..) What to do if question not answerable directly? Example: if I say there is a “negligible” chance of anyone failing this class, what probability do you assume? What if I say “non-negligible chance that someone will fail”? 12-706 and 73-359 5 Merging of Theories Science has known that “objective” and “subjective” factors existed for a long time Only more recently did we realize we could represent subjective as probabilities But inherently all of these subjective decisions can be ordered by decision tree Where we have a gamble or bet between what we know and what we think we know Clemen uses the basketball game gamble example We would keep adjusting payoffs until optimal 12-706 and 73-359 6 Probability Wheel Mechanism for formalizing our thoughts on probabilities of comparative lotteries You select the area of the pie chart until you’re indifferent between the two lotteries Quick 2-person exercise. Then we’ll discuss p-values. 12-706 and 73-359 7 Continuous Distributions Similar to above, but we need to do it a few times. E.g., try to get 5%, 50%, 95% points on distribution Each point done with a “cdf-like” lottery comparison 12-706 and 73-359 8 Danger: Heuristics and Biases Heuristics are “rules of thumb” Which do we use in life? Biased? How? Representativeness (fit in a category) Availability (seen it before, fits memory) Anchoring/Adjusting (common base point) Motivational Bias (perverse incentives) Idea is to consider these in advance and make people aware of them 12-706 and 73-359 9 Asking Experts In the end, often we do studies like this, but use experts for elicitation Idea is we should “trust” their predictions more, and can better deal with biases Lots of training and reinforcement steps But in the end, get nice prob functions 12-706 and 73-359 10 Information Design What is it? Idea of carefully linking what data you have with what you want to say “God” of the field: Edward Tufte (.com) Quotes from his books (mostly his first) The eye can recognize 150 Mbits of information And is connected to our brain, a great processor Perhaps most important: don’t just blindly use builtin graph/graphic tools when you have a significant point to make a.k.a. Excel and Powerpoint are not friends! They create simplistic graphs that dumb us down and 73-359your perceived command 11 Your graphics say a12-706 lot about Some pre-thoughts In statistics, plotting raw data is useful because it can show outliers (easy to see) Analytical results need same treatment 12-706 and 73-359 12 Strive for “Graphical Excellence” "consists of complex ideas communicated with clarity, precision, and efficiency is that which gives to the viewer the greatest number of ideas in the shortest time with the least “ink” in the smallest space is nearly always multivariate “requires telling the truth about the data." 12-706 and 73-359 13 Graphics/Viz should: "show the data induce viewer to think about the substance rather than about methodology, graphic design, the technology, etc. avoid distorting what the data have to say present many numbers in a small space make large data sets coherent encourage the eye to compare different pieces of data reveal the data at several levels of detail, from a broad overview to the fine structure serve a reasonably clear purpose: description, exploration, tabulation, or decoration be closely integrated with the statistical and verbal descriptions of a data set." 12-706 and 73-359 14 Visualization goals content focus comparison rather than mere description Integrity high resolution utilization of classic designs and concepts proven by time. 12-706 and 73-359 15 Content Focus “Above all else show the data." The focus should be on the content of the data, not the visualization technique. This leads to design transparency. The success of a visualization is based on deep knowledge and care about the substance, and the quality, relevance and integrity of the content Assume that the viewer is just as smart as you and cares just as much Never `dumb-down' a visualization. 12-706 and 73-359 16 Comparison vs. Description At the heart of quantitative reasoning is a single question: Compared to what? Most visualizations today are descriptive rather than comparative. The xy-plot invites reasoning about causality in a way that even the most impressive isosurface does not. We should strive for relational, rather than merely descriptive, visualizations. Avoid relying on the viewer's memory to make visual comparisons; a weak facility in most of us. 12-706 and 73-359 17 Integrity - Misleading visualizations are common To help limit unintentional visualization lies: "The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity Write out explanations of the data on the graphic itself. Label important events in the data Show data variation, not design variation The number of information-carrying (variable) dimensions depicted should not exceed the number and 73-359 18 of dimensions in the12-706 data “Lie Factor” Lie-factor = size-of-effect-shown-invisualization / size-of-effect-in-data 12-706 and 73-359 19 Design Guidelines Visualizations "are paragraphs about data and should be treated as such." Words, pictures, and numbers are all part of the information to be visualized, not separate entities "have a properly chosen format and design use words, numbers, and drawing together reflect balance, proportion, sense of relevant scale display an accessible complexity of detail often have a narrative quality, a story to tell about the data avoid content-free decoration, including “chartjunk” (miscellaneous graphics that have nothing to do with 20 12-706 and 73-359 Examples, and what’s wrong? Think of Tufte’s “rules” above. Specify. 12-706 and 73-359 21 Nice attempt gone bad.. Graphic was bad before scan made it worse ;-) Source: NY Times, Aug 9, 1978, p. D-2 Caption says “Fuel Economy Standards for Autos, set by Congress 12-706 73-359 22 And supplemented by DOT, in and miles per gallon” 12-706 and 73-359 23 12-706 and 73-359 24 12-706 and 73-359 25 12-706 and 73-359 26 12-706 and 73-359 27 12-706 and 73-359 28 What’s wrong? What could we do better? 12-706 and 73-359 29 Sorted by 5-yr Formatted nicer (big small) Source:http://edwardtufte.com 12-706 and 73-359 30 Consistent scale in this case Causes lots of crossover and Clutter. 12-706 and 73-359 31 12-706 and 73-359 32 Labels on both sides! 12-706 and 73-359 33 12-706 and 73-359 34 How far we’ve come! 12-706 and 73-359 35
© Copyright 2026 Paperzz