July, 2013 Tutorial: Introduction to Game Theory Jesus Rios IBM T.J. Watson Research Center, USA [email protected] © 2013 IBM Corporation Approaches to decision analysis Descriptive – Understanding of how decisions are made Normative – Models of how decision should be made Prescriptive – Helping DM make smart decisions – Use of normative theory to support DM – Elicit inputs of normative models • DM preferences and beliefs (psycho-analysis) • use of experts – Role of descriptive theories of DM behavior 2 © 2013 IBM Corporation Game theory arena Non-cooperative games – More than one intelligent player – Individual action spaces – Interdependent consequences • Players’ consequences depend on their own and other player actions Cooperative game theory – Normative bargaining models • Joint decision making - Binding agreements on what to play • Given players preferences and solution space Find a fair, jointly satisfying and Pareto optimal agreement/solution – Group decision making on a common action space (Social choice) • Preference aggregation • Voting rules - Arrow’s theorem – Coalition games 3 © 2013 IBM Corporation Cooperative game theory: Bargaining solution concepts Working alone Juan How to distribute $ 10 the profits of the cooperation? Maria $ 20 Working together $ 100 Juan x Maria y Maria x + y = 100 90 K y Bliss point • • • • • Fair Disagreement point: BATNA, status quo Feasible solutions: ZOPA Pareto-efficiency Aspiration levels Fairness: x = 45 K-S, Nash, maxmin solutions y = 55 20 10 4 x 80 Juan © 2013 IBM Corporation Normative models of decision making under uncertainty Models for a unitary DM – vN-M expected utility • Objective probability distributions – Subjective expected utility (SEU) • Subjective probability distributions Example: investment decision problem – One decision variable with two alternatives • In what to investment? - Treasury bonds - IBM shares – One uncertainty with two possible states • IBM share price at the end of the year - High - Low – One evaluation criteria for consequences • Profit from investment The simplest decision problem under uncertainty 5 © 2013 IBM Corporation Decision Table DM chooses a row without knowing which column will occur Choice depends on the relative likelihood of High and Low? – If DM is sure that IBM share price will be High, best choice is to buy Shares – If DM is sure that IBM share price will be Low, best choice is to buy Bonds Elicit the DM’s beliefs about which column will occur Choice depends on the value of money – Expected return not a good measure of decision preferences • The two alternatives give the same expected return but most of DMs would not fell indifferent between them Elicit risk attitude of the DM 6 © 2013 IBM Corporation Decision tree representation High IBM Shares $2,000 uncertainty price Low - $1,000 What to buy Bonds $500 certainty What does the choice depends upon? – relative likelihood of H vs L – strength of preferences for money 7 © 2013 IBM Corporation Subjective expected utility solution If DM’s decision behavior consistent with some set of “rational” desiderata (axioms) DM decides as if he has – probabilities to represent his beliefs about the future price of IBM share – “utilities” to represent his preferences and risk attitude towards money and choose the alternative of maximum expected utility The subjective expected utility model balance in a “rational” manner – the DM’s beliefs and risk attitudes Application requires to – know the DM’s beliefs and “utilities” • Different elicitation methods – compute of expected utilities of each decision strategy • It may require approximation in non-simple problems 8 © 2013 IBM Corporation A constructive definition of “utility” The Basic Canonical Reference Lottery ticket: p-BCRL p $2,000 BCLR 1-p - $1,000 Preferences over BCRL p-BCRL > q-BCRL iff p > q where p and q are canonical probabilities 9 © 2013 IBM Corporation Elicit prob. of the price of IBM shares Event H – IBM price High H IBM shares Event L $2,000 price L – IBM price Low - $1,000 Pr( H ) + Pr( L ) = 1 p p-BCRL Move p from 1 to 0 Which alternative is preferred by the DM? $2,000 BCRL 1-p - $1,000 – IBM shares – p-BCRL There exists a breakeven canonical prob. such that the DM is indifferent – pH-BCRL ~ IBM shares – The judgmental probability of H is pH 10 © 2013 IBM Corporation Elicit the utility of $500 p p - BCLR $2,000 BCLR U( $500 )? 1-p Bonds Move p from 1 to 0 Which alternative is preferred by the DM? - $1,000 $500 p-BCRL vs. Bonds There exists a breakeven canonical prob. such that the DM is indifferent – u-BCRL ~ Bonds – This scales the value of $500 between the value of $2,000 and - $1,000 U($500) = u What is then U($500)? – The probability of a BCRL between $2,000 and - $1,000 that is indifferent (for the DM) to getting $500 with certainty 11 © 2013 IBM Corporation H Comparison of alternatives IBM shares $2,000 price L ~ pH - $1,000 $2,000 BCRL - $1,000 U($500) $2,000 BCLR The DM prefers to invest on “IBM Shares” iff pH > U($500) 12 - $1,000 ~ Bonds $500 © 2013 IBM Corporation Solving the tree: backward induction Utility scaling 0 = U( - $1,000 ) < U( $500 ) = u < U( $2,000 ) = 1 Utilities pH IBM Shares High 1 - pH Bonds 13 1 - $1,000 0 $500 u price Low What to buy $2,000 © 2013 IBM Corporation Preferences: value vs. utility Value function – measure the desirability (intensity of preferences) of money gained, – but do not measure risk attitude Utility function – Measure risk attitude – but no intensity of preferences over sure consequences Many methods to elicit a utility function – Qualitative analysis of risk attitude leads to parametric utility functions – Ask quantitative indifference questions between deals (one of which must be an uncertain lottery) to assess parameters of utility function – Consistency checks and sensitivity analysis 14 © 2013 IBM Corporation The Bayesian process of inference and evaluation with several stakeholders and decision makers (Group decision making) 15 © 2013 IBM Corporation Disagreements in group decision making Group decision making assumes – Group value/utility function – Group probabilities on the uncertainties If our experts disagree on the science (Expert problem) – How to draw together and learn from conflicting probabilistic judgements – Mathematical aggregation • Bayesian approach • Opinion pools - There is no opinion pool satisfying a consensus minimum set of “good” probabilistic properties • Issues - How do we model knowledge overlap/correlation - Expertise evaluation – Behavioural aggregation – The textbook problem • If we do not have access to experts we need to develop meta-analytical methodologies for drawing together expert judgment studies 16 © 2013 IBM Corporation Disagreements in group decision making If group members disagree on the values – How to combine different individuals’ rankings of options into a group ranking? – Arbitration/voting • Ordinal rankings - Arrow impossibility results. • Cardinal ranking (values and not utilities -- Decisions without uncertainty) - Interpersonal comparison of preferences’ strengths - Supra decision maker approach (MAUT) • Issues: manipulation and true reporting of rankings Disagreement on the values and the science – Combining • individual probabilities and utilities • into group probabilities and utilities, respectively, • to form the corresponding group expected utilities and choosing accordingly – Impossibility of being Bayesian and Paretian at the same time • No aggregation method exist (of probabilities and utilities) compatible with the Pareto order – Behavioral approaches • Consensus on group probabilities and utilities via sensitivity analysis. • Agreement on what to do via negotiation 17 © 2013 IBM Corporation Decision analysis in the presence of intelligent others Matrix games against nature – One player: R (Row) • Two choices: U (Up) and D (Down) – Payoff matrix Nature L R 0 5 10 3 U R D If you were R, what would you do? D > U against L U > D against R 18 © 2013 IBM Corporation Games against nature Do we know which Colum nature will choose? – We know our best responses to Nature moves, but not what move Nature will choose Do we know the (objective) probabilities of Nature’s possible moves? – YES p 1-p Nature L R Expected payoff 0 5 0 p + 5 (1-p) 10 3 10 p + 3 (1-p) U R D U > D iff p < 1/6 19 Payoffs = vNM utils © 2013 IBM Corporation Games against nature and the SEU criteria Do we know the (objective) probabilities of Nature’s possible moves? – No • Variety of decision criteria - Maximin (pessimistic), maxmax (optimistic), Hurwicz, minimax regret,… Nature L R Min Max Max Regret 0 5 0 5 10 10 3 3 10 2 U R D Maxmin D Maxmax D Minmax Regret D SEU criteria Elicit DM’s subjective probabilistic beliefs about Nature move (p) Compute SEU of each alternative: D > U iff p > 1/6 20 © 2013 IBM Corporation Games against others intelligent players Bimatrix (simultaneous) games – Second intelligent player: C (Column) • Two choices: L (Left) and R (Right) – Payoff bimatrix • we know C payoffs and that he will try to maximize them – As R, what would you do? C L R 0 U 5 2 * 4 R 10 D 21 3 3 – Knowledge C’s payoffs and rationality allows us to predict with certitude C’s move (R) 8 © 2013 IBM Corporation One shot simultaneous bi-matrix games Two players – Trying to maximize their payoffs Players must choose one out of two fixed alternatives – Row player chooses a row – Column player chooses a column Payoffs depends of both players’ moves Simultaneous move game – Players must act without knowing what the other player does – Play once No other uncertainties involved Players have full and common knowledge of L – choice spaces – bi-matrix payoffs No cooperation allowed U R D 22 C uR(U,L) uC(U,L) uR(D,L) uC(D,L) R uR(U,R) uC(U,R) uR(D,L) uC(D,L) © 2013 IBM Corporation Dominant alternatives and social dilemmas C Prisoner dilemma – (NC,NC) is mutually dominant • Players’ choices are independent of information regarding the other player’s move – (NC,NC) is socially dominated by (C,C) C NC 5 C -5 5 10 R Airport network security 10 NC -2 -5 * -2 * 23 © 2013 IBM Corporation Iterative dominance No dominant strategy for either player, however – There are iterative dominated strategies • L > R • Now M is dominant in the restricted game - M > U and M > D • Now L > C in the restricted game - 20 > - 10 – (M,L) solution by iteratively elimination of (strict) dominated strategies • Common knowledge and rationality assumptions Exercise – Find if there is a solution by iteratively eliminating dominated strategies Solution: (D,C) 24 © 2013 IBM Corporation Nash equilibrium Games without – Dominant solution – Solution by iterative elimination of dominated alternatives Concert Ballet 0 2 Ballet Concert 0 1 * * 2 Tails 1 1 Head 0 Battle of the sexes 25 Head -1 -1 -1 0 Tails 1 1 1 -1 Matching pennies © 2013 IBM Corporation Existence of Nash equilibrium (Nash) Every finite game has a NE in mixed strategies – Requires extending the original set of alternatives of each player Consider the matching pennies game – Mixed strategies • Choosing a lottery of certain probabilities over Head and Tails – Players’ choice sets defined by the lottery’s probability • Row: p in [0,1] • Column: q in [0,1] – Payoff associated with a pair of strategies (p,q) is • (p,1-p) P (q,1-q)T where P is the payoff matrix for the original game in pure strategies • Payoffs need to be vNM utilities – Nash equilibrium • Intersection of players best response correspondences uR(p*,q*) > uR(p,q*) uC(p*,q*) > uC(p*,q) 26 (p*,q*) © 2013 IBM Corporation Nash equilibria concept as predictive tool Supporting the row player against the column player Games with multiple NEs L R 4 10 U -100 12 D * 6 8 – To protect himself against -100 Knowing this, R would prefer to play U 5 * Two NEs (D,L) > (U,R), since 12>10 and 8>6 C may prefer to play R – ending up at the inferior NE (U,R) 4 How can we model C behavior? – Bayesian K-level thinking 27 © 2013 IBM Corporation K-level thinking p Row is not sure about Column’s move – p: Row’s beliefs about C moving L – Row’s SEU • U: 4 p + 10 (1-p) • D: 12 p + 5 (1-p) – U > D iff p < 5/13 = 0.38 q How to elicit p? – Row’s analysis of Column’s decision • Assuming C behave as a SEU maximizer • q: C’s beliefs about whether Row is smart enough to choose D (best NE) • L SEU: -100 (1-q) + 8 q R SEU: 6 (1-q) + 4 q • L > R iff q > 53/55 = 0.96 • Since Row does not know q, his beliefs about q are represented by a CPD F • p = Pr (q > 0.96) = F(0.96) 28 © 2013 IBM Corporation Simultaneous vs sequential games First mover advantage – Both players want to move first • Credible commitment/threat Game of Chicken 29 Second mover advantage – Players want to observe their opponent’s move before acting – Both players try not to disclose their moves Matching pennies game © 2013 IBM Corporation Dynamic games: backward induction Sequential Defend-Attack games – Two intelligent players • Defender and Attacker – Sequential moves • First Defender, afterwards Attacker knowing Defender’s decision 30 © 2013 IBM Corporation Standard Game Theoretic Analysis Expected utilities at node S Best Attacker’s decision at node A Assuming Defender knows Attacker’s analysis Defender’s best decision at node D Solution: 31 © 2013 IBM Corporation Supporting a SEU maximizer Defender Defender’s problem Defender’s solution of maximum SEU Modeling input: 32 ?? © 2013 IBM Corporation Example: Banks-Anderson (2006) Exploring how to defend US against a possible smallpox attack – Random costs (payoffs) – Conditional probabilities of each kind of smallpox attack given terrorists know what defence has been adopted This is the problematic step of the analysis – Compute expected cost of each defence strategy Solution: defence of minimum expected cost 33 © 2013 IBM Corporation Predicting Attacker’s decision: Defender problem 34 . Defender’s view of Attacker problem © 2013 IBM Corporation Solving the assessment problem Defender’s view of Attacker problem Elicitation of A is an EU maximizer D’s beliefs about MC simulation 35 © 2013 IBM Corporation Bayesian decision solution for the sequential Defend- Attack model 36 © 2013 IBM Corporation Standard Game Theory vs. Bayesian Decision Analysis Decision Analysis (unitary DM) – Use of decision trees – Opponent’ actions treated as a random variables • How to elicit probs on opponents’ decisions?? • Sensitivity analysis on (problematic) probabilities Game theory (multiple DMs) – Use of game trees – Opponent’ actions treated as a decision variables – All players are EU maximizers • Do we really know the utilities our opponents try to maximizes? 37 © 2013 IBM Corporation Bayesian decision analysis approach to games One-sided prescriptive support – Use a prescriptive model (SEU) for supporting one of the DMs – Treat opponent's decisions as uncertainties – Assess probs over opponent's possible actions – Compute action of maximum expected utility The ‘real’ bayesian approach to games (Kadane & Larkey 1982) – Weaken common (prior) knowledge assumption How to assess a prob distribution over actions of intelligent others?? – “Adversarial Risk Analysis” (DRI, DB and JR) – Development of new methods for the elicitation of probs on adversary’s actions • by modeling the adversary’s decision reasoning - Descriptive decision models 38 © 2013 IBM Corporation Relevance to counterbioterrorism Biological Threat Risk Assessment for DHS (Battelle, 2006) – Based on Probability Event Trees (PET) • Government & Terrorists’ decisions treated as random events Methodological improvements study (NRC committee) – PET appropriate for risk assessment of • Random failure in engineering systems but not for adversarial risk assessment • Terrorists are intelligent adversaries trying to achieve their own objectives • Their decisions (if rational) can be somehow anticipated – PET cannot be used for a full risk management analysis • Government is a decision maker not a random variable 39 © 2013 IBM Corporation Methodological improvement recommendations Distinction between risks from – Nature/Accidents vs. – Actions of intelligent adversaries Need of models to predict Terrorists’ behavior – Red team role playing (simulations of adversaries thinking) – Attack-preference models • Examine decision from Attacker viewpoint (T as DM) – Decision analytic approaches • Transform the PET in a decision tree (G as DM) - How to elicit probs on terrorist decisions?? - Sensitivity analysis on (problematic) probabilities - Von Winterfeldt and O’Sullivan (2006) – Game theoretic approaches • Transform the PET in a game tree (G & T as DM) 40 © 2013 IBM Corporation Models to predict opponents’ behavior Role playing (simulations of adversaries thinking) Opponent-preference models – Examine decision from the opponent viewpoint • Elicit opponent’s probs and utilities from our viewpoint (point estimates) – Treat the opponent as a EU maximizer ( = rationality?) • Solve opponent’s decision problem by finding his action of max. EU – Assuming we know the opponent’s true probs and utilities • We can anticipate with certitude what the opponent will do Probabilistic prediction models – Acknowledge our uncertainty on opponent’s thinking 41 © 2013 IBM Corporation Opponent-preference models Von Winterfeldt and O’Sullivan (2006) – Should We Protect Commercial Airplanes Against Surface-to-Air Missile Attacks by Terrorists? Decision tree + sensitivity analysis on probs 42 © 2013 IBM Corporation Parnell (2007) Elicit Terrorist’s probs and utilities from our viewpoint – Point estimates Solve Terrorist’s decision problem – Finding Terrorist’s action that gives him max. expected utility Assuming we know the Terrorist’s true probs and utilities – We can anticipate with certitude what the terrorist will do 43 © 2013 IBM Corporation Parnell (2007) Terrorist decision tree 44 © 2013 IBM Corporation Paté-Cornell & Guikema (2002) Attacker 45 Defender © 2013 IBM Corporation Paté-Cornell & Guikema (2002) Assessing probabilities of terrorist’s actions – From the Defender viewpoint • Model the Attacker’s decision problem • Estimate Attacker’s probs and utilities (point estimates) • Calculate expected utilities of attacker’s actions – Prob of attacker’s actions proportional to their perceived EU Feed these probs into the Defender’s decision problem – Uncertainty of Attacker’s decisions has been quantified – Choose defense of maximum expected utility Shortcoming – If the (idealized) adversary is an EU maximizer he would certainly choose the attack of max expected utility 46 © 2013 IBM Corporation How to assess probabilities over the actions of an intelligent adversary?? Raiffa (2002): Asymmetric prescriptive/descriptive approach – Prescriptive advice to one party conditional on a (probabilistic) description of how others will behave – Assess probability distribution from experimental data • Lab role simulation experiments Rios Insua, Rios & Banks (2009) – Assessment based on an analysis of the adversary rational behavior • Assuming the opponent is a SEU maximizer - Model his decision problem - Assess his probabilities and utilities - Find his action of maximum expected utility – Uncertainty in the Attacker’s decision stems from • our uncertainty about his probabilities and utilities – Sources of information • Available past statistical data of Attacker’s decision behavior • Expert knowledge / Intelligence 47 © 2013 IBM Corporation The Defend–Attack–Defend model Two intelligent players – Defender and Attacker Sequential moves – First, Defender moves – Afterwards, Attacker knowing Defender’s move – Afterwards, Defender again responding to attack Infinite regress 48 © 2013 IBM Corporation Standard Game Theory Analysis Under common knowledge of utilities and probs At node Expected utilities at node S Best Attacker’s decision at node A Best Defender’s decision at node Nash Solution: 49 © 2013 IBM Corporation Supporting the Defender against the Attacker At node Expected utilities at node S At node A Best Defender’s decision at node 50 ?? © 2013 IBM Corporation Predicting Attacker’s problem as seen by the Defender 51 © 2013 IBM Corporation Assessing Given 52 © 2013 IBM Corporation Monte-Carlo approximation of Drawn Generate by Approximate 53 © 2013 IBM Corporation The assessment of The Defender may want to exploit information about how the Attacker analyzes her problem Hierarchy of recursive analysis – Infinity regress – Stop when there is no more information to elicit 54 © 2013 IBM Corporation Games with private information Example: – Consider the following two-person simultaneous game with asymmetric information • Player 1 (row) knows whether he is stronger than player 2 (Colum) but player 2 does not know this • Player's type use to represent information privately known by that player 55 © 2013 IBM Corporation Bayes Nash Equilibrium Assumption – common prior over the row player's type: • Column's beliefs about the row player's type are common knowledge • Why column is going to disclose this information? • Why row is going to believe that column is disclosing her true beliefs about his type? Row’s strategy function 56 © 2013 IBM Corporation Bayes Nash Equilibrium 57 © 2013 IBM Corporation Is the common knowledge assumption realistic? – Column is better off reporting that – – 58 © 2013 IBM Corporation Modeling opponents' learning of private information Simultaneous decisions – Bayes Nash Equilibrium – No opportunity to learn about this information Sequential decisions • Perfect Bayesian equilibrium/Sequential rationality • Opportunity to learn from the observed decision behavior - Signaling games Models of adversaries' thinking to anticipate their decision behavior – need to model opponents' learning of private information we want to keep secret – how would this lead to a predictive probability distribution? 59 © 2013 IBM Corporation Sequential Defend-Attack model with Defender’s private information Two intelligent players – Defender and Attacker Sequential moves – First Defender, afterwards Attacker knowing Defender’s decision Defender’s decision takes into account her private information – The vulnerabilities and importance of sites she wants to protect – The position of ground soldiers in the data ferry control problem (ITA) Attacker observes Defender’s decision – Attacker can infer/learn about information she wants to keep secret How to model the Attacker’s learning 60 © 2013 IBM Corporation Influence diagram vs. game tree representation 61 © 2013 IBM Corporation A game theoretic analysis 62 © 2013 IBM Corporation A game theoretic analysis 63 © 2013 IBM Corporation A game theoretic solution 64 © 2013 IBM Corporation Supporting the Defender We weaken the common knowledge assumption The Defender’s decision problem D S A ?? V 65 © 2013 IBM Corporation Defender’s solution 66 © 2013 IBM Corporation Predicting the Attacker’s move: 67 © 2013 IBM Corporation Attacker action of MEU 68 © 2013 IBM Corporation Assessing 69 © 2013 IBM Corporation How to stop this hierarchy of recursive analysis? Potentially infinite analysis of nested decision models – where to stop? • Accommodate as much information as we can • Stop when the Defender has no more information • Non-informative or reference model • Sensitivity analysis test 70 © 2013 IBM Corporation
© Copyright 2026 Paperzz