COLLEGIO EUROPEO DI PARMA Borgo Lalatta 14, 43100 Parma Tel. + 39 (0)521.207525, Fax + 39 (0)521.384653 www.collegio.europeo.parma.it Advanced Diploma in European Studies (ADES) THE ECONOMIC ROLE OF THE STATE Uncertainty and Game Theory: Private Choices and Public Policies by Pietro A. Vagliasindi* April 2004 _______________________________ * Professor of Public Finance and Industrial Organisation, University of Parma Director of the Department of Economics, Finance & International Law, Via Università 12, 43100 Parma Tel: +39-0521-034561 Fax: +39-0521-034562 Email: [email protected] 2 Table of Contents PAGE Introduction .................................................................................................. 3 1. Choice under uncertainty ........................................................................ 4 A. The expected return: Heads or tails?...................................................... 4 B. The expected utility: risk aversion and risk preference. ........................ 5 C. Rational behaviour in an uncertain world. ............................................. 7 D. Von Neumann utility function and social welfare. ................................ 9 2. Strategic thinking and game theory ..................................................... 11 A. Interdependencies, strategic behaviour and rational individuals......... 11 B. Strategic Behaviour: the Prisoner’s Dilemma...................................... 12 C. Repeated Prisoner’s Dilemma .............................................................. 15 3. Game Theory .......................................................................................... 17 A. Static games, dynamic games and Nash equilibria. ............................. 17 B. Extensive form, dynamic games and perfect Nash equilibria.............. 20 4. Economic Applications: Oligopoly. ...................................................... 22 A. Oligopoly applications of the Nash Equilibrium ................................. 22 B. Cournot Competition: Quantity Strategy. ............................................ 23 C. Bertrand Competition: Price Strategy. ................................................. 24 D. The Stackelberg dynamic game. .......................................................... 25 E. Tacit collusion in repeated games: Cournot supergames ..................... 25 5. Appendix: Solution to selected problems............................................. 27 3 1. Introduction Economic theory shows how markets work, i.e. how prices and quantities are determined. In a world of certainty, decisions involve predictable streams of costs and benefits. Converting these into present value one can compare them and choose the optimal decision (that is, the one which maximises the benefits, net of costs). However, we live and public policies (like the one described in this course) take place in an uncertain world (and in a strategic context). In this setting the analysis of individual choices is a bit more complex, as you can see from part.1. However, under some simplifying assumptions economic problems can be dealt with as in a world of certainty. Specifically, we assume that each individual does not know what will happen, but knows the likelihood with which each outcome will be realised. In an uncertain world, economic agents will then maximise their welfare, by choosing the alternative that offers them the highest expected return (or utility), i.e. the sum of returns (utility) associated with the different possible outcomes, each weighted by its probability. The reader not familiar with expected return and expected utility can familiarise with the implications of uncertainty, going first to the appendix. There we will consider the utility flow from a year’s expenditure considering expected return in “euros” and expected utility “utiles”. To understand strategic economic context in which firms operate we also need to be able to use game theory as a tool of analysis. Specifically, dealing with noncooperative game theory, we focus on individual players, who look to the way to attain their best interest subject to given rules and possibilities. In this way, we will gain a better knowledge of firms’ interactions and their behaviour when facing competitors and the way they may react to regulations and competition policies. In order to give a feeling of strategic behaviour, in part 2 we introduce game theory starting with an informal description of the prisoner dilemma, one of the most famous games. Later, in part 3 we try to formalise a bit more the analysis, discussing ways in which one might “solve” a static game but we trying to avoid mathematical definitions. Finally, in part 4 we discuss applications to oligopoly theory that are nearer to the subjects dealt with in the topics related to competition policy. All parts are complemented with problems to be solved. Solutions (or helpful hints for problems with *, that will be solved in class) can be found in part 5. 4 1. Choice under uncertainty Economic theory shows how markets work, through the determination of prices and quantities. In a world of certainty, decisions involve predictable streams of costs and benefits. Converting these into present value one can compare them and choose the optimal decision (that is, the one which maximises the benefits, net of costs). An optimal decision must be characterised by a positive present value (PV): PV = Σi (Bi-Ci)/(1+r)i = Σi ρi Pi > 0 where Pi = Bi-Ci denotes the net benefits (profits, given by the difference between benefits and costs), r is the rate of discount, ρi = 1/(1+r)i the discount factor and i=1,… refers to time. An optimal decision must be also characterised by the highest values among all possible alternatives. Otherwise, undertaking such a decision would imply giving up an alternative one with a higher (positive) present value. In an uncertain world the analysis of individual choices is more complex. However, under some simplifying assumptions we can convert the problem in the already solved problem. Specifically, we can assume that each individual has a probability distribution over possible outcomes. He does not know what will happen, but he knows the likelihood with which each outcome will be realised. His problem is how to maximise his welfare. He can choose the alternative that gives the highest expected return (or utility), i.e. the sum of returns (utility) associated with the different possible outcomes, each weighted by its probability. In what follows, we analyse the implications of uncertainty, considering the utility flow from a year’s expenditure, while temporarily ignoring other complications. To make things simple, we talk in “euros” and “utiles” instead of “ euros per year” and “utiles per year”, i.e. an income of x euros/year for one year equals x euros. A. The expected return: Heads or tails? Consider the case in which you are betting on whether a coin will come up heads or tails. You have 1 € and you can choose between a certain outcome (i.e. to decline the bet, ending up with 1 €) and an uncertain outcome (i.e. to accept the bet ending up with either more or less than 1 €). Using a fair coin, half the time it will come up heads. A rational gambler will take bets that offer a payoff of more than 1 € and refuse any bet that offers less. For instance, if he is paid 2 € when the coin comes up heads and he pays 1 € if it comes up tails, then, by accepting the bet, on average he gains 0.50 €. If he is offered 0.50 € for the risk of 1 €, then on average he loses 0.25 € 5 by accepting the bet and should hence refuse the bet. Taking the same gamble many times a gambler should choose the one with the highest expected return. He should take any bet that is better than a fair gamble i.e. one with positive expected return. The case of a gambler betting many times on the toss of a coin can be generalised to describe any game of chance, following the rule “to maximise expected return”. The expected return (E R) is the sum, over all of the possible outcomes, of the return from each outcome times the probability of that outcome. ER = Σi πi • Ri with Σi πi = 1 Here πi is the probability of outcome i occurring, Ri is the return from outcome number i. Any gamble ends up with one of the alternative outcomes happening; for instance, when you toss a coin, it must come up either heads or tails. In this gamble, using a fair coin π1 = π2 = 0.5 are respectively the probabilities associated to the outcome heads and tails, according to which the gambler respectively gains R1 = 2 € and loses R2 = 1 €. The expected return is € 0.50. E R = (π1 • R1) + (π2 • R2) = [0.5 • (+2 €)] + [0.5 • ( - 1 €) ]= + 0.50 €. If you play the game many times, you will on average make € 0.50 each time you play. The expected return from taking the gamble is positive, so you should take it, provided that you can repeat it many times. The same applies to any other gamble with a positive expected return. A gamble with a zero expected return is a fair gamble. Problem 1: Check the result for π1 = π2 = 0.5, R1 = 0.5 €, R2 = -1 € by redoing the calculations. What happens if π1 = 0.9 and π2 = 0.1? Suppose now that you are playing the game once and that the bet is € 50,000, i.e. all your income. If you lose, you starve, if you win, you gain only a modest welfare increase. You may feel that a decline in your wealth from € 50,000 to zero hurts you more than an increase from € 50,000 to € 150,000 would help you. The euros that raise your income from zero to € 50,000 are worth more (per unit) than the additional 100,000 euros, starting from an income equal to 50,000 €. The rule “to maximise expected return” is no longer rational. What is the rational behaviour in such a case? B. The expected utility: risk aversion and risk preference. John Von Neumann, the inventor of game theory, provided the answer to the question at the end of last section, by combining the idea of expected return used in the mathematical theory of gambling (probability theory) with the idea of utility used in 6 economics. In this way, he shows that it is possible to describe the behaviour of individuals dealing with uncertain situations. The basic underlying idea is that instead of maximizing expected return in euros, individuals maximise expected return in utiles, i.e. expected utility. Each outcome i has an associated utility Ui. He defined expected utility as: E U(R) = Σi πi U(Ri) The utility you get from outcome i depends only on how much more (or less) money that outcome gives you. If utility increases linearly with income U(R) = a + (b • R), as shown along OE on Figure 1, whatever decision maximises E R maximises E U. E U(R) = Σi πi (a + b • Ri) =a Σiπi + b ΣiπiRi = a + b • E R Hence, with a linear utility function the individual maximizing his expected utility behaves like the gambler maximizing his expected return. We can represent graphically the utility level of any outcome on a bidimensional graph, such as ODE in Figure 1. Along this curve we will then find the utility of the income Ri associated with outcome i. Considering ODE in Fig. 1, if you start with R* = 50,000 € and bet all of it at even odds on the toss of a coin (heads you win, tails you lose) then the utility to you of the outcome “heads” is the utility of € 1,300 (point E). The utility to you of the outcome “tails” is the utility of zero dollars (point O). Fig. 1 1,200 U(RB) 1,000 U(R*) 900 C E U(R) A U(RA) 600 B D E Fig. 2 U(RB) 1,800 E U(R) 1,000 RA Rc R* C U(R*) 800 U(RA) 200 O B RB RA D A R* Rc RB ODE shows a relation where income has declining marginal utility. That is, total utility increases with income, but it increases more and more slowly as income gets higher and higher. In deciding whether to bet € 25,000, you are choosing between two different gambles. If you do not take the bet, you have the certainty (π* =1) of ending up with R* = € 50,000. If you do take the bet, you have a 0.5 chance of ending up with RA = € 25,000 and a 0.5 chance of ending up with RB = € 75,000. So in the first case, assuming U(50,000 €) = 1,000 utiles, we have: 7 E U(R*) = Σi πi Ui = π* • U* = 1,000 utiles In the second case, with U(25,000 €) = 600 and U(100,000 €) = 1,200 we have: E U(R) = Σi πi Ui = (0.5 • 600 utiles) + (0.5 • 1,200 utiles) = 900 utiles The individual, taking the alternative with the higher expected utility, declines the bet. In money terms, the two alternatives are equally attractive; they yield the same expected return R* = 50,000 €. In that sense, it is a fair bet. In utility terms, the sure option U(R*) is superior to the risky one U(RC). As long as the utility function has the shape shown in Figure 1, a certainty of X € will be always preferred to a gamble with the same expected return X €. An individual who behaves in that way is defined as risk averse. Such an individual would decline a fair gamble but might accept one that is better than fair, i.e. bet € 1,000 against € 1,500 on the flip of a coin, for example. ODB in Figure 2 shows instead the utility function of a risk lover. It exhibits increasing marginal utility. A risk lover is willing to take a gamble slightly worse than fair, although he would decline one with very low expected return. An individual who is neither a risk lover nor a risk averter is called risk neutral. Its corresponding utility function (line OE) is also shown in Figure 1. Problem 2: Will the agent accept the bet for U* = 1,000, UA = 200 and UB = 1, 800 with πA=πB= 0.5 in figure 2? What are the values of U*, UA and UB for a risk neutral agent in Figure 1? Will he prefer the certain income R* for πA=0,6? And for πA=0,4? The degree to which someone exhibits risk preference or risk aversion depends on the shape of the utility function, the initial income, and the size of the bet. For small bets, we can expect everyone to be roughly risk neutral; the marginal utility of an euro does not change very much between an income of 49,999 € and an income of 50,001 €, which is the relevant consideration for someone with 50,000 € who is considering a 1 € bet. C. Rational behaviour in an uncertain world. As section B shown, it is relatively easy to predict the behaviour of someone maximizing expected return than of someone maximizing expected utility. Each individual can still maximise his utility by maximizing his expected return, as long as he can repeat the same gamble many times (so that he can expect results to average out). His income in the long run is (almost) certain. He maximises his expected utility making that income as large as possible, i.e. by choosing the gamble with the highest expected return. Maximizing expected utility is also equivalent to maximizing 8 expected return (like the gambler we started with) when the: (i) individual is riskneutral, (ii) the size of the prospective gains and losses is small compared to one’s income (we can treat the marginal utility of income as constant and changes in utility as proportional to changes in income, so he should act as if he were risk neutral). Let us now consider a firm rather than an individual. The management, wishing to raise the present price of firms’ stock, maximises the expected value of its future price by maximizing the expected value of future profits. The threat of takeover bids forces management to maximise the value of firms’ stock. When management pursues his own goals the conclusion no longer holds. If the firm goes bankrupt, the income of the chief executive may fall a lot. Accordingly, he may not be willing to take a 50 percent chance of leading to bankruptcy even if it also has a 50 percent chance of tripling the firm’s value. Hence the assumption of risk neutrality might not be always appropriate for firms as well. The existence of risk adverse agents explains the need for insurance. Suppose Paula’s income is 30,000 € and there is a small probability (0.01) that an accident reduces it to € 10,000. The insurance company offers to insure her against that accident for a fixed price of € 200. Whether or not the accident happens, she gives them € 200. If the accident happens, they give her back € 20,000. She has a choice between two gambles: to buy or not to buy the insurance. Buying the insurance, whether or not the accident occurs she has € 30,000 minus the € 200 paid for the insurance. For the first gamble: π1 = 1; R1 = 29,800 € and EU = π1 • U(R1) = 998 utiles. When she does not buy the insurance: π1 = 0.99; R1 = € 30,000; U(R1) = 1,000 utiles and π2 = 0.01; R2 = € 10,000; U(R2) = 600 utiles. That implies: E U(R) = [π1 • U(R1)] + [π2 • U(R2)] = 990 utiles + 6 utiles = 996 utiles. Since Paula is better off with the insurance than without it and will buy it. Notice that for R1 = € 30,000 the marginal utility of 100€ is about 1 utile. Problem 3: How much would the agent in figure 1 pay to have a certain income R* = € 50,000, instead of RA = € 25,000 and RB = € 75,000 with π1 =π2 = 0.5. How much should instead the agent in figure 2 be paid to have the certain income R*? Buying insurance was a fair gamble: € 200 are paid in exchange for 1% chance of receiving € 20,000. An insurance company making 100,000 bets will end up receiving, on average, almost exactly the expected return, if the probabilities of these bets are independently distributed. When insurance is fair, the insurance company and the client breaks even in monetary terms, but the client gains in utility. In the real world, insurance companies incur additional expenses other than paying out 9 claims and offer gambles somewhat less than fair to clients. Sufficiently risk averse consumers still accept the gamble and buy an insurance contract that lowers their expected return but increase their expected utility. In our case, with a marginal utility of 100€ ≈ 1 utile, it would still be worth buying the insurance even if the company charged € 300 for it. It would no longer worth buying at € 500. Problem 4: Check those results. Buying a lottery ticket is the opposite of buying insurance. When you buy a lottery ticket, you accept an unfair gamble but this time you do it in order to increase your uncertainty. In fact, on average, a lottery pays out in prizes less than it takes in. If you are risk averse, it may make sense for you to buy insurance, but you should never buy lottery tickets. If you are a risk lover it may make sense for you to buy a lottery ticket, but you should never buy insurance. This brings us to the lottery-insurance paradox. In the real world, the same people sometimes buy both insurance and lottery tickets. They both gamble and buy insurance knowing the odds are against them. Is this consistent with rational behaviour? We propose two possible explanations. The individual is risk averse for one range of incomes and risk preferring for another, higher, range. What individuals get is not just one chance in a billion of a 100,000 € prize, but also, for a while, the daydream (at a very low price) of getting it, because they have a slim chance to actually win the prize. D. Von Neumann utility function and social welfare. Von Neumann proved that if individual choice under uncertainty meets a few consistency conditions, it is always possible to assign utilities to outcomes in such a way that the decisions actually made follows from maximizing expected utility. He considers an individual behaviour “rational” or “consistent” under uncertainty (i.e. choosing among “lotteries”: a collection of outcomes, each with a probability) if: (i) given any two lotteries A and B, the individual either prefers A to B, prefers B to A, or is indifferent between them, (ii) preferences are transitive; if you prefer A to B and B to C, you must prefer A to C, (iii) in considering lotteries whose payoffs are themselves lotteries people combine probabilities in a mathematically correct fashion, (iv) preferences are continuous, (v) when outcome A is preferred to outcome B and outcome B to outcome C, there is a probability mix of A and C (a lottery containing only those outcomes) equivalent to B; i.e. since U(A) > U(B) > U(C) as utility moves from U(A) to U(C), at some point, it must be equal to U(B). Accepting these axioms and hence the Von Neumann utility the statement “I prefer 10 outcome X to outcome Y twice as much as I prefer Y to Z” is equivalent to “I am indifferent between a certainty of Y and a lottery that gives me a two-thirds chance of Z and a one-third chance of X”. Problem 5*: Check that these statements are equivalent by doing the calculations. We can make quantitative comparisons of utility differences and quantitative comparisons of marginal utilities. The principle of declining marginal utility is equivalent to risk aversion. We agree about the order of preferences and about their relative intensity, but we may still disagree about the zero of the utility function and the size of the unit in which we are measuring them. Utility functions are arbitrary with respect to linear transformations. Changes in the utility function that consist of adding the same amount to all utilities (changing the zero), or multiplying all utilities by the same number (changing the scale), or both, do not really change the utility function, i.e. the behaviour is exactly the same. [Please check this statement]. Utilitarian used the concept of utility, to determine social welfare, i.e. the total utility of individuals that society should maximise. This was criticized because there is no way of making interpersonal comparisons of utility, nor of deciding if a change that benefits me and hurts you increases total utility. With Von Neumann utility the utilitarian rule “to maximise total utility” is equivalent to “choose the alternative you prefer if you were one of the people affected”. Given the probability π = 1/N of being anyone; if there are N individuals, we can write the utility of person i as Ui = U(Ri), and consider the expected utility of the lottery with probability π of being each person: E U(R) = Σi πi Ui = Σi π Ui, = π Σi U(Ri). It is easy to see how Σi Ui is simply social welfare, i.e. society’s total utility. Problem 6: Would this result hold when individual utility functions are different? 11 2. Strategic thinking and game theory A. Interdependencies, strategic behaviour and rational individuals An economy is fundamentally an interdependent system. Usually, interdependencies are pushed into the background to simplify and solve economic problems. Markets are represented in terms of an individual, consumer or producer, maximizing against an opportunity set (e.g. a given budget constraint). Thus, important interaction features of many markets are eliminated; bargaining, threats, bluffs, i.e. the whole range of strategic behaviour. In this way economic theory explains a large part of real markets avoiding situations involving strategic behaviours. In practice, it presents two basic standard models: competitive markets and monopolies. (1) On one hand, each individual, i.e. the consumer and the producer, is a small part of the market. Therefore, he will take the behaviour of other individuals as given. He does not have to worry about how what he does will affect everyone else’s behaviour. The rest of the world consists for him of a set of prices at which he can sell what he produces and buy what he wants. (2) On the other hand, the monopolist is so big that his behaviour affects the entire market. He deals with the mass of consumers, which individually know that each of them cannot affect his behaviour. For the monopolist, customers are not persons; but simply a demand curve. In fact, each consumer buys the quantity that maximises his welfare at the price fixed by him. In the analysis of oligopoly or bilateral monopoly, basic economics collapses from a coherent theory to a set of guesses. In fact, analysing strategic behaviour is really difficult. John Von Neumann, one of the brightest mathematician and economist of the last century, created game theory as a whole new branch of mathematics in the process of trying to solve it. The work of successive economists has brought economics closer to being able to say what economic agents will or should do in such strategic framework. We are interested in game theory as a tool of economic analysis to understand strategic economic context. Studying the interactions of rational individuals we gain better knowledge and prediction of the real ones. Game theory give us a clear and precise language to express and formalise intuitive insight and notions in models (defining players and for each one strategies and payoffs, i.e. a numerical representation of preferences) that can be analysed deductively, testing their logical consistency and finding which hypothesis support particular conclusions. Accordingly, we assume that agents are able to calculate how to play the game and to consider every possibility before making their first move. Obviously, for most games that assumption is not realistic; but makes relatively straightforward to describe the perfect play of a game. 12 Whatever the game, the perfect strategy is the one that produces the best result. It is more difficult to construct a theory of imperfect decisions of realistic players with limited abilities. The assumption of rationality can be defended on the grounds that there is one right answer to a problem and many wrong ones. If individuals tend to choose the right one, we could analyse his behaviour as if he chooses the best option. Let us start with an informal description of the most famous game that recurs in economic context, to give a feeling of strategic behaviour. Part 2 contains a more formal analysis (which avoid mathematical definitions), discussing ways in which one might “solve” a static game and applications to oligopoly theory. Dealing with noncooperative theory, our unit of analysis are individual players subject to clearly defined rules and possibilities, which look to their best interest. B. Strategic Behaviour: the Prisoner’s Dilemma Paula and Albert are arrested. If convicted, each will receive a jail sentence of five years. The public accuser does not have enough evidence to get a conviction, so he puts criminals in separate cells. He goes first to Paula. If she confesses and Albert does not, the major charge will be dropped and she get only three months for a minor charge. If Albert also confesses, the charge cannot be dropped but the judge will be indulgent; Albert and Paula will get two years each. If Paula refuses to confess, the judge will not be indulgent. If Albert confesses, Paula will be convicted possibly with the maximum sentence. If neither confesses, they will get a six-month sentence, for minor charges. Then, the public accuser goes to Albert’s cell and gives a similar speech. Figure 1A shows the matrix of outcomes in utiles facing Paula and Albert. Once Paula and Albert choose their strategies (a row and a column, respectively) we can insert their payoffs (UP, UA) in a cell, the first for Paula, the second for Albert. Figure 1A A Figure 1B C NC C 4 ; 4 2 ; 5 NC 5 ; 2 3 ; 3 P A C NC C 4 ; 4 4 ; 3 NC 3 ; 4 3 ; 3 P Paula reasons as follows: (1) If Albert confesses (he chooses NC, not to cooperate with me) and I don’t (C, NC), I get five years UP(C, NC) = 2; if I confess too (NC, NC), I 13 get two years UP(NC, NC) = 3. If Albert is going to confess, I had better confess too; 3 utiles > 2 utiles. (2) If no one confesses (C,C), I go to jail for six months UP(C, C) = 4. That is an improvement, but I can do better. If I confess (NC, C), I get three months UP(NC, C) = 3. So if Albert is going to stay silent, I am better off confessing 5 utiles > 4 utiles. (3) Whatever Albert does I am better off confessing (choosing strategy NC). Problem 7 Show that Albert makes the same calculation and reaches the same conclusion. So both dictate their confession. Obviously, Paula picks her strategy with the objective of getting the highest payoff, since it is just a numerical representation of her preferences. Albert does the same. The same apply if Paula and Albert are two producers selling a similar product and payoffs represent net profit (ΠP, ΠA). Each can advertise (or make price concessions) to attract clients. So each one is lead to advertise (chose NC) to increase profits at expenses of the rival. But if both increase advertising (NC, NC) net profits will decrease. They would improve by cooperating and restricting advertising or price concessions (C, C). This game introduces a solution concept. They do not cooperate because that is better than cooperating, whatever the other does. In Figure 1A the column NC has a higher payoff for Paula than the column C whichever row Albert chooses. Similarly, the row NC has a higher payoff for Albert than the row C whichever column Paula chooses. If one strategy leads to a better outcome than another whatever the other player does, the first strategy is said to dominate the second. If one strategy dominates all others, then the player is always better off using it; if both players have such dominant strategies, we have a solution to the game. In this case, we are also assuming that each player believer that his strategy choice does not affect the others’ strategy choices. When applying dominance criterion iteratively, we assume that players assume that others will not play dominated strategies. As far as these premises are correct dominance gives a neat and straightforward mechanism to make predictions. When the solution cannot be found with the dominance criterion we should recur to a different solution concept due to Nash, we will introduce in part 3. In our example, both players act rationally and both are, as a result, worse off. Individual rationality, i.e. making the choice that best achieves the individual’s ends, results in both individuals being worse off. The result of prisoner’s dilemma seems counter-intuitive, but there is a number of situations where rational behaviour by the individuals in a group made all of them worse off. The explanation is that individual rationality and group rationality are different things. Paula is only choosing her strategy, not Albert’s one. If Paula could choose between the lower right-hand cell of the matrix and the upper left-hand cell, she would choose 14 the former; so would Albert. But those are not the choices they are offered. Paula is choosing a column, and the right-hand column dominates the left-hand column; it is better whichever row Albert chooses. Albert is choosing a row, and the bottom row dominates the top one whichever column Paula chooses. They do not cooperate, if the structure of rewards and punishments facing them has not been changed, as in Fig. 1B. Criminals and producers make considerable efforts to raise the cost of not cooperative strategies and lower the cost of cooperative ones. That does not confute the logic of prisoner’s dilemma; it merely means that real agents are sometimes playing other games, such in Fig. 1B. When payoffs have the structure shown in Figure 1A, the logic of the game is compelling and there is no cooperation. P and A cannot make a binding agreement since they must move simultaneously and independently, so that there is no way to bind the other or inflict some punishment. Problem 8 Show that in Fig. 1B Paula and Albert make the same calculation and reach the conclusion to cooperate. So both cooperate. Assume now that nature makes the first move and chose game 1A with probability 0.2 and game 1B with probability 0,8, without telling to Paula and Albert what is the case. How should they behave? They should use expected payoffs according to the theory of choice under uncertainty. The expected utility in case (C, NC) will thus be given by payoff 0.2 (2, 5)+ 0.8 (4, 3) = (3,6; 3,4). Show that both will cooperate in this case. Game theory provides a taxonomy for economic situations, based on the strategic form. Other games, like the so-called battle of sexes can be applied to economic context. If Paula and Albert produce complementary goods they may wish to adopt compatible standards, even if they may prefer different sort of standards. In this game, in figure 1C players seek to coordinate their actions, although they have conflicting preferences. Figure 1C A Figure 1D a b a 3;5 0;0 b 0;0 5;3 P A a b a 3;5 0;0 b 0;0 2;1 P We can have two solutions where a standard is adopted: a (favoured by Albert), or b (favoured by Paula). Also in figure 1D we do not have a dominated strategy, but standard a here emerges as optimal from consistent preferences. We will provide the 15 tools to solve these games in part 3 introducing Nash equilibria. C. Repeated Prisoner’s Dilemma It could be argued the prisoner’s dilemma result is correct, simply because it is a one shot game. Real-world situations involve repeated plays. If one does not cooperate, he can expect a similar treatment next time, so they both cooperate. This “reputation” argument seems credible, but is it right? Consider Paula and Albert playing the game in Figure 1.A a thousand times. A player who betrays his partner gains 1 utile in the short run. The victim will respond by betraying on the next turn, and perhaps several more. Both players would be better off cooperating every turn 4 > 3. Is that insignificant gain (1 utile) worth this huge price (1,000 utiles)? However, this reasoning has a problem. Consider the last turn of the game. Each player knows that whatever he does, the other will have no further opportunity to punish him. The last turn is therefore a one shot game. Non-cooperating dominates cooperating. Each player knows that the other will betray him on the last move. He needs not fear punishment for anything he does on the previous move; in any case the other is going to betray him on the next move. So both betray on that move as well and there is now no punishment for betraying on the previous move. Thus, if they are rational they betray each other on the first move and every move thereafter. If they had been irrational and cooperated, they would each have ended up with a best deal. The result seems paradoxical, but the backward induction argument is correct. The cooperative solution to repeated prisoner’s dilemma is unstable because it pays to betray on the last play and hence on the next-to-last play, and so on back to the beginning. With sufficiently bounded rationality the cooperative solution is no longer unstable. Consider the game of repeated prisoner’s dilemma (with 1,000 plays) played by robots programmed to compute only 900 possible states, due to their limited amount of memory (each state implies a move, cooperate or betray in the case of prisoner’s dilemma). They wont consider the last round if they cannot count up to 1,000. On the other hand, cooperation may be stable if the play is repeated an infinite or indefinite number of times. The promise to cooperate becomes credible due to the threat to punish non-cooperative behaviour, since once the cooperative equilibrium is broken there is no advantage from one side in trying to restore it. However, there is no unique equilibrium, the non-cooperative solution is an equilibrium and similarly behaviours in which periods of cooperation and non-cooperation are alternated. Suppose that P cooperates for two periods and then does not cooperate for one period, while A always cooperates. P gets 13 each three periods and A 10, but both are better 16 off than getting 9 arresting cooperation. Problem 9* Consider Paula and Albert playing the game in Figure 1.A when the probability to repeat it next time is ρ=0,9. Assume booth cooperate at first. Does it pays not to cooperate if once an agent is betrayed he will always betray? What if the game is repeated infinite times and agents discount future utility by r = 10%? 17 3. Game Theory Game theory, was conceived by Von Neumann and presented in The Theory of Games and Economic Behaviour co-authored with Oskar Morgenstern. You have probably realised how broad are the applications of “game theory”. Von Neumann’s objective was to understand all behaviours that could be structured like a game. That includes most of the usual subject matter of economics, sociology, interpersonal relations, political science, international relations, biology and perhaps more. Along his line of reasoning solution means thinking how an agent should figure out how to play any game perfectly. If one can set up any game as an explicit mathematical problem, the details of the solution of each particular game become simply applications. In this perspective, complicated games turn out to be trivial. The total number of moves, and thus the total number of possible ways to play, is limited, very large but finite. All a player needs to do is list all possible games, note his payoff, and then work backward from the last move, assuming at each step that each player choose the move that leads to an higher payoff. This is the right approach to games, if you search a common way of describing all of them, in order to figure out in what sense games have solutions and how, in principle, one could find them. Obviously, it may not be a practical solution to many games. The number of possible moves is very large like the number of stars in the universe, so finding enough paper (or computer memory) to list them may be still difficult. But, as game theorists, we may not be interested in these difficulties, being willing to give an unlimited amount of computer memory and time to solve a game. A. Static games, dynamic games and Nash equilibria. For simplicity sake, let us consider two-person games. We need to show how any game can be represented in a reduced (or normal or strategic) form as in Figures 1 and in what sense the reduced form of a game can be solved. We can think of a dynamic game as a series of separate decisions; I make a first move, you respond, I respond to that, and so forth. We will see later that it can be represented it by a tree in extensive form. However, we can, describe the same game in terms of a single move by each side. The move consists of the choice of a strategy describing what the player will do in any situation. The strategy would be a complete description of how I would respond to any sequence of moves; I might observe my opponent making and to any sequence of random events, such as the toss of a dice. Thus one possible strategy might be to start by a given move, then if the opponent’s move is x to respond by y, if the opponent’s move is instead z to respond by w, and so on. 18 Since a strategy determines everything a player will do in every situation, playing any game simply consists of each side picking one strategy. Strategy decisions are simultaneous; although each player may observe his opponent’s moves as they are made, because he cannot see his opponent’s mind. Once the two strategies are chosen, everything is determined. We can imagine the two players writing down their strategies and then sitting back and watching as computers execute them. Considered in these terms, any two-person game can be represented by a payoff matrix, although it requires a huge number of rows and columns. Each row represents a strategy that P can choose; each column represents a strategy that A can choose. The cell at the intersection shows the outcome of that particular pair of strategies. If the game contains random elements, the cell contains the expected outcome, the average payoff over many plays of the game. In game theory, this way of describing a game is called its strategic or reduced form. Let us now discuss the Nash Equilibrium solution concept, a generalization of an idea developed by a French economist/mathematician early in the nineteenth century. Consider a game played over and over. Each player observes what the other players are doing and alters his play accordingly. He acts on the assumption that what he does will not affect what they do. In this way he does not have to take such effects into account. He keeps changing his play until no further change can make him better off. All the other players do the same. The equilibrium is reached when each player has chosen a strategy that is optimal for him, given the strategies that the other players follow. This solution is called Nash equilibrium and is a generalization by John Nash of what Antoine Cournot realised more than a hundred years ago. All players know what they and others should do, i.e. it is self-evident how to play. Behaviours are self-evident and each player believes it to be clear to others; so he chooses the best response to what others are obviously doing. In practice, they talk earlier and came to a credibly self-enforcing agreement, or experienced each other, or follow social rules of conduct. The set of Nash equilibria is the collection of all credibly self-enforcing agreement and stable conventions that can be arranged. Consider the case of driving. Choosing a strategy consists of deciding what side of the road to drive on. The Hungarian population reached Nash equilibrium when everyone drives on the right (R, R). The situation is stable, and would be stable even without traffic police to enforce it. Since everyone else drives on the right, each player driving on the left would impose very large costs on himself (as well as on others); so it is in his interest to drive on the right. In England, everyone drives on the left (L,L). That is a Nash equilibrium too, for the same reason. It may be a sub-optimal since in other countries people drive on the right, cars should be manufactured with steering wheels 19 on the right side for the English market. Foreign tourists driving in England may automatically drift into the right-hand lane and discover their error only when they encounter an English driver face to face. If all English drivers switched to driving on the right, they might all be better off. But any English driver who tries to make the switch on his own initiative would be very much worse off. The Nash equilibrium is stable against individual action even when it leads to sub-optimal outcome. It may not be stable against joint action; e.g. a Country switching to driving on the right, i.e. everyone changing his strategy at once. Problem 10 Represent such a game using a payoff matrix. Show that (a, a) and (b, b) are Nash equilibria in the game in figure 1C. Is that true for the game in figure 3D? The Nash equilibrium is not, in general, unique; both everyone driving on the left and everyone driving on the right are equilibria. Part of its definition is that my strategy is optimal for me, given the strategies of the other players; I act as if what I do has no effect on what they do. But my actions affect the other players, who respond following the optimal response strategy. Moreover, the choice of a different strategy variable gives rise to different Nash equilibria for otherwise identical games. So also the rule of the game should be agreed upon. Nash equilibria do not cover everything a good player would do. We explicitly ignore approaches like stealing candy from babies, i.e. strategies that work badly against good opponents but exploit the mistakes of bad ones. It is hard to include them, being quite impossible to define a “best” strategy against many different opponents and many different mistakes they might make. It seems reasonable to define a solution as the correct way to play against an opponent playing correctly. Figure 2A A P a a - 5; 5 Figure 2B b a 5 ; -5 5 ; -5 b Aa a b Figure 2C P a Ab b A b Pa a b 5 ; -5 -5 ; 5 a Pb b a b -5; 5 -5 ; 5 5 ; -5 - 5; 5 -5 ; 5 5 ; -5 5 ; -5 Whether a solution exists for a game depends on what its reduced form looks like. Figure 1 shows reduced form of games, each of which has a solution. However, the game in Figure 2A, that has no solution in terms of pure strategies has still a mixed strategies solution. A mixed strategy is a probability mix of pure strategies, e.g. a 50% chance of a, a 50% chance of b for Paula, and vice-versa for Albert. A player who follows that mixed strategy will lose, on average, zero, whatever his opponent does. A 20 player whose opponent follows that strategy will win, on average, zero, whatever he does. So the Von Neumann solution is for each player to adopt that strategy. It is not only a solution but the only solution; if the player P follows any pure strategy (say a) more frequently than the other, his opponent A can win more often than he loses by always picking the pure strategy (a) that wins against that one. Problem 11 Change all the payoff sign in Fig. 2A and let Paul and Albert play the new game. Show it has only a mixed strategy solution and find it. What kind of game is it? B. Extensive form, dynamic games and perfect Nash equilibria. Figure 2B represents the same game of Figure 2A in extensive form. In this case attention is given to the timing of actions and information available to players when they choose each action. In an extensive game we have a series of decisional nodes labelled by the players whose turn comes when that position is reached. At the start of the game at the initial node P the two lines labelled a and b indicate that player P must decide between a and b. In general, these lines may point to another node (Aa and Ab) or a vector of number (payoffs) when that move ends the game. In Figure 2B after the initial move by P, it comes the turn of player A who makes his choice between a and b, without knowing P’s decision, as indicated by the ellipse, called information set containing nodes Aa and Ab. This means that A does not know which of the two nodes he is at when selecting his response. His choice ends the game. A game in extensive form is like a tree, starting at the initial node it branches out till it reaches payoffs. In fact, each following node has exactly one line (a move) aiming to it and at least one line out of it (an action available to player). Accordingly, from any node there is a single path towards the initial node and it is impossible to cycle back to that node. Nodes in the same information set have the same range of choice and players. For every extensive form game there is a corresponding strategic form game, but a strategic form game may correspond to several extensive form games. Trivially, interchanging the chronological order we can let A choose first as in figure 2C. In figure 3A, by removing the ellipse we introduce a dynamic game, in which A learns P’s move. This game does not correspond to 2A and has different solutions. Problem 12 Represent the game in Fig. 3A in normal form. Show that it has two pure strategy solutions and find them. Is there a first move advantage? Let us now change payoffs and consider the Stackelberg game in figure 3B, where P and A are the only producers in a market facing a decreasing demand function. P can and does commit to her level of output (a=high, b=low) before A has the opportunity 21 to act. A observes P’s output and then decides what quantity to produce (a or b). Figure 3A Figure 3B P a b Aa a A a P b Aa Ab b Figure 3C P 1 1 2 5 ; -5 4 a 5; 2 -3; -3 5; 2 -3; -3 Ab a b a b a b 5 ; -5 -5 ; 5 -3 ; -3 5;2 2;5 4;4 b 4; 4 2; 5 -5 ; 5 3 2; 5 4; 4 Suppose that before P moves A warns her by saying: “I will choose a whatever you choose”. If P believes this menace, she should choose b in order to get 2 instead of -5. A’s optimal response to the choice b by P is indeed a. So if P finds the threat credible and optimally respond to it, P will be happy to play b. But it is an incredible threat if P chooses a A will face a loss if he carries out his promise and will choose b in the absence of other considerations. A threat to act differently from the optimal response is not credible, since P’s move is already carried out. We can also have incredible promises. Suppose that A says: “I will choose a if you choose a and b if you choose b”. If P believes that and acts accordingly, A has an incentive to break his promise and chose a once P has chosen b. Accordingly, P will not believe him without some credible guarantees and will opt for a. But, (b, a) is a Nash equilibrium. Problem 13 The reader is invited to show that, by representing the game in a normal form, as in figure 3.C, what are the strategies 1, 2, 3, 4, available to A. When P plays a, A should optimally respond to it, playing b, i.e. if the node Aa is reached the “rest of the game” (subgame with initial node Aa) will be played in the standard way, i.e. acting in their best interest given the circumstances. Hence, (a, b) is the only Perfect Subgame Equilibrium, i.e. it represents a Nash equilibrium in each subgame (also in those that are not reached in equilibrium). All P.S.E. are also Nash equilibria, vice versa not all Nash equilibria are necessarily P.S.E. In fact, (b, a) is not a P.S.E. because is not an equilibrium in the subgame with initial node Aa. 22 4. Economic Applications: Oligopoly. A. Oligopoly applications of the Nash Equilibrium In economics, there are many applications of “game theory”, but we shall limit to the case of oligopoly, a market structure somewhere in between monopoly and perfect competition. Oligopoly exists when a small number of firms sell in a single market. A reason for this situation is that the optimal size of firm (at which average cost is minimized) is so large that there is only room for a few such firms; this corresponds to the cost curves shown in Figure 4. The situation differs from perfect competition because each firm is large enough to have a significant effect on the market price. It differs from monopoly because there is more than one firm. The firms are few enough and their products similar enough that each must take account of the behaviour of all the others. As far as their customers are concerned, oligopolists have no more need to worry about their strategic behaviour than monopolies do. The problem arises with competitors. All the firms will be better off if they keep their output down and their prices up, but each firm is better off increasing its output in order to take advantage of the high price. One can imagine at least three different outcomes. The firms might behave independently, each trying to maximise its own profit while somehow taking account of the effects of what it does on what the other firms do. A leader may emerge while other firms behave as followers. In repeated games, firms might cooperate, coordinating their behaviour as if they were a single monopoly. In a static game, firms in an oligopolistic industry may talk about cooperative agreement, although as in the prisoner dilemma each will violate the agreement if doing so is in its interest. In a one shot game, agreements are not worth making because they cannot be enforced, even if they can be reached. In such a situation, each firm tries to maximise its profit independently and the result is a Nash equilibrium. Each player takes what the other players are doing as given when deciding what he should do to maximise gains. But firms face a downward-sloping demand curve. We should carefully define a strategy, since different definitions (quantity or price) lead to different conclusions. Each firm may decide how much to sell and let the market determine the price; or it may choose its price and let the market determine quantity. Considering a duopoly, let us first find the Nash equilibrium on the assumption that a firm’s strategy is defined by the quantity it produces, i.e. the case originally analysed by Cournot. 23 B. Cournot Competition: Quantity Strategy. Given quantities produced by other firms, each firm calculates how much it should produce to maximise profit. Figure 4 shows this situation from the point of view of firm 1. D is the demand curve for the whole industry. Q2 is the output of other firms in the industry (a single firm in a duopoly). It also shows the marginal cost (MC, defined as the slope of total cost that is the cost of increasing quantity by one unit) and the average cost (AC, defined as total cost over quantity) of firm 1. Whatever price the firm decides to charge, it faces the residual demand curve (total demand minus Q2) D1 = D - Q2. To maximise profits the firm calculates its marginal revenue from the residual demand curve D1 at the point at which it intersects the marginal cost, producing quantity Q*1, as in monopoly, provided that for that quantity it is not loosing money. It makes profits if the average cost AC is smaller than price P. If firms are identical, they will find the same profit-maximising output. In a Nash equilibrium with two firms, each firm produces Q*1, with a total output Q=2Q*1. With free entry if the price is above the average cost we have positive profits and new firms will enter the market. Hence, in equilibrium the average cost equals the marginal revenue and the profit is approximately equal to zero, as in Figure 4. Figure 4 Q2 MC 1 MR 1 Q1 R1 AC 1 Q2 P1 Figure 5 Q D1 D Q *2 E Q*1 R2 Q1 The Nash equilibrium can be solved using reaction curves, that show what strategy one player chooses, given the strategy of the other. In Figure 4, D1 is the residual demand curve faced by Firm 1, given that Firm 2 is producing a quantity Q2. By repeating this calculation of Q1 for different values of Q2, we build R1 on Figure 5 as the reaction curve for Firm 1. It shows, for any quantity that Firm 2 chooses to produce, how much Firm 1 will produce. Point E is the point calculated using Figure 4. The same analysis can be used to generate R2, the reaction function showing how much Firm 2 will produce for any quantity Q1 Firm 1 produces. Since the two firms are assumed to have the same cost curves, their reaction curves are symmetrical. Nash equilibrium is reached at point E, where each firm produces its optimal quantity given the quantity produced by the other firm. It occurs only at point E, where reaction 24 curves intersect, since only there the strategies are consistent, each optimal against the other. This “reaction curve approach” can be applied to a wider range of problems. C. Bertrand Competition: Price Strategy. Let us now redo our analysis using a price strategy. Each firm observes prices other firms are charging and select the price that maximises its profit. Since firms produce identical goods, only the lowest price matters. Figure 6 shows the situation from firm 1’s perspective. P1 is the lowest of the prices charged by the other firms. The firm in this situation has three alternatives, as shown by D1. It can charge more than Pl and sell nothing. It can charge Pl and sell a determinate amount, Q(Pl)/N, if there are N firms each charging Pl. It can charge less than Pl, say one cent less, and sell as much as it wants up to Q(Pl). It is easy to see that, if Pl is greater than AC, the last choice maximises its profit. Firm 1 maximises profit by producing Q1(Pl) and selling it for just under Pl. In a Bertrand-Nash equilibrium, every firm is maximizing its profit. Each other firm also has the option of cutting its price (say by a cent) and selling all it wants. Whatever price the other firms are charging, it is in the interest of any other firm to charge a cent less. The process stops when the price reaches a level consistent with each firm selling where price equals marginal cost. If additional identical firms are free to enter the industry, the process stops when price gets down to minimum average cost and firms is indifferent between selling as much as it likes or nothing. Figure 6B Figure 6A Q 2 D MC 1 AC 1 B Q2 P1 D1 Q1 O Q*2 E Q2 Q 1 (P 1 ) Q(P 1) Q*1 Q1 Q1 Oligopolistic firms engaged in Bertrand competition behave in a competitive way, ending up in point B instead of E. This seems a bit peculiar in an oligopoly where firms are large and affecting price they should produce less than the competitive output. 25 D. The Stackelberg dynamic game. Let us now finally consider the dynamic Stackelberg game using a quantity strategy. Firm 1 can commit to a given level of output before firm 2 has the opportunity to act. Firm 2 observes her output and then decides what quantity to produce. Figure 7A shows the Stackelberg game from firm 1’s perspective. Firm 1 can choose the Cournot equilibrium production level Q* 1 but also different levels. He knows that firm 2 will respond optimally to her chosen level. Starting from Q* 1 she can improve her situation increasing her output to Q’1. In fact, in this way firm 2 will find it optimal to cut his output to Q’2 and the price will be reduced in a limited amount. By repeating this calculation of Π1 for different values of Q1, she will find the maximum profit Π’1. Her new profits Π’1 will be greater than Π*1, the Cournot equilibrium level. Figure 7A Q2 MC 2 P P’ Figure 7B R1 Q1 MR’1 Q’1 MR 2 Q’2 Q2 Q D2 D’2 D Q *2 E Q’2 Q*1 S1 M1 Q’1 R2 Q1 In Figure 7B, looking at firm 2’s reaction curve, the Stackelberg equilibrium is reached at point S1 at the right of E, where firm 1 produces a greater quantity and firm 2 a smaller one. In the quantity space, knowing that profits increases the nearest we are to the monopoly output M1, we can draw the isoprofit curves for firm 1 (and 2). Define them They reach their maximum in correspondence of the reaction curve R1 (as it represents the best reply for a given Q2), where they are tangent to the horizontal lines. The point on R2, which maximises profits of firm 1 is S1, i.e. the one on the lowest isoprofit curve. In practice, the first mover anticipates correctly her rival’s reaction. She incorporates the follower’s maximisation problem when setting Q’1. The follower behaves as in Cournot, since no further reactions are expected. In the equilibrium S1 firm 1 the first mover chooses the output on R2 reactions curve, which maximises his own profits. E. Tacit collusion in repeated games: Cournot supergames Collusive outcomes are sustainable as non cooperative equilibria in repeated games. Let our firms play infinite times the Cournot quantity setting game. Before starting the game firms select a Pareto optimal equilibrium and commit to a strategy, after in each 26 stage game they choose output simultaneously. Firm i maximises its profits present value [with ρ = 1/(1+r) = discount factor]. ∞ Πi = Σt ρt Πi(Q1t, Q2t) Outputs (Q1t, Q2t) are observed at the start of period t+1. Firms condition current actions on previous behaviour using trigger quantity strategy (a single deviation triggers the end of cooperation): they cooperate (producing collusive output) as long as others do so, after a defection they turn to non-cooperation. Punishments correspond to the Cournot equilibrium in an infinite long punishment phase (Cournot reversion). Stage game payoffs are represented in figure 8. Figure 8A P A C NC C Π∗ ; Π∗ Πd ; Πb NC Π ; Πd b Πc ; Πc Figure 8B Q2 R1 Q2 Q *2 B D1 E C D2 Q *1 R2 Q1 Q1 Firm earns Π*/(1-ρ) by cooperating (where Π* = profit in the collusive equilibrium), Πd+ρΠc/(1-ρ) by deviating (where Πd = profits when deviating from collusive equilibrium and Πp = Πc = Cournot profits in the punishment phase). We have a collusive equilibrium when: Π* > Πd(1-ρ)+ρΠp Πd - Π* short run gains from defection or ρ > = Πd - Πp permanent punishment losses Since Πd - Πc = (Πd - Π*) + (Π* - Πc) > Πd - Π* a mild punishment is enough for ρ close to 1. When the respond to “cheating” (detection and punishment) is quick any payoff vector that is better for all players than a Nash equilibrium payoff vector can be sustained as the outcome of a perfect equilibrium of the infinitely repeated game, if players are patient (ρ close to 1). Cournot reversion is not the most severe punishment; credible more competitive behaviour (like Bertrand) lowers Πp promoting collusion. Problem 14* Assume that there are only two firm with constant marginal costs (C1 = cQ1 and C2 = cQ2) facing market demand function Q(P) = Q1+Q2 = a-P. Calculate the levels of output and profit for both firms with Cournot, Bertrand, Stackelberg, collusion and “cheating”. Calculate also the minimum level of ρ supporting collusion. 27 5. Appendix: Solution to selected problems Problem 1: Check the result for π1 = π2 = 0.5, R1 = 0.5 €, R2 = -1 € by redoing the calculations. What happens if π1 = 0.9 and π2 = 0.1? E R = (π1 • R1) + (π2 • R2) = [0.5 • (0.5 €)] + [0.5 • ( - 1 €) ] = - 0.50 €. The expected return from taking the gamble is negative. It becomes positive if π1 = 0.9 and π2 = 0.1. E R = (π1 • R1) + (π2 • R2) = [0.9 • (0.5 €)] + [0.1 • ( - 1 €) ] = 0.35 €. Problem 2: Will the agent accept the bet for U* = 1,000, UA = 200 and UB = 1, 800 with πA=πB= 0.5 in figure 2? What are the values of U*, UA and UB for a risk neutral agent in Figure 1? Will he prefer the certain income R* for πA=0,6? And for πA=0,4? Yes, being indifferent; U* = 1,000, UA = 500 and UB = 1, 500; Yes, No. Problem 3: How much would the agent in figure 1 pay to have a certain income R* = € 50,000, instead of RA = € 25,000 and RB = € 75,000 with π1 =π2 = 0.5. How much should instead the agent in figure 2 be paid to have the certain income R*? In both case it is the segment B*BC Problem 4: Check those results. U(29,700)= 997 and U(29,600)= 995 Problem 5*: Check that these statements are equivalent by doing the calculations. Lottery 1 consist of Y, Lottery 2 of a 2/3 chance of Z and a 1/3 chance of X. Rearranging statement 1 we find it implies statement 2 Problem 6: Would this result hold when individual utility functions are different? Yes, in fact: E U(R) = Σi πi Ui(Ri) = πΣiUi(Ri) 28 Problem 7 Show that Albert makes the same calculation and reaches the same conclusion. So both dictate their confession. Albert should reason as follows: (1) If Paula confesses (she chooses NC, not to cooperate with me) and I don’t (NC, C), I get five years UA(NC, C) = 2; if I confess too (NC, NC), I get two years UA(NC, NC) = 3. If Paula is going to confess, I had better confess too; 3 utiles > 2 utiles. (2) If no one confesses (C,C), I go to jail for six months UA(C, C) = 4. But if I confess (C, NC), I get three months UA(C, NC) = 3. So I am better off confessing 5 utiles > 4 utiles. (3) Whatever Paula does I am better off confessing (choosing strategy NC). Figure 1A A Figure 1B C NC C 4 ; 4 2 ; 5 NC 5 ; 2 3 ; 3 P A C NC C 4 ; 4 4 ; 3 NC 3 ; 4 3 ; 3 P Problem 8 Show that in Fig. 1B Paula and Albert make the same calculation and reach the conclusion to cooperate. So both cooperate. In this new setting Paula would reason as follows: (1) If Albert confesses (he chooses NC, not to cooperate with me) and I don’t (C, NC), I get UP(C, NC) = 4; if I confess too (NC, NC), I get UP(NC, NC) = 3. If Albert is going to confess, I had better not to confess; 4 utiles > 3 utiles. (2) If Albert is going to stay silent and I do not confesses (C,C), I get UP(C, C) = 4, while if I confess (NC, C), I get UP(NC, C) = 3. So also in this case I am better off not confessing 4 utiles > 3 utiles. (3) Whatever Albert does I am better off not confessing (choosing the cooperative strategy C). Problem 9* Consider Paula and Albert playing the game in Figure 1.A when the probability to repeat it next time is ρ=0,9. Assume booth cooperate at first. Does it pays not to cooperate if once an agent is betrayed he will always betray? What if the game is repeated infinite times and agents discount future utility by r = 10%? Hint: Since the probability to playing the game next time is π = 0,9 Paula and Albert 29 maximises the present value of their utility ui [with π as a discount factor]. ∞ ∞ The expected utility betraying is ui(NC) = Σt π Ui(NCt,NCt) = Σt 0.9•3 ∞ ∞ ui(C) = Σt π Ui(Ct, Ct) = Σt 0.9•4 While when cooperating is It pays not to cooperate if the difference ui(C) - ui(NC) is greater than the immediate gain from betraying 5 - 4 = 1. If the game is repeated infinite times and agents discount future utility by r = 10% we get a discount factor ρ = 1/(1+r) = 0.9. Notice how the structure of the problem is the same with π = 0.9 and there is no difference in term of the present value of utility. Problem 10 Represent such a game using a payoff matrix. Show that (a, a) and (b, b) are Nash equilibria in the game in figure 1C. Is that true for the game in figure 3D? A possible representation of the English drivers game is provided in figure 1E with Paula and Albert. If all English drivers switched from (L,L) to driving (R,R), they will be better off, gaining 5 instead of 3. A single English driver (Paula or Albert) making the switch on his own, would be worse off, gaining 0 instead of 3. Figure 1E P A R Figure 1C L P A Figure 1D a b P A a b R 5;5 0;0 a 3;5 0;0 a 3;5 0;0 L 0;0 3;3 b 0;0 5;3 b 0;0 2;1 In the game in figure 1C (a, a) is a Nash equilibrium, since when the standard a (favoured by Albert) is adopted, Paula would suffer (and impose to Albert) a large costs -3 (-5) by choosing b; so it is in her interest to adopt a. The same is true when (b, b) is adopted), Albert would suffer (and impose to Paula) a large costs -3 (-5) by choosing a; so it is in his interest to adopt b.. The same is true for the game in figure 3D, where (a, a) and (b, b) are Nash equilibria. In this case (b, b) is stable against individual action, even if it leads to a sub-optimal outcome. Paula and Albert will both gain making the switch to (a, a) together, gaining 3 or 5 instead of 2 or 1, but each one would be worse off making the switch on his own, gaining 0 instead of 2 or 1. 30 Problem 11* Change all the payoff sign in Fig. 2A and let Paul and Albert play the new game. Show it has only a mixed strategy solution and find it. What kind of game is it? A P a b a 5 ; -5 -5 ; 5 b -5 ; 5 5 ; -5 Hint: Obviously, starting from any state a player will find in his interest to change strategy e.g. in (a,a) Paula will improve by playing b. Consider the mixed strategy solution is the following probability mix of pure strategies a 50% chance of a and a 50% chance of b for Paula and Albert. In fact, if Paula (or Albert) will choose more frequently strategy b (a) Albert (Paula) will not be indifferent among the two alternative but strictly prefer b to a. This seems a game of the type “you loose and I win”, in practice a “zero-sum game”. Problem 12 Represent the game in Fig. 3A in normal form. Show that it has two pure strategy solutions and find them. Is there a first move advantage? We can represent the game as follows once we define Albert’s possible strategies. A P 1 2 3 4 1 = “Choose always a” a -5; 5 5; -5 -5; 5 5; -5 b 5; -5 -5; 5 -5; 5 Albert’s strategies: 5; -5 2 = “Choose always b” 3 = “Choose a if P chooses a and b if P chooses b” 4 = “Choose b if P chooses a and a if P chooses b” Strategy 3, i.e. “choose a if P chooses a and b if P chooses b” is the dominant strategy for Albert. Paula is indifferent between choosing a or b, hence (a, 3) and (b, 3) are the two pure strategy solutions Any probability mix of pure strategies (e.g. a 50% chance of a and a 50% chance of b) for Paula would also be a component with Albert’s strategy 3 for any solution. Problem 13 The reader is invited to show that, by representing the game in a normal form, as in figure 3.C, what are the strategies 1, 2, 3, 4, available to A. 31 We can represent the game in fig. 3B in a normal form, as in figure 3.C once we define Albert’s possible strategies as follows. Figure 3B Figure 3C P A a Aa a P b 1 1 2 4 a 5; 2 -3; -3 5; 2 -3; -3 Ab b a b 5;2 2;5 4;4 b 4; 4 2; 5 -3 ; -3 3 2; 5 4; 4 1 = “Choose always b”, 2 = “Choose always a”, 3 = “Choose b if P chooses a and a if P chooses b”, 4 = “Choose a if P chooses a and b if P chooses b” Problem 14* Assume that there are only two firm with constant marginal costs (C1 = cQ1 and C2 = cQ2) facing market demand function Q(P) = Q1+Q2 = a-P. Calculate the levels of output and profit for both firms with Cournot, Bertrand, Stackelberg, collusion and “cheating”. Calculate also the minimum level of ρ supporting collusion. Hint: Firms marginal revenues equal to marginal costs represents the equations of the reaction functions. From MR1 ⇒ a-Q2-2Q1 = c and from MR2 ⇒ a-Q1-2Q2 = c. Solving the system of the two equations gives Cournot solution, i.e. point E in figure 8B. Other equilibria can be calculate from point B, R and C in figure 8B. Given equilibrium quantities we can calculate profits in equilibrium ( Π*, Πd, Πp) and Πd - Π* short run gains from defection ρ = = Πd - Πp permanent punishment losses 32
© Copyright 2026 Paperzz