A Course on Poker?!? A+K+Q Game Rules A+K+Q

cs6501: Imperfect
Information Games
A Course on Poker?!?
There are few things that are so
unpardonably neglected in our
country as poker. The upper class
knows very little about it. Now and
then you find ambassadors who
have sort of a general knowledge of
the game, but the ignorance of the
people is fearful. Why, I have known
clergymen, good men,
kind+hearted, liberal, sincere, and
all that, who did not know the
meaning of a “flush”. It is enough to
make one ashamed of one’s species.
+ Mark Twain
(as quoted in A Bibliography of Mark
Twain, Merle Johnson)
Principles
Of
Knowledge
Engineering &
Reconstruction
Spring 2010
University of Virginia
David Evans
A+K+Q Game (not von Neumann
Poker)
John von Neumann
(1903+1957)
Pure Math
Quantum Physics
Atomic Bombs
Designer of Plutonium Bomb
Fission/Fusion Hydrogen Bomb
Computer Science
First Draft Report on EDVAC
von Neumann Architecture
Merge Sort
Random Number generation
Game Theory
Theory of Games and Economic
Behavior (with Morgenstern)
Mutual Assured Destruction
Flickr:cc Malkav
A+K+Q Game Rules
A+K+Q Game Rules
• 3 card deck: Ace > King > Queen
• 2 Players, each player gets one card face+up
• Higher card wins
Without secrecy, stakes, betting, its not poker!
•
•
•
•
3 card deck: Ace > King > Queen
2 Players, each player gets one card face+down
Higher card wins
Betting: (half+street game)
– Ante: 1 chip
– Player 1: bet 1, or check
– Player 2: call or fold
• Stakes: scheduling signup order by chip count
Loosely based on Bill Chen and Jerrod Ankenman, The Mathematics of Poker.
A+K+Q Analysis
Game Payoffs
Player 1:
Better to be player 1 or player 2?
Ace
Bet
Check
King
Bet
Check
Queen
Bet
Check
Call
Easy Decisions:
Ace
Player 2
Fold
Hard Decisions:
Call
King
Fold
Call
Queen
Fold
Game Payoffs (Player 1, Player 2)
Player 1:
Ace
Bet
King
Check
Zero-Sum Game
Queen
Bet
Check
Bet
Check
Call
(-2, +2)
(-1,+1)
(-2,+2)
(-1,+1)
Fold
(+1,-1)
(+1, -1)
(+1,-1)
(+1,-1)
Gain(p) = 0
Player 2
Ace
Call
(+2, -2) (+1, -1)
(-2,+2)
(-1,+1)
Fold
(+1, -1) (+1, -1)
(+1,-1)
(+1,-1)
Call
(+2, -2) (+1, -1)
(+2,-2)
(+1,-1)
Fold
(+1, -1) (+1, -1)
(+1,-1)
(+1, -1)
p∈P layers
King
Queen
Payoffs for Player 1
Player 1:
Ace
Bet
King
Check
Strategic Domination
Queen
Bet
Check
Bet
Check
Call
-2
-1
-2
-1
Fold
+1
+1
+1
+1
Player 2
Ace
Call
+2
+1
-2
-1
Fold
+1
+1
+1
+1
Call
+2
+1
+2
+1
Fold
+1
+1
+1
+1
King
Queen
Strategy A dominates Strategy B if Strategy A
always produces a better outcome than
Strategy B regardless of the other player’s
action.
Eliminating Dominated Strategies
Player 1:
Ace
Bet
King
Check
Simplified Payoff Matrix
Queen
Bet
Check
Bet
Check
Call
-2
-1
-2
-1
Fold
+1
+1
+1
+1
Player 1: Ace King
Bet
Call
+2
+1
-2
-1
Fold
+1
+1
+1
+1
Call
+2
+1
+2
+1
Fold
+1
+1
+1
+1
King
Ace
Player 2
Player 2
Ace
Call
Queen
Check
Bet
Check
-1
-2
-1
Call
+2
-2
-1
Fold
+1
+1
+1
Fold
+1
King
Queen
+1
Queen
The Tough Decisions
Player 1: Ace
Bet
Player 2
Ace
Call
Expected Value
Queen
Bet
Check
-2
-1
-1
Call
+2
-2
Fold
+1
+1
EV =
P r(e)V alue(e)
e∈Events
King
What if Player 1 never bluffs?
Never Bluff Strategy
EV1 =
A
K
Q
Bet
Check
Check
-1
-1
A
Call
K
Fold/Call
+1
Q
Fold
+1
1
3 (1)
+
1 −1
3( 2
+
Player 1: Ace
Bet
Player 2
Player 2
Player 1:
The Tough Decisions
-1
+1
1
1
2 ) + 3 (−1)
Ace
Call
Queen
Bet
Check
-2
-1
-1
Call
+2
-2
Fold
+1
+1
King
=0
What if Player 1 always bluffs?
Always Bluff Strategy
Player 2
Player 1:
A
A
K
Q
Bet
Check
Bet
-1
-2
Call
Call
+2
Fold
+1
+1
Fold
+1
+1
Recap
If player 1 never bluffs:
EV1 = 0
If player 1 always bluffs: EV1
= − 16
-2
K
Q
+1
Is this a break-even game for Player 1?
EV1/CallK = 13 ( 21 (+2) + 12 (+1)) + 13 (− 21 + 12 ) + 13 (−2) = − 16
EV1/F oldK = 13 (1) + 13 (− 12 + 12 ) + 13 ( 12 (−2) + 12 (1)) =
1
6
Course Overview
Class Leader Expectations
• At least two weeks* before your scheduled class:
• Topics
– Game Theory
– Machine Learning
– Anything else relevant to building a poker bot
• Format: most classes will be student-led
– Present a topic and/or research paper
– Let me know what you are planning on doing (talk to me
after class or email)
• At least one week before your scheduled class:
– Post on the course blog a description of the class topic and
links to any reading/preparation materials
• At the class: lead an interesting class, bring any needed
materials
• Later that day: post class materials on the course blog
• Follow-up: respond to any comments on the course
blog
* If you signed up for Feb 1, you’re already late!
Course Project
My (Lack of) Qualifications
Build a poker bot capable of competing in the
Sixth Annual Computer Poker Competition
http://www.computerpokercompetition.org/
• I do research in computer security
• I have very limited knowledge and experience
in game theory, machine learning, etc.
• I am (probably) a fairly lousy poker player
Work in small (2-4) person teams
A few preliminary projects earlier
Combine ideas/code/results from best teams
This course will be a shared learning experience,
and will only work well if everyone contributes to
make it interesting and worthwhile.
Note: overlaps with USENIX Security, August 9-12 (also in San Francisco)
Things to Do
Recap Recap
• Submit course survey
• Print and sign course contract: bring to
Tuesday’s class
If player 1 never bluffs:
EV1 = 0
If player 1 always bluffs: EV1
= − 16
Looks like a break-even game for Player 1:
is there a better strategy?
• Reading for Tuesday: Chapters 1 and 2 of
Darse Billings’ dissertation
Everything will be posted on the course site (by tomorrow!):
http://www.cs.virginia.edu/evans/poker
Mixed Strategy
A
K
Q
Bet
Check
Check
-1
-1
A
Call
K
Fold/Call
+1
Q
Fold
+1
Never Bluff
-1
+1
EV1 = 0
Player 1: A
Bet
Player 2
Player 2
Player 1:
Strategies
A
K
Q
Call
Call
+2
Fold
+1
Always Bluff
K
Q
Check
Bet
-1
-2
Player 1
Player 2
Call with King
Fold with King
Bluff with Queen
Check with Queen
-2
+1
EV1 (< SBluf f , TCall >) = − 16 EV1 (< SBluf f , TF old >) =
EV1 = − 16
EV1 (< SCheck , TCall >) =
Pure strategy: always do the same action for a given input state.
Mixed strategy: probabilistically select from a set of pure strategies.
Nash Equilibrium
1
6
1
6
EV1 (< SCheck , TF old >) = 0
Finding the best strategy for Player 1: assume Player 2 plays optimally.
Nash Equilibrium
• Player 1 is making the best decision she can,
taking into account Player 2’s decisions.
• Player 2 is making the best decision he can,
taking into about Player 1’s decisions.
• Neither player can improve its expected value
by deviating from the strategy.
John Nash (born 1928)
Equilibrium Points in N-Person Games, 1950
Hence, to find the best strategy for Player 1, we need to find a strategy
that makes Player 2 indifferent between his options.
Winning the AKQ Game
EV1 (< SBluf f , TCall >) =
− 16
EV1 (< SCheck , TCall >) =
Call
Fold
1
6
EV1 (< SBluf f , TF old >) =
Winning the AKQ Game
1
6
EV1 (< SCheck , TF old >) = 0
Bluff
Check
-1
+1
+1
0
Player 1 wants to make Player 2 indifferent between TCall and TFold
Charge
• Submit course survey
• Print and sign course contract: bring to
Tuesday’s class
• Reading for Tuesday: Chapters 1 and 2 of
Darse Billings’ dissertation
Readings posted now. Everything else will be posted on the
course site (by tomorrow!):
http://www.cs.virginia.edu/evans/poker
If you are signed up for February 1, by tomorrow: contact me about plans for class.
Bluff
Check
Call
-1
+1
Fold
+1
0
Player 1 wants to make Player 2 indifferent between TCall and TFold