Texas Hold`em

SARTRE: System Overview
A Case-Based Agent for Two-Player Texas
Hold'em
Jonathan Rubin & Ian Watson
University of Auckland Game AI Group
http://www.cs.auckland.ac.nz/research/gameai/
Overview
•
Introduction
•
Texas Hold'em
•
Approaches to Computer Poker
•
Sartre: System Overview
•
Results
•
Future Work
Texas Hold'em
•
Two-player Limit Hold'em
–
Much different to full-table game
•
Chance events
•
Hidden Information
Approaches to Computer Poker
•
Near-Equilibrium Strategy
•
Exploitative Strategy
Near-Equilibrium Strategy
•
•
Nash Equilibrium
–
Assumes the opponent makes no mistakes
–
Attempts to minimise its loses against this
perfect opponent
Near-Equilibrium
–
As game tree is too large
–
Plays not to lose
Exploitative Strategy
•
Exploitative Strategy
–
Opponent Modelling
–
Attempts to punish weaknesses in the
opponents strategy
–
Plays off the equilibrium
–
Plays to win
Sartre: System Overview
•
Similarity Assessment Reasoning for Texas
hold'em via Recall of Experience
•
Our entry for the 2009 Computer Poker
Competition
•
Case-base was constructed from past CPC
games
Sartre: System Overview
•
Hand picked by authors
•
Case Features
–
Previous betting for the hand
–
Hand Category
–
Board Category
1. Previous betting for the hand
•
Currently represented as a string
–
–
–
•
f = fold
c = check/call
r = bet/raise
Examples
–
–
–
r
rrc-r
rc-crrc-rc-cr
1. Previous betting for the hand
2. Hand Category
•
Rule-based System
2. Hand Category
•
Two components
–
–
•
Hand Category
Hand Potential
Examples
–
–
Missed
One-Pair, Two-Pair, Three-of-a-kind
–
Flush-draw, Straight-draw
3. Board Category
•
•
Captures information about potential
–
Flush Draws or,
–
Straight Draws
Information that is likely to be noticed by an
good player
3. Board Category
•
Flush Highly Possible
3. Board Category
•
Straight Possible
Similarity
•
Currently either all or nothing
–
If a collection of cards maps to the same
category they are assigned a similarity of
1.0, otherwise 0.
Case Overview
•
Case Features
–
–
–
•
Solution
–
•
1. Previous betting for the hand
2. Hand Category
3. Board Category
f, c, r
Outcome
–
–
–
+/- value
+ Profit
- Loss
Case Overview
•
Solution + Outcome
–
Recorded from equilibrium approaching
bots from previous AAAI Computer Poker
Competition
•
Separate case-bases for preflop, flop, turn
& river
•
Approx. 250,000 cases in each case-base.
Decision Making
•
Retrieved cases can have different
decisions
•
Three different versions
–
1. Probability Triple
–
2. Majority rules
–
3. Outcome-based
Decision Making
•
Probability Triple
–
–
•
Majority Rules
–
•
Proportion of times that the solution
indicated to fold, call or raise
(f, c, r)
Decision made the most is reused
Outcome-Based
–
–
Dependant on adjusted average outcome
values for each decision
If a call or raise decision was never made,
it's outcome is unknown and is given a
value of +infinity
Duplicate Matches
•
Experimental results derived using
duplicate matches
–
–
–
Play N poker hands
Reset each players memory
Reverse the position of each player and
deal the same N hands
•
Forward + Reverse Directions
•
Reduces variance
Self-Play Experiments
•
Small bets per hand (sb/h)
–
•
Sartre-Probability Vs. Sartre-Outcome
–
–
•
Assuming a $10/$20 game
Sartre-Probability wins 0.168 sb/h
On average $1.68 profit per hand
Sartre-Probability Vs. Sartre-Majority
–
–
Sartre-Majority wins 0.039 sb/h
On average $0.39 per hand
Self-Play Experiments
•
Chose Sartre – Majority Rules.
•
Results not transitive
•
Makes Sartre more predictable and hence
more exploitable by strong opposition
2009 Computer Poker Competition
Results
•
Duplicate match structure
–
3000 hands in forward & reverse direction
•
Multiple matches against each opponent
until statistical significance obtained
•
Sartre placed 7th out of 13 entrants in limit
competition
2009 Computer Poker Competition
Results
1
MANZANA
-0.038
2
GGValuta
-0.043
3
HyperboreanLimitEqm
-0.051
4
HyperboreanLimitBR
-0.023
5
Rockhopper
-0.033
6
Slumbot
-0.012
7
Sartre
8
GS5
9
AoBot
0.131
10
LIDIA
0.145
11
dcurbhu
0.217
12
GS5Dynamic
0.119
13
tommybot
0.765
Total
0.097
-0.007
2009 Computer Poker Competition
Results
•
Overall profit of +0.097 sb/h
•
Assuming a $10/$20 game
–
$0.97 per hand profit
Future Work
•
Investigate loosening of all-or-nothing
similarity
•
CBR and adaptive poker agents
–
–
•
Opponent modelling
Learning
Better solution adaptation
–
Combination of decision + outcome
The End!