SARTRE: System Overview A Case-Based Agent for Two-Player Texas Hold'em Jonathan Rubin & Ian Watson University of Auckland Game AI Group http://www.cs.auckland.ac.nz/research/gameai/ Overview • Introduction • Texas Hold'em • Approaches to Computer Poker • Sartre: System Overview • Results • Future Work Texas Hold'em • Two-player Limit Hold'em – Much different to full-table game • Chance events • Hidden Information Approaches to Computer Poker • Near-Equilibrium Strategy • Exploitative Strategy Near-Equilibrium Strategy • • Nash Equilibrium – Assumes the opponent makes no mistakes – Attempts to minimise its loses against this perfect opponent Near-Equilibrium – As game tree is too large – Plays not to lose Exploitative Strategy • Exploitative Strategy – Opponent Modelling – Attempts to punish weaknesses in the opponents strategy – Plays off the equilibrium – Plays to win Sartre: System Overview • Similarity Assessment Reasoning for Texas hold'em via Recall of Experience • Our entry for the 2009 Computer Poker Competition • Case-base was constructed from past CPC games Sartre: System Overview • Hand picked by authors • Case Features – Previous betting for the hand – Hand Category – Board Category 1. Previous betting for the hand • Currently represented as a string – – – • f = fold c = check/call r = bet/raise Examples – – – r rrc-r rc-crrc-rc-cr 1. Previous betting for the hand 2. Hand Category • Rule-based System 2. Hand Category • Two components – – • Hand Category Hand Potential Examples – – Missed One-Pair, Two-Pair, Three-of-a-kind – Flush-draw, Straight-draw 3. Board Category • • Captures information about potential – Flush Draws or, – Straight Draws Information that is likely to be noticed by an good player 3. Board Category • Flush Highly Possible 3. Board Category • Straight Possible Similarity • Currently either all or nothing – If a collection of cards maps to the same category they are assigned a similarity of 1.0, otherwise 0. Case Overview • Case Features – – – • Solution – • 1. Previous betting for the hand 2. Hand Category 3. Board Category f, c, r Outcome – – – +/- value + Profit - Loss Case Overview • Solution + Outcome – Recorded from equilibrium approaching bots from previous AAAI Computer Poker Competition • Separate case-bases for preflop, flop, turn & river • Approx. 250,000 cases in each case-base. Decision Making • Retrieved cases can have different decisions • Three different versions – 1. Probability Triple – 2. Majority rules – 3. Outcome-based Decision Making • Probability Triple – – • Majority Rules – • Proportion of times that the solution indicated to fold, call or raise (f, c, r) Decision made the most is reused Outcome-Based – – Dependant on adjusted average outcome values for each decision If a call or raise decision was never made, it's outcome is unknown and is given a value of +infinity Duplicate Matches • Experimental results derived using duplicate matches – – – Play N poker hands Reset each players memory Reverse the position of each player and deal the same N hands • Forward + Reverse Directions • Reduces variance Self-Play Experiments • Small bets per hand (sb/h) – • Sartre-Probability Vs. Sartre-Outcome – – • Assuming a $10/$20 game Sartre-Probability wins 0.168 sb/h On average $1.68 profit per hand Sartre-Probability Vs. Sartre-Majority – – Sartre-Majority wins 0.039 sb/h On average $0.39 per hand Self-Play Experiments • Chose Sartre – Majority Rules. • Results not transitive • Makes Sartre more predictable and hence more exploitable by strong opposition 2009 Computer Poker Competition Results • Duplicate match structure – 3000 hands in forward & reverse direction • Multiple matches against each opponent until statistical significance obtained • Sartre placed 7th out of 13 entrants in limit competition 2009 Computer Poker Competition Results 1 MANZANA -0.038 2 GGValuta -0.043 3 HyperboreanLimitEqm -0.051 4 HyperboreanLimitBR -0.023 5 Rockhopper -0.033 6 Slumbot -0.012 7 Sartre 8 GS5 9 AoBot 0.131 10 LIDIA 0.145 11 dcurbhu 0.217 12 GS5Dynamic 0.119 13 tommybot 0.765 Total 0.097 -0.007 2009 Computer Poker Competition Results • Overall profit of +0.097 sb/h • Assuming a $10/$20 game – $0.97 per hand profit Future Work • Investigate loosening of all-or-nothing similarity • CBR and adaptive poker agents – – • Opponent modelling Learning Better solution adaptation – Combination of decision + outcome The End!
© Copyright 2026 Paperzz