Sicilian Reasoning

Falcon on a Cloudy Day
A Ro Sham Bo Algorithm
by Andrew Post
Lets Review

If you missed my previous presentation:

Ro Sham Bo = Rock Paper Scissors

Can be more complicated though
Ro Sham Bo has important applications
 Algorithms compete at Ro Sham Bo in tournaments
 Iocaine Powder is the world champ of Ro Sham Bo



Because it uses ‘Sicilian Reasoning’
I will beat Iocaine Powder

Eventually…
What is Ro Sham Bo?

Also known as Rock Paper Scissors
What is Ro Sham Bo?

Generalized case of Rock Paper Scissors actually

Not always three choices
Ties can be resolved differently
The game is not necessarily zero-sum


Why does it matter?



Many competitive scenarios involve a Ro Sham Bo
Example:
CBS and NBC choosing Primetime TV Shows




They can choose to show a Drama, Comedy, or Sports show
Viewers prefer Comedy to Drama, Sports to Comedy, and Drama
to Sports, given the choice.
Neither station knows ahead of time what the other will choose
Billions of dollars every day rely on decisions like
these.
How it works

Simplest Non-Cooperative Game


Players cannot play to ensure they both win
Governed by the Nash Equilibrium
There are strategies which cannot be dominated
 http://www.youtube.com/watch?v=pdrBDfRvpBA

1:31 -- 2:20
How to Win

As you just heard, playing randomly can ensure
you don’t lose, but how do you win?

How to predict your opponent
Sub-Optimal Frequency Distributions
 Pattern Matching
 History Analysis

Iocaine Powder

International Ro Sham Bo Programming
Tournament Champion

Named for this famous scene:
http://youtube.com/watch?v=TUee1WvtQZU
0:57 -- 2:20
The Tournament



Tournament programs play thousands of rounds
Win by beating the most opponents by a large
margin
Most programs play sub-optimally, so exploiting
your opponent is more important than playing
randomly to avoid losing.
Iocaine Powder



IP is the algorithm which does this best.
IP uses the same heuristics to predict what an
opponent is most likely to do.
Using the same tools, how can you be better?
Sicilian Reasoning!
Sicilian Reasoning

Levels of second guessing:
1.
2.
3.
4.
Opponent will play rock, so play paper
Opponent knows you will counter rock with paper, and
play scissors – so play rock
Opponent knows all this, and will now play paper to
beat your rock – so play scissors
Opponent will play rock again – same as 1
Sicilian Reasoning



Use your predictive strategies to evaluate what is
going to happen next.
Run SR on yourself and your opponent, and
keep a table of what each of the six levels of
reasoning say you should do.
Pick the level of reasoning which would have
won against what your opponent actually did the
most often.
Wait, six? Don’t you mean three?


You can use the same predictive tools that your
opponent uses to ‘predict’ what you are going to
do.
Now you have three more levels of SR:
4. I will play rock. So he plays paper. So play Scissors
5. He knows I will counter with scissors, and play rock. So
play Paper.
6. He expects me to counter-counter with paper, and will
play scissors. So play rock.
More Sicilian Reasoning



Just because one level of SR is winning now,
doesn’t mean it always will be.
Opponents will change how they play if they are
losing, so you must change too!
How do you switch your level of SR?
Switching Reasoning




SR-2 has just won the first 100 rounds
Opponent changes strategy
You lose 50 rounds before SR-4 has more than
100 theoretical wins.
You just wasted 50 rounds!
Switching Reasoning

Use several different methodologies for switches
Most wins in last 10, 25, 50, 100, 1000 rounds
 Has won the most in similar situations
 Causes the opponent to switch to a worse strategy

Switching Reasoning

Here is the real genius – now use the switching
methodology which has helped you win the
most rounds!
Falcon on a Cloudy Day

So you ask, how do you beat Iocaine Powder?
Improve the basic predictive heuristics
 Extend Sicilian Reasoning

Improving Prediction

What I have implemented:

Improved Variable History Analysis


Look at just your history, your opponents, or both
Improved Frequency Analysis

EV[x] = Pr[x+2] - Pr[x+1]
Demonstration

Here is how my project does with what is
implemented so far.
Improving Prediction

What I have not implemented yet:

Improved Pattern Matching


Markov Models with MegaHAL
Extended Sicilian Reasoning
More on MegaHAL



MegaHAL is a very simple "infinite-order"
Markov model.
Stores frequency information about the moves
the opponent has made in the past for all
possible contexts
Using the ‘context’ of the last few moves, the
“appropriate” response is then selected.
Extended Sicilian Reasoning



Q: Isn’t Sicilian Reasoning complete at 6?
A: Yes, but there is information we are ignoring.
By compressing your strategy decisions into the
idea of which of six strategies is best right now,
you have no way to keep track of how changing
your strategies has paid off best in the past.
Now for some Math





Hilbert Space
Game Trajectory and Game State
Projection Operators
Annotated History Analysis
Project Enigma