Opponents` Total Ratings + 400*(wins

Parallel
Programming in
Chess Simulations
Tyler Patton
Discussion:
Background
Sequential Optimizations
Parallelization of chess
Background: What is Chess?
Strategic 2 player game
64 tiles
16 pieces per player
Objective to capture the
opponents king
Background: Chess ELO
Created by Arpad Elo to improve chess ratings
Players of equal ELO have an equal chance to win
A difference of 400 ELO gives a win 97% of the
time to the higher rated player
Basic formula:
Performance Rating =
Opponents’ Total Ratings + 400*(wins-losses)
Number of games
Average Player rating: 1200
Highest player rating: 2850 (Magnus Carlson)
Background: Scope
First estimate of the number of positions:
64! / 32!*(8!)2*(2!)6 =1043 (Shannon)
Tight upper bound:
=1050 (Dr. Allis)
Number of possible game variations:
10120 (Shannon number)
Given ~103 starting moves and 40 move pair
average
Background: History of chess engines
1950: Alan Turing develops
Turbochamp.
1962: Adam Kotok of MIT
develops first “credible”
chess program.
1997: Deep blue defeats
Gary Kasparov in 6 games
Present: Stockfish 6 holds
a rating of 3309
Deep blue vs Kasparov
Sequential Optimizations: StockFish
StockFish implementation:
•
•
•
•
Alpha-Beta pruning
Bitboards
Transposition table
Late move reductions
Sequential Optimizations:
Minimax search
•
Minimax:
•
Evaluate a given move,
the opponents responses
and your responses to
your opponent’s moves… etc.
•
Proceed to the next move and repeat.
•
Choose the tree which yields the best ending evaluation
•
Assume the opponent always chooses the best move
Sequential Optimizations:
Alpha-Beta Pruning
• Alpha is the maximum score the
player is guaranteed for a branch
• Beta is the minimum score the
opponent is guaranteed for a branch
• Allows for eliminations
of branches in the search tree
• Eliminates branches if the
opponent would never
allow the position
• Ordered node complexity: O(bd/2)
• Random order complexity: O(b3d/4)
Sequential Optimizations:
Alpha-Beta Pruning
Sequential Optimizations:
Transposition Table
• Stores the history of search evaluations
• Positions that been searched are likely to be
reached again
• Before a branch is searched the transpositions
table is checked and gives the result if able
• Implemented as a hash table
Parallelization of Chess:
Parallelizing Alpha-Beta pruning
•
Goal: Use multiple processors to simultaneously
search different branches of the game tree
•
Drawback: Dependency on the alpha value
•
Processors are dependent on each other for updated
alpha values which cause communication locks
•
Parallel algorithms tend to be less efficient since the alpha
value is not as strong
•
Parallel implementation only has equivalent efficiency to
the sequential algorithm if the first move if the best one
examined
Parallelization of Chess:
Principal Variation Splitting
•
•
Early technique for
parallelizing alpha-beta
Assumptions:
•
•
The game tree is well
ordered
The leftmost path is
the best
•
Updates alpha after a branch is searched
Parallelization of Chess:
Typical Speedup
•
Principal Variation Splitting
•
Implemented using 4 or fewer
Processors with a speedup of ~3.5
•
The Younger Brothers Wait
Concept achieved a speedup
of 140 on 256 nodes
• CilkChess parallel engine is shown to be scalable up to 1000
nodes with a similar speedup to The Younger Brothers Wait
Concept
Parallelization of Chess:
Looking to the future
•
•
•
The best algorithm for large numbers of processors
and indefinite tree size is unknown
Explore new algorithms that don’t rely on
communication pitfalls
Optimizations to existing algorithms and techniques
are still possible
•
i.e. making the new alpha available to each processor
when its found as opposed to when a processor finishes a
search
Questions?