Evaluation Function to predict move in Othello

International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
Evaluation Function to predict move in Othello
More Onkar , Pisal Vishal , Singh Satyendra
Dept. of IT, Dr. D. Y. Patil College of Engineering, Savitribai Phule University, India
Abstract : This Paper Presents implementation of AI game theory algorithms to create powerful
OTHELLO/REVERSI player our AI player uses algorithms like mini max, alpha, beta pruning and
optimizations like transpositions ,bit boards and strategies like mobility stability. Using these strategies the
application is able to play very easy the game at a very high level of opponent, as well as provide a generic
platform for games to be played on. Its strength is against other Othello-playing applications and it is can be
distinguished from typical such applications in the following aspects. we propose some enhancements of Alphabeta and Proof-number Search, that can be useful in other domains. After many tests with computer opponents
and a year of deployment on a popular board-gaming portal.
Keywords :- Alpha-beta pruning, Bitboards, Frontiers, Minimax, Mobility, Transpositions
I.
INTRODUCTION
1.1 Introduction of Othello
From the beginning of the computer era people are working on game playing programs. The first
challenge was create a program that can win against the top human in Chess. In 1996 Garry Kasparov, being
currently the best Chess player, lost a game against the computer Deep Blue. Nowadays the best computer
programs for Chess are much stronger than the best people.
Game-playing programs have existed from the early years of computer science and have improved in
their ability. Game Theory was developed and established in the 1940s, making a contribution to the world of
maths. Some of the more impacts that the topic has had is reacted in the field of economy where prediction of
trends and human behaviour is fundamental.
Programs that play well, on the other hand, served peoples curiosity and set new challenge in the scene
of world-class Chess. Artificial Intelligence is now said to be tightly associated with such programs that can
think, Computers, use brute-force for to the simulate human think, like real neural functions use etc.
It would be safe to say that computers have not yet reached this goal of being able to. Brute-force may
work for a game of Chess. Othello can be considered an instance of the games for which brute-force is of real
use, how, as explained in later, there is a snag to on this. Othello Master will make use of the classical
approaches to carry out its process of playing. It will traverse the game tree (denoted later) and inspect various
properties systematically.
There is no real thinking involved and the only rational decisions made are in the mind of the
programmer. All of these methods have had some famous successes, as indicated in the above, and also have
some important differences that make for interesting comparisons. TDL uses information available during game
play in an attempt to solve a credit assignment problem, whereas co evolution normally focuses only on the end
result.
www.ijeebs.com
481 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
II.
RELATED WORK
In This Paper, Implement AI and game theory algorithms to create a powerful AI based
OTHELLO/REVERSI player. AI player should use mini max, alpha beta pruning, bit boards, transposition
tables and various evaluation strategies mobility, frontier discs, stability etc. Human player should be able to
play with the AI player. Also deploy the AI on cloud so that any human player in the world can play with our AI
player.
2.1. System Approach:
2.1.1 System Description
At the beginning of the game, four stones are already placed at the centre of an 8x8 standard game
board. Two stones of each one of the players are placed diagonally and, by convention, the player to make the
1st move is
of white colour whereas the other is of black colour. The stones are all two-sided and flipping them changes
their colour to the opponents colour. A legal move is such that it reverses one or more stones of the opponents
colour. To reverse a stone, a player places one of his/her stones in such a way so that it surrounds a sequence of
one or more of the opponents stones. Such a sequence of opponent’s stones must be ending in a board slot that is
occupied by the reversing players stone. All straight lines are applicable in such a reversal: horizontal, vertical
or diagonal. If no reversal is possible, the turn is passed to the opponent. The game is finished when neither
player has any legal moves left. Usually, by this point the board is completely full. Whoever has the most stones
placed on the board at that point wins the game.
Fig. 1.System Architecture
www.ijeebs.com
482 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
III.
EVALUATION FUNCTION APPROXIMATION
In this paper is on preference learning. Since this is a type of supervised learning ,we first describe a
number of ways in which supervised approaches have been used to learn evaluation functions, either from game
logs from players, then motivate our preference learning approach.
3.1. Game Trees
The progress of an Othello game can be represented as a tree. The nodes of the game tree 11represent
board situations and the branches how one board configuration is transformed into another i.e. a move. The ply
of a game tree, p, is the number of levels of the tree, including the root level. A tree of depth d has p = d + 1.
This tree can then be searched to provide the most promising move. It is, however, impossible to use an
exhaustive search of the game tree. An effective branching factor for Othello was found to be of the order of 7.
The effective tree depth is 60, assuming that the majority of games end when the board is full. The number of
branches is then 7 60 = 51050, far too many to evaluate. If there was an infallible way to rank the members of a
set of board situations then it would be a simple matter to select the move which lead to the best situation
without using any search. Unfortunately no such ranking procedure is available in Othello.
Fig. 2.Game Tree
3.2 Mini-max algorithm
Assume that we posses a function (the static evaluator) that will evaluate a board situation into an
overall quality number (the static evaluation score). A positive number indicates an advantage to one player (the
max player), a negative, an advantage to the other (the mini player). The degree of advantage increases with the
absolute value of the number. The maximum looks for a move that leads to the largest positive number and
assumes that the minimum will try to force play towards situations with negative static evaluations. The
decisions of the maximum take cognizance of the choices available to the minimum at the next level down and
vice-versa. Eventually the limit of the tree is reached when static evaluation is used to select between
alternatives. This scoring information passes up the game tree using a procedure called MINIMAX. Intended for
readers with a fair knowledge in programming. where 2 players are involved and the current information is
complete.
www.ijeebs.com
483 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
Psuedocode:
(* the minimax value of n, searched to depth d *)
fun minimax(n: node, d: int): int =
if leaf(n) or depth=0 return evaluate(n)
if n is a max node
v := L
for each child of n
v' := minimax (child,d-1)
if v' > v, v:= v'
return v
if n is a min node
v := W
for each child of n
v' := minimax (child,d-1)
if v' < v, v:= v'
return v
3.3 Alpha-beta pruning
The Alpha-Beta pruning is so named because it uses two parameters, alpha and beta, to keep track of
expectations. Whenever you discover a fact about a given node you check what you know about its ancestor
nodes. It may be the case that no further work is required below the parent node. Also, the best you can hope for
at the parent node may have to be revised. The Alpha-Beta pruning is started on the root node with alpha set at and beta at. The pruning is then called recursively with a narrowing range between the alpha and beta values.
Based on the idea that branches in the game tree should no longer. Explored if they o_er a solution that is no
better than what has already been found.
Psuedocode
(* the minimax value of n, searched to depth d.
* If the value is less than min, returns min.
* If greater than max, returns max. *)
fun minimax(n: node, d: int, min: int, max: int): int =
if leaf(n) or depth=0 return evaluate(n)
if n is a max node
v := min
for each child of n
v' := minimax (child,d-1,v,max)
if v' > v, v:= v'
if v > max return max
return v
if n is a min node
v := max
for each child of n
v' := minimax (child,d-1,min,v)
if v' < v, v:= v'
if v < min return min
return v
www.ijeebs.com
484 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
IV.
STRATEGY
4.1.Coin Parity
This component of the utility function captures the difference in coins between the max player and the
min player. The return value is determined as follows :
Coin Parity Heuristic Value = 100 * (Max Player Coins – Min Player Coins) / (Max Player Coins + Min Player
Coins)
4.2.Mobility
At each stage of the game you will have to chose between the limited number of moves available to you. In
diagram 9 white has just 3 available moves or "liberties", two of which hand a corner to black straight away.
Assuming white plays to e8 black will have 13 moves available of which 11 will lead to a win with best possible
play by both sides thereafter. In this position white has poor mobility having few moves to choose from, all
pretty bad at that, while black has good mobility having lots of choice. As long as there is at least one nondisastrous move for each player the game will remain in balance but if you can start to restrict the mobility of
your opponent while maintaining your own then you may be able to force them into having to make bad moves.
It attempts to capture the relative difference between the number of possible moves for the max and the min
players, with the intent of restricting the opponent’s mobility and increasing one’s own mobility. This value is
calculated as follows:
if (Max Player Moves + Min Player Moves ! = 0)
Mobility Heuristic Value = 100 * (Max Player Moves – Min Player Moves) / (Max Player Moves +
Min Player Moves)
Else
Mobility Heuristic Values = 0
4.3. Corners Captured
Corners hold special importance because once captured, they cannot be flanked by the opponent. They
also allow a player to build coins around them and provide stability to the player’s coins. This value is captured
as follows :
if (Max Player Moves + Min Player Moves ! = 0)
Corners Heuristic Value = 100 * (Max Player Moves – Min Player Moves) / (Max Player Moves + Min
Player Moves)
Else
Corners Heuristic Values = 0
4.4. Stability
The stability measure of a coin is a quantitative representation of how vulnerable it is to being flanked.
Coins can be classified as belonging to one of three categories: (i) stable, (ii) semi-stable and (iii) unstable.
Stable coins are coins which cannot be flanked at any point of time in the game from the given state. Unstable
coins are those that could be flanked in the very next move. Semi-stable coins are those that could potentially be
flanked at some point in the future, but they do not face the danger of being flanked immediately in the next
move. Corners are always stable in nature, and by building upon corners, more coins become stable in the
region. Weights are associated to each of the three categories, and summed to give rise to a final stability value
for the player. Typical weights could be 1 for stable coins, -1 for unstable coins and 0 for semi-stable coins. The
stability value is calculated as follows :
www.ijeebs.com
485 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
if (Max Player Moves + Min Player Moves ! = 0)
Stability Heuristic Value = 100 * (Max Player Moves – Min Player Moves) / (Max Player Moves +
Min Player Moves)
Else
Stability Heuristic Values = 0
4.5 Frontiers
Each move is played to an empty square adjacent to an opponent's disc and flips at least one of their
discs. The discs which have empty neighbouring squares form the frontier while those that do not are called
interior discs. The more frontier discs you have, the more choices your opponent has and, likewise, a smaller set
of frontier discs restricts the number of available moves. It should be clear that minimising one's frontier is key
to winning the battle for mobility. In diagram 11 black should play a6 flipping 3 discs (rather than f7 which flips
1) as this keeps the frontier to a minimum. A move like this which does not flip any frontier disks is called a
quiet move and often represents good play. This suggests a refinement of the evaporation strategy in which you
try to evaporate your frontier discs while not being so concerned with the total numbers of discs flipped at each
turn.
4.6 Openings
It is quite easy to lose control of the game in the first few moves. Play the wrong move and your
opponent will be able to restrict your choice of moves to those that work in their favour. While the concepts
discussed above may help guide your opening moves it is worth looking at some standard openings which
appear to preserve the balance of control, at least for a while. In each position the most recently placed disc is
highlighted with a red dot. The expected difference in scores for each move were generated
with WZebra's larger opening book and a look-ahead of 24 moves. Because of the two diagonal lines of
symmetry there are at least 4 variations of each opening however only one of each is shown below. Where the
development does not lead to another named position one or more "+" symbols are used to indicate progression.
These illustrated openings generally follow promising lines according to WZebrahowever the game is not
"solved" (unlike 6x6 Reversi) so these are only indications of the likely outcome for strong players following a
particular line.
EVALUATION FUNCTION SCORE
Score = F(Board) = W1 * Eval(Stability) + W2 * Eval(Mobility) + W3 * Eval(Frontier) + W3 * Eval(Openings)
+ W3 * Eval()
V. CONCLUSION
In this paper, implemented and employed algorithms sufficiently strong to perform convincingly well
against other available applications and has man- aged to analysis its different approaches to conclude which
ones surpass all others and why. There was constructed in accordance with the software engineering life-cycle
and possible extensions have been proposed. Differential preference learning to imitate human play from game
logs. Using logs taken from the French Othello League, we applied preference learning in two ways: using a
standard output negation method and using board inversion.
REFERENCES
1.
2.
Stefan Reisch (1980). "Gobang ist PSPACE-vollstandig (Gomoku is PSPACE-complete)". Acta Informatica 13:
5966.doi:10.1007/bf00288536.
Wolfgang Slany: The Complexity of Graph Ramsey Games
www.ijeebs.com
486 | Page
International Journal of Emerging Trend in Engineering and Basic Sciences (IJEEBS)
ISSN (Online) 2349-6967
Volume 2 , Issue 1(Jan-Feb 2015), PP481-487
3. H. J. van den Herik; J. W. H. M. Uiterwijk; J. van Rijswijck (2002). "Games solved: Now and in the
future". Artificial Intelligence 134 (1–2): 277–311. doi:10.1016/S0004-3702(01)00152-7.
4. Hilarie K. Orman: Pentominoes: A First Player Win in Games of no chance, MSRI Publications – Volume 29,
1996, pages 339-344. Online: pdf.
5. See van den Herik et al for rules.
6. John Tromp (2010). "John's Connect Four Playground".
7. Michael Lachmann (July 2000). "Who wins domineering on rectangular boards?". MSRI Combinatorial Game
Theory Research Workshop.
8. Jonathan Schaeffer et al. (July 6, 2007). "Checkers is Solved". Science 317 (5844): 1518–
1522. doi:10.1126/science.1144079.PMID 17641166.
9. J. M. Robson (1984). "N by N checkers is Exptime complete". SIAM Journal on Computing, 13 (2): 252–
267.doi:10.1137/0213018.
10. See Allis 1994 for rules
11. M.P.D. Schadd, M.H.M. Winands, J.W.H.M. Uiterwijk, H.J. van den Herik and M.H.J. Bergsma (2008). "Best
Play in Fanorona leads to Draw". New Mathematics and Natural Computation 4 (3): 369–
387. doi:10.1142/S1793005708001124.
12. G.I. Bell (2009). "The Shortest Game of Chinese Checkers and Related Problems". Integers. arXiv:0803.1245.
13. Takumi Kasai, Akeo Adachi, and Shigeki Iwata (1979). "Classes of Pebble Games and Complete
Problems". SIAM Journal on Computing 8 (4): 574–586. doi:10.1137/0208046. Proves completeness of the
generalization to arbitrary graphs.
14. Mark H.M. Winands (2004). Informed Search in Complex Games (Ph.D. thesis). Maastricht University,
Maastricht, The Netherlands. ISBN 90-5278-429-9.
15. S. Iwata and T. Kasai (1994). "The Othello game on an n*n board is PSPACE-complete". Theor. Comp.
Sci. 123 (2): 329–340.doi:10.1016/0304-3975(94)90131-7.
16. Robert Briesemeister (2009). Analysis and Implementation of the Game OnTop (Thesis). Maastricht University,
Dept of Knowledge Engineering.
17. Stefan Reisch (1981). "Hex ist PSPACE-vollständig (Hex is PSPACE-complete)". Acta Inf. (15): 167–191.
18. John Tromp and Gunnar Farnebäck (2007). "Combinatorics of Go". This paper derives the bounds
48<log(log(N))<171 on the number of possible games N.
19. J. M. Robson (1983). "The complexity of Go". Information Processing; Proceedings of IFIP Congress. pp. 413–
417.
20. The size of the state space and game tree for chess were first estimated in Claude Shannon (1950). "Programming
a Computer for Playing Chess". Philosophical Magazine 41 (314). Shannon gave estimates of 1043 and
10120 respectively, smaller than the upper bound in the table, which is detailed in Shannon number.
www.ijeebs.com
487 | Page