The Implementation of Machine Learning in the Game of Checkers Billy Melicher Computer Systems lab 08 10/29/08 Abstract • Machine learning uses past information to predict future states • Can be used in any situation where the past will predict the future • Will adapt to situations Introduction • Checkers is used to explore machine learning • Checkers has many tactical aspects that make it good for studying Background • Minimax • Heuristics • Learning Minimax • Method of adversarial search • Every pattern(board) can be given a fitness value(heuristic) • Each player chooses the outcome that is best for them from the choices they have Minimax Minimax • Has exponential growth rate • Can only evaluate a certain number of actions into the future – ply Heuristic • Heuristics predict out come of a board • Fitness value of board, higher value, better outcome • Not perfect • Requires expertise in the situation to create Heuristics • • • • H(s) = c0F0(s) + c1F1(s) + … + cnFn(s) H(s) = heuristic Has many different terms In checkers terms could be: • • • • Number of checkers Number of kings Number of checkers on an edge How far checkers are on board Learning by Rote • Stores every game played • Connects the moves made for each board • Relates the moves made from a particular board to the outcome of the board • More likely to make moves that result in a win, less likely to make moves resulting in a loss • Good in end game, not as good in mid game Learning by Generalization • Uses a heuristic function to guide moves • Changes the heuristic function after games based on the outcome • Good in mid game but not as good in early and end games • Requires identifying the features that affect game Development • Use of minimax algorithm with alpha beta pruning • Use of both learning by Rote and Generalization • Temporal difference learning Temporal Difference Learning • In temporal difference learning, you adjust the heuristic based on the difference between the heuristic at one time and at another • Equilibrium moves toward ideal function • U(s) <-- U(s) + α( R(s) + γU(s') - U(s))
© Copyright 2026 Paperzz