Programming Assignment 3 - Rose

Programming Assignment 3
Tic-Tac-Toe
By:
Jason Wentland
Brent Ellwein
Instructions
Here is a basic tutorial for the usage of our program. The menus and selections
are generally self-explanatory, so it is not necessary to strictly follow this tutorial.
1. When the program is first run, you will see the main menu screen, from here you
can navigate to anywhere in the program. The first step is to create players for the
system. You should create both a human and an AI player (unless you want to do
training yourself – this is NOT suggested).
2. Next you can enter the training room
3. The first step is to set-up the training options. You may also alter a linear fitness
function. Many of these options are described in more detail beyond the tutorial.
4. If the training options have been initialized, you need to assign an initial
configuration for each member of the population.
5. If you would like to view the results of the training, you can specify an output file
which will show you the record of each child created during training.
6. Finally, you can execute the training. Once complete, you will be asked to name
the child in the last generation with the highest fitness. You then can run more
training, or you can return to the main menu to play a game.
7. From the main menu select play a game and then two players to compete. Player
1 is ‘X’ and player 2 is ‘O.’
8. When it is your turn, select a square according to a numeric keypad.
7 8 9
4 5 6
1 2 3
9. At the conclusion of the game you will be asked to play another.
10. When you have finished playing games you may view the record for any player.
This includes the AI trainer, any Human Players, and any generated players.
How it works
Our approach to the tic-tac-toe problem was to split the game into several states
and a decision upon the next move based upon a given state. Because there are 9 squares
on a board, each square with 3 possible configurations, there are 39 or 19683 possible
board configurations. By eliminating invalid and rotate-able or symmetric configurations
we reduced this number to 593. By associating each of these states with a next move, the
genetic players would know what to do in each situation. When created, each genetic
player is randomly assigned a valid next-move for each state. During evolution, the
genetic children are assigned a next move for each state from one of their parents. If
mutation occurs, then a valid next move is randomly assigned for a given state.
Training occurs by identifying an initial population and a tutor AI. Each member
of the initial population plays a number of games against the AI (by alternating who goes
first) and the results are tallied. The tallied records are then used to determine the fitness
of each player and finally the probability for genetic selection. Once a new generation
has been spawned, the process repeats.
Tutors
We tried several different types of tutor AI’s. The first, a completely random
player was unsuccessful. Secondly, we tried to train the AI’s against another genetic
player. This resulted in the children evolving very quickly (as little as 20 generations) to
beat the static trainer, but because they saw only one permutation of the game, they were
very unsuccessful against other opponents. Finally, we developed a ‘smart’ random
player. This player will look for all possible moves to win or block it’s opponent from
winning. If no such move is found, the AI will pick a random location. Training against
this AI yielded the best results. Alternatively to having one tutor, we experimented with
having the children play against themselves but found that this was not as good as the
smart AI.
Fitness
The fitness function has a great deal of influence of the development of players
using genetic algorithms. We experimented with several models for a fitness function.
The first was a linear transformation of the record. This has the form of A*wins +
B*draws + C*losses = Fitness. A,B,C can take any real value. Secondly we
experimented with quadratic transformations of the record for example: A*losses +
B*wins2 = Fitness. By altering the relationships between wins, losses, and draws, we
were able to encourage players that would try to draw or try to win. Overall, we found
that the best players evolved when the fitness function was linear in nature and losses
were not taken into account.
Mutation and population size
Our interface allows the specification of both a mutation factor and the size of the
population. The mutation factor is inversely related to the probability of a mutation
occurring on a given game state for a given child. For moderate population sizes, we’ve
found that a good value for the mutation factor is somewhere around 30,000. The size of
the population also affects the variance in the moves done by children. A small initial
population is reliant almost entirely upon mutation to introduce new moves, while a large
varied initial population will learn well with a very large mutation factor. In short, the
size of the initial affects the need for mutation. Because for any given move there are a
maximum of 9 possible choices, starting populations of approximately 10 should have all
possible choices within them. However since these may be hidden in ‘bad’ players,
they’re information may be lost during selection and so some mutation is still necessary.
Generating children
We experimented with two different methods of generating children. First, two
parents can generate one child. This method simply takes one move from one parent at
random for each state. Second, we have an option where each pair of parents generates 2
children. One child will take a given move from one of the parents and the other child
from the other parent. These children are ‘opposite’ from each other. We were able to
obtain successful results using both methods, but the single child per pair of parents
seemed to work better. Finally, there is an option that some of the parents may survive
until the next generation. The number of fittest parents specified will be carried through
unchanged to the next generation. This number may be set to zero for ‘normal’ genetic
reproduction. (Note: this feature could happen randomly by a parent being selected to
mate with itself, the parent carryover simply ‘encourages’ this to happen).
Results
Following attachments are graphs of some of our tested results. We tried a lot of
variations, and the following seemed to come out the best. A short description and the
settings used are on each graph.
population
number of games
generations
tutor
mutation factor
mating option
# of parents to survive
fitness function drawWeight
fitness function winWeight
20
100
250
smarterAI
1000000000
single child
10
0
1
Here we tried to see what would happen if we only weighted the wins. We wanted to try
and get the AI to learn to win. It did learn fairly quickly during the first twenty or so
generations, but after that it didn’t get better. Because losses were weighted the same as
draws, it didn’t try to get draws over losses.
Population
number of games
generations
tutor
mutation factor
mating option
# of parents to survive
fitness function drawWeight
fitness function winWeight
20
100
200
smarterAI
300000
single child
5
4
1
The smartest players playing each other will always end a game in a draw. Here we tried
to make a player that did well at ending a game in a draw. Was a fairly good AI to play
against.
Population
number of games
generations
tutor
mutation factor
mating option
fitness function drawWeight
fitness function winWeight
20
100
200
smarterAI
1000000000
single child
1
2
Fairly early on results of a pretty standard setup. Parents die off, and wins are worth more
then draws. You can see that it does learn not to lose, but that’s about it. Ends up about
1/3rd on wins, draws, and losses.
Population
number of games
generations
tutor
mutation factor
mating option
# of parents to survive
fitness function drawWeight
fitness function winWeight
100
100
100
stupidAI
1000000
single child
50
1
2
Playing this AI after training, showed this AI to be the smartest one we’ve made. Even
though it was trained against the random AI player, it did well against us as human
players. This is probably because it has a better chance at learning all possible games
states. Since it is easier to win against a stupid player, the ones who lose get lost pretty
quickly. This was done after the feature to allow parents to survive was added. This
increased the learning performance, because now information the player just learned
wasn’t being lost in the next round to a stupid child who can easily beat a stupid AI
trainer.
Population
number of games
generations
tutor
mutation factor
mating option
# of parents to survive
fitness function drawWeight
fitness function winWeight
100
200
200
stupidAI
1000000
single child
50
1
2
In attempt to get further better results after the previous test using the random AI player
as the tutor, we ran this test with double the games and double the generations to see if
our genetic AI player would improver further. From this graph you can see it improved
slightly mainly due to the longer run with more generations.
population
number of games
generations
tutor
mutation factor
mating option
# of parents to survive
fitness function drawWeight
fitness function winWeight
100
200
150
stupidAI
1000000
single child
50
1
1
We decided that the number of losses was too many and wondered if we could reduce
them further. Here we ran the same test as the previous, but made wins and draws worth
the same amount to hopefully encourage draws instead of losing. We also rand it for only
150 generations, as the previous run didn’t seem to do anything interesting past about 120
generations. This turned out to be our best AI. It played pretty well against us. It still
wasn’t perfect, but we think that it will be extremely hard to generate a perfect player
genetically.