Tic Tac Toe

Tic Tac Toe
An Artificial Neural Network approach
CS539 Artificial Neural Network and Fuzzy Logic
Final Project
December 20, 2005
Justin Herbrand
Introduction
Getting games to react back to the user of a game has always been long hard
question for game programmers. Because, lets just face it, a good game that doesn’t
challenge the user’s ability to play the game doesn’t keep the user around very long. This
idea can be applied to any form of game that is out there. Board games are never fun
when the opponent that he or she is playing doesn’t learn or catches on. With today’s
computers always advancing, programmers are always looking for new ways to make a
video game more interesting and challenging for the user. And one solution that
programmers are looking towards are different forms of Artificial Intelligences.
The idea behind artificial intelligences is that they are suppose to learn when the
user is starting to master the game, they are suppose increase the difficulty of the game to
make it harder for the player to advance on in the game. Many people that buy games
normally don’t look for games that are going to take a few minutes to beat and then throw
it on their shelf. People normally want a game that is going to entertain them for months
so when they have free time to spare later on, they will pick up the game and be
challenged and entertained by this game.
One way people are building artificial intelligences is neural networks.
networks are showing up everywhere from entertainment, gambling, to security.
Neural
The basic idea behind a neural network is shown in the picture above. What you
see here is one neuron. It has some form of way to take in input so it can compute what it
is see through its inputs and then output the result through its output from what it was
computed. But people don’t just build one single neuron to solve complex expression,
we network a bunch of neurons together to have it compute a desired output. See picture
below.
And this is the very basic idea that is going on behind the scene of a game. In this
project, I am going to apply a neural network to learn the game tic tac toe so people can
play the computer when they are looking for something to do. Granted there are simple
algorithms that are out there that can compute what the player’s next move is going to be
but these algorithms can be too hard for the user’s to beat and that is not very entertaining
for them. The reason that people can write an algorithm for this type game is because of
the nature of the size of this game. With the amount of room that a computer can have,
the computer can store all the possible moves that the user can make and just follow the
user what he does and make sure to take the important spots so he or she doesn’t win.
Rules
For people that don’t play tic tac toe very much, the game is relatively simple.
Each player has a different marking that he or she uses (usually X or O) and try to get
three in a row with there marking. There are eight possible ways a player can do that
(three Horizontal, three Vertical, and two diagonals). When a player accomplishes this,
he or she wins the game.
When playing the game I developed in MatLab, you are asked where to place
your mark on the board with the corresponding numbers above. This is done this way
because it was easier to put in the information into a matrix form and pass it around to
function to be modified. You just pick a different number every time it is your turn and
try to get three in a row before the computer does.
Building
The game was built in mind with training the neural network with the most
optimal training moves taken into consideration. That way the network didn’t have to
learn every combination there was in the game. With that in mind, I built up a set of
combinations of data points that the neural network needed to see in order to do fairly
good when it comes to making decisions on where to make a move. For the number of
possible combinations that a player might see in the first three to four moves is 304
different moves. This seems to be very few because of the fact that there are 3^9
different combinations to the board, but not every combination is needed because a board
full of X means that this user didn’t take turns with their opponent. Then to make the
combinations smaller, considerations in board combos that are the same when you start
rotating the board around and flipping it to get the same match as before. If you want
more information about this topic, visit
http://www.mathrec.org/old/2002jan/solutions.html for more information.
Other considerations that were needed to be thought about are how we are going
to represent the data to the neural network. How the data is represented on the board is
by having zeros in places where there is an allowed moved yet, -1 for the user’s moves,
and 1 for the computers moves. This way, it will be easy to store the board’s
combinations into an array and send it to the network to determine the next move that is
supposed to be made. When the array is sent off to the neural network, the result that is
returned is a single array with one one in it that shows where the computer wanted to
make its move.
Another important factor that is needed is determined is the learning rate and
momentum for back propagation that is required for teaching a neuron. This is pretty
much determined by experimentation. I played around with these variables but didn’t get
vary far with them when they where changed. So the learning rate was left at .1 and the
momentum was left at .8.
For one more big parameter that needed to be missed around with is how many
neurons where required for the network. There is no scientific way in determining this,
so basically this was determined by trial and error. If the network seemed to have smaller
error with more neurons, then I kept adding more neurons until the error went the wrong
way. The way neurons can be configured is by either a small number of neurons per
layer or mass amounts of neurons per layer. A lot of different levels of neural networks
where considered because when I was testing the reliability of the network, I began to
find out that is was much harder to find a network that could play the game with trying to
fill in spots on the board compared to networks that were picking spots that where
already taken.
Data collection and results
There where a lot of different configurations that I tried out throughout this
project, these first couple configurations of neural networks that I tried where found in
other people’s projects that they said worked out will for them.
Learn = .1
mom = .8
Network level 9-9-9
Cmat =
0
0
0
0
0
0
0
0
0
crate =
60
0
0
0
0
0
0
0
0
0
1 0 0
2 1 0
25 2 1
1 37 3
5 3 77
5 4 1
1 3 1
7 11 0
4 8 3
6
1
0
2
1
7
0
1
5
15 2 0
7 2 0
13 1 0
5 0 1
14 0 0
11 1 0
62 3 0
13 4 0
15 0 31
training error (epoch size = 100)
90
80
70
error
60
50
40
30
20
10
0
100
200
300
400
500
epoch
600
700
800
900
1000
Learn = .1
mom = .8
9-48-9
Cmat =
15
0
0
1
0
0
0
0
1
0 0 0 1 1 7 0 0
4 0 0 1 2 6 0 0
0 27 0 0 2 12 0 1
0 0 23 6 6 11 0 2
0 0 0 92 1 7 0 0
0 0 1 4 18 6 0 0
0 0 0 3 1 66 0 0
0 0 0 2 2 10 22 0
0 0 0 4 2 11 0 48
crate =
77.7778
What I was mainly looking for in these couple experiments was to see if I could
get a high conversion rate with a good confusion matrix so I can see that it was making
more right decisions then making the wrong decisions. But when I thought I found a
good conversion rate, I would test it against the game design and find out that the neural
network wasn’t very good on predicating good moves. For the first couple neural
networks, they where spitting out the same result for each move and not make a decision
from what I present through the board.
One these first couple of trials, I started going through all the data that I originally
created by hand by basically just recording my own moves from games that I played on
the internet with an already made AI game. What I found was that the board
combinations that it was confused about where moves that where not there in the data set.
If you look through my training files, I have three different files. The first file is
tictactoe.txt. It only has 100 test points in it and I figured that maybe it didn’t have
enough points to make good decisions. That is when I introduced the idea of getting the
necessary board combinations that were needed that I discussed at the beginning of this
paper. This new file now that I used is called newpostions.txt.
With the new data, I started to observe that the error graph oscillate a lot more
then it use to. But I also found when I started testing the new neural networks, it was
making better decisions then before, but it wasn’t performing very will for a normal
game. So I combined all the data together into one file called combodata.txt and start
experimenting more with the different layers.
After about fifteen different neural networks that I experimented with, I started to
find a trend with the data. Meaning that instead of trying to compact all the neurons into
few hidden layers, I started to spread them out over more hidden layers and the MLP
started to show the decision making that I was looking for. Even though at this point of
the project, I was just looking for a network that would not try to take a position that was
not available.
The best network that I got to work out with playable results was a network of 920-20-20-20-9.
training error (epoch size = 100)
100
80
error
60
40
20
0
0
100
200
300
400
500
epoch
600
700
800
900
1000
It converged fairly will and the confusion matrix had a definite diagonal row in the
middle. Normally the bigger the diagonal, it meant that it was learning from it training
data and it will perform fairly will. If you want to see intermediate data samples, look at
the end of my report for more data.
Conclusion and Observations
From the developed network that was developed from this project, it is not a very
good one for competing against anyone. When I would test the network against myself, I
was relatively easy on the computer because I wanted to see if it would see openings in
the board where it should jump on. The MLP would sometimes pick a spot that is
already taken but I figured that a lot of it decision making has to do with the amount of
data that was needed in my training file. Even though it would keep testing over and over
with the same training data, that doesn’t mean it is going to learn something that it didn’t
see.
Another thing that I observed in my testing sessions during was the way my error
graphs would like to jump around When the graph pretty much had a small error going
for it, it would make little hopes every once in a while. I figured that maybe it was due to
a bad test point because I did scan over the data file to see if there were any bad points in
the file that might have done that. I found one and it made it jump less often, so I looked
more and didn’t find more.
But to rap it all up, the neural network that I developed isn’t very good. I know it
can do better because there are programs that are out there that do better and you don’t
have to be gentle too. Since I am short on time, I’ll have to work with what I got and
present what works.
Game start up
To start up the game with the present neural network, just type tictactoe in the
directory where the file it is found. If you want to try to improve the network, just type
bp to run the back propagation program to start teaching and see how it performs.
Matlab Files
bp.m
bpconfig.m
bpdisplay.m
bptest.m
bptest.m
bptestap.m
checkboard.m
combodata.txt
cvgtest.m
fuzzymove.m
mlpconfig.mat
newpositions.txt
nextmove.m
partunef.m
Poistions.str
Randomize.m
Rsample.m
Scale.m
Sline.m
Tictactoe.m
Tictactoe.mat
Tictactoetrain.txt
References
http://www.colinfahey.com/2003apr20_neuron/index.htm#example_xor
http://www.adit.co.uk/html/neural_network_objectives.html
http://www.mathrec.org/old/2002jan/solutions.html
Data
Learn = .1
mom = .8
9-10-10-9
Cmat =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
12 0 0 1 3 1 7
5 0 0 1 2 3 2
31 0 0 1 4 0 6
2 36 0 4 2 1 4
5 0 79 6 3 0 7
1 1 3 18 1 1 4
2 1 0 3 57 0 7
3 1 0 1 3 28 0
2 1 1 1 5 1 55
crate =
75.0617
Learn = .1
mom = .8
9-20-20-9
Cmat =
18 5 0 0 1 0 0 0 0
1 11 0 0 1 0 0 0 0
0 6 34 0 1 0 0 0 1
0 3 0 43 1 1 0 1 0
0 8 0 0 92 0 0 0 0
0 3 0 1 1 23 0 1 0
0 8 0 0 2 0 60 0 0
0 3 1 2 1 0 0 29 0
0 6 1 1 2 0 0 1 55
crate =
90.1235
Learn = .1
mom = .8
9-10-10-10-9
Cmat =
14
1
1
1
0
0
0
0
1
0 1 1 0
0 1 1 0
0 30 1 0
0 0 35 1
0 3 1 87
0 3 3 2
0 1 2 0
0 0 1 0
0 1 1 2
0
0
1
1
0
9
1
0
0
3
3
2
9
1
6
55
3
5
1 4
5 2
1 6
1 1
1 7
5 1
1 10
31 1
1 55
crate =
78.0247
training error (epoch size = 100)
100
80
error
60
40
20
0
0
100
200
300
400
Learn = .1
mom = .8
9-20-20-20-9
Cmat =
18
1
0
0 1
9 1
0 35
1
1
1
3
0
3
0
0
0
0
0
0
0
0
2
1
1
1
500
epoch
600
700
800
900
1000
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
2
44 2 0 0 0 2
1 96 0 0 0 2
2 0 25 0 0 1
1 6 0 60 1 1
1 0 0 0 33 1
1 3 0 0 1 59
crate =
93.5802
training error (epoch size = 100)
100
80
error
60
40
20
0
0
100
200
300
400
Learn = .1
mom = .8
9-10-10-10-10-9
Cmat =
12
1
1
0
3
1
2
1
2
0 1 1 5
1 4 2 1
0 18 4 8
0 1 35 4
0 1 3 75
1 4 5 3
0 2 2 7
0 7 5 4
0 2 3 6
crate =
62.4691
Learn = .1
mom = .8
0
0
0
0
1
6
1
1
0
3
4
9
9
12
7
51
9
5
1
0
1
0
0
1
0
9
2
1
0
1
0
5
1
5
0
46
500
epoch
600
700
800
900
1000
9-20-20-20-20-9
Cmat =
15
2
1
2
1
1
1
1
2
0 0 0 0 0 5 1 3
3 1 0 0 0 4 2 1
0 28 1 1 0 4 3 4
0 2 38 0 2 2 2 1
0 0 1 88 1 5 1 3
0 1 0 2 21 2 2 0
0 0 2 2 1 61 2 1
0 0 2 0 0 1 30 2
0 1 0 3 1 5 2 52
crate =
82.9630
training error (epoch size = 100)
100
80
error
60
40
20
0
0
100
200
300
400
SUCCES AND NO REPEATS
Learn = .1
mom = .8
9-10-10-10-10-10-9
Cmat =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5 0 0
6 0 1
20 4 5
5 35 2
2 4 81
11 4 3
6 3 8
0 5 9 5
0 1 4 1
0 5 6 2
0 3 4 0
0 6 5 2
0 3 6 2
0 41 10 2
500
epoch
600
700
800
900
1000
0
0
0
0
8
2
3
0
0
8
0
0
2 23
4 10
0
42
crate =
59.7531
training error (epoch size = 100)
90
80
70
error
60
50
40
30
20
10
0
50
100
150
200
Learn = .1
mom = .8
9-20-20-20-20-20-9
Cmat =
10
2
2
2
1
0
1
0
0
0
0
0
0
0
0
0
0
0
crate =
50.8642
0 1 5 0 4
0 2 1 0 5
7 4 14 0 5
0 27 9 0 8
2 4 85 0 4
2 2 16 0 5
0 1 27 0 31
0 6 5 0 4
0 1 22 0 7
2 2
0 3
0 10
1 2
2 2
1 3
3 7
14 7
4 32
250
epoch
300
350
400
450
500
training error (epoch size = 100)
90
80
70
error
60
50
40
30
20
10
0
100
200
300
400
500
epoch
600
700
800
900
1000
Learn = .1
mom = .8
9-20-20-20-20-20-9
Cmat =
0
0
0
0
0
0
0
0
0
0
0
1
1
6
2
1
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
14 0 0
9 0 0
27 0 0
42 0 0
65 0 0
23 0 0
54 0 0
21 0 0
40 0 0
10 0
4 0
14 0
6 0
29 0
4 0
15 0
14 0
24 0
crate =
19.5062
training error (epoch size = 100)
140
120
error
100
80
60
40
20
0
50
100
150
200
250
epoch
300
350
400
450
learn = .3
mom = .8
Cmat =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1 2
2 2
9 3
13 14
15 31
4 10
15 10
5 9
14 17
0
0
0
1
0
0
0 0
0 0
0 0
0 0
0 0
0 0
20
9
30
22
53
15
42
21
35
0
0
0
0
1
0
3
1
0
crate =
16.0494
training error (epoch size = 100)
180
160
error
140
120
100
80
60
40
0
100
200
300
400
500
epoch
600
700
800
900
AND FOR ANY OTHER DIFFERENT LEARNING AND MOMENTUM
PERAMETERS.
1000