Tic Tac Toe An Artificial Neural Network approach CS539 Artificial Neural Network and Fuzzy Logic Final Project December 20, 2005 Justin Herbrand Introduction Getting games to react back to the user of a game has always been long hard question for game programmers. Because, lets just face it, a good game that doesn’t challenge the user’s ability to play the game doesn’t keep the user around very long. This idea can be applied to any form of game that is out there. Board games are never fun when the opponent that he or she is playing doesn’t learn or catches on. With today’s computers always advancing, programmers are always looking for new ways to make a video game more interesting and challenging for the user. And one solution that programmers are looking towards are different forms of Artificial Intelligences. The idea behind artificial intelligences is that they are suppose to learn when the user is starting to master the game, they are suppose increase the difficulty of the game to make it harder for the player to advance on in the game. Many people that buy games normally don’t look for games that are going to take a few minutes to beat and then throw it on their shelf. People normally want a game that is going to entertain them for months so when they have free time to spare later on, they will pick up the game and be challenged and entertained by this game. One way people are building artificial intelligences is neural networks. networks are showing up everywhere from entertainment, gambling, to security. Neural The basic idea behind a neural network is shown in the picture above. What you see here is one neuron. It has some form of way to take in input so it can compute what it is see through its inputs and then output the result through its output from what it was computed. But people don’t just build one single neuron to solve complex expression, we network a bunch of neurons together to have it compute a desired output. See picture below. And this is the very basic idea that is going on behind the scene of a game. In this project, I am going to apply a neural network to learn the game tic tac toe so people can play the computer when they are looking for something to do. Granted there are simple algorithms that are out there that can compute what the player’s next move is going to be but these algorithms can be too hard for the user’s to beat and that is not very entertaining for them. The reason that people can write an algorithm for this type game is because of the nature of the size of this game. With the amount of room that a computer can have, the computer can store all the possible moves that the user can make and just follow the user what he does and make sure to take the important spots so he or she doesn’t win. Rules For people that don’t play tic tac toe very much, the game is relatively simple. Each player has a different marking that he or she uses (usually X or O) and try to get three in a row with there marking. There are eight possible ways a player can do that (three Horizontal, three Vertical, and two diagonals). When a player accomplishes this, he or she wins the game. When playing the game I developed in MatLab, you are asked where to place your mark on the board with the corresponding numbers above. This is done this way because it was easier to put in the information into a matrix form and pass it around to function to be modified. You just pick a different number every time it is your turn and try to get three in a row before the computer does. Building The game was built in mind with training the neural network with the most optimal training moves taken into consideration. That way the network didn’t have to learn every combination there was in the game. With that in mind, I built up a set of combinations of data points that the neural network needed to see in order to do fairly good when it comes to making decisions on where to make a move. For the number of possible combinations that a player might see in the first three to four moves is 304 different moves. This seems to be very few because of the fact that there are 3^9 different combinations to the board, but not every combination is needed because a board full of X means that this user didn’t take turns with their opponent. Then to make the combinations smaller, considerations in board combos that are the same when you start rotating the board around and flipping it to get the same match as before. If you want more information about this topic, visit http://www.mathrec.org/old/2002jan/solutions.html for more information. Other considerations that were needed to be thought about are how we are going to represent the data to the neural network. How the data is represented on the board is by having zeros in places where there is an allowed moved yet, -1 for the user’s moves, and 1 for the computers moves. This way, it will be easy to store the board’s combinations into an array and send it to the network to determine the next move that is supposed to be made. When the array is sent off to the neural network, the result that is returned is a single array with one one in it that shows where the computer wanted to make its move. Another important factor that is needed is determined is the learning rate and momentum for back propagation that is required for teaching a neuron. This is pretty much determined by experimentation. I played around with these variables but didn’t get vary far with them when they where changed. So the learning rate was left at .1 and the momentum was left at .8. For one more big parameter that needed to be missed around with is how many neurons where required for the network. There is no scientific way in determining this, so basically this was determined by trial and error. If the network seemed to have smaller error with more neurons, then I kept adding more neurons until the error went the wrong way. The way neurons can be configured is by either a small number of neurons per layer or mass amounts of neurons per layer. A lot of different levels of neural networks where considered because when I was testing the reliability of the network, I began to find out that is was much harder to find a network that could play the game with trying to fill in spots on the board compared to networks that were picking spots that where already taken. Data collection and results There where a lot of different configurations that I tried out throughout this project, these first couple configurations of neural networks that I tried where found in other people’s projects that they said worked out will for them. Learn = .1 mom = .8 Network level 9-9-9 Cmat = 0 0 0 0 0 0 0 0 0 crate = 60 0 0 0 0 0 0 0 0 0 1 0 0 2 1 0 25 2 1 1 37 3 5 3 77 5 4 1 1 3 1 7 11 0 4 8 3 6 1 0 2 1 7 0 1 5 15 2 0 7 2 0 13 1 0 5 0 1 14 0 0 11 1 0 62 3 0 13 4 0 15 0 31 training error (epoch size = 100) 90 80 70 error 60 50 40 30 20 10 0 100 200 300 400 500 epoch 600 700 800 900 1000 Learn = .1 mom = .8 9-48-9 Cmat = 15 0 0 1 0 0 0 0 1 0 0 0 1 1 7 0 0 4 0 0 1 2 6 0 0 0 27 0 0 2 12 0 1 0 0 23 6 6 11 0 2 0 0 0 92 1 7 0 0 0 0 1 4 18 6 0 0 0 0 0 3 1 66 0 0 0 0 0 2 2 10 22 0 0 0 0 4 2 11 0 48 crate = 77.7778 What I was mainly looking for in these couple experiments was to see if I could get a high conversion rate with a good confusion matrix so I can see that it was making more right decisions then making the wrong decisions. But when I thought I found a good conversion rate, I would test it against the game design and find out that the neural network wasn’t very good on predicating good moves. For the first couple neural networks, they where spitting out the same result for each move and not make a decision from what I present through the board. One these first couple of trials, I started going through all the data that I originally created by hand by basically just recording my own moves from games that I played on the internet with an already made AI game. What I found was that the board combinations that it was confused about where moves that where not there in the data set. If you look through my training files, I have three different files. The first file is tictactoe.txt. It only has 100 test points in it and I figured that maybe it didn’t have enough points to make good decisions. That is when I introduced the idea of getting the necessary board combinations that were needed that I discussed at the beginning of this paper. This new file now that I used is called newpostions.txt. With the new data, I started to observe that the error graph oscillate a lot more then it use to. But I also found when I started testing the new neural networks, it was making better decisions then before, but it wasn’t performing very will for a normal game. So I combined all the data together into one file called combodata.txt and start experimenting more with the different layers. After about fifteen different neural networks that I experimented with, I started to find a trend with the data. Meaning that instead of trying to compact all the neurons into few hidden layers, I started to spread them out over more hidden layers and the MLP started to show the decision making that I was looking for. Even though at this point of the project, I was just looking for a network that would not try to take a position that was not available. The best network that I got to work out with playable results was a network of 920-20-20-20-9. training error (epoch size = 100) 100 80 error 60 40 20 0 0 100 200 300 400 500 epoch 600 700 800 900 1000 It converged fairly will and the confusion matrix had a definite diagonal row in the middle. Normally the bigger the diagonal, it meant that it was learning from it training data and it will perform fairly will. If you want to see intermediate data samples, look at the end of my report for more data. Conclusion and Observations From the developed network that was developed from this project, it is not a very good one for competing against anyone. When I would test the network against myself, I was relatively easy on the computer because I wanted to see if it would see openings in the board where it should jump on. The MLP would sometimes pick a spot that is already taken but I figured that a lot of it decision making has to do with the amount of data that was needed in my training file. Even though it would keep testing over and over with the same training data, that doesn’t mean it is going to learn something that it didn’t see. Another thing that I observed in my testing sessions during was the way my error graphs would like to jump around When the graph pretty much had a small error going for it, it would make little hopes every once in a while. I figured that maybe it was due to a bad test point because I did scan over the data file to see if there were any bad points in the file that might have done that. I found one and it made it jump less often, so I looked more and didn’t find more. But to rap it all up, the neural network that I developed isn’t very good. I know it can do better because there are programs that are out there that do better and you don’t have to be gentle too. Since I am short on time, I’ll have to work with what I got and present what works. Game start up To start up the game with the present neural network, just type tictactoe in the directory where the file it is found. If you want to try to improve the network, just type bp to run the back propagation program to start teaching and see how it performs. Matlab Files bp.m bpconfig.m bpdisplay.m bptest.m bptest.m bptestap.m checkboard.m combodata.txt cvgtest.m fuzzymove.m mlpconfig.mat newpositions.txt nextmove.m partunef.m Poistions.str Randomize.m Rsample.m Scale.m Sline.m Tictactoe.m Tictactoe.mat Tictactoetrain.txt References http://www.colinfahey.com/2003apr20_neuron/index.htm#example_xor http://www.adit.co.uk/html/neural_network_objectives.html http://www.mathrec.org/old/2002jan/solutions.html Data Learn = .1 mom = .8 9-10-10-9 Cmat = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 1 3 1 7 5 0 0 1 2 3 2 31 0 0 1 4 0 6 2 36 0 4 2 1 4 5 0 79 6 3 0 7 1 1 3 18 1 1 4 2 1 0 3 57 0 7 3 1 0 1 3 28 0 2 1 1 1 5 1 55 crate = 75.0617 Learn = .1 mom = .8 9-20-20-9 Cmat = 18 5 0 0 1 0 0 0 0 1 11 0 0 1 0 0 0 0 0 6 34 0 1 0 0 0 1 0 3 0 43 1 1 0 1 0 0 8 0 0 92 0 0 0 0 0 3 0 1 1 23 0 1 0 0 8 0 0 2 0 60 0 0 0 3 1 2 1 0 0 29 0 0 6 1 1 2 0 0 1 55 crate = 90.1235 Learn = .1 mom = .8 9-10-10-10-9 Cmat = 14 1 1 1 0 0 0 0 1 0 1 1 0 0 1 1 0 0 30 1 0 0 0 35 1 0 3 1 87 0 3 3 2 0 1 2 0 0 0 1 0 0 1 1 2 0 0 1 1 0 9 1 0 0 3 3 2 9 1 6 55 3 5 1 4 5 2 1 6 1 1 1 7 5 1 1 10 31 1 1 55 crate = 78.0247 training error (epoch size = 100) 100 80 error 60 40 20 0 0 100 200 300 400 Learn = .1 mom = .8 9-20-20-20-9 Cmat = 18 1 0 0 1 9 1 0 35 1 1 1 3 0 3 0 0 0 0 0 0 0 0 2 1 1 1 500 epoch 600 700 800 900 1000 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 2 44 2 0 0 0 2 1 96 0 0 0 2 2 0 25 0 0 1 1 6 0 60 1 1 1 0 0 0 33 1 1 3 0 0 1 59 crate = 93.5802 training error (epoch size = 100) 100 80 error 60 40 20 0 0 100 200 300 400 Learn = .1 mom = .8 9-10-10-10-10-9 Cmat = 12 1 1 0 3 1 2 1 2 0 1 1 5 1 4 2 1 0 18 4 8 0 1 35 4 0 1 3 75 1 4 5 3 0 2 2 7 0 7 5 4 0 2 3 6 crate = 62.4691 Learn = .1 mom = .8 0 0 0 0 1 6 1 1 0 3 4 9 9 12 7 51 9 5 1 0 1 0 0 1 0 9 2 1 0 1 0 5 1 5 0 46 500 epoch 600 700 800 900 1000 9-20-20-20-20-9 Cmat = 15 2 1 2 1 1 1 1 2 0 0 0 0 0 5 1 3 3 1 0 0 0 4 2 1 0 28 1 1 0 4 3 4 0 2 38 0 2 2 2 1 0 0 1 88 1 5 1 3 0 1 0 2 21 2 2 0 0 0 2 2 1 61 2 1 0 0 2 0 0 1 30 2 0 1 0 3 1 5 2 52 crate = 82.9630 training error (epoch size = 100) 100 80 error 60 40 20 0 0 100 200 300 400 SUCCES AND NO REPEATS Learn = .1 mom = .8 9-10-10-10-10-10-9 Cmat = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 6 0 1 20 4 5 5 35 2 2 4 81 11 4 3 6 3 8 0 5 9 5 0 1 4 1 0 5 6 2 0 3 4 0 0 6 5 2 0 3 6 2 0 41 10 2 500 epoch 600 700 800 900 1000 0 0 0 0 8 2 3 0 0 8 0 0 2 23 4 10 0 42 crate = 59.7531 training error (epoch size = 100) 90 80 70 error 60 50 40 30 20 10 0 50 100 150 200 Learn = .1 mom = .8 9-20-20-20-20-20-9 Cmat = 10 2 2 2 1 0 1 0 0 0 0 0 0 0 0 0 0 0 crate = 50.8642 0 1 5 0 4 0 2 1 0 5 7 4 14 0 5 0 27 9 0 8 2 4 85 0 4 2 2 16 0 5 0 1 27 0 31 0 6 5 0 4 0 1 22 0 7 2 2 0 3 0 10 1 2 2 2 1 3 3 7 14 7 4 32 250 epoch 300 350 400 450 500 training error (epoch size = 100) 90 80 70 error 60 50 40 30 20 10 0 100 200 300 400 500 epoch 600 700 800 900 1000 Learn = .1 mom = .8 9-20-20-20-20-20-9 Cmat = 0 0 0 0 0 0 0 0 0 0 0 1 1 6 2 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 9 0 0 27 0 0 42 0 0 65 0 0 23 0 0 54 0 0 21 0 0 40 0 0 10 0 4 0 14 0 6 0 29 0 4 0 15 0 14 0 24 0 crate = 19.5062 training error (epoch size = 100) 140 120 error 100 80 60 40 20 0 50 100 150 200 250 epoch 300 350 400 450 learn = .3 mom = .8 Cmat = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 2 9 3 13 14 15 31 4 10 15 10 5 9 14 17 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 9 30 22 53 15 42 21 35 0 0 0 0 1 0 3 1 0 crate = 16.0494 training error (epoch size = 100) 180 160 error 140 120 100 80 60 40 0 100 200 300 400 500 epoch 600 700 800 900 AND FOR ANY OTHER DIFFERENT LEARNING AND MOMENTUM PERAMETERS. 1000
© Copyright 2026 Paperzz