Intelligent Agent Design for Planet Wars Game Intelligent Systems – Final assignment Danny Meeuwsen1 nr. 1785168, Joris Nijman2 nr. 1743694, and Klaas Schuijtemaker3 nr. 2566162 Vrije Universiteit, Amsterdam Abstract. An exploratory experiment is performed to compare the strength of different strategies for artificial agent design. This is done in the environment of the game: Planet Wars. Planet Wars can be represented as an observable, static and discrete environment. Agents look for the best possible move using search algorithms or other strategies. First useful heuristics are investigated, these are based on the amount of ships and growth ratio of the player and the enemy. The amount of ships and enemy parameters should be given more weight for better results. Nine different artificial agents play a competition. The algorithms that take the enemy moves into account perform well (adversary beam search and minimax) but also purely strategic algorithms (super bully and simple adaptive strategy) perform better than normal search algorithms (greedy best first search). Based on the results from the experiment it can be reasoned that developing adaptive bots, using a combination of strategic moves and moves based on search algorithms, will lead to the best results. 1 Introduction Computer games are excellent testing environments for Intelligent Agent design strategies. The objective of this experiment report is to compare different algorithms which are often used in the field of Artificial Intelligence. This is done by designing a number of Intelligent Agents (bots) that play the Planet Wars game and systematically comparing bots with different strategies. Planet Wars was used by Google for a competition in intelligent agent design 1 . The planet wars version used for this report is a two player strategy game in which you try to take over all enemy planets. This version is restricted so that a player can only have one fleet in transit at a time. The game starts with each player owning a planet occupied by a fleet of ships. Each turn the player can send half the ships from a planet he owns, to another planet. If the other planet is hostile or neutral and the player has more ships than the destination planet, he takes over the planet with the remaining ships. Each planet on the map has a growth ratio, the higher the ratio the more ships the player gains each turn he owns the planet. 1 AI Challenge 2010 (Planet Wars) http://planetwars.aichallenge.org/ An intelligent agent playing the Planet Wars game has to deal with the problem of deciding on where to send his ships. The simplest strategy is to randomly send ships to another planet. A more structured approach is a Simple state-space search, e.g. considering every possible move and searching for a goal state in which there’re no enemy ships left. These strategies lead to very big calculating times and are therefore not useful for a game like Planet Wars. To optimise the search strategies, heuristics can be used. A heuristic is an estimation about the costs to reach a goal or an estimation of the game position of a player (in comparison to opponent). Using heuristics, an Agent does not have to look through all possible options anymore but it has a way to decide which options to explore first. Because Planet Wars is a two player game, a better estimate of the game state can still be made if the opponent’s moves are taken into account. This can be done using a minimax search algorithm. Investigating how different strategies contribute to the successful behaviour of the bots is done by playing competitions between the bots. By systematically changing the heuristics and search strategies of the different bots, conclusions are drawn about the performance of the different algorithms within the Planet Wars environment. The environment of Planet Wars is fully observable; both bots know the complete game state. The environment also static; while a move is calculated nothing changes. It is also discrete; all locations in the game are fixed. One aspect of the environment is manipulated during the experiment, namely a deterministic version; where the moves are turn based, and a stochastic version; where the both bots perform their move simultaneously. In a turn based environment both bots know the exact outcome of their moves. In a parallel environment every outcome of a move includes a probability of the move that the opponent chooses. The different possible game states consist of a number of ships at every planet, the growth rate of the planets and the ownership of the planets, player 1, player 2 or neutral. These variables are used to make an estimation (heuristic) about the desirability of a particular state. Different heuristics are compared for the algorithms that are used. The development of the bots starts with some bots that were provided and is done by improving on these bots with implementation of more advanced algorithms and heuristics. The bots that were provided use a variety of strategies: Random selection, strategic bully, minimax and an adaptive strategy; switching between random or bully in different situations. The algorithms that are integrated in this experiment are: Hill climbing, adversarial beam search, a super strategic bully, a minimax strategy(with alpha-beta pruning) and an adaptive strategy. The search algorithms only use an estimation of the most desirable state and do not depend on the path cost which is irrelevant in this version of Planet Wars Expectations are that all bots will perform better than a bot that uses a random algorithm. The more precise the heuristics describe the game state, the better the prediction will be and the better the bot should perform. A heuristic with only the total amount of ships of the player should perform less than one that also uses the average growth ratio. An even better performance is expected when also the amount of ships and growth ratio of the enemy is taken in to account. These hypotheses are for the most part proven to be true, although there were some interesting results like: only taking the enemy growth rate in to account also works surprisingly well. To compare different intelligent agent design strategies, the heuristic calculation used in the following experiment was the same for all bots. All bots, the ones that were provided and the implemented, play a competition against each other. Expectations are that the minimax algorithms perform better than the other ones. The further these algorithms can look in to the future, the better they should perform, so the given minimax bot who looks two steps ahead should perform worse than the implemented minimax bot which looks four steps ahead. The adversarial beam search bot uses a minimax strategy combined with a beam size to restrict the options and should perform the best of all algorithms. These hypotheses were confirmed to some extend but optimisation of the algorithms also seems to be very important. Surprisingly the strategy based bots also perform really well, which raises some interesting questions about the importance of a strategic approach. The competition between the bots is performed in serial and in parallel mode. Results of these different situations are compared and explanations are given for the different performances of the bots. 2 Background Information Designing bots for a game is called rational agent design. An agent is an entity that maps a percepts sequence into an action. The agent has to calculate which is the best action based on its available perceptions. The different kinds of perceptions are: a performance measure, prior knowledge of the environment, the available percepts sequence and his actions 2 . The agent tries to look at different game states that result from his possible actions and chooses the best option. Search search algorithms are used to look for the best sequence of actions which leads to a goal or the best possible state. The simplest search strategies are uninformed ones, these algorithms explore a search space systematically. the sequence of nodes is determined based on their name or number. Uninformed search strategies can work well for small search spaces but when there are many nodes to expand, these algorithms will not terminate within a reasonable time span. This is why these strategies are not very useful for playing a game like Planet Wars. To make searching more efficient than just opening all possible nodes, the agent needs a way to evaluate the results of his moves. An estimation about the desirability of a certain game state is called a heuristic value. A heuristic value can be calculated for every move by using path cost to a goal. An other way to look at heuristics is to calculate how ’good’ the resulting game position is by just looking at the situation. For example having more pawns than the enemy is preferred. The higher the heuristic value of the game 2 Peter Norvig, Stuart Russell, Artificial Intelligence - A modern Approach, 3rd edition, 2009. [p.35-37] state, the closer the search algorithm is to a goal state. Many search algorithms make use of some kind of heuristics to essentially make the search tree smaller so the goals can be found faster. This could really help to find goals much faster in a large search space like Planet Wars. The only problem is that it should also be taken into account that the other player (adversary) is trying to minimise your score. Thus bots that make use of informed search algorithms and also incorporate an estimation of the adversary’s move should suit this game better. Algorithms that take the adversary in to account are called minimax algorithms. Minimax considers all possible game states resulting from a move, chooses the best move and then tries to predict the move of the enemy (which will minimise your own score). This is done to a certain depth in the search tree. The path leading to the best game position is then chosen. The Planet Wars game is a fully observable environment, e.g. each player knows how many planets there are and what their growth rate is. Both know how many planets he and his adversary own and also the amount of ships that are on all planets. The game is static as it is unknown what the opponent will do. There is a finite amount of moves a player can make every turn. A game state describes the state of the Planet Wars game. When a player makes a move, a new game state is created. The new game state and the old game state are connected by the move that the player made. Many game states together form a state space. The state space in Planet Wars can be represented by a tree. Each node in the tree is a game state. The current situation is always described by the game state at the root of the tree. Two versions of the game are used to compare search algorithms in a deterministic and a stochastic environment. In the deterministic game, one of the players makes a move, then the planets grow. Followed by the other player making a move and the planets grow again. In the stochastic environment both players make a move simultaneously where after the planets grow. Search strategies that are used in this report do not use a heuristic which includes the path coast because the path to the next state is always the same (1 move). Therefore an evaluation of the game state is used as classifier for the algorithms. Hill climbing search will choose a node based on an evaluation of the game state. The best next state is chosen without looking further. Beam search can be seen as hill climbing which keeps track of several options to go back to. When a summit is reached the search continues on the next best path to see if that leads to a better option. The beam size influences the depth of the search and should be adjusted to fit to the search space. Minimax also uses an evaluation of the game state but also takes into account that after every move, the enemy will make a move to the least optimal game state. Because there are so many options to explore and minimax tries to look at them all, the search tree is limited to a certain depth. This depth should also be adjusted to the specific search problem. To extend the search tree of the minimax algorithm, alpha-beta pruning is used. Alpha-beta pruning is used in a limited depth first way of minimax search. What defines alpha-beta pruning is that moves can be pruned from the tree by comparing the moves in memory to moves from explored nodes. Because each player will either minimise or maximise the score, some nodes can be skipped because that branch will never be opened when both players play optimal. In a best case scenario, a large number of states is pruned from the tree, cutting the search time in half. In a worst case scenario, none of the states are pruned. This leaves the search time the same as limited depth first search, as for both the worst case search-time is ’branchingˆdepth’. To optimise the minimax search depth even more, a beam search-like algorithm was implemented in a minimax way. This should lead to even deeper searches but is also more prone to miss search-paths that might be better. The main difficulties with strategy design for Planet Wars are: The search space complexity; calculating an optimal route to a goal is very time consuming, and the adversary; the route to the goal is dependent on a prediction of the opponents move, if the opponent does something different the strategy might fail miserably. 3 Research Question To investigate which algorithms perform better in the Planet Wars game, many questions arise. Systematic comparison of heuristics, algorithms and strategies should lead to better predictions about useful algorithms for Planet Wars. First of all the heuristics used in the algorithms should be the same, to be able to compare the algorithms. This is why the heuristic value is investigated first. It is expected that the better the heuristic reflects the desirability of a game state the better an algorithm will perform. Thus the more variables that describe the game state the heuristic incorporates, the better it will perform. The importance of the different variables can be reflected in a weight factor which can be used to optimise the heuristic. An accurate evaluation of the game state is derived from adjusting these weights and testing the performance in a experimental research setting. The most general question about the algorithms would be; which performs better, the informed algorithms or the ones that also take the adversary in to account? The adversarial ones are expected to perform better than informed ones because they predict what the enemy will do. This assumption comes with a connotation, if the move of the enemy is not predicted well enough, the adversary strategy might even perform worse than the normal informed strategies. For a bot to perform better than an other bot, the bot should win more and lose less games than the other bot. Based on the background information the following predictions can be made about how well the bots will fare. First, the Greedy bot should win from the Randombot because of the use of heuristics. The LookAhead bot should do better than the Greedy bot because the LookAhead bot also takes the first move of the enemy in to account. The minimax bot should perform better than the LookAhead bot because it looks further ahead making use of alpha-beta pruning. The adversary beam search bot can look even further ahead but also adjusts for the enemy. Based on the arguments above the adversaryBeamSearch bot would do even better but because of incompleteness problems the prediction is that is will be comparable to the Minimax bot. Until now we only looked at the development of useful algorithms for playing Planet Wars. An other way of looking at the Planet wars game is a more strategic approach. Are there ways to win the game by making decisions based on strategy rather than on search? The easiest and most compelling strategy seems to be the SuperBully bot. This bot keeps attacking the smallest planet of the enemy with the owned planet which is populated by the most ships. This bot is thus predicted to be quite effective. It could perform just as good as the most advanced bots used in this report. Because it can be reasoned that all algorithms and strategies have their down sides; they do not always lead to an optimal solution for every game state. A likely assumption would be to say that bots should be improved by incorporating multiple strategical rules and search algorithms. The Adaptive bot makes use of multiple strategies which make it (given the right criteria) stronger than both strategy’s would be on their own. Thus the Adaptive bot should win from the Bully and the SuperBully bot, and because of the compelling strategy it might win from all other bots in the report. 4 4.1 Experimental Setup Software and programming environment The Planet Wars game is run as a java program called PlayGame. This program was provided with the assignment, including a visualiser written in python. Some bots were also provided. These bots are run as separate java programs loaded in to the game as player1 or player2. The Playgame programme decides the start, end and winner of the game. Playgame imports the game maps and also makes sure that the rules of the game can’t be broken. For every battle, at least the playGame and 2 bots have to run. The game can be visualised by sending the game results to the visualiser. 4.2 Heuristic test A few factors are of importance when calculating the heuristic value, namely the amount of ships of both players, which planet is owned by which player and the grow ratio of all the planets. The greedy bot is used to test the heuristics. This is because the hill climbing algorithm completely relies on heuristics to find the best possible move. Testing the heuristics is done by first using single game state variables as a heuristic, than adding variables and experimenting with the weight of the values. The Greedy bot is first tested with a heuristic formula that only uses the total number of ships of the player. Then also the grow ratio of the planets is added to the formula. After that, also the ships of the opponent are also taken in to account. All these factors are then given weights to see which ones are more important in predicting the game state. This leads to a competition with 15 Greedy bots all using different heuristics based on variations of the following formula, where W = weight, S = ships owned, Se = ships enemy, G is own grow ratio and Ge = enemy grow ratio. f = W (W * S + W * G) + W (W * -Se + W * -Ge) 4.3 Logger In order to test the different hypothesis about the search algorithms, heuristics and strategies, a competition program is developed. This program is called the Logger. The Logger is able to simulate multiple battles between different bots in a sequential order. It has the capability to simulate multiple battles between multiple bots on multiple maps. After the battles are finished, the Logger saves the results in multiple Comma Separated Values (csv) -files. And after the logging the Logger will retry all failed battles. For every bot, that participated in a competition, three different csv files are created; a file with all battles won, a file with all battles tied and a file with all battles lost. Every csv file contains the following information. – – – – – 4.4 All matches played Opponent bot name The map that was played on Number of turns the bot has used Average time it took to make a turn Bot competition Using the Logger, all bots that were developed play against each other on the three provided maps with 8 planets. On all maps all competitions are played twice, one normal battle and one with switched starting positions. The position switching does not happen when a bot plays against itself. This results in every bot playing: 9 bots * 3 maps * 2 position - 3 = 51 battles. From all the resulting log files, all the winning logs are put in to a table. The loses are automatically derived from the wins and the draws are not taken in to consideration. 5 5.1 Results Heuristic evaluation To evaluate which way of calculating the heuristic values is useful for Planet Wars, fifteen Greedy bots with different heuristics played against each other. The heuristics were systematically manipulated to see which factors contribute the most to the bot winning the battle. The following formula’s were used for comparison. Table 1 shows the amount of games the bot in the left column won against the bots in the first row. Looking at the wins, Bot 5, 6, 10, 11, 14 and 15 all win around 50 of the 84 total matches. Now looking how many they lost, number 6, 10, 11, 14 and 15 did almost never lose. The strategy of giving the ships more weight and taking all factors in to account seem to work the best (bot 10 and 11) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Ships self Growth self Ships + growth self Ships enemy Growth enemy Ships + growth enemy Ships + growth self + Ships + growth enemy Ships + 5 growth self + Ships + 5 growth enemy Ships + 10 growth self + Ships + 10 growth enemy 5 * ships + growth self + 5 * ships + growth enemy 10 * ships + growth self+ 10 * ships + growth enemy 5 (ships + growth self) + ships + growth enemy 10 (ships + growth self)+ ships + growth enemy Ships + growth self + 5 (ships + growth enemy) Ships + growth self + 10 (ships + growth enemy Table 1. Comparison of heuristic values between Greedy bots Bot# #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 Wins #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 Losses 5.2 6 2 0 6 6 6 6 6 6 6 6 3 6 6 71 0 0 5 5 6 6 6 4 6 6 2 1 6 6 59 0 4 0 6 6 6 6 5 6 6 3 2 6 6 62 5 1 6 5 0 6 6 4 0 0 6 4 0 0 43 0 1 0 1 3 3 3 3 6 6 0 0 3 3 32 0 0 0 0 3 0 0 0 0 0 0 0 0 0 3 0 0 0 0 3 6 3 3 6 6 0 0 6 6 39 0 0 0 0 3 6 3 3 6 6 1 0 6 6 40 0 2 1 2 3 6 3 3 6 6 1 1 6 6 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 6 6 6 4 5 6 6 1 6 6 55 0 5 1 0 6 6 6 6 5 6 6 3 6 6 62 0 0 0 0 3 0 0 0 0 0 0 0 0 0 3 0 0 0 0 3 0 0 0 0 0 0 0 0 0 3 5 22 10 8 52 51 45 43 38 54 54 22 12 51 51 Competitions between the bots All nine bots were pitted against each other in both serial and parallel mode. Table 2 and 3 show how many matches each bot won and lost against the other bots. A score can be computed by subtracting the number of losses from the number of wins (score = wins − losses). This score gives a general idea of how well a bot performed. With this score, the bots can be ranked. In the serial mode the beam bot, Greedy bot, Minimax bot and the simple adaptive bot won more than 25 times out of 51 matches. The Beam and Minimax bot did not lose more than 2 times. The Greedy and SimpleAdaptive both lost 7 times. The Minimax bot won once from the beam bot, and the beam bot did never win from the Minimax bot. The Random bot performed worst of all with only 2 wins and 46 loses. The Adaptive bot only performed slightly better. The Bully bot again did a little better than the Adaptive bot, followed by the LookAhead bot and than the SuperBully which did reasonably well with 19 wins and 13 loses. Bot ranking in serial mode: 1. 2. 3. 4. 5. 6. 7. 8. 9. Score: Score: Score: Score: Score: Score: Score: Score: Score: 30, Minimax bot 27, Beam bot 18, SimpleAdaptive bot 18, Greedy bot 6, SuperBully bot -12, LookAhead bot -18, Bully bot -26, Adaptive bot -44, Random bot Table 2. Competition of the bots in serial mode Bot rank #1 #2 #3 #4 #5 #6 #7 #8 #9 Wins #1 #2 #3 #4 #5 #6 #7 #8 #9 Losses 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2 2 2 1 0 0 1 1 0 7 2 1 1 0 1 1 1 0 7 2 2 2 2 1 2 2 0 13 6 6 6 5 5 1 1 0 30 6 6 5 5 4 5 2 0 33 6 6 5 5 4 5 4 2 37 6 6 6 6 6 6 6 4 46 30 29 25 25 19 18 15 11 2 Looking at the wins in the parallel mode version of the competition shows that the LookAhead bot, the SimpleAdaptive bot, and the SuperBully bot all won more than 25 out of 51 matches. The LookAhead bot did lose almost as much matches as it won. The SimpleAdaptive bot and the SuperBully bot lost 3 and 0 matches respectively, they seem to perform almost equally well. The SimpleAdaptive bot could be called the winner because it wins 3 rounds more than the SuperBully although the SuperBully loses 3 matches less. The Random bot again loses almost every match. The Adaptive bot and the Bully bot follow the Random bot with just winning around 10 matches and losing the rest. The Beam, Greedy and Minimax bot do a little better with around 20 wins and little losses. Bot ranking in parallel mode: 1. 2. 3. 4. 5. 6. 7. 8. 9. Score: Score: Score: Score: Score: Score: Score: Score: Score: 26, SimpleAdaptive bot 26, SuperBully bot 14, Minimax bot 14, Beam bot 12, Greedy bot 6, LookAhead bot -23, Bully bot -27, Adaptive bot -48, Random bot Table 3. Competition of the bots in parallel mode Bot rank #1 #2 #3 #4 #5 #6 #7 #8 #9 Wins #1 #2 #3 #4 #5 #6 #7 #8 #9 losses 6 2 0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 2 0 0 0 2 1 1 0 6 2 0 0 0 2 0 0 0 4 2 0 0 0 3 1 1 0 7 5 6 4 2 3 0 0 0 20 6 6 5 5 5 6 2 0 35 6 6 5 5 5 6 4 0 37 6 6 6 6 6 6 6 6 48 29 26 20 18 19 26 12 10 0 Findings Predictions were that the more explicitly the game state was reflected in the heuristic, the better this prediction would be. This seems to hold up for most conditions looking at the bots that do not take all parameters into account (bot 1, 2, 3 and 4). Surprisingly bot 5 and especially bot 6 performed really well which would suggest that only the information about the status of the enemy is important to look at. This also reflects in bot 14 an 15 (which give the enemy parameters more weight) performing as good as bot 6. Only bot 10 and 11 perform a little better, this shows that giving the ships more weight than the growth ratio and taking all variables into account works the best. It can be reasoned that a combination of giving more weight to the ships but also to all the enemy variables would perform even better. As expected the Random bot lost to all the algorithms, this shows that choosing any strategy, even a not very strong one, is better than playing with no strategy at all. The Greedy bot performed better than the LookAhead bot. This implicates that looking only at the next move works better than taking the enemy move into account. This finding is interesting but is really not conclusive because the LookAhead bot uses a Bully bot strategy for the enemy prediction. Clearly only one bot uses this strategy, the other bots do something else than the LookAhead predicts, causing it to miscalculate the results of its moves. This results in a lower overall score than the Greedy bot, still it can be shown that if the prediction is correct the LookAhead wins most of the matches from his opponent (Bully bot). Another expectation was that the Minimax bot would be better than the LookAhead because it uses a state space evaluation for the opponent and looks further ahead using alpha-beta pruning. This is perfectly shown in the results by the Minimax bot winning all the games versus the LookAhead bot in serial mode. Also Minimax’ overall performance in both game modes is better than the LookAhead. Overall the Minimax is the best performing bot in serial mode with a total of 30 wins and 0 losses. However, in parallel mode the Minimax bot performs a lot less. This can be explained by the the idea that the Minimax bot tries to predict a move which follows on the move done by the player. In parallel mode the result of the move is different because the other player moves simultaneously. Adversary beam search and the min max algorithm almost perform the same in both game modes. This also adheres to the predictions made. An interesting point is the difference in the prediction about the opponents move, the BeamSearch predicts every enemy move as if it were a SuperBully bot. The Minimax uses the same evaluation as it uses for its own moves. The super bully prediction would be less precise based on the fact that only one of the other bots actually uses the super bully strategy. This lack in prediction strength might be compensated by the ability to look further ahead from the Beam bot. Looking at the more strategy based bots, it shows that the SuperBully bot performs really well. This has to be ascribed to the specific strategy it uses since it does not calculate heuristics or look ahead. The SuperBully bot minimises the loss of ships and waits for the opponent to lose some, then it starts attacking the weakened enemy. One problem with this strategy is that it does not work in maps with a lot of planets with small numbers of ships. Then the enemy gets an advantage by capturing these small planets, because of the growth rate. When combining SuperBully bot and Bully bot the strategy is even stronger, the Adaptive bot does this by changing from a SuperBully bot to a normal Bully bot when its possible to get an advantage by taking over some small planets. This makes for a strong strategy which is reflected in the results by the Adaptive bot winning the competition. 7 Conclusions Different ways of programming artificial agents for playing Planet Wars were investigated. A broad scope of factors in agent design were looked at. An experiment was done letting nine different bots based on different principals play against each other (using the same heuristics). A few strategies perform much better than the rest. These findings can be attributed to the existing theories about search algorithms. Based on these results it becomes clear which strategies should be investigated further to develop an even better bot for the Planet Wars environment Almost all the predictions that were made based on the literature hold up in this experiment. Heuristics play a big role in the strength of algorithms. The competition between bots with different heuristics show clearly that some heuristics perform really bad and other almost always win. The best heuristic is one where the amount of ships are given extra weight in comparison to the growth ratio and the opponents variables are also weighted more than the variables of the own variables. The informed algorithms in general perform better than uninformed ones. The correct prediction of the enemy move is an important factor in the success of an adversary bot. Given the right way to predict the enemy move these algorithms can be very strong. It is shown that a better prediction leads to better results. another interesting point is that a lack of a good predictions of the enemy can be countered by a deeper search depth of the algorithm. It now can be reasoned that algorithms that make good predictions and look far ahead are the strongest. One of the best bots in the competition turned out to be a bot based on strategy, the SimpleAdaptive bot. This shows that strategy’s are a good starting point to develop a bot, especially when an adaptive strategy is used the bot knows the best move in a lot of situations. Because bots that use adversary strategy’s can incorporate a really strong strategy for the prediction of the enemy, they should in theory be able to counter act bots which only work based on this strategy. This is partially shown in the results of this experiment. For further development of bots the adversary beam search and the minimax algorithms seem to be the most promising options. Especially the beam search can be improved by implementing a better estimation strategy for the opponent, for example the SimpleAdaptive bot could be used. Further improvements could be made by combining strategies with search in one Adaptive bot, like using a strategy based move in certain situations, but using a search algorithm when it is not certain what to do.
© Copyright 2026 Paperzz