Intelligent Decision Making Process for an Attack in RTS Games

Full Paper
Proc. of Int. Conf. on Advances in Computer Engineering 2012
Intelligent Decision Making Process for an Attack in
RTS Games
Z. Zeinalpour-Tabrizi1, B. Minaei-Bidgoli2
1
Iran University of Science & Technology, Tehran, Iran
[email protected]
2
Iran University of Science & Technology, Tehran, Iran
[email protected]
Abstract—One of the important challenges in Real Time
Strategy games is to construct GameAI. Using classic GameAI
techniques for this game genre is not possible because of large
amount of complexity and magnitude of these games’ worlds.
There are large numbers of areas in RTS games which need
to be controlled by an AI. One of the main components of a
NPC in RTS games is attack tactic controller. In this paper,
we will suggest a system for controlling the army during the
invasion to an enemy city. The suggested system includes two
main modules: Attack Tactic Manager and Attack Tactic
Designer. ATM determines preferable times for a change in
routine of the army operation. In such times, ATD makes
appropriate decisions for any of the defined circumstances of
the game. An implementation of the purposed system is
achieved and tested using StarCraft game. The results of the
experiments illustrate optimized performance and intelligent
behavior during attacks.
According to Sid Meier, programmer and designer of several
popular computer strategy games: “A good game is a series
of interesting decisions; The decisions must be both frequent
and meaningful” [4]. To achieve decision frequency in
complicated games such as RTS games, we need to grade the
decisions. So we break down the decisions of strategy games
into these three levels: Strategy, Tactic, and Behavior. . In
this paper we design a major part of the tactic layer which is
called “Attack Tactic Unit” or ATU. As illustrated in fig. 1,
when the strategy manager decides to attack an enemy city
with a specific army, it calls ATU and gives it attack’s order
and information. This information consists of the ID of the
assigned army to attack the mission and the enemy’s cities’
boundaries.
Index Terms—RTS, Non Player Character (NPC), GameAI,
Influence Map, Strategy, Tactic, Behavior
There are a lot of researches done for GameAI of RTS
games. Aha et al. apply case-based reasoning for strategy
selection [5]. The system uses several forms of domain
knowledge, including a building-specific state lattice
developed by Ponsen et al. [6]. Wintermute et al. developed
knowledge-rich agents to play real-time strategy games by
interfacing the ORTS game engine to the Soar cognitive
architecture. The middleware they developed supports
grouping, attention, coordinated path finding, and Finite State
Machine (FSM) control of low-level unit behaviors. The
middleware attempts to provide information humans use to
reason about RTS games, and facilitates creating agent
behaviors in Soar [7]. Bergsma and Spronck also utilize
influence maps (IMs) to generate tactics for a turn based
strategy game [8]. A neural network layers several IMs to
generate attack and move values for each cell. They bring
into play neural networks to combine the influence maps.
Similar work has been done by Jang and Cho who also employ
layered IMs with a neural network for a real-time strategy
game [9]. Avery and Luis utilize coevolving Influence Maps
to generate coordinating team tactics for a RTS game. Each
entity in the team is assigned their own IM generated with
evolved parameters. The individual IMs allows each entity
to act independently of the team and team coordination is
then achieved by evolving all team entities’ IM parameters
together as a single chromosome with a single evaluation
[10].
Preuss et al. suggest improving AI behavior by combining well-known computational intelligence techniques
II. RELATED WORK
I. INTRODUCTION
Game industry has had a salient growth in recent years.
Apart from games which are rarely reliant on physical
simulations, most of the games need a wise rival to be in the
other side and do its bests to win the human player. These AI
players have to be real-enough and are up to show humanlike decisions during the game [1]. When talking about AI in
games field, the only important point is what the final user
receives as GameAI attitudes; so it is fair to say “if our
character seems to be intelligent, then it’s intelligent” [2]. As
a result, one of the main methods to evaluate the NPCs’
intelligence is to have intuition evaluations on them. Another
GameAI goal is to harden the game level to challenge the
human players.
Although AI researchers had paid lots of attentions to
classic games such as chess and checkers [3], younger
computer game genres, which are mostly more popular, has
received less attentions than needed. These games require
more complicated GameAI, according to their inherent
complexity. Among these complex games, RTS games
includes the most complicated game world and also has the
highest types and counts; so its GameAI faces more
challenges. As the problems which the GameAI faces in a
RTS game have different matters, we can use these games as
an experimental environment for different AI solutions.
© 2012 ACEEE
DOI: 02.ACE.2012.03.13
35
Full Paper
Proc. of Int. Conf. on Advances in Computer Engineering 2012
applied in an original way. Team composition for battling
spatially distributed opponent groups is supported by a learning self-organizing map (SOM) that relies on an evolutionary
algorithm (EA) to adapt it to the game. Different abilities of
unit types are thus employed in a near-optimal way, reminiscent of human ad hoc decisions. Team movement is greatly
enhanced by flocking and influence map-based path finding,
leading to a more natural behavior by preserving individual
motion types [11].
A. Attack Tactic Manager
The attack tactic manager’s duty is to realize when attack
process should be changed. This need occurs in three specific
situations: the first time ATU is called, when one or more
teams get successfully done with their mission and when
failures occur in the mission of one or more teams. In all of
the above situations, ATM will call the ATD to rearrange
teams and goals’ priority, so the best strategy plan for current
status of the map can be made.
B. Attack Tactic Designer
The attack tactic designer duty is to set the priority for
destroying goals at the current time. This priority is used for
organizing unit teams, allocating them to each goal and
calculating paths to each of them for the unit teams.
As illustrated in fig. 3, the tactic designer module consists
of six sub-modules: Heuristic Function, Map Analyzer, Path
Finding Procedure, Decision Maker, Learning Procedure and
Evaluation Function. Each enemy city includes different units
and buildings. Our goal is to destroy all the units and
buildings with the minimum possible cost, damage, and to
act like human-players as well. Beyond the first step, we have
collected a set of rules about the priority of buildings to be
damaged. These sorting rules were extracted from interviews
with two expert men in StarCraft-BroodWar. The question
was: “In an attack, which kind of units or buildings you try to
destroy first?”. The interviewees explain about their approach
to decide goals’ destroying priorities in any condition and
on any map, as the answer to our question. The heuristic
function uses these rules to analysis the enemy’s city
structure, in order to generate a queue of sets. Each set
contains the attack goals with same priority. The information
used to set these priorities, is the type of enemy buildings
and units, and the force balance between the NPC army and
the enemy army; it is the decision maker job to make the
priorities more precise.
We do all the following progress, starting from the first
set of the queue, after allocating our units to these sets’
members. If we have any unallocated units at the end of each
progress, we will repeat the progress to the second set and
so on. In this way we attack the goals in the heuristic function
order and use our army properly.
For the first step in ATM we attempt to sort goal’s set,
based on three parameters: goal value, its path’s danger and
its path’s cost. To calculate the value of each goal, we use
the definition of each building or unit in the game that our
NPC is going to play with. Each building or unit require certain
amount of time and resources to be built and has certain
effects on the rest. These calculations are good enough for
us, because we are using them for sorting a set of goals that
are already identified as goals with identical priority by
heuristic function based on their usage.
In order to calculate the danger and the cost of getting
our army to a goal, we need to find the best path to it. This
path can be calculated using two modules: Map Analyzer
and Path Finding Procedure.
III. ATTACK T ACTIC MANAGER
In several of pervious works, an attack method has been
proposed, but none of them designed to attack to a specific
area with the goal of destroying enemy’s whole properties.
These methods just focused on fighting with enemy units
without considering its buildings, resources or city structure.
In this paper, we design a system to do the decision making
in the battle field for the attacker side. This system leads a
complete attack to enemy city with considering the whole
battle situation. The purposed system consists of two major
modules: Attack Tactic Manager and Attack Tactic Designer.
In addition, we develop a simple micromanager unit in order
to evaluate the purposed approach (fig. 2).When the strategy
unit decides to attack one of the enemy’s cities with a specified
army, it invokes the ATU. In the ATU, the manager module
calls the designer module in order to determine some
immediate goal and assigns teams to them.
Figure 1. Position of proposed system in hierarchical decision
making structure of NPCs in Real-Time Strategy games
Figure 2. Structure of ATU. ATM, ATD and behavior layer
interactions are illustrated
© 2012 ACEEE
DOI: 02.ACE.2012.03.13
36
Full Paper
Proc. of Int. Conf. on Advances in Computer Engineering 2012
We need to analyze the map as a former step for path finding,
as every day experiences teach us that the key to effective
decision-making is not merely having the best data, but
presenting the data in the right way. Raw data is useless until
converted to contextual information. An appropriate
representation of the data will force the relevant underlying
patterns to reveal themselves [12]. Map analyzer gets the
information of the map, analyze it and generate an influence
map.
In the mentioned influence map we consider the cost of
each cell as the time it takes for a unit to move to it. This
amount depends on the terrain material; so for the cells with
a building on them, it shows the cost of destroying that
building. Also in some cells with impassable natural entities
such as a lack or a mountain, the cost of moving is infinite.
The danger of each cell is the amount of danger which
enemy units or defending buildings cause for it. To calculate
this amount, we add the danger of each building and unit to
their attack range cells. For moving units, the influence of
their danger propagates their neighbors; but for the defending
buildings, their danger just influence cells in their shooting
range.
In every frame of the game, our army units scatter in form
of several teams on the map; these teams are the ones we
organized in the previous decision making progress. In order
to find the best path with minimum amount of danger and
cost to a goal, we need to have a start point. So for each team,
we calculate the average coordinate of its members (P(Ti)),
with (1), in which m is number of Ti members, Mj is the jth
member of the team and P stands for position.
(3)
(4)
These two equations are similar. In the first one Dk is the kth goal
danger, n is the number of the teams we organize in our last
decision making, dij is the danger of the ith team, moving from its
position to the position of the jth goal, F(Ti) is the Ti total force
and is calculated by (5). In the second one, Ck is the cost of
getting to the kth goal, cij is the danger of going from the ith team’s
position to the jth goal that has been provided through path
finding procedure.
(5)
A team force is the sum of its members’ forces. Each member
force is dependent on its type and health. In (5), m is number of
Ti members and Mx is xth member of Ti. The typical properties of
Mx are shown by Type(Mx). H(Mx) is a function that returns the
health amount of Mx, while H(Type(Mx)) is a function that returns
the health amount to its perfect status.
Now that we have all the three sorting parameters for each
goal in the set, we may use a linear equation in order to sort its
members. It is appropriate enough for a simple comparison
between the goals that have already been put in the same priority
set using heuristic function.
(1)
The inputs of Path Finding Procedure for each pair of
(team, goal) are P(Ti) for start point and goal coordination for
goal point. This Procedure will find the best path to the goal
and calculate the amount of danger and cost for it. So for
each pair, we will have a path and two estimations of cost and
danger. As you can see in fig. 4, Pij is the optimum path for a
unit in team Ti to go to the goal Gj. cij is the cost of Pij and dij
is it’s danger.
As in most games, we used A* algorithm for path finding.
It uses physical distance to the goal as heuristic function
and g(c) as its cost function. When calculating g(c), dc and cc
are cell c’s danger and cost in Influence Map. ω And δ are the
coefficients of cost and danger. These two are necessary to
balance the influence of danger and cost in path finding
procedure and they have been adapted based on experiment.
(2)
Figure 3. Structure of ATD module
In (6), APk is kth goal’s attack priority; Vk is its value, and ω and δ
are the coefficients of cost and danger. Using these coefficients
is necessary because we aim for value, cost and danger to have
specific effects on setting priorities. ω and δ have been adapted
based on experiment.
When the decision maker gathers the cost and the danger of
every pairs of team and goal as well as the teams’ status to
calculate the general cost and danger of destroying each
goal would cause for our army. We call these parameters as
‘goal cost’ and ‘goal danger’, and calculate them using (3)
and (4).
© 2012 ACEEE
DOI: 02.ACE.2012.03.13
37
Full Paper
Proc. of Int. Conf. on Advances in Computer Engineering 2012
(6)
(9)
After calculating AP for all of the goals in the set, we first sort
and then insert them into a queue of goals. Having gone through
the steps above, we sort heuristic priorities based on game’s
current status. Next step is to allocate units to the goals. For this
purpose, it is necessary to calculate the amount of power needed
to destroy each goal.
As shown in (7), the amount of force necessary to destroy
each target is the maximum of the necessary force needed to
overcome its danger, and the optimal force required to destroy
the target type. The optimal force necessary to destroy a target
depends on its shape and resistance. If we allocate a higher
amount of units needed to destroy it, some of our units will
remain useless; and if we send a much lower amount, the
destruction operation will seem unnatural. In (7), F(Gi) is the
force necessary to destroy the ith goal. β is the coefficient of
danger and β*Di is the force necessary to overcome Di.
In these equations N(Z) and N(D) are respectively the number
Zealots and Dragoons, and F(Z) and F(D)are the total force of
each kind. For each goal, the allocated force of each kind can be
obtained by equations DFZ(Gi) and DFD(Gi), in which R(Z)and
R(D) are the ratio of each kind of units which have been included
in a team. Also, DF(Gi) is the force allocated to ith goal and
DFZ(Gi) is the force of type Zealot and DFD(Gi) is the force of
type Dragoon that we allocate to this goal.
(10)
(11)
For organizing teams with the specified force of each kind, we
use a greedy approach via choosing the best free units for each
goal. The best free unit for each goal is the one with minimum
CD. CD is the sum of danger plus cost of unit’s best path to the
goal.
(7)
If F(Gi) is less than the amount of force necessary to overcome
Di, it will cause losing some of our units. Also, if F(Gi) is much
more than the necessary amount to overcome Di, some of our
units will face unwanted idle times, so the usage of units won’t
be whether optimal nor natural. So the accuracy of β has a
significant impact on NPC performance. To obtain the proper
value for β, we evaluate the teams’ results, if there were useless
units in the team we will decrease β and if it had high mortality
rate we will increase β. This adaption will be done as a real-time
process during the game.
After calculating F(G) for all goals, we will allocate force
to them respectively. At this point, three different cases may
occur. The first case occurs when total force of our army is
less than the force needed for any of the goals in the set; In
this case, ATU will retreat army to our city. The second case
occurs when after allocating force to all the goals in the set,
we still have some unallocated force; in such a case, we will
do all of the above tasks on the next set in the queue of
heuristic function and allocate the remaining force for new
goals. The third case is the most likely case and occurs after
allocating force for some of the goals in the priority set, we
still have some force left that is not enough for any of the
remaining goals. In this case the remaining force will be split
between goals with allocated force.
We use three different types of units during the
implementation process: Probe which is a resource extractor,
Zealot and Dragoon which are two military units, one with
short-range and one with long-range weapons. According to
most of the experts in RTS games, a team with different kinds
of units is much more efficient than a team with a single kind
of unit; so we include same ratio of any kinds of units in each
team. To obtain this ratio, we need to calculate total force for
each kind by (8) and (9).
(12)
When all the units get allocated, ATU will order behavior
layer to lead units from the paths that had been specified by
Path Finding Procedure to the assigned goal and destroy all
the wrangler units in the way.
Figure 4. The information gathered for all goal, team pairs
IV. EXPERIMENTAL RESULT
As presented in the previous sections, the goal is to
implement an efficient and intelligent NPC. To evaluate the
presented system, we need a method which shows these goals’
achievement degree. As mentioned before, the implementation
environment is StarCraft-BroodWar game. We focused our
experiments on Attack Tactics, where an army of two different
unit types will attack the enemy city.
Our NPC fights two different enemies in 10 different
scenarios. In these scenarios, we face different structures of
enemy city and also variant force balances. These games’ results
show the NPC’s optimism. As in Tables 1 and 2, when our army’s
force is extremely lowers than the enemy’s defense force, NPC
decides to retreat immediately so it won’t face a high amount
of loss and also won’t continue fighting till complete ruin. In
scenarios where our army’s military force is extremely higher
(8)
© 2012 ACEEE
DOI: 02.ACE.2012.03. 13
38
Full Paper
Proc. of Int. Conf. on Advances in Computer Engineering 2012
CONCLUSION
than enemy’s defense force, certain success will be gained
among really little amount of loss. As well, in a case where
two rivals’ powers are mostly the same, our NPC is always
the winner, when playing against a rival of default StarCraft
NPC. Also playing with the expert, the amount of wins are
more than the amount of retreats. In all of the games, our loss
ratio is less than the amount of value we annihilate from the
enemy. According to the results, the suggested NPC has a
great turnover in a commercial RTS game platform.
In the first step, all of the 20 implemented games had been
recorded. Then we asked three other experts to evaluate the
intelligence level of attacker army in these games and gave it
a number between [0, 10], where 10 stands for being fully
intelligent and 0 stands for a complete inaccurate behavior.
As it is demonstrated, the average amount of these grades,
which reveals our NPC intelligence ratio, is equal to ‘9.29’.
This shows admissible behavior and intelligence level of our
NPC. The reason of such results seems to be the fact that the
decision process of this NPC is very similar to a human player
and also the decision basement is expert’s knowledge.
In this paper, we have presented an attack tactic unit
structure for the StarCraft game. StarCraft is one of the most
famous Real Time Strategy games developed till now, and
has recently been used in tournaments to compare the use of
different techniques of artificial intelligence. In this structure
we combine some general expert knowledge about the game
with some computations. To evaluate this structure, we run
24 games against another automated player and also an expert
human player. The results showed that our NPC has a very
good efficiency and it shows human like behavior.
REFERENCES
[1] M. Cavazza, “Al in computer games: Survey and perspectives
Virtual Reality. Springer London, vol. 5, Issue 4, pp. 223–
235, December 2000.
[2] B. Hall, “Artificial Intelligence for Game Developers,” Game
Institute, 2010.
[3] D.B. Fogel, T.J. Hays, S.L. Hahn, J. Quon, “A self-learning
evolutionary chess program,” Proceedings of the IEEE, vol.
92. Issue 12, pp. 1947-1954, Dec 2004.
[4] A. Rollings, D. Morris. “Game Architecture and Design.
Scottsdale,” Ariz. Coriolis, 2000.
[5] D. W. Aha, M. Molineaux, M. Ponsen, “Learning to win:
Case-based plan selection in a real-time strategy game,”
Springer Berlin/Heidelberg, Lecture Notes in Computer Science,
pp. 5-20, Sep. 2005.
[6] M. Ponsen, H. Muoz-Avila, P. Spronck, D. W. Aha,
“Automatically Acquiring Domain Knowledge for Adaptive
Game AI Using Evolutionary Learning,” In Proceedings of the
Innovative Applications of Artificial Intelligence Conference
(IAAI), Vol. 3, pp. 1535–1540, 2005.
[7] S. Wintermute, J. Z. Xu, J. E. Laird, “SORTS: A Human-Level
Approach to Real-Time Strategy AI,” In Proceedings of the
3rd Artificial Intelligence for Interactive Digital Entertainment
Conference (AIIDE-07), pp. 55–60, 2007.
[8] M. Bergsma and P. Spronck, “Adaptive spatial reasoning for
turn-based strategy games,” in Proceedings of the Fourth
Artificial Intelligence and Interactive Digital Entertainment
Conference. AAAI, 2008.
[9] S. Jang and S. Cho, “Evolving neural NPCs with layered
influence map in the real-time simulation game conqueror,”
Perth, WA, Proceedings of the 2008 IEEE Symposium on
Computational Intelligence in Games, pp. 385–388, Dec.
2008.
[10] P. Avery, S. Louis, “Coevolving team tactics for a real-time
strategy game,” Barcelona, Evolutionary Computation (CEC),
2010 IEEE Congress, pp. 1-8, July 2010.
[11] M. Preuss, N. Beume, H. Danielsiek, T. Hein, B. Naujoks, N.
Piatkowski, R. Stuer, A. Thom, S. Wessing, “ Towards
Intelligent Team Composition and Maneuvering in Real-Time
Strategy Games,” Computational Intelligence and AI in Games,
IEEE Transactions, Vol. 2, Issue 2, pp. 82-98, June 2010.
[12] M. DeLoura, “Game Programming Gems (Game Programming
Gems),” Charles River Media, Aug 2000.
TABLE I. SCENARIOS AND RESULTS OF TEN GAMES BETWEEN IMPLEMENTED NPC
AND STARCRAFT DEFAULT NPC
TABLE II. SCENARIOS AND RESULTS OF TEN GAMES BETWEEN IMPLEMENTED NPC AND
AN EXPERT PLAYER
© 2012 ACEEE
DOI: 02.ACE.2012.03.13
39