Evolution of team composition in multi-agent systems

Evolution of Team Composition in Multi-agent Systems
Joshua Rubini
University of Idaho
Moscow, ID
83844
[email protected]
Robert B. Heckendorn
University of Idaho
Moscow, ID
83844
[email protected]
ABSTRACT
practical issue of what robots are available for a mission,
cost constraints or other facts. To maximize the benefits
of heterogeneous teams a number of hurdles remain to be
overcome. One such hurdle is the composition of the team.
Composition is the number of individuals of each capability in an heterogeneous team. In many cases, the optimal
composition of a heterogeneous team is not known a priori.
An arbitrary choice of composition may lead to significant
sub-optimal performance. Thus, having the ability to discover the optimal composition is an important feature of
any successful algorithm for creating cooperative, heterogeneous teams. This research extends the previously successful
heterogeneous team evolution algorithm, Orthogonal Evolution of Teams (OET) [15] by optimizing team compositions
as well as team cooperation.
The results showed that allowing teams to evolve their
composition produced teams whose performance was no worse
than teams which were given the optimal composition at
the outset. The practical outcome of this is that OET
does not need to be given an optimal team composition in
order to evolve teams as effective as if the optimal team
composition was known in advance. The surprising result
was that some configurations of problems and implementations of OET may lead to a hysteresis with respect to team
composition. In our experimental implementation two compositions repeatedly vied for supremacy in the population
and switching between compositions proved difficult. While
our algorithm is ultimately effective in selecting the optimal
composition this detail turned out to significantly slow our
convergence and should serve as a warning to implementers
of algorithms that evolve team composition.
Evolution of multi-agent teams has been shown to be an effective method of solving complex problems involving the exploration of an unknown problem space. These autonomous
and heterogeneous agents are able to go places where humans are unable to go and perform tasks that would be
otherwise dangerous or impossible to complete. This research tests the ability of the Orthogonal Evolution of Teams
(OET) algorithm to evolve heterogeneous teams of agents
which can change their composition, i.e. the numbers of
each type of agent on a team. The results showed that OET
could effectively produce both the correct team composition
and a team for that composition that was competitive with
teams evolved with OET where the composition was fixed a
priori.
Categories and Subject Descriptors
I.2.9 [Artificial Intelligence]: Robotics - Autonomous vehicles; I.2.2 [Artificial Intelligence]: Automatic Programming
General Terms
Algorithms, Design
Keywords
autonomous vehicles, cooperation, teams
1.
Terence Soule
University of Idaho
Moscow, ID
83844
[email protected]
INTRODUCTION
Teams of robots or agents that are able to operate independently are important tools for solving problems in hostile environments, such as foreign planets, toxic sites, and
ocean bottoms, or where manpower may not be available
or spread too thin, such as post-disaster search and rescue. Furthermore, rather than being a homogeneous team
of bots, the teams may best be served by being a diverse
team of specialist robots. Heterogeneity may also arise from
2. BACKGROUND
A great deal of research has been devoted to developing
training methods for teams of robotic agents, often referred
to as “cooperative multi-agent learning” [10]. There are two
fundamental requirements for creating a successful multiagent team: the individual agents must be relatively successful and the agents must cooperate in a way that promotes the
performance of the whole team. Typically this means that
the bots specialize to solve particular sub-problems. Common EC approaches for multi-agent teams can be roughly
divided into two categories: island [10] approaches and team
approaches.
In island approaches, independent evolutionary processes
are used to train specific members of the team [7, 2, 17,
10]. This approach assumes that each evolutionary process
will produce agents well suited to a particular role, and that
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
GECCO’09, July 8–12, 2009, Montréal, Québec, Canada.
Copyright 2009 ACM 978-1-60558-325-9/09/07 ...$5.00.
1067
when combined they will naturally cooperate to cover the
entire problem domain. However, research has shown that
the members generated tend to have significantly ‘overlapping’ behaviors such that much of the problem domain remains unaddressed and overall performance of the team is
poor [7, 5, 15].
In contrast, in team approaches, a single population evolves.
Each ‘individual’ in the population represents a team of m
individuals. Fitness and selection are based purely on the
whole team’s performance [3, 8, 12, 1, 11]. This leads to
members that cooperate well. However, even in successful
teams, some team members can become “lazy”, letting others on the team cover for their poor performance. This can
produce poor overall performance [13, 1].
Thus, the team approach produces teams with strong cooperation, but some underperforming members, whereas the
island approach creates highly fit individuals which cooperate poorly. Despite these weaknesses both EC approaches
to evolving teams have proven successful when compared to
other forms of ensemble learning [4, 17, 6, 9].
To overcome these weaknesses, we developed a new training approach which blends selection for individual agent performance with overall team performance called Orthogonal
Evolution of Teams (OET) [15]. Research using OET produced statistically significant improvements in cooperative
teams’ ability to solve exploration problems involving finding and investigating areas of interest within a defined two
dimensional grid [16]. It was also shown [16] that OET,
in contrast to island or team approaches, produced teams
that were more robust in that they were better able to
absorb upgraded/replaced agents without substantially degrading their performance. Furthermore, scalability experiments [14] suggested that teams trained with OET could be
more effectively trained in the small numbers and deployed
in large numbers than either island or team approaches.
Previous research into OET was limited in that it did not
allow the teams to determine their own makeup. For instance, if the team started with three model X robots and
three model Y robots the team was forced to use this distribution for the duration of the training. The algorithm could
not improve performance by choosing to swap a model X
for model Y during training. Unless the ideal distribution
of agent types is both known in advance, picking an agent
distribution in advance could lead to sub-optimal results.
Allowing the evolutionary algorithm to optimize the distribution of agents might avoid this problem. Our goal is to determine whether OET can optimize the agent distribution.
And, if it can, whether the adding the team composition
variable to the evolutionary parameters hinders or improves
the training performance or quality of solution.
3.
METHODS
3.1 Hypotheses
This research extends the previous work with OET and
multi-agent teams by allowing the team composition to be
mutated. We have two hypotheses:
1. Allowing team composition to change, via mutation of
a single team member during steady-state evolution,
will result in higher average team fitness than if a fixed
team composition is chosen in advance.
2. Allowing team composition to change, via mutation of
1068
a single team member during steady-state evolution,
will result in slower evolution, requiring more time to
reach a given fitness.
Neither of these hypotheses addresses the practical matter
that if a team composition is chosen to be a nonoptimal
fixed value the resulting teams will most likely be not optimal in comparison with teams evolved with a fixed optimal
team composition. For many practical problems it is not
clear that the optimal team composition could be guessed
by the implementers.
3.2 Exploration Problem
To test our hypotheses we return to an robot team exploration problem we have used in previous research [15].
In this problem a heterogeneous team of robot agents must
explore a two dimension grid space in which some squares
are flagged as “interesting”. There are two types of robots:
scouts and investigators. Scouts find interesting squares and
flag them with a beacon that is detectable at a distance by investigators. Investigators investigate interesting squares and
mark them as investigated. Scouts travel at up to twice the
speed of investigators. Scouts automatically place beacons
in any interesting squares within a small pre-defined radius
unless a beacon is already present. If an investigator enters
an interesting square it changes the square to be investigated
and deactivates any beacons in the square. Agents can move
beyond the defined search area, but are penalized (see the
fitness function below) for moves that end outside of the exploration area. This model can be viewed as an abstraction
of a number of practical problems, including planetary exploration, clearing a minefield, search and rescue, etc. The
heterogeneous nature of the agents represents a trade off between fast mobile explorer robots and slower “heavier” robots
that must methodically deal with each discovery.
Because investigators work at half the speed of the scouts,
but can see the beacons at a distance, the space can be more
efficiently explored by fast moving scouts flagging interesting areas with beacons and investigators using the beacons
to go directly to the areas to be investigated. Furthermore,
the two groups of agents must efficiently divide up the space
to be searched and investigated since the task has a time
limit. It is clear that the optimum ratio of scouts to investigators for this problem changes depending on size and
shape of the area to be investigated, speed ratio of the two
kinds of agents, distribution of the “interesting” areas, and
other factors. It is also clear that the optimum ratio of the
two kinds of robots in a team is not immediately apparent
from the problem parameters. This is a realistic dilemma
for evolving a heterogeneous team of robots.
In our implementation, the problem environment keeps
track of “flagged” and “investigated” cells and communicates
these events to the agents. During each generation, two new
teams will each attempt to find all the locations of interest
on a randomly created grid. The teams have a limited number of time steps to explore. We designed our experiments
so it is unlikely that a team will investigate all of the locations of interest. The fitness of an individual is measured by
the number of locations flagged and/or investigated, with a
penalty imposed for each time step spent outside the problem space. The fitness of a team in the population is measured by the number of locations flagged and investigated
by the whole team. In the OET algorithm team fitness is
used for the replacement tournament and individual fitness
3.4 The Orthogonal Evolution of Teams
is used for selection of individuals by tournament for new
teams.
The Orthogonal Evolution of Teams (OET) algorithm is
designed to put evolutionary pressure on both the evolving
teams as a whole (as in team approaches) and on the team
members (as in island approaches) by alternately treating
the evolving population as a series of M independent populations/islands and as a single population of teams of M
members. Figure 1 illustrates this idea. We use a steadystate algorithm. During the selection phase agents are selected based on their individual fitness and combined into
two new teams. These teams undergo crossover and mutation producing two new offspring teams. Each of the two
teams are then evaluated for their performance in solving
the problem. The teams are then inserted into the population replacing two teams selected for their low team fitness.
This approach puts pressure on both the teams as a whole
and on the individual members: members must have high
fitness to be selected, teams must have a high fitness to avoid
being selected for replacement. Other variations of OET are
possible, for example, teams may be selected for reproduction, and individuals for replacement, or the selection and
replacement can vary from iteration to iteration. Extensive
testing of these many variations to determine which, if any,
of these approaches is most reliable has not been performed.
However, previous work has shown that the form of OET
used in this work produces teams whose average members
are better than the average member in a typical team algorithm, teams whose members cooperate better than in an
island algorithm, and teams that perform better than teams
generated via either the island or team algorithms [15].
3.3 Agent Design
Each team consists of six agents, some are scouts and
some are investigators. The number of each is determined
by the evolutionary process. Agent’s are controlled by vector expressions represented as expression trees. The vector
represented by the expression is the direction that the agent
will travel in the next time step. The allowed terminals of
the trees (the inputs to the vector expression) are:
1. North: A unit vector pointing North (Angle: π/2).
2. Constant: A vector generated randomly when the node
is created. It remains constant during the lifetime of
the node unless changed by mutation.
3. Random: A vector that is randomized each time step
(Magnitude: [0, 2], Angle: [0, 2π]).
4. Nearest Scout: A vector from the agent to the nearest
scout agent.
5. Nearest Investigator: A vector from the agent to the
nearest investigator agent.
6. Nearest Beacon: A vector from the agent to the nearest
location flagged by a scout.
7. Nearest Edge: A vector from the agent to the nearest
boundary of the problem space.
8. Last Move: A vector which contains the last move the
agent made.
9. Check Bounds: A zero magnitude vector with a small
arbitrary positive direction if the agent is inside the
problem space, and a small arbitrary negative direction
if it is outside.
If an input is meaningless, for example nearest beacon when
there are no beacons present, a random vector is generated.
The internal nodes of the tree (the vector functions) are :
1. Add: Takes two child vectors and computes the vector
sum.
2. Invert: Takes a single child vector and computes the
negative vector
3. If-MagA-Less-Than-MagB: Operates on 4 children. It
compares the magnitudes of the first two children. If
the magnitude of child A is less than the magnitude of
child B, then the vector value of this node is child C,
otherwise it is child D.
4. If-AngA-Less-Than-AngB: Operates on 4 children. It
compares the angles of the first two children. If the
angle of child A is less than the angle of child B, then
the vector value of this node is child C, otherwise it is
child D.
Figure 1: A population in the OET algorithm. Competition for selection to form a new team occurs in
columns (a). This selects for better performing individuals in each role. Competition for insertion back
into the population occurs by replacing worse teams
with better teams, where teams are found in rows
(b). Thus, there is selective pressure on both teams
and individuals.
During each time step the agent moves in the direction
and possibly the distance of the vector generated by the expression tree. Scouts are limited to a movement of at most
two units, and investigators are limited to a movement of
one unit per time step. If a longer vector is returned by
the expression tree it is truncated to the maximum allowed
move. The agents move in real valued two-space, so a fraction of a unit can be moved if that is the magnitude of the
vector from the expression tree. The problem environment
determines what square of the grid each agent “lands on”
after moving.
3.5 The Experiment
In our experiment we compared the performance of an
OET algorithm to solve the Exploration Problem with teams
formed of a fixed number of scouts and investigators versus
1069
1. A tournament to select each member for each position
in parent team A
2. A tournament to select each member for each position
in parent team B
3. Crossover A and B based upon the crossover rate
4. Mutate each offspring in A and B based upon the mutation rate
5. Mutate team composition based upon the team mutation
rate
6. Evaluate teams A and B
7. Tournament to select a team to be replaced by team A
8. Tournament to select a team to be replaced by team B
9. Go to step 1
teams which are allowed to evolve the number of scouts and
investigators. In both cases the total team size was fixed.
Figure 3: The basic elements of the core loop of the
OET algorithm. Step 5 appears only in the variable
composition version of the algorithm.
node as opposed to a leaf node. A leaf node was forced once
the maximum depth was reached. The parameters were a .7
decay with a maximum depth of 10 levels.
Tree size was controlled during evolution by using the fitness function for both individuals and teams. Each type
of agent is given a maximum tree size and the entire team
given a maximum allowed size for all of the trees in the
team. These limits were “soft” limits in that any agents or
teams that had trees larger than their limit were penalized
for each node that exceeded the limit. This proved to be
very effective at controlling the tree sizes. Since memory
consumption and execution time were the main limitations
we encountered in the preliminary trials, this was a major improvement. In the preliminary trials we noticed that
scouts tended to have much larger (between 5-10 times as
large) decision trees than investigators. As a result, we set
the tree size limit for the scouts at 250 and for investigators
at 50. The combined size of all of the trees in a team of 1500
was chosen so as not to penalize teams of six scouts and zero
investigators solely because of their composition.
The experimental parameters are given in Table 1. These
parameters were selected to closely match previous work [15,
14]. Each population was evolved over a series of 200 generations. Each generation consisted of 500 iterations of generating two new offspring teams and replacing two unfit teams
selected by tournament. Each offspring was evaluated once
on a single grid, which was only changed at the beginning
of the 500 iteration segment.
For efficiency, individuals and teams are only evaluated
when they are created and every 500 evaluations. More
specifically, at the beginning and end of each generation
(500 evaluations), every team in the population undergoes
two full evaluations of fitness against randomly generated
grids. Because the locations of interesting squares are random, its possible that a team would be given an environment
it was particularly suited to and obtain a high fitness that
did not accurately reflect its quality. The multiple evaluations limited the effect on the selection or replacement of an
agent or team by a single “good” or “bad” test grid. Because
the basic model is steady-state with only the worst teams
being replaced, a fortunate team would remain in the population for a considerable amount of time, potentially mis-
Figure 2: Heterogeneous teams can be thought of
as a row of team members with one end of the row
being investigators and the other end being scouts.
Variation of composition is performed by changing
a member that is adjacent to a member of a different type to that type. In the upper diagram there
are four investigators and two scouts. In the diagram below, there are three investigators and three
scouts.
In the experiments in which the number of scouts and
investigators were allowed to vary, a post mutation step was
added in which, with a fixed probability, a scout could be
changed to an investigator or vice versa (see Figure 2). The
core loop of our OET algorithm can be summarized as in
Figure 3.
Individual agent mutation was performed at a set mutation rate per node. Mutated leaves were replaced with
other leaf nodes. Internal nodes were not mutated. At the
team level, mutation was performed by having a set mutation probability each time a new team was created. If a team
was chosen to mutate its composition, with even probability
a scout was changed to an investigator, or vice versa. The
best scout or investigator (by fitness) was found in the entire
population, and was swapped in for the appropriate agent in
the column where the division of labor was. In other words,
if columns 0, 1, and 2 were scouts, and columns 3, 4, and 5
were investigators, then either column 2 would become an
investigator or column 3 would become a scout (see Figure
2). The reason this was done, rather than by fitness, was
that the agents within a column tended to specialize for a
role in the team. and mutating the team composition only
at the interior demarcation of roles gives the outer columns
as long as possible to specialize. Future work will compare
other techniques such as replacing the worst team member.
Crossover was performed with a probability of .95. The
crossover operator swapped random subtrees from the two
parents.
The trees for the initial population were generated using
a decaying probability of creating an interior node and a
maximum depth. Each successive level of the agent tree
had a smaller probability of generating an interior decision
1070
does 3/3 show itself to be superior. That is, it is easier to
find 4/2 teams that performed well than 3/3 teams.
When the individual trials are examined, the algorithm
with evolved team composition tends to produce populations
that coverage to 4/2 and then, after some time, a significant
portion of the trial populations revert to 3/3 (see Figure
5). The time to reversion varies from trial to trial. The
reversion is generally a sudden transition in composition (see
Figure 6). Not all populations are seen to revert before the
generation limit of our experiment. Since 3/3 is a slightly
better composition than 4/2 most teams which switch from
4/2 to 3/3 do not switch back.
Figure 5 shows the average number of teams over 60 trials
with the three most prevalent ratios of scouts to investigators (3/3, 4/2, 5/1). The results show that the teams
immediately shifted to a composition of 4 scouts and 2 investigators, and then slowly shifted back to 3 scouts and 3
investigators. All other combinations are rapidly removed
from the population. Again the averaging hides the suddenness of transitional behavior in individual trials.
This experiment showed some interesting phenomena when
teams evolved using OET are allowed to mutate their composition. First, unless the optimal team composition is welldefined, with little to no overlap in fitness with other compositions, the population in our implementation has a hard
time shifting from one well performing composition to another (e.g. 4/2 to 3/3). In this experiment, 4/2 was significantly better in early generations and was therefore easily
converged upon. However, once the fitness plateaued, it was
difficult for the population to then shift back to the ultimately optimal composition of 3/3. We believe that when
a composition change is made the new team member is not
as well adapted and that in a mature population teams are
well coadapted so that the new team is at a disadvantage.
The sudden transition between two team compositions is the
awaited discovery of a well adapted individual for the new
composition. In OET individuals are rewarded for their performance but that reward is also dependent on their teammates. When selection for team composition occurs it is
based on the individual’s ability to work as part of a team
and individually score high points. So technically teams are
not selected to reproduce. This reduces a team’s ability to
“sweep” a population, which is part of what makes OET a
great performer in balancing the needs of a team with the
needs of in individual. Indirectly, however, individuals in
teams that perform well will in turn perform well and can
sweep a population in a punctuated-evolution-like event.
To analyze the shift between compositions, we broke the
test group into two sub-groups, populations that shifted the
majority of their members from 4/2 to 3/3 and populations
that stayed at 4/2. Calculating the average fitness for these
two subgroups showed that each sub-group yielded fitnesses
comparable to the control group with the same composition.
This is shown in Table 3. The averages between the test and
control for both 3/3 and 4/2 compositions showed a significant difference only at p < .25 (using Student’s t-test). This
is encouraging, because it shows that allowing the population to mutate team composition is not detrimental to the
final fitness when compared to populations evolved using
the same fixed composition ratio for the entire evolution.
But more importantly it demonstrated that algorithms that
could apply varying composition during evolution might be
able to leverage higher convergence rates and can still arrive
Table 1: Algorithm Parameters
Number of Trials
60
Size of Generation
500 evaluations
Size of Trial
200 generations
Total number of replacements
200000
Population Size
100
Team composition mutation rate
0.05
Avg. number of leaf mutations/offspring 2.5
Crossover rate
0.95
Tournament size
3
Maximum investigator tree size
50
Maximum scout tree size
250
Maximum team total tree size
1500
Time steps per evaluation
200
Environment size
45×45
Number of interesting squares
405 (20%)
Maximum possible fitness
810
Table 2: Average Fitness at 100000 Generations
Group
Avg Fitness Std Dev
Control: 3 scout/3 investigator
597.31
15.23
Composition mutation
586.57
17.38
Control: 4 scout/2 investigator
577.58
15.46
leading the search process. Evaluating on multiple random
environments reduces the impact of such fortunate teams
by recalculating and averaging their fitness. The minimum,
maximum, and average fitness of the surviving teams were
recorded at the end of each iteration for later analysis. Fitness for the individual agents was calculated by:
3∗(beacons placed OR squares investigated) −
.1∗(time steps spent outside grid boundaries) −
.3∗(nodes more than the maximum allowed)
Fitness for the teams was calculated by:
(beacons placed) + (squares investigated) −
.1∗(nodes in team above the allowed maximum).
4.
RESULTS
Data was collected for the average fitness per generation,
the composition, and the tree size for all teams in the population. The final average fitnesses are shown in Table 2. All
results were indicated to be significant by Student’s t-test
(p < .005).
Figure 4 shows the average fitness over 60 trials of each of
the different groups as they evolved. For approximately the
first 5000 generations, the evolved team composition grew
in fitness faster than the 3/3 (3 scout/3 investigator) control
group, and at about the same rate as the 4/2 control group.
As the fitness began to plateau at roughly generation 10000,
the 3/3 control group started to overtake and subsequently
outperform the test group. In fact, the optimum composition of a team for this problem given the allowed amount
of time is 3/3. However, in the initial phase of exploration
the population quickly finds that having more scouts than
investigators tends to give a better point score than a balanced team. Only after the algorithm fine tunes the team
1071
600
Average Fitness
580
æ
æ æ æ æ
æ
æ
æ
æ
æ
æ
æ æ æ æ
æ
æ
æ
æ
æ
à
æ
æ æ
à à à à à
à
à
à
æ æ æ æ æ
æ
æ
à
à à
à ì
à à à ì
ì
à à à à à à
æ
ì
ì
ì
ì
à
à
ì
ì ì
à ì ì ì ì ì ì ì ì ì ì
ì
à ì
ì ì
æ æ æ
à à à ì ì
à à ì
à ì ì ì
æ
à ì ì
à
ì
ì
à
à ì ì
æ
560
æ
ì
à
ì
à
æ
540
fixed 33
evolved ratio
fixed 42
æ
520
à
ì
à
ì
æ
500
0
20 000
40 000
60 000
80 000
Iterations
Figure 4: Average fitness over 60 trials with evolved ratios and fixed ratios of 3/3 and 4/2. Up to approximately generation 5000 evolved team composition and 4/2 outperform 3/3.
100
æ
Percent of Teams that are Scouts
à à à
à à à à
à
à à
à à à
à
à
à à à
80
à à
à à
à à à
à
à à
à à
à
60
æ
à
ì
40
à à à
à à
fixed 33
fixed 42
fixed 51
à
æ
æ æ
æ æ æ
æ
20
æ æ æ
æ
æ
æ æ æ
æ æ
æ
0
à à
æ æ
æ æ
æ æ æ
æ æ
æ æ
æ æ
æ
æ æ æ æ
æ æ æ
à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì
ì
0
20 000
40 000
60 000
80 000
Iterations
Figure 5: This shows the number of teams in the population for each composition (0/6, 1/5, ... , 6/0)
averaged over 60 trials. Composition is denoted: scouts/investigators.
1072
Average Number of Scouts in a Team
6
5
4
3
ààæààæ
àààààæ
ààààæ
àààààæ
àààààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààààààààààààæ
ààààààààààààààààààààààààààààààààààààààààààààààààààààààà
ààààààààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
àààààæ
ààààæ
æ ààààæ
à
æ
æ
æ æ æ æ æ
æ
æ æ
æ
æ
à
2
æ
Switches to 33
à
Stays 42
1
0
0
20 000
40 000
60 000
80 000
100 000
Iterations
Figure 6: The average number of scouts on a team in a population for two example runs. This shows the
sudden transition from 4/2 to 3/3 that was not evident by the averaging of 60 trials in the previous graph.
Composition 3/3, 4/2 and 5/1 are displayed. The remaining compositions were less than 1% of the population.
mance of the two algorithms when results are grouped by
composition. This suggests that by comparing the performance of most frequently recurring compositions in repeated
optimizations may select the best composition. This is similar to a restart strategy. Furthermore, the performance of
the best evolved compositions is likely similar to that of
the best fixed composition. While this does not support
our first hypothesis, it has practical implications for heterogeneous team development when the optimum composition
is unknown. Further research is necessary across a wider
selection of problems with an algorithm that exploits this
approach.
Although the variable composition algorithm produced an
equal or higher growth rate in performance than either of the
fixed algorithms, once the groups plateaued, this advantage
was lost. This disproved our second hypothesis but since
performance grouped by composition was not significantly
worse the algorithm is competitive in a practical sense.
While this exploration of evolved team composition did
not produce better teams or evolve teams faster in the long
run, teams that converged to the optimum composition were
competitive with teams where the optimum composition was
known and fixed in advance. This algorithm could be extremely useful in cases where optimum composition was not
known in advance.
Table 3: Average fitness for Test groups broken into
3/3 and 4/2 sub-populations
Group
Avg Fitness Std Dev
Test group evolved to 3/3
601.52
12.94
Control group 3/3
597.31
15.23
Test group evolved to 4/2
577.28
12.65
Control group 4/2
577.58
15.46
at optima competitive with algorithms that were initially
fixed at the optimum composition. The major advantage is
that the optimum composition need not be initially known
and rather discovered during evolution.
5.
CONCLUSIONS
In previous work, the OET algorithm was shown to outperform both team based selection and replacement and individual based selection and replacement in terms of both
quality of answer and robustness. However, the tested OET
algorithms use a fixed team composition set by the user before the optimization. Therefore, an incorrect guess of the
optimal team composition could lead to suboptimal team
performance. In this work, we examined the ability of OET
to optimize the composition of heterogeneous teams and
therefore without prior knowledge of the optimal team composition.
The evolving team composition algorithm produced a better average fitness than the suboptimal 4/2 team composition. However, the optimal team composition of 3/3 produced a significantly better average performance than the
variable team composition algorithm. This is not the end of
the story, however, a comparison of the performance of the
3/3 and 4/2 results for both variable and fixed team composition reveal that there was little difference in the perfor-
6. REFERENCES
[1] M. Brameier and W. Banzhaf. Evolving teams of
predictors with linear genetic programming. Genetic
Programming and Evolvable Machines, 2(4):381–408,
2001.
[2] R. Feldt. Generating multiple diverse software versions
with genetic programming. In Proceedings of the 24th
EUROMICRO Conference, volume 1, pages 387–394,
1998.
1073
[3] T. Haynes, S. Sen, D. Schoenefeld, and
R. Wainwright. Evolving a team. In E. V. Siegel and
J. Koza, editors, Working Notes of the AAAI-95 Fall
Symposium on GP, pages 23–30. AAAI Press, 1995.
[4] H. Iba. Bagging, boosting, and bloating in genetic
programming. In Proceedings of the Genetic and
Evolutionary Computation Conference, volume 2,
pages 1053–1060, 1999.
[5] K. Imamura. N-version Genetic Programming: A
probabilistic Optimal Ensemble. PhD thesis, University
of Idaho, 2002.
[6] K. Imamura and J. Foster. Fault Tolerant Computing
with N-Version Genetic Programming. In Proceedings
of Genetic and Evolvable Computing Conference
(GECCO), Morgan Kaufmann, volume 178, 2001.
[7] K. Imamura, R. B. Heckendorn, T. Soule, and J. A.
Foster. Behavioral diversity and a probabilistically
optimal gp ensemble. Genetic Programming and
Evolvable Machines, 4:235–253, 2004.
[8] S. Luke and L. Spector. Evolving teamwork and
coordination with genetic programming. In J. R.
Koza, D. E. Goldberg, D. B. Fogel, and R. R. Riolo,
editors, Genetic Programming 1996: Proceedings of
the First Annual Conference on Genetic Programming,
pages 150–156. Cambridge, MA: MIT Press, 1996.
[9] D. Opitz, S. Basak, B. Gute, W. Banzhaf, J. Daida,
A. Eiben, M. Garzon, V. Honavar, M. Jakiela, and
R. Smith. Hazard Assessment Modeling: An
Evolutionary Ensemble Approach. In Proceedings of
the Genetic and Evolutionary Computation
Conference, volume 2, pages 1643–1650. Morgan
Kaufmann, 1999.
[10] L. Panait and S. Luke. Cooperative Multi-Agent
Learning: The State of the Art. Autonomous Agents
and Multi-Agent Systems, 11(3):387–434, 2005.
[11] M. D. Platel, M. Chami, M. Clergue, and P. Collard.
Teams of genetic predictors for inverse problem
solving. In Proceeding of the 8th European Conference
on Genetic Programming— EuroGP 2005, 2005.
[12] T. Soule. Voting teams: A cooperative approach to
non-typical problems. In W. Banzhaf, J. Daida, A. E.
Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and
R. E. Smith, editors, Proceedings of the Genetic and
Evolutionary Computation Conference, pages 916–922,
Orlando, Florida, USA, 13-17 July 1999. Morgan
Kaufmann.
[13] T. Soule. Cooperative evolution on the intertwined
spirals problem. In Genetic Programming: Proceedings
of the 6th European Conference on Genetic
Programming, EuroGP 2003, pages 434–442.
Springer-Verlag, 2003.
[14] T. Soule and R. B. Heckendorn. Evolutionary
optimization of cooperative heterogeneous teams.
SPIE Defense and Security Symposium, 2007.
[15] T. Soule and P. Komireddy. Orthogonal evolution of
teams: A class of algorithms for evolving teams with
inversely correlated errors. In R. L. Riolo, T. Soule,
and B. Worzel, editors, Genetic Programming Theory
and Practice IV, volume 5 of Genetic and
Evolutionary Computation, chapter 8, pages –.
Springer, Ann Arbor, 11-13 May 2006.
[16] R. Thomason, R. B. Heckendorn, and T. Soule.
Training time and team composition robustness in
evolved multi-agent systems. In M. O’Neill,
L. Vanneschi, S. Gustafson, A. I. Esparcia Alcazar,
I. De Falco, A. Della Cioppa, and E. Tarantino,
editors, Proceedings of the 11th European Conference
on Genetic Programming, EuroGP 2008, volume 4971
of Lecture Notes in Computer Science, pages 1–12.
Springer-Verlag, 2008.
[17] B. Zhang, J. Joung, J. Koza, K. Deb, M. Dorigo,
D. Fogel, M. Garzon, H. Iba, and R. Riolo. Enhancing
Robustness of Genetic Programming at the Species
Level. In Genetic Programming 1997: Proceedings of
the Second, pages 336–342. Morgan Kaufmann, 1997.
1074

Download Report

Evolution of team composition in multi-agent systems

Paperzz.com

Your Paperzz