Deep Barca: Artificial Intelligence for the Board Game Battle Line

Deep Barca:
Artificial Intelligence for the Board Game Battle Line
Dobrow, Tom
Dept. of Computer Science
Middlebury College
[email protected]
Prof. McCulloch, Sean
Dept. of Mathematics and Computer Science
Ohio Wesleyan University
[email protected]
October 14, 2015
Abstract
that your opponent can place cards on. This forces them to commit
multiple cards to other flags sooner then they might wish. Taking
flags prematurely is a one-turn process. Upon finishing a formation,
if you can show that the formation is unbeatable (given the remaining unplayed Troop Cards and the Troop Cards your opponent has
already committed to the flag), then you can claim that formation
at the end of your turn. A flag can only taken at the beginning of a
turn, so if Player One, say, claims a flag at the end of his/her turn,
then Player Two can respond to the claim on his/her own turn. Since
a claim must account for all unplayed Troop Cards, Player Two could
not possibly play a Troop Card on that flag to prevent Player One
from taking the flag at the beginning of her/her next turn. However,
a claim does not account for the Tactic Cards, since it only considers
the unplayed Troop Cards.
abstract text and stuff
Introduction
General AI Approach
Game State Trees
The Game of Battle Line
A Gentle Introduction to Battle Line
In this paper we focus on the two-player strategy board game Battle
Line. In Battle Line players use a deck of “Troop Cards” of 10 values
in 6 colors and special-purpose “Tactic Cards” to fight for control of
nine flags, by building three-card poker hands, called formations in
front of each flag. Once both players have finished their formations in
front of a particular flag, the player with the stronger formation can
claim the flag, meaning they win the flag at the beginning of their
next turn. The strength of the formations from highest to lowest are:
Wedge (three cards in a row in the same color), Phalanx (three cards
of the same value), Battalion (three cards in the same color, not all
in a row), Skirmish (three cards in a row, not all in the same color),
and Host (three cards that make no other formation).
Tactic Cards
Tactic Cards are a thing, and they can stop claims.
Proving Mechanism for Flag Claims
For this project, we needed to write the proving mechanism for the
early capture of flags. Each time a formation is finished (the third
card is placed on one side of the flag), we check whether it is beatable.
It is beatable only if there exists a stronger formation that the opposing player could possibly make, given the cards they have already
committed towards their formation, and the cards still available in
the deck of Troop Cards. To do this efficiently, we consider all categories that are strictly stronger than the category of the completed
formation. Here, a category is a class of formations, all with the same
strength. For instance, the following two formations belong to the
same category:
1. Red 8, Blue 8, and Green 8
Figure 1: Example formations, in descending order of strength, from
left to right
2. Green 8, Yellow 8, and Orange 8
Since both are Phalanxes with a sum of 24. The following two formations belong to the same category as well:
If two formations are tied in strength, the player with the higher
total sum of cards wins. A player wins once they have taken five of
the nine flags, or any three adjacent flags. On a player’s turn, they
select a card from the seven in their hand, place it on one of the yet
to be taken flags, and then draw a card from either of the two decks
to replace it. This leads to 63 possible choices for each player on most
moves, and the randomness of card draw and the number of possible
formations make for an intractably large number of game states.
1. Purple 5, Purple 10, and Purple 3
2. Red 8, Red 6, and Red 4
Since both are Battalions with a sum of 18. Each formation uniquely
belongs to one category. There are a total of 72 categories.
In checking for a claim, we only need to consider all possible categories that can still be made, and not all possible formations. Thus,
if there exist no categories that are both stronger than the category
of the finished formation, and still makable, then the player of the
finished formation can immediately claim the flag.
Premature Capturing of Flags
If a player can prove that a finished formation they have made is unbeatable, given the cards already in play, they can prematurely take
the flag. The act of prematurely taking flags is very important in
Battle Line, since in doing so you limit the number of available flags
1
Our Agent: Deep Barca
Results
Probabilistic Model for Decision Making
Emergent Behavior
Our agent uses a general probabilistic model for decision making to
play Battle Line. Each turn, for each card in its hand, and for each
flag that card could be placed on, the agent evaluates the probability
of winning the game. It then chooses the option that maximizes this
probability. Calculating the probability of winning the game for each
move is a three-part process:
Since our agent (for the most part) followed a general mathematical model for decision making, often it would make moves that went
against our immediate human intuitions. We found some emergent
behaviors that were particularly noteworthy:
1. Eight > Ten: Most high level human players would prefer
a Troop Card with value 10 to a Troop Card with value 8,
all else equal. The 10 is a higher value, and thus wins all ties
against the 8. However, the 8 is eligible for more straight flushes
(10-9-8; 9-8-7; 8-7-6, as opposed to just 10-9-8), which are the
strongest formations. Because of this, our agent’s preference of
an 8 over a 10 is not objectively bad, and many human players
even have this preference. We were pleased to see this behavior
emerge, as it was unexpected, and not caused by direct human
intervention.
1. First we calculate the top four formations that each player could
make on the flag. These are the four strongest formations that
could possibly be made, given the Troop Cards remaining in
the agent’s hand and the cards left in the deck of Troop Cards.
2. Next we calculate the probability of winning the flag given our
move. We calculate this in one of the following three ways:
(a) If neither player has played a card towards a formation on
the flag, then we simply say the probability of winning the
flag is 50%.
Performance
(b) If both players have played at least one card towards a
formation on the flag, we calculate the probability of winning the flag via what we call the Multiple Formation Approach, or MFA. For each of our top formations, we find
the probability of winning via that formation (the probability of making that formation multiplied by the probability of that formation not being beaten by our opponent).
The MFA score for that flag is the union of the probabilities of winning via each of our top formations.
We evaluated the quality of our agent against computer and human
players. To the best of our knowledge, there is only one other computer agent, which is the iPhone App Reiner Knizia’s Battleline, by
Rational Brothers LLC. We played a best-of-three match, in which
our agent handily won 2-0. In one of the games, our agent won without losing a flag. Against human players, we found the agent won
50% of its games against experienced human players. Against the
2014 Battle Line World Championship silver medalist Sean McCulloch, our agent won three of the ten games it played. With these
results we can confidently say that our agent can competitively play
against even the very best human Battle Line players. Also, our
agent was able to make its decision for each turn in between a tenth
of a second and one second, which is far faster than a human player
could play. This is encouraging, since it means we have room to add
additional features and advanced logic, and still have the agent run
at or faster than the speed of a human player.
(c) If one player has played at least one card towards a formation on the flag, but the other play has yet to play a card,
we use a mix of two approaches, the MFA from above
and the Best Single Formation Approach, or BSFA. In
the BSFA, for each of our top formations, the agent finds
the probability of winning via that formation, and then
choose the single formation that maximizes its chances of
winning. We evaluate our chances of winning the flag as
if that single best formation were our only available formation. The BSFA on its own yields very narrow decision
making because the agent will not even consider many decent or adequate formations, since they are not the single
best formations it could make. Often even subpar formations are enough to yield a victory on a particular flag.
The MFA on its own is weak as well, since it is heavily
biases in favor of the player who has yet to play a card
on the flag. This is due to the fact that all of the top
formations generated for that player are simply the very
strongest formations in the game, since the player has yet
to commit any card that would narrow his or her options.
Using a mix of these two approaches, our agent makes
much better decisions than it would using either approach
by itself.
Future Work
There are many possible improvements we could make to the agent
to strengthen quality of play, and reduce ad hoc decision making even
further.
At the moment there are a few aspects of the game logic where we
rely simply upon our intuition, rather than anything directly mathematical computed. Currently the ten unique Tactic Cards each
require specific logic for evaluating their strength, and determining
when to play them. Similarly, we currently directly weight the importance of flags by their distance to the central flag. This is because the
more central flags are eligible for more three-in-a-rows than the fringe
flags. We would like to find a more dynamic model for evaluating the
importance of flags that can adapt to flags being won. Lastly, we
would like to have a more sophisticated method for deciding whether
to draw a Troop Card or a Tactic Card each turn, that considers
more than just a few simple checks for what has already been drawn.
We would like to improve the agent’s ability to make longer-term
decisions. Since we evaluate the quality of each move based only on
the game state one turn in the future, the agent can often miss moves
involving multiple Tactic Cards used together, since the play of any
of the cards by itself is weak.
The agent’s ability to handle the scenario in which its opponent
claims multiple flags on the same turn is very weak. We would like to
be able to evaluate which of the claimed flags are worth fighting for,
and which of the flags might be too detrimental to lose, but currently
we do not.
3. Lastly, we calculate the probability of winning the game as a
function of the probabilities of winning each of the nine flags.
Since we have two different win conditions (winning five out of
nine, and winning three adjacent flags) we take the weighted
average of the probabilities of winning via each of these two conditions, as a function of the number of cards left in the Troop
Deck. Thus, in the beginning of the game, the agent is far
more concerned with simply winning the majority of the flags,
since this goal in practice entails doing as well as possible on
as many flags as possible. Near the end of the game, the agent
is more concerned with winning three adjacent flags, and will
even willingly sacrifice a fringe flag if it means winning three in
a row somewhere else.
Acknowledgements
More About Deep Barca
other cool things about the code
NSF and OWU and stuff
2