The Boats and Torpedoes Problem

The Boats and Torpedoes Problem
Report by
Miguel Covarrubias
CS 491AB
Professor Abbott
Spring 2009
1
I. Background
In most software, the algorithms that the software implements (or less formally, the
processes that the software performs) solve a problem or perform a task in ways that are
understood in advance. Artificial Intelligence is required when it is not known how to go
about solving a problem or performing a task. In such cases one writes a program that
searches for solutions to the problem or for ways to perform the task..
Search is the essence of artificial intelligence. In most software, the algorithms that the
software implements (or less formally, the processes that the software performs) solve a
problem or perform a task in ways that are understood in advance. Artificial Intelligence
is required when it is not known how to go about solving a problem or performing a task.
In such cases one writes a program that searches for solutions to the problem or for ways
to perform the task.
In the boat-torpedoes problem, a boat on a featureless ocean has as a goal to reach a safe
harbor, its home port. The boat is being pursued by torpedoes. The torpedoes are faster
than the boat, but the boat is able to turn more sharply than the torpedoes. The boat's
goal is to reach its home port without being hit by a torpedo.
All information is known. The boat knows where it is; it knowns where its home port is;
and it knows where the torpedoes are. The torpedoes know where the boat and the boat's
home port is.
The boat and the torpedoes move at constant speeds — with the torpedoes moving faster
than the boat.
2
II. Problem description
In the boat-torpedoes problem a boat on a featureless ocean has as a goal to reach a safe
harbor, its home port. The boat is being pursued by torpedoes. The torpedoes are faster
than the boat, but the boat is able to turn more sharply than the torpedoes. The boat's goal
is to reach its home port without being hit by a torpedo.
All information is known. The boat knows where it is; it knowns where its home port is;
and it knows where the torpedoes are. The torpedoes know where the boat and the boat's
home port is.
The boat and the torpedoes move at constant speeds — with the torpedoes moving faster.
The problem is formulated as a time-stepped simulation. At each time step the boat and
the torpedoes are required to turn either left or right by a fixed amount — with the boat
turning more than the torpedoes — and then to move forward by a fixed amount, with the
torpedoes moving farther than the boat. At the default settings, the torpedoes move twice
as fast as the boat, and they turn 30% as sharply.
The boat reaches home if it comes within epsilon of its home. A torpedo hits the boat if it
comes within epsilon of the boat. The overall space is square and toroidal. The sizes of
the distance increments are such that at the default settings the boat would be able to
traverse the height/width of the space in 400 steps were it to go in a straight line. Of
course the boat can't move in a straight line because it is constrained to turn either left of
right by a fixed angle (with a default value of 0.1 radians) at each time step.
3
Problem space size
The positions of the boat and torpedoes are recorded as double values. Were the space
digitized more coarsely, it would consist of a square grid of at least 400 x 400 or 160,000
locations. The boat and torpedoes could each be at virtually any of those points. Each
could be in any of more than 60 orientations. Under the default settings, there are 5
torpedoes. Hence the boat and torpedoes each could be in any of the nearly 160,000
locations facing in any of approximately 60 directions, resulting in an order of at least
(160,000*60)6/5! ~= 10,000,0006/5! which is nearly 1040 possible states. In other words,
the space is far too large to explore directly. Therefore one cannot approach the problem
by presuming that one could enumerate the possible states and examine the entire
problem space.
Intelligence
The torpedoes are constrained to be reactive. At each time step, they turn, as best they
can, towards a point somewhere in front of where the boat is currently located in the
direction in which it is currently headed. The closer a torpedo is to the boat, the less it
leads the boat. (Currently an ad hoc approximation is used that seems to maximize the
torpedoes' effectiveness in this regard.) The torpedoes do not work as a team. Each
torpedo attempts to intercept the boat individually.
The boat is an intelligent agent. At each time step it projects ahead along its possible
paths. It also projects how the torpedoes will behave given its possible moves. (It can do
that because it knows how the torpedoes operate.) It then selects what it considers its best
path. If time and space were not an issue, this would be a trivial problem. The boat would
4
search for a path to its home base that avoided all the torpedoes. Depending on the initial
conditions there may or may not be such a path. With no time and space constraints, this
would be similar to a maze problem in which the boat is navigating a maze whose
pathways predictably open or are blocked (by the presence of the torpedoes) as time
passes and as a function of how the boat moves.
Because the search space is so huge, this problem can't be approach in the normal way.
Furthermore, it is required that the boat respond in a fixed amount of time. This makes
the problem a real-time search problem. The boat is required to take a step at each time
tick after only a fixed amount of time has elapsed. (The torpedoes are also required to act
in real-time. That isn't a problem for them. At each time tick, they perform a simple
computation that determines how to adjust their heading as a function of their current
heading and position and that of the boat.)
The problem presents us with standard search strategies a new search strategy that seems
to be more effective for this problem.
III. User Guide
BF – breadth first
5
ST – search by lines
SP – spider search
SPO - spider original
6
Boat reached home
Boat Gets Hit by a Torpedo
IV. Implementation.
The class BoatTorpedoModel is the main framework of the game. It is the heart of the
implementation. It controls all search strategies.
package edu.csula.cs.boatTorpedo;
imports *;
public class BoatTorpedoModel extends SimpleModel {
public static Random randomNumberGenerator;
private static long seed;
public static void main(String[] args) {
seed = System.currentTimeMillis();
new SimInit().loadModel(new BoatTorpedoModel(), null, false);
7
}
/** Generate a random double in [0 .. 1).
*/
public static double nextDouble() {
return randomNumberGenerator.nextDouble();
}
/** Generate a random int in [i .. j). The possibilities
* include i but not j.
*/
public static int nextInt(int i, int j) {
return (int) (nextDouble() * (j-i)) + i;
}
// ========================= End of static declarations ============================
// Repast parameters
private OpenSequenceGraph plot = null;
private DisplaySurface displaySurface = null;
private Object2DDisplay seaSurface = null;
public Object2DTorus space = null;
protected int clusterSz = 1;
private FileWriter fileWriter = null;
private long tickStartTime;
// Execution parameters
public boolean display
= true;
public boolean displayTree
= true;
private int seaSide
= 600;//650;//670; //350;
private long sleep
= 30;
private double touchEpsilonFactor
= 1.5;
private final Color seaColor
= new Color(255, 255, 255);//new Color(120, 180, 200);
// The search strategy;
public Strategy strategySelector
private int nodesPerTick
= Strategy.Bf;
= 1200;
private double boatMaxAngle
= 0.1;
private double boatSpeed
= 0.005;
private double boatWidth
= 0.010;
private double boatLength
= 0.025;
private int torpedoCount
= 3;
private double torpedoAngleFactor
= 0.3;
private double torpedoSpeedFactor
= 2.0;
private double torpedoWidth
= 0.0025;
private double torpedoLength
= 0.010;
private int treeToTickLimitRatio
= 15;
// Execution variables
private boolean rerun
= false;
public boolean verbose
= false;
public BoatAgent boatAgent;
public HomeAgent homeAgent;
public SearchStrategy searchStrategy;
// Keep state private to make it difficult to get to.
// It shouldn't be looked at or changed outside of this object.
private ModelState modelState;
public TreeNode tree;
public ArrayList<TorpedoAgent> torpedoAgents;
8
// Multi-run
private int multirun
= 0;
private long[] seeds;
private Strategy[] strategies = new Strategy[]{Strategy.Bf, Strategy.ST, Strategy.Sp, Strategy.Spo};
private int seedCount = 0;
public int strategyCount = 0;
private boolean[][] results;
private boolean newRun;
private long newRun2;
private int[] totals;
private long multirunSeed;
private Splash splash = null;
private int nodesRemaining;
public BoatTorpedoModel () {
params =
new String[] {
"BoatMaxAngle", "BoatSpeed", "BoatLength", "BoatWidth", "Display",
"DisplayTree", "Multirun", "MultirunSeed",
"NodesPerTick", "Rerun", "SeaSide", "Seed", "Strategy_Sp_ST",
"TorpedoAngleFactor", "TorpedoCount",
"TorpedoLength", "TorpedoMaxAngle",
"TorpedoSpeedFactor", "TorpedoWidth", "TreeToTickLimitRatio", "Verbose"
};
try{fileWriter = new FileWriter( "result.txt", true);
} catch(Exception e) { System.out.println( "Can't create file!"); }
}
@Override
public void begin() {
// The following apparently includes a call to buildModel().
super.begin();
newRun = true;
newRun2 = 0;
seedCount = 0;
strategyCount = 0;
seeds = new long[multirun];
// Genereate the sequence of seeds.
if (!rerun) multirunSeed = System.currentTimeMillis();
if (multirun > 0) System.out.println("MultirunSeed: " + multirunSeed);
Random seedGenerator = new Random(multirunSeed);
for (int i = 0; i < seeds.length; i++)
seeds[i] = seedGenerator.nextLong();
results = new boolean[seeds.length][strategies.length];
totals = new int[strategies.length];
for (int i = 0; i < strategies.length; i++) totals[i] = 0;
if (display) displaySurface.updateDisplayDirect();
}
@Override
public void buildModel() {
if (space == null)
space = new Object2DTorus(getSeaWidth(), getSeaHeight());
if (display) {
9
if (seaSurface == null)
seaSurface = new Object2DDisplay(space);
displaySurface.addDisplayable(seaSurface, "Sea War");
displaySurface.setBackground( seaColor );
displaySurface.display();
}
}
private void createAgentsAndModelState() {
// Reset the static count of agents generated.
// Each agent gets its own index.
Agent.agentCount = 0;
//modelState = new ModelState(22);
modelState = new ModelState(getTorpedoCount() + 2);
nodesRemaining = 1;
if (multirun > 0) {
strategySelector = strategies[strategyCount];
}
// Normally the state object should not be passed out of the model.
searchStrategy = strategySelector.getStrategy(this, modelState);
homeAgent = new HomeAgent( space, this);
initAgent(homeAgent);
boatAgent = new BoatAgent( space, this, getBoatSpeed(), getBoatMaxAngle(), getBoatWidth(), getBoatLength());
initAgent(boatAgent);
torpedoAgents = new ArrayList<TorpedoAgent>();
for( int j = 0; j < getTorpedoCount(); j++) {
TorpedoAgent torpedoAgent =
new TorpedoAgent( space, this, getTorpedoSpeed(),
getTorpedoAngleFactor() * getBoatMaxAngle(), getTorpedoWidth(), getTorpedoLength() );
initAgent(torpedoAgent);
torpedoAgents.add(torpedoAgent);
}
}
private void incrementMultirun(boolean boatReachedHome) {
results[seedCount][strategyCount] = boatReachedHome;
newRun = true;
strategyCount++;
if (strategyCount < strategies.length) return;
// We finished all strategies on this seed.
printResults();
strategyCount = 0;
seedCount++;
if (seedCount < seeds.length) return;
// We are finished with the multirun.
System.out.print("Totals: ");
for (int j = 0; j < strategies.length; j++) {
System.out.print(strategies[j] + ": " + percentHome(totals[j], seedCount));
}
// Terminate the current line and leave a blank line after this Multirun.
System.out.println("\n");
10
// Stop this Repast run.
super.stop();
}
/** Require the torpedo to be within one step of the boat to count as a hit.
* The boat gets the same epsilon for getting home.
*/
public double getTouchEpsilonFactor() {
return touchEpsilonFactor;
}
public void setTouchEpsilonFactor(double touchEpsilonFactor) {
this.touchEpsilonFactor = touchEpsilonFactor;
}
public double getTouchEpsilon() {
return getTorpedoSpeed() * getTouchEpsilonFactor();
}
@SuppressWarnings("unchecked")
private void initAgent(Agent agent) {
agentList.add( agent );
AgentState agentState = agent.createState();
modelState.putAgentState(agentState);
agent.updateAvatar(agentState);
}
private void initializeRun() {
TreeNode.nodeCount = 0;
if (splash != null) splash.remove();
seed = multirun > 0 ? seeds[seedCount] :
rerun
? getSeed() :
System.currentTimeMillis();
randomNumberGenerator = new Random(seed);
if (agentList == null) agentList = new ArrayList<AutoStepable>();
else {
for (Object agent : agentList)
((Agent)agent).avatar.remove();
agentList.clear();
if (display) displaySurface.updateDisplay();
}
createAgentsAndModelState();
if (display) displaySurface.updateDisplay();
try {
fileWriter.write("\n" + this +
"\n" + searchStrategy + "\n\t" + seed);
} catch (IOException e) {e.printStackTrace();}
// Initialize the searchStrategy by letting it create its root state.
// This must be done after all the agents have been created and
// placed at their starting positions.
searchStrategy.createRoot(modelState);
11
}
public String parameters() {
return
"boatMaxAngle: " + boatMaxAngle +
", torpedoAngleFactor: " + torpedoAngleFactor +
", boatSpeed: " + boatSpeed +
", torpedoSpeedFactor: " + torpedoSpeedFactor
;
}
private String percentHome(int succeeses, int total) {
return succeeses + "/" + total + " = " +
Math.round(succeeses*100.0/total) + "%; ";
}
@Override
protected void postStep() {
// Each agent must update its avatar based on its current state.
for (Object agent : agentList) ((Agent)agent).postStep(modelState);
boolean boatWasHit = searchStrategy.boatWasHit();
boolean boatReachedHome = false; //searchStrategy.boatReachedHome();
boolean finished = boatWasHit || boatReachedHome;
// Was it hit? Did it reach home?
if( finished ) {
// There should never be a tie. If boatReachedHome(), we don't look for
// deadlyTorpedoes.
if (boatWasHit && boatReachedHome) System.out.println("Tie");
try {
fileWriter.write(" (" + TreeNode.nodeCount + " nodes) " +
(boatReachedHome ? "Home\n" : "Hit\n"));
fileWriter.flush();
}
catch (IOException e) {e.printStackTrace();}
if (boatReachedHome) homeAgent.avatar.remove();
// Draw a red or golden splash around the boat if it was hit or if it made it home.
// If it both reached home and was hit, give it credit for reaching home.
// That is, a tie goes to the boat.
splash = new Splash(space, modelState.getAgentState(boatAgent),
boatReachedHome ? Color.YELLOW : Color.RED );
}
if (display) displaySurface.updateDisplay();
if ( finished ) {
for (Object agent: agentList) ((Agent) agent).avatar.remove();
if (multirun > 0) incrementMultirun(boatReachedHome);
else super.stop();
}
if (display) sleep();
if (newRun2 == 30) {
torpedoCount += 1;
newRun2 = 0;
newRun = true;
//createAgentsAndModelState();
//displaySurface.updateDisplay();
//nodesRemaining = getNodesPerTick();
12
//boatAgent.plan();
preStep();
//JOptionPane.showMessageDialog(null, "I just added a new torpedo!\nThere are now " + getTorpedoCount()
+ " torpedos.");
} else if (Math.random() > .94) {
//sleep(300);
newRun2 += 1;
}
}
private void sleep(long count) {
try {Thread.sleep(count);}
catch(InterruptedException interruptedexception)
{
System.out.println(interruptedexception + " in exec thread");
}
}
@Override
protected void preStep() {
if (newRun) {
initializeRun();
newRun = false;
}
nodesRemaining = getNodesPerTick();
boatAgent.plan();
}
private void printResults() {
int index = seedCount+1;
System.out.print(index + ". " + seeds[seedCount] + ". ");
for (int j = 0; j < strategies.length; j++) {
if (results[seedCount][j]) totals[j]++;
System.out.print(strategies[j] + ": " + results[seedCount][j] + " ");
System.out.print(percentHome(totals[j], index));
}
System.out.println();
}
/**
* Run on reset and initially(?)
*/
@Override
public void setup() {
super.setup();
if (display) {
/*if (displaySurface != null)
{
try {
displaySurface.removeDisplayable(seaSurface);
displaySurface.dispose();
displaySurface = null;
space = null;
seaSurface = null;
} catch(Exception e) {}
}*/
displaySurface =
new DisplaySurface(
13
new Dimension(getSeaWidth(), getSeaHeight()),
this,
"Sea War");
super.registerDisplaySurface("Sea War", displaySurface);
}
if (plot != null) plot.dispose();
plot = new OpenSequenceGraph("Buttons", this);
plot.setAxisTitles("time", "cluster");
plot.addSequence(
"MaxCluster",
new Sequence() {
public double getSValue() {return clusterSz;}
});
}
private void sleep() {
long tickLength = System.currentTimeMillis() - tickStartTime;
if (tickLength < sleep) {
try {Thread.sleep(sleep - tickLength);}
catch(InterruptedException interruptedexception)
{
System.out.println(interruptedexception + " in exec thread");
}
}
// Start the clock for the next tick.
tickStartTime = System.currentTimeMillis();
}
/** The searchStrategy figured out what to do in its plan() step.
* Ask it what the next state is.
*/
@Override
protected void step() {
modelState = searchStrategy.getNextState();
}
@Override
public String toString() {
return getClass().getSimpleName() + "[" + parameters() + "]";
}
// ======================== Getters and setters ================================
@SuppressWarnings("unchecked")
public ArrayList getAgentList() {return agentList;}
public double getBoatLength() {return boatLength;}
public void setBoatLength(double boatLength) {
this.boatLength = boatLength;
}
public double getBoatMaxAngle() {return boatMaxAngle;}
public void setBoatMaxAngle(double boatMaxAngle) {
this.boatMaxAngle = boatMaxAngle;
}
public double getBoatSpeed() {return boatSpeed;}
public void setBoatSpeed(double boatSpeed) {
14
this.boatSpeed = boatSpeed;
}
public double getBoatWidth() {return boatWidth;}
public void setBoatWidth(double boatWidth) {
this.boatWidth = boatWidth;
}
public boolean getDisplay() {return display;}
public void setDisplay(boolean displayState) {
this.display = displayState;
}
public boolean getDisplayTree() {return displayTree;}
public void setDisplayTree(boolean displayState) {
this.displayTree = displayState;
}
public int getMultirun() {return multirun;}
public void setMultirun(int m) {this.multirun = m;}
public long getMultirunSeed() {return seed;}
public void setMultirunSeed(long s) {seed = s;}
public int getNodesPerTick() {return nodesPerTick;}
public void setNodesPerTick(int max) {nodesPerTick = max;}
public int decrementNodesRemaining() {return --nodesRemaining;}
public int getNodesRemaining() {return nodesRemaining;}
public boolean getRerun() {return rerun;}
public void setRerun(boolean r) {rerun = r;}
public int getSeaHeight() {return getSeaSide();}
public int getSeaSide() {return seaSide;}
public void setSeaSide(int s) {this.seaSide = s;}
public int getSeaWidth() {return getSeaSide();}
public long getSeed() {return seed;}
public void setSeed(long s) {seed = s;}
public long getSleep() {return sleep;}
public void setSleep(long sleep) {this.sleep = sleep;}
public Object2DTorus getSpace() {return space;}
public String getStrategy_Sp_ST() {return strategySelector.toString();}
public void setStrategy_Sp_ST(String s) {
strategySelector = Strategy.getStrategy(s);
}
public ArrayList<TorpedoAgent> getTorpedoAgents() {
return torpedoAgents;
}
public double getTorpedoAngleFactor() {return torpedoAngleFactor;}
public void setTorpedoAngleFactor(double torpedoMaxAngle) {
this.torpedoAngleFactor = torpedoMaxAngle;
}
15
public int getTorpedoCount() { return torpedoCount; }
// Don't allow more than 18 torpedoes.
public void setTorpedoCount(int tc) {
this.torpedoCount = Math.min(18, tc);
}
public double getTorpedoLength() {return torpedoLength;}
public void setTorpedoLength(double torpedoLength) {
this.torpedoLength = torpedoLength;
}
/** The torpedo speed is some multiplier of the boat speed.
*/
public double getTorpedoSpeed() {
return boatSpeed * torpedoSpeedFactor;
}
public double getTorpedoSpeedFactor() {
return torpedoSpeedFactor;
}
public void setTorpedoSpeedFactor(double torpedoSpeed) {
this.torpedoSpeedFactor = torpedoSpeed;
}
public double getTorpedoWidth() {return torpedoWidth;}
public void setTorpedoWidth(double torpedoWidth) {
this.torpedoWidth = torpedoWidth;
}
public int getTreeLimit() {
return getNodesPerTick() * getTreeToTickLimitRatio();
}
public int getTreeToTickLimitRatio() {
return treeToTickLimitRatio;
}
public void setTreeToTickLimitRatio(int max) {
treeToTickLimitRatio = max;
}
public boolean getVerbose() {return verbose;}
public void setVerbose(boolean v) {verbose = v;}
}
For the problems addressed by Chapter 3, the objective is for an agent (the problem
solver, in our case the boat) to solve a problem. A problem is defined in terms of what is
often called a search space. Each point in the search space represents a state — of the
world and of the search agent's relationship to the world. In the example discussed in the
chapter, the agent is located at a starting city. The problem is to find a way for the agent
16
to go to a goal city. Transitions from point to point represent possible transitions from
one state to another, in this case from one city to another. In this problem, the transitions
represent roads from one city to another. It is often desirable to find an optimal solution,
i.e., the least expensive (according to some measure) way for the agent to move from its
initial position to its goal.
Chapter 3 covers what are known as blind search strategies. The searches are called blind
because the agent has (or uses) no information about the problem space other than the
graph that defines possible transitions among states.
Most blind search approaches are best understood in terms of a tree. The agent is at the
root of the tree. It is able to move from the root to various other states. Its job is to
explore the tree until it finds a path (or an optimum path) to the goal state. Since the
problem space itself is not necessarily a tree, i.e., there may be loops in the state
transition graph, the agent must keep track of visited nodes so that it doesn't repeat itself.
By doing this it essentially transforms the problem space graph into a tree with the initial
(or current) state as the root. With this formulation, search can be re-examined as a
problem in tree traversal.
The best known tree-traversal methods are breadth-first and depth-first. There are
advantages and disadvantages to each. A variant of depth-first search, known as iterative
deepening (depth-first search to increasingly large depth limits), eliminates some of the
problems with depth-first search and gains some of the advantages of breadth-first search
by paying a penalty in search time.
17
Both breadth-first search and depth-first search may be understood in terms of a frontier
that keeps track of the nodes of the tree that are yet to be expanded. In bread-first search
the frontier is a FIFO queue. In depth first search, the frontier is a LIFO stack. As the
search progresses, the agent is at a particular node of the tree, which is the front element
of the frontier. It generates a list of all the node's children. If it appends the list of
children to the end of the frontier and draws nodes to be expanded from the front of the
frontier, one has breadth-first search. If it inserts the list of children at the front of the
frontier and then draws nodes to be examined from the front, one has depth-first search.
Blind search is not feasible for the boat-torpedoes problem. If one supposes that on
average the boat will require 500 steps to dodge the torpedoes and reach home (an
optimistic estimate), blind search would require that the boat explore a search tree of
depth 500. Clearly no blind search strategy can do that in a reasonable amount of time.
More specifically, breadth-first search is unsuccessful because breadth-first search
requires that the agent examine a full tree. Since each additional level of the tree
represents only a relatively small distance traveled, the depth required to make any real
difference is too great for realistic use.
Depth-first is no better. A pure depth-first search would expand nodes, continually
turning in one direction until the generated path is hit by a torpedo — or reaches home. If
hit by a torpedo, the search would back up one step, turn in the other direction, and
continue turning in the original direction. But this approach too falls victim to the tooshort-distance-per-step problem. Backing up a single step, turning in the other direction,
and then proceeding in the original direction will almost always make very little
18
difference. The torpedo that caught the boat along the first path will catch it along the
second path as well. Eventually, the boat will backtrack far enough so that it will avoid
that torpedo. But the number of nodes required to be examined will exceed any
reasonable amount.
Informed search modifies the tree searches discussed in Chapter 3 by ordering the nodes
in the frontier in a more sophisticated way. Instead of putting newly generated nodes
either at the front or back of the frontier, one might evaluate a node and place it at a
position in the frontier that corresponds to an estimate of how likely it is to produce a
solution. Nodes that are deemed more likely to lead to a solution are placed near the front
of the frontier. Those that are deemed less likely to lead to a solution are placed near the
back of the frontier.
This approach is called informed search because the agent must have some way of
evaluating nodes with respect to their likeliness of producing a solution. The two bestknow informed search techniques are best-first and A*. In best-first search one orders the
frontier on the basis of estimates of how close the nodes are to the goal. Nodes that are
deemed closer to the goal (i.e., better) are placed toward the front of the frontier. In A*
search, one orders the nodes on the basis of (a) how close the node is to the goal along
with (b) a measure of how much work is required to reach that node. In other words,
nodes are ordered according to an estimate of how good the solution will be if that node
leads to a solution. The nodes that are deemed likely to lead to better overall solutions are
put earlier in the frontier. For the boat-torpedoes problem, we don't care how long the
19
boat stays out as long as it eventually gets home. So best-first is essentially equivalent to
A*.
Best-first and A* search may seem more promising than pure blind search. Following a
best-first search strategy the boat would explore the most direct path from its location to
the home base. If that path is intercepted by a torpedo, the boat would back up a step and
turn in the opposite direction from the direction that represents the intercepted path. It
would then proceed toward its home base. But this approach also falls victim to the tooshort-distance-per-step problem. As noted above, backing up a single step, turning in the
other direction, and then proceeding toward home will almost always make very little
difference. The torpedo that caught the boat along the first path will catch it along the
second path as well.
The genetic algorithm
Sections 4.3 and 4.4 discuss searches in which one looks for solutions to a problem —
rather than looking for paths that lead to goals. The genetic algorithm is the best known
approach for this sort of search. The boat-torpedoes problem may be reformulated to fit
this paradigm by reframing it as a search for a path from the current position to the home
base. That is, one is no longer traversing a tree looking for a path from the root to a goal
leaf. Instead one is standing back from the tree and looking at all possible paths from the
root to determine whether one of them leads to the goal leaf. Since there are far too many
paths to do that exhaustively, the genetic algorithm approach is to generate a population
of possible paths and hope to evolve a successful path.
20
Originally, a form of genetic algorithm was used in this problem. The population
consisted of paths through the tree, starting at the root. These paths were subjected to
cross-over and mutation. They were evaluated on the basis of (a) whether they reached
home and if not (b) some combination of how long they were before being hit by a
torpedo and how close they came to reaching home.
The advantage of this approach is that it generates a population of fairly diverse paths —
which do not suffer from the too-short-distance-per-step problem. That is, two elements
of the population often differ significantly in terms of how they attempt to traverse the
ocean.
There are two disadvantages to this approach.
1. Genetic operators (especially cross-over) make no sense in this context. It makes
no sense to append the end of one path to the start of another. More sophisticated
cross-over schemes are even less applicable. If one drops cross-over and works
strictly with mutation, this approach degenerates to looking for random paths
through the tree.
2. The population tends not to be diverse enough. Even though the population is
diverse, it is not diverse in an important way. If one generates random paths
through the tree, each path is likely to have more or less the same number of left
and right turns. In fact, one would get a normal distribution with most of the paths
bunched in the middle and having approximately equal left and right turns. The
paths may be significantly different from each other on a turn-by-turn basis, but
the population as a whole tends not to explore a very broad portion of the ocean.
21
Although a genetic algorithm approach turned out not to be successful, its deficiencies
suggested an approach that was successful. Two variants of that approach, which work
quite well, are described at the bottom of the page.
LRTA*
So far we have not attempted to say what we mean when we say that the searches require
visits to too many nodes. Section 4.5 approaches this problem directly. As that section
says, the searches we have examined so far are all what are called offline searches. The
assumption is that the agent (the boat in our case) performs its search using some sort of
internal computation before making any actual moves. That's possible for the boattorpedo problem because the boat knows the locations of its home base and of the
torpedoes, and it knows how the torpedoes will respond to its moves. In what Russell and
Norvig call online search,
the agent operates by interleaving computation and action; first it takes an
action, then it observes the environment and computes the next action.
The preceding, of course, is an accurate description of the boat-torpedo problem. The
boat and the torpedoes are each required to do some time-limited computation and then to
make a move in the simulated world. Consequently, one might hope that techniques that
accommodate this framework might be useful.
According to Russell and Norvig, online searches are especially useful when an agent is
attempting to explore an unknown environment. The agent has no way to learn about the
environment except by moving around in it. That is not the case in our problem. The
22
ocean is featureless; there are no hidden hazards. Furthermore, the boat knows its own
location, the locations of the torpedoes, and the location of its home base. It also knows
exactly how the torpedoes will move at each time step. Those moves will depend on the
boat's own move, but the boat knows that also. So the boat doesn't have to move around
in the environment to learn about it.
Nonetheless, section 4.5 discusses a form of search that may initially seem relevant to our
problem. It is called Learning Real Time A* (LRTA*). For our purposes, the most
relevant aspect of LRTA* is the requirement that the agent, like our boat, must act in real
time. In other words, the agent is permitted only a fixed finite number of steps (either
actual exploratory steps or internal node expansions) before it is required to make a move
of some sort.
Unfortunately, LRTA* is less powerful than the searches we have already dismissed.
That's because the agent is essentially constrained to position itself somewhere in the
frontier. Consider the LRTA* variant of simple best-first. According to simple best first
search, the agent has access to the entire frontier. As it expands the current best node, it
inserts the children at appropriate places in the frontier. In LRTA* the agent is positioned
somewhere in the frontier. The frontier cannot be ordered but corresponds roughly to the
states the agent has already visited, along perhaps with neighbors of those states. Each
frontier state is marked with an estimate of the cost to complete the path from that state.
As the agent expands a node, it learns more about the possible paths from the expanded
node. That gives it more information about the neighboring states, which it updates. But
the agent is never able to develop a bird's-eye view of the entire frontier;. it can only
follow the local gradient of the frontier to the best currently known node. Since A* itself
23
is inadequate for the boat-torpedo problem, LRTA* is even worse. Not only is the boat
still victim to the too-small-distance-per-step problem, the boat can't even jump from one
point in the frontier to another that may look more promising.
Another mismatch between LRTA* and the boat torpedo problem is that LRTA* assumes
what Russell and Norvig call a safe environment. The most salient feature of the boattorpedo problem is that the environment is decidedly not safe: the boat may be destroyed
by a torpedo. Fortunately, the boat is able to compute ahead and avoid many (but not all)
of the torpedo hazards. In LRTA* search, the only way the agent can learn about hazards
in the environment is to encounter them.
One might modify LRTA* by giving the agent the ability to sense a neighborhood around
itself. Then the agent can learn about a hazard without necessarily falling victim to it. The
real heart of the boat-torpedo problem is essentially that: to provide the boat a way of
seeing far enough ahead in its environment so that it can avoid the torpedo hazards. The
challenge is that in order to avoid a torpedo, the boat must generally make a decision
about its direction very many moves in advance of its potential encounter with the
torpedo.
One might call problems in which significant foresight is required blind alley problems.
A blind alley problem is one that has the property that once an agent determines that the
path it is exploring turns out to be a blind alley — with a torpedo at the end in our case —
it's too late to turn back. It may be that the alley has some internal options. But the
difficulty is that the torpedoes adjust their moves dynamically. Once a torpedo has the
boat in range, it's often impossible for the boat to escape no matter how it turns. Thus the
boat must anticipate blind alleys far in advance. None of the search techniques discussed
24
so far allow the boat to explore multiple distinct paths in depth. It cannot look down an
alley and determine what lurks at the end.
How in general should one approach a blind alley problem? Unless one can see through
to the solution of a problem, every path is a potential blind alley. With time-limited
computing, it's impossible to determine in general whether a path is a blind alley or not.
The only advantage the agent has is that it can project ahead and keep a number of
options open for itself. As time passes, and as it explores these options further, it may
find that some of them are blind alleys and that some of them open up to new
possibilities. The best the agent can do in a blind alley problem is to assure as best it can
that it always has a range of available options. To do that it must ensure that the
collection of options it maintains for itself is as diverse as possible. Since it doesn't know
which of the options will eventually turn out to be dead-ends, it must provide for itself as
many different kinds of options as possible.
In fact, one might want to modify the boat-torpedoes problem to be a problem in pure
survival. Instead of having as a goal to reach a home base, the goal of the boat could be
simply to avoid the torpedoes as long as possible. This would make the problem
somewhat simpler to state and somewhat purer to solve. A strategy for the boat would
then be evaluated on the basis of how long the boat stays out of the paths of the
torpedoes. Although this seems like a worthwhile change, for this page we will stay with
the original problem.
The more general case of this sort of problem is the generic strategy planning
responsibility of people at very high levels of organizations. One never knows what's
25
going to happen. So it's important to keep as many options open as possible. Of course
it's also important to act when an opportunity presents itself. So there will always be a
tension between committing to a course of action and ensuring that multiple options are
available.
Adversarial search matches two adversaries with equal search capabilities. Adversarial
search is used for game playing. Each participant knows the rules of the game; each
participant knows what the possible moves are of the other participant; and each
participant can determine how best to move to ensure its best possible result.
In adversarial search both participants construct what is called a game tree. The game tree
indicates the states that would exist were the participants to make the various possible
moves. Through a process of minimax (and its more efficient implementation called
alpha-beta pruning) each participant can determine the best possible outcome.
In games in which the tree can be explored completely, the game is determined. In
games, like chess, in which the tree is too large to explore completely, heuristics must be
developed to assign values to the nodes that can be reached in the time available.
The boat-torpedo problem may be understood as something like a one-sided adversarial
search problem. The boat is capable of making decisions about how to move. The
torpedoes respond predictably at each opportunity. So one may imagine the boat
expanding the game tree and looking for the best result. But putting the problem in those
terms doesn't help. It simply points out that the problem is how to expand the tree. We
already know that expanding the tree in any compact way (breadth-first, depth-first, best-
26
first, A*, etc.) will be unsuccessful. Too many nodes are required before one gets far
enough away from the root to be able to distinguish in a useful way among different
paths.
The chapters on planning extend search to problem spaces that have more sophisticated
transitions. Like other searches, planning is an attempt to formulate a series of moves that
will transform an initial state into a desired goal state. Planning is more complex than
other searches in that the transition rules have preconditions for their application. In some
ways this is not very different from saying that the road that leads from city 1 to city 2
requires that the agent be in city 1 before the road may be taken.
Since the boat-torpedo problem does not involve sophisticated transition rules, the
additional power that sophisticated transition rules offer is of no help in this problem.
A successful approach
Interestingly, one of the approaches that Russell and Norvig dismiss turns out to be quite
successful. It is iterated gradient ascent but with the the twist that the paths generated are
intended not primarily for finding a solution but as a way to ensure that the boat has a
diversity of options.
We have developed two versions of this approach. Under either of these approaches, the
boat proceeds as follows. At each time step it generates a diverse population of possible
paths. Actually, because the boat is always expanding a tree, it will always have a
collection of options. At each time step it generates additional options and then selects
which ones to keep.
27
It does this in real time by limiting itself to a fixed number of node expansions. All this
work is done as a computation and does not involve the boat making any actual moves.
Once the boat has generated its path options, it selects whether to move left or right. This
approach then requires two types of decisions.
1. How should the boat generate the population of paths?
2. How should the boat select in which direction to more?
Our working framework has been (and continues to be) that the boat computes a tree of
possibilities. Thus at each time step, the boat has access to the tree of nodes that it has
already computed as being accessible, i.e., as being reachable without being hit by a
torpedo. The boat must then select nodes from that tree to expand.
How should the boat select nodes to expand?
There are a number of possibilities.
1. It could pick a node at random.
2. It could pick the "best" node, where best is measured as closest to home or
farthest from the nearest torpedo, or farthest from the root, or according to some
other measure.
3. It could select from nodes that are distributed between the two subtrees of its
current root node. The argument for that strategy is that when it moves it will be
required to select either the left or right subtree of its current root position. It
therefore makes sense to attempt to explore nodes from both sides of those
subtrees before it is required to discard one of them.
28
Once the boat has a strategy for selecting nodes to expand, it must generate expansion
nodes.
How should the boat generate paths?
We developed three approaches. All of these approaches are intended to ensure that the
generated paths are sufficiently different that as a collection they will offer the boat a
reasonably diverse set of options. In all three versions, the boat generates paths from
nodes selected for expansion. It generates each path until that path is intercepted by a
torpedo or until that path reaches home.
1. In one version the boat generates paths that reach toward home. A nice way to
generate a diverse collection of long paths toward home is to realize that we are
working in a toroidal world. From any point in that world, the boat has 9 straight
line directions to its home base. To see this think of the world as a square
surrounded on each side by a copy of itself and with four more copies at each of
the four corners. From the boat's position in the middle square, which represents
the actual world, there are nine lines to the home base images in each of the nine
versions of the world. So to generate a path, the boat generates a best-first path to
one of the nine images, selected at random. This approach tends to generate
enough long paths that are different enough from each other that the boat buys
itself time. Eventually, one of the randomly generated paths will actually reach its
selected home image, and the boat will make it safely home.
2. The other version is similar except that instead of generating best-first paths to a
home image, the boat generates a collection of paths, each having a randomly
29
selected percentage of left and right turns. That is, from the selected start node,
the boat generates a path with a given random percentage of left and right turns.
This too results in the generation of a diverse collection of sufficiently long paths
that the boat buys time until it finds a way home. This approach requires that the
boat favor path starting points that are closer to home. Note that this approach
generates curves with higher or lower curvatures. A path with all right turns for
example, will generate an approximate circle that reflects the tightest circle the
boat can make.
3. If, as suggested earlier on this page, we modified the problem to be one simply of
staying alive, one could generate paths by selecting a random direction and then
have the boat turn as closely as possible so that it is moving in that direction. The
initial part of these paths will curve until the boat changes its heading to
approximate the selected direction. After that the boat will zig-zag to move in the
selected direction. This version would be similar to the approach of generating a
straight line path to one of the 9 home images except that the boat is driven by a
direction rather than a target point.
In both of these strategies, the boat continues to generate paths—terminating a path when
it is hit by a torpedo or reaches home—until it reaches its node-expansion limit for the
time tick.
How should the boat select which subtree to keep?
Once the boat generates its paths, it needn't commit itself to any one of them. All it has to
do at each time tick is to commit itself to a single move: left or right. So the boat must
30
decide not among paths but between trees. Is it better to keep the left subtree or the right
subtree? The question becomes: is there a way of evaluating (sub)trees so that the boat
can select one over the other?

One option is to select the tree that contains the "best" node, however best is
determined. But that may not be the best choice. The best node may eventually
lead to a dead end, and the other paths in that tree may not provide enough viable
options.

Another possibility is to select the tree with the most nodes. That approach is
based on the argument that the tree with the most nodes gives the boat the most
chances. But if all the nodes of a tree are essentially corralled by the torpedoes,
e.g., in a large blind alley, a simple count of the number of nodes may not be the
best choice.

Another possibility is to select the tree with the longest path. That tends not to be
too bad a choice since a long path gives the boat a great many steps during which
to generate new nodes before it eventually reaches the end of the initially selected
path. By that time another path is likely to have opened up and be the longest.

It would seem that the best choice is the (sub)tree with the most different paths.
But it's not clear how to measure trees for path variety. At this point we have not
developed an effective metric for comparing trees on this basis. We suspect that
such a metric will combine path length with path direction. We will want long
paths, and we will want paths that end up going in different directions. A subtree
that has paths with both properties will probably offer the boat the broadest
variety of viable options.
31
V. Conclusions
There has been four milestones in this project.
I.First, the objective was to put the project together and make it work for four different
search strategies: Breadth First, ST (Frontier) Search, Spider Search, and an older version
of Spider Search
II.Change parameters to make game more visible and to run faster.
III.Display lines of search paths for each search strategy.
IV.Explore more the survival capabilities of the boat by eliminating the home port and
find out how much it can manage to stay alive.
Currently work is in progress to deliver milestone number 4. All other milestones for this
project have been completed and the remaing work for this quarter is to continue working
with milestone four.
VI. References
Chapters from Russell and Norvig are most directly relevant to search.
Chapter 3. Solving Problems by Searching. This chapter discusses the basic search
techniques.
Chapter 4. Informed Search and Exploration. This chapter discusses Best-first, A*, online
search, and genetic algorithms.
Chapter 6. Adversarial Search. This chapter discusses search in which one is dealing with
an adversary that is performing a search of the same search space but with opposing
objectives.
Chapter 11. Planning. This chapter discusses search in a state space that is much less
homogeneous than those discussed earlier.
32
Chapter 12. Planning in the Real World. This chapter discusses online search in an
uncertain environment.
33