Planning on robots

ROBOTICS
COE 584
Reactive Systems
Previously, on Robots …

Robots are agents in physical or virtual environments


Environments have different characteristics


Static/dynamic, accessible?, deterministic?, ….
Since we are lazy, we want robots to do things for us


Persistent, situated, responsive
Robots must consider task when deciding what to do
Action-selection problem:
What to do now in service of the task?
Physical Environments




Dynamic
Non-deterministic
Inaccessible
Continuous
A Crash Course in AI Planning



Planning: An approach to action-selection problem
Very long history, since the very beginning of AI
1971, seminal paper: STRIPS (Fikes and Nilsson)


Still cited and taught today, despite much progress
STRIPS originally developed for SRI robot, Shakey


This is ironic
Later on, STRIPS planning was rejected for robot control
AI Planning Definition: State
{ On(A,table), On(B,table), On(C,table), In-hand(nil) }
A
B
C
{ On(A,B), On(B,table), On(C,table), In-hand(D) }
A
B
D
C
AI Planning Definition: Operators

Operators change state of the world


Pick-A(?x)




Preconditions, add-list, delete-list
Preconditions: On(A, ?x), In-hand(nil)
Add: In-hand(A)
Delete: In-hand(nil), On(A, ?x)
Put-A(?y)



Preconditions: In-hand(A), not On(?x, ?y)
Add: On(A,?y), In-hand(nil)
Delete: In-hand(A)
STRIPS Planning

Given:




Initial state of the world
Operators
Goal state
Produce:


Plan: Ordered list of instantiated operators
Will change the world from initial state to goal state
Planning Example
Initial State
A
B
After Pick-B(table)
After Put-B(A)
B
B
C
A
A
C
C
C
B
B
A
A
Goal State
After Put-C(B)
After Pick-C(table)
C
Planning on robots



Sense initial state using sensors
Create a full plan given goal state (given task)
Feed plan, step-by-step to motors

No need to sense again
What’s wrong with this?
(Hint: Think about Schoppers’ paper)
Deliberative Control
Sense
Model
Think
Act
Deliberative:



Has internal state (typically a model of the world)
Uses this internal state to make decisions
Decisions made between alternatives
When plans goes wrong

Dynamic environment


Non-deterministic


State changes not according to operator specs
Inaccessible


State changes even if no operator applied
Cannot sense entire state of the world
Continuous

Predicate-based description of world is discrete
Reactive control
Sense

Hard Wiring
Act
Reactive:




No internal state
Direct connection from sensors to actions
S-R (stimulus response) systems
No choices, no alternatives
Universal Planning

Have a plan ready for any possible contingency



From any initial state, know how to get to goal state
Input: Operators, goal state


Scouts: Be prepared! 
Do not need to give initial state
Output: Decision tree


What operator to take, depending on environment state
Not a single ordered list of operators
Universal planning algorithm
Initial State
A
B
After Pick-B(table)
After Put-B(A)
B
B
C
A
A
C
C
C
B
B
A
A
Goal State
After Pick-C(table)
C
Robot Control Algorithm
Using Universal Planning




Robot given task (goal, operators)
Uses universal planner to create universal plan
Robot senses environment
Goal state reached?


No: Execute operator according to decision tree
Yes (keep persistency)
Advantages of Universal Planning

Guaranteed to use optimal (shortest) plan to goal




A very good thing!
Optimal solution to action selection problem
Robust to failures
Robust in dynamic and non-deterministic domains
Problems with Universal Planning



Assumes accessibility
Assumes perfect sensors
Assumes discrete actions (operators)
Universal plan as mapping
sensors to actions


Universal plan can be viewed as a function
Sensor readings to actions
u: SA

Essentially a table: For each state, give action

Schoppers uses a decision-tree representation
Problems: Planning Time


What is the planning time?
Planning time grows with the number of states


Since we have to enumerate operator for every state
What is the number of states in an environment?
Worst case:
All possible combinations of sensor readings
(state predicates)
Problems: Universal Plan Size

Plan size grows with the number of possible states

“Curse of dimensionality”
X1
X2
Pick
Put
X1
Pick
Put
Put
Put
Pick
Pick
X2
X2
X1
Pick
Put
Pick
Put
Problems: Stupid executioner

Schoppers:


Ginsberg:




Baby goes around knocking blocks around?
What if baby repeatedly knocks down the same block?
Universal plans may get into cycles
This is because no deliberation is done
Universal planner relies on simple executioner


Sense, consult table, act
Same as regular planner – except for sensing
Brooks’ Subsumption Architecture
Multiple levels of control: Behaviors
Plan changes
Identify objects
Monitor Change
Map
Explore
Wander
Avoid Object
Why does this work?

It breaks the ideal universal plan into behaviors





avoids the curse of dimensionality
Behaviors (levels) interact to generate behavior
Note that programmer is responsible for taskoriented design
Goes both below and above universal plans
Hand programmed: approximate plan

Not automatically generated
Subsuming Layers

How to make sure overall output is coherent?


e.g., avoid object is in conflict with explore
Subsumption hierarchy: Higher levels modify lower
Map
Explore
Wander
Avoid Object
Coherence using subsumption
Key principle
Higher layers control input/outputs of lower layers
In practice:
 Can be difficult to design



a lot depends on programmer
Single module per layer can be restrictive
Increasingly challenging to control at higher levels
Irony

Ginsberg’s article pretty much killed universal planning


Though occasional papers still published
Reactive control very popular in practice


But due to theory problem, no more automated planners!
So we get lots of reactive plans, but no planning
Irony again

Ginsberg was right:



Approximating universal plan is possible
Tends to be useful only in fairly low level locomotion control
Approximation is what Brooks had done

Which is why he often gets the credit for the revolution
Starting next week
Behavior-Based Control:
Expanding on Subsumption