Values of Fluents

Nishant Shukla
Notes
October 21, 2016
Values of Fluents
An AOG is a model of all valid plans to achieve a fluent-change. This formulation of values establishes
a total order on the parse-graphs.
Learning Values
Denote internal fluents as F (s) , x, external fluents as F (ŝ) , u, and motor control as a. A task plan is
the optimal motion sequence that maximizes value.
a∗[1:t] = arg max V (a[1:t] )
a[1:t]
The search for a value function is facilitated by the expansion below.
∂V
∂V ∂x ∂u
=
∂a[1:t]
∂x |{z}
∂u ∂a[1:t]
|{z}
| {z }
“why” “how”
To solve the first part
∂V
∂x ,
“what”
we define a Lagrangian,
L(ẋ, x, t) = Ekinetic − Epotential = Cost(ẋ) + V (x)
minimizing the functional,
Z
∗
x (t) = arg min
x(t)
b
L(ẋ, x, t) dt
a
to use the Euler-Lagrange equation.
d ∂L ∂L
=
dt ∂ ẋ
∂x
∂L
d
=
Cost(ẋ)
∂ ẋ
dẋ
∂L
d
=
V (x)
∂x
dx
d ∂
d
Cost(ẋ) =
V (x)
dt ∂ ẋ
dx
UCLA
Last modified: October 21, 2016
1
Nishant Shukla
Notes
October 21, 2016
Where the cost of a fluent-change is defined as the expected cost of actions that can achieve it.
Z
Cost(ẋ) , Eu|ẋ [Cost(u)] =
Cost(u) P (u|ẋ) du
Ωu
And in turn, the cost of an action is recursively defined as the expected cost of a fluent-change.
Z
Cost(u) , Eẋ|u [Cost(ẋ)] =
Cost(ẋ) P (ẋ|u) dẋ
Ωẋ
We hypothesis the interlinked cost functions converge to a fixed point equilibrium, like PageRank.
Either way,
d ∂
∂V
=
∂x
dt ∂ ẋ
Z
Cost(u) P (u|ẋ) du
Ω
AOG Task Plan
A fluent-vector describes all the fluents at some time, F (s) ∈ RN . The state-space for planning is a set
of fluent-vectors, {F (s)}s . A set of actions A can transition between states.
The And-Or Graph G is a policy, and any parse-graph pg is a valid plan. There are many plans pg to
achieve the same goal: Ω = {pg | pg ∈ G}.
The optimal action sequence is driven by an optimal plan, which is defined as the parse-graph with
minimum cost.
a∗[1:t] = pg ∗ = arg min Cost(pg) = arg max V (pg) = arg max V (a[1:t] )
pg∈Ω
pg∈Ω
a[1:t] =pg∈Ω
Every fluent-vector F (s) has a value VF (s) ∈ R.
Every change in fluent ∂F = (∂f1 , ..., ∂fN ) has a cost Cost(∂F ) ∈ R+ .
UCLA
Last modified: October 21, 2016
2