Segmentation of Human Motion into Dynamics - JHU

Segmentation of Human Motion into Dynamics
Based Primitives with Application to Drawing
Tasks
D. Del Vecchio, R.M. Murray, P. Perona
Alvina Goh
Reading group: 07/10/06
Motivation
• Develop a framework for the decomposition of human
motion using tools from dynamical systems and systems
identification.
• In the paper by Bregler and Malik,
– their approach does not include an input and
– therefore only applicable to periodic or stereotypical motions, like
walking and running where the motion is always the same and
movemes are repeatable segments of trajectory.
Aim of Paper
• Build an alphabet of movemes which one can compose to
represent and describe human motion similar to phonemes
used in speech.
• Segmentation and Classification:
Can a continuous trajectory of the human body be
decomposed automatically into its component movemes?
Dynamical Definition of Moveme
• Basic definitions and properties
M (£) denote a LTI system class parameterized by £ 2 E,
where E is a linear space.
U denote a class of inputs.
y(t) = Y (M (£)ju;x )(t) for t ¸ t0 denote the output of M (£), with the
0
parameters £ 2 E, input
u 2 U , and initial conditions already chosen.
µ 2 E 0 ½ E be a parameter lying in a subspace of E.
Map ¨ : E ! E 0 . Write µ = ¨(£) representing the transformation from
£ 2 E to reduced set of parameters µ 2 E 0
Dynamical Definition of Moveme
• Definition of a moveme
De¯nition 2.1: Let M 1 = fM (£)jµ 2 C 1 g and M 2 = fM (£)jµ 2 C 2 g denotes
two subsets in M with C j ½ E 0 for j = 1; 2.
M 1 and M 2 are dynamically independent if
1. the class of systems M and the class of inputs U are such that
Y (M (£1 )ju ;x )(t) = Y (M (£2 )ju ;x )(t), 8t ¸ t0
1 0
0
,(£ ; u ) =
2 U 2and
2 U;
(£
;
u
)
for
u
u
1
1
2
2
1
2
C
C
1
2
2. the sets
and
are non empty, bounded and have trivial intersection,
³.e. C 1 \ C 2 = ;
Each of the elements in set M = fM 1 ; ¢ ¢ ¢ ; M l g of mutually dynamically independent model sets is called a moveme.
Dynamical Definition of Moveme
• Model
used
paper
Model
class M
andin
input
u as asympototically stable linear systems driven by
a unit step input with full state output:
x_ = Ax + b
y=x
(1)
where A 2 Rn£n , x = (x1 ; ¢ ¢ ¢ ; xn ) 2 Rn , b 2 Rn ,
such that £ = (Ajb) 2 E = Rn£(n+1) and µ = A 2 E 0 = Rn£n with ¨(Ajb) = A.
Assumption 2.1: Given x(t) as the output of the model in (1), we assume that
the initial condition x0 is such that for any v 2 Rn+1 , v T x
¹(t) = 0, t 2 [t1 ; t2 ],
t2 > t1 , ) v = 0, where x
¹ = (xT ; 1)T .
This assumption implies that (1) is minimal in the sense that x(t) cannot be
described by a lower order dynamical system as xn (t) cannot be a linear combination of x1 (t); ¢ ¢ ¢ ; xn¡1 (t).
Dynamical Definition of Moveme
A consequence of assumption 2.1 is that there is a 1-1 correspondence between
x(t) and parameters (Ajb) of model (1), giving us the following lemma.
Lemma 2.1: Let x(t) and z(t) be generated by two LTI systems
x_ = A1 x + b1
z_ = A2 z + b2
(2)
and let assumption 2.1 hold. Then z(t) = x(t) for all t , (A1 jb1 ) = (A2 jb2 )
Proof: (() If (A1 jb1 ) = (A2 jb2 ), then z(t) = x(t) for all t by uniqueness of
solutions.
()) If z(t) = x(t) for all t, then z(t)
_ = x(t)
_ for all t, so [(A1 jb1 )¡(A2 jb2 )]¹
x(t) = 0
for all t, which implies (A1 jb1 ) = (A2 jb2 ) since assumption 2.1 is satis¯ed.
Construction of Set of Movemes
Lemma 2.1 shows that Property 1 of De¯nition 2.1 is satis¯ed by choice of M
and U .
By choosing C j ; j = 1; ¢ ¢ ¢ ; m as balls in Rn£n with centers Aj 2 Rn£n , j =
c
1; ¢ ¢ ¢ ; m, and radii rj such that
C j = B (Aj );
rj
c
C j \ C k = ;;
j = 1; ¢ ¢ ¢ ; m
(3)
j 6= k
where m is the number of movemes and the matrix norm is the Frobenius norm,
we satisfy property 2 of De¯nition 2.1.
Thus we have constructed a set M = fM 1 ; ¢ ¢ ¢ ; M m g of m movemes where
M k = fM ((Ajb))jA 2 C k g, for k 2 f1; ¢ ¢ ¢ ; mg and M is in the form given by
(1)
Classification: Noiseless Case
Now, if we are given any signal x(t), we would like to determine a good representative of this signal in the space of models given in (1). This is done via
minimizing the following cost function:
Z
T
1
j
^
^
(A b) = arg min
(x_ ¡ (Ajb)¹
x)T (x_ ¡ (Ajb)¹
x)dt
(4)
2
j
(A b)
t
0
with x
¹ = (xT ; 1)T .
The solution to this quadratic minimization problem is given as
Z
Z
T
T
j
^
^
T
(A b) = (
x(t)¹
_ x(t) dt)(
x
¹(t)¹
x(t)T dt)¡1
t0
t0
R
Assumption 2.1 ensures that ( T x
¹(t)¹
x(t)T dt)¡1 exists.
t0
Classification: Noiseless Case
This gives the least squares estimate of parameters (A^j^b) so to get the estimate
of x in model class (1) as
^x + ^b
x
^_ = A^
x
^(t0 ) = x(t0 )
In order to classify x(t) as output of moveme M j , we see that (A^j^b) = (Ajb),
thus we just have to ¯nd k 2 f1; : : : ; j; : : : mg such that A^ 2 C k
Classification: Perturbed Case
Now consider the signal x(t) generated by
x_ = Ax + b + d(t)
y=x
with A 2 C j , for some j 2 1; : : : ; m and d(t) is a bounded realization of white
noise.
Note that the least square estimate in given by
Z
Z
T
T
j
^
^
(A b) = (
x(t)¹
_ x(t)T dt)(
x
¹(t)¹
x(t)T dt)¡1
t0
t0
R
( Tx
¹(t)¹
x(t)T dt)¡1 exists in the noiseless case d(t) = 0 by assumption 2.1 and
t0
remains well de¯ned when d(t) 6= 0 by the fact that d(t) is a realization of white
noise that is uncorrelated in time.
Classification:
Perturbed Case
In addition, since A 2 C , there exists ± < r such that A = A + ±U with U a
j
j
c
j
unit norm matrix and Aj center of C j . We now get
c
x_ = (Aj + ±U )x + b + d(t)
c
y=x
(5)
Thus we need to identify j in (5) for some conditions on ± and d(t). Note
that if d(t) = 0 then we can exactly identify Aj + ±U and correctly classify x(t)
c
Under what conditions on A and d(t), x(t) can be classi¯ed as output of moveme
M j ? d(t) induces an estimation error such that A^ 6= Aj + ±U , however, as the
c
following lemma will show, equality is not required for the right classi¯cation.
Lemma 2.2: Let x(t), t 2 [t0 ; T ] be generated by (5), where Aj is the center of
c
C j for some j 2 f1; : : : ; mg in (3). Let A^ be the least square estimate
according
¹
to (4). There exist positive constants d¹ and ±¹ such that if ± · ±¹ and kd(t)k · d,
then
arg
k2f1;:::;j;:::;mg
fkA^ ¡ Ak k · r g = j
k
c
Segmentation
Aim: Obtain su±cient conditions on noise level and parameter uncertainty that
allow o®-line determination of
1. the sequence of switching times f¿1 ; : : : ; ¿l¡1 g (¿0 is the known starting
time, ¿l is the known ending time), and
2. the sequence of matrices fA1 ; : : : ; Al g from the observation of state x.
Once we have that, then just apply Lemma 2.2 to classify.
Segmentation
Consider the sequence of systems for i = 0; : : : ; ; l
x_ = (Ai + ±Ui )x + bi + d(t)
x_ = (Ai+1 + ±Ui+1 )x + bi+1 + d(t)
t 2 [¿i¡1 ; ¿i )
t 2 [¿i ; ¿i+1 )
(6)
with x 2 Rn , Ai 2 Rn£n , bi 2 R , Ui 2 Rn£n norm ones matrices, ± 2 R
modeling uncertainty, d(t) realization of white noise.
Segmentation Assumptions
We make the following assumptions:
Assumption 3.1: Ai 2 Rn£n an unknown matrix whose value can take place
in the set of known Hurwitz matrices fA1 ; : : : ; Am g, which are the centers
c
c
C j = B (Aj ) with C j \ C l = ; for j 6= l.
Modeling
uncertainty ± is such
rj
c
that (Ai + ±Ui ) 2 C j for some j.
(A square matrix A is called a Hurwitz matrix if all eigenvalues of A have strictly
negative real part )
Segmentation Assumptions
¹
Assumption 3.2: Modeling, uncertainty and disturbance are bounded, j± j · ±,
¹
and kd(t)k · d.
Assumption 3.3: Vectors bi 2 Rn are unknown constant vectors, ¿i unknown
switching times with ¿0 known starting time, and ¿l known ending time. The
total number of switching times l is unknown.
Segmentation Assumptions
Assumption 3.4: The nominal system obtained when noise and parameter uncertainty go to zero is (assumption 2.1 is satis¯ed)
t 2 [¿i¡1 ; ¿i )
t 2 [¿i ; ¿i+1 )
x_ = Ai x + bi
x_ = Ai+1 x + bi+1
(7)
satis¯es the interconnection condition
x(¿
_ ¡ )T x(¿
_ +) ·
i
i
½0 < 1
¡ kk
kx(¿
k
+
_
) x(¿
_
)
i
i
where
x(¿
_ ¡ ) = lim x(¿
_ )
i
¿ !¿ ¡
i
x(¿
_ + ) = lim x(¿
_ )
i
¿ !¿ +
i
This condition also gives a bound on the discontinuity in the trajectory0 s derivative at the switching point. (Dot product, angle between two vectors)
Segmentation
Solution
Propose an iterative approach, which we look for the maximizer of a function
W de¯ned on [t0 ; tM ] where tM = ¿l and t0 is the starting time, coinciding with
¿0 at the 1st iteration.
Least square estimate for x(t), 2 [t0 ; ¿ ] is
Z
Z
T
(A^j^b) = (
x(t)¹
_ x(t)T dt)(
t0
T
x
¹(t)¹
x(t)T dt)¡1
t0
^ t )^
which generates the system x
^_ = A(¿;
x + ^b(¿; t0 ) with x
^(t0 ) = x(t0 )
0
Segmentation Solution
De¯ne the following three quantities before we can de¯ne W
1. Transition factor
1 ¡ x_ av (¿ ¡ )T x_ av (¿ + )
i
i
Tr(¿ ) = (1 k
)
¡
kk
k
+
2
x_ av (¿ ) x_ av (¿ )
i
(¿ ¡ )
R
¿
¿ ¡¢¿
where x_ av
=
x(t)dt
_
and x_ av (¿ + ) =
2. Parametric error at time ¿
1
¢¿
ep (¿; t0 ) =
i
1
¢¿
R
¿ +¢¿
¿
^ t ) ¡ Aj k
min kA(¿;
0
j=1;:::;m
c
3. Approximation error at time ¿
1
ea (¿; t0 ) = ¡
¿ t0
Z
¿
t0
(x ¡ x
^)T (x ¡ x
^)dt
x(t)dt
_
Segmentation
Solution
We want a function with the maximizer falling in the interval I around the ¯rst
switching time encountered after time t0 and the length of this interval should
go to zero when there is no uncertainty.
Function chosen is
exp( ¡ep (¿;t0 )2 )Tr(¿ )
¾2
W (¿; t0 ) =
;
a + ea (¿; t0 )
W has high value at ¿ where
1. there is a small approximation error ea (¿; t0 ),
2. small parametric error ep (¿; t0 ) and,
3. high transition factor Tr(¿ )
¿ 2 (t0 ; tM ]
Main
Theorem
in
Segmentation
The following theorem shows that W has the desired properties.
Theorem 4.1: Consider the sequence of dynamical systems given in (6) subject
to Assumptions 3.1-3.4. Let the function W (¿; t0 ) be de¯ned as
exp( ¡ep (¿;t0 )2 )Tr(¿ )
¾2
W (¿; t0 ) =
;
a + ea (¿; t0 )
¿ 2 (t0 ; tM ]
for t0 = ¿i¡1 and tM = ¿l . Then there exists bounds ± ¤ and d¤ such that if
±¹ · ± ¤ and d¹ · d¤ the potential function W (¿ ) admits its global maximizer ¿^i
for ¿^i 2 I = [¿i ¡ ¢¿; ¿i ¡ ¢¿ + ] where I contracts to ¿i as ±¹ ! 0 and d¹ ! 0.
Moreover the estimated class ^j of the segment in [t0 ; ¿^i ] is equal the class of ith
segment generated by the system (6).
The proofs of Theorem 4.1 and the following lemmas are found in Decomposition of
Human Motion into Dynamics Based Primitives with Application to Drawing Tasks, D.
Del Vecchio, R. M. Murray, and P. Perona. Automatica, vol. 39(12), pp. 2085-2098,
2003.
Lemmas
Used
in
Proof
The following two lemmas hold for the nominal system given in (7).
Lemma 4.1: Consider the system given in (7) (the nominal noiseless system).
There exists k1 > 0 such that
k(A^j^(b))(¿ ) ¡ (A jb )k2 ¸ k (¿ ¡ ¿ )2 ;
i i
1
i
¿i < ¿ < ¿i+1
This lemma establishes that the closer ¿ is to the switching time ¿i the smaller
the lower bound of the parameter estimation error for (7).
Lemma 4.2: Consider the system given in (7) and approximation error ea (¿ ),
there exists k2 > 0 such that
ea (¿ ) ¸ k2 k(A^j^b)(¿ ) ¡ (Ai jbi )k2 ;
¿i < ¿ < ¿i+1
This lemma establishes that for the nominal system, the approximation error
lower bound increases as the parameter estimates become far from the parameters
Ai ; bi .
Lemmas
Used
in
Proof
To relate the quantities for the perturbed system given in (6), the following two
lemmas are given.
Lemma 4.3: Let A and Ai be Hurwitz matrices and consider the pair of systems
x_ = Ax + b
z_ = A1 z + b1 + d(t)
¹
with x; z 2 Rn , A; A1 2 Rn£n , b; b1 2 Rn , kd(t)k · d¹ and k(Ajb) ¡ (A1 jb1 )k · ±.
Then if x(0) = z(0) there exists k3 > 0 and k4 > 0 such that
kx ¡ z k2 · k ±¹ + k d;
¹
3
4
8t ¸ 0
This lemma establishes how far two systems states are from each other when the
two systems di®er due to parameter di®erences and presence of noise.
Lemmas
Used
in
Proof
To relate the quantities for the perturbed system given in (6), the following two
lemmas are given.
Lemma 4.4: Let ep (¿ ) and ea (¿ ) denote parametric errors and approximation
errors for the sequence of dynamical systems given in (6) (noisy case). Let e0 (¿ )
p
and e0 (¿ ) denote parametric errors and approximation errors for the nominal
a
system in (7). Then there exists constants kp > 0 and ka > 0 such that
e0 (¿ ) ¡ ¢ · ep (¿ ) · e0 (¿ ) + ¢
p
p
e0 (¿ ) ¡ ² · e (¿ ) · e0 (¿ ) + ²
a
a
a
with ¢ = kp (d¹ + d¹2 + d¹3 + ±¹ + ±¹2 + ±¹3 ) and ¢ = ka (d¹ + d¹2 + d¹3 + d¹4 + ±¹ + ±¹2 +
±¹3 + ±¹4 + ±¹6 ).
This lemma explicitly links parameter and approximation errors for nominal and
perturbed systems.
Lemmas
Used
in
Proof
Finally, we want to ¯nd a possible value of the averaging time ¢¿ as function
of noise level and parameter uncertainty such that for ¿i + ¢¿ · ¿ · ¿i+1 ¡ ¢¿
for each i the transition factor becomes smaller and smaller as the perturbation
decreases and reaches zero when there is no perturbation. The following lemma
shows how to obtain ¢¿ .
Lemma 4.5: Let the transition factor be de¯ned as before. There exist positive
constants c1 and c2 such that if
¡ 2¯
1
¢¿ = ¡c1 ln( ¡ )
1 ¯
then the transition factor is such that
Tr(¿ ) · c2 ¯; ¿i¡1 + ¢¿ · ¿ · ¿i ¡ ¢¿
1 ¡ ½0 ¡ '
¸
Tr(¿ )
; ¿ = ¿i
2
for all i, where ¯ and ' are perturbation dependent quantities and go to zero
as the perturbation goes to zero.
Proposed Algorithm
The segmentation and classi¯cation algorithm is given as follows:
1. initialization: t0 = ¿0 , tM = ¿l , i = 1;
2. maximize W (¿ ) for ¿ 2 (t0 ; tM ]:
¿^i = max¿ 2(t ;t ] W (¿ );
0
M
3. compute class j of the segment found:
fkA(^
^ ¿ ) ¡ Ak kg · r ;
j = arg 2f
g
k
k
1;:::;m
4. compute ¢¿ ;
5. set t0 = ¿^i + ¢¿ ;
6. i = i + 1;
7. go to 2.
c
Experimental Setup
•
Subjects are shown 4 different prototypes: car, sun, ship and house.
•
Asked to reproduce them on a 700 x 500 canvas; dimensions are chosen arbitrarily.
•
Each drawing task is accomplished by performing a sequence of actions such as “reach pt A”, “draw a line to pt B”.
These actions define elementary motions.
•
Use Theorem 4.1 to find the sequence of reach and draw movements that the user did to accomplish the task and the
switching times.
•
The (x,y) position is sampled everywhere on the screen at the rate of 100Hz and a spatial resolution of 1 pixel
Experimental Setup
• “Draws”: straight lines traced
with a specific intention (like
drawing a side of the house)
• “Reaches”: happens with the
intention of shifting fast the
equilibrium position
• Both the first and second order
dynamical systems are
considered. The second order
decoupled system is found to be
the best fit.
• The circle class is also
introduced since circular shapes
like the wheels of the cars exist.
Experimental Results
• Classification error: trajectory correctly segmented, wrongly classified
• Segmentation error: trajectory over segmented or missed segmentation
Experimental Results
Finally to differentiate the different
category (car, house etc), a Gaussian
classifier is built based on the number of
reach, draw and circles.
Discussion