Gestures_CSE_Concord.. - The Topological Media Lab

A scientific approach to
gesture recognition tracking
in perfomative environments
Sha Xin Wei
[email protected]
Computer Science, Concordia
13 November 2007
1
motivating examples
Responsive Environments
2
life
a-signifying semiologies
ethico-aesthetic magma
pathic, improvisatory subjectivation
chaosmosis
art
Guattari
How can we build a world that’s not complicated, but rich?
Stengel: ethics expressed not in
3
3
Smart environments?? airports,
streets, EV
Strategy: look to experience from live
performance and architecture
4
TGarden responsive
playspaces
(1997) 2001-2002
Stanford; FoAM, Starlab Brussels; Banff New Media
Institute; Georgia Tech.
Exhibited: Siggraph 2000, Medi@Terra Athens 2001,
Ars Electronica Linz 2001, DEAF Rotterdam 2001;
subsequent: txOom & tgvu 2002
5
Preliminary TGarden 2000
SXW, Sponge, FoAM collectives, SIGGRAPH New Orleans
6
6
events in responsive spaces
!chamber scale"
SXW TGardens
2000-2001
sponge +
7
7
events in responsive spaces
!building scale"
txOom FoAM 2002
8
8
conditions
Live, embodied experience
Real-time, performative events
Collective as well as individual
Improvised (no a priori)
9
What’s a gesture?
Human movement
Vocal (vs sonic) movement
Elicits response
SXW, "Resistance Is Fertile: Gesture and Agency in the Field of Responsive
Media," Configurations, Vol 10, Number 3, Summer 2002.
10
sing of gesture
udio/visual mee have contribnsible, and repntinuous motion
hing to provide
gesturing user.
m requirements
architecture.
rs offer a wide
media. We are
pping freehand,
synthesis and
ng multidimene physical afforcloth substrate
formance, play
nd in [5].
o sets of users:
a spaces and the
chitecture has the unique requirement for relatively high
frequency (10-100Hz) sensor data. In trade-off we are not
as concerned about the high power consumption needed
for continuous transmission.
Three regimes of sensing are most relevant to our
situation: (1) approximate (topological) location using
magnetic and light fields, (2) contact based on point and
strip Force Sensitive
Resistors
(FSR)
and ((3)
followEnd-to-end
latency:
from
onset
mouse-down,
through gesture based on acceleration. Sustaining the illunot
mouse-up) of gesture to perception of media
sion of continuous, direct interaction requires the system
response
to support sensor readings with tight latency and update
frequency bounds.
What does real-time imply?
Table 1 Human interaction bounds
Type of feedback
visual
acoustic
acoustic (melodic)
haptic
Frequency Hz
15-25
5-10
50
1000
Latency ms
50
10-50
10
1
End-to-end latency is used to evaluate the responsiveness of interactive media and includes the time needed to
sense changes actuated by the user, to modify its state,
and to produce a feedback in the space. The maximum acceptable latency is determined by the human motor/perceptual system depending on interaction mode
latency ! rhythm
11
What sense modality?
machine sense ! human sense
sensor modality ! sense modality
12
Regimes of sensing
regimes of sensing vs. sensor modality
scales of distance x energy (example):
>> 1m x multiple bodies (pedestrian flow)
~ 1m x limb extension (greeting)
0 x contact (handshake)
touch nuance (caress,
Jill Fantauzza, et al., "Greeting Dynamics Using Expressive Softwear" (Ubicomp 2003)
13
what sensor modality?
Physical or physiological?
Critique of psychologism
Descombes, V. (2001). The Mind's Provisions: A Critique of Cognitivism, Princeton
University Press.
Martin Kusch, Psychologism, Stanford Encyclopedia of Philosophy, 21 Mar 2007
14
intrinsic vs extrinsic
Saussure: any semiotic system is an abstract
system of differences
Bobick: Kid’s Room method
15
continuous vs discrete
what can we do without tokenizing?
work with streams
(discretized) continuous model ! discrete
model
16
what is noise?
“heat” noise in the sensor
network or operating system indeterminacies
non-semantic signal
17
push vs. poll
Better sense of coupling, causality for
gesturer.
You can feel when the system stops
responding.
Joel Ryan, Institute of Sonology, and STEIM
18
some projects
19
continuous gesture tracking for sound
SXW, et al, Continuous Sensing of Gesture for Control of Audio-Visual Media
(ISWC 2003)
20
angular moments to continuous sound
SXW, Yvonne Caravia, Yoichiro Serita, Georgia Tech IDT
21
21
Platform (Iachello, Dow, Serita)
Usage (Sha, Serita, Ubicomp)
SXW, et al, Continuous Sensing of Gesture for Control of Audio-Visual Media
(ISWC 2003)
22
controlling what?
sound file play back vs continuous synthesis
not triggers, sound ! {objects in space}
classical methods for sound synthesis:
vocal formants (howling ball txOom)
granular synthesis
23
motes, MSP instruments
SXW, J Fantauzza, Ubicomp
24
24
WYSIWYG
Wearable Sound Gestural Instruments
SXW - Marcelo Wanderley
Hexagram
25
1 material, 2 general mapping, 3 time
WYSIWYG
•
Soft, continuous1 controllers
handkerchief, scarf, blanket
•
Continuous2 mapping to continuous3 sound
•
Improvised play
•
Collective movement
•
Technical Goal: Statistical correlates to
intentional gesture
* Doug Van Nort, * David Gauthier, Sha Xin Wei, Marcelo M. Wanderley,
“Extraction of Gestural Meaning from a Fabric-Based Instrument,” ICMC
International Computer Music Conference Proceedings, 2007.
* David Birnbaum, * Freida Abtan, Sha Xin Wei, Marcelo M. Wanderley,
“Mapping and Dimensionality of a Cloth-based Sound Instrument,” Proceedings
of Sound and Music Computing (SMC), Lefkada Greece, 2007.
26
WYSIWYG architecture
27
WYSIWYG blanket test
Woven w/ conductive fibers on Jacquard loom, M. Bromley, J. Berzowksa, XS Labs.
Mechatronics, Feature extraction, mappings to sound: E. Sinyor, D. Gauthier, F Abtan,
D. Birnbaum, D. v Nort
28
Choreography by SY Cho and members of Dance Department.
28
higher order features
Phase relationship between spatial locations
Periodicity as function of position
Harmonicity, generalizing “regular motion”
Directionality, as cue to focus and intent
But: we always start and end with human interpretant
29
Ouija
Collective gesture
Intentional gesture
Movement art: *Improvisation vs. choreography
* => no a prior classifiers. But also, no grammar!
30
Ostensive Experiments
Entrainment
pass dynamics body to body w/o explicit instruction
(un)rehearsed or (un)trained, 3-6 people
Contact Improv
Camouflage
Calligraphy
Delay
31
Implicit Experiments
Effects of no | passive | responsive temporal
media on movement.
Future:
Look (inspection) for emergence of co-articulation.
Look for statistical correlates among sensor features.
32
Ouija: entrainment
Ouija Experiment on Collective Gesture in
Responsive Media Spaces, June-July 2007
33
33
Ouija: Calligraphy 1/2
machine vision for feature extraction from movement,
continuous synthesis of video based on physics models,
leveraging corporeal intuition.
34
34
Ouija: Calligraphy 2/2
35
35
Ouija: Delay
Conventional performance’s fixed sequence of movement =>
less need for realtime gesture tracking? But how fixed is fixed?
36
36
future work
37
expert movement analysis
Dr. S. Gill, Middlesex & Cambridge (Ian Cross)
non-verbal movement, rhythm and musicality in
movement
SXW, S. Gill, Gesture and Response in Field-Based Performance, Creativity and
Cognition Conference, 2005.
38
expert movement analysis
D. van Nort, McGill (Marcelo Wanderley
IDMIL)
predictive and geometric models for gestural sound
e.g. correlation, ICA, realtime wavelets
D. Gauthier, M. Fortin
39
varying vector X[n] = {x [n], x [n], . . . , x [n]}, wherein
1
2
N
the general point of view that the observables
y(t)model
are the
result of
a X(t)
linearplus
dynamical
system
on some
parameters
non-linear
pertu
sendin
tosome
supply
other
process
Taking the general point
of view
that the
observables
y(t) arepoint
the result
ofsensor
a linear
dynamica
each
dimension
represents
a different
on
the
OneX(t)
example
oftoa steer
particularly
powerful
approach
is to
filters
to dif
esc
e model
parameters
plus some
non-linear
perturbation
of controllable
asKalman
”noise,”
we tal
canparameters
try tobehavior.
discover
those
X(t).
inspired
by
”steerable
ses
downstream
with
parameters
(Part
of
this
wasuse
tosome
supply
other
processes
downstream
with
tobehavior.
steer
their
behavior.
(Part
o
supply
other
processes
downstream
with
toparameters
steer
their
(Part
of this
was
am
with
parameters
to
steer
their
behavior.
(Part
of non-linear
this
was
on
model
parameters
X(t) their
plus
some
perturbation
of
controllable
as ”n
grid.
From
this
information,
we
build
a
collection
of
es(LeRoux
2003[12])
In
the
discrete
time
version
of
such
an
approach,
the
state
evo
to
discover
those
X(t).
and th
computation”
of physical
in and
the 1980’s.)
inspired
”steerable
computation”
of physical
in the
1980’s.)
by
”steerable
computation”
of
physical
simulations
insimulations
the
1980’s.)
”spired
of physical
simulations
insimulations
the
1980’s.)
can
try
toby
discover
those
X(t).
timators
feature
extractions
techniques
based
on
the
supply
other
processes
downstream
with
parameters
to
steer
their
behavior.
(Part
of
this
was
to be given by
One example of a particularly powerful
approach
is to
use
figure
Taking
theasgeneral
poin
general
idea
of
cross-correlation
across
spatial
channels
pired
by
”steerable
computation”
of
physical
simulations
in
the
1980’s.)
ample
of that
a particularly
powerful
approach
is toof2003[12])
use
Kalman
filters
to estimate
the model
state.
(LeRoux
In
the
discrete
time
version
of such
an ap
expre
on
some
parame
nt
of view
the
observables
y(t)
are
the
result
a
linear
dynamical
system
Taking
theare
general
point
of
view
that
the
observables
y(t)
are
the
result
of
a
linear
dynami
aking
the general
point
ofofthe
view
that
the
observables
y(t)
are
the
result
of
a
linear
dynamical
system
t the
observables
y(t)
of
a
linear
dynamical
system
One
example
a result
particularly
powerful
approach
is
to
use
Kalman
filters
to estimate
t
well
as
temporal
autocorrelation
at
given
points
along
the
ux
2003[12])
In
the
discrete
time
version
of
such
an
approach,
the
state
evolution
is
assumed
to
be
given
by
try
to
discover
tho
entire
eters
X(t)
plusparameters
some
non-linear
perturbation
of as
controllable
”noise,”
we can
on
some
model
parameters
X(t)
plus
some
non-linear
perturbation
of
controllable
as
some
model
X(t)
plus
some
non-linear
perturbation
of
controllable
as
”noise,”
we
us
some
non-linear
perturbation
ofthe
controllable
”noise,”
weas
(LeRoux
2003[12])
In
the
discrete
time
version
ofthe
such
an
approach,
the
state
evolution
is”
cloth
surface.
Considering
X[n]
as
a
wide-sense
stationking
the
general
point
of
view
that
observables
y(t)
are
result
of
a
linear
dynamical
system
X(t
+
1)
=
A(t)X(t)
+
b(t)
+
w(t)
iven
by
X(t).
its rel
try
to by
discover
X(t).non-linear
nse
try
to
discover
those
X(t).
tocan
be
given
ary stochastic
processperturbation
[4], we can express
its generalized
some
model
parameters
X(t) those
plus
some
of controllable
as ”noise,” we
One example of atime.
part
n try to discover those X(t).
spatio-temporal correlation sequence as
X(t
1) =
A(t)X(t)
+colum
b(t)
(LeRoux
2003[12])
Intrat+
ticularly
powerful
approach
to state
use Kalman
filters
touseestimate
state.
One
of isaispowerful
particularly
approach
is tothe
useA(t)
Kalman
filters
tosquare
estimate
where
the
vector
we
toKalman
estimate,
the
known
ne
example
of aisexample
particularly
approach
isintend
to
filters
tois+
estimate
the
state.
erful
approach
to
use X
Kalman
filters
to powerful
estimate
the
state.
X(t
+ 1)
=approach,
A(t)X(t)the
+ b(t)
+ w(t)
(1)
to
be
given
by
he
discrete
time
version
of
such
an
state
evolution
is
assumed
(LeRoux
2003[12])
In
the
discrete
time
version
of
such
an
approach,
the
state
the
process.
The
control
b(t)
is
given
and
there
is
a
zero
mean
process
noise i
eRoux
2003[12])
In
the
discrete
time
version
of
such
an
approach,
the
state
evolution
is
assumed
R
[k,
i,
j]
=
E(x
[n]x
[n
−
k])
(1)
me
version
of
such
an
approach,
the
state
evolution
is
assumed
X(t
+
1)
=
A(t)X(t)
+
b(t)
+
w(t)
n is to use Kalman
i
jfilters to estimate theevolution
ne example of a particularly powerful approach
state.
to be In
given
covariance
rwtime
(t). version
This
noise
w(t)
independant
X(t).
The
measured
vector
be given
by
where
is the
vectorthe
weof
intend
to
estimate,
A(t)
is t
eRoux
2003[12])
the by
discrete
ofXsuch
anisstate
approach,
state
evolution
is assumed
us
expression
the dependence
between
varithe we
measurement
equation
:
X
the state
vector
intend giving
to estimate,
A(t)
is the
known
square
theanprocess.
Theof
control
b(t) istransition
given
andmatrix
there of
is a zero
beisgiven
by
where X is the state vector
we
intend
to
estimate,
A(t)
is
the
known
square
transition
m
space
and
time.This
ocess. The control b(t) is givenables
and across
there
is
a zero
process
with knownof X(t).
covariance
rw mean
(t).
noisenoise
w(t) w(t)
is independant
the +
process.
The control
b(t)
is given
and
there
a zero mean
processfrom
noise w(t)
wit
X(t
=b(t)
A(t)X(t)
+
b(t)
+ the
w(t)
(1)
mean
Now,
this
us
with
aisstatistical
X(t
+ 1)+
=b(t)
A(t)X(t)
+ :b(t)
+framework
w(t)
nce
This1)+
noise
w(t)
isX(t
independant
ofleaves
X(t).
The
vector
y(t) is given
by (1)
+
1)
=
A(t)X(t)
+measured
w(t)
+ 1)rw
= (t).
A(t)X(t)
+ w(t)
(1)
measurement
equation
covariance rw (t). This
noisewew(t)
is build
independant
of
X(t). The
measured
vector y(t)
is
colum
which
must
a
proper
real-world
estimate,
as
well
y(t)
=
H(t)X(t)
+
v(t).
asurement equation :
X(t + 1): = A(t)X(t) + b(t) + w(t)
(1) ve
where X is the state
the measurement equation
the
process.
The of
cont
ector
toXvector
estimate,
A(t)
issquare
the
known
where
isisthe
state
vector
intendsquare
to estimate,
A(t)matrix
issquare
the ofknown
square
transition
here
is intend
the
state
intend
to we
estimate,
A(t)
istransition
theofknown
transition
matrix
nd
toXwe
estimate,
A(t)
thewe
known
transition
matrix
y(t)
= H(t)X(t)
+ Thi
v(t)
covariance
rwknown
(t).
trol
b(t)
is the
given
and
there
iscontrol
aprocess
zero b(t)
mean
process
w(t)is with
known
process.
The
isthere
given
and
there
av(t)
zero
mean
process
noise
w(t)
w
isb(t)
the
rectangular
measurement
matrix,
is
the
zero
mean
measurement
e process.
control
given
and
a noise
zero
mean
process
noise
w(t)
with
ven
and
is
aH(t)
zero
mean
noise
w(t)isA(t)
with
known
here
X is there
theThe
state
vector
we is
intend
to
estimate,
is
the
known
square
transition
matrix
of
y(t)
=
H(t)X(t)
+
v(t).
(2)
the
measurement
equat
s noise
w(t)
is of
independant
ofmeasured
X(t).
The
measured
vector
y(t)
is
given
by
rw
(t).w(t)
w(t)
is
ofmeasured
X(t).
The
measured
covariance
rv
(t).
The
noise
v(t)
is
independant
of X(t).
The
dimension
ofy(t)
w(t)i
variance
rwcovariance
(t).
This
noise
is noise
independant
ofais
X(t).
The
vector
y(t)
isvector
given
by
independant
X(t).
The
vector
y(t)
given
by
y(t)
=
H(t)X(t)
+
v(t).
eis
process.
The
control
b(t)
isThis
given
and
there
isindependant
zero
mean
process
noise
w(t)
with
known
tion
:
equation
: H(t)
of x(t);
dimension
of v(t)
isofthe
dimension
of y(t). The
covariance
ofisthe
e measurement
equation
:the
is the
rectangular
measurement
matrix,
the
zer
variance
rwthe
(t).measurement
This
noise
w(t)
is independant
X(t).
The measured
vector
y(t) isv(t)
given
bystat
rectangularequation
measurement
matrix, covav(t) isriance
the zero
mean
measurement
of known
rv (t).
The
noise v(t) is noise,
independant
of X(t). T
e the
measurement
:
H(t) is the rectangular measurement matrix, v(t) is the zero mean measurement noise, o
iance rv (t). The noise v(t) is independant
of X(t).
The dimension
the dimension
of x(t);
the dimension
of v(t)ofisw(t)
the is
dimension
of y(t). The
cova-y(t)
riance
rv
(t).
The
noise
v(t)
is
independant
of
X(t).
The
dimension
of
w(t) is the d
2
=v(t).
H(t)X(t)
+ v(t).
y(t)+covariance
=v(t).
H(t)X(t)
v(t).
P (t)(2)
= E[|X(t)
−(2)
X̄|vector
]
the
of +
v(t)
is the dimension
of H(t)X(t)
y(t). The
of+the
state
X(t) is (2)
y(t) =
y(t)dimension
= H(t)X(t)
of x(t); the dimension of v(t) is the dimension of y(t). The covariance of the state vector
y(t) = H(t)X(t) + v(t).
(2)
H(t) is the rectangular
2
P
(t)
=
E[|X(t)
−
X̄|
covariance
rv
(t).
The
r
measurement
matrix,
v(t)
is
the
zero
mean
measurement
noise,
of
known
H(t)isisthe
thezero
rectangular
measurement
matrix,
v(t)mean
is themeasurement
mean measurement
noise,
where
X̄ mean
= E[X(t)]
the expected,
value
ofzero
X(t).
(t)matrix,
is the rectangular
measurement
matrix,
v(t)
is
the
zero
noise, of known
nt
v(t)
measurement
noise,
of mean
known
2or
P
(t)
=
E[|X(t)
−
X̄|
]
(3)
2 The
of x(t);
the
dimension
eindependant
noise
v(t)rv
is(t).
independant
of
X(t).
The
dimension
of
isof
the
dimension
covariance
rv (t).
The
noise
isisP
independant
X(t).
ofdimension
is the
40 o
vaThemeasurement
noise
v(t)
is independant
of
X(t).
The
dimension
w(t)
is
the
s(t)
of
X(t).
The
dimension
of v(t)
w(t)
the
dimension
=w(t)
E[|X(t)
−
X̄|
] of dimension
isriance
the rectangular
matrix,
v(t)
is(t)
the
zero
mean
measurement
noise,
ofw(t)
known
statistical correlates, not
certificates, of intention
Correlation, e.g.
, or
Kalman filter estimates of state X(t) from
observables y(t), of form
minimizing covariance
engineering note
Body gesture
Object “gesture”
Mechanical energy
Sensing technology
Any movement
Any part [?]
Sensitivity
Hands
Feet
Lying/sitting
Handling
Wearing
Throwing
Handling
Holding
Pushing
Pressing
Squeezing
Acceleration
Accelerometer
Price,
Efficiency,
Availability1
254
Piezoresistive
Air pressure
Flexion
Piezoresistive
Load cell
Miniature pressure transducer
Piezoelectric
Strain gauge
Capacitive
Capacitive
Hygrometer
Piezoresistive
Piezoresistive
225
244
244
234
255
154
244
254
545
545
333
525
525
Piezoelectric
Piezoresistive
Capacitive
Piezoresistive
242
545
545
545
333
545
Mouth
Sweep
Movements
Energy/heat
Hands
Feet
Articulation
[?]
Space movements
Body parts
Contact/non-contact
1
1:
2:
3:
4:
5:
Force
Pressure
Specific (cutting, etc.)
Inside/outside
Hydrophile material
Inside/outside
Thermosensitive materials
Long objects
Large surfaces
Soft materials
Surface (de Rossi)
Rigid materials
Activation
Strain
Contact
Presence
Humidity
Temperature
Linear position
2D Localization
Flexion
Strain
3D Localization
Orientation
Rotation
Switching
Piezoresistive
Video
Tilting, acceleration, magnetic
Magnetic, piezoresistive
Capacitive
Piezo
Very poor
Poor
Moderate
Good
Very good
244
445
545
545
thanks to E. Sinyor
Table 1. Gestures and sensors, at a glance.
41
summary
Conditions:
Live, real-time, performative events
Collective as well as individual
Improvised (no a priori)
Strategy:
Leverage corporeal intuition
Continuous models
controllers, media, and mappings
geometrical / topological models
Statistical approaches
correlation
Kalman estimate state
42
topological media lab
Credits:
http://www.topologicalmedialab.net
People
Concordia University
Engineering Visual Arts, EV 7.725
1515 Ste. Catherine Ouest, Montreal H3G 2W1
43