Building Petri Nets from Video Event Ontologies

Petri-Nets for Video Event
Understanding
Gal Lavee and Ehud Rivlin
Technion – Israel Institute of Technology
Motivation
•
•
Huge volume of ‘interesting’ video data to process
Growing visual surveillance demands
– Control of the security sensitive areas like stores, airports, parking lots, banks
and other public places
•
Its an Important
Problem!!
Video database analysis demands
– Offline surveillance data analysis (e.g. abnormal behaviour detection, specific
event querying)
•
Well-understood supporting work
–
–
–
–
–
–
Feature extraction
Object detection
Object tracking
Object recognition
“Low-level” event recognition
Event domain knowledge specification using ontology languages
Video Events
Activities- Bobick
Composite Events- Bremond
Known Formalisms
•
•
•
•
•
•
•
•
•
•
FSMs( Sequence)
Bayesian Networks
(Uncertainty, Factorize State Space)
HMMs (Sequence , Uncertainty)
CRFS
( Sequence, Uncertainty,
relax Independence Assumptions)
Grammars
( Hierarchy, Partial Ordering)
What About?
Long Term Temporal Dependencies?
Concurrency?
Temporal, spatial, logical relations?
“Incomplete” Events ?
What About
Semantics?
Petri Net Solution for Event
Understanding
•
•
•
•
Petri-Net formalism (defined shortly)
Represents the dynamic evolution of the video sequence
Encodes semantic knowledge of the domain
Formalism naturally captures inherent properties of video events
•
– Logical, Temporal and Spatial composition
– Concurrency and Partial Ordering
– Long-term temporal dependence
Recursive Reuse of Fragments (Hierarchy)
•
•
•
Recognizes events and…..
Snapshot of the state of the system at any time before or after an event has
occurred
How far away is event of interest?
Petri Nets- Defined
•
A graphical tool for formal description of system dynamics.
•
What graph components does Petri net comprise?
– Place nodes ()- describe possible local system states.
– Transition nodes () - describe events that may modify the state.
– Arcs () - specify the relation between
system states and events.
– Tokens - markers that reside in place nodes and are used to specify the
PN state
Presource_idle
Shared Resource Example:
Presource_busy
Tuser_request
Puser_active
Prequesting
Tstart
Tend
Puser_access
Petri Nets ctd.
Petri Nets ctd.
•
•
•
The instant location of the tokens (called marking) defines the current state
of the model
Enabling Rule
– Enabled if all input places have tokens.
– Additional requirements may be added (conditional transitions)
When transition fires:
– Tokens from input places deleted
– Tokens placed in output places
Nothing to do
with Petri Dish
Petri Net Properties
•
Petri Net formalism is useful for modeling
•
Logical Relations
Petri Net Properties
•
Petri Net formalism is useful for modeling
•
Temporal Relations
Petri Net Properties
•
Petri Net formalism is useful for modeling
• Concurrency and Partial Ordering
Reachability set and
Reachability graph
•
Given a PN model and an initial marking
– Compute Reachabillty Set (set of reachable markings)
– Reachabillity graph visualizes the path taken to each particular state
Reachability Graph:
Reachability Set:
Marking
Puser_active
M0
M1
1
Prequesting
1
M1
Puser_access
1
Presource_busy
1
Presource_idle
M0
M2
1
1
M2
Petri Net Extensions
•
Priority Transitions
– Each transition is associated with a natural number
– Allows resolution of conflicts (e.g. multiple transitions which share an input)
•
Timed Transitions
–
–
–
–
Extends priority concept
Associates a real interval of time with each transition
Must elapse between enabling and firing
Model real world phenomenon
Petri Net Extensions
•
Stochastic Petri Nets
– Stochastic timed transition delays (discrete-state stochastic process)
– Exponential distribution for transition delays
• Negative exponential probability density function (PDF)
Dn  1  e tn /  n
•
Generalized Stochastic Petri Net is PN where:
– Immediate transitions coexist with stochastic timed transitions
PN Video Event Models
•
•
PNs in the literature have not been built in an agreed upon fashion
However two distinct classes of building PN event models have emerged
Plan Petri-Nets
Object Petri-Nets
PN Video Event Models
Object PN
Plan PN
Tokens
Objects
Plan Progress
Places
Object States
Plan States
Transitions
Object State Change
(Events)
Plan Advancement
Enabling
Rules
Conditioned on Object
Properties
Conditioned on Scene
Properties
Event
Firing of Transition
(not all events are interesting)
End of Plan (firing of
“sink” transitions)
One PN for multiple
objects/events
One PN per event
Plan PNs
•
Plan Petri-Nets
–
–
–
–
Castel (1996)
Natural extension of PN plans in other domains
In this work a number of “plans” are kept track of
At each observation (knowledge received from the “numerical layer”) the plans in
progress are checked to see if they are consistent with the observation.
Parking Lot Example
Plan Prototypes
Pedestrian
Arsonist
Vehicle
Movement
Arrival
Action
Vehicle
Departure
Explain Observation Using Plans
Cars Parked
Current Plans:
1. Vehicle
Departure
Consistent with Vehicle Departure– New plans created
Not consistent with vehicle arrival , Pedestrian Movement, Arsonist action
Pedestrian Appears
Current Plans:
1. Vehicle Departure
2. Pedestrian
Movement
3. Arsonist Action
Consistent with Vehicle Departure, Pedestrian Movement, Arsonist action – New plans created
Not consistent with vehicle arrival
Pedestrian Disappears
Current Plans:
1. Vehicle Departure
2. Pedestrian
Movement
(Terminated)
Consistent with existing plans Vehicle Departure, Pedestrian Movement – plans maintained
Pedestrian Movement reaches end of plan- Pedestrian movement event can be said to have
occurred
Not consistent with arsonist action – plan rejected
Vehicle Starts Moving
Current Plans:
1. Vehicle Departure
2. Pedestrian
Movement
(Terminated)
Consistent with existing plan Vehicle Departure
Parking Lot Example
•
•
•
•
•
After the observation sequence…
The only consistent event is vehicle departure…. Even though it hasn’t
concluded
Using this system we can make assertions such as a certain event is
possible/ not possible before having seen the complete event.
Joined with information on the duration of the various sub-events we can
conjecture on when a possible event might occur in terms of an offset from
the current state
Feedback loop with observation is possible
Object PNs
• Ghanem 2004 & 2007
• Tokens are objects
• Multiple objects can be represented within same net
•
•
Snapshot of the state of the system at any time before or after an event has
occurred
Feedback Loop
Traffic Monitoring Domain Example
Negative EventSecurity Guard has Not Returned to post after 15 minutes
Video event modeling with GSPN
•
•
•
•
•
•
Object PNs
Borzin et al 2007
Capture multiple events of interest in a domain within a single net
Stochastic Timed Transitions (GSPN)
The parameters of these distributions may be learned from training data
Also allows estimates on reachabillity of certain events (Using Marking
Analysis discussed shortly)
Video event modeling with
GSPN
•
Basic representation concepts
– Each token represents a detected object in a specific state
– A place represents a possible state of one or more objects
– Transitions represent an event or a satisfied relation
•
•
Logical relations and Temporal Relations enforced by appropriate PN
fragments
Spatial relations
– Topological, directional or distance
– Defined by the enabling rules attached to transitions
•
Enabling Rules
– Define conditions on tokens (objects) that must be met for the associated
transition to become enabled and fire
Surveillance System
Synthesized
Dataset
(optional)
Video
Input
Intermediate Video
Processing Unit
Abstracted
Video
Representation
Video Event
Recognition Unit
User
Interface
•
Behavior Modeling
Unit
GSPN Based
Behavior Model
Intermediate video processing unit
–
–
–
–
–
Motion detection
Object detection and classification
Object tracking
Can be replaced with format compliant datasets
Supports CAVIAR ground truth format
Results
User
Interface
Surveillance System
Synthesized
Dataset
(optional)
Video
Input
Intermediate Video
Processing Unit
Abstracted
Video
Representation
Video Event
Recognition Unit
User
Interface
•
Behavior Modeling
Unit
Results
User
Interface
GSPN Based
Behavior Model
Behavior modeling unit
– Provides a graphical interface for creating GSPN models
– Allows splitting an entire graph to small fragments
– Supports various templates that can be edited or extended by user.
Surveillance System
Synthesized
Dataset
(optional)
Video
Input
Intermediate Video
Processing Unit
Abstracted
Video
Representation
Video Event
Recognition Unit
User
Interface
•
Behavior Modeling
Unit
Results
User
Interface
GSPN Based
Behavior Model
Video event recognition unit
– Receives the intermediate video processing results and the model
– Provides textual description of the detected events
Surveillance System Interface
Behavior Modeling Interface
Surveillance System Interface
Video Event Interpretation Interface
The Data
•
•
•
•
•
Assume good tracking, detection and recognition
CAVIAR annotation format
Synthetic Video
Corresponds to real video
Generation of similar video for training
Synthetic Video Animator
• Single scene editor
• User interface
• Scene series editor
Video Event Analysis
•
Scenario 1: ‘Security check in a public place’
– Every visitor must pass a security check before he enters the place
– The following cases are considered to be abnormal and must be reported:
• A visitor enters the place without being checked
• The security check is abnormally long
– Example of event of interest:
Video Event Analysis
•
GSPN model for ‘Security check in a public place’:
Video Event Analysis
•
Interpretation results for ‘Security check in a public place’:
Frame
Message
1
'Object_Appeared' fired on objects: 0
20
'Object_Appeared' fired on objects :2
25
‘Object_Appeared' fired on objects :6
56
'Visitor_Entered_the_Hall' fired on objects :2
61
'Guard_Met_Visitor' fired on objects :0, 2
61
'Visitor_Entered_the_Hall' fired on objects :6
68
'Visitor_Was_Not_Checked' fired on objects :6
71
'Security_Check_Is_Too_Long_Detected' fired on objects :0, 2
86
'Meeting_Is_Over' fired on objects :0, 2
Marking Analysis
•
•
•
•
PN structure can be translated into a reachabillity graph
A training set provides statistics about transitions between adjacent
markings
Using these the probabillity for future transitions can be calculated
Example:
λ4,0
λ5,0
M0
λ0,1
λ0,2
M2
M1
λ1,3
M3
λ1,4
λ3,4
λ1,5
M4
λ2,5
M5
– Where the probability to move to marking Mk from marking Mn is:
n ,k 
N n,k
Nn
Video Event Analysis
•
Marking analysis for ‘Security check in a public place’:
–
–
–
•
The marking data was collected during the training process
The reachability graph was constructed upon the possible states observed during the training
process
Statistic information was used to predict the most probable next system state Marking graph
is used for Discrete Time Markov Chain description
Example:
Empty
1.0
Guard_In_Hall
1.0
Guard_Waiting
1.0
One_Visitor_Appeared
1.0
One_Visitor_Stopped
_Near_Guard
1.0
0.66
0.63
0.72
Visitor_Walking_
Towards_Guard
Guard_Met_One_
Visitor
Guard_Checked
_One_Visitor
0.34
0.28
One_Visitor_Passed
_Near_Guard
0.37
1.0
0.12
Guard_Met_Two
_Visitors
0.78
0.22
One_Visitor_has_Evaded
_the_Check
0.58
0.3
–
The probability to get to ‘Guard_Checked_One_Visitor’ from
‘Visitor_Walking_Towards_Guard’ is (the red path):
P  0.72 1 (0.63  0.37  0.78)  0.66
Video Event Analysis
•
Scenario 2: ‘Traffic junction control’
– Assume a junction without any traffic lights or traffic signs
– The Law: car may enter the junction unless there is a car on its right side
– Example event of interest:
Video Event Analysis
•
GSPN model for ‘Traffic junction control’:
Video Event Analysis
•
Interpretation results for ‘Traffic junction control’:
Frame
Message
0
'Car_Appeared' fired on the objects :2
1
'Car_Appeared' fired on the objects: 0
10
'Car_Appeared' fired on the objects :1
18
'Car_Entered_Z1' fired on the objects: 0
23
'Car_Entered_Z2' fired on the objects :2
34
'Car_Entered_Z1' fired on the objects :1
52
'Car_In_Z1_Breaks_the_Law' fired on the objects: 0
77
'Car_Entered_Z3' fired on the objects: 0
80
'Car_Appeared' fired on the objects :3
Building PNs from Ontologies
•
•
•
•
•
We have seen Petri Nets are a formalism for describing video events that
enables description of a particular event domain,
Other Formalisms exist (e.g., Hidden Markov Models, Grammar
Models….etc.)
A knowledge engineer can describe the semantic content of a particular
event domain in a standard way.
Such an expert also has to have expert knowledge of the modeling
formalism.
To bridge this gap we propose that there be a process for translating
domain knowledge into an event modeling formalism.
Ontology Languages
•
•
To formalize our knowledge of an event domain we require a standard
method of knowledge specification – an ontology
Competing ontology specification standards for video events exist
– VERL (Nevatia et al 2004)
– CASEE (Hakeem et al 2004)
Ontology Languages -VERL
•
•
•
Defines constructs such as Sequence, Change
Allows definition of predicates and entities
Captures temporal and logical relationships
SINGLE-THREAD(tailgate(ent x, ent y, facility f),
AND (portal-of(door, f))
Sequence(
approach(y, door),
unlock(y, door),
open(y, door),
AND(enter(y, f), near(x, y)),
NOT(unlock(x, door)),
enter(x, f)))
Ontology Languages- CASEE
•
Extends the concept of case frames hierarchically
[ PRED: Moves, AG: Train, D: Signals, LOC: Zone1, FAC: Towards, AFTER:
[ PRED: Switches, AG: Signals, FAC: On, AFTER:
[ PRED: Moves, AG: Gate, FAC: Down, AFTER: Switches, SUB:
[ PRED: Stops, AG: Vehicle, LOC: Zone2, FAC: Outside, AFTER: Moves ] ] ] ]
Translating the Ontology to PN
•
•
•
•
Methodology for constructing PN from ontology descriptions (not automatic
or optimal)
Similar models for similar events
Each Sub-event is represented by a PN fragment
Simple Sub-Event Fragment (no Temporal Relations)
•
Temporal Sub-Event Fragment
Translating the Ontology to PN
•
•
Next we connect the fragments representing the various sub-events in a
manner that corresponds to their relationship in the ontology event
description.
Logical Relations (AND)
•
Temporal Relations (OVERLAPS)
Temporal Relations
Translating the Ontology to PN
•
•
•
It may be possible to represent a temporal relation in more than one way.
Having a consistent representation avoids having arbitrary model structures
Nodes shared between fragments may be fused.
Building a Plan PN
•
•
•
In a plan PN transitions can represent “primitive” events as given by the
ontology descriptions
These primitive events can be projected into our event structure fragment
Example: “Safe Crossing Event”
– An approaching train causes the signal to change, the gate to lower,
and the approaching car to stop.
– Specified using CASEE
– Create temporal sub-event fragments for each of the sub-events
– Connect fragments according to temporal relations
Building a Plan PN
CASEE Ontology Representation:
[ PRED: Moves, AG: Train, D: Signals, LOC: Zone1, FAC: Towards, AFTER:
[ PRED: Switches, AG: Signals, FAC: On, AFTER:
[ PRED: Moves, AG: Gate, FAC: Down, AFTER: Switches, SUB:
[ PRED: Stops, AG: Vehicle, LOC: Zone2, FAC: Outside, AFTER: Moves ] ] ] ]
Resulting Plan PN:
Simplifying the structure
•
•
•
Redundant nodes are merged to give a simpler structure
The Label SE indicates each of the sub-events in the ontology specification
Each transition is now associated with the appropriate “primitive” sub-event
Building an Object PN
•
•
•
•
•
In an Object PN model tokens represent objects in the system
We define PN fragments that include places for each possible state of an
object
These fragments are then connected to the special fragments representing
the structure of the event (made up of sub-event fragments) as appropriate.
Our objects of interest for the “safe crossing” event are train, car signal and
gate
Not known are object states (may be implicit by event specification or
explicit with small extension to the ontology)
Building an Object PN
•
•
•
•
•
The possible sates of car can be
“inscene”, “stopped”, “inzone2”,
“stoppedinzone2”.
Each of these states are
represented as a place in the car
state transition fragment.
Conditional transition nodes allow
changes of state based on token
properties.
Similar fragments are constructed for
train gate and signal
States are connected appropriately to
the event structure fragment
Building an Object PN
Petri Nets- Summary
•
•
•
•
•
•
•
Inference is the propagation of tokens through the PN
Relies on semantic structure
‘Snapshot’ of the system
‘How far Away?’, quantified by probabilities or absolute time.
Naturally capture inherent video event properties such as concurrency and
partial ordering
May be specified subjectively
Consistent process for specification needs to be implemented
What About Uncertainty??
•
•
•
•
•
•
•
Video Events ARE uncertain
PN model presented is deterministic
Uncertainty resides at lower levels
Higher level becomes more qualitative than quantitative
Propagation Nets (Shi and Bobick 2004) assign a probability to each state
transition
Other work assigns a probability to tokens as they propagate through
network
It is not critical to assign a probability to each event
– Prof. Bobick’s .7 vs .00000000000001 example
•
•
If two events are feasible they should both be considered
We have seen that multiple event explanations can exist
Thank You
Acknowledgements to :
Michael Rudzsky
Artyom Borzin
Questions?
Publications
•
•
•
•
Surveillance Event Interpretation Using Generalized Stochastic Petri Nets. Artyom Borzin, Ehud
Rivlin, and Michael Rudzsky. WIAMIS 2007, The 8th International Workshop on Image Analysis
for Multimedia Interactive Services, 6-8 June 2007, Santorini, Greece.
Representation and Recognition of Multiagent Interactions by Generalized Stochastic Petri Nets.
Artyom Borzin, Ehud Rivlin, and Michael Rudzsky. CBMI-2007, Fifth International Workshop on
Content-Based Multimedia Indexing. June 25-27, 2007 Bordeaux, France.
Building Petri Nets from Video Event Ontologies. Gal Lavee, Artyom Borzin, Ehud Rivlin, and
Michael Rudzsky. Advances in Visual Computing. Third International Symposium, ISVC 2007,
Lake Tahoe, NV, USA, November 26-28, 2007. LNCS 4841, p. 442-451.
Recognition of Human Behavior from Surveillance Video using Marking Analysis in Generalized
Stochastic Petri Nets, Borzin, Artyom, MsC Thesis link
References
•
•
•
Nagia M. Ghanem, Petri Net Models for Event Recognition in Surveillance Videos, PhD Thesis,
2007,link
Nagia Ghanem, Daniel DeMenthon, David Doermann, Larry Davis, "Representation and
Recognition of Events in Surveillance Video Using Petri Nets," cvprw, p. 112, 2004 Conference
on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 7, 2004
C. Castel, L. Chaudron, and C. Tessier, "What is going on? a high level interpretation of
sequences of images," in Proceedings of the workshop on conceptual descriptions from images,
ECCV, 1996, pp. 13--27.