Data Conformance Checking
using Optimal Alignments
Felix Mannhardt, Massimiliano de Leoni,
Hajo A. Reijers
Problem (Adapted from Massimiliano de Leoni)
Activity d should have
occurred, since
Activity<h5000
hasnβt
amount
been executed: D
(a; {A = 5001;R = Michael; E = Pete});
«Sue»
cannot
be not
«OK»
(b; {V = OK;E = Pete});
authorized to
(c; {I = 530;D = NOK;E = Sue});
perform b: is not
(f; {E = Pete});
Assistant
(a; {A = 3000;R = Michael; E = Pete});
(b; {V = OK;E = Sue});
(c; {I = 530;D = OK;E = Sue});
(f; {E = Pete});
Department of Mathematics and Computer Science
PAGE 1 / 18
How Does Data Alignment Work?
β’ Petri Nets with Data:
A
π < 5000
B
π β₯ 5000
D
C
X
β’ Two new βMovesβ with associated βCostsβ:
β’ Move with incorrect write operation
β’ Move with missing write operation
β’ Formulation of an MILP problem for CF Alignment:
+πππ π‘(πππ π πππ) = π
π·π΄
β’ πππ ππ
β’ π β π΄ππ β€ ππππ β§ βπ β π΄ππ β€ βππππ β§ π β₯ ππππ
β’ ππ β π, π , π β β€
[1] M. de Leoni, W. M. P. van der Aalst (2103). Aligning Event Logs and Process Models for Multi-Perspective
Conformance Checking: An Approach Based on Integer Linear Programming.
[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness
analysis.
Department of Mathematics and Computer Science
PAGE 2 / 18
Current Data Conformance Checker in ProM
Input
Output
Petri Net with
Data
Data Conformance Checking
Event Log
Cost
Control Flow
Alignment
Department of Mathematics and Computer Science
PAGE 3 / 18
Data
Alignment
Data
Alignment
Shortcomings of the Current Solution
(a; {A = 3000;R = Michael});
(b; {V = NOK});
(c; {I = 530;D = OK});
(f);
(πΏπͺπ¨ = π)
Perfect CF Alignment
L
a
b
c
f
P
a
b
c
f
Resulting DF Alignment (πΏπ«π¨ = π)
(a; {A = 5001; R = Michael});
(b; {V = OK});
(c; {I = 530;D = NOK});
(f);
Better DF Alignment
(πΏπͺπ¨ = π)
(a; {A = 3000;R = Michael});
(b; {V = NOK});
(c; {I = 530;D = OK});
(f);
Department of Mathematics and Computer Science
PAGE 4 / 18
First Idea (Multi-Alignment Approach)
Input
Output
Petri Net with
Data
Optimal Data
Alignment
Data Conformance Checking
Yes
Event Log
No
Optimal?
πΏπͺπ΄ β₯ π± π©ππππΊππππ
OR
Cost
πΏπ«π΄ = π
Control Flow
Alignment
Data
Alignment
Cache
Image source: http://commons.wikimedia.org/wiki/File:Pictofigo_-_Idea.png
Department of Mathematics and Computer Science
PAGE 5 / 18
Second Idea (Single-Alignment Approach)
(a; {A = 3000;R = Michael});
(b; {V = NOK});
(c; {I = 530;D = OK});
(f);
<a>
Move in Both
<a>
Move in Both
<a,b>
β¦
<a,b,c>
(1,0)
<a,b,c>
Move in Both
<a,b,c,f>
Image source: http://commons.wikimedia.org/wiki/File:Pictofigo_-_Idea.png
Department of Mathematics and Computer Science
<a>
(0,0)
<a,b>
Move in Both
(1,0)
(π
πΆπ΄ ,π
π·π΄ )
<>
PAGE 6 / 17
(0,2)
β¦
Single-Alignment Approach I
β’ For each node in the search space
β’ Compute a Data Alignment (MILP) for the prefix
β’ Remember π
π·π΄ and the variable assignment
β’ Use the variable assignment to check if an MILP needed
β’ A best-first search on the overall cost (π
π·π΄ + π
πΆπ΄ )
returns one optimal Data Alignment
β’ Use of ILP heuristic for A* [2] still possible
β’ π
π·π΄ + π
πΆπ΄ never gets better (no negative edges!)
β’ But, our search space is bigger!
[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness
analysis.
Department of Mathematics and Computer Science
PAGE 7 / 18
Single-Alignment Approach: Search Space
{A =states
5001;R =are
Michael});
β’ (a;
Two
equivalent iff (a; {A = 5001;R = Michael});
(b; {V = OK});
(b; {V = OK});
vs.
β’
Same
marking
of
βEvent
Netβ
& {IProcess
(c; {I = 530;D = OK});
(c;
= 530;D =Model
OK});
(c;
{I = 530;D
= NOK});assignment wrt.
(c; {I
= NOK});
β’ Same
variable
all= 530;D
guards
(f);
(f);
as in [2]
[2] A. Adriansyah, B. F. van Dongen, W. M. P. van der Aalst (2011). Conformance checking using cost-based fitness
analysis.
Department of Mathematics and Computer Science
PAGE 8 / 18
Comparison: Improvement of Fitness
Change in Fitness
60%
50%
40%
30%
254 Traces
2 Traces
20%
10%
0%
Average
Max
Insurance Institute (12,000 Traces)
Synthetic Example (1,200 Traces, Length 4-15, 10% Noise)
Department of Mathematics and Computer Science
PAGE 9 / 18
Comparison: Dutch Insurance Institute
120
Seconds
100
80
60
40
20
0
Running Time
Old
Department of Mathematics and Computer Science
PAGE 10/ 18
Multi
Single
Comparison: Dutch Insurance Institute [1]
4.5
120
4
100
3.5
80
# MILP
# MILP
3
2.5
2
1.5
60
40
1
20
0.5
0
0
Average
Multi
Max
Single
Multi
Single
[1] M. de Leoni, W. M. P. van der Aalst (2103). Aligning Event Logs and Process Models for Multi-Perspective
Conformance Checking: An Approach Based on Integer Linear Programming.
Department of Mathematics and Computer Science
PAGE 11 / 18
Comparison: Dutch Insurance Institute
50
4,000
45
3,500
# Queued States
# Queued States
40
35
30
25
20
15
10
3,000
2,500
2,000
1,500
1,000
5
500
0
0
64
Average
Old
Multi
Department of Mathematics and Computer Science
Max
Single
PAGE 12 / 18
Old
Multi
Single
Comparison: Synthetic Model (10% Noise)
80
70
Seconds
60
50
40
30
20
10
0
Running Time
Old
Department of Mathematics and Computer Science
PAGE 13 / 18
Multi
Single
16
1,600
14
1,400
12
1,200
10
1,000
# MILP
# MILP
Comparison: Synthetic Model (10% Noise)
8
6
800
600
4
400
2
200
0
0
Average
Multi
Max
Single
Department of Mathematics and Computer Science
Multi
PAGE 14 / 18
Single
800
180,000
700
160,000
600
140,000
# Queued States
# Queued States
Comparison: Synthetic Model (10% Noise)
500
400
300
200
66
100
26
0
120,000
100,000
80,000
60,000
40,000
20,000
0
Average
Old
Multi
Department of Mathematics and Computer Science
Max
Single
PAGE 15 / 18
Old
Multi
Single
Comparison Wrap-up
β’ Multi-Alignment Approach
β’ Building CF Alignments (sorted) up to a certain πΏπͺπ is
not feasible for certain models/traces
β’ Though faster in some cases (Good Fitness)
β’ Single-Alignment Approach
β’ Again, solving many (smaller) MILPs
β’ Integrated Optimizations:
β Check if guards already fulfilled
β Check if only write operations missing
β Re-use calculated Data Alignments
Department of Mathematics and Computer Science
PAGE 16 / 18
ο¨ No MILP
ο¨ No MILP
ο¨ 1 x MILP
What Next?
β’ Improve the Implementation
β’ Faster MILP solving by re-use the lpsolve instance?
β’ Reduce memory footprint of both approaches?
β’ Will a Decomposition of the process model help?
β’ Case study with Event Log from Italian local police
β’
β’
β’
β’
Event Log about the management of road-traffic fines
Process with multiple decision points
Process with non-trivial guards
Event Log contains data attributes
β’ Submit Paper to FASE 2014
Department of Mathematics and Computer Science
PAGE 17 / 18
Summary
β’ Current Data Alignment could be sub-optimal
β’ Two approaches for an optimal Data Alignment
β’ Multi-Alignment Approach
CF
Alignment
MILP
CF
Alignment
MILP
CF
Alignment
β¦
CF
Alignment
Optimal?
Data
Alignment
MILP
MILP
β’ Single-Alignment Approach
Best First Search
β’ Both implemented in ProM
β’ Soon to be integrated in Data Aware Replayer
β’ Which one to use depends on the case
Department of Mathematics and Computer Science
LAST PAGE
Data
Alignment
Image source: http://commons.wikimedia.org/wiki/File:Pictofigo_-_Idea.png
Department of Mathematics and Computer Science
© Copyright 2026 Paperzz