DEBS 2009 - University of Toronto

Parallel Event Processing for Content-Based
Publish/Subscribe Systems
Amer Farroukh
Department of Electrical and Computer Engineering
University of Toronto
Joint work with Elias Ferzli, Naweed Tajuddin, and Hans-Arno Jacobsen
DEBS 2009
Motivation
• Event processing is ubiquitous in enterprise-scale
applications (Fraud detection, Data analysis)
• Network security monitoring and analysis tools require
Gigabit per second speed (Application-layer firewalls)
• Selective dissemination of information for Internetscale applications (RSS, XML, Xpath)
• These systems need to support thousands of users and
process millions of events
• Achieving Scalability and high performance under
excessive load is a challenging problem
• Matching engine is the most computation intensive
function in event processing
2
DEBS 2009
How to support high data-processing
rates?
• Choose an existing, powerful matching
algorithm
• Leverage chip multi-processors
• Increase throughput or reduce matching time
• Evaluate multi-threading vs. software
transactional memory
3
DEBS 2009
Outline
•
•
•
•
Related work
Matching algorithm
Parallelization techniques
Implementation and results
4
DEBS 2009
Sequential Matching Algorithms
• Single phase: A_TREAT [E.H., 1992]
– Predicates are complied into a test network
– Subscriptions may appear in one or several leaves
– Poor locality, space consuming, hard to maintain
• Two phase: SIFT [T.Y., 2000]
– Predicates are evaluated in the first phase
– Subscriptions are matched in the second phase
– Predicates and subscription are indexed
• Algorithm used: Filtering Algorithms [F.F., 2001]
5
DEBS 2009
Matching Algorithm
E
P1
Price
P2
Color
Quantity
Phase 1
0 1
0 0 0
1 0 10 0 1
0
Phase 2
Ap1
C1
C2
Ap2
C1
C2
C1
C2
Ap3
Ap4
Ap5
.
.
.
DEBS 2009
C3
S1
S5
C3
S9
6
Multiple Events Independent Processing
Thread 1
E1
P1
E2
P2
P1
Price
Color
0 1
0 0 0
1 0 0 0 1
0
0 1
0 0 0
1 0
1 0 0 1
0
Ap1
C1
C2
S7
Ap2
C1
C2
C1
C2
Ap4
S8
Ap5
.
.
.
P2
Quantity
S1
Ap3
Thread 2
DEBS 2009
C3
S3
S2
C3
S9
7
Single Event Collaborative Processing
Thread 1
E
P1
Price
Thread 2
P2
Color
0 0 0 0
1 0 0 0 0
Quantity
0 1
0 0 0 0 0 0 0
1
0 1
0 0 0
1 0 0 0 1
0
S1
Ap1
C1
C2
Ap2
C1
C2
C1
C2
Ap3
Ap4
S8
Ap5
.
.
.
DEBS 2009
C3
S2
C3
8
Multiple Events Collaborative Processing
Group 1
T1
T2
E1
P1
P2
P1
Price
0 0
1 0
Group 2
T3
T4
E2
Color
P2
Quantity
0 0 0
1
1
0 0 0
0
1 0 1
0
0
1 1
0 0
S1
S3
Ap1
C1
C2
Ap2
C1
C2
C1
C2
Ap3
S7
0 0 0
1
Ap4
Ap5
.
.
.
DEBS 2009
C3
S2
S4
C3
S9
9
Implementation Setup
• Synchronization
– Static
– Locks
– Software transactional memory (STM)
• Machine
– 2.33GHz quad-core Xeon processors
– 32KB L1 cache and 4MB L2 cache
• Workload
Number of Subscriptions
1M – 6M
Average Predicates per Subscription
10
Predicate Range
1 - 15
Number of Events
5000
Average Attributes per Event
50
10
DEBS 2009
Multiple Events Independent Processing
Analysis
Linear Throughput and Constant
Average Matching Time
11
DEBS 2009
Single Event Collaborative Processing
Analysis
Lock Implementation is best
Bit vector size limits scalability
12
DEBS 2009
Multiple Events Collaborative Processing
Analysis
Threads can be allocated based on
system requirements and load
13
DEBS 2009
Conclusions
•
•
•
•
•
Parallel matching engine is a promising solution
Over 1600 events/s with 6M subs
Matching time vs. throughput
Lock-based implementation is more efficient
HTM is a potential candidate for enhancing
speed and potential ease of implementation
14
DEBS 2009
DEBS 2009
Predicate Tables (Phase 1)
S1: quantity = 2 , price < 30
QUANTITY
1
S2: quantity > 4 , price = 20
2
EQUAL
3
4
5
1
LESS
GREATER
3
NOT EQUAL
PRICE
EQUAL
10
20
30
40
50
4
2
LESS
GREATER
NOT EQUAL
16
DEBS 2009
Subscription Clusters (Phase 2)
Ap1
S1
S2
S3
S4
P1
P2
P3
Ap2
S5
P4
.
.
.
ApN
17
DEBS 2009
Time Profiling
18
DEBS 2009
Block Size
19
DEBS 2009
Subscriptions Effect
SE-CP
ME-IP
20
DEBS 2009