content based multicast

An SAIC Company
Efficient Dissemination of
Personalized Information Using
Content-Based Multicast (CBM)
Rahul Shah*
Ravi Jain*
Farooq Anjum
Dept. Computer Science
Rutgers University
Autonomous Comm. Lab
NTT DoCoMo USA Labs
Applied Research
Telcordia
[email protected]
[email protected]
[email protected]
*Work performed while at Applied Research, Telcordia
Outline
 Motivation and background
 Problem definition
 Simulation results
 Concluding remarks
Ravi Jain / 20-Jun-02 / 2
Mobile Filters for Efficient
Personalized Information Delivery
 Users want targeted, personalized information, particularly
– as the amount and diversity of information increases,
– the capabilities of end devices are limited and resources are scarce
 Applications like personalized information delivery to large
numbers of users rely on multicast to conserve resources
 Traditional network multicast (e.g. IP multicast)
– does not consider the content or semantics of the information sent
– Management difficult as number of groups increase
 Content-Based Multicast (CBM) filters the information being
sent down the multicast tree in accordance with the interests
of the recipients
 Problem: how to place software information filters in
response to
– the location and interests of the users, and how these change
– the additional cost and complexity of the filters
Ravi Jain / 20-Jun-02 / 3
Related work
 Multicast
– Application layer multicast
 Assumes only unicast at the IP layer, while CBM assumes a multicast tree
(either at the IP or the application layer)
 Examples: Francis, Yoid, 2000; Chu et al., End System Multicast,
Sigmetrics 2000; Chawathe et al., Scattercast, 2000
– Publish-subscribe systems
 Many-many distribution with matching done by brokers in the network
 In CBM the brokers form the underlying multicast tree
 Examples: Aguilera, 1998; Banavar, 1998; Carzaniga, 1998
– Modifications to IP multicast
 Opyrchal, Minimizing number of multicast groups, Middleware 2000
 Wen et al., Use active network approaches, OpenArch 2001
 Theoretical work
– Classical k-median and facility location problems
Ravi Jain / 20-Jun-02 / 4
Multicast filtering example
Items
1 2
3
4
5
6
7
1, 3, 4, 5, 6, 7, 8
1, 3, 5, 6, 7, 8
1, 3, 5
= Active Filter
Content
Source
8
3, 6, 7, 8
1, 3, 5, 8
4, 6, 7, 8
Users
Items desired 1, 3
1, 5
7,8
3, 6
4
6, 7, 8
3, 8
1, 8
3, 5
• Without filters, all 8 items are sent on all 15 links = 120 traffic units
• With filters at all internal nodes, traffic
= 47 units
• With filters at 3 internal nodes, traffic
= 63 units
Ravi Jain / 20-Jun-02 / 5
Mobile code problem definition
 Problem 1: Bandwidth optimization problem
– Criterion: Find optimal placement to minimize total bandwidth
– Cost model: k-Filters: Allow at most k filters to be used
 Problem 2: Delay optimization problem
– Criterion: Find optimal placement to minimize mean delivery delay
– Cost model: Delay:
 Each filter adds a delay D for processing
 The reduction in link utilization also results in reduction in link delay:
 Optimal placement changes as users move or change interests
– the filtering code should or could be mobile and
– the placement algorithm should be fast
 Results:
– optimal centralized off-line algorithm for bandwidth optimization. Time = O(k
n2)
– optimal centralized off-line algorithm for delay optimization. Time = O(n2)
– Two centralized O(n) heuristics that restrict filter moves
– Evaluation using simulations
Ravi Jain / 20-Jun-02 / 6
Filtering algorithm framework
 For simplicity, we assume the following framework
– 1: The multicast tree has previously been constructed and is known
– 2: Filters can be placed at all internal nodes of the multicast tree
– If not, simply consider the subtree where filters are permitted
– 3: Subscriptions propagate from the users to the source
 There is a simple list of information items that users can request
 Subscription changes are batched at the source
– At every batch (time slice) x% of the users change subscription
– 4. The source calculates filter placements
– 5: The source dispatches filters to the (new) placement
 Currently we ignore signaling costs of subscriptions and filter movement
because negligible for the applications considered (news clips, video
clips, music, etc)
 Alternatively could consider that filters are available at all nodes and are
only activated/deactivated by signaling messages
Ravi Jain / 20-Jun-02 / 7
Bandwidth minimization problem
Optimal centralized algorithm
f(p)
Model of multicast tree at source
f(p)
• f(x) = Traffic required at node x
• Execution time = O(k n2)
n = number of nodes in tree
• Time complexity calculated
using Tamir (1996)
T(v, i, p)
f(l)
j
filters
Child of
Lowest filtering
ancestor, p
Node v
i filters,
max
f(r)
i - (j -1)
filters
 Dynamic programming recurrence relations
– Traffic in the subtree rooted at v, with a filter at v:
T(v, i, p) = f(l) + f(r) + min[ j: 0  j  i: T(l, j, l) + T(r, i - j - 1, r) ]
– Traffic with no filter at v:
S(v, i, p) = 2 f(p) + min[ j: 0  j  i: T(l, j, p) + T(r, i - j, p) ]
– Traffic at a leaf node v:
T(v, i, p) = S(v, i, p) = 0
– Minimum traffic is
min[ T(v, k, p), S(v, k, p) ]
Ravi Jain / 20-Jun-02 / 8
Simulation results: Filters can be very effective
Optimum Total Traffic
(messages)
 Seven-level complete binary tree (n = 127), with 64 leaves
 m = 64 messages
 Uniform subscription: p(i, j) = Prob [ User i subscribes to message j ] = p
k filters
9,000
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
0
0
3
5
10
15
20
30
0
0.2
0.4
0.6
0.8
1
63
Subscription probability, p
Ravi Jain / 20-Jun-02 / 9
Interest Locality increases filtering benefits
Locality model: P(i, j) = 1/N
= qr /N
if i = j
else, where r = LCA(i, j)
q is a skew parameter inversely proportional to locality
Optimum Total
Traffic (messages)
Effect of locality, q
q1
9,000
0.99
0.97
6,000
0.95
3,000
0.9
0
0.8
0
16
32
48
64
0.7
Num ber of filters, k
Ravi Jain / 20-Jun-02 / 10
Bandwidth minimization problem
Heuristic centralized algorithm
 Node importance, I
= amount by which
total
traffic changes by
placing a filter there
 Execution time = O(n)
f(v)
Node v
f(l)
z(l)
affected
edges
k filters,
max
f(r)
z(r)
affected
edges
 Importance of node v:
I(v) = (f(v) - f(l)) z(l) + (f(v) - f(r)) z(r), where
z(x) = 1,
if x has a filter
1 + z(left-child of x) + z(right-child of x),
otherwise
 z(x) is number of edges in the subtree rooted at x affected
by a filter at x
Ravi Jain / 20-Jun-02 / 11
Centralized heuristic
 Subscriptions propagate up to the source, which
– calculates the required flow amount at each edge and the
Importance value of each node
– tries the Importance Flip
 Imax(v) = max[ v: v does not have a filter: I(v)]
 Imin(u) = min[ u: u has a filter, I(u)]
 If Imax(v) > Imin(u), move the filter from u to v
– If the most Important non-filtering node is more important than
the least Important filtering node, swap the filter location
– otherwise, tries the Parent-child flip
– is allowed to make at most one filter move
 The source dispatches one new filter, or a move instruction
to one existing filter
Ravi Jain / 20-Jun-02 / 12
Code mobility is not useful with uniform subscriptions and
static users
 opt = optimal placement at each trial
 heu = heuristic re-run at each trial
 Init = initial placement, kept unchanged
k = 15, p = 0.3
Algorithm used
Total traffic (messages)
5,300
opt
5,290
heu
5,280
init
5,270
5,260
5,250
5,240
5,230
5,220
-
10
20
30
40
50
60
Trial instance (time unit)
Ravi Jain / 20-Jun-02 / 13
Mobility model
 User mobility: Users gradually move from the left subtree to
the right subtree
– Subscription skew, q
– At t = 0, users in left subtree have
p = 0.3 + q, users in right p = 0.3 - q
– At t = i, swap probabilities of user i in left
subtree with user i in right subtree
p = 0.3 + q
p = 0.3 - q
Ravi Jain / 20-Jun-02 / 14
User mobility motivates filter mobility
Reduction in traffic
with filter mobility (%)
Subscription skew, q
40
0.2
0.1
30
0
20
10
0
0
10
20
30
40
Number of filters
Ravi Jain / 20-Jun-02 / 15
Further work
 Theoretical improvements:
– More efficient algorithms
 Achieves O(n logn) time complexity
 Prototype and obtain actual bandwidth costs and delays for
filter movement using Aglets technology
 A distributed filtering algorithm, where the filters are agents
that coordinate with minimal involvement of the source
– How to avoid thrashing and loops
– How to ensure semi-autonomous agent movements do not degrade
performance
 Investigate different application domains
Ravi Jain / 20-Jun-02 / 16