Collaborative joins in a pervasive computing environment

Collaborative joins in a pervasive computing
environment
Filip Perich, Anupam Joshi, Yelena Yesha, Tim Finin
The VLDB Journal (2005)
2008. 11. 17.
Summarized & presented by Babar Tareen,
IDS Lab., Seoul National University
Center for E-Business Technology
Seoul National University
Seoul, Korea
Introduction
 To obtains data

Devices should not solely depend on centralized servers

Devices should not be required to pre-cache all required data
 A device should utilize its vicinity by collaborating with peers
Copyright  2008 by CEBT
2
Introduction (2)
 Data

Static – User Profile

Dynamic – Context Sensitive Data
–
Data which is affected by change in context
–
Not the actual context data
–
For example: List of restaurants near to a user
 In this paper, context also includes

Belief, Desire, Intentions

Stored in user profile
 Based on MoGATU
Copyright  2008 by CEBT
3
Contribution
 Collaborative Query Protocol (CQP)

Based on Contract Nets

Enables a mobile device to query its vicinity for peers that can
answer a given query

Allows two or more devices to cooperate
 A realistic experimental model for simulating a city traffic
scenario
 Demonstrate the capability of CQP by implementing it in
MoGATU and by evaluating its performance
Copyright  2008 by CEBT
4
MoGATU Overview
 Information Providers

Represent Data sources available in environment
 Information Consumers

Entity that query an update data available in the environment
 Information Managers (InforMa)

Responsible for network communication and for most of the data
management functions
Copyright  2008 by CEBT
5
Data Representation
 Data Model

A set of ontologies
 Define ontologies using DAML+OIL
 Using ontologies because of reasoning
 Do not take into account the time necessary for reasoning
over the ontology knowledge
Copyright  2008 by CEBT
6
Query Representation

Explicit Query


User generated query
Implicit Query

Device generated query, inferred from user profile

User takes lunch between 12:00 pm – 2:00 pm and prefers Chinese food

Queries are specified in DAML-S

For this paper, abstracting queries to select-from-where form

query = (O, σ, θ,Σ, τ)

O : A set of used ontologies

σ : Selection list

θ : Filtering statement

Σ : Cardianality

τ: Temporal constraints
SELECT (select_list)
FROM (ontology_list)
WHERE (conjunct_disjunct_predicate_list)
LIMIT [minCardinality, maxCardinality]
TIME neededBy
Copyright  2008 by CEBT
7
CQP
Copyright  2008 by CEBT
8
CQP (2)
 Call for query

Initially device attempts to satisfy query using local cache

If not possible, creates a call-for-query message

Message contains
–
Query or part of query
–
Cardinality requirements
–
Deadline for delivering the complete answer
–
Time when the winner will be announced

Device sends the message to its peers upto n-hops

And Starts its bid-submission timer

If device does not gets any bid-submission response then it starts
to decompose the query
Copyright  2008 by CEBT
9
CQP (3)


Bid Submission (Upon receipt of call-for-query)

A device decides if it should interact in the proposed collaboration based on
inference

If device does not wishes to participate or can not provide data, it simply ignores
call-for-query

If device wishes to collaborate then it calculates the size of the answer it can
provide

Returns bid message including estimated size of its answer

Starts a timer awaiting a bid-award
Bid Award

Contractor waits for a predefined time period for any responses

When bid submission timer expires, the bidder which claims to deliver the most
data in shortest time is selected as winner

Contractor sends a bid award message

Starts Ack Timer

If a bidder does not receives a bid-award message before its timer expires, the
bidder resend its bid message n-1 more times
Copyright  2008 by CEBT
10
CQP (4)
 Acknowledgement

When the bidder receives bid award message it sends back an
ack message

Starts an Ack timmer and waits for ack from Contractor

When contractor receives ack from Bidder it send ack message
Copyright  2008 by CEBT
11
Join Query over two streams

In case 1, querying device A asks its vicinity for one input stream only since it already
holds the second stream.

In case 2, A asks its vicinity for the final join result only.

In case 3, A asks for each stream separately in order to perform the join locally.

In case 4, A asks B to process the query, but B needs to first obtain the second stream
from some other device C.

In case 5, A “delegates” the task to C, which asks its vicinity for the input streams instead.
Copyright  2008 by CEBT
12
Experimental Setup
 Environment

Realistic model that mapped streets and intersections south of
72nd Street in Manhattan

Directed graph with 793 intersections (vertices)

5000 x 9000 m

Each intersection was assigned an (x,y) coordinate

Each intersection was given a list of its neighboring intersections
 Beacon entity

Assigned a stationary beacon to each intersection

Each beacon has knowledge about its vicinity (Resturants,
Theaters, etc)
Copyright  2008 by CEBT
13
Experimental Setup (2)
 Car entity

Use 100 cars

Transmission distance 125 m

Maximum throughput 2 Mbps
 Mobility model

Car driven randomly by tourists (50 Cars)

Car driven by taxi driver on shortest possible route (50 Cars)
Copyright  2008 by CEBT
14
Profile accuracy vs Query success rate
 Fig 7a,b.

a: Willingness to help = 0%

b: Willingness to help = 75%
Copyright  2008 by CEBT
15
Profile accuracy vs Computing Cost
Implicit Queries
Copyright  2008 by CEBT
16
Willingness to help vs query success rate
Profile Accuracy 80%
Copyright  2008 by CEBT
17
Willingness to help vs. computing and network
cost
Profile Accuracy 80%
Copyright  2008 by CEBT
18
Willingness to help vs. query success rate /
computing cost
Profile Accuracy 80%
Copyright  2008 by CEBT
19
Review
 Pros

CQP can be used to query data from multiple sources

CQP can be used in any environment not just mobile peer – peer
scenario
 Cons

CQP is not much useful if devices already have access to some
fixed network

More on discussion slide
Copyright  2008 by CEBT
20
Discussion

Matrices used for evaluation are not appropriate

No comparison with any existing system with similar architecture

No comparison with centralized server architecture

Any technical problems in device – device communication not specified

In the example scenario, at every intersection beacons were installed


Cost of installing such beacons not specified

Enhancing centralized system vs. installing beacons
Only 100 cars were used in a space of 5000 x 9000 m

What will be the performance of the protocol if number of devices increase

Cost of ontology reasoning not considered

I think there is a lot of packet over head for query and this protocol
might not be practically usable

A combination of server and peer-peer querying might give better results
Copyright  2008 by CEBT
21