Talk slides

An Efficient, Low-Cost Inconsistency Detection
Framework for Data and Service Sharing in an
Internet-Scale System
Yijun Lu†, Hong Jiang†, and Dan Feng*
†University
of Nebraska-Lincoln, USA
* Huazhong University of Science and Technology, China
1
Introduction
• Consistency control is important
– Active replication is essential to data security
– Systems need to handle updates
– Thus, consistency needs to be maintained
• Challenges
– Requirement is difficult to predict
– Overhead to maintain consistency is high
– In Grid-like systems, network is unreliable
2
Two Flavors:
• Inconsistency avoidance
– To avoid inconsistency in the first place. Incur high
maintenance cost and support a specific application.
– Examples:



Strong consistency
NFS consistency
etc.
– Optimistic consistency protocol?

Pre-defined
• Inconsistency detection
– Our new approach
– There is no need to define consistency protocols
3
Inconsistency Detection
• Features
– No need to pre-define consistency level
– Detect inconsistency among nodes in a timely manner
– Resolve inconsistencies based on application semantics
• Advantages
– Efficient: Timely inconsistency detection
– Low-cost: No prohibitive cost associated with a given
consistency protocol
– Versatile: Several applications with different
consistency requirement can run simultaneously
4
Overview of IDF
5
Efficient Detection
Focus of this paper
6
Outline
•
•
•
•
•
•
Background
Design
Evaluation
Inconsistency resolution
Related work
Current status
7
Background
• RanSub
– Locate disjoint content within a system
– Two processes: collect/distribute
– Used to exchange nodes’ information among one
another
• Gossip-based data dissemination
– A node disseminates non-duplicate packets to random
set of neighbors every T seconds.
– Each message travels a certain number of hops
– Used to distribute updates
8
Design of Timely Detection
• Basic idea
– Two layers
– Top layer captures most inconsistencies fast
– Bottom layer catch all the missed inconsistencies
• Terms
– Temperature: the frequency that a user updates a certain
file in a period of time.
9
1. Measure the Updating Patterns
• Importance
– Use nodes’ updating patterns as an indicator of their
interest in a certain file, called temperature.
– The higher the temperature, the more likely a node is
the “trouble maker”—It causes most inconsistencies.
• Strategy
– A node tracks its updating history for a certain file
during a certain period of time.
10
2. Learning the Updating Patterns
• Use RanSub
– Collect nodes’ updating patterns
– Each node learns a random disjoint set with each
distribution
• Possible improvement
– RanSub uses a single multicasting tree
– This cannot tolerate a single interior node failure
– Deploy a multicasting forest?
11
3. Temperature Collection/Dist.
• Why does this matter?
– Network bandwidth cost could be prohibitive
– Think the total number of files in a computer
• Interest-group based approach
– Nodes only report the temperature of files that they are
interested in.
– In distribution, an interior node only relays the
temperature of files that are interested in by nodes in its
sub-tree
• Result
– It can be supported by any connectivity, including a
dial-up connection.
12
4. Two-layer detection
An example:
• Two layers
– Solid line: top layer
– Dotted line: bottom layer
• Version vector is used to
detect inconsistencies
• Mechanism
– Travel the top layer first
– If no inconsistency found in
top layer

Go to the bottom layer
13
5. Caching & Garbage Collection
• Caching
– Cache temperature information
– Cache routing information among top layer, then smart
decision can be made to save traversal time
• Garbage collection
– Keep the temperature fresh
– Assign time stamp to each piece of temperature
information
– Temperature information expires when the an
information is older than a threshold.
14
6. Discussion
• Till now, we treat the term “update” generically
– Only one kind of “update”
• Several forms of update exist, indeed
– Creating
– Modifying
– Deleting
• It does not matter in the detection part, but does
matter when we design the APIs for applications
15
Evaluation 1: Failure rate
• Why do we care about it?
– Top layer detects inconsistencies much faster than
bottom layer
– It is desirable that most inconsistencies are captured by
the top layer
• Analysis result
– In worst case scenario, two sub-cases exist


Case 1: failure rate 0.04%
Case 2: failure rate 18.9%
– See paper for clarification
• Main message
– Top layer captures the vast majority of inconsistencies!
16
Evaluation 2: Maintenance Cost
• Metric
– # of messages received by each node incurred by the maintenance
process
• Simulation setup
– 1000 nodes in the network.
– Simulation runs 800 seconds.
• Result
– Max bandwidth cost: < 6KB/s
17
Inconsistency Resolution
• Overview
– Utilize detection result
– Support multiple applications with different
requirement for consistency control
• Semantic-based resolution (ongoing & future
work)
– Get semantics


Hint-based
Middleware detection
– Resolution schemes


Middleware automatically resolves inconsistency
Ask users’ preference before reacting
18
Related Work
• TACT
– Explore trade-off between consistency level and
performance
• DENO
– Peer-to-Peer scheme, yet to maintain strong consistency
• Lpbcast
– Pure gossip-based protocol
• Quorum system
– Could fails in the presence of node failure
19
Current Status
• Dealing with inconsistency resolution
– Support applications.
• Implementing a prototype on Planet-Lab
• Investigating the implications of the new
framework to large-scale distributed systems in
general
20