slides - Department of Computer Science

Peer-to-Peer Information Systems
Week 9: Accountability
Old Dominion University
Department of Computer Science
CS 495/595 Fall 2003
Michael L. Nelson <[email protected]>
10/23/03
Resource Allocation
• problems in (computer) resource allocation
are generally solved by accountability
• traditionally, accountability is maintained
through centralized control of resources
– cf. disk quotas on .cs.odu.edu machines
• what happens if disk quotas are removed?
Tragedy of the Commons
• first described in Hardin, Science, 162, pp. 12431248, 1968.
– “commons” - a shared grazing area in a village
– “tragedy” - misfortune brought about by inevitable
events
• eventually, the village grows to where the
commons cannot support all available livestock
– an individual herder has two choices:
• adding an additional animal of his to the commons increases
increases his utility by 1
• the additional cost of supporting the extra animal is 1 / (# of
herders); which is < 1
Collective Actions & Public Goods
• “Unless the number of individuals in a
group is quite small, or unless there is
coercion or some other special device to
make individuals act in their common
interest, rational, self-interested will not act
to achieve their common or group interests”
Olson (1982), “The Logic of Collective Action”, In Barry & Hardin (eds.), Rational Man
and Irrational Society, p. 44
P2P Resource Allocation Problems
• Storage and bandwidth are finite
– Denial of service “attacks”
• e.g. “slashdot” effect
– Storage flooding “attacks”
• sending a 500MB email attachment to
[email protected]
• “spam” email
P2P Problems (p. 273)
• providing corrupted or low-quality
information
• reneging on promises to store data
• unavailability
• making false claims about other peers
Two Solution Approaches
• Restricting access
– using micropayments
• Reputation systems
– low rated users have lower access or less
favorable transactions
Real World Accountability
• Real world accountability measures
– reputation
– legal recourse
• With P2P you have no central control or authority
for many resources (bandwidth, storage,
computation)
– hard to permanently and uniquely identify peers
– no way to assess peer history / reputation
– no way to enforce “contracts”
P2P Accountability
• “As the systems become more dynamic and
diverge from real-world notions of identify, it
becomes more difficult to achieve accountability
and protect against attacks on resources”
– p. 276
• A P2P system must support:
– Privacy
• anonymity
• pseduonymity
– Dynamic participation
A Scale of Difficulty
• (mostly) static lists of peers; identities of peers are
known
– e.g. mixmaster remailers
• dynamic peers, identities are known
– e.g. Gnutella
• dynamic peers, pseudonymous
– e.g. Free Haven
• dynamic, anonymous
– e.g. ???
Minimizing Risk in P2P
Transactions
• Limit risk (bandwidth, storage, etc.) to the the
benefit from the transactions
– fee-for-service / micropayments
• Make risk proportional to level of trust in the other
peer
– reputation system
• Ignore the problem; assume some bad servers and
workaround
– exploit redundancy, distributed resources, etc.
Accountability Approaches in
Existing Systems
• Freenet
– unpopular data is overwritten when space is needed
• Gnutella
– files are stored only locally
• Publius
– submissions are limited to 100Kbytes
• Free Haven
– you must provide storage to get storage
Pseudonymity
• Pseudospoofing
– simultaneously creating & controlling many
fake identities
– uses:
• corrupting a reputation system
• gaining more resources (email, storage, etc.)
– cf. Tragedy of the Commons
Eliminate Pseudospoofing?
• Abandon pseudonyms
– require people to prove who they are (e.g., PKI)
– problems:
• identity does not imply accountability
• not necessary
• Allow only a 1-1 mapping of identities and
pseudonyms
– problems:
• not really easier than the above
Eliminate Pseudospoofing?
• Monitor for pseudospoofing evidence
– look for multiple registrations, etc
– problems:
• privacy concerns?
• Remove pseudospoofing motive
– make it “expensive” or “unprofitable” to
generate / operate multiple identities
Reputations for Sale?
• How to handle when 1 pseudonym serially
maps to different people over time?
– cf. Ultima Online avatars for sale on ebay
–
e.g. -- http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=3053682454&category=33887
– solutions?
• embed some desirous, real-world information (e.g.
credit card #) as part of the password
– security problems…
• limit the life of passwords; require re-certification
– does not completely solve the problem…
Pseudonym Sold for $127.50
Handling Flooding & DoS Attacks
• first order approaches:
– Caching
• e.g. caching proxies
– Mirroring
• e.g. mirror selection in sourgeforge.net
• active caching
– e.g. Freenet
• active mirroring
– e.g. Akamai
Consistent Hashing
for Active Mirroring
hash functions of the type:
h(f) = (7(f) + 4) / (mod 23)
invalidate all previous
values when a new
node is added…
…Akamai uses consistent
hashes to handle dynamic
participation
Figure 1 from Krager, et al., WWW8 Conference Proceedings, http://www8.org/w8-papers/2a-webserver/caching/paper2.html
Akamai Summary
• most details are proprietary, but see:
–
–
–
–
http://theory.lcs.mit.edu/~karger/Papers/web.ps.gz
http://theory.lcs.mit.edu/~karger/Papers/Talks/Hash/
http://www.siam.org/siamnews/12-99/akamai.pdf
http://www8.org/w8-papers/2a-webserver/caching/paper2.html
Virtual Cache
DNS Resolution
Client
Virtual Cache
Virtual Cache
...
Actual Cache
Consistent Hashing
...
Actual Cache
Virtual Cache
# VC > # AC
www.akamai.com resolution
at 3 different sites:
• cs.odu.edu
• ils.unc.edu
• larc.nasa.gov
Micropayments
• Macropayments examples:
– $29.95 for a month of service
– $0.99 for a song from iTunes
• Micropayments (“digital cash”)
– Nonfungible
• purpose: slow the person down, show proof-of-work
• cannot be re-used; they have no intrinsic value
– Fungible
• has some intrinsic, reusable value; surrogate for $, storage,
service, etc.
– Anonymous vs. identified
Email “Postage Stamps”
• Based on Dwork & Naor, 1992
– http://gunther.smeal.psu.edu/dwork93pricing.html
• Refuse to accept email unless a proof-of-work is
attached
– POW is a hash of the recipients email addr +
timestamp, thus separate POWs have to be generated
for each address / each transmission
– POWs can be easily checked for validity
• keep a local database to insure POWs are not reused
– keep a “frequent correspondent list” to manage
exceptions
“Hash Cash”
• Premise: Bob calculates a hash (say, 160 bits)
based on a secret that only Bob knows
– Alice can’t reverse the hash; and brute force creation of
input is expensive
– Bob sets “payment” as guessing an input of some
subset of bits that match the hash
• amount of “payment” involves how many bits must be
matched
– http://www.hashcash.org/
– http://citeseer.nj.nec.com/back02hashcash.html
• Juels & Brainard, “Client Puzzles” are similar;
they adjust puzzle difficulty if they sense an attack
– http://www.rsasecurity.com/rsalabs/staff/bios/ajuels/publications/cl
ient-puzzles/
Nonparallelizable POWs
• If an attacker had access to M machines
(legitimate access, or cracked access), they could
solve the POWs / puzzles in 1/M time
• Time-lock puzzles
– http://citeseer.nj.nec.com/rivest96timelock.html
– requires the solving of puzzles (believed to be)
intrinsically non-parallelizable:
t
2
2
mod n
– where n is the product of 2 large primes, q & t, which
can be chosen to tune the puzzle
POWs vs. Reputations
• POWs are relevant to the current or
impending transaction
• Reputations are the sum of past transactions
– reputations also come from 3rd parties; one is
basing their actions on the feedback of others
Trust
• Pretty Good Privacy (PGP) model
– key signing; verifying that digital keys map to
verifiable humans
– small world effect
• http://bcn.boulder.co.us/~neal/pgpstat/ shows an average of 6.x
hops from one key to another; but some have 21 hops
– problem:
• key revocation
• Public Key Infrastructure (PKI) model
– hierarchical system, trusted root delegates to other
trusted members, etc.
– think “DNS for keys”
– problem: when the root key is compromised
• http://www.cert.org/advisories/CA-2000-19.html
Slashdot.org
• problem: how to have a large scale news service
and maintain a high signal-to-noise ratio?
• summary
– allow all posts; delete nothing
– score the posts on a scale of -1 .. 5
• users set their threshold off the comments they wish to view
• so where do the scores come from?
– details at: http://slashdot.org/faq/commod.shtml#cm600
Slashdot Moderation Evolution
• evolution:
–
–
–
–
no moderation
25 moderators picked by fiat
400 more picked from “good” posters
now, everyone has the occasional opportunity to moderate
• users selected based on account history, reading activity (high, but not
too high), and good “karma”
• moderators can rank the posters/posts they are
reading from -1 .. 5
– ability to moderate lasts for a “few points”
– “karma” is gained by making more good posts than bad
posts (as viewed by slashdot moderators)
• good karma increases your chance to moderate again
– Slashdot editors have unlimited moderation points
Slashdot Metamoderation
• Who will moderate the moderators?
– the readers. well, some of them anyway.
– your account must be one of oldest 92.5% accounts
• you can generally metamoderate after a few months
– you can rate the moderation several times a day
• exact # of times is frequently tuned
– details at: http://slashdot.org/faq/metamod.shtml
Advogato Trust Metric
• How to build a reputation system that resists
pseudospoofing and collaborating agents?
• Advogato trust metric
– http://www.advogato.org/trust-metric.html
– somewhat similar to how Google calculates the “rank”
of web pages…
– intuition:
• trust is modeled as a directed graph
• collaborating bad guys trust each other, but relatively little trust
will “flow” from the “good” part of the graph to “bad” part of
the graph
– uses 4 “seed” accounts for reference
Advogato
• Each Advogato account
has a certification level
l.
• An edge exists between
accounts s and t when s
has certified t at level l.
• capacity
– nodes have higher
“capacities” the closer
they are to a root node
figure from: http://www.advogato.org/trust-metric.html
Graph Conversion
to convert the graph
into a single source,
single sink with capacities
on the edges, convert this:

to this:
figure from: http://www.advogato.org/trust-metric.html
Flow Calculation
• We can calculate the maximum flow from a
seed to a node n
– this is the trust metric for n
– calculate using the Floyd-Fulkerson method,
section 27.2 in your Cormen, Leiserson &
Rivest book
• Intuition: collaborating agents can heavily
inter-link, but create no new flow from the
seed
Damage from Pseudospoofing?
from inspection, if there are no “confused” nodes, there is no damage
figure from: http://www.advogato.org/trust-metric.html
Max-Flow, Min-Cut
• max-flow, min-cut
theorem (p. 593, CLR)
gives the cut indicated in
the figure
• the maximum weight the
pseudospoofers can have
is:
∑cx - 1
of the compromised
nodes
cx represents the capacity of node x
figure from: http://www.advogato.org/trust-metric.html
Reputation Observations
• The hierarchical PKI model has supplanted
the P2P PGP model (!)
• Slashdot & Advogato appear to work
• eBay’s reputation system is vulnerable to
“be good, then turn evil and cash out”
attacks
– perhaps other models have similar weaknesses,
but eBay deals in fungible goods?
Reputation Issues
• Provable transactions
– eBay example
– New York Times bestseller list example (p. 317)
• Honest ratings
– if the raters have an interest in promoting the value of
their investment, their ratings are suspect
• Boostrapping
– incentive to participate?
– centralized systems: must buy into the hierarchy of trust
– decentralized systems: all trust is experiential
Reputation Metrics
• How well does a linear metric represent
reputation?
– how to prevent bad merchants from cheating on
1/10 of their transactions?
– how to prevent bad customers from
besmirching good merchants?
Accountability & Reputation
Portability
• Currently, accounts (and their associated
reputation) are tied to a particular context
– generally tied to a particular service/application
– how to maintain portable pseudonymity?
• How to move your account/reputation from:
– AIM -> Yahoo?
– Advogato -> SourceForge?
– eBay -> Amazon?
• Who will run 3rd party accounts / reputation?
– (PKI?)
– how much are you willing to pay?