Anonymity

Anonymity Metrics
R. Newman
Topics

Defining anonymity

Need for anonymity

Defining privacy

Threats to anonymity and privacy

Mechanisms to provide anonymity

Metrics for Anonymity

Applications of anonymity technology
Anonymity Set Size

Used with Chaum (free-route) Mixes

Anonymity measure: Anonymity set size



Relative to message m

All possible senders
Pfitzmann – log |AS|

Measure used is log2 (|AS(m)|)

AS(m) is Anonymity Set for message m
Does not capture different likelihoods for
different senders
Free-Route Mix Network

Suppose Threshold Mixes, threshold N = 2
M2
M4
M1
M3
Free-Route Mix Network

Suppose Threshold Mixes, threshold N = 2

Trace backwards through Mix that sent m
M2
M4
M1
M3
m – msg of interest
Free-Route Mix Network

Suppose Threshold Mixes, threshold N = 2

Trace backwards through Mix that sent m

Recursively....
M2
M4
M1
M3
Possible sender
m – msg of interest
Free-Route Mix Network

Continue, and get all possible senders
M2
M4
M1
M3
Possible senders
m – msg of interest
Free-Route Mix Network

Continue, and get all possible senders

This is the Anonymity Set for m
M2
M4
M1
M3
Possible senders
|AS| = all 4 nodes!
m – msg of interest
Free-Route Mix Network

But are all senders equally likely?

Q: What is the likelihood of each sender?
S1
M2
S2
S3
S4
Possible senders
M4
M1
M3
m – msg of interest
Free-Route Mix Network

But are all senders equally likely?

Q: What is the likelihood of each sender?
p=¼
p=¼
p = 1/8
p = 1/8
M2
p=¼
p = 1/8
p = 1/8
p=½
Possible senders
M4
M1
p=½
p=½
M3
|AS| = all 4 nodes!
p=1
Anonymity Set

Relative to a message m

All possible senders of m

If Mix M that forwards m is honest


If Mix that forwards m is corrupt


AS(m) = Union of AS(m’) for all m’ input to M
AS(m) = AS(m’) for input message m’ linked to m
Can be further constrained by path limitations
Effective Anonymity Set Size

Given that senders are NOT all equally likely

What is information that attacker has?

Can measure using information theory concept

Entropy of the distribution


S = - Sum pu log2(pu)

Where pu is probability of element u
What is effective AS size for our example?
Effective Anonymity Set Size

What is effective AS size for our example?

Entropy of the distribution

S = - Sum pu log2(pu)

Where pu is probability of element u

Distribution = {1/2, 1/4, 1/8, 1/8}

S = - [(1/2)(-1) + (1/4)(-2) + (1/8)(-3) + (1/8)(-3)]

S = ½ + ½ + 6/8 = 1.75

Effective AS size is 21.75 = 3.36 < 4

So non-uniform probabilities provide attacker
with some usable information
Effective Anonymity Set Size

How to combine networks of Mixes?

Let Mix sec have l input Mixes, M1, M2, ... Ml


All senders are independent
Analyze effective anonymity set size for sec

Ssec = - Sum pi log2(pi)

Where pi is probability m came from Mix Mi

Let Si be the effective anonymity set size of Mi

Then effective anonymity set size for system is

Stotal = Ssec + Sum pi Si
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
M2
M4
M1
M3
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
M2
M4
M1
M3
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
M2
M4
M1
M3
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
M2
M4
M1
M3
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
Can’t be this one – path too long!
M2
M4
M1
M3
Route Length Constraints

Suppose max route length = 2

i.e., message only traverses 2 mixes
M2
M4
M1
M3
What is effect on effective AS size?
Mix Cascade

Single chain of Mixes for a sender group

All traffic enters first Mix M1 in cascade

All traffic is shuffled and re-encrypted

All traffic is sent from Mi to Mi+1 in cascade

All traffic exits last Mix to destinations
M1
M2
M3
M4
Mix Cascade

What is effect of Mix cascade on effective AS
size?
M1
M2
M3
M4
Threshold Pool Mixes

Mix starts with P messages in pool

When N messages arrive, Mix fires

Selects N messages from pool uniformly

Sends those N messages, keeping P in pool
N
N
PM
n
Threshold Pool Mixes


Using standard AS measure, a given message
m sent by mix M could have been sent by any
node that ever could have sent a message to
the mix or to one of its predecessors before m
was sent by M
What about effective AS size?
S1
SN+1
SN N
S2N N
PM
Round 1
n
S(k-1)N+1
PM
N
SkN N
Round 2
…
n
N
n
Round k
PM
n
N
Threshold Pool Mixes

For effective AS size, must analyze probability
distribution for a message coming from senders
at each previous round (i.e., firing)

For example, if m comes out at round k

Prob that message arrived at round x, 0<x<=k

Px = [N/(N+n)][n/(N+n)]k-x
Prob(arrived round x given
that it didn’t arrive later)
Prob(didn’t arrive rounds x+1 to k)
Prob=N/(N+n)
N
M
Prob=n/(N+n)
Threshold Pool Mixes

For effective AS size, must analyze probability
distribution for a message coming from senders
at each previous round (i.e., firing)

For example, if m comes out at round k

Prob that message arrived at round 0

P0 = [1][n/(N+n)]k
Prob=N/(N+n)
Prob(arrived round 0 given
that it didn’t arrive later)
Prob(didn’t arrive rounds 1 to k)
N
M
Prob=n/(N+n)
Threshold Pool Mixes


Entropy measure is then sum over all possible
arrival rounds (0 to k) of probability times log2
probability (kinda big to write out here)
As k -> infinity (large number of rounds), the
expression converges to


When n=0 (no pool – standard threshold mix)


Lim Ek = [1+(n/N)] log2(N+n) – (n/N) log2n
Ek = log2 N
When n=1, Lim Ek = (N+1/N) log2(N+1)

For N = 100, this is about 6.725

Effective AS size is about 106
Threshold Pool Mixes


When n=10 and N=100

Lim Ek is about 7.13

Effective AS size is about 140
At what price?

Not free!

Increased delay due to chance of staying in pool

Average latency increases from 1 to 1+n/N rounds

Variance of n(N+n)2/N3
Entropy Measure


So now we have an effective way to account for
what the attacker actually knows
That reflects the non-uniformity of probability
distributions for senders (or recipients) of a
given message