Accusation probabilities
in Tardos codes
Antonino Simone and Boris Škorić
Eindhoven University of Technology
CWG, Dec 2010
Outline
Introduction to forensic watermarking
◦ Collusion attacks
◦ Aim
◦ Attack models
Tardos scheme
◦ Code length history
◦ q-ary version
◦ Properties
New parameterization
Majority voting effect
Performance of the Tardos scheme
◦ False accusation probability
Results & Summary
Forensic Watermarking
original
content
original
content
WM secrets
payload
Embedder
content with
hidden payload
payload
Detector
WM secrets
ATTACK
Payload = some secret code indentifying the recipient
Collusion attacks
"Coalition of pirates"
= "detectable positions"
pirate #1
1
1
1
0
1
0
1
0
0
0
0
1
#2
1
0
1
0
1
0
1
0
1
0
1
1
#3
1
0
1
0
1
0
1
0
0
0
1
1
#4
1
1
1
0
0
0
1
1
0
0
0
1
1
0/1
1
0
0/1
0
1
0
0/1
1
Attacked
Content
0/1 0/1
Aim
Trace at least one pirate from detected
watermark
BUT
Resist large coalition
longer code
Low probability of innocent accusation (FP) (critical!)
longer code
Low probability of missing all pirates (FN) (not critical)
longer code
AND
Limited bandwidth available for
watermarking code
Attack models
Once pirates detect watermark positions, what
can they do?
1. Restricted digit model
Alphabet={A,B,C,D}
◦
2.
3.
•◦
•
◦
4.
Choice from available
symbols only
Unreadable digit model
More
realistic
scenario
Erasure
allowed
Simpler to analyze
Arbitrary digit model
Arbitrary symbol (but not
erasure)
General digit model
A
A
B
D
B
A
B
B
equivalent
A A
for
? A
A
binary A
B
symbolsC
B
C
A
?B
A
C
B
A
B
C
C
D
D
?
A
B
A
D
B
C
C
D
D
C
D
D
Code length history
Construction
Boneh and Shaw 1998:
Tardos 2003:
Chor et al 2000:
Staddon et al 2001:
Huang + Moulin; Amiri + Tardos 2009:
Tardos 2003:
Boneh and Shaw 1998:
Lower bound
m 2ln 2 c 02 ln[1/1 ],
q2
c0 = #pirates
n = #users
m = code length in symbols
q = alphabet size
1 = Prob[accuse specific innocent]
= Prob[not all accused are guilty]
2 = False Negative prob.
q-ary Tardos scheme (2008)
m content segments
biases
Symbol biases
drawn from
distribution F
embedded
symbols
• Arbitrary alphabet size q
• Dirichlet distribution F
n users
c pirates
• Symbol-symmetric
watermark
after attack
Symbols allowed
p1A
p1B
p1C
p2A
p2B
p2C
piA
piB
piC
pmA
pmB
pmC
A
B
C
B
A
C
B
A
B
B
A
C
B
A
B
A
A
B
A
C
C
A
A
A
A
B
A
B
A
C
A
B
A
A
B
C
=y
Tardos scheme continued
Accusation:
• Every user gets a score
• User is accused if score > threshold
• Sum of scores per content segment
• Given that pirates have y in segment i:
p
g0(p)
g1(p)
• Symbol-symmetric
p
Properties of the Tardos scheme
Asymptotically optimal
Random code book
No framing
◦ No risk to accuse innocent users if coalition
is larger than anticipated
F, g0 and g1 chosen ‘ad hoc’ (can still be
improved)
Accusation probabilities
m = code length
c = #pirates
threshold
μ̃ = expected coalition score per
segment
Curve shapes
depend on:
F, g0, g1 (fixed ‘a
priori’)
Code length
# pirates
Pirate strategy
innocent
Pirates want to
minimize μ̃ and
make longer the
innocent tail
guilty
total score (scaled)
Central Limit Theorem asymptotically Gaussian shape (how fast?)
2003 2010: innocent accusation curve shape unknown… till now!
New parameterization
Symbol-symmetric we take care only the symbol occurrences
= pirate occurrences vector α = # α in segment
c pirates α α = c
Necessary a new parameterization!
b
1
Pr[
b
]
K
W
(
b
)
(
q
1
)
b
2
c
b 1
(b 1 / 2) (c b [q 1] 1 / 2)
W (b) c
(b )
(c b [q 1])
~ q
c
Kb=quantity depends on pirate strategy
W(b)
Kb can be pre-computed
Which strategy minimizes μ̃?
b
Some attack definitions
Majority
voting
Interleaving
attack
◦ yProb[y
= αoccurs
/c
most in segment i
i=α] that
i = symbol
Example:
A
A
B
D
B
A
B
B
A
A
C
A
A
B
A
B
C
A
B
D
P[A]=2/3
A
P[B]=1/3
A
P[B]=2/3
B
P[C]=1/3
P[A]=1/3
P[B]=1/3
P[D]=1/3
Majority voting
Theorem: Majority voting strategy minimizes μ̃
Proof (intuitive):
Case 1:
• only 2 symbols detected
W(b)
c=19
Best choice
b
c / 2
Majority voting
Theorem: Majority voting strategy minimizes μ̃
Proof (intuitive):
Case 2:
• more than two symbols detected
• one symbol occurs more than c/2 times
W(b)
c=19
Best choice
b
c / 2
Majority voting
Theorem: Majority voting strategy minimizes μ̃
Proof (intuitive):
Case 3:
• more than two symbols detected
• all symbols occur less than c/2 times
W(b)
c=19
b
Best choice
c / 2
Innocent curve behaviour
Motivations:
◦ Most critical part in the Tardos scheme
(FP ≈ 10-10)
◦ Still unknown
◦ Unknown innocent curve unknown real
code length
◦ Is Gaussian approximation good?
Approach
Steps:
1. S = i Si
Si
= pdf of total score S
S
= InverseFourier[
2.
3.
4.
5.
Fourier transform property:
]
Trouble doing
numerics (integral
does not converge)
Compute
• Depends on strategy
• New parameterization for attack strategy
Compute
•
•
•
Taylor
Taylor
Taylor
Main result: false accusation
probability curve
Example: interleaving attack
threshold/√m
exact FP
log10FP
Result
from
Gaussian
Main result: false accusation
probability curve
Example: interleaving attack
Conclusion:
Gaussian approximation is
Better than Gaussian!
worse for larger q
Main result: false accusation
probability curve
Example: majority voting attack
threshold/√m
exact FP
FP is
Result
log10FP
from
Gaussian
70 times less than Gaussian approx in this example
But
Code 2-5% shorter than predicted by Gaussian approx
Summary
Results:
introduced a new parameterization of the attack strategy
majority voting minimizes μ̃
first to compute the innocent score pdf
◦ quantified how close FP probability is to Gaussian
◦ sometimes better then Gaussian!
◦ safe to use Gaussian approx
◦ larger q
Gaussian approximation less good
Future work:
study more general attacks
different parameter choices
Thank you for your attention!
© Copyright 2026 Paperzz