Randomness Extractors
& their Cryptographic Applications
Salil Vadhan
Harvard University
to be posted at http://eecs.harvard.edu/~salil
Motivation
Original Motivation
[SV84,Vaz85,VV85,CG85,Vaz87,CW89,Zuc90,Zuc91]
• Randomization is pervasive in CS
– Algorithm design, cryptography, distributed computing, …
• Typically assume perfect random source.
– Unbiased, independent random bits
– Unrealistic?
• Can we use a “weak” random source?
– Source of biased & correlated bits.
– More realistic model of physical sources.
• (Randomness) Extractors: convert a weak random
source into an almost-perfect random source.
CS Theory Applications of Extractors
• Derandomization of (poly-time/log-space) algorithms
[Sip88,NZ93,INW94, GZ97,RR99, MV99,STV99,GW02]
• Distributed & Network Algorithms
[WZ95,Zuc97,RZ98,Ind02].
• Hardness of Approximation [Zuc93,Uma99,MU01]
• Data Structures [Ta02]
• Unify many important “pseudorandom” objects
–
–
–
–
–
Hash Functions
Expander Graphs
Samplers
Pseudorandom Generators
Error-Correcting Codes
Crypto Applications of Extractors
• Privacy Amplification [BBR85]
• Pseudorandom Generators [HILL89]
• Protecting against Partial Key Exposure [CDHKS00]
• Crypto vs. Storage-bounded Adversaries [Lu02]
• Biometrics [DRS04]
• Crypto with Human Passwords [NV04]
Outline
•
Motivation
•
Definition & Basics
•
Cryptographic Applications
•
Conclusions & a Glimpse Beyond
Definition & Basics
Weak Random Sources
•
What is a source of biased & correlated bits?
– Probability distribution X on {0,1}n.
– Must contain some “randomness”.
– Want: no independence assumptions ) one sample
•
Measure of “randomness”
– Shannon entropy:
No good:
– Better [Zuckerman `90]:
min-entropy
Min-entropy
• Def: X is a k-source if H1(X)¸ k.
i.e. Pr[X=x] · 2-k for all x
• Examples:
– Unpredictable Source [SV84]: 8 i2[n], b1, ..., bi-12 {0,1},
– Bit-fixing [CGH+85,BL85,LLS87,CW89]: Some k coordinates of
X uniform, rest fixed (or even depend arbitrarily on others).
– Flat k-source: Uniform over S µ {0,1}n, |S|=2k
• Fact [CG85]: every k-source is convex combination of
flat ones.
Extractors: 1st attempt
• A function Ext : {0,1}n ! {0,1}m s.t.
8 k-source X, Ext(X) is “close” to uniform.
k-source of length n
EXT
m almost-uniform bits
• Impossible! 9 set of 2n-1 inputs x on which first bit of
Ext(x) is constant ) flat (n-1)-source X, bad for Ext.
Extractors
[Nisan & Zuckerman `93]
• Def: A (k,e)-extractor is Ext : {0,1}n £{0,1}d ! {0,1}m
s.t. 8 k-source X, Ext(X,Ud) is e-close to Um.
k-source of length n
“seed”
d random bits
EXT
m almost-uniform bits
• Key point: seed can be much shorter than output.
• Goals: minimize seed length, maximize output length.
Definitional Details
• Ut = uniform distribution on {0,1}t
• Measure of closeness:
statistical difference (a.k.a. variation distance)
– T = “statistical test” or “distinguisher”
– metric, 2 [0,1], very well-behaved
• Def: X, Y e-close if D(X,Y)·e.
Strong extractors
• Output looks random even after seeing the seed.
(important in most crypto applications)
• Def: Ext is a (k,e) strong extractor if
Ext0(x,y)=y±Ext(x,y) is a (k,e) extractor
• i.e. 8 k-sources X, for a 1-e0 frac. of y2{0,1}d
Ext(X,y) is e0-close to Um
• In this talk, “extractor” ´ “strong extractor”
The Parameters
• The min-entropy k:
– High min-entropy: k = n-a, a =o(n)
– Constant entropy rate: k = W(n)
– Middle (hardest) range: k = na, 0<a<1
– Low min-entropy: k = no(1)
• The error e:
– In crypto apps, e ¼ Pr[adversary “breaks” scheme]
(very small)
• The output length m:
– Certainly m· k.
– Can this be achieved?
The Optimal Extractor
Thm [Sip88,RT97]: For every k · n, 9 a (k,e)-extractor w/
– Seed length d = log(n-k)+2log(1/e)+O(1)
– Output length m = k -2log(1/e)-O(1)
“extract almost all the min-entropy w/logarithmic seed”
• Pf Sketch: Probabilistic Method.
– Show that for random Ext,
Pr[Ext not (k,e)-extractor] < 1.
– By union bound over flat k-sources X on {0,1}n and
statistical tests Tµ {0,1}m
The Optimal Extractor
• Thm: For every k · n, 9 a (k,e)-extractor w/
– Seed length d = log(n-k)+2log(1/e)+O(1)
– Output length m = k -2log(1/e)-O(1)
• Thm [NZ93,RT97]: Above tight up to additive constants.
• For applications, need explicit extractors:
– Ext(x,y) computable in time poly(n).
– Random extractor requires space ¸ 2n to even store!
• Long line of research has sought to approach above
bounds with explicit constructions.
Extractors as Hash Functions
{0,1}n
flat k-source,
i.e. set of size 2k À 2m
•For most y, hy maps sets of size K almost uniformly
onto range.
{0,1}m
Extractors from Hash Functions
• Leftover Hash Lemma [BBR85,ILL89]:
universal (ie pairwise independent) hash functions
yield strong extractors
– output length: m= k-2log(1/e)-O(1)
– seed length: d= n+m
– example: Ext(x,(a,b))=first m bits of a¢x+b in GF(2n)
• Almost pairwise independence [SZ94,GW94]:
– seed length: d= O(log n+k)
Application: Randomized algorithms
w/a weak source [Zuckerman `90,`91]
k-source
d-bit seed
EXT
m uniform bits
input x
Randomized Algorithm
accept/reject
errs w.p. ·2( d +e )
• Run algorithm using all 2d seeds & output majority.
• Only polynomial slowdown, provided d=O(log n)
and Ext explicit.
Cryptographic Applications
Crypto with Weak Random Sources?
• Enumerating seeds doesn’t work.
– e.g. get several encryptions of a message, most of which are
“secure”
• Thm [MP97,DOPS04]: Most crypto tasks are impossible
with only an (n-1)-source.
– Encryption, commitment, secret sharing, zero knowledge,…
• Alternative: Seek “seedless” extractors for restricted
classes of sources.
– Lots of recent progress
– Bit-fixing sources [KZ03], several independent weak sources
[CG88,BIW04,DEOR04,BKSSW04,Raz05], efficiently
samplable sources [TV00]
Seeded Extractors in Crypto:
Privacy Amplification
[Bennett,Brassard,Robert `85]
• honest parties hold a string X that is weakly random
to adversary ) X a k-source
• Goal: obtain bits that adversary cannot distinguish
from uniform
• Natural approach: apply an extractor to X
• Q: where to get seed?
– Various solutions, depending on application
Key Agreement w/a Noisy Channel
[BBR85]
XÃ{0,1}n
Alice
Eve
Z
Noisy Communication Channel
Y
Bob
) w.h.p. Alice & Bob “share some randomness” unknown to Eve
Information Reconciliation
Protocol
Alice
X
Z
Bob
Y= X whp
) w.h.p. over zà Z, X |Z’=z is a k-source for large k.
Random seed R
K =Ext(X,R)
Z=(Z,R)
) w.h.p. over zà Z, K|Z’’ =z is e-close to uniform.
K =Ext(Y,R)
The Bounded-Storage Model [Maurer 90]
00111011101000100000000100001100001
length n
EXT
Storage s
seed
00000000
01100001
Adversary
0100000100010101100000010
• High-rate source of truly random bits.
• Lemma: conditioned on adversary’s state, have (n-s)source w.h.p.
) Output of extractor looks uniform to adversary [NZ93,Lu02]
Proof of Lemma
Lemma: (X,Z) (correlated) random vars,
w.p. ¸ 1-e over zÃZ,
X a k-source
X|Z=z is a (k-s-log(1/e))-source.
and |Z|=s
Proof: Let BAD = { z : Pr[Z=z] · e¢ 2-s}. Then
•
•
The Bounded-Storage Model
00111011101000100000000100001100001
length n
EXT
Storage s
seed
00000000
01100001
Adversary
0100000100010101100000010
Doing Cryptography:
• Seed = shared secret key
• Output of extractor = use for encryption (one-time pad),
message authentication
• Strong extractor ) seed reusable, secure even if key
compromised later (“everlasting security” [ADR99])
The Bounded-Storage Model
00111011101000100000000100001100001
length n
EXT
Storage s
seed
00000000
01100001
Adversary
0100000100010101100000010
Additional Constraint: honest parties should only have to
read a small # bits from source
i.e. EXT should be “locally computable” [Lu 02,Vadhan 03]
(easily achieved using techniques in the extractor literature)
Extractors & Biometrics
[Dodis, Reyzin, Smith `03]
Goal: use biometric data (eg your fingerprint F) as
crypto keys
Problem: biometric data not uniform
• But seems to have significant min-entropy
) use K = Ext(F,R) instead
F
start session
R
K, R
K = Ext(F,R)
user
client
server
Extractors & Biometrics
Problem 2: biometric data not reliable
• Multiple readings will produce non-identical, but “close” (eg in
Hamming distance) values
Want: value C=C(F) s.t.
• F can be recovered from C and any F d-close to F
• F still has high min-entropy given C
F
F = Rec(F,C)
start session
R, C
K, R, C
K = Ext(F,R)
user
client
server
Extractors & Biometrics
Want: value C=C(F) s.t.
• F can be recovered from C and any F d-close to F
• F still has high min-entropy given C
Solution: C=F©Z
• Z random codeword in error-correcting code of relative minimum
distance >2d and rate 1-
• Reduces min-entropy rate by at most
F
F = Rec(F,C)
start session
R, C
K, R, C
K = Ext(F,R)
user
client
server
Additional Crypto Applications
• Protecting against partial key exposure [Canetti et al. 00]
– If adversary learns k bits of an n-bit key, key still has minentropy n-k.
– Suffices to have extractors for “bit-fixing sources”
• Cryptography with human-memorizable passwords
[Nguyen-Vadhan `04]
– Human passwords are not uniform bit-strings, and have low entropy
(compared to “good crypto keys”).
– Extractors reduce using human passwords to using PINs.
Conclusions
• Randomness extractors address a basic problem that
arises often in cryptography.
• Language and basic results as important as the
actual constructions.
• Interplay between cryptography, theory of
computation, probability & information theory
(also combinatorics, algebra, …)
Beyond this Talk
• Close relations to expander graphs, pseudorandom
generators, error-correcting codes, samplers,…
• Explicit constructions.
Thm [...,NZ93,WZ93,GW94,SZ94,SSZ95,Zuc96,Ta96,Ta98,Tre99,
RRV99, ISW00,RSW00,RVW00,TUZ01,TZS01,SU01,LRVW02]:
For every k · n, e>0 9 an EXPLICIT (k,e)-extractor w/
Seed d= O(log2 n+log(1/e)) &
Output m = .99k
• (Surprising) applications in other parts of CS.
Further Reading
• N. Nisan and A. Ta-Shma. Extracting randomness: a survey and
new constructions. Journal of Computer & System Sciences, 58
(1):148-173, 1999.
• R. Shaltiel. Recent developments in explicit constructions of
extractors. Bulletin of EATCS, 77:67-95, June 2002.
• S. Vadhan. Randomness extractors & their many guises.
Slides from tutorial at FOCS `02.
• S. Vadhan. Course Notes for CS225: Pseudorandomness.
http://eecs.harvard.edu/~salil
© Copyright 2026 Paperzz