Impagliazzo`s Hardcore Lemma 1 Average

Average Case Complexity
February 8, 2011
Impagliazzo’s Hardcore Lemma
Professor: Valentine Kabanets
1
Scribe: Hoda Akbari
Average-Case Hard Boolean Functions w.r.t. Circuits
In this lecture, we are going to introduce a notion of hardness in the sense of circuit size. For
now, we focus on the uniform distribution.
Definition 1. A function f : {0, 1}n → {0, 1} is said to be δ-hard for size s (for parameters δ, s)
if for all circuits C of size |C| ≤ s, we have
Pr [C(x) 6= f (x)] > δ
x∼Un
In this sense,
f is not δ-hard ⇔
⇔
Theorem 1. For all δ <
circuit size s, where s ∃C, |C| ≤ s. Pr [C(x) 6= f (x)] ≤ δ
x∼Un
f ∈ Heurδ SIZE(s)
1
1 1
and 0 <
2
ε< 2
2
1
ε
2
n
n · 2 · 64 .
such that δ =
1
2
− ε there exist δ-hard functions for
Proof. The proof is by a probabilistic argument over a randomized construction: we prove that
random function f works. For each circuit C of size s, we have
1
Pr Pr [f (x) = C(x)] > + ε ≤ exp(−ε2 2n )
(1)
x
f
2
This is because the expected fraction of agreements is 21 , and the proof follows by a Chernoff
bound argument.
Remark (Lower bound for circuit size). Suppose we have a truth table of size 2n which is
compressed into a minimal circuit C of size s. To represent C with bits, we need at most
n
s(2 + 2 log s) ≤ 3s log s bits. Roughly, to have s log s ≥ 2n we can choose s ≈ 2n .
Using the above remark and equation 1, by the union bound, the probability that there exists
some circuit within the size limit for computing f is upperbounded by:
(number of circuits) · exp(−ε2 2n ) ≤ ss log s exp(−ε2 2n ) 1
Suppose now that we choose a hardness measure δ = 12 − ε, and make ε sufficiently small. To
have a δ-hard function here means having a function that is very close to random, and essentially
unpredictable by any circuit within the allowed size, on almost all inputs.
never happens that δ > 21 because either 1 or 0 constitutes at least half of the function’s output, and
therefore we can achieve δ ≥ 12 by having a circuit that outputs the more frequent bit, regardless of the input.
2 The inequality s ≤ 2n /n always holds, since according to the Shannon-Lupanov bound [Shannon and Lupanov
n
’60], any function f : {0, 1}n → {0, 1} can be computed using some circuit of size ≤ 2n (1 + o(1)). Thus for the
bound to be interesting, we must add a factor so that s 2n /n.
1 It
1
In general, we can construct a δ-hard function f in the following way: We choose a subset H
of density 2δ from the universe (|H| ≈ 2δ · 2n ). Suppose function f is so that f |H is random and
f |H is 0, where H is the complement of H. With this construction, it can be easily verified that
f is (2δ.( 12 ) = δ)-hard.
Intuitively, f is δ-hard because there exists a hard-core set H of inputs of size |H| ≥ 2δ · 2n
such that f is essentially unpredictable on H by any small circuit. Surprisingly, this intuition is
actually correct! The following lemma formalizes this fact:
Lemma 1 (Hard-Core Lemma [Impagliazzo’95 [1], Holenstein’04[2]] ). Let f : {0, 1}n → {0, 1}
n
δ2
be δ-hard for size s where s ≤ 2n · 32
; then for any 0 < ε < 1, there exists a set H ⊆ {0, 1}n , with
n
size |H| ≥ 2δ · 2 such that H is an ε-hardcore for circuits of size s0 = s · poly(ε, δ).
This means that for some vanishing ε, we cannot do better than randomly guessing the output.
Thus, ε is a measure of hard-coreness and determines how much unpredictable our function is.
ε2
, and the lemma says for any circuit C with size |C| ≤ s0 ,
The value for s0 is s0 = s. 32n
Pr [C(x) = f (x)] ≤
x∼H
1 ε
+
2 2
Proof. Let’s argue the easier result of “∃ε-hard core of size≥ δ·2n ”. The proof is by contradiction:
Suppose there is no such ε-hard core; that is:
1 ε
+
2 2
Consider a two-player zero-sum game with players S (set player) and C (circuit player), where
the payoff matrix can be arranged in form of a table whose row headers are different choices of
sets of size at least δ · 2n and column headers are all possible circuits of size at most s0 . We define
the payoff matrix as:
∀S ⊆ {0, 1}n , |S| ≥ δ.2n , ∃ circuit C, |C| ≤ s0 s.t. Pr [C(x) = f (x)] >
x∼S
Payoff S,C = Pr [C(x) = f (x)]
(The amount S gives to C)
x∼S
Note that 0 ≤ Payoff S,C ≤ 1. Let v be the value of the game:
v = min max E [Payoff S,C ] = max min E [Payoff S,C ]
S
C
C
S∼S
C∼C
S
S∼S
C∼C
That is, we consider mixed strategies for the two players (mixed strategy is a probability distribution over pure strategies: rows for S, and columns for C). Then we look at the min max or
max min of the expected payoff, when the players use their respective mixed strategies (with S
player trying to minimize, and C player trying to maximize their payoff). To continue the proof,
we distinguish between the following two cases:
(1) v <
1
2
+
ε
2
(2) v ≥
1
2
+
ε
2
Case (1). By the min-max theorem, there exists a mixed strategy S for the set player such
that, for all circuits C within the size limit,
E [Payoff S,C ] <
S∈S
1 ε
+
2 2
or equivalently,
E [ Pr [C(x) = f (x)]] <
S∈S x∼S
2
1 ε
+
2 2
(2)
By averaging we can get
∃S, |S| ≥ δ · 2n s.t. Pr [C(x) = f (x)] <
x∼S
1 ε
+
2 2
This is a for a fixed circuit C, though. We want to argue the existence of such a set S that has
this property for all circuits C. To do so, we need to use some concentration bounds. Define
distribution D in the following way:
– Sample S from S;
– Sample random x from S (Uniformly).
With this definition, Eq. (2) yields:
Ex∼D [C(x) = f (x)] <
1 ε
+
2 2
The rest of the proof for case (1) includes concentration-type argument on the the quality of
random sample, and is therefore omitted to avoid technicalities. Finally one would get:
∃S, |S| ≥ δ · 2n : S is ε-hardcore (contradicting the initial assumption)
Case (2). Similar to case (1), there exists mixed strategy C for for circuit player such that
∀ set S, |S| ≥ δ · 2n : E [Payoff S,C ] ≥
C∈C
1 ε
+
2 2
That is, for every set S of density δ, on average over x ∈ S, the majority of circuits (weighted
according to C) is correct on x. Let’s define the set H of those inputs x where less than 1/2 + ε/2
of the circuits in C are correct on x. That is,
H = {x|Weighted under C, less than
1 ε
+
2 2
− fraction of circuits C in C are correct on x}
Note:
E [Payoff S,C ] =
C∼C
E
x∼H,C∼C
[C(x) = f (x)] <
1 ε
+
2 2
By definition of H and the contradiction assumption, it must hold that |H| < δ · 2n . We will
give up on the inputs from H, which constitute less than δ fraction of all inputs. Fix an x 6∈ H.
For this x, we have 21 + 2ε of circuits C ∼ C are correct. Hence, if we sample O( εn2 ) circuits C
from C, their majority are correct on x with probability 1 − exp(−n). By the union bound,
Pr[∃x 6∈ H s.t. majority is wrong] 1
As a result, there must exist a good choice of majority which is correct on all x 6∈ H.
Proof for the tight bound of 2δ · 2n .
Our approach is to use the same game with the same payoff function as in proof of the
loose bound, but now we label the rows with sets S of density 2δ (rather than δ). Again,
case (1) v < 21 + 2ε cannot happen, as this would imply the existence of a hardcore set of density
2δ, which we assumed were not true. We want to compute f on at least 1 − δ fraction of all
inputs. As before, we can get the majority circuit correct on at least 1 − 2δ fraction of all inputs.
3
We want to recover half of the inputs in H, i.e., we want our algorithm to be also correct on
about half of inputs in H (in addition to being correct outside of H). This would be easy if we
knew what H is, by the following randomized circuit:
“On input x:
if x 6∈ H, Majority(circuit(x))
else output random bits (or the more frequent bit) ”
But this is not possible since we must know f (x), while f is hard. We will define a set H in
a special way (different from the one we used for the loose bound above). Our approach is to
sample a few circuits and estimate the expectation with sample average based on the law of large
numbers:
E [Payoff S,C ] ≈ Avg[.] (For a few circuits C sampled from C)
C∼C
Definition 2. Define αcorr (x) and α1 (x) as:
αcorr (x) = 2
Pr
[C(x) = f (x)] − 1
Pr
[C(x) = 1] − 1
C in samples
α1 (x) = 2
C in samples
Note that α1 (x) is computable since our collection of sampled circuits C is polynomially small.
We have:
Pr
[C(x) = f (x)] =
C in samples
Pr
[C(x) = 1] =
C in samples
1 1
+ αcorr (x)
2 2
1 1
+ α1 (x)
2 2
Observe that for all x, −1 ≤ αcorr (x) ≤ 1. We also define a set H of size 2δ · 2n of all those
x’s with smallest values αcorr (x). Let φ = maxx∈H αcorr (x). We have
αcorr (x1 ) ≤ αcorr (x2 ) ≤ · · · ≤ αcorr (x2δ·2n ) ≤ φ
where all these xi ’s are members of H.
Claim 2. φ > 0. (Because Ex∈H [αcorr (x)] > 2 ·
1
2
− 1 ≥ 0)
For x 6∈ H, αcorr (x) ≥ φ, so at least 12 + 12 ·αcorr (x) ≥ 21 + φ2 of circuits in the samples collection
are correct. Define a randomized circuit C 0 (x) as follows:

 0
C 0 (x) =
1 with probability

1
1
2
+
α1 (x)
2φ
if α1 (x) ≤ −φ,
if − φ < α1 (x) < φ,
if α1 (x) ≥ φ
Note that for every x 6∈ H, C 0 (x) = f (x) with probability 1. Also, it can be proved that
Pr[C 0 (x) = f (x)] =
1 αcorr (x)
+
2
2φ
truncated at 0 and 1
Thus, for each x ∈ H:
Pr[C 0 (x) = f (x)] ≥ max(0,
4
1 αcorr (x)
1
+
)≥
2
2φ
2
and therefore
E [C 0 (x) = f (x)] ≥
x∼H
1
2
By fixing randomness of C 0 to at least preserve the expectation, we get a deterministic circuit
that is correct on more than 1 − δ fraction of inputs, contradicting the assumption that f is
δ-hard.
In the next lecture, we’ll see how to use these results for hardness amplification.
References
[1] R. Impagliazzo. ”Hard-core distributions for somewhat hard problems”. In Proc. 36th IEEE
Symposium on Foundations of Computer Science, pp. 538–545, 1995.
[2] T. Holenstein. ”Key agreement from weak bit agreement”. In Proc. 37th Symposium on Theory
of Computing, pp. 664–673, 2005.
5