Proving Lower Bounds on Circuits Using Switching Lemma

Proving Lower Bounds on Circuits Using
Switching Lemma
M.Arunothia, 13378
Sai Kishan Pampana, 13458
Submitted to Prof. Raghunath Tewari for partial fulfilment of the course
requirements for CS396A, IITK
Abstract
Switching Lemma is a method that is used in Circuit Complexity to
convert a bounded size CNF to a bounded size DNF under a random restriction. We have a lot of application of Switching Lemma, in the paper,
”Almost Optimal Lower Bounds for Small Depth Circuits”, Hastad gives
an inductive proof for the switching lemma and shows applications of the
lemma for lower bound computations in Circuit Complexity. Benjamin
Rossman is his paper, “The Average Sensitivity of Bounded-Depth Formulas” uses switching lemma and gives a tight lower bound on the size of
constant depth formulas computing the parity function also in his paper
he gives an upper bound on the value of average sensitivity of the Boolean
Formula, which has constant depth and size.
Department of Computer Science and Engineering
Indian Institute of Technology Kanpur
1
Contents
1 Basics
1.1 Random Restrictions . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Min-Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
3
3
2 Switching Lemma
3
3 Proof of Switching Lemma
4
4 Application of the Switching Lemma
4.1 Approach . . . . . . . . . . . . . . .
4.2 Assignment and Timestamp . . . . .
4.3 Main Definitoin . . . . . . . . . . . .
4.4 Switching Lemma . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
6
7
7
5 Main Theorem
7
6 Proof Idea
8
7 Bound on size when computing PARITY
9
8 Conclusion
9
9 Further Extensions of this Project
2
10
1
Basics
1.1
Random Restrictions
Definition 1. A restriction ρ maps the input variables of a Boolean Formula
to the set {0, 1, ∗}.
• ρ(x) = 0/1 means the variable x is mapped to 0/1 respectively.
• ρ(x) = ∗ means the variable remains as itself, no value assigned as yet.
Notation 1. F dρ is obtained after making the substitutions prescribed by ρ in
the Boolean Function F .
Definition 2. Random Restrictions are parametrized by a real number p. A
restriction ρ ∈ Rp if it satisfies
• ρ(x) = 0 with probability
1−p
2
• ρ(x) = 1 with probability
1−p
2
• ρ(x) = ∗ with probability p
This would mean that F dρ would contain pn variables (if F originally had
n variables).
1.2
Min-Term
Definition 3. A minterm is a minimal way of forcing a Boolean function to
become 1. A minterm σ of F is a partial assignment such that
• σ forces F to become 1.
• No sub-assignment of σ forces F to become 1.
Example: If (x1 ∧ (x2 ∨ x3 )) then x1 = 1, x2 = 1, x3 = ∗ is a minterm.
1.3
Formulas
A formula is a circuit which has a finite tree like structure where the leaf nodes
are the literals (i.e. variablesxi or negated variables ) and non-leaf nodes are
gates. The gates that we will be using in this are AND and OR.
2
Switching Lemma
The following Switching Lemma and Remark are as stated in paper [1]
Lemma 1. Let G be an AND of ORs all of size ≤ t and a ρ random restriction
from Rp . Then the probability that Gdρ cannot be written as an OR of ANDs all
of size < s is bounded by αs where α is the unique positive root to the equation.
t
t
4p
2p
(1 + (1+p)α
) = (1 + (1+p)α
) +1
3
Remark 1. α =
golden ratio.
3
2pt
lnφ
+ O(p2 t) < 5pt , for sufficiently small p, where φ is the
Proof of Switching Lemma
In this section, we discuss the inductive proof of switching lemma discussed in
Hastad’s paper [1]. To prove the lemma 1, we will first make the lemma stronger
to support induction.
Lemma 2. Let G = ∧w
i=1 Gi , where Gi are OR’s of fan-in ≤ t . Let F be an
arbitrary function. Let ρ be a random restriction in Rp . Then we have
P r[min(G) ≥ s|F dρ ≡ 1] ≤ αs
Notice, that the stronger main lemma reduces to the originally stated switching lemma 1 when F ≡ 1. The intuition behind this extension can be seen in
the inductive proof.
Proof. The induction is carried out on the value w.
Base Case w = 0. This means the size of the circuit is zero and hence the
lemma trivially holds.
Induction Hypothesis The lemma holds for all i < w.
Inductive Step We have, G = ∧w
i=1 Gi . We deal with the proof by looking at
two cases.
1. G1 dρ ≡ 1
In this case, G = ∧w
i=2 Gi and hence, by induction hypothesis (as the
number of Or circuits are only w − 1), the lemma holds for this case.
2. G1 dρ 6≡ 1
Now we are left with proving
V
P r[min(G) ≥ s|(F dρ ≡ 1) (G1 dρ 6≡ 1)] ≤ αs
• We
W shall assume that G1 is an OR of only positive literals, i.e. G1 =
i∈T xi where |T | < t.
• Since in case-2 G1 is not made true by the restriction, we know that
G1 has to he made true by every min-term of Gdρ.
• Let Y be the set of indices ∈ T to whom the min-term of Gdρ assigns
values {0, 1}.
• This implies we have ρ(Y ) = ∗.
Y
• Let min(G) ≥ s denote the event that Gdρ has a min-term of size
at least s whose restriction to the variables in T assigns values to
precisely those variables in Y .
Hence, we have
X
^
Y
P r[min(G) ≥ s|(F dρ ≡ 1) (G1 dρ 6≡ 1)] ≤ αs =
Y ∈T,Y 6=φ
4
P
Y ∈T,Y 6=φ
P
V
Y ∈T,Y 6=φ (P r[ρ1 (Y ) = ∗|(F dρ ≡ 1)
V
Y
P r[(min(G) ≥ s)|(ρ1 (Y ) = ∗) (F dρ ≡
=
∗
V
V
Y
P r[(min(G) ≥ s) (ρ1 (Y ) = ∗)|(F dρ ≡ 1) (G1 dρ1 6≡ 1)] ≤ αs
Claim 1.
P
Y ∈T,Y 6=φ (P r[ρ1 (Y
(G1 dρ1 6≡ 1)]
V
1) (G1 dρ1 6≡ 1)]) ≤ αs
2p
) = ∗|G1 dρ1 6≡ 1]) = ( 1+p
)
|Y |
Proof. This is because no variable in T can take value 1 as Gdρ1 ≡
6 1 and hence
ρ(Y ) should get mapped to only ∗ when it is given that it will be mapped to
only one of {0, ∗}. This implies
P r[ρ1 (Y ) = ∗|ρ1 (Y ) = ∗ ∨ ρ1 (Y ) = 0] =
p
1+p
2
=
2p
1+p
This probabilty is independent for every variable in Y and hence the result of
claim-1.
P
V
2p |Y |
Claim 2.
Y ∈T,Y 6=φ P r[ρ1 (Y ) = ∗|(G1 dρ1 6≡ 1) (F dρ ≡ 1)] ≤ ( 1+p )
V
Proof. First, by V
using the probability result P r[A|B C] ≤ P r[A|C] is equivalent to P r[B|A C] ≤ P r[B|C], we modify our probabilty term. We do this
to use the intuition that some variables being * cannot increase the probability
that a function is determined and hence, the result of claim-2.
V
V
Y
Claim 3. P r[(min(G) ≥ s)|(ρ1 (Y ) = ∗) (F dρ ≡ 1) (G1 dρ1 6≡ 1)] ≤
P
Y,σ1
σ1 {0,1}|Y | ,σ1 6≡0|Y | (maxρ1 (Y )=∗,ρ1 (T )={0,∗}|T | P rρ2 [min(G)
≥ s|(F dρ1 σ1 )dρ2 ≡ 1])
Proof. For arriving at the above reduction, we consider the min-term of Gdρ as
consisting of two parts
• σ1 which assign values to the variables of Y .
• σ2 which assign values to some variables in T̄ .
Notice that σ2 is a min-term of the function Gdρ1 σ1 and this can be used as
the key for getting an inductive argument. We get rid of the condition G1 dρ1 6≡ 1
by maximizing over all ρ1 that satidfy this condition. Hence, we can reduce the
second part of case-2 as given in claim-3.
Claim 4.
P
σ1 {0,1}|Y | ,σ1 6≡0|Y | (maxρ1 (Y )=∗,ρ1 (T )={0,∗}|T |
s−|y|
|Y |
≤ (α
) ∗ (2
− 1)
5
P rρ2 [min(G)
Y,σ1
≥ s|(F dρ1 σ1 )dρ2 ≡ 1])
Y,σ
Proof. we know that min(G) 2 > s implies that (Gdρ1 σ1 )dρ2 has a min-term
of size at least s−|Y | on the variables in T. Thus we can estimate this probability
by induction hypothesis as G1 is no longer there and hence this probability by
I.H will be αs−|y| .For substituting the ∗s of ρ1 , we take and of the two formulas
resulting by substituting 0 and 1. We have 2|Y | − 1 possible σ1 . This is because
σ1 must make G1 dρ1 true and hence cannot be all 0.
Finally using the results derived till now, we solve the case-2 as follows
P
2p |Y |
∗ (αs−|y| ) ∗ (2|Y | − 1) ≤ αs
Y ∈T ( 1+p )
This follows from the definition of α mentioned in the lemma itself (that it is
the root of the specified equation). Hence, this completes the proof as we have
covered both the cases.
4
Application of the Switching Lemma
Now that we have define and proved what the Switching Lemma, let us see an
application of this lemma. In the application we will be proving lower bounds
on a class of Boolean Circuits, called formulas. Here we will be using the
switching to prove the lower bound of formulas that are computing PARITY.
4.1
Approach
For showing the lower bound of the formulas we will be using a set restrictions
which we will using on the sub formulas. Our main tools are the set of restrictions and the switching lemma. So, the main part that we are left with is to
find the required set of restrictions.
4.2
Assignment and Timestamp
Before we choose the kind of restrictions that we will be using let us familiarize ourselves with some definitions. Our first definition would be that of the
timestamp τ and the assignment σ.
Definition: Let σ ∈ {0, 1}n (assignment) and τ ∈ [0, 1]n (timestamp) be
independent uniformly distributed random variables, then we define a restriction
from the distribution Rp where 0 ≤ p ≤ 1
Rpσ,τ : [n] → {0, 1, ∗}
where if
τi ≤ p then Rpσ,τ (i) = ∗
else
Rpσ,τ (i) = σi
6
4.3
Main Definitoin
Here we will state the main definition, about the stopping time. Once we have
decided on the set of restrictions we can then easily show the lower bounds.
Definition(Main Definition): For all formulas φ, we define the stopping
time q σ,τ (φ) ∈ [0, 1]n in a inductive manner
• If φ has a depth d = 0 then q σ,τ (φ) = 0
• If φ is AND or OR of booelan formulas (ψ1 , ψ2 , .̇.ψm ) then
q σ,τ (φ) =
pσ,τ (φ)
14k σ,τ (φ)
where
pσ,τ (φ) = min(q σ,τ (ψi ))
k σ,τ (φ) = max(1, maxi D(ψi dRpσ,τ
σ,τ (φ) ))
4.4
Switching Lemma
Here we will state the switching lemma again, this form of the switching that
we state is what we will be using in our proofs.
Swtiching Lemma: Let k, l ∈ N , f be an AND or OR of a family of
Boolean function fi such that D(fi ) ≤ k for all i. Then for 0 ≤ p ≤ 1/2,
Pρ∈Rp [D(f dρ) ≥ l] ≤ (5pk)l
Using the above two definitions we will state a lemma that will help us prove
the results.
Lemma:Let φ be a formula of depth ≥ 1 and q be such that there exists
σ, τ , q = q σ,τ (φ). Then for all 0 ≤ α1 and l ∈ N
Pσ,τ [D(φdRqστ ) ≥ l | q σ,τ (φ) = q] ≤
5
α l
e
Main Theorem
Now we have gathered all the tools that are requierd to show the lower bound.So,
here we will state a theorem, this will be the main theorem for this paper.
As mentioned previously we need a set of restrictions and for that purpose
we have introduced something call the stopping time. Now if we want to show
a lower bound then we will need to bound the value of q as well. So in the next
theorem we will bound the value of q.
Theorem:For every depth d+1 formula φ and 0 ≤ λ ≤ 1
P [q(φ) ≤ λ] ≤
|φ|
exp(Ω(dλ−1/d )
− O(d))
We will present a proof idea for this in the next section.
7
6
Proof Idea
We will prove the above theorem using induction. But to do this we first need
to rewrite the above theorem in a form so that we can use induction.
Theorem (restated): For every depth d + 1 formula φ and l ≥ 0
P [q(φ) ≤
1
14d=1 l
] ≤ |φ|
Cd
exp(e−2 dl1/d )
where C is a constant.
Proof:First we will see that this proof is trivial in the case when l ≤ ed
because in this case the RHS ≥ 1. So we will only see in the case when l ≥ ed .
Our induction will be presented on d. So first let us look at the base case
when d=1. In this the depth of the tree will be 2. Now φ is an AND or
OR of depth 1 subformulas ψ and for each of these subformulas we will have
q(ψ) = 1/14, and so by defintion of p(φ) we have (φ) = 1/14. From this we can
now write the expression for q(φ)
q(φ) =
1
142 k(φ)
from the definition of q we can write the probability as
P [q(φ) ≤
_
1
σ,τ
]
=
P
[k(φ)
≥
l]
=
P
[
D(ψdRp(φ)
) ≥ l]
142 l
ψ
≤
X
σ,τ
P [D(ψdR1/14)
) ≥ l]
ψ
For this inequality we will use the switching lemma and get an upper bound,
and so we finally get
1
≤ |φ|
exp(l)
With minor changes in this we can get our desired results and so we have
shown that the base case holds.
Now for the induction step, we will assume that the result holds for d-1. To
solve the induction step we will use a differenct approach.
Approach: So in our approach instead of showing an upper bound on the
1
event q(φ) ≤ 14d+1
, we will show an upper bound for a different event which
l
in turn will prove out hypothesis. So before we go on to the proof we will first
define some terms that we will be using.
For all i define ki and i
ki = ei−1 l1/d
αi =
After this we will define some events
A : (p(φ) ≤ α0 )
8
ki
14d l
Bi :
_
σ,τ
(q(φ) ≤ αi+1 ) ∧ D(ψdRq(ψ)
) ≥ ki
ψ
Ci,j :
_
σ,τ
(αi+j+1 < q(φ) ≤ αi++j+2 ) ∧ D(ψdRα
)
≥
k
i
i+1
ψ
W
W
1
then A ∨ i (Bi ∨ j Ci,i )
Claim:If q(φ) ≤ 14d+1
l
From this claim we can say that the probability of the first event is less than
the second one. And so if we bound the probability of the second one then we
can have an upper bound on the first one.
Now it is just enough to bound the probability of the right had statements.
These statements can be bounded using the switching lemma and the reconstructed switching lemma. So this is basically the complete proof idea.
Now in the next section we will use this theorem and show how to bound
the size of a formula that is computing the parity.
7
Bound on size when computing PARITY
We will use the results that we have established so far to prove a lower bound
for parity
Theorem: Depth d+1 formulas computing PARITY require size of exp(Ω(d(n1/d −
1)))
Proof : Say φ is a formula which has depth d+1 and is comuting PARITY
then
1
1
Pρ∈R1/n [φdρ is 6] = 1 − (1 − )n > 1 −
n
e
The above is true because the formula is not constant only when at least
one of the variables is unset when we apply the restriction.
Pρ∈R1/n [φdρ is 6] = 1 − (1 −
1
1 n
σ,τ
) > 1 − = Pσ,τ [D(φR1/n
) ≥ 1]
n
e
σ,τ
≤ Pσ,τ [D(φRmax{1/n,q(φ)}
) ≥ 1]
σ,τ
≤ P (q(φ) ≤ 1/n) + P [D(φRq(φ)
) ≥ 1]
From here using the main theorem where we bound the stopping time and
using the switching lemma we get the lower bound on the PARITY function.
8
Conclusion
In the previous few sections we have shown how we can use the switching lemma
to bound the size of a formual that is computing the PARITY, apart form this
the switching lemma can also be used to bound some other properties of the
circuit. Another example that we can give is where we use switching lemma to
9
bound the average sensitivity of a boolean formula which has a depth d+1 and
size s.
In the next section we will be giving some more applications of the switching
lemma
9
Further Extensions of this Project
• We could try producing some result like that of switching lemma when
even M ODp gates are included in the circuits. At least the extension
should be tried when parity gates are included in the circuits.
• Similarly we could show some results regarding the average sensitivity of
circuits with M ODp gates that have a size s and depth d+1
References
[1] J Hastad. Almost optimal lower bounds for small depth circuits. In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing,
STOC ’86, pages 6–20, New York, NY, USA, 1986. ACM.
[2] Benjamin Rossman. The average sensitivity of bounded-depth formulas.
[3] Ryan O’Donnell. Analysis of Boolean Functions. Cambridge University
Press, 2014.
10