A threshold for unsatisfiability

A threshold for unsatisfiability
Andreas Goerdt
Universits -GH- Paderborn
FB 17/Mathematik-Informatik
Postfach 1621
D-4790 Paderborn
e-maihgoerdt~uni-paderborn.de
Abstract
We show the following threshold property of satisfiabflity of propositional
formulas in 2-CNF: If C = 1 + ~ where e > 0 is fixed, then almost all formulas
in 2-CNF with C . n different clauses over n variables are unsatisfiable. If
C = 1 - e, then almost all such formulas are satisfiable. ("Almost all" simply
means that the probability w.r.t, the uniform distribution considered here
tends to 1 as n --~ co.) Due to the close relationship between satisfiability of
formulas in 2-CNF and graph theoretic properties it is not surprising that our
proof uses techniques from the theory of random graphs, in particular [12].
Introduction
W h e n analyzing the probabilistic p e r f o r m a n c e of an algorithm one usually considers
this a l g o r i t h m w.r.t, an infinite family of probability spaces of inputs. In case of
satisfiability algorithms for propositional formulas m a i n l y two types of families of
input spaces are considered: One is the constant density model dealt with for example in [6] the other one is the constant clause size model treated a m o n g others in [2],
[3], [4], and [5]. T h e constant clause size model is defined as the family Form,(q, k)
where Form,~(q, k) is the probability space consisting of all formulas which are sequences of exactly q clauses (with repetition) each of which has exactly k literals
over n variables such t h a t each formula is equally likely. Our result constributes to
an analysis of this model (modified such t h a t each f o r m u l a is set of q clauses for
simplicity).
If the clause size is k = 3 and the n u m b e r of clauses is (asymptotically) C . n for
a constant C < 1 it is known t h a t almost all formulas are satisfiable [5] and if
In ~ t - 5.19) almost all formulas are unsatisfiable. Not m u c h is known for C
C > -1-g:7~/8~
between these two values. E x p e r i m e n t s show t h a t the formulas become unsatisfiable
for C a r o u n d 4. If the n u m b e r of literals per clause is k = 2, it is easy to see (cf.
In 2 r 2.49) the average n u m b e r of satisfying assigmnents
also [4]) t h a t for C > -ln--57-q/4vv
of formulas with C 9n clauses approaches 0 as n --* oo. Hence almost no f o r m u l a is
2
satisfiable. If C < - I n In3/-----~
this average goes to infinity. However, this does not m e a n
t h a t almost all formulas are satisfiable, and we show t h a t they are not as long as
C>I.
In proving our result we build on the well known observation t h a t a propositional
265
formula in 2-CNF is unsatisfiable iff its formula graph [1] has a contradictory cycle,
i.e. a cycle containing a vertex L and another one L for a literal L. The formula
graph of a random formula over n variables with C - n different clauses is a directed
random graph over 2 9n vertices with C . n independent p a i r s of edges, hence with
2 9 C 9 n edges. If we look at the analogous situation of a directed random graph
with 2 9n vertices and C 9 2 9 n i n d e p e n d e n t edges we have [9]: If C < 1, then the
probability that such a graph has a cycle approaches C and the average number of
cycles - log(1 - C). As the probability C approaches 1 as C --+ 1, for C > 1 almost
all directed random graphs have a cycle. (Here one uses the trivial observation that
more edges are more likely to induce a cycle.)
To show the first half of our result: for C < 1 almost no formula graph has a contradictory cycle, we show that the average number of contradictory cycles approaches
0 for n ~ or As edges in formula graphs are thrown in pairwise, they can be
stochastically dependent. This makes it difficult to compute the average directly.
Fortunately this stochastic dependency has a positive effect, too: we can restrict
attention to contradictory cycles in a certain normalform. Then we show that the
average number of cycles in this normalform approaches 0.
To show the second half of our result: for C > 1 almost all formula graphs have
a contradictory cycle, we cannot proceed as in the analogous situation for directed
graphs mentioned above. Instead, we observe that the average number of contradictory cycles of length O(log n) goes to infinity when C > 1. Then we apply the second
moment m e t h o d [10]. (The average number of contradictory cycles of length O(1)
approaches 0 even for C > 1.) The second moment method in its standard form is
applied to subgraphs of constant size. In [12] it is applied to cycles of increasing size
in random graphs. We transfer the techniques used there to our situation which is
more complex because edges are thrown in pairwise and therefore cycles can have
common edges in a variety of ways.
Our result shows that the existence of contradictory cycles in a random formula
graph is a little bit less likely than that of cycles in the analogous directed random
graph. If we forget for the moment that edges in formula graphs are thown in pairwise, we have some intuitive explanation for this: To get a contradictory cycle with
l vertices we can choose a~ m o s t I - 1 vertices from the set of all vertices because one
vertex must cause the contradiction. For a cycle of length l in a directed random
graph we can choose all I vertices from the set of all vertices and hence have more
possibilities (if I is not too large).
Our results do give no information if C = 1. Experiments performed at our department suggest, that for C = 1 about 907, of the formuals are satisfiable. In [8]
experiments are reported showing that 1007, of the formulas are satisfiable. This
seems unlikely in view of our result.
in section 1 we introduce some basics, in section 2 we treat the case C < 1 and in
section 3 we let C > 1. In the conclusion we show that our result yields an algorithm
which runs in linear time on almost all formulas in 3-CNF with C. n clauses if C < 1
and n is the number of variables.
266
1
Basics
We let Varn = { X l , . . . , xn} be a standard set of n propositional variables,
Litn = { x l , . . . , xn,-21, ...,xn} is the set of literals over Varn. A clause over Varn
is a set { L , K } (or L V K) with L , K E L i t , , L ~ K, and T ~ K. The number of
clauses over Var.
is N = i ( . )
= 4
1) We l e t . always be the number
=
of variables and assume that n is large. A formula with q clauses over n variables is
a set {C1, ..., Cq} (or C1 A . . . A Cq) consisting of exactly q clauses. Formn(q) is the
set of all formulas with q clauses over n variables. Then [ Formn(q)l = (N) and we
make Form~(q) a probability space by assuming that each formula is equally likely.
A formula graph over n variables is a directed graph G = (V, E) with V = Litn, E C
{ ( L , K ) ] L , K C V,L r I f , ; r If} and ( L , K ) e E r
(If, L) e E, see [1].
We call L -+ K , / s -+ L a pair of complementary edges. FG~(q) is the set of all
formula graphs over n variables with q pairs of complementary edges. The bijective
correspondence between FGn(q) and Form~(q) is well known and makes FG~(q)
a probability space where each graph is equally likely. Finally, for a subgraph (not
necessarily a formula graph) H = (U, F) of a formula graph G we define the complementary g r a p h / t = ( U , F ) by U = {L [ L e U} and F = {L --+/-< [ If --+ L G F}.
For example if H is a path like L1 -+ L2 -~ L3, then / t = L3 --+ L2 --+ L1. We
define P a i r s H = {L --+ K, L~ --+ L [ L -+ Kis an edge ofH} and Lit H = U. We
abbreviate L E Lit H by L E H. We let EdgesH be the set of edges, F, of H.
2
Fewer clauses than variables
The main result of this section is theorem 2.3(b) saying that for C < 1 almost no
formula graph with C . n pairs of edges has a contradictory cycle. To prove it we
introduce a normalform for contradictory cycles in formula graphs.
D e f i n i t i o n 2.1 Let a, b, c E Af with a > 1, b > a ,
a, b, c i f f ~r can be decomposed as
S1
--'+ UI ~
$2 ---+ U2 --~ . . . .
Sa -"+
T
Va ~ - ~ . ~ - . . . ~ - V ~ - ~ 2 ~ - V I + -
7V ----
and c > O. The cycle ~ is of type
Ua
$1
where the Si are non-empty paths and the Ui, Vi are (possibly empty) paths, such
that: (a) For all contradictory pairs L, L E 7r we have exactly one i with L E Si and
(b) ~ [iit S~] = b (= the number of contradictory pairs of literals of ~r).
i
(c) ~ [ Lit Ui [ + [ Lit Vi [= c (= the number of literals i E 7r with L ~ 7r).
i
We say ~ consists of a >_ 1 sections. The sections are $1,... ,Sa We have that
] P a i r s T r [ = a + b + c and L e n g t h ~ r = 2 b + c = [ { L [ L E a - } [.
We say ~ is in normalform iff ~r is of type a, b, e for some a, b, c.
[]
267
The cycle
x --~y--~
z
T
is of type 2, 2, 2 with $1 = x, $2 = y, and U1 = |
x ---~u---~y---~ v
T
U2 = z, V1 = Q,'V2 = u. The cycle
i
s
~ r ~
is not in normalform because with $1 = x, $2 = y we have $2 before $1 and with
$1 = x ~ u ~ y we do not have $1 in the cycle. But in a formula graph containing
the path x -+ u -* y we can replace the path y ~ r --* 2 with 9 -'~ u ~ x, which is
the complementary path of x ---* u ~ y and get a cycle of type 1, 3, 2. This can be
generalized:
L e m m a 2.2 Each formula graph with a contradictory cycle has a cycle in normalform.
[]
Now the main theorem follows easily:
T h e o r e m 2.3 Let Xa,b,c(G) be the number of cycles of type a, b, c of the formula
graph G with < C . n pairs of complementary edges and let X ( G ) be the number of
all cycles in normalform in G. Then we have:
b-1 " (~+~:-1). (~)
1 a.ca+b+~ 9(1 § o(1)).
(a) EXa,b,c ~ (~-1)
(b) If C < 1, then E X ~ 0 for n --~ co. Hence allmost all formulas are satisfiable.
P r o o f (a) Let #~,b,r be the number of cycles of type a, b, c in the complete formula
graph over n variables. Then
,o,b,~ _< (~) . b! .2b(~:
. ~). .c~ 2 ~ (o-1)'b-1(~+~:-1). ~1
because each ~" of type a, b, c can be obtained in exactly 2 9a ways by the following
choosing process:
(1) Choose the path S = $1 ---+ S~ - * . . . ---* Sa : (~). b!. 2 b choices. Choose the b
variables order them, and negate them or not. (We do not yet choose the partition
into the Si.)
(2) Choose the path U = U1 ~ ... ~ Ua ~ V1 --* ... ~ Va: (~-b). c!. 2 r choices.
b-1
(3) Partition S into the Si where each S i r |
(~-1)
choices, because the number of
ways to partition S is equal to the number of vectors (ml,..., ma) with mi E A f \ { ~
and ~ m i = b [cf. 11, exercise 11, page 13].
(4) Partition the path V into the Ui, V/:_< (c+22-1) choices: The number of choices
is bounded by the number of vectors ( m l , . . . , m2~) with ml E Af, mi = 0 is possible,
and ~ m l = c [cf. 11, proposition 6.1, page 11]. ( l f a = 1 we have Vl r G and
V1 r | therefore the " < ".)
Because of cyclic permutations each ~r has been chosen 2. a times therefore we multiply with ~ .
As I Pairs(~)l = a + b + c for ~" of type a, b, c, we get
268
N--~--b--c
(r162
< (a--l)
9
"
1
1
9 ca'i'bTc(1
-]- O ( 1 ) )
where we have used i t e r a t i v e l y t h e fact t h a t 7k -< k+/
l+i i f 0 -< k -< l a n d
n - 1 = n(1 ~- o(1)).
i >- 0, a n d
(b) Now let C < 1. T h e n we get
EX
= ~
EX~,,b,c
a,b,c
-< ~
oo
1
oo
b-1
9 C b-1 .
9
a=l
co
b-l=a-1
= ~
1
c~
1
a=l (2(,~-1))~ " ~
"
1
- 1 --~ 0 for n - ~ ~ ,
--1-2-~
C ,( ~ + 2 ~ - q
c=0
. C c.
[7, f o r m u l a s 5.56 a n d 5.57]
where E is a s u i t a b l e c o n s t a n t .
[]
3
More
clauses
than
variables
In this section we consider the case C > 1 a n d prove:
T h e o r e m 3.1 I f C > 1 and q(n) E Af with q(n) > C . n then almost all formulas
of F o r m ~ ( q ( n ) ) are unsatisfiable.
[]
F o r t h e rest of this section let C = 1 + e where e > 0 is fixed, q = q(n) C 2 ( w i t h
In n
q(n) >_ C . n , k =- k ( n ) E A f w i t h k i n ) ..~ log c n = K-U,
l = l(n) = 2k + 2.
For a, b, c fixed we get w i t h t h e o r e m 2.3(a) t h a t EX,,b,r = O ( ( n1) a ) where a > 1.
Hence a l m o s t no g r a p h has a cycle of t y p e a, b, c for fixed a, b, c. T h i s forces us to
look at cycles of l e n g t h l o g a r i t h m i c in n, a c t u a l l y of l e n g t h I = 2k + 2.
In o r d e r to show t h a t p a i r s of cycles w i t h c o m m o n edges are a s y m p t o t i c a l l y irrele v a n t , which is at t h e h e a r t of t h e second m o m e n t m e t h o d , we have t o d i s t i n g u i s h
precisely several ways in which c o m m o n edges can occur 9 T h i s requires s o m e t e r m i n o l o g i c a l effert.
D e f i n i t i o n 3 . 2 (a) Let P be a ( n o n - e m p t y ) path and L a literal. We consider the
unordered pair of paths consisting of L --+ P --+ L and its complement L -+ P ---* L.
I f a formula graph has one of these paths it has the other one, too. We distinguish
these two paths calling one of them main path and the other one side path. We
assume that f o r any pair of paths as above these names are fixed f r o m now on.
(b) A cycle 7r is called a simple cycle iff there exists a variable x and main paths
x --* L1 --* L2 --+ 9 --+ Lk --+ ~, ~ --+ G1 -+ G2 --+ 9 .. --* Gk ---* x such that
x ---+L1--+L2---+...--+
Lk
T
1
Gk +-- . . . ~ G2 +-- G1 ~-- ~:
and x, ~ is the only contradictory pair of 7r (and the nodes are pairwise different).
Note that 7r contains no pair of complementary edges.
7r =
269
We call x --* L1 --+ L2 -+ ... -+ Lk -+ fr the first main path of 7r Fmp~r,2 --*
G1 -+ G2 --+ ... -+ Gk --+ x the second main path of ~r, Smp~r. The first and second side path of 7r are given by Fsp~r = FrnpTr and Sspzr = S m p Tr. We have
I Pairs Frnp~r) I=] Pairs (_SmpTr) I= k + 1 and ] Pairs ~r I= I. Note that FspTr =
x --* Lk ~ Lk-1 --~ ... --* L1 ~ Je and SspTr = 2 --~ Gk -+ Gk-z 2_+ ... __, Gz --~ x.
The first paths always go from a variable x to 2 whereas the second paths go from a
negated variable 2 to x.
If ~r is a simple cycle in the complete formula graph over n variables, we let X~ be
the indicator random variable of the event "G contains r", X ( G ) is the number of
all simple cycles of a formula graph G, and # = #(n) is the number of all simple
cycles in the complete formula graph.
[]
The following corollary is easy to see, note that (n)z "~ n' because l = o(Tt 1/2) [10].
C o r o l l a r y 3.3 (a) # = n
9
(n71) 9 k ! ' 2 k-z
" ( n - l -kk
)" k!" 2 k-1
=
( n ) l ' t 9 22k-2
n 1-1 . 2 / - 4 ,
(b)
(c) E X = # .
(q)~N,-> # "
~
( 1 + o ( 1 ) ) > _ ~6" C 2 " n ( 1 + ~ 1 6 2
(Here we use C > 1!)
asn--~oz.
[]
In the following we use the abbreviations
#1 = (,~;1). k!. 2 k-1 = (n - 1) k. 2 k - l " (1 + o(1)),
. 2 = ( n-l-k ) . k ! . 2k-1 = (n - 1 - k) .2k_1 . (1 + o ( 1 ) ) ,
as k = o(nl/2). Then # = n '#1 " #2.
Though the average number of simple cycles goes to infinity this does not ensure
that each graph has got a simple cycle. To prove this we apply the second moment
method extending the argument of [12]. We have to show that E(X2)E(X)2= 1 + o(1). We
have X 2 =
~ X~ 9X~ where the sum is over all ordered pairs of simple cycles in
(~,~')
the complete formula graph. To compute E(X2)(EXFwe decompose the random variable
X 2 as a sum of random variables. This decomposition is induced by the different
ways in which two simple cycles (~', a") can have common edges.
D e f i n i t i o n 3.4 (a) Let V be a path in a formula graph, let | r S C the set of
edges of V and let C = el . . . e n where n >_ 1 be a sequence of n consecutive edges
of V. We say C is a chain of S on V iff C C_ S and if f is the edge succeeding en
or preceding ez on V then f ~ S. We let Chainy S be the set of chains o r s on V.
We have 1 < [Cainv S] <_ IS] and ]Chainv S] < Length V - IS] + 1.
(b) For an ordered pair (~r, ~r') of simple cycles we define conditions (1)through r
These conditions distinguish the ways in which ~r U ~r and ~c' U ~' can have common
edges.
(I)
The remaining conditions only apply if ~r ~ ~r'. Let E = Edges(To U ~r).
270
(2) Edges(~r') N E = |
T: U # and 7v' have no common edges (they may have common nodes).
(3) FrnpTr'= Fmp~r and (a) Edges(SmpTr')N E = Q or
(b) E d g e s ( S t o p s ' ) N E = S r Q.
We say the pair (Tr, ~r') satisfies condition (3)(b) with parameters r, R i f f ISI = r and
ChainSmp~,S= R. (Then 1 < r < k, 1 <_ [t<_ r, andR<_ k + l - r + l
= k-r+2.)
(4) (a)(b) analogously to (3) changing the roles of Fmp and Smp.
(5) Edges(Fmp~r') N E = Q, Smplr' r Stupor, and Edges(SmpTr') n E = S r Q.
As in (3)(b) we define condition (5) with parameter r, _g.
(6) Analogously to (5) changing the rotes of Smp and Fmp.
(7) Fmp 7r' n E = S r Q, Fmp :r' ~k Fmp 7r and
Stop 7r' n E = S' ~ Q, Smp 7d ~ Stop 7r.
We say (Tr, 7r') satisfies condition 7 with parameters r, R, and t, T i f f
] S I= r, I Chainrr, p ~,S I= R and
I s' I= t, I Chains,~p ~,S' I= T.
For each pair of simple cycles exactly one of the conditions above applies. Note that
situations like Fmp rr' = Fsp rc cannot occur because Fmp rr' must be a main path.
Situations like Fmp ~r' = Stop rr cannot occur because first main paths always go
from x to ~ whereas secnd main paths from ~l to y.
(c) The random variable X~ is given by X~ =
~
X ~ . X " where the sum
(~,~')
goes over all pairs of simple cycles satisfying (1).
The random variables
X2, X3a, X3b, X4a, X4b, X5, X6, andXr are defined analogously.
[]
Now we have X 2 = Y~ Xi and we c o m p u t e
lamina is easy:
Lem,na
a.5 (a) (Ex)2 =
(ExFEX'for all i.
T h e p r o o f of the following
o(i)
(b) ~(EXp <
- 1 (X2 counts the pairs without common edges.)
(c) (EX) = = o 0 ) , (EX) 2 = o(1).
[]
T h e r e m a i n i n g cases require a much m o r e detailed analysis of the ways in which the
cycle 7r' can have c o m m o n edges with 7r U ~. We begin by distinguishing two kinds
of chains on ~r U ~-. (Note t h a t in 3.4(a) we define a chain of S on V, where V is a
path. However, 7r U ~ is not a path.)
3.6 (a) Let Q 7i S C_ Edges(~rU~r) and let C = q . . . e,~ where n >_ 1 be a
sequence of consecutive edges of TrUer. C is a chain o r s on 7rU~: iff C C SnEdges(~r)
or C C S N Edges(~) and if f is a successor of an or a predecessor of ez on TrU ~r,
then f ~ S a n d r O S .
Definition
Let Chain~u~S be the set of chains of S on ~r U ~r. Let ChainrS the set of these
chains which are contained in 7r.
C is a broken chain of S on ~r U ~ iff C n Edges(~r) 7s Q and C n Edges(~) r Q
and iff f is a Successor of en or a predecessor of el on 7rU~ then f (~ S and f ~ S.
Brchainru~ S is the s'et
of broken chains of S on ~rU #.
271
Broken chains go from ~r to ~r or vice versa whereas chains either are in 7r or in Yr.
n
Corollary 3.7 (a) I f (~r, 7r') satisfies condition (3)(b) with parameters r, R, then
] Chain,ru~S I= R. (For S see definition 3.4(b), condition (3)(b).)
(b) I f (~r, ld) satisfies condition 5 with parameter r, R, then either ] C h a i n ~ u ~ S ]=/~
or (1 C h a i n , u ~ S l= R - 1 and l B r c h a i n r u ~ S ]= 1.)
(c) If (7r, ~r') satisfies condition 7 with parameters r, R , t , T then the following two
statements holds:
Either ] C h a i n ~ u ~ S ]= R or d C h a i n ~ u ~ S ]- R - 1 and ] B r c h a i n ~ u ~ S [= 1).
The second statement is the analogous one with S', T instead of S, R.
P r o o f (a), (b), (c) The result follow easily from the structure of main paths: By
definition they have the only one contradictory pair of literals at the beginning and
end.
[]
Definition a n d corollary 3.8 (a) Let ~r be a simple cycle. For r, R with 1 < r <
k, l < R
L~(r,R)
GTr(r,R)
I(,r(r, R)
< r, and R < k - r + 2
= ] { S C_ EdgesQr) [[ S
=1 {S C Edges(TrUe)II
= ] { S C Edges(To U ~)
we define
]= r,[ C h a i n ~ S ]= R)~
s [= r, I C h a i n ~ u ~ S l= R}~
]] S ]-- r, ] Chain~ru~S ]= R - 1, ] Brchain,ru~S ]=
1}t
M ( r , R ) = 2 n . ~ . (~) . ( ' • ) .
(b) We have
L~(r,R) : ~ (.-0
~-~
('-LT),
G.(r,R) = 2R. L ~ ( r , R ) <_ M ( r , R ) .
K. (r, R) < M(r, R)
P r o o f (b) The first equation is fi'om [12]. The rest follows easily from this.
Definition a n d l e m m a 3 . 9
R< k-r+2
[]
(a) For r , R with 1 < r < k, 1 <_ R <_ r and
let
vl (r, R) = M (r, R) . (n;:-~r_-nR) 9( k - r)! . 2k-r-R+2,
u2(r, R) = M ( r , R ) - ( n - l - k - t - / l ] . (k - r)!. 2k-r-R+2.
k-r-R
]
(b) If R = 1, the number of pairs of simple cycles satisfying condition (S)(b) with
parameters r, R is < It 9n . u~(r, R).
I f R > 2 this number is < I2 9n 2 9 u2(r, t~).
The same applies to condition (#(b).
(c) I f t~ = 1, the number of pairs of simple cycles satisfying condition (5) with parameters r, t~ is <_ It 9#1 9l . n 9u2(r, R).
I f R > 2, this number is <_ It 9121 9l . n 2 9u2(r, R).
The same applies to condition (6).
(d) I f R = 1 and T = 1, the number of pairs of simple cycles satisfying condition
(7) with parameters r, R and t, T is < It. 1. n . ul(r, I~) . n . u2(t, T).
272
I f R = 1 and T > 2 or if I~ > 2 and T = 1, this n u m b e r is <_ tt 9 1. n
u2 (t, T ) ,
ifR>
2 ~ndT>
2 it is
9
tzl(r
, _,~) - n 2 9
< # . l . n ~ . , ~ ( r , R ) . . 2 . ~2(r,R).
P r o o f (b) The prooffollows from the structure of the main paths of the cycle ~'. []
The following lemma finishes the proof of theorem 3.1.
EX~b = o(1), (EX)
EX~ 2 = o(1), ~(EX)2 = o(1).
L e m m a 3.10 CEX)-~The s a m e applies to E X 4 b and E X 6 .
P r o o f The most difficult case is XT. The other cases can be treated in an analogous
manner. For r, R a n d t , T w i t h 1_< r < k , l _ < R < r and R_< k - r + 2
and the
same for t, T let X 7 ~ m T = ~ X ~ . X ' , where the sum goes over all pairs (~', ~')
satisfying condition (7) with parameters r, R and t, T. With A = EX~,,~
(EX)2 we have
EX7
E
E
A+
A+
E
E
A+
r>n=l
r>R_>2
r>R=I
r>R>2
t>_T=l
tkT=l
t>T>_2
t>T>2
A
Note that r, R, t and T all are indices of the summation.
For each pair (~', ~r') satisfying condition(7)with parameters r, R and t, T we have
(N-2l+~+q
1
q--21-i-rTt / graphs in F G ( q ) containing 7r and zd. Therefore for R = 1 and T
we get with 3.9(d)
(N-21+,-+q
A = ~ . l . n . ~2(t, T ) .
__
< m~ . 1 . ~ / ~ . ~ ( r , R ) . ( a y
.
~1 .
~-~'+~+'~ 9
nl/2.
(~)~
"~_~)
u 2 ( t , T ) . ( N ) t . (1 + o(1)).
IfR=landT>2weget
. 1.
A-<~t
_
_
n . n 1/2. u 2 ( t , T ) . ( N ) t . (1 + o(1)),
analogously for R > 2 and T = 1.
Finally, for R > 2 and T > 2 we get
A _ < - 1- l ~
~1/~ " I ( r , R ) ' ( ~ Y
. 1~. n . n
1/2.u2(t,T).(N)t
Our proof is finished by showing for i = 1, 2
1
$1{ =
E
]A---i'' n li~ 9z. a(r, R ) - ( N F
=
o(1)
r>R=l
1 .1 n n 1 / 2 . v i ( r , R )
$2~ = r>R>2
E
tti
(N)~ = o ( 1 )
(1+o(1)).
273
which follows by extending the arguments used in [12].
[]
Conclusion
Our result yields a satisfiability algorithm with linear time on almost all formulas in
3-CNF with C 9n clauses if C < 1 as follows:
Input: F
From each clause of F randomly delete 1 literal, call the resulting formula G. (Now
G is a random formula of Formn(C 9n) and if G is satisfiable, then F.)
Apply a linear time satisfiability test to G [1].
If G is satisfiable (This is almost always the case by theorem 2.1)
then output "F is satisfiable"
else
Apply the Davis Putnam procedure to F.
(This statement is reached with probability going to 0.)
ft.
Franco [5] shows a similar result with a different proof: The Davis-Putnam procedure with the pure literal rule shows 3-CNF instances as above satisfiable almost
always without backtracking hence in polynomial time if C < 1.
I thank Prof. Kleine Brining and my colleagues for interest and advice concerning this paper. Moreover, Prof. Speckenmeyer pointed out to me the relationship
between formulas in 2-CNF and graphs.
References
[1]
B. Aspvall, M. F. Plass and R.E. Tarjan: A linear-time algorithm for
testing the truth of certain quantified boolean formulas, Information Processing Letters 8, 1979.
[2]
M. T. Chao and J. Franco: Probalistic analysis of two heuristics for the
3-satisfiability problem, SIAM J. Comput 15 (1986), pp. 1106-1118.
[3]
V. Chvatal and E. Szemerdi: Many hard examples for resolution, J. AMC
35 (1988), pp. 759-768.
[4]
O. Dubois: Counting the number of solutions for instances of satisfiability, Theoret. Comp. Sci. 81 (1991), pp. 49-64.
IS]
J. Franco: Probabilistic analysis of the pure literal heuristic for solving
the satisfiability problem, Annals of Operations Research 1 (1984), pp.
273-289.
[6]
A. Goldberg, P. W. Purdom, and C. A. Brown: Averge time analyses of
simplified Davis-Putnam Procedures, Information Processing Letters 15
(1982), pp. 72-75 (Errata 16 (1983), 213).
[7]
R. L. Graham, D. E. Knuth, and O. Patashnik: "Concrete Mathematics",
Addison Wesley, Reading, Massachusetts, December 1988.
274
IS]
P. Hansen, B. Jaumard, and M. Minoux: A linear expected-time algoritm
for deriving all logical conclusions implied by a set of boolean inequalities,
Mathematical Programming 34 (1986), pp. 223-231.
[9]
L. Palasti: Cn the threshold distribution function of cycles in a directed
random graph, Studia Scientiarum Math. Hun. 6 (1971), pp. 67-73.
[10]
E. M. Palmer: "Graphical evolution", J. Wiley and Sons, New York,
1985.
[11]
S. Ross: "A first course in probability", Macmillan, New York, 1976.
[12]
E. M. Wright: Large cycles in large labelled graphs, Math. Proc. Camb.
Phil. Soc. 78 (1975), pp. 7-17.

Download Report

A threshold for unsatisfiability

Paperzz.com

Your Paperzz