On the Probability That a Random ± 1

On the Probability That a Random ± 1-Matrix Is Singular
Jeff Kahn; János Komlós; Endre Szemerédi
Journal of the American Mathematical Society, Vol. 8, No. 1. (Jan., 1995), pp. 223-240.
Stable URL:
http://links.jstor.org/sici?sici=0894-0347%28199501%298%3A1%3C223%3AOTPTAR%3E2.0.CO%3B2-N
Journal of the American Mathematical Society is currently published by American Mathematical Society.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/journals/ams.html.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic
journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers,
and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take
advantage of advances in technology. For more information regarding JSTOR, please contact [email protected].
http://www.jstor.org
Mon Nov 26 03:03:03 2007
JOURNAL OF THE
AMERICAN MATHEMATICAL SOCIETY
Volume 8, Number I , January 1995
O N THE PROBABILITY THAT A RANDOM fl-MATRIX IS SINGULAR JEFF KAHN, JANOS KOMLOS, AND ENDRE SZEMEREDI
1.1. The problem. For M, a random n x n f 1-matrix ("random" meaning
with respect to uniform distribution), set
Pn = P r ( M , is singular).
The question considered in this paper is an old and rather notorious one: What
is the asymptotic behavior of Pn ?
It seems often to have been conjectured that
that is, that P, is essentially the probability that M, contains two rows or two
columns which are equal up to a sign. This conjecture is perhaps best regarded
as folklore. It is more or less stated in [14] and is mentioned explicitly, as a
standing conjecture, in [20], but has surely been recognized as the probable truth
for considerably longer. (It has also been conjectured ([17]) that p,/(n22-") -t
CQ
-1
Of course the guess in (1) may be sharpened, e.g., to
the right-hand side being essentially the probability of having a minimal row or
column dependency of length 4.
Despite the availability of the natural guess ( I ) , upper bounds on Pn have
net been easy to come by. That P, -+ 0 was shown by Komlos in 1963 (but
published somewhat later [12]). This is a discrete analogue of the fact that the
variety of singular real matrices has Lebesgue measure 0, and should be quite
Received by the editors May 3 1, 1991 and, in revised form, January 28, 1994.
1991 Mathematics Subject Classification. Primary 15A52, 1 1L03; Secondary 15A36, 11C20.
Key words and phrases. Random matrices, exponential sums, Littlewood-Offord Problem.
Research for the first author was supported in part by NSF grant NSFDMS-9003376 and AFOSR
grants AFOSR-89-05 12, AFOSR-90-0008.
Research for the second author was supported in part by the Hungarian National Foundation
for Scientific Research #I 905.
@ 1994 American Mathematical Society
224
JEFF KAHN, JANOS KOMLOS, AND ENDRE SZEMEREDI
obvious. It is somewhat surprising that no trivial proof is known. A simpler
proof (based, like the original proof, on Sperner's theorem ([24] or, e.g., [7])),
, was offered in [ 141 (see also [I], XIV.2).
giving the bound P,, = O(1/fi)
Here we give an exponential bound.
Theorem 1. There is a positive constant
E
for which Pn < (1 - E)" .
We prove this with E = .001 for n 2 n o . While this could be improved
somewhat, a proof of (1) seems to require substantial new ideas.
Let m . . , i , j 2 1 , be chosen at random from { f 1) independently of each
'1.
other (an infinite random matrix). Let Mn be the finite matrix (mij),5i,j5? .
Our return to the estimation of Pn was motivated in part by a question
proposed by Benji Weiss: Is it true that C P,, < cc ?
The point of the question is that an affirmative answer (as provided by our
Theorem 1) implies, via the Borel-Cantelli Lemma, that with probability 1 only
finitely many of the Mn are singular.
A few additional applications and extensions are mentioned in Section 4. Of
the extensions, the most interesting is perhaps Corollary 4(b), which, improving a result of Odlyzko [20], says that for an appropriate constant C , n - C
random {f1)-vectors are not only (a.s.) independent, but in fact span no other
{ f 1)-vectors.
The problem of estimating Pn turns out to be closely related to questions
arising in various other areas, e.g., geometry (Fiiredi [4]), threshold logic (Zuev
[27]), and associative memories (Kanter-Sompolinsky [ l 11). Consequences of
Theorem 1 for some of these are also discussed in Section 4.
For more on the by now vast literature on random matrices see, e.g., Girko
[6] or Mehta [16]. See also Odlyzko [20] for a few problems more or less related
to the present work.
In the remainder of this section we sketch the main points in the proof of
Theorem 1. Let us in particular draw the reader's attention to Theorem 2,
which is central to the proof of Theorem 1, and seems also to be of independent
interest.
1.2. Linear algebra. For
a E R" - {Q) , let
where E is drawn uniformly from {f1)" , and denote by E, the event " M a =
Ow,where M = Mn is a random n x n f 1-matrix. Thus, &(E,)- = [p(g)ln,
and P,, = Pr(U{E,: a E Z" - (0))).
Of course ''~06ie's inequality" Pn 5 C Pr(E,) gives nothing here. For a's
with very small p ( a ) , the following trivial lemma gives a usable lower bound.
(This is just the case k = n of Lemma 2 below, but we prove it separately both
by way of illustration and because it plays a special role in what follows.)
ON THE PROBABILITY THAT A RANDOM k1-MATRIX IS SINGULAR
225
Lemma 1. For any p,, 0 < p, < 1 ,
Prooj The inequality is implied by the following observation. Let Li denote
the event "the ith row of M is a linear combination of the other n - 1 rows".
Then, Pr([u{E, : p ( a ) 5 p,)] n L,) 5 p, . Indeed, p, is an upper bound on
the probability of this event conditioned on any values in the other n - 1 rows,
since for such matrices the set {a:p ( a ) 5 p,, M a = Q) is determined already
by those n - 1 rows. Taking total probability gives the bound p, .
To deal with large p(g)'s, we must somehow exploit dependencies among the
E,'s. A framework for doing so, based on the idea that linearly dependent a's
tend to be annihilated by the same M's, is given by the following lemma. (For
S c R" , dim(S) is the dimension of the subspace spanned by S .)
Lemma 2. Let S be a subset of R" - {Q) , k = dim(S) , and p ( S ) = max{p(g):
S ) . Then,
aE
-
Pr(u{Ea:
- a E S)) 5
(k - 1 )p(~)"-*".
is somewhat wasteful. We
This is proved in Section 3.3. The factor (*ll)
will eventually substitute for Lemma 2 a more technical variant (Lemma 4)
which gives a slightly better value of E in Theorem 1.
1.3. Subspaces. In the proof of Theorem 1 we will try to cover Z" \ {Q) by
a small number of subspaces of moderate dimensions, and simply add up the
bounds in Lemma 2 (or rather Lemma 4). That is, we will use the estimate
(2)
Pn = Pr(u{Ea- : a E Z" \ {Q))) 5
Pr(u{Ea:
-
E
S,)) ,
i>O
where {S,) is an appropriate cover of Z" \ {Q) , and use Lemma 4 to bound
the summands. We will choose So = { a : p ( a ) 5 p,) (p, = (1 - E)"), SO
dim(So) = n . But for i # 0, dim(S,) will be roughly y n , with y < 1 a
coastant to be specified later.
It is perhaps most natural here to try to use S,'s of the form S ( I ) = n{E' :
-E E I ) , where I is a set of linearly independent vectors from {f1)" , the idea
= 0 for many 5 E {f1)" , then a should lie
being that if g E Z" satisfies
in many S(I)'s. While this seems not quite to work, something similar does
lead to usable S,'s. Namely, our subspaces will be of the above form S ( I ) with
I a set of about (1 - y)n linearly independent vectors from {- 1 , 0 , + 11" ,
each with exactly d non-zero components, for some d = pn with p a small
constant.
To show that a moderate number of such subspaces can cover Z" , we use a
probabilistic construction (Lemma 5).
226
JEFF KAHN, J ~ O KS O M ~ S AND
,
ENDRE SZEMEREDI
Definition. Let Vd be the set of vectors g E {- 1 , 0 , + I), with exactly d nonzero coordinates. A d- sum in g is an expression of the form
ciai , where
E E Vd . We write
t
X,(g) = { E E % : g g = O )
EL,
and
Dd
(a)= lzd (a)1.
The analogue of p(g) for d-sums is
1.4. A Halasz-type inequality. As noted above, we may place all a's for which
p(g) is small-p(g) < po = (1 - E ) " , say-in a single set S o , which by Lemma I
contributes only np, to the bound (2).
The crucial question posed by Lemma 2 (or Lemma 4) is: What can be said
about g's for which p(g) is large?
For example, as observed by Erdos [2] in connection with the "LittlewoodOfford Problem", Sperner's Theorem ([24] or, e.g., [7]) implies that if p(g) is
much bigger than n-'I2, then g has relatively small support.
A second example is given by a theorem of Sarkozy and Szemeredi [23],
which says that if a , , . .. , a, are distinct, then
(The precise bound here is a celebrated result essentially due to Stanley; see [25],
[21].) So if p(g) is much bigger than n-3'2, then g must have many repeated
entries. (Incidentally, one can use this with Lemma 2 to show P, = ~ ( n - ' I 2 ) ,
which already answers the question of Weiss mentioned earlier. This was in
fact our starting point.)
For smaller values of p(g) , some deep theorems of Halasz [8, 91 apply. They
say, roughly, that if p(g) is much bigger than n -(2r+1)/2, then there must be
considerable duplication among the sums h a , k . . . a,, .
I
We give here a more abstract condition, which says that for d much less
than n , p(g) = p,(g) tends to be significantly less than pd(g). This is perhaps
the most important step in the proof of Theorem 1.
In terms of random walks, the result says roughly the following. Let a,, ... ,a,
Then the probability that a random walk with step
be integers and p E ( 0 ,
sizes a , , . .. , a, returns to the origin at time n is less by a factor O(@) than
the corresponding probability for the "lazy" walk which at the ith step moves
a, or -a,, each with probability p , and otherwise remains where it is.
While this is certainly the case for ordinary random walks, it is somewhat
surprising that such a relation between p, and pd can be established for random walks with arbitrary step sizes, since in such generality it is hopeless to
determine, or even to give reasonable estimates for, individual values of p, .
Let supp(g) be the number of non-zero components in a .
*
4).
ON THE PROBABILITY THAT A RANDOM f 1-MATRIX IS SINGULAR
227
Theorem 2. Let A < 1 be a positive number, and let k be a positive integer such
that 4Ak2 < 1 . If E Z" - {Q), then
where Q = Q,(a) is dejined as
Q = k (:)(~e-"~(l-Ae-)
A n-d
pd(g).
d=O
The choice k = (122)-'I2 leads to the following corollary.
Theorem 3. There exists Cfor each A) K(A) such that
and supp(a) 2 K(A), then
~ ( a<) C,&Q
if (122)-'I2 is integral
,
where co < 5.2.
Remark. Set p = Ae-I . The weight function in Q is a binomial distribution
which is highly concentrated around the expected value pn . Hence, typically,
only the terms pd(g) with d r; pn matter. Thus, Theorem 3 roughly says the
following. Let p > 0 be small. If a, , .. . , an are non-zero integers, and many
(more than a (1 - p)" proportion) of the 2" signed sums of the a's are 0, then,
for some d r; pn , an even larger (by a factor
proportion of the (;)zd
signed sums of d terms are 0.
We just mention two illuminating examples:
m)
Example 1 (verified for us by Imre Ruzsa [22]). If ai = ia , cu a positive integer, then (for large enough d , n) p(a) cn-"-'I2 , while pd(g) cn-"d-'I2 .
-
-
-
Example 2. If the ai are random integers chosen from the range { 1, 2,
then p(a) c / ( M f i ) and pd(a) c/(M&).
Now fix a positive constant E' , and set
-
qA(a) = max{pd(a): Id By the Chernoff bound,
where
< ctn).
.. ., M ) ,
JEFF KAHN, JANOS KOMLOS, AND ENDRE SZEMEREDI
228
with D(xll,u) = xlog(x/p) + (1 - x ) log((1 - x ) / ( l - p)) , the information theoretical divergence of x from p .
Thus, under the conditions of Theorem 3 we have, provided supp(g) > K ( 1 ) ,
with c, < 5.2.
1.5. Sketch of the proof of the main theorem. The proof of Theorem 2, using
ideas of Halasz, is given in Section 2. In Section 3, we complete the proof
of Theorem 1. The argument (ignoring a's of small support, which are easily
handled directly) will go roughly as follows.
We fix a small A (eventually 1j 108) and E somewhat smaller (.002). Vectors
a- with p(a) < (1 - E)" are placed in So.
For the remaining vectors, as indicated earlier, we use Si's based on d-sums
and having dimension yn , where d takes various values in the vicinity of An
and y = ~ n l d .
The crucial difference between d-sums and full sums is in the factor f i : for
given a , the number of S,'s used to cover a's with qA(g)= pd (a) and od(a)x
o behaves roughly like ((;)2d/o)'1-Y)n x pd(g)-(l-Y'" ; the binomial coefficient
from
in Lemma 2 turns out not to be too important; and the factor (fi)('-")"
P ~ - ~is small
+ ~ enough to give the desired exponential bound.
Recall that
cos ai = 2-"
cos(elal + . . . + e n a n ) ,
where the sum is over g E {k1)" . This gives, for any g E R" ,
Remark. The reader may notice that the integrand on the right-hand side of (7)
is the Fourier transform of the distribution of C:=,c iai, where g ih chosen
uniformly from {f1)" . This is not by accident. Esseen's concentration lemma
[3] says that for any finite measure p ,
where $ ( t ) is the Fourier transform $(t) = seitxP(dx) and c is an absolute
constant.
ON THE PROBABILITY THAT A RANDOM *1-MATRIX IS SINGULAR
229
This remark may be used to generalize Theorem 1 to random matrices with
arbitrary independent identically distributed non-degenerate entries. (For such
a generalization of the result of [12], see [13].)
- ( I -x2)/2 together with
Returning to (7) and using the inequality 1x1 5 e
1 - cos2a = (1 - cos(2a))/2 and the integrality of ai we have
Setting
we define
(I
.I
1
T(x) = {t E ( 0 , 2n) : f (t) I x) and g(x) = 2n IT(x)l
stands for Lebesgue measure).
Using this f and g , we can rewrite our estimate as
(8) p ( d
1
< r;;
1
2a
1z
i
21r
e-""dt
=
0
L e d x d xd t =
271
A
00
e-xg(x) dx.
The following inequality of Halasz ([8], see also [9]) is at the heart of our
proof. For any x > 0 and positive integer k ,
provided g(k2x) < 1 , which certainly holds if k2x 5 supp(a)/4 since
f (t) d t = supp(a)/4 and f is not constant.
2a 0
For the convenience of the reader, we sketch Halasz's proof of (9). For a
fixed integer k 2 2 , let T*(x)= {t, + . . . + tk : ti E T(x)) (addition modulo
271). Then (9) follows from
s2'
1
--IT*(x)~2 min{kg(x), 1).
2n
The set containment (10) follows from
(11)
and
sin2
(5 (51
i= 1 ai)
5
i= 1
sin ail
k
i= 1
For the proof of (1 I), see [8]. (Alternatively, it is an easy consequence of the
Cauchy-Davenport Theorem (e.g., Halberstam-Roth [lo])).
230
JEFF KAHN, J ~ O KOMLOS,
S
AND ENDRE SZEMEREDI
We return to the proof of Theorem 2. Let us fix a positive number 1< 1 .
First we use Chernofs method to show
g(x) I e4k Q.
(12)
By Markov's inequality,
<
- exp {-l(n
- 4x)}
2n
1
I
I
I
> exp {i(n - 4x))
t E (0, 2n): exp
1
2 exp
' (
ix
i
cos(ait) d t .
Using the inequality
e"
recalling
0 ' - - i ( l - z) for
121
2 1,
= ~ e - and
~ , then using ( 6 ) , we have
Thus,
=e"Erd(l-a)
d
n-d
od(a)
- - - e"
2d
"C (:)
ad( 1 - a)n-d~d
(a),
d
proving ( 12).
Now, let k be a positive integer with 41k2 < 1 . Let us write S =
s u p p ( ~ ) / ( 4 k 2and
) split the estimate (8) as
We start with the second integral. By (12),
In the domain of the first integral we have k2x I k 2 s = supp(g)/4. Thus
(9) applies, and with (12) yields
proving Theorem 2.
ON THE PROBABILITY THAT A RANDOM f 1-MATRIX IS SINGULAR
23 1
We assume throughout that n is large enough to support our approximations.
We generally treat large real numbers as integers without comment; if the reader
prefers, replacing each such number by its floor, say, removes this imprecision
without affecting any of the arguments.
3.1. a's with many 0's. We first dispose of the easy case of a's with many 0's.
The following observation is from [14] (see also [I], p. 348, Lemma 10).
For all K ,
In particular,
Remark. This can easily be improved to the true value (1 + o(1))n 2 , e.g., by
substituting (;I:) for (kll)
in (13). Thus, vectors a with at least 3n/log, n
0's do not obstruct a proof of (1).
2 -n
3.2. An Odlyzko-type lemma. We need one more easy observation, which generalizes Theorem 2 of [20]. (Recall that Vd is the set of {- 1 , 0 , + 1)-vectors
with exactly d non-zero coordinates.)
Lemma 3. If S is a D-dimensional subspace of R" , then
Proof: Without loss of generality, the set of restrictions of vectors in S to the
coordinates { 1 , . .. , D) is of dimension D . Thus different vectors in S n Vd
have different restrictions to these coordinates, each a {- 1 , 0 , 1)-vector with
between D - n + d and d nonzero coordinates. This gives (15).
The case D = n is Odlyzko's result, which we state for future reference as
Corollary 1. For V a subspace of R" and 1: chosen uniformly at random from
{*l>",
dim(V)
Pr(r E v') 5 23.3. Back to Lemma 2. As mentioned earlier, a little more care with Lemma 2
eventually gives a somewhat better E in Theorem 1.
232
JEFF KAHN. JANOS KOMLOS, A N D ENDRE SZEMEREDI
Lemma 4. Suppose the k-dimensional subset S of R" and numbers p, e" satisjj
P ( S )I P ,
(16)
together with the technical conditions
Then Wor large enough n ) ,
Pr(u{E,:
a E S ) )<
(Enn)~n-k+'s
Proofs of Lemmas 2 and 4. Let r , , . .. , m be the rows of M . A necessary
condition for the event Es ;= U{E,- : g E S ) is that there be at most k - 1
indices J E [ n ] for which
Call the event in ( 1 9 ) FJ . For I c [ n ], let HI be the event { S n
n,,, r:
# {Q)).
Proof of Lemma 2. The discussion to this point implies
But for any J c [ n ],
T o see this, fix (and condition on) rows E ~J ,g J , satisfying HI,]\, . Then
F J implies that E, lies in ( S n
$)' , so in particular is orthogonal to any
I
given a E S n
r l . Since the latter occurs with probability at most p ( S ) ,
and the rows are chosen independently, we have (20). The lemma follows.
-
n1<)
nl,j
Proof of Lemma 4. We just elaborate the preceding proof a little. Note we may
assume
since otherwise the conclusion follows from Lemma 2.
ON THE PROBABILITY THAT A RANDOM +I-MATRIX IS SINGULAR
Given I c [n] with Ill
< k - 1 , set
233
J = [n] \ I ,
G , = H , ~ ~ F ~F , = G , ~ ~ F , .
~ E I
;€J
Thus
Pr(Es) 4 X { P ~ ( F , ) : I c [n], Il 5 k
(22)
For j
E
-
I}.
J let t ( j ) = II \ [j]l . Our basic inequality is
To see this, fix rows r i , i
E
I , satisfying GI . Then
Fj requires that
Now GI implies that
dim
(snnq
1
2 dim
(s n n r l9
+t(j)
t(j)+l,
;
-t(j)-1
. But (24) also
so by Corollary 1, (24) occurs with probability at most 2
, so occurs with
requires that L, be orthogonal to any given a E S n
probability at most p(S) . This gives (23).
Consider first I of size k - 1 . Set m = k - ~ ' ' nand suppose II n [m]l = i .
Then t ( j ) 2 k - 1 - i for j E J n [m], and so, by (23),
n,,, $
Letting I vary, this gives
the second inequality by (16), (18).
For smaller I , notice that for any I c I' c [n] ,
f ( I ) 4 p'I1'I' f (I1).
Thus (see (22))
k- 1
Pr(E,) 5
x{
f(1): I c [n], (I1= k - 1 - t )
The inequality (26) is from (25), while (27) is a consequence of (17) and (21).
3.4. A random construction. We can now construct the sets Sifor use in Lemma
4. Our basic parameters are A, p = Ae-" E , and E' = a p , all small positive
constants. We assume first of all that
(28)
I-E > e
- D I V ( ~ E, ' )
(As mentioned earlier, we also assume n is large.)
Then Theorem 2 (see (5)) guarantees that for any
a
with
we have the crucial inequality
(30)
~ ( a<
) 5 . 2 d q,(a).
Let us fix (temporarily) two integers d and a with
and
where N = Nd = lVdl= (:)2 d .
For such d and a , we also define the sets
zn- (0): supp(a) > K(A), q,(a) = p d ( a ) , and ad(@ = a ) .
As mentioned above, vectors a in any of the sets M ( d , a ) satisfy (30).
M ( d , a ) = {a E
Define in addition 6 = d / n , y = ~ / 6 and
,
D
the parameters so that d 5 0 1 2 , that is,
implying that F ( D , d ) < 2(:)2d.
= (1
- y)n . We will choose
Note also that, since y < 1 ,
235
ON THE PROBABILITY THAT A RANDOM f 1-MATRIX IS SINGULAR
We will cover M ( d , a ) by a number of sets Si, each consisting of a's which
are orthogonal to some D linearly independent vectors from Vd .
(t)
D
Lemma5. There exist m < (1 + o ( l ) )
log(:)
and W,, ... , Wm, each a
set of D linearly independent vectors from V d , such that any a-subset of Vd
contains a t least one of the W, .
Remark. If we don't require that the elements of Wi be independent, then
Lemma 5 becomes a special case of a hypergraph covering result of Lovasz [15]
(see also Fiiredi [5] for a survey of this and related topics). In the present
situation we use Lemma 3 to show that the independence requirement doesn't
really cause any trouble.
Proof: Fix C c Vd with 1x1 = a , and set q = a / N . Let w, , ... , wD be
chosen uniformly and independently from Vd , and set
F = {w, , . . . , wD are linearly independent elements of C).
We show that
P r ( F ) = (1 - o(1))qD.
Since Pr(wl , . .. , wD E C) = qD , it's enough to show
This probability is just P := Pr(vl , ... , vD are linearly independent) , where
vl , ... , vD are drawn uniformly and independently from C . By Lemma 3,
x
D
P > 1-
pr(vi E (v, , ... ,
But, writing (x), for x ( x - 1 ) . . . ( x - s + 1) and using (32), (33),
So (35) follows from (34).
D
This gives the lemma: For appropriate rn < (1 + o(1)) ( F ) log
, if
W, , ... , Wm are uniformly and independently chosen D-subsets of V, , then
the expected number of a-subsets of Vd containing no independent W. is
(r)
so in particular there exist Wi's as in the statement of the lemma.
236
JEFF KAHN, JANOS KOMLOS, AND ENDRE SZEMEREDI
3.5. Defining the 8,'s. With notation as in Section 3.4, let W, ,. . . , Wm be as
in Lemma 5 and set
S, = (W,)'
n M ( d , 0).
Suppose also that the constant e" satisfies
el1 > - log2(l - e). (36)
Then applying Lemma 4 with S = S, and k = D , p = 5.2fia/N, and using (30) we have
Pr(U{Ea- : a E
S,}) < (&'ln)
( 5
. 2 ~ ~ 1 ~ ) ~
and (since m < ( N / ~ ) ~ N )
(37) Pr(u{Ea:
E M ( d , a)}) < N
-
(e n)( 5 . 2 ~ ) ~
< exp2[(H2(6)+ 6 + H ~ ( E "+)(1 - y ) log2(5.2A))n] =: md.
Thus we have
where the sum is over d satisfying (31).
The factor N in (38) is much more than is necessary, and, though this makes
only a small difference in our final bound, we modify the argument as follows
to reduce it.
, into
intervals of the form I = { a : (1 + lln)' < a 5
Partition [ ( l - E ) ~ NN]
(1 + l/n)''') and use Lemma 5 to cover uUE,M(d,a) rather than an individual M ( d , a ) . This has essentially no effect on any of the above calculations,
and allows us to replace (38) by, for example,
(39)
P~(u{E,:
- supp(a) > K V ) , qA(a)> (1 - &)"I) < n2
x
md.
Finally, set
so= {a E zn- (0) : qA(a)I ( 1 - E
=zn- {Q) - u d , u M ( d , o).
) OR
~
supp(a) 5 K(A)}
By (5) and (28), the conditions qA(a)5 (1 - E ) ~supp(a)
,
> K(A) , with our
eventual choice A = 1/ 108 , imply
~ ( a<) (1 - &In
(40) Thus, by Lemma 1 and (14)
Pr(U{Ea_ : a E So})< n(1 - e)"
+ n32-n
and finally,
(41)
P, < n2
md
+ n ( l - e)" + n32-".
,
ON THE PROBABILITY THAT A RANDOM *I-MATRIX IS SINGULAR
237
3.6. Choosing the parameters. It remains to set the parameters. Essentially this
amounts to choosing A and cu = &'Ip,the other values then being dictated by
(281, (311, (331, and (36).
A convenient, if not quite optimal choice, is II = 1/ 108 (k = 3) ( p = IIe-'),
cu = .5 ( E ' = .5p), E" = .Ol , and E = 0.002 (i.e., something a little bigger than
0.001).
It is then straightforward to check that for any (1 - a ) p 5 6 5 (1 + a ) p and
y = ~ / 6 the
, expression
in (37) is less than log,(l - E) . (Its values at the extremes 6 = (1 If: a ) p are
less than log2(l - E) , and its second derivative with respect to 6 is positive
between the extremes.)
Thus the bound in (41) is essentially equal to its second term and we have
Theorem 1.
It would be of considerable interest to say more about the distribution of
det(Mn). Viewed "up close", the distribution is not very nice-for instance it's
easy to see that det(Mn) is always divisible by 2"-'-but
it seems reasonable
to expect some kind of limit distribution. The log-normal law for random
determinants (Girko [6], Theorem 6.4.1) doesn't apply here, for the entries
4
don't satisfy Emij = 3 .
It is not hard to see, based on the above results, that for any b ,
[Briefly, this is because: If we define p*(g) = max, pr(&g = c ) , then the
bound of Theorem 2 applies to p'(g) . (Multiplying the integrands in (7) by
cos(ct) gives ~ r ( ~ =
' gc) in place of p ( g ) , and the rest of the proof goes
through as is.) This implies, as in the proof of Theorem 1, that with probability
at least 1 - (1 - E)" the first n - 1 rows, r, , ... , L , - ~ , of M annihilate a
unique g , which satisfies p*(g) > (1 - 6)". On the other hand, given any such
r, , ... , rn-l and g , Pr(det(M) = b) 5 p*(g) for any b .)
However, (42) should be far from the truth, which we believe to be that
(except when b = 0) the probability in (42) is exp[-Q(n log n)] .
It is also not hard to see that (37) together with (13) implies the following
extensions.
> 0 there are constants C and E* > 0 so that
Pr(Mng= Q for some g with supp(g) > C and p(a) > (1 - E * ) " ) < Y".
Corollary 2. For any y
Corollary 3. For every y > 0 there is a constant C such that
JEFT KAHN, J ~ O KOMLOS,
S
AND ENDRE SZEMEREDI
238
Corollary 4. There is a constant C so that if r 5 n - C and 2, , . . . , 2, are
chosen (uniformly, independently) at random from { f 1)" , then
(a) Pr(2, , . . . , 2, are linearly dependent) = (1 + o(1))2 (i)2-" ,
(b) Pr((!L,, - .- , 2,) fl{kl}" # {c,,... 2,)) = (1 + 0(1))4(;)($)"7
(The precise error term in (a) is ( 1 + 0(1))8(1)($)" , while (b) has an error
term O((&)").)
Part (b) improves a result of Odlyzko [20]-studied in connection with a
question on associative memories [I I]- which gives the same bound provided
r < n - Ion/ log n . As he observes, the error term O((&)") is not best possible.
We conjecture that the conclusions of Corollary 3 hold provided r 5 n - 1 ,
but expect that proving this will be about the same as proving (I).
Denote by Tn the number of threshold functions of n variables, that is,
functions f : { f 1)" 4 { f 1) of the form
with ai E R . The behavior of log Tn was considered beginning in the late
1950s by various authors who established the bounds (;) < log2 Tn < n 2 . (See
Muroga [18] for details, related results, and references.) More recently, Zuev
2
[27] showed, using results from [26] and [20], that log, Tn N n . His precise
bound is
~,
2
= exp, [n - 1On
2
/ log n - O(n log n)] ,
whereas an upper bound is
Using Corollary 4(b) in place of [20] improves the lower bound (43) to
2
exp2[n - n log, n - O(n)] . Moreover, if the conjecture that one may replace
r 5 n - C by r 5 n - 1 in the corollary is true, then a slight elaboration of the
argument of [27] gives the asymptotics, not just of log T, , but of Tn itself: it
would be asymptotic to the left-hand side of (44).
As pointed out by Fiiredi [4], Theorem 1 also gives some improvement in
the bounds of that paper, namely, if n = O(d) and x, , . .. , x, are chosen
uniformly and independently from { f lid, then
where h(n , d ) =
c : ~("; ') .
ON THE PROBABILITY THAT A RANDOM f1-MATRIX IS SINGULAR
239
We would like to thank Jozsef Beck alld Gabor Halasz for helpful conversations, and Volodia Blinovsky for pointing out some errors in the manuscript.
REFERENCES
[ l ] Bela Bollobas, Random graphs, Academic Press, New York, 1985.
[2] Pal Erdos, On a lemma ofLittlewood and Oflord, Bull. Amer. Math. Soc. 51 (1945), 898-902.
[3] C. G . Esseen, On the Kolmogorov-Rogozin inequalityfor the concentration function, Z. Wahrsch.
Venv. Gebiete 5 (1966),210-216.
[4] Zoltan Fiiredi, Random polytopes in the d-dimensional cube, Discrete Comput. Geom. 1
(1986), 315-319.
[5] ,
Matchings and covers in hypergraphs, Graphs Combin. 4 ( 1 988), 1 15-206.
[6] V . L. Girko, Theory of random determinants, Math. Appl. (Soviet Ser.), vol. 45, Kluwer Acad.
Publ., Dordrecht, 1990.
[7] C. Greene and D. J . Kleitman, Proof techniques in the theory of finite sets, Studies in
Combinatorics (G.-C. Rota, ed.), Math. Assoc. Amer., Washington, D.C., 1978:
[8] Gabor Halasz, On the distribution of additive arithmetic functions, Acta Arith. 27 ( 1 9 7 9 , 143152.
[9] ,
Estimates for the concentration function of combinatorial number theory and
probability, Period. Math. Hungar. 8 ( 1 977), 197-2 1 1 .
[ l o ] H. Halberstam and K. F. Roth, Sequences, Vol. 1, Oxford Univ. Press, London and New York,
1966.
[ l 11 I . Kanter and H. Sompolinsky, Associative recall of memory without errors, Phys. Rev. ( A ) ( 3 )
35 (1987), 380-392.
[12] Janos Komlos, On the determinant of ( 0 , 1 ) matrices, Studia Sci. Math. Hungar. 2 (1967),
7-2 1.
[I31 ,
On thedeterminants of random matrices, Studia Sci. Math. Hungar. 3 (1968),387-399.
[14] ,
Circulated manuscript, 1977.
[15] Liszlo Lovasz, On the ratio of optimal integral and fractional covers, Discrete Math. 13 (1975),
383-390.
[I61 Madan La1 Mehta, Random matrices, second ed., Academic Press, New York, 1991.
[17] N. Metropolis and P. R. Stein, A class of ( 0 , 1)-matrices with vanishing determinants, J .
Combin. Theory 3 (1967), 191-198.
[18] Saburo Muroga, Threshold logic and its applications, Wiley, New York, 197 1 .
[19] A. M. Odlyzko, On the ranks ofsome ( 0 , 1)-matrices with constant row-sums, J . Austral. Math.
Soc. Ser. A 31 (1981), 193-201.
[20] ,
On subspaces spanned by random selections of * 1 vectors, J. Combin. Theory Ser.
A 47 (1988), 124-133.
[21] G . W . Peck, Erdos conjecture on sums of distinct numbers, Studies Appl. Math. 63 (1980),
87-92.
[22] Imre Ruzsa, Private communication.
[23] Andras Sarkijzy and Endre Szemeredi, ~ b e ein
r Problem von Erdis und Moser, Acta. Arith. 1 1
( 1 965), 205-208.
[24] E. Sperner, Ein Satz uber Untermenge einer endliche Menge, Math. Z. 27 (1928),544-548.
[25] Richard P. Stanley, Weyl groups, the hard Lefschetz theorem, and the Sperner property, SIAM
J . Alg. Discrete Math. 1 ( 1 980), 168-1 84.
[26] Thomas Zaslavsky, Facing up to arrangements: Face-count formulas for partitions of space by
hyperplanes, Mem. Amer. Math. Soc., vol. 154, Amer. Math. Soc., Providence, RI, 1975.
[27] Yu. A. Zuev, Methods of geometry and probabilistic combinatorics in threshold logic, Discrete
Math. Appl. 2 (1992), 427-438.
ABSTRACT.
We report some progress on the old problem of estimating the probability, P, , that a random n x n f 1-matrix is singular:
Theorem. There is a positive constant E for which P, < (1 - E)" .
This is a considerable improvement on the best previous bound, P, =
O(l/Jfi) , given by Komlos in 1977, but still falls short of the often-conjectured
asymptotical formula P, = (1 + o(l))n22'-" .
The proof combines ideas from combinatorial number theory, Fourier analysis and combinatorics, and some probabilistic constructions. A key ingredient,
based on a Fourier-analytic idea of Halasz, is an inequality (Theorem 2) relating the probability that g € R" is orthogonal to a random & € {*I)"
to the corresponding probability when & is random from (-1 , 0 , 1)" with
Pr(&,= - 1) = Pr(Ei = 1) = p and &,'schosen independently.
(J. Kahn and J. Komlos) DEPARTMENTOF MATHEMATICS,RUTGERS UNIVERSITY,
NEWBRUNSWICK,
NEWJERSEY08903
E-mail address: jkahnomath. r u t g e r s .edu E-mail address: [email protected] u t g e r s edu .
(E. Szemeredi) DEPARTMENT
OF COMPUTER
SCIENCE,
RUTGERS
UNIVERSITY,
NEWBRUNSWICK,
NEWJERSEY
08903
E-mail address: szemeredocs . r u t g e r s .edu
(J. Komlos and E. Szemeredi) HUNGARIAN
ACADEMY
OF SCIENCES,
BUDAPEST,
HUNGARY