Analysis of a simple greedy matching algorithm

Analysis of a Simple Greedy Matching Algorithm on
Random Cubic Graphs
Alan Frieze
A.J. Radcliey
Department of Mathematics,
Carnegie Mellon University,
Pittsburgh PA15213, U.S.A.
Stephen Suen
June 14, 1995
Abstract
We consider the performance of a simple greedy matching algorithm MINGREEDY
when applied to random cubic graphs. We show that if n is the expected number of
vertices not matched by MINGREEDY, then there are positive constants c1 and c2
such that c1n1=5 n c2n1=5 log n.
1 Introduction
There have been a number of papers analysing simple greedy-type algorithms for nding
large matchings in graphs, e.g. Korte and Hausmann [6], Karp and Sipser [7], Tinhofer [8],
Dyer and Frieze [3], Goldschmidt and Hochbaum [5] and Dyer, Frieze and Pittel [4]. Most of
these deal with the expected performance of various algorithms on random graphs. In this
paper we discuss the algorithm MINGREEDY, given below. It is simple and can easily be
implemented in linear time.
We use the following notation in the description of MINGREEDY. G (v) denotes the set
of neighbours of the vertex v in the graph G; often G is understood and the subscript is
omitted. G n fu; vg is the graph obtained from G by deleting the vertices u and v, all edges
incident with them and, in addition, any new isolated vertices.
y
Supported by NSF grant CCR-9024935
Current address: Department of Mathematics, Georgia Institute of Technology, Atlanta GA30332
1
MINGREEDY
Input G;
begin
M ;;
while E (G) 6= ; do
begin
A: Choose v 2 V uniformly at random from among the vertices of
minimum degree in G;
B : Choose u 2 (v) uniformly at random and set e = fu; vg;
G G n fu; vg;
M M [ feg
end;
Output M as our matching
end
The performance of MINGREEDY, at least on random cubic graphs, is remarkably good.
In Theorem 1.1 below we prove that the number of vertices left unmatched is usually comparatively small, but it is instructive to see the algorithm's performance in \practice". We
have done a limited amount of computation and some results are given in Table 1. We have
compared it to two other greedy matching algorithms: GREEDY, which chooses an edge at
random from those available, and MODIFIED GREEDY, which randomly chooses a vertex
v and then an edge incident with it. The algorithms were run on random n-vertex cubic
graphs, up to n = 10 . The dierence in performance is quite dramatic. It makes good sense
to use an algorithm like MINGREEDY as a \front end" to an optimising algorithm.
6
n
10
10
10
10
10
2
3
4
5
6
iM
1.2
2.2
3.5
6.2
10.4
iR
12.0
116.3
1193.3
11892.7
119091.9
iV
11.8
102.2
1049.9
10472.1
104546.8
Table 1: Performance of greedy type matching algorithms in 20 Monte Carlo runs, where iM
(respectively iR, iV ) is the average number of vertices left isolated when using MINGREEDY
(respectively GREEDY, MODIFIED GREEDY).
This paper is concerned with the performance of MINGREEDY on random graphs. For
a graph G we let (G) denote the expected number of vertices left unmatched by a run of
MINGREEDY . Let n denote the set of 3-regular graphs with vertex set [n] = f1; 2; : : : ; ng.
1
MINGREEDY is a randomising algorithm and the expectation here is with respect to the algorithm's
random choices
1
2
Theorem 1.1 Let Gn be chosen uniformly at random from n . Then there exist constants
c ; c > 0 such that
c n = E((Gn )) c n = ln n:
1
2
1
1 5
2
1 5
It seems likely that the ln n factor in the upper bound is unnecessary.
Note, in contrast to this result, that the algorithm which at every stage chooses an edge
randomly from among those available (analysed in [3]), is very likely to leave at least n
vertices unmatched for some absolute constant > 0. A similar estimate holds if, at every
stage, one chooses a random vertex v and then a random neighbour of v. These calculations
are borne out by the computational results in Table 1; it is crucial to the success of our
algorithm that we always choose a vertex of minimum degree.
We would like to extend Theorem 1.1 to random r-regular graphs and to sparse random
graphs but it seems dicult to carry through the analysis. It should be noted however that
MINGREEDY is likely to perform well. It is closest in spirit to Algorithm 2 of [7], which,
when run on sparse random graphs, usually leaves only o(n) more vertices unmatched than
the minimum possible. (This is a dicult proof and there is no estimate given of the o(n)
error term.)
2 Overview of the Proof
Our proof of Theorem 1.1 is rather long and so we now give a brief overview. For functions
f and g, we use f (n) g(n) to denote that f (n) = (1 + o(1))g(n) as n ! 1.
Let t denote the number of iterations of the algorithm so far, i.e. the number of executions
of Step A, and let ni = ni (t) be the number of vertices of degree i = 1; 2; 3 in the graph
G(t) at the end of iteration t. (So n (0) = n (0) = 0, n (0) = n.) Similarly, write m(t) =
(n (t) + 2n (t) + 3n (t)) for the number of edges of G(t). Let (t) denote the number of
new isolated vertices created at time t. Then the number of unmatched vertices at the end
is given by
X
(Gn ) = (t):
1
2
1
1
2
2
3
3
t0
Our ability to analyse the algorithm rests on the following fact.
Claim 2.1 Suppose that G = G(t) has ni vertices of degree i, i = 1; 2; 3. Then G is equally
likely to be any member of G (n1 ; n2; n3 ), the set of all graphs with vertex set V (G) and ni
vertices of degree i, i = 1; 2; 3.
3
The claim allows us to treat the progress of the algorithm as a Markov chain M on
N (where N = f0; 1; 2; : : :g), the state at time t being (n (t); n (t); n (t)). Analysis of M
yields the following fact.
1
3
1
2
3
1
Claim 2.2 Conditional on n (t); n (t) and n (t), the expected number (t) of isolated ver1
2
3
tices created at iteration t is (n1 (t)=m(t)).
We will see that, in a typical execution of the algorithm (except near the end of the
process), n (t) and m(t) almost surely satisfy
2 = m(t) = !
(2:1)
n (t) 3
n= ;
and n is usually small compared to n and n . The expected behaviour of n is \controlled"
by the size of n until m(t) m = n = . Indeed, conditional on n (t) satisfying (2.1),
we can bound the progress of n (t) by a random walk on N where the particle's expected
distance from the origin is O(m(t)=n (t)) for m(t) m . Thus from Claim 2.2 we see that
the expected number of isolated vertices created before m m is O(Pm t m0 n (t) ).
Hence the contribution to E[(Gn )] before m(t) m is
0
1
=
X
n
=
A
O@
(2:2)
m = = O (n ):
3
3 2
3 2
3
1 2
1
2
3
0
3 5
3
1
3
1
3
0
0
1
3
( )
0
1 2
m(t)m0
1 5
3 2
When m < m , we are near the end of the process and it is good enough for our purposes
to bound n by a symmetric random walk on Z (the set of integers) in which the particle
moves up or down by one when it moves, and stays where it is with probablity 1 O(n =m).
Applying standard results on random walks, we obtain that for m < m , E[n ] = O(n = ).
Claim 2.2 now gives that the contribution to E[(Gn )] after m(t) m is O(n = ln n), which
together with (2.2) yield the upper bound in the theorem.
0
1
3
0
0
1
1 5
1 5
To prove the lower bound we consider only those times when An = m Bn = for some
large constants A; B . We use coupling arguments to bound n (t) from below by a random
walk, and to estimate the probability that this walk \reaches" its steady state before too
long. We then deduce that for some constants B < A we have E[n (t)] n = for all m
between Bn = and An = . This is enough to give the lower bound in the theorem.
3 5
3 5
1
3 5
1
3 5
1 5
We shall establish Claim 2.1 in the next section. The transition probabilities of the chain
are then described in x4. In order to bound E[n =m] it is necessary to study the behaviour
of n , which is done in x5. The lower and upper bounds in Theorem 1.1 are proved in xx6
and 7 respectively.
1
3
4
3 The Graph Chain
We next establish Claim 2.1 by using induction on t. Since G(0) = Gn which is a random
member of G (0; 0; n), we need only establish that for x; y 2 N and for G 2 G (y ; y ; y ), the
number of triples in
IG = f(H; v; w) : H 2 G (x ; x ; x ); v is of minimum degree in H;
w is a neighbour of v and G = H n fv; wgg
depends only on x; y. But to construct a triple in IG, we
(a) choose v; w 2 [n] n V (G);
(b) choose v ; v ; : : : v` 2 [n] n (V (G) [ fv; wg), where ` = (x + x + x ) (y + y + y ) 2
is the number of new isolated vertices created in going from x to y;
(c) add appropriate edges incident with v; w; v ; v ; : : : v` to make a graph in G (x ; x ; x ).
3
1
1
2
1
2
3
3
2
1
1
2
3
2
1
2
3
1
2
3
The number of choices in (a),(b) (trivially) depends only on x; y and the same is true for the
number of choices in (c), which is xed once the degree sequence of G is xed. (Just observe
that if one chooses A as the set of edges in (c) and changes G without changing the degree
sequence then A remains a valid choice.) This veries Claim 2.1.
4 The Degree Chain
In this section we derive the transition probabilities of the chain M whose state at time t
is n(t) = (n (t); n (t); n (t)). We use the notation
p[x : y] = Pr(n(t + 1) = yjn(t) = x):
We shall require the conguration model of Bollobas [2] which is a simple and useful description of that used by Bender and Caneld [1].
1
1
2
3
Suppose we are given
a degree sequence 1 d ; d ; : : : d . Let Wi = fPig [di]
S
for i 2 [ ] and W = i Wi. A conguration is a partition of W into = i di=2
pairs. Let be the set of all congurations, and let F be chosen uniformly from .
Then let (F ) be the multigraph with vertex set [ ] and edges fi; j g for each occurrence of
f(i; x); (j; y)g 2 F for some x 2 [di]; y 2 [dj ].
1
2
=1
=1
The properties that we need of this model are that: rst, conditional on (F ) being
simple, it is equally likely to be any simple graph with the given degree sequence; second, if
is an absolute constant (here of course = 3 would suce), then
(
)
Pr( (Fn) is simple) = (1 + O( )) exp 2 4 ;
(4:1)
1
5
2
where = Pi di . (Note that
E[(Gn )] = O(E[( (Fn))]);
where Fn is chosen uniformly at random from n with di = 3 for all i 2 [n]. Thus for proving
the upper bound in Theorem 1.1 we need only consider this multigraph distribution together
with a corresponding multigraph version of Claim 2.1.)
1
=1
2
For x 2 N , let = x + x + x and = (x + 2x + 3x )=2. Also, without loss of
generality, assume that di = 1; 1 i x ; di = 2; x < i x + x ; di = 3; x + x > i.
Now choose F randomly from and apply one step of Algorithm MINGREEDY to (F ).
For y 2 N , let Ey be the event that the multigraph remaining has yi vertices of degree i
(i = 1; 2; 3). We shall rst prove that
p[x : y] = Pr(Ey ) + O(1= ):
(4:2)
Note that from Claim 2.1, all we need to show is that
Pr(Ey j (F ) is simple) = Pr(Ey ) + O(1= ):
(4:3)
But
F ) is simple j Ey )Pr(Ey )
Pr(Ey j (F ) is simple) = Pr( (Pr
( (F ) is simple)
and so we need only show
Pr( (F ) is simple j Ey ) = Pr( (F ) is simple) + O(1= )
or
Pr( (F ) is simple j D) = Pr( (F ) is simple) + O(1= )
(4:4)
where D is the set of pairs of points deleted from F in one step and D does not contain
loops nor multiple edges. But jDj 5 and (4.4) follows easily from (4.1).
3
1
2
3
1
1
2
1
3
1
2
1
2
3
We next write down the transition probabilities p[x : y]. Now many of the transition
probabilities are small (i.e. O(1=m)) and do not have a signicant eect on our analysis.
We will only give transition probabilities up to this level of accuracy. Suppose that n(t) =
x and thus 2m = x + 2x + 3x . Let pi = ixi=2m for i = 1; 2; 3: The following table
gives the signicant transitions. There is an O(1=m) term to be added to each probability.
Any transitions not mentioned will have total probability O(1=m) of occuring. (These are
associated with triangles close to the chosen edge.) The probabilities in the table are only
accurate for m suciently large. (Once m is small the remainder of the process cannot eect
the number of uncovered vertices very much.) In Table 2 below, the rst row corresponds
to the initial state, the next seven rows correspond to cases where there are no vertices of
degree one and the last ten rows cover the cases where there are! vertices of degree one.
1
2
3
These probabilities are derived by using (4.2). For example consider the transition from
(x ; x ; x ) to (x 2; x + 1; x 2). Here v in Step A is of degree 1 and u is of degree
3, which accounts for one factor p in the transition probability. The other 2 neighbours
of u are of degree 1 and 3 respectively. This accounts for the factor 2p p . The remaining
probabilities can be checked in the same manner.
1
2
3
1
2
3
3
1 3
6
x
(0,0,x )
(0,x ; x )
(0,x ; x )
(0,x ; x )
(0,x ; x )
(0,x ; x )
(0,x ; x )
(0,x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
(x ; x ; x )
y
(0,4,x 6)
(0,x + 2; x 4)
(1,x ; x 3)
(2,x 2; x 2)
(3,x 4; x 1)
(0,x ; x 2)
(1,x 2; x 1)
(2,x 4; x )
(x 1; x + 2; x
(x ; x ; x 2)
(x + 1; x 2; x
(x 1; x 1; x
(x 2; x + 1; x
(x 3; x ; x 1)
(x 1; x ; x 1)
(x ; x 2; x )
(x 2; x 1; x )
(x 2; x ; x )
3
2
3
2
3
2
3
2
3
2
3
2
3
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
3
2
3
2
3
2
3
2
3
2
3
2
3
1
2
1
2
3
3
1
2
1
2
3
1
2
3
1
2
3
1
2
3
1
4
3
3
2
3
2
3)
1)
1)
2)
2
1
2
3
2 3
2 2
2 3
3
2 3
2
2 3
2
2 3
3
2
3
3
2
2 3
2
2 3
1 2 3
2
1 3
2
1 3
2 3
2
2
3
1
p[x : y ]
1
p
3p p
3p p
pp
pp
2p p
p
p
2p p
pp
2p p p
2p p
pp
pp
p
pp
p
3
1 2
3
1
Table 2: Transition probabilities for the chain M
We continue by computing the expected number of new isolated vertices created in a
single transition (conditional on the present state of M being x). Denote this by (x).
Except in rare cases (i.e. of probability O(1=m)) isolated vertices are produced only when
x > 0. The following table gives the number created in each case. (The case numbers refer
to the row numbers of Table 2.)
1
1
1
9 10 11 12 13 14 15 16 17 18
0 0 0 1 1 2 0 0 1 0
Table 3: Number of isolated vertices created in transitions of M
Observe that with probability 1, the number of isolated vertices created in each transition
equals O(1). Hence, we obtain from the above table that
(x) = 2p p p + 2p p + 2p p + p p + O(1=m)
= p (p + 2p ) + O(1=m)
= p + p (p p ) + O(1=m);
and hence that
0
1
X n (t) ! @X 1 A
E[(Gn)] 2 E m(t) + O m(t)
t
t
1
2
1 3
1 2 3
1
1
2
2
1 3
1 2
3
1
3
1
1
0
0
7
!
n
(
t
)
(4.5)
= 2 E m(t) + O(ln n):
t
In studying the lower bound, we will show, for suitable t < t , that p is negligible when
compared with p whenever t is between t and t . This gives
!
t1
X
n
(
t
)
E[(Gn )] (1 o(1)) E m(t) O(ln n):
(4:6)
t t0
X
1
0
0
3
0
1
1
1
1
=
5 Behaviour of n3
Note that for each edge fu; vg removed by MINGREEDY, one of the end-points, say u,
is picked from the vertices of minimal (but non-zero) degrees, or u is determined from a
previous edge removal. The other end-point v is chosen randomly from the neighbours of u.
Now since almost all cubic graphs are connected, each decrease in n , except for the rst edge
removal, is accounted for exactly once as the end-point v whenever v is of degree 3. (Here, we
regard n as function of the number m of edges in the current graph.) Now consider the edge
removal when the current graph has n vertices of degree 3 and m edges. The probability
that the end-point v in the edge removed is of degree 3 equals 3n =(2m)(1 + O(1=m)) (from
applying arguments similar to those used in showing (4.2). Thus the rate of change in n
with respect to m should be approximately
dn 3n ;
dm 2m
which gives the approximation stated in (2.1). These ideas are made rigorous by the following
lemma.
3
3
3
3
3
3
3
Lemma 5.1 For any xed > 0,
3 =
p
Pr 9t such that m(t) n ln n and n n 2m
=
3 2
3
1 2
3
!
1 = O (n ):
2
Proof. We note rst that MINGREEDY destroys edges of G sequentially. Let h =
bn = c, and for i = 0; 1; 2; : : :, dene
1 4
mi = 32n ih:
Let zi be the number of vertices with degree three in G at the rst time when m mi, and
let Ei be the event that
0i = = 1
3 =
X
p
1 = O @ n ln= n A :
(5:1)
zi n 2m
i
mj
j
1
3 2
=0
8
3 8
1 2
5 4
We shall show that for i such that mi n = ln n,
Pr 9j i such that Ej = O(i=n )
3
1 2
(5:2)
4
(All big O terms in this section are uniform over i.) Equation (5.2) is trivially true for i = 0.
Assume as induction hypothesis that equation (5.2) holds for smaller values of i. Suppose
next that mi n = ln n. We need to show that
Pr (Ei) = O(1=n );
(5:3)
where Pr is the probability conditional on Fi = E \ : : : \ Ei . Let
zi = zi
zi:
Recall that for each edge fx; yg removed by MINGREEDY, one of the end-points, say x,
is either chosen randomly from vertices of minimum degree or determined by a previous
edge removal, while the other end-point, say y, is a degree three vertex with probability
p + O(1=m). Then conditional on Fi and for mi m mi , the probability that the
vertex y in an edge removal is of degree three is
!!
3
z
3
h
i + O(h)
p = 2m + O(h) = zi 2m + O z m
i
i
i
0 = 1i1
0
= zi @ 2m3 + O @ n = AA :
i
mi
Thus, if 0zi is the number of degree three vertices that appear as vertices y in the edge
removals during the period when m decreases from mi to mi, then using Cherno bounds,
q
Pr j 0zi hp j 12hp ln n = O(1=n ):
1 2
3
4
0
0
1
1
1
3
1
1
1
1
1
3 4
1
5 2
1
1
1
4
0
However, zi = 0zi + , where denotes the number of degree three vertices that appear as
vertices x in the edge removals. Now is at most one plus the number of components of Gn .
Hence we may assume for all i that 10, which incurs an error probability of O(1=n ).
Since hp ! 1, we have that
q
Pr j zi hp j 12hp ln n = O(1=n );
4
4
0
giving with conditional probability 1 O(1=n ) that
0
1
1
0v
=
= ln n
u
n
zi = 3n + O @ n A + O @u
t
A
=
zi
2mi
m
z
i
i
mi
0
1
1
0v
u
=
= ln n
u
n
n
3
n
= 2m + O @ = A + O @t = A
i
mi
mi
0
1
=
=
=
= 23mn + O @ n ln= n A :
i
mi
4
1 4
1
1
1 4
5 2
1
1
1 4
1
1 4
3 4
5 2
1
3 8
5 2
1
1 2
5 4
1
1
9
1
As zi = zi
zi, we have with conditional probability at least 1 O(1=n ),
zi = zi 0 zi
0 = = 11
=
3
n
= zi @1 2m + O @ n ln= n AA
i
m
0 =i = 11
!= 0
@1 + O @ n ln= n AA
= zi mmi
i
mi
0
11
0
=
i
=
=
X
i
@1 + O @ n ln= n AA :
= p1n 2m
3
mj
j
This completes our proof1=2of3(5.3) and hence of (5.2). Next,
we note that for mi n = ln n,
3
1
=
2
i is at most N = b n n h n c and that nh N = n h n + O(1). Hence the error term on
the right hand side of (5.1) is at most
= ln = n X
X
1
1
n
=
=
=
n ln n
n
=
=
=
h
j N ( h j )
j N ( n jh)
X
1
= O(n = ln = n)
=
3n N j 3n j
2h
2h
1
=
=
= n ln nO = =
n ln n
= o(1):
This shows that for any xed > 0,
!
3 = n = !
=
Pr 9 i such that m = mi n ln n and n 2
1 = O(1=n ):
m=
Similar arguments can be used to complete our proof of the lemma.
2
4
1
1
1 4
1
1 2
3 8
5 4
1
1
3 2
1
1 2
3 8
5 4
1
1
1
3 2
1 2
3 8
5 4
=0
3
1 2
3
3 8
2
ln
2
3
2
ln
3 8
1 2
0
3
2
1 2
5 4
5 4
1 16
3
2
0
5 4
1 2
5 4
1 16
1 2
1 16
3 2
3
1 2
3
6 An Upper Bound for E n1
[
3 4
1 2
3
3 2
]
In this and the following sections, we would like to estimate E[n ] when the current G has
m edges. Let tm be the rst time when m(t) m. Note that m m(tm) is bounded by an
absolute constant. The big O and small o terms in this and the next sections are uniform in
m. We nd an upper bound for E[n (tm)] in this section.
1
1
From Table 2, we have the following transition probabilities for n (t) (where n =
n (t) = n (t + 1) n (t) and Pr is the probability conditional on n (t); n (t); n (t)).
When n = 0 and n 6= 0,
( i i i i
i = 0; 1; 2; 3;
Pr (n = i) = 0;i p p + i p p + O(1=m); ifotherwise
;
1
1
1
1
1
1
1
2
3
1
1
4
3
1
2
2
2
3
10
+1
2
2
3
and when n 6= 0,
1
Pr (n = 1)
Pr (n = 0)
Pr (n = 1)
Pr (n = 2)
Pr (n = 3)
Pr (n = i)
1
1
1
1
1
1
1
1
1
1
1
p p + O(1=m);
2p p + p + O(1=m);
p + 2p p p + p p + O(1=m);
p p + 2p p + p + O(1=m);
p p + O(1=m);
0; if i 6= 1; 0; 1; 2; 3:
=
=
=
=
=
=
1
2
2 3
2
2 3
2
2
3
3
1 2 3
2 3
2
1 3
1 2
1
2
1 3
Note that when both n = 0 and n = 0, we have
1
2
Pr (n =6 0) = O(1=m);
Pr (n 3) = 1:
1
1
1
1
In particular, we may assume that no vertex of degree 1 is created in the rst iteration of
MINGREEDY; that is, n (1) = 0. Observe also that the O(1=m) terms in the transition
probabilities depend only on the current state of M .
1
1
Let
(x) = x 2x + x = (1 x) x;
(x) = x x + x = x + (1 x)x:
2
3
2
2
3
3
We next consider a process X (t) with initial state X (1) = 0 and transition probabilities
dened below.
1
Pr (X = 1)
Pr (X = 0)
Pr (X = 1)
Pr (X = i)
1
1
1
1
1
1
1
1
=
=
=
=
1
(p ) + O (1=m);
1 ((p ) + O (1=m)) ( (p ) + O (1=m)) (X );
( (p ) + O (1=m)) (X );
0; if i 6= 1; 0; 1;
3
1
3
3
1
3
2
2
1
1
where (x) = 1 if x > 0 and zero otherwise. Note that when n (t) = 0, Pr(n (t +1) 3) = 1
and that when n (t) 6= 0, we have (ignoring the O(1=m) terms)
1
1
1
Pr (n = 1) (p );
Pr (n 1) (p ):
1
1
1
3
1
3
Thus, the big O terms O (1=m) and O (1=m) in the transition probabilities of the process
X (t) can be properly dened so that when n (t) 6= 0,
1
2
1
1
Pr (n = 1) Pr (X = 1);
Pr (n 1) Pr (X = 1):
1
1
1
1
1
1
1
1
It follows that n and X can be coupled so that n is stochastically at most X + 3 for all
t. Next, let X (t) be a process, with X (1) = X (1), having the same transition probabilities,
1
1
1
1
11
1
but without the big O terms, as those of X (t). Note that the big O terms in the transition
probabilities of X (t) can only eect, with probability 1 O(1=nA ) for any A > 0, at
most ln n transitions in O(n) transitions of X . Now at each time t, conditional on X
having taken a transition not eected by the big O terms, we may couple X and X so that
jX (t) X (t)j jX (t 1) X (t 1)j (because of the reecting barrier at the origin). It
therefore follows that for all t = O(n),
E[X (t)] = E[X (t)] + O(ln n):
Thus, E[n ] = E[X ] + O(ln n). Since we have a fairly accurate estimate of n as a function
of m (and hence an estimate of p = 3n =2m), we proceed to nd a bound for X and hence
one for E[n ].
1
1
2
1
1
1
1
1
2
1
2
1
3
3
3
1
Lemma 6.1 For m n = ln n and for any xed > 0,
1 2
3
E[n (tm)] (1 p p) + O(ln n);
2
2
1
where p = (1 ) (2m=3n)1=2.
Proof.
Consider a random walk X 0 with transition probabilities as follows. We write
0 = (p); 0 = (p).
Pr(X 0 = 1) = 0;
Pr(X 0 = 0) = 1 0 0 (X 0);
Pr(X 0 = 1) = 0 (X 0):
Note that the steady state distribution 0 of X 0(t) satises the following equations:
0 = (1 0)0 + 00 ;
i0 = 0i0 + (1 0 0)i0 + 0i0 ; i 1;
i0 = 0; i < 0:
These can be solved, giving
i0 = 0; i < 0;
0 0 0 !i
0
(6.1)
i = 0 0 ; i 0;
with expectation Pi ii0 = (1 p) =p. We shall assume that X 0 starts with its steady state
distribution 0, and hence the distribution of X 0 equals 0 always.
0
0
1
1
+1
2
We shall show later that if for all t tm
p (t) (1 ) 2m3n(t)
3
12
!=
1 2
> p;
(6:2)
then for all t tm and for large n,
X (t) X 0(t) in distribution:
(6:3)
It follows from Lemma 5.1 that (6.2) holds with probability 1 O(n ), and since n n,
we have
2
1
E[n (t)] = E [n (t)] + O(1=n)
E [X (t)] + O(ln n)
E[X 0(t)] + O(ln n) = (1 p) =p + O(ln n);
1
2
1
2
2
2
2
2
where E is the expectation conditional on p satisfying (6.2). This proves the lemma.
2
3
We shall use induction to prove (6.3). This is trivial when t = 1. Assuming it is true for
t, we shall bound the distribution of X (t + 1). Dene a random variable Y = X 0(t) + Y ,
where the distribution of Y is as follows. Writing p = p (t), dene
3
3
Pr(Y = 1) = (p );
Pr(Y = 0) = 1 (p ) (p ) (X 0);
Pr(Y = 1) = (p ) (X 0):
3
3
3
3
We shall show that
X (t + 1) Y X 0(t + 1)
(6:4)
holds in distribution. The rst inequatlity comes from the fact that we can couple X (t)
and X 0(t) so that X (t) X 0(t) (this uses the induction hypothesis). For the case where
X (t) = X 0(t), we see that X (t) = Y in distribution; for the case X (t) < X 0(t), we use
the fact that X (t) and Y can be coupled so that X (t) Y + 1. The rst inequality
thus holds for both cases. To show the second inequality, we consider the distribution of
Y . Writing = (p (t)); = (p (t)); 0 = (p); 0 = (p), we have the following equations
3
3
= (1 )0 + 0 ;
i = i0 + (1 )i0 + i0 ; i 1;
i = 0; i < 0:
0
0
1
1
+1
Hence for i 1,
0
0!
i =
0 + 1 + 0 !!
0
= i0 1 + ( 0 0) 0 0 :
Note that the functions (x) and (x) are such that for all x 2 [0; 1],
i0
(x) (x) 0;
13
(6.5)
and
1 + x 0:
d (x) =
dx (x) (1 x + x )
It therefore follows from (6.2) and (6.5) that for i 1,
2
2 2
i i0;
giving that for i 1,
X
j i
i0
i X
j i
i0:
As i = = 0 for i < 0, the distribution is smaller than 0. This shows the second
inequality in (6.4), and so our proof of the lemma is complete.
2
Lemma 6.1 gives that for m n = ln n,
1 2
3
E[n (tm)] = O((n=m) = ):
1 2
1
Hence, if m m = n = , then E[n (tm)] = O(n = ). The next lemma shows that E[n (tm)] =
O(n = ) holds when m < m too.
3 5
0
1 5
1 5
1
1
0
Lemma 6.2 E[n (tm)] = O(n = ) for all m m .
1 5
1
0
Proof.
As argued in Lemma 6.1, the assertion in this lemma follows if we can show that
for all m < m ,
E[X (tm)] = O(n = ):
(6:6)
Given p , consider a random walk Z with transition probability dened as below.
0
1 5
3
Pr(Z = 1) = ((p ) + (p ))(1 (Z )=2);
Pr(Z = 0) = 1 (p ) (p );
Pr(Z = 1) = ((p ) + (p )) (Z )=2:
3
3
3
3
3
3
Write t = tm0 . We consider the processes X and Z after time t . Suppose Z (t ) = X (t ).
As (p ) (p ) for all p 2 [0; 1], we can couple the processes X (t) and Z (t) for all t t
so that Z is stochastically at least X . Hence (6.6) follows if
0
3
0
3
3
0
0
0
E[Z (t)] = O(n = ); 8t t :
1 5
0
(6:7)
Since Z is a symmetric random walk with a reecting barrier at Z = 0, the reection
principle shows that
E[Z (t)] = E[j Z 0(t) j]
14
where Z 0 is the same symmetric random walk as Z but without the reecting barrier. Next,
let Z 00 be a random walk with the same transition probabilities as those of Z 0 but with
Z 00(t ) = 0. Then it is clear that Z 0(t) = Z 00(t) + Z 0(t ) and so
0
0
E[j Z 0(t) j] E[j Z 00(t) j] + E[Z 0(t )]
= E[j Z 00(t) j] + E[X (t )]:
0
0
Let M be the number of moves that Z 00 makes after t until the end of algorithm MINGREEDY. Then we have
p
E[j Z 00(t) j] = O(E[ M ]);
giving that
p
E[Z (t)] = O(E[ M ] + E[X (t )]):
(6:8)
0
0
We next estimate M . Let m = n = ln n and m = n = . We write
M =M +M +M ;
3
1 2
1
2 5
2
1
2
3
where M (respectively M ) is the number of moves that Z 00 makes when m is between m
and m (respectively between m and m ), and M is the number of moves when m m .
Then clearly
M n= :
To estimate M , note that with error probability O(1=n ), we can assume that for xed
> 0, p (1 + )(2m=3n) = = O((2m =3n) = ) = O(n = ) for m between m and m .
Thus
X
E[M ] = ((p (t)) + (p (t)))
= O (n = n = ) = O (n = ):
For M , we note that according to Lemma 5.1, when m = m , n is of order n = ln n and
n decreases as m decreases. Hence when m is between m and m , we have
p = O(n = ln n=n = ) = O(n : );
and so
X
E[M ] = ((p (t)) + (p (t)))
= O (n = n : ) = O (n = ):
As M and M are sums of independent Bernoulli variables with probability of success
(p (t))+ (p (t)) (when the sequence p is known), they are concentrated near their means.
Hence, we see that
M = O(E[M ]) + O(E[M ]) + M = O(n = );
almost surely. This gives that
p
E[ M ] = O ( n = ) :
1
2
0
1
1
2
3
2 5
3
1
3
2
2
1 2
1 2
0
1
1 5
3
3 5
2 5
1
3
1
5
1 4
2 5
2
1
3
1 4
3
2
0 14
3
1 2
3
0 14
2 5
2
3
3
1
2
3
1 5
15
1
3
1 5
2
3
0
2 5
5
Equation (6.7) now follows from (6.8) and the fact that E[X (t )] = O(n = ). This completes
our proof of the lemma.
2
0
1 5
We are now in a position to prove the upper bound in the theorem. Using (4.5), we have
X
E[(Gn)] 2 E[n (t)=m(t)] + O(ln n)
2
= 2
t0
3n=2
X
m=1
3n=2
X
m=1
1
E[n (tm)=m] + O(ln n)
1
E[n (tm)]=m + O(ln n)
1
=
= O (n )
n=
X
3
1 5
2
m=1
1=m + O(ln n)
= O(n = ln n):
1 5
7 A Lower Bound for E[n1]
Let B 3000 be a constant, and let mB = Bn = . Write tB be the rst time at which
m(t) mB . Fix a constant A > B and a small constant > 0. (The numbers A; will be
chosen later.) Let mA = An = . Dene tA as the time immediately after m mA. Hence
tA = tmA and tB = tmB . Let N = bn = c and r = (n = + 3)=mB n = =B . We shall nd
a lower bound for E[n (tB )] by considering a random walk with reecting barriers at 0 and
N as follows.
3 5
3 5
1 5
1 5
2 5
1
Suppose for now that the sequence p (t); t 2 [tA; tB ] is known. Consider a random walk
W (t); t 2 [tA; tB ] with initial state W (tA) = 0 and transition probabilities given below. With
functions , and as dened before and writing = (p ) and = (p ),
3
3
3
Pr(W = 1) = N (W )
Pr(W = 1) = (W );
Pr(W = 0) = 1 N (W ) (W );
where N (x) = 1 if x < N and zero otherwise. Associated with W is a process U where
U (tA) = 0 and U is decreased by 3 with probability 9r at each step. We claim that if the
processes n ; W and U are \suitably" dened in a probability space, then
1
n (t) W (t) + U (t);
1
t 2 [tA; tB]:
(7:1)
Equation (7.1) holds for t = tA trivially. Assume that (7.1) holds for t. We want to show
that it holds for t + 1 also. Now if n (t) N + 3, then (7.1) holds for t + 1 as W N and
n 3 always. Thus, we assume that n (t) < N + 3 and like to show next that if W and
1
1
1
16
U are properly dened, then n W + U . Note rst that if n (t) < N +3, then p r
from our denition of r. From the transition probabilities of n , we see that (ignoring the
O(1=m) terms)
Case 1 n = 1 with probability at least 2p p (p ) 2p (p ) 2r;
Case 2 n = 1 with probability at most (p ) + 2p (p ) + 2r;
Case 3 n = 2 with probability at most 4p 4r and n = 3 with probability at
most p r.
Thus, very crudely, we see that n and W can be coupled so that n W with probability
at least 1 9r and n W 3 always. (The big O terms in the transition probabilities
of n equal O(n = ) and are negligible when compared with r = O(n = ).) This establishes
(7.1). Note that
E[U (tB)] = 3(tB tA )(9r)(1 + o(1)) 27r(A B )n = (1 + o(1)) 27(AB B ) n = ;
and so for t 2 [tA; tB ],
E[n (tB )] E[W (tB)] 27(AB B ) n = (1 + o(1)):
(7:2)
1
1
1
1
2
2 3
1
1
3
3
1
1
1
3
3
1
1
1
1
3 5
1
1
1
2 5
1 5
3 5
1 5
1
We next bound E[W (tB)]. Let D be a probability distribution with support [N ], and let
WD (t) be a random walk with the same transition probabilities as those of W (t), but WD (tA)
has distribution D. Since W (tA) = 0, W and WD can be coupled so that WD (t) W (t)
for all t tA and that WD (t) W (t) is non-increasing in t. Let T be the rst time that
W (t) = WD (t). Then
W (tB) WD (tB )(fT tB g);
where (fT tB g) is the indicator function for the event fT tB g. Since W N , we have
E[W (tB)] E[WD (tB )] N Pr(T > tB ):
Let T 0 be the rst time that WD = 0. Then clearly T T 0 in distribution as WD (t) W (t) 0. Thus,
E[W (tB)] E[WD (tB )] N Pr(T 0 > tB ):
(7:3)
According to Lemma 5.1, with probability 1 O(n ), we have that for t between tA and
tB and xed > 0,
2mB =
2mA =
(1 ) 3n
p (t) (1 + ) 3n
:
(7:4)
The assumption of (7.4)
only gives an extra factor
of 1 O(n ) to our estimate of E[n ].
=
=
m
m
Let q = (1+ ) nA and q = (1 ) nB . We shall assume that (7.4) is satised.
2
1
1 2
1 2
1
3
1
2
0
1
2
3
1 2
1
1
1
1 2
2
3
We next dene the distribution D and bound WD suitably. Write 0 = (q ) and 0 =
(q ). Consider a random walk W 0 with transition probabilities given by
Pr(W 0 = 1) = 0N (W 0);
Pr(W 0 = 1) = 0 (W 0);
Pr(W 0 = 0) = 1 0N (W 0) 0 (W 0):
0
0
17
Note that here 0 and 0 no longer depend on t and that the steady state distribution 0 of
W 0 satises
0 = (1 0)0 + 00
0i = 00i + (1 0 0)0i + 00i ; i = 1; 2; : : : ; N 1
0N = (1 0)0N + 00N
0i = 0; 8i < 0;
from which we obtain
0 = 0 )
0 !i
1
(
0
i = 1 (0= 0)N 0 ; i = 0; 1; : : : ; N
(7.5)
0i = 0; 8i < 0:
We assume that W 0(tA ) has distribution 0 and so the distribution of W 0(t) equals 0 always.
0
0
1
1
+1
1
+1
Suppose now that WD (t) starts with distribution 0 at time tA (that is, D = 0 ). We
shall show by induction that WD (t) is at least W 0(t) in distribution. For this, we follow the
method used in showing X X 0 in distribution in our proof of Lemma 6.1. Assume p
satises (7.4) and that WD (t) W 0(t) as our induction hypothesis. Let Y be the state of
a process which starts with distribution 0 (which equals the distribution of W 0(t)) followed
by a one-step transition with transition probabilities as those of WD . We claim that
WD (t + 1) Y W 0(t + 1):
(7:6)
The justication for the rst inequality here follows similar arguments as those used in
showing the rst inequality in (6.4). To show the second inequality in (7.6), use to denote
the distribution of Y . Writing = (p (t)) and = (p (t)), we nd that i = 0 for all
i < 0 and for i between 1 and N 1 we have
= (1 )0 + 0 ;
i = 0i + (1 )0i + 0i
N = (1 )0N + 0N :
Note that, since q p (t), we have 0= 0 = . Also
0!
0
0
0
0
N = N + N
N = N + 0 0N 0N ;
and by following (6.5), we have that for i = 1; 2; : : : ; N 1,
0 !!
0
0
0
i = i 1 + ( ) 0 0 0i:
As 0i = i = 0 for i < 0, the distribution is at least 0. This completes the induction step,
and so we conclude that the distribution of WD (t) is at least 0. It therefore follows that for
t 2 [tA; tB ],
X
E[WD(t)] i0i = 1 (10= 0) 1 (N0+=10)N + N:
i
3
3
0
3
0
1
1
+1
1
0
3
1
1
+1
18
Since q (1 + )(2A=3) = n
and (0= 0)N exp( q N ), we have
s
=
n
(1 + )(2A=3) = ) :
E[WD (t)] 1 + 23A n = 1 exp(
exp( (1 + )(2A=3) = )
0
1 2
1
=
1 5
0
1 5
1 2
1
1 5
1
1 2
1
By choosing > 0 such that (3=2) = =(1 + ) = 1:22, we have
1 2
1
1
n
E[WD(tB )] 1:22
A=
=
1 5
1 2
A = =1:22) :
n = 1 exp(
exp( A = =1:22)
1 2
1 5
(7:7)
1 2
To estimate T 0, we write 00 = (q ) and 00 = (q ), and let 00 be the distribution given
in (7.5) with 0; 0 replaced with 00; 00 respectively. Let W 00 be the same random walk as W 0
but with 0, 0 replaced with 00; 00 and let W 00(tA) have distribution 00 instead of 0. Note
that as q q , we have 0= 0 00= 00, and it is not dicult to check that distribution 0 is
at most 00. Hence, our previous argument for showing W 0 WD in distribution can be used
to show WD W 00 in distribution. Next, let W be the walk whose transition probabilities
are as those of W 00 but W (tA) = N and without the reecting barrier at W = N . A simple
coupling argument gives that W 00 W in distribution. Let T be the rst time that W = 0.
Then we have
T 0 T in distribution:
1
0
1
1
1
1
1
1
1
1
1
We next estimate T . Let L be the time elapsed before W rst gets to N 1. Then we
have
T = L + : : : + LN + tA;
where L ; : : :; LN are independent variables identically distributed as L. Note that
(I) with probability 00, L = 1;
(II) with probability 00, L = 1 + L0 + L00;
(III) with probability 1 00 00, L = 1 + L000,
where L0; L00; L000 are independent and identically distributed as L. Hence, writing M (z) =
E[zL] as the probability generating function of L, we have
1
1
1
1
1
M (z) = 00z + 00zM (z) + (1 00 00)zM (z);
2
giving
q
(z 00z 00z 1)
M (z ) =
200z
Put z = exp(q ), for some < 1=4 to be chosen later. Then
p
M (z) = 1 + (1 1 4)q =2 + O(q ):
(z 00z 00z 1)
3
1
1
Since each step of MINGREEDY destroys at most 5 edges,
Pr(T 0 tB ) Pr(T tA tB tA )
Pr(L + : : : + LN (A B )n = =5)
1
3 5
1
19
2
1
2
400 00z
2
:
E[exp((L + : : : + LN ) ln z (A B )n = ln z=5)]
=
M (z)N exp(
p (A B )(ln z)n =5)
exp((1 p1 4)Nq =2 (A B )q n = =5)
=
3 5
1
3 5
3
1
1
(A B )(1 ) (2B=3) = =5):
1 4)(1 )(B=6)
exp((1
1 2
1
3 5
1
3
3 2
Put = 0:249 and = 0:00001, we obtain
1
Pr(T 0 tB ) exp(0:39B = 0:027(A B )B = ):
1 2
(7:8)
3 2
Hence, assuming (7.4), we have from (7.2), (7.3) and (7.7) that
A = =1:22)
n = 1 exp(
exp( A = =1:22)
n = exp(0:39B = 0:027(A B )B = ) 27(AB B ) n = (1 + o(1)):
We now set A = B + B = , = 1:22x=A = , where x = 0:04. Then
!
x
=
x
e
27
x
(1
+
o
(1))
1
:
22
n
p
E[n (tB )] (B + pB ) = 1 1 e x x exp(0:5x 0:027B )
B
=
10 pn = ;
(7.9)
(B + B )
since B 3000. We have therefore showed the following lemma.
n=
E[n (tB )] 1:22
A=
1 5
1
1 2
1 5
1 2
1 2
1 5
1 2
1 2
3 2
1 5
1 2
1 5
1
2
1 2
4
1 5
1 2
Lemma 7.1 For any constant B 3000, if m = Bn = , then
3 5
E[n (tm)] (B10+ pnB ) = (1 o(1)):
4
=
1 5
1
Proof.
1 2
Let E be the event that (7.4) occurs. Then
E[n (tm)] E [n (tm)](1 O(1=n ));
where E is the expectation conditional on E . It follows from (7.9) that
1
2
2
1
2
E[n (tm)] (B10+ pnB ) = (1 o(1)):
4
1
=
1 5
1 2
2
To prove the lower bound in the theorem, we let B = 3600 and B = 3000. Write
m = B n = , m = B n = , t = tm0 and t = tm1 where tm is (as before) the greatest t
0
0
3 5
1
1
3 5
0
0
1
20
1
such that m(t) m. Note that when t is between t and t , we have n = (n = ) and
n = O(n = ), which gives p > p almost surely. Then using (4.6), we have
X
E[(Gn )] E[p (t)](1 o(1)) O(ln n)
t
X
E[n (t)=m(t)](1 o(1)) O(ln n)
1
1 5
0
3
1
3
2 5
1
1
0
t0
t1
X
t=t0
t1
X
t=t0
1
E[n (t)=m(t)](1 o(1)) O(ln n)
1
E[n (t)]=m (1 o(1)) O(ln n)
1
0
=
t m t (B 10+ pnB ) = O(ln n)
=
m 5m m 10p n (1 o(1)) O(ln n)
3660
1
4
0
0
0
0
=
0
4
1
1 5
1 2
1 5
0
= (n ):
1 5
Acknowledgement. We would like to thank the referee for a careful reading of the
manuscript.
References
[1] E. A. Bender and E. R. Caneld, The asymptotic number of labelled graphs with given
degree sequences, Journal of Combinatorial Theory, Series A, 24 (1978) 296-307.
[2] B. Bollobas, A probabilistic proof of an asymptotic formula for the number of labelled
regular graphs, European Journal on Combinatorics, 1 (1980) 311-316.
[3] M. E. Dyer and A. M. Frieze, Randomized greedy matching, Random Structures and
Algorithms, 2 (1991) 29-45.
[4] M. E. Dyer, A. M .Frieze and B. Pittel, On the average performance of the greedy
matching algorithm, to appear.
[5] O. Goldschmidt and D. S. Hochbaum, A fast perfect-matching algorithm in random
graphs, SIAM Journal on Discrete mathematics, 3 (1990) 48-57.
[6] B. Korte and D. Hausmann, An analysis of the greedy algorithm for independence
systems, in Algorithmic Aspects of Combinatorics, B. Alspach, P. Hall and D. J. Miller
Eds., Annals of Discrete Mathematics, 2 (1978) 65-74.
[7] R. M. Karp and M. Sipser, Maximum matchings in sparse random graphs, Proceedings
of the 22nd Annual IEEE Symposium on Foundations of Computing, (1981) 364-375.
21
[8] G. Tinhofer, A probabilistic analysis of some greedy cardinality matching algorithms,
Annals of Operations Research, 1 (1984) 239-254.
22

Download Report

Analysis of a simple greedy matching algorithm

Paperzz.com

Your Paperzz