Analysis of a Simple Greedy Matching Algorithm on Random Cubic Graphs Alan Frieze A.J. Radcliey Department of Mathematics, Carnegie Mellon University, Pittsburgh PA15213, U.S.A. Stephen Suen June 14, 1995 Abstract We consider the performance of a simple greedy matching algorithm MINGREEDY when applied to random cubic graphs. We show that if n is the expected number of vertices not matched by MINGREEDY, then there are positive constants c1 and c2 such that c1n1=5 n c2n1=5 log n. 1 Introduction There have been a number of papers analysing simple greedy-type algorithms for nding large matchings in graphs, e.g. Korte and Hausmann [6], Karp and Sipser [7], Tinhofer [8], Dyer and Frieze [3], Goldschmidt and Hochbaum [5] and Dyer, Frieze and Pittel [4]. Most of these deal with the expected performance of various algorithms on random graphs. In this paper we discuss the algorithm MINGREEDY, given below. It is simple and can easily be implemented in linear time. We use the following notation in the description of MINGREEDY. G (v) denotes the set of neighbours of the vertex v in the graph G; often G is understood and the subscript is omitted. G n fu; vg is the graph obtained from G by deleting the vertices u and v, all edges incident with them and, in addition, any new isolated vertices. y Supported by NSF grant CCR-9024935 Current address: Department of Mathematics, Georgia Institute of Technology, Atlanta GA30332 1 MINGREEDY Input G; begin M ;; while E (G) 6= ; do begin A: Choose v 2 V uniformly at random from among the vertices of minimum degree in G; B : Choose u 2 (v) uniformly at random and set e = fu; vg; G G n fu; vg; M M [ feg end; Output M as our matching end The performance of MINGREEDY, at least on random cubic graphs, is remarkably good. In Theorem 1.1 below we prove that the number of vertices left unmatched is usually comparatively small, but it is instructive to see the algorithm's performance in \practice". We have done a limited amount of computation and some results are given in Table 1. We have compared it to two other greedy matching algorithms: GREEDY, which chooses an edge at random from those available, and MODIFIED GREEDY, which randomly chooses a vertex v and then an edge incident with it. The algorithms were run on random n-vertex cubic graphs, up to n = 10 . The dierence in performance is quite dramatic. It makes good sense to use an algorithm like MINGREEDY as a \front end" to an optimising algorithm. 6 n 10 10 10 10 10 2 3 4 5 6 iM 1.2 2.2 3.5 6.2 10.4 iR 12.0 116.3 1193.3 11892.7 119091.9 iV 11.8 102.2 1049.9 10472.1 104546.8 Table 1: Performance of greedy type matching algorithms in 20 Monte Carlo runs, where iM (respectively iR, iV ) is the average number of vertices left isolated when using MINGREEDY (respectively GREEDY, MODIFIED GREEDY). This paper is concerned with the performance of MINGREEDY on random graphs. For a graph G we let (G) denote the expected number of vertices left unmatched by a run of MINGREEDY . Let n denote the set of 3-regular graphs with vertex set [n] = f1; 2; : : : ; ng. 1 MINGREEDY is a randomising algorithm and the expectation here is with respect to the algorithm's random choices 1 2 Theorem 1.1 Let Gn be chosen uniformly at random from n . Then there exist constants c ; c > 0 such that c n = E((Gn )) c n = ln n: 1 2 1 1 5 2 1 5 It seems likely that the ln n factor in the upper bound is unnecessary. Note, in contrast to this result, that the algorithm which at every stage chooses an edge randomly from among those available (analysed in [3]), is very likely to leave at least n vertices unmatched for some absolute constant > 0. A similar estimate holds if, at every stage, one chooses a random vertex v and then a random neighbour of v. These calculations are borne out by the computational results in Table 1; it is crucial to the success of our algorithm that we always choose a vertex of minimum degree. We would like to extend Theorem 1.1 to random r-regular graphs and to sparse random graphs but it seems dicult to carry through the analysis. It should be noted however that MINGREEDY is likely to perform well. It is closest in spirit to Algorithm 2 of [7], which, when run on sparse random graphs, usually leaves only o(n) more vertices unmatched than the minimum possible. (This is a dicult proof and there is no estimate given of the o(n) error term.) 2 Overview of the Proof Our proof of Theorem 1.1 is rather long and so we now give a brief overview. For functions f and g, we use f (n) g(n) to denote that f (n) = (1 + o(1))g(n) as n ! 1. Let t denote the number of iterations of the algorithm so far, i.e. the number of executions of Step A, and let ni = ni (t) be the number of vertices of degree i = 1; 2; 3 in the graph G(t) at the end of iteration t. (So n (0) = n (0) = 0, n (0) = n.) Similarly, write m(t) = (n (t) + 2n (t) + 3n (t)) for the number of edges of G(t). Let (t) denote the number of new isolated vertices created at time t. Then the number of unmatched vertices at the end is given by X (Gn ) = (t): 1 2 1 1 2 2 3 3 t0 Our ability to analyse the algorithm rests on the following fact. Claim 2.1 Suppose that G = G(t) has ni vertices of degree i, i = 1; 2; 3. Then G is equally likely to be any member of G (n1 ; n2; n3 ), the set of all graphs with vertex set V (G) and ni vertices of degree i, i = 1; 2; 3. 3 The claim allows us to treat the progress of the algorithm as a Markov chain M on N (where N = f0; 1; 2; : : :g), the state at time t being (n (t); n (t); n (t)). Analysis of M yields the following fact. 1 3 1 2 3 1 Claim 2.2 Conditional on n (t); n (t) and n (t), the expected number (t) of isolated ver1 2 3 tices created at iteration t is (n1 (t)=m(t)). We will see that, in a typical execution of the algorithm (except near the end of the process), n (t) and m(t) almost surely satisfy 2 = m(t) = ! (2:1) n (t) 3 n= ; and n is usually small compared to n and n . The expected behaviour of n is \controlled" by the size of n until m(t) m = n = . Indeed, conditional on n (t) satisfying (2.1), we can bound the progress of n (t) by a random walk on N where the particle's expected distance from the origin is O(m(t)=n (t)) for m(t) m . Thus from Claim 2.2 we see that the expected number of isolated vertices created before m m is O(Pm t m0 n (t) ). Hence the contribution to E[(Gn )] before m(t) m is 0 1 = X n = A O@ (2:2) m = = O (n ): 3 3 2 3 2 3 1 2 1 2 3 0 3 5 3 1 3 1 3 0 0 1 3 ( ) 0 1 2 m(t)m0 1 5 3 2 When m < m , we are near the end of the process and it is good enough for our purposes to bound n by a symmetric random walk on Z (the set of integers) in which the particle moves up or down by one when it moves, and stays where it is with probablity 1 O(n =m). Applying standard results on random walks, we obtain that for m < m , E[n ] = O(n = ). Claim 2.2 now gives that the contribution to E[(Gn )] after m(t) m is O(n = ln n), which together with (2.2) yield the upper bound in the theorem. 0 1 3 0 0 1 1 5 1 5 To prove the lower bound we consider only those times when An = m Bn = for some large constants A; B . We use coupling arguments to bound n (t) from below by a random walk, and to estimate the probability that this walk \reaches" its steady state before too long. We then deduce that for some constants B < A we have E[n (t)] n = for all m between Bn = and An = . This is enough to give the lower bound in the theorem. 3 5 3 5 1 3 5 1 3 5 1 5 We shall establish Claim 2.1 in the next section. The transition probabilities of the chain are then described in x4. In order to bound E[n =m] it is necessary to study the behaviour of n , which is done in x5. The lower and upper bounds in Theorem 1.1 are proved in xx6 and 7 respectively. 1 3 4 3 The Graph Chain We next establish Claim 2.1 by using induction on t. Since G(0) = Gn which is a random member of G (0; 0; n), we need only establish that for x; y 2 N and for G 2 G (y ; y ; y ), the number of triples in IG = f(H; v; w) : H 2 G (x ; x ; x ); v is of minimum degree in H; w is a neighbour of v and G = H n fv; wgg depends only on x; y. But to construct a triple in IG, we (a) choose v; w 2 [n] n V (G); (b) choose v ; v ; : : : v` 2 [n] n (V (G) [ fv; wg), where ` = (x + x + x ) (y + y + y ) 2 is the number of new isolated vertices created in going from x to y; (c) add appropriate edges incident with v; w; v ; v ; : : : v` to make a graph in G (x ; x ; x ). 3 1 1 2 1 2 3 3 2 1 1 2 3 2 1 2 3 1 2 3 The number of choices in (a),(b) (trivially) depends only on x; y and the same is true for the number of choices in (c), which is xed once the degree sequence of G is xed. (Just observe that if one chooses A as the set of edges in (c) and changes G without changing the degree sequence then A remains a valid choice.) This veries Claim 2.1. 4 The Degree Chain In this section we derive the transition probabilities of the chain M whose state at time t is n(t) = (n (t); n (t); n (t)). We use the notation p[x : y] = Pr(n(t + 1) = yjn(t) = x): We shall require the conguration model of Bollobas [2] which is a simple and useful description of that used by Bender and Caneld [1]. 1 1 2 3 Suppose we are given a degree sequence 1 d ; d ; : : : d . Let Wi = fPig [di] S for i 2 [ ] and W = i Wi. A conguration is a partition of W into = i di=2 pairs. Let be the set of all congurations, and let F be chosen uniformly from . Then let (F ) be the multigraph with vertex set [ ] and edges fi; j g for each occurrence of f(i; x); (j; y)g 2 F for some x 2 [di]; y 2 [dj ]. 1 2 =1 =1 The properties that we need of this model are that: rst, conditional on (F ) being simple, it is equally likely to be any simple graph with the given degree sequence; second, if is an absolute constant (here of course = 3 would suce), then ( ) Pr( (Fn) is simple) = (1 + O( )) exp 2 4 ; (4:1) 1 5 2 where = Pi di . (Note that E[(Gn )] = O(E[( (Fn))]); where Fn is chosen uniformly at random from n with di = 3 for all i 2 [n]. Thus for proving the upper bound in Theorem 1.1 we need only consider this multigraph distribution together with a corresponding multigraph version of Claim 2.1.) 1 =1 2 For x 2 N , let = x + x + x and = (x + 2x + 3x )=2. Also, without loss of generality, assume that di = 1; 1 i x ; di = 2; x < i x + x ; di = 3; x + x > i. Now choose F randomly from and apply one step of Algorithm MINGREEDY to (F ). For y 2 N , let Ey be the event that the multigraph remaining has yi vertices of degree i (i = 1; 2; 3). We shall rst prove that p[x : y] = Pr(Ey ) + O(1= ): (4:2) Note that from Claim 2.1, all we need to show is that Pr(Ey j (F ) is simple) = Pr(Ey ) + O(1= ): (4:3) But F ) is simple j Ey )Pr(Ey ) Pr(Ey j (F ) is simple) = Pr( (Pr ( (F ) is simple) and so we need only show Pr( (F ) is simple j Ey ) = Pr( (F ) is simple) + O(1= ) or Pr( (F ) is simple j D) = Pr( (F ) is simple) + O(1= ) (4:4) where D is the set of pairs of points deleted from F in one step and D does not contain loops nor multiple edges. But jDj 5 and (4.4) follows easily from (4.1). 3 1 2 3 1 1 2 1 3 1 2 1 2 3 We next write down the transition probabilities p[x : y]. Now many of the transition probabilities are small (i.e. O(1=m)) and do not have a signicant eect on our analysis. We will only give transition probabilities up to this level of accuracy. Suppose that n(t) = x and thus 2m = x + 2x + 3x . Let pi = ixi=2m for i = 1; 2; 3: The following table gives the signicant transitions. There is an O(1=m) term to be added to each probability. Any transitions not mentioned will have total probability O(1=m) of occuring. (These are associated with triangles close to the chosen edge.) The probabilities in the table are only accurate for m suciently large. (Once m is small the remainder of the process cannot eect the number of uncovered vertices very much.) In Table 2 below, the rst row corresponds to the initial state, the next seven rows correspond to cases where there are no vertices of degree one and the last ten rows cover the cases where there are! vertices of degree one. 1 2 3 These probabilities are derived by using (4.2). For example consider the transition from (x ; x ; x ) to (x 2; x + 1; x 2). Here v in Step A is of degree 1 and u is of degree 3, which accounts for one factor p in the transition probability. The other 2 neighbours of u are of degree 1 and 3 respectively. This accounts for the factor 2p p . The remaining probabilities can be checked in the same manner. 1 2 3 1 2 3 3 1 3 6 x (0,0,x ) (0,x ; x ) (0,x ; x ) (0,x ; x ) (0,x ; x ) (0,x ; x ) (0,x ; x ) (0,x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) (x ; x ; x ) y (0,4,x 6) (0,x + 2; x 4) (1,x ; x 3) (2,x 2; x 2) (3,x 4; x 1) (0,x ; x 2) (1,x 2; x 1) (2,x 4; x ) (x 1; x + 2; x (x ; x ; x 2) (x + 1; x 2; x (x 1; x 1; x (x 2; x + 1; x (x 3; x ; x 1) (x 1; x ; x 1) (x ; x 2; x ) (x 2; x 1; x ) (x 2; x ; x ) 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 3 2 3 2 3 2 3 2 3 2 3 2 3 1 2 1 2 3 3 1 2 1 2 3 1 2 3 1 2 3 1 2 3 1 4 3 3 2 3 2 3) 1) 1) 2) 2 1 2 3 2 3 2 2 2 3 3 2 3 2 2 3 2 2 3 3 2 3 3 2 2 3 2 2 3 1 2 3 2 1 3 2 1 3 2 3 2 2 3 1 p[x : y ] 1 p 3p p 3p p pp pp 2p p p p 2p p pp 2p p p 2p p pp pp p pp p 3 1 2 3 1 Table 2: Transition probabilities for the chain M We continue by computing the expected number of new isolated vertices created in a single transition (conditional on the present state of M being x). Denote this by (x). Except in rare cases (i.e. of probability O(1=m)) isolated vertices are produced only when x > 0. The following table gives the number created in each case. (The case numbers refer to the row numbers of Table 2.) 1 1 1 9 10 11 12 13 14 15 16 17 18 0 0 0 1 1 2 0 0 1 0 Table 3: Number of isolated vertices created in transitions of M Observe that with probability 1, the number of isolated vertices created in each transition equals O(1). Hence, we obtain from the above table that (x) = 2p p p + 2p p + 2p p + p p + O(1=m) = p (p + 2p ) + O(1=m) = p + p (p p ) + O(1=m); and hence that 0 1 X n (t) ! @X 1 A E[(Gn)] 2 E m(t) + O m(t) t t 1 2 1 3 1 2 3 1 1 2 2 1 3 1 2 3 1 3 1 1 0 0 7 ! n ( t ) (4.5) = 2 E m(t) + O(ln n): t In studying the lower bound, we will show, for suitable t < t , that p is negligible when compared with p whenever t is between t and t . This gives ! t1 X n ( t ) E[(Gn )] (1 o(1)) E m(t) O(ln n): (4:6) t t0 X 1 0 0 3 0 1 1 1 1 = 5 Behaviour of n3 Note that for each edge fu; vg removed by MINGREEDY, one of the end-points, say u, is picked from the vertices of minimal (but non-zero) degrees, or u is determined from a previous edge removal. The other end-point v is chosen randomly from the neighbours of u. Now since almost all cubic graphs are connected, each decrease in n , except for the rst edge removal, is accounted for exactly once as the end-point v whenever v is of degree 3. (Here, we regard n as function of the number m of edges in the current graph.) Now consider the edge removal when the current graph has n vertices of degree 3 and m edges. The probability that the end-point v in the edge removed is of degree 3 equals 3n =(2m)(1 + O(1=m)) (from applying arguments similar to those used in showing (4.2). Thus the rate of change in n with respect to m should be approximately dn 3n ; dm 2m which gives the approximation stated in (2.1). These ideas are made rigorous by the following lemma. 3 3 3 3 3 3 3 Lemma 5.1 For any xed > 0, 3 = p Pr 9t such that m(t) n ln n and n n 2m = 3 2 3 1 2 3 ! 1 = O (n ): 2 Proof. We note rst that MINGREEDY destroys edges of G sequentially. Let h = bn = c, and for i = 0; 1; 2; : : :, dene 1 4 mi = 32n ih: Let zi be the number of vertices with degree three in G at the rst time when m mi, and let Ei be the event that 0i = = 1 3 = X p 1 = O @ n ln= n A : (5:1) zi n 2m i mj j 1 3 2 =0 8 3 8 1 2 5 4 We shall show that for i such that mi n = ln n, Pr 9j i such that Ej = O(i=n ) 3 1 2 (5:2) 4 (All big O terms in this section are uniform over i.) Equation (5.2) is trivially true for i = 0. Assume as induction hypothesis that equation (5.2) holds for smaller values of i. Suppose next that mi n = ln n. We need to show that Pr (Ei) = O(1=n ); (5:3) where Pr is the probability conditional on Fi = E \ : : : \ Ei . Let zi = zi zi: Recall that for each edge fx; yg removed by MINGREEDY, one of the end-points, say x, is either chosen randomly from vertices of minimum degree or determined by a previous edge removal, while the other end-point, say y, is a degree three vertex with probability p + O(1=m). Then conditional on Fi and for mi m mi , the probability that the vertex y in an edge removal is of degree three is !! 3 z 3 h i + O(h) p = 2m + O(h) = zi 2m + O z m i i i 0 = 1i1 0 = zi @ 2m3 + O @ n = AA : i mi Thus, if 0zi is the number of degree three vertices that appear as vertices y in the edge removals during the period when m decreases from mi to mi, then using Cherno bounds, q Pr j 0zi hp j 12hp ln n = O(1=n ): 1 2 3 4 0 0 1 1 1 3 1 1 1 1 1 3 4 1 5 2 1 1 1 4 0 However, zi = 0zi + , where denotes the number of degree three vertices that appear as vertices x in the edge removals. Now is at most one plus the number of components of Gn . Hence we may assume for all i that 10, which incurs an error probability of O(1=n ). Since hp ! 1, we have that q Pr j zi hp j 12hp ln n = O(1=n ); 4 4 0 giving with conditional probability 1 O(1=n ) that 0 1 1 0v = = ln n u n zi = 3n + O @ n A + O @u t A = zi 2mi m z i i mi 0 1 1 0v u = = ln n u n n 3 n = 2m + O @ = A + O @t = A i mi mi 0 1 = = = = 23mn + O @ n ln= n A : i mi 4 1 4 1 1 1 4 5 2 1 1 1 4 1 1 4 3 4 5 2 1 3 8 5 2 1 1 2 5 4 1 1 9 1 As zi = zi zi, we have with conditional probability at least 1 O(1=n ), zi = zi 0 zi 0 = = 11 = 3 n = zi @1 2m + O @ n ln= n AA i m 0 =i = 11 != 0 @1 + O @ n ln= n AA = zi mmi i mi 0 11 0 = i = = X i @1 + O @ n ln= n AA : = p1n 2m 3 mj j This completes our proof1=2of3(5.3) and hence of (5.2). Next, we note that for mi n = ln n, 3 1 = 2 i is at most N = b n n h n c and that nh N = n h n + O(1). Hence the error term on the right hand side of (5.1) is at most = ln = n X X 1 1 n = = = n ln n n = = = h j N ( h j ) j N ( n jh) X 1 = O(n = ln = n) = 3n N j 3n j 2h 2h 1 = = = n ln nO = = n ln n = o(1): This shows that for any xed > 0, ! 3 = n = ! = Pr 9 i such that m = mi n ln n and n 2 1 = O(1=n ): m= Similar arguments can be used to complete our proof of the lemma. 2 4 1 1 1 4 1 1 2 3 8 5 4 1 1 3 2 1 1 2 3 8 5 4 1 1 1 3 2 1 2 3 8 5 4 =0 3 1 2 3 3 8 2 ln 2 3 2 ln 3 8 1 2 0 3 2 1 2 5 4 5 4 1 16 3 2 0 5 4 1 2 5 4 1 16 1 2 1 16 3 2 3 1 2 3 6 An Upper Bound for E n1 [ 3 4 1 2 3 3 2 ] In this and the following sections, we would like to estimate E[n ] when the current G has m edges. Let tm be the rst time when m(t) m. Note that m m(tm) is bounded by an absolute constant. The big O and small o terms in this and the next sections are uniform in m. We nd an upper bound for E[n (tm)] in this section. 1 1 From Table 2, we have the following transition probabilities for n (t) (where n = n (t) = n (t + 1) n (t) and Pr is the probability conditional on n (t); n (t); n (t)). When n = 0 and n 6= 0, ( i i i i i = 0; 1; 2; 3; Pr (n = i) = 0;i p p + i p p + O(1=m); ifotherwise ; 1 1 1 1 1 1 1 2 3 1 1 4 3 1 2 2 2 3 10 +1 2 2 3 and when n 6= 0, 1 Pr (n = 1) Pr (n = 0) Pr (n = 1) Pr (n = 2) Pr (n = 3) Pr (n = i) 1 1 1 1 1 1 1 1 1 1 1 p p + O(1=m); 2p p + p + O(1=m); p + 2p p p + p p + O(1=m); p p + 2p p + p + O(1=m); p p + O(1=m); 0; if i 6= 1; 0; 1; 2; 3: = = = = = = 1 2 2 3 2 2 3 2 2 3 3 1 2 3 2 3 2 1 3 1 2 1 2 1 3 Note that when both n = 0 and n = 0, we have 1 2 Pr (n =6 0) = O(1=m); Pr (n 3) = 1: 1 1 1 1 In particular, we may assume that no vertex of degree 1 is created in the rst iteration of MINGREEDY; that is, n (1) = 0. Observe also that the O(1=m) terms in the transition probabilities depend only on the current state of M . 1 1 Let (x) = x 2x + x = (1 x) x; (x) = x x + x = x + (1 x)x: 2 3 2 2 3 3 We next consider a process X (t) with initial state X (1) = 0 and transition probabilities dened below. 1 Pr (X = 1) Pr (X = 0) Pr (X = 1) Pr (X = i) 1 1 1 1 1 1 1 1 = = = = 1 (p ) + O (1=m); 1 ((p ) + O (1=m)) ( (p ) + O (1=m)) (X ); ( (p ) + O (1=m)) (X ); 0; if i 6= 1; 0; 1; 3 1 3 3 1 3 2 2 1 1 where (x) = 1 if x > 0 and zero otherwise. Note that when n (t) = 0, Pr(n (t +1) 3) = 1 and that when n (t) 6= 0, we have (ignoring the O(1=m) terms) 1 1 1 Pr (n = 1) (p ); Pr (n 1) (p ): 1 1 1 3 1 3 Thus, the big O terms O (1=m) and O (1=m) in the transition probabilities of the process X (t) can be properly dened so that when n (t) 6= 0, 1 2 1 1 Pr (n = 1) Pr (X = 1); Pr (n 1) Pr (X = 1): 1 1 1 1 1 1 1 1 It follows that n and X can be coupled so that n is stochastically at most X + 3 for all t. Next, let X (t) be a process, with X (1) = X (1), having the same transition probabilities, 1 1 1 1 11 1 but without the big O terms, as those of X (t). Note that the big O terms in the transition probabilities of X (t) can only eect, with probability 1 O(1=nA ) for any A > 0, at most ln n transitions in O(n) transitions of X . Now at each time t, conditional on X having taken a transition not eected by the big O terms, we may couple X and X so that jX (t) X (t)j jX (t 1) X (t 1)j (because of the reecting barrier at the origin). It therefore follows that for all t = O(n), E[X (t)] = E[X (t)] + O(ln n): Thus, E[n ] = E[X ] + O(ln n). Since we have a fairly accurate estimate of n as a function of m (and hence an estimate of p = 3n =2m), we proceed to nd a bound for X and hence one for E[n ]. 1 1 2 1 1 1 1 1 2 1 2 1 3 3 3 1 Lemma 6.1 For m n = ln n and for any xed > 0, 1 2 3 E[n (tm)] (1 p p) + O(ln n); 2 2 1 where p = (1 ) (2m=3n)1=2. Proof. Consider a random walk X 0 with transition probabilities as follows. We write 0 = (p); 0 = (p). Pr(X 0 = 1) = 0; Pr(X 0 = 0) = 1 0 0 (X 0); Pr(X 0 = 1) = 0 (X 0): Note that the steady state distribution 0 of X 0(t) satises the following equations: 0 = (1 0)0 + 00 ; i0 = 0i0 + (1 0 0)i0 + 0i0 ; i 1; i0 = 0; i < 0: These can be solved, giving i0 = 0; i < 0; 0 0 0 !i 0 (6.1) i = 0 0 ; i 0; with expectation Pi ii0 = (1 p) =p. We shall assume that X 0 starts with its steady state distribution 0, and hence the distribution of X 0 equals 0 always. 0 0 1 1 +1 2 We shall show later that if for all t tm p (t) (1 ) 2m3n(t) 3 12 != 1 2 > p; (6:2) then for all t tm and for large n, X (t) X 0(t) in distribution: (6:3) It follows from Lemma 5.1 that (6.2) holds with probability 1 O(n ), and since n n, we have 2 1 E[n (t)] = E [n (t)] + O(1=n) E [X (t)] + O(ln n) E[X 0(t)] + O(ln n) = (1 p) =p + O(ln n); 1 2 1 2 2 2 2 2 where E is the expectation conditional on p satisfying (6.2). This proves the lemma. 2 3 We shall use induction to prove (6.3). This is trivial when t = 1. Assuming it is true for t, we shall bound the distribution of X (t + 1). Dene a random variable Y = X 0(t) + Y , where the distribution of Y is as follows. Writing p = p (t), dene 3 3 Pr(Y = 1) = (p ); Pr(Y = 0) = 1 (p ) (p ) (X 0); Pr(Y = 1) = (p ) (X 0): 3 3 3 3 We shall show that X (t + 1) Y X 0(t + 1) (6:4) holds in distribution. The rst inequatlity comes from the fact that we can couple X (t) and X 0(t) so that X (t) X 0(t) (this uses the induction hypothesis). For the case where X (t) = X 0(t), we see that X (t) = Y in distribution; for the case X (t) < X 0(t), we use the fact that X (t) and Y can be coupled so that X (t) Y + 1. The rst inequality thus holds for both cases. To show the second inequality, we consider the distribution of Y . Writing = (p (t)); = (p (t)); 0 = (p); 0 = (p), we have the following equations 3 3 = (1 )0 + 0 ; i = i0 + (1 )i0 + i0 ; i 1; i = 0; i < 0: 0 0 1 1 +1 Hence for i 1, 0 0! i = 0 + 1 + 0 !! 0 = i0 1 + ( 0 0) 0 0 : Note that the functions (x) and (x) are such that for all x 2 [0; 1], i0 (x) (x) 0; 13 (6.5) and 1 + x 0: d (x) = dx (x) (1 x + x ) It therefore follows from (6.2) and (6.5) that for i 1, 2 2 2 i i0; giving that for i 1, X j i i0 i X j i i0: As i = = 0 for i < 0, the distribution is smaller than 0. This shows the second inequality in (6.4), and so our proof of the lemma is complete. 2 Lemma 6.1 gives that for m n = ln n, 1 2 3 E[n (tm)] = O((n=m) = ): 1 2 1 Hence, if m m = n = , then E[n (tm)] = O(n = ). The next lemma shows that E[n (tm)] = O(n = ) holds when m < m too. 3 5 0 1 5 1 5 1 1 0 Lemma 6.2 E[n (tm)] = O(n = ) for all m m . 1 5 1 0 Proof. As argued in Lemma 6.1, the assertion in this lemma follows if we can show that for all m < m , E[X (tm)] = O(n = ): (6:6) Given p , consider a random walk Z with transition probability dened as below. 0 1 5 3 Pr(Z = 1) = ((p ) + (p ))(1 (Z )=2); Pr(Z = 0) = 1 (p ) (p ); Pr(Z = 1) = ((p ) + (p )) (Z )=2: 3 3 3 3 3 3 Write t = tm0 . We consider the processes X and Z after time t . Suppose Z (t ) = X (t ). As (p ) (p ) for all p 2 [0; 1], we can couple the processes X (t) and Z (t) for all t t so that Z is stochastically at least X . Hence (6.6) follows if 0 3 0 3 3 0 0 0 E[Z (t)] = O(n = ); 8t t : 1 5 0 (6:7) Since Z is a symmetric random walk with a reecting barrier at Z = 0, the reection principle shows that E[Z (t)] = E[j Z 0(t) j] 14 where Z 0 is the same symmetric random walk as Z but without the reecting barrier. Next, let Z 00 be a random walk with the same transition probabilities as those of Z 0 but with Z 00(t ) = 0. Then it is clear that Z 0(t) = Z 00(t) + Z 0(t ) and so 0 0 E[j Z 0(t) j] E[j Z 00(t) j] + E[Z 0(t )] = E[j Z 00(t) j] + E[X (t )]: 0 0 Let M be the number of moves that Z 00 makes after t until the end of algorithm MINGREEDY. Then we have p E[j Z 00(t) j] = O(E[ M ]); giving that p E[Z (t)] = O(E[ M ] + E[X (t )]): (6:8) 0 0 We next estimate M . Let m = n = ln n and m = n = . We write M =M +M +M ; 3 1 2 1 2 5 2 1 2 3 where M (respectively M ) is the number of moves that Z 00 makes when m is between m and m (respectively between m and m ), and M is the number of moves when m m . Then clearly M n= : To estimate M , note that with error probability O(1=n ), we can assume that for xed > 0, p (1 + )(2m=3n) = = O((2m =3n) = ) = O(n = ) for m between m and m . Thus X E[M ] = ((p (t)) + (p (t))) = O (n = n = ) = O (n = ): For M , we note that according to Lemma 5.1, when m = m , n is of order n = ln n and n decreases as m decreases. Hence when m is between m and m , we have p = O(n = ln n=n = ) = O(n : ); and so X E[M ] = ((p (t)) + (p (t))) = O (n = n : ) = O (n = ): As M and M are sums of independent Bernoulli variables with probability of success (p (t))+ (p (t)) (when the sequence p is known), they are concentrated near their means. Hence, we see that M = O(E[M ]) + O(E[M ]) + M = O(n = ); almost surely. This gives that p E[ M ] = O ( n = ) : 1 2 0 1 1 2 3 2 5 3 1 3 2 2 1 2 1 2 0 1 1 5 3 3 5 2 5 1 3 1 5 1 4 2 5 2 1 3 1 4 3 2 0 14 3 1 2 3 0 14 2 5 2 3 3 1 2 3 1 5 15 1 3 1 5 2 3 0 2 5 5 Equation (6.7) now follows from (6.8) and the fact that E[X (t )] = O(n = ). This completes our proof of the lemma. 2 0 1 5 We are now in a position to prove the upper bound in the theorem. Using (4.5), we have X E[(Gn)] 2 E[n (t)=m(t)] + O(ln n) 2 = 2 t0 3n=2 X m=1 3n=2 X m=1 1 E[n (tm)=m] + O(ln n) 1 E[n (tm)]=m + O(ln n) 1 = = O (n ) n= X 3 1 5 2 m=1 1=m + O(ln n) = O(n = ln n): 1 5 7 A Lower Bound for E[n1] Let B 3000 be a constant, and let mB = Bn = . Write tB be the rst time at which m(t) mB . Fix a constant A > B and a small constant > 0. (The numbers A; will be chosen later.) Let mA = An = . Dene tA as the time immediately after m mA. Hence tA = tmA and tB = tmB . Let N = bn = c and r = (n = + 3)=mB n = =B . We shall nd a lower bound for E[n (tB )] by considering a random walk with reecting barriers at 0 and N as follows. 3 5 3 5 1 5 1 5 2 5 1 Suppose for now that the sequence p (t); t 2 [tA; tB ] is known. Consider a random walk W (t); t 2 [tA; tB ] with initial state W (tA) = 0 and transition probabilities given below. With functions , and as dened before and writing = (p ) and = (p ), 3 3 3 Pr(W = 1) = N (W ) Pr(W = 1) = (W ); Pr(W = 0) = 1 N (W ) (W ); where N (x) = 1 if x < N and zero otherwise. Associated with W is a process U where U (tA) = 0 and U is decreased by 3 with probability 9r at each step. We claim that if the processes n ; W and U are \suitably" dened in a probability space, then 1 n (t) W (t) + U (t); 1 t 2 [tA; tB]: (7:1) Equation (7.1) holds for t = tA trivially. Assume that (7.1) holds for t. We want to show that it holds for t + 1 also. Now if n (t) N + 3, then (7.1) holds for t + 1 as W N and n 3 always. Thus, we assume that n (t) < N + 3 and like to show next that if W and 1 1 1 16 U are properly dened, then n W + U . Note rst that if n (t) < N +3, then p r from our denition of r. From the transition probabilities of n , we see that (ignoring the O(1=m) terms) Case 1 n = 1 with probability at least 2p p (p ) 2p (p ) 2r; Case 2 n = 1 with probability at most (p ) + 2p (p ) + 2r; Case 3 n = 2 with probability at most 4p 4r and n = 3 with probability at most p r. Thus, very crudely, we see that n and W can be coupled so that n W with probability at least 1 9r and n W 3 always. (The big O terms in the transition probabilities of n equal O(n = ) and are negligible when compared with r = O(n = ).) This establishes (7.1). Note that E[U (tB)] = 3(tB tA )(9r)(1 + o(1)) 27r(A B )n = (1 + o(1)) 27(AB B ) n = ; and so for t 2 [tA; tB ], E[n (tB )] E[W (tB)] 27(AB B ) n = (1 + o(1)): (7:2) 1 1 1 1 2 2 3 1 1 3 3 1 1 1 3 3 1 1 1 1 3 5 1 1 1 2 5 1 5 3 5 1 5 1 We next bound E[W (tB)]. Let D be a probability distribution with support [N ], and let WD (t) be a random walk with the same transition probabilities as those of W (t), but WD (tA) has distribution D. Since W (tA) = 0, W and WD can be coupled so that WD (t) W (t) for all t tA and that WD (t) W (t) is non-increasing in t. Let T be the rst time that W (t) = WD (t). Then W (tB) WD (tB )(fT tB g); where (fT tB g) is the indicator function for the event fT tB g. Since W N , we have E[W (tB)] E[WD (tB )] N Pr(T > tB ): Let T 0 be the rst time that WD = 0. Then clearly T T 0 in distribution as WD (t) W (t) 0. Thus, E[W (tB)] E[WD (tB )] N Pr(T 0 > tB ): (7:3) According to Lemma 5.1, with probability 1 O(n ), we have that for t between tA and tB and xed > 0, 2mB = 2mA = (1 ) 3n p (t) (1 + ) 3n : (7:4) The assumption of (7.4) only gives an extra factor of 1 O(n ) to our estimate of E[n ]. = = m m Let q = (1+ ) nA and q = (1 ) nB . We shall assume that (7.4) is satised. 2 1 1 2 1 2 1 3 1 2 0 1 2 3 1 2 1 1 1 1 2 2 3 We next dene the distribution D and bound WD suitably. Write 0 = (q ) and 0 = (q ). Consider a random walk W 0 with transition probabilities given by Pr(W 0 = 1) = 0N (W 0); Pr(W 0 = 1) = 0 (W 0); Pr(W 0 = 0) = 1 0N (W 0) 0 (W 0): 0 0 17 Note that here 0 and 0 no longer depend on t and that the steady state distribution 0 of W 0 satises 0 = (1 0)0 + 00 0i = 00i + (1 0 0)0i + 00i ; i = 1; 2; : : : ; N 1 0N = (1 0)0N + 00N 0i = 0; 8i < 0; from which we obtain 0 = 0 ) 0 !i 1 ( 0 i = 1 (0= 0)N 0 ; i = 0; 1; : : : ; N (7.5) 0i = 0; 8i < 0: We assume that W 0(tA ) has distribution 0 and so the distribution of W 0(t) equals 0 always. 0 0 1 1 +1 1 +1 Suppose now that WD (t) starts with distribution 0 at time tA (that is, D = 0 ). We shall show by induction that WD (t) is at least W 0(t) in distribution. For this, we follow the method used in showing X X 0 in distribution in our proof of Lemma 6.1. Assume p satises (7.4) and that WD (t) W 0(t) as our induction hypothesis. Let Y be the state of a process which starts with distribution 0 (which equals the distribution of W 0(t)) followed by a one-step transition with transition probabilities as those of WD . We claim that WD (t + 1) Y W 0(t + 1): (7:6) The justication for the rst inequality here follows similar arguments as those used in showing the rst inequality in (6.4). To show the second inequality in (7.6), use to denote the distribution of Y . Writing = (p (t)) and = (p (t)), we nd that i = 0 for all i < 0 and for i between 1 and N 1 we have = (1 )0 + 0 ; i = 0i + (1 )0i + 0i N = (1 )0N + 0N : Note that, since q p (t), we have 0= 0 = . Also 0! 0 0 0 0 N = N + N N = N + 0 0N 0N ; and by following (6.5), we have that for i = 1; 2; : : : ; N 1, 0 !! 0 0 0 i = i 1 + ( ) 0 0 0i: As 0i = i = 0 for i < 0, the distribution is at least 0. This completes the induction step, and so we conclude that the distribution of WD (t) is at least 0. It therefore follows that for t 2 [tA; tB ], X E[WD(t)] i0i = 1 (10= 0) 1 (N0+=10)N + N: i 3 3 0 3 0 1 1 +1 1 0 3 1 1 +1 18 Since q (1 + )(2A=3) = n and (0= 0)N exp( q N ), we have s = n (1 + )(2A=3) = ) : E[WD (t)] 1 + 23A n = 1 exp( exp( (1 + )(2A=3) = ) 0 1 2 1 = 1 5 0 1 5 1 2 1 1 5 1 1 2 1 By choosing > 0 such that (3=2) = =(1 + ) = 1:22, we have 1 2 1 1 n E[WD(tB )] 1:22 A= = 1 5 1 2 A = =1:22) : n = 1 exp( exp( A = =1:22) 1 2 1 5 (7:7) 1 2 To estimate T 0, we write 00 = (q ) and 00 = (q ), and let 00 be the distribution given in (7.5) with 0; 0 replaced with 00; 00 respectively. Let W 00 be the same random walk as W 0 but with 0, 0 replaced with 00; 00 and let W 00(tA) have distribution 00 instead of 0. Note that as q q , we have 0= 0 00= 00, and it is not dicult to check that distribution 0 is at most 00. Hence, our previous argument for showing W 0 WD in distribution can be used to show WD W 00 in distribution. Next, let W be the walk whose transition probabilities are as those of W 00 but W (tA) = N and without the reecting barrier at W = N . A simple coupling argument gives that W 00 W in distribution. Let T be the rst time that W = 0. Then we have T 0 T in distribution: 1 0 1 1 1 1 1 1 1 1 1 We next estimate T . Let L be the time elapsed before W rst gets to N 1. Then we have T = L + : : : + LN + tA; where L ; : : :; LN are independent variables identically distributed as L. Note that (I) with probability 00, L = 1; (II) with probability 00, L = 1 + L0 + L00; (III) with probability 1 00 00, L = 1 + L000, where L0; L00; L000 are independent and identically distributed as L. Hence, writing M (z) = E[zL] as the probability generating function of L, we have 1 1 1 1 1 M (z) = 00z + 00zM (z) + (1 00 00)zM (z); 2 giving q (z 00z 00z 1) M (z ) = 200z Put z = exp(q ), for some < 1=4 to be chosen later. Then p M (z) = 1 + (1 1 4)q =2 + O(q ): (z 00z 00z 1) 3 1 1 Since each step of MINGREEDY destroys at most 5 edges, Pr(T 0 tB ) Pr(T tA tB tA ) Pr(L + : : : + LN (A B )n = =5) 1 3 5 1 19 2 1 2 400 00z 2 : E[exp((L + : : : + LN ) ln z (A B )n = ln z=5)] = M (z)N exp( p (A B )(ln z)n =5) exp((1 p1 4)Nq =2 (A B )q n = =5) = 3 5 1 3 5 3 1 1 (A B )(1 ) (2B=3) = =5): 1 4)(1 )(B=6) exp((1 1 2 1 3 5 1 3 3 2 Put = 0:249 and = 0:00001, we obtain 1 Pr(T 0 tB ) exp(0:39B = 0:027(A B )B = ): 1 2 (7:8) 3 2 Hence, assuming (7.4), we have from (7.2), (7.3) and (7.7) that A = =1:22) n = 1 exp( exp( A = =1:22) n = exp(0:39B = 0:027(A B )B = ) 27(AB B ) n = (1 + o(1)): We now set A = B + B = , = 1:22x=A = , where x = 0:04. Then ! x = x e 27 x (1 + o (1)) 1 : 22 n p E[n (tB )] (B + pB ) = 1 1 e x x exp(0:5x 0:027B ) B = 10 pn = ; (7.9) (B + B ) since B 3000. We have therefore showed the following lemma. n= E[n (tB )] 1:22 A= 1 5 1 1 2 1 5 1 2 1 2 1 5 1 2 1 2 3 2 1 5 1 2 1 5 1 2 1 2 4 1 5 1 2 Lemma 7.1 For any constant B 3000, if m = Bn = , then 3 5 E[n (tm)] (B10+ pnB ) = (1 o(1)): 4 = 1 5 1 Proof. 1 2 Let E be the event that (7.4) occurs. Then E[n (tm)] E [n (tm)](1 O(1=n )); where E is the expectation conditional on E . It follows from (7.9) that 1 2 2 1 2 E[n (tm)] (B10+ pnB ) = (1 o(1)): 4 1 = 1 5 1 2 2 To prove the lower bound in the theorem, we let B = 3600 and B = 3000. Write m = B n = , m = B n = , t = tm0 and t = tm1 where tm is (as before) the greatest t 0 0 3 5 1 1 3 5 0 0 1 20 1 such that m(t) m. Note that when t is between t and t , we have n = (n = ) and n = O(n = ), which gives p > p almost surely. Then using (4.6), we have X E[(Gn )] E[p (t)](1 o(1)) O(ln n) t X E[n (t)=m(t)](1 o(1)) O(ln n) 1 1 5 0 3 1 3 2 5 1 1 0 t0 t1 X t=t0 t1 X t=t0 1 E[n (t)=m(t)](1 o(1)) O(ln n) 1 E[n (t)]=m (1 o(1)) O(ln n) 1 0 = t m t (B 10+ pnB ) = O(ln n) = m 5m m 10p n (1 o(1)) O(ln n) 3660 1 4 0 0 0 0 = 0 4 1 1 5 1 2 1 5 0 = (n ): 1 5 Acknowledgement. We would like to thank the referee for a careful reading of the manuscript. References [1] E. A. Bender and E. R. Caneld, The asymptotic number of labelled graphs with given degree sequences, Journal of Combinatorial Theory, Series A, 24 (1978) 296-307. [2] B. Bollobas, A probabilistic proof of an asymptotic formula for the number of labelled regular graphs, European Journal on Combinatorics, 1 (1980) 311-316. [3] M. E. Dyer and A. M. Frieze, Randomized greedy matching, Random Structures and Algorithms, 2 (1991) 29-45. [4] M. E. Dyer, A. M .Frieze and B. Pittel, On the average performance of the greedy matching algorithm, to appear. [5] O. Goldschmidt and D. S. Hochbaum, A fast perfect-matching algorithm in random graphs, SIAM Journal on Discrete mathematics, 3 (1990) 48-57. [6] B. Korte and D. Hausmann, An analysis of the greedy algorithm for independence systems, in Algorithmic Aspects of Combinatorics, B. Alspach, P. Hall and D. J. Miller Eds., Annals of Discrete Mathematics, 2 (1978) 65-74. [7] R. M. Karp and M. Sipser, Maximum matchings in sparse random graphs, Proceedings of the 22nd Annual IEEE Symposium on Foundations of Computing, (1981) 364-375. 21 [8] G. Tinhofer, A probabilistic analysis of some greedy cardinality matching algorithms, Annals of Operations Research, 1 (1984) 239-254. 22
© Copyright 2026 Paperzz